Repulsive_Tart3669 t1_j0jlaa6 wrote on December 17, 2022 at 3:15 AM

Back in 2012 we were experimenting with an engineering-based approach to extract relations and events from texts. Examples of events are company announcements, merger and acquisitions, management position changes, customer complaints about products etc. Our NLP pipelined included two major steps - named entity recognizers and rule-based engine over graph of annotations. The former step extracts various types of entities - names of companies and people, geographical locations, temporal expressions and dictionary-based extractor that extracts anchor verbs (e.g., acquire, purchase, announce, step down). The latter step uses a rule-based engine that tries to match tokens and named entities into high-level concepts using regular expression-type syntax, e.g., 'annotate[COMPANY_ANNOUNCEMENT] if match[COMPANY ANNOUNCEMENT_VERB]'. Then, if I recall correctly, we switched to use rules over dependency structure of sentences (something like subject - verb - object) - with slightly lower precision this resulted in much better recall. But this was 10 years ago, and a lot has changed since then.