In this work we present a system for the automatic annotation of opinions in Spanish texts. We focus mainly in the definition of a {TFS}-style model for the predicates of opinion and their arguments, in the creation of a lexicon of opinion predicates and in two additional variants for identifying the source of opinions. The original system extracts opinions and all its elements (predicate, source, topic and message) based on hand-coded rules, the first variant uses a {CRF} model for learning the source, assuming that the predicate is already tagged, and the second variant is a combined version, with the result of source recognition via the rule-based system being added as an additional attribute for training the {CRF} model. We found that this hybrid system performs better than each of the systems evaluated separately. This work involved the construction of several resources for Spanish: a lexicon of opinion predicates, a 13,000 word corpus with whole opinion annotations and a 40,000 word corpus with annotations of opinion predicates and sources.
Combining Rules and CRF Learning for Opinion Source Identification in Spanish Texts
Tipo
              Artículo de journal
          Año
              2012
          Publisher
              Springer Berlin Heidelberg
          Páginas
              452
          Volúmen
              7637
          Abstract
              Citekey
              citeulike:12275311
          doi
              10.1007/978-3-642-34654-5\\_46
          Keywords
          