Sentiment phrase generation using statistical methods

Abstract

In this paper, we describe a new algorithm designed to generate lexical resources in the field of sentiment analysis. For this approach, based on corpora of customer reviews, we determine words and phrases as candidates for our sentiment lexicon solely by calculating a word co-occurrence measure and by considering word frequencies. The sentiment values of every single word or phrase are derived automatically from the review titles and the associated given ratings. We consciously renounce the use of natural language processing methods in order to ensure language independency of our algorithm. Furthermore, by using exclusively statistical methods, we are able to identify rather unusual word combinations, such as idiomatic expressions. This differentiates our work from most prior approaches which concentrate on single words or word-modifier combinations. An example lexicon is generated by the use of a corpus of 1.5 million German Amazon customer reviews.

Mehr zum Titel

Titel Sentiment phrase generation using statistical methods
Medien SAC 2018: Symposium on Applied Computing
Verlag Association for Computing Machinery
ISBN 978-1-4503-5191-1
Verfasser Dirk Reinel, Prof. Dr. Jörg Scheidt, Andreas Henrich, Niko Brucker
Seiten S. 452-460
Veröffentlichungsdatum 09.04.2018
Zitation Reinel, Dirk; Scheidt, Jörg; Henrich, Andreas; Brucker, Niko (2018): Sentiment phrase generation using statistical methods. SAC 2018: Symposium on Applied Computing, S. 452-460.