Finding Sentiment Dimension in Vector Space of Movie Reviews: An Unsupervised Approach

Year
2017
Volume 18
Issue 1
Pages
85-102
Authors
Youngsam Kim & Hyopil Shin
Abstract
This study suggests an unsupervised method to find sentiment orientations of the words in Korean movie reviews. The orientations are represented as real values on a sentiment domain, which is derived from high-dimensional vector space for the movie reviews. To search for the dimension, the Point- wise Mutual Information is first used to select a set of words that are close to common modifiers; The phrases comprised of these words often form good/ bad associations (e.g., “good acting”, “terrible acting”). A neural language model (Word2Vec) is then used to calculate the point-wise similarity distances between the chosen words and, dimensionality reduction algorithms (e.g., PCA, MDS) are employed to find the axis of the sentiment orientations. Finally, the performance of our method is measured by unsupervised classification of the two movie reviews based on the orientation values. According to the results, the best accuracy achieves 66% and 76% for the two datasets.

Key words: Sentiment Analysis, Vector Semantics, Unsupervised Approach, Word Space Models