Domain Adaptation for Sentiment Classification in Light of Multiple Sources

Published Online:https://doi.org/10.1287/ijoc.2013.0585

Sentiment classification is one of the most extensively studied problems in sentiment analysis, and supervised learning methods, which require labeled data for training, have been proven quite effective. However, supervised methods assume that the training domain and the testing domain share the same distribution; otherwise, accuracy drops dramatically. Although this does not pose problems when training data are readily available, in some circumstances, labeled data is quite expensive to acquire. For instance, if we want to detect sentiment from Tweets or Facebook comments, the only way to acquire is to manually label it, and this is prohibitively burdensome and time-consuming. In this paper, we propose a hybrid approach that integrates the sentiment information from source-domain labeled data and a set of preselected sentiment words to solve this problem. The experimental results suggest that our method statistically outperforms the state of the art and even, in some cases, surpasses the in-domain gold standard.

INFORMS site uses cookies to store information on your computer. Some are essential to make our site work; Others help us improve the user experience. By using this site, you consent to the placement of these cookies. Please read our Privacy Statement to learn more.