Enhancing deep learning sentiment analysis with ensemble techniques in social applications

doi:10.1016/j.eswa.2017.02.002

Expert Systems with Applications

Volume 77, 1 July 2017, Pages 236-246

https://doi.org/10.1016/j.eswa.2017.02.002 Get rights and content

Under a Creative Commons license

open access

Highlights

•
A taxonomy that classifies ensemble models in the literature is presented.
•
Surface and deep features integration is explored to improve classification.
•
Several ensembles of classifiers and features are proposed and evaluated.
•
Performance of the proposed models is evaluated on several sentiment datasets.

Abstract

Deep learning techniques for Sentiment Analysis have become very popular. They provide automatic feature extraction and both richer representation capabilities and better performance than traditional feature based techniques (i.e., surface methods). Traditional surface approaches are based on complex manually extracted features, and this extraction process is a fundamental question in feature driven methods. These long-established approaches can yield strong baselines, and their predictive capabilities can be used in conjunction with the arising deep learning methods. In this paper we seek to improve the performance of deep learning techniques integrating them with traditional surface approaches based on manually extracted features. The contributions of this paper are sixfold. First, we develop a deep learning based sentiment classifier using a word embeddings model and a linear machine learning algorithm. This classifier serves as a baseline to compare to subsequent results. Second, we propose two ensemble techniques which aggregate our baseline classifier with other surface classifiers widely used in Sentiment Analysis. Third, we also propose two models for combining both surface and deep features to merge information from several sources. Fourth, we introduce a taxonomy for classifying the different models found in the literature, as well as the ones we propose. Fifth, we conduct several experiments to compare the performance of these models with the deep learning baseline. For this, we use seven public datasets that were extracted from the microblogging and movie reviews domain. Finally, as a result, a statistical study confirms that the performance of these proposed models surpasses that of our original baseline on F1-Score.

Keywords

Ensemble

Deep learning

Sentiment analysis

Machine learning

Natural language processing