Understanding Emotions in Text Using Deep Learning and Big Data

doi:10.1016/j.chb.2018.12.029

Computers in Human Behavior

Volume 93, April 2019, Pages 309-317

https://doi.org/10.1016/j.chb.2018.12.029 Get rights and content

Highlights

•
Emotion Detection in text finds several practical applications such as modulation of responses for real-world chat-bot.
•
Combining Sentiment and Semantic information in a text improves emotion detection system.
•
Our approach learns diverse ways of expressing emotions and significantly outperforms methods described in literature.

Abstract

Big Data and Deep Learning algorithms combined with enormous computing power have paved ways for significant technological advancements. Technology is evolving to anticipate, understand and address our unmet needs. However, to fully meet human needs, machines or computers must deeply understand human behavior including emotions. Emotions are physiological states generated in humans as a reaction to internal or external events. They are complex and studied across numerous fields including computer science. As humans, on reading “Why don't you ever text me!”, we can either interpret it as a sad or an angry emotion and the same ambiguity exists for machines as well. Lack of facial expressions and voice modulations make detecting emotions in text a challenging problem. However, in today's online world, humans are increasingly communicating using text messaging applications and digital agents. Hence, it is imperative for machines to understand emotions in textual dialogue to provide emotionally aware responses to users. In this paper, we propose a novel Deep Learning based approach to detect emotions - Happy, Sad and Angry in textual dialogues. The essence of our approach lies in combining both semantic and sentiment based representations for more accurate emotion detection. We use semi-automated techniques to gather large scale training data with diverse ways of expressing emotions to train our model. Evaluation of our approach on real world dialogue datasets reveals that it significantly outperforms traditional Machine Learning baselines as well as other off-the-shelf Deep Learning models.

Introduction

Technology is continuously evolving to amplify human ingenuity, to make our day to day life simpler and, to anticipate and address our unmet needs. In order to anticipate our needs, it is essential for machines or computers to be able to deeply understand human behavior. Human behavior is very complex. Culture, social norms, faith, language among many other things, play a role in defining human behavior. In particular, understanding and expressing emotion is a key element of human behavior. Emotions must be deeply understood by machines and computers, to be able to anticipate human needs.

Emotions such as happiness, anger, sadness etc. are physiological states that humans routinely experience. In the field of cognitive computing, where we develop technologies to mimic functioning of the human brain, understanding emotions is an important area of research (Thilmany, 2007). With growing prominence of messaging platforms like WhatsApp and Twitter, there is an increased interaction using textual dialogues. There are several digital agents and chat bots on these messaging platforms and are currently being used by a large number of online users. The success of these agents depends on their ability to modulate responses based on user emotions for which it is imperative to be able to detect emotions in textual dialogues and avoid responding inappropriately (Miner et al., Linos).

Furthermore, ability of machines or computers to understand emotions is critical for success of several other applications as well. For instance, in the domain of customer service, social media platforms like Twitter are gaining prominence where customers expect quick responses. In case of heavy flow of tweets, turn-around time for responses increase. Prioritizing tweets according to their emotional content and responding to them in that order will result in increased customer satisfaction. For example, responding to an angry tweet prior to a basic inquiry. Furthermore, in this era of text messaging, users are constantly texting and may send inappropriately angry messages to others. If emotion detection is implemented, in such cases, the application can take appropriate action such as popping up a warning to the user before sending a message. Emotion detection also finds social applications such as flagging content representing bullying, depression etc. from Twitter streams or online fora. Thus, emotion detection in textual dialogue finds several applications in today's online world.

Emotions have been studied by researchers (Hochschild, 2002; Lane et al., 1996; Plutchik, 1994) in the fields of psychology, sociology, medicine, computer science etc. for the past several years. Some of the prominent work in understanding and categorizing emotions include Ekman's six class categorization (Ekman) and Plutchik's “Wheel of Emotion” which suggested eight primary bipolar emotions (Plutchik & Kellerman, 1986). Given the vast nature of study in this field, there is naturally no broader consensus on the granularity of emotion classes. Hence, as a first-step, we restricted our current study to the top three frequently observed emotions in our user logs – Happy, Sad and Angry.

Problem Definition

In a textual dialogue, given the user utterance along with its context, classify the emotion of user utterance as one of Happy, Sad, Angry or Others.

Understanding emotions in textual conversations can be a challenging Problem in absence of facial expressions and voice modulations. Fig. 1 provides an example where it is difficult, even as a human, to detect the emotion of user utterance solely on the basis of text of the conversation. The emotion of the user whose messages are on the left, could be interpreted as angry or sad. The challenge of understanding emotions is further compounded by difficulty in understanding context, sarcasm, class size imbalance, natural language ambiguity and rapidly growing Internet slang. However, big data and powerful deep learning algorithms have paved way for us to attack this problem statement.

In this paper, we propose an end-to-end trainable deep learning model, called “Sentiment and Semantic Based Emotion Detector (SS-BED)” for detecting emotions in textual dialogues. The essence of our approach lies in leveraging both the sentiment and semantic representations of user utterance for accurate emotion detection. The motivation behind combining sentiment and semantic representations can be understood from the following example. Let's consider the utterance “On road again … miss my amazing partner though!”. This utterance contains a negative sentiment word ‘miss’ as well as a positive sentiment word ‘amazing’ but the overall emotion of the utterance is Sad. By combining the sentiment of different words in the utterance with semantic understanding of the sentence, we can detect the emotion in this case. Hence, we intuitively feel combining both sentiment and semantic features helps in improving classification of emotions under such scenarios.

Given a user utterance, SS-BED takes the individual sentiment and semantic representations of their input words and combines them into a unified representation for the entire utterance which is used for predicting the emotion. We evaluate SS-BED on real world textual dialogues and it outperforms traditional Machine Learning approaches and other Deep Learning approaches. The main contributions of our paper are as follows:

•
We propose a novel approach towards understanding emotions in textual conversations, using a deep-learning system called SS-BED.
•
We evaluate various Deep Learning techniques and embeddings, along with Machine learning algorithms (such as Support Vector Machines (SVM), Decision Trees, Naive Bayes), on real world textual conversations and compare their effectiveness for the task of understanding emotions.

Practical Application: Our current research is in the context of an online chat-bot, designed for informal conversations with users. In this scenario, we notice that users often express a variety of emotions such as being nervous about exams, excited about a new job, feeling sad about a break-up, etc. In such cases, the boundaries between computers and humans blur, and users expect computers to deeply understand human behavior including emotions. Understanding these emotions and providing an emotionally aware response not only creates a deeper and sustained engagement with users but takes us a step closer to deeply understanding humans and anticipating their psychological needs.

The rest of the paper is organized as follows: Section 2 provides a summary of related work. Section 3 describes our approach (SS-BED) in detail. Our experimental setup is discussed in Section 4 and our results are in Section 5. Finally, Section 6 concludes the paper, followed by future direction for our work.

Section snippets

Related work

A lot of work has happened in the space of image based emotion recognition (Wang et al., 2018), (Zhang et al., 2016). However, classifying textual dialogues based on emotions is relatively new research area. Emotion-detection algorithms can be largely bucketized into following two categories:

(a)
Hand-crafted Feature Engineering Based Approaches: - Many methods exploit the usage of keywords in a sentence with explicit emotional/affective value (Balahur et al., 2011; Chaumartin, 2007; Kozareva et

Our approach

We model the task of understanding emotions as a multi-class classification Problem where given a user utterance, the model outputs probabilities of it belonging to four output classes - Happy, Sad, Angry and Others. The architecture of our proposed SS-BED model is shown in Fig. 2. Our model uses LSTMs (Hochreiter and Schmidhuber, 1735), which are effective in processing sequential information. The input user utterance is fed into two LSTM layers using two different word embedding matrices. One

Experimental setup

In this section, we describe details of evaluation dataset used to compare various techniques and baseline methods used for comparison.

Results

A summary of results from various techniques on the dataset described in Section 4.1 is presented in Table 6. SS-BED gives the best performance on F1 score for each emotion class as well as on Macro and Micro F1, as can be seen more clearly from Fig. 3. The performance of SS-BED over all other models is particularly significant (p $<$ 0.005) as measured by McNemar's test (McNemar, 1969). Our results thus indicate that combining sentiment and semantic features in SS-BED outperforms individual

Conclusion

In this paper, we discuss Problem of understanding emotions in text by machines. To be able to anticipate human needs, emotions must be deeply understood by machines and computers, as understanding and expressing emotion is a key element of human behavior. Detecting emotions helps in modulation and regulation of responses for real-world chat-bot and other textual-dialogue based applications. For this problem, we harness the power of deep learning and big data and propose a Deep Learning based

Future work

As part of our future work, we plan to extend this approach to detect more emotional classes such as Surprise, Fear, Disgust etc. Currently, our model is limited by the fact that it does not train on the context of the dialogue. We plan to train models that also take the dialogue context into account besides the current user utterance.

References (63)

S.-H. Wang et al.
Intelligent facial emotion recognition based on stationary wavelet entropy and jaya algorithm
Neurocomputing
(2018)
M. Abdul-Mageed et al.
Emonet: Fine-grained emotion detection with gated recurrent neural networks
A. Agrawal et al.
Unsupervised emotion detection from text using semantic and syntactic relations
C.O. Alm et al.
Emotions from text: Machine learning for text-based emotion prediction
R. C. Balabantaray, M. Mohammad, N. Sharma, Multi-class twitter emotion classification: A new approach, International...
A. Balahur et al.
Detecting implicit expressions of sentiment in text based on commonsense knowledge
L. Canales, P. Martínez-Barco, Emotion detection from text: A survey, processing in the 5th information systems...
F.-R. Chaumartin
Upar7: A knowledge-based system for headline sentiment tagging
V. Chernykh, G. Sterling, P. Prihodko, Emotion recognition from speech with recurrent neural networks, arXiv preprint...
C. Cortes, V. Vapnik, Support-vector networks, Machine Learning Vol. 20, pages...

P. R. Dachapally, Facial emotion detection using convolutional neural networks and representational autoencoder units,...

T. Danisman et al.

Emotion classification of audio signals using ensemble of support vector machines

D. Davidov et al.

Enhanced sentiment learning using twitter hashtags and smileys

P. Ekman, An argument for basic emotions, Cognition & Emotion Vol. 6, pages...

A. Esuli et al.

Sentiwordnet: A high-coverage lexical resource for opinion mining

Evaluation

(2007)

B. Felbo et al.

J. Friedman et al.

The elements of statistical learning

(2001)

I. Goodfellow et al.

Deep learning

(2016)

M. Hasan et al.

Using hashtags as labels for supervised learning of emotions in twitter messages

M. Hasan, E. Rundensteiner, E. Agu, Emotex: Detecting emotions in twitter...

S. Hochreiter, J. Schmidhuber, Long short-term memory, Neural Computation Vol. 9, pages...

A.R. Hochschild

The sociology of emotion as a way of seeing

I.T. Jolliffe

Principal component analysis and factor analysis

A. Joulin, E. Grave, P. Bojanowski, T. Mikolov, Bag of tricks for efficient text classification, arXiv preprint...

Y. Kim, Convolutional neural networks for sentence classification, arXiv preprint...

M. Köper et al.

Z. Kozareva et al.

Ua-zbsa: A headline emotion classification through web information

A. Krizhevsky et al.

Imagenet classification with deep convolutional neural networks

F. Kunneman et al.

The (un) predictability of emotional hashtags in twitter

R.D. Lane et al.

Impaired verbal and nonverbal emotion recognition in alexithymia

Psychosomatic Medicine

(1996)

N. Liang et al.

TRSDL: Tag-aware recommender system based on deep learning–intelligent computing systems

Applied Sciences

(2018)

Cited by (278)

The use of machine learning and deep learning models in detecting depression on social media: A systematic literature review
2024, Personalized Medicine in Psychiatry
Depression is regarded as one of the world's primary concerns. Recent researchers use artificial intelligence techniques like machine learning and deep learning to identify depressive symptoms automatically. This literature review focuses on using machine learning and deep learning models in depression detection on social media. Advances in deep learning have improved methods for identifying depression, which is one of the illnesses that affect the health of individuals. Some researchers employ a variety of deep-learning approaches to improve the diagnosis, detection, and prediction of depression to support expert decision-making. The researchers identified the available prediction techniques and tools used to detect, forecast, compare, and classify depression in victims systematically. Twenty-eight (28) articles relevant to machine learning and thirty-two (32) articles linked to deep learning were chosen and considered using boolean keyword searches in different publishing databases and filters. A significant number of the studies, according to the conclusions of the analysis, used machine learning techniques such as decision trees, K-nearest neighbours, naive bayes, random forests, and support vector machines. The deep learning models that are most frequently utilised include convolutional neural networks, long short-term memory, and recurrent neural networks with different datasets to detect subjects suffering from depression using social media data. The datasets used in these studies include Twitter, Facebook, Reddit, tweets from the Kaggle website, and clinic patients’ records. These datasets can include posts, comments, audio, video, images, and interviews. The results of this study revealed that, recently, several approaches have focused on using deep learning for depression detection. The paper highlighted that most research focuses on the detection and identification of depression. Prospects for cutting-edge studies in the detection of depression and other illnesses that are related to health were also suggested.
In-depth investigation of speech emotion recognition studies from past to present –The importance of emotion recognition from speech signal for AI–
2024, Intelligent Systems with Applications
In the super smart society (Society 5.0), new and rapid methods are needed for speech recognition, emotion recognition, and speech emotion recognition areas to maximize human-machine or human-computer interaction and collaboration. Speech signal contains much information about the speaker, such as age, sex, ethnicity, health condition, emotion, and thoughts. The field of study which analyzes the mood of the person from the speech is called speech emotion recognition (SER). Classifying the emotions from the speech data is a complicated problem for artificial intelligence, and its sub-discipline, machine learning. Because it is hard to analyze the speech signal which contains various frequencies and characteristics. Speech data are digitized with signal processing methods and speech features are obtained. These features vary depending on the emotions such as sadness, fear, anger, happiness, boredom, confusion, etc. Even though different methods have been developed for determining the audio properties and emotion recognition, the success rate varies depending on the languages, cultures, emotions, and data sets. In speech emotion recognition, there is a need for new methods which can be applied in data sets with different sizes, which will increase classification success, in which best properties can be obtained, and which are affordable. The success rates are affected by many factors such as the methods used, lack of speech emotion datasets, the homogeneity of the database, the difficulty of the language (linguistic differences), the noise in audio data and the length of the audio data. Within the scope of this study, studies on emotion recognition from speech signals from past to present have been analyzed in detail. In this study, classification studies based on a discrete emotion model using speech data belonging to the Berlin emotional database (EMO-DB), Italian emotional speech database (EMOVO), The Surrey audio-visual expressed emotion database (SAVEE), Ryerson Audio-Visual Database of Emotional Speech and Song Database (RAVDESS), which are mostly independent of the speaker and content, are examined. The results of both classical classifiers and deep learning methods are compared. Deep learning results are more successful, but classical classification is more important in determining the defining features of speech, song or voice. So It develops feature extraction stage. This study will be able to contribute to the literature and help the researchers in the SER field.
Contextual emotion detection using ensemble deep learning
2024, Computer Speech and Language
Emotion detection from online textual information is gaining more attention due to its usefulness in understanding users’ behaviors and their desires. This is driven by the large amounts of texts from different sources such as social media and shopping websites. Recent studies investigated the benefits of deep learning in the detection of emotions from textual conversations. In this paper, we study the performance of several deep learning and transformer-based models in the classification of emotions in English conversations. Further, we apply ensemble learning using a majority voting technique to improve the overall classification performance. We evaluated our proposed models on the SemEval 2019 Task 3 public dataset that categorizes emotions as Happy, Angry, Sad, and Others. The results show that our models can successfully distinguish the three main classes of emotions and separate them from Others in a highly imbalanced dataset. The transformer-based models achieved a micro-averaged F1-score of up to 75.55%, whereas the RNN-based models only reached 67.03%. Further, we show that the ensemble model significantly improves the overall performance and achieves a micro-averaged F1-score of 77.07%.
Zero-shot multitask intent and emotion prediction from multimodal data: A benchmark study
2024, Neurocomputing
Empathy involves comprehending and sharing the emotions of another person. In the realm of conversational AI, empathy pertains to the AI’s capacity to understand and respond suitably to the user’s emotions and needs. Conversational AI with empathetic capabilities can heighten the user experience by making interactions more personalized and natural. At present, machine learning algorithms are commonly utilized in existing conversational AI systems to recognize emotions and corresponding empathetic intents from annotated data. Nonetheless, this approach is not without limitations, being expensive and time-consuming. Our present work takes a holistic approach to empathy in conversational AI, where we propose a novel zero-shot multitask framework, the Zero-shot Intent Emotion Detection (ZIED) network, identifies both emotions and intents in a multimodal setting. We developed an end-to-end model that concurrently captures textual, audio, and visual representations and integrates the different modalities using cross-attention mechanisms. Our experimental results, based on the EmoInt-MD dataset, show that incorporating all three modalities results in the best performance for both emotion and empathetic intent detection. We observed a noteworthy improvement of over 6% and 4% for intent and emotion, respectively, for various ratios of seen and unseen classes.
Exploring online consumer review-management response dynamics: A heuristic-systematic perspective
2024, Decision Support Systems
Although the effects of managerial responses (MRs) on subsequent customer reviews (CRs) has been explored, we lack a comprehensive theoretical framework to explain the interdependent relationships between previous and subsequent CRs—specifically the dynamic influences of MRs on future CRs. We draw on emotional contagion and regulation theories to develop a heuristic systematic model to explain CR-MR dynamics in online settings. We propose six systematic processing and three heuristic processing routes to delineate the determination and persuasion effects between previous and subsequent consumers' CRs. The systematic routes describe how current customers' compliments, complaints, and emotions influence their current rating scores. The heuristic processing routes describe how previous customers' rating scores and emotions influence current customers' rating scores and emotions. We suggest MR strategies to regulate these effects. The presence and length of MRs defines the numeric heuristic route while the positive-emotion heuristic route is conceptualized through expressions of thanks, sincerity, interaction, and complimenting customers. Expressions of apology, explanation, empathy, and remedy inform the negative-emotion heuristic route. We collect text from customers' reviews and managers' responses from the TripAdvisor website using text-mining techniques and analyze our hypotheses using Pooled Ordinary Least Squares (pooled OLS) and Generalized Method of Moment (GMM) modeling. Our findings not only enrich the theoretical underpinnings of the CR/MR literature, but also provide managerial guidance on how customers' emotional contagion and rating behaviors might be regulated.
Negative emotion detection on social media during the peak time of COVID-19 through deep learning with an auto-regressive transformer
2024, Engineering Applications of Artificial Intelligence
Negative emotion detection is challenging during the peak time of the COVID-19 period. Most earlier studies contain individuals’ physical health recognition rather than mental health detection, which is a significant concern in facing the COVID-19 situation. Identifying mental health in advance is essential to understand individuals’ psychological condition. This paper considers the texts from social media during the pandemic of COVID-19. We propose a novel context-based auto-regressive transformer with bidirectional long short-term memory and a convolutional neural network (Context-ABT-BiLSTM-CNN) model to detect emotions such as abuse, anger, anxiety, depression, disgust, fear, guilt, sadness, and shame on social media. The existing works do not suggest relevant terms to detect suitable context; as a result, there is no scope for detecting emotions. We introduce a novel topic-based text (TBT) with a rule-based permutation (RBP) procedure to extract the relevant text from social media to identify emotions. Random search is suggested to store each input’s correlated information and the order of each sequence. We recommend various transformer components to maintain the text sequence, avoid discrepancies during model training, capture the long-distance semantics in bidirectional contexts, and adopt both the permutation and factorization processes to build the model. Moreover, a comparative study is introduced to detect the most dominant emotions on social media during the pandemic and non-pandemic periods. The proposed model with XLNet embeddings surpasses state-of-the-art models for detecting text emotions. The ablation study is conducted to understand the essential components needed for the proposed model.

View all citing articles on Scopus

View full text

Full length articleUnderstanding Emotions in Text Using Deep Learning and Big Data