TY - GEN
T1 - Multimodal deep learning framework for sentiment analysis from text-image web data
AU - Thuseethan, Selvarajah
AU - Janarthan, Sivasubramaniam
AU - Rajasegarar, Sutharshan
AU - Kumari, Priya
AU - Yearwood, John
PY - 2020/12
Y1 - 2020/12
N2 - Understanding people's sentiments from data published on the web presents a significant research problem and has a variety of applications, such as learning the context, prediction of election results and opinion about an incident. So far, sentiment analysis from web data has focused primarily on a single modality, such as text or image. However, the readily available multiple modal information, such as image and different forms of texts, as a combination can help to estimate the sentiments more accurately. Further, blindly combining the visual and textual features increases the complexity of the model, which ultimately reduces the sentiment analysis performance as it often fails to capture the correct interrelationships between different modalities. Hence, in this study, a sentiment analysis framework that carefully fuses the salient visual cues and high attention textual cues is proposed, exploiting the interrelationships between multimodal web data. A multimodal deep association learner is stacked to learn the relationships between learned salient visual features and textual features. Further, to automatically learn the discriminative features from the image and text, two streams of unimodal deep feature extractors are proposed to extract the visual and textual features that are most relevant to the sentiments. Finally, the sentiment is estimated using the features that are combined using a late fusion mechanism. The extensive evaluations show that our proposed framework achieved promising results for sentiment analysis using web data, in comparison to existing unimodal approaches and multimodal approaches that blindly combine the visual and textual features.
AB - Understanding people's sentiments from data published on the web presents a significant research problem and has a variety of applications, such as learning the context, prediction of election results and opinion about an incident. So far, sentiment analysis from web data has focused primarily on a single modality, such as text or image. However, the readily available multiple modal information, such as image and different forms of texts, as a combination can help to estimate the sentiments more accurately. Further, blindly combining the visual and textual features increases the complexity of the model, which ultimately reduces the sentiment analysis performance as it often fails to capture the correct interrelationships between different modalities. Hence, in this study, a sentiment analysis framework that carefully fuses the salient visual cues and high attention textual cues is proposed, exploiting the interrelationships between multimodal web data. A multimodal deep association learner is stacked to learn the relationships between learned salient visual features and textual features. Further, to automatically learn the discriminative features from the image and text, two streams of unimodal deep feature extractors are proposed to extract the visual and textual features that are most relevant to the sentiments. Finally, the sentiment is estimated using the features that are combined using a late fusion mechanism. The extensive evaluations show that our proposed framework achieved promising results for sentiment analysis using web data, in comparison to existing unimodal approaches and multimodal approaches that blindly combine the visual and textual features.
KW - Affective Computing
KW - Deep Learning
KW - Multimodal Features
KW - Sentiment Analysis
KW - Web Data
UR - http://www.scopus.com/inward/record.url?scp=85114408367&partnerID=8YFLogxK
U2 - 10.1109/WIIAT50758.2020.00039
DO - 10.1109/WIIAT50758.2020.00039
M3 - Conference Paper published in Proceedings
AN - SCOPUS:85114408367
SN - 9781665430173
T3 - Proceedings - 2020 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology, WI-IAT 2020
SP - 267
EP - 274
BT - Proceedings - 2020 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology, WI-IAT 2020
A2 - He, Jing
A2 - Purohit, Hemant
A2 - Huang, Guangyan
A2 - Gao, Xiaoying
A2 - Deng, Ke
PB - IEEE, Institute of Electrical and Electronics Engineers
CY - New York
T2 - 2020 IEEE/WIC/ACM International Joint Conference on Web Intelligence and Intelligent Agent Technology, WI-IAT 2020
Y2 - 14 December 2020 through 17 December 2020
ER -