As the amount of content online grows exponentially, new networks and interactions are also growing tremendously fast. EUNOMIA user’s trustworthiness indicators provide a boost towards a fair and balanced social network interaction.
Sentiment analysis is one of EUNOMIA’s trustworthiness indicators assisting users to assess the trustworthiness of online information. It relies on the automatic identification of the sentiment expressed in a user post (negative, positive, or neutral). A sentiment analysis algorithm employs principles from the scientific fields of machine learning and natural language processing. Current trends in the field include AI techniques that outperform traditional dictionary-based approaches and provide unparalleled performance.
Dictionary-based techniques work as follows: A list of opinion words such as adjectives (i.e. excellent, love, supports, expensive, terrible, hate, complicated), nouns, verbs and word phrases constitute the prior knowledge for extracting the sentiment polarity of a piece of text. For example, in “I love playing basketball” a dictionary-based method would identify and consider the word “love” to infer the positive polarity of the expression.
Unfortunately, these methods are unable to grasp long-range sentiment dependencies, sentiment fluctuations or opinion modifiers (i.e. not so much expensive, less terrible etc.) that exist in abundance in user-generated text.
We use two models that process user generated content in parallel. The first model relies on sentiment patterns to extract polarity. For example in “not so much expensive” the model would identify the relation between “not” and “expensive” and would assign positive polarity in comparison to a dictionary-based method that would only rely on the negative word “expensive”.
The second model is an advanced machine learning model, that relies on a trained neural network and it can identify sentiment fluctuations of longer range. Therefore, the first model (pattern-based) relies on sentiment patterns to extract the sentiment orientation, while the second, relies on a neural network that is trained on labeled data and is capable of distinguishing between positive/neutral/negative text with high accuracy.
The output of both models is processed by an ensemble algorithm that decides on the final sentiment classification and the degree that the models are confident about their predictions.
The results of the sentiment analysis process provide one of EUNOMIA’s indicators. Sentiment and emotion in language is connected quite frequently with subjectivity and on many occasions with decietful information. EUNOMIA raises an alert and then the user, by consulting additional meta-information like EUNOMIA’s other indicators can investigate the content further and decide if it is valid and can be safely consumed or shared further to the community.
Pantelis Agathangelou, PhD Candidate, University of Nicosia
The featured photo is by Domingo Alvarez E on Unsplash