Orange: Sentimen Analysis

From OnnoCenterWiki
Jump to: navigation, search


Lecture at University of Bologna Lecture at University of Bologna

The main goal of the lecture was to lay out the possibilities that contemporary technology offers to researchers and to showcase a few simple text mining tasks in Orange. We analysed Trump’s and Clinton’s Twitter timeline and discovered that their tweets are highly distinct from one another and that you can easily find significant words they’re using in their tweets. Moreover, we’ve discovered that Trump is much better at social media than Clinton, creating highly likable and shareable content and inventing his own hashtags. Could that be a tell-tale sign of his recent victory?

Perhaps. Our future, data-mining savvy political scientists will decide. Below, you can see some examples of the workflows presented at the workshop. bologna-workflow1 Author predictions from Tweet content. Logistic Regression reports on 92% classification accuracy and AUC score. Confusion Matrix can output misclassified tweets to Corpus Viewer, where we can inspect these tweets further.

bologna-wordcloud Word Cloud from preprocessed tweets. We removed stopwords and punctuation to find frequencies for meaningful words only.

bologna-enrichment Word Enrichment by Author. First we find Donald’s tweets with Select Rows and then compare them to the entire corpus in Word Enrichment. The widget outputs a ranked list of significant words for the provided subset. We do the same for Hillary’s tweets.

bologna-topicmodelling Finding potential topics with LDA.

bologna-emotions Finally, we offered a sneak peek of our recent Tweet Profiler widget. Tweet Profiler is intended for sentiment analysis of tweets and can output classes. probabilities and embeddings. The widget is not yet officially available, but will be included in the upcoming release.