<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://lms.onnocenter.or.id/wiki/index.php?action=history&amp;feed=atom&amp;title=NLTK%3A_Basic_Sentiment_Analysis_with_Python</id>
	<title>NLTK: Basic Sentiment Analysis with Python - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://lms.onnocenter.or.id/wiki/index.php?action=history&amp;feed=atom&amp;title=NLTK%3A_Basic_Sentiment_Analysis_with_Python"/>
	<link rel="alternate" type="text/html" href="https://lms.onnocenter.or.id/wiki/index.php?title=NLTK:_Basic_Sentiment_Analysis_with_Python&amp;action=history"/>
	<updated>2026-04-20T03:37:35Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.45.1</generator>
	<entry>
		<id>https://lms.onnocenter.or.id/wiki/index.php?title=NLTK:_Basic_Sentiment_Analysis_with_Python&amp;diff=46960&amp;oldid=prev</id>
		<title>Onnowpurbo at 00:10, 9 February 2017</title>
		<link rel="alternate" type="text/html" href="https://lms.onnocenter.or.id/wiki/index.php?title=NLTK:_Basic_Sentiment_Analysis_with_Python&amp;diff=46960&amp;oldid=prev"/>
		<updated>2017-02-09T00:10:30Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 00:10, 9 February 2017&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l3&quot;&gt;Line 3:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 3:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;01 nov 2012&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;01 nov 2012&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[Update]: you can check out the code on Github&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;[Update]: you can check out the code on Github &lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;https://github.com/fjavieralba/basic_sentiment_analysis/blob/master/basic_sentiment_analysis.py&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;In this post I will try to give a very introductory view of some techniques that could be useful when you want to perform a basic analysis of opinions written in english.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;In this post I will try to give a very introductory view of some techniques that could be useful when you want to perform a basic analysis of opinions written in english.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Onnowpurbo</name></author>
	</entry>
	<entry>
		<id>https://lms.onnocenter.or.id/wiki/index.php?title=NLTK:_Basic_Sentiment_Analysis_with_Python&amp;diff=46959&amp;oldid=prev</id>
		<title>Onnowpurbo: Created page with &quot; Basic Sentiment Analysis with Python  01 nov 2012  [Update]: you can check out the code on Github  In this post I will try to give a very introductory view of some techniques...&quot;</title>
		<link rel="alternate" type="text/html" href="https://lms.onnocenter.or.id/wiki/index.php?title=NLTK:_Basic_Sentiment_Analysis_with_Python&amp;diff=46959&amp;oldid=prev"/>
		<updated>2017-02-09T00:08:47Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot; Basic Sentiment Analysis with Python  01 nov 2012  [Update]: you can check out the code on Github  In this post I will try to give a very introductory view of some techniques...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt; Basic Sentiment Analysis with Python&lt;br /&gt;
&lt;br /&gt;
01 nov 2012&lt;br /&gt;
&lt;br /&gt;
[Update]: you can check out the code on Github&lt;br /&gt;
&lt;br /&gt;
In this post I will try to give a very introductory view of some techniques that could be useful when you want to perform a basic analysis of opinions written in english.&lt;br /&gt;
&lt;br /&gt;
These techniques come 100% from experience in real-life projects. Don&amp;#039;t expect a theoretical introduction of Sentiment Analysis and the multiple strategies out there to achieve opinion mining, this is only a practical example of applying some basic rules to extract the polarity (positive or negative) of a text.&lt;br /&gt;
&lt;br /&gt;
Let&amp;#039;s start looking at an example opinion:&lt;br /&gt;
&lt;br /&gt;
    &amp;quot;What can I say about this place. The staff of the restaurant is nice and the eggplant is not bad. Apart from that, very uninspired food, lack of atmosphere and too expensive. I am a staunch vegetarian and was sorely dissapointed with the veggie options on the menu. Will be the last time I visit, I recommend others to avoid.&amp;quot;&lt;br /&gt;
&lt;br /&gt;
As you can see, this is a mainly negative review about a restaurant.&lt;br /&gt;
&lt;br /&gt;
General or detailed sentiment&lt;br /&gt;
&lt;br /&gt;
Sometimes we only want an overall rating of the sentiment of the whole review. In other cases, we need a little more detail, and we want each negative or positive comment identified.&lt;br /&gt;
&lt;br /&gt;
This kind of detailed detection can be quite challenging. Sometimes the aspect is explicit. An example is the opinion &amp;quot;very uninspired food&amp;quot;, where the criticized aspect is the food. In other cases, is implicit: the sentence &amp;quot;too expensive&amp;quot; gives a negative opinion about the price without mentioning it.&lt;br /&gt;
&lt;br /&gt;
In this post I will focus on detecting the overall polarity of a review, leaving for later the identification of individual opinions on concrete aspects of the restaurant. To compute the polarity of a review, I&amp;#039;m going to use an approach based on dictionaries and some basic algorithms.&lt;br /&gt;
&lt;br /&gt;
A note about the dictionaries&lt;br /&gt;
&lt;br /&gt;
A dictionary is no more than a list of words that share a category. For example, you can have a dictionary for positive expressions, and another one for stop words.&lt;br /&gt;
&lt;br /&gt;
The design of the dictionaries highly depends on the concrete topic where you want to perform the opinion mining. Mining hotel opinions is quite different than mining laptops opinions. Not only the positive/negative expressions could be different but the context vocabulary is also quite distinct.&lt;br /&gt;
Defining a structure for the text&lt;br /&gt;
&lt;br /&gt;
Before writing code, there is an important decision to make. Our code will have to interact with text, splitting, tagging, and extracting information from it.&lt;br /&gt;
&lt;br /&gt;
But what should be the structure of our text?&lt;br /&gt;
&lt;br /&gt;
This is a key decision because it will determine our algorithms in some ways. We should decide if we want to differentiate sentences inside a a paragraph. We could define a sentence as a list of tokens. But what is a token? a string? a more complex structure? Note that we will want to assign tags to our token. Should we only allow one tag per-token or unlimited ones?&lt;br /&gt;
&lt;br /&gt;
Infinite options here. We could choose a very simple structure, for example, defining the text simply as a list of words. Or we could define a more elaborated structure carrying every possible attribute of a processed text (word lemmas, word forms, multiple taggings, inflections...)&lt;br /&gt;
&lt;br /&gt;
As usual, a compromise between these two extremes can be a good way to go.&lt;br /&gt;
&lt;br /&gt;
For the examples of this post, I&amp;#039;m going to use the following structure:&lt;br /&gt;
&lt;br /&gt;
        Each text is a list of sentences&lt;br /&gt;
        Each sentence is a list of tokens&lt;br /&gt;
        Each token is a tuple of three elements: a word form (the exact word that appeared in the text), a word lemma (a generalized version of the word), and a list of associated tags&lt;br /&gt;
&lt;br /&gt;
This is a structure type I&amp;#039;ve found quite useful. Is ready for some &amp;quot;advanced&amp;quot; processing (lemmatization, multiple tags) without being too complex (at least in Python).&lt;br /&gt;
&lt;br /&gt;
This is an example of a POS-tagged paragraph:&lt;br /&gt;
&lt;br /&gt;
 [[(&amp;#039;All&amp;#039;, &amp;#039;All&amp;#039;, [&amp;#039;DT&amp;#039;]),&lt;br /&gt;
 (&amp;#039;that&amp;#039;, &amp;#039;that&amp;#039;, [&amp;#039;DT&amp;#039;]),&lt;br /&gt;
 (&amp;#039;is&amp;#039;, &amp;#039;is&amp;#039;, [&amp;#039;VBZ&amp;#039;]),&lt;br /&gt;
 (&amp;#039;gold&amp;#039;, &amp;#039;gold&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
 (&amp;#039;does&amp;#039;, &amp;#039;does&amp;#039;, [&amp;#039;VBZ&amp;#039;]),&lt;br /&gt;
 (&amp;#039;not&amp;#039;, &amp;#039;not&amp;#039;, [&amp;#039;RB&amp;#039;]),&lt;br /&gt;
 (&amp;#039;glitter&amp;#039;, &amp;#039;glitter&amp;#039;, [&amp;#039;VB&amp;#039;]),&lt;br /&gt;
 (&amp;#039;.&amp;#039;, &amp;#039;.&amp;#039;, [&amp;#039;.&amp;#039;])],&lt;br /&gt;
[(&amp;#039;Not&amp;#039;, &amp;#039;Not&amp;#039;, [&amp;#039;RB&amp;#039;]),&lt;br /&gt;
 (&amp;#039;all&amp;#039;, &amp;#039;all&amp;#039;, [&amp;#039;DT&amp;#039;]),&lt;br /&gt;
 (&amp;#039;those&amp;#039;, &amp;#039;those&amp;#039;, [&amp;#039;DT&amp;#039;]),&lt;br /&gt;
 (&amp;#039;who&amp;#039;, &amp;#039;who&amp;#039;, [&amp;#039;WP&amp;#039;]),&lt;br /&gt;
 (&amp;#039;wander&amp;#039;, &amp;#039;wander&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
 (&amp;#039;are&amp;#039;, &amp;#039;are&amp;#039;, [&amp;#039;VBP&amp;#039;]),&lt;br /&gt;
 (&amp;#039;lost&amp;#039;, &amp;#039;lost&amp;#039;, [&amp;#039;VBN&amp;#039;])]]&lt;br /&gt;
&lt;br /&gt;
Prepocessing the Text&lt;br /&gt;
&lt;br /&gt;
Once we have decided the structural shape of your processed text, we can start writing some code to read, and pre-process this text. With pre-process I mean some common first steps in NLP such as: Tokenize, Split into sentences, and POS Tag.&lt;br /&gt;
&lt;br /&gt;
I will use the NLTK library for these tasks:&lt;br /&gt;
import nltk&lt;br /&gt;
&lt;br /&gt;
class Splitter(object):&lt;br /&gt;
    def __init__(self):&lt;br /&gt;
        self.nltk_splitter = nltk.data.load(&amp;#039;tokenizers/punkt/english.pickle&amp;#039;)&lt;br /&gt;
        self.nltk_tokenizer = nltk.tokenize.TreebankWordTokenizer()&lt;br /&gt;
&lt;br /&gt;
    def split(self, text):&lt;br /&gt;
        &amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
        input format: a paragraph of text&lt;br /&gt;
        output format: a list of lists of words.&lt;br /&gt;
            e.g.: [[&amp;#039;this&amp;#039;, &amp;#039;is&amp;#039;, &amp;#039;a&amp;#039;, &amp;#039;sentence&amp;#039;], [&amp;#039;this&amp;#039;, &amp;#039;is&amp;#039;, &amp;#039;another&amp;#039;, &amp;#039;one&amp;#039;]]&lt;br /&gt;
        &amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
        sentences = self.nltk_splitter.tokenize(text)&lt;br /&gt;
        tokenized_sentences = [self.nltk_tokenizer.tokenize(sent) for sent in sentences]&lt;br /&gt;
        return tokenized_sentences&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
class POSTagger(object):&lt;br /&gt;
    def __init__(self):&lt;br /&gt;
        pass&lt;br /&gt;
        &lt;br /&gt;
    def pos_tag(self, sentences):&lt;br /&gt;
        &amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
        input format: list of lists of words&lt;br /&gt;
            e.g.: [[&amp;#039;this&amp;#039;, &amp;#039;is&amp;#039;, &amp;#039;a&amp;#039;, &amp;#039;sentence&amp;#039;], [&amp;#039;this&amp;#039;, &amp;#039;is&amp;#039;, &amp;#039;another&amp;#039;, &amp;#039;one&amp;#039;]]&lt;br /&gt;
        output format: list of lists of tagged tokens. Each tagged tokens has a&lt;br /&gt;
        form, a lemma, and a list of tags&lt;br /&gt;
            e.g: [[(&amp;#039;this&amp;#039;, &amp;#039;this&amp;#039;, [&amp;#039;DT&amp;#039;]), (&amp;#039;is&amp;#039;, &amp;#039;be&amp;#039;, [&amp;#039;VB&amp;#039;]), (&amp;#039;a&amp;#039;, &amp;#039;a&amp;#039;, [&amp;#039;DT&amp;#039;]), (&amp;#039;sentence&amp;#039;, &amp;#039;sentence&amp;#039;, [&amp;#039;NN&amp;#039;])],&lt;br /&gt;
                    [(&amp;#039;this&amp;#039;, &amp;#039;this&amp;#039;, [&amp;#039;DT&amp;#039;]), (&amp;#039;is&amp;#039;, &amp;#039;be&amp;#039;, [&amp;#039;VB&amp;#039;]), (&amp;#039;another&amp;#039;, &amp;#039;another&amp;#039;, [&amp;#039;DT&amp;#039;]), (&amp;#039;one&amp;#039;, &amp;#039;one&amp;#039;, [&amp;#039;CARD&amp;#039;])]]&lt;br /&gt;
        &amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
        pos = [nltk.pos_tag(sentence) for sentence in sentences]&lt;br /&gt;
        #adapt format&lt;br /&gt;
        pos = [[(word, word, [postag]) for (word, postag) in sentence] for sentence in pos]&lt;br /&gt;
        return pos&lt;br /&gt;
view raw&lt;br /&gt;
splitter_postagger_nltk.py hosted with ❤ by GitHub&lt;br /&gt;
&lt;br /&gt;
Now, using this two simple wrapper classes, I can perform a basic text preprocessing, where the input is the text as a string and the output is a collection of sentences, each of which is again a collection of tokens.&lt;br /&gt;
&lt;br /&gt;
By the moment, our tokens are quite simple. Since we are using NLTK, and it does not lemmatize words, our forms and lemmas will be always identical. At this point of the process, the only tag associated to each word is its own POS Tag provided by NLTK.&lt;br /&gt;
text = &amp;quot;&amp;quot;&amp;quot;What can I say about this place. The staff of the restaurant is nice and the eggplant is not bad. Apart from that, very uninspired food, lack of atmosphere and too expensive. I am a staunch vegetarian and was sorely dissapointed with the veggie options on the menu. Will be the last time I visit, I recommend others to avoid.&amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
&lt;br /&gt;
splitter = Splitter()&lt;br /&gt;
postagger = POSTagger()&lt;br /&gt;
&lt;br /&gt;
splitted_sentences = splitter.split(text)&lt;br /&gt;
&lt;br /&gt;
print splitted_sentences&lt;br /&gt;
[[&amp;#039;What&amp;#039;, &amp;#039;can&amp;#039;, &amp;#039;I&amp;#039;, &amp;#039;say&amp;#039;, &amp;#039;about&amp;#039;, &amp;#039;this&amp;#039;, &amp;#039;place&amp;#039;, &amp;#039;.&amp;#039;], [&amp;#039;The&amp;#039;, &amp;#039;staff&amp;#039;, &amp;#039;of&amp;#039;, &amp;#039;the&amp;#039;, &amp;#039;restaurant&amp;#039;, &amp;#039;is&amp;#039;, &amp;#039;nice&amp;#039;, &amp;#039;and&amp;#039;, &amp;#039;eggplant&amp;#039;, &amp;#039;is&amp;#039;, &amp;#039;not&amp;#039;, &amp;#039;bad&amp;#039;, &amp;#039;.&amp;#039;], [&amp;#039;apart&amp;#039;, &amp;#039;from&amp;#039;, &amp;#039;that&amp;#039;, &amp;#039;,&amp;#039;, &amp;#039;very&amp;#039;, &amp;#039;uninspired&amp;#039;, &amp;#039;food&amp;#039;, &amp;#039;,&amp;#039;, &amp;#039;lack&amp;#039;, &amp;#039;of&amp;#039;, &amp;#039;atmosphere&amp;#039;, &amp;#039;and&amp;#039;, &amp;#039;too&amp;#039;, &amp;#039;expensive&amp;#039;, &amp;#039;.&amp;#039;], [&amp;#039;I&amp;#039;, &amp;#039;am&amp;#039;, &amp;#039;a&amp;#039;, &amp;#039;staunch&amp;#039;, &amp;#039;vegetarian&amp;#039;, &amp;#039;and&amp;#039;, &amp;#039;was&amp;#039;, &amp;#039;sorely&amp;#039;, &amp;#039;dissapointed&amp;#039;, &amp;#039;with&amp;#039;, &amp;#039;the&amp;#039;, &amp;#039;veggie&amp;#039;, &amp;#039;options&amp;#039;, &amp;#039;on&amp;#039;, &amp;#039;the&amp;#039;, &amp;#039;menu&amp;#039;, &amp;#039;.&amp;#039;], [&amp;#039;Will&amp;#039;, &amp;#039;be&amp;#039;, &amp;#039;the&amp;#039;, &amp;#039;last&amp;#039;, &amp;#039;time&amp;#039;, &amp;#039;I&amp;#039;, &amp;#039;visit&amp;#039;, &amp;#039;,&amp;#039;, &amp;#039;I&amp;#039;, &amp;#039;recommend&amp;#039;, &amp;#039;others&amp;#039;, &amp;#039;to&amp;#039;, &amp;#039;avoid&amp;#039;, &amp;#039;.&amp;#039;]]&lt;br /&gt;
&lt;br /&gt;
pos_tagged_sentences = postagger.pos_tag(splitted_sentences)&lt;br /&gt;
&lt;br /&gt;
print pos_tagged_sentences&lt;br /&gt;
[[(&amp;#039;What&amp;#039;, &amp;#039;What&amp;#039;, [&amp;#039;WP&amp;#039;]), (&amp;#039;can&amp;#039;, &amp;#039;can&amp;#039;, [&amp;#039;MD&amp;#039;]), (&amp;#039;I&amp;#039;, &amp;#039;I&amp;#039;, [&amp;#039;PRP&amp;#039;]), (&amp;#039;say&amp;#039;, &amp;#039;say&amp;#039;, [&amp;#039;VB&amp;#039;]), (&amp;#039;about&amp;#039;, &amp;#039;about&amp;#039;, [&amp;#039;IN&amp;#039;]), (&amp;#039;this&amp;#039;, &amp;#039;this&amp;#039;, [&amp;#039;DT&amp;#039;]), (&amp;#039;place&amp;#039;, &amp;#039;place&amp;#039;, [&amp;#039;NN&amp;#039;]), (&amp;#039;.&amp;#039;, &amp;#039;.&amp;#039;, [&amp;#039;.&amp;#039;])], [(&amp;#039;The&amp;#039;, &amp;#039;The&amp;#039;, [&amp;#039;DT&amp;#039;]), (&amp;#039;staff&amp;#039;, &amp;#039;staff&amp;#039;, [&amp;#039;NN&amp;#039;]), (&amp;#039;of&amp;#039;, &amp;#039;of&amp;#039;, [&amp;#039;IN&amp;#039;]), (&amp;#039;the&amp;#039;, &amp;#039;the&amp;#039;, [&amp;#039;DT&amp;#039;]), (&amp;#039;restaurant&amp;#039;, &amp;#039;restaurant&amp;#039;, [&amp;#039;NN&amp;#039;]), (&amp;#039;is&amp;#039;, &amp;#039;is&amp;#039;, [&amp;#039;VBZ&amp;#039;]), (&amp;#039;nice&amp;#039;, &amp;#039;nice&amp;#039;, [&amp;#039;JJ&amp;#039;]), (&amp;#039;and&amp;#039;, &amp;#039;and&amp;#039;, [&amp;#039;CC&amp;#039;]), (&amp;#039;eggplant&amp;#039;, &amp;#039;eggplant&amp;#039;, [&amp;#039;NN&amp;#039;]), (&amp;#039;is&amp;#039;, &amp;#039;is&amp;#039;, [&amp;#039;VBZ&amp;#039;]), (&amp;#039;not&amp;#039;, &amp;#039;not&amp;#039;, [&amp;#039;RB&amp;#039;]), (&amp;#039;bad&amp;#039;, &amp;#039;bad&amp;#039;, [&amp;#039;JJ&amp;#039;]), (&amp;#039;.&amp;#039;, &amp;#039;.&amp;#039;, [&amp;#039;.&amp;#039;])], [(&amp;#039;apart&amp;#039;, &amp;#039;apart&amp;#039;, [&amp;#039;NN&amp;#039;]), (&amp;#039;from&amp;#039;, &amp;#039;from&amp;#039;, [&amp;#039;IN&amp;#039;]), (&amp;#039;that&amp;#039;, &amp;#039;that&amp;#039;, [&amp;#039;DT&amp;#039;]), (&amp;#039;,&amp;#039;, &amp;#039;,&amp;#039;, [&amp;#039;,&amp;#039;]), (&amp;#039;very&amp;#039;, &amp;#039;very&amp;#039;, [&amp;#039;RB&amp;#039;]), (&amp;#039;uninspired&amp;#039;, &amp;#039;uninspired&amp;#039;, [&amp;#039;VBN&amp;#039;]), (&amp;#039;food&amp;#039;, &amp;#039;food&amp;#039;, [&amp;#039;NN&amp;#039;]), (&amp;#039;,&amp;#039;, &amp;#039;,&amp;#039;, [&amp;#039;,&amp;#039;]), (&amp;#039;lack&amp;#039;, &amp;#039;lack&amp;#039;, [&amp;#039;NN&amp;#039;]), (&amp;#039;of&amp;#039;, &amp;#039;of&amp;#039;, [&amp;#039;IN&amp;#039;]), (&amp;#039;atmosphere&amp;#039;, &amp;#039;atmosphere&amp;#039;, [&amp;#039;NN&amp;#039;]), (&amp;#039;and&amp;#039;, &amp;#039;and&amp;#039;, [&amp;#039;CC&amp;#039;]), (&amp;#039;too&amp;#039;, &amp;#039;too&amp;#039;, [&amp;#039;RB&amp;#039;]), (&amp;#039;expensive&amp;#039;, &amp;#039;expensive&amp;#039;, [&amp;#039;JJ&amp;#039;]), (&amp;#039;.&amp;#039;, &amp;#039;.&amp;#039;, [&amp;#039;.&amp;#039;])], [(&amp;#039;I&amp;#039;, &amp;#039;I&amp;#039;, [&amp;#039;PRP&amp;#039;]), (&amp;#039;am&amp;#039;, &amp;#039;am&amp;#039;, [&amp;#039;VBP&amp;#039;]), (&amp;#039;a&amp;#039;, &amp;#039;a&amp;#039;, [&amp;#039;DT&amp;#039;]), (&amp;#039;staunch&amp;#039;, &amp;#039;staunch&amp;#039;, [&amp;#039;NN&amp;#039;]), (&amp;#039;vegetarian&amp;#039;, &amp;#039;vegetarian&amp;#039;, [&amp;#039;NN&amp;#039;]), (&amp;#039;and&amp;#039;, &amp;#039;and&amp;#039;, [&amp;#039;CC&amp;#039;]), (&amp;#039;was&amp;#039;, &amp;#039;was&amp;#039;, [&amp;#039;VBD&amp;#039;]), (&amp;#039;sorely&amp;#039;, &amp;#039;sorely&amp;#039;, [&amp;#039;RB&amp;#039;]), (&amp;#039;dissapointed&amp;#039;, &amp;#039;dissapointed&amp;#039;, [&amp;#039;VBN&amp;#039;]), (&amp;#039;with&amp;#039;, &amp;#039;with&amp;#039;, [&amp;#039;IN&amp;#039;]), (&amp;#039;the&amp;#039;, &amp;#039;the&amp;#039;, [&amp;#039;DT&amp;#039;]), (&amp;#039;veggie&amp;#039;, &amp;#039;veggie&amp;#039;, [&amp;#039;NN&amp;#039;]), (&amp;#039;options&amp;#039;, &amp;#039;options&amp;#039;, [&amp;#039;NNS&amp;#039;]), (&amp;#039;on&amp;#039;, &amp;#039;on&amp;#039;, [&amp;#039;IN&amp;#039;]), (&amp;#039;the&amp;#039;, &amp;#039;the&amp;#039;, [&amp;#039;DT&amp;#039;]), (&amp;#039;menu&amp;#039;, &amp;#039;menu&amp;#039;, [&amp;#039;NN&amp;#039;]), (&amp;#039;.&amp;#039;, &amp;#039;.&amp;#039;, [&amp;#039;.&amp;#039;])], [(&amp;#039;Will&amp;#039;, &amp;#039;Will&amp;#039;, [&amp;#039;NNP&amp;#039;]), (&amp;#039;be&amp;#039;, &amp;#039;be&amp;#039;, [&amp;#039;VB&amp;#039;]), (&amp;#039;the&amp;#039;, &amp;#039;the&amp;#039;, [&amp;#039;DT&amp;#039;]), (&amp;#039;last&amp;#039;, &amp;#039;last&amp;#039;, [&amp;#039;JJ&amp;#039;]), (&amp;#039;time&amp;#039;, &amp;#039;time&amp;#039;, [&amp;#039;NN&amp;#039;]), (&amp;#039;I&amp;#039;, &amp;#039;I&amp;#039;, [&amp;#039;PRP&amp;#039;]), (&amp;#039;visit&amp;#039;, &amp;#039;visit&amp;#039;, [&amp;#039;VBP&amp;#039;]), (&amp;#039;,&amp;#039;, &amp;#039;,&amp;#039;, [&amp;#039;,&amp;#039;]), (&amp;#039;I&amp;#039;, &amp;#039;I&amp;#039;, [&amp;#039;PRP&amp;#039;]), (&amp;#039;recommend&amp;#039;, &amp;#039;recommend&amp;#039;, [&amp;#039;VBP&amp;#039;]), (&amp;#039;others&amp;#039;, &amp;#039;others&amp;#039;, [&amp;#039;NNS&amp;#039;]), (&amp;#039;to&amp;#039;, &amp;#039;to&amp;#039;, [&amp;#039;TO&amp;#039;]), (&amp;#039;avoid&amp;#039;, &amp;#039;avoid&amp;#039;, [&amp;#039;VB&amp;#039;]), (&amp;#039;.&amp;#039;, &amp;#039;.&amp;#039;, [&amp;#039;.&amp;#039;])]]&lt;br /&gt;
view raw&lt;br /&gt;
preprocessing_text.py hosted with ❤ by GitHub&lt;br /&gt;
Defining a dictionary of positive and negative expressions&lt;br /&gt;
&lt;br /&gt;
The next step is to recognize positive and negative expressions. To achieve this, I&amp;#039;m going to use dictionaries, i.e. simple files containing expressions that will be searched in our text.&lt;br /&gt;
&lt;br /&gt;
For example, I&amp;#039;m going to define two tiny dictionaries, one for positive expressions and other for negative ones:&lt;br /&gt;
&lt;br /&gt;
positive.yml&lt;br /&gt;
&lt;br /&gt;
nice: [positive]&lt;br /&gt;
awesome: [positive]&lt;br /&gt;
cool: [positive]&lt;br /&gt;
superb: [positive]&lt;br /&gt;
&lt;br /&gt;
negative.yml&lt;br /&gt;
&lt;br /&gt;
bad: [negative]&lt;br /&gt;
uninspired: [negative]&lt;br /&gt;
expensive: [negative]&lt;br /&gt;
dissapointed: [negative]&lt;br /&gt;
recommend others to avoid: [negative]&lt;br /&gt;
&lt;br /&gt;
In case you were wondering, we could have used a simpler format, or used only one file, but this dictionary format will be useful later.&lt;br /&gt;
&lt;br /&gt;
Note that these are only two example dictionaries, useless in a real life project.&lt;br /&gt;
Tagging the text with dictionaries&lt;br /&gt;
&lt;br /&gt;
The following code defines a class that I will use to tag our pre-processed text with our just defined dictionaries.&lt;br /&gt;
class DictionaryTagger(object):&lt;br /&gt;
    def __init__(self, dictionary_paths):&lt;br /&gt;
        files = [open(path, &amp;#039;r&amp;#039;) for path in dictionary_paths]&lt;br /&gt;
        dictionaries = [yaml.load(dict_file) for dict_file in files]&lt;br /&gt;
        map(lambda x: x.close(), files)&lt;br /&gt;
        self.dictionary = {}&lt;br /&gt;
        self.max_key_size = 0&lt;br /&gt;
        for curr_dict in dictionaries:&lt;br /&gt;
            for key in curr_dict:&lt;br /&gt;
                if key in self.dictionary:&lt;br /&gt;
                    self.dictionary[key].extend(curr_dict[key])&lt;br /&gt;
                else:&lt;br /&gt;
                    self.dictionary[key] = curr_dict[key]&lt;br /&gt;
                    self.max_key_size = max(self.max_key_size, len(key))&lt;br /&gt;
&lt;br /&gt;
    def tag(self, postagged_sentences):&lt;br /&gt;
        return [self.tag_sentence(sentence) for sentence in postagged_sentences]&lt;br /&gt;
&lt;br /&gt;
    def tag_sentence(self, sentence, tag_with_lemmas=False):&lt;br /&gt;
        &amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
        the result is only one tagging of all the possible ones.&lt;br /&gt;
        The resulting tagging is determined by these two priority rules:&lt;br /&gt;
            - longest matches have higher priority&lt;br /&gt;
            - search is made from left to right&lt;br /&gt;
        &amp;quot;&amp;quot;&amp;quot;&lt;br /&gt;
        tag_sentence = []&lt;br /&gt;
        N = len(sentence)&lt;br /&gt;
        if self.max_key_size == 0:&lt;br /&gt;
            self.max_key_size = N&lt;br /&gt;
        i = 0&lt;br /&gt;
        while (i &amp;lt; N):&lt;br /&gt;
            j = min(i + self.max_key_size, N) #avoid overflow&lt;br /&gt;
            tagged = False&lt;br /&gt;
            while (j &amp;gt; i):&lt;br /&gt;
                expression_form = &amp;#039; &amp;#039;.join([word[0] for word in sentence[i:j]]).lower()&lt;br /&gt;
                expression_lemma = &amp;#039; &amp;#039;.join([word[1] for word in sentence[i:j]]).lower()&lt;br /&gt;
                if tag_with_lemmas:&lt;br /&gt;
                    literal = expression_lemma&lt;br /&gt;
                else:&lt;br /&gt;
                    literal = expression_form&lt;br /&gt;
                if literal in self.dictionary:&lt;br /&gt;
                    #self.logger.debug(&amp;quot;found: %s&amp;quot; % literal)&lt;br /&gt;
                    is_single_token = j - i == 1&lt;br /&gt;
                    original_position = i&lt;br /&gt;
                    i = j&lt;br /&gt;
                    taggings = [tag for tag in self.dictionary[literal]]&lt;br /&gt;
                    tagged_expression = (expression_form, expression_lemma, taggings)&lt;br /&gt;
                    if is_single_token: #if the tagged literal is a single token, conserve its previous taggings:&lt;br /&gt;
                        original_token_tagging = sentence[original_position][2]&lt;br /&gt;
                        tagged_expression[2].extend(original_token_tagging)&lt;br /&gt;
                    tag_sentence.append(tagged_expression)&lt;br /&gt;
                    tagged = True&lt;br /&gt;
                else:&lt;br /&gt;
                    j = j - 1&lt;br /&gt;
            if not tagged:&lt;br /&gt;
                tag_sentence.append(sentence[i])&lt;br /&gt;
                i += 1&lt;br /&gt;
        return tag_sentence&lt;br /&gt;
view raw&lt;br /&gt;
dictionary_tagger.py hosted with ❤ by GitHub&lt;br /&gt;
&lt;br /&gt;
When tagging our review, the input is the previously preprocessed text, and the output is the same text, enriched with tags of type &amp;quot;positive&amp;quot; or &amp;quot;negative&amp;quot;:&lt;br /&gt;
dicttagger = DictionaryTagger([ &amp;#039;dicts/positive.yml&amp;#039;, &amp;#039;dicts/negative.yml&amp;#039;])&lt;br /&gt;
&lt;br /&gt;
dict_tagged_sentences = dicttagger.tag(pos_tagged_sentences)&lt;br /&gt;
&lt;br /&gt;
pprint(dict_tagged_sentences)&lt;br /&gt;
[[(&amp;#039;What&amp;#039;, &amp;#039;What&amp;#039;, [&amp;#039;WP&amp;#039;]),&lt;br /&gt;
  (&amp;#039;can&amp;#039;, &amp;#039;can&amp;#039;, [&amp;#039;MD&amp;#039;]),&lt;br /&gt;
  (&amp;#039;I&amp;#039;, &amp;#039;I&amp;#039;, [&amp;#039;PRP&amp;#039;]),&lt;br /&gt;
  (&amp;#039;say&amp;#039;, &amp;#039;say&amp;#039;, [&amp;#039;VB&amp;#039;]),&lt;br /&gt;
  (&amp;#039;about&amp;#039;, &amp;#039;about&amp;#039;, [&amp;#039;IN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;this&amp;#039;, &amp;#039;this&amp;#039;, [&amp;#039;DT&amp;#039;]),&lt;br /&gt;
  (&amp;#039;place&amp;#039;, &amp;#039;place&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;.&amp;#039;, &amp;#039;.&amp;#039;, [&amp;#039;.&amp;#039;])],&lt;br /&gt;
 [(&amp;#039;The&amp;#039;, &amp;#039;The&amp;#039;, [&amp;#039;DT&amp;#039;]),&lt;br /&gt;
  (&amp;#039;staff&amp;#039;, &amp;#039;staff&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;of&amp;#039;, &amp;#039;of&amp;#039;, [&amp;#039;IN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;the&amp;#039;, &amp;#039;the&amp;#039;, [&amp;#039;DT&amp;#039;]),&lt;br /&gt;
  (&amp;#039;restaurant&amp;#039;, &amp;#039;restaurant&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;is&amp;#039;, &amp;#039;is&amp;#039;, [&amp;#039;VBZ&amp;#039;]),&lt;br /&gt;
  (&amp;#039;nice&amp;#039;, &amp;#039;nice&amp;#039;, [&amp;#039;positive&amp;#039;, &amp;#039;JJ&amp;#039;]),&lt;br /&gt;
  (&amp;#039;and&amp;#039;, &amp;#039;and&amp;#039;, [&amp;#039;CC&amp;#039;]),&lt;br /&gt;
  (&amp;#039;eggplant&amp;#039;, &amp;#039;eggplant&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;is&amp;#039;, &amp;#039;is&amp;#039;, [&amp;#039;VBZ&amp;#039;]),&lt;br /&gt;
  (&amp;#039;not&amp;#039;, &amp;#039;not&amp;#039;, [&amp;#039;RB&amp;#039;]),&lt;br /&gt;
  (&amp;#039;bad&amp;#039;, &amp;#039;bad&amp;#039;, [&amp;#039;negative&amp;#039;, &amp;#039;JJ&amp;#039;]),&lt;br /&gt;
  (&amp;#039;.&amp;#039;, &amp;#039;.&amp;#039;, [&amp;#039;.&amp;#039;])],&lt;br /&gt;
 [(&amp;#039;apart&amp;#039;, &amp;#039;apart&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;from&amp;#039;, &amp;#039;from&amp;#039;, [&amp;#039;IN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;that&amp;#039;, &amp;#039;that&amp;#039;, [&amp;#039;DT&amp;#039;]),&lt;br /&gt;
  (&amp;#039;,&amp;#039;, &amp;#039;,&amp;#039;, [&amp;#039;,&amp;#039;]),&lt;br /&gt;
  (&amp;#039;very&amp;#039;, &amp;#039;very&amp;#039;, [&amp;#039;RB&amp;#039;]),&lt;br /&gt;
  (&amp;#039;uninspired&amp;#039;, &amp;#039;uninspired&amp;#039;, [&amp;#039;negative&amp;#039;, &amp;#039;VBN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;food&amp;#039;, &amp;#039;food&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;,&amp;#039;, &amp;#039;,&amp;#039;, [&amp;#039;,&amp;#039;]),&lt;br /&gt;
  (&amp;#039;lack&amp;#039;, &amp;#039;lack&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;of&amp;#039;, &amp;#039;of&amp;#039;, [&amp;#039;IN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;atmosphere&amp;#039;, &amp;#039;atmosphere&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;and&amp;#039;, &amp;#039;and&amp;#039;, [&amp;#039;CC&amp;#039;]),&lt;br /&gt;
  (&amp;#039;too&amp;#039;, &amp;#039;too&amp;#039;, [&amp;#039;RB&amp;#039;]),&lt;br /&gt;
  (&amp;#039;expensive&amp;#039;, &amp;#039;expensive&amp;#039;, [&amp;#039;negative&amp;#039;, &amp;#039;JJ&amp;#039;]),&lt;br /&gt;
  (&amp;#039;.&amp;#039;, &amp;#039;.&amp;#039;, [&amp;#039;.&amp;#039;])],&lt;br /&gt;
 [(&amp;#039;I&amp;#039;, &amp;#039;I&amp;#039;, [&amp;#039;PRP&amp;#039;]),&lt;br /&gt;
  (&amp;#039;am&amp;#039;, &amp;#039;am&amp;#039;, [&amp;#039;VBP&amp;#039;]),&lt;br /&gt;
  (&amp;#039;a&amp;#039;, &amp;#039;a&amp;#039;, [&amp;#039;DT&amp;#039;]),&lt;br /&gt;
  (&amp;#039;staunch&amp;#039;, &amp;#039;staunch&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;vegetarian&amp;#039;, &amp;#039;vegetarian&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;and&amp;#039;, &amp;#039;and&amp;#039;, [&amp;#039;CC&amp;#039;]),&lt;br /&gt;
  (&amp;#039;was&amp;#039;, &amp;#039;was&amp;#039;, [&amp;#039;VBD&amp;#039;]),&lt;br /&gt;
  (&amp;#039;sorely&amp;#039;, &amp;#039;sorely&amp;#039;, [&amp;#039;RB&amp;#039;]),&lt;br /&gt;
  (&amp;#039;dissapointed&amp;#039;, &amp;#039;dissapointed&amp;#039;, [&amp;#039;negative&amp;#039;, &amp;#039;VBN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;with&amp;#039;, &amp;#039;with&amp;#039;, [&amp;#039;IN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;the&amp;#039;, &amp;#039;the&amp;#039;, [&amp;#039;DT&amp;#039;]),&lt;br /&gt;
  (&amp;#039;veggie&amp;#039;, &amp;#039;veggie&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;options&amp;#039;, &amp;#039;options&amp;#039;, [&amp;#039;NNS&amp;#039;]),&lt;br /&gt;
  (&amp;#039;on&amp;#039;, &amp;#039;on&amp;#039;, [&amp;#039;IN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;the&amp;#039;, &amp;#039;the&amp;#039;, [&amp;#039;DT&amp;#039;]),&lt;br /&gt;
  (&amp;#039;menu&amp;#039;, &amp;#039;menu&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;.&amp;#039;, &amp;#039;.&amp;#039;, [&amp;#039;.&amp;#039;])],&lt;br /&gt;
 [(&amp;#039;Will&amp;#039;, &amp;#039;Will&amp;#039;, [&amp;#039;NNP&amp;#039;]),&lt;br /&gt;
  (&amp;#039;be&amp;#039;, &amp;#039;be&amp;#039;, [&amp;#039;VB&amp;#039;]),&lt;br /&gt;
  (&amp;#039;the&amp;#039;, &amp;#039;the&amp;#039;, [&amp;#039;DT&amp;#039;]),&lt;br /&gt;
  (&amp;#039;last&amp;#039;, &amp;#039;last&amp;#039;, [&amp;#039;JJ&amp;#039;]),&lt;br /&gt;
  (&amp;#039;time&amp;#039;, &amp;#039;time&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;I&amp;#039;, &amp;#039;I&amp;#039;, [&amp;#039;PRP&amp;#039;]),&lt;br /&gt;
  (&amp;#039;visit&amp;#039;, &amp;#039;visit&amp;#039;, [&amp;#039;VBP&amp;#039;]),&lt;br /&gt;
  (&amp;#039;,&amp;#039;, &amp;#039;,&amp;#039;, [&amp;#039;,&amp;#039;]),&lt;br /&gt;
  (&amp;#039;I&amp;#039;, &amp;#039;I&amp;#039;, [&amp;#039;PRP&amp;#039;]),&lt;br /&gt;
  (&amp;#039;recommend others to avoid&amp;#039;, &amp;#039;recommend others to avoid&amp;#039;, [&amp;#039;negative&amp;#039;]),&lt;br /&gt;
  (&amp;#039;.&amp;#039;, &amp;#039;.&amp;#039;, [&amp;#039;.&amp;#039;])]]&lt;br /&gt;
view raw&lt;br /&gt;
tagging_positive_negative.py hosted with ❤ by GitHub&lt;br /&gt;
A simple sentiment measure&lt;br /&gt;
&lt;br /&gt;
We could already perform a basic calculus of the positiveness or negativeness of a review.&lt;br /&gt;
&lt;br /&gt;
Simply counting how many positive and negative expressions we detected, could be a (very naive) sentiment measure.&lt;br /&gt;
&lt;br /&gt;
The following code snippet applies this idea:&lt;br /&gt;
def value_of(sentiment):&lt;br /&gt;
    if sentiment == &amp;#039;positive&amp;#039;: return 1&lt;br /&gt;
    if sentiment == &amp;#039;negative&amp;#039;: return -1&lt;br /&gt;
    return 0&lt;br /&gt;
&lt;br /&gt;
def sentiment_score(review):    &lt;br /&gt;
    return sum ([value_of(tag) for sentence in dict_tagged_sentences for token in sentence for tag in token[2]])&lt;br /&gt;
view raw&lt;br /&gt;
basic_sentiment_score.py hosted with ❤ by GitHub&lt;br /&gt;
sentiment_score(dict_tagged_sentences)&lt;br /&gt;
-4&lt;br /&gt;
view raw&lt;br /&gt;
example_exec_1.py hosted with ❤ by GitHub&lt;br /&gt;
&lt;br /&gt;
So, our review could be considered &amp;quot;quite negative&amp;quot; since it has a score of -4&lt;br /&gt;
Incrementers and decrementers&lt;br /&gt;
&lt;br /&gt;
The previous &amp;quot;sentiment score&amp;quot; was very basic: it only counts positive and negative expressions and makes a sum, without taking into account that maybe some expressions are more positive or more negative than others.&lt;br /&gt;
&lt;br /&gt;
A way of defining this &amp;quot;strength&amp;quot; could be using two new dictionaries. One for &amp;quot;incrementers&amp;quot; and another for &amp;quot;decrementers&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Let&amp;#039;s define two tiny examples:&lt;br /&gt;
&lt;br /&gt;
inc.yml&lt;br /&gt;
&lt;br /&gt;
too: [inc]&lt;br /&gt;
very: [inc]&lt;br /&gt;
sorely: [inc]&lt;br /&gt;
&lt;br /&gt;
dec.yml&lt;br /&gt;
&lt;br /&gt;
barely: [dec]&lt;br /&gt;
little: [dec]&lt;br /&gt;
&lt;br /&gt;
We instantiate again our tagger, telling it to use these two new dictionaries:&lt;br /&gt;
dicttagger = DictionaryTagger([ &amp;#039;dicts/positive.yml&amp;#039;, &amp;#039;dicts/negative.yml&amp;#039;, &amp;#039;dicts/inc.yml&amp;#039;, &amp;#039;dicts/dec.yml&amp;#039;])&lt;br /&gt;
&lt;br /&gt;
dict_tagged_sentences = dicttagger.tag(pos_tagged_sentences)&lt;br /&gt;
&lt;br /&gt;
pprint(dict_tagged_sentences)&lt;br /&gt;
[[(&amp;#039;What&amp;#039;, &amp;#039;What&amp;#039;, [&amp;#039;WP&amp;#039;]),&lt;br /&gt;
  (&amp;#039;can&amp;#039;, &amp;#039;can&amp;#039;, [&amp;#039;MD&amp;#039;]),&lt;br /&gt;
  (&amp;#039;I&amp;#039;, &amp;#039;I&amp;#039;, [&amp;#039;PRP&amp;#039;]),&lt;br /&gt;
  (&amp;#039;say&amp;#039;, &amp;#039;say&amp;#039;, [&amp;#039;VB&amp;#039;]),&lt;br /&gt;
  (&amp;#039;about&amp;#039;, &amp;#039;about&amp;#039;, [&amp;#039;IN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;this&amp;#039;, &amp;#039;this&amp;#039;, [&amp;#039;DT&amp;#039;]),&lt;br /&gt;
  (&amp;#039;place&amp;#039;, &amp;#039;place&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;.&amp;#039;, &amp;#039;.&amp;#039;, [&amp;#039;.&amp;#039;])],&lt;br /&gt;
 [(&amp;#039;The&amp;#039;, &amp;#039;The&amp;#039;, [&amp;#039;DT&amp;#039;]),&lt;br /&gt;
  (&amp;#039;staff&amp;#039;, &amp;#039;staff&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;of&amp;#039;, &amp;#039;of&amp;#039;, [&amp;#039;IN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;the&amp;#039;, &amp;#039;the&amp;#039;, [&amp;#039;DT&amp;#039;]),&lt;br /&gt;
  (&amp;#039;restaurant&amp;#039;, &amp;#039;restaurant&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;is&amp;#039;, &amp;#039;is&amp;#039;, [&amp;#039;VBZ&amp;#039;]),&lt;br /&gt;
  (&amp;#039;nice&amp;#039;, &amp;#039;nice&amp;#039;, [&amp;#039;positive&amp;#039;, &amp;#039;JJ&amp;#039;]),&lt;br /&gt;
  (&amp;#039;and&amp;#039;, &amp;#039;and&amp;#039;, [&amp;#039;CC&amp;#039;]),&lt;br /&gt;
  (&amp;#039;eggplant&amp;#039;, &amp;#039;eggplant&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;is&amp;#039;, &amp;#039;is&amp;#039;, [&amp;#039;VBZ&amp;#039;]),&lt;br /&gt;
  (&amp;#039;not&amp;#039;, &amp;#039;not&amp;#039;, [&amp;#039;RB&amp;#039;]),&lt;br /&gt;
  (&amp;#039;bad&amp;#039;, &amp;#039;bad&amp;#039;, [&amp;#039;negative&amp;#039;, &amp;#039;JJ&amp;#039;]),&lt;br /&gt;
  (&amp;#039;.&amp;#039;, &amp;#039;.&amp;#039;, [&amp;#039;.&amp;#039;])],&lt;br /&gt;
 [(&amp;#039;apart&amp;#039;, &amp;#039;apart&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;from&amp;#039;, &amp;#039;from&amp;#039;, [&amp;#039;IN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;that&amp;#039;, &amp;#039;that&amp;#039;, [&amp;#039;DT&amp;#039;]),&lt;br /&gt;
  (&amp;#039;,&amp;#039;, &amp;#039;,&amp;#039;, [&amp;#039;,&amp;#039;]),&lt;br /&gt;
  (&amp;#039;very&amp;#039;, &amp;#039;very&amp;#039;, [&amp;#039;inc&amp;#039;, &amp;#039;RB&amp;#039;]),&lt;br /&gt;
  (&amp;#039;uninspired&amp;#039;, &amp;#039;uninspired&amp;#039;, [&amp;#039;negative&amp;#039;, &amp;#039;VBN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;food&amp;#039;, &amp;#039;food&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;,&amp;#039;, &amp;#039;,&amp;#039;, [&amp;#039;,&amp;#039;]),&lt;br /&gt;
  (&amp;#039;lack&amp;#039;, &amp;#039;lack&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;of&amp;#039;, &amp;#039;of&amp;#039;, [&amp;#039;IN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;atmosphere&amp;#039;, &amp;#039;atmosphere&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;and&amp;#039;, &amp;#039;and&amp;#039;, [&amp;#039;CC&amp;#039;]),&lt;br /&gt;
  (&amp;#039;too&amp;#039;, &amp;#039;too&amp;#039;, [&amp;#039;inc&amp;#039;, &amp;#039;RB&amp;#039;]),&lt;br /&gt;
  (&amp;#039;expensive&amp;#039;, &amp;#039;expensive&amp;#039;, [&amp;#039;negative&amp;#039;, &amp;#039;JJ&amp;#039;]),&lt;br /&gt;
  (&amp;#039;.&amp;#039;, &amp;#039;.&amp;#039;, [&amp;#039;.&amp;#039;])],&lt;br /&gt;
 [(&amp;#039;I&amp;#039;, &amp;#039;I&amp;#039;, [&amp;#039;PRP&amp;#039;]),&lt;br /&gt;
  (&amp;#039;am&amp;#039;, &amp;#039;am&amp;#039;, [&amp;#039;VBP&amp;#039;]),&lt;br /&gt;
  (&amp;#039;a&amp;#039;, &amp;#039;a&amp;#039;, [&amp;#039;DT&amp;#039;]),&lt;br /&gt;
  (&amp;#039;staunch&amp;#039;, &amp;#039;staunch&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;vegetarian&amp;#039;, &amp;#039;vegetarian&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;and&amp;#039;, &amp;#039;and&amp;#039;, [&amp;#039;CC&amp;#039;]),&lt;br /&gt;
  (&amp;#039;was&amp;#039;, &amp;#039;was&amp;#039;, [&amp;#039;VBD&amp;#039;]),&lt;br /&gt;
  (&amp;#039;sorely&amp;#039;, &amp;#039;sorely&amp;#039;, [&amp;#039;inc&amp;#039;, &amp;#039;RB&amp;#039;]),&lt;br /&gt;
  (&amp;#039;dissapointed&amp;#039;, &amp;#039;dissapointed&amp;#039;, [&amp;#039;negative&amp;#039;, &amp;#039;VBN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;with&amp;#039;, &amp;#039;with&amp;#039;, [&amp;#039;IN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;the&amp;#039;, &amp;#039;the&amp;#039;, [&amp;#039;DT&amp;#039;]),&lt;br /&gt;
  (&amp;#039;veggie&amp;#039;, &amp;#039;veggie&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;options&amp;#039;, &amp;#039;options&amp;#039;, [&amp;#039;NNS&amp;#039;]),&lt;br /&gt;
  (&amp;#039;on&amp;#039;, &amp;#039;on&amp;#039;, [&amp;#039;IN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;the&amp;#039;, &amp;#039;the&amp;#039;, [&amp;#039;DT&amp;#039;]),&lt;br /&gt;
  (&amp;#039;menu&amp;#039;, &amp;#039;menu&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;.&amp;#039;, &amp;#039;.&amp;#039;, [&amp;#039;.&amp;#039;])],&lt;br /&gt;
 [(&amp;#039;Will&amp;#039;, &amp;#039;Will&amp;#039;, [&amp;#039;NNP&amp;#039;]),&lt;br /&gt;
  (&amp;#039;be&amp;#039;, &amp;#039;be&amp;#039;, [&amp;#039;VB&amp;#039;]),&lt;br /&gt;
  (&amp;#039;the&amp;#039;, &amp;#039;the&amp;#039;, [&amp;#039;DT&amp;#039;]),&lt;br /&gt;
  (&amp;#039;last&amp;#039;, &amp;#039;last&amp;#039;, [&amp;#039;JJ&amp;#039;]),&lt;br /&gt;
  (&amp;#039;time&amp;#039;, &amp;#039;time&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;I&amp;#039;, &amp;#039;I&amp;#039;, [&amp;#039;PRP&amp;#039;]),&lt;br /&gt;
  (&amp;#039;visit&amp;#039;, &amp;#039;visit&amp;#039;, [&amp;#039;VBP&amp;#039;]),&lt;br /&gt;
  (&amp;#039;,&amp;#039;, &amp;#039;,&amp;#039;, [&amp;#039;,&amp;#039;]),&lt;br /&gt;
  (&amp;#039;I&amp;#039;, &amp;#039;I&amp;#039;, [&amp;#039;PRP&amp;#039;]),&lt;br /&gt;
  (&amp;#039;recommend others to avoid&amp;#039;, &amp;#039;recommend others to avoid&amp;#039;, [&amp;#039;negative&amp;#039;]),&lt;br /&gt;
  (&amp;#039;.&amp;#039;, &amp;#039;.&amp;#039;, [&amp;#039;.&amp;#039;])]]&lt;br /&gt;
view raw&lt;br /&gt;
tagging_inc_dec.py hosted with ❤ by GitHub&lt;br /&gt;
&lt;br /&gt;
Now, we could improve in some way our sentiment score. The idea is that &amp;quot;good&amp;quot; has more strength than &amp;quot;barely good&amp;quot; but less than &amp;quot;very good&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
The following code defines the recursive function sentence_score to compute the sentiment score of a sentence. The most remarkable thing about it is that it uses information about the previous token to make a decision on the score of the current token.&lt;br /&gt;
&lt;br /&gt;
This function is then used by our new sentiment_score function:&lt;br /&gt;
def sentence_score(sentence_tokens, previous_token, acum_score):    &lt;br /&gt;
    if not sentence_tokens:&lt;br /&gt;
        return acum_score&lt;br /&gt;
    else:&lt;br /&gt;
        current_token = sentence_tokens[0]&lt;br /&gt;
        tags = current_token[2]&lt;br /&gt;
        token_score = sum([value_of(tag) for tag in tags])&lt;br /&gt;
        if previous_token is not None:&lt;br /&gt;
            previous_tags = previous_token[2]&lt;br /&gt;
            if &amp;#039;inc&amp;#039; in previous_tags:&lt;br /&gt;
                token_score *= 2.0&lt;br /&gt;
            elif &amp;#039;dec&amp;#039; in previous_tags:&lt;br /&gt;
                token_score /= 2.0&lt;br /&gt;
        return sentence_score(sentence_tokens[1:], current_token, acum_score + token_score)&lt;br /&gt;
&lt;br /&gt;
def sentiment_score(review):&lt;br /&gt;
    return sum([sentence_score(sentence, None, 0.0) for sentence in review])&lt;br /&gt;
view raw&lt;br /&gt;
sentiment_score_inc_dec.py hosted with ❤ by GitHub&lt;br /&gt;
sentiment_score(dict_tagged_sentences)&lt;br /&gt;
-7.0&lt;br /&gt;
view raw&lt;br /&gt;
example_exec_2.py hosted with ❤ by GitHub&lt;br /&gt;
&lt;br /&gt;
Notice that the review is now considered more negative, due to the appearance of expressions such as &amp;quot;very uninspired&amp;quot;, &amp;quot;too expensive&amp;quot; and &amp;quot;sorely dissapointed&amp;quot;.&lt;br /&gt;
Inverters and polarity flips&lt;br /&gt;
&lt;br /&gt;
With the approach we&amp;#039;ve been following so far, some expressions could be incorrectly tagged. For example, this part of our example review:&lt;br /&gt;
&lt;br /&gt;
    the eggplant is not bad&lt;br /&gt;
&lt;br /&gt;
contains the word bad but the sentence is a positive opinion about the eggplant.&lt;br /&gt;
&lt;br /&gt;
This is because the appearance of the negation word not, that flips the meaning of the negative adjective bad.&lt;br /&gt;
&lt;br /&gt;
We could take into account these types of polarity flips defining a dictionary of inverters:&lt;br /&gt;
&lt;br /&gt;
inv.yml&lt;br /&gt;
&lt;br /&gt;
lack of: [inv]&lt;br /&gt;
not: [inv]&lt;br /&gt;
&lt;br /&gt;
When tagging our text, we should also specify this new dictionary in the instantiation of our tagger:&lt;br /&gt;
dicttagger = DictionaryTagger([ &amp;#039;dicts/positive.yml&amp;#039;, &amp;#039;dicts/negative.yml&amp;#039;, &amp;#039;dicts/inc.yml&amp;#039;, &amp;#039;dicts/dec.yml&amp;#039;, &amp;#039;dicts/inv.yml&amp;#039;])&lt;br /&gt;
&lt;br /&gt;
dict_tagged_sentences = dicttagger.tag(pos_tagged_sentences)&lt;br /&gt;
&lt;br /&gt;
pprint(dict_tagged_sentences)&lt;br /&gt;
[[(&amp;#039;What&amp;#039;, &amp;#039;What&amp;#039;, [&amp;#039;WP&amp;#039;]),&lt;br /&gt;
  (&amp;#039;can&amp;#039;, &amp;#039;can&amp;#039;, [&amp;#039;MD&amp;#039;]),&lt;br /&gt;
  (&amp;#039;I&amp;#039;, &amp;#039;I&amp;#039;, [&amp;#039;PRP&amp;#039;]),&lt;br /&gt;
  (&amp;#039;say&amp;#039;, &amp;#039;say&amp;#039;, [&amp;#039;VB&amp;#039;]),&lt;br /&gt;
  (&amp;#039;about&amp;#039;, &amp;#039;about&amp;#039;, [&amp;#039;IN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;this&amp;#039;, &amp;#039;this&amp;#039;, [&amp;#039;DT&amp;#039;]),&lt;br /&gt;
  (&amp;#039;place&amp;#039;, &amp;#039;place&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;.&amp;#039;, &amp;#039;.&amp;#039;, [&amp;#039;.&amp;#039;])],&lt;br /&gt;
 [(&amp;#039;The&amp;#039;, &amp;#039;The&amp;#039;, [&amp;#039;DT&amp;#039;]),&lt;br /&gt;
  (&amp;#039;staff&amp;#039;, &amp;#039;staff&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;of&amp;#039;, &amp;#039;of&amp;#039;, [&amp;#039;IN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;the&amp;#039;, &amp;#039;the&amp;#039;, [&amp;#039;DT&amp;#039;]),&lt;br /&gt;
  (&amp;#039;restaurant&amp;#039;, &amp;#039;restaurant&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;is&amp;#039;, &amp;#039;is&amp;#039;, [&amp;#039;VBZ&amp;#039;]),&lt;br /&gt;
  (&amp;#039;nice&amp;#039;, &amp;#039;nice&amp;#039;, [&amp;#039;positive&amp;#039;, &amp;#039;JJ&amp;#039;]),&lt;br /&gt;
  (&amp;#039;and&amp;#039;, &amp;#039;and&amp;#039;, [&amp;#039;CC&amp;#039;]),&lt;br /&gt;
  (&amp;#039;eggplant&amp;#039;, &amp;#039;eggplant&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;is&amp;#039;, &amp;#039;is&amp;#039;, [&amp;#039;VBZ&amp;#039;]),&lt;br /&gt;
  (&amp;#039;not&amp;#039;, &amp;#039;not&amp;#039;, [&amp;#039;inv&amp;#039;, &amp;#039;RB&amp;#039;]),&lt;br /&gt;
  (&amp;#039;bad&amp;#039;, &amp;#039;bad&amp;#039;, [&amp;#039;negative&amp;#039;, &amp;#039;JJ&amp;#039;]),&lt;br /&gt;
  (&amp;#039;.&amp;#039;, &amp;#039;.&amp;#039;, [&amp;#039;.&amp;#039;])],&lt;br /&gt;
 [(&amp;#039;apart&amp;#039;, &amp;#039;apart&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;from&amp;#039;, &amp;#039;from&amp;#039;, [&amp;#039;IN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;that&amp;#039;, &amp;#039;that&amp;#039;, [&amp;#039;DT&amp;#039;]),&lt;br /&gt;
  (&amp;#039;,&amp;#039;, &amp;#039;,&amp;#039;, [&amp;#039;,&amp;#039;]),&lt;br /&gt;
  (&amp;#039;very&amp;#039;, &amp;#039;very&amp;#039;, [&amp;#039;inc&amp;#039;, &amp;#039;RB&amp;#039;]),&lt;br /&gt;
  (&amp;#039;uninspired&amp;#039;, &amp;#039;uninspired&amp;#039;, [&amp;#039;negative&amp;#039;, &amp;#039;VBN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;food&amp;#039;, &amp;#039;food&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;,&amp;#039;, &amp;#039;,&amp;#039;, [&amp;#039;,&amp;#039;]),&lt;br /&gt;
  (&amp;#039;lack of&amp;#039;, &amp;#039;lack of&amp;#039;, [&amp;#039;inv&amp;#039;]),&lt;br /&gt;
  (&amp;#039;atmosphere&amp;#039;, &amp;#039;atmosphere&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;and&amp;#039;, &amp;#039;and&amp;#039;, [&amp;#039;CC&amp;#039;]),&lt;br /&gt;
  (&amp;#039;too&amp;#039;, &amp;#039;too&amp;#039;, [&amp;#039;inc&amp;#039;, &amp;#039;RB&amp;#039;]),&lt;br /&gt;
  (&amp;#039;expensive&amp;#039;, &amp;#039;expensive&amp;#039;, [&amp;#039;negative&amp;#039;, &amp;#039;JJ&amp;#039;]),&lt;br /&gt;
  (&amp;#039;.&amp;#039;, &amp;#039;.&amp;#039;, [&amp;#039;.&amp;#039;])],&lt;br /&gt;
 [(&amp;#039;I&amp;#039;, &amp;#039;I&amp;#039;, [&amp;#039;PRP&amp;#039;]),&lt;br /&gt;
  (&amp;#039;am&amp;#039;, &amp;#039;am&amp;#039;, [&amp;#039;VBP&amp;#039;]),&lt;br /&gt;
  (&amp;#039;a&amp;#039;, &amp;#039;a&amp;#039;, [&amp;#039;DT&amp;#039;]),&lt;br /&gt;
  (&amp;#039;staunch&amp;#039;, &amp;#039;staunch&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;vegetarian&amp;#039;, &amp;#039;vegetarian&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;and&amp;#039;, &amp;#039;and&amp;#039;, [&amp;#039;CC&amp;#039;]),&lt;br /&gt;
  (&amp;#039;was&amp;#039;, &amp;#039;was&amp;#039;, [&amp;#039;VBD&amp;#039;]),&lt;br /&gt;
  (&amp;#039;sorely&amp;#039;, &amp;#039;sorely&amp;#039;, [&amp;#039;inc&amp;#039;, &amp;#039;RB&amp;#039;]),&lt;br /&gt;
  (&amp;#039;dissapointed&amp;#039;, &amp;#039;dissapointed&amp;#039;, [&amp;#039;negative&amp;#039;, &amp;#039;VBN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;with&amp;#039;, &amp;#039;with&amp;#039;, [&amp;#039;IN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;the&amp;#039;, &amp;#039;the&amp;#039;, [&amp;#039;DT&amp;#039;]),&lt;br /&gt;
  (&amp;#039;veggie&amp;#039;, &amp;#039;veggie&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;options&amp;#039;, &amp;#039;options&amp;#039;, [&amp;#039;NNS&amp;#039;]),&lt;br /&gt;
  (&amp;#039;on&amp;#039;, &amp;#039;on&amp;#039;, [&amp;#039;IN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;the&amp;#039;, &amp;#039;the&amp;#039;, [&amp;#039;DT&amp;#039;]),&lt;br /&gt;
  (&amp;#039;menu&amp;#039;, &amp;#039;menu&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;.&amp;#039;, &amp;#039;.&amp;#039;, [&amp;#039;.&amp;#039;])],&lt;br /&gt;
 [(&amp;#039;Will&amp;#039;, &amp;#039;Will&amp;#039;, [&amp;#039;NNP&amp;#039;]),&lt;br /&gt;
  (&amp;#039;be&amp;#039;, &amp;#039;be&amp;#039;, [&amp;#039;VB&amp;#039;]),&lt;br /&gt;
  (&amp;#039;the&amp;#039;, &amp;#039;the&amp;#039;, [&amp;#039;DT&amp;#039;]),&lt;br /&gt;
  (&amp;#039;last&amp;#039;, &amp;#039;last&amp;#039;, [&amp;#039;JJ&amp;#039;]),&lt;br /&gt;
  (&amp;#039;time&amp;#039;, &amp;#039;time&amp;#039;, [&amp;#039;NN&amp;#039;]),&lt;br /&gt;
  (&amp;#039;I&amp;#039;, &amp;#039;I&amp;#039;, [&amp;#039;PRP&amp;#039;]),&lt;br /&gt;
  (&amp;#039;visit&amp;#039;, &amp;#039;visit&amp;#039;, [&amp;#039;VBP&amp;#039;]),&lt;br /&gt;
  (&amp;#039;,&amp;#039;, &amp;#039;,&amp;#039;, [&amp;#039;,&amp;#039;]),&lt;br /&gt;
  (&amp;#039;I&amp;#039;, &amp;#039;I&amp;#039;, [&amp;#039;PRP&amp;#039;]),&lt;br /&gt;
  (&amp;#039;recommend others to avoid&amp;#039;, &amp;#039;recommend others to avoid&amp;#039;, [&amp;#039;negative&amp;#039;]),&lt;br /&gt;
  (&amp;#039;.&amp;#039;, &amp;#039;.&amp;#039;, [&amp;#039;.&amp;#039;])]]&lt;br /&gt;
view raw&lt;br /&gt;
tagging_inverters.py hosted with ❤ by GitHub&lt;br /&gt;
&lt;br /&gt;
Then, we could adapt our sentiment_score function. We want it to flip the polarity of a sentiment word when is preceded by an inverter:&lt;br /&gt;
def sentence_score(sentence_tokens, previous_token, acum_score):    &lt;br /&gt;
    if not sentence_tokens:&lt;br /&gt;
        return acum_score&lt;br /&gt;
    else:&lt;br /&gt;
        current_token = sentence_tokens[0]&lt;br /&gt;
        tags = current_token[2]&lt;br /&gt;
        token_score = sum([value_of(tag) for tag in tags])&lt;br /&gt;
        if previous_token is not None:&lt;br /&gt;
            previous_tags = previous_token[2]&lt;br /&gt;
            if &amp;#039;inc&amp;#039; in previous_tags:&lt;br /&gt;
                token_score *= 2.0&lt;br /&gt;
            elif &amp;#039;dec&amp;#039; in previous_tags:&lt;br /&gt;
                token_score /= 2.0&lt;br /&gt;
            elif &amp;#039;inv&amp;#039; in previous_tags:&lt;br /&gt;
                token_score *= -1.0&lt;br /&gt;
        return sentence_score(sentence_tokens[1:], current_token, acum_score + token_score)&lt;br /&gt;
&lt;br /&gt;
def sentiment_score(review):&lt;br /&gt;
    return sum([sentence_score(sentence, None, 0.0) for sentence in review])&lt;br /&gt;
view raw&lt;br /&gt;
sentiment_score_flips.py hosted with ❤ by GitHub&lt;br /&gt;
&lt;br /&gt;
Recalculating again the sentiment score:&lt;br /&gt;
sentiment_score(dict_tagged_sentences)&lt;br /&gt;
-5.0&lt;br /&gt;
view raw&lt;br /&gt;
example_exec_3.py hosted with ❤ by GitHub&lt;br /&gt;
&lt;br /&gt;
It&amp;#039;s now -5.0 since &amp;quot;not bad&amp;quot; is considered positive.&lt;br /&gt;
Conclusion&lt;br /&gt;
&lt;br /&gt;
We have seen a little introduction to some basic techniques and algorithms that can give us an overall &amp;quot;score&amp;quot; of how positive or negative a review is.&lt;br /&gt;
&lt;br /&gt;
The steps we&amp;#039;ve followed are:&lt;br /&gt;
&lt;br /&gt;
        Split the text into sentences, and each sentence into tokens&lt;br /&gt;
        Add POS (Part Of Speech) tags to the Splitted text, using NLTK&lt;br /&gt;
        Enrich the POS-tagged text with our own tags using dictionaries. These tags are in a different &amp;quot;semantic level&amp;quot; than POS-tags: &amp;quot;positive&amp;quot;, &amp;quot;negative&amp;quot;, &amp;quot;inverter&amp;quot;, &amp;quot;incrementer&amp;quot; and &amp;quot;decrementer&amp;quot;&lt;br /&gt;
        Implement some basic extraction rules over the tagged text, in form of python functions&lt;br /&gt;
&lt;br /&gt;
That could be a good starting point to someone interested in sentiment analysis, but this is only the very beginning.&lt;br /&gt;
&lt;br /&gt;
In a real-life system you should work harder, especially in the extraction-rules part (and, of course, in the dictionaries).&lt;br /&gt;
&lt;br /&gt;
The method described so far is a rule-based approach. There are other techniques to perform sentiment analysis, for example, applying machine-learning algorithms. In any case, I think that advanced rule-based or machine-learning systems are out of scope in an introductory post like this.&lt;br /&gt;
&lt;br /&gt;
Hope you enjoyed the reading!&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Referensi==&lt;br /&gt;
&lt;br /&gt;
* http://fjavieralba.com/basic-sentiment-analysis-with-python.html&lt;/div&gt;</summary>
		<author><name>Onnowpurbo</name></author>
	</entry>
</feed>