<?xml version="1.0"?>
<feed xmlns="http://www.w3.org/2005/Atom" xml:lang="en">
	<id>https://lms.onnocenter.or.id/wiki/index.php?action=history&amp;feed=atom&amp;title=Python%3A_Twitter_Analytic</id>
	<title>Python: Twitter Analytic - Revision history</title>
	<link rel="self" type="application/atom+xml" href="https://lms.onnocenter.or.id/wiki/index.php?action=history&amp;feed=atom&amp;title=Python%3A_Twitter_Analytic"/>
	<link rel="alternate" type="text/html" href="https://lms.onnocenter.or.id/wiki/index.php?title=Python:_Twitter_Analytic&amp;action=history"/>
	<updated>2026-04-20T06:08:05Z</updated>
	<subtitle>Revision history for this page on the wiki</subtitle>
	<generator>MediaWiki 1.45.1</generator>
	<entry>
		<id>https://lms.onnocenter.or.id/wiki/index.php?title=Python:_Twitter_Analytic&amp;diff=46566&amp;oldid=prev</id>
		<title>Onnowpurbo: /* 2. Reading and Understanding the data */</title>
		<link rel="alternate" type="text/html" href="https://lms.onnocenter.or.id/wiki/index.php?title=Python:_Twitter_Analytic&amp;diff=46566&amp;oldid=prev"/>
		<updated>2017-01-22T09:15:27Z</updated>

		<summary type="html">&lt;p&gt;&lt;span class=&quot;autocomment&quot;&gt;2. Reading and Understanding the data&lt;/span&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 09:15, 22 January 2017&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l78&quot;&gt;Line 78:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 78:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;  {&amp;quot;created_at&amp;quot;:&amp;quot;Tue Jul 15 14:19:30 +0000 2014&amp;quot;,&amp;quot;id&amp;quot;:489051636304990208,&amp;quot;id_str&amp;quot;:&amp;quot;489051636304990208&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Yaayyy I learned some JavaScript today!  #thatwasntsohard #yesitwas #stoptalkingtoyourself #hashbrown #hashtag&amp;quot;,&amp;quot;source&amp;quot;:&amp;quot;\u003ca href=\&amp;quot;http:\/\/twitter.com\/download\/iphone\&amp;quot; rel=\&amp;quot;nofollow\&amp;quot;\u003eTwitter for iPhone\u003c\/a\u003e&amp;quot;,&amp;quot;truncated&amp;quot;:false,&amp;quot;in_reply_to_status_id&amp;quot;:null,&amp;quot;in_reply_to_status_id_str&amp;quot;:null,&amp;quot;in_reply_to_user_id&amp;quot;:null,&amp;quot;in_reply_to_user_id_str&amp;quot;:null,&amp;quot;in_reply_to_screen_name&amp;quot;:null,&amp;quot;user&amp;quot;:{&amp;quot;id&amp;quot;:2301702187,&amp;quot;id_str&amp;quot;:&amp;quot;2301702187&amp;quot;,&amp;quot;name&amp;quot;:&amp;quot;Toni Barlettano&amp;quot;,&amp;quot;screen_name&amp;quot;:&amp;quot;itsmetonib&amp;quot;,&amp;quot;location&amp;quot;:&amp;quot;Greater NYC Area&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;http:\/\/www.tonib.me&amp;quot;,&amp;quot;description&amp;quot;:&amp;quot;So Full of Art   |   \nToni Barlettano Creative Media + Design&amp;quot;,&amp;quot;protected&amp;quot;:false,&amp;quot;followers_count&amp;quot;:8,&amp;quot;friends_count&amp;quot;:25,&amp;quot;listed_count&amp;quot;:0,&amp;quot;created_at&amp;quot;:&amp;quot;Mon Jan 20 16:49:46 +0000 2014&amp;quot;,&amp;quot;favourites_count&amp;quot;:6,&amp;quot;utc_offset&amp;quot;:null,&amp;quot;time_zone&amp;quot;:null,&amp;quot;geo_enabled&amp;quot;:false,&amp;quot;verified&amp;quot;:false,&amp;quot;statuses_count&amp;quot;:20,&amp;quot;lang&amp;quot;:&amp;quot;en&amp;quot;,&amp;quot;contributors_enabled&amp;quot;:false,&amp;quot;is_translator&amp;quot;:false,&amp;quot;is_translation_enabled&amp;quot;:false,&amp;quot;profile_background_color&amp;quot;:&amp;quot;C0DEED&amp;quot;,&amp;quot;profile_background_image_url&amp;quot;:&amp;quot;http:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png&amp;quot;,&amp;quot;profile_background_image_url_https&amp;quot;:&amp;quot;https:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png&amp;quot;,&amp;quot;profile_background_tile&amp;quot;:false,&amp;quot;profile_image_url&amp;quot;:&amp;quot;http:\/\/pbs.twimg.com\/profile_images\/425313048320958464\/Z2GcderW_normal.jpeg&amp;quot;,&amp;quot;profile_image_url_https&amp;quot;:&amp;quot;https:\/\/pbs.twimg.com\/profile_images\/425313048320958464\/Z2GcderW_normal.jpeg&amp;quot;,&amp;quot;profile_link_color&amp;quot;:&amp;quot;0084B4&amp;quot;,&amp;quot;profile_sidebar_border_color&amp;quot;:&amp;quot;C0DEED&amp;quot;,&amp;quot;profile_sidebar_fill_color&amp;quot;:&amp;quot;DDEEF6&amp;quot;,&amp;quot;profile_text_color&amp;quot;:&amp;quot;333333&amp;quot;,&amp;quot;profile_use_background_image&amp;quot;:true,&amp;quot;default_profile&amp;quot;:true,&amp;quot;default_profile_image&amp;quot;:false,&amp;quot;following&amp;quot;:null,&amp;quot;follow_request_sent&amp;quot;:null,&amp;quot;notifications&amp;quot;:null},&amp;quot;geo&amp;quot;:null,&amp;quot;coordinates&amp;quot;:null,&amp;quot;place&amp;quot;:null,&amp;quot;contributors&amp;quot;:null,&amp;quot;retweet_count&amp;quot;:0,&amp;quot;favorite_count&amp;quot;:0,&amp;quot;entities&amp;quot;:{&amp;quot;hashtags&amp;quot;:[{&amp;quot;text&amp;quot;:&amp;quot;thatwasntsohard&amp;quot;,&amp;quot;indices&amp;quot;:[40,56]},{&amp;quot;text&amp;quot;:&amp;quot;yesitwas&amp;quot;,&amp;quot;indices&amp;quot;:[57,66]},{&amp;quot;text&amp;quot;:&amp;quot;stoptalkingtoyourself&amp;quot;,&amp;quot;indices&amp;quot;:[67,89]},{&amp;quot;text&amp;quot;:&amp;quot;hashbrown&amp;quot;,&amp;quot;indices&amp;quot;:[90,100]},{&amp;quot;text&amp;quot;:&amp;quot;hashtag&amp;quot;,&amp;quot;indices&amp;quot;:[101,109]}],&amp;quot;symbols&amp;quot;:[],&amp;quot;urls&amp;quot;:[],&amp;quot;user_mentions&amp;quot;:[]},&amp;quot;favorited&amp;quot;:false,&amp;quot;retweeted&amp;quot;:false,&amp;quot;filter_level&amp;quot;:&amp;quot;medium&amp;quot;,&amp;quot;lang&amp;quot;:&amp;quot;en&amp;quot;}&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;  {&amp;quot;created_at&amp;quot;:&amp;quot;Tue Jul 15 14:19:30 +0000 2014&amp;quot;,&amp;quot;id&amp;quot;:489051636304990208,&amp;quot;id_str&amp;quot;:&amp;quot;489051636304990208&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Yaayyy I learned some JavaScript today!  #thatwasntsohard #yesitwas #stoptalkingtoyourself #hashbrown #hashtag&amp;quot;,&amp;quot;source&amp;quot;:&amp;quot;\u003ca href=\&amp;quot;http:\/\/twitter.com\/download\/iphone\&amp;quot; rel=\&amp;quot;nofollow\&amp;quot;\u003eTwitter for iPhone\u003c\/a\u003e&amp;quot;,&amp;quot;truncated&amp;quot;:false,&amp;quot;in_reply_to_status_id&amp;quot;:null,&amp;quot;in_reply_to_status_id_str&amp;quot;:null,&amp;quot;in_reply_to_user_id&amp;quot;:null,&amp;quot;in_reply_to_user_id_str&amp;quot;:null,&amp;quot;in_reply_to_screen_name&amp;quot;:null,&amp;quot;user&amp;quot;:{&amp;quot;id&amp;quot;:2301702187,&amp;quot;id_str&amp;quot;:&amp;quot;2301702187&amp;quot;,&amp;quot;name&amp;quot;:&amp;quot;Toni Barlettano&amp;quot;,&amp;quot;screen_name&amp;quot;:&amp;quot;itsmetonib&amp;quot;,&amp;quot;location&amp;quot;:&amp;quot;Greater NYC Area&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;http:\/\/www.tonib.me&amp;quot;,&amp;quot;description&amp;quot;:&amp;quot;So Full of Art   |   \nToni Barlettano Creative Media + Design&amp;quot;,&amp;quot;protected&amp;quot;:false,&amp;quot;followers_count&amp;quot;:8,&amp;quot;friends_count&amp;quot;:25,&amp;quot;listed_count&amp;quot;:0,&amp;quot;created_at&amp;quot;:&amp;quot;Mon Jan 20 16:49:46 +0000 2014&amp;quot;,&amp;quot;favourites_count&amp;quot;:6,&amp;quot;utc_offset&amp;quot;:null,&amp;quot;time_zone&amp;quot;:null,&amp;quot;geo_enabled&amp;quot;:false,&amp;quot;verified&amp;quot;:false,&amp;quot;statuses_count&amp;quot;:20,&amp;quot;lang&amp;quot;:&amp;quot;en&amp;quot;,&amp;quot;contributors_enabled&amp;quot;:false,&amp;quot;is_translator&amp;quot;:false,&amp;quot;is_translation_enabled&amp;quot;:false,&amp;quot;profile_background_color&amp;quot;:&amp;quot;C0DEED&amp;quot;,&amp;quot;profile_background_image_url&amp;quot;:&amp;quot;http:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png&amp;quot;,&amp;quot;profile_background_image_url_https&amp;quot;:&amp;quot;https:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png&amp;quot;,&amp;quot;profile_background_tile&amp;quot;:false,&amp;quot;profile_image_url&amp;quot;:&amp;quot;http:\/\/pbs.twimg.com\/profile_images\/425313048320958464\/Z2GcderW_normal.jpeg&amp;quot;,&amp;quot;profile_image_url_https&amp;quot;:&amp;quot;https:\/\/pbs.twimg.com\/profile_images\/425313048320958464\/Z2GcderW_normal.jpeg&amp;quot;,&amp;quot;profile_link_color&amp;quot;:&amp;quot;0084B4&amp;quot;,&amp;quot;profile_sidebar_border_color&amp;quot;:&amp;quot;C0DEED&amp;quot;,&amp;quot;profile_sidebar_fill_color&amp;quot;:&amp;quot;DDEEF6&amp;quot;,&amp;quot;profile_text_color&amp;quot;:&amp;quot;333333&amp;quot;,&amp;quot;profile_use_background_image&amp;quot;:true,&amp;quot;default_profile&amp;quot;:true,&amp;quot;default_profile_image&amp;quot;:false,&amp;quot;following&amp;quot;:null,&amp;quot;follow_request_sent&amp;quot;:null,&amp;quot;notifications&amp;quot;:null},&amp;quot;geo&amp;quot;:null,&amp;quot;coordinates&amp;quot;:null,&amp;quot;place&amp;quot;:null,&amp;quot;contributors&amp;quot;:null,&amp;quot;retweet_count&amp;quot;:0,&amp;quot;favorite_count&amp;quot;:0,&amp;quot;entities&amp;quot;:{&amp;quot;hashtags&amp;quot;:[{&amp;quot;text&amp;quot;:&amp;quot;thatwasntsohard&amp;quot;,&amp;quot;indices&amp;quot;:[40,56]},{&amp;quot;text&amp;quot;:&amp;quot;yesitwas&amp;quot;,&amp;quot;indices&amp;quot;:[57,66]},{&amp;quot;text&amp;quot;:&amp;quot;stoptalkingtoyourself&amp;quot;,&amp;quot;indices&amp;quot;:[67,89]},{&amp;quot;text&amp;quot;:&amp;quot;hashbrown&amp;quot;,&amp;quot;indices&amp;quot;:[90,100]},{&amp;quot;text&amp;quot;:&amp;quot;hashtag&amp;quot;,&amp;quot;indices&amp;quot;:[101,109]}],&amp;quot;symbols&amp;quot;:[],&amp;quot;urls&amp;quot;:[],&amp;quot;user_mentions&amp;quot;:[]},&amp;quot;favorited&amp;quot;:false,&amp;quot;retweeted&amp;quot;:false,&amp;quot;filter_level&amp;quot;:&amp;quot;medium&amp;quot;,&amp;quot;lang&amp;quot;:&amp;quot;en&amp;quot;}&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;For the remaining of this tutorial, we will be using 4 Python libraries json for parsing the data, pandas for data manipulation, matplotlib for creating charts, adn re for regular expressions. The json and re libraries are installed by default in Python. You should install pandas and matplotlib if you don&#039;t have them in your machine.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;For the remaining of this tutorial, we will be using 4 Python libraries json for parsing the data, pandas for data manipulation, matplotlib for creating charts, adn re for regular expressions. The json and re libraries are installed by default in Python. You should install pandas and matplotlib if you  &lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;don&#039;t have them in your machine.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt; &lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;Install aplikasi pendukung&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt; &lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; sudo pip install pandas&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; sudo pip install matplotlib&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; sudo apt install python-tk&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt; &lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;We will start first by uploading json and pandas using the commands below:&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;We will start first by uploading json and pandas using the commands below:&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l136&quot;&gt;Line 136:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 144:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;  ax.set_title(&amp;#039;Top 5 countries&amp;#039;, fontsize=15, fontweight=&amp;#039;bold&amp;#039;)&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;  ax.set_title(&amp;#039;Top 5 countries&amp;#039;, fontsize=15, fontweight=&amp;#039;bold&amp;#039;)&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;  tweets_by_country[:5].plot(ax=ax, kind=&amp;#039;bar&amp;#039;, color=&amp;#039;blue&amp;#039;)&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;  tweets_by_country[:5].plot(ax=ax, kind=&amp;#039;bar&amp;#039;, color=&amp;#039;blue&amp;#039;)&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;Secara lengkap&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; import json&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; import pandas as pd&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; import matplotlib.pyplot as plt&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; &lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; tweets_data_path = &#039;/home/onno/Twitter-big-data/pilkada-data.txt&#039;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; &lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; tweets_data = []&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; tweets_file = open(tweets_data_path, &quot;r&quot;)&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; for line in tweets_file:&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;     try:&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;         tweet = json.loads(line)&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;         tweets_data.append(tweet)&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;     except:&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;         continue&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; &lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; print len(tweets_data)&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; &lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; tweets = pd.DataFrame()&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; &lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; tweets[&#039;text&#039;] = map(lambda tweet: tweet[&#039;text&#039;], tweets_data)&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; tweets[&#039;lang&#039;] = map(lambda tweet: tweet[&#039;lang&#039;], tweets_data)&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; tweets[&#039;country&#039;] = map(lambda tweet: tweet[&#039;place&#039;][&#039;country&#039;] if tweet[&#039;place&#039;] != None else None, tweets_data)&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; &lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; tweets_by_lang = tweets[&#039;lang&#039;].value_counts()&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; &lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; fig, ax = plt.subplots()&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; ax.tick_params(axis=&#039;x&#039;, labelsize=15)&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; ax.tick_params(axis=&#039;y&#039;, labelsize=10)&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; ax.set_xlabel(&#039;Languages&#039;, fontsize=15)&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; ax.set_ylabel(&#039;Number of tweets&#039; , fontsize=15)&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; ax.set_title(&#039;Top 5 languages&#039;, fontsize=15, fontweight=&#039;bold&#039;)&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; tweets_by_lang[:5].plot(ax=ax, kind=&#039;bar&#039;, color=&#039;red&#039;)&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; plt.show()&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; &lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; tweets_by_country = tweets[&#039;country&#039;].value_counts() &lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; &lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; fig, ax = plt.subplots()&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; ax.tick_params(axis=&#039;x&#039;, labelsize=15)&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; ax.tick_params(axis=&#039;y&#039;, labelsize=10)&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; ax.set_xlabel(&#039;Countries&#039;, fontsize=15)&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; ax.set_ylabel(&#039;Number of tweets&#039; , fontsize=15)&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; ax.set_title(&#039;Top 5 countries&#039;, fontsize=15, fontweight=&#039;bold&#039;)&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; tweets_by_country[:5].plot(ax=ax, kind=&#039;bar&#039;, color=&#039;blue&#039;)&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; plt.show()&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;===3. Mining the tweets===&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;===3. Mining the tweets===&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l281&quot;&gt;Line 281:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 341:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;In this tutorial, we covered many techniques used in text mining. The code provide in this post could be modified to create a deeper analysis or could be adapted to another use case. For those who want to go further in text mining, I recommend to follow up by studying regular expressions.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;In this tutorial, we covered many techniques used in text mining. The code provide in this post could be modified to create a deeper analysis or could be adapted to another use case. For those who want to go further in text mining, I recommend to follow up by studying regular expressions.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;You can find the source code from this tutorial in this github repository github link.  &lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;You can find the source code from this tutorial in this github repository github link.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt; &lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt; &lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt; &lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;==Referensi==&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;==Referensi==&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* http://adilmoujahid.com/posts/2014/07/twitter-analytics/&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* http://adilmoujahid.com/posts/2014/07/twitter-analytics/&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Onnowpurbo</name></author>
	</entry>
	<entry>
		<id>https://lms.onnocenter.or.id/wiki/index.php?title=Python:_Twitter_Analytic&amp;diff=46565&amp;oldid=prev</id>
		<title>Onnowpurbo at 08:42, 22 January 2017</title>
		<link rel="alternate" type="text/html" href="https://lms.onnocenter.or.id/wiki/index.php?title=Python:_Twitter_Analytic&amp;diff=46565&amp;oldid=prev"/>
		<updated>2017-01-22T08:42:33Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;table style=&quot;background-color: #fff; color: #202122;&quot; data-mw=&quot;interface&quot;&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;col class=&quot;diff-marker&quot; /&gt;
				&lt;col class=&quot;diff-content&quot; /&gt;
				&lt;tr class=&quot;diff-title&quot; lang=&quot;en&quot;&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;← Older revision&lt;/td&gt;
				&lt;td colspan=&quot;2&quot; style=&quot;background-color: #fff; color: #202122; text-align: center;&quot;&gt;Revision as of 08:42, 22 January 2017&lt;/td&gt;
				&lt;/tr&gt;&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l5&quot;&gt;Line 5:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 5:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Using only 2 days worth of Twitter data, I could retrieve 644 links to python tutorials, 413 to javascript tutorials and 136 to ruby tutorials. Furthermore, I could confirm that python is 1.5 times more popular than javascript and 4 times more popular than ruby.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Using only 2 days worth of Twitter data, I could retrieve 644 links to python tutorials, 413 to javascript tutorials and 136 to ruby tutorials. Furthermore, I could confirm that python is 1.5 times more popular than javascript and 4 times more popular than ruby.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;Alt Text&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;==&lt;/ins&gt;1. Getting Data from Twitter Streaming API&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;==&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;1. Getting Data from Twitter Streaming API&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;API stands for Application Programming Interface. It is a tool that makes the interaction with computer programs and web services easy. Many web services provides APIs to developers to interact with their services and to access data in programmatic way. For this tutorial, we will use Twitter Streaming API to download tweets related to 3 keywords: &amp;quot;python&amp;quot;, &amp;quot;javascript&amp;quot;, and &amp;quot;ruby&amp;quot;.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;API stands for Application Programming Interface. It is a tool that makes the interaction with computer programs and web services easy. Many web services provides APIs to developers to interact with their services and to access data in programmatic way. For this tutorial, we will use Twitter Streaming API to download tweets related to 3 keywords: &amp;quot;python&amp;quot;, &amp;quot;javascript&amp;quot;, and &amp;quot;ruby&amp;quot;.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Step 1: Getting Twitter API keys&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt; &lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;===&lt;/ins&gt;Step 1: Getting Twitter API keys&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;===&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;In order to access Twitter Streaming API, we need to get 4 pieces of information from Twitter: API key, API secret, Access token and Access token secret. Follow the steps below to get all 4 elements:&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;In order to access Twitter Streaming API, we need to get 4 pieces of information from Twitter: API key, API secret, Access token and Access token secret. Follow the steps below to get all 4 elements:&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l20&quot;&gt;Line 20:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 20:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* Scroll down and click &amp;quot;Create my access token&amp;quot;, and copy your &amp;quot;Access token&amp;quot; and &amp;quot;Access token secret&amp;quot;.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;* Scroll down and click &amp;quot;Create my access token&amp;quot;, and copy your &amp;quot;Access token&amp;quot; and &amp;quot;Access token secret&amp;quot;.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Step 2: Connecting to Twitter Streaming API and downloading data&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;===&lt;/ins&gt;Step 2: Connecting to Twitter Streaming API and downloading data&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;===&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;We will be using a Python library called Tweepy to connect to Twitter Streaming API and downloading the data. If you don&amp;#039;t have Tweepy installed in your machine, go to this link, and follow the installation instructions.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;We will be using a Python library called Tweepy to connect to Twitter Streaming API and downloading the data. If you don&amp;#039;t have Tweepy installed in your machine, go to this link, and follow the installation instructions.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l58&quot;&gt;Line 58:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 58:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;      stream.filter(track=[&amp;#039;python&amp;#039;, &amp;#039;javascript&amp;#039;, &amp;#039;ruby&amp;#039;])&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;      stream.filter(track=[&amp;#039;python&amp;#039;, &amp;#039;javascript&amp;#039;, &amp;#039;ruby&amp;#039;])&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;If you run the program from your terminal using the command: python twitter_streaming.py&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;, you will see data flowing like the picture below.&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;If you run the program from your terminal using the command:&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt; &lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; &lt;/ins&gt;python twitter_streaming.py&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;Alt Text&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;you will see data flowing like the picture below.&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;You can stop the program by pressing Ctrl-C.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;You can stop the program by pressing Ctrl-C.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;We want to capture this data into a file that we will use later for the analysis. You can do so by piping the output to a file using the following command: python twitter_streaming.py &amp;gt; twitter_data.txt.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;We want to capture this data into a file that we will use later for the analysis. You can do so by piping the output to a file using the following command:&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt; &lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt; &lt;/ins&gt;python twitter_streaming.py &amp;gt; twitter_data.txt.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;I run the program for 2 days (from 2014/07/15 till 2014/07/17) to get a meaningful data sample. This file size is 242 MB.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;I run the program for 2 days (from 2014/07/15 till 2014/07/17) to get a meaningful data sample. This file size is 242 MB.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;2. Reading and Understanding the data&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt; &lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-deleted&quot;&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;==&lt;/ins&gt;2. Reading and Understanding the data&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;==&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;The data that we stored twitter_data.txt is in JSON format. JSON stands for JavaScript Object Notation. This format makes it easy to humans to read the data, and for machines to parse it. Below is an example for one tweet in JSON format. You can see that the tweet contains additional information in addition to the main text which in this example: &amp;quot;Yaayyy I learned some JavaScript today! #thatwasntsohard #yesitwas #stoptalkingtoyourself #hashbrown #hashtag&amp;quot;.&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;The data that we stored twitter_data.txt is in JSON format. JSON stands for JavaScript Object Notation. This format makes it easy to humans to read the data, and for machines to parse it. Below is an example for one tweet in JSON format. You can see that the tweet contains additional information in addition to the main text which in this example: &amp;quot;Yaayyy I learned some JavaScript today! #thatwasntsohard #yesitwas #stoptalkingtoyourself #hashbrown #hashtag&amp;quot;.&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot; id=&quot;mw-diff-left-l132&quot;&gt;Line 132:&lt;/td&gt;
&lt;td colspan=&quot;2&quot; class=&quot;diff-lineno&quot;&gt;Line 137:&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;  tweets_by_country[:5].plot(ax=ax, kind=&amp;#039;bar&amp;#039;, color=&amp;#039;blue&amp;#039;)&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;  tweets_by_country[:5].plot(ax=ax, kind=&amp;#039;bar&amp;#039;, color=&amp;#039;blue&amp;#039;)&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;del style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;Alt Text&lt;/del&gt;&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;+&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #a3d3ff; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;===&lt;/ins&gt;3. Mining the tweets&lt;ins style=&quot;font-weight: bold; text-decoration: none;&quot;&gt;===&lt;/ins&gt;&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot; data-marker=&quot;−&quot;&gt;&lt;/td&gt;&lt;td style=&quot;color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #ffe49c; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;3. Mining the tweets&lt;/div&gt;&lt;/td&gt;&lt;td colspan=&quot;2&quot; class=&quot;diff-side-added&quot;&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;br&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;tr&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Our main goals in these text mining tasks are: compare the popularity of Python, Ruby and Javascript programming languages and to retrieve programming tutorial links. We will do this in 3 steps:&lt;/div&gt;&lt;/td&gt;&lt;td class=&quot;diff-marker&quot;&gt;&lt;/td&gt;&lt;td style=&quot;background-color: #f8f9fa; color: #202122; font-size: 88%; border-style: solid; border-width: 1px 1px 1px 4px; border-radius: 0.33em; border-color: #eaecf0; vertical-align: top; white-space: pre-wrap;&quot;&gt;&lt;div&gt;Our main goals in these text mining tasks are: compare the popularity of Python, Ruby and Javascript programming languages and to retrieve programming tutorial links. We will do this in 3 steps:&lt;/div&gt;&lt;/td&gt;&lt;/tr&gt;
&lt;/table&gt;</summary>
		<author><name>Onnowpurbo</name></author>
	</entry>
	<entry>
		<id>https://lms.onnocenter.or.id/wiki/index.php?title=Python:_Twitter_Analytic&amp;diff=46564&amp;oldid=prev</id>
		<title>Onnowpurbo at 08:30, 22 January 2017</title>
		<link rel="alternate" type="text/html" href="https://lms.onnocenter.or.id/wiki/index.php?title=Python:_Twitter_Analytic&amp;diff=46564&amp;oldid=prev"/>
		<updated>2017-01-22T08:30:04Z</updated>

		<summary type="html">&lt;p&gt;&lt;/p&gt;
&lt;a href=&quot;https://lms.onnocenter.or.id/wiki/index.php?title=Python:_Twitter_Analytic&amp;amp;diff=46564&amp;amp;oldid=46563&quot;&gt;Show changes&lt;/a&gt;</summary>
		<author><name>Onnowpurbo</name></author>
	</entry>
	<entry>
		<id>https://lms.onnocenter.or.id/wiki/index.php?title=Python:_Twitter_Analytic&amp;diff=46563&amp;oldid=prev</id>
		<title>Onnowpurbo: Created page with &quot;Text mining is the application of natural language processing techniques and analytical methods to text data in order to derive relevant information. Text mining is getting a...&quot;</title>
		<link rel="alternate" type="text/html" href="https://lms.onnocenter.or.id/wiki/index.php?title=Python:_Twitter_Analytic&amp;diff=46563&amp;oldid=prev"/>
		<updated>2017-01-22T08:20:59Z</updated>

		<summary type="html">&lt;p&gt;Created page with &amp;quot;Text mining is the application of natural language processing techniques and analytical methods to text data in order to derive relevant information. Text mining is getting a...&amp;quot;&lt;/p&gt;
&lt;p&gt;&lt;b&gt;New page&lt;/b&gt;&lt;/p&gt;&lt;div&gt;Text mining is the application of natural language processing techniques and analytical methods to text data in order to derive relevant information. Text mining is getting a lot attention these last years, due to an exponential increase in digital text data from web pages, google&amp;#039;s projects such as google books and google ngram, and social media services such as Twitter. Twitter data constitutes a rich source that can be used for capturing information about any topic imaginable. This data can be used in different use cases such as finding trends related to a specific keyword, measuring brand sentiment, and gathering feedback about new products and services.&lt;br /&gt;
&lt;br /&gt;
In this tutorial, I will use Twitter data to compare the popularity of 3 programming languages: Python, Javascript and Ruby, and to retrieve links to programming tutorials. In the first paragraph, I will explaing how to connect to Twitter Streaming API and how to get the data. In the second paragraph, I will explain how to structure the data for analysis, and in the last paragraph, I will explain how to filter the data and extract links from tweets.&lt;br /&gt;
&lt;br /&gt;
Using only 2 days worth of Twitter data, I could retrieve 644 links to python tutorials, 413 to javascript tutorials and 136 to ruby tutorials. Furthermore, I could confirm that python is 1.5 times more popular than javascript and 4 times more popular than ruby.&lt;br /&gt;
&lt;br /&gt;
Alt Text&lt;br /&gt;
1. Getting Data from Twitter Streaming API&lt;br /&gt;
&lt;br /&gt;
API stands for Application Programming Interface. It is a tool that makes the interaction with computer programs and web services easy. Many web services provides APIs to developers to interact with their services and to access data in programmatic way. For this tutorial, we will use Twitter Streaming API to download tweets related to 3 keywords: &amp;quot;python&amp;quot;, &amp;quot;javascript&amp;quot;, and &amp;quot;ruby&amp;quot;.&lt;br /&gt;
Step 1: Getting Twitter API keys&lt;br /&gt;
&lt;br /&gt;
In order to access Twitter Streaming API, we need to get 4 pieces of information from Twitter: API key, API secret, Access token and Access token secret. Follow the steps below to get all 4 elements:&lt;br /&gt;
&lt;br /&gt;
    Create a twitter account if you do not already have one.&lt;br /&gt;
    Go to https://apps.twitter.com/ and log in with your twitter credentials.&lt;br /&gt;
    Click &amp;quot;Create New App&amp;quot;&lt;br /&gt;
    Fill out the form, agree to the terms, and click &amp;quot;Create your Twitter application&amp;quot;&lt;br /&gt;
    In the next page, click on &amp;quot;API keys&amp;quot; tab, and copy your &amp;quot;API key&amp;quot; and &amp;quot;API secret&amp;quot;.&lt;br /&gt;
    Scroll down and click &amp;quot;Create my access token&amp;quot;, and copy your &amp;quot;Access token&amp;quot; and &amp;quot;Access token secret&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
Step 2: Connecting to Twitter Streaming API and downloading data&lt;br /&gt;
&lt;br /&gt;
We will be using a Python library called Tweepy to connect to Twitter Streaming API and downloading the data. If you don&amp;#039;t have Tweepy installed in your machine, go to this link, and follow the installation instructions.&lt;br /&gt;
&lt;br /&gt;
Next create, a file called twitter_streaming.py, and copy into it the code below. Make sure to enter your credentials into access_token, access_token_secret, consumer_key, and consumer_secret.&lt;br /&gt;
&lt;br /&gt;
#Import the necessary methods from tweepy library&lt;br /&gt;
from tweepy.streaming import StreamListener&lt;br /&gt;
from tweepy import OAuthHandler&lt;br /&gt;
from tweepy import Stream&lt;br /&gt;
&lt;br /&gt;
#Variables that contains the user credentials to access Twitter API &lt;br /&gt;
access_token = &amp;quot;ENTER YOUR ACCESS TOKEN&amp;quot;&lt;br /&gt;
access_token_secret = &amp;quot;ENTER YOUR ACCESS TOKEN SECRET&amp;quot;&lt;br /&gt;
consumer_key = &amp;quot;ENTER YOUR API KEY&amp;quot;&lt;br /&gt;
consumer_secret = &amp;quot;ENTER YOUR API SECRET&amp;quot;&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
#This is a basic listener that just prints received tweets to stdout.&lt;br /&gt;
class StdOutListener(StreamListener):&lt;br /&gt;
&lt;br /&gt;
    def on_data(self, data):&lt;br /&gt;
        print data&lt;br /&gt;
        return True&lt;br /&gt;
&lt;br /&gt;
    def on_error(self, status):&lt;br /&gt;
        print status&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
if __name__ == &amp;#039;__main__&amp;#039;:&lt;br /&gt;
&lt;br /&gt;
    #This handles Twitter authetification and the connection to Twitter Streaming API&lt;br /&gt;
    l = StdOutListener()&lt;br /&gt;
    auth = OAuthHandler(consumer_key, consumer_secret)&lt;br /&gt;
    auth.set_access_token(access_token, access_token_secret)&lt;br /&gt;
    stream = Stream(auth, l)&lt;br /&gt;
&lt;br /&gt;
    #This line filter Twitter Streams to capture data by the keywords: &amp;#039;python&amp;#039;, &amp;#039;javascript&amp;#039;, &amp;#039;ruby&amp;#039;&lt;br /&gt;
    stream.filter(track=[&amp;#039;python&amp;#039;, &amp;#039;javascript&amp;#039;, &amp;#039;ruby&amp;#039;])&lt;br /&gt;
&lt;br /&gt;
If you run the program from your terminal using the command: python twitter_streaming.py, you will see data flowing like the picture below.&lt;br /&gt;
&lt;br /&gt;
Alt Text&lt;br /&gt;
&lt;br /&gt;
You can stop the program by pressing Ctrl-C.&lt;br /&gt;
&lt;br /&gt;
We want to capture this data into a file that we will use later for the analysis. You can do so by piping the output to a file using the following command: python twitter_streaming.py &amp;gt; twitter_data.txt.&lt;br /&gt;
&lt;br /&gt;
I run the program for 2 days (from 2014/07/15 till 2014/07/17) to get a meaningful data sample. This file size is 242 MB.&lt;br /&gt;
2. Reading and Understanding the data&lt;br /&gt;
&lt;br /&gt;
The data that we stored twitter_data.txt is in JSON format. JSON stands for JavaScript Object Notation. This format makes it easy to humans to read the data, and for machines to parse it. Below is an example for one tweet in JSON format. You can see that the tweet contains additional information in addition to the main text which in this example: &amp;quot;Yaayyy I learned some JavaScript today! #thatwasntsohard #yesitwas #stoptalkingtoyourself #hashbrown #hashtag&amp;quot;.&lt;br /&gt;
&lt;br /&gt;
{&amp;quot;created_at&amp;quot;:&amp;quot;Tue Jul 15 14:19:30 +0000 2014&amp;quot;,&amp;quot;id&amp;quot;:489051636304990208,&amp;quot;id_str&amp;quot;:&amp;quot;489051636304990208&amp;quot;,&amp;quot;text&amp;quot;:&amp;quot;Yaayyy I learned some JavaScript today! #thatwasntsohard #yesitwas #stoptalkingtoyourself #hashbrown #hashtag&amp;quot;,&amp;quot;source&amp;quot;:&amp;quot;\u003ca href=\&amp;quot;http:\/\/twitter.com\/download\/iphone\&amp;quot; rel=\&amp;quot;nofollow\&amp;quot;\u003eTwitter for iPhone\u003c\/a\u003e&amp;quot;,&amp;quot;truncated&amp;quot;:false,&amp;quot;in_reply_to_status_id&amp;quot;:null,&amp;quot;in_reply_to_status_id_str&amp;quot;:null,&amp;quot;in_reply_to_user_id&amp;quot;:null,&amp;quot;in_reply_to_user_id_str&amp;quot;:null,&amp;quot;in_reply_to_screen_name&amp;quot;:null,&amp;quot;user&amp;quot;:{&amp;quot;id&amp;quot;:2301702187,&amp;quot;id_str&amp;quot;:&amp;quot;2301702187&amp;quot;,&amp;quot;name&amp;quot;:&amp;quot;Toni Barlettano&amp;quot;,&amp;quot;screen_name&amp;quot;:&amp;quot;itsmetonib&amp;quot;,&amp;quot;location&amp;quot;:&amp;quot;Greater NYC Area&amp;quot;,&amp;quot;url&amp;quot;:&amp;quot;http:\/\/www.tonib.me&amp;quot;,&amp;quot;description&amp;quot;:&amp;quot;So Full of Art   |   \nToni Barlettano Creative Media + Design&amp;quot;,&amp;quot;protected&amp;quot;:false,&amp;quot;followers_count&amp;quot;:8,&amp;quot;friends_count&amp;quot;:25,&amp;quot;listed_count&amp;quot;:0,&amp;quot;created_at&amp;quot;:&amp;quot;Mon Jan 20 16:49:46 +0000 2014&amp;quot;,&amp;quot;favourites_count&amp;quot;:6,&amp;quot;utc_offset&amp;quot;:null,&amp;quot;time_zone&amp;quot;:null,&amp;quot;geo_enabled&amp;quot;:false,&amp;quot;verified&amp;quot;:false,&amp;quot;statuses_count&amp;quot;:20,&amp;quot;lang&amp;quot;:&amp;quot;en&amp;quot;,&amp;quot;contributors_enabled&amp;quot;:false,&amp;quot;is_translator&amp;quot;:false,&amp;quot;is_translation_enabled&amp;quot;:false,&amp;quot;profile_background_color&amp;quot;:&amp;quot;C0DEED&amp;quot;,&amp;quot;profile_background_image_url&amp;quot;:&amp;quot;http:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png&amp;quot;,&amp;quot;profile_background_image_url_https&amp;quot;:&amp;quot;https:\/\/abs.twimg.com\/images\/themes\/theme1\/bg.png&amp;quot;,&amp;quot;profile_background_tile&amp;quot;:false,&amp;quot;profile_image_url&amp;quot;:&amp;quot;http:\/\/pbs.twimg.com\/profile_images\/425313048320958464\/Z2GcderW_normal.jpeg&amp;quot;,&amp;quot;profile_image_url_https&amp;quot;:&amp;quot;https:\/\/pbs.twimg.com\/profile_images\/425313048320958464\/Z2GcderW_normal.jpeg&amp;quot;,&amp;quot;profile_link_color&amp;quot;:&amp;quot;0084B4&amp;quot;,&amp;quot;profile_sidebar_border_color&amp;quot;:&amp;quot;C0DEED&amp;quot;,&amp;quot;profile_sidebar_fill_color&amp;quot;:&amp;quot;DDEEF6&amp;quot;,&amp;quot;profile_text_color&amp;quot;:&amp;quot;333333&amp;quot;,&amp;quot;profile_use_background_image&amp;quot;:true,&amp;quot;default_profile&amp;quot;:true,&amp;quot;default_profile_image&amp;quot;:false,&amp;quot;following&amp;quot;:null,&amp;quot;follow_request_sent&amp;quot;:null,&amp;quot;notifications&amp;quot;:null},&amp;quot;geo&amp;quot;:null,&amp;quot;coordinates&amp;quot;:null,&amp;quot;place&amp;quot;:null,&amp;quot;contributors&amp;quot;:null,&amp;quot;retweet_count&amp;quot;:0,&amp;quot;favorite_count&amp;quot;:0,&amp;quot;entities&amp;quot;:{&amp;quot;hashtags&amp;quot;:[{&amp;quot;text&amp;quot;:&amp;quot;thatwasntsohard&amp;quot;,&amp;quot;indices&amp;quot;:[40,56]},{&amp;quot;text&amp;quot;:&amp;quot;yesitwas&amp;quot;,&amp;quot;indices&amp;quot;:[57,66]},{&amp;quot;text&amp;quot;:&amp;quot;stoptalkingtoyourself&amp;quot;,&amp;quot;indices&amp;quot;:[67,89]},{&amp;quot;text&amp;quot;:&amp;quot;hashbrown&amp;quot;,&amp;quot;indices&amp;quot;:[90,100]},{&amp;quot;text&amp;quot;:&amp;quot;hashtag&amp;quot;,&amp;quot;indices&amp;quot;:[101,109]}],&amp;quot;symbols&amp;quot;:[],&amp;quot;urls&amp;quot;:[],&amp;quot;user_mentions&amp;quot;:[]},&amp;quot;favorited&amp;quot;:false,&amp;quot;retweeted&amp;quot;:false,&amp;quot;filter_level&amp;quot;:&amp;quot;medium&amp;quot;,&amp;quot;lang&amp;quot;:&amp;quot;en&amp;quot;}&lt;br /&gt;
&lt;br /&gt;
For the remaining of this tutorial, we will be using 4 Python libraries json for parsing the data, pandas for data manipulation, matplotlib for creating charts, adn re for regular expressions. The json and re libraries are installed by default in Python. You should install pandas and matplotlib if you don&amp;#039;t have them in your machine.&lt;br /&gt;
&lt;br /&gt;
We will start first by uploading json and pandas using the commands below:&lt;br /&gt;
&lt;br /&gt;
import json&lt;br /&gt;
import pandas as pd&lt;br /&gt;
import matplotlib.pyplot as plt&lt;br /&gt;
&lt;br /&gt;
Next we will read the data in into an array that we call tweets.&lt;br /&gt;
&lt;br /&gt;
tweets_data_path = &amp;#039;../data/twitter_data.txt&amp;#039;&lt;br /&gt;
&lt;br /&gt;
tweets_data = []&lt;br /&gt;
tweets_file = open(tweets_data_path, &amp;quot;r&amp;quot;)&lt;br /&gt;
for line in tweets_file:&lt;br /&gt;
    try:&lt;br /&gt;
        tweet = json.loads(line)&lt;br /&gt;
        tweets_data.append(tweet)&lt;br /&gt;
    except:&lt;br /&gt;
        continue&lt;br /&gt;
&lt;br /&gt;
We can print the number of tweets using the command below. For the dataset that I prepared, the number is 71238.&lt;br /&gt;
&lt;br /&gt;
print len(tweets_data)&lt;br /&gt;
&lt;br /&gt;
Next, we will structure the tweets data into a pandas DataFrame to simplify the data manipulation. We will start by creating an empty DataFrame called tweets using the following command.&lt;br /&gt;
&lt;br /&gt;
tweets = pd.DataFrame()&lt;br /&gt;
&lt;br /&gt;
Next, we will add 3 columns to the tweets DataFrame called text, lang, and country. text column contains the tweet, lang column contains the language in which the tweet was written, and country the country from which the tweet was sent.&lt;br /&gt;
&lt;br /&gt;
tweets[&amp;#039;text&amp;#039;] = map(lambda tweet: tweet[&amp;#039;text&amp;#039;], tweets_data)&lt;br /&gt;
tweets[&amp;#039;lang&amp;#039;] = map(lambda tweet: tweet[&amp;#039;lang&amp;#039;], tweets_data)&lt;br /&gt;
tweets[&amp;#039;country&amp;#039;] = map(lambda tweet: tweet[&amp;#039;place&amp;#039;][&amp;#039;country&amp;#039;] if tweet[&amp;#039;place&amp;#039;] != None else None, tweets_data)&lt;br /&gt;
&lt;br /&gt;
Next, we will create 2 charts: The first one describing the Top 5 languages in which the tweets were written, and the second the Top 5 countries from which the tweets were sent.&lt;br /&gt;
&lt;br /&gt;
tweets_by_lang = tweets[&amp;#039;lang&amp;#039;].value_counts()&lt;br /&gt;
&lt;br /&gt;
fig, ax = plt.subplots()&lt;br /&gt;
ax.tick_params(axis=&amp;#039;x&amp;#039;, labelsize=15)&lt;br /&gt;
ax.tick_params(axis=&amp;#039;y&amp;#039;, labelsize=10)&lt;br /&gt;
ax.set_xlabel(&amp;#039;Languages&amp;#039;, fontsize=15)&lt;br /&gt;
ax.set_ylabel(&amp;#039;Number of tweets&amp;#039; , fontsize=15)&lt;br /&gt;
ax.set_title(&amp;#039;Top 5 languages&amp;#039;, fontsize=15, fontweight=&amp;#039;bold&amp;#039;)&lt;br /&gt;
tweets_by_lang[:5].plot(ax=ax, kind=&amp;#039;bar&amp;#039;, color=&amp;#039;red&amp;#039;)&lt;br /&gt;
&lt;br /&gt;
Alt Text&lt;br /&gt;
&lt;br /&gt;
tweets_by_country = tweets[&amp;#039;country&amp;#039;].value_counts()&lt;br /&gt;
&lt;br /&gt;
fig, ax = plt.subplots()&lt;br /&gt;
ax.tick_params(axis=&amp;#039;x&amp;#039;, labelsize=15)&lt;br /&gt;
ax.tick_params(axis=&amp;#039;y&amp;#039;, labelsize=10)&lt;br /&gt;
ax.set_xlabel(&amp;#039;Countries&amp;#039;, fontsize=15)&lt;br /&gt;
ax.set_ylabel(&amp;#039;Number of tweets&amp;#039; , fontsize=15)&lt;br /&gt;
ax.set_title(&amp;#039;Top 5 countries&amp;#039;, fontsize=15, fontweight=&amp;#039;bold&amp;#039;)&lt;br /&gt;
tweets_by_country[:5].plot(ax=ax, kind=&amp;#039;bar&amp;#039;, color=&amp;#039;blue&amp;#039;)&lt;br /&gt;
&lt;br /&gt;
Alt Text&lt;br /&gt;
3. Mining the tweets&lt;br /&gt;
&lt;br /&gt;
Our main goals in these text mining tasks are: compare the popularity of Python, Ruby and Javascript programming languages and to retrieve programming tutorial links. We will do this in 3 steps:&lt;br /&gt;
&lt;br /&gt;
    We will add tags to our tweets DataFrame in order to be able to manipualte the data easily.&lt;br /&gt;
    Target tweets that have &amp;quot;pogramming&amp;quot; or &amp;quot;tutorial&amp;quot; keywords.&lt;br /&gt;
    Extract links from the relevants tweets&lt;br /&gt;
&lt;br /&gt;
Adding Python, Ruby, and Javascript tags&lt;br /&gt;
&lt;br /&gt;
First, we will create a function that checks if a specific keyword is present in a text. We will do this by using regular expressions. Python provides a library for regular expression called re. We will start by importing this library&lt;br /&gt;
&lt;br /&gt;
import re&lt;br /&gt;
&lt;br /&gt;
Next we will create a function called word_in_text(word, text). This function return True if a word is found in text, otherwise it returns False.&lt;br /&gt;
&lt;br /&gt;
def word_in_text(word, text):&lt;br /&gt;
    word = word.lower()&lt;br /&gt;
    text = text.lower()&lt;br /&gt;
    match = re.search(word, text)&lt;br /&gt;
    if match:&lt;br /&gt;
        return True&lt;br /&gt;
    return False&lt;br /&gt;
&lt;br /&gt;
Next, we will add 3 columns to our tweets DataFrame.&lt;br /&gt;
&lt;br /&gt;
tweets[&amp;#039;python&amp;#039;] = tweets[&amp;#039;text&amp;#039;].apply(lambda tweet: word_in_text(&amp;#039;python&amp;#039;, tweet))&lt;br /&gt;
tweets[&amp;#039;javascript&amp;#039;] = tweets[&amp;#039;text&amp;#039;].apply(lambda tweet: word_in_text(&amp;#039;javascript&amp;#039;, tweet))&lt;br /&gt;
tweets[&amp;#039;ruby&amp;#039;] = tweets[&amp;#039;text&amp;#039;].apply(lambda tweet: word_in_text(&amp;#039;ruby&amp;#039;, tweet))&lt;br /&gt;
&lt;br /&gt;
We can calculate the number of tweets for each programming language as follows:&lt;br /&gt;
&lt;br /&gt;
print tweets[&amp;#039;python&amp;#039;].value_counts()[True]&lt;br /&gt;
print tweets[&amp;#039;javascript&amp;#039;].value_counts()[True]&lt;br /&gt;
print tweets[&amp;#039;ruby&amp;#039;].value_counts()[True]&lt;br /&gt;
&lt;br /&gt;
This returns: 21839 for python, 16154 for javascript and 31410 for ruby. We can make a simple comparaison chart by executing the following:&lt;br /&gt;
&lt;br /&gt;
prg_langs = [&amp;#039;python&amp;#039;, &amp;#039;javascript&amp;#039;, &amp;#039;ruby&amp;#039;]&lt;br /&gt;
tweets_by_prg_lang = [tweets[&amp;#039;python&amp;#039;].value_counts()[True], tweets[&amp;#039;javascript&amp;#039;].value_counts()[True], tweets[&amp;#039;ruby&amp;#039;].value_counts()[True]]&lt;br /&gt;
&lt;br /&gt;
x_pos = list(range(len(prg_langs)))&lt;br /&gt;
width = 0.8&lt;br /&gt;
fig, ax = plt.subplots()&lt;br /&gt;
plt.bar(x_pos, tweets_by_prg_lang, width, alpha=1, color=&amp;#039;g&amp;#039;)&lt;br /&gt;
&lt;br /&gt;
# Setting axis labels and ticks&lt;br /&gt;
ax.set_ylabel(&amp;#039;Number of tweets&amp;#039;, fontsize=15)&lt;br /&gt;
ax.set_title(&amp;#039;Ranking: python vs. javascript vs. ruby (Raw data)&amp;#039;, fontsize=10, fontweight=&amp;#039;bold&amp;#039;)&lt;br /&gt;
ax.set_xticks([p + 0.4 * width for p in x_pos])&lt;br /&gt;
ax.set_xticklabels(prg_langs)&lt;br /&gt;
plt.grid()&lt;br /&gt;
&lt;br /&gt;
Alt Text&lt;br /&gt;
&lt;br /&gt;
This shows, that the keyword ruby is the most popular, followed by python then javascript. However, the tweets DataFrame contains information about all tweets that contains one of the 3 keywords and doesn&amp;#039;t restrict the information to the programming languages. For example, there are a lot tweets that contains the keyword ruby and that are related to a political scandal called Rubygate. In the next section, we will filter the tweets and re-run the analysis to make a more accurate comparison.&lt;br /&gt;
Targeting relevant tweets&lt;br /&gt;
&lt;br /&gt;
We are intersted in targetting tweets that are related to programming languages. Such tweets often have one of the 2 keywords: &amp;quot;programming&amp;quot; or &amp;quot;tutorial&amp;quot;. We will create 2 additional columns to our tweets DataFrame where we will add this information.&lt;br /&gt;
&lt;br /&gt;
tweets[&amp;#039;programming&amp;#039;] = tweets[&amp;#039;text&amp;#039;].apply(lambda tweet: word_in_text(&amp;#039;programming&amp;#039;, tweet))&lt;br /&gt;
tweets[&amp;#039;tutorial&amp;#039;] = tweets[&amp;#039;text&amp;#039;].apply(lambda tweet: word_in_text(&amp;#039;tutorial&amp;#039;, tweet))&lt;br /&gt;
&lt;br /&gt;
We will add an additional column called relevant that take value True if the tweet has either &amp;quot;programming&amp;quot; or &amp;quot;tutorial&amp;quot; keyword, otherwise it takes value False.&lt;br /&gt;
&lt;br /&gt;
tweets[&amp;#039;relevant&amp;#039;] = tweets[&amp;#039;text&amp;#039;].apply(lambda tweet: word_in_text(&amp;#039;programming&amp;#039;, tweet) or word_in_text(&amp;#039;tutorial&amp;#039;, tweet))&lt;br /&gt;
&lt;br /&gt;
We can print the counts of relevant tweet by executing the commands below.&lt;br /&gt;
&lt;br /&gt;
print tweets[&amp;#039;programming&amp;#039;].value_counts()[True]&lt;br /&gt;
print tweets[&amp;#039;tutorial&amp;#039;].value_counts()[True]&lt;br /&gt;
print tweets[&amp;#039;relevant&amp;#039;].value_counts()[True]&lt;br /&gt;
&lt;br /&gt;
This returns, 871 for programming column, 511 for tutorial column, and 1356 for relevant column.&lt;br /&gt;
&lt;br /&gt;
We can compare now the popularity of the programming languages by executing the commands below.&lt;br /&gt;
&lt;br /&gt;
print tweets[tweets[&amp;#039;relevant&amp;#039;] == True][&amp;#039;python&amp;#039;].value_counts()[True]&lt;br /&gt;
print tweets[tweets[&amp;#039;relevant&amp;#039;] == True][&amp;#039;javascript&amp;#039;].value_counts()[True]&lt;br /&gt;
print tweets[tweets[&amp;#039;relevant&amp;#039;] == True][&amp;#039;ruby&amp;#039;].value_counts()[True]&lt;br /&gt;
&lt;br /&gt;
Python is the most popular with a count of 732, followed by javascript by a count of 473, and ruby by a count of 185. We can make a comparaison graph by executing the commands below:&lt;br /&gt;
&lt;br /&gt;
tweets_by_prg_lang = [tweets[tweets[&amp;#039;relevant&amp;#039;] == True][&amp;#039;python&amp;#039;].value_counts()[True], &lt;br /&gt;
                      tweets[tweets[&amp;#039;relevant&amp;#039;] == True][&amp;#039;javascript&amp;#039;].value_counts()[True], &lt;br /&gt;
                      tweets[tweets[&amp;#039;relevant&amp;#039;] == True][&amp;#039;ruby&amp;#039;].value_counts()[True]]&lt;br /&gt;
x_pos = list(range(len(prg_langs)))&lt;br /&gt;
width = 0.8&lt;br /&gt;
fig, ax = plt.subplots()&lt;br /&gt;
plt.bar(x_pos, tweets_by_prg_lang, width,alpha=1,color=&amp;#039;g&amp;#039;)&lt;br /&gt;
ax.set_ylabel(&amp;#039;Number of tweets&amp;#039;, fontsize=15)&lt;br /&gt;
ax.set_title(&amp;#039;Ranking: python vs. javascript vs. ruby (Relevant data)&amp;#039;, fontsize=10, fontweight=&amp;#039;bold&amp;#039;)&lt;br /&gt;
ax.set_xticks([p + 0.4 * width for p in x_pos])&lt;br /&gt;
ax.set_xticklabels(prg_langs)&lt;br /&gt;
plt.grid()&lt;br /&gt;
&lt;br /&gt;
Alt Text&lt;br /&gt;
Extracting links from the relevants tweets&lt;br /&gt;
&lt;br /&gt;
Now that we extracted the relevant tweets, we want to retrieve links to programming tutorials. We will start by creating a function that uses regular expressions for retrieving link that start with &amp;quot;http://&amp;quot; or &amp;quot;https://&amp;quot; from a text. This function will return the url if found, otherwise it returns an empty string.&lt;br /&gt;
&lt;br /&gt;
def extract_link(text):&lt;br /&gt;
    regex = r&amp;#039;https?://[^\s&amp;lt;&amp;gt;&amp;quot;]+|www\.[^\s&amp;lt;&amp;gt;&amp;quot;]+&amp;#039;&lt;br /&gt;
    match = re.search(regex, text)&lt;br /&gt;
    if match:&lt;br /&gt;
        return match.group()&lt;br /&gt;
    return &amp;#039;&amp;#039;&lt;br /&gt;
&lt;br /&gt;
Next, we will add a column called link to our tweets DataFrame. This column will contain the urls information.&lt;br /&gt;
&lt;br /&gt;
tweets[&amp;#039;link&amp;#039;] = tweets[&amp;#039;text&amp;#039;].apply(lambda tweet: extract_link(tweet))&lt;br /&gt;
&lt;br /&gt;
Next we will create a new DataFrame called tweets_relevant_with_link. This DataFrame is a subset of tweets DataFrame and contains all relevant tweets that have a link.&lt;br /&gt;
&lt;br /&gt;
tweets_relevant = tweets[tweets[&amp;#039;relevant&amp;#039;] == True]&lt;br /&gt;
tweets_relevant_with_link = tweets_relevant[tweets_relevant[&amp;#039;link&amp;#039;] != &amp;#039;&amp;#039;]&lt;br /&gt;
&lt;br /&gt;
We can now print out all links for python, javascript, and ruby by executing the commands below:&lt;br /&gt;
&lt;br /&gt;
print tweets_relevant_with_link[tweets_relevant_with_link[&amp;#039;python&amp;#039;] == True][&amp;#039;link&amp;#039;]&lt;br /&gt;
print tweets_relevant_with_link[tweets_relevant_with_link[&amp;#039;javascript&amp;#039;] == True][&amp;#039;link&amp;#039;]&lt;br /&gt;
print tweets_relevant_with_link[tweets_relevant_with_link[&amp;#039;ruby&amp;#039;] == True][&amp;#039;link&amp;#039;]&lt;br /&gt;
&lt;br /&gt;
This returns 644 links for python, 413 links for javascript, and 136 for ruby. Below are some python related links&lt;br /&gt;
&lt;br /&gt;
    http://t.co/WmTccp3rb1&lt;br /&gt;
    http://t.co/5qE3vPAy7N&lt;br /&gt;
    http://t.co/1rvmhqPsXD&lt;br /&gt;
    http://t.co/S9aq2AahjH&lt;br /&gt;
    http://t.co/ORg6IL8qXT&lt;br /&gt;
    http://t.co/EnK2UlDcJ8&lt;br /&gt;
    http://t.co/gtu9WVQCLK&lt;br /&gt;
    http://t.co/xvMTzqLGg0&lt;br /&gt;
    http://t.co/bgMZ0jIpA7&lt;br /&gt;
    http://t.co/O03VrRyEAb&lt;br /&gt;
    http://t.co/CfWYefZML7&lt;br /&gt;
    http://t.co/N3iU2ZYa2z&lt;br /&gt;
    http://t.co/S9aq2AahjH&lt;br /&gt;
    http://t.co/ytms7bcsQV&lt;br /&gt;
&lt;br /&gt;
Conclusion&lt;br /&gt;
&lt;br /&gt;
In this tutorial, we covered many techniques used in text mining. The code provide in this post could be modified to create a deeper analysis or could be adapted to another use case. For those who want to go further in text mining, I recommend to follow up by studying regular expressions.&lt;br /&gt;
&lt;br /&gt;
You can find the source code from this tutorial in this github repository github link. &lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
&lt;br /&gt;
==Referensi==&lt;br /&gt;
&lt;br /&gt;
* http://adilmoujahid.com/posts/2014/07/twitter-analytics/&lt;/div&gt;</summary>
		<author><name>Onnowpurbo</name></author>
	</entry>
</feed>