NLTK: Sentiment Strength Detection in Bahasa Indonesia: Difference between revisions
From OnnoCenterWiki
Jump to navigationJump to search
Onnowpurbo (talk | contribs) No edit summary |
Onnowpurbo (talk | contribs) No edit summary |
||
| Line 24: | Line 24: | ||
This is work in progress. Experimental for my Master Thesis | This is work in progress. Experimental for my Master Thesis | ||
=Ubah Source Code== | |||
import argparse | |||
def parse_args(): | |||
parser = argparse.ArgumentParser() | |||
parser.add_argument('-i', '--infile', default='', help='input filename') | |||
return parser.parse_args() | |||
def main(): | |||
args = parse_args() | |||
infile = args.infile | |||
filename = open(infile,'r') | |||
fcontent=filename.read() | |||
filename.close() | |||
ss = sentiStrength() | |||
sc = spellCheck() | |||
for t in fcontent: | |||
print ss.main(t) | |||
print "=====================" | |||
print ss.getSentimenScore() | |||
main() | |||
Revision as of 03:36, 25 February 2017
SentiStrengthID
Sentiment Strength Detection in Bahasa Indonesia. This is unsupervised version of SentiStrength (http://sentistrength.wlv.ac.uk/) in Bahasa Indonesia. Core Feature:
- Sentiment Lookup
- Negation Word Lookup
- Booster Word Lookup
- Emoticon Lookup
- Idiom Lookup
- Question Word Lookup
- Slang Word Lookup
- Spelling Correction (optional) using Pater Norvig (http://norvig.com/spell-correct.html)
- Negative emotion ignored in question
- Exclamation marks count as +2
- Repeated Punctuation boosts sentiment
Ignored Rule:
repeated letters more than 2 boosts sentiment score. This rule do not applied due to my own pre-processing rule which removing word's extra character score +2, -2 in word "miss". Do not apply in Bahasa Indonesia.
Warning!
This is work in progress. Experimental for my Master Thesis
Ubah Source Code=
import argparse
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument('-i', '--infile', default=, help='input filename')
return parser.parse_args()
def main():
args = parse_args()
infile = args.infile
filename = open(infile,'r')
fcontent=filename.read()
filename.close()
ss = sentiStrength()
sc = spellCheck()
for t in fcontent:
print ss.main(t)
print "====================="
print ss.getSentimenScore()
main()