In search of reputation assessment: experiences with polarity classification in RepLab 2013

The diue system uses a supervised Machine Learning approach for the polarity classification subtask of RepLab. We used the Python NLTK for preprocessing, including file parsing, text analysis and feature extraction. Our best solution is a mixed strategy, combining bag-of-words with a limited set of...

ver descrição completa

Detalhes bibliográficos
Autor principal:	Saias, José (author)
Formato:	article
Idioma:	eng
Publicado em:	2014
Assuntos:	opinion mining reputation assessment NLP Machine Learning
Texto completo:	http://hdl.handle.net/10174/10352
País:	Portugal
Oai:	oai:dspace.uevora.pt:10174/10352

Descrição
Resumo:	The diue system uses a supervised Machine Learning approach for the polarity classification subtask of RepLab. We used the Python NLTK for preprocessing, including file parsing, text analysis and feature extraction. Our best solution is a mixed strategy, combining bag-of-words with a limited set of features based on sentiment lexicons and superficial text analysis. This system begins by applying tokenization and lemmatization. Then each tweet content is analyzed and 18 features are obtained, related to presence of polarized term, negation before polarized expression and entity reference. For the first run, the learning and classification were performed with the Decision Tree algorithm, from the NLTK framework. In the second run, we used a pipeline of classifiers. The first classifier applies Naive Bayes in a bag-of-words feature model, with the 1500 most frequent words in the training set. The second classifier used the features from the first run plus another feature with the result from the previous classifier. Our system's best result had 0.54694 Accuracy and 0.31506 in F measure.

In search of reputation assessment: experiences with polarity classification in RepLab 2013

Registros relacionados

Precisa de ajuda?