Research on Sentiment Classification of Chinese Micro Blog Based on Machine Learning

Duolin Liu
February 2013
International Journal of Digital Content Technology & its Applic;Feb2013, Vol. 7 Issue 3, p395
Academic Journal
This thesis has conducted an empirical research on sentiment classification of micro blog in three machine learning algorithms, three feature selection algorithms and three feature item weighting algorithms. As the experimental result shows, considering different feature weighting algorithms, SVM and NaÏŠve Bayes have their own advantages, and Information Gain (IG) feature selection algorithm is apparently more effective than other methods. Considering the three factors as a whole, it is most effective to have sentiment classification on micro blog by adopting SVM, IG and TF-IDF (Term Frequency-Inverse Document Frequency) as feature items weighting. It has compared the generality of classification model between micro blog comments and ordinary comments in the field of films, and as a result, the experimental results show that the performance of sentiment classification relies on the style of reviews.


Related Articles

  • A Survey Paper: Areas, Techniques and Challenges of Opinion Mining. Rashid, Ayesha; Anwer, Naveed; Iqbal, Muddaser; Sher, Muhammad // International Journal of Computer Science Issues (IJCSI);Nov2013, Vol. 10 Issue 6, p18 

    Opinion Mining is a promising discipline, defined as an intersection of information retrieval and computational linguistic techniques to deal with the opinions expressed in a document. The field aims at solving the problems related to opinions about products, Politian in newsgroup posts, review...

  • Analysis and Sentiment Detection in Online Reviews of Tax Professionals: A Comparison of Three Software Packages. Witherspoon, Candace L.; Stone, Dan N. // Journal of Emerging Technologies in Accounting;2013, Vol. 10, p89 

    This study analyzes the "tax talk" in online client reviews of tax preparers, and evaluates off-the-shelf (SentiStrength, LIWC2007, and DICTION 6.0) and customized software packages' detection of sentiment in these reviews. Compared to human-coded sentiment, three off-the-shelf programs poorly...

  • Sentiment Analysis on News Comments Based on Supervised Learning Method. Yan Zhao; Suyu Dong; Leixiao Li // International Journal of Multimedia & Ubiquitous Engineering;2014, Vol. 9 Issue 7, p333 

    Up to now, sentiment analysis has become one of most active research ares in NLP, many researchers have conducted sentiment analysis for foreign language documents. Compared with the researches of foreign language documents, there are few studies on sentiment classification of Chinese document,...

  • Semi-supervised Sentiment Classification using Ranked Opinion Words. Suke Li; Yanbing Jiang // International Journal of Database Theory & Application;Dec2013, Vol. 6 Issue 6, p51 

    This work proposes a semi-supervised sentiment classification method which is based on the co-training framework. The proposed method needs to construct three sentiment classifiers. We use common text features to construct the first classifier. We extract opinion words from consumer reviews, and...

  • Binary Classification with Class-Dependent Misclassification Cost and Class-Dependent Reject Cost. Enhui Zheng; Cong Zhang; Xiai Chen; Jian Sun; Huijuan Lu // International Journal of Digital Content Technology & its Applic;Nov2012, Vol. 6 Issue 20, p714 

    To minimize "0-1" loss, most of traditional classification algorithms implicitly assume that all errors cost equally and all classification results are acceptable. However, in some real- world applications the assumption is not reasonable, such as medical diagnosis, face diagnosis and fraud...

  • Adaptive ensemble with trust networks and collaborative recommendations. Zou, Haitao; Gong, Zhiguo; Zhang, Nan; Li, Qing; Rao, Yanghui // Knowledge & Information Systems;Sep2015, Vol. 44 Issue 3, p663 

    Several existing recommender algorithms combine collaborative filtering and social/trust networks together in order to overcome the problems caused by data scarcity and to produce more effective recommendations for users. In general, those methods fuse a user's own taste and his trusted...

  • Research of the Improved Adaboost Algorithm Based on Unbalanced Data. Shang Fuhua // International Journal of Computer Science & Network Security;May2014, Vol. 14 Issue 5, p14 

    A large of Non-equilibrium data exist in the real world, because of the traditional classification methods based on assumptions of class balance and different categories misclassification the same costs as well as the evaluation criteria based on the accuracy of the overall sample...

  • Feature Selection Method Based on Improved Document Frequency. Wei Zheng; Guohe Feng // Telkomnika;Dec2014, Vol. 12 Issue 4, p905 

    Feature selection is an important part of the process of text classification, there is a direct impact on the quality of feature selection because of the evaluation function. Document frequency (DF) is one of several commonly methods used feature selection, its shortcomings is the lack of...

  • Research on the Chinese Word Segmentation System Based on Incremental Learning. Fanjin Mai; Shitong Wu; Laiyue Wang // Applied Mechanics & Materials;2014, Issue 602-605, p3469 

    In order to cope with the increasing size of the training corpus and adapt to the requirements of incremental learning, this paper introduces a feature selection algorithm of maximum entropy model into the research of Chinese word segmentation technology, designs and implements a Chinese word...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics