TITLE

Performance of Decision Trees on Arabic Text Categorization

AUTHOR(S)
Harrag, Fouzi; AI-Qawasrnah, Eyas; Hamdi-Cherif, Aboubekeur
PUB. DATE
December 2009
SOURCE
Journal of Digital Information Management;Dec2009, Vol. 7 Issue 6, p377
SOURCE TYPE
Academic Journal
DOC. TYPE
Article
ABSTRACT
Text classification is the task of assigning a document to one or more of pre-defined categories based on its contents. This paper presents the results of classifying Arabic text documents using a decision tree algorithm. Experiments are performed over two self collected data corpus and the results show that the suggested hybrid approach of Document Frequency Thresholding using an embedded information gain criterion of the decision tree algorithm is the preferable feature selection criterion. The average accuracy of using Feature selection is 0.93 for the scientific corpus, while for the literary corpus the average accuracy is 0.91. We also conclude that the effectiveness of the decision tree classifier was increased as we increase the training size.
ACCESSION #
47637858

 

Related Articles

  • GENETICALLY ENGINEERED DECISION TREES: POPULATION DIVERSITY PRODUCES SMARTER TREES. Zhiwei Fu; Golden, Bruce; Lele, Shreevardhan; Raghavan, S.; Wasil, Edward // Operations Research;Nov/Dec2003, Vol. 51 Issue 6, p894 

    When considering a decision tree for the purpose of classification, accuracy is usually the sole performance measure used in the construction process. In this paper, we introduce the idea of combining a decision tree's expected value and variance in a new probabilistic measure for assessing the...

  • Comprehensive Decision Tree Models in Bioinformatics. Stiglic, Gregor; Kocbek, Simon; Pernek, Igor; Kokol, Peter // PLoS ONE;Mar2012, Vol. 7 Issue 3, p1 

    Purpose: Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification...

  • Comprehensive Decision Tree Models in Bioinformatics. Stiglic, Gregor; Kocbek, Simon; Pernek, Igor; Kokol, Peter // PLoS ONE;Mar2012, Vol. 7 Issue 3, p1 

    Purpose: Classification is an important and widely used machine learning technique in bioinformatics. Researchers and other end-users of machine learning software often prefer to work with comprehensible models where knowledge extraction and explanation of reasoning behind the classification...

  • Usefulness of decision tree analysis. Buckley, Adrian // Accountancy;Jul72, Vol. 83 Issue 947, p80 

    The article discusses decision tree analysis, with an emphasis on its role in complex sequential investment decisions. The author has addressed some of the issues which confronts investment decisions, including company's decision to expand, the future development of the product in the market,...

  • Predicting Protein Function using Decision Tree. Singh, Manpreet; Wadhwa, Parminder Kaur; Kaur, Surinder // Proceedings of World Academy of Science: Engineering & Technolog;May2008, Vol. 41, p350 

    The drug discovery process starts with protein identification because proteins are responsible for many functions required for maintenance of life. Protein identification further needs determination of protein function. Proposed method develops a classifier for human protein function prediction....

  • Usefulness of decision tree analysis. Buckley, Adrian // Accountancy;Jul1972, Vol. 83 Issue 947, p80 

    The article discusses decision tree analysis, with an emphasis on its role in complex sequential investment decisions. The author has addressed some of the issues which confronts investment decisions, including company's decision to expand, the future development of the product in the market,...

  • Against Classification Attacks: A Decision Tree Pruning Approach to Privacy Protection in Data Mining. Xiao-Bai Li; Sarkar, Sumit // Operations Research;Nov2009, Vol. 57 Issue 6, p1496 

    Data-mining techniques can be used not only to study collective behavior about customers, but also to discover private information about individuals. In this study, we demonstrate that decision trees, a popular classification technique for data mining, can be used to effectively reveal...

  • Query Join Processing over Uncertain Data for Decision Tree Classifiers. Kumar, V. Yaswanth; Kalyani, G. // International Journal of Computer Science Engineering & Technolo;Jun2012, Vol. 2 Issue 6, p1249 

    Traditional decision tree classifiers work with the data whose values are known and precise. We can also extend those classifiers to handle data with uncertain information. Value uncertainty arises in many applications during the data collection process. Example sources of uncertainty...

  • COMPUTING EQUILIBRIA OF TWO-PERSON GAMES FROM THE EXTENSIVE FORM. Wilson, Robert // Management Science;Mar72, Vol. 18 Issue 7, p448 

    The Lemke-Howson algorithm for computing equilibria of finite 2-person noncooperative games in normal form is modified to restrict the computations to the ordinarily small portion corresponding to the strategies actually used by the players, and further it is shown that in games with perfect...

Share

Read the Article

Courtesy of VIRGINIA BEACH PUBLIC LIBRARY AND SYSTEM

Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics