Pruning Techniques in Associative Classification: Survey and Comparison

Thabtah, Fadi
September 2006
Journal of Digital Information Management;Sep2006, Vol. 4 Issue 3, p197
Academic Journal
Association rule discovery and classification are common data mining tasks. Integrating association rule and classification also known as associative classification is a promising approach that derives classifiers highly competitive with regards to accuracy to that of traditional classification approaches such as rule induction and decision trees. However, the size of the classifiers generated by associative classification is often large and therefore pruning becomes an essential task. In this paper, we survey different rule pruning methods used by current associative classification techniques. Further, we compare the effect of three pruning methods (database coverage, pessimistic error estimation, lazy pruning) on the accuracy rate and the number of rules derived from different classification data sets. Results obtained from experimenting on different data sets from UCI data collection indicate that lazy pruning algorithms may produce slightly higher predictive classifiers than those which utilise database coverage and pessimistic error pruning methods. However, the potential use of such classifiers is limited because they are difficult to understand and maintain by the end-user.


Related Articles

  • An Optimistic Data Mining Approach for Handling Large Data Set using Data Partitioning Techniques. Patil, Dipak V.; Bichkar, R. S. // International Journal of Computer Applications;Jun2011, Vol. 24, p29 

    The use of the Internet for various purposes leads to collection of large volume of data. The knowledge contents of large data can be utilized to improve decision-making process of an organization. The knowledge discovery on this high volume data becomes very slow, as it has to be done serially...

  • Mining Educational Data to Analyze Students' Performance. Baradwaj, Brijesh Kumar; Pal, Saurabh // International Journal of Advanced Computer Science & Application;Jun2011, Vol. 2 Issue 6, p63 

    The main objective of higher education institutions is to provide quality education to its students. One way to achieve highest level of quality in higher education system is by discovering knowledge for prediction regarding enrolment of students in a particular course, alienation of traditional...

  • COMPARATIVE STUDY OF DATA MINING ALGORITHMS FOR HIGH DIMENSIONAL DATA ANALYSIS. Smitha, T.; Sundaram, V. // International Journal of Advances in Engineering & Technology;Sep2012, Vol. 4 Issue 2, p173 

    The main objective of this research paper is to prove the effectiveness of high dimensional data analysis and different algorithm in the prediction process of Data mining. The approach made for this survey includes , an extensive literature search on published papers as well as text books in the...

  • EXTRACTING USEFUL RULES THROUGH IMPROVED DECISION TREE INDUCTION USING INFORMATION ENTROPY. Ali, Mohd. Mahmood; Qaseem, Mohd. S.; Rajamani, Lakshmi; Govardhan, A. // International Journal of Information Sciences & Techniques;Jan2013, Vol. 3 Issue 1, p27 

    Classification is widely used technique in the data mining domain, where scalability and efficiency are the immediate problems in classification algorithms for large databases. We suggest improvements to the existing C4.5 decision tree algorithm. In this paper attribute oriented induction (AOI)...

  • Analyzing Data Mining Algorithms in SQL Server. Gupta, B. Sreekanth; Moorthy, M. Narayana; Babu, M. Rajasekhara // International Journal of Research & Reviews in Computer Science;Jun2011, Vol. 2 Issue 3, p670 

    The database latest buzzword is Data mining which is used to find, predict the hidden patterns of the data, to find the trends, help analysts in designing better algorithms and take more correct and faster actions to prevent wastage. This paper provides an introduction to Data mining defines...

  • Incorporating heuristics for efficient search space pruning in frequent itemset mining strategies. Kalpana, B.; Nadarajan, R. // Current Science (00113891);1/10/2008, Vol. 94 Issue 1, p97 

    Recent studies have shown an increasing thrust on the development of algorithms based on a lattice framework. Efficient pruning of the search space is an important factor which determines the performance of such algorithms. In this communication, we present certain lattice theoretical concepts...

  • An Attribute-Centre Based Decision Tree Classification Algorithm. Silahtaroğlu, Gökhan // World Academy of Science, Engineering & Technology;Aug2009, Issue 32, p302 

    Decision tree algorithms have very important place at classification model of data mining. In literature, algorithms use entropy concept or gini index to form the tree. The shape of the classes and their closeness to each other some of the factors that affect the performance of the algorithm. In...

  • An Efficient Way of Frequent Embedded Subtree Mining on Biological Data. Wei Liu; Ling Chen // Journal of Computers;Dec2011, Vol. 6 Issue 12, p2574 

    Data mining provides biological research a useful information analyzing tool. The key factors which influence the performance of biological data mining approaches are the large-scale of biological data and the high similarities among patterns mined. In this paper, we present an efficient...

  • The Application of Data Mining Technology in the Customer Churn Prewarning. Lichang Zhen; Xin Gao; Yiming Wang; Yongchun Gao // Applied Mechanics & Materials;2014, Issue 644-650, p2198 

    With the further reform and market division in the telecommunication industry, there are more and more choices for customers to select telecom products and operators, which lead to the fiercer competition for customers between telecom operators. As the technical method to identify customers...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics