Clustering Frequent Graph Patterns

Yong Liu; Jianzhong Li; Jinghua Zhu; Hong Gao
December 2007
Journal of Digital Information Management;Dec2007, Vol. 5 Issue 6, p335
Academic Journal
In recent years, graph mining has attracted much attention in the data mining community. Several efficient frequent subgraph mining algorithms have been recently proposed. However, the number of frequent graph patterns generated by these graph mining algorithms may be too large to be effectively explored by users, especially when the support threshold is low. In this paper, we propose to summarize frequent graph patterns by a much smaller number of representative graph patterns. Several novel concepts such as δ-cover graph, jump value and δ-jump pattern are proposed for efficiently summarizing frequent graph patterns. Based on the property of all δ-jump patterns being representative graph patterns, we propose two efficient algorithms for summarizing frequent graph patterns, RP-FP and RP-GD. The RP-FP algorithm computes representative graph patterns from a set of closed frequent graph patterns, whereas the RP-GD algorithm directly mines representative graph patterns from graph databases. The RPFP has tight approximation (summarization) bound but higher computational complexity. The RP-GD has no approximation bound guarantee but is far more efficient. We conducted extensive experimental studies using various real and synthetic datasets. Experimental results show that RP-FP and RP-GD are able to obtain compact summarization in both real and synthetic graph databases. When the number of closed graph patterns is very large, RP-GD is much more efficient than RP-FP, while achieving comparable summarization quality.


Related Articles

  • Frequent Item Set Mining using Convertible Constraints in IMine. Prakash, T. Senthil; Thangaraj, P. // European Journal of Scientific Research;11/29/2011, Vol. 65 Issue 1, p93 

    Frequent itemset mining has turned out to be a familiar area of investigation in the data mining field. Its key objective is to determine the sets of items that come collectively more than a specified threshold value based on the need. The IMine index structure provides the more common and...

  • Detection and Extraction of Videos using Decision Trees. Nabi, Abdul; Rasool, Shaik; Premchand, P. // International Journal of Advanced Computer Science & Application;Dec2011, Vol. 2 Issue 12, p147 

    This paper addresses a new multimedia data mining framework for the extraction of events in videos by using decision tree logic. The aim of our DEVDT (Detection and Extraction of Videos using Decision Trees) system is for improving the indexing and retrieval of multimedia information. The...

  • An Algorithm of Top-k High Utility Itemsets Mining over Data Stream. Tianjun Lu; Yang Liu; Le Wang // Journal of Software (1796217X);Sep2014, Vol. 9 Issue 9, p2342 

    Existing top-k high utility itemset (HUI) mining algorithms generate candidate itemsets in the mining process; their time & space performance might be severely affected when the dataset is large or contains many long transactions; and when applied to data streams, the performance of...

  • Critical Evaluation of Validation Rules Automated Extraction from Data. Pejcoch, David // Journal of Systems Integration (1804-2724);2014, Vol. 5 Issue 4, p32 

    The goal of this article is to critically evaluate a possibility of automatic extraction of such kind of rules which could be later used within a Data Quality Management process for validation of records newly incoming to Information System. For practical demonstration the 4FT-Miner procedure...

  • ICA: An Incremental Clustering Algorithm Based on OPTICS. Fu, Jun-Song; Liu, Yun; Chao, Han-Chieh // Wireless Personal Communications;Oct2015, Vol. 84 Issue 3, p2151 

    Clustering algorithms play an important role in data mining no matter whether they are used as a stand-alone tool or as a preprocessing step for further analysis on the data. With the arrival of the information era, the speed of data generation is faster and faster. As a result, clustering...

  • An Approach of Automatic Web Data Record Extraction Using Clustering Techniques. YongQuan Dong; QingZhong Li // Proceedings of the International Symposium on Information System;2009, p441 

    A novel approach is proposed to automatically extract data records from web pages using clustering techniques. The approach is completed in four steps. Firstly, it utilizes the visual information to identify data region containing data records. Secondly, it uses clustering algorithm to divide...

  • Overview of the BioCreative III Workshop.  // BMC Bioinformatics;2011 Supplement, Vol. 12 Issue Suppl 8, pS1 

    The article presents information related to the BioCreative Workshops, which promotes the development of text mining and text processing tools. It states that humanly annotated test data for several basic tasks in text mining applied to the biomedical literature were discussed in previous...

  • Decision Support Through Subgroup Discovery: Three Case Studies and the Lessons Learned. LAVRAČ, NADA; CESTNIK, BOJAN; GAMBERGER, DRAGAN; FLACH, PETER // Machine Learning;Oct2004, Vol. 57 Issue 1/2, p115 

    This paper presents ways to use subgroup discovery to generate actionable knowledge for decision support. Actionable knowledge is explicit symbolic knowledge, typically presented in the form of rules, that allows the decision maker to recognize some important relations and to perform an...

  • DECISION SUPPORT SYSTEM FOR PREDICTING THE DEGREE OF A CANCER PATIENT'S EMPOWERMENT. ABOUABDELLAH, Abdellah; CHERKAOUI, Abdelghani // Journal of Theoretical & Applied Information Technology;2/28/2014, Vol. 60 Issue 3, p517 

    In this paper, we develop a decision support system (DSS) based on Knowledge Discovery from Databases (KDD). This system allows the prediction of the empowerment of a patient with cancer treated by chemotherapy. The first part of this article presents the process of empowerment of the patient....


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics