A New Text Clustering Method Based on KGA

ZhanGang Hao
May 2012
Journal of Software (1796217X);May2012, Vol. 7 Issue 5, p1094
Academic Journal
Text clustering is one of the key research areas in data mining. K-medoids is a classical partitioning algorithm, which can better solve the isolated point problem, but it often converges to local optimization. In this paper, we put forward a new genetic algorithm called KGA algorithm by putting k-medoids into the genetic algorithm, then we form a local Optimal Solution with multiple initial species group, strategy for crossover within a species group and crossover among species groups, using the mutation threshold to control mutation. This algorithm will increase the diversity of species group and enhance the optimization capability of genetic algorithm, thus improve the accuracy of clustering and the capacity of acquiring isolated points.


Related Articles

  • Novel Hybrid Clustering Optimization Algorithms Based on Plant Growth Simulation Algorithm. Tavakolian, Rozita; Charkari, Nasroollah Moghaddam // Journal of Advanced Computer Science & Technology Research;2011, Vol. 1 Issue 2, p84 

    Data clustering as one of the important data mining techniques is a fundamental and widely used method to achieve useful information about data. In face of the clustering problem, clustering methods still suffer from trapping in a local optimum and cannot often find global clusters. In general,...

  • An Improved Genetic Algorithm for Text Clustering. Shidong YU; Yuan DING; Xicheng MA; Jian SUN // Advanced Materials Research;7/24/2014, Vol. 989-994, p1853 

    The genetic algorithm (GA) is a self-adapted probability search method used to solve optimization problems, which has been applied widely in science and engineering. In this paper, we propose an improved variable string length genetic algorithm (IVGA) for text clustering. Our algorithm has been...

  • Evolving rule induction algorithms with multi-objective grammar-based genetic programming. Pappa, Gisele L.; Freitas, Alex A. // Knowledge & Information Systems;Jun2009, Vol. 19 Issue 3, p283 

    Multi-objective optimization has played a major role in solving problems where two or more conflicting objectives need to be simultaneously optimized. This paper presents a Multi-Objective grammar-based genetic programming (MOGGP) system that automatically evolves complete rule induction...

  • Para Miner: a generic pattern mining algorithm for multi-core architectures. Negrevergne, Benjamin; Termier, Alexandre; Rousset, Marie-Christine; Méhaut, Jean-François // Data Mining & Knowledge Discovery;May2014, Vol. 28 Issue 3, p593 

    In this paper, we present Para Miner which is a generic and parallel algorithm for closed pattern mining. Para Miner is built on the principles of pattern enumeration in strongly accessible set systems. Its efficiency is due to a novel dataset reduction technique (that we call EL-reduction),...

  • An Improved Animal Migration Optimization Algorithm for Clustering Analysis. Ma, Mingzhi; Luo, Qifang; Zhou, Yongquan; Chen, Xin; Li, Liangliang // Discrete Dynamics in Nature & Society;1/5/2015, Vol. 2015, p1 

    Animal migration optimization (AMO) is one of the most recently introduced algorithms based on the behavior of animal swarm migration. This paper presents an improved AMO algorithm (IAMO), which significantly improves the original AMO in solving complex optimization problems. Clustering is a...

  • A DIFFERENTIAL CLUSTERING ALGORITHM BASED ON ELITE STRATEGY. Xiongjun Wen; Qun Zhou; Sheng Huang // Scientific Bulletin of National Mining University;2016, Issue 2, p116 

    Purpose. Cluster analysis is not only an important research field of data mining but also a significant means and method in data partitioning or packet processing. The research aims to further improve the effect of clustering algorithm and overcome the existing defects of differential evolution...

  • THE ISLAND MODEL AS A MARKOV DYNAMIC SYSTEM. SCHAEFER, ROBERT; BYRSKI, ALEKSANDER; SMOŁKA, MACIEJ // International Journal of Applied Mathematics & Computer Science;Dec2012, Vol. 22 Issue 4, p971 

    Parallel multi-deme genetic algorithms are especially advantageous because they allow reducing the time of computations and can perform a much broader search than single-population ones. However, their formal analysis does not seem to have been studied exhaustively enough. In this paper we...

  • Parallel Implementation of Genetic Algorithm using K-Means Clustering. Senthil Kumar, A. V.; Mythili, S. // International Journal of Advanced Networking & Applications;May/Jun2012, Vol. 3 Issue 6, p1450 

    The existing clustering algorithm has a sequential execution of the data. The speed of the execution is very less and more time is taken for the execution of a single data. A new algorithm Parallel Implementation of Genetic Algorithm using KMeans Clustering (PIGAKM) is proposed to overcome the...

  • Role of Rough Sets in Data Analysis. Anitha, K. // Annual International Conference on Optoelectronics, Photonics & ;2016, p155 

    Data Clustering is the process of dividing a data set into groups of similar items. This paper describes the data analysis technique based on Rough sets. The main objective of data analysis using Rough Set theory is it discovers hidden patterns of data from high dimensional data base. Moreover...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics