An Effective Evolutionary Clustering Algorithm: Hepatitis C case study

Marghny, M. H.; El-Aziz, Rasha M. Abd; Taloba, Ahmed I.
November 2011
International Journal of Computer Applications;Nov2011, Vol. 34, p1
Academic Journal
Clustering analysis plays an important role in scientific research and commercial application. K-means algorithm is a widely used partition method in clustering. However, it is known that the K-means algorithm may get stuck at suboptimal solutions, depending on the choice of the initial cluster centers. In this article, we propose a technique to handle large scale data, which can select initial clustering center purposefully using Genetic algorithms (GAs), reduce the sensitivity to isolated point, avoid dissevering big cluster, and overcome deflexion of data in some degree that caused by the disproportion in data partitioning owing to adoption of multi-sampling. We applied our method to some public datasets these show the advantages of the proposed approach for example Hepatitis C dataset that has been taken from the machine learning warehouse of University of California. Our aim is to evaluate hepatitis dataset. In order to evaluate this dataset we did some preprocessing operation, the reason to preprocessing is to summarize the data in the best and suitable way for our algorithm. Missing values of the instances are adjusted using local mean method.


Related Articles

  • MULTI-DENSITY DBSCAN USING REPRESENTATIVES: MDBSCAN-UR. Ahmed, Rwand; El-Zaza, Eman; Ashour, Wesam // Computing & Information Systems;Oct2011, Vol. 15 Issue 2, p1 

    DBSCAN is one of the most popular algorithms for cluster analysis. It can discover clusters with arbitrary shape and separate noises. But this algorithm cannot choose its parameter according to distributing of dataset. It simply uses the global uses minimum number of points (MinPts) parameter,...

  • AVOIDING NOISE AND OUTLIERS IN K-MEANS. Jnena, Rami; Timraz, Mohammed; Ashour, Wesam // Computing & Information Systems;Oct2011, Vol. 15 Issue 2, p1 

    Applying k-means algorithm on the datasets that include large number of noise and outlier objects, gives unclear clusters results. In this paper we proposed a new technique for avoiding these noise and outliers by applying some preprocessing and post processing steps for the dataset that have to...

  • K-Means for Spherical Clusters with Large Variance in Sizes. Fahim, A. M.; Saake, G.; Salem, A. M.; Torkey, F. A.; Ramadan, M. A. // International Journal of Computer Science;2009, Vol. 4 Issue 3, p145 

    Data clustering is an important data exploration technique with many applications in data mining. The k-means algorithm is well known for its efficiency in clustering large data sets. However, this algorithm is suitable for spherical shaped clusters of similar sizes and densities. The quality of...

  • DERIVING CLUSTER KNOWLEDGE USING ROUGH SET THEORY. Upadhyaya, Shuchita; Arora, Alka; Jain, Rajni // Journal of Theoretical & Applied Information Technology;2008, Vol. 4 Issue 8, p688 

    Clustering algorithms gives general description of the clusters listing number of clusters and member entities in those clusters. It lacks in generating cluster description in the form of pattern. Deriving pattern from clusters along with grouping of data into clusters is important from data...

  • HYBRID ANT-BASED CLUSTERING ALGORITHM WITH CLUSTER ANALYSIS TECHNIQUES. Omar, Wafa'a; Badr, Amr; El-Fattah Hegazy, Abd // Journal of Computer Science;Jun2013, Vol. 9 Issue 6, p780 

    Cluster analysis is a data mining technology designed to derive a good understanding of data to solve clustering problems by extracting useful information from a large volume of mixed data elements. Recently, researchers have aimed to derive clustering algorithms from nature's swarm behaviors....

  • Avoiding Objects with few Neighbors in the K-Means Process and Adding ROCK Links to Its Distance. Alnabriss, Hadi A.; Ashour, Wesam // International Journal of Computer Applications;Aug2011, Vol. 28, p12 

    K-means is considered as one of the most common and powerful algorithms in data clustering, in this paper we're going to present new techniques to solve two problems in the K-means traditional clustering algorithm, the 1st problem is its sensitivity for outliers, in this part we are going to...

  • Multicore Processing for Clustering Algorithms. Rao, Rekhansh; Nagwanshi, Kapil Kumar; Dubey, Sipi // International Journal of Computer Technology & Applications;2012, Vol. 3 Issue 2, p555 

    Data Mining algorithms such as classification and clustering are the future of computation, though multidimensional data-processing is required. People are using multicore processors with GPU's. Most of the programming languages doesn't provide multiprocessing facilities and hence wastage of...

  • An effective web document clustering algorithm based on bisection and merge. Ingyu Lee; Byung-Won On // Artificial Intelligence Review;Jun2011, Vol. 36 Issue 1, p69 

    To cluster web documents, all of which have the same name entities, we attempted to use existing clustering algorithms such as K-means and spectral clustering. Unexpectedly, it turned out that these algorithms are not effective to cluster web documents. According to our intensive investigation,...

  • Combining PSO cluster and nonlinear mapping algorithm to perform clustering performance analysis: take the enterprise financial alarming as example. Pan, Wen-Tsao // Quality & Quantity;Oct2011, Vol. 45 Issue 6, p1291 

    Algorithm by simulating biological intelligence concept with application in the optimization issue is still in the emergent stage. Among them, Particle Swarm Optimization Algorithm is a concept and method that has group intelligence. Using the exploring and development feature of particle swarm,...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics