TITLE

K-Means for Spherical Clusters with Large Variance in Sizes

AUTHOR(S)
Fahim, A. M.; Saake, G.; Salem, A. M.; Torkey, F. A.; Ramadan, M. A.
PUB. DATE
September 2009
SOURCE
International Journal of Computer Science;2009, Vol. 4 Issue 3, p145
SOURCE TYPE
Academic Journal
DOC. TYPE
Article
ABSTRACT
Data clustering is an important data exploration technique with many applications in data mining. The k-means algorithm is well known for its efficiency in clustering large data sets. However, this algorithm is suitable for spherical shaped clusters of similar sizes and densities. The quality of the resulting clusters decreases when the data set contains spherical shaped with large variance in sizes. In this paper, we introduce a competent procedure to overcome this problem. The proposed method is based on shifting the center of the large cluster toward the small cluster, and recomputing the membership of small cluster points, the experimental results reveal that the proposed algorithm produces satisfactory results.
ACCESSION #
48125477

 

Related Articles

  • HYBRID ANT-BASED CLUSTERING ALGORITHM WITH CLUSTER ANALYSIS TECHNIQUES. Omar, Wafa'a; Badr, Amr; El-Fattah Hegazy, Abd // Journal of Computer Science;Jun2013, Vol. 9 Issue 6, p780 

    Cluster analysis is a data mining technology designed to derive a good understanding of data to solve clustering problems by extracting useful information from a large volume of mixed data elements. Recently, researchers have aimed to derive clustering algorithms from nature's swarm behaviors....

  • AVOIDING NOISE AND OUTLIERS IN K-MEANS. Jnena, Rami; Timraz, Mohammed; Ashour, Wesam // Computing & Information Systems;Oct2011, Vol. 15 Issue 2, p1 

    Applying k-means algorithm on the datasets that include large number of noise and outlier objects, gives unclear clusters results. In this paper we proposed a new technique for avoiding these noise and outliers by applying some preprocessing and post processing steps for the dataset that have to...

  • DERIVING CLUSTER KNOWLEDGE USING ROUGH SET THEORY. Upadhyaya, Shuchita; Arora, Alka; Jain, Rajni // Journal of Theoretical & Applied Information Technology;2008, Vol. 4 Issue 8, p688 

    Clustering algorithms gives general description of the clusters listing number of clusters and member entities in those clusters. It lacks in generating cluster description in the form of pattern. Deriving pattern from clusters along with grouping of data into clusters is important from data...

  • MULTI-DENSITY DBSCAN USING REPRESENTATIVES: MDBSCAN-UR. Ahmed, Rwand; El-Zaza, Eman; Ashour, Wesam // Computing & Information Systems;Oct2011, Vol. 15 Issue 2, p1 

    DBSCAN is one of the most popular algorithms for cluster analysis. It can discover clusters with arbitrary shape and separate noises. But this algorithm cannot choose its parameter according to distributing of dataset. It simply uses the global uses minimum number of points (MinPts) parameter,...

  • Avoiding Objects with few Neighbors in the K-Means Process and Adding ROCK Links to Its Distance. Alnabriss, Hadi A.; Ashour, Wesam // International Journal of Computer Applications;Aug2011, Vol. 28, p12 

    K-means is considered as one of the most common and powerful algorithms in data clustering, in this paper we're going to present new techniques to solve two problems in the K-means traditional clustering algorithm, the 1st problem is its sensitivity for outliers, in this part we are going to...

  • An effective web document clustering algorithm based on bisection and merge. Ingyu Lee; Byung-Won On // Artificial Intelligence Review;Jun2011, Vol. 36 Issue 1, p69 

    To cluster web documents, all of which have the same name entities, we attempted to use existing clustering algorithms such as K-means and spectral clustering. Unexpectedly, it turned out that these algorithms are not effective to cluster web documents. According to our intensive investigation,...

  • Cluster Dynamic XML Documents based on Frequently Changing Structures. Wei Li; Xiongfei Li; Regen Te // Advances in Information Sciences & Service Sciences;Apr2012, Vol. 4 Issue 6, p70 

    XML documents cluster analysis is a hot research topic, however, most of the existing work focuses on the snapshot XML data, while XML document is dynamic in practical application. This paper introduced a method to discover frequently changing structure (FCS) from a sequence of versions of...

  • Composite kernels for semi-supervised clustering. Domeniconi, Carlotta; Jing Peng; Yan, Bojun // Knowledge & Information Systems;Jul2011, Vol. 28 Issue 1, p99 

    critical problem related to kernel-based methods is how to select optimal kernels. A kernel function must conform to the learning target in order to obtain meaningful results. While solutions to the problem of estimating optimal kernel functions and corresponding parameters have been proposed in...

  • An Effective Evolutionary Clustering Algorithm: Hepatitis C case study. Marghny, M. H.; El-Aziz, Rasha M. Abd; Taloba, Ahmed I. // International Journal of Computer Applications;Nov2011, Vol. 34, p1 

    Clustering analysis plays an important role in scientific research and commercial application. K-means algorithm is a widely used partition method in clustering. However, it is known that the K-means algorithm may get stuck at suboptimal solutions, depending on the choice of the initial cluster...

Share

Read the Article

Courtesy of VIRGINIA BEACH PUBLIC LIBRARY AND SYSTEM

Sign out of this library

Other Topics