Compressed labeling on distilled labelsets for multi-label learning

Zhou, Tianyi; Tao, Dacheng; Wu, Xindong
July 2012
Machine Learning;Jul2012, Vol. 88 Issue 1/2, p69
Academic Journal
Directly applying single-label classification methods to the multi-label learning problems substantially limits both the performance and speed due to the imbalance, dependence and high dimensionality of the given label matrix. Existing methods either ignore these three problems or reduce one with the price of aggravating another. In this paper, we propose a {0,1} label matrix compression and recovery method termed 'compressed labeling (CL)' to simultaneously solve or at least reduce these three problems. CL first compresses the original label matrix to improve balance and independence by preserving the signs of its Gaussian random projections. Afterward, we directly utilize popular binary classification methods (e.g., support vector machines) for each new label. A fast recovery algorithm is developed to recover the original labels from the predicted new labels. In the recovery algorithm, a 'labelset distilling method' is designed to extract distilled labelsets (DLs), i.e., the frequently appeared label subsets from the original labels via recursive clustering and subtraction. Given a distilled and an original label vector, we discover that the signs of their random projections have an explicit joint distribution that can be quickly computed from a geometric inference. Based on this observation, the original label vector is exactly determined after performing a series of Kullback-Leibler divergence based hypothesis tests on the distribution about the new labels. CL significantly improves the balance of the training samples and reduces the dependence between different labels. Moreover, it accelerates the learning process by training fewer binary classifiers for compressed labels, and makes use of label dependence via DLs based tests. Theoretically, we prove the recovery bounds of CL which verifies the effectiveness of CL for label compression and multi-label classification performance improvement brought by label correlations preserved in DLs. We show the effectiveness, efficiency and robustness of CL via 5 groups of experiments on 21 datasets from text classification, image annotation, scene classification, music categorization, genomics and web page classification.


Related Articles

  • Multiple instance learning via Gaussian processes. Kim, Minyoung; Torre, Fernando // Data Mining & Knowledge Discovery;Jul2014, Vol. 28 Issue 4, p1078 

    Multiple instance learning (MIL) is a binary classification problem with loosely supervised data where a class label is assigned only to a bag of instances indicating presence/absence of positive instances. In this paper we introduce a novel MIL algorithm using Gaussian processes (GP). The bag...

  • Improving a SVM Meta-classifier for Text Documents by using Naive Bayes. Morariu, D.; Creţulescu, R.; Vinţan, L. // International Journal of Computers, Communications & Control;Sep2010, Vol. 5 Issue 3, p351 

    Text categorization is the problem of classifying text documents into a set of predefined classes. In this paper, we investigated two approaches: a) to develop a classifier for text document based on Naive Bayes Theory and b) to integrate this classifier into a meta-classifier in order to...

  • SOME METHODS OF CONSTRUCTING KERNELS IN STATISTICAL LEARNING. Górecki, Tomasz; łuczak, Maciej // Discussiones Mathematicae: Probability & Statistics;2010, Vol. 30 Issue 2, p179 

    This paper is a collection of numerous methods and results concerning a design of kernel functions. It gives a short overview of methods of building kernels in metric spaces, especially Rn and Sn. However we also present a new theory. Introducing kernels was motivated by searching for non-linear...

  • Offline Signature Identification by Fusion of Multiple Classifiers using Statistical Learning Theory. Kisku, Dakshina Ranjan; Gupta, Phalguni; Sing, Jamuna Kanta // International Journal of Security & Its Applications;Jul2010, Vol. 4 Issue 3, p35 

    This paper uses Support Vector Machines (SVM) to fuse multiple classifiers for an offline signature system. From the signature images, global and local features are extracted and the signatures are verified with the help of Gaussian empirical rule, Euclidean and Mahalanobis distance based...

  • Particle Swarm Optimization for Semi-supervised Support Vector Machine. QING WU; SAN-YANG LIU; LE-YOU ZHANG // Journal of Information Science & Engineering;Sep2010, Vol. 26 Issue 5, p1695 

    Semi-supervised Support vector machine has become an increasingly popular tool for machine learning due to its wide applicability. Unlike SVM, their formulation leads to a non-smooth non-convex optimization problem. In 2005, Chapelle and Zien used a Gaussian approximation as a smooth function...

  • Bayesian multi-instance multi-label learning using Gaussian process prior. He, Jianjun; Gu, Hong; Wang, Zhelong // Machine Learning;Jul2012, Vol. 88 Issue 1/2, p273 

    Multi-instance multi-label learning (MIML) is a newly proposed framework, in which the multi-label problems are investigated by representing each sample with multiple feature vectors named instances. In this framework, the multi-label learning task becomes to learn a many-to-many relationship,...

  • Multiple Kernel Learning Algorithms. Gönen, Mehmet; Alpaydın, Ethem // Journal of Machine Learning Research;Jul2011, Vol. 12 Issue 7, p2211 

    In recent years, several methods have been proposed to combine multiple kernels instead of using a single one. These different kernels may correspond to using different notions of similarity or may be using information coming from multiple sources (different representations or different feature...

  • Regression and Classification Method Based on Gaussian Processes. WU Xinghui; ZHOU Yuping // Advanced Materials Research;2014, Vol. 971-973, p1949 

    Gaussian processes is a kind of machine learning method developed in recent years and also a promising technology that has been applied both in the regression problem and the classification problem. In this paper, the general principle of regression and classification based on Gaussian process...


    The performance of the kernel-based learning algorithms, such as support vector domain description, depends heavily on the proper choice of the kernel parameter. It is desirable for the kernel machines to work on the optimal kernel parameter that adapts well to the input data and the pattern...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics