False Positives Reduction in Top-down Protein Informatics using Support Vector Machines

Huijuan Guo
April 2012
Journal of Computers;Apr2012, Vol. 7 Issue 4, p913
Academic Journal
The scarce but consistent chance of getting false positive matches [1], [2] in protein database search [3] has always casted a shadow over the reliability of results. The situation can be helped by viewing the protein data from a descriptive and the probabilistic framework, together. Using the conventional approach as the first stage, top down protein data is descriptively searched for proteins and the results are scored and ranked, using a top down protein search engine. We then suggest applying Support Vector Machine, (SVM) as a second stage probabilistic scoring system, to the first stage protein database search results so as to further enhance protein classification. For SVM scoring, features are extracted from the top down data and a feature table is constructed. An SVM using Radial Basis Function is trained with this feature table. Later classification is performed on the test data using this SVM. The classification can then be viewed together with the previously calculated search engine score and a reordering of top ranked proteins may be done.


Related Articles

  • Twin Support Vector Machines Based on the Mixed Kernel Function. Fulin Wu; Shifei Ding // Journal of Computers;Jul2014, Vol. 9 Issue 7, p1690 

    The efficiency and performance of the Twin Support Vector Machines (TWSVM) are better than the traditional support vector machines when it deals with the problems. However, it also has the problem of selecting kernel functions. Generally, TWSVM selects the Gaussian radial basis kernel function....

  • Using SVM as Back-End Classifier for Language Identification. Hongbin Suo; Ming Li; Ping Lu; Yonghong Yan // EURASIP Journal on Audio Speech & Music Processing;2008, Vol. 2008, Special section p1 

    Robust automatic language identification (LID) is a task of identifying the language froma short utterance spoken by an unknown speaker. One of the mainstream approaches named parallel phone recognition language modeling (PPRLM) has achieved a very good performance. The log-likelihood radio...

  • The Application of Entropy Method in Wind Power Combined Prediction. Kangping Li // Applied Mechanics & Materials;2014, Issue 602-605, p3043 

    A new combined method of wind power prediction based on entropy method is proposed according to information fusion technique. Firstly, Carry out the wind-power forecast with BP neural network, radial basis function neural network and support vector machine respectively. Then, weights of...

  • Intelligent Optimization Methods for High-Dimensional Data Classification for Support Vector Machines. Sheng Ding; Li Chen // Intelligent Information Management;Jun2010, Vol. 2 Issue 6, p354 

    Support vector machine (SVM) is a popular pattern classification method with many application areas. SVM shows its outstanding performance in high-dimensional data classification. In the process of classification, SVM kernel parameter setting during the SVM training procedure, along with the...

  • A comparison of several neural networks to predict the execution times in injection molding production for automotive industry. Fern├índez-Delgado, M.; Reboreda, M.; Cernadas, E.; Barro, S. // Neural Computing & Applications;Jul2010, Vol. 19 Issue 5, p741 

    In the industrial environment, specifically in the automotive industry, an accurate prediction of execution times for each production task is very useful in order to plan the work and to optimize the human, technical and material resources. In this paper, we applied several regression neural...

  • Payload Estimation in Universal Steganalysis. P. P., Amritha; Madathil, Anoj // Defence Science Journal;2010, Vol. 60 Issue 4, p412 

    Universal Steganalysis can classify images without the knowledge of steganographic algorithms. This steganalysis will blindly classify an image as cover or not, but finding how much payload embedded, is still an open problem. This paper focuses on the above problem. Firstly, they use features...

  • A Hybrid Framework using RBF and SVM for Direct Marketing. Govidarajan, M. // International Journal of Advanced Computer Science & Application;Apr2013, Vol. 4 Issue 4, p121 

    one of the major developments in machine learning in the past decade is the ensemble method, which finds highly accurate classifier by combining many moderately accurate component classifiers. This paper addresses using an ensemble of classification methods for direct marketing. Direct marketing...

  • Classification of Normal and Pathological Voice Using SVM and RBFNN. Sellam, V.; Jagadeesan, J. // Journal of Signal & Information Processing;Feb2014, Vol. 5 Issue 1, p1 

    The identification and classification of pathological voice are still a challenging area of research in speech processing. Acoustic features of speech are used mainly to discriminate normal voices from pathological voices. This paper explores and compares various classification models to find...

  • Evaluation of Ensemble Classifiers for Handwriting Recognition. Govindarajan, M. // International Journal of Modern Education & Computer Science;Nov2013, Vol. 5 Issue 11, p11 

    One of the major developments in machine learning in the past decade is the ensemble method, which finds highly accurate classifier by combining many moderately accurate component classifiers. In this research work, new ensemble classification methods are proposed for homogeneous ensemble...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics