TITLE

Role of Spectral Peaks in Autocorrelation Domain for Robust Speech Recognition

AUTHOR(S)
Bansal, Poonam; Dev, Amita; Jain, Shail Bala
PUB. DATE
September 2009
SOURCE
Journal of Computing & Information Technology;Sep2009, Vol. 17 Issue 3, p295
SOURCE TYPE
Academic Journal
DOC. TYPE
Article
ABSTRACT
This paper presents a new front-end for robust speech recognition. This new front-end scenario focuses on the spectral features of the filtered speech signals in the autocorrelation domain. The autocorrelation domain is well known for its pole preserving and noise separation properties. In this paper, a novel method for robust speech extraction is proposed in the autocorrelation domain. The proposed method is based on a novel representation of the speech signal corrupted by an additive noise. Initial filtering stage is used to reduce the additive noise when computing the speech features followed by extraction of the autocorrelation spectrum peaks. Robust features based on these peaks are derived by assuming that the corrupting noise is stationary in nature. A task of speaker-independent isolated-word recognition is used to demonstrate the efficiency of these robust features. The cases of white noise and colored noise such as factory, babble and F16 are tested. Experimental results show significant improvement in comparison to the results obtained using traditional front-end methods. Further enhancement has been done by applying cepstral mean normalization (CMN) to the above extracted features.
ACCESSION #
45061569

 

Related Articles

  • Automatic Speech Recognition Technique for Bangla Words. Ali, Md. Akkas; Hossain, Manwar; Bhuiyan, Mohammad Nuruzzaman // International Journal of Advanced Science & Technology;Jan2013, Vol. 50, p51 

    Automatic recognition of spoken words is one of the most challenging tasks in the field of speech recognition. The difficulty of this task is due to the acoustic similarity of many of the words and their syllabi. Accurate recognition requires the system to perform fine phonetic distinctions....

  • Emotion Recognition from Speech using Discriminative Features. Chandrasekar, Purnima; Chapaneri, Santosh; Jayaswal, Deepak // International Journal of Computer Applications;Sep2014, Vol. 101 Issue 1-16, p31 

    Creating an accurate Speech Emotion Recognition (SER) system depends on extracting features relevant to that of emotions from speech. In this paper, the features that are extracted from the speech samples include Mel Frequency Cepstral Coefficients (MFCC), energy, pitch, spectral flux, spectral...

  • Bimodal Emotion Recognition via Facial Expression and Speech. Shiqing Zhang; Lemin Li; Zhijin Zhao // Advances in Information Sciences & Service Sciences;Dec2012, Vol. 4 Issue 22, p256 

    Automatic emotion recognition is a largely unexplored and challenging topic. Most existing emotion recognition systems focus on the facial expression modality and speech modality alone. In this paper, we present an approach to bimodal emotion recognition integrating facial expression and speech....

  • A NOVEL APPROACH FOR SIMULTANEOUS GENDER AND HINDI VOWEL RECOGNITION USING A MULTIPLE-INPUT MULTIPLE-OUTPUT CO-ACTIVE NEURO-FUZZY INFERENCE SYSTEM. LAKRA, SACHIN; PRASAD, T. V.; RAMAKRISHNA, G. // Journal of Theoretical & Applied Information Technology;9/1/2015, Vol. 79 Issue 1, p101 

    Human beings can simultaneously recognize vowels in speech as well as gender of a speaker inspite of high variability. However, machines have not been able to simultaneously overcome both gender variability and vowel variability existing in speech due to gender. This paper uses a Multiple-Input...

  • Filterbank optimization for robust ASR using GA and PSO. Aggarwal, R.; Dave, M. // International Journal of Speech Technology;Jun2012, Vol. 15 Issue 2, p191 

    Automatic speech recognition (ASR) systems follow a well established approach of pattern recognition, that is signal processing based feature extraction at front-end and likelihood evaluation of feature vectors at back-end. Mel-frequency cepstral coefficients (MFCCs) are the features widely used...

  • Articulatory Feature Extraction for Speech Recognition Using Neural Network. Huda, Mohammad Nurul; Hasan, Mohammad Mahedi; Hassan, Foyzul; Kotwal, Mohammed Rokibul Alam; Muhammad, Ghulam; Rahman, Chowdhury Mofizur // International Review on Computers & Software;Jan2011, Vol. 6 Issue 1, p25 

    This paper presents a method for articulatory features (AFs) extraction with lower cost. The method comprises two multilayer neural networks (MLNs). The first MLN, MLNLF-DPF, maps local features (LFs) of an input speech signal into discrete AFs and the second MLN, MLNDyn, restricts dynamics of...

  • A Novel Method for Speaker Independent Recognition Based on Hidden Markov Model. Feng-Long Huang // International Journal on Recent Trends in Engineering & Technolo;May2010, Vol. 3 Issue 1, p144 

    In this paper, we address the speaker independent recognition of Chinese number speeches 0~9 based on HMM. Our former results of inside and outside testing achieved 92.5% and 76.79% respectively. To improve further the performance, two important features of speech; MFCC and cluster number of...

  • Performance Assessment of Joint Feature Derived from Mellin-Cepstrum for Vowel Recognition. Jamaati, M.; Marvi, H. // International Review of Electrical Engineering;Nov/Dec2009, Vol. 3 Issue 6, p1077 

    Vowel recognition has been commonly used in speech recognition system for large vocabulary continuous speech and isolated word recognition. Vowel recognition system uses a parametric form of the vowel to get the most important features in the signal. Thus, the main stage of vowel recognition can...

  • Speaker recognition utilizing distributed DCT-II based Mel frequency cepstral coefficients and fuzzy vector quantization. Hossan, M.; Gregory, Mark // International Journal of Speech Technology;Mar2013, Vol. 16 Issue 1, p103 

    In this paper, a new and novel Automatic Speaker Recognition (ASR) system is presented. The new ASR system includes novel feature extraction and vector classification steps utilizing distributed Discrete Cosine Transform (DCT-II) based Mel Frequency Cepstral Coefficients (MFCC) and Fuzzy Vector...

Share

Read the Article

Courtesy of THE LIBRARY OF VIRGINIA

Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics