Speech Synthesis Using Damped Sinusoids

Hillenbrand, James M.; Houde, Robert A.
August 2002
Journal of Speech, Language & Hearing Research;Aug2002, Vol. 45 Issue 4, p639
Academic Journal
A speech synthesizer was developed that operates by summing exponentially damped sinusoids at frequencies and amplitudes corresponding to peaks derived from the spectrum envelope of the speech signal. The spectrum analysis begins with the calculation of a smoothed Fourier spectrum. A masking threshold is then computed for each frame as the running average of spectral amplitudes over an 800-Hz window. In a rough simulation of lateral suppression, the running average is then subtracted from the smoothed spectrum (with negative spectral values set to zero), producing a masked spectrum. The signal is resynthesized by summing exponentially damped sinusoids at frequencies corresponding to peaks in the masked spectra. If a periodicity measure indicates that a given analysis frame is voiced, the damped sinusoids are pulsed at a rate corresponding to the measured fundamental period. For unvoiced speech, the damped sinusoids are pulsed on and off at random intervals. A perceptual evaluation of speech produced by the damped sinewave synthesizer showed excellent sentence intelligibility, excellent intelligibility for vowels in /hVd/ syllables, and fair intelligibility for consonants in CV nonsense syllables.


Related Articles

  • Unit Selection Algorithm Using Bi-grams Model For Corpus-Based Speech Synthesis. Kammoun, Mohamed Ali; Hamida, Ahmed Ben // International Journal of Signal Processing;2009, Vol. 5 Issue 2, p120 

    In this paper, we present a novel statistical approach to corpus-based speech synthesis. Classically, phonetic information is defined and considered as acoustic reference to be respected. In this way, many studies were elaborated for acoustical unit classification. This type of classification...

  • Implementation of Phonetic Context Variable Length Unit Selection Module for Malay Text to Speech. Tian-Swee Tan // Journal of Computer Science;2008, Vol. 4 Issue 7, p550 

    Problem statement: The main problem with current Malay Text-To-Speech (MTTS) synthesis system is the poor quality of the generated speech sound due to the inability of traditional TTS system to provide multiple choices of unit for generating more accurate synthesized speech. Approach: This study...

  • COMPUTER AIDED INSTRUCTION IN SPEECH SCIENCE. O'Malley, Michael H.; Kloker, Dean R. // Today's Speech;Spring1972, Vol. 20 Issue 2, p17 

    The use of a computer based speech analysis and synthesis system for elementary instruction in speech science is described. System organization and the development of an interactive command language are covered. This approach, in which the student actively uses the computer, is contrasted with...

  • Extraction of Arabic Standard Micromelody. Chentir, A.; Guerti, M.; Hirst, D. J. // Journal of Computer Science;2009, Vol. 5 Issue 2, p86 

    Problem statement: In the early days of speech synthesis research the obvious focus of attention was intelligibility. But many researchers agree that the major remaining obstacle to fully acceptable synthetic speech is that it continues to be insufficiently natural. Approach: In this study, we...

  • TÃœRKÇE METÄ°NDEN KONUÅžMA SENTEZLEMEDE YAÅžANAN SIKINTILAR VE ÇÖZÃœM YÖNTEMLERÄ°. Canal, Ş. Murat; Kurnaz, Sefer; Yilmaz, A. Egemen // Journal of Aeronautics & Space Technologies / Havacilik ve Uzay ;2010, Vol. 4 Issue 3, p47 

    Even though numerous research studies have so far been conducted on "text-to-speech (TTS)" systems for different languages, the number of studies regarding agglutinative languages such as Turkish is relatively few. A feature of agglutinative languages is that many words and phrases can be built...

  • Peculiarities of Person's Speech and Thought Activity in the Context of Multicultural Education. Olegovna Shishova, Evgeniya // Middle East Journal of Scientific Research;1/17/2014, Vol. 19 Issue 9, p1137 

    The article discusses the necessity to develop organizational and pedagogical conditions of optimizing speech and thought activity taking into account individual speech experience of students. It also displays the problem of bilingualism and multilingualism in the context of multicultural...

  • INTERACTIONS OF SPEECH AND MANUAL MOVEMENT IN A SYNCOPATED TASK. Inui, Nobuyuki // Perceptual & Motor Skills;Oct2007, Vol. 105 Issue 2, p447 

    The present study examined interactions of speech production and finger-tapping movement, using a syncopated motor task with two movements in 10 male right-handed undergraduate students (M age=21.0 yr.; SD=1.4). On the syncopated task, participants were required to produce one movement exactly...

  • On a Pitch Alteration for Speech Synthesis Systems. Kim, JongKuk; Hahn, HernSoo; Yoon, Uei-Joong; Bae, MyungJin // Wireless Personal Communications;Oct2009, Vol. 50 Issue 4, p435 

    Speech synthesis can be classified into waveform coding, source coding, or hybrid coding by the synthesis method. Among these, waveform coding is especially suitable for high-quality speech synthesis. However, synthesis techniques using syllable or phoneme unit is not desirable since it fails to...

  • A general-purpose IsiZulu speech synthesizer. Louw, J. A.; Davel, M.; Barnard, E. // South African Journal of African Languages;2005, Vol. 25 Issue 2, p92 

    A general-purpose isiZulu text -to-speech (TTS) system was developed, based on the ‘Multisyn’ unit-selection approach supported by the Festival TTS toolkit. The development involved a number of challenges related to the interface between speech technology and linguistics for...


Read the Article


Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics