TITLE

Simple Displays of Talker Location Improve Voice Identification Performance in Multitalker, Spatialized Audio Environments

AUTHOR(S)
Kilgore, Ryan M.
PUB. DATE
April 2009
SOURCE
Human Factors;Apr2009, Vol. 51 Issue 2, p224
SOURCE TYPE
Academic Journal
DOC. TYPE
Article
ABSTRACT
Objective: The aim of this study was to assess the voice identification benefits of visual depictions of the relative locations of spatialized talkers in a serial listening task. Background: Although spatialized audio is known to improve speech intelligibility and voice identification accuracy within multitalker environments, prior studies have not found any additional benefit for augmenting spatialized audio with visual depictions of relative voice locations. These studies, however, were restricted to small audio environments (four voices), potentially limiting the ability of simple talker location displays to provide additional identification benefit. Method: In the first experiment, 18 participants performed a voice identification task for four- and eight-voice environments under three display conditions: (a) nonspatialized voices with an audio-only display, (b) spatialized voices with an audio-only display, and (c) spatialized voices augmented by a visual display of relative talker locations. In the second experiment, 32 participants performed the same voice identification task within a spatialized eight-voice environment but with audio and visual displays of differing angular scale. Results: Visually depicting relative talker locations improved voice identification performance in terms of both accuracy and response time, particularly for more populous auditory spaces. Both auditory and visual display scale affected these benefits, with large-angle displays performing the best for both modalities. Conclusion: Results indicate that simple visual representations of spatialized audio environments help listeners identify voices and that these representations are more effective when the angular spacing (auditory and visual) between talker locations is increased. Application: These results have important implications for the design and implementation of collaborative audio environments for shared, desktop, and portable communication devices.
ACCESSION #
43374040

 

Related Articles

  • Decision rule based on ordered-nearest-neighbour with applications to utterance verification. Huang, C.-S.; Lee, C.-H.; Wang, H.-C. // Electronics Letters;2/6/2003, Vol. 39 Issue 3, p327 

    Proposes a novel decision rule based on ordered-nearest-neighbor for utterance verification in automatic speech recognition technology. Exploitation of the underlying neighborhood distribution associated with each class as an auxiliary criterion for the plug-in maximum a posteriori decision...

  • Croatian Large Vocabulary Automatic Speech Recognition. Martinčić-Ipšić, Sanda; Pobar, Miran; Ipšić, Ivo // Automatika: Journal for Control, Measurement, Electronics, Compu;2011, Vol. 52 Issue 2, p147 

    This paper presents procedures used for development of a Croatian large vocabulary automatic speech recognition system (LVASR). The proposed acoustic model is based on context-dependent triphone hidden Markov models and Croatian phonetic rules. Different acoustic and language models, developed...

  • PROCEED With CAUTION. Yeager, David // For the Record (Great Valley Publishing Company, Inc.);1/30/2012, Vol. 24 Issue 2, p10 

    The article discusses the challenge of translating speech recognition technology's promise of streamlining operations and reducing staffing needs into savings without sacrificing quality and efficiency. It reminds hospitals about the ramifications of medical record and cites the training and...

  • A Study on Connection of Facility DB for People with Disability to Smartphone for Location and Voice Recognition and QR Code Recognition and to Navigation. Sung-Yong Yang; Dea-Woo Park // International Journal of Digital Content Technology & its Applic;Apr2014, Vol. 8 Issue 2, p157 

    The number of people with a disability registered in the Korea's Ministry of Health and Welfare is more than 2.5 million in 2010, and approximately 2.51 million as of late 2012. In consideration of potential disabilities led by aging which is in fast progress in Korea, the number of people with...

  • Do-it-yourself. Stait, Rob // Utility Week;11/25/2011, p24 

    The article discusses the impact of the service incentive mechanism (SIM) on the approach by water companies to improving the quality of their service particularly in Great Britain. According to the author, there seems to be a consensus among water companies on the increasing importance of web...

  • Innovative Research in the Labs Part V: Intervoice Center for Conversational Technology at the University of Texas at Dallas. Jamison, Nancy // Speech Technology Magazine;Nov/Dec2006, Vol. 11 Issue 6, p40 

    The article focuses on technologies that were developed at the Intervoice Center for Conversational Technologies, located at the Human Language Technology Research Institute of the University of Texas at Dallas. One of those technologies is the paraphraser which takes input as a set of task...

  • Toys push technology into mainstream. Teague, Paul E. // Design News;12/7/92, Vol. 48 Issue 23, p25 

    The article reports on the voice recognition technology applied on toys in the U.S. Voice-recognition, once relegated to high-end applications such as telecommunications, has finally moved into toyland, as have scanning and multimedia technologies. And the cost-cutting technical refinements that...

  • Speaking and Listening. Silverstein, Alvin; Silverstein, Virginia; Nunn, Laura Silverstein // Hearing (Senses & Sensors);2001, p31 

    This chapter describes the human speech. Human speech is more than just the ability to make sounds that are grouped into recognizable words. Speech also involves listening. Speech recognition systems were developed by computer scientists to recognize simple speech in jobs that require a...

  • What Ever Happened To Voice? Lowenstein, Mark // Wireless Week;6/15/2003, Vol. 9 Issue 13, p45 

    Focuses on voice portals and applications using voice navigation. Technology improvements in voice recognition and TTS; Promising concept for voice application; Feature deployed in standard application platforms; Example of how session management could help improve the user experience.

Share

Read the Article

Courtesy of THE LIBRARY OF VIRGINIA

Sorry, but this item is not currently available from your library.

Try another library?
Sign out of this library

Other Topics