Audiovisual recalibration of vowel categories

Franken, M. K., Eisner, F., Schoffelen, J.-M., Acheson, D. J., Hagoort, P., & McQueen, J. M. (2017). Audiovisual recalibration of vowel categories. Talk presented at Psycholinguistics in Flanders (PiF 2017). Leuven, Belgium. 2017-05-29 - 2017-05-30.
One of the most daunting tasks of a listener is to map a continuous auditory stream onto known speech sound categories and lexical items. A major issue with this mapping problem is the variability in the acoustic realizations of sound categories, both within and across speakers. Past research has suggested listeners may use various sources of information, such as lexical knowledge or visual cues (e.g., lip-reading) to recalibrate these speech categories to the current speaker. Previous studies have focused on audiovisual recalibration of consonant categories. The present study explores whether vowel categorization, which is known to show less sharply defined category boundaries, also benefit from visual cues.
Participants were exposed to videos of a speaker pronouncing one out of two vowels (Dutch vowels /e/ and /ø/), paired with audio that was ambiguous between the two vowels. The most ambiguous vowel token was determined on an individual basis by a categorization task at the beginning of the experiment. In one group of participants, this auditory token was paired with a video of an /e/ articulation, in the other group with an /ø/ video. After exposure to these videos, it was found in an audio-only categorization task that participants had adapted their categorization behavior as a function of the video exposure. The group that was exposed to /e/ videos showed a reduction of /ø/ classifications, suggesting they had recalibrated their vowel categories based on the available visual information. These results show that listeners indeed use visual information to recalibrate vowel categories, which is in line with previous work on audiovisual recalibration in consonant categories, and lexically-guided recalibration in both vowels and consonants.
In addition, a secondary aim of the current study was to explore individual variability in audiovisual recalibration. Phoneme categories vary not only in terms of boundary location, but also in terms of boundary sharpness, or how strictly categories are distinguished. The present study explores whether this sharpness is associated with the amount of audiovisual recalibration. The results tentatively support that a fuzzy boundary is associated with stronger recalibration, suggesting that listeners’ category sharpness may be related to the weight they assign to visual information in audiovisual speech perception. If listeners with fuzzy boundaries assign more weight to visual cues, given that vowel categories have less sharp boundaries than consonants, there ought to be audiovisual recalibration for vowels as well. This is exactly what was found in the current study.
Publication type
Talk
Publication date
2017

Share this page