Displaying 1 - 24 of 24
-
Dai, B., McQueen, J. M., Terporten, R., Hagoort, P., & Kösem, A. (2022). Distracting Linguistic Information Impairs Neural Tracking of Attended Speech. Current Research in Neurobiology, 3: 100043. doi:10.1016/j.crneur.2022.100043.
Abstract
Listening to speech is difficult in noisy environments, and is even harder when the interfering noise consists of intelligible speech as compared to unintelligible sounds. This suggests that the competing linguistic information interferes with the neural processing of target speech. Interference could either arise from a degradation of the neural representation of the target speech, or from increased representation of distracting speech that enters in competition with the target speech. We tested these alternative hypotheses using magnetoencephalography (MEG) while participants listened to a target clear speech in the presence of distracting noise-vocoded speech. Crucially, the distractors were initially unintelligible but became more intelligible after a short training session. Results showed that the comprehension of the target speech was poorer after training than before training. The neural tracking of target speech in the delta range (1–4 Hz) reduced in strength in the presence of a more intelligible distractor. In contrast, the neural tracking of distracting signals was not significantly modulated by intelligibility. These results suggest that the presence of distracting speech signals degrades the linguistic representation of target speech carried by delta oscillations. -
Hintz, F., Voeten, C. C., McQueen, J. M., & Meyer, A. S. (2022). Quantifying the relationships between linguistic experience, general cognitive skills and linguistic processing skills. In J. Culbertson, A. Perfors, H. Rabagliati, & V. Ramenzoni (
Eds. ), Proceedings of the 44th Annual Conference of the Cognitive Science Society (CogSci 2022) (pp. 2491-2496). Toronto, Canada: Cognitive Science Society.Abstract
Humans differ greatly in their ability to use language. Contemporary psycholinguistic theories assume that individual differences in language skills arise from variability in linguistic experience and in general cognitive skills. While much previous research has tested the involvement of select verbal and non-verbal variables in select domains of linguistic processing, comprehensive characterizations of the relationships among the skills underlying language use are rare. We contribute to such a research program by re-analyzing a publicly available set of data from 112 young adults tested on 35 behavioral tests. The tests assessed nine key constructs reflecting linguistic processing skills, linguistic experience and general cognitive skills. Correlation and hierarchical clustering analyses of the test scores showed that most of the tests assumed to measure the same construct correlated moderately to strongly and largely clustered together. Furthermore, the results suggest important roles of processing speed in comprehension, and of linguistic experience in production. -
Menks, W. M., Ekerdt, C., Janzen, G., Kidd, E., Lemhöfer, K., Fernández, G., & McQueen, J. M. (2022). Study protocol: A comprehensive multi-method neuroimaging approach to disentangle developmental effects and individual differences in second language learning. BMC Psychology, 10: 169. doi:10.1186/s40359-022-00873-x.
Abstract
Background
While it is well established that second language (L2) learning success changes with age and across individuals, the underlying neural mechanisms responsible for this developmental shift and these individual differences are largely unknown. We will study the behavioral and neural factors that subserve new grammar and word learning in a large cross-sectional developmental sample. This study falls under the NWO (Nederlandse Organisatie voor Wetenschappelijk Onderzoek [Dutch Research Council]) Language in Interaction consortium (website: https://www.languageininteraction.nl/).
Methods
We will sample 360 healthy individuals across a broad age range between 8 and 25 years. In this paper, we describe the study design and protocol, which involves multiple study visits covering a comprehensive behavioral battery and extensive magnetic resonance imaging (MRI) protocols. On the basis of these measures, we will create behavioral and neural fingerprints that capture age-based and individual variability in new language learning. The behavioral fingerprint will be based on first and second language proficiency, memory systems, and executive functioning. We will map the neural fingerprint for each participant using the following MRI modalities: T1‐weighted, diffusion-weighted, resting-state functional MRI, and multiple functional-MRI paradigms. With respect to the functional MRI measures, half of the sample will learn grammatical features and half will learn words of a new language. Combining all individual fingerprints allows us to explore the neural maturation effects on grammar and word learning.
Discussion
This will be one of the largest neuroimaging studies to date that investigates the developmental shift in L2 learning covering preadolescence to adulthood. Our comprehensive approach of combining behavioral and neuroimaging data will contribute to the understanding of the mechanisms influencing this developmental shift and individual differences in new language learning. We aim to answer: (I) do these fingerprints differ according to age and can these explain the age-related differences observed in new language learning? And (II) which aspects of the behavioral and neural fingerprints explain individual differences (across and within ages) in grammar and word learning? The results of this study provide a unique opportunity to understand how the development of brain structure and function influence new language learning success. -
Severijnen, G. G. A., Bosker, H. R., & McQueen, J. M. (2022). Acoustic correlates of Dutch lexical stress re-examined: Spectral tilt is not always more reliable than intensity. In S. Frota, M. Cruz, & M. Vigário (
Eds. ), Proceedings of Speech Prosody 2022 (pp. 278-282). doi:10.21437/SpeechProsody.2022-57.Abstract
The present study examined two acoustic cues in the production
of lexical stress in Dutch: spectral tilt and overall intensity.
Sluijter and Van Heuven (1996) reported that spectral tilt is a
more reliable cue to stress than intensity. However, that study
included only a small number of talkers (10) and only syllables
with the vowels /aː/ and /ɔ/.
The present study re-examined this issue in a larger and
more variable dataset. We recorded 38 native speakers of Dutch
(20 females) producing 744 tokens of Dutch segmentally
overlapping words (e.g., VOORnaam vs. voorNAAM, “first
name” vs. “respectable”), targeting 10 different vowels, in
variable sentence contexts. For each syllable, we measured
overall intensity and spectral tilt following Sluijter and Van
Heuven (1996).
Results from Linear Discriminant Analyses showed that,
for the vowel /aː/ alone, spectral tilt showed an advantage over
intensity, as evidenced by higher stressed/unstressed syllable
classification accuracy scores for spectral tilt. However, when
all vowels were included in the analysis, the advantage
disappeared.
These findings confirm that spectral tilt plays a larger role
in signaling stress in Dutch /aː/ but show that, for a larger
sample of Dutch vowels, overall intensity and spectral tilt are
equally important. -
Strauß, A., Wu, T., McQueen, J. M., Scharenborg, O., & Hintz, F. (2022). The differential roles of lexical and sublexical processing during spoken-word recognition in clear and in noise. Cortex, 151, 70-88. doi:10.1016/j.cortex.2022.02.011.
Abstract
Successful spoken-word recognition relies on an interplay between lexical and sublexical processing. Previous research demonstrated that listeners readily shift between more lexically-biased and more sublexically-biased modes of processing in response to the situational context in which language comprehension takes place. Recognizing words in the presence of background noise reduces the perceptual evidence for the speech signal and – compared to the clear – results in greater uncertainty. It has been proposed that, when dealing with greater uncertainty, listeners rely more strongly on sublexical processing. The present study tested this proposal using behavioral and electroencephalography (EEG) measures. We reasoned that such an adjustment would be reflected in changes in the effects of variables predicting recognition performance with loci at lexical and sublexical levels, respectively. We presented native speakers of Dutch with words featuring substantial variability in (1) word frequency (locus at lexical level), (2) phonological neighborhood density (loci at lexical and sublexical levels) and (3) phonotactic probability (locus at sublexical level). Each participant heard each word in noise (presented at one of three signal-to-noise ratios) and in the clear and performed a two-stage lexical decision and transcription task while EEG was recorded. Using linear mixed-effects analyses, we observed behavioral evidence that listeners relied more strongly on sublexical processing when speech quality decreased. Mixed-effects modelling of the EEG signal in the clear condition showed that sublexical effects were reflected in early modulations of ERP components (e.g., within the first 300 ms post word onset). In noise, EEG effects occurred later and involved multiple regions activated in parallel. Taken together, we found evidence – especially in the behavioral data – supporting previous accounts that the presence of background noise induces a stronger reliance on sublexical processing. -
Bakker-Marshall, I., Takashima, A., Schoffelen, J.-M., Van Hell, J. G., Janzen, G., & McQueen, J. M. (2018). Theta-band Oscillations in the Middle Temporal Gyrus Reflect Novel Word Consolidation. Journal of Cognitive Neuroscience, 30(5), 621-633. doi:10.1162/jocn_a_01240.
Abstract
Like many other types of memory formation, novel word learning benefits from an offline consolidation period after the initial encoding phase. A previous EEG study has shown that retrieval of novel words elicited more word-like-induced electrophysiological brain activity in the theta band after consolidation [Bakker, I., Takashima, A., van Hell, J. G., Janzen, G., & McQueen, J. M. Changes in theta and beta oscillations as signatures of novel word consolidation. Journal of Cognitive Neuroscience, 27, 1286–1297, 2015]. This suggests that theta-band oscillations play a role in lexicalization, but it has not been demonstrated that this effect is directly caused by the formation of lexical representations. This study used magnetoencephalography to localize the theta consolidation effect to the left posterior middle temporal gyrus (pMTG), a region known to be involved in lexical storage. Both untrained novel words and words learned immediately before test elicited lower theta power during retrieval than existing words in this region. After a 24-hr consolidation period, the difference between novel and existing words decreased significantly, most strongly in the left pMTG. The magnitude of the decrease after consolidation correlated with an increase in behavioral competition effects between novel words and existing words with similar spelling, reflecting functional integration into the mental lexicon. These results thus provide new evidence that consolidation aids the development of lexical representations mediated by the left pMTG. Theta synchronization may enable lexical access by facilitating the simultaneous activation of distributed semantic, phonological, and orthographic representations that are bound together in the pMTG. -
Eisner, F., & McQueen, J. M. (2018). Speech perception. In S. Thompson-Schill (
Ed. ), Stevens’ handbook of experimental psychology and cognitive neuroscience (4th ed.). Volume 3: Language & thought (pp. 1-46). Hoboken: Wiley. doi:10.1002/9781119170174.epcn301.Abstract
This chapter reviews the computational processes that are responsible for recognizing word forms in the speech stream. We outline the different stages in a processing hierarchy from the extraction of general acoustic features, through speech‐specific prelexical processes, to the retrieval and selection of lexical representations. We argue that two recurring properties of the system as a whole are abstraction and adaptability. We also present evidence for parallel processing of information on different timescales, more specifically that segmental material in the speech stream (its consonants and vowels) is processed in parallel with suprasegmental material (the prosodic structures of spoken words). We consider evidence from both psycholinguistics and neurobiology wherever possible, and discuss how the two fields are beginning to address common computational problems. The challenge for future research in speech perception will be to build an account that links these computational problems, through functional mechanisms that address them, to neurobiological implementation. -
Francisco, A. A., Takashima, A., McQueen, J. M., Van den Bunt, M., Jesse, A., & Groen, M. A. (2018). Adult dyslexic readers benefit less from visual input during audiovisual speech processing: fMRI evidence. Neuropsychologia, 117, 454-471. doi:10.1016/j.neuropsychologia.2018.07.009.
Abstract
The aim of the present fMRI study was to investigate whether typical and dyslexic adult readers differed in the neural correlates of audiovisual speech processing. We tested for Blood Oxygen-Level Dependent (BOLD) activity differences between these two groups in a 1-back task, as they processed written (word, illegal consonant strings) and spoken (auditory, visual and audiovisual) stimuli. When processing written stimuli, dyslexic readers showed reduced activity in the supramarginal gyrus, a region suggested to play an important role in phonological processing, but only when they processed strings of consonants, not when they read words. During the speech perception tasks, dyslexic readers were only slower than typical readers in their behavioral responses in the visual speech condition. Additionally, dyslexic readers presented reduced neural activation in the auditory, the visual, and the audiovisual speech conditions. The groups also differed in terms of superadditivity, with dyslexic readers showing decreased neural activation in the regions of interest. An additional analysis focusing on vision-related processing during the audiovisual condition showed diminished activation for the dyslexic readers in a fusiform gyrus cluster. Our results thus suggest that there are differences in audiovisual speech processing between dyslexic and normal readers. These differences might be explained by difficulties in processing the unisensory components of audiovisual speech, more specifically, dyslexic readers may benefit less from visual information during audiovisual speech processing than typical readers. Given that visual speech processing supports the development of phonological skills fundamental in reading, differences in processing of visual speech could contribute to differences in reading ability between typical and dyslexic readers. -
Franken, M. K., Acheson, D. J., McQueen, J. M., Hagoort, P., & Eisner, F. (2018). Opposing and following responses in sensorimotor speech control: Why responses go both ways. Psychonomic Bulletin & Review, 25(4), 1458-1467. doi:10.3758/s13423-018-1494-x.
Abstract
When talking, speakers continuously monitor and use the auditory feedback of their own voice to control and inform speech production processes. When speakers are provided with auditory feedback that is perturbed in real time, most of them compensate for this by opposing the feedback perturbation. But some speakers follow the perturbation. In the current study, we investigated whether the state of the speech production system at perturbation onset may determine what type of response (opposing or following) is given. The results suggest that whether a perturbation-related response is opposing or following depends on ongoing fluctuations of the production system: It initially responds by doing the opposite of what it was doing. This effect and the non-trivial proportion of following responses suggest that current production models are inadequate: They need to account for why responses to unexpected sensory feedback depend on the production-system’s state at the time of perturbation.Additional information
https://link.springer.com/article/10.3758%2Fs13423-018-1494-x#SupplementaryMate… -
Franken, M. K., Eisner, F., Acheson, D. J., McQueen, J. M., Hagoort, P., & Schoffelen, J.-M. (2018). Self-monitoring in the cerebral cortex: Neural responses to pitch-perturbed auditory feedback during speech production. NeuroImage, 179, 326-336. doi:10.1016/j.neuroimage.2018.06.061.
Abstract
Speaking is a complex motor skill which requires near instantaneous integration of sensory and motor-related information. Current theory hypothesizes a complex interplay between motor and auditory processes during speech production, involving the online comparison of the speech output with an internally generated forward model. To examine the neural correlates of this intricate interplay between sensory and motor processes, the current study uses altered auditory feedback (AAF) in combination with magnetoencephalography (MEG). Participants vocalized the vowel/e/and heard auditory feedback that was temporarily pitch-shifted by only 25 cents, while neural activity was recorded with MEG. As a control condition, participants also heard the recordings of the same auditory feedback that they heard in the first half of the experiment, now without vocalizing. The participants were not aware of any perturbation of the auditory feedback. We found auditory cortical areas responded more strongly to the pitch shifts during vocalization. In addition, auditory feedback perturbation resulted in spectral power increases in the θ and lower β bands, predominantly in sensorimotor areas. These results are in line with current models of speech production, suggesting auditory cortical areas are involved in an active comparison between a forward model's prediction and the actual sensory input. Subsequently, these areas interact with motor areas to generate a motor response. Furthermore, the results suggest that θ and β power increases support auditory-motor interaction, motor error detection and/or sensory prediction processing. -
Goriot, C., Broersma, M., McQueen, J. M., Unsworth, S., & Van Hout, R. (2018). Language balance and switching ability in children acquiring English as a second language. Journal of Experimental Child Psychology, 173, 168-186. doi:10.1016/j.jecp.2018.03.019.
Abstract
This study investigated whether relative lexical proficiency in Dutch and English in child second language (L2) learners is related to executive functioning. Participants were Dutch primary school pupils of three different age groups (4–5, 8–9, and 11–12 years) who either were enrolled in an early-English schooling program or were age-matched controls not on that early-English program. Participants performed tasks that measured switching, inhibition, and working memory. Early-English program pupils had greater knowledge of English vocabulary and more balanced Dutch–English lexicons. In both groups, lexical balance, a ratio measure obtained by dividing vocabulary scores in English by those in Dutch, was related to switching but not to inhibition or working memory performance. These results show that for children who are learning an L2 in an instructional setting, and for whom managing two languages is not yet an automatized process, language balance may be more important than L2 proficiency in influencing the relation between childhood bilingualism and switching abilities. -
Mitterer, H., Reinisch, E., & McQueen, J. M. (2018). Allophones, not phonemes in spoken-word recognition. Journal of Memory and Language, 98, 77-92. doi:10.1016/j.jml.2017.09.005.
Abstract
What are the phonological representations that listeners use to map information about the segmental content of speech onto the mental lexicon during spoken-word recognition? Recent evidence from perceptual-learning paradigms seems to support (context-dependent) allophones as the basic representational units in spoken-word recognition. But recent evidence from a selective-adaptation paradigm seems to suggest that context-independent phonemes also play a role. We present three experiments using selective adaptation that constitute strong tests of these representational hypotheses. In Experiment 1, we tested generalization of selective adaptation using different allophones of Dutch /r/ and /l/ – a case where generalization has not been found with perceptual learning. In Experiments 2 and 3, we tested generalization of selective adaptation using German back fricatives in which allophonic and phonemic identity were varied orthogonally. In all three experiments, selective adaptation was observed only if adaptors and test stimuli shared allophones. Phonemic identity, in contrast, was neither necessary nor sufficient for generalization of selective adaptation to occur. These findings and other recent data using the perceptual-learning paradigm suggest that pre-lexical processing during spoken-word recognition is based on allophones, and not on context-independent phonemes -
Norris, D., McQueen, J. M., & Cutler, A. (2018). Commentary on “Interaction in spoken word recognition models". Frontiers in Psychology, 9: 1568. doi:10.3389/fpsyg.2018.01568.
-
Thorin, J., Sadakata, M., Desain, P., & McQueen, J. M. (2018). Perception and production in interaction during non-native speech category learning. The Journal of the Acoustical Society of America, 144(1), 92-103. doi:10.1121/1.5044415.
Abstract
Establishing non-native phoneme categories can be a notoriously difficult endeavour—in both speech perception and speech production. This study asks how these two domains interact in the course of this learning process. It investigates the effect of perceptual learning and related production practice of a challenging non-native category on the perception and/or production of that category. A four-day perceptual training protocol on the British English /æ/-/ɛ/ vowel contrast was combined with either related or unrelated production practice. After feedback on perceptual categorisation of the contrast, native Dutch participants in the related production group (N = 19) pronounced the trial's correct answer, while participants in the unrelated production group (N = 19) pronounced similar but phonologically unrelated words. Comparison of pre- and post-tests showed significant improvement over the course of training in both perception and production, but no differences between the groups were found. The lack of an effect of production practice is discussed in the light of previous, competing results and models of second-language speech perception and production. This study confirms that, even in the context of related production practice, perceptual training boosts production learning. -
Viebahn, M., McQueen, J. M., Ernestus, M., Frauenfelder, U. H., & Bürki, A. (2018). How much does orthography influence the processing of reduced word forms? Evidence from novel-word learning about French schwa deletion. The Quarterly Journal of Experimental Psychology, 71(11), 2378-2394. doi:10.1177/1747021817741859.
Abstract
This study examines the influence of orthography on the processing of reduced word forms. For this purpose, we compared the impact of phonological variation with the impact of spelling-sound consistency on the processing of words that may be produced with or without the vowel schwa. Participants learnt novel French words in which the vowel schwa was present or absent in the first syllable. In Experiment 1, the words were consistently produced without schwa or produced in a variable manner (i.e., sometimes produced with and sometimes produced without schwa). In Experiment 2, words were always produced in a consistent manner, but an orthographic exposure phase was included in which words that were produced without schwa were either spelled with or without the letter. Results from naming and eye-tracking tasks suggest that both phonological variation and spelling-sound consistency influence the processing of spoken novel words. However, the influence of phonological variation outweighs the effect of spelling-sound consistency. Our findings therefore suggest that the influence of orthography on the processing of reduced word forms is relatively small. -
Cho, T., & McQueen, J. M. (2005). Prosodic influences on consonant production in Dutch: Effects of prosodic boundaries, phrasal accent and lexical stress. Journal of Phonetics, 33(2), 121-157. doi:10.1016/j.wocn.2005.01.001.
Abstract
Prosodic influences on phonetic realizations of four Dutch consonants (/t d s z/) were examined. Sentences were constructed containing these consonants in word-initial position; the factors lexical stress, phrasal accent and prosodic boundary were manipulated between sentences. Eleven Dutch speakers read these sentences aloud. The patterns found in acoustic measurements of these utterances (e.g., voice onset time (VOT), consonant duration, voicing during closure, spectral center of gravity, burst energy) indicate that the low-level phonetic implementation of all four consonants is modulated by prosodic structure. Boundary effects on domain-initial segments were observed in stressed and unstressed syllables, extending previous findings which have been on stressed syllables alone. Three aspects of the data are highlighted. First, shorter VOTs were found for /t/ in prosodically stronger locations (stressed, accented and domain-initial), as opposed to longer VOTs in these positions in English. This suggests that prosodically driven phonetic realization is bounded by language-specific constraints on how phonetic features are specified with phonetic content: Shortened VOT in Dutch reflects enhancement of the phonetic feature {−spread glottis}, while lengthened VOT in English reflects enhancement of {+spread glottis}. Prosodic strengthening therefore appears to operate primarily at the phonetic level, such that prosodically driven enhancement of phonological contrast is determined by phonetic implementation of these (language-specific) phonetic features. Second, an accent effect was observed in stressed and unstressed syllables, and was independent of prosodic boundary size. The domain of accentuation in Dutch is thus larger than the foot. Third, within a prosodic category consisting of those utterances with a boundary tone but no pause, tokens with syntactically defined Phonological Phrase boundaries could be differentiated from the other tokens. This syntactic influence on prosodic phrasing implies the existence of an intermediate-level phrase in the prosodic hierarchy of Dutch. -
Cutler, A., McQueen, J. M., & Norris, D. (2005). The lexical utility of phoneme-category plasticity. In Proceedings of the ISCA Workshop on Plasticity in Speech Perception (PSP2005) (pp. 103-107).
-
Eisner, F., & McQueen, J. M. (2005). The specificity of perceptual learning in speech processing. Perception & Psychophysics, 67(2), 224-238.
Abstract
We conducted four experiments to investigate the specificity of perceptual adjustments made to unusual speech sounds. Dutch listeners heard a female talker produce an ambiguous fricative [?] (between [f] and [s]) in [f]- or [s]-biased lexical contexts. Listeners with [f]-biased exposure (e.g., [witlo?]; from witlof, “chicory”; witlos is meaningless) subsequently categorized more sounds on an [εf]–[εs] continuum as [f] than did listeners with [s]-biased exposure. This occurred when the continuum was based on the exposure talker's speech (Experiment 1), and when the same test fricatives appeared after vowels spoken by novel female and male talkers (Experiments 1 and 2). When the continuum was made entirely from a novel talker's speech, there was no exposure effect (Experiment 3) unless fricatives from that talker had been spliced into the exposure talker's speech during exposure (Experiment 4). We conclude that perceptual learning about idiosyncratic speech is applied at a segmental level and is, under these exposure conditions, talker specific. -
McQueen, J. M. (2005). Speech perception. In K. Lamberts, & R. Goldstone (
Eds. ), The Handbook of Cognition (pp. 255-275). London: Sage Publications. -
McQueen, J. M. (2005). Spoken word recognition and production: Regular but not inseparable bedfellows. In A. Cutler (
Ed. ), Twenty-first century psycholinguistics: Four cornerstones (pp. 229-244). Mahwah, NJ: Erlbaum. -
McQueen, J. M., & Sereno, J. (2005). Cleaving automatic processes from strategic biases in phonological priming. Memory & Cognition, 33(7), 1185-1209.
Abstract
In a phonological priming experiment using spoken Dutch words, Dutch listeners were taught varying expectancies and relatedness relations about the phonological form of target words, given particular primes. They learned to expect that, after a particular prime, if the target was a word, it would be from a specific phonological category. The expectancy either involved phonological overlap (e.g., honk-vonk, “base-spark”; expected related) or did not (e.g., nest-galm, “nest-boom”; expected unrelated, where the learned expectation after hearing nest was a word rhyming in -alm). Targets were occasionally inconsistent with expectations. In these inconsistent expectancy trials, targets were either unrelated (e.g., honk-mest, “base-manure”; unexpected unrelated), where the listener was expecting a related target, or related (e.g., nest-pest, “nest-plague”; unexpected related), where the listener was expecting an unrelated target. Participant expectations and phonological relatedness were thus manipulated factorially for three types of phonological overlap (rhyme, one onset phoneme, and three onset phonemes) at three interstimulus intervals (ISIs; 50, 500, and 2,000 msec). Lexical decisions to targets revealed evidence of expectancy-based strategies for all three types of overlap (e.g., faster responses to expected than to unexpected targets, irrespective of phonological relatedness) and evidence of automatic phonological processes, but only for the rhyme and three-phoneme onset overlap conditions and, most strongly, at the shortest ISI (e.g., faster responses to related than to unrelated targets, irrespective of expectations). Although phonological priming thus has both automatic and strategic components, it is possible to cleave them apart. -
McQueen, J. M., & Mitterer, H. (2005). Lexically-driven perceptual adjustments of vowel categories. In Proceedings of the ISCA Workshop on Plasticity in Speech Perception (PSP2005) (pp. 233-236).
-
Scharenborg, O., Norris, D., Ten Bosch, L., & McQueen, J. M. (2005). How should a speech recognizer work? Cognitive Science, 29(6), 867-918. doi:10.1207/s15516709cog0000_37.
Abstract
Although researchers studying human speech recognition (HSR) and automatic speech recognition (ASR) share a common interest in how information processing systems (human or machine) recognize spoken language, there is little communication between the two disciplines. We suggest that this lack of communication follows largely from the fact that research in these related fields has focused on the mechanics of how speech can be recognized. In Marr's (1982) terms, emphasis has been on the algorithmic and implementational levels rather than on the computational level. In this article, we provide a computational-level analysis of the task of speech recognition, which reveals the close parallels between research concerned with HSR and ASR. We illustrate this relation by presenting a new computational model of human spoken-word recognition, built using techniques from the field of ASR that, in contrast to current existing models of HSR, recognizes words from real speech input. -
Warner, N., Smits, R., McQueen, J. M., & Cutler, A. (2005). Phonological and statistical effects on timing of speech perception: Insights from a database of Dutch diphone perception. Speech Communication, 46(1), 53-72. doi:10.1016/j.specom.2005.01.003.
Abstract
We report detailed analyses of a very large database on timing of speech perception collected by Smits et al. (Smits, R., Warner, N., McQueen, J.M., Cutler, A., 2003. Unfolding of phonetic information over time: A database of Dutch diphone perception. J. Acoust. Soc. Am. 113, 563–574). Eighteen listeners heard all possible diphones of Dutch, gated in portions of varying size and presented without background noise. The present report analyzes listeners’ responses across gates in terms of phonological features (voicing, place, and manner for consonants; height, backness, and length for vowels). The resulting patterns for feature perception differ from patterns reported when speech is presented in noise. The data are also analyzed for effects of stress and of phonological context (neighboring vowel vs. consonant); effects of these factors are observed to be surprisingly limited. Finally, statistical effects, such as overall phoneme frequency and transitional probabilities, along with response biases, are examined; these too exercise only limited effects on response patterns. The results suggest highly accurate speech perception on the basis of acoustic information alone.
Share this page