Displaying 1 - 24 of 24
-
Dai, B., McQueen, J. M., Terporten, R., Hagoort, P., & Kösem, A. (2022). Distracting Linguistic Information Impairs Neural Tracking of Attended Speech. Current Research in Neurobiology, 3: 100043. doi:10.1016/j.crneur.2022.100043.
Abstract
Listening to speech is difficult in noisy environments, and is even harder when the interfering noise consists of intelligible speech as compared to unintelligible sounds. This suggests that the competing linguistic information interferes with the neural processing of target speech. Interference could either arise from a degradation of the neural representation of the target speech, or from increased representation of distracting speech that enters in competition with the target speech. We tested these alternative hypotheses using magnetoencephalography (MEG) while participants listened to a target clear speech in the presence of distracting noise-vocoded speech. Crucially, the distractors were initially unintelligible but became more intelligible after a short training session. Results showed that the comprehension of the target speech was poorer after training than before training. The neural tracking of target speech in the delta range (1–4 Hz) reduced in strength in the presence of a more intelligible distractor. In contrast, the neural tracking of distracting signals was not significantly modulated by intelligibility. These results suggest that the presence of distracting speech signals degrades the linguistic representation of target speech carried by delta oscillations. -
Hintz, F., Voeten, C. C., McQueen, J. M., & Meyer, A. S. (2022). Quantifying the relationships between linguistic experience, general cognitive skills and linguistic processing skills. In J. Culbertson, A. Perfors, H. Rabagliati, & V. Ramenzoni (
Eds. ), Proceedings of the 44th Annual Conference of the Cognitive Science Society (CogSci 2022) (pp. 2491-2496). Toronto, Canada: Cognitive Science Society.Abstract
Humans differ greatly in their ability to use language. Contemporary psycholinguistic theories assume that individual differences in language skills arise from variability in linguistic experience and in general cognitive skills. While much previous research has tested the involvement of select verbal and non-verbal variables in select domains of linguistic processing, comprehensive characterizations of the relationships among the skills underlying language use are rare. We contribute to such a research program by re-analyzing a publicly available set of data from 112 young adults tested on 35 behavioral tests. The tests assessed nine key constructs reflecting linguistic processing skills, linguistic experience and general cognitive skills. Correlation and hierarchical clustering analyses of the test scores showed that most of the tests assumed to measure the same construct correlated moderately to strongly and largely clustered together. Furthermore, the results suggest important roles of processing speed in comprehension, and of linguistic experience in production. -
Menks, W. M., Ekerdt, C., Janzen, G., Kidd, E., Lemhöfer, K., Fernández, G., & McQueen, J. M. (2022). Study protocol: A comprehensive multi-method neuroimaging approach to disentangle developmental effects and individual differences in second language learning. BMC Psychology, 10: 169. doi:10.1186/s40359-022-00873-x.
Abstract
Background
While it is well established that second language (L2) learning success changes with age and across individuals, the underlying neural mechanisms responsible for this developmental shift and these individual differences are largely unknown. We will study the behavioral and neural factors that subserve new grammar and word learning in a large cross-sectional developmental sample. This study falls under the NWO (Nederlandse Organisatie voor Wetenschappelijk Onderzoek [Dutch Research Council]) Language in Interaction consortium (website: https://www.languageininteraction.nl/).
Methods
We will sample 360 healthy individuals across a broad age range between 8 and 25 years. In this paper, we describe the study design and protocol, which involves multiple study visits covering a comprehensive behavioral battery and extensive magnetic resonance imaging (MRI) protocols. On the basis of these measures, we will create behavioral and neural fingerprints that capture age-based and individual variability in new language learning. The behavioral fingerprint will be based on first and second language proficiency, memory systems, and executive functioning. We will map the neural fingerprint for each participant using the following MRI modalities: T1‐weighted, diffusion-weighted, resting-state functional MRI, and multiple functional-MRI paradigms. With respect to the functional MRI measures, half of the sample will learn grammatical features and half will learn words of a new language. Combining all individual fingerprints allows us to explore the neural maturation effects on grammar and word learning.
Discussion
This will be one of the largest neuroimaging studies to date that investigates the developmental shift in L2 learning covering preadolescence to adulthood. Our comprehensive approach of combining behavioral and neuroimaging data will contribute to the understanding of the mechanisms influencing this developmental shift and individual differences in new language learning. We aim to answer: (I) do these fingerprints differ according to age and can these explain the age-related differences observed in new language learning? And (II) which aspects of the behavioral and neural fingerprints explain individual differences (across and within ages) in grammar and word learning? The results of this study provide a unique opportunity to understand how the development of brain structure and function influence new language learning success. -
Severijnen, G. G. A., Bosker, H. R., & McQueen, J. M. (2022). Acoustic correlates of Dutch lexical stress re-examined: Spectral tilt is not always more reliable than intensity. In S. Frota, M. Cruz, & M. Vigário (
Eds. ), Proceedings of Speech Prosody 2022 (pp. 278-282). doi:10.21437/SpeechProsody.2022-57.Abstract
The present study examined two acoustic cues in the production
of lexical stress in Dutch: spectral tilt and overall intensity.
Sluijter and Van Heuven (1996) reported that spectral tilt is a
more reliable cue to stress than intensity. However, that study
included only a small number of talkers (10) and only syllables
with the vowels /aː/ and /ɔ/.
The present study re-examined this issue in a larger and
more variable dataset. We recorded 38 native speakers of Dutch
(20 females) producing 744 tokens of Dutch segmentally
overlapping words (e.g., VOORnaam vs. voorNAAM, “first
name” vs. “respectable”), targeting 10 different vowels, in
variable sentence contexts. For each syllable, we measured
overall intensity and spectral tilt following Sluijter and Van
Heuven (1996).
Results from Linear Discriminant Analyses showed that,
for the vowel /aː/ alone, spectral tilt showed an advantage over
intensity, as evidenced by higher stressed/unstressed syllable
classification accuracy scores for spectral tilt. However, when
all vowels were included in the analysis, the advantage
disappeared.
These findings confirm that spectral tilt plays a larger role
in signaling stress in Dutch /aː/ but show that, for a larger
sample of Dutch vowels, overall intensity and spectral tilt are
equally important. -
Strauß, A., Wu, T., McQueen, J. M., Scharenborg, O., & Hintz, F. (2022). The differential roles of lexical and sublexical processing during spoken-word recognition in clear and in noise. Cortex, 151, 70-88. doi:10.1016/j.cortex.2022.02.011.
Abstract
Successful spoken-word recognition relies on an interplay between lexical and sublexical processing. Previous research demonstrated that listeners readily shift between more lexically-biased and more sublexically-biased modes of processing in response to the situational context in which language comprehension takes place. Recognizing words in the presence of background noise reduces the perceptual evidence for the speech signal and – compared to the clear – results in greater uncertainty. It has been proposed that, when dealing with greater uncertainty, listeners rely more strongly on sublexical processing. The present study tested this proposal using behavioral and electroencephalography (EEG) measures. We reasoned that such an adjustment would be reflected in changes in the effects of variables predicting recognition performance with loci at lexical and sublexical levels, respectively. We presented native speakers of Dutch with words featuring substantial variability in (1) word frequency (locus at lexical level), (2) phonological neighborhood density (loci at lexical and sublexical levels) and (3) phonotactic probability (locus at sublexical level). Each participant heard each word in noise (presented at one of three signal-to-noise ratios) and in the clear and performed a two-stage lexical decision and transcription task while EEG was recorded. Using linear mixed-effects analyses, we observed behavioral evidence that listeners relied more strongly on sublexical processing when speech quality decreased. Mixed-effects modelling of the EEG signal in the clear condition showed that sublexical effects were reflected in early modulations of ERP components (e.g., within the first 300 ms post word onset). In noise, EEG effects occurred later and involved multiple regions activated in parallel. Taken together, we found evidence – especially in the behavioral data – supporting previous accounts that the presence of background noise induces a stronger reliance on sublexical processing. -
Franken, M. K., Acheson, D. J., McQueen, J. M., Hagoort, P., & Eisner, F. (2019). Consistency influences altered auditory feedback processing. Quarterly Journal of Experimental Psychology, 72(10), 2371-2379. doi:10.1177/1747021819838939.
Abstract
Previous research on the effect of perturbed auditory feedback in speech production has focused on two types of responses. In the short term, speakers generate compensatory motor commands in response to unexpected perturbations. In the longer term, speakers adapt feedforward motor programmes in response to feedback perturbations, to avoid future errors. The current study investigated the relation between these two types of responses to altered auditory feedback. Specifically, it was hypothesised that consistency in previous feedback perturbations would influence whether speakers adapt their feedforward motor programmes. In an altered auditory feedback paradigm, formant perturbations were applied either across all trials (the consistent condition) or only to some trials, whereas the others remained unperturbed (the inconsistent condition). The results showed that speakers’ responses were affected by feedback consistency, with stronger speech changes in the consistent condition compared with the inconsistent condition. Current models of speech-motor control can explain this consistency effect. However, the data also suggest that compensation and adaptation are distinct processes, which are not in line with all current models. -
Grey, S., Schubel, L. C., McQueen, J. M., & Van Hell, J. G. (2019). Processing foreign-accented speech in a second language: Evidence from ERPs during sentence comprehension in bilinguals. Bilingualism: Language and Cognition, 22(5), 912-929. doi:10.1017/S1366728918000937.
Abstract
This study examined electrophysiological correlates of sentence comprehension of native-accented and foreign-accented
speech in a second language (L2), for sentences produced in a foreign accent different from that associated with the listeners’
L1. Bilingual speaker-listeners process different accents in their L2 conversations, but the effects on real-time L2 sentence
comprehension are unknown. Dutch–English bilinguals listened to native American-English accented sentences and foreign
(and for them unfamiliarly-accented) Chinese-English accented sentences while EEG was recorded. Behavioral sentence
comprehension was highly accurate for both native-accented and foreign-accented sentences. ERPs showed different patterns
for L2 grammar and semantic processing of native- and foreign-accented speech. For grammar, only native-accented speech
elicited an Nref. For semantics, both native- and foreign-accented speech elicited an N400 effect, but with a delayed onset
across both accent conditions. These findings suggest that the way listeners comprehend native- and foreign-accented
sentences in their L2 depends on their familiarity with the accent. -
Janssen, C., Segers, E., McQueen, J. M., & Verhoeven, L. (2019). Comparing effects of instruction on word meaning and word form on early literacy abilities in kindergarten. Early Education and Development, 30(3), 375-399. doi:10.1080/10409289.2018.1547563.
Abstract
Research Findings: The present study compared effects of explicit instruction on and practice with the phonological form of words (form-focused instruction) versus explicit instruction on and practice with the meaning of words (meaning-focused instruction). Instruction was given via interactive storybook reading in the kindergarten classroom of children learning Dutch. We asked whether the 2 types of instruction had different effects on vocabulary development and 2 precursors of reading ability—phonological awareness and letter knowledge—and we examined effects on these measures of the ability to learn new words with minimal acoustic-phonetic differences. Learners showed similar receptive target-word vocabulary gain after both types of instruction, but learners who received form-focused vocabulary instruction showed more gain in semantic knowledge of target vocabulary, phonological awareness, and letter knowledge than learners who received meaning-focused vocabulary instruction. Level of ability to learn pairs of words with minimal acoustic-phonetic differences predicted gain in semantic knowledge of target vocabulary and in letter knowledge in the form-focused instruction group only. Practice or Policy: A focus on the form of words during instruction appears to have benefits for young children learning vocabulary. -
McQueen, J. M., & Meyer, A. S. (2019). Key issues and future directions: Towards a comprehensive cognitive architecture for language use. In P. Hagoort (
Ed. ), Human language: From genes and brain to behavior (pp. 85-96). Cambridge, MA: MIT Press. -
Mickan, A., McQueen, J. M., & Lemhöfer, K. (2019). Bridging the gap between second language acquisition research and memory science: The case of foreign language attrition. Frontiers in Human Neuroscience, 13: 397. doi:10.3389/fnhum.2019.00397.
Abstract
The field of second language acquisition (SLA) is by nature of its subject a highly interdisciplinary area of research. Learning a (foreign) language, for example, involves encoding new words, consolidating and committing them to long-term memory, and later retrieving them. All of these processes have direct parallels in the domain of human memory and have been thoroughly studied by researchers in that field. Yet, despite these clear links, the two fields have largely developed in parallel and in isolation from one another. The present paper aims to promote more cross-talk between SLA and memory science. We focus on foreign language (FL) attrition as an example of a research topic in SLA where the parallels with memory science are especially apparent. We discuss evidence that suggests that competition between languages is one of the mechanisms of FL attrition, paralleling the interference process thought to underlie forgetting in other domains of human memory. Backed up by concrete suggestions, we advocate the use of paradigms from the memory literature to study these interference effects in the language domain. In doing so, we hope to facilitate future cross-talk between the two fields, and to further our understanding of FL attrition as a memory phenomenon. -
Schuerman, W. L., McQueen, J. M., & Meyer, A. S. (2019). Speaker statistical averageness modulates word recognition in adverse listening conditions. In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (
Eds. ), Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 20195) (pp. 1203-1207). Canberra, Australia: Australasian Speech Science and Technology Association Inc.Abstract
We tested whether statistical averageness (SA) at the level of the individual speaker could predict a speaker’s intelligibility. 28 female and 21 male speakers of Dutch were recorded producing 336 sentences,
each containing two target nouns. Recordings were compared to those of all other same-sex speakers using dynamic time warping (DTW). For each sentence, the DTW distance constituted a metric
of phonetic distance from one speaker to all other speakers. SA comprised the average of these distances. Later, the same participants performed a word recognition task on the target nouns in the same sentences, under three degraded listening conditions. In all three conditions, accuracy increased with SA. This held even when participants listened to their own utterances. These findings suggest that listeners process speech with respect to the statistical
properties of the language spoken in their community, rather than using their own speech as a reference -
Solberg Økland, H., Todorović, A., Lüttke, C. S., McQueen, J. M., & De Lange, F. P. (2019). Combined predictive effects of sentential and visual constraints in early audiovisual speech processing. Scientific Reports, 9: 7870. doi:10.1038/s41598-019-44311-2.
Abstract
In language comprehension, a variety of contextual cues act in unison to render upcoming words more or less predictable. As a sentence unfolds, we use prior context (sentential constraints) to predict what the next words might be. Additionally, in a conversation, we can predict upcoming sounds through observing the mouth movements of a speaker (visual constraints). In electrophysiological studies, effects of visual constraints have typically been observed early in language processing, while effects of sentential constraints have typically been observed later. We hypothesized that the visual and the sentential constraints might feed into the same predictive process such that effects of sentential constraints might also be detectable early in language processing through modulations of the early effects of visual salience. We presented participants with audiovisual speech while recording their brain activity with magnetoencephalography. Participants saw videos of a person saying sentences where the last word was either sententially constrained or not, and began with a salient or non-salient mouth movement. We found that sentential constraints indeed exerted an early (N1) influence on language processing. Sentential modulations of the N1 visual predictability effect were visible in brain areas associated with semantic processing, and were differently expressed in the two hemispheres. In the left hemisphere, visual and sentential constraints jointly suppressed the auditory evoked field, while the right hemisphere was sensitive to visual constraints only in the absence of strong sentential constraints. These results suggest that sentential and visual constraints can jointly influence even very early stages of audiovisual speech comprehension. -
Takashima, A., Bakker-Marshall, I., Van Hell, J. G., McQueen, J. M., & Janzen, G. (2019). Neural correlates of word learning in children. Developmental Cognitive Neuroscience, 37: 100647. doi:10.1016/j.dcn.2019.100649.
Abstract
Memory representations of words are thought to undergo changes with consolidation: Episodic memories of novel words are transformed into lexical representations that interact with other words in the mental dictionary. Behavioral studies have shown that this lexical integration process is enhanced when there is more time for consolidation. Neuroimaging studies have further revealed that novel word representations are initially represented in a hippocampally-centered system, whereas left posterior middle temporal cortex activation increases with lexicalization. In this study, we measured behavioral and brain responses to newly-learned words in children. Two groups of Dutch children, aged between 8-10 and 14-16 years, were trained on 30 novel Japanese words depicting novel concepts. Children were tested on word-forms, word-meanings, and the novel words’ influence on existing word processing immediately after training, and again after a week. In line with the adult findings, hippocampal involvement decreased with time. Lexical integration, however, was not observed immediately or after a week, neither behaviorally nor neurally. It appears that time alone is not always sufficient for lexical integration to occur. We suggest that other factors (e.g., the novelty of the concepts and familiarity with the language the words are derived from) might also influence the integration process.Additional information
Supplementary data -
Van Goch, M. M., Verhoeven, L., & McQueen, J. M. (2019). Success in learning similar-sounding words predicts vocabulary depth above and beyond vocabulary breadth. Journal of Child Language, 46(1), 184-197. doi:10.1017/S0305000918000338.
Abstract
In lexical development, the specificity of phonological representations is important. The ability to build phonologically specific lexical representations predicts the number of words a child knows (vocabulary breadth), but it is not clear if it also fosters how well words are known (vocabulary depth). Sixty-six children were studied in kindergarten (age 5;7) and first grade (age 6;8). The predictive value of the ability to learn phonologically similar new words, phoneme discrimination ability, and phonological awareness on vocabulary breadth and depth were assessed using hierarchical regression. Word learning explained unique variance in kindergarten and first-grade vocabulary depth, over the other phonological factors. It did not explain unique variance in vocabulary breadth. Furthermore, even after controlling for kindergarten vocabulary breadth, kindergarten word learning still explained unique variance in first-grade vocabulary depth. Skill in learning phonologically similar words appears to predict knowledge children have about what words mean. -
Wagner, M. A., Broersma, M., McQueen, J. M., & Lemhöfer, K. (2019). Imitating speech in an unfamiliar language and an unfamiliar non-native accent in the native language. In S. Calhoun, P. Escudero, M. Tabain, & P. Warren (
Eds. ), Proceedings of the 19th International Congress of Phonetic Sciences (ICPhS 20195) (pp. 1362-1366). Canberra, Australia: Australasian Speech Science and Technology Association Inc.Abstract
This study concerns individual differences in speech imitation ability and the role that lexical representations play in imitation. We examined 1) whether imitation of sounds in an unfamiliar language (L0) is related to imitation of sounds in an unfamiliar
non-native accent in the speaker’s native language (L1) and 2) whether it is easier or harder to imitate speech when you know the words to be imitated. Fifty-nine native Dutch speakers imitated words with target vowels in Basque (/a/ and /e/) and Greekaccented
Dutch (/i/ and /u/). Spectral and durational
analyses of the target vowels revealed no relationship between the success of L0 and L1 imitation and no difference in performance between tasks (i.e., L1
imitation was neither aided nor blocked by lexical knowledge about the correct pronunciation). The results suggest instead that the relationship of the vowels to native phonological categories plays a bigger role in imitation -
Cho, T., & McQueen, J. M. (2005). Prosodic influences on consonant production in Dutch: Effects of prosodic boundaries, phrasal accent and lexical stress. Journal of Phonetics, 33(2), 121-157. doi:10.1016/j.wocn.2005.01.001.
Abstract
Prosodic influences on phonetic realizations of four Dutch consonants (/t d s z/) were examined. Sentences were constructed containing these consonants in word-initial position; the factors lexical stress, phrasal accent and prosodic boundary were manipulated between sentences. Eleven Dutch speakers read these sentences aloud. The patterns found in acoustic measurements of these utterances (e.g., voice onset time (VOT), consonant duration, voicing during closure, spectral center of gravity, burst energy) indicate that the low-level phonetic implementation of all four consonants is modulated by prosodic structure. Boundary effects on domain-initial segments were observed in stressed and unstressed syllables, extending previous findings which have been on stressed syllables alone. Three aspects of the data are highlighted. First, shorter VOTs were found for /t/ in prosodically stronger locations (stressed, accented and domain-initial), as opposed to longer VOTs in these positions in English. This suggests that prosodically driven phonetic realization is bounded by language-specific constraints on how phonetic features are specified with phonetic content: Shortened VOT in Dutch reflects enhancement of the phonetic feature {−spread glottis}, while lengthened VOT in English reflects enhancement of {+spread glottis}. Prosodic strengthening therefore appears to operate primarily at the phonetic level, such that prosodically driven enhancement of phonological contrast is determined by phonetic implementation of these (language-specific) phonetic features. Second, an accent effect was observed in stressed and unstressed syllables, and was independent of prosodic boundary size. The domain of accentuation in Dutch is thus larger than the foot. Third, within a prosodic category consisting of those utterances with a boundary tone but no pause, tokens with syntactically defined Phonological Phrase boundaries could be differentiated from the other tokens. This syntactic influence on prosodic phrasing implies the existence of an intermediate-level phrase in the prosodic hierarchy of Dutch. -
Cutler, A., McQueen, J. M., & Norris, D. (2005). The lexical utility of phoneme-category plasticity. In Proceedings of the ISCA Workshop on Plasticity in Speech Perception (PSP2005) (pp. 103-107).
-
Eisner, F., & McQueen, J. M. (2005). The specificity of perceptual learning in speech processing. Perception & Psychophysics, 67(2), 224-238.
Abstract
We conducted four experiments to investigate the specificity of perceptual adjustments made to unusual speech sounds. Dutch listeners heard a female talker produce an ambiguous fricative [?] (between [f] and [s]) in [f]- or [s]-biased lexical contexts. Listeners with [f]-biased exposure (e.g., [witlo?]; from witlof, “chicory”; witlos is meaningless) subsequently categorized more sounds on an [εf]–[εs] continuum as [f] than did listeners with [s]-biased exposure. This occurred when the continuum was based on the exposure talker's speech (Experiment 1), and when the same test fricatives appeared after vowels spoken by novel female and male talkers (Experiments 1 and 2). When the continuum was made entirely from a novel talker's speech, there was no exposure effect (Experiment 3) unless fricatives from that talker had been spliced into the exposure talker's speech during exposure (Experiment 4). We conclude that perceptual learning about idiosyncratic speech is applied at a segmental level and is, under these exposure conditions, talker specific. -
McQueen, J. M. (2005). Speech perception. In K. Lamberts, & R. Goldstone (
Eds. ), The Handbook of Cognition (pp. 255-275). London: Sage Publications. -
McQueen, J. M. (2005). Spoken word recognition and production: Regular but not inseparable bedfellows. In A. Cutler (
Ed. ), Twenty-first century psycholinguistics: Four cornerstones (pp. 229-244). Mahwah, NJ: Erlbaum. -
McQueen, J. M., & Sereno, J. (2005). Cleaving automatic processes from strategic biases in phonological priming. Memory & Cognition, 33(7), 1185-1209.
Abstract
In a phonological priming experiment using spoken Dutch words, Dutch listeners were taught varying expectancies and relatedness relations about the phonological form of target words, given particular primes. They learned to expect that, after a particular prime, if the target was a word, it would be from a specific phonological category. The expectancy either involved phonological overlap (e.g., honk-vonk, “base-spark”; expected related) or did not (e.g., nest-galm, “nest-boom”; expected unrelated, where the learned expectation after hearing nest was a word rhyming in -alm). Targets were occasionally inconsistent with expectations. In these inconsistent expectancy trials, targets were either unrelated (e.g., honk-mest, “base-manure”; unexpected unrelated), where the listener was expecting a related target, or related (e.g., nest-pest, “nest-plague”; unexpected related), where the listener was expecting an unrelated target. Participant expectations and phonological relatedness were thus manipulated factorially for three types of phonological overlap (rhyme, one onset phoneme, and three onset phonemes) at three interstimulus intervals (ISIs; 50, 500, and 2,000 msec). Lexical decisions to targets revealed evidence of expectancy-based strategies for all three types of overlap (e.g., faster responses to expected than to unexpected targets, irrespective of phonological relatedness) and evidence of automatic phonological processes, but only for the rhyme and three-phoneme onset overlap conditions and, most strongly, at the shortest ISI (e.g., faster responses to related than to unrelated targets, irrespective of expectations). Although phonological priming thus has both automatic and strategic components, it is possible to cleave them apart. -
McQueen, J. M., & Mitterer, H. (2005). Lexically-driven perceptual adjustments of vowel categories. In Proceedings of the ISCA Workshop on Plasticity in Speech Perception (PSP2005) (pp. 233-236).
-
Scharenborg, O., Norris, D., Ten Bosch, L., & McQueen, J. M. (2005). How should a speech recognizer work? Cognitive Science, 29(6), 867-918. doi:10.1207/s15516709cog0000_37.
Abstract
Although researchers studying human speech recognition (HSR) and automatic speech recognition (ASR) share a common interest in how information processing systems (human or machine) recognize spoken language, there is little communication between the two disciplines. We suggest that this lack of communication follows largely from the fact that research in these related fields has focused on the mechanics of how speech can be recognized. In Marr's (1982) terms, emphasis has been on the algorithmic and implementational levels rather than on the computational level. In this article, we provide a computational-level analysis of the task of speech recognition, which reveals the close parallels between research concerned with HSR and ASR. We illustrate this relation by presenting a new computational model of human spoken-word recognition, built using techniques from the field of ASR that, in contrast to current existing models of HSR, recognizes words from real speech input. -
Warner, N., Smits, R., McQueen, J. M., & Cutler, A. (2005). Phonological and statistical effects on timing of speech perception: Insights from a database of Dutch diphone perception. Speech Communication, 46(1), 53-72. doi:10.1016/j.specom.2005.01.003.
Abstract
We report detailed analyses of a very large database on timing of speech perception collected by Smits et al. (Smits, R., Warner, N., McQueen, J.M., Cutler, A., 2003. Unfolding of phonetic information over time: A database of Dutch diphone perception. J. Acoust. Soc. Am. 113, 563–574). Eighteen listeners heard all possible diphones of Dutch, gated in portions of varying size and presented without background noise. The present report analyzes listeners’ responses across gates in terms of phonological features (voicing, place, and manner for consonants; height, backness, and length for vowels). The resulting patterns for feature perception differ from patterns reported when speech is presented in noise. The data are also analyzed for effects of stress and of phonological context (neighboring vowel vs. consonant); effects of these factors are observed to be surprisingly limited. Finally, statistical effects, such as overall phoneme frequency and transitional probabilities, along with response biases, are examined; these too exercise only limited effects on response patterns. The results suggest highly accurate speech perception on the basis of acoustic information alone.
Share this page