James McQueen

Publications

Displaying 1 - 27 of 27
  • Cho, T., & McQueen, J. M. (2005). Prosodic influences on consonant production in Dutch: Effects of prosodic boundaries, phrasal accent and lexical stress. Journal of Phonetics, 33(2), 121-157. doi:10.1016/j.wocn.2005.01.001.

    Abstract

    Prosodic influences on phonetic realizations of four Dutch consonants (/t d s z/) were examined. Sentences were constructed containing these consonants in word-initial position; the factors lexical stress, phrasal accent and prosodic boundary were manipulated between sentences. Eleven Dutch speakers read these sentences aloud. The patterns found in acoustic measurements of these utterances (e.g., voice onset time (VOT), consonant duration, voicing during closure, spectral center of gravity, burst energy) indicate that the low-level phonetic implementation of all four consonants is modulated by prosodic structure. Boundary effects on domain-initial segments were observed in stressed and unstressed syllables, extending previous findings which have been on stressed syllables alone. Three aspects of the data are highlighted. First, shorter VOTs were found for /t/ in prosodically stronger locations (stressed, accented and domain-initial), as opposed to longer VOTs in these positions in English. This suggests that prosodically driven phonetic realization is bounded by language-specific constraints on how phonetic features are specified with phonetic content: Shortened VOT in Dutch reflects enhancement of the phonetic feature {−spread glottis}, while lengthened VOT in English reflects enhancement of {+spread glottis}. Prosodic strengthening therefore appears to operate primarily at the phonetic level, such that prosodically driven enhancement of phonological contrast is determined by phonetic implementation of these (language-specific) phonetic features. Second, an accent effect was observed in stressed and unstressed syllables, and was independent of prosodic boundary size. The domain of accentuation in Dutch is thus larger than the foot. Third, within a prosodic category consisting of those utterances with a boundary tone but no pause, tokens with syntactically defined Phonological Phrase boundaries could be differentiated from the other tokens. This syntactic influence on prosodic phrasing implies the existence of an intermediate-level phrase in the prosodic hierarchy of Dutch.
  • Cutler, A., McQueen, J. M., & Norris, D. (2005). The lexical utility of phoneme-category plasticity. In Proceedings of the ISCA Workshop on Plasticity in Speech Perception (PSP2005) (pp. 103-107).
  • Eisner, F., & McQueen, J. M. (2005). The specificity of perceptual learning in speech processing. Perception & Psychophysics, 67(2), 224-238.

    Abstract

    We conducted four experiments to investigate the specificity of perceptual adjustments made to unusual speech sounds. Dutch listeners heard a female talker produce an ambiguous fricative [?] (between [f] and [s]) in [f]- or [s]-biased lexical contexts. Listeners with [f]-biased exposure (e.g., [witlo?]; from witlof, “chicory”; witlos is meaningless) subsequently categorized more sounds on an [εf]–[εs] continuum as [f] than did listeners with [s]-biased exposure. This occurred when the continuum was based on the exposure talker's speech (Experiment 1), and when the same test fricatives appeared after vowels spoken by novel female and male talkers (Experiments 1 and 2). When the continuum was made entirely from a novel talker's speech, there was no exposure effect (Experiment 3) unless fricatives from that talker had been spliced into the exposure talker's speech during exposure (Experiment 4). We conclude that perceptual learning about idiosyncratic speech is applied at a segmental level and is, under these exposure conditions, talker specific.
  • McQueen, J. M. (2005). Speech perception. In K. Lamberts, & R. Goldstone (Eds.), The Handbook of Cognition (pp. 255-275). London: Sage Publications.
  • McQueen, J. M. (2005). Spoken word recognition and production: Regular but not inseparable bedfellows. In A. Cutler (Ed.), Twenty-first century psycholinguistics: Four cornerstones (pp. 229-244). Mahwah, NJ: Erlbaum.
  • McQueen, J. M., & Sereno, J. (2005). Cleaving automatic processes from strategic biases in phonological priming. Memory & Cognition, 33(7), 1185-1209.

    Abstract

    In a phonological priming experiment using spoken Dutch words, Dutch listeners were taught varying expectancies and relatedness relations about the phonological form of target words, given particular primes. They learned to expect that, after a particular prime, if the target was a word, it would be from a specific phonological category. The expectancy either involved phonological overlap (e.g., honk-vonk, “base-spark”; expected related) or did not (e.g., nest-galm, “nest-boom”; expected unrelated, where the learned expectation after hearing nest was a word rhyming in -alm). Targets were occasionally inconsistent with expectations. In these inconsistent expectancy trials, targets were either unrelated (e.g., honk-mest, “base-manure”; unexpected unrelated), where the listener was expecting a related target, or related (e.g., nest-pest, “nest-plague”; unexpected related), where the listener was expecting an unrelated target. Participant expectations and phonological relatedness were thus manipulated factorially for three types of phonological overlap (rhyme, one onset phoneme, and three onset phonemes) at three interstimulus intervals (ISIs; 50, 500, and 2,000 msec). Lexical decisions to targets revealed evidence of expectancy-based strategies for all three types of overlap (e.g., faster responses to expected than to unexpected targets, irrespective of phonological relatedness) and evidence of automatic phonological processes, but only for the rhyme and three-phoneme onset overlap conditions and, most strongly, at the shortest ISI (e.g., faster responses to related than to unrelated targets, irrespective of expectations). Although phonological priming thus has both automatic and strategic components, it is possible to cleave them apart.
  • McQueen, J. M., & Mitterer, H. (2005). Lexically-driven perceptual adjustments of vowel categories. In Proceedings of the ISCA Workshop on Plasticity in Speech Perception (PSP2005) (pp. 233-236).
  • Scharenborg, O., Norris, D., Ten Bosch, L., & McQueen, J. M. (2005). How should a speech recognizer work? Cognitive Science, 29(6), 867-918. doi:10.1207/s15516709cog0000_37.

    Abstract

    Although researchers studying human speech recognition (HSR) and automatic speech recognition (ASR) share a common interest in how information processing systems (human or machine) recognize spoken language, there is little communication between the two disciplines. We suggest that this lack of communication follows largely from the fact that research in these related fields has focused on the mechanics of how speech can be recognized. In Marr's (1982) terms, emphasis has been on the algorithmic and implementational levels rather than on the computational level. In this article, we provide a computational-level analysis of the task of speech recognition, which reveals the close parallels between research concerned with HSR and ASR. We illustrate this relation by presenting a new computational model of human spoken-word recognition, built using techniques from the field of ASR that, in contrast to current existing models of HSR, recognizes words from real speech input.
  • Warner, N., Smits, R., McQueen, J. M., & Cutler, A. (2005). Phonological and statistical effects on timing of speech perception: Insights from a database of Dutch diphone perception. Speech Communication, 46(1), 53-72. doi:10.1016/j.specom.2005.01.003.

    Abstract

    We report detailed analyses of a very large database on timing of speech perception collected by Smits et al. (Smits, R., Warner, N., McQueen, J.M., Cutler, A., 2003. Unfolding of phonetic information over time: A database of Dutch diphone perception. J. Acoust. Soc. Am. 113, 563–574). Eighteen listeners heard all possible diphones of Dutch, gated in portions of varying size and presented without background noise. The present report analyzes listeners’ responses across gates in terms of phonological features (voicing, place, and manner for consonants; height, backness, and length for vowels). The resulting patterns for feature perception differ from patterns reported when speech is presented in noise. The data are also analyzed for effects of stress and of phonological context (neighboring vowel vs. consonant); effects of these factors are observed to be surprisingly limited. Finally, statistical effects, such as overall phoneme frequency and transitional probabilities, along with response biases, are examined; these too exercise only limited effects on response patterns. The results suggest highly accurate speech perception on the basis of acoustic information alone.
  • Baayen, R. H., McQueen, J. M., Dijkstra, T., & Schreuder, R. (2003). Frequency effects in regular inflectional morphology: Revisiting Dutch plurals. In R. H. Baayen, & R. Schreuder (Eds.), Morphological structure in language processing (pp. 355-390). Berlin: Mouton de Gruyter.
  • Baayen, R. H., McQueen, J. M., Dijkstra, T., & Schreuder, R. (2003). Frequency effects in regular inflectional morphology: Revisiting Dutch plurals. In R. H. Baayen, & R. Schreuder (Eds.), Morphological Structure in Language Processing (pp. 355-390). Berlin, Germany: Mouton De Gruyter.
  • McQueen, J. M. (2003). The ghost of Christmas future: Didn't Scrooge learn to be good? Commentary on Magnuson, McMurray, Tanenhaus and Aslin (2003). Cognitive Science, 27(5), 795-799. doi:10.1207/s15516709cog2705_6.

    Abstract

    Magnuson, McMurray, Tanenhaus, and Aslin [Cogn. Sci. 27 (2003) 285] suggest that they have evidence of lexical feedback in speech perception, and that this evidence thus challenges the purely feedforward Merge model [Behav. Brain Sci. 23 (2000) 299]. This evidence is open to an alternative explanation, however, one which preserves the assumption in Merge that there is no lexical-prelexical feedback during on-line speech processing. This explanation invokes the distinction between perceptual processing that occurs in the short term, as an utterance is heard, and processing that occurs over the longer term, for perceptual learning.
  • McQueen, J. M., & Cho, T. (2003). The use of domain-initial strengthening in segmentation of continuous English speech. In Proceedings of the 15th International Congress of Phonetic Sciences (ICPhS 2003) (pp. 2993-2996). Adelaide: Causal Productions.
  • McQueen, J. M., Dahan, D., & Cutler, A. (2003). Continuity and gradedness in speech processing. In N. O. Schiller, & A. S. Meyer (Eds.), Phonetics and phonology in language comprehension and production: Differences and similarities (pp. 39-78). Berlin: Mouton de Gruyter.
  • McQueen, J. M., Cutler, A., & Norris, D. (2003). Flow of information in the spoken word recognition system. Speech Communication, 41(1), 257-270. doi:10.1016/S0167-6393(02)00108-5.

    Abstract

    Spoken word recognition consists of two major component processes. First, at the prelexical stage, an abstract description of the utterance is generated from the information in the speech signal. Second, at the lexical stage, this description is used to activate all the words stored in the mental lexicon which match the input. These multiple candidate words then compete with each other. We review evidence which suggests that positive (match) and negative (mismatch) information of both a segmental and a suprasegmental nature is used to constrain this activation and competition process. We then ask whether, in addition to the necessary influence of the prelexical stage on the lexical stage, there is also feedback from the lexicon to the prelexical level. In two phonetic categorization experiments, Dutch listeners were asked to label both syllable-initial and syllable-final ambiguous fricatives (e.g., sounds ranging from [f] to [s]) in the word–nonword series maf–mas, and the nonword–word series jaf–jas. They tended to label the sounds in a lexically consistent manner (i.e., consistent with the word endpoints of the series). These lexical effects became smaller in listeners’ slower responses, even when the listeners were put under pressure to respond as fast as possible. Our results challenge models of spoken word recognition in which feedback modulates the prelexical analysis of the component sounds of a word whenever that word is heard
  • Norris, D., McQueen, J. M., & Cutler, A. (2003). Perceptual learning in speech. Cognitive Psychology, 47(2), 204-238. doi:10.1016/S0010-0285(03)00006-9.

    Abstract

    This study demonstrates that listeners use lexical knowledge in perceptual learning of speech sounds. Dutch listeners first made lexical decisions on Dutch words and nonwords. The final fricative of 20 critical words had been replaced by an ambiguous sound, between [f] and [s]. One group of listeners heard ambiguous [f]-final words (e.g., [WI tlo?], from witlof, chicory) and unambiguous [s]-final words (e.g., naaldbos, pine forest). Another group heard the reverse (e.g., ambiguous [na:ldbo?], unambiguous witlof). Listeners who had heard [?] in [f]-final words were subsequently more likely to categorize ambiguous sounds on an [f]–[s] continuum as [f] than those who heard [?] in [s]-final words. Control conditions ruled out alternative explanations based on selective adaptation and contrast. Lexical information can thus be used to train categorization of speech. This use of lexical information differs from the on-line lexical feedback embodied in interactive models of speech perception. In contrast to on-line feedback, lexical feedback for learning is of benefit to spoken word recognition (e.g., in adapting to a newly encountered dialect).
  • Salverda, A. P., Dahan, D., & McQueen, J. M. (2003). The role of prosodic boundaries in the resolution of lexical embedding in speech comprehension. Cognition, 90(1), 51-89. doi:10.1016/S0010-0277(03)00139-2.

    Abstract

    Participants' eye movements were monitored as they heard sentences and saw four pictured objects on a computer screen. Participants were instructed to click on the object mentioned in the sentence. There were more transitory fixations to pictures representing monosyllabic words (e.g. ham) when the first syllable of the target word (e.g. hamster) had been replaced by a recording of the monosyllabic word than when it came from a different recording of the target word. This demonstrates that a phonemically identical sequence can contain cues that modulate its lexical interpretation. This effect was governed by the duration of the sequence, rather than by its origin (i.e. which type of word it came from). The longer the sequence, the more monosyllabic-word interpretations it generated. We argue that cues to lexical-embedding disambiguation, such as segmental lengthening, result from the realization of a prosodic boundary that often but not always follows monosyllabic words, and that lexical candidates whose word boundaries are aligned with prosodic boundaries are favored in the word-recognition process.
  • Scharenborg, O., McQueen, J. M., Ten Bosch, L., & Norris, D. (2003). Modelling human speech recognition using automatic speech recognition paradigms in SpeM. In Proceedings of Eurospeech 2003 (pp. 2097-2100). Adelaide: Causal Productions.

    Abstract

    We have recently developed a new model of human speech recognition, based on automatic speech recognition techniques [1]. The present paper has two goals. First, we show that the new model performs well in the recognition of lexically ambiguous input. These demonstrations suggest that the model is able to operate in the same optimal way as human listeners. Second, we discuss how to relate the behaviour of a recogniser, designed to discover the optimum path through a word lattice, to data from human listening experiments. We argue that this requires a metric that combines both path-based and word-based measures of recognition performance. The combined metric varies continuously as the input speech signal unfolds over time.
  • Smits, R., Warner, N., McQueen, J. M., & Cutler, A. (2003). Unfolding of phonetic information over time: A database of Dutch diphone perception. Journal of the Acoustical Society of America, 113(1), 563-574. doi:10.1121/1.1525287.

    Abstract

    We present the results of a large-scale study on speech perception, assessing the number and type of perceptual hypotheses which listeners entertain about possible phoneme sequences in their language. Dutch listeners were asked to identify gated fragments of all 1179 diphones of Dutch, providing a total of 488 520 phoneme categorizations. The results manifest orderly uptake of acoustic information in the signal. Differences across phonemes in the rate at which fully correct recognition was achieved arose as a result of whether or not potential confusions could occur with other phonemes of the language ~long with short vowels, affricates with their initial components, etc.!. These data can be used to improve models of how acoustic phonetic information is mapped onto the mental lexicon during speech comprehension.
  • Spinelli, E., McQueen, J. M., & Cutler, A. (2003). Processing resyllabified words in French. Journal of Memory and Language, 48(2), 233-254. doi:10.1016/S0749-596X(02)00513-2.
  • Cutler, A., McQueen, J. M., Norris, D., & Somejuan, A. (2001). The roll of the silly ball. In E. Dupoux (Ed.), Language, brain and cognitive development: Essays in honor of Jacques Mehler (pp. 181-194). Cambridge, MA: MIT Press.
  • McQueen, J. M., Norris, D., & Cutler, A. (2001). Can lexical knowledge modulate prelexical representations over time? In R. Smits, J. Kingston, T. Neary, & R. Zondervan (Eds.), Proceedings of the workshop on Speech Recognition as Pattern Classification (SPRAAC) (pp. 145-150). Nijmegen: Max Planck Institute for Psycholinguistics.

    Abstract

    The results of a study on perceptual learning are reported. Dutch subjects made lexical decisions on a list of words and nonwords. Embedded in the list were either [f]- or [s]-final words in which the final fricative had been replaced by an ambiguous sound, midway between [f] and [s]. One group of listeners heard ambiguous [f]- final Dutch words like [kara?] (based on karaf, carafe) and unambiguous [s]-final words (e.g., karkas, carcase). A second group heard the reverse (e.g., ambiguous [karka?] and unambiguous karaf). After this training phase, listeners labelled ambiguous fricatives on an [f]- [s] continuum. The subjects who had heard [?] in [f]- final words categorised these fricatives as [f] reliably more often than those who had heard [?] in [s]-final words. These results suggest that speech recognition is dynamic: the system adjusts to the constraints of each particular listening situation. The lexicon can provide this adjustment process with a training signal.
  • McQueen, J. M., & Cutler, A. (Eds.). (2001). Spoken word access processes. Hove, UK: Psychology Press.
  • McQueen, J. M., & Cutler, A. (2001). Spoken word access processes: An introduction. Language and Cognitive Processes, 16, 469-490. doi:10.1080/01690960143000209.

    Abstract

    We introduce the papers in this special issue by summarising the current major issues in spoken word recognition. We argue that a full understanding of the process of lexical access during speech comprehension will depend on resolving several key representational issues: what is the form of the representations used for lexical access; how is phonological information coded in the mental lexicon; and how is the morphological and semantic information about each word stored? We then discuss a number of distinct access processes: competition between lexical hypotheses; the computation of goodness-of-fit between the signal and stored lexical knowledge; segmentation of continuous speech; whether the lexicon influences prelexical processing through feedback; and the relationship of form-based processing to the processes responsible for deriving an interpretation of a complete utterance. We conclude that further progress may well be made by swapping ideas among the different sub-domains of the discipline.
  • McQueen, J. M., Otake, T., & Cutler, A. (2001). Rhythmic cues and possible-word constraints in Japanese speech segmentation. Journal of Memory and Language, 45, 103-132. doi:10.1006/jmla.2000.2763.

    Abstract

    In two word-spotting experiments, Japanese listeners detected Japanese words faster in vowel contexts (e.g., agura, to sit cross-legged, in oagura) than in consonant contexts (e.g., tagura). In the same experiments, however, listeners spotted words in vowel contexts (e.g., saru, monkey, in sarua) no faster than in moraic nasal contexts (e.g., saruN). In a third word-spotting experiment, words like uni, sea urchin, followed contexts consisting of a consonant-consonant-vowel mora (e.g., gya) plus either a moraic nasal (gyaNuni), a vowel (gyaouni) or a consonant (gyabuni). Listeners spotted words as easily in the first as in the second context (where in each case the target words were aligned with mora boundaries), but found it almost impossible to spot words in the third (where there was a single consonant, such as the [b] in gyabuni, between the beginning of the word and the nearest preceding mora boundary). Three control experiments confirmed that these effects reflected the relative ease of segmentation of the words from their contexts.We argue that the listeners showed sensitivity to the viability of sound sequences as possible Japanese words in the way that they parsed the speech into words. Since single consonants are not possible Japanese words, the listeners avoided lexical parses including single consonants and thus had difficulty recognizing words in the consonant contexts. Even though moraic nasals are also impossible words, they were not difficult segmentation contexts because, as with the vowel contexts, the mora boundaries between the contexts and the target words signaled likely word boundaries. Moraic rhythm appears to provide Japanese listeners with important segmentation cues.
  • Norris, D., McQueen, J. M., Cutler, A., Butterfield, S., & Kearns, R. (2001). Language-universal constraints on speech segmentation. Language and Cognitive Processes, 16, 637-660. doi:10.1080/01690960143000119.

    Abstract

    Two word-spotting experiments are reported that examine whether the Possible-Word Constraint (PWC) is a language-specific or language-universal strategy for the segmentation of continuous speech. The PWC disfavours parses which leave an impossible residue between the end of a candidate word and any likely location of a word boundary, as cued in the speech signal. The experiments examined cases where the residue was either a CVC syllable with a schwa, or a CV syllable with a lax vowel. Although neither of these syllable contexts is a possible lexical word in English, word-spotting in both contexts was easier than in a context consisting of a single consonant. Two control lexical-decision experiments showed that the word-spotting results reflected the relative segmentation difficulty of the words in different contexts. The PWC appears to be language-universal rather than language-specific.
  • Van Alphen, P. M., & McQueen, J. M. (2001). The time-limited influence of sentential context on function word identification. Journal of Experimental Psychology: Human Perception and Performance, 27, 1057-1071. doi:10.1037/0096-1523.27.5.1057.

    Abstract

    Sentential context effects on the identification of the Dutch function words te (to) and de (the) were examined. In Experiment 1, listeners labeled words on a [tә]-[dә] continuum more often as te when the context was te biased (Ik probeer [?ә] schieten [I try to/the shoot]) than when it was de biased (Ik probeer [?ә] schoenen [I try to/the shoes]). The effect was weaker in slower responses. In Experiment 2, disambiguation began later, in the second word after [?ә]. There was a weak context effect only in the slower responses. In Experiments 3 and 4, disambiguation occurred on the word before [?ә]: There was no context effect when one set of sentences was used, but there was an effect (larger in the faster responses) when more sentences were used. Syntactic processing affects word identification only within a limited time frame. It appears to do so not by influencing lexical access processes through feedback but, instead, by biasing decision making.

Share this page