Publications

Displaying 201 - 209 of 209
  • Räsänen, O., Seshadri, S., & Casillas, M. (2018). Comparison of syllabification algorithms and training strategies for robust word count estimation across different languages and recording conditions. In Proceedings of Interspeech 2018 (pp. 1200-1204). doi:10.21437/Interspeech.2018-1047.

    Abstract

    Word count estimation (WCE) from audio recordings has a number of applications, including quantifying the amount of speech that language-learning infants hear in their natural environments, as captured by daylong recordings made with devices worn by infants. To be applicable in a wide range of scenarios and also low-resource domains, WCE tools should be extremely robust against varying signal conditions and require minimal access to labeled training data in the target domain. For this purpose, earlier work has used automatic syllabification of speech, followed by a least-squares-mapping of syllables to word counts. This paper compares a number of previously proposed syllabifiers in the WCE task, including a supervised bi-directional long short-term memory (BLSTM) network that is trained on a language for which high quality syllable annotations are available (a “high resource language”), and reports how the alternative methods compare on different languages and signal conditions. We also explore additive noise and varying-channel data augmentation strategies for BLSTM training, and show how they improve performance in both matching and mismatching languages. Intriguingly, we also find that even though the BLSTM works on languages beyond its training data, the unsupervised algorithms can still outperform it in challenging signal conditions on novel languages.
  • Rowland, C. F. (2018). The principles of scientific inquiry. Linguistic Approaches to Bilingualism, 8(6), 770-775. doi:10.1075/lab.18056.row.
  • Von Holzen, K., & Bergmann, C. (2018). A Meta-Analysis of Infants’ Mispronunciation Sensitivity Development. In C. Kalish, M. Rau, J. Zhu, & T. T. Rogers (Eds.), Proceedings of the 40th Annual Conference of the Cognitive Science Society (CogSci 2018) (pp. 1159-1164). Austin, TX: Cognitive Science Society.

    Abstract

    Before infants become mature speakers of their native language, they must acquire a robust word-recognition system which allows them to strike the balance between allowing some variation (mood, voice, accent) and recognizing variability that potentially changes meaning (e.g. cat vs hat). The current meta-analysis quantifies how the latter, termed mispronunciation sensitivity, changes over infants’ first three years, testing competing predictions of mainstream language acquisition theories. Our results show that infants were sensitive to mispronunciations, but accepted them as labels for target objects. Interestingly, and in contrast to predictions of mainstream theories, mispronunciation sensitivity was not modulated by infant age, suggesting that a sufficiently flexible understanding of native language phonology is in place at a young age.
  • Abbot-Smith, K., Chang, F., Rowland, C. F., Ferguson, H., & Pine, J. (2017). Do two and three year old children use an incremental first-NP-as-agent bias to process active transitive and passive sentences?: A permutation analysis. PLoS One, 12(10): e0186129. doi:10.1371/journal.pone.0186129.

    Abstract

    We used eye-tracking to investigate if and when children show an incremental bias to assume that the first noun phrase in a sentence is the agent (first-NP-as-agent bias) while processing the meaning of English active and passive transitive sentences. We also investigated whether children can override this bias to successfully distinguish active from passive sentences, after processing the remainder of the sentence frame. For this second question we used eye-tracking (Study 1) and forced-choice pointing (Study 2). For both studies, we used a paradigm in which participants simultaneously saw two novel actions with reversed agent-patient relations while listening to active and passive sentences. We compared English-speaking 25-month-olds and 41-month-olds in between-subjects sentence structure conditions (Active Transitive Condition vs. Passive Condition). A permutation analysis found that both age groups showed a bias to incrementally map the first noun in a sentence onto an agent role. Regarding the second question, 25-month-olds showed some evidence of distinguishing the two structures in the eye-tracking study. However, the 25-month-olds did not distinguish active from passive sentences in the forced choice pointing task. In contrast, the 41-month-old children did reanalyse their initial first-NP-as-agent bias to the extent that they clearly distinguished between active and passive sentences both in the eye-tracking data and in the pointing task. The results are discussed in relation to the development of syntactic (re)parsing.

    Additional information

    Data available from OSF
  • Casillas, M., Bergelson, E., Warlaumont, A. S., Cristia, A., Soderstrom, M., VanDam, M., & Sloetjes, H. (2017). A New Workflow for Semi-automatized Annotations: Tests with Long-Form Naturalistic Recordings of Childrens Language Environments. In Proceedings of Interspeech 2017 (pp. 2098-2102). doi:10.21437/Interspeech.2017-1418.

    Abstract

    Interoperable annotation formats are fundamental to the utility, expansion, and sustainability of collective data repositories.In language development research, shared annotation schemes have been critical to facilitating the transition from raw acoustic data to searchable, structured corpora. Current schemes typically require comprehensive and manual annotation of utterance boundaries and orthographic speech content, with an additional, optional range of tags of interest. These schemes have been enormously successful for datasets on the scale of dozens of recording hours but are untenable for long-format recording corpora, which routinely contain hundreds to thousands of audio hours. Long-format corpora would benefit greatly from (semi-)automated analyses, both on the earliest steps of annotation—voice activity detection, utterance segmentation, and speaker diarization—as well as later steps—e.g., classification-based codes such as child-vs-adult-directed speech, and speech recognition to produce phonetic/orthographic representations. We present an annotation workflow specifically designed for long-format corpora which can be tailored by individual researchers and which interfaces with the current dominant scheme for short-format recordings. The workflow allows semi-automated annotation and analyses at higher linguistic levels. We give one example of how the workflow has been successfully implemented in a large cross-database project.
  • Casillas, M., Amatuni, A., Seidl, A., Soderstrom, M., Warlaumont, A., & Bergelson, E. (2017). What do Babies hear? Analyses of Child- and Adult-Directed Speech. In Proceedings of Interspeech 2017 (pp. 2093-2097). doi:10.21437/Interspeech.2017-1409.

    Abstract

    Child-directed speech is argued to facilitate language development, and is found cross-linguistically and cross-culturally to varying degrees. However, previous research has generally focused on short samples of child-caregiver interaction, often in the lab or with experimenters present. We test the generalizability of this phenomenon with an initial descriptive analysis of the speech heard by young children in a large, unique collection of naturalistic, daylong home recordings. Trained annotators coded automatically-detected adult speech 'utterances' from 61 homes across 4 North American cities, gathered from children (age 2-24 months) wearing audio recorders during a typical day. Coders marked the speaker gender (male/female) and intended addressee (child/adult), yielding 10,886 addressee and gender tags from 2,523 minutes of audio (cf. HB-CHAAC Interspeech ComParE challenge; Schuller et al., in press). Automated speaker-diarization (LENA) incorrectly gender-tagged 30% of male adult utterances, compared to manually-coded consensus. Furthermore, we find effects of SES and gender on child-directed and overall speech, increasing child-directed speech with child age, and interactions of speaker gender, child gender, and child age: female caretakers increased their child-directed speech more with age than male caretakers did, but only for male infants. Implications for language acquisition and existing classification algorithms are discussed.
  • Jones, G., & Rowland, C. F. (2017). Diversity not quantity in caregiver speech: Using computational modeling to isolate the effects of the quantity and the diversity of the input on vocabulary growth. Cognitive Psychology, 98, 1-21. doi:10.1016/j.cogpsych.2017.07.002.

    Abstract

    Children who hear large amounts of diverse speech learn language more quickly than children who do not. However, high correlations between the amount and the diversity of the input in speech samples makes it difficult to isolate the influence of each. We overcame this problem by controlling the input to a computational model so that amount of exposure to linguistic input (quantity) and the quality of that input (lexical diversity) were independently manipulated. Sublexical, lexical, and multi-word knowledge were charted across development (Study 1), showing that while input quantity may be important early in learning, lexical diversity is ultimately more crucial, a prediction confirmed against children’s data (Study 2). The model trained on a lexically diverse input also performed better on nonword repetition and sentence recall tests (Study 3) and was quicker to learn new words over time (Study 4). A language input that is rich in lexical diversity outperforms equivalent richness in quantity for learned sublexical and lexical knowledge, for well-established language tests, and for acquiring words that have never been encountered before.
  • Monaghan, P., & Rowland, C. F. (2017). Combining language corpora with experimental and computational approaches for language acquisition research. Language Learning, 67(S1), 14-39. doi:10.1111/lang.12221.

    Abstract

    Historically, first language acquisition research was a painstaking process of observation, requiring the laborious hand coding of children's linguistic productions, followed by the generation of abstract theoretical proposals for how the developmental process unfolds. Recently, the ability to collect large-scale corpora of children's language exposure has revolutionized the field. New techniques enable more precise measurements of children's actual language input, and these corpora constrain computational and cognitive theories of language development, which can then generate predictions about learning behavior. We describe several instances where corpus, computational, and experimental work have been productively combined to uncover the first language acquisition process and the richness of multimodal properties of the environment, highlighting how these methods can be extended to address related issues in second language research. Finally, we outline some of the difficulties that can be encountered when applying multimethod approaches and show how these difficulties can be obviated
  • Rowland, C. F., & Monaghan, P. (2017). Developmental psycholinguistics teaches us that we need multi-method, not single-method, approaches to the study of linguistic representation. Commentary on Branigan and Pickering "An experimental approach to linguistic representation". Behavioral and Brain Sciences, 40: e308. doi:10.1017/S0140525X17000565.

    Abstract

    In developmental psycholinguistics, we have, for many years,
    been generating and testing theories that propose both descriptions of
    adult representations and explanations of how those representations
    develop. We have learnt that restricting ourselves to any one
    methodology yields only incomplete data about the nature of linguistic
    representations. We argue that we need a multi-method approach to the
    study of representation.

Share this page