Displaying 1 - 100 of 150
  • Sümer, B., & Özyürek, A. (in press). Action bias in describing object locations by signing children. Sign Language and Linguistics.
  • Ter Bekke, M., Drijvers, L., & Holler, J. (in press). Co-speech hand gestures are used to predict upcoming meaning. Psychological Science.
  • Zora, H., Bowin, H., Heldner, M., Riad, T., & Hagoort, P. (in press). Lexical and information structure functions of prosody and their relevance for spoken communication: Evidence from psychometric and EEG data. Journal of Cognitive Neuroscience.
  • Drijvers, L., Small, S. L., & Skipper, J. I. (2025). Language is widely distributed throughout the brain. Nature Reviews Neuroscience, 26: 189. doi:10.1038/s41583-024-00903-0.
  • Emmendorfer, A. K., & Holler, J. (2025). Facial signals shape predictions about the nature of upcoming conversational responses. Scientific Reports, 15: 1381. doi:10.1038/s41598-025-85192-y.

    Abstract

    Increasing evidence suggests that interlocutors use visual communicative signals to form predictions about unfolding utterances, but there is little data on the predictive potential of facial signals in conversation. In an online experiment with virtual agents, we examine whether facial signals produced by an addressee may allow speakers to anticipate the response to a question before it is given. Participants (n = 80) viewed videos of short conversation fragments between two virtual humans. Each fragment ended with the Questioner asking a question, followed by a pause during which the Responder looked either straight at the Questioner (baseline), or averted their gaze, or accompanied the straight gaze with one of the following facial signals: brow raise, brow frown, nose wrinkle, smile, squint, mouth corner pulled back (dimpler). Participants then indicated on a 6-point scale whether they expected a “yes” or “no” response. Analyses revealed that all signals received different ratings relative to the baseline: brow raises, dimplers, and smiles were associated with more positive responses, gaze aversions, brow frowns, nose wrinkles, and squints with more negative responses. Qur findings show that interlocutors may form strong associations between facial signals and upcoming responses to questions, highlighting their predictive potential in face-to-face conversation.

    Additional information

    supplementary materials
  • Esmer, Ş. C., Turan, E., Karadöller, D. Z., & Göksun, T. (2025). Sources of variation in preschoolers’ relational reasoning: The interaction between language use and working memory. Journal of Experimental Child Psychology, 252: 106149. doi:10.1016/j.jecp.2024.106149.

    Abstract

    Previous research has suggested the importance of relational language and working memory in children’s relational reasoning. The tendency to use language (e.g., using more relational than object-focused language, prioritizing focal objects over background in linguistic descriptions) could reflect children’s biases toward the relational versus object-based solutions in a relational match-to-sample (RMTS) task. In the lack of any apparent object match as a foil option, object-focused children might rely on other cognitive mechanisms (i.e., working memory) to choose a relational match in the RMTS task. The current study examined the interactive roles of language- and working memory-related sources of variation in Turkish-learning preschoolers’ relational reasoning. We collected data from 4- and 5-year-olds (N = 41) via Zoom in the RMTS task, a scene description task, and a backward word span task. Generalized binomial mixed effects models revealed that children who used more relational language and background-focused scene descriptions performed worse in the relational reasoning task. Furthermore, children with less frequent relational language use and focal object descriptions of the scenes benefited more from working memory to succeed in the relational reasoning task. These results suggest additional working memory demands for object-focused children to choose relational matches in the RMTS task, highlighting the importance of examining the interactive effects of different cognitive mechanisms on relational reasoning.

    Additional information

    supplementary material
  • Göksun, T., Aktan-Erciyes, A., Karadöller, D. Z., & Demir-Lira, Ö. E. (2025). Multifaceted nature of early vocabulary development: Connecting child characteristics with parental input types. Child Development Perspectives, 19(1), 30-37. doi:10.1111/cdep.12524.

    Abstract

    Children need to learn the demands of their native language in the early vocabulary development phase. In this dynamic process, parental multimodal input may shape neurodevelopmental trajectories while also being tailored by child-related factors. Moving beyond typically characterized group profiles, in this article, we synthesize growing evidence on the effects of parental multimodal input (amount, quality, or absence), domain-specific input (space and math), and language-specific input (causal verbs and sound symbols) on preterm, full-term, and deaf children's early vocabulary development, focusing primarily on research with children learning Turkish and Turkish Sign Language. We advocate for a theoretical perspective, integrating neonatal characteristics and parental input, and acknowledging the unique constraints of languages.
  • Karadöller, D. Z., Demir-Lira, Ö. E., & Göksun, T. (2025). Full-term children with lower vocabulary scores receive more multimodal math input than preterm children. Journal of Cognition and Development. Advance online publication. doi:10.1080/15248372.2025.2470245.

    Abstract

    One of the earliest sources of mathematical input arises in dyadic parent–child interactions. However, the emphasis has been on parental input only in speech and how input varies across different environmental and child-specific factors remains largely unexplored. Here, we investigated the relationship among parental math input modality and type, children’s gestational status (being preterm vs. full-term born), and vocabulary development. Using book-reading as a medium for parental math input in dyadic interaction, we coded specific math input elicited by Turkish-speaking parents and their 26-month-old children (N = 58, 24 preterms) for speech-only and multimodal (speech and gestures combined) input. Results showed that multimodal math input, as opposed to speech-only math input, was uniquely associated with gestational status, expressive vocabulary, and the interaction between the two. Full-term children with lower expressive vocabulary scores received more multimodal input compared to their preterm peers. However, there was no association between expressive vocabulary and multimodal math input for preterm children. Moreover, cardinality was the most frequent type for both speech-only and multimodal input. These findings suggest that the specific type of multimodal math input can be produced as a function of children’s gestational status and vocabulary development.
  • Özer, D., Özyürek, A., & Göksun, T. (2025). Spatial working memory is critical for gesture processing: Evidence from gestures with varying semantic links to speech. Psychonomic Bulletin & Review. Advance online publication. doi:10.3758/s13423-025-02642-4.

    Abstract

    Gestures express redundant or complementary information to speech they accompany by depicting visual and spatial features of referents. In doing so, they recruit both spatial and verbal cognitive resources that underpin the processing of visual semantic information and its integration with speech. The relation between spatial and verbal skills and gesture comprehension, where gestures may serve different roles in relation to speech is yet to be explored. This study examined the role of spatial and verbal skills in processing gestures that expressed redundant or complementary information to speech during the comprehension of spatial relations between objects. Turkish-speaking adults (N=74) watched videos describing the spatial location of objects that involved perspective-taking (left-right) or not (on-under) with speech and gesture. Gestures either conveyed redundant information to speech (e.g., saying and gesturing “left”) or complemented the accompanying demonstrative in speech (e.g., saying “here,” gesturing “left”). We also measured participants’ spatial (the Corsi block span and the mental rotation tasks) and verbal skills (the digit span task). Our results revealed nuanced interactions between these skills and spatial language comprehension, depending on the modality in which the information was expressed. One insight emerged prominently. Spatial skills, particularly spatial working memory capacity, were related to enhanced comprehension of visual semantic information conveyed through gestures especially when this information was not present in the accompanying speech. This study highlights the critical role of spatial working memory in gesture processing and underscores the importance of examining the interplay among cognitive and contextual factors to understand the complex dynamics of multimodal language.

    Additional information

    supplementary file data via OSF
  • Rubio-Fernandez, P. (2025). First acquiring articles in a second language: A new approach to the study of language and social cognition. Lingua, 313: 103851. doi:10.1016/j.lingua.2024.103851.

    Abstract

    Pragmatic phenomena are characterized by extreme variability, which makes it difficult to draw sound generalizations about the role of social cognition in pragmatic language by and large. I introduce cultural evolutionary pragmatics as a new framework for the study of the interdependence between language and social cognition, and point at the study of common-ground management across languages and ages as a way to test the reliance of pragmatic language on social cognition. I illustrate this new research line with three experiments on article use by second language speakers, whose mother tongue lacks articles. These L2 speakers are known to find article use challenging and it is often argued that their difficulties stem from articles being pragmatically redundant. Contrary to this view, the results of this exploratory study support the view that proficient article use requires automatizing basic socio-cognitive processes, offering a window into the interdependence between language and social cognition.
  • Rubio-Fernandez, P., Berke, M. D., & Jara-Ettinger, J. (2025). Tracking minds in communication. Trends in Cognitive Sciences, 29(3), 269-281. doi:10.1016/j.tics.2024.11.005.

    Abstract

    How might social cognition help us communicate through language? At what levels does this interaction occur? In classical views, social cognition is independent of language, and integrating the two can be slow, effortful, and error-prone. But new research into word level processes reveals that communication
    is brimming with social micro-processes that happen in real time, guiding even the simplest choices like how we use adjectives, articles, and demonstratives. We interpret these findings in the context of advances in theoretical models of social cognition and propose a Communicative Mind-Tracking
    framework, where social micro-processes aren’t a secondary process in how we use language—they are fundamental to how communication works.
  • Soberanes, M., Pérez-Ramírez, C. A., & Assaneo, M. F. (2025). Insights into the effect of general attentional state, coarticulation, and primed speech rate in phoneme production time. Journal of Speech, Language, and Hearing Research. Advance online publication. doi:10.1044/2025_JSLHR-24-00595.

    Abstract

    Purpose:
    This study aimed to identify how a set of predefined factors modulates phoneme articulation time within a speaker.
    Method:
    We used a custom in-lab system that records lip muscle activity through electromyography signals, aligned with the produced speech, to measure phoneme articulation time. Twenty Spanish-speaking participants (12 females) were evaluated while producing sequences of a consonant–vowel syllable, with each sequence consisting of repeated articulations of either /pa/ or /pu/. Before starting the sequences, participants underwent a priming step with either a fast or slow speech rate. Additionally, the general attentional state level was assessed at the beginning, middle, and end of the protocol. To analyze the variability in the duration of /p/ and vowel articulation, we fitted individual linear mixed-models considering three factors: general attentional state level, priming rate, and coarticulation effects (for /p/, i.e., followed by /a/ or /u/) or phoneme identity (for vowels, i.e., being /a/ or /u/).
    Results:
    We found that the level of general attentional state positively correlated with production time for both the consonant /p/ and the vowels. Additionally, /p/ production was influenced by the nature of the following vowel (i.e., coarticulation effects), while vowel production time was affected by the primed speech rate.
    Conclusions:
    Phoneme duration appears to be influenced by both stable, speaker-specific characteristics (idiosyncratic traits) and internal, state-dependent factors related to the speaker's condition at the time of speech production. While some factors affect both consonants and vowels, others specifically modify only one of these types.

    Additional information

    supplemental material
  • Tilston, O., Holler, J., & Bangerter, A. (2025). Opening social interactions: The coordination of approach, gaze, speech and handshakes during greetings. Cognitive Science, 49(2): e70049. doi:10.1111/cogs.70049.

    Abstract

    Despite the importance of greetings for opening social interactions, their multimodal coordination processes remain poorly understood. We used a naturalistic, lab-based setup where pairs of unacquainted participants approached and greeted each other while unaware their greeting behavior was studied. We measured the prevalence and time course of multimodal behaviors potentially culminating in a handshake, including motor behaviors (e.g., walking, standing up, hand movements like raise, grasp, and retraction), gaze patterns (using eye tracking glasses), and speech (close and distant verbal salutations). We further manipulated the visibility of partners’ eyes to test its effect on gaze. Our findings reveal that gaze to a partner's face increases over the course of a greeting, but is partly averted during approach and is influenced by the visibility of partners’ eyes. Gaze helps coordinate handshakes, by signaling intent and guiding the grasp. The timing of adjacency pairs in verbal salutations is comparable to the precision of floor transitions in the main body of conversations, and varies according to greeting phase, with distant salutation pair parts featuring more gaps and close salutation pair parts featuring more overlap. Gender composition and a range of multimodal behaviors affect whether pairs chose to shake hands or not. These findings fill several gaps in our understanding of greetings and provide avenues for future research, including advancements in social robotics and human−robot interaction.
  • Trujillo, J. P., & Holler, J. (2025). Multimodal information density is highest in question beginnings, and early entropy is associated with fewer but longer visual signals. Discourse Processes. Advance online publication. doi:10.1080/0163853X.2024.2413314.

    Abstract

    When engaged in spoken conversation, speakers convey meaning using both speech and visual signals, such as facial expressions and manual gestures. An important question is how information is distributed in utterances during face-to-face interaction when information from visual signals is also present. In a corpus of casual Dutch face-to-face conversations, we focus on spoken questions in particular because they occur frequently, thus constituting core building blocks of conversation. We quantified information density (i.e. lexical entropy and surprisal) and the number and relative duration of facial and manual signals. We tested whether lexical information density or the number of visual signals differed between the first and last halves of questions, as well as whether the number of visual signals occurring in the less-predictable portion of a question was associated with the lexical information density of the same portion of the question in a systematic manner. We found that information density, as well as number of visual signals, were higher in the first half of questions, and specifically lexical entropy was associated with fewer, but longer visual signals. The multimodal front-loading of questions and the complementary distribution of visual signals and high entropy words in Dutch casual face-to-face conversations may have implications for the parallel processes of utterance comprehension and response planning during turn-taking.

    Additional information

    supplemental material
  • Trujillo, J. P., Dyer, R. M. K., & Holler, J. (2025). Dyadic differences in empathy scores are associated with kinematic similarity during conversational question-answer pairs. Discourse Processes. Advance online publication. doi:10.1080/0163853X.2025.2467605.

    Abstract

    During conversation, speakers coordinate and synergize their behaviors at multiple levels, and in different ways. The extent to which individuals converge or diverge in their behaviors during interaction may relate to interpersonal differences relevant to social interaction, such as empathy as measured by the empathy quotient (EQ). An association between interpersonal difference in empathy and interpersonal entrainment could help to throw light on how interlocutor characteristics influence interpersonal entrainment. We investigated this possibility in a corpus of unconstrained conversation between dyads. We used dynamic time warping to quantify entrainment between interlocutors of head motion, hand motion, and maximum speech f0 during question–response sequences. We additionally calculated interlocutor differences in EQ scores. We found that, for both head and hand motion, greater difference in EQ was associated with higher entrainment. Thus, we consider that people who are dissimilar in EQ may need to “ground” their interaction with low-level movement entrainment. There was no significant relationship between f0 entrainment and EQ score differences.
  • Ünal, E., Kırbaşoğlu, K., Karadöller, D. Z., Sumer, B., & Özyürek, A. (2025). Gesture reduces mapping difficulties in the development of spatial language depending on the complexity of spatial relations. Cognitive Science, 49(2): e70046. doi:10.1111/cogs.70046.

    Abstract

    In spoken languages, children acquire locative terms in a cross-linguistically stable order. Terms similar in meaning to in and on emerge earlier than those similar to front and behind, followed by left and right. This order has been attributed to the complexity of the relations expressed by different locative terms. An additional possibility is that children may be delayed in expressing certain spatial meanings partly due to difficulties in discovering the mappings between locative terms in speech and spatial relation they express. We investigate cognitive and mapping difficulties in the domain of spatial language by comparing how children map spatial meanings onto speech versus visually motivated forms in co-speech gesture across different spatial relations. Twenty-four 8-year-old and 23 adult native Turkish-speakers described four-picture displays where the target picture depicted in-on, front-behind, or left-right relations between objects. As the complexity of spatial relations increased, children were more likely to rely on gestures as opposed to speech to informatively express the spatial relation. Adults overwhelmingly relied on speech to informatively express the spatial relation, and this did not change across the complexity of spatial relations. Nevertheless, even when spatial expressions in both speech and co-speech gesture were considered, children lagged behind adults when expressing the most complex left-right relations. These findings suggest that cognitive development and mapping difficulties introduced by the modality of expressions interact in shaping the development of spatial language.

    Additional information

    list of stimuli and descriptions
  • Yılmaz, B., Doğan, I., Karadöller, D. Z., Demir-Lira, Ö. E., & Göksun, T. (2025). Parental attitudes and beliefs about mathematics and the use of gestures in children’s math development. Cognitive Development, 73: 101531. doi:10.1016/j.cogdev.2024.101531.

    Abstract

    Children vary in mathematical skills even before formal schooling. The current study investigated how parental math beliefs, parents’ math anxiety, and children's spontaneous gestures contribute to preschool-aged children’s math performance. Sixty-three Turkish-reared children (33 girls, Mage = 49.9 months, SD = 3.68) were assessed on verbal counting, cardinality, and arithmetic tasks (nonverbal and verbal). Results showed that parental math beliefs were related to children’s verbal counting, cardinality and arithmetic scores. Children whose parents have higher math beliefs along with low math anxiety scored highest in the cardinality task. Children’s gesture use was also related to lower cardinality performance and the relation between parental math beliefs and children’s performance became stronger when child gestures were absent. These findings highlight the importance of parent and child-related contributors in explaining the variability in preschool-aged children’s math skills.
  • Yılmaz, B., Doğan, I., Karadöller, D. Z., Demir-Lira, Ö. E., & Göksun, T. (2025). Parental attitudes and beliefs about mathematics and the use of gestures in children’s math development. Cognitive Development, 73: 101531. doi:10.1016/j.cogdev.2024.101531.

    Abstract

    Children vary in mathematical skills even before formal schooling. The current study investigated how parental math beliefs, parents’ math anxiety, and children's spontaneous gestures contribute to preschool-aged children’s math performance. Sixty-three Turkish-reared children (33 girls, Mage = 49.9 months, SD = 3.68) were assessed on verbal counting, cardinality, and arithmetic tasks (nonverbal and verbal). Results showed that parental math beliefs were related to children’s verbal counting, cardinality and arithmetic scores. Children whose parents have higher math beliefs along with low math anxiety scored highest in the cardinality task. Children’s gesture use was also related to lower cardinality performance and the relation between parental math beliefs and children’s performance became stronger when child gestures were absent. These findings highlight the importance of parent and child-related contributors in explaining the variability in preschool-aged children’s math skills.

    Additional information

    supplementary material
  • Zora, H., Kabak, B., & Hagoort, P. (2025). Relevance of prosodic focus and lexical stress for discourse comprehension in Turkish: Evidence from psychometric and electrophysiological data. Journal of Cognitive Neuroscience, 37(3), 693-736. doi:10.1162/jocn_a_02262.

    Abstract

    Prosody underpins various linguistic domains ranging from semantics and syntax to discourse. For instance, prosodic information in the form of lexical stress modifies meanings and, as such, syntactic contexts of words as in Turkish kaz-má "pickaxe" (noun) versus káz-ma "do not dig" (imperative). Likewise, prosody indicates the focused constituent of an utterance as the noun phrase filling the wh-spot in a dialogue like What did you eat? I ate----. In the present study, we investigated the relevance of such prosodic variations for discourse comprehension in Turkish. We aimed at answering how lexical stress and prosodic focus mismatches on critical noun phrases-resulting in grammatical anomalies involving both semantics and syntax and discourse-level anomalies, respectively-affect the perceived correctness of an answer to a question in a given context. To that end, 80 native speakers of Turkish, 40 participating in a psychometric experiment and 40 participating in an EEG experiment, were asked to judge the acceptability of prosodic mismatches that occur either separately or concurrently. Psychometric results indicated that lexical stress mismatch led to a lower correctness score than prosodic focus mismatch, and combined mismatch received the lowest score. Consistent with the psychometric data, EEG results revealed an N400 effect to combined mismatch, and this effect was followed by a P600 response to lexical stress mismatch. Conjointly, these results suggest that every source of prosodic information is immediately available and codetermines the interpretation of an utterance; however, semantically and syntactically relevant lexical stress information is assigned more significance by the language comprehension system compared with prosodic focus information.
  • Cho, S.-J., Brown-Schmidt, S., Clough, S., & Duff, M. C. (2024). Comparing Functional Trend and Learning among Groups in Intensive Binary Longitudinal Eye-Tracking Data using By-Variable Smooth Functions of GAMM. Psychometrika. Advance online publication. doi:10.1007/s11336-024-09986-1.

    Abstract

    This paper presents a model specification for group comparisons regarding a functional trend over time within a trial and learning across a series of trials in intensive binary longitudinal eye-tracking data. The functional trend and learning effects are modeled using by-variable smooth functions. This model specification is formulated as a generalized additive mixed model, which allowed for the use of the freely available mgcv package (Wood in Package ‘mgcv.’ https://cran.r-project.org/web/packages/mgcv/mgcv.pdf, 2023) in R. The model specification was applied to intensive binary longitudinal eye-tracking data, where the questions of interest concern differences between individuals with and without brain injury in their real-time language comprehension and how this affects their learning over time. The results of the simulation study show that the model parameters are recovered well and the by-variable smooth functions are adequately predicted in the same condition as those found in the application.
  • Clough, S., Brown-Schmidt, S., Cho, S.-J., & Duff, M. C. (2024). Reduced on-line speech gesture integration during multimodal language processing in adults with moderate-severe traumatic brain injury: Evidence from eye-tracking. Cortex, 181, 26-46. doi:10.1016/j.cortex.2024.08.008.

    Abstract

    Background
    Language is multimodal and situated in rich visual contexts. Language is also incremental, unfolding moment-to-moment in real time, yet few studies have examined how spoken language interacts with gesture and visual context during multimodal language processing. Gesture is a rich communication cue that is integrally related to speech and often depicts concrete referents from the visual world. Using eye-tracking in an adapted visual world paradigm, we examined how participants with and without moderate-severe traumatic brain injury (TBI) use gesture to resolve temporary referential ambiguity.

    Methods
    Participants viewed a screen with four objects and one video. The speaker in the video produced sentences (e.g., “The girl will eat the very good sandwich”), paired with either a meaningful gesture (e.g., sandwich-holding gesture) or a meaningless grooming movement (e.g., arm scratch) at the verb “will eat.” We measured participants’ gaze to the target object (e.g., sandwich), a semantic competitor (e.g., apple), and two unrelated distractors (e.g., piano, guitar) during the critical window between movement onset in the gesture modality and onset of the spoken referent in speech.

    Results
    Both participants with and without TBI were more likely to fixate the target when the speaker produced a gesture compared to a grooming movement; however, relative to non-injured participants, the effect was significantly attenuated in the TBI group.

    Discussion
    We demonstrated evidence of reduced speech-gesture integration in participants with TBI relative to non-injured peers. This study advances our understanding of the communicative abilities of adults with TBI and could lead to a more mechanistic account of the communication difficulties adults with TBI experience in rich communication contexts that require the processing and integration of multiple co-occurring cues. This work has the potential to increase the ecological validity of language assessment and provide insights into the cognitive and neural mechanisms that support multimodal language processing.

    Additional information

    supplementary data
  • Dikshit, A. P., Das, D., Samal, R. R., Parashar, K., Mishra, C., & Parashar, S. (2024). Optimization of (Ba1-xCax)(Ti0.9Sn0.1)O3 ceramics in X-band using Machine Learning. Journal of Alloys and Compounds, 982: 173797. doi:10.1016/j.jallcom.2024.173797.

    Abstract

    Developing efficient electromagnetic interference shielding materials has become significantly important in present times. This paper reports a series of (Ba1-xCax)(Ti0.9Sn0.1)O3 (BCTS) ((x =0, 0.01, 0.05, & 0.1)ceramics synthesized by conventional method which were studied for electromagnetic interference shielding (EMI) applications in X-band (8-12.4 GHz). EMI shielding properties and all S parameters (S11 & S12) of BCTS ceramic pellets were measured in the frequency range (8-12.4 GHz) using a Vector Network Analyser (VNA). The BCTS ceramic pellets for x = 0.05 showed maximum total effective shielding of 46 dB indicating good shielding behaviour for high-frequency applications. However, the development of lead-free ceramics with different concentrations usually requires iterative experiments resulting in, longer development cycles and higher costs. To address this, we used a machine learning (ML) strategy to predict the EMI shielding for different concentrations and experimentally verify the concentration predicted to give the best EMI shielding. The ML model predicted BCTS ceramics with concentration (x = 0.06, 0.07, 0.08, and 0.09) to have higher shielding values. On experimental verification, a shielding value of 58 dB was obtained for x = 0.08, which was significantly higher than what was obtained experimentally before applying the ML approach. Our results show the potential of using ML in accelerating the process of optimal material development, reducing the need for repeated experimental measures significantly.
  • Evans, M. J., Clough, S., Duff, M. C., & Brown‐Schmidt, S. (2024). Temporal organization of narrative recall is present but attenuated in adults with hippocampal amnesia. Hippocampus, 34(8), 438-451. doi:10.1002/hipo.23620.

    Abstract

    Studies of the impact of brain injury on memory processes often focus on the quantity and episodic richness of those recollections. Here, we argue that the organization of one's recollections offers critical insights into the impact of brain injury on functional memory. It is well-established in studies of word list memory that free recall of unrelated words exhibits a clear temporal organization. This temporal contiguity effect refers to the fact that the order in which word lists are recalled reflects the original presentation order. Little is known, however, about the organization of recall for semantically rich materials, nor how recall organization is impacted by hippocampal damage and memory impairment. The present research is the first study, to our knowledge, of temporal organization in semantically rich narratives in three groups: (1) Adults with bilateral hippocampal damage and severe declarative memory impairment, (2) adults with bilateral ventromedial prefrontal cortex (vmPFC) damage and no memory impairment, and (3) demographically matched non-brain-injured comparison participants. We find that although the narrative recall of adults with bilateral hippocampal damage reflected the temporal order in which those narratives were experienced above chance levels, their temporal contiguity effect was significantly attenuated relative to comparison groups. In contrast, individuals with vmPFC damage did not differ from non-brain-injured comparison participants in temporal contiguity. This pattern of group differences yields insights into the cognitive and neural systems that support the use of temporal organization in recall. These data provide evidence that the retrieval of temporal context in narrative recall is hippocampal-dependent, whereas damage to the vmPFC does not impair the temporal organization of narrative recall. This evidence of limited but demonstrable organization of memory in participants with hippocampal damage and amnesia speaks to the power of narrative structures in supporting meaningfully organized recall despite memory impairment.

    Additional information

    supporting information
  • Feller, J. J., Duff, M. C., Clough, S., Jacobson, G. P., Roberts, R. A., & Romero, D. J. (2024). Evidence of peripheral vestibular impairment among adults with chronic moderate–severe traumatic brain injury. American Journal of Audiology, 33, 1118-1134. doi:10.1044/2024_AJA-24-00058.

    Abstract

    Purpose:
    Traumatic brain injury (TBI) is a leading cause of death and disability among adults in the United States. There is evidence to suggest the peripheral vestibular system is vulnerable to damage in individuals with TBI. However, there are limited prospective studies that describe the type and frequency of vestibular impairment in individuals with chronic moderate–severe TBI (> 6 months postinjury).

    Method:
    Cervical and ocular vestibular evoked myogenic potentials (VEMPs) and video head impulse test (vHIT) were used to assess the function of otolith organ and horizontal semicircular canal (hSCC) pathways in adults with chronic moderate–severe TBI and in noninjured comparison (NC) participants. Self-report questionnaires were administered to participants with TBI to determine prevalence of vestibular symptoms and quality of life associated with those symptoms.

    Results:
    Chronic moderate–severe TBI was associated with a greater degree of impairment in otolith organ, rather than hSCC, pathways. About 63% of participants with TBI had abnormal VEMP responses, compared to only ~10% with abnormal vHIT responses. The NC group had significantly less abnormal VEMP responses (~7%), while none of the NC participants had abnormal vHIT responses. As many as 80% of participants with TBI reported vestibular symptoms, and up to 36% reported that these symptoms negatively affected their quality of life.

    Conclusions:
    Adults with TBI reported vestibular symptoms and decreased quality of life related to those symptoms and had objective evidence of peripheral vestibular impairment. Vestibular testing for adults with chronic TBI who report persistent dizziness and imbalance may serve as a guide for treatment and rehabilitation in these individuals.
  • Gordon, J. K., & Clough, S. (2024). The Flu-ID: A new evidence-based method of assessing fluency in aphasia. American Journal of Speech-Language Pathology, 33, 2972-2990. doi:10.1044/2024_AJSLP-23-00424.

    Abstract

    Purpose:
    Assessing fluency in aphasia is diagnostically important for determining aphasia type and severity and therapeutically important for determining appropriate treatment targets. However, wide variability in the measures and criteria used to assess fluency, as revealed by a recent survey of clinicians (Gordon & Clough, 2022), results in poor reliability. Furthermore, poor specificity in many fluency measures makes it difficult to identify the underlying impairments. Here, we introduce the Flu-ID Aphasia, an evidence-based tool that provides a more informative method of assessing fluency by capturing the range of behaviors that can affect the flow of speech in aphasia.

    Method:
    The development of the Flu-ID was based on prior evidence about factors underlying fluency (Clough & Gordon, 2020; Gordon & Clough, 2020) and clinical perceptions about the measurement of fluency (Gordon & Clough, 2022). Clinical utility is maximized by automated counting of fluency behaviors in an Excel template. Reliability is maximized by outlining thorough guidelines for transcription and coding. Eighteen narrative samples representing a range of fluency were coded independently by the authors to examine the Flu-ID's utility, reliability, and validity.

    Results:
    Overall reliability was very good, with point-to-point agreement of 86% between coders. Ten of the 12 dimensions showed good to excellent reliability. Validity analyses indicated that Flu-ID scores were similar to clinician ratings on some dimensions, but differed on others. Possible reasons and implications of the discrepancies are discussed, along with opportunities for improvement.

    Conclusions:
    The Flu-ID assesses fluency in aphasia using a consistent and comprehensive set of measures and semi-automated procedures to generate individual fluency profiles. The profiles generated in the current study illustrate how similar ratings of fluency can arise from different underlying impairments. Supplemental materials include an analysis template, extensive guidelines for transcription and coding, a completed sample, and a quick reference guide.

    Additional information

    supplemental material
  • Hagoort, P., & Özyürek, A. (2024). Extending the architecture of language from a multimodal perspective. Topics in Cognitive Science. Advance online publication. doi:10.1111/tops.12728.

    Abstract

    Language is inherently multimodal. In spoken languages, combined spoken and visual signals (e.g., co-speech gestures) are an integral part of linguistic structure and language representation. This requires an extension of the parallel architecture, which needs to include the visual signals concomitant to speech. We present the evidence for the multimodality of language. In addition, we propose that distributional semantics might provide a format for integrating speech and co-speech gestures in a common semantic representation.
  • Jara-Ettinger, J., & Rubio-Fernandez, P. (2024). Demonstratives as attention tools: Evidence of mentalistic representations in language. Proceedings of the National Academy of Sciences of the United States of America, 121(32): e2402068121. doi:10.1073/pnas.2402068121.

    Abstract

    Linguistic communication is an intrinsically social activity that enables us to share thoughts across minds. Many complex social uses of language can be captured by domain-general representations of other minds (i.e., mentalistic representations) that externally modulate linguistic meaning through Gricean reasoning. However, here we show that representations of others’ attention are embedded within language itself. Across ten languages, we show that demonstratives—basic grammatical words (e.g.,“this”/“that”) which are evolutionarily ancient, learned early in life, and documented in all known languages—are intrinsic attention tools. Beyond their spatial meanings, demonstratives encode both joint attention and the direction in which the listenermmust turn to establish it. Crucially, the frequency of the spatial and attentional uses of demonstratives varies across languages, suggesting that both spatial and mentalistic representations are part of their conventional meaning. Using computational modeling, we show that mentalistic representations of others’ attention are internally encoded in demonstratives, with their effect further boosted by Gricean reasoning. Yet, speakers are largely unaware of this, incorrectly reporting that they primarily capture spatial representations. Our findings show that representations of other people’s cognitive states (namely, their attention) are embedded in language and suggest that the most basic building blocks of the linguistic system crucially rely on social cognition.

    Additional information

    pnas.2402068121.sapp.pdf
  • Karadöller, D. Z., Sümer, B., Ünal, E., & Özyürek, A. (2024). Sign advantage: Both children and adults’ spatial expressions in sign are more informative than those in speech and gestures combined. Journal of Child Language, 51(4), 876-902. doi:10.1017/S0305000922000642.

    Abstract

    Expressing Left-Right relations is challenging for speaking-children. Yet, this challenge was absent for signing-children, possibly due to iconicity in the visual-spatial modality of expression. We investigate whether there is also a modality advantage when speaking-children’s co-speech gestures are considered. Eight-year-old child and adult hearing monolingual Turkish speakers and deaf signers of Turkish-Sign-Language described pictures of objects in various spatial relations. Descriptions were coded for informativeness in speech, sign, and speech-gesture combinations for encoding Left-Right relations. The use of co-speech gestures increased the informativeness of speakers’ spatial expressions compared to speech-only. This pattern was more prominent for children than adults. However, signing-adults and children were more informative than child and adult speakers even when co-speech gestures were considered. Thus, both speaking- and signing-children benefit from iconic expressions in visual modality. Finally, in each modality, children were less informative than adults, pointing to the challenge of this spatial domain in development.
  • Karadöller, D. Z., Peeters, D., Manhardt, F., Özyürek, A., & Ortega, G. (2024). Iconicity and gesture jointly facilitate learning of second language signs at first exposure in hearing non-signers. Language Learning, 74(4), 781-813. doi:10.1111/lang.12636.

    Abstract

    When learning a spoken second language (L2), words overlapping in form and meaning with one’s native language (L1) help break into the new language. When non-signing speakers learn a sign language as L2, such forms are absent because of the modality differences (L1:speech, L2:sign). In such cases, non-signing speakers might use iconic form-meaning mappings in signs or their own gestural experience as gateways into the to-be-acquired sign language. Here, we investigated how both these factors may contribute jointly to the acquisition of sign language vocabulary by hearing non-signers. Participants were presented with three types of sign in NGT (Sign Language of the Netherlands): arbitrary signs, iconic signs with high or low gesture overlap. Signs that were both iconic and highly overlapping with gestures boosted learning most at first exposure, and this effect remained the day after. Findings highlight the influence of modality-specific factors supporting the acquisition of a signed lexicon.
  • Karadöller*, D. Z., Sümer*, B., & Özyürek, A. (2024). First-language acquisition in a multimodal language framework: Insights from speech, gesture, and sign. First Language. Advance online publication. doi:10.1177/01427237241290678.

    Abstract

    *=shared first authorship
    Children across the world acquire their first language(s) naturally, regardless of typology or modality (e.g. sign or spoken). Various attempts have been made to explain the puzzle of language acquisition using several approaches, trying to understand to what extent it can be explained by what children bring to language-learning situations as well as what they learn from the input and the interactive context. However, most of these approaches consider only speech development, thus ignoring the inherently multimodal nature of human language. As a multimodal view of language is becoming more widely adopted for the study of adult language, a multimodal approach to language acquisition is inevitable. Not only do children have the capacity to learn spoken and sign language equally easily, but spoken language acquisition consists of learning to coordinate linguistic expressions in both modalities, that is, in both speech and gesture. To provide a step forward in this direction, this article aims to synthesize findings from research studies that take a multimodal perspective on language acquisition in different sign and spoken languages, including the development of speech and accompanying gestures. Our review shows that while some aspects of language acquisition seem to be modality-independent, others might differ according to the affordances of each modality when used separately as well as together (either in sign, speech, and/or gesture). We argue that these findings need to be integrated into our understanding of language acquisition. We also identify which areas need future research for both spoken and sign language acquisition, taking into account not only multimodal but also cross-linguistic variation.
  • Kejriwal, J., Mishra, C., Skantze, G., Offrede, T., & Beňuš, Š. (2024). Does a robot’s gaze behavior affect entrainment in HRI? Computing and Informatics, 43(5), 1256-1284. doi:10.31577/cai_2024_5_1256.

    Abstract

    Speakers tend to engage in adaptive behavior, known as entrainment, when they reuse their partner's linguistic representations, including lexical, acoustic prosodic, semantic, or syntactic structures during a conversation. Studies have explored the relationship between entrainment and social factors such as likeability, task success, and rapport. Still, limited research has investigated the relationship between entrainment and gaze. To address this gap, we conducted a within-subjects user study (N = 33) to test if gaze behavior of a robotic head affects entrainment of subjects toward the robot on four linguistic dimensions: lexical, syntactic, semantic, and acoustic-prosodic. Our results show that participants entrain more on lexical and acoustic-prosodic features when the robot exhibits well-timed gaze aversions similar to the ones observed in human gaze behavior, as compared to when the robot keeps staring at participants constantly. Our results support the predictions of the computers as social actors (CASA) model and suggest that implementing well-timed gaze aversion behavior in a robot can lead to speech entrainment in human-robot interactions.
  • Kimmel, M., Schneider, S. M., & Fisher, V. J. (2024). "Introjecting" imagery: A process model of how minds and bodies are co-enacted. Language Sciences, 102: 101602. doi:10.1016/j.langsci.2023.101602.

    Abstract

    Somatic practices frequently use imagery, typically via verbal instructions, to scaffold sensorimotor organization and experience, a phenomenon we term “introjection”. We argue that introjection is an imagery practice in which sensorimotor and conceptual aspects are co-orchestrated, suggesting the necessity of crosstalk between somatics, phenomenology, psychology, embodied-enactive cognition, and linguistic research on embodied simulation. We presently focus on the scarcely addressed details of the process necessary to enact instructions of a literal or metaphoric nature through the body. Based on vignettes from dance, Feldenkrais, and Taichi practice, we describe introjection as a complex form of processual sense-making, in which context-interpretive, mental, attentional and physical sub-processes recursively braid. Our analysis focuses on how mental and body-related processes progressively align, inform and augment each other. This dialectic requires emphasis on the active body, which implies that uni-directional models (concept ⇒ body) are inadequate and should be replaced by interactionist alternatives (concept ⇔ body). Furthermore, we emphasize that both the source image itself and the body are specifically conceptualized for the context through constructive operations, and both evolve through their interplay. At this level introjection employs representational operations that are embedded in enactive dynamics of a fully situated person.
  • Long, M., Rohde, H., Oraa Ali, M., & Rubio-Fernandez, P. (2024). The role of cognitive control and referential complexity on adults’ choice of referring expressions: Testing and expanding the referential complexity scale. Journal of Experimental Psychology: Learning, Memory, and Cognition, 50(1), 109-136. doi:10.1037/xlm0001273.

    Abstract

    This study aims to advance our understanding of the nature and source(s) of individual differences in pragmatic language behavior over the adult lifespan. Across four story continuation experiments, we probed adults’ (N = 496 participants, ages 18–82) choice of referential forms (i.e., names vs. pronouns to refer to the main character). Our manipulations were based on Fossard et al.’s (2018) scale of referential complexity which varies according to the visual properties of the scene: low complexity (one character), intermediate complexity (two characters of different genders), and high complexity (two characters of the same gender). Since pronouns signal topic continuity (i.e., that the discourse will continue to be about the same referent), the use of pronouns is expected to decrease as referential complexity increases. The choice of names versus pronouns, therefore, provides insight into participants’ perception of the topicality of a referent, and whether that varies by age and cognitive capacity. In Experiment 1, we used the scale to test the association between referential choice, aging, and cognition, identifying a link between older adults’ switching skills and optimal referential choice. In Experiments 2–4, we tested novel manipulations that could impact the scale and found both the timing of a competitor referent’s presence and emphasis placed on competitors modulated referential choice, leading us to refine the scale for future use. Collectively, Experiments 1–4 highlight what type of contextual information is prioritized at different ages, revealing older adults’ preserved sensitivity to (visual) scene complexity but reduced sensitivity to linguistic prominence cues, compared to younger adults.
  • Long, M., MacPherson, S. E., & Rubio-Fernandez, P. (2024). Prosocial speech acts: Links to pragmatics and aging. Developmental Psychology, 60(3), 491-504. doi:10.1037/dev0001725.

    Abstract

    This study investigated how adults over the lifespan flexibly adapt their use of prosocial speech acts when conveying bad news to communicative partners. Experiment 1a (N = 100 Scottish adults aged 18–72 years) assessed whether participants’ use of prosocial speech acts varied according to audience design considerations (i.e., whether or not the recipient of the news was directly affected). Experiment 1b (N = 100 Scottish adults aged 19–70 years) assessed whether participants adjusted for whether the bad news was more or less severe (an index of general knowledge). Younger adults displayed more flexible adaptation to the recipient manipulation, while no age differences were found for severity. These findings are consistent with prior work showing age-related decline in audience design but not in the use of general knowledge during language production. Experiment 2 further probed younger adults (N = 40, Scottish, aged 18–37 years) and older adults’ (N = 40, Scottish, aged 70–89 years) prosocial linguistic behavior by investigating whether health (vs. nonhealth-related) matters would affect responses. While older adults used prosocial speech acts to a greater extent than younger adults, they did not distinguish between conditions. Our results suggest that prosocial linguistic behavior is likely influenced by a combination of differences in audience design and communicative styles at different ages. Collectively, these findings highlight the importance of situating prosocial speech acts within the pragmatics and aging literature, allowing us to uncover the factors modulating prosocial linguistic behavior at different developmental stages.

    Additional information

    figures
  • Ronderos, C. R., Aparicio, H., Long, M., Shukla, V., Jara-Ettinger, J., & Rubio-Fernandez, P. (2024). Perceptual, semantic, and pragmatic factors affect the derivation of contrastive inferences. Open mind: discoveries in cognitive science, 8, 1213-1227. doi:10.1162/opmi_a_00165.

    Abstract

    People derive contrastive inferences when interpreting adjectives (e.g., inferring that ‘the short pencil’ is being contrasted with a longer one). However, classic eye-tracking studies revealed contrastive inferences with scalar and material adjectives, but not with color adjectives. This was explained as a difference in listeners’ informativity expectations, since color adjectives are often used descriptively (hence not warranting a contrastive interpretation). Here we hypothesized that, beyond these pragmatic factors, perceptual factors (i.e., the relative perceptibility of color, material and scalar contrast) and semantic factors (i.e., the difference between gradable and non-gradable properties) also affect the real-time derivation of contrastive inferences. We tested these predictions in three languages with prenominal modification (English, Hindi, and Hungarian) and found that people derive contrastive inferences for color and scalar adjectives, but not for material adjectives. In addition, the processing of scalar adjectives was more context dependent than that of color and material adjectives, confirming that pragmatic, perceptual and semantic factors affect the derivation of contrastive inferences.
  • Rubianes, M., Drijvers, L., Muñoz, F., Jiménez-Ortega, L., Almeida-Rivera, T., Sánchez-García, J., Fondevila, S., Casado, P., & Martín-Loeches, M. (2024). The self-reference effect can modulate language syntactic processing even without explicit awareness: An electroencephalography study. Journal of Cognitive Neuroscience, 36(3), 460-474. doi:10.1162/jocn_a_02104.

    Abstract

    Although it is well established that self-related information can rapidly capture our attention and bias cognitive functioning, whether this self-bias can affect language processing remains largely unknown. In addition, there is an ongoing debate as to the functional independence of language processes, notably regarding the syntactic domain. Hence, this study investigated the influence of self-related content on syntactic speech processing. Participants listened to sentences that could contain morphosyntactic anomalies while the masked face identity (self, friend, or unknown faces) was presented for 16 msec preceding the critical word. The language-related ERP components (left anterior negativity [LAN] and P600) appeared for all identity conditions. However, the largest LAN effect followed by a reduced P600 effect was observed for self-faces, whereas a larger LAN with no reduction of the P600 was found for friend faces compared with unknown faces. These data suggest that both early and late syntactic processes can be modulated by self-related content. In addition, alpha power was more suppressed over the left inferior frontal gyrus only when self-faces appeared before the critical word. This may reflect higher semantic demands concomitant to early syntactic operations (around 150–550 msec). Our data also provide further evidence of self-specific response, as reflected by the N250 component. Collectively, our results suggest that identity-related information is rapidly decoded from facial stimuli and may impact core linguistic processes, supporting an interactive view of syntactic processing. This study provides evidence that the self-reference effect can be extended to syntactic processing.
  • Sekine, K., & Özyürek, A. (2024). Children benefit from gestures to understand degraded speech but to a lesser extent than adults. Frontiers in Psychology, 14: 1305562. doi:10.3389/fpsyg.2023.1305562.

    Abstract

    The present study investigated to what extent children, compared to adults, benefit from gestures to disambiguate degraded speech by manipulating speech signals and manual modality. Dutch-speaking adults (N = 20) and 6- and 7-year-old children (N = 15) were presented with a series of video clips in which an actor produced a Dutch action verb with or without an accompanying iconic gesture. Participants were then asked to repeat what they had heard. The speech signal was either clear or altered into 4- or 8-band noise-vocoded speech. Children had more difficulty than adults in disambiguating degraded speech in the speech-only condition. However, when presented with both speech and gestures, children reached a comparable level of accuracy to that of adults in the degraded-speech-only condition. Furthermore, for adults, the enhancement of gestures was greater in the 4-band condition than in the 8-band condition, whereas children showed the opposite pattern. Gestures help children to disambiguate degraded speech, but children need more phonological information than adults to benefit from use of gestures. Children’s multimodal language integration needs to further develop to adapt flexibly to challenging situations such as degraded speech, as tested in our study, or instances where speech is heard with environmental noise or through a face mask.

    Additional information

    supplemental material
  • Slonimska, A. (2024). The role of iconicity and simultaneity in efficient communication in the visual modality: Evidence from LIS (Italian Sign Language) [Dissertation Abstract]. Sign Language & Linguistics, 27(1), 116-124. doi:10.1075/sll.00084.slo.
  • Ter Bekke, M., Drijvers, L., & Holler, J. (2024). Hand gestures have predictive potential during conversation: An investigation of the timing of gestures in relation to speech. Cognitive Science, 48(1): e13407. doi:10.1111/cogs.13407.

    Abstract

    During face-to-face conversation, transitions between speaker turns are incredibly fast. These fast turn exchanges seem to involve next speakers predicting upcoming semantic information, such that next turn planning can begin before a current turn is complete. Given that face-to-face conversation also involves the use of communicative bodily signals, an important question is how bodily signals such as co-speech hand gestures play into these processes of prediction and fast responding. In this corpus study, we found that hand gestures that depict or refer to semantic information started before the corresponding information in speech, which held both for the onset of the gesture as a whole, as well as the onset of the stroke (the most meaningful part of the gesture). This early timing potentially allows listeners to use the gestural information to predict the corresponding semantic information to be conveyed in speech. Moreover, we provided further evidence that questions with gestures got faster responses than questions without gestures. However, we found no evidence for the idea that how much a gesture precedes its lexical affiliate (i.e., its predictive potential) relates to how fast responses were given. The findings presented here highlight the importance of the temporal relation between speech and gesture and help to illuminate the potential mechanisms underpinning multimodal language processing during face-to-face conversation.
  • Ter Bekke, M., Drijvers, L., & Holler, J. (2024). Gestures speed up responses to questions. Language, Cognition and Neuroscience, 39(4), 423-430. doi:10.1080/23273798.2024.2314021.

    Abstract

    Most language use occurs in face-to-face conversation, which involves rapid turn-taking. Seeing communicative bodily signals in addition to hearing speech may facilitate such fast responding. We tested whether this holds for co-speech hand gestures by investigating whether these gestures speed up button press responses to questions. Sixty native speakers of Dutch viewed videos in which an actress asked yes/no-questions, either with or without a corresponding iconic hand gesture. Participants answered the questions as quickly and accurately as possible via button press. Gestures did not impact response accuracy, but crucially, gestures sped up responses, suggesting that response planning may be finished earlier when gestures are seen. How much gestures sped up responses was not related to their timing in the question or their timing with respect to the corresponding information in speech. Overall, these results are in line with the idea that multimodality may facilitate fast responding during face-to-face conversation.
  • Ter Bekke, M., Levinson, S. C., Van Otterdijk, L., Kühn, M., & Holler, J. (2024). Visual bodily signals and conversational context benefit the anticipation of turn ends. Cognition, 248: 105806. doi:10.1016/j.cognition.2024.105806.

    Abstract

    The typical pattern of alternating turns in conversation seems trivial at first sight. But a closer look quickly reveals the cognitive challenges involved, with much of it resulting from the fast-paced nature of conversation. One core ingredient to turn coordination is the anticipation of upcoming turn ends so as to be able to ready oneself for providing the next contribution. Across two experiments, we investigated two variables inherent to face-to-face conversation, the presence of visual bodily signals and preceding discourse context, in terms of their contribution to turn end anticipation. In a reaction time paradigm, participants anticipated conversational turn ends better when seeing the speaker and their visual bodily signals than when they did not, especially so for longer turns. Likewise, participants were better able to anticipate turn ends when they had access to the preceding discourse context than when they did not, and especially so for longer turns. Critically, the two variables did not interact, showing that visual bodily signals retain their influence even in the context of preceding discourse. In a pre-registered follow-up experiment, we manipulated the visibility of the speaker's head, eyes and upper body (i.e. torso + arms). Participants were better able to anticipate turn ends when the speaker's upper body was visible, suggesting a role for manual gestures in turn end anticipation. Together, these findings show that seeing the speaker during conversation may critically facilitate turn coordination in interaction.
  • Trujillo, J. P., & Holler, J. (2024). Conversational facial signals combine into compositional meanings that change the interpretation of speaker intentions. Scientific Reports, 14: 2286. doi:10.1038/s41598-024-52589-0.

    Abstract

    Human language is extremely versatile, combining a limited set of signals in an unlimited number of ways. However, it is unknown whether conversational visual signals feed into the composite utterances with which speakers communicate their intentions. We assessed whether different combinations of visual signals lead to different intent interpretations of the same spoken utterance. Participants viewed a virtual avatar uttering spoken questions while producing single visual signals (i.e., head turn, head tilt, eyebrow raise) or combinations of these signals. After each video, participants classified the communicative intention behind the question. We found that composite utterances combining several visual signals conveyed different meaning compared to utterances accompanied by the single visual signals. However, responses to combinations of signals were more similar to the responses to related, rather than unrelated, individual signals, indicating a consistent influence of the individual visual signals on the whole. This study therefore provides first evidence for compositional, non-additive (i.e., Gestalt-like) perception of multimodal language.

    Additional information

    41598_2024_52589_MOESM1_ESM.docx
  • Trujillo, J. P., & Holler, J. (2024). Information distribution patterns in naturalistic dialogue differ across languages. Psychonomic Bulletin & Review, 31, 1723-1734. doi:10.3758/s13423-024-02452-0.

    Abstract

    The natural ecology of language is conversation, with individuals taking turns speaking to communicate in a back-and-forth fashion. Language in this context involves strings of words that a listener must process while simultaneously planning their own next utterance. It would thus be highly advantageous if language users distributed information within an utterance in a way that may facilitate this processing–planning dynamic. While some studies have investigated how information is distributed at the level of single words or clauses, or in written language, little is known about how information is distributed within spoken utterances produced during naturalistic conversation. It also is not known how information distribution patterns of spoken utterances may differ across languages. We used a set of matched corpora (CallHome) containing 898 telephone conversations conducted in six different languages (Arabic, English, German, Japanese, Mandarin, and Spanish), analyzing more than 58,000 utterances, to assess whether there is evidence of distinct patterns of information distributions at the utterance level, and whether these patterns are similar or differed across the languages. We found that English, Spanish, and Mandarin typically show a back-loaded distribution, with higher information (i.e., surprisal) in the last half of utterances compared with the first half, while Arabic, German, and Japanese showed front-loaded distributions, with higher information in the first half compared with the last half. Additional analyses suggest that these patterns may be related to word order and rate of noun and verb usage. We additionally found that back-loaded languages have longer turn transition times (i.e.,time between speaker turns)

    Additional information

    Data availability
  • Ünal, E., Mamus, E., & Özyürek, A. (2024). Multimodal encoding of motion events in speech, gesture, and cognition. Language and Cognition, 16(4), 785-804. doi:10.1017/langcog.2023.61.

    Abstract

    How people communicate about motion events and how this is shaped by language typology are mostly studied with a focus on linguistic encoding in speech. Yet, human communication typically involves an interactional exchange of multimodal signals, such as hand gestures that have different affordances for representing event components. Here, we review recent empirical evidence on multimodal encoding of motion in speech and gesture to gain a deeper understanding of whether and how language typology shapes linguistic expressions in different modalities, and how this changes across different sensory modalities of input and interacts with other aspects of cognition. Empirical evidence strongly suggests that Talmy’s typology of event integration predicts multimodal event descriptions in speech and gesture and visual attention to event components prior to producing these descriptions. Furthermore, variability within the event itself, such as type and modality of stimuli, may override the influence of language typology, especially for expression of manner.
  • Coventry, K. R., Gudde, H. B., Diessel, H., Collier, J., Guijarro-Fuentes, P., Vulchanova, M., Vulchanov, V., Todisco, E., Reile, M., Breunesse, M., Plado, H., Bohnemeyer, J., Bsili, R., Caldano, M., Dekova, R., Donelson, K., Forker, D., Park, Y., Pathak, L. S., Peeters, D. and 25 moreCoventry, K. R., Gudde, H. B., Diessel, H., Collier, J., Guijarro-Fuentes, P., Vulchanova, M., Vulchanov, V., Todisco, E., Reile, M., Breunesse, M., Plado, H., Bohnemeyer, J., Bsili, R., Caldano, M., Dekova, R., Donelson, K., Forker, D., Park, Y., Pathak, L. S., Peeters, D., Pizzuto, G., Serhan, B., Apse, L., Hesse, F., Hoang, L., Hoang, P., Igari, Y., Kapiley, K., Haupt-Khutsishvili, T., Kolding, S., Priiki, K., Mačiukaitytė, I., Mohite, V., Nahkola, T., Tsoi, S. Y., Williams, S., Yasuda, S., Cangelosi, A., Duñabeitia, J. A., Mishra, R. K., Rocca, R., Šķilters, J., Wallentin, M., Žilinskaitė-Šinkūnienė, E., & Incel, O. D. (2023). Spatial communication systems across languages reflect universal action constraints. Nature Human Behaviour, 77, 2099-2110. doi:10.1038/s41562-023-01697-4.

    Abstract

    The extent to which languages share properties reflecting the non-linguistic constraints of the speakers who speak them is key to the debate regarding the relationship between language and cognition. A critical case is spatial communication, where it has been argued that semantic universals should exist, if anywhere. Here, using an experimental paradigm able to separate variation within a language from variation between languages, we tested the use of spatial demonstratives—the most fundamental and frequent spatial terms across languages. In n = 874 speakers across 29 languages, we show that speakers of all tested languages use spatial demonstratives as a function of being able to reach or act on an object being referred to. In some languages, the position of the addressee is also relevant in selecting between demonstrative forms. Commonalities and differences across languages in spatial communication can be understood in terms of universal constraints on action shaping spatial language and cognition.
  • Dingemanse, M., Liesenfeld, A., Rasenberg, M., Albert, S., Ameka, F. K., Birhane, A., Bolis, D., Cassell, J., Clift, R., Cuffari, E., De Jaegher, H., Dutilh Novaes, C., Enfield, N. J., Fusaroli, R., Gregoromichelaki, E., Hutchins, E., Konvalinka, I., Milton, D., Rączaszek-Leonardi, J., Reddy, V. and 8 moreDingemanse, M., Liesenfeld, A., Rasenberg, M., Albert, S., Ameka, F. K., Birhane, A., Bolis, D., Cassell, J., Clift, R., Cuffari, E., De Jaegher, H., Dutilh Novaes, C., Enfield, N. J., Fusaroli, R., Gregoromichelaki, E., Hutchins, E., Konvalinka, I., Milton, D., Rączaszek-Leonardi, J., Reddy, V., Rossano, F., Schlangen, D., Seibt, J., Stokoe, E., Suchman, L. A., Vesper, C., Wheatley, T., & Wiltschko, M. (2023). Beyond single-mindedness: A figure-ground reversal for the cognitive sciences. Cognitive Science, 47(1): e13230. doi:10.1111/cogs.13230.

    Abstract

    A fundamental fact about human minds is that they are never truly alone: all minds are steeped in situated interaction. That social interaction matters is recognised by any experimentalist who seeks to exclude its influence by studying individuals in isolation. On this view, interaction complicates cognition. Here we explore the more radical stance that interaction co-constitutes cognition: that we benefit from looking beyond single minds towards cognition as a process involving interacting minds. All around the cognitive sciences, there are approaches that put interaction centre stage. Their diverse and pluralistic origins may obscure the fact that collectively, they harbour insights and methods that can respecify foundational assumptions and fuel novel interdisciplinary work. What might the cognitive sciences gain from stronger interactional foundations? This represents, we believe, one of the key questions for the future. Writing as a multidisciplinary collective assembled from across the classic cognitive science hexagon and beyond, we highlight the opportunity for a figure-ground reversal that puts interaction at the heart of cognition. The interactive stance is a way of seeing that deserves to be a key part of the conceptual toolkit of cognitive scientists.
  • Karadöller, D. Z., Sumer, B., Ünal, E., & Özyürek, A. (2023). Late sign language exposure does not modulate the relation between spatial language and spatial memory in deaf children and adults. Memory & Cognition, 51, 582-600. doi:10.3758/s13421-022-01281-7.

    Abstract

    Prior work with hearing children acquiring a spoken language as their first language shows that spatial language and cognition are related systems and spatial language use predicts spatial memory. Here, we further investigate the extent of this relationship in signing deaf children and adults and ask if late sign language exposure, as well as the frequency and the type of spatial language use that might be affected by late exposure, modulate subsequent memory for spatial relations. To do so, we compared spatial language and memory of 8-year-old late-signing children (after 2 years of exposure to a sign language at the school for the deaf) and late-signing adults to their native-signing counterparts. We elicited picture descriptions of Left-Right relations in Turkish Sign Language (Türk İşaret Dili) and measured the subsequent recognition memory accuracy of the described pictures. Results showed that late-signing adults and children were similar to their native-signing counterparts in how often they encoded the spatial relation. However, late-signing adults but not children differed from their native-signing counterparts in the type of spatial language they used. However, neither late sign language exposure nor the frequency and type of spatial language use modulated spatial memory accuracy. Therefore, even though late language exposure seems to influence the type of spatial language use, this does not predict subsequent memory for spatial relations. We discuss the implications of these findings based on the theories concerning the correspondence between spatial language and cognition as related or rather independent systems.
  • Mamus, E., Speed, L. J., Rissman, L., Majid, A., & Özyürek, A. (2023). Lack of visual experience affects multimodal language production: Evidence from congenitally blind and sighted people. Cognitive Science, 47(1): e13228. doi:10.1111/cogs.13228.

    Abstract

    The human experience is shaped by information from different perceptual channels, but it is still debated whether and how differential experience influences language use. To address this, we compared congenitally blind, blindfolded, and sighted people's descriptions of the same motion events experienced auditorily by all participants (i.e., via sound alone) and conveyed in speech and gesture. Comparison of blind and sighted participants to blindfolded participants helped us disentangle the effects of a lifetime experience of being blind versus the task-specific effects of experiencing a motion event by sound alone. Compared to sighted people, blind people's speech focused more on path and less on manner of motion, and encoded paths in a more segmented fashion using more landmarks and path verbs. Gestures followed the speech, such that blind people pointed to landmarks more and depicted manner less than sighted people. This suggests that visual experience affects how people express spatial events in the multimodal language and that blindness may enhance sensitivity to paths of motion due to changes in event construal. These findings have implications for the claims that language processes are deeply rooted in our sensory experiences.
  • Mamus, E., Speed, L., Özyürek, A., & Majid, A. (2023). The effect of input sensory modality on the multimodal encoding of motion events. Language, Cognition and Neuroscience, 38(5), 711-723. doi:10.1080/23273798.2022.2141282.

    Abstract

    Each sensory modality has different affordances: vision has higher spatial acuity than audition, whereas audition has better temporal acuity. This may have consequences for the encoding of events and its subsequent multimodal language production—an issue that has received relatively little attention to date. In this study, we compared motion events presented as audio-only, visual-only, or multimodal (visual + audio) input and measured speech and co-speech gesture depicting path and manner of motion in Turkish. Input modality affected speech production. Speakers with audio-only input produced more path descriptions and fewer manner descriptions in speech compared to speakers who received visual input. In contrast, the type and frequency of gestures did not change across conditions. Path-only gestures dominated throughout. Our results suggest that while speech is more susceptible to auditory vs. visual input in encoding aspects of motion events, gesture is less sensitive to such differences.

    Additional information

    Supplemental material
  • Manhardt, F., Brouwer, S., Van Wijk, E., & Özyürek, A. (2023). Word order preference in sign influences speech in hearing bimodal bilinguals but not vice versa: Evidence from behavior and eye-gaze. Bilingualism: Language and Cognition, 26(1), 48-61. doi:10.1017/S1366728922000311.

    Abstract

    We investigated cross-modal influences between speech and sign in hearing bimodal bilinguals, proficient in a spoken and a sign language, and its consequences on visual attention during message preparation using eye-tracking. We focused on spatial expressions in which sign languages, unlike spoken languages, have a modality-driven preference to mention grounds (big objects) prior to figures (smaller objects). We compared hearing bimodal bilinguals’ spatial expressions and visual attention in Dutch and Dutch Sign Language (N = 18) to those of their hearing non-signing (N = 20) and deaf signing peers (N = 18). In speech, hearing bimodal bilinguals expressed more ground-first descriptions and fixated grounds more than hearing non-signers, showing influence from sign. In sign, they used as many ground-first descriptions as deaf signers and fixated grounds equally often, demonstrating no influence from speech. Cross-linguistic influence of word order preference and visual attention in hearing bimodal bilinguals appears to be one-directional modulated by modality-driven differences.
  • Özer, D., Karadöller, D. Z., Özyürek, A., & Göksun, T. (2023). Gestures cued by demonstratives in speech guide listeners' visual attention during spatial language comprehension. Journal of Experimental Psychology: General, 152(9), 2623-2635. doi:10.1037/xge0001402.

    Abstract

    Gestures help speakers and listeners during communication and thinking, particularly for visual-spatial information. Speakers tend to use gestures to complement the accompanying spoken deictic constructions, such as demonstratives, when communicating spatial information (e.g., saying “The candle is here” and gesturing to the right side to express that the candle is on the speaker's right). Visual information conveyed by gestures enhances listeners’ comprehension. Whether and how listeners allocate overt visual attention to gestures in different speech contexts is mostly unknown. We asked if (a) listeners gazed at gestures more when they complement demonstratives in speech (“here”) compared to when they express redundant information to speech (e.g., “right”) and (b) gazing at gestures related to listeners’ information uptake from those gestures. We demonstrated that listeners fixated gestures more when they expressed complementary than redundant information in the accompanying speech. Moreover, overt visual attention to gestures did not predict listeners’ comprehension. These results suggest that the heightened communicative value of gestures as signaled by external cues, such as demonstratives, guides listeners’ visual attention to gestures. However, overt visual attention does not seem to be necessary to extract the cued information from the multimodal message.
  • Eijk, L., Rasenberg, M., Arnese, F., Blokpoel, M., Dingemanse, M., Doeller, C. F., Ernestus, M., Holler, J., Milivojevic, B., Özyürek, A., Pouw, W., Van Rooij, I., Schriefers, H., Toni, I., Trujillo, J. P., & Bögels, S. (2022). The CABB dataset: A multimodal corpus of communicative interactions for behavioural and neural analyses. NeuroImage, 264: 119734. doi:10.1016/j.neuroimage.2022.119734.

    Abstract

    We present a dataset of behavioural and fMRI observations acquired in the context of humans involved in multimodal referential communication. The dataset contains audio/video and motion-tracking recordings of face-to-face, task-based communicative interactions in Dutch, as well as behavioural and neural correlates of participants’ representations of dialogue referents. Seventy-one pairs of unacquainted participants performed two interleaved interactional tasks in which they described and located 16 novel geometrical objects (i.e., Fribbles) yielding spontaneous interactions of about one hour. We share high-quality video (from three cameras), audio (from head-mounted microphones), and motion-tracking (Kinect) data, as well as speech transcripts of the interactions. Before and after engaging in the face-to-face communicative interactions, participants’ individual representations of the 16 Fribbles were estimated. Behaviourally, participants provided a written description (one to three words) for each Fribble and positioned them along 29 independent conceptual dimensions (e.g., rounded, human, audible). Neurally, fMRI signal evoked by each Fribble was measured during a one-back working-memory task. To enable functional hyperalignment across participants, the dataset also includes fMRI measurements obtained during visual presentation of eight animated movies (35 minutes total). We present analyses for the various types of data demonstrating their quality and consistency with earlier research. Besides high-resolution multimodal interactional data, this dataset includes different correlates of communicative referents, obtained before and after face-to-face dialogue, allowing for novel investigations into the relation between communicative behaviours and the representational space shared by communicators. This unique combination of data can be used for research in neuroscience, psychology, linguistics, and beyond.
  • Heesen, R., Fröhlich, M., Sievers, C., Woensdregt, M., & Dingemanse, M. (2022). Coordinating social action: A primer for the cross-species investigation of communicative repair. Philosophical Transactions of the Royal Society of London, Series B: Biological Sciences, 377(1859): 20210110. doi:10.1098/rstb.2021.0110.

    Abstract

    Human joint action is inherently cooperative, manifested in the collaborative efforts of participants to minimize communicative trouble through interactive repair. Although interactive repair requires sophisticated cognitive abilities,
    it can be dissected into basic building blocks shared with non-human animal species. A review of the primate literature shows that interactionally contingent signal sequences are at least common among species of nonhuman great apes, suggesting a gradual evolution of repair. To pioneer a cross-species assessment of repair this paper aims at (i) identifying necessary precursors of human interactive repair; (ii) proposing a coding framework for its comparative study in humans and non-human species; and (iii) using this framework to analyse examples of interactions of humans (adults/children) and non-human great apes. We hope this paper will serve as a primer for cross-species comparisons of communicative breakdowns and how they are repaired.
  • Pearson, L., & Pouw, W. (2022). Gesture–vocal coupling in Karnatak music performance: A neuro–bodily distributed aesthetic entanglement. Annals of the New York Academy of Sciences, 1515(1), 219-236. doi:10.1111/nyas.14806.

    Abstract

    In many musical styles, vocalists manually gesture while they sing. Coupling between gesture kinematics and vocalization has been examined in speech contexts, but it is an open question how these couple in music making. We examine this in a corpus of South Indian, Karnatak vocal music that includes motion-capture data. Through peak magnitude analysis (linear mixed regression) and continuous time-series analyses (generalized additive modeling), we assessed whether vocal trajectories around peaks in vertical velocity, speed, or acceleration were coupling with changes in vocal acoustics (namely, F0 and amplitude). Kinematic coupling was stronger for F0 change versus amplitude, pointing to F0's musical significance. Acceleration was the most predictive for F0 change and had the most reliable magnitude coupling, showing a one-third power relation. That acceleration, rather than other kinematics, is maximally predictive for vocalization is interesting because acceleration entails force transfers onto the body. As a theoretical contribution, we argue that gesturing in musical contexts should be understood in relation to the physical connections between gesturing and vocal production that are brought into harmony with the vocalists’ (enculturated) performance goals. Gesture–vocal coupling should, therefore, be viewed as a neuro–bodily distributed aesthetic entanglement.

    Additional information

    tables
  • Pouw, W., & Holler, J. (2022). Timing in conversation is dynamically adjusted turn by turn in dyadic telephone conversations. Cognition, 222: 105015. doi:10.1016/j.cognition.2022.105015.

    Abstract

    Conversational turn taking in humans involves incredibly rapid responding. The timing mechanisms underpinning such responses have been heavily debated, including questions such as who is doing the timing. Similar to findings on rhythmic tapping to a metronome, we show that floor transfer offsets (FTOs) in telephone conversations are serially dependent, such that FTOs are lag-1 negatively autocorrelated. Finding this serial dependence on a turn-by-turn basis (lag-1) rather than on the basis of two or more turns, suggests a counter-adjustment mechanism operating at the level of the dyad in FTOs during telephone conversations, rather than a more individualistic self-adjustment within speakers. This finding, if replicated, has major implications for models describing turn taking, and confirms the joint, dyadic nature of human conversational dynamics. Future research is needed to see how pervasive serial dependencies in FTOs are, such as for example in richer communicative face-to-face contexts where visual signals affect conversational timing.
  • Pouw, W., & Dixon, J. A. (2022). What you hear and see specifies the perception of a limb-respiratory-vocal act. Proceedings of the Royal Society B: Biological Sciences, 289(1979): 20221026. doi:10.1098/rspb.2022.1026.
  • Pouw, W., Harrison, S. J., & Dixon, J. A. (2022). The importance of visual control and biomechanics in the regulation of gesture-speech synchrony for an individual deprived of proprioceptive feedback of body position. Scientific Reports, 12: 14775. doi:10.1038/s41598-022-18300-x.

    Abstract

    Do communicative actions such as gestures fundamentally differ in their control mechanisms from other actions? Evidence for such fundamental differences comes from a classic gesture-speech coordination experiment performed with a person (IW) with deafferentation (McNeill, 2005). Although IW has lost both his primary source of information about body position (i.e., proprioception) and discriminative touch from the neck down, his gesture-speech coordination has been reported to be largely unaffected, even if his vision is blocked. This is surprising because, without vision, his object-directed actions almost completely break down. We examine the hypothesis that IW’s gesture-speech coordination is supported by the biomechanical effects of gesturing on head posture and speech. We find that when vision is blocked, there are micro-scale increases in gesture-speech timing variability, consistent with IW’s reported experience that gesturing is difficult without vision. Supporting the hypothesis that IW exploits biomechanical consequences of the act of gesturing, we find that: (1) gestures with larger physical impulses co-occur with greater head movement, (2) gesture-speech synchrony relates to larger gesture-concurrent head movements (i.e. for bimanual gestures), (3) when vision is blocked, gestures generate more physical impulse, and (4) moments of acoustic prominence couple more with peaks of physical impulse when vision is blocked. It can be concluded that IW’s gesturing ability is not based on a specialized language-based feedforward control as originally concluded from previous research, but is still dependent on a varied means of recurrent feedback from the body.

    Additional information

    supplementary tables
  • Pouw, W., & Fuchs, S. (2022). Origins of vocal-entangled gesture. Neuroscience and Biobehavioral Reviews, 141: 104836. doi:10.1016/j.neubiorev.2022.104836.

    Abstract

    Gestures during speaking are typically understood in a representational framework: they represent absent or distal states of affairs by means of pointing, resemblance, or symbolic replacement. However, humans also gesture along with the rhythm of speaking, which is amenable to a non-representational perspective. Such a perspective centers on the phenomenon of vocal-entangled gestures and builds on evidence showing that when an upper limb with a certain mass decelerates/accelerates sufficiently, it yields impulses on the body that cascade in various ways into the respiratory–vocal system. It entails a physical entanglement between body motions, respiration, and vocal activities. It is shown that vocal-entangled gestures are realized in infant vocal–motor babbling before any representational use of gesture develops. Similarly, an overview is given of vocal-entangled processes in non-human animals. They can frequently be found in rats, bats, birds, and a range of other species that developed even earlier in the phylogenetic tree. Thus, the origins of human gesture lie in biomechanics, emerging early in ontogeny and running deep in phylogeny.
  • Rasenberg, M., Pouw, W., Özyürek, A., & Dingemanse, M. (2022). The multimodal nature of communicative efficiency in social interaction. Scientific Reports, 12: 19111. doi:10.1038/s41598-022-22883-w.

    Abstract

    How does communicative efficiency shape language use? We approach this question by studying it at the level of the dyad, and in terms of multimodal utterances. We investigate whether and how people minimize their joint speech and gesture efforts in face-to-face interactions, using linguistic and kinematic analyses. We zoom in on other-initiated repair—a conversational microcosm where people coordinate their utterances to solve problems with perceiving or understanding. We find that efforts in the spoken and gestural modalities are wielded in parallel across repair turns of different types, and that people repair conversational problems in the most cost-efficient way possible, minimizing the joint multimodal effort for the dyad as a whole. These results are in line with the principle of least collaborative effort in speech and with the reduction of joint costs in non-linguistic joint actions. The results extend our understanding of those coefficiency principles by revealing that they pertain to multimodal utterance design.

    Additional information

    Data and analysis scripts
  • Rasenberg, M., Özyürek, A., Bögels, S., & Dingemanse, M. (2022). The primacy of multimodal alignment in converging on shared symbols for novel referents. Discourse Processes, 59(3), 209-236. doi:10.1080/0163853X.2021.1992235.

    Abstract

    When people establish shared symbols for novel objects or concepts, they have been shown to rely on the use of multiple communicative modalities as well as on alignment (i.e., cross-participant repetition of communicative behavior). Yet these interactional resources have rarely been studied together, so little is known about if and how people combine multiple modalities in alignment to achieve joint reference. To investigate this, we systematically track the emergence of lexical and gestural alignment in a referential communication task with novel objects. Quantitative analyses reveal that people frequently use a combination of lexical and gestural alignment, and that such multimodal alignment tends to emerge earlier compared to unimodal alignment. Qualitative analyses of the interactional contexts in which alignment emerges reveal how people flexibly deploy lexical and gestural alignment (independently, simultaneously or successively) to adjust to communicative pressures.
  • Schubotz, L., Özyürek, A., & Holler, J. (2022). Individual differences in working memory and semantic fluency predict younger and older adults' multimodal recipient design in an interactive spatial task. Acta Psychologica, 229: 103690. doi:10.1016/j.actpsy.2022.103690.

    Abstract

    Aging appears to impair the ability to adapt speech and gestures based on knowledge shared with an addressee
    (common ground-based recipient design) in narrative settings. Here, we test whether this extends to spatial settings
    and is modulated by cognitive abilities. Younger and older adults gave instructions on how to assemble 3D-
    models from building blocks on six consecutive trials. We induced mutually shared knowledge by either
    showing speaker and addressee the model beforehand, or not. Additionally, shared knowledge accumulated
    across the trials. Younger and crucially also older adults provided recipient-designed utterances, indicated by a
    significant reduction in the number of words and of gestures when common ground was present. Additionally, we
    observed a reduction in semantic content and a shift in cross-modal distribution of information across trials.
    Rather than age, individual differences in verbal and visual working memory and semantic fluency predicted the
    extent of addressee-based adaptations. Thus, in spatial tasks, individual cognitive abilities modulate the inter-
    active language use of both younger and older adul

    Additional information

    1-s2.0-S0001691822002050-mmc1.docx
  • Slonimska, A., Özyürek, A., & Capirci, O. (2022). Simultaneity as an emergent property of efficient communication in language: A comparison of silent gesture and sign language. Wiley Interdisciplinary Reviews: Cognitive Science, 46(5): 13133. doi:10.1111/cogs.13133.

    Abstract

    Sign languages use multiple articulators and iconicity in the visual modality which allow linguistic units to be organized not only linearly but also simultaneously. Recent research has shown that users of an established sign language such as LIS (Italian Sign Language) use simultaneous and iconic constructions as a modality-specific resource to achieve communicative efficiency when they are required to encode informationally rich events. However, it remains to be explored whether the use of such simultaneous and iconic constructions recruited for communicative efficiency can be employed even without a linguistic system (i.e., in silent gesture) or whether they are specific to linguistic patterning (i.e., in LIS). In the present study, we conducted the same experiment as in Slonimska et al. with 23 Italian speakers using silent gesture and compared the results of the two studies. The findings showed that while simultaneity was afforded by the visual modality to some extent, its use in silent gesture was nevertheless less frequent and qualitatively different than when used within a linguistic system. Thus, the use of simultaneous and iconic constructions for communicative efficiency constitutes an emergent property of sign languages. The present study highlights the importance of studying modality-specific resources and their use for linguistic expression in order to promote a more thorough understanding of the language faculty and its modality-specific adaptive capabilities.
  • Sumer, B., & Özyürek, A. (2022). Cross-modal investigation of event component omissions in language development: A comparison of signing and speaking children. Language, Cognition and Neuroscience, 37(8), 1023-1039. doi:10.1080/23273798.2022.2042336.

    Abstract

    Language development research suggests a universal tendency for children to be under- informative in narrating motion events by omitting components such as Path, Manner or Ground. However, this assumption has not been tested for children acquiring sign language. Due to the affordances of the visual-spatial modality of sign languages for iconic expression, signing children might omit event components less frequently than speaking children. Here we analysed motion event descriptions elicited from deaf children (4–10 years) acquiring Turkish Sign Language (TİD) and their Turkish-speaking peers. While children omitted all types of event components more often than adults, signing children and adults encoded more Path and Manner in TİD than their peers in Turkish. These results provide more evidence for a general universal tendency for children to omit event components as well as a modality bias for sign languages to encode both Manner and Path more frequently than spoken languages.
  • Sumer, B., & Özyürek, A. (2022). Language use in deaf children with early-signing versus late-signing deaf parents. Frontiers in Communication, 6: 804900. doi:10.3389/fcomm.2021.804900.

    Abstract

    Previous research has shown that spatial language is sensitive to the effects of delayed language exposure. Locative encodings of late-signing deaf adults varied from those of early-signing deaf adults in the preferred types of linguistic forms. In the current study, we investigated whether such differences would be found in spatial language use of deaf children with deaf parents who are either early or late signers of Turkish Sign Language (TİD). We analyzed locative encodings elicited from these two groups of deaf children for the use of different linguistic forms and the types of classifier handshapes. Our findings revealed differences between these two groups of deaf children in their preferred types of linguistic forms, which showed parallels to differences between late versus early deaf adult signers as reported by earlier studies. Deaf children in the current study, however, were similar to each other in the type of classifier handshapes that they used in their classifier constructions. Our findings have implications for expanding current knowledge on to what extent variation in language input (i.e., from early vs. late deaf signers) is reflected in children’s productions as well as the role of linguistic input on language development in general.
  • Trujillo, J. P., Özyürek, A., Kan, C., Sheftel-Simanova, I., & Bekkering, H. (2022). Differences in functional brain organization during gesture recognition between autistic and neurotypical individuals. Social Cognitive and Affective Neuroscience, 17(11), 1021-1034. doi:10.1093/scan/nsac026.

    Abstract

    Persons with and without autism process sensory information differently. Differences in sensory processing are directly relevant to social functioning and communicative abilities, which are known to be hampered in persons with autism. We collected functional magnetic resonance imaging (fMRI) data from 25 autistic individuals and 25 neurotypical individuals while they performed a silent gesture recognition task. We exploited brain network topology, a holistic quantification of how networks within the brain are organized to provide new insights into how visual communicative signals are processed in autistic and neurotypical individuals. Performing graph theoretical analysis, we calculated two network properties of the action observation network: local efficiency, as a measure of network segregation, and global efficiency, as a measure of network integration. We found that persons with autism and neurotypical persons differ in how the action observation network is organized. Persons with autism utilize a more clustered, local-processing-oriented network configuration (i.e., higher local efficiency), rather than the more integrative network organization seen in neurotypicals (i.e., higher global efficiency). These results shed new light on the complex interplay between social and sensory processing in autism.

    Additional information

    nsac026_supp.zip
  • Ünal, E., Manhardt, F., & Özyürek, A. (2022). Speaking and gesturing guide event perception during message conceptualization: Evidence from eye movements. Cognition, 225: 105127. doi:10.1016/j.cognition.2022.105127.

    Abstract

    Speakers’ visual attention to events is guided by linguistic conceptualization of information in spoken language
    production and in language-specific ways. Does production of language-specific co-speech gestures further guide
    speakers’ visual attention during message preparation? Here, we examine the link between visual attention and
    multimodal event descriptions in Turkish. Turkish is a verb-framed language where speakers’ speech and gesture
    show language specificity with path of motion mostly expressed within the main verb accompanied by path
    gestures. Turkish-speaking adults viewed motion events while their eye movements were recorded during non-
    linguistic (viewing-only) and linguistic (viewing-before-describing) tasks. The relative attention allocated to path
    over manner was higher in the linguistic task compared to the non-linguistic task. Furthermore, the relative
    attention allocated to path over manner within the linguistic task was higher when speakers (a) encoded path in
    the main verb versus outside the verb and (b) used additional path gestures accompanying speech versus not.
    Results strongly suggest that speakers’ visual attention is guided by language-specific event encoding not only in
    speech but also in gesture. This provides evidence consistent with models that propose integration of speech and
    gesture at the conceptualization level of language production and suggests that the links between the eye and the
    mouth may be extended to the eye and the hand.
  • Brown, A. R., Pouw, W., Brentari, D., & Goldin-Meadow, S. (2021). People are less susceptible to illusion when they use their hands to communicate rather than estimate. Psychological Science, 32, 1227-1237. doi:10.1177/0956797621991552.

    Abstract

    When we use our hands to estimate the length of a stick in the Müller-Lyer illusion, we are highly susceptible to the illusion. But when we prepare to act on sticks under the same conditions, we are significantly less susceptible. Here, we asked whether people are susceptible to illusion when they use their hands not to act on objects but to describe them in spontaneous co-speech gestures or conventional sign languages of the deaf. Thirty-two English speakers and 13 American Sign Language signers used their hands to act on, estimate the length of, and describe sticks eliciting the Müller-Lyer illusion. For both gesture and sign, the magnitude of illusion in the description task was smaller than the magnitude of illusion in the estimation task and not different from the magnitude of illusion in the action task. The mechanisms responsible for producing gesture in speech and sign thus appear to operate not on percepts involved in estimation but on percepts derived from the way we act on objects.

    Additional information

    supplementary material data via OSF
  • Fisher, V. J. (2021). Embodied songs: Insights into the nature of cross-modal meaning-making within sign language informed, embodied interpretations of vocal music. Frontiers in Psychology, 12: 624689. doi:10.3389/fpsyg.2021.624689.

    Abstract

    Embodied song practices involve the transformation of songs from the acoustic modality into an embodied-visual form, to increase meaningful access for d/Deaf audiences. This goes beyond the translation of lyrics, by combining poetic sign language with other bodily movements to embody the para-linguistic expressive and musical features that enhance the message of a song. To date, the limited research into this phenomenon has focussed on linguistic features and interactions with rhythm. The relationship between bodily actions and music has not been probed beyond an assumed implication of conformance. However, as the primary objective is to communicate equivalent meanings, the ways that the acoustic and embodied-visual signals relate to each other should reveal something about underlying conceptual agreement. This paper draws together a range of pertinent theories from within a grounded cognition framework including semiotics, analogy mapping and cross-modal correspondences. These theories are applied to embodiment strategies used by prominent d/Deaf and hearing Dutch practitioners, to unpack the relationship between acoustic songs, their embodied representations, and their broader conceptual and affective meanings. This leads to the proposition that meaning primarily arises through shared patterns of internal relations across a range of amodal and cross-modal features with an emphasis on dynamic qualities. These analogous patterns can inform metaphorical interpretations and trigger shared emotional responses. This exploratory survey offers insights into the nature of cross-modal and embodied meaning-making, as a jumping-off point for further research.
  • Karadöller, D. Z., Sumer, B., & Ozyurek, A. (2021). Effects and non-effects of late language exposure on spatial language development: Evidence from deaf adults and children. Language Learning and Development, 17(1), 1-25. doi:10.1080/15475441.2020.1823846.

    Abstract

    Late exposure to the first language, as in the case of deaf children with hearing parents, hinders the production of linguistic expressions, even in adulthood. Less is known about the development of language soon after language exposure and if late exposure hinders all domains of language in children and adults. We compared late signing adults and children (MAge = 8;5) 2 years after exposure to sign language, to their age-matched native signing peers in expressions of two types of locative relations that are acquired in certain cognitive-developmental order: view-independent (IN-ON-UNDER) and view-dependent (LEFT-RIGHT). Late signing children and adults differed from native signers in their use of linguistic devices for view-dependent relations but not for view-independent relations. These effects were also modulated by the morphological complexity. Hindering effects of late language exposure on the development of language in children and adults are not absolute but are modulated by cognitive and linguistic complexity.
  • Manhardt, F., Brouwer, S., & Ozyurek, A. (2021). A tale of two modalities: Sign and speech influence in each other in bimodal bilinguals. Psychological Science, 32(3), 424-436. doi:10.1177/0956797620968789.

    Abstract

    Bimodal bilinguals are hearing individuals fluent in a sign and a spoken language. Can the two languages influence each other in such individuals despite differences in the visual (sign) and vocal (speech) modalities of expression? We investigated cross-linguistic influences on bimodal bilinguals’ expression of spatial relations. Unlike spoken languages, sign uses iconic linguistic forms that resemble physical features of objects in a spatial relation and thus expresses specific semantic information. Hearing bimodal bilinguals (n = 21) fluent in Dutch and Sign Language of the Netherlands and their hearing nonsigning and deaf signing peers (n = 20 each) described left/right relations between two objects. Bimodal bilinguals expressed more specific information about physical features of objects in speech than nonsigners, showing influence from sign language. They also used fewer iconic signs with specific semantic information than deaf signers, demonstrating influence from speech. Bimodal bilinguals’ speech and signs are shaped by two languages from different modalities.

    Additional information

    supplementary materials
  • Nielsen, A. K. S., & Dingemanse, M. (2021). Iconicity in word learning and beyond: A critical review. Language and Speech, 64(1), 52-72. doi:10.1177/0023830920914339.

    Abstract

    Interest in iconicity (the resemblance-based mapping between aspects of form and meaning) is in the midst of a resurgence, and a prominent focus in the field has been the possible role of iconicity in language learning. Here we critically review theory and empirical findings in this domain. We distinguish local learning enhancement (where the iconicity of certain lexical items influences the learning of those items) and general learning enhancement (where the iconicity of certain lexical items influences the later learning of non-iconic items or systems). We find that evidence for local learning enhancement is quite strong, though not as clear cut as it is often described and based on a limited sample of languages. Despite common claims about broader facilitatory effects of iconicity on learning, we find that current evidence for general learning enhancement is lacking. We suggest a number of productive avenues for future research and specify what types of evidence would be required to show a role for iconicity in general learning enhancement. We also review evidence for functions of iconicity beyond word learning: iconicity enhances comprehension by providing complementary representations, supports communication about sensory imagery, and expresses affective meanings. Even if learning benefits may be modest or cross-linguistically varied, on balance, iconicity emerges as a vital aspect of language.
  • Ozyurek, A. (2021). Considering the nature of multimodal language from a crosslinguistic perspective. Journal of Cognition, 4(1): 42. doi:10.5334/joc.165.

    Abstract

    Language in its primary face-to-face context is multimodal (e.g., Holler and Levinson, 2019; Perniss, 2018). Thus, understanding how expressions in the vocal and visual modalities together contribute to our notions of language structure, use, processing, and transmission (i.e., acquisition, evolution, emergence) in different languages and cultures should be a fundamental goal of language sciences. This requires a new framework of language that brings together how arbitrary and non-arbitrary and motivated semiotic resources of language relate to each other. Current commentary evaluates such a proposal by Murgiano et al (2021) from a crosslinguistic perspective taking variation as well as systematicity in multimodal utterances into account.
  • Pouw, W., Dingemanse, M., Motamedi, Y., & Ozyurek, A. (2021). A systematic investigation of gesture kinematics in evolving manual languages in the lab. Cognitive Science, 45(7): e13014. doi:10.1111/cogs.13014.

    Abstract

    Silent gestures consist of complex multi-articulatory movements but are now primarily studied through categorical coding of the referential gesture content. The relation of categorical linguistic content with continuous kinematics is therefore poorly understood. Here, we reanalyzed the video data from a gestural evolution experiment (Motamedi, Schouwstra, Smith, Culbertson, & Kirby, 2019), which showed increases in the systematicity of gesture content over time. We applied computer vision techniques to quantify the kinematics of the original data. Our kinematic analyses demonstrated that gestures become more efficient and less complex in their kinematics over generations of learners. We further detect the systematicity of gesture form on the level of thegesture kinematic interrelations, which directly scales with the systematicity obtained on semantic coding of the gestures. Thus, from continuous kinematics alone, we can tap into linguistic aspects that were previously only approachable through categorical coding of meaning. Finally, going beyond issues of systematicity, we show how unique gesture kinematic dialects emerged over generations as isolated chains of participants gradually diverged over iterations from other chains. We, thereby, conclude that gestures can come to embody the linguistic system at the level of interrelationships between communicative tokens, which should calibrate our theories about form and linguistic content.
  • Pouw, W., Proksch, S., Drijvers, L., Gamba, M., Holler, J., Kello, C., Schaefer, R. S., & Wiggins, G. A. (2021). Multilevel rhythms in multimodal communication. Philosophical Transactions of the Royal Society of London, Series B: Biological Sciences, 376: 20200334. doi:10.1098/rstb.2020.0334.

    Abstract

    It is now widely accepted that the brunt of animal communication is conducted via several modalities, e.g. acoustic and visual, either simultaneously or sequentially. This is a laudable multimodal turn relative to traditional accounts of temporal aspects of animal communication which have focused on a single modality at a time. However, the fields that are currently contributing to the study of multimodal communication are highly varied, and still largely disconnected given their sole focus on a particular level of description or their particular concern with human or non-human animals. Here, we provide an integrative overview of converging findings that show how multimodal processes occurring at neural, bodily, as well as social interactional levels each contribute uniquely to the complex rhythms that characterize communication in human and non-human animals. Though we address findings for each of these levels independently, we conclude that the most important challenge in this field is to identify how processes at these different levels connect.
  • Pouw, W., De Jonge-Hoekstra, L., Harrison, S. J., Paxton, A., & Dixon, J. A. (2021). Gesture-speech physics in fluent speech and rhythmic upper limb movements. Annals of the New York Academy of Sciences, 1491(1), 89-105. doi:10.1111/nyas.14532.

    Abstract

    Communicative hand gestures are often coordinated with prosodic aspects of speech, and salient moments of gestural movement (e.g., quick changes in speed) often co-occur with salient moments in speech (e.g., near peaks in fundamental frequency and intensity). A common understanding is that such gesture and speech coordination is culturally and cognitively acquired, rather than having a biological basis. Recently, however, the biomechanical physical coupling of arm movements to speech movements has been identified as a potentially important factor in understanding the emergence of gesture-speech coordination. Specifically, in the case of steady-state vocalization and mono-syllable utterances, forces produced during gesturing are transferred onto the tensioned body, leading to changes in respiratory-related activity and thereby affecting vocalization F0 and intensity. In the current experiment (N = 37), we extend this previous line of work to show that gesture-speech physics impacts fluent speech, too. Compared with non-movement, participants who are producing fluent self-formulated speech, while rhythmically moving their limbs, demonstrate heightened F0 and amplitude envelope, and such effects are more pronounced for higher-impulse arm versus lower-impulse wrist movement. We replicate that acoustic peaks arise especially during moments of peak-impulse (i.e., the beat) of the movement, namely around deceleration phases of the movement. Finally, higher deceleration rates of higher-mass arm movements were related to higher peaks in acoustics. These results confirm a role for physical-impulses of gesture affecting the speech system. We discuss the implications of
    gesture-speech physics for understanding of the emergence of communicative gesture, both ontogenetically and phylogenetically.

    Additional information

    data and analyses
  • Schubotz, L., Holler, J., Drijvers, L., & Ozyurek, A. (2021). Aging and working memory modulate the ability to benefit from visible speech and iconic gestures during speech-in-noise comprehension. Psychological Research, 85, 1997-2011. doi:10.1007/s00426-020-01363-8.

    Abstract

    When comprehending speech-in-noise (SiN), younger and older adults benefit from seeing the speaker’s mouth, i.e. visible speech. Younger adults additionally benefit from manual iconic co-speech gestures. Here, we investigate to what extent younger and older adults benefit from perceiving both visual articulators while comprehending SiN, and whether this is modulated by working memory and inhibitory control. Twenty-eight younger and 28 older adults performed a word recognition task in three visual contexts: mouth blurred (speech-only), visible speech, or visible speech + iconic gesture. The speech signal was either clear or embedded in multitalker babble. Additionally, there were two visual-only conditions (visible speech, visible speech + gesture). Accuracy levels for both age groups were higher when both visual articulators were present compared to either one or none. However, older adults received a significantly smaller benefit than younger adults, although they performed equally well in speech-only and visual-only word recognition. Individual differences in verbal working memory and inhibitory control partly accounted for age-related performance differences. To conclude, perceiving iconic gestures in addition to visible speech improves younger and older adults’ comprehension of SiN. Yet, the ability to benefit from this additional visual information is modulated by age and verbal working memory. Future research will have to show whether these findings extend beyond the single word level.

    Additional information

    supplementary material
  • Slonimska, A., Ozyurek, A., & Capirci, O. (2021). Using depiction for efficient communication in LIS (Italian Sign Language). Language and Cognition, 13(3), 367 -396. doi:10.1017/langcog.2021.7.

    Abstract

    Meanings communicated with depictions constitute an integral part of how speakers and signers actually use language (Clark, 2016). Recent studies have argued that, in sign languages, depicting strategy like constructed action (CA), in which a signer enacts the referent, is used for referential purposes in narratives. Here, we tested the referential function of CA in a more controlled experimental setting and outside narrative context. Given the iconic properties of CA we hypothesized that this strategy could be used for efficient information transmission. Thus, we asked if use of CA increased with the increase in the information required to be communicated. Twenty-three deaf signers of LIS described unconnected images, which varied in the amount of information represented, to another player in a director–matcher game. Results revealed that participants used CA to communicate core information about the images and also increased the use of CA as images became informatively denser. The findings show that iconic features of CA can be used for referential function in addition to its depictive function outside narrative context and to achieve communicative efficiency.
  • Trujillo, J. P., Ozyurek, A., Holler, J., & Drijvers, L. (2021). Speakers exhibit a multimodal Lombard effect in noise. Scientific Reports, 11: 16721. doi:10.1038/s41598-021-95791-0.

    Abstract

    In everyday conversation, we are often challenged with communicating in non-ideal settings, such as in noise. Increased speech intensity and larger mouth movements are used to overcome noise in constrained settings (the Lombard effect). How we adapt to noise in face-to-face interaction, the natural environment of human language use, where manual gestures are ubiquitous, is currently unknown. We asked Dutch adults to wear headphones with varying levels of multi-talker babble while attempting to communicate action verbs to one another. Using quantitative motion capture and acoustic analyses, we found that (1) noise is associated with increased speech intensity and enhanced gesture kinematics and mouth movements, and (2) acoustic modulation only occurs when gestures are not present, while kinematic modulation occurs regardless of co-occurring speech. Thus, in face-to-face encounters the Lombard effect is not constrained to speech but is a multimodal phenomenon where the visual channel carries most of the communicative burden.

    Additional information

    supplementary material
  • Trujillo, J. P., Ozyurek, A., Kan, C. C., Sheftel-Simanova, I., & Bekkering, H. (2021). Differences in the production and perception of communicative kinematics in autism. Autism Research, 14(12), 2640-2653. doi:10.1002/aur.2611.

    Abstract

    In human communication, social intentions and meaning are often revealed in the way we move. In this study, we investigate the flexibility of human communication in terms of kinematic modulation in a clinical population, namely, autistic individuals. The aim of this study was twofold: to assess (a) whether communicatively relevant kinematic features of gestures differ between autistic and neurotypical individuals, and (b) if autistic individuals use communicative kinematic modulation to support gesture recognition. We tested autistic and neurotypical individuals on a silent gesture production task and a gesture comprehension task. We measured movement during the gesture production task using a Kinect motion tracking device in order to determine if autistic individuals differed from neurotypical individuals in their gesture kinematics. For the gesture comprehension task, we assessed whether autistic individuals used communicatively relevant kinematic cues to support recognition. This was done by using stick-light figures as stimuli and testing for a correlation between the kinematics of these videos and recognition performance. We found that (a) silent gestures produced by autistic and neurotypical individuals differ in communicatively relevant kinematic features, such as the number of meaningful holds between movements, and (b) while autistic individuals are overall unimpaired at recognizing gestures, they processed repetition and complexity, measured as the amount of submovements perceived, differently than neurotypicals do. These findings highlight how subtle aspects of neurotypical behavior can be experienced differently by autistic individuals. They further demonstrate the relationship between movement kinematics and social interaction in high-functioning autistic individuals.

    Additional information

    supporting information
  • Azar, Z., Backus, A., & Ozyurek, A. (2020). Language contact does not drive gesture transfer: Heritage speakers maintain language specific gesture patterns in each language. Bilingualism: Language and Cognition, 23(2), 414-428. doi:10.1017/S136672891900018X.

    Abstract

    This paper investigates whether there are changes in gesture rate when speakers of two languages with different gesture rates (Turkish-high gesture; Dutch-low gesture) come into daily contact. We analyzed gestures produced by second-generation heritage speakers of Turkish in the Netherlands in each language, comparing them to monolingual baselines. We did not find differences between bilingual and monolingual speakers, possibly because bilinguals were proficient in both languages and used them frequently – in line with a usage-based approach to language. However, bilinguals produced more deictic gestures than monolinguals in both Turkish and Dutch, which we interpret as a bilingual strategy. Deictic gestures may help organize discourse by placing entities in gesture space and help reduce the cognitive load associated with being bilingual, e.g., inhibition cost. Therefore, gesture rate does not necessarily change in contact situations but might be modulated by frequency of language use, proficiency, and cognitive factors related to being bilingual.
  • Azar, Z., Ozyurek, A., & Backus, A. (2020). Turkish-Dutch bilinguals maintain language-specific reference tracking strategies in elicited narratives. International Journal of Bilingualism, 24(2), 376-409. doi:10.1177/1367006919838375.

    Abstract

    Aim:

    This paper examines whether second-generation Turkish heritage speakers in the Netherlands follow language-specific patterns of reference tracking in Turkish and Dutch, focusing on discourse status and pragmatic contexts as factors that may modulate the choice of referring expressions (REs), that is, the noun phrase (NP), overt pronoun and null pronoun.
    Methodology:

    Two short silent videos were used to elicit narratives from 20 heritage speakers of Turkish, both in Turkish and in Dutch. Monolingual baseline data were collected from 20 monolingually raised speakers of Turkish in Turkey and 20 monolingually raised speakers of Dutch in the Netherlands. We also collected language background data from bilinguals with an extensive survey.
    Data and analysis:

    Using generalised logistic mixed-effect regression, we analysed the influence of discourse status and pragmatic context on the choice of subject REs in Turkish and Dutch, comparing bilingual data to the monolingual baseline in each language.
    Findings:

    Heritage speakers used overt versus null pronouns in Turkish and stressed versus reduced pronouns in Dutch in pragmatically appropriate contexts. There was, however, a slight increase in the proportions of overt pronouns as opposed to NPs in Turkish and as opposed to null pronouns in Dutch. We suggest an explanation based on the degree of entrenchment of differential RE types in relation to discourse status as the possible source of the increase.
    Originality:

    This paper provides data from an understudied language pair in the domain of reference tracking in language contact situations. Unlike several studies of pronouns in language contact, we do not find differences across monolingual and bilingual speakers with regard to pragmatic constraints on overt pronouns in the minority pro-drop language.
    Significance:

    Our findings highlight the importance of taking language proficiency and use into account while studying bilingualism and combining formal approaches to language use with usage-based approaches for a more complete understanding of bilingual language production.
  • Burghoorn, F., Dingemanse, M., Van Lier, R., & Van Leeuwen, T. M. (2020). The relation between the degree of synaesthesia, autistic traits, and local/global visual perception. Journal of Autism and Developmental Disorders, 50, 12-29. doi:10.1007/s10803-019-04222-7.

    Abstract

    In individuals with synaesthesia specific sensory stimulation leads to unusual concurrent perceptions in the same or a different modality. Recent studies have demonstrated a high co-occurrence between synaesthesia and autism spectrum disorder (ASD), a condition also characterized by altered perception. A potentially shared characteristic of synaesthesia and ASD is a bias towards local (detail-focussed) perception. We investigated whether a bias towards local perception is indeed shared between synaesthesia and ASD. In a neurotypical population, we studied the relation between the degree of autistic traits (measured by the AQ) and the degree of grapheme-colour synaesthesia (measured by a consistency task), as well as whether both are related to a local bias in tasks assessing local/global visual perception. A positive correlation between total AQ scores and the degree of synaesthesia was found. Our study extends previous studies that found a high ASD-synaesthesia co-occurrence in clinical populations. Consistent with the hypothesized local perceptual bias in ASD, scores on the AQ-attention to detail subscale were related to increased performance on an Embedded Figures Task (EFT), and we found evidence for a relation to reduced susceptibility to visual illusions. We found no relation between autistic traits and local visual perception in a motion coherence task (MCT). Also, no relation between synaesthesia and local visual perception was found, although a reduced susceptibility to visual illusions resembled the results obtained for AQ-atttention to detail subscale. A suggested explanation for the absence of a relationship between the degree of synaesthesia and a local bias is that a possible local bias might be more pronounced in supra-threshold synaesthetes (compared to neurotypicals).
  • Dingemanse, M., Perlman, M., & Perniss, P. (2020). Construals of iconicity: Experimental approaches to form-meaning resemblances in language. Language and Cognition, 12(1), 1-14. doi:10.1017/langcog.2019.48.

    Abstract

    While speculations on form–meaning resemblances in language go back millennia, the experimental study of iconicity is only about a century old. Here we take stock of experimental work on iconicity and present a double special issue with a diverse set of new contributions. We contextualise the work by introducing a typology of approaches to iconicity in language. Some approaches construe iconicity as a discrete property that is either present or absent; others treat it as involving semiotic relationships that come in kinds; and yet others see it as a gradient substance that comes in degrees. We show the benefits and limitations that come with each of these construals and stress the importance of developing accounts that can fluently switch between them. With operationalisations of iconicity that are well defined yet flexible enough to deal with differences in tasks, modalities, and levels of analysis, experimental research on iconicity is well equipped to contribute to a comprehensive science of language.
  • Dingemanse, M. (2020). Resource-rationality beyond individual minds: The case of interactive language use. Behavioral and Brain Sciences, 43, 23-24. doi:10.1017/S0140525X19001638.

    Abstract

    Resource-rational approaches offer much promise for understanding human cognition, especially if they can reach beyond the confines of individual minds. Language allows people to transcend individual resource limitations by augmenting computation and enabling distributed cognition. Interactive language use, an environment where social rational agents routinely deal with resource constraints together, offers a natural laboratory to test resource-rationality in the wild.
  • Dingemanse, M. (2020). Der Raum zwischen unseren Köpfen. Technology Review, 2020(13), 10-15.

    Abstract

    Aktuelle Vorstellungen von Gehirn-zu-Gehirn-Schnittstellen versprechen, die Sprache zu umgehen. Aber wenn wir sie verfeinern, um ihr kollaboratives Potenzial voll auszuschöpfen, sehen wir Sprache — oder zumindest ein sprachähnliches Infrastruktur für Kommunika­tion und Koordination — durch die Hintertür wieder hereinkommen. Es wäre nicht das erste Mal, dass sich die Sprache neu erfindet.

    Current conceptions of brain-to-brain interfaces attempt to bypass language. But when we refine them to more fully realise their collaborative potential we find language —or at least a language-like infrastructure for communication and coordination— slipping through the back door. It wouldn't be the first time that language reinvented itself.
  • Dingemanse, M., & Thompson, B. (2020). Playful iconicity: Structural markedness underlies the relation between funniness and iconicity. Language and Cognition, 12(1), 203-224. doi:10.1017/langcog.2019.49.

    Abstract

    Words like ‘waddle’, ‘flop’ and ‘zigzag’ combine playful connotations with iconic form-meaning resemblances. Here we propose that structural markedness may be a common factor underlying perceptions of playfulness and iconicity. Using collected and estimated lexical ratings covering a total of over 70,000 English words, we assess the robustness of this assocation. We identify cues of phonotactic complexity that covary with funniness and iconicity ratings and that, we propose, serve as metacommunicative signals to draw attention to words as playful and performative. To assess the generalisability of the findings we develop a method to estimate lexical ratings from distributional semantics and apply it to a dataset 20 times the size of the original set of human ratings. The method can be used more generally to extend coverage of lexical ratings. We find that it reliably reproduces correlations between funniness and iconicity as well as cues of structural markedness, though it also amplifies biases present in the human ratings. Our study shows that the playful and the poetic are part of the very texture of the lexicon.
  • Dowell, C., Hajnal, A., Pouw, W., & Wagman, J. B. (2020). Visual and haptic perception of affordances of feelies. Perception, 49(9), 905-925. doi:10.1177/0301006620946532.

    Abstract

    Most objects have well-defined affordances. Investigating perception of affordances of objects that were not created for a specific purpose would provide insight into how affordances are perceived. In addition, comparison of perception of affordances for such objects across different exploratory modalities (visual vs. haptic) would offer a strong test of the lawfulness of information about affordances (i.e., the invariance of such information over transformation). Along these lines, “feelies”— objects created by Gibson with no obvious function and unlike any common object—could shed light on the processes underlying affordance perception. This study showed that when observers reported potential uses for feelies, modality significantly influenced what kind of affordances were perceived. Specifically, visual exploration resulted in more noun labels (e.g., “toy”) than haptic exploration which resulted in more verb labels (i.e., “throw”). These results suggested that overlapping, but distinct classes of action possibilities are perceivable using vision and haptics. Semantic network analyses revealed that visual exploration resulted in object-oriented responses focused on object identification, whereas haptic exploration resulted in action-oriented responses. Cluster analyses confirmed these results. Affordance labels produced in the visual condition were more consistent, used fewer descriptors, were less diverse, but more novel than in the haptic condition.
  • Drijvers, L., & Ozyurek, A. (2020). Non-native listeners benefit less from gestures and visible speech than native listeners during degraded speech comprehension. Language and Speech, 63(2), 209-220. doi:10.1177/0023830919831311.

    Abstract

    Native listeners benefit from both visible speech and iconic gestures to enhance degraded speech comprehension (Drijvers & Ozyürek, 2017). We tested how highly proficient non-native listeners benefit from these visual articulators compared to native listeners. We presented videos of an actress uttering a verb in clear, moderately, or severely degraded speech, while her lips were blurred, visible, or visible and accompanied by a gesture. Our results revealed that unlike native listeners, non-native listeners were less likely to benefit from the combined enhancement of visible speech and gestures, especially since the benefit from visible speech was minimal when the signal quality was not sufficient.
  • Eielts, C., Pouw, W., Ouwehand, K., Van Gog, T., Zwaan, R. A., & Paas, F. (2020). Co-thought gesturing supports more complex problem solving in subjects with lower visual working-memory capacity. Psychological Research, 84, 502-513. doi:10.1007/s00426-018-1065-9.

    Abstract

    During silent problem solving, hand gestures arise that have no communicative intent. The role of such co-thought gestures in
    cognition has been understudied in cognitive research as compared to co-speech gestures. We investigated whether gesticulation
    during silent problem solving supported subsequent performance in a Tower of Hanoi problem-solving task, in relation
    to visual working-memory capacity and task complexity. Seventy-six participants were assigned to either an instructed gesture
    condition or a condition that allowed them to gesture, but without explicit instructions to do so. This resulted in three
    gesture groups: (1) non-gesturing; (2) spontaneous gesturing; (3) instructed gesturing. In line with the embedded/extended
    cognition perspective on gesture, gesturing benefited complex problem-solving performance for participants with a lower
    visual working-memory capacity, but not for participants with a lower spatial working-memory capacity.
  • Hostetter, A. B., Pouw, W., & Wakefield, E. M. (2020). Learning from gesture and action: An investigation of memory for where objects went and how they got there. Cognitive Science, 44(9): e12889. doi:10.1111/cogs.12889.

    Abstract

    Speakers often use gesture to demonstrate how to perform actions—for example, they might show how to open the top of a jar by making a twisting motion above the jar. Yet it is unclear whether listeners learn as much from seeing such gestures as they learn from seeing actions that physically change the position of objects (i.e., actually opening the jar). Here, we examined participants' implicit and explicit understanding about a series of movements that demonstrated how to move a set of objects. The movements were either shown with actions that physically relocated each object or with gestures that represented the relocation without touching the objects. Further, the end location that was indicated for each object covaried with whether the object was grasped with one or two hands. We found that memory for the end location of each object was better after seeing the physical relocation of the objects, that is, after seeing action, than after seeing gesture, regardless of whether speech was absent (Experiment 1) or present (Experiment 2). However, gesture and action built similar implicit understanding of how a particular handgrasp corresponded with a particular end location. Although gestures miss the benefit of showing the end state of objects that have been acted upon, the data show that gestures are as good as action in building knowledge of how to perform an action.

    Additional information

    additional analyses Open Data OSF
  • Kendrick, K. H., Brown, P., Dingemanse, M., Floyd, S., Gipper, S., Hayano, K., Hoey, E., Hoymann, G., Manrique, E., Rossi, G., & Levinson, S. C. (2020). Sequence organization: A universal infrastructure for social action. Journal of Pragmatics, 168, 119-138. doi:10.1016/j.pragma.2020.06.009.

    Abstract

    This article makes the case for the universality of the sequence organization observable in informal human conversational interaction. Using the descriptive schema developed by Schegloff (2007), we examine the major patterns of action-sequencing in a dozen nearly all unrelated languages. What we find is that these patterns are instantiated in very similar ways for the most part right down to the types of different action sequences. There are also some notably different cultural exploitations of the patterns, but the patterns themselves look strongly universal. Recent work in gestural communication in the great apes suggests that sequence organization may have been a crucial route into the development of language. Taken together with the fundamental role of this organization in language acquisition, sequential behavior of this kind seems to have both phylogenetic and ontogenetic priority, which probably puts substantial functional pressure on language form.

    Additional information

    Supplementary data
  • Macuch Silva, V., Holler, J., Ozyurek, A., & Roberts, S. G. (2020). Multimodality and the origin of a novel communication system in face-to-face interaction. Royal Society Open Science, 7: 182056. doi:10.1098/rsos.182056.

    Abstract

    Face-to-face communication is multimodal at its core: it consists of a combination of vocal and visual signalling. However, current evidence suggests that, in the absence of an established communication system, visual signalling, especially in the form of visible gesture, is a more powerful form of communication than vocalisation, and therefore likely to have played a primary role in the emergence of human language. This argument is based on experimental evidence of how vocal and visual modalities (i.e., gesture) are employed to communicate about familiar concepts when participants cannot use their existing languages. To investigate this further, we introduce an experiment where pairs of participants performed a referential communication task in which they described unfamiliar stimuli in order to reduce reliance on conventional signals. Visual and auditory stimuli were described in three conditions: using visible gestures only, using non-linguistic vocalisations only and given the option to use both (multimodal communication). The results suggest that even in the absence of conventional signals, gesture is a more powerful mode of communication compared to vocalisation, but that there are also advantages to multimodality compared to using gesture alone. Participants with an option to produce multimodal signals had comparable accuracy to those using only gesture, but gained an efficiency advantage. The analysis of the interactions between participants showed that interactants developed novel communication systems for unfamiliar stimuli by deploying different modalities flexibly to suit their needs and by taking advantage of multimodality when required.
  • Manhardt, F., Ozyurek, A., Sumer, B., Mulder, K., Karadöller, D. Z., & Brouwer, S. (2020). Iconicity in spatial language guides visual attention: A comparison between signers’ and speakers’ eye gaze during message preparation. Journal of Experimental Psychology: Learning, Memory, and Cognition, 46(9), 1735-1753. doi:10.1037/xlm0000843.

    Abstract

    To talk about space, spoken languages rely on arbitrary and categorical forms (e.g., left, right). In sign languages, however, the visual–spatial modality allows for iconic encodings (motivated form-meaning mappings) of space in which form and location of the hands bear resemblance to the objects and spatial relations depicted. We assessed whether the iconic encodings in sign languages guide visual attention to spatial relations differently than spatial encodings in spoken languages during message preparation at the sentence level. Using a visual world production eye-tracking paradigm, we compared 20 deaf native signers of Sign-Language-of-the-Netherlands and 20 Dutch speakers’ visual attention to describe left versus right configurations of objects (e.g., “pen is to the left/right of cup”). Participants viewed 4-picture displays in which each picture contained the same 2 objects but in different spatial relations (lateral [left/right], sagittal [front/behind], topological [in/on]) to each other. They described the target picture (left/right) highlighted by an arrow. During message preparation, signers, but not speakers, experienced increasing eye-gaze competition from other spatial configurations. This effect was absent during picture viewing prior to message preparation of relational encoding. Moreover, signers’ visual attention to lateral and/or sagittal relations was predicted by the type of iconicity (i.e., object and space resemblance vs. space resemblance only) in their spatial descriptions. Findings are discussed in relation to how “thinking for speaking” differs from “thinking for signing” and how iconicity can mediate the link between language and human experience and guides signers’ but not speakers’ attention to visual aspects of the world.

    Additional information

    Supplementary materials
  • Ortega, G., Ozyurek, A., & Peeters, D. (2020). Iconic gestures serve as manual cognates in hearing second language learners of a sign language: An ERP study. Journal of Experimental Psychology: Learning, Memory, and Cognition, 46(3), 403-415. doi:10.1037/xlm0000729.

    Abstract

    When learning a second spoken language, cognates, words overlapping in form and meaning with one’s native language, help breaking into the language one wishes to acquire. But what happens when the to-be-acquired second language is a sign language? We tested whether hearing nonsigners rely on their gestural repertoire at first exposure to a sign language. Participants saw iconic signs with high and low overlap with the form of iconic gestures while electrophysiological brain activity was recorded. Upon first exposure, signs with low overlap with gestures elicited enhanced positive amplitude in the P3a component compared to signs with high overlap. This effect disappeared after a training session. We conclude that nonsigners generate expectations about the form of iconic signs never seen before based on their implicit knowledge of gestures, even without having to produce them. Learners thus draw from any available semiotic resources when acquiring a second language, and not only from their linguistic experience
  • Ortega, G., & Ozyurek, A. (2020). Systematic mappings between semantic categories and types of iconic representations in the manual modality: A normed database of silent gesture. Behavior Research Methods, 52, 51-67. doi:10.3758/s13428-019-01204-6.

    Abstract

    An unprecedented number of empirical studies have shown that iconic gestures—those that mimic the sensorimotor attributes of a referent—contribute significantly to language acquisition, perception, and processing. However, there has been a lack of normed studies describing generalizable principles in gesture production and in comprehension of the mappings of different types of iconic strategies (i.e., modes of representation; Müller, 2013). In Study 1 we elicited silent gestures in order to explore the implementation of different types of iconic representation (i.e., acting, representing, drawing, and personification) to express concepts across five semantic domains. In Study 2 we investigated the degree of meaning transparency (i.e., iconicity ratings) of the gestures elicited in Study 1. We found systematicity in the gestural forms of 109 concepts across all participants, with different types of iconicity aligning with specific semantic domains: Acting was favored for actions and manipulable objects, drawing for nonmanipulable objects, and personification for animate entities. Interpretation of gesture–meaning transparency was modulated by the interaction between mode of representation and semantic domain, with some couplings being more transparent than others: Acting yielded higher ratings for actions, representing for object-related concepts, personification for animate entities, and drawing for nonmanipulable entities. This study provides mapping principles that may extend to all forms of manual communication (gesture and sign). This database includes a list of the most systematic silent gestures in the group of participants, a notation of the form of each gesture based on four features (hand configuration, orientation, placement, and movement), each gesture’s mode of representation, iconicity ratings, and professionally filmed videos that can be used for experimental and clinical endeavors.
  • Ortega, G., & Ozyurek, A. (2020). Types of iconicity and combinatorial strategies distinguish semantic categories in silent gesture. Language and Cognition, 12(1), 84-113. doi:10.1017/langcog.2019.28.

    Abstract

    In this study we explore whether different types of iconic gestures
    (i.e., acting, drawing, representing) and their combinations are used
    systematically to distinguish between different semantic categories in
    production and comprehension. In Study 1, we elicited silent gestures
    from Mexican and Dutch participants to represent concepts from three
    semantic categories: actions, manipulable objects, and non-manipulable
    objects. Both groups favoured the acting strategy to represent actions and
    manipulable objects; while non-manipulable objects were represented
    through the drawing strategy. Actions elicited primarily single gestures
    whereas objects elicited combinations of different types of iconic gestures
    as well as pointing. In Study 2, a different group of participants were
    shown gestures from Study 1 and were asked to guess their meaning.
    Single-gesture depictions for actions were more accurately guessed than
    for objects. Objects represented through two-gesture combinations (e.g.,
    acting + drawing) were more accurately guessed than objects represented
    with a single gesture. We suggest iconicity is exploited to make direct
    links with a referent, but when it lends itself to ambiguity, individuals
    resort to combinatorial structures to clarify the intended referent.
    Iconicity and the need to communicate a clear signal shape the structure
    of silent gestures and this in turn supports comprehension.
  • Pouw, W., Paxton, A., Harrison, S. J., & Dixon, J. A. (2020). Reply to Ravignani and Kotz: Physical impulses from upper-limb movements impact the respiratory–vocal system. Proceedings of the National Academy of Sciences of the United States of America, 117(38), 23225-23226. doi:10.1073/pnas.2015452117.
  • Pouw, W., Paxton, A., Harrison, S. J., & Dixon, J. A. (2020). Acoustic information about upper limb movement in voicing. Proceedings of the National Academy of Sciences of the United States of America, 117(21), 11364-11367. doi:10.1073/pnas.2004163117.

    Abstract

    We show that the human voice has complex acoustic qualities that are directly coupled to peripheral musculoskeletal tensioning of the body, such as subtle wrist movements. In this study, human vocalizers produced a steady-state vocalization while rhythmically moving the wrist or the arm at different tempos. Although listeners could only hear but not see the vocalizer, they were able to completely synchronize their own rhythmic wrist or arm movement with the movement of the vocalizer which they perceived in the voice acoustics. This study corroborates
    recent evidence suggesting that the human voice is constrained by bodily tensioning affecting the respiratory-vocal system. The current results show that the human voice contains a bodily imprint that is directly informative for the interpersonal perception of another’s dynamic physical states.
  • Pouw, W., Harrison, S. J., Esteve-Gibert, N., & Dixon, J. A. (2020). Energy flows in gesture-speech physics: The respiratory-vocal system and its coupling with hand gestures. The Journal of the Acoustical Society of America, 148(3): 1231. doi:10.1121/10.0001730.

    Abstract

    Expressive moments in communicative hand gestures often align with emphatic stress in speech. It has recently been found that acoustic markers of emphatic stress arise naturally during steady-state phonation when upper-limb movements impart physical impulses on the body, most likely affecting acoustics via respiratory activity. In this confirmatory study, participants (N = 29) repeatedly uttered consonant-vowel (/pa/) mono-syllables while moving in particular phase relations with speech, or not moving the upper limbs. This study shows that respiration-related activity is affected by (especially high-impulse) gesturing when vocalizations occur near peaks in physical impulse. This study further shows that gesture-induced moments of bodily impulses increase the amplitude envelope of speech, while not similarly affecting the Fundamental Frequency (F0). Finally, tight relations between respiration-related activity and vocalization were observed, even in the absence of movement, but even more so when upper-limb movement is present. The current findings expand a developing line of research showing that speech is modulated by functional biomechanical linkages between hand gestures and the respiratory system. This identification of gesture-speech biomechanics promises to provide an alternative phylogenetic, ontogenetic, and mechanistic explanatory route of why communicative upper limb movements co-occur with speech in humans.
    ACKNOWLEDGMENTS

    Additional information

    Link to Preprint on OSF
  • Pouw, W., & Dixon, J. A. (2020). Gesture networks: Introducing dynamic time warping and network analysis for the kinematic study of gesture ensembles. Discourse Processes, 57(4), 301-319. doi:10.1080/0163853X.2019.1678967.

    Abstract

    We introduce applications of established methods in time-series and network
    analysis that we jointly apply here for the kinematic study of gesture
    ensembles. We define a gesture ensemble as the set of gestures produced
    during discourse by a single person or a group of persons. Here we are
    interested in how gestures kinematically relate to one another. We use
    a bivariate time-series analysis called dynamic time warping to assess how
    similar each gesture is to other gestures in the ensemble in terms of their
    velocity profiles (as well as studying multivariate cases with gesture velocity
    and speech amplitude envelope profiles). By relating each gesture event to
    all other gesture events produced in the ensemble, we obtain a weighted
    matrix that essentially represents a network of similarity relationships. We
    can therefore apply network analysis that can gauge, for example, how
    diverse or coherent certain gestures are with respect to the gesture ensemble.
    We believe these analyses promise to be of great value for gesture
    studies, as we can come to understand how low-level gesture features
    (kinematics of gesture) relate to the higher-order organizational structures
    present at the level of discourse.

    Additional information

    Open Data OSF

Share this page