Displaying 1 - 57 of 57
-
Akamine, S., Ghaleb, E., Rasenberg, M., Fernandez, R., Meyer, A. S., & Özyürek, A. (2024). Speakers align both their gestures and words not only to establish but also to maintain reference to create shared labels for novel objects in interaction. In L. K. Samuelson, S. L. Frank, A. Mackey, & E. Hazeltine (
Eds. ), Proceedings of the 46th Annual Meeting of the Cognitive Science Society (CogSci 2024) (pp. 2435-2442).Abstract
When we communicate with others, we often repeat aspects of each other's communicative behavior such as sentence structures and words. Such behavioral alignment has been mostly studied for speech or text. Yet, language use is mostly multimodal, flexibly using speech and gestures to convey messages. Here, we explore the use of alignment in speech (words) and co-speech gestures (iconic gestures) in a referential communication task aimed at finding labels for novel objects in interaction. In particular, we investigate how people flexibly use lexical and gestural alignment to create shared labels for novel objects and whether alignment in speech and gesture are related over time. The present study shows that interlocutors establish shared labels multimodally, and alignment in words and iconic gestures are used throughout the interaction. We also show that the amount of lexical alignment positively associates with the amount of gestural alignment over time, suggesting a close relationship between alignment in the vocal and manual modalities.Additional information
link to eScholarship -
Ben-Ami, S., Shukla, Vishakha, V., Gupta, P., Shah, P., Ralekar, C., Ganesh, S., Gilad-Gutnick, S., Rubio-Fernández, P., & Sinha, P. (2024). Form perception as a bridge to real-world functional proficiency. In L. K. Samuelson, S. L. Frank, M. Toneva, A. Mackey, & E. Hazeltine (
Eds. ), Proceedings of the 46th Annual Meeting of the Cognitive Science Society (CogSci 2024) (pp. 6094-6102).Abstract
Recognizing the limitations of standard vision assessments in capturing the real-world capabilities of individuals with low vision, we investigated the potential of the Seguin Form Board Test (SFBT), a widely-used intelligence assessment employing a visuo-haptic shape-fitting task, as an estimator of vision's practical utility. We present findings from 23 children from India, who underwent treatment for congenital bilateral dense cataracts, and 21 control participants. To assess the development of functional visual ability, we conducted the SFBT and the standard measure of visual acuity, before and longitudinally after treatment. We observed a dissociation in the development of shape-fitting and visual acuity. Improvements of patients' shape-fitting preceded enhancements in their visual acuity after surgery and emerged even with acuity worse than that of control participants. Our findings highlight the importance of incorporating multi-modal and cognitive aspects into evaluations of visual proficiency in low-vision conditions, to better reflect vision's impact on daily activities.Additional information
link to eScholarship -
Cho, S.-J., Brown-Schmidt, S., Clough, S., & Duff, M. C. (2024). Comparing Functional Trend and Learning among Groups in Intensive Binary Longitudinal Eye-Tracking Data using By-Variable Smooth Functions of GAMM. Psychometrika. Advance online publication. doi:10.1007/s11336-024-09986-1.
Abstract
This paper presents a model specification for group comparisons regarding a functional trend over time within a trial and learning across a series of trials in intensive binary longitudinal eye-tracking data. The functional trend and learning effects are modeled using by-variable smooth functions. This model specification is formulated as a generalized additive mixed model, which allowed for the use of the freely available mgcv package (Wood in Package ‘mgcv.’ https://cran.r-project.org/web/packages/mgcv/mgcv.pdf, 2023) in R. The model specification was applied to intensive binary longitudinal eye-tracking data, where the questions of interest concern differences between individuals with and without brain injury in their real-time language comprehension and how this affects their learning over time. The results of the simulation study show that the model parameters are recovered well and the by-variable smooth functions are adequately predicted in the same condition as those found in the application.Additional information
The data and the R code used in the illustration can be found in the Open Scien… -
Clough, S., Brown-Schmidt, S., Cho, S.-J., & Duff, M. C. (2024). Reduced on-line speech gesture integration during multimodal language processing in adults with moderate-severe traumatic brain injury: Evidence from eye-tracking. Cortex, 181, 26-46. doi:10.1016/j.cortex.2024.08.008.
Abstract
Background
Language is multimodal and situated in rich visual contexts. Language is also incremental, unfolding moment-to-moment in real time, yet few studies have examined how spoken language interacts with gesture and visual context during multimodal language processing. Gesture is a rich communication cue that is integrally related to speech and often depicts concrete referents from the visual world. Using eye-tracking in an adapted visual world paradigm, we examined how participants with and without moderate-severe traumatic brain injury (TBI) use gesture to resolve temporary referential ambiguity.
Methods
Participants viewed a screen with four objects and one video. The speaker in the video produced sentences (e.g., “The girl will eat the very good sandwich”), paired with either a meaningful gesture (e.g., sandwich-holding gesture) or a meaningless grooming movement (e.g., arm scratch) at the verb “will eat.” We measured participants’ gaze to the target object (e.g., sandwich), a semantic competitor (e.g., apple), and two unrelated distractors (e.g., piano, guitar) during the critical window between movement onset in the gesture modality and onset of the spoken referent in speech.
Results
Both participants with and without TBI were more likely to fixate the target when the speaker produced a gesture compared to a grooming movement; however, relative to non-injured participants, the effect was significantly attenuated in the TBI group.
Discussion
We demonstrated evidence of reduced speech-gesture integration in participants with TBI relative to non-injured peers. This study advances our understanding of the communicative abilities of adults with TBI and could lead to a more mechanistic account of the communication difficulties adults with TBI experience in rich communication contexts that require the processing and integration of multiple co-occurring cues. This work has the potential to increase the ecological validity of language assessment and provide insights into the cognitive and neural mechanisms that support multimodal language processing.Additional information
supplementary data -
Dikshit, A. P., Das, D., Samal, R. R., Parashar, K., Mishra, C., & Parashar, S. (2024). Optimization of (Ba1-xCax)(Ti0.9Sn0.1)O3 ceramics in X-band using Machine Learning. Journal of Alloys and Compounds, 982: 173797. doi:10.1016/j.jallcom.2024.173797.
Abstract
Developing efficient electromagnetic interference shielding materials has become significantly important in present times. This paper reports a series of (Ba1-xCax)(Ti0.9Sn0.1)O3 (BCTS) ((x =0, 0.01, 0.05, & 0.1)ceramics synthesized by conventional method which were studied for electromagnetic interference shielding (EMI) applications in X-band (8-12.4 GHz). EMI shielding properties and all S parameters (S11 & S12) of BCTS ceramic pellets were measured in the frequency range (8-12.4 GHz) using a Vector Network Analyser (VNA). The BCTS ceramic pellets for x = 0.05 showed maximum total effective shielding of 46 dB indicating good shielding behaviour for high-frequency applications. However, the development of lead-free ceramics with different concentrations usually requires iterative experiments resulting in, longer development cycles and higher costs. To address this, we used a machine learning (ML) strategy to predict the EMI shielding for different concentrations and experimentally verify the concentration predicted to give the best EMI shielding. The ML model predicted BCTS ceramics with concentration (x = 0.06, 0.07, 0.08, and 0.09) to have higher shielding values. On experimental verification, a shielding value of 58 dB was obtained for x = 0.08, which was significantly higher than what was obtained experimentally before applying the ML approach. Our results show the potential of using ML in accelerating the process of optimal material development, reducing the need for repeated experimental measures significantly. -
Dona, L., & Schouwstra, M. (2024). Balancing regularization and variation: The roles of priming and motivatedness. In J. Nölle, L. Raviv, K. E. Graham, S. Hartmann, Y. Jadoul, M. Josserand, T. Matzinger, K. Mudd, M. Pleyer, A. Slonimska, & S. Wacewicz (
Eds. ), The Evolution of Language: Proceedings of the 15th International Conference (EVOLANG XV) (pp. 130-133). Nijmegen: The Evolution of Language Conferences. -
Evans, M. J., Clough, S., Duff, M. C., & Brown‐Schmidt, S. (2024). Temporal organization of narrative recall is present but attenuated in adults with hippocampal amnesia. Hippocampus, 34(8), 438-451. doi:10.1002/hipo.23620.
Abstract
Studies of the impact of brain injury on memory processes often focus on the quantity and episodic richness of those recollections. Here, we argue that the organization of one's recollections offers critical insights into the impact of brain injury on functional memory. It is well-established in studies of word list memory that free recall of unrelated words exhibits a clear temporal organization. This temporal contiguity effect refers to the fact that the order in which word lists are recalled reflects the original presentation order. Little is known, however, about the organization of recall for semantically rich materials, nor how recall organization is impacted by hippocampal damage and memory impairment. The present research is the first study, to our knowledge, of temporal organization in semantically rich narratives in three groups: (1) Adults with bilateral hippocampal damage and severe declarative memory impairment, (2) adults with bilateral ventromedial prefrontal cortex (vmPFC) damage and no memory impairment, and (3) demographically matched non-brain-injured comparison participants. We find that although the narrative recall of adults with bilateral hippocampal damage reflected the temporal order in which those narratives were experienced above chance levels, their temporal contiguity effect was significantly attenuated relative to comparison groups. In contrast, individuals with vmPFC damage did not differ from non-brain-injured comparison participants in temporal contiguity. This pattern of group differences yields insights into the cognitive and neural systems that support the use of temporal organization in recall. These data provide evidence that the retrieval of temporal context in narrative recall is hippocampal-dependent, whereas damage to the vmPFC does not impair the temporal organization of narrative recall. This evidence of limited but demonstrable organization of memory in participants with hippocampal damage and amnesia speaks to the power of narrative structures in supporting meaningfully organized recall despite memory impairment.Additional information
supporting information -
Feller, J. J., Duff, M. C., Clough, S., Jacobson, G. P., Roberts, R. A., & Romero, D. J. (2024). Evidence of peripheral vestibular impairment among adults with chronic moderate–severe traumatic brain injury. American Journal of Audiology, 33, 1118-1134. doi:10.1044/2024_AJA-24-00058.
Abstract
Purpose:
Traumatic brain injury (TBI) is a leading cause of death and disability among adults in the United States. There is evidence to suggest the peripheral vestibular system is vulnerable to damage in individuals with TBI. However, there are limited prospective studies that describe the type and frequency of vestibular impairment in individuals with chronic moderate–severe TBI (> 6 months postinjury).
Method:
Cervical and ocular vestibular evoked myogenic potentials (VEMPs) and video head impulse test (vHIT) were used to assess the function of otolith organ and horizontal semicircular canal (hSCC) pathways in adults with chronic moderate–severe TBI and in noninjured comparison (NC) participants. Self-report questionnaires were administered to participants with TBI to determine prevalence of vestibular symptoms and quality of life associated with those symptoms.
Results:
Chronic moderate–severe TBI was associated with a greater degree of impairment in otolith organ, rather than hSCC, pathways. About 63% of participants with TBI had abnormal VEMP responses, compared to only ~10% with abnormal vHIT responses. The NC group had significantly less abnormal VEMP responses (~7%), while none of the NC participants had abnormal vHIT responses. As many as 80% of participants with TBI reported vestibular symptoms, and up to 36% reported that these symptoms negatively affected their quality of life.
Conclusions:
Adults with TBI reported vestibular symptoms and decreased quality of life related to those symptoms and had objective evidence of peripheral vestibular impairment. Vestibular testing for adults with chronic TBI who report persistent dizziness and imbalance may serve as a guide for treatment and rehabilitation in these individuals. -
Ghaleb, E., Rasenberg, M., Pouw, W., Toni, I., Holler, J., Özyürek, A., & Fernandez, R. (2024). Analysing cross-speaker convergence through the lens of automatically detected shared linguistic constructions. In L. K. Samuelson, S. L. Frank, A. Mackey, & E. Hazeltine (
Eds. ), Proceedings of the 46th Annual Meeting of the Cognitive Science Society (CogSci 2024) (pp. 1717-1723).Abstract
Conversation requires a substantial amount of coordination between dialogue participants, from managing turn taking to negotiating mutual understanding. Part of this coordination effort surfaces as the reuse of linguistic behaviour across speakers, a process often referred to as alignment. While the presence of linguistic alignment is well documented in the literature, several questions remain open, including the extent to which patterns of reuse across speakers have an impact on the emergence of labelling conventions for novel referents. In this study, we put forward a methodology for automatically detecting shared lemmatised constructions---expressions with a common lexical core used by both speakers within a dialogue---and apply it to a referential communication corpus where participants aim to identify novel objects for which no established labels exist. Our analyses uncover the usage patterns of shared constructions in interaction and reveal that features such as their frequency and the amount of different constructions used for a referent are associated with the degree of object labelling convergence the participants exhibit after social interaction. More generally, the present study shows that automatically detected shared constructions offer a useful level of analysis to investigate the dynamics of reference negotiation in dialogue.Additional information
link to eScholarship -
Ghaleb, E., Burenko, I., Rasenberg, M., Pouw, W., Uhrig, P., Holler, J., Toni, I., Ozyurek, A., & Fernandez, R. (2024). Cospeech gesture detection through multi-phase sequence labeling. In Proceedings of IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2024) (pp. 4007-4015).
Abstract
Gestures are integral components of face-to-face communication. They unfold over time, often following predictable movement phases of preparation, stroke, and re-
traction. Yet, the prevalent approach to automatic gesture detection treats the problem as binary classification, classifying a segment as either containing a gesture or not, thus failing to capture its inherently sequential and contextual nature. To address this, we introduce a novel framework that reframes the task as a multi-phase sequence labeling problem rather than binary classification. Our model processes sequences of skeletal movements over time windows, uses Transformer encoders to learn contextual embeddings, and leverages Conditional Random Fields to perform sequence labeling. We evaluate our proposal on a large dataset of diverse co-speech gestures in task-oriented face-to-face dialogues. The results consistently demonstrate that our method significantly outperforms strong baseline models in detecting gesture strokes. Furthermore, applying Transformer encoders to learn contextual embeddings from movement sequences substantially improves gesture unit detection. These results highlight our framework’s capacity to capture the fine-grained dynamics of co-speech gesture phases, paving the way for more nuanced and accurate gesture detection and analysis. -
Gordon, J. K., & Clough, S. (2024). The Flu-ID: A new evidence-based method of assessing fluency in aphasia. American Journal of Speech-Language Pathology, 33, 2972-2990. doi:10.1044/2024_AJSLP-23-00424.
Abstract
Purpose:
Assessing fluency in aphasia is diagnostically important for determining aphasia type and severity and therapeutically important for determining appropriate treatment targets. However, wide variability in the measures and criteria used to assess fluency, as revealed by a recent survey of clinicians (Gordon & Clough, 2022), results in poor reliability. Furthermore, poor specificity in many fluency measures makes it difficult to identify the underlying impairments. Here, we introduce the Flu-ID Aphasia, an evidence-based tool that provides a more informative method of assessing fluency by capturing the range of behaviors that can affect the flow of speech in aphasia.
Method:
The development of the Flu-ID was based on prior evidence about factors underlying fluency (Clough & Gordon, 2020; Gordon & Clough, 2020) and clinical perceptions about the measurement of fluency (Gordon & Clough, 2022). Clinical utility is maximized by automated counting of fluency behaviors in an Excel template. Reliability is maximized by outlining thorough guidelines for transcription and coding. Eighteen narrative samples representing a range of fluency were coded independently by the authors to examine the Flu-ID's utility, reliability, and validity.
Results:
Overall reliability was very good, with point-to-point agreement of 86% between coders. Ten of the 12 dimensions showed good to excellent reliability. Validity analyses indicated that Flu-ID scores were similar to clinician ratings on some dimensions, but differed on others. Possible reasons and implications of the discrepancies are discussed, along with opportunities for improvement.
Conclusions:
The Flu-ID assesses fluency in aphasia using a consistent and comprehensive set of measures and semi-automated procedures to generate individual fluency profiles. The profiles generated in the current study illustrate how similar ratings of fluency can arise from different underlying impairments. Supplemental materials include an analysis template, extensive guidelines for transcription and coding, a completed sample, and a quick reference guide.Additional information
supplemental material -
Hagoort, P., & Özyürek, A. (2024). Extending the architecture of language from a multimodal perspective. Topics in Cognitive Science. Advance online publication. doi:10.1111/tops.12728.
Abstract
Language is inherently multimodal. In spoken languages, combined spoken and visual signals (e.g., co-speech gestures) are an integral part of linguistic structure and language representation. This requires an extension of the parallel architecture, which needs to include the visual signals concomitant to speech. We present the evidence for the multimodality of language. In addition, we propose that distributional semantics might provide a format for integrating speech and co-speech gestures in a common semantic representation. -
Jara-Ettinger, J., & Rubio-Fernandez, P. (2024). Demonstratives as attention tools: Evidence of mentalistic representations in language. Proceedings of the National Academy of Sciences of the United States of America, 121(32): e2402068121. doi:10.1073/pnas.2402068121.
Abstract
Linguistic communication is an intrinsically social activity that enables us to share thoughts across minds. Many complex social uses of language can be captured by domain-general representations of other minds (i.e., mentalistic representations) that externally modulate linguistic meaning through Gricean reasoning. However, here we show that representations of others’ attention are embedded within language itself. Across ten languages, we show that demonstratives—basic grammatical words (e.g.,“this”/“that”) which are evolutionarily ancient, learned early in life, and documented in all known languages—are intrinsic attention tools. Beyond their spatial meanings, demonstratives encode both joint attention and the direction in which the listenermmust turn to establish it. Crucially, the frequency of the spatial and attentional uses of demonstratives varies across languages, suggesting that both spatial and mentalistic representations are part of their conventional meaning. Using computational modeling, we show that mentalistic representations of others’ attention are internally encoded in demonstratives, with their effect further boosted by Gricean reasoning. Yet, speakers are largely unaware of this, incorrectly reporting that they primarily capture spatial representations. Our findings show that representations of other people’s cognitive states (namely, their attention) are embedded in language and suggest that the most basic building blocks of the linguistic system crucially rely on social cognition.Additional information
pnas.2402068121.sapp.pdf -
Joshi, A., Mohanty, R., Kanakanti, M., Mangla, A., Choudhary, S., Barbate, M., & Modi, A. (2024). iSign: A benchmark for Indian Sign Language processing. In L.-W. Ku, A. Martins, & V. Srikumar (
Eds. ), Findings of the Association for Computational Linguistics ACL 2024 (pp. 10827-10844). Bangkok, Thailand: Association for Computational Linguistics.Abstract
Indian Sign Language has limited resources for developing machine learning and data-driven approaches for automated language processing. Though text/audio-based language processing techniques have shown colossal research interest and tremendous improvements in the last few years, Sign Languages still need to catch up due to the need for more resources. To bridge this gap, in this work, we propose iSign: a benchmark for Indian Sign Language (ISL) Processing. We make three primary contributions to this work. First, we release one of the largest ISL-English datasets with more than video-sentence/phrase pairs. To the best of our knowledge, it is the largest sign language dataset available for ISL. Second, we propose multiple NLP-specific tasks (including SignVideo2Text, SignPose2Text, Text2Pose, Word Prediction, and Sign Semantics) and benchmark them with the baseline models for easier access to the research community. Third, we provide detailed insights into the proposed benchmarks with a few linguistic insights into the working of ISL. We streamline the evaluation of Sign Language processing, addressing the gaps in the NLP research community for Sign Languages. We release the dataset, tasks and models via the following website: https://exploration-lab.github.io/iSign/
Additional information
dataset, tasks, models -
Karadöller, D. Z., Sümer, B., Ünal, E., & Özyürek, A. (2024). Sign advantage: Both children and adults’ spatial expressions in sign are more informative than those in speech and gestures combined. Journal of Child Language, 51(4), 876-902. doi:10.1017/S0305000922000642.
Abstract
Expressing Left-Right relations is challenging for speaking-children. Yet, this challenge was absent for signing-children, possibly due to iconicity in the visual-spatial modality of expression. We investigate whether there is also a modality advantage when speaking-children’s co-speech gestures are considered. Eight-year-old child and adult hearing monolingual Turkish speakers and deaf signers of Turkish-Sign-Language described pictures of objects in various spatial relations. Descriptions were coded for informativeness in speech, sign, and speech-gesture combinations for encoding Left-Right relations. The use of co-speech gestures increased the informativeness of speakers’ spatial expressions compared to speech-only. This pattern was more prominent for children than adults. However, signing-adults and children were more informative than child and adult speakers even when co-speech gestures were considered. Thus, both speaking- and signing-children benefit from iconic expressions in visual modality. Finally, in each modality, children were less informative than adults, pointing to the challenge of this spatial domain in development. -
Karadöller, D. Z., Peeters, D., Manhardt, F., Özyürek, A., & Ortega, G. (2024). Iconicity and gesture jointly facilitate learning of second language signs at first exposure in hearing non-signers. Language Learning, 74(4), 781-813. doi:10.1111/lang.12636.
Abstract
When learning a spoken second language (L2), words overlapping in form and meaning with one’s native language (L1) help break into the new language. When non-signing speakers learn a sign language as L2, such forms are absent because of the modality differences (L1:speech, L2:sign). In such cases, non-signing speakers might use iconic form-meaning mappings in signs or their own gestural experience as gateways into the to-be-acquired sign language. Here, we investigated how both these factors may contribute jointly to the acquisition of sign language vocabulary by hearing non-signers. Participants were presented with three types of sign in NGT (Sign Language of the Netherlands): arbitrary signs, iconic signs with high or low gesture overlap. Signs that were both iconic and highly overlapping with gestures boosted learning most at first exposure, and this effect remained the day after. Findings highlight the influence of modality-specific factors supporting the acquisition of a signed lexicon. -
Karadöller*, D. Z., Sümer*, B., & Özyürek, A. (2024). First-language acquisition in a multimodal language framework: Insights from speech, gesture, and sign. First Language. Advance online publication. doi:10.1177/01427237241290678.
Abstract
*=shared first authorship
Children across the world acquire their first language(s) naturally, regardless of typology or modality (e.g. sign or spoken). Various attempts have been made to explain the puzzle of language acquisition using several approaches, trying to understand to what extent it can be explained by what children bring to language-learning situations as well as what they learn from the input and the interactive context. However, most of these approaches consider only speech development, thus ignoring the inherently multimodal nature of human language. As a multimodal view of language is becoming more widely adopted for the study of adult language, a multimodal approach to language acquisition is inevitable. Not only do children have the capacity to learn spoken and sign language equally easily, but spoken language acquisition consists of learning to coordinate linguistic expressions in both modalities, that is, in both speech and gesture. To provide a step forward in this direction, this article aims to synthesize findings from research studies that take a multimodal perspective on language acquisition in different sign and spoken languages, including the development of speech and accompanying gestures. Our review shows that while some aspects of language acquisition seem to be modality-independent, others might differ according to the affordances of each modality when used separately as well as together (either in sign, speech, and/or gesture). We argue that these findings need to be integrated into our understanding of language acquisition. We also identify which areas need future research for both spoken and sign language acquisition, taking into account not only multimodal but also cross-linguistic variation. -
Kejriwal, J., Mishra, C., Skantze, G., Offrede, T., & Beňuš, Š. (2024). Does a robot’s gaze behavior affect entrainment in HRI? Computing and Informatics, 43(5), 1256-1284. doi:10.31577/cai_2024_5_1256.
Abstract
Speakers tend to engage in adaptive behavior, known as entrainment, when they reuse their partner's linguistic representations, including lexical, acoustic prosodic, semantic, or syntactic structures during a conversation. Studies have explored the relationship between entrainment and social factors such as likeability, task success, and rapport. Still, limited research has investigated the relationship between entrainment and gaze. To address this gap, we conducted a within-subjects user study (N = 33) to test if gaze behavior of a robotic head affects entrainment of subjects toward the robot on four linguistic dimensions: lexical, syntactic, semantic, and acoustic-prosodic. Our results show that participants entrain more on lexical and acoustic-prosodic features when the robot exhibits well-timed gaze aversions similar to the ones observed in human gaze behavior, as compared to when the robot keeps staring at participants constantly. Our results support the predictions of the computers as social actors (CASA) model and suggest that implementing well-timed gaze aversion behavior in a robot can lead to speech entrainment in human-robot interactions. -
Kendrick, K. H., & Holler, J. (2024). Conversation. In M. C. Frank, & A. Majid (
Eds. ), Open Encyclopedia of Cognitive Science. Cambridge: MIT Press. doi:10.21428/e2759450.3c00b537. -
Kimmel, M., Schneider, S. M., & Fisher, V. J. (2024). "Introjecting" imagery: A process model of how minds and bodies are co-enacted. Language Sciences, 102: 101602. doi:10.1016/j.langsci.2023.101602.
Abstract
Somatic practices frequently use imagery, typically via verbal instructions, to scaffold sensorimotor organization and experience, a phenomenon we term “introjection”. We argue that introjection is an imagery practice in which sensorimotor and conceptual aspects are co-orchestrated, suggesting the necessity of crosstalk between somatics, phenomenology, psychology, embodied-enactive cognition, and linguistic research on embodied simulation. We presently focus on the scarcely addressed details of the process necessary to enact instructions of a literal or metaphoric nature through the body. Based on vignettes from dance, Feldenkrais, and Taichi practice, we describe introjection as a complex form of processual sense-making, in which context-interpretive, mental, attentional and physical sub-processes recursively braid. Our analysis focuses on how mental and body-related processes progressively align, inform and augment each other. This dialectic requires emphasis on the active body, which implies that uni-directional models (concept ⇒ body) are inadequate and should be replaced by interactionist alternatives (concept ⇔ body). Furthermore, we emphasize that both the source image itself and the body are specifically conceptualized for the context through constructive operations, and both evolve through their interplay. At this level introjection employs representational operations that are embedded in enactive dynamics of a fully situated person. -
Long, M., Rohde, H., Oraa Ali, M., & Rubio-Fernandez, P. (2024). The role of cognitive control and referential complexity on adults’ choice of referring expressions: Testing and expanding the referential complexity scale. Journal of Experimental Psychology: Learning, Memory, and Cognition, 50(1), 109-136. doi:10.1037/xlm0001273.
Abstract
This study aims to advance our understanding of the nature and source(s) of individual differences in pragmatic language behavior over the adult lifespan. Across four story continuation experiments, we probed adults’ (N = 496 participants, ages 18–82) choice of referential forms (i.e., names vs. pronouns to refer to the main character). Our manipulations were based on Fossard et al.’s (2018) scale of referential complexity which varies according to the visual properties of the scene: low complexity (one character), intermediate complexity (two characters of different genders), and high complexity (two characters of the same gender). Since pronouns signal topic continuity (i.e., that the discourse will continue to be about the same referent), the use of pronouns is expected to decrease as referential complexity increases. The choice of names versus pronouns, therefore, provides insight into participants’ perception of the topicality of a referent, and whether that varies by age and cognitive capacity. In Experiment 1, we used the scale to test the association between referential choice, aging, and cognition, identifying a link between older adults’ switching skills and optimal referential choice. In Experiments 2–4, we tested novel manipulations that could impact the scale and found both the timing of a competitor referent’s presence and emphasis placed on competitors modulated referential choice, leading us to refine the scale for future use. Collectively, Experiments 1–4 highlight what type of contextual information is prioritized at different ages, revealing older adults’ preserved sensitivity to (visual) scene complexity but reduced sensitivity to linguistic prominence cues, compared to younger adults. -
Long, M., MacPherson, S. E., & Rubio-Fernandez, P. (2024). Prosocial speech acts: Links to pragmatics and aging. Developmental Psychology, 60(3), 491-504. doi:10.1037/dev0001725.
Abstract
This study investigated how adults over the lifespan flexibly adapt their use of prosocial speech acts when conveying bad news to communicative partners. Experiment 1a (N = 100 Scottish adults aged 18–72 years) assessed whether participants’ use of prosocial speech acts varied according to audience design considerations (i.e., whether or not the recipient of the news was directly affected). Experiment 1b (N = 100 Scottish adults aged 19–70 years) assessed whether participants adjusted for whether the bad news was more or less severe (an index of general knowledge). Younger adults displayed more flexible adaptation to the recipient manipulation, while no age differences were found for severity. These findings are consistent with prior work showing age-related decline in audience design but not in the use of general knowledge during language production. Experiment 2 further probed younger adults (N = 40, Scottish, aged 18–37 years) and older adults’ (N = 40, Scottish, aged 70–89 years) prosocial linguistic behavior by investigating whether health (vs. nonhealth-related) matters would affect responses. While older adults used prosocial speech acts to a greater extent than younger adults, they did not distinguish between conditions. Our results suggest that prosocial linguistic behavior is likely influenced by a combination of differences in audience design and communicative styles at different ages. Collectively, these findings highlight the importance of situating prosocial speech acts within the pragmatics and aging literature, allowing us to uncover the factors modulating prosocial linguistic behavior at different developmental stages.Additional information
figures -
Long, M., & Rubio-Fernandez, P. (2024). Beyond typicality: Lexical category affects the use and processing of color words. In L. K. Samuelson, S. L. Frank, M. Toneva, A. Mackey, & E. Hazeltine (
Eds. ), Proceedings of the 46th Annual Meeting of the Cognitive Science Society (CogSci 2024) (pp. 4925-4930).Abstract
Speakers and listeners show an informativity bias in the use and interpretation of color modifiers. For example, speakers use color more often when referring to objects that vary in color than to objects with a prototypical color. Likewise, listeners look away from objects with prototypical colors upon hearing that color mentioned. Here we test whether speakers and listeners account for another factor related to informativity: the strength of the association between lexical categories and color. Our results demonstrate that speakers and listeners' choices are indeed influenced by this factor; as such, it should be integrated into current pragmatic theories of informativity and computational models of color reference.Additional information
link to eScholarship -
Mamus, E. (2024). Perceptual experience shapes how blind and sighted people express concepts in multimodal language. PhD Thesis, Radboud University Nijmegen, Nijmegen.
Additional information
fullt text via Radboud Repository -
Mishra, C., Nandanwar, A., & Mishra, S. (2024). HRI in Indian education: Challenges opportunities. In H. Admoni, D. Szafir, W. Johal, & A. Sandygulova (
Eds. ), Designing an introductory HRI course (workshop at HRI 2024). ArXiv. doi:10.48550/arXiv.2403.12223.Abstract
With the recent advancements in the field of robotics and the increased focus on having general-purpose robots widely available to the general public, it has become increasingly necessary to pursue research into Human-robot interaction (HRI). While there have been a lot of works discussing frameworks for teaching HRI in educational institutions with a few institutions already offering courses to students, a consensus on the course content still eludes the field. In this work, we highlight a few challenges and opportunities while designing an HRI course from an Indian perspective. These topics warrant further deliberations as they have a direct impact on the design of HRI courses and wider implications for the entire field. -
Motiekaitytė, K., Grosseck, O., Wolf, L., Bosker, H. R., Peeters, D., Perlman, M., Ortega, G., & Raviv, L. (2024). Iconicity and compositionality in emerging vocal communication systems: a Virtual Reality approach. In J. Nölle, L. Raviv, K. E. Graham, S. Hartmann, Y. Jadoul, M. Josserand, T. Matzinger, K. Mudd, M. Pleyer, A. Slonimska, & S. Wacewicz (
Eds. ), The Evolution of Language: Proceedings of the 15th International Conference (EVOLANG XV) (pp. 387-389). Nijmegen: The Evolution of Language Conferences. -
Nölle, J., Raviv, L., Graham, K. E., Hartmann, S., Jadoul, Y., Josserand, M., Matzinger, T., Mudd, K., Pleyer, M., Slonimska, A., Wacewicz, S., & Watson, S. (
Eds. ). (2024). The Evolution of Language: Proceedings of the 15th International Conference (EVOLANG XV). Nijmegen: The Evolution of Language Conferences. doi:10.17617/2.3587960. -
Plate, L., Fisher, V. J., Nabibaks, F., & Feenstra, M. (2024). Feeling the traces of the Dutch colonial past: Dance as an affective methodology in Farida Nabibaks’s radiant shadow. In E. Van Bijnen, P. Brandon, K. Fatah-Black, I. Limon, W. Modest, & M. Schavemaker (
Eds. ), The future of the Dutch colonial past: From dialogues to new narratives (pp. 126-139). Amsterdam: Amsterdam University Press. -
Ronderos, C. R., Zhang, Y., & Rubio-Fernandez, P. (2024). Weighted parameters in demonstrative use: The case of Spanish teens and adults. In L. K. Samuelson, S. L. Frank, M. Toneva, A. Mackey, & E. Hazeltine (
Eds. ), Proceedings of the 46th Annual Meeting of the Cognitive Science Society (CogSci 2024) (pp. 3279-3286).Additional information
link to eScholarship -
Ronderos, C. R., Aparicio, H., Long, M., Shukla, V., Jara-Ettinger, J., & Rubio-Fernandez, P. (2024). Perceptual, semantic, and pragmatic factors affect the derivation of contrastive inferences. Open mind: discoveries in cognitive science, 8, 1213-1227. doi:10.1162/opmi_a_00165.
Abstract
People derive contrastive inferences when interpreting adjectives (e.g., inferring that ‘the short pencil’ is being contrasted with a longer one). However, classic eye-tracking studies revealed contrastive inferences with scalar and material adjectives, but not with color adjectives. This was explained as a difference in listeners’ informativity expectations, since color adjectives are often used descriptively (hence not warranting a contrastive interpretation). Here we hypothesized that, beyond these pragmatic factors, perceptual factors (i.e., the relative perceptibility of color, material and scalar contrast) and semantic factors (i.e., the difference between gradable and non-gradable properties) also affect the real-time derivation of contrastive inferences. We tested these predictions in three languages with prenominal modification (English, Hindi, and Hungarian) and found that people derive contrastive inferences for color and scalar adjectives, but not for material adjectives. In addition, the processing of scalar adjectives was more context dependent than that of color and material adjectives, confirming that pragmatic, perceptual and semantic factors affect the derivation of contrastive inferences.
-
Rubianes, M., Drijvers, L., Muñoz, F., Jiménez-Ortega, L., Almeida-Rivera, T., Sánchez-García, J., Fondevila, S., Casado, P., & Martín-Loeches, M. (2024). The self-reference effect can modulate language syntactic processing even without explicit awareness: An electroencephalography study. Journal of Cognitive Neuroscience, 36(3), 460-474. doi:10.1162/jocn_a_02104.
Abstract
Although it is well established that self-related information can rapidly capture our attention and bias cognitive functioning, whether this self-bias can affect language processing remains largely unknown. In addition, there is an ongoing debate as to the functional independence of language processes, notably regarding the syntactic domain. Hence, this study investigated the influence of self-related content on syntactic speech processing. Participants listened to sentences that could contain morphosyntactic anomalies while the masked face identity (self, friend, or unknown faces) was presented for 16 msec preceding the critical word. The language-related ERP components (left anterior negativity [LAN] and P600) appeared for all identity conditions. However, the largest LAN effect followed by a reduced P600 effect was observed for self-faces, whereas a larger LAN with no reduction of the P600 was found for friend faces compared with unknown faces. These data suggest that both early and late syntactic processes can be modulated by self-related content. In addition, alpha power was more suppressed over the left inferior frontal gyrus only when self-faces appeared before the critical word. This may reflect higher semantic demands concomitant to early syntactic operations (around 150–550 msec). Our data also provide further evidence of self-specific response, as reflected by the N250 component. Collectively, our results suggest that identity-related information is rapidly decoded from facial stimuli and may impact core linguistic processes, supporting an interactive view of syntactic processing. This study provides evidence that the self-reference effect can be extended to syntactic processing. -
Rubio-Fernandez, P., Long, M., Shukla, V., Bhatia, V., Mahapatra, A., Ralekar, C., Ben-Ami, S., & Sinha, P. (2024). Multimodal communication in newly sighted children: An investigation of the relation between visual experience and pragmatic development. In L. K. Samuelson, S. L. Frank, M. Toneva, A. Mackey, & E. Hazeltine (
Eds. ), Proceedings of the 46th Annual Meeting of the Cognitive Science Society (CogSci 2024) (pp. 2560-2567).Abstract
We investigated the relationship between visual experience and pragmatic development by testing the socio-communicative skills of a unique population: the Prakash children of India, who received treatment for congenital cataracts after years of visual deprivation. Using two different referential communication tasks, our study investigated Prakash' children ability to produce sufficiently informative referential expressions (e.g., ‘the green pear' or ‘the small plate') and pay attention to their interlocutor's face during the task (Experiment 1), as well as their ability to recognize a speaker's referential intent through non-verbal cues such as head turning and pointing (Experiment 2). Our results show that Prakash children have strong pragmatic skills, but do not look at their interlocutor's face as often as neurotypical children do. However, longitudinal analyses revealed an increase in face fixations, suggesting that over time, Prakash children come to utilize their improved visual skills for efficient referential communication.Additional information
link to eScholarship -
Sekine, K., & Özyürek, A. (2024). Children benefit from gestures to understand degraded speech but to a lesser extent than adults. Frontiers in Psychology, 14: 1305562. doi:10.3389/fpsyg.2023.1305562.
Abstract
The present study investigated to what extent children, compared to adults, benefit from gestures to disambiguate degraded speech by manipulating speech signals and manual modality. Dutch-speaking adults (N = 20) and 6- and 7-year-old children (N = 15) were presented with a series of video clips in which an actor produced a Dutch action verb with or without an accompanying iconic gesture. Participants were then asked to repeat what they had heard. The speech signal was either clear or altered into 4- or 8-band noise-vocoded speech. Children had more difficulty than adults in disambiguating degraded speech in the speech-only condition. However, when presented with both speech and gestures, children reached a comparable level of accuracy to that of adults in the degraded-speech-only condition. Furthermore, for adults, the enhancement of gestures was greater in the 4-band condition than in the 8-band condition, whereas children showed the opposite pattern. Gestures help children to disambiguate degraded speech, but children need more phonological information than adults to benefit from use of gestures. Children’s multimodal language integration needs to further develop to adapt flexibly to challenging situations such as degraded speech, as tested in our study, or instances where speech is heard with environmental noise or through a face mask.Additional information
supplemental material -
Slonimska, A. (2024). The role of iconicity and simultaneity in efficient communication in the visual modality: Evidence from LIS (Italian Sign Language) [Dissertation Abstract]. Sign Language & Linguistics, 27(1), 116-124. doi:10.1075/sll.00084.slo.
-
Ter Bekke, M., Drijvers, L., & Holler, J. (2024). Hand gestures have predictive potential during conversation: An investigation of the timing of gestures in relation to speech. Cognitive Science, 48(1): e13407. doi:10.1111/cogs.13407.
Abstract
During face-to-face conversation, transitions between speaker turns are incredibly fast. These fast turn exchanges seem to involve next speakers predicting upcoming semantic information, such that next turn planning can begin before a current turn is complete. Given that face-to-face conversation also involves the use of communicative bodily signals, an important question is how bodily signals such as co-speech hand gestures play into these processes of prediction and fast responding. In this corpus study, we found that hand gestures that depict or refer to semantic information started before the corresponding information in speech, which held both for the onset of the gesture as a whole, as well as the onset of the stroke (the most meaningful part of the gesture). This early timing potentially allows listeners to use the gestural information to predict the corresponding semantic information to be conveyed in speech. Moreover, we provided further evidence that questions with gestures got faster responses than questions without gestures. However, we found no evidence for the idea that how much a gesture precedes its lexical affiliate (i.e., its predictive potential) relates to how fast responses were given. The findings presented here highlight the importance of the temporal relation between speech and gesture and help to illuminate the potential mechanisms underpinning multimodal language processing during face-to-face conversation. -
Ter Bekke, M., Drijvers, L., & Holler, J. (2024). Gestures speed up responses to questions. Language, Cognition and Neuroscience, 39(4), 423-430. doi:10.1080/23273798.2024.2314021.
Abstract
Most language use occurs in face-to-face conversation, which involves rapid turn-taking. Seeing communicative bodily signals in addition to hearing speech may facilitate such fast responding. We tested whether this holds for co-speech hand gestures by investigating whether these gestures speed up button press responses to questions. Sixty native speakers of Dutch viewed videos in which an actress asked yes/no-questions, either with or without a corresponding iconic hand gesture. Participants answered the questions as quickly and accurately as possible via button press. Gestures did not impact response accuracy, but crucially, gestures sped up responses, suggesting that response planning may be finished earlier when gestures are seen. How much gestures sped up responses was not related to their timing in the question or their timing with respect to the corresponding information in speech. Overall, these results are in line with the idea that multimodality may facilitate fast responding during face-to-face conversation. -
Ter Bekke, M., Levinson, S. C., Van Otterdijk, L., Kühn, M., & Holler, J. (2024). Visual bodily signals and conversational context benefit the anticipation of turn ends. Cognition, 248: 105806. doi:10.1016/j.cognition.2024.105806.
Abstract
The typical pattern of alternating turns in conversation seems trivial at first sight. But a closer look quickly reveals the cognitive challenges involved, with much of it resulting from the fast-paced nature of conversation. One core ingredient to turn coordination is the anticipation of upcoming turn ends so as to be able to ready oneself for providing the next contribution. Across two experiments, we investigated two variables inherent to face-to-face conversation, the presence of visual bodily signals and preceding discourse context, in terms of their contribution to turn end anticipation. In a reaction time paradigm, participants anticipated conversational turn ends better when seeing the speaker and their visual bodily signals than when they did not, especially so for longer turns. Likewise, participants were better able to anticipate turn ends when they had access to the preceding discourse context than when they did not, and especially so for longer turns. Critically, the two variables did not interact, showing that visual bodily signals retain their influence even in the context of preceding discourse. In a pre-registered follow-up experiment, we manipulated the visibility of the speaker's head, eyes and upper body (i.e. torso + arms). Participants were better able to anticipate turn ends when the speaker's upper body was visible, suggesting a role for manual gestures in turn end anticipation. Together, these findings show that seeing the speaker during conversation may critically facilitate turn coordination in interaction. -
Trujillo, J. P. (2024). Motion-tracking technology for the study of gesture. In A. Cienki (
Ed. ), The Cambridge Handbook of Gesture Studies. Cambridge: Cambridge University Press. -
Trujillo, J. P., & Holler, J. (2024). Conversational facial signals combine into compositional meanings that change the interpretation of speaker intentions. Scientific Reports, 14: 2286. doi:10.1038/s41598-024-52589-0.
Abstract
Human language is extremely versatile, combining a limited set of signals in an unlimited number of ways. However, it is unknown whether conversational visual signals feed into the composite utterances with which speakers communicate their intentions. We assessed whether different combinations of visual signals lead to different intent interpretations of the same spoken utterance. Participants viewed a virtual avatar uttering spoken questions while producing single visual signals (i.e., head turn, head tilt, eyebrow raise) or combinations of these signals. After each video, participants classified the communicative intention behind the question. We found that composite utterances combining several visual signals conveyed different meaning compared to utterances accompanied by the single visual signals. However, responses to combinations of signals were more similar to the responses to related, rather than unrelated, individual signals, indicating a consistent influence of the individual visual signals on the whole. This study therefore provides first evidence for compositional, non-additive (i.e., Gestalt-like) perception of multimodal language.Additional information
41598_2024_52589_MOESM1_ESM.docx -
Trujillo, J. P., & Holler, J. (2024). Information distribution patterns in naturalistic dialogue differ across languages. Psychonomic Bulletin & Review, 31, 1723-1734. doi:10.3758/s13423-024-02452-0.
Abstract
The natural ecology of language is conversation, with individuals taking turns speaking to communicate in a back-and-forth fashion. Language in this context involves strings of words that a listener must process while simultaneously planning their own next utterance. It would thus be highly advantageous if language users distributed information within an utterance in a way that may facilitate this processing–planning dynamic. While some studies have investigated how information is distributed at the level of single words or clauses, or in written language, little is known about how information is distributed within spoken utterances produced during naturalistic conversation. It also is not known how information distribution patterns of spoken utterances may differ across languages. We used a set of matched corpora (CallHome) containing 898 telephone conversations conducted in six different languages (Arabic, English, German, Japanese, Mandarin, and Spanish), analyzing more than 58,000 utterances, to assess whether there is evidence of distinct patterns of information distributions at the utterance level, and whether these patterns are similar or differed across the languages. We found that English, Spanish, and Mandarin typically show a back-loaded distribution, with higher information (i.e., surprisal) in the last half of utterances compared with the first half, while Arabic, German, and Japanese showed front-loaded distributions, with higher information in the first half compared with the last half. Additional analyses suggest that these patterns may be related to word order and rate of noun and verb usage. We additionally found that back-loaded languages have longer turn transition times (i.e.,time between speaker turns)Additional information
Data availability -
Ünal, E., Mamus, E., & Özyürek, A. (2024). Multimodal encoding of motion events in speech, gesture, and cognition. Language and Cognition, 16(4), 785-804. doi:10.1017/langcog.2023.61.
Abstract
How people communicate about motion events and how this is shaped by language typology are mostly studied with a focus on linguistic encoding in speech. Yet, human communication typically involves an interactional exchange of multimodal signals, such as hand gestures that have different affordances for representing event components. Here, we review recent empirical evidence on multimodal encoding of motion in speech and gesture to gain a deeper understanding of whether and how language typology shapes linguistic expressions in different modalities, and how this changes across different sensory modalities of input and interacts with other aspects of cognition. Empirical evidence strongly suggests that Talmy’s typology of event integration predicts multimodal event descriptions in speech and gesture and visual attention to event components prior to producing these descriptions. Furthermore, variability within the event itself, such as type and modality of stimuli, may override the influence of language typology, especially for expression of manner. -
Akita, K., & Dingemanse, M. (2019). Ideophones (Mimetics, Expressives). In Oxford Research Encyclopedia for Linguistics. Oxford: Oxford University Press. doi:10.1093/acrefore/9780199384655.013.477.
Abstract
Ideophones, also termed “mimetics” or “expressives,” are marked words that depict sensory imagery. They are found in many of the world’s languages, and sizable lexical classes of ideophones are particularly well-documented in languages of Asia, Africa, and the Americas. Ideophones are not limited to onomatopoeia like meow and smack, but cover a wide range of sensory domains, such as manner of motion (e.g., plisti plasta ‘splish-splash’ in Basque), texture (e.g., tsaklii ‘rough’ in Ewe), and psychological states (e.g., wakuwaku ‘excited’ in Japanese). Across languages, ideophones stand out as marked words due to special phonotactics, expressive morphology including certain types of reduplication, and relative syntactic independence, in addition to production features like prosodic foregrounding and common co-occurrence with iconic gestures.
Three intertwined issues have been repeatedly debated in the century-long literature on ideophones. (a) Definition: Isolated descriptive traditions and cross-linguistic variation have sometimes obscured a typologically unified view of ideophones, but recent advances show the promise of a prototype definition of ideophones as conventionalised depictions in speech, with room for language-specific nuances. (b) Integration: The variable integration of ideophones across linguistic levels reveals an interaction between expressiveness and grammatical integration, and has important implications for how to conceive of dependencies between linguistic systems. (c) Iconicity: Ideophones form a natural laboratory for the study of iconic form-meaning associations in natural languages, and converging evidence from corpus and experimental studies suggests important developmental, evolutionary, and communicative advantages of ideophones. -
Azar, Z., Backus, A., & Ozyurek, A. (2019). General and language specific factors influence reference tracking in speech and gesture in discourse. Discourse Processes, 56(7), 553-574. doi:10.1080/0163853X.2018.1519368.
Abstract
Referent accessibility influences expressions in speech and gestures in similar ways. Speakers mostly use richer forms as noun phrases (NPs) in speech and gesture more when referents have low accessibility, whereas they use reduced forms such as pronouns more often and gesture less when referents have high accessibility. We investigated the relationships between speech and gesture during reference tracking in a pro-drop language—Turkish. Overt pronouns were not strongly associated with accessibility but with pragmatic context (i.e., marking similarity, contrast). Nevertheless, speakers gestured more when referents were re-introduced versus maintained and when referents were expressed with NPs versus pronouns. Pragmatic context did not influence gestures. Further, pronouns in low-accessibility contexts were accompanied with gestures—possibly for reference disambiguation—more often than previously found for non-pro-drop languages in such contexts. These findings enhance our understanding of the relationships between speech and gesture at the discourse level. -
Cuskley, C., Dingemanse, M., Kirby, S., & Van Leeuwen, T. M. (2019). Cross-modal associations and synesthesia: Categorical perception and structure in vowel–color mappings in a large online sample. Behavior Research Methods, 51, 1651-1675. doi:10.3758/s13428-019-01203-7.
Abstract
We report associations between vowel sounds, graphemes, and colours collected online from over 1000 Dutch speakers. We provide open materials including a Python implementation of the structure measure, and code for a single page web application to run simple cross-modal tasks. We also provide a full dataset of colour-vowel associations from 1164 participants, including over 200 synaesthetes identified using consistency measures. Our analysis reveals salient patterns in cross-modal associations, and introduces a novel measure of isomorphism in cross-modal mappings. We find that while acoustic features of vowels significantly predict certain mappings (replicating prior work), both vowel phoneme category and grapheme category are even better predictors of colour choice. Phoneme category is the best predictor of colour choice overall, pointing to the importance of phonological representations in addition to acoustic cues. Generally, high/front vowels are lighter, more green, and more yellow than low/back vowels. Synaesthetes respond more strongly on some dimensions, choosing lighter and more yellow colours for high and mid front vowels than non-synaesthetes. We also present a novel measure of cross-modal mappings adapted from ecology, which uses a simulated distribution of mappings to measure the extent to which participants' actual mappings are structured isomorphically across modalities. Synaesthetes have mappings that tend to be more structured than non-synaesthetes, and more consistent colour choices across trials correlate with higher structure scores. Nevertheless, the large majority (~70%) of participants produce structured mappings, indicating that the capacity to make isomorphically structured mappings across distinct modalities is shared to a large extent, even if the exact nature of mappings varies across individuals. Overall, this novel structure measure suggests a distribution of structured cross-modal association in the population, with synaesthetes on one extreme and participants with unstructured associations on the other. -
Dideriksen, C., Fusaroli, R., Tylén, K., Dingemanse, M., & Christiansen, M. H. (2019). Contextualizing Conversational Strategies: Backchannel, Repair and Linguistic Alignment in Spontaneous and Task-Oriented Conversations. In A. K. Goel, C. M. Seifert, & C. Freksa (
Eds. ), Proceedings of the 41st Annual Conference of the Cognitive Science Society (CogSci 2019) (pp. 261-267). Montreal, QB: Cognitive Science Society.Abstract
Do interlocutors adjust their conversational strategies to the specific contextual demands of a given situation? Prior studies have yielded conflicting results, making it unclear how strategies vary with demands. We combine insights from qualitative and quantitative approaches in a within-participant experimental design involving two different contexts: spontaneously occurring conversations (SOC) and task-oriented conversations (TOC). We systematically assess backchanneling, other-repair and linguistic alignment. We find that SOC exhibit a higher number of backchannels, a reduced and more generic repair format and higher rates of lexical and syntactic alignment. TOC are characterized by a high number of specific repairs and a lower rate of lexical and syntactic alignment. However, when alignment occurs, more linguistic forms are aligned. The findings show that conversational strategies adapt to specific contextual demands. -
Dingemanse, M. (2019). 'Ideophone' as a comparative concept. In K. Akita, & P. Pardeshi (
Eds. ), Ideophones, Mimetics, and Expressives (pp. 13-33). Amsterdam: John Benjamins. doi:10.1075/ill.16.02din.Abstract
This chapter makes the case for ‘ideophone’ as a comparative concept: a notion that captures a recurrent typological pattern and provides a template for understanding language-specific phenomena that prove similar. It revises an earlier definition to account for the observation that ideophones typically form an open lexical class, and uses insights from canonical typology to explore the larger typological space. According to the resulting definition, a canonical ideophone is a member of an open lexical class of marked words that depict sensory imagery. The five elements of this definition can be seen as dimensions that together generate a possibility space to characterise cross-linguistic diversity in depictive means of expression. This approach allows for the systematic comparative treatment of ideophones and ideophone-like phenomena. Some phenomena in the larger typological space are discussed to demonstrate the utility of the approach: phonaesthemes in European languages, specialised semantic classes in West-Chadic, diachronic diversions in Aslian, and depicting constructions in signed languages. -
Drijvers, L., Vaitonyte, J., & Ozyurek, A. (2019). Degree of language experience modulates visual attention to visible speech and iconic gestures during clear and degraded speech comprehension. Cognitive Science, 43: e12789. doi:10.1111/cogs.12789.
Abstract
Visual information conveyed by iconic hand gestures and visible speech can enhance speech comprehension under adverse listening conditions for both native and non‐native listeners. However, how a listener allocates visual attention to these articulators during speech comprehension is unknown. We used eye‐tracking to investigate whether and how native and highly proficient non‐native listeners of Dutch allocated overt eye gaze to visible speech and gestures during clear and degraded speech comprehension. Participants watched video clips of an actress uttering a clear or degraded (6‐band noise‐vocoded) action verb while performing a gesture or not, and were asked to indicate the word they heard in a cued‐recall task. Gestural enhancement was the largest (i.e., a relative reduction in reaction time cost) when speech was degraded for all listeners, but it was stronger for native listeners. Both native and non‐native listeners mostly gazed at the face during comprehension, but non‐native listeners gazed more often at gestures than native listeners. However, only native but not non‐native listeners' gaze allocation to gestures predicted gestural benefit during degraded speech comprehension. We conclude that non‐native listeners might gaze at gesture more as it might be more challenging for non‐native listeners to resolve the degraded auditory cues and couple those cues to phonological information that is conveyed by visible speech. This diminished phonological knowledge might hinder the use of semantic information that is conveyed by gestures for non‐native compared to native listeners. Our results demonstrate that the degree of language experience impacts overt visual attention to visual articulators, resulting in different visual benefits for native versus non‐native listeners.Additional information
Supporting information -
Drijvers, L., Van der Plas, M., Ozyurek, A., & Jensen, O. (2019). Native and non-native listeners show similar yet distinct oscillatory dynamics when using gestures to access speech in noise. NeuroImage, 194, 55-67. doi:10.1016/j.neuroimage.2019.03.032.
Abstract
Listeners are often challenged by adverse listening conditions during language comprehension induced by external factors, such as noise, but also internal factors, such as being a non-native listener. Visible cues, such as semantic information conveyed by iconic gestures, can enhance language comprehension in such situations. Using magnetoencephalography (MEG) we investigated whether spatiotemporal oscillatory dynamics can predict a listener's benefit of iconic gestures during language comprehension in both internally (non-native versus native listeners) and externally (clear/degraded speech) induced adverse listening conditions. Proficient non-native speakers of Dutch were presented with videos in which an actress uttered a degraded or clear verb, accompanied by a gesture or not, and completed a cued-recall task after every video. The behavioral and oscillatory results obtained from non-native listeners were compared to an MEG study where we presented the same stimuli to native listeners (Drijvers et al., 2018a). Non-native listeners demonstrated a similar gestural enhancement effect as native listeners, but overall scored significantly slower on the cued-recall task. In both native and non-native listeners, an alpha/beta power suppression revealed engagement of the extended language network, motor and visual regions during gestural enhancement of degraded speech comprehension, suggesting similar core processes that support unification and lexical access processes. An individual's alpha/beta power modulation predicted the gestural benefit a listener experienced during degraded speech comprehension. Importantly, however, non-native listeners showed less engagement of the mouth area of the primary somatosensory cortex, left insula (beta), LIFG and ATL (alpha) than native listeners, which suggests that non-native listeners might be hindered in processing the degraded phonological cues and coupling them to the semantic information conveyed by the gesture. Native and non-native listeners thus demonstrated similar yet distinct spatiotemporal oscillatory dynamics when recruiting visual cues to disambiguate degraded speech.Additional information
1-s2.0-S1053811919302216-mmc1.docx -
Mamus, E., Rissman, L., Majid, A., & Ozyurek, A. (2019). Effects of blindfolding on verbal and gestural expression of path in auditory motion events. In A. K. Goel, C. M. Seifert, & C. C. Freksa (
Eds. ), Proceedings of the 41st Annual Meeting of the Cognitive Science Society (CogSci 2019) (pp. 2275-2281). Montreal, QB: Cognitive Science Society.Abstract
Studies have claimed that blind people’s spatial representations are different from sighted people, and blind people display superior auditory processing. Due to the nature of auditory and haptic information, it has been proposed that blind people have spatial representations that are more sequential than sighted people. Even the temporary loss of sight—such as through blindfolding—can affect spatial representations, but not much research has been done on this topic. We compared blindfolded and sighted people’s linguistic spatial expressions and non-linguistic localization accuracy to test how blindfolding affects the representation of path in auditory motion events. We found that blindfolded people were as good as sighted people when localizing simple sounds, but they outperformed sighted people when localizing auditory motion events. Blindfolded people’s path related speech also included more sequential, and less holistic elements. Our results indicate that even temporary loss of sight influences spatial representations of auditory motion eventsAdditional information
https://mindmodeling.org/cogsci2019/papers/0395/index.html -
Ortega, G., Schiefner, A., & Ozyurek, A. (2019). Hearing non-signers use their gestures to predict iconic form-meaning mappings at first exposure to sign. Cognition, 191: 103996. doi:10.1016/j.cognition.2019.06.008.
Abstract
The sign languages of deaf communities and the gestures produced by hearing people are communicative systems that exploit the manual-visual modality as means of expression. Despite their striking differences they share the property of iconicity, understood as the direct relationship between a symbol and its referent. Here we investigate whether non-signing hearing adults exploit their implicit knowledge of gestures to bootstrap accurate understanding of the meaning of iconic signs they have never seen before. In Study 1 we show that for some concepts gestures exhibit systematic forms across participants, and share different degrees of form overlap with the signs for the same concepts (full, partial, and no overlap). In Study 2 we found that signs with stronger resemblance with signs are more accurately guessed and are assigned higher iconicity ratings by non-signers than signs with low overlap. In addition, when more people produced a systematic gesture resembling a sign, they assigned higher iconicity ratings to that sign. Furthermore, participants had a bias to assume that signs represent actions and not objects. The similarities between some signs and gestures could be explained by deaf signers and hearing gesturers sharing a conceptual substrate that is rooted in our embodied experiences with the world. The finding that gestural knowledge can ease the interpretation of the meaning of novel signs and predicts iconicity ratings is in line with embodied accounts of cognition and the influence of prior knowledge to acquire new schemas. Through these mechanisms we propose that iconic gestures that overlap in form with signs may serve as some type of ‘manual cognates’ that help non-signing adults to break into a new language at first exposure.Additional information
Supplementary Materials -
Ozyurek, A., & Woll, B. (2019). Language in the visual modality: Cospeech gesture and sign language. In P. Hagoort (
Ed. ), Human language: From genes and brain to behavior (pp. 67-83). Cambridge, MA: MIT Press. -
Rissman, L., & Majid, A. (2019). Agency drives category structure in instrumental events. In A. K. Goel, C. M. Seifert, & C. Freksa (
Eds. ), Proceedings of the 41st Annual Meeting of the Cognitive Science Society (CogSci 2019) (pp. 2661-2667). Montreal, QB: Cognitive Science Society.Abstract
Thematic roles such as Agent and Instrument have a long-standing place in theories of event representation. Nonetheless, the structure of these categories has been difficult to determine. We investigated how instrumental events, such as someone slicing bread with a knife, are categorized in English. Speakers described a variety of typical and atypical instrumental events, and we determined the similarity structure of their descriptions using correspondence analysis. We found that events where the instrument is an extension of an intentional agent were most likely to elicit similar language, highlighting the importance of agency in structuring instrumental categories.Additional information
https://mindmodeling.org/cogsci2019/papers/0455/0455.pdf -
Rissman, L., & Majid, A. (2019). Thematic roles: Core knowledge or linguistic construct? Psychonomic Bulletin & Review, 26(6), 1850-1869. doi:10.3758/s13423-019-01634-5.
Abstract
The status of thematic roles such as Agent and Patient in cognitive science is highly controversial: To some they are universal components of core knowledge, to others they are scholarly fictions without psychological reality. We address this debate by posing two critical questions: to what extent do humans represent events in terms of abstract role categories, and to what extent are these categories shaped by universal cognitive biases? We review a range of literature that contributes answers to these questions: psycholinguistic and event cognition experiments with adults, children, and infants; typological studies grounded in cross-linguistic data; and studies of emerging sign languages. We pose these questions for a variety of roles and find that the answers depend on the role. For Agents and Patients, there is strong evidence for abstract role categories and a universal bias to distinguish the two roles. For Goals and Recipients, we find clear evidence for abstraction but mixed evidence as to whether there is a bias to encode Goals and Recipients as part of one or two distinct categories. Finally, we discuss the Instrumental role and do not find clear evidence for either abstraction or universal biases to structure instrumental categories. -
Schubotz, L., Ozyurek, A., & Holler, J. (2019). Age-related differences in multimodal recipient design: Younger, but not older adults, adapt speech and co-speech gestures to common ground. Language, Cognition and Neuroscience, 34(2), 254-271. doi:10.1080/23273798.2018.1527377.
Abstract
Speakers can adapt their speech and co-speech gestures based on knowledge shared with an addressee (common ground-based recipient design). Here, we investigate whether these adaptations are modulated by the speaker’s age and cognitive abilities. Younger and older participants narrated six short comic stories to a same-aged addressee. Half of each story was known to both participants, the other half only to the speaker. The two age groups did not differ in terms of the number of words and narrative events mentioned per narration, or in terms of gesture frequency, gesture rate, or percentage of events expressed multimodally. However, only the younger participants reduced the amount of verbal and gestural information when narrating mutually known as opposed to novel story content. Age-related differences in cognitive abilities did not predict these differences in common ground-based recipient design. The older participants’ communicative behaviour may therefore also reflect differences in social or pragmatic goals.Additional information
plcp_a_1527377_sm4510.pdf -
Ter Bekke, M., Ozyurek, A., & Ünal, E. (2019). Speaking but not gesturing predicts motion event memory within and across languages. In A. Goel, C. Seifert, & C. Freksa (
Eds. ), Proceedings of the 41st Annual Meeting of the Cognitive Science Society (CogSci 2019) (pp. 2940-2946). Montreal, QB: Cognitive Science Society.Abstract
In everyday life, people see, describe and remember motion events. We tested whether the type of motion event information (path or manner) encoded in speech and gesture predicts which information is remembered and if this varies across speakers of typologically different languages. We focus on intransitive motion events (e.g., a woman running to a tree) that are described differently in speech and co-speech gesture across languages, based on how these languages typologically encode manner and path information (Kita & Özyürek, 2003; Talmy, 1985). Speakers of Dutch (n = 19) and Turkish (n = 22) watched and described motion events. With a surprise (i.e. unexpected) recognition memory task, memory for manner and path components of these events was measured. Neither Dutch nor Turkish speakers’ memory for manner went above chance levels. However, we found a positive relation between path speech and path change detection: participants who described the path during encoding were more accurate at detecting changes to the path of an event during the memory task. In addition, the relation between path speech and path memory changed with native language: for Dutch speakers encoding path in speech was related to improved path memory, but for Turkish speakers no such relation existed. For both languages, co-speech gesture did not predict memory speakers. We discuss the implications of these findings for our understanding of the relations between speech, gesture, type of encoding in language and memory.Additional information
https://mindmodeling.org/cogsci2019/papers/0496/0496.pdf -
Trujillo, J. P., Vaitonyte, J., Simanova, I., & Ozyurek, A. (2019). Toward the markerless and automatic analysis of kinematic features: A toolkit for gesture and movement research. Behavior Research Methods, 51(2), 769-777. doi:10.3758/s13428-018-1086-8.
Abstract
Action, gesture, and sign represent unique aspects of human communication that use form and movement to convey meaning. Researchers typically use manual coding of video data to characterize naturalistic, meaningful movements at various levels of description, but the availability of markerless motion-tracking technology allows for quantification of the kinematic features of gestures or any meaningful human movement. We present a novel protocol for extracting a set of kinematic features from movements recorded with Microsoft Kinect. Our protocol captures spatial and temporal features, such as height, velocity, submovements/strokes, and holds. This approach is based on studies of communicative actions and gestures and attempts to capture features that are consistently implicated as important kinematic aspects of communication. We provide open-source code for the protocol, a description of how the features are calculated, a validation of these features as quantified by our protocol versus manual coders, and a discussion of how the protocol can be applied. The protocol effectively quantifies kinematic features that are important in the production (e.g., characterizing different contexts) as well as the comprehension (e.g., used by addressees to understand intent and semantics) of manual acts. The protocol can also be integrated with qualitative analysis, allowing fast and objective demarcation of movement units, providing accurate coding even of complex movements. This can be useful to clinicians, as well as to researchers studying multimodal communication or human–robot interactions. By making this protocol available, we hope to provide a tool that can be applied to understanding meaningful movement characteristics in human communication.Additional information
https://link.springer.com/article/10.3758/s13428-018-1086-8#SupplementaryMateri… -
Van Leeuwen, T. M., Van Petersen, E., Burghoorn, F., Dingemanse, M., & Van Lier, R. (2019). Autistic traits in synaesthesia: Atypical sensory sensitivity and enhanced perception of details. Philosophical Transactions of the Royal Society of London, Series B: Biological Sciences, 374: 20190024. doi:10.1098/rstb.2019.0024.
Abstract
In synaesthetes specific sensory stimuli (e.g., black letters) elicit additional experiences (e.g. colour). Synaesthesia is highly prevalent among individuals with autism spectrum disorder but the mechanisms of this co-occurrence are not clear. We hypothesized autism and synaesthesia share atypical sensory sensitivity and perception. We assessed autistic traits, sensory sensitivity, and visual perception in two synaesthete populations. In Study 1, synaesthetes (N=79, of different types) scored higher than non-synaesthetes (N=76) on the Attention-to-detail and Social skills subscales of the Autism Spectrum Quotient indexing autistic traits, and on the Glasgow Sensory Questionnaire indexing sensory hypersensitivity and hyposensitivity which frequently occur in autism. Synaesthetes performed two local/global visual tasks because individuals with autism typically show a bias toward detail processing. In synaesthetes, elevated motion coherence thresholds suggested reduced global motion perception and higher accuracy on an embedded figures task suggested enhanced local perception. In Study 2 sequence-space synaesthetes (N=18) completed the same tasks. Questionnaire and embedded figures results qualitatively resembled Study 1 results but no significant group differences with non-synaesthetes (N=20) were obtained. Unexpectedly, sequence-space synaesthetes had reduced motion coherence thresholds. Altogether, our studies suggest atypical sensory sensitivity and a bias towards detail processing are shared features of synaesthesia and autism spectrum disorder.
Share this page