Seeking Temporal Predictability in Speech: Comparing Statistical Approaches on 18 World Languages
Temporal regularities in speech, such as interdependencies in the timing of speech
events, are thought to scaffold early acquisition of the building blocks in speech. By
providing on-line clues to the location and duration of upcoming syllables, temporal
structure may aid segmentation and clustering of continuous speech into separable
units. This hypothesis tacitly assumes that learners exploit
predictability
in the temporal
structure of speech. Existing measures of speech timing tend to focus on first-order
regularities among adjacent units, and are overly sensitive to idiosyncrasies in the
data they describe. Here, we compare several statistical methods on a sample of 18
languages, testing whether syllable occurrence is predictable over time. Rather than
looking for differences between languages, we aim to find across languages (using
clearly defined acoustic, rather than orthographic, measures), temporal predictability
in the speech signal which could be exploited by a language learner. First, we
analyse distributional regularities using two novel techniques: a Bayesian ideal learner
analysis, and a simple distributional measure. Second, we model
higher-order
temporal
structure—regularities arising in an ordered
series
of syllable timings—testing the
hypothesis that non-adjacent temporal structures may explain the gap between
subjectively-perceived temporal regularities, and the absence of universally-accepted
lower-order objective measures. Together, our analyses provide limited evidence for
predictability at different time scales, though higher-order predictability is difficult to
reliably infer. We conclude that temporal predictability in speech may well arise from
a combination of individually weak perceptual cues at multiple structural levels, but is
challenging to pinpoint.
Additional information
https://www.frontiersin.org/article/10.3389/fnhum.2016.00586/full#supplementary-material
Share this page