Predicting the dative alternation
Theoretical linguists have traditionally relied on linguistic intuitions such as grammaticality
judgments for their data. But the massive growth of computer-readable texts and recordings, the availability of cheaper, more powerful computers and software,
and the development of new probabilistic models for language have now made the spontaneous use of language in natural settings a rich and easily accessible alternative
source of data. Surprisingly, many linguists believe that such ‘usage data’ are irrelevant to the
theory of grammar. Four problems are repeatedly brought up in the critiques of usage data—
1. correlated factors seeming to support reductive theories,
2. pooled data invalidating grammatical inference,
3. syntactic choices reducing to lexical biases, and
4. cross-corpus differences undermining corpus studies.
Presenting a case study of work on the English dative alternation, we show first,that linguistic intuitions of grammaticality are deeply flawed and seriously underestimate
the space of grammatical possibility, and second, that the four problems in the critique of usage data are empirical issues that can be resolved by using modern
statistical theory and modelling strategies widely used in other fields.
The new models allow linguistic theory to solve more difficult problems than it has in the past, and to build convergent projects with psychology, computer science,
and allied fields of cognitive science.
Share this page