Neighborhood Density and Frequency Across Languages and Modalities
This research exploits the English and Dutch CELEX lexical database to investigate the form similarity relations between words. Lexical statistics analyses replicate and extend the findings of Landauer and Streeter (1973) concerning the relation between a word′s frequency and the density and frequency of its similarity neighborhood. The results for both Dutch and English reveal only a weak tendency for high-frequency written and spoken words to have more neighbors than rare words and for these neighbors to be more frequent than those of rare words. However, the number of neighbors was found to correlate more highly with bigram frequency than with word frequency. To clarify the relations between these properties, a stochastic model is presented which captures the relevant effects of phonotactic structure on neighborhood similarities. The implications of these findings for models of language production and comprehension are considered.
Share this page