Language Acquisition as the Detection, Memorization, and Reproduction of Statistical Regularities in Perceived Language

Reinhard Rapp
In this article we investigate the hypothesis that language learning is based on the detection and memorization of particular statistical regularities as observed in perceived language, and that during language production these regularities are reproduced. We give an overview of those regularities where we have been able to exemplify this behaviour. Our finding is that not only statistics of order zero (frequencies) and one (co-occurrences) are of importance, but also statistics of higher order. For several types of statistics we present simulation results and conduct quantitative evaluations by comparing them to experimental data as obtained from test subjects.

Key words: language acquisition, corpus statistics, unsupervised learning, associationism