Semantic Modeling of Collocations for Lexicographic Purposes

Volume 16
Issue 3
Lothar Lemnitzer & Alexander Geyken
The study which we will present in this paper aims at investigating and modeling lexical-semantic properties of collocations. Pairs of words that co-occur with statistic salience will be extracted automatically from a large German corpus with the help of the “Wortprofil”, a sketch-engine-like application. From these sets of co-occurring words, collocations in the narrow sense are selected manually. With these data, the following research questions will be addressed a) concerning the collocates: are we able to classify these into lexical-semantic classes and group them accordingly; b) concerning the bases: are we able to find significant numbers of shared collocates for lexicalsemantically related bases and thus reach some form of generalization and regular patterns? In our study we apply the Meaning-Text Theory of Mel’čuk, more precisely, the concept of Lexical Functions (LF). The idea to employ LFs for lexicographic work on collocations is not new. However, the combination of LF with semantic wordnets for the abstraction over individual bases (addressing question b above) is innovative as it has, to the best of our knowledge, not yet been used for modeling a larger subset of collocations in any language. In the study we report here, we have focused on a set of lexical items and their collocations in order to test the appropriateness of Lexical Functions and to model the phenomena and the intersection of collocates of related base words to generalize collocational patterns. A practical goal of our work will gain a clearer view of how to use lexicalsemantic features for the encoding of collocation in semasiological dictionaries such as the Digitales Wörterbuch der Deutschen Sprache(DWDS). A further goal and a contribution to lexicological theory is to better understand the interdependence between regularity and arbitrariness of lexical choice. While the arbitrariness of lexical choice makes collocations hard to learn for the second-language learner, we assume that there are some (sub-)regularities, at least within groups of semantically related headwords. Applied properly to the task of language learning, such regularities should facilitate the acquisition of this part of the vocabulary.

Keywords: Collocation, Lexical Function, Corpus Linguistics, Lexical Semantics