Tag Archives: Methods
Lexicography in Natural Language Processing

Orin Hargraves – Independent Scholar
Course time: Tuesday/Thursday 9:00-10:50 am
2325 Mason Hall

See Course Description

Determining what words mean is the core skill and practice of lexicography. Determining what words mean is also a central challenge in natural language processing (NLP), where it is usually classed under the exercise of word sense disambiguation (WSD). Until the late 20th century, lexicography was dominated by scholars with backgrounds in philosophy, literature, and other humanistic disciplines, and the writing of dictionaries was based strongly on intuition, and only secondarily on induction from the study of examples of usage. Linguistics, in this same period, establish itself as a discipline with strong scientific credentials. With the development of corpora and other computational tools for processing text, dictionary makers recognized first the value, and soon the indispensability, of using evidence-based data to develop dictionary definitions, and this brought them increasingly into contact with computational linguists. The developers of computational linguistic tools and resources eventually turned their attention back to the dictionary and found that it was a document that could be exploited for use in the newly emerging fields of linguistic inquiry that computation made possible: NLP, artificial intelligence, machine learning, and machine translation. This course will explore the computational tools that lexicographers use today to write dictionaries, and the ways in which computational linguists use dictionaries in their pursuits. The aim is to give students an appreciation of the unexploited opportunities that dictionary databases offer to NLP, and of the challenges that stand in the way of their exploitation. Students will have an opportunity to explore the ways in which dictionaries may aid or hinder automatic WSD, and they will be encouraged to develop their own models for the use of dictionary databases in NLP.

Students must have native-speaker fluency in English. Thorough knowledge of Englsih grammar and morphology is an advantage, as is knowledge of the rudiments of NLP.

, ,

Modeling and Measuring Inflectional Paradigms

Andrew Hippisley – University of Kentucky
Greg Stump – University of Kentucky
Raphael Finkel – University of Kentucky
Course time: Tuesday/Thursday 9:00-10:50 am
2333 Mason Hall

See Course Description

The emergence of inferential-realizational approaches to inflection has led to a dramatic reversal of a perspective on morphology that dominated twentieth-century grammatical theory, where inflectional paradigms were regarded as an epiphenomenon of the combinatory properties of inflectional morphemes and were accorded no theoretical importance.   The new perspective suggests that paradigms are essential to the definition of a language’s inflectional morphology and that they constitute a significant domain of measurable typological variation.  The purpose of this course is to investigate both the universal principles of paradigm structure and the dimensions and degrees of cross-linguistic variation in paradigm structure.  Central to our method is the use of computational resources for the formal modeling and typological measurement of inflectional paradigms.  We begin by examining inferential-realizational theories of inflection and their place in the broader theoretical landscape. Numerous considerations decisively favor the inferential-realizational approach.  We exemplify this approach with Paradigm Function Morphology, a precise system of universal principles for the definition of inflectional systems.  We then consider two different approaches to modeling paradigm realization in inferential-realizational theories, the exponence-based approach, computationally illustrated through Network Morphology; and the implicative approach, computationally illustrated by the Principal-Parts Analyzer. Both approaches are then contrasted in the way they account for inflectional classes, and for the exponent-based account we introduce the concept of default inheritance hierarchy, for the implicative the notion of principal parts.  We move on to look at the diversity of paradigm structures, treating it as various departures from a canonical norm.  Two kinds of phenomena responsible for paradigm structure variation are syncretism and deponency, both covered in some detail.  Further variation is identified by considering the predictability of cells, and we consider the implicative structure of paradigms. We go on to relate this concept to the property of inflectional complexity, a point of comparison between languages’ morphological systems that lends itself to a typological treatment.  Throughout the course practical hands-on computational sessions will supplement and illustrate theoretical points made. An introduction to linguistics course is strongly advised, and knowledge of morphology is desirable.