Jason Eisner – Johns Hopkins University
Course time: Tuesday/Thursday 1:30-3:20 pm AND Friday, June 28 1:00-5:00 pm
1401 Mason Hall
See Course Description
This class presents fundamental methods of computational linguistics. We will develop probabilistic models to describe what structures are likely in a language. After estimating the parameters of such models, it is possible to recover underlying structure from surface observations. We will examine algorithms to accomplish these tasks.
Specifically, we will focus on modeling
- trees (via probabilistic context-free grammars and their relatives)
- sequences (via n-gram models, hidden Markov models, and other probabilistic finite-state processes)
- bags of words (via topic models)
- lexicons (via hierarchical generative models)
We will also survey a range of current tasks in applied natural language processing. Many of these tasks can be addressed with techniques from the class.
Some previous exposure to probability and programming may be helpful. However, probabilistic modeling techniques will be carefully introduced, and programming expertise will not be required. We will use a very high-level language (Dyna) to describe algorithms and visualize their execution.
Some previous exposure to probability and programming may be helpful. However, probabilistic modeling techniques will be carefully introduced, and programming expertise will not be required. We will use a very high-level language (Dyna) to describe algorithms and visualize their execution.
Useful related courses include Machine Learning, Python 3 for Linguists, Corpus-based Linguistic Research, and Computational Psycholinguistics.