The relation called Agree plays a very prominent role in Minimalist theory: it underlies phi-feature agreement, Case valuation, and syntactic movement. This course will explore in detail some of the rich morpho-syntactic phenomena connected with Agree and their implications for syntactic theory and Universal Grammar. Bantu languages will provide much but not all of the empirical content, which will also draw on English, German, Icelandic, and other languages TBA. Topics will likely include various inversion constructions, complementizer agreement, (transitive) expletive constructions, concord phenomena, Feature Inheritance theory, and issues in structural and inherent Case.
This course will assume familiarity with Minimalist syntactic theory.
How can speech sounds be described in terms of their articulations, so that not only contrasts but also small phonetic differences can be understood? This course will cover selected topics related to the articulation of speech sounds, probably including: the articulatory framework of the IPA; articulatory descriptions of languages, such as in Ladefoged and Madieson’s Sounds of the World’s Languages (1996); aerodynamic data and modeling for different sound types; phonation types, including high-speed imaging of the glottis, electroglottography, and acoustic analysis; articulatory strengthening and prosodic structure; coarticulation.
Articulatory Phonology (AP) is a view of the sound structure of a language that tries to account both for its abstract aspect, contrast and pattern, as well as its physical realization in speech production and perception. And it does so without assuming a dualistic mind-body distinction between phonology-phonetics. The key aspect of AP that allows is to be non-dualistic is a dynamical framework that allows for a principled (non-arbitrary) relation between symbolic/discrete entities and continuous motion. This course will start by introducing students to the dynamical framework of task dynamics, and how contrasts and patterns are expressed in this framework. Students will also be introduced to TaDa, a computational engine allowing for the derivation of continuous motion of articulators and formants from an utterance described in terms of overlapped elementary contrasts, expressed as gestures. Segmental and prosodic phonology will be discussed in combination with each other throughout the course.
Research about beliefs about language and reactions to it has gone beyond interest in such matters for their own sake, and researchers have used internal, classificatory mechanisms related to attitudes and beliefs to explain both the deployment of linguistic resources and the paths of language change. This course will examine historical and current trends in the study of attitudes and ideologies with reference to their role in more structured accounts of language variation and change. We begin with Hymesian ethnographic studies and social psychological approaches to attitude as developed by Lambert et al. Early uses of ideology and attitude in variationist studies will also be noted, and the continuation of the Hymesian tradition by linguistic anthropologists will be discussed. The course next elaborates on two recent turns — indexicality, as developed by Silverstein, and accounts of variability in linguistic theory, as suggested in attempts to build variable OT representations and the attaching of sociocultural information to forms in exemplar theory. The course also evaluates trends in both discoursal and experimental investigations. In the first, we look at content analyses, at linguistic anthropologists’ use of interaction in extracting ideologies from actions, and at more recent attempts to link attitudinal and ideological content to form in critical discourse analysis as well as proposals to link form and attitude by means of pragmatic analyses. Finally, we investigate task-based and experimental procedures in identifying and interpreting attitudes and ideologies, ranging from overt tasks such as those used in work on perceptual dialectology, including very recent uses of georeferencing techniques, to matched-guise and experimental response settings that seek to expose respondents’ unconscious reactions. We will look carefully at the design of experiments that relate attitudinal and ideological factors to structural elements, including techniques developed in social psychology in implicit research design. We conclude with an overview of the cognitive foundations of attitudinal and ideological processing, touching on acquisition, change, and deployment.
Bilingual mixed languages are the result of the fusion of two identifiable source languages, normally in situations of community bilingualism. As recently as the 1990s, the existence of these languages had often been denied or labelled as cases of code-switching, adstrate influence or borrowing (see e.g. Greenberg 1999). Nonetheless they were brought to the attention of contact linguistics by Thomason and Kaufman (1988) as a legitimate form of contact language. Since then a number of edited volumes, papers and monographs have drawn together substantial amounts of data from various languages which have been identified as being ‘mixed’. This course focuses on a number of these languages including Angloromani (England), Bilingual Navajo (US), Gurindji Kriol (Australia), Helsinki Slang (Finland), Light Warlpiri (Australia), Ma’á (Tanzania), Media Lengua (Ecuador), Mednyj Aleut (Bering Strait) and Michif (Canada). Topics to be covered in this course include the socio-historical and structural origins and features of mixed languages; linguistic innovation and continuity in mixed languages; and the relationship of mixed languages to other forms of language contact such as code-switching, borrowing, metatypy and creolisation. A number current issues will also be covered including whether mixed language phonologies are stratified; whether mixed languages can be considered autonomous language systems; and how to characterise variation in mixed languages.
Languages change continuously, in part because their speakers also use other languages, language contact. I will discuss different ways of studying language contact, from the perspective of stability. Which aspects of language remain stable and under which circumstances is there stability? Which methodologies can be used to study stability and can they reinforce each other? How does stability relate to borrowability? Language contact can be studied at different levels of time depth and geographical scope:
* deep time contacts involving large areas, such as the Circum-Pacific or Eurasia
* historical time contacts involving countries and single languages, such as the history of English in Great Britain
* recent time contacts involving bilingual speech communities, such as the Puerto Rican community in New York
* instant time contacts in experimental settings with cross-linguistic priming of multilingual speakers
These different levels have yielded different and sometimes apparently contradictory results. Some deep time and instant time studies have suggested much less stability than historical and recent time studies. Are these contrasts real or an artifact of the particular study? How could they be explained? I will focus on recent results from our research with deep time relations in the Amazon region (time depth at least 5000 years), historical time depth relations in the Republic of Surinam (time depth about 500 years), studies on heritage languages in the Netherlands (time depth about 50 years), and priming experiments with Turkish-Dutch and Papiamentu-Dutch bilinguals (very limited time depth).
In this course, we’ll explore the phenomenon of codeswitching, both for its own sake and for what it can tell us about language in general. Overall, codeswitching and contact-induced change will be confronted with the usage-based approach to linguistic competence, providing what could be called a usage-based contact linguistics.
Code-switching is the use of overt material from two or more different languages. It is very common in the speech of bi- and multilinguals the world over, and has attracted the attention of all kinds of linguists, from different sub-branches and different theoretical persuasions. The class will provide a brief historical overview, in order to map the various contributions to the understanding of the phenomenon, and assess to what degree the study of code-switching is a coherent field (or not).
The usage-based approach will be shown to be relevant for the study of codeswitching and contact-induced language change because it has the potential to unite various strands of CS research that are up to now not much in contact, locked as they are in different corners of the discipline, and studied through different theoretical lenses. To do this, it will be necessary to study codeswitching in conjunction with other contact phenomena, primarily loan translation and grammatical interference, and to do so on both synchronic and diachronic planes. Traditionally, codeswitching is looked at from a self-contained purely synchronic point of view only; the course will explore to what degree a diachronic perspective can enrich both the account of the phenomenon itself and of its embedding in general linguistics. Seen this way, the study of codeswitching allows new windows on the essence of language.
This course aims at introducing students to research on comparative syntax. It is directed to students interested in a more thorough understanding of the common properties of the syntax of human languages and of the possible variation across their structure.
Human languages have strikingly similar structural features, but at the same time they also vary in significant respects. A substantial amount of advances in our understanding of human language has resulted from the individual and comparative analysis of distinct languages. Their similarities and differences can be explored from cognitive, formal, theoretical and typological perspectives. This course focuses on a generative perspective to comparative syntax, by also taking into account insights from linguistic typology. It investigates approaches aiming at explaining both common properties and boundaries of variation across languages. Some of the questions that arise in this context are: what structural principles are common across different human languages? What kind of variation can we find across human languages? What parameters or alternative mechanisms determine the range of this variation? How can this variation be analyzed and understood in a precise way? What mechanisms give rise to this sort of cross-linguistic variation over time?
The course focus will be: (i) to introduce students to a generative approach to syntactic variation across languages, by discussing aspects of variation that have received prominent attention in the linguistics literature (e.g. word order variation regarding verb movement, wh-questions, empty categories); (ii) to explore extensions to different approaches to cross-linguistic variation (e.g. variation in clause structure and word order, and across case systems); (iii) to consider potential difficulties and limitations to unifying approaches to syntactic variation (e.g. non-configurational languages).
Students in this course should have taken an introductory undergraduate course in syntax or semantics.
Decades of empirical research have led to an increasingly nuanced picture of the nature of phonetic and phonological change, incorporating insights from speech production and perception, cognitive biases, and social factors. However, there remains a significant gap between observed patterns and proposed mechanisms, in part due to the difficulty of conducting the type of controlled studies necessary to test hypotheses about historical change. Computational and mathematical models provide an alternative means by which such hypotheses can be fruitfully explored. With an eye towards Box’s dictum (all models are wrong, but some are useful), this course asks: how can computational models be useful for understanding why phonetic and phonological change occurs? Students will study the growing and varied literature on computational and mathematical modeling of sound change that has emerged over the past decade and a half, including models of phonetic change in individuals over the lifespan, phonological change in speech communities in historical time, and lexical diffusion. Discussion topics will include the strengths and weaknesses of different approaches (e.g.simulation-based vs. mathematical models); identifying which modeling frameworks are best suited for particular types of research questions; and methodological considerations in modeling phonetic and phonological change. For this course, some background in probability theory, single-variable calculus, and/or linear algebra is helpful but not required.
This course examines cognitive models of human sentence comprehension. Such models are programs that express psycholinguistic theories of how people unconsciously put together words and phrases in order to make sense of what they hear (or read). They hold out the promise of rigorously connecting behavioral measurements to broader theories, for instance theories of natural language syntax or cognitive architecture. The course brings students up to speed on the role of computer models in cognitive science generally, and situates the topic in relation to neighboring fields such as psychology and generative grammar. Students master several different viewpoints on what it might mean to “attach” a piece of phrase structure. Attendees will get familiar with notions of experience, probability and information theory as candidate explanations of human sentence processing difficulty. This course has no prerequisites although exposure to artificial intelligence, generative grammar and cognitive psychology will help deepen the experience.
Big, fast, cheap, computers; ubiquitous digital networks; huge and
growing archives of text and speech; good and improving algorithms for
automatic analysis of text and speech: all of this creates a
cornucopia of research opportunities, at every level of linguistic
analysis from phonetics to pragmatics. This course will survey the
history and prospects of corpus-based research on speech, language,
and communication, in the context of class participation in a series
of representative projects. Programming ability, though helpful, is
This course will cover:
* How to find or create resources for empirical research in linguistics
* How to turn abstract issues in linguistic theory into concrete
questions about linguistic data
* Problems of task definition and inter-annotator agreement
* Exploratory data analysis versus hypothesis testing
* Programs and programming: practical methods for searching,
classifying, counting, and measuring
* A survey of relevant machine-learning algorithms and applications
We will explore these topics through a series of empirical research
exercises, some planned in advance and some developed in response to
the interests of participants.
This course will cover a number of issues in contemporary Minimalist Theory and analysis. We will discuss why Minimalism, with its commitment to explanation, not mere description or just “data coverage,” accords with the standard goals of scientific theorizing. The question of which properties of human grammars are Linguistic and which might follow from more general law (third factor explanation) will be discussed in this context as well. We will cover many aspects of Chomsky’s most recent work, and our own lines of research concerning this framework of inquiry, including: The fundamental properties of derivations; the nature of computational efficiency; representations; Bare Output Conditions; the operations Merge, Agree, Labelling; constraints like the No Tampering Condition; the primacy of CI; Feature Inheritance and set intersected representation in Bare Phrase structure (multi-dominance).
Ideally the student will already have two courses in syntax, will know the mechanics of basic Minimalist analysis, and will have a strong interest in the goals of minimalist method, specifically the quest for explanation.
This course explores the relationship between the English language and ethnicity in the United States by merging anthropological understandings of race and ethnicity with sociolinguistic methods of description and analysis. In doing so, it introduces students to both traditional and current models of language and ethno-racial identity. Specifically, the course explores sociolinguistic assumptions that may equate “race/ethnicity” with “non-whiteness,” that overlook the inherent relationships between racial categories, and that treat race as isolatable dimension. It will also question conceptions of ethno-racial language as an objective set of features by considering how language is a sociocultural set of practices and resources that produce meanings, identities, and ideologies.
The course will introduce students to a range of ethnolectal models that have been traditionally adopted as well as the problems and politics inherent in them. In particular, it will explore sites across the United States that complicate traditional models, including communities in which groups defy easy categorization in a black-white racial paradigm, cases in which speakers use features associated with racial outgroups, and speakers who simultaneously index gendered, classed, and racialized meanings. The course will additionally emphasize the real-world relevance of studying language and race, namely be considering racist and anti-racist language practices in institutional and media contexts.
The emerging field of experimental pragmatics combines an interest in the theoretical complexities of language use with the experimental methodologies of psycholinguistics. This course will present a broad survey of recent work in this area that has attempted to apply the methods of experimental psychology to classic issues in theoretical pragmatics. Each class session will include both theoretical and experimental readings on topics such as reference, information structure, implicature, and speech acts. These topics wrestle with the relationship between the sentence, as an abstract object with phonological, syntactic, and semantic properties assigned by the grammar of the language, and the utterance, as the concrete realization of that sentence with properties inherited from consideration of the discourse situation. The class will also focus on a number of experimental and analytical methodologies that have been used to investigate these topics, including reaction time studies, eyetracking, and corpus analysis. In general, the course will be organized primarily around discussion of the assigned readings, and students will have the opportunity to develop a research proposal relevant to issues in language use. No specific background in or familiarity with particular experimental methods or approaches is required.
This course is an introduction to linguistic field methods. We will work with a speaker of a language that none of us know, endeavoring to discover as much as possible about the structure of the language, at all levels – phonetic, phonological, morphological, syntactic, semantic – through a combination of structured questioning and working with texts that we will record from the speaker. The emphasis will be on how to discover the systematicity of an unknown language on its own terms.
Prerequisite: Background in linguistics. Students should be able to transcribe, do morphological analysis, and syntactic analysis.
In the past ten years the study of hand gestures has become an established area of investigation in different disciplines. This course will provide an introduction to theoretical and methodological issues in manual gesture research. The course will provide a solid foundation for further research into the phenomenon by the course participants. We will explore the role of manual gesture in language, culture and cognition and provide hands on training in methods in gesture research. The basic functions of gesture in communication, its interaction with speech in the creation of meaning as well as its role in cognition will be introduced. One focus will be how to document gesture in actual language use doing fieldwork. In the practical component participants will learn how to record gesture data in naturalistic as well as in experimental settings. In addition the course will provide the opportunity to learn how to annotate and code gesture with available software. Participants are encouraged to bring their own recordings for annotation and analyses. Some familiarity with general linguistics is presumed.
This course provides a very basic introduction to core topics in historical linguistics, appropriate for beginning graduate students or advanced undergraduates who have not taken a previous course on the subject. The following topics will be surveyed: patterns and causes of phonological change (week 1), morphological change (week 2), and syntactic and semantic change (week 3); and methods of reconstruction, determining relatedness and subgrouping, and patterns of diversification (week 4).
One of the great mysteries of linguistics is the so-called actuation problem, first articulated in Weinreich, Labov, and Herzog 1968, and still largely unanswered to this day. The question is what causes the inception of language change, if the linguistic conditions favoring particular changes are always present? Previous studies on sound change have mainly focused on group effects, that is, effects observed in a population as a whole. Recent work has drawn on interspeaker variation for a solution to the actuation puzzle. The main impetus for considering individual differences in the context of sound change comes from the need to build a linking theory that bridges the gap between the emergence of new linguistic variants and their eventual propagation.
This course will explore sources of individual linguistic differences, and the role they may play in the initiation and propagation of sound change. Idiosyncratic variation provides an opportunity to understand the limits and flexibility of the human capacity for language, and to better understand the observed properties of natural languages, which are systems that must be shared by individuals who differ from each other in important ways. We will focus on three types of individual-level factors that have been implicated in language variation and change, namely covert linguistic/phonetic differences (e.g., differences in lexicon, articulation, and cue weighting), social-attitudinal matters, and neuro-cognitive factors.
Students enrolling in this course should have at least one course in phonetics and/or phonology.
This course explores the meanings of linguistic descriptions of beliefs, dreams, hopes, desires, the past, the future, and what might have been. For instance, while extensional semantics might detail what the world is like when the sentence “it’s raining” is true, intensional semantics asks what the world is like when “Mary thinks it’s raining” is true. (Hint: it doesn’t have to be raining.)
Topics will include tense (“it’s raining” vs. “it rained”), aspect (“it’s raining” vs. “it rains”), modal statements (“it might rain”), propositional attitudes (“I wish it would rain”), conditionals (“If it rains, I’ll get wet”), and the de re / de dicto distinction – the reason why one can say “Mary thinks someone in this room is outside” even if Mary is not crazy.
An introductory theoretical semantics class will be assumed, but aspiring students may study a textbook such as Heim & Kratzer’s Semantics in Generative Grammar (Blackwell, 1998) on their own or enroll concurrently in the introductory semantics class.
This class presents fundamental methods of computational linguistics. We will develop probabilistic models to describe what structures are likely in a language. After estimating the parameters of such models, it is possible to recover underlying structure from surface observations. We will examine algorithms to accomplish these tasks.
Specifically, we will focus on modeling
trees (via probabilistic context-free grammars and their relatives)
sequences (via n-gram models, hidden Markov models, and other probabilistic finite-state processes)
bags of words (via topic models)
lexicons (via hierarchical generative models)
We will also survey a range of current tasks in applied natural language processing. Many of these tasks can be addressed with techniques from the class.
Some previous exposure to probability and programming may be helpful. However, probabilistic modeling techniques will be carefully introduced, and programming expertise will not be required. We will use a very high-level language (Dyna) to describe algorithms and visualize their execution.
Useful related courses include Machine Learning, Python 3 for Linguists, Corpus-based Linguistic Research, and Computational Psycholinguistics.
This course will be built around novel analytic techniques made available with the adoption of minimalist assumptions. As novelty is best appreciated against the background of what is conventional, there will be some retrospective glances, with an eye to understanding both what is new and what is continuous with earlier approaches. To ground the discussion empirically, we will concentrate on the following “hot” areas:
Control and Binding
Multiple Interrogation and Superiority
Multiple Spell Out , Cyclicity, Islands and Ellipsis
The main idea will be to introduce the central concepts of Minimalism in the context of analyses of these kinds of phenomena. The minimalist concepts we will discuss include:
Bare phrase structure, labels
Merge, Internal and External
Economy, Merge-over Move
Relativized Minimality, minimal-domains, Minimal Link Condition
Extension, Virus Theory
Bare Output Conditions
Last Resort and Greed
Features, Interpretability, Valuation
Agree, Probes, Goals
Copy Theory and Reconstruction
Topic by Topic:
How To Build A Simple Sentence: For the beginning of classes read Chapter 3 of The Minimalist Program, and Bare Phrase Structure
Case, minimality and minimal domains, S-structure
X’-theory, Bare Phrase structure, Generalized Transformations
For the following topics readings will be added as we move along.
How To build a Complex sentence: Raising, Passive, Wh-movement
Greed, various kinds
Probes and Goals
Extension and Virus Theory
Control and Raising
Features, Greed, EPP
Bare Phrase structure and the status of PRO
Parasitic Gaps and Adjunct Control
Extension and Virus theory
Probes/Goals and Greed
Binding and Reconstruction
Copies, LF and PF
Principles A, B, C
Binding and movement
Locality and spell out/phases
Attract vs Move
Virus Theory and Extension
Agree and Move
Binding and features
Merge over Move
By this time we will hopefully have passed the semester equator (we are planning about ten weeks for those core topics). At this point we want to open the discussion to Phase Theory. Since this is a more current topic, we expect the course to evolve towards a more participatory, seminar-like, environment, and we even plan to invite more senior graduate students to join in on the show. Be prepared to engage on a frank discussion of the topic.
The usual for this sort of class. We will have regular homeworks that we may even exchange among participants. We allow – in fact encourage – collective work on homeworks, so long as eventually every participant writes their own contribution and participation is explicitly acknowledged. We will expect some conference-like abstract by the middle of the semester with a concrete suggestion for a research topic. The requirements will end with a short squib, based on the abstract, which can be the basis for a future paper, hopefully to be submitted to a conference.
A survey of ways in which morphological structure influences and constrains phonological processes. The first half of the course will focus on cases in which phonological processes are sensitive to morphological constituency or the existence of morphologically related forms, including morpheme structure constraints, cyclicity effects in derived words, and paradigm effects (uniformity, antihomophony) in inflected forms. We will contrast two main approaches: cyclic evaluation of subparts of the word, and surface evaluation employing output-output correspondence constraints. We then turn to processes that affect some morphemes but not others, including lexically specific allomorphy and affix-specific processes, comparing representational approaches (diacritics, floating features) with dual-route approaches (grammar + memorized exceptions). A recurring theme throughout the course will be the predictions that different approaches make for acquisition and historical change; in addition, we will consider evidence from corpus studies and psycholinguistic experiments.
This course is an introduction to the internal structure of words and its relation to the structure of phrases and sentences. The topics covered will include examination of the primitives of word structure, isomorphism between syntactic and morphological structure and departures from such isomorphism, and the interplay between syntax and morphology in determining morpheme order. The course will draw on data from typologically diverse languages, and will use the tools of current morphological theory to analyze phenomena such as agreement, cliticization, and argument-structure changing morphology.
Requirements: Students must have had an introductory-level course in linguistics. Some previous experience in syntax is recommended.
This course concentrates on the neural machinery that underly our ability to speak and understand language. Topics discussed include the brain bases of speech perception and reading, lexical processing, syntax, and semantics. We will draw on a range of state-of-the-art functional neuroimaging techniques, as well as the study of neurological and developmental language disorders. Special attention will be given to how models of linguistic computations and representations can inform, and be informed by, our understanding of the brain.
This course is an introduction to phonological theory, centering on the representations and the analysis of phonotactic patterns. We will cover theories of phonological representations, beginning with distinctive feature matrices as in The Sound Pattern of English (Chomsky & Halle 1968) and continuing through feature geometry (Clements 1985; Sagey 1986; McCarthy 1988) as well as theories of articulatorily and auditorally detailed representations (Browman & Goldstein 1986 et seq.; Gafos 1999; Flemming 2002). The analysis of phonotactic patterns such as dissimilatory co-occurrence restrictions, consonant-vowel interactions, and harmony patterns will be considered in detail, with attention paid both to the representational components of the analysis as well as the structure of grammatical statements (e.g., the form of markedness constraints). The discussion will largely assume either Autosegmental Phonology (Goldsmith 1975) or Optimality Theory (Prince & Smolensky 1993/2004) as the formal framework. The course is not designed to provide a systematic introduction to either of these frameworks, though a brief introduction to each will be given. Prior knowledge of either framework is not required. A basic understanding of the phonetic properties of speech sounds will be assumed. There is no textbook for the class, though there will be readings for each class and the lecture notes will be made available. There will be several homework assignments, as well as in class exercises to work through the course material.
Language scientists attempt to answer three fundamental questions: 1. What does one know when one knows a language? 2. How does an individual access and use that knowledge when producing or understanding language? 3. How did we get this way? This course will focus on the first two of these three questions. Students will gain an appreciation for the kinds of theories that language scientists have developed to answer these questions as well as the research methods used to investigate them. The course will focus chiefly on comprehension issues, but we will also examine contemporary theories of speech production, such as Levelt and Roelof’s Weaver ++ and Dell’s interactive account.
Students will review contemporary accounts of lexical, syntactic, and discourse processing. This review includes both accounts of normal language function but also the sequelae of brain damage and other forms of language dysfunction. Topics relating to lexical processing include theories of semantic representation, lexical access, and the neural basis of lexical representation and processing. Topics relating to syntactic processing include accounts of syntactic parsing, serial versus parallel processing approaches, and processing of unbounded dependencies. Topics relating to discourse processing include contemporary accounts of discourse representation, inferencing, and the neural basis of discourse processing and representations.
This course is an introduction to the study of meaning in language, with a focus on formal semantic theory but also touching on some issues in pragmatics and lexical semantics. We will first establish the core principles of formal semantic research, with a special focus on compositionality – how the meaning of sentences and phrases arises from the meaning of its parts. We will also discuss how every utterance involves several layers of meaning, including literal meaning, presupposition, and pragmatic implications. Then, we will look at many of the components that come together to create the meaning of sentences, showing that even seemingly simple and familiar sentences may contain surprising depths. Questions under discussion will include the semantics of modals, propositional attitudes, quantification, definiteness, and plurality.
There are no requirements for the course, but familiarity with basic set theory and logic will be helpful, as will familiarity with basic syntactic concepts such as constituency.
Linguistic theory aims to specify the range of grammars permitted by the human language faculty, and thereby to specify the child’s “hypothesis space” during language acquisition. This course shows, step by step, how to use acquisition data to test theoretical claims about grammatical variation. The text is the instructor’s book, Child Language: The Parametric Approach, published by Oxford University Press. The book covers a number of methodologies, but the course will focus on the analysis of longitudinal corpora of children’s spontaneous speech, and will cover methods of statistical hypothesis-testing that are appropriate for this type of data. The students in the course will each conduct an individual project using data from the Child Language Data Exchange System (CHILDES), which includes corpora for a range of languages. Students will learn how to use correlational analysis and distributional statistics to analyze group data, as well as non-distributional methods that are appropriate for use in single-child case-studies.
Prerequisites: A decent grounding in syntax and/or phonology. Algebra-level mathematics. Basic computer skills in a Mac or PC environment.
Course Requirements: Enrolled students are required to attend regularly, participate actively in classroom discussion, complete an individual project using data from CHILDES, and present their findings at the final class meeting.
Because language contact is a fact of life for most of the world’s people and all of the world’s nations, it is hardly surprising that it often plays a major role in language change. This course will begin (Week 1) with a survey of historical, social, and political settings of language contact (when, where, and why do languages come into contact?) and with a consideration of this question: when two languages come into contact, is one of them doomed to vanish within a few decades? These background discussions will serve as an introduction to the main focus of the course: contact-induced language change. The main topics that will be covered in Week 2 are social and linguistic predictors of the effects of language contact (together with a discussion of why they can never be expected to yield deterministic predictions); the effects of contact-induced language change on the structure of the receiving language; and criteria for establishing contact as a cause of a language change (and how to react when the criteria can’t all be met). In Week 3 we will consider mechanisms of contact-induced change and linguistic areas as a special problem for the study of contact and change, and in Week 4 we’ll focus on mixed languages (pidgins, creoles, and bilingual mixed languages) and contact-induced changes in some (not all) dying languages.
“Language ideologies” are the conceptualizations people have about the languages, speakers, and discursive practices in their purview. Both embedded in practices and reflexive of them, language ideologies are pervaded with political and moral interests, and are shaped in a cultural setting. To study language ideologies is to explore the nexus of language, culture, and politics – to examine the representations, whether explicit or implicit, that construe language’s role in a social and cultural world and that are themselves acts within it. This course considers current topics and debates in the study of language ideologies, such as: what should we mean by “ideology”? What is ideological in conceptions of “language” itself? In what ways are language ideologies positioned, with respect to distributions of power and resources? What are the sites of language ideologies – the practices and scenes in which they are enacted (and revealed)? What is the role of language ideologies in organizing social identities, groups, boundaries, and activities? How do language ideologies influence linguistic and social change? We will consider these questions in the light of case materials representing a wide range of ethnographic, historical, and linguistic circumstances.
This course focuses on syntactic typology. We correlate syntactic types with
genetic and areal groupings of languages and certain morphological patterns. We present five major topics selected on a broad semantic basis. For each topic we present patterns of variation, review or suggest explanations for them and assess how well they can be characterized in current syntactic theory. We close with a formal presentation of how to state language universals over syntactically non-isomorphic languages.
1. Word order types: Verb Final, Verb Initial, Serial Verb Languages
Hixkaryana: ?The odd man out
2. A. Relative Clause Formation
Prenominal, postnominal, resumptive pronouns, finite/non-finite verbs
B. Valency Affecting Operations (VAOs)
Rich Voice Systems (W. Austronesian)
Rich Applicative Systems (Bantu)
Syntactic Role: VAOs feed extraction
3. Anaphora Patterns in the World’s Languages
Reflexives: templatic, affixal, clitics, full arguments
Reciprocals: templatic, affixal, clitics, full arguments
4. Quantification in the World’s Languages
Linguistic Invariants over non-isomorphic grammars
Pre-requisite for this course: an introductory linguistics course (with at least some basic syntax).
This course will explore research on the significance of natural language variation in shaping human thought. The first unit of the course provides essential background by introducing historical and conceptual perspectives on the relation of language and reality that continue to shape our understanding of language variation and by surveying early work in anthropology (Boas, Sapir, Whorf) and psychology (Brown, Lenneberg, Carroll) linking language variation to thought. Classic topics involving lexical forms denoting “color” and “snow” will be discussed critically. The second unit reviews and contrasts prominent contemporary approaches from within anthropological linguistics, including both structure-centered approaches (Lucy et al.) and domain-centered approaches (Levinson et al.), as well as several influential approaches within psychology (Slobin, Boroditsky). The discussions will highlight both continuities and innovations with respect to earlier work. The third unit will review recent research extending these approaches to new populations including the deaf, young children, bilinguals, etc. These approaches not only offer avenues to exploring underlying mechanisms but also open up ways of theorizing the centrality and trade-offs of relying on language in human thought. The final unit will explore variations in the cultural and institutional regimentation of language-thought relationships, first in the areas of standard language as promulgated through education and literacy, and then within the research enterprise itself in areas involving practical translation, including comparative linguistic research. Readings will be drawn from many fields but will emphasize classic works that emphasize comparative, developmental, and critical approaches and provide a foundation for further research. Class time will be divided between general orienting lectures on theoretical issues and close discussion of key empirical works.
Determining what words mean is the core skill and practice of lexicography. Determining what words mean is also a central challenge in natural language processing (NLP), where it is usually classed under the exercise of word sense disambiguation (WSD). Until the late 20th century, lexicography was dominated by scholars with backgrounds in philosophy, literature, and other humanistic disciplines, and the writing of dictionaries was based strongly on intuition, and only secondarily on induction from the study of examples of usage. Linguistics, in this same period, establish itself as a discipline with strong scientific credentials. With the development of corpora and other computational tools for processing text, dictionary makers recognized first the value, and soon the indispensability, of using evidence-based data to develop dictionary definitions, and this brought them increasingly into contact with computational linguists. The developers of computational linguistic tools and resources eventually turned their attention back to the dictionary and found that it was a document that could be exploited for use in the newly emerging fields of linguistic inquiry that computation made possible: NLP, artificial intelligence, machine learning, and machine translation. This course will explore the computational tools that lexicographers use today to write dictionaries, and the ways in which computational linguists use dictionaries in their pursuits. The aim is to give students an appreciation of the unexploited opportunities that dictionary databases offer to NLP, and of the challenges that stand in the way of their exploitation. Students will have an opportunity to explore the ways in which dictionaries may aid or hinder automatic WSD, and they will be encouraged to develop their own models for the use of dictionary databases in NLP.
Students must have native-speaker fluency in English. Thorough knowledge of Englsih grammar and morphology is an advantage, as is knowledge of the rudiments of NLP.
In keeping with the theme of the Institute, Universality and Variation, this course addresses language diversity and language change, and how they interact with one another. It investigates broadly the following questions:
(1) How many language families are there in the world? How do we know?
(2) How many language isolates are there in the world, and how can we investigate their history?
(3) What are the prospects for finding new language classifications and thus of reducing the ultimate number of independent language families? How might controversial proposals of distant genetic relationship be resolved?
(4) How many of the existing languages are endangered? What are the implications of this for linguistic diversity and the classification of languages? Can endangered languages undergo changes that are not possible in fully viable non-endangered languages? What are their implications for historical linguistics generally?
(5) What implications does the discovery of unusual or unique linguistic traits in recent documentation of endangered languages have for how we view universals, linguistic typology, and aspect of language change?
(6) How do language contact and diffusion affect views of linguistic diversity?
(7) What is the relevance, if any, of human genetics, the farming/language dispersal hypothesis, and related matters to language classification and linguistic diversity?
Students at any level of preparation in linguistics are welcome to register for this course, although it will be clearest for students who have had at least a solid introduction to general linguistics and some familiarity with the basic concepts of phonology/phonetics, grammar, and historical linguistics.
Linguistics as a Forensic Science introduces students to the current state of the art in forensic linguistics. Students learn the legal standards that linguistic evidence must meet, how linguistic research has produced methods that meet these standards, as well as examples of methodological failure. Cases and rulings are discussed in the context of methodological issues for linguistics, and to demonstrate the seriousness of legal standards. Examined in detail are linguistic methods for author identification, text classification, intertextuality and linguistic profiling. Most forensic linguistic methods attempt to identify, individuate or classify texts, so automatically texts are seen as instances of either individual or group variation (i.e. the method must be able to categorize texts as belonging to different individuals, the method must be able to classify texts as belonging to a particular type of text, the method must be able to identify texts as coming from a person with a certain level of education or dialect, and so forth).
The paradigm which students learn in this course is one in which (1) universal principles provide methodological grounding for the analysis of variation, (2) texts are analyzed for the instantiation of syntactic and semantic properties, (3) the instantiations are quantified, (4) the quantifications are subjected to statistical analysis, (5) the statistical analysis is subjected to validation testing for error rates. This paradigm –known as computational forensic linguistics– poses several challenges to linguistics as a science, such as, the choice of levels and units for linguistic analysis of forensic texts for specific tasks, the predictability of linguistic behavior, tools for analysis of variable linguistic behavior, and the model of language which is both circumscribed or determined by universal principles but at the same time instantiated in group and individual behaviors. Thus, computational forensic linguistics provides a proving ground for how universal principles ground analysis and method so that individual and group variability can be accurately captured and then used for prediction –the core of scientific endeavors.
Current forensic linguistics methods exemplify the tension between universality and variability. The ways in which different methods embrace universality or variability have either enabled or prevented linguistic methods from reaching error rates low enough for legal use. Admissible methods that have successfully met the scientific rigor required for legal evidence combine analysis based on universal principles of linguistic structure with statistical analysis of linguistic variability. On the other hand, methods which have focused on variability to the exclusion of universal principles have failed methodologically to produce repeatable results or low error rates, and have thus not met legal standards and are generally ruled as inadmissible. The computational forensic linguistic paradigm embraces variability as the core of most forensic linguistic problems, with universal structural principles as the primary analytical approach for solving these problems. Only this synergistic approach — a structural-behaviorist approach— actually works to produce feasible forensic linguistic methods that are theoretically grounded, replicable and reliable.
Students in this course should have already taken an introductory linguistics course. Students in may also find the Institute courses on R and Python to be good courses to take at the same time but they are not required.
This course provides a general introduction to machine learning. Unlike results in learnability, which are very abstract and have limited practical consequences, machine learning methods are eminently practical, and provide detailed understanding of the space of possibilities for human language learning.
Machine learning has come to dominate the field of computational linguistics: virtually every problem of language processing is treated as a learning problem. Machine learning is also making inroads into mainstream linguistics, particularly in the area of phonology. Stochastic Optimality Theory and the use of maximum entropy models for phonotactics may be cited as two examples.
The course will focus on giving a general understanding of how machine learning methods work, in a way that is accessible to linguistics students. There will be some discussion of software, but the focus will be on understanding what the software is doing, not in the details of using a particular package.
The topics to be touched on include classification methods (Naive Bayes, the perceptron, support vector machines, boosting, decision trees, maximum entropy classifiers) and clustering (hierarchical clustering, k-means clustering, the EM algorithm, latent semantic indexing), sequential models (Hidden Markov Models, conditional random fields) and grammatical inference (probabilistic context-free grammars, distributional learning), semisupervised learning (self-training, co-training, spectral methods) and reinforcement learning.
With increasing use of quantitative behavioral data, statistical data analysis has rapidly become a crucial part of linguistic training. Linguistic data analysis is often particularly challenging because (i) the relevant data are often sparse, (ii) the data sets are often unbalanced with regard to the variables of interest, and (iii) data points are typically not sampled independently of each other, making it necessary to account for—possibly hierarchical—grouping structures (clusters) in the data. This course provides an introduction to several advanced data analyses techniques that help us to address these challenges. We will focus on the Generalized Linear Model (GLM) and Generalized Linear Mixed Model (GLMM) – what they are, how to fit them, what common ‘traps’ to be aware of, how to interpret them, and how to report and visualize results obtained from these models. GLMs and GLMMs are a powerful tool to understand complex data, including not only whether effects are significant but also what direction and shape they have. GLMs have been used in corpus and sociolinguistics since at least the 60s. GLMMs have recently been introduced to language research through corpus- and psycholinguistics. They are rapidly becoming a popular data analysis techniques in these and other fields (e.g. sociolinguistics).
In this course, I will assume a basic statistical background and a conceptual understanding of at least linear regression.
The emergence of inferential-realizational approaches to inflection has led to a dramatic reversal of a perspective on morphology that dominated twentieth-century grammatical theory, where inflectional paradigms were regarded as an epiphenomenon of the combinatory properties of inflectional morphemes and were accorded no theoretical importance. The new perspective suggests that paradigms are essential to the definition of a language’s inflectional morphology and that they constitute a significant domain of measurable typological variation. The purpose of this course is to investigate both the universal principles of paradigm structure and the dimensions and degrees of cross-linguistic variation in paradigm structure. Central to our method is the use of computational resources for the formal modeling and typological measurement of inflectional paradigms. We begin by examining inferential-realizational theories of inflection and their place in the broader theoretical landscape. Numerous considerations decisively favor the inferential-realizational approach. We exemplify this approach with Paradigm Function Morphology, a precise system of universal principles for the definition of inflectional systems. We then consider two different approaches to modeling paradigm realization in inferential-realizational theories, the exponence-based approach, computationally illustrated through Network Morphology; and the implicative approach, computationally illustrated by the Principal-Parts Analyzer. Both approaches are then contrasted in the way they account for inflectional classes, and for the exponent-based account we introduce the concept of default inheritance hierarchy, for the implicative the notion of principal parts. We move on to look at the diversity of paradigm structures, treating it as various departures from a canonical norm. Two kinds of phenomena responsible for paradigm structure variation are syncretism and deponency, both covered in some detail. Further variation is identified by considering the predictability of cells, and we consider the implicative structure of paradigms. We go on to relate this concept to the property of inflectional complexity, a point of comparison between languages’ morphological systems that lends itself to a typological treatment. Throughout the course practical hands-on computational sessions will supplement and illustrate theoretical points made. An introduction to linguistics course is strongly advised, and knowledge of morphology is desirable.
This class is an introduction to several aspects of child phonological acquisition: what early phonologies sound like, how child speech is similar to and also different from adult phonologies, what properties of child speech seem universal vs. language-specific, and how current phonological theories and models capture and predict developmental stages in child speech, and with what success. The empirical focus will be child L1 and some L2 production from about 18 months to five years, in a wide variety of languages, and class meetings will be data-intensive. The grammatical focus will be constraint-based, as in Optimality Theory and Harmonic Grammar, but many different models of learning will be explored. Over the course of the session, we will study the acquisition of segments, syllables, word shapes and simple morpho-phonology, drawing evidence from longitudinal and cross-sectional studies, and also consider the interactions of phonological development and word learning, and some recent insights drawn from computational simulations of phonological learning.
Some of the most compelling questions in the field of pidgins and creoles consist in identifying the linguistic sources and cognitive forces that shape a given creole: why does a particular creole look and sound the way it does? Where do its linguistic properties come from? What are the original populations and languages that contributed to its genesis? This investigation ultimately hopes to shed light on two major cognitive questions: how does the mind pull together linguistic materials from distinct sources to form a creole? What is the nature of the cognitive processes involved in creole formation? In exploring some of these queries, this particular course will focus on the processes of convergence, relexification and grammaticalization and will contrast, regarding the latter point, general theories of grammaticalization (Lehmann, 2002; Hopper & Traugott, 2003; Fischer, 2007) with their generative (Van Gelderen, 2004) and usage-based (Tomasello, 2005; (Boyer & Harder, 2012) counterparts. Comparing these approaches will allow us to gauge how each framework accounts for specific aspects of creole grammars and to assess their contribution to our understanding of how creole languages develop. Besides its focus on cognitive issues in creole formation, other major topics in this course will include:
1) Socio-historical contexts of creole genesis, how a distinct history of population contact results in distinct structural outcomes;
2) examination of the morpho-syntactic properties of a set of creole languages;
3) contributions of L1 and L2 to the emergence of creole specific features.
Students enrolling in this class should have taken and introductory course in linguistics.
This course introduces basic automation and scripting skills for linguists using Praat. The course will expand upon a basic familiarity with Praat and explore how scripting can help you automate mundane tasks, ensure consistency in your analyses, and provide implicit (and richly-detailed) methodological documentation of your research. Our main goals will be:
1. To expand upon a basic familiarity with Praat by exploring the software’s capabilities and learning the details of its scripting language.
2. To practice a set of scripting best practices to help you not only write and maintain your own scripts but evaluate scripts written by others.
The course assumes participants have read and practiced with the Intro from Praat’s help manual. Topics to be covered include:
o Working with the Objects, Editor, and Picture windows
o Finding available commands
o Creating new commands
o Working with TextGrids
o Conditionals, flow control, and error handling
o Using strings, numbers, formulas, arrays, and tables
o Automating phonetic analysis
o Testing, adapting, and using scripts from the internet
Relying on recent theoretical and cross-linguistic work, the course will begin by clarifying the semantic and pragmatic properties of information structure that natural languages choose to represent in one way or other – syntactically, prosodically, or lexically. The emphasis will be on the representation of discourse-old versus discourse-new, various types of contrasts, and various types of topicality. We will then explore how different languages exploit morphological, syntactic, or prosodic formats of representation to express information structural distinctions. Given the expertise of the instructors, the prosodic reflexes of information structure and their various sources in the grammar will get the most attention in this course, but, in line with the theme of this Summer Institute, we will also document that prosody is just one possible way of representing information structural properties in the languages of the world: there is no necessary connection between prosody and information structure. Throughout the course, we will probe into possible grammatical architectures that might be responsible for the observed range of realizations for information structural properties. Pre-requisites: Graduate student level familiarity with phonology, syntax, and semantics.
This course introduces basic programming and scripting skills to linguists using the Python 3 programming language and common development environments. Our main goals are:
- to offer an entry point to programming and computation for humanities students, and whoever is interested
- to do so without requiring any previous computer or IT knowledge (except basic computer experience and common lay-person computer knowledge).
The course covers in eight sessions the interaction with the Python programming environment, an introduction to programming, and an introduction to linguistically relevant text and data processing algorithms, including quantitative and statistical analyses, as well as qualitative and symbolic methods.
Existing Python code libraries and components will be discussed, and practical usage examples given. The emphasis in this course is on being creative with a programming language, and teaching content that is geared towards specific tasks that linguists are confronted with, where computation of large amounts of data or time consuming annotation and data manipulation tasks are necessary. Among the tasks we consider essential are:
- reading text and language data from- and writing to files in various encodings, using different orthographic systems and standards, corpus encoding formats and technologies (e.g. XML),
- generating and processing of word lists, linguistic annotation models, N-gram models, frequency profiles to study quantitative and qualitative aspects of language, for example, variation in language, computational dialectology, similarity or dissimilarity at different linguistic levels,
- symbolic processing of regular grammar rules to be used in finite state automata for processing of phonotactic information or morphology, but also context free grammars and parsers for syntactic analyses, and higher level grammar formalisms, and the use of these grammars and language processing algorithms.
In the grammar architecture of classical Optimality Theory (Prince and Smolensky 1993), constraints are ranked and the grammar generates exactly one winner per input. Phonologists have proposed instead that we should consider models in which the constraints, rather than being ranked, bear weights (real numbers, intuitively related to constraint strength). Weights are employed to calculate probabilities for all members of the candidate set.
Such quantitative grammars open up new research possibilities for constraint-based phonology:
(a) Modeling free variation and the multiple factors that shift the statistical distribution of outputs across contexts;
(b) Modeling gradient intuitions (intermediate well-formedness, ambivalence among output choices);
(c) Modeling quantitative lexical patterns and how they are characteristically mimicked in experiments where native speakers are tested on their phonological knowledge;
(d) Modeling phonological learning: even where in areas where the ambient language doesn’t vary at all, the child’s conception of what is likely to be the correct grammar of it will change (approaching certainty) as more data are taken in; modeling can trace this process.
This course will be an introduction to these models and research areas. It will emphasize learning by doing. Participants will use software tools that embody the theories at hand and will examine and model data from a variety of digital corpora. The course will not cover computational phonology per se, but it will cover enough computation to give participants a good understanding of the tools they are using. Pre-requisite for this course: a course in phonology.
Pragmatics studies the interactions of linguistic meaning, as determined by grammatical compositional mechanisms, with inferential processes that involve reasoning about speakers’ communicative intentions. In recent years, pragmatics has started being studied with as rigorous techniques as those used in formal semantics, giving rise to the relatively new field of Formal Pragmatics. One of the results of this trend has been that certain widely accepted assumptions about the division of labor between semantics and pragmatics have been challenged. One of this course’s goals is to enable students to understand these recent debates. We will focus on one particular type of inference, known as scalar implicatures, whose status is currently under debate; while the traditional, Gricean view conceives of them as paradigmatic cases of pragmatic inferences, it has been recently argued that they should rather be thought of as a grammatical phenomenon. We will present in a detailed way the various arguments of the two sides of the debate. As we’ll see, they appear to involve a variety of linguistic phenomena, such as the interpretation of interrogative clauses, focus-marking and focus-sensitive operators, modal environments, the interpretation of number and gender features, numerals, negative polarity items, free-choice items… . We will also discuss a few experimental studies which are relevant to the theory of scalar implicatures and related issues. Students enrolling in this course should have significant background knowledge in formal semantics or/and logic (typically at the level of a graduate or advanced undergraduate introductory class).
The modern day study of second language acquisition (SLA) dates back to the late 1960s. What launched it was the discovery of common acquisition orders and sequences of development among all learners of a given second language. Of course, there was clear native language influence on such orders and sequences, but the L1 interference was perceived to minimally “disturb” them. This finding of universality has been remarkably robust and is widely accepted among second language acquisition researchers. It has inspired many theoretical explanations, from the existence of an innate universal grammar, still accessible in SLA, to processability theory, which explains the common order by appealing to sentence processing constraints, to usage-based theories, which attribute the universality to features in the input, such as the frequency, saliency, and contingency of form-meaning mapping of certain constructions. More recently, there has been a shift to focusing on variability in the SLA process. While it has always been acknowledged to be part of SLA, awareness of its ubiquity has been heightened through increased attention to social and contextual factors. In addition, when one examines individual learners, as opposed to group phenomena, variability is obvious. Gaussian statistics, which emphasize averages, should at least be complemented with Pareto-based statistics, which feature (nearly) infinite variance. In addition, variability has been recognized to play an important role in stimulating language development among second language learners, leading researchers to focus upon variable performance, looking for “motors of change.” The course will conclude with a consideration of a complexity-theory view of language and its learning, which inspires us to look for what unites universality and variability.
This course introduces participants to the methodology of collecting semantic/pragmatic data in collaboration with linguistically untrained native speaker consultants.
Data that may inform semantic/pragmatic theorizing are typically quite complex, consisting of 1) one or more grammatical sentences that are 2) uttered in an appropriately designed context, and 3) a native speaker’s judgment about the acceptability or the truth of the sentence(s) uttered in that context.
The goal of the course is to familiarize students with the empirical, theoretical and methodological considerations relevant to obtaining such data. In particular, topics to be discussed include the kinds of judgments obtainable from native speakers, distinguishing syntactically ill-formed from semantically/pragmatically anomalous sentences/utterances, the importance of context and how to appropriately control for it, reporting semantic/pragmatic data, and the generalizability of results.
The course also examines the benefits of and difficulties with exploring semantic/pragmatic research questions through texts. The relative merits of one-on-one elicitation and controlled experiments with linguistically untrained native speakers are also considered.
Although much of the data provided for in-class discussion comes from Paraguayan Guaraní (Tupí-Guaraní), in particular studies of temporal and nominal reference, and of presuppositions and other projective contents, the course aims to prepare participants to conduct semantic/pragmatic fieldwork on any topic in any language. Note that this course does not have a regular practical component during which course participants work with a native speaker consultant; Professor Keren Rice’s field methods course (http://lsa2013.lsa.umich.edu/2012/05/field-methods/) is highly recommended for this purpose.
This course is targeted at students already familiar with formal syntax, semantics and pragmatics who wish to collect data with native speakers, as well as students who already have experience in conducting research with native speakers and want to extend their research to semantic/pragmatic topics. Interested course participants should contact the instructor (firstname.lastname@example.org) with questions about the course content and suitability.
Knowing the grammar of your language entails understanding how meanings map to syntactic structures, but these mappings are not strictly one-to-one. We know, for example, that “Chris gave the book to Kim” and “Chris gave Kim the book” are semantically equivalent and interchangeable. Likewise, we know “That car don’t run” is semantically equivalent to “That car doesn’t run,” but the two expressions are not interchangeable because the former is sociolinguistically marked. In this class, we explore the intersection of syntactic variation and sentence processing. Our approach assumes that knowledge of syntactic alternants, and of the social patterning of those alternants, is incorporated into our mental representations of grammar. As such, this knowledge should also be reflected in psycholinguistic theories. We will consider current theorizing that bears on this topic, and its limitations. Readings and discussion will address the following set of issues:
1. How do children deal with syntactic variation in the input?
2. How do adults represent and acquire syntactic variants that they themselves don’t use?
3. What is the role of language variation in sentence processing?
4. How do/can current models of linguistic competence and processing accommodate syntactic variation?
This course will be taught seminar-style, with students leading some of the discussions. The readings will focus on recent experimental research using a variety of online and offline methodologies. Students will work together to develop research proposals, which they will present to the class and write up as a final paper.
The “information age” has brought with it an explosion of new kinds of communication, from electronic mail to discussion forums, chat, weblogs, texting, video sharing and many other hybrid modes. Millions of people participate on a daily basis in these “Social Media”, presenting new opportunities and challenges for linguistic research. Social media often offer readily available data, allowing both the content and context of ordinary communication to be studied as it never has before. At the same time, the scale of the available data, its sometimes uncertain provenance, and the constantly evolving status of the supporting media raise significant challenges for analysis. This course addresses the analysis of language in social media, through systematic exploration of current research literature on social media, focusing especially on the uses of computational techniques for the analysis of both language and context.
The purpose of this course is to provide training in discourse analysis that focuses on how culture is manifest in discourse practices. Recordings of socially occurring speech render relatively ephemeral speech in a material and permanent form that gives it cultural reliability and repeatability not available in data collected through other anthropological/ethnographic research methods such as participant observation and note taking. Topics include: 1) Research design. When is recording useful, appropriate, and ethical; what kinds of activities will be recorded and how much material in hours will be recorded? 2) Transcription, translation and computer entry of recordings. How to choose what to transcribe and how much to transcribe; in-field versus after-fieldwork transcription and translation; selection of transcription formats and software for coding data. 3) Analysis based on recordings, transcripts and coding of transcripts. Using the comparative method, identification of relevant units of interaction and their internal sequencing; comparison of multiple instances of the same units of interaction; comparison of multiple kinds of units of interaction and forms of talk; relating discourse analysis to other kinds of data concerning forms of local knowledge in order to make claims for sociocultural processes greater in scale than the discourse data. 4) Analysis of linguistic structures crucial to the interactional constitution of cultural processes, e.g. mood/modality; agency; evidentiality. This will be a hands-on course involving analysis of data provided by the instructors. This approach can serve scholars interested in how culture and language are mutually constituted through not only socially occurring speech, but also in interviews, in written records and in the media. The planning and implementation of research in linguistic anthropology, cultural anthropology, sociolinguistics, and language change can be strengthened by greater knowledge of the theoretical and methodological underpinnings of discourse analysis.
Some experience with linguistic analysis/description is preferred, but not required.
Sociophonetic research focuses on the implications for theories of speech production, speech processing and phonological acquisition of the presence of a rich array of social-indexical information inextricably woven into the substance of speech. Research in recent years has shown how this social-indexical channel can be sensitively controlled by speakers, is readily interpretable by listeners, and accessible to language-learners, and findings such as these are starting to have a significant impact on our understanding of different aspects of the speech chain, not least in respect of what we understand as an individual’s “phonological knowledge”. Sociophonetics is also concerned with the application of methods and theories from different areas of phonetic research to the theories and models of phonological variation and change which have arisen most notably from work within variationist sociolinguistics.
This course begins with an evaluation of the factors which have led to such a rapid and really quite sudden convergence of interest in sociophonetics from a number of different directions over the past 15-20 years. It then focuses on the research questions which define the sociophonetics research community, discussing key studies in the field, methodological innovations, and theoretical insights. The material covered will include empirical studies of speech production, perception, and acquisition, the development and application of new experimental methods for investigating sociophonetic questions, and an evaluation of the theoretical innovations associated with this rapidly developing field of research. The course will round off by considering the methodological and theoretical challenges which are likely to shape the next stage in the development of sociophonetic research.
This course examines the role of phonetics in the construction of gender and sexuality. We begin with the premise that phonetic material carries a range of social meanings that themselves are constitutive of gender and sexuality. Our goals are to review techniques for acoustically quantifying the phonetic characteristics of vowels, consonants, prosody, and voice quality, with an emphasis on those that are used to distinguish speakers on the basis of gender or sexuality; survey classic and current literature on the sociophonetics of gender and sexuality; and unpack the ideological processes that enable language users to forge indexical connections from phonetic forms to gender and sexuality. In addressing this last issue, we will consider a range of issues, including the following: sounding gay, and as a point of contrast, sounding lesbian; the role of phonetics in constructing transgendered identity; intragender phonetic difference; and the pathologization of young women’s voices in the media, with a focus on the creaky voice (or vocal fry) phenomenon.
This course introduces students to the basic principles and theories of speech perception. We will take a hands-on approach, conducting small-scale experiments to illustrate classic phenomena and test selected theoretical claims.
In a very broad sense, much of the research in the roughly 60-year history of experimental speech perception investigates how listeners map the input acoustic signal onto phonological units. Determining the nature of the mapping is a complex issue because the acoustic signal is highly variable, yet perception remains nearly constant across many types of variation. Some theoretical approaches to speech perception postulate that invariant properties in the input signal underlie perceptual constancy. Other approaches do not assume invariants but either require principles that account for the necessarily more complex mapping between signal and phonological representation, or require more complex representations. As a result, theoretical approaches differ in their assumptions concerning the relevant phonological units (features, gestures, segments, words) and the structure of these units (e.g., abstract representations, stored memory traces of auditory experiences). These issues will serve as our overarching framework. However, in addressing them we will also consider: What initial perceptual capabilities do infants have, what is the nature of our perceptual experiences, and how do these determine perceptual learning? How do listeners weight multiple sources of information, and integrate these cues into a coherent linguistic percept? How might cue weighting serve as an impetus for sound change? How do social categories and phonetic categories interact in perception?
Some background in acoustic phonetics is recommended for this course.
The prescriptive-descriptive binary, a commonplace in most introductory linguistics textbooks, can make it seem like prescriptivism lies outside the purview of serious linguistic study. This course puts prescriptivism at its center, as an important sociolinguistic factor in the development of Modern English as well as a key challenge to linguists in engaging the public in dialogue about linguistic diversity. In this course we will briefly cover the rise of standardization and Standard English in the history of English, and discuss the ways that morality—discourses of good and bad, right and wrong, pure and corrupt—has become entangled with grammar over the past three centuries. The course will tackle the definitions of Standard English and prescriptivism, as well as the nature of standard language ideology and authority. We’ll read a few key theoretical pieces as background and then address: (a) evolving attitudes about the prescriptive authority of usages guides and dictionaries; and (b) “grammar teaching” and Standard English in the educational system. At the end of the course, we will examine recent debates in the national media about language and “correctness” to think through how linguists can most productively engage in public discussions about language given the prescriptive language ideologies in widespread circulation.
This course is aimed at beginners in statistics and will cover (1) the theoretical foundations of statistical reasoning as well as (2) selected practical applications. As for (1), we will discuss notions such as (different types of) variables, operationalization, (null and alternative) hypotheses, additive and interactive effects, significance testing and p-values, model(ing) and model selection, etc. As for (2), we will be concerned with how to annotate and prepare data for statistical analysis using spreadsheet software, how to use the open-source language and environment R <www.r-project.org>) to
- explore data visually using a multitude of graphs (an important precursor to any kind of statistical analysis) and exploratory statistical tools (e.g., cluster analysis);
- conduct some basic statistical tests;
- explore briefly more advanced statistical regression modeling techniques.
The course will be leaning on the second edition of my textbook on statistics for linguists (to be published 2013 by Mouton de Gruyter). Examples will include observational and experimental data from a variety of linguistic sub-disciplines.
This class will explore the basic principles that create and sustain the richness of the lexicon in human languages. We will consider how new words are created, how they are learned, and how they are replicated through social interactions in human communities. Empirical data will be drawn from classical sources, from language on the Internet, and from computer-based “games with a purpose”. Using concepts from research on population biology and social dynamics, we will also discuss mathematical approaches to modeling the life and death of words.
This course is designed to walk students, beginning with conceptual basics, through the myriad of complex issues which surround the relationship between the two distinct approaches to `universalism’ (typological generalization and formal model construction) and the task of syntactic reconstruction. There is considerable debate in the literature as to the possibility of actually reconstructing the syntax of a protolanguage, with a general split between nay-sayers (usually ‘formalists’, though I myself hate using labels) and advocates (usually ‘functionalists, though, ditto) regarding the process.
We will begin with a consideration of the relationship between typology, formal model construction and reconstruction methodology in a somewhat less controversial (though still subject to much debate) domain: that of phonological reconstruction, thus exploring the debate between typologists and formalists in a domain within which there is no serious dissent as to the practability of reconstruction.
We next turn to a survey of typological approaches to syntactic structure, including the wealth of new tools (e.g., the WALS database) now available to assist scholars in establishing an empirical foundation for their investigation. The general theoretical issue of the ‘grounding’ of typological generalizations will be raised at this juncture as well (since this forms part of the basis for the conflict between ‘functionalists’ and ‘formalists’ in reconstruction).
Next, we turn to the often very different kinds of generalizations that ‘formalist’ models seek to account for, the types of evidence which are offered up for such generalizations, and the ‘grounding’ (in this case, in UG) of the accounts offered.
We turn finally to the question of how these two types of approach play out for the issue of syntactic reconstruction. What are the ‘units of analysis’ in the two domains which COULD (in principle) be reconstructed? What would the successful reconstruction of such units tell us about the ‘syntax’ (in the descriptive sense) of the proto-language and what would it leave unclear?
In conclusion, we move to the practical consideration of three specific ‘test cases’:
(1) embedded clause stuctures in Proto-Indo-European,
(2) the ergative vs. accusative reconstruction of Proto-Polynesian and
(3) Wackernagel’s Law and the ‘left periphery’ (i.e., syntax-discourse interface) in Proto-Indo-European. We will conclude with some general lessons, open avenues for future research, etc.
This course will explore how inherent variability, as asserted by linguistic variation scholarship, can be understood with respect to variation in syntax. We will limit ourselves to what Suzanne Romaine has referred to as “pure syntactic variables”, as opposed to morphosyntactic/ morpholexical, morphophonemic and pure phonological, variables. The approach to syntax that we will assume is the Minimalist Program (MP) of Noam Chomsky. Relating variation in “pure syntax” to the MP is more daunting than it might at first seem, because MP theory and variation theory are not about the same thing. MP syntax is about sentences. Variation analysis is about utterances. Sentences are abstractions; utterances are observable events. One cannot, strictly speaking, write or speak a sentence, only an utterance approximating one. The course will propose that we need separate approaches to sentences and utterances. Both contribute to the understanding of language, but are fundamentally separate.
We will take up in detail four cases of “pure” syntactic variation: 1) the alternation among which, that and zero in relative clauses as studied by Tagliamonte et al (2005), 2) the variation between pronouns and reflexives that exists even where the classic binding theory forbids it, 3) variation in word order in Dutch verb clusters as researched by Barbiers (2005), and 4) the alternation between preposition stranding and pied-piping of WH-noun phrases. The second and third cases emerge from my own research.
The approach will be in-class lectures augmented by Powerpoint slides. There will be two out-of-class assignments, in which students will be asked to search an online corpus for examples that support or challenge the analyses of reflexives and pied-piping presented in class. The assignments will call for a discussion of how the examples relate to the presented analyses.
Tense and aspectual properties in AAE are at the top of the list of descriptions—especially those from Creolist and Africanist perspectives—that are intended to highlight the ways in which the linguistic variety differs from other varieties of English. On the other hand, modality in AAE is not commonly addressed in the literature. This course will examine syntactic/semantic and morphological properties of tense, modality, and aspect (TMA) in AAE. Questions have been raised about the interpretation and syntactic representation of tense, especially given weak morphology and the fact that overt tense markers may not be expressed in AAE. This course will present a general overview of tense marking and the ways in which time-related meaning is computed in AAE.
The second part of the course considers grammaticalized markers in AAE that combine with predicates and other markers to indicate information about the way an event is carried out. Questions about properties of tense marking within aspectual sequences in AAE have not received much attention perhaps because so much emphasis has been placed on grammaticalized aspect markers, with the view that AAE is aspect prominent. For instance, some aspectual sequences can take a present or past perspective while still others are limited to present contexts. We will analyze empirical data from different sources in investigating the TMA system in AAE. This section of the course will also consider the types of subtle distinctions that are made in the AAE tense/aspect system. For instance, when overt or covert present tense auxiliary BE (i.e. is) combines with V(erb)-ing, the result is an in-progress reading, as in the following:
1) Sue IS running.
2) Sue running.
Sue’s running is already in progress.
However, when aspectual be combines with V(erb)-ing, the result is an in-progress or inception reading, as in the following:
3) Sue be running when the Mardi Gras characters pass by.
In-Progress Reading 1: Sue’s running is generally already in progress when the Mardi Gras characters pass by.
Inception Reading 2: Sue generally begins to run when the Mardi Gras characters pass by.
In addition to considering verbs types (e.g., state and activity) and their lexical properties, we will also examine the role of morphological endings, such as –ing and –ed, in aspectual sequences. Finally, this course will investigate modality in light of modal auxiliaries as well as mood markers in AAE.
We will extend the study of TMA in AAE to practical contexts by considering questions such as the following:
1) How is the TMA system acquired, and how is it reflected in child AAE?
2) How is TMA marking reflected in the discourse structure of ex-slave narratives?
For a century and a half ,data have been gathered on brain organization for language. From early on, as this research was initially carried out primarily in Europe, questions have been asked about how language is represented and processed in the brains of bilinguals and multilinguals. In this class we will review the questions that have been asked and the currently held answers concerning how brains handle more than one language. After an initial review of brain regions that have been identified as crucial for language generally and methods for studying them, topics will include a selection of the following:
- Parallel and differential impairment and recovery from aphasia in bilinguals
- Consequences of age of L2 acquisition
- Consequences of age of diminishment of L 1 or L2 use (e.g., in heritage-language speakers)
- Bilingual switching
- Cognitive advantages of bilingualism
- Talented L2 learning and hyperpolyglots
- Particular difficulties with L2 learning (links to dyslexia)
- L1 and L2 attrition
- Shared and distinct components of the bilingual’s two languages (e.g., cognates vs. non-cognates; bidialectalism vs. bilingualism)
- Differences between bilingualism and multilingualism
- Bilingualism in Alzheimer’s disease
Our focus will be not only on the phenomena of interest, but also on how neurolinguistic methods lead to findings and what the relative advantages and disadvantages of the commonly used techniques are.
This course introduces students to sociolinguistic research exploring a related set of issues at the intersection of child language acquisition research and research into sociolinguistic variation:
What is the timing of the emergence of sociolinguistic variation in children? In particular:
How does the timing of the production of variation relate to other universal milestones of language development? When do children begin to show perceptual awareness of variation in the linguistic forms produced by others? What do we “know” about the types of social categories children conceptualize and employ in their social lives?
We begin with Dell Hymes’ notion of communicative competence, establishing a basis for understanding the abilities comprising sociolinguistic competence (systemic potential, appropriateness, occurrence, feasibility). From there, we explore classic linguistic literature regarding milestones of child language acquisition. We then examine key research focused upon the emergence of variability in children: 1) the history of language variation studies targeting children and preadolescent children, 2) systematic variability in childrens’ output (production), 3) variability in input to children, 4) evidence for language change in second dialect acquisition. We will look at one particular family of models, exemplar models, that seem promising for modeling of the mapping of linguistic form to social meaning in the mind because they can accommodate the types of findings described in the literature we have explored. We will consider in particular, how and whether metalinguistic commentary from children (their “talk about talk”) might provide insights into the social categories they perceive, and the evolution of linkages between these categories and linguistic forms over time in the exemplar space.
This course will explore the Institute theme of universality and the complexities of linguistic variability by examining major morphological and syntactic features in languages indigenous to North America. The languages show considerable diversity among themselves, comprising well over 50 distinct families. At the same time, we find a number of areal traits that were apparently spread through contact. Many of the languages exhibit highly developed structures that are relatively rare or less developed elsewhere. A number show elaborate morphological structure, which has implications for syntactic structure. After an overview of traditional and current issues in morphological and syntactic typology, we will move to more specific topics. Among them will be functional differences between roots and affixes; compounding, noun incorporation, and bipartite stem structures; certain elaborately developed sets of distinctions in the domains such as space, means and manner, evidentiality, and reality status; relations between morphologically-defined and syntactically-defined lexical categories; head versus dependent marking and differences that arise from the locus of marking; pronouns and agreement; polysynthesis and ‘configurationality’; cross-linguistic differences in the core/oblique distinction; alternative alignment patterns and their combinations; the variable strength of syntactic relations between predicates and lexical arguments; affix order and constituent order; and issues in clause combining, including ‘switch-reference’, logophoricity, and continua of finiteness.
All human languages construct words from meaningless elements—either speech sounds (in spoken languages) or manual gestures (in signed linguistic systems). Not only are phonological patterns evident in every known human language, but they even emerge anew, in the systems generated spontaneously in the homes of deaf signers and in newly emerging languages. Why do humans engage in phonological patterning? And what mechanisms support our capacity to extend our phonological reflexes to novel forms?
This course addresses these questions from a broad interdisciplinary perspective. We consider evidence from diverse sources, ranging from linguistic analysis to experimental studies of humans, comparative animal work, neurological evidence, genetic studies and computational simulations of language evolution. Select issues include:
Specialization and innateness: What is an innate specialized cognitive system?
Generalizations: What computational mechanisms support phonological generalizations? Do they exhibit the capacity for discrete infinity?
Design. Are there constraints on the design of phonological systems—actual and potential? What is the nature of such restrictions: do they concern language, broadly, or speech, specifically?
Hardware: What genes and brain “hardware” regulate the phonological system? Is this hardware specialized for language?
Ontogeny: Are some phonological precursors present at birth?
Phylogeny: What components of the phonological mind are shared with our evolutionary ancestors? How did the human capacity for phonological patterning evolve?
Phonological technologies. Unlike language, reading and writing are “linguistic technologies” that emerge (sometimes spontaneously) on the basis of linguistic principles. Why is reading based on phonology? And why do reading disorders impair speech perception?
The role of children in contact-induced language change is relatively under-explored, as most work on language contact and language change investigates adult speakers. Little is known about when and how the adult speakers developed their speech repertoires, or how their speech styles as children interact with long-term change. Yet in several contexts children learning (a) first language(s) have clearly played a role in contact-induced language change, and recent studies have detailed the contribution of children using empirical data and/or historical records. It appears that children’s roles differ according to context, where contextual factors include age, type of variation, medium of interaction (sign or oral), dialect contact, new dialect or language formation, and/or different degrees of input in the language being acquired. This course will bring together studies of children’s language development in several types of contact situation and attempt to provide a synthesis which links sociolinguistic situations, socio- and psycholinguistic processes, and linguistic outcomes. We will discuss evidence of children’s roles in the nativization of pidgins and creoles (oral and sign), mixed languages, and dialect formation, linking these to evidence of children’s language processing in other instances of first language acquisition.
Sociolinguistic variation is most widely known for its correlations with broad social categories, particularly as it represents the spread of linguistic change through and across communities. These patterns, however, are the tip of the variation iceberg. Variation offers up a robust social-semiotic system capable of indexing the full range of a community’s social concerns, and of changing with those concerns. The Third Wave approach to variation is based in the understanding that the social-indexical potential of variation is central to the social functioning of language. This course will examine the social meaning of variation up close, and trace the relation between local meaning and the formation of abstract social categories. Topics will cover the nature of stylistic practice, the role of variation in social change, the sources of variables, and the range of indexical functions of variation from affect to stance to persona construction.
This four-week course will cover a selection of the software, hardware, and stimulus kits/surveys which are most useful in documenting languages. The course will begin with an overview of software tools for organizing language data, including Toolbox and Elan, and hardware (e.g. audio and video recorders) for making recordings. Week 2 will focus on tools related to grammatical documentation (e.g. in the writing of reference grammars) and will include the use of structured stimulus kits, questionnaires, and tools for organizing transcripts and analytical data. Week 3 will focus on corpus planning and the collection of narratives and conversational data. Week 4 will concentrate on software and techniques for lexical elicitation, along with collection archiving. Each class will have a practical component and class participants are encouraged to bring their own data sets; however, data samples will also be available for those who need them. Some familiarity with general linguistics is presumed.
This course develops a constructionist approach to First and Second Language Acquisition (L1A, L2A). It presents psycholinguistic and corpus linguistic evidence for L2 constructions and for the inseparability of lexis, grammar, and semantics. It outlines a psycholinguistic theory of language learning following general cognitive principles of category learning, with schematic constructions emerging from usage. It reviews how the following factors jointly determine how a construction is learned: (1) the exemplar frequencies and their Zipfian distribution; (2) the salience of their form; (3) the significance of their functional interpretation; (4) the exemplars’ similarity to the construction prototype; and (5) the reliability of these form-function mappings. It tests these proposals against large corpora of usage and longitudinal corpora of L1 and L2 learner language using statistical and computational modelling. It considers the psychology of transfer and learned attention in L2A in order to understand how L2A differs from L1A in that it involves reconstructing language, with learners’ expectations and attentional biases tuned by experience of their L1. A central theme of the course is that patterns of language usage, structure, acquisition, and change are emergent, and that there is value in viewing Language as a Complex Adaptive System.
Week 1: Constructions, their cognition and acquisition
Week 2: A frequency-informed construction grammar of English usage
Week 3: Construction learning in L1A and L2A longitudinal corpora
Week 4: L2A, learned attention, and transfer and their implications for instruction.
Course Areas: Language Acquisition, Semantics/Pragmatics, Psycholinguistics, Corpus Linguistics, Cognitive Linguistics