INTRODUCTION / DESCRIPTION

In this course we will learn to use corpora (large collections of electronic text) for a number of types of linguistic analyses.  After discussing the basic methodology of corpus linguistics and how it compares to more formal models of language, we will discuss the mechanics of corpus creation.  This will include corpus design, acquisition, and annotation.  We will then focus on software tools that are already available to analyze corpora, including concordancing tools like Word Cruncher, WordSmith, grep-like tools, and relational databases.  The second half of the course will be oriented towards using corpora to study a number of types of linguistic issues.  These will include language variation and change, language learning, and cross-linguistic comparison.  After successfully completing this course, the participants should be able to effectively use corpora as a database to study a wide range of linguistic phenomena.