In this course, we will explore how corpora (i.e. large collections of texts) can be used to study languages, with a focus on English. We will discuss various tools that can be used to process large electronic corpora, such as specialized websites (e.g. English-corpora.org) and software (e.g. AntConc). We will see how these tools can be used to extract and analyse data from large corpora.

Using electronic corpora benefits multiple fields of linguistics, especially those that rely on a usage-based perspective. We will discuss how corpora can be used to study word formation, grammatical structures, metaphors, and other linguistic phenomena. Corpus linguistics can also be used to study literary texts. For instance, you can find out what actions and adjectives are associated with your favourite characters, which you can also use in broader discussions such as how women and men are portrayed by your favourite authors.

The main goal of this course is for students to be able to use these tools to conduct their own studies on their own topics of interest. Another goal is to be able to critically read studies that involve corpus linguistics. This course does not require any prior knowledge in computer science. Some key concepts will be discussed in class, such as the use of regular expressions to identify specific constructions, but they will be introduced in a beginner-friendly way. As some of our sessions will be practical, having a laptop is highly recommended, but not compulsory.