OVERVIEW
 


Hunston 3-23, 213-16

 

  • What gives a corpus value for researchers?

  • Give a few examples of the relationship between frequency and register (text type)

  • Give a few examples of corpus insights into phraseology and collocation

  • What are some general uses of corpora?  Which interest you the most?

  • What are some different types of corpora? Which would be the most useful for you?

  • Define the following: type, token, hapax, lemma, word-form, tag, parse, annotation

  • Give some examples where corpora provide (even native speakers) with insights that otherwise might not be available.

  • [ACTIVITY] Hunston claims that native speaker intuition usually isn’t very good at guessing frequency and/or collocation.  Try answering the following questions in your head, and only after this compare your intuitions from actual data from the 560+ million word Corpus of Contemporary American English:

o   What is the relative frequency of the following verbs: look, live, like, get, take (input each one separately, and limit to infinitival form of the verb ([VVI]); e.g. like.[vvi] )

o   What is the relative frequency of the following adjectives: important, big, other, only, hard

o   What adjectives occur most frequently with painfully and with completely? (e.g. painfully [j*])

o   What verbs occur most frequently with slowly and with hardly? (e.g. [vvd] slowly) (note: [vvd] = -ED form of a lexical (non-AUX) verb)

o   What verbs occur most frequently in the phrase: hard to V (hard to [vvi]) (note: [vvi] = infinitival form of a lexical verb)

  • What considerations do we need to keep in mind in interpreting corpus data?

  • (p213) What does Hunston mean when she says that corpora can be both authoritarian and empowering?

  • What does Hunston mean when she says that corpora have made language analysis more simple, as well as more complex?

Note: you didn't do the reading for the following questions, so no need to be prepared before class, but I'll discuss these in class anyway:

  • What is the difference between a rationalist and and an empirical approach to language?

  • What is the difference between competence and performance? Which one did Chomsky favor, and why?

  • Discuss the issue of introspection (vs external data), and how it relates to corpus linguistics

  • What was the situation with data processing in the 1950s-1970s, and how did this impact on corpus linguistics?

  • (1.4) ["Corpus linguistics strikes back"] Why is a corpus at least as good or better than individual introspection?

  • How have advances in data processing aided the resurgence of corpus linguistics?