OVERVIEW
Hunston 3-23, 213-16
-
What gives a corpus value
for researchers?
-
Give a few examples of the
relationship between frequency and register (text type)
-
Give a few examples of
corpus insights into phraseology and collocation
-
What are some general uses
of corpora? Which interest you the most?
-
What are some different
types of corpora? Which would be the most useful for you?
-
Define the following:
type, token, hapax, lemma, word-form, tag, parse, annotation
-
the man put the
books on the table (8 tokens, 6 types)
-
COCA for lemma,
POS (no grouping)
-
Give some examples where
corpora provide (even native speakers) with insights that otherwise might not
be available.
-
cause
-
naked eye
-
go/come + ADJ
-
help (to)
-
[ACTIVITY] Hunston claims
that native speaker intuition usually isn’t very good at guessing frequency
and/or collocation. Try answering the following questions in your
head, and only after this compare your
intuitions from actual data from the Corpus of Contemporary American
English:
o What is the relative
frequency of the following verbs: look, live, like, get, take (input each
one separately, and limit to infinitival form of the verb ([VVI]);
e.g. like.[vvi] )
o What is the relative
frequency of the following adjectives: important, big, other, only, hard
o What adjectives
occur most frequently with painfully and with completely? (e.g.
painfully [j*])
o What verbs occur
most frequently with slowly and with hardly? (e.g.
[vvd] slowly) (note: [vvd] = -ED
form of a lexical (non-AUX) verb)
o What verbs occur
most frequently in the phrase: hard to V (hard
to [vvi]) (note: [vvi] = infinitival form of a lexical verb)
-
What considerations do we
need to keep in mind in interpreting corpus data?
-
± frequent does
not = ± possible (mauve carpet)
-
value of corpus
and possible data a function of corpus design (e.g. CREA)
-
little
non-textual context (oh, sure)
-
gives frequency;
we interpret (go/come + ADJ)
-
(p213) What does Hunston
mean when she says that corpora can be both authoritarian and empowering?
-
What does Hunston mean
when she says that corpora have made language analysis more simple, as well as
more complex?
-
preposition
stranding ([vv*] with): (vs. prescriptive rule) genre, time (after WW II),
by verb, other factors?
Note: you
didn't do the reading for the following questions, so no need to be
prepared before class, but I'll discuss these in class anyway:
-
What is the
difference between a rationalist and and an empirical approach to language?
-
What is the difference
between competence and performance? Which one did Chomsky favor, and why?
-
Discuss the issue of
introspection (vs external data), and how it relates to corpus linguistics
(conferences)
-
What was the situation with
data processing in the 1950s-1970s, and how did this impact on corpus
linguistics?
-
How have advances in data
processing aided the resurgence of corpus linguistics? (my Dad)
QUIZ
1. Concordance: total/totally,
interested/interesting, fast/slow, catch/caught
2. Type of corpora: NOT mentioned: parallel,
learner, religious, historical
3. Terms: NOT mentioned: regularizing, tagging,
parsing, annotation
4. Harder (p213-6): might as well, in terms of, for
all [pronoun] know, under the influence
|