CORPORA -- STRUCTURED
Lee, David (2010) "What corpora are available?". In Routledge
Handbook of Corpus Linguistics, p 107-22. (Note: somewhat biased;
very BNC-centric)
Corpora, Collections, Data Archives (mainly for English) (David Lee)
General points:
- accessibility
- copyright (e.g. OUP / BNC)
1-2 main corpora from each of:
- general
- speech
- parsed
- historical
- Web as Corpus
- learner
- parallel
- non-English (2-3 languages mentioned?)
Pay special attention to (if there):
- Brown / FROWN / LOB / FLOB
- Australian Corpus of English (ACE) /
Wellington corpus / Kolhapur corpus
- London-Lund
- British National Corpus (BNC)
- American National Corpus (ANC)
- Bank of English / Cobuild
- International Corpus of English (ICE)
- Switchboard
- Helsinki
- CHILDES
- MICASE
- ICLE - International Corpus of Learners'
English
|