Day 1: "Getting comfortable with corpora"

 


 

A. British National Corpus (Davies)

 

Note 1: If you are using a non-BYU computer, you may have to "register" after 20 or so queries.  It's a simple two step process, and should take no more than two minutes.

 

Note 2: Input the yellow highlighted search strings exactly as shown (probably best to just copy and paste).  Any alteration (however small -- even an unneeded space) may mean that the query won't produce any hits.

 

GENERAL (INPUT INTO THE SEARCH WORD/PHRASE FIELD)

 

1. MORPHOLOGY: Look for words starting with the prefix con- (e.g. confabulation).  What are the three most common singular nouns (con*.[nn1]), the three most common adjectives (con*.[aj0]), and the three most common infinitival verbs (con*.[vvi])

 

2. LEXICAL: Search for prophet and then compare the frequency in all 70 registers (by clicking on [1] in the results window).  In which three registers is it the most/least common?

 

3. LEXICO-GRAMMAR: Look at the top five adjectives following come and go (e.g. = come [aj*]).  Is there any pattern in terms of which adjectives occur with the two verbs?

 

4. SEMANTICS I: Compare the collocates of sheer/utter/total ( use {sheer/utter/total} [nn*] ). Any patterns here?

 

5. SEMANTICS II: Compare the adjectives occurring near boy and girl (boy/girl as search word, select SURROUNDING words, ADJ.ALL, 5 words to left, 5 words to right). Anything interesting?

 

6. COLLOCATIONS: What are the 7-8 most frequent nouns with treasure (treasure as search word, select SURROUNDING words, NOUN.ALL, 5 words to left, 5 words to right). Any surprises?

 

COMPARING REGISTERS (INPUT INTO THE SEARCH WORD/PHRASE FIELD)

 

7. LEXICO-GRAMMAR: Compare the 8-10 most common phrases with we [v*] in SPOKEN vs ACADEMIC (e.g. we hope that).  What is the major difference between the two registers?

 

8. LEXICAL: Compare the most frequent singular nouns (= [nn1]) in FICTION vs ACADEMIC.  Which types are more common in each register?

 


 

B. Historical: TIME Corpus of American English (Davies)

 

9. LEXICON: Has the frequency of the words and phrases end up, basically, and turn on increased or decreased over the past 100 years? (Use "CHARTS")

 

10. MORPHOLOGY: What are the five most common words ending in -gate (e.g. *gate) that appear for the first time in the 1990s (i.e. set Section 1 to 1990s and Section 2 to 1980s). Any surprises? Does this relate to anything going on in society during this time?

 

11. SYNTAX: Has the frequency of phrasal verbs with on (e.g. to turn on, to take on) (i.e. to [v*] up) increased or decreased over the past 100 years?

 

12. SEMANTICS: Look at the collocates with chip (i.e. chip + Surrounding Words) over the past 100 years. What are some new meanings in the past 20-30 years, and meanings that have decreased or disappeared since the 1920s-1940s?