COLLOCATES

 

1. Choose a moderately frequent word from English -- maybe between #2000 and 4000. Please choose a different word than you did for the concordances project. Also, please number the sections below in your responses.

2. Before you start looking for the word in corpora, write down 4-5 of the most frequent collocates that you think of, and get the same data from two other people (i.e. "what words do you think of when you think of ---?").

3. Search for the word in COCA.

3.1 What are the 7-8 most common collocates to the left? (indicate span)

3.2 What are the 7-8 most common collocates to the right? (indicate span)

4. Which direction (left or right) gives the best results? Why?

5. What are 2-3 words (even beyond the 7-8 listed above) that are a surprise? How are they used with the node word?

6. Are there 2-3 words in the list (might need to go fairly far down) that seem to be "errors". Are the errors of the type "abortion / supreme"  or "rove / presidential"?

7. In the searches above, you didn't limit by part of speech. Completely reset the form, and now re-do the search, limiting by one part of speech that you think might be useful? How does this compare with the results in #3 above?

8. Re-set the form and this time sort by Mutual Information score, with a minimum frequency of 5 or 10.

8.1 What are the top 7-8 collocates now?

8.2 Which search seemed to produce the best results? #3 above (sorting by raw frequency, with MI threshold) or sorting by Mutual Information?

9. Re-do the search in the BNC. Any important differences from COCA?

10. What about the Brown corpus (via AntConc). Is there enough data to see many patterns?