STATISTICS
 


  • Types of data

    • Quantitative (test scores, number of tokens of word)

    • Ordinal (top 50 words in FICT and ACAD)

    • Nominal (M/F, state of origin, ethnic background)
       

  • Normalization

    • 120 tokens in 20m words vs. 30 in 4m words
       

  • Simple stats

Months Score
12 67
14 68
16 72
18 71
20 74
  infected uninfected
placebo 81 1427
vaccine 179 2824
  • With chi-square, want p <= .05. Here it's .42 = no difference
     

  • Mutual information and z-score (e.g. collocates of havoc, strike, break)

  • ANOVA: Analysis of variation (which factors most important)