Mark Davies
Professor, Corpus Linguistics
Brigham Young University

 

EDUCATION


PUBLICATIONS


Click on year in parentheses to download the PDF file (for 2004 and earlier). The password to open the files is: byu_ling
  1. (Forthcoming) A Frequency Dictionary of American English: Core Vocabulary for Learners. Routledge. (Co-authored with Dee Gardner.)

  2. (Forthcoming) Corpus linguistic applications: current studies, new directions. Rodopi. (Co-editor, with Stefan Gries and Stefanie Wulff.)

  3. (Forthcoming) "The 385+ Million Word Corpus of Contemporary American English (1990-present)". International Journal of Corpus Linguistics.

  4. (Forthcoming) "Corpus Linguistics Questions and Answers". In Perspectives on Corpus Linguistics, ed. Geoff Barnbrook and Vander Viana. John Benjamins.

  5. (Forthcoming) "Creating Useful Historical Corpora: A Comparison of CORDE, the Corpus del Español, and the Corpus do Português". In Diacronía de las lenguas iberorromances: nuevas perspectivas desde la lingüística de corpus, ed. Andrés Enrique-Arias. Frankfurt/Madrid: Vervuert/Iberoamericana.

  6. (Forthcoming) “Relational databases as a robust architecture for the analysis of word frequency”. In AHRC ICT Methods Network: Expert Seminar on Linguistics: Word Frequency and Keyword Extraction, ed. Dawn Archer. Ashgate.

  7. (Forthcoming) "Research on historical pragmatics with Biblia Medieval (an aligned parallel corpus of medieval Spanish). In Claus Pusch, et al (eds), Romance Corpus Linguistics III: Corpora and Pragmatics. Guntar Naar. (Co-authored with Andrés Enrique-Arias).

  8. (Forthcoming) "Review of Using Spanish Corpora". Modern Language Journal.

  9. (Forthcoming) "Review of The International Corpus of English – British Component (ICE-GB), the Diachronic Corpus of Present-day Spoken English (DCPSE), and ICECUP 3.1". Language.
     

  10. (2008) "Spanish and Portuguese Corpus Linguistics". Studies in Hispanic and Lusophone Linguistics. 1:149-86.

  11. (2008) "The corpus-based Frequency Dictionary of Portuguese: A new tool for learners and teachers." In Proceedings of TALC 8: Teaching and Language Corpora, ed. Ana Frankenberg-Garcia, et al. Lisbon. (Co-authored with Ana Maria Raposo Preto-Bay)

  12. (2008) "The Corpus of Contemporary American English--a Useful Tool for English Teaching and Research". Computer-Assisted Foreign Language Education in China. 5:24-31 (Co-authored with Wang Xingfu and Liu Guohui).

  13. (2007) A Frequency Dictionary of Portuguese: Core Vocabulary for Learners.  Routledge. (Co-authored with Ana Maria Raposo Preto-Bay)

  14. (2007)Pointing Out Frequent Phrasal Verbs: A Corpus-Based Analysis”. TESOL Quarterly 41:339-59. (Co-authored with Dee Gardner)

  15. (2007) "Semantically-based queries with a joint BNC/WordNet database". In Corpus Linguistics twenty-five years on, ed. Roberta Facchinetti. Amsterdam: Rodopi. 149-167.

  16. (2006) "Towards the first comprehensive survey of register variation in Spanish".  In Corpus Linguistics Beyond the Word: Corpus Research from Phrase to Discourse, ed. Eileen Fitzpatrick. Rodopi. 73-86.

  17. (2006) “Vocabulary Coverage in Spanish Textbooks: How Representative is It?” In Selected Proceedings from the Conference on the Acquisition of Spanish and Portuguese as First and Second Languages, ed. Jacqueline Toribio. Cascadilla. 132-43. (Co-authored with Timothy L. Face). 132-43.

  18. (2006) "Spoken and written register variation in Spanish: A Multi-dimensional Analysis." Corpora 1:1-37. (Co-authored with Doug Biber, James Jones, and Nicole Tracy-Ventura).

  19. (2005) A Frequency Dictionary of Spanish: Core Vocabulary for Learners.  Routledge.

  20. (2005) "The advantage of using relational databases for large corpora: speed, advanced queries, and unlimited annotation".  International Journal of Corpus Linguistics 10: 301-28.

  21. (2005) "On diachronic shifts with Spanish se: preliminary evidence from large electronic corpora." In Claus Pusch, et al (eds), Romance Corpus Linguistics II: Corpora and Diachronic Linguistics. Guntar Naar. 431-42.

  22. (2005) "Vocabulary Range and Text Coverage: Insights from the Forthcoming Routledge Frequency Dictionary of Spanish". In David Eddington, (ed), Selected Proceedings from the 7th Hispanic Linguistics Symposium. 106-15.

  23. (2005) "Advanced research on syntactic and semantic change with the Corpus del Español". In Claus Pusch, et al (eds), Romance Corpus Linguistics II: Corpora and Diachronic Linguistics. Guntar Naar. 203-14. Reprinted in: Teubert, Wolfgang & Ramesh Krishnamurthy (eds.). 2007. Corpus Linguistics. Critical Concepts in Linguistics (6 vols.). London: Routledge. 337-48 (Volume 5).

  24. (2004) El uso del Corpus del Español y otros corpus para investigar la variación actual y los cambios  históricos. Tokyo: Univ. Sophia.

  25. (2004) Review of Léxico Hispanoamericano (Peter Boyd-Bowman, et al). La Coronica: A Journal of Medieval Spanish Literature and Language 33:259-64.

  26. (2004) "Student use of large, annotated corpora to analyze syntactic variation". In Guy Aston, et al (eds). Corpora and Language Learners. Philadelphia: John Benjamins. 259-69.

  27. (2004) Review of "Computer Learner Corpora, Second Language Acquisition and Foreign Language Teaching (Sylvaine Granger, et al). Modern Language Journal.

  28. (2004) "Student use of large corpora to investigate language change". In Thomas Upton, et al (eds).  Applied Corpus Linguistics: A Multidimensional Perspective. Amsterdam: Rodopi. 207-22.

  29. (2003)  "Diachronic Shifts and Register Variation with the "Lexical Subject of Infinitive" Construction. (Para yo hacerlo)". In Silvina Montrul and Francisco Ordóñez, Linguistic Theory and Language Development in Hispanic Languages. Somerville, MA: Cascadilla Press. 13-29. 

  30. (2003)  "Annotation without lexicons: an alternative to the standard bootstrapping approach".  In Paul Rayson, et al. Proceedings from Corpus Linguistics 2003 (Lancaster, England, March 2003). 174-83.

  31. (2002) "Un corpus anotado de 100.000.000 palabras del español histórico y moderno". SEPLN 2002 (Sociedad Española para el Procesamiento del Lenguaje Natural). (Valladolid).  21-27.

  32. (2002) "'Esto es ligero de fazer: Object to Subject Raising in Medieval and Early Modern Spanish".  In James F. Lee, et al, Structure, Meaning, and Acquisition of Spanish. Somerville, MA: Cascadilla Press.  19-31.

  33. (2001) “Review of Construcciones causativas en el español medieval by Milagros Alfonso Vega”. Revista Canadiense de Estudios Hispánicos 25:329-30.

  34. (2001) "Creating and using multi-million word corpora from web-based newspapers".  In Corpus Linguistics in North America, eds. Rita C. Simpson and John M. Swales. Ann Arbor: U Michigan P.  58-75.

  35. (2000)  "Using multi-million word corpora of historical and dialectal Spanish texts to teach advanced courses in Spanish linguistics".  In Rethinking Language Pedagogy from a Corpus Perspective, eds. Lou Burnard and Tony McEnery.  Frankfurt am Main; New York: P. Lang. 173-85.

  36. (2000"Syntactic Diffusion in Spanish and Portuguese Infinitival Complements”.  In New Approaches to Old Problems: Issues in Romance Historical Linguistics, eds.Steven Dworkin and Dieter Wanner.  Amsterdam; Philadelphia: John Benjamins. 109-27.

  37. (1999)  "The Historical Development of Subject Raising in Portuguese: A Corpus-Based Approach". Neuphilologische Mitteilungen 100:95-110.

  38. (1999)  "A Computer Corpus-Based Study of Subject Raising in Modern Portuguese". Lingvisticae Investigationes 21:379-400.

  39. (1998)  "The Evolution of Spanish Clitic Climbing: A Corpus-Based Approach." Studia Neophilologica 69:251-63.

  40. (1997)  "A Corpus-Based Approach to Diachronic Clitic Climbing in Portuguese." Hispanic Journal 17: 93-111.

  41. (1997) "Using Large Computer-Based Corpora as a Philological Tool: An Analysis of Four Medieval Spanish Bibles." Dactylus 16: 70-92.

  42. (1997)  "The History of Subject Raising in Spanish". Bulletin of Hispanic Studies (Liverpool) 74: 399-411.

  43. (1997)  "A Corpus-Based Analysis of Subject Raising in Modern Spanish." Hispanic Linguistics 9: 33-63.

  44. (1996)  "The Diachronic Interplay of Finite and Nonfinite Verbal Complements in Spanish and Portuguese." Bulletin of Hispanic Studies (Glasgow) 73:137-58.

  45. (1996) "The Diachronic Evolution of the Causative Construction in Portuguese." Journal of Hispanic Philology 17:261-92.

  46. (1995)  "The Evolution of Causative Constructions in Spanish and Portuguese." In Current Research in Romance Linguistics, ed. John Amastae, et al. Philadelphia: John Benjamins, 1995. 105-122.

  47. (1995)  "Omnipage and WordCruncher: Tools for Creating and Searching Digitalized Text Corpora." La Corónica 23:111-115.

  48. (1995)  "The Evolution of the Spanish Causative Construction." Hispanic Review 63:57-77.

  49. (1995) "Analyzing Syntactic Variation with Computer-Based Corpora: The Case of Modern Spanish Clitic Climbing". Hispania 78:370-380.

  50. (1994)  "Parameters, Passives, and Parsing: Explaining Diachronic Shifts in Spanish and Portuguese". In Variation and Linguistic Theory, ed. K. Beals, et al. Chicago: CLS. Vol 2. 46-60.

  51. (1992)  "A Tentative Bibliography of Historical Spanish Syntax." Hispanic Linguistics 5:279-351.

CONFERENCE PRESENTATIONS / WORKSHOPS


  1. (2009) "The Corpus of Contemporary American English (385+ Million Words, 1990-present)". Conference on "Practical Applications in Language and Computers (PALC)". Univ. Lodz, Poland. [Keynote speaker]

  2. (2009) "Examining recent syntactic shifts with the Corpus of Contemporary American English". ICAME (International Computer Archive of Medieval and Modern English). Lancaster Univ, England.

  3. (2009) "Examining genre-based variation and recent historical shifts with the Corpus of Contemporary American English". Conference on "Design and Delivery of Online Corpora". Univ. Glasgow, Scotland. [Keynote speaker]

  4. (2009) "The 385+ Million Word Corpus of Contemporary American English (1990-present): A new tool for examining language variation and change". Digital Humanities 2009. Univ. Maryland.

  5. (2009) "Large text archives and structured corpora: Case studies from syntactic shifts in twentieth-century American English". 19th International Conference on Historical Linguistics. Nijmegen, Netherlands.

  6. (2009) Workshop on Online Corpora. Univ. Bergen, Norway. [Invited presentation]

  7. (2009) American Association for Corpus Linguistics. Univ. Alberta. [Keynote speaker] (Topic to be announced)

  8. (2008) "The 360 million word “BYU Corpus of American English (1990-2007)". ICAME (International Computer Archive of Medieval and Modern English). Univ. Zurich.

  9. (2007) "Investigating Recent Linguistic Shifts with a New 100+ Million Word Corpus of American English from the 1900s. SHEL 5 (Studies in the History of the English Language". Univ. Georgia.

  10. (2007) "From the Corpus del español to the Corpus do portugues (and back again): Evolving architectures for historical corpora". Colloquium on Ibero-Romance Historical Corpora. Univ. Balearic Islands, Spain. [Keynote speaker]

  11. (2007) "A 100+ Million Word Corpus of American Magazines, 1900-1999".  ICAME (International Computer Archive of Medieval and Modern English). Stratford-upon-Avon.

  12. (2009) Series of presentations on using online corpora. Univ. Murcia, Spain. [Invited presentation]

  13. (2006) "A new, 37 million word, Web-based corpus of historical English". ICAME (International Computer Archive of Medieval and Modern English). Univ. of Helsinki.

  14. (2006) "Competing architectures for large historical corpora".  Workshop on Historical Text Mining (AHRC ICT Methods workshop). Univ. of Lancaster (England). [Keynote speaker]

  15. (2006) "Towards a 250 Million Word Corpus of Historical English". Bringing Text Alive: The Future of Scholarship, Pedagogy, and Electronic Publication (Text Creation Partnership). U Michigan.

  16. (2006) "Resolving Trade Name Legal Disputes through Corpus Research". American Association of Applied Corpus Linguistics. Northern Arizona U.

  17. (2006) "Incorporating “meaning-based” searches into corpus architectures and interfaces". American Association of Applied Corpus Linguistics. Northern Arizona U.

  18. (2006) "Size, speed, and annotation with historical corpora". Digital Historical Corpora - Architecture, Annotation, and Retrieval. Dagstuhl Int'l Conference and Research Center for Computer Science (#06491). [Invited speaker]

  19. (2005) “Vocabulary Coverage in Spanish Textbooks: How Representative is It?” Conference on the Acquisition of Spanish and Portuguese as First and Second Languages. Penn State Univ.

  20. (2005) “Relational databases as a robust architecture for the analysis of word frequency”. AHRC ICT Methods Network: Expert Seminar on Linguistics: Word Frequency and Keyword Extraction. Univ. of Lancaster, England. [Invited speaker]

  21. (2005) “A new interface for examining use of synonyms (and other related words) in the BNC.” Corpus Linguistics 2005. Univ. of Birmingham, England.

  22. (2004) "The frequency and distribution of [se] constructions in Spanish: a corpus and learner-based approach."  7th Conference on the Acquisition of Spanish and Portuguese as First and Second Languages. Univ. of Minnesota. (With Timothy L. Face)

  23. (2004) "Incorporating register variation into BNC queries: a relational database approach."  Sixth International Conference on Teaching and Language Corpora. Granada, Spain.

  24. (2004) "El uso del Corpus del Español y otros corpus para investigar la variación actual y los cambios históricos."  Series of workshops presented at Sophia University (Tokyo, Japan).  [Invited presentation]

  25. (2004) "Creating and Using Corpora to Investigate Language Change and Variation."  Department of Linguistics, Sophia University (Tokyo, Japan).

  26. (2004) "El diseño y uso de los corpus grandes para investigar el cambio lingüístico y la variación actual." Kobe University, Japan.

  27. (2004) "A joint BNC/WordNet database: the best of both worlds". The Fifth North American Symposium on Corpus Linguistics. Montclair State, NJ.

  28. (2004) "A multi-dimensional analysis of register variation in Spanish.". The Fifth North American Symposium on Corpus Linguistics. Montclair State, NJ.

  29. (2004) "A match made in corpus heaven: the BNC and WordNet in relational database form." 25th Conference of the International Computer Archive of Modern and Medieval English. Verona, Italy.

  30. (2004) "The impact of phrasal forms in corpus-based vocabulary studies".  AAAL 2004 (American Association of Applied Linguistics). Portland, OR.

  31. (2003) "How much vocabulary is enough?: Insights from recent corpus-based studies".  6th Conference on the Acquisition of Spanish and Portuguese as First and Second Languages. U New Mexico.

  32. (2003) "A multidimensional analysis of register variation in Spanish".  7th Hispanic Linguistics Symposium. U New Mexico.

  33. (2003) "Advanced research on syntactic and semantic change with the 100 million word, fully-annotated Corpus del Español.".  2nd Freiburg Workshop on Romance Corpus Linguistics. U Freiburg, Germany. September 2003.

  34. (2003) "On the frequency, use, and omission of se: Evidence from the 100 million word Corpus del Español".  2nd Freiburg Workshop on Romance Corpus Linguistics. U Freiburg, Germany. September 2003.

  35. (2003) "Annotation without lexicons".  Corpus Linguistics 2003. Lancaster University, UK. March 2003.

  36. (2003) "Relational n-gram databases as a basis for unlimited annotation on very large corpora". Workshop on the Shallow Processing of Large Corpora.  Lancaster University, UK. March 2003.

  37. (2002) "Using Relational Databases to Create Highly Searchable and Very Large Corpora". The Fourth North American Symposium on Corpus Linguistics. IUPUI, Indianapolis, IN.

  38. (2002) "Student use of a 100 million word, fully annotated corpus of Spanish to model language variation and change". Fifth International Conference on Teaching and Language Corpora. Bertinoro, Italy.

  39. (2002) " Un corpus anotado de 100.000.000 palabras del español histórico y moderno". SEPLN 2002 (Sociedad Española para el Procesamiento del Lenguaje Natural). (Univ. de Valladolid, Spain).

  40. (2002) "Modeling Syntactic Change with the Fully Annotated, 100 Million Word 'Corpus del Español': Suppresion of se with Causative Verbs". Sixth Hispanic Linguistics Symposium. Univ. of Iowa.

  41. (2002) "A Searchable, Fully-Annotated, 100 Million Word Corpus of Historical Spanish Texts”. Kentucky Foreign Language Conference. Univ. of Kentucky.

  42. (2002) "A 100 Million Word Corpus of Historical and Modern Spanish, Searchable by Grammatical Category, Lemma, and Related Words".  XXX Romance Linguistics Symposium. Cambridge, England.

  43. (2002) "How to Make Large Corpora both Fast and Highly Annotated". 5th Annual CLUK (Computational Linguistics in the UK) Research Colloquium. Leeds, England. 

  44. (2001) "Multimillion Word Online Corpora as a Tool for Language Learning".  The Seventh Sloan-C International Conference on Online Learning: Emerging Standards of Excellence in Asynchronous Learning Networks. Orlando, FL.

  45. (2001) "Virtually Unlimited Annotation on Very Large Corpora".  IRCS Workshop on Linguistic Databases.  Philadelphia, PA.

  46. (2001) "Dialectal Variation and Diachronic Shifts with the "Preposition + Subject + Infinitive" Construction (para yo hacerlo)".  Fifth Hispanic Linguistics Symposium. Univ. Illinois.

  47. (2001) "Large Historical Corpora on the Web: Helping Students to Model Linguistic Change". The Third North American Symposium on Corpus Linguistics and Language Teaching. Boston, MA.

  48. (2000) "Diachronic Shifts in Spanish Raising Constructions".  4th Hispanic Linguistics Symposium, Indiana University.

  49. (2000) "Using Large Computer-Based Corpora as a Philological Tool:An Analysis of Five Medieval Spanish Bibles". 35th International Congress on Medieval Studies, Univ. Western Michigan, May 2000.

  50. (2000) "Using Large Computer-Based Parallel Texts to Study Lexical (and Other) Changes from Old Spanish to Modern Spanish". Kentucky Foreign Language Conference, April 2000.

  51. (1999)  "Diachronic shifts in the interpretation of Spanish infinitival complements". Conference on Spanish Semantics and Pragmatics. Ohio State Univ.

  52. (1999)  "Languages as dialects and dialects as languages: explaining parallel syntactic shifts in Spanish and Portuguese". International Conference on Historical Linguistics (ICHL XIV). Univ. of British Columbia.

  53. (1999)  "Creating Multimillion Word Corpora from Web-based Newspapers". North American Symposium on Corpora in Linguistics and Language Teaching. Univ. of Michigan.

  54. (1999)  "Modeling syntactic change: evidence from computer-based studies of infinitival complements in Spanish and Portuguese". Linguistic Symposium on Romance Languages (LSRL 29). Univ. of Michigan.

  55. (1998)  "Using multimillion word corpora of historical and dialectal Spanish texts to teach 'Advanced Spanish Syntax'".  Teaching and Language Corpora 1998 conference. Oxford University, England.

  56. (1998)  "A Corpus-Based Analysis of Subject Raising in Historical and Modern Spanish". Univ. Texas-Austin Colloquium on Romance Linguistics. Austin, TX

  57. (1997)  "Using Large Computer-Based Corpora to Investigate Language Variation and Change". Deseret Language and Linguistics Symposium. Provo, UT.

  58. (1996)  "The Use of Large Computer-Based Corpus in Research and Teaching". Colloquium on Spanish Linguistics. Roanoke, VA.

  59. (1995)  "The Diachronic Evolution of Portuguese Clitic Climbing". Annual Meeting of the Modern Language Association (MLA), Chicago, IL.

  60. (1995)  "Exploring Foreign Language Resources on the Internet". A series of presentations and workshops given to Illinois K-12 teachers; organized under the auspices of the Illinois Council on the Teaching of Foreign Languages (ICTFL).

  61. (1994)  "A Corpus-Based Approach to Modern Spanish Clitic Climbing." American Association of the Teachers of Spanish and Portuguese Annual Meeting (AATSP). Philadelphia, PA.

  62. (1994)  "Parameters, Passives, and Parsing: Explaining Diachronic Shifts in Spanish and Portuguese". Parasession at the Chicago Linguistics Society (CLS). Chicago, IL.

  63. (1992)  "Explaining Diachronic Shifts in Spanish and Portuguese Causative Constructions". Linguistic Symposium on Romance Languages (LSRL XXII). El Paso, TX.

  64. (1991)  "A Diachronic Look at Infinitival Complements in Spanish and Portuguese." Modern Language Association (MLA). San Francisco, CA.

  65. (1991)  "Parameters in Diachronic Spanish and Portuguese Causative Constructions." South Central Modern Language Association (SCMLA). Dallas, TX.

  66. (1991)  "Parameters in the Development of Diachronic Infinitival Complements in Spanish and Portuguese." Language Association of the Southwest (LASSO). Austin, TX.

  67. (1991)  "Towards a Unified Account of Diachronic Spanish Clitic Placement." Linguistic Symposium on Romance Languages (LSRL XXI). Santa Barbara, CA.

  68. (1990)  "Functional-Typological Explanations for Diachronic Shifts in Spanish Clitic Placement." "Explanation in Historical Linguistics" conference. Univ. Wisconsin-Milwaukee

CONSULTANCIES (LEGAL)


2004-05.  Expert witness for Wilmer Cutler Pickering Hale and Dorr LLP (Washington DC) in a case involving the generic status of a product name in Latin America

2008.  Expert witness for Wilmer Cutler Pickering Hale and Dorr LLP (Washington DC) in a case involving the generic status of a product name in Latin America (different lawsuit)

TEACHING EXPERIENCE


Brigham Young University (2003-present)

Graduate Courses (Illinois State University)

Undergraduate Courses (Illinois State University)

Graduate Course (Visiting Professor, Brigham Young University, Summer 1995)

  • Spanish Morphosyntax (625, Graduate Seminar)

HONORS AND AWARDS


External

  • 2009. $200,000, two year grant from the National Endowment for the Humanities to create a 300 million word, annotated, online Corpus of Historical American English.

  • 2004. $250,000, two year grant from the National Endowment for the Humanities to create a 45 million word, annotated, online Corpus do Português.

  • 2002. $155,000, two year grant from the National Science Foundation to research the "Multi-dimensional analysis of register variation in Spanish". Co-PI with Douglas Biber of Northern Arizona University. 

  • 2001. $125,000, two year grant from the National Endowment for the Humanities to develop a "100 million word web-based searchable corpus of historical Spanish texts."

Internal

  • 2008. [BYU] Barker Lectureship (One scholar in language/linguistics selected from among the faculty in the College of Humanities)

  • 2001-02. [ISU] One of three mentors for the Distance Learning Training Program.

  • 2000. [ISU] Grant from the "Extended University" to develop online course: "Variation in Spanish Syntax".

  • 1999. [ISU] Grant from the "Extended University" to develop online course: "History of the Spanish Language".

  • 1997. [ISU] Grant from the Provost's Office to develop online course: "Foreign Language Resources on the Internet".

  • 1997. [ISU] Teaching Initiative Award (One of seven awarded at the university)

  • 1997. [ISU] "Faculty Fellow" award from the Center for the Advancement of Teaching.

  • 1995. [ISU] Research Initiative Award (One of seven awarded at the university).

  • 1996. [ISU] Office of Instructional Technology Grant to develop a course to be taught entirely by means of the Internet

  • 1996. "Best of Illinois" award from the 1200+ member Illinois Council on the Teaching of Foreign Languages (ICTFL) (One award given each year)