Native language identification and writing proficiency

Kristopher Kyle, Scott A. Crossley, You Jin Kim

Research output: Contribution to journalArticlepeer-review

9 Citations (Scopus)


This study evaluates the impact of writing proficiency on native language identification (NLI), a topic that has important implications for the generalizability of NLI models and detection-based arguments for cross-linguistic influence (Jarvis 2010, 2012; CLI). The study uses multinomial logistic regression to classify the first language (L1) group membership of essays at two proficiency levels based on systematic lexical and phrasal choices made by members of five L1 groups. The results indicate that lower proficiency essays are significantly easier to classify than higher proficiency essays, suggesting that lower proficiency writers make lexical and phrasal choices that are more similar to other lower proficiency writers that share an L1 than higher proficiency writers that share an L1. A close analysis of the findings also indicates that the relationship between NLI accuracy and proficiency differed across L1 groups.

Original languageEnglish
Pages (from-to)187-209
Number of pages23
JournalInternational Journal of Learner Corpus Research
Issue number2
Publication statusPublished - 2015

Bibliographical note

Publisher Copyright:
© John Benjamins Publishing Company.

All Science Journal Classification (ASJC) codes

  • Language and Linguistics
  • Linguistics and Language
  • Education


Dive into the research topics of 'Native language identification and writing proficiency'. Together they form a unique fingerprint.

Cite this