Native Language Identification: A Key N-gram Category Approach

Kristopher Kyle, Scott Crossley, Jianmin Dai, Danielle S. McNamara

Research output: Chapter in Book/Report/Conference proceedingConference contribution

5 Citations (Scopus)

Abstract

This study explores the efficacy of an approach to native language identification that utilizes grammatical, rhetorical, semantic, syntactic, and cohesive function categories comprised of key n-grams. The study found that a model based on these categories of key n-grams was able to successfully predict the L1 of essays written in English by L2 learners from 11 different L1 backgrounds with an accuracy of 59%. Preliminary findings concerning instances of crosslinguistic influence are discussed, along with evidence of language similarities based on patterns of language misclassification.

Original languageEnglish
Title of host publicationProceedings of the 8th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2013
EditorsJoel Tetreault, Jill Burstein, Claudia Leacock
PublisherAssociation for Computational Linguistics (ACL)
Pages242-250
Number of pages9
ISBN (Electronic)9781937284473
Publication statusPublished - 2013
Event8th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2013 - Atlanta, United States
Duration: 2013 Jun 13 → …

Publication series

NameProceedings of the 8th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2013

Conference

Conference8th Workshop on Innovative Use of NLP for Building Educational Applications, BEA 2013
Country/TerritoryUnited States
CityAtlanta
Period13/6/13 → …

Bibliographical note

Publisher Copyright:
© 2013 Association for Computational Linguistics.

All Science Journal Classification (ASJC) codes

  • Computer Science Applications
  • Information Systems
  • Computational Theory and Mathematics

Fingerprint

Dive into the research topics of 'Native Language Identification: A Key N-gram Category Approach'. Together they form a unique fingerprint.

Cite this