Language Identification: A Computational Linguistics Primer

Slides and results from a talk I gave at Kalamazoo College on language identification.

My co-worker at Powerset, Chris Biemann, has a nice paper on Unsupervised Language Identification


One response to "Language Identification: A Computational Linguistics Primer"

  1. Daniel Lemire April 27, 2009 at 3:40 pm

    Great. Thanks for sharing.

    I did some vaguely related work hashing n-grams… you may appreciate it:

    Recursive n-gram hashing is pairwise independent, at best

    (You provided the initial motivation of this paper a long, long, long time ago!)

