|
|
(Next) Detecting the language |
||||
| |
|||||
|
|||||
Detects the language of a given piece of text.
The package attempts to detect the language of a sample of text by correlating ranked 3-gram frequencies to a table of 3-gram frequencies of known languages.
It implements a version of a technique originally proposed by Cavnar & Trenkle (1994): "N-Gram-Based Text Categorization".
|
|
(Next) Detecting the language |
||||||||
| |
|||||||||
|
|||||||||