Detecting the language

At first, you might want to get a list of supported languages. It can be retrieved by calling getLanguages on a Text_LanguageDetect object. It returns an array of strings that represent the languages, e.g. array('albanian', 'arabic', 'azeri').

To actually detect the language of a piece of text, use the detect method on the Text_LanguageDetect object. It takes the text as first parameter, and an optional $limit as second parameter, determining how many (likely) languages shall be returned at most. The method returns a sorted array with the languages as key, and their score as value. If no language is detected, an empty array is returned.

To get the most likely language only, use detectSimple which directly returns the string of the language, or null if none was detected.

To detect the language correctly, the length of the input text should be at least some sentences.

Text_LanguageDetect (Previous) Language names (Next)
Last updated: Sat, 16 Feb 2019 — Download Documentation
Do you think that something on this page is wrong? Please file a bug report.
View this page in:
  • English

User Notes:

There are no user contributed notes for this page.