Top Level :: Internationalization

Package Information: I18N_UnicodeNormalizer

» Summary» License
Unicode NormalizerThe BSD License
» Current Release» Bug Summary
1.0.0 (stable) was released on 2007-08-04 (Changelog)No open bugs
» Description
"...Unicode's normalization is the concept of character composition and decomposition.
Character composition is the process of combining simpler characters into fewer precomposed characters, such as the n character and the combining ~ character into the single n+~ character. Decomposition is the opposite process, breaking precomposed characters back into their component pieces...
...Normalization is important when comparing text strings for searching and sorting (collation)..." [Wikipedia]
Performs the 4 normalizations:
NFD: Canonical Decomposition NFC: Canonical Decomposition, followed by Canonical Composition NFKD: Compatibility Decomposition NFKC: Compatibility Decomposition, followed by Canonical Composition Complies with the official Unicode.org regression test.
Uses UTF8 binary strings natively but can normalize a string in any UTF format.
Fully tested with phpUnit. Code coverage test close to 100%.
» Maintainers» More Information