I18N_UnicodeNormalizer
[ class tree: I18N_UnicodeNormalizer ] [ index: I18N_UnicodeNormalizer ] [ all elements ]

Class: I18N_UnicodeNormalizer_String

Source Location: /I18N_UnicodeNormalizer-1.0.0RC4/UnicodeNormalizer/String.php

Class Overview


Manipulation of Unicode and UTF-8 strings


Author(s):

Version:

  • Release: @package_version@

Copyright:

  • 2007 Michel Corne

Methods


Inherited Variables

Inherited Methods


Class Details

[line 53]
Manipulation of Unicode and UTF-8 strings

Converts characters or strings from/to Unicode to/from UTF-8. Splits UTF-8 strings. Extracts a UTF-8 string character.



[ Top ]


Method Detail

char2unicode   [line 67]

string char2unicode( mixed $char, [string $invalid = '?'])

Converts a UTF-8 character into a Unicode code point
  • Return: the Unicode code point in UCN format, e.g. \u00A0, invalid characters are replaced by the substitution ASCII character
  • Access: public

Parameters:

mixed   $char   —  the UTF-8 character as a multibyte binary string or as an integer
string   $invalid   —  the substitution ASCII character if the UTF-8 character is invalid, the default is "?"

[ Top ]

dec2ucn   [line 140]

string dec2ucn( integer $int)

Converts an integer to an hexadecimal string in UCN format, e.g. \u00A0 or \U0012abcd
  • Return: the UCN string
  • Access: public

Parameters:

integer   $int   —  the integer

[ Top ]

getChar   [line 167]

string getChar( string $string, integer &$pos, integer $length, [boolean $lookahead = false], [string $invalid = '?'])

Gets the current character from a UTF-8 string

Returns a substitution character if the first byte is invalid. Expecting a valid UTF-8 string. Does not check if the bytes following the first one are valid.

  • Return: the UTF-8 character, or false if there are no more characters to get
  • Access: public

Parameters:

string   $string   —  the UTF-8 string
integer   &$pos   —  the current byte position within the UTF-8 string, the position is updated to the next character on exit
integer   $length   —  the length of the UTF-8 string
boolean   $lookahead   —  update the position to the next UTF-8 character if true, leaves it unchanged if true
string   $invalid   —  the ASCII character replacing an invalid byte, e.g. "?", invalid bytes are silently ignored if null

[ Top ]

split   [line 213]

array split( string $string)

Splits a UTF-8 string into its characters

Expecting a valid UTF-8 string.

  • Return: the UTF-8 string characters
  • Access: public

Parameters:

string   $string   —  the UTF-8 string

[ Top ]

string2unicode   [line 253]

mixed string2unicode( string $string, [boolean $toString = true])

Converts a UTF-8 string to a Unicode string in UCN format

Example: string2unicode('123') returns '\u0031\u0032\u0033'. Expecting a valid UTF-8 string.

  • Return: the Unicode string in UCN format
  • Access: public

Parameters:

string   $string   —  the UTF-8 string
boolean   $toString   —  returns a string if true, or an array of characters if false, the default is true

[ Top ]

unicode2char   [line 277]

string unicode2char( mixed $code, [string $invalid = '?'])

Converts a Unicode code point into a UTF-8 character
  • Return: the UTF-8 character, or the substitution ASCII character if the Unicode is invalid
  • Access: public

Parameters:

mixed   $code   —  the Unicode code point as an hexadecimal string, e.g. 000A, or 0x000A, or \u000A, or as an integer, e.g. or (int)10
string   $invalid   —  the substitution ASCII character if the Unicode is invalid, default is "?"

[ Top ]

unicode2string   [line 319]

string unicode2string( string $string)

Converts a Unicode string in the UCN format to a UTF-8 string

Expecting a valid Unicode string. Any character outside of the [0-9A-Fa-f] range are considered as separators. Example: unicode2string('\u0031\u0032\u0033') returns '123'. Example: unicode2string('31 32x33') returns '123'.

  • Return: the UTF-8 string
  • Access: public

Parameters:

string   $string   —  the Unicode string in UCN format

[ Top ]


Documentation generated on Mon, 11 Mar 2019 15:09:22 -0400 by phpDocumentor 1.4.4. PEAR Logo Copyright © PHP Group 2004.