t3lib_cs Class Reference

List of all members.

Public Member Functions

 parse_charset ($charset)
 get_locale_charset ($locale)
 conv ($str, $fromCS, $toCS, $useEntityForNoChar=0)
 convArray (&$array, $fromCS, $toCS, $useEntityForNoChar=0)
 utf8_encode ($str, $charset)
 utf8_decode ($str, $charset, $useEntityForNoChar=0)
 utf8_to_entities ($str)
 entities_to_utf8 ($str, $alsoStdHtmlEnt=0)
 utf8_to_numberarray ($str, $convEntities=0, $retChar=0)
 UnumberToChar ($cbyte)
 utf8CharToUnumber ($str, $hex=0)
 initCharset ($charset)
 initUnicodeData ($mode=null)
 initCaseFolding ($charset)
 initToASCII ($charset)
 substr ($charset, $string, $start, $len=null)
 strlen ($charset, $string)
 crop ($charset, $string, $len, $crop=')
 strtrunc ($charset, $string, $len)
 conv_case ($charset, $string, $case)
 specCharsToASCII ($charset, $string)
 sb_char_mapping ($str, $charset, $mode, $opt=')
 utf8_substr ($str, $start, $len=null)
 utf8_strlen ($str)
 utf8_strtrunc ($str, $len)
 utf8_strpos ($haystack, $needle, $offset=0)
 utf8_strrpos ($haystack, $needle)
 utf8_char2byte_pos ($str, $pos)
 utf8_byte2char_pos ($str, $pos)
 utf8_char_mapping ($str, $mode, $opt=')
 euc_strtrunc ($str, $len, $charset)
 euc_substr ($str, $start, $charset, $len=null)
 euc_strlen ($str, $charset)
 euc_char2byte_pos ($str, $pos, $charset)
 euc_char_mapping ($str, $charset, $mode, $opt=')
 parse_charset ($charset)
 get_locale_charset ($locale)
 conv ($str, $fromCS, $toCS, $useEntityForNoChar=0)
 convArray (&$array, $fromCS, $toCS, $useEntityForNoChar=0)
 utf8_encode ($str, $charset)
 utf8_decode ($str, $charset, $useEntityForNoChar=0)
 utf8_to_entities ($str)
 entities_to_utf8 ($str, $alsoStdHtmlEnt=0)
 utf8_to_numberarray ($str, $convEntities=0, $retChar=0)
 UnumberToChar ($cbyte)
 utf8CharToUnumber ($str, $hex=0)
 initCharset ($charset)
 initUnicodeData ($mode=null)
 initCaseFolding ($charset)
 initToASCII ($charset)
 substr ($charset, $string, $start, $len=null)
 strlen ($charset, $string)
 crop ($charset, $string, $len, $crop=')
 strtrunc ($charset, $string, $len)
 conv_case ($charset, $string, $case)
 specCharsToASCII ($charset, $string)
 sb_char_mapping ($str, $charset, $mode, $opt=')
 utf8_substr ($str, $start, $len=null)
 utf8_strlen ($str)
 utf8_strtrunc ($str, $len)
 utf8_strpos ($haystack, $needle, $offset=0)
 utf8_strrpos ($haystack, $needle)
 utf8_char2byte_pos ($str, $pos)
 utf8_byte2char_pos ($str, $pos)
 utf8_char_mapping ($str, $mode, $opt=')
 euc_strtrunc ($str, $len, $charset)
 euc_substr ($str, $start, $charset, $len=null)
 euc_strlen ($str, $charset)
 euc_char2byte_pos ($str, $pos, $charset)
 euc_char_mapping ($str, $charset, $mode, $opt=')

Public Attributes

 $noCharByteVal = 63
 $parsedCharsets = array()
 $caseFolding = array()
 $toASCII = array()
 $twoByteSets
 $fourByteSets
 $eucBasedSets
 $synonyms
 $lang_to_langfamily
 $lang_to_charset_unix
 $lang_to_charset_windows
 $locale_to_charset
 $charSetArray
 $isoArray

Detailed Description

Definition at line 136 of file class.t3lib_cs.php.


Member Function Documentation

t3lib_cs::conv str,
fromCS,
toCS,
useEntityForNoChar = 0
 

Convert from one charset to another charset.

Parameters:
string Input string
string From charset (the current charset of the string)
string To charset (the output charset wanted)
boolean If set, then characters that are not available in the destination character set will be encoded as numeric entities
Returns:
string Converted string
See also:
convArray()

Definition at line 577 of file class.t3lib_cs.php.

References $GLOBALS, and utf8_decode().

t3lib_cs::conv str,
fromCS,
toCS,
useEntityForNoChar = 0
 

Convert from one charset to another charset.

Parameters:
string Input string
string From charset (the current charset of the string)
string To charset (the output charset wanted)
boolean If set, then characters that are not available in the destination character set will be encoded as numeric entities
Returns:
string Converted string
See also:
convArray()

Definition at line 577 of file class.t3lib_cs.php.

References $GLOBALS, and utf8_decode().

Referenced by convArray().

t3lib_cs::conv_case charset,
string,
case
 

Translates all characters of a string into their respective case values. Unlike strtolower() and strtoupper() this method is locale independent. Note that the string length may change! eg. lower case German �(sharp S) becomes upper case "SS" Unit-tested by Kasper Real case folding is language dependent, this method ignores this fact.

Parameters:
string Character set of string
string Input string to convert case for
string Case keyword: "toLower" means lowercase conversion, anything else is uppercase (use "toUpper" )
Returns:
string The converted string
Author:
Martin Kutschker <martin.t.kutschker@blackbox.net>
See also:
strtolower(), strtoupper()

Definition at line 1501 of file class.t3lib_cs.php.

References euc_char_mapping(), sb_char_mapping(), and utf8_char_mapping().

t3lib_cs::conv_case charset,
string,
case
 

Translates all characters of a string into their respective case values. Unlike strtolower() and strtoupper() this method is locale independent. Note that the string length may change! eg. lower case German �(sharp S) becomes upper case "SS" Unit-tested by Kasper Real case folding is language dependent, this method ignores this fact.

Parameters:
string Character set of string
string Input string to convert case for
string Case keyword: "toLower" means lowercase conversion, anything else is uppercase (use "toUpper" )
Returns:
string The converted string
Author:
Martin Kutschker <martin.t.kutschker@blackbox.net>
See also:
strtolower(), strtoupper()

Definition at line 1501 of file class.t3lib_cs.php.

References euc_char_mapping(), sb_char_mapping(), and utf8_char_mapping().

t3lib_cs::convArray &$  array,
fromCS,
toCS,
useEntityForNoChar = 0
 

Convert all elements in ARRAY from one charset to another charset. NOTICE: Array is passed by reference!

Parameters:
string Input array, possibly multidimensional
string From charset (the current charset of the string)
string To charset (the output charset wanted)
boolean If set, then characters that are not available in the destination character set will be encoded as numeric entities
Returns:
void
See also:
conv()

Definition at line 617 of file class.t3lib_cs.php.

References $key, conv(), convArray(), and value.

t3lib_cs::convArray &$  array,
fromCS,
toCS,
useEntityForNoChar = 0
 

Convert all elements in ARRAY from one charset to another charset. NOTICE: Array is passed by reference!

Parameters:
string Input array, possibly multidimensional
string From charset (the current charset of the string)
string To charset (the output charset wanted)
boolean If set, then characters that are not available in the destination character set will be encoded as numeric entities
Returns:
void
See also:
conv()

Definition at line 617 of file class.t3lib_cs.php.

References $key, conv(), and value.

Referenced by convArray().

t3lib_cs::crop charset,
string,
len,
crop = '
 

Truncates a string and pre-/appends a string. Unit tested by Kasper

Parameters:
string The character set
string Character string
integer Length (in characters)
string Crop signifier
Returns:
string The shortened string
See also:
substr(), mb_strimwidth()
Author:
Martin Kutschker <martin.t.kutschker@blackbox.net>

Definition at line 1414 of file class.t3lib_cs.php.

References $i, euc_char2byte_pos(), strlen(), substr(), and utf8_char2byte_pos().

t3lib_cs::crop charset,
string,
len,
crop = '
 

Truncates a string and pre-/appends a string. Unit tested by Kasper

Parameters:
string The character set
string Character string
integer Length (in characters)
string Crop signifier
Returns:
string The shortened string
See also:
substr(), mb_strimwidth()
Author:
Martin Kutschker <martin.t.kutschker@blackbox.net>

Definition at line 1414 of file class.t3lib_cs.php.

References $i, euc_char2byte_pos(), strlen(), substr(), and utf8_char2byte_pos().

t3lib_cs::entities_to_utf8 str,
alsoStdHtmlEnt = 0
 

Converts numeric entities (UNICODE, eg. decimal (&#1234;) or hexadecimal (&x1b;)) to UTF-8 multibyte chars

Parameters:
string Input string, UTF-8
boolean If set, then all string-HTML entities (like & or will be converted as well)
Returns:
string Output string

Definition at line 756 of file class.t3lib_cs.php.

References $k, $parts, substr(), UnumberToChar(), and utf8_encode().

t3lib_cs::entities_to_utf8 str,
alsoStdHtmlEnt = 0
 

Converts numeric entities (UNICODE, eg. decimal (&#1234;) or hexadecimal (&x1b;)) to UTF-8 multibyte chars

Parameters:
string Input string, UTF-8
boolean If set, then all string-HTML entities (like & or will be converted as well)
Returns:
string Output string

Definition at line 756 of file class.t3lib_cs.php.

References $k, $parts, substr(), UnumberToChar(), and utf8_encode().

Referenced by utf8_to_numberarray().

t3lib_cs::euc_char2byte_pos str,
pos,
charset
 

Translates a character position into an 'absolute' byte position.

Parameters:
string EUC multibyte character string
integer character position (negative values start from the end)
string the charset
Returns:
integer byte position
Author:
Martin Kutschker <martin.t.kutschker@blackbox.net>

Definition at line 1962 of file class.t3lib_cs.php.

References $charset, $i, pos, and strlen().

t3lib_cs::euc_char2byte_pos str,
pos,
charset
 

Translates a character position into an 'absolute' byte position.

Parameters:
string EUC multibyte character string
integer character position (negative values start from the end)
string the charset
Returns:
integer byte position
Author:
Martin Kutschker <martin.t.kutschker@blackbox.net>

Definition at line 1962 of file class.t3lib_cs.php.

References $charset, $i, pos, and strlen().

Referenced by crop(), and euc_substr().

t3lib_cs::euc_char_mapping str,
charset,
mode,
opt = '
 

Maps all characters of a string in the EUC charset family.

Parameters:
string EUC multibyte character string
string the charset
string mode: 'case' (case folding) or 'ascii' (ASCII transliteration)
string 'case': conversion 'toLower' or 'toUpper'
Returns:
string the converted string
Author:
Martin Kutschker <martin.t.kutschker@blackbox.net>

Definition at line 2003 of file class.t3lib_cs.php.

References $charset, $i, strlen(), and substr().

t3lib_cs::euc_char_mapping str,
charset,
mode,
opt = '
 

Maps all characters of a string in the EUC charset family.

Parameters:
string EUC multibyte character string
string the charset
string mode: 'case' (case folding) or 'ascii' (ASCII transliteration)
string 'case': conversion 'toLower' or 'toUpper'
Returns:
string the converted string
Author:
Martin Kutschker <martin.t.kutschker@blackbox.net>

Definition at line 2003 of file class.t3lib_cs.php.

References $charset, $i, strlen(), and substr().

Referenced by conv_case(), and specCharsToASCII().

t3lib_cs::euc_strlen str,
charset
 

Counts the number of characters of a string in the EUC charset family.

Parameters:
string EUC multibyte character string
string the charset
Returns:
integer the number of characters
See also:
strlen()
Author:
Martin Kutschker <martin.t.kutschker@blackbox.net>

Definition at line 1935 of file class.t3lib_cs.php.

References $charset, $i, and strlen().

t3lib_cs::euc_strlen str,
charset
 

Counts the number of characters of a string in the EUC charset family.

Parameters:
string EUC multibyte character string
string the charset
Returns:
integer the number of characters
See also:
strlen()
Author:
Martin Kutschker <martin.t.kutschker@blackbox.net>

Definition at line 1935 of file class.t3lib_cs.php.

References $charset, $i, and strlen().

Referenced by strlen().

t3lib_cs::euc_strtrunc str,
len,
charset
 

Cuts a string in the EUC charset family short at a given byte length.

Parameters:
string EUC multibyte character string
integer the byte length
string the charset
Returns:
string the shortened string
See also:
mb_strcut()
Author:
Martin Kutschker <martin.t.kutschker@blackbox.net>

Definition at line 1881 of file class.t3lib_cs.php.

References $charset, $i, strlen(), and substr().

t3lib_cs::euc_strtrunc str,
len,
charset
 

Cuts a string in the EUC charset family short at a given byte length.

Parameters:
string EUC multibyte character string
integer the byte length
string the charset
Returns:
string the shortened string
See also:
mb_strcut()
Author:
Martin Kutschker <martin.t.kutschker@blackbox.net>

Definition at line 1881 of file class.t3lib_cs.php.

References $charset, $i, strlen(), and substr().

Referenced by strtrunc().

t3lib_cs::euc_substr str,
start,
charset,
len = null
 

Returns a part of a string in the EUC charset family.

Parameters:
string EUC multibyte character string
integer start position (character position)
string the charset
integer length (in characters)
Returns:
string the substring
Author:
Martin Kutschker <martin.t.kutschker@blackbox.net>

Definition at line 1910 of file class.t3lib_cs.php.

References euc_char2byte_pos(), start(), and substr().

t3lib_cs::euc_substr str,
start,
charset,
len = null
 

Returns a part of a string in the EUC charset family.

Parameters:
string EUC multibyte character string
integer start position (character position)
string the charset
integer length (in characters)
Returns:
string the substring
Author:
Martin Kutschker <martin.t.kutschker@blackbox.net>

Definition at line 1910 of file class.t3lib_cs.php.

References euc_char2byte_pos(), start(), and substr().

Referenced by substr().

t3lib_cs::get_locale_charset locale  ) 
 

Get the charset of a locale.

ln language ln_CN language / country ln_CN.cs language / country / charset ln_CN.cs language / country / charset / modifier

Parameters:
string Locale string
Returns:
string Charset resolved for locale string
Author:
Martin Kutschker <martin.t.kutschker@blackbox.net>

Definition at line 524 of file class.t3lib_cs.php.

References parse_charset().

t3lib_cs::get_locale_charset locale  ) 
 

Get the charset of a locale.

ln language ln_CN language / country ln_CN.cs language / country / charset ln_CN.cs language / country / charset / modifier

Parameters:
string Locale string
Returns:
string Charset resolved for locale string
Author:
Martin Kutschker <martin.t.kutschker@blackbox.net>

Definition at line 524 of file class.t3lib_cs.php.

References parse_charset().

t3lib_cs::initCaseFolding charset  ) 
 

This function initializes the folding table for a charset other than UTF-8. This function is automatically called by the case folding functions.

Parameters:
string Charset for which to initialize case folding.
Returns:
integer Returns FALSE on error, a TRUE value on success: 1 table already loaded, 2, cached version, 3 table parsed (and cached). private

Definition at line 1215 of file class.t3lib_cs.php.

References $charset, t3lib_div::getFileAbsFileName(), utf8_decode(), and t3lib_div::writeFileToTypo3tempDir().

t3lib_cs::initCaseFolding charset  ) 
 

This function initializes the folding table for a charset other than UTF-8. This function is automatically called by the case folding functions.

Parameters:
string Charset for which to initialize case folding.
Returns:
integer Returns FALSE on error, a TRUE value on success: 1 table already loaded, 2, cached version, 3 table parsed (and cached). private

Definition at line 1215 of file class.t3lib_cs.php.

References $charset, t3lib_div::getFileAbsFileName(), utf8_decode(), and t3lib_div::writeFileToTypo3tempDir().

t3lib_cs::initCharset charset  ) 
 

This will initialize a charset for use if it's defined in the PATH_t3lib.'csconvtbl/' folder This function is automatically called by the conversion functions

PLEASE SEE: http://www.unicode.org/Public/MAPPINGS/

Parameters:
string The charset to be initialized. Use lowercase charset always (the charset must match exactly with a filename in csconvtbl/ folder ([charset].tbl)
Returns:
integer Returns '1' if already loaded. Returns FALSE if charset conversion table was not found. Returns '2' if the charset conversion table was found and parsed. private

Definition at line 928 of file class.t3lib_cs.php.

References $charset, $lines, t3lib_div::getFileAbsFileName(), substr(), t3lib_div::trimExplode(), UnumberToChar(), t3lib_div::validPathStr(), value, and t3lib_div::writeFileToTypo3tempDir().

t3lib_cs::initCharset charset  ) 
 

This will initialize a charset for use if it's defined in the PATH_t3lib.'csconvtbl/' folder This function is automatically called by the conversion functions

PLEASE SEE: http://www.unicode.org/Public/MAPPINGS/

Parameters:
string The charset to be initialized. Use lowercase charset always (the charset must match exactly with a filename in csconvtbl/ folder ([charset].tbl)
Returns:
integer Returns '1' if already loaded. Returns FALSE if charset conversion table was not found. Returns '2' if the charset conversion table was found and parsed. private

Definition at line 928 of file class.t3lib_cs.php.

References $charset, $lines, t3lib_div::getFileAbsFileName(), substr(), t3lib_div::trimExplode(), UnumberToChar(), t3lib_div::validPathStr(), value, and t3lib_div::writeFileToTypo3tempDir().

t3lib_cs::initToASCII charset  ) 
 

This function initializes the to-ASCII conversion table for a charset other than UTF-8. This function is automatically called by the ASCII transliteration functions.

Parameters:
string Charset for which to initialize conversion.
Returns:
integer Returns FALSE on error, a TRUE value on success: 1 table already loaded, 2, cached version, 3 table parsed (and cached). private

Definition at line 1277 of file class.t3lib_cs.php.

References $charset, t3lib_div::getFileAbsFileName(), utf8_decode(), and t3lib_div::writeFileToTypo3tempDir().

t3lib_cs::initToASCII charset  ) 
 

This function initializes the to-ASCII conversion table for a charset other than UTF-8. This function is automatically called by the ASCII transliteration functions.

Parameters:
string Charset for which to initialize conversion.
Returns:
integer Returns FALSE on error, a TRUE value on success: 1 table already loaded, 2, cached version, 3 table parsed (and cached). private

Definition at line 1277 of file class.t3lib_cs.php.

References $charset, t3lib_div::getFileAbsFileName(), utf8_decode(), and t3lib_div::writeFileToTypo3tempDir().

t3lib_cs::initUnicodeData mode = null  ) 
 

This function initializes all UTF-8 character data tables.

PLEASE SEE: http://www.unicode.org/Public/UNIDATA/

Parameters:
string Mode ("case", "ascii", ...)
Returns:
integer Returns FALSE on error, a TRUE value on success: 1 table already loaded, 2, cached version, 3 table parsed (and cached). private

Definition at line 990 of file class.t3lib_cs.php.

References $i, t3lib_div::getFileAbsFileName(), t3lib_div::trimExplode(), UnumberToChar(), t3lib_div::validPathStr(), and t3lib_div::writeFileToTypo3tempDir().

t3lib_cs::initUnicodeData mode = null  ) 
 

This function initializes all UTF-8 character data tables.

PLEASE SEE: http://www.unicode.org/Public/UNIDATA/

Parameters:
string Mode ("case", "ascii", ...)
Returns:
integer Returns FALSE on error, a TRUE value on success: 1 table already loaded, 2, cached version, 3 table parsed (and cached). private

Definition at line 990 of file class.t3lib_cs.php.

References $i, t3lib_div::getFileAbsFileName(), t3lib_div::trimExplode(), UnumberToChar(), t3lib_div::validPathStr(), and t3lib_div::writeFileToTypo3tempDir().

t3lib_cs::parse_charset charset  ) 
 

Normalize - changes input character set to lowercase letters.

Parameters:
string Input charset
Returns:
string Normalized charset
Author:
Martin Kutschker <martin.t.kutschker@blackbox.net>

Definition at line 505 of file class.t3lib_cs.php.

References $charset.

t3lib_cs::parse_charset charset  ) 
 

Normalize - changes input character set to lowercase letters.

Parameters:
string Input charset
Returns:
string Normalized charset
Author:
Martin Kutschker <martin.t.kutschker@blackbox.net>

Definition at line 505 of file class.t3lib_cs.php.

References $charset.

Referenced by get_locale_charset().

t3lib_cs::sb_char_mapping str,
charset,
mode,
opt = '
 

Maps all characters of a string in a single byte charset.

Parameters:
string the string
string the charset
string mode: 'case' (case folding) or 'ascii' (ASCII transliteration)
string 'case': conversion 'toLower' or 'toUpper'
Returns:
string the converted string
Author:
Martin Kutschker <martin.t.kutschker@blackbox.net>

Definition at line 1567 of file class.t3lib_cs.php.

References $charset, $i, and strlen().

t3lib_cs::sb_char_mapping str,
charset,
mode,
opt = '
 

Maps all characters of a string in a single byte charset.

Parameters:
string the string
string the charset
string mode: 'case' (case folding) or 'ascii' (ASCII transliteration)
string 'case': conversion 'toLower' or 'toUpper'
Returns:
string the converted string
Author:
Martin Kutschker <martin.t.kutschker@blackbox.net>

Definition at line 1567 of file class.t3lib_cs.php.

References $charset, $i, and strlen().

Referenced by conv_case(), and specCharsToASCII().

t3lib_cs::specCharsToASCII charset,
string
 

Converts special chars (like ���, umlauts etc) to ascii equivalents (usually double-bytes, like �=> ae etc.)

Parameters:
string Character set of string
string Input string to convert
Returns:
string The converted string

Definition at line 1527 of file class.t3lib_cs.php.

References euc_char_mapping(), sb_char_mapping(), and utf8_char_mapping().

t3lib_cs::specCharsToASCII charset,
string
 

Converts special chars (like ���, umlauts etc) to ascii equivalents (usually double-bytes, like �=> ae etc.)

Parameters:
string Character set of string
string Input string to convert
Returns:
string The converted string

Definition at line 1527 of file class.t3lib_cs.php.

References euc_char_mapping(), sb_char_mapping(), and utf8_char_mapping().

t3lib_cs::strlen charset,
string
 

Counts the number of characters. Unit-tested by Kasper (single byte charsets only)

Parameters:
string The character set
string Character string
Returns:
integer The number of characters
See also:
strlen()
Author:
Martin Kutschker <martin.t.kutschker@blackbox.net>

Definition at line 1386 of file class.t3lib_cs.php.

References euc_strlen(), strlen(), and utf8_strlen().

t3lib_cs::strlen charset,
string
 

Counts the number of characters. Unit-tested by Kasper (single byte charsets only)

Parameters:
string The character set
string Character string
Returns:
integer The number of characters
See also:
strlen()
Author:
Martin Kutschker <martin.t.kutschker@blackbox.net>

Definition at line 1386 of file class.t3lib_cs.php.

References euc_strlen(), and utf8_strlen().

Referenced by crop(), euc_char2byte_pos(), euc_char_mapping(), euc_strlen(), euc_strtrunc(), sb_char_mapping(), strlen(), utf8_byte2char_pos(), utf8_char2byte_pos(), utf8_char_mapping(), utf8_decode(), utf8_encode(), utf8_strlen(), utf8_to_entities(), and utf8_to_numberarray().

t3lib_cs::strtrunc charset,
string,
len
 

Cuts a string short at a given byte length.

Parameters:
string The character set
string Character string
integer The byte length
Returns:
string The shortened string
See also:
mb_strcut()
Author:
Martin Kutschker <martin.t.kutschker@blackbox.net>

Definition at line 1467 of file class.t3lib_cs.php.

References euc_strtrunc(), substr(), and utf8_strtrunc().

t3lib_cs::strtrunc charset,
string,
len
 

Cuts a string short at a given byte length.

Parameters:
string The character set
string Character string
integer The byte length
Returns:
string The shortened string
See also:
mb_strcut()
Author:
Martin Kutschker <martin.t.kutschker@blackbox.net>

Definition at line 1467 of file class.t3lib_cs.php.

References euc_strtrunc(), substr(), and utf8_strtrunc().

t3lib_cs::substr charset,
string,
start,
len = null
 

Returns a part of a string. Unit-tested by Kasper (single byte charsets only)

Parameters:
string The character set
string Character string
integer Start position (character position)
integer Length (in characters)
Returns:
string The substring
See also:
substr(), mb_substr()
Author:
Martin Kutschker <martin.t.kutschker@blackbox.net>

Definition at line 1348 of file class.t3lib_cs.php.

References $enc, euc_substr(), start(), substr(), and utf8_substr().

t3lib_cs::substr charset,
string,