public function PhpTransliteration::transliterate in Drupal 8
Same name and namespace in other branches
- 9 core/lib/Drupal/Component/Transliteration/PhpTransliteration.php \Drupal\Component\Transliteration\PhpTransliteration::transliterate()
Transliterates text from Unicode to US-ASCII.
Parameters
string $string: The string to transliterate.
string $langcode: (optional) The language code of the language the string is in. Defaults to 'en' if not provided. Warning: this can be unfiltered user input.
string $unknown_character: (optional) The character to substitute for characters in $string without transliterated equivalents. Defaults to '?'.
int $max_length: (optional) If provided, return at most this many characters, ensuring that the transliteration does not split in the middle of an input character's transliteration.
Return value
string $string with non-US-ASCII characters transliterated to US-ASCII characters, and unknown characters replaced with $unknown_character.
Overrides TransliterationInterface::transliterate
File
- core/
lib/ Drupal/ Component/ Transliteration/ PhpTransliteration.php, line 125
Class
- PhpTransliteration
- Implements transliteration without using the PECL extensions.
Namespace
Drupal\Component\TransliterationCode
public function transliterate($string, $langcode = 'en', $unknown_character = '?', $max_length = NULL) {
$result = '';
$length = 0;
$hash = FALSE;
// Replace question marks with a unique hash if necessary. This because
// mb_convert_encoding() replaces all invalid characters with a question
// mark.
if ($unknown_character != '?' && strpos($string, '?') !== FALSE) {
$hash = hash('sha256', $string);
$string = str_replace('?', $hash, $string);
}
// Ensure the string is valid UTF8 for preg_split(). Unknown characters will
// be replaced by a question mark.
$string = mb_convert_encoding($string, 'UTF-8', 'UTF-8');
// Use the provided unknown character instead of a question mark.
if ($unknown_character != '?') {
$string = str_replace('?', $unknown_character, $string);
// Restore original question marks if necessary.
if ($hash !== FALSE) {
$string = str_replace($hash, '?', $string);
}
}
// Split into Unicode characters and transliterate each one.
foreach (preg_split('//u', $string, 0, PREG_SPLIT_NO_EMPTY) as $character) {
$code = self::ordUTF8($character);
if ($code == -1) {
$to_add = $unknown_character;
}
else {
$to_add = $this
->replace($code, $langcode, $unknown_character);
}
// Check if this exceeds the maximum allowed length.
if (isset($max_length)) {
$length += strlen($to_add);
if ($length > $max_length) {
// There is no more space.
return $result;
}
}
$result .= $to_add;
}
return $result;
}