You are here

public static function Utf8::encode in Extensible BBCode 4.0.x

Same name and namespace in other branches
  1. 8.3 src/Utf8.php \Drupal\xbbcode\Utf8::encode()

Escape specified characters in a UTF8 string to \uXXXX and \UXXXXXXXX.

(This resembles the escape sequences of json_encode, but uses a single eight-digit hex code for non-BMP instead of a UTF16 surrogate pair.)

Existing sequences will get an extra backslash. Backslashes before existing and new sequences are doubled for distinction. Other backslashes are left unchanged.

Parameters

string $string: The string to encode.

string $characters: A valid character group (without []) to match. Without a group, all non-ASCII characters are escaped.

Return value

string The encoded string.

1 call to Utf8::encode()
CodeTagPlugin::prepare in standard/src/Plugin/XBBCode/CodeTagPlugin.php
Transform an elements' content, to armor against other filters.

File

src/Utf8.php, line 116

Class

Utf8
Implementation of UTF-8 character utilities.

Namespace

Drupal\xbbcode

Code

public static function encode(string $string, $characters = NULL) : string {
  $characters = $characters ?: '^\\x00-\\x7f';

  // Escape existing \uXXXX and \UXXXXXXXX sequences.
  // This is done by doubling the number of backslashes preceding them.
  $string = preg_replace('/(\\\\+)(u[\\da-fA-F]{4}|U[\\da-fA-F]{8})/', '$1$0', $string);

  // Encode all blacklisted characters (or all non-ASCII characters).
  // Double any backslashes preceding them.
  return preg_replace_callback('/(\\\\*)([' . $characters . '])/u', static function ($match) {
    $code = self::ord($match[2]);
    $sequence = sprintf($code < 0x10000 ? '\\u%04x' : '\\U%08x', $code);
    return $match[1] . $match[1] . $sequence;
  }, $string);
}