You are here

function _strip_utf8mb4_for_text_fields in Strip 4-byte UTF8 7

Retern the processed text which the none utf8 characters has been replaced.

Parameters

string $text_data:

string $replace_text:

Return value

string

2 calls to _strip_utf8mb4_for_text_fields()
strip_utf8mb4_field_attach_presave in ./strip_utf8mb4.module
Implements hook_field_attach_presave().
strip_utf8mb4_webform_submission_presave in ./strip_utf8mb4.module
Implement hook_webform_submission_presave().

File

./strip_utf8mb4.module, line 152
Allow users to Strip 4-byte UTF8 characters. overly long 2 byte sequences, as well as characters above U+10000, and reject overly long 3 byte sequences and UTF-16

Code

function _strip_utf8mb4_for_text_fields($text_data, $replace_text = '') {
  $replacements_done = array();

  // Strip overly long 2 byte sequences, as well as characters
  //  above U+10000 and replace with $replace_text
  $processed_text_data = preg_replace('/[\\x00-\\x08\\x10\\x0B\\x0C\\x0E-\\x19\\x7F]' . '|[\\x00-\\x7F][\\x80-\\xBF]+' . '|([\\xC0\\xC1]|[\\xF0-\\xFF])[\\x80-\\xBF]*' . '|[\\xC2-\\xDF]((?![\\x80-\\xBF])|[\\x80-\\xBF]{2,})' . '|[\\xE0-\\xEF](([\\x80-\\xBF](?![\\x80-\\xBF]))|(?![\\x80-\\xBF]{2})|[\\x80-\\xBF]{3,})/S', $replace_text, $text_data, -1, $replacements_done[]);

  // Strip overly long 3 byte sequences and UTF-16 surrogates and replace with $replace_text
  $processed_text_data = preg_replace('/\\xE0[\\x80-\\x9F][\\x80-\\xBF]' . '|\\xED[\\xA0-\\xBF][\\x80-\\xBF]/S', $replace_text, $processed_text_data, -1, $replacements_done[]);
  if (array_sum($replacements_done) > 0) {
    $message = t('Unsupported characters in your text were replaced with "!replacement"', array(
      '!replacement' => $replace_text,
    ));
    drupal_set_message($message, 'warning', FALSE);
  }
  return $processed_text_data;
}