You are here

protected function FeedsExQueryPathHtml::convertEncoding in Feeds extensible parsers 7.2

Converts a string to UTF-8.

Requires the iconv, GNU recode or mbstring PHP extension.

Parameters

string $data: The string to convert.

string $encoding: The encoding to convert to.

Return value

string The encoded string, or the original string if encoding failed.

Overrides FeedsExXml::convertEncoding

See also

drupal_convert_to_utf8()

1 call to FeedsExQueryPathHtml::convertEncoding()
FeedsExQueryPathHtml::prepareDocument in src/FeedsExQueryPathHtml.inc
Prepares the DOM document.

File

src/FeedsExQueryPathHtml.inc, line 34
Contains FeedsExQueryPathHtml.

Class

FeedsExQueryPathHtml
Parses HTML documents with QueryPath.

Code

protected function convertEncoding($data, $encoding = 'UTF-8') {

  // Check for an encoding declaration.
  $matches = FALSE;
  if (preg_match('/<meta[^>]+charset\\s*=\\s*["\']?([\\w-]+)\\b/i', $data, $matches)) {
    $encoding = $matches[1];
  }
  elseif ($detected = parent::detectEncoding($data)) {
    $encoding = $detected;
  }

  // Unsupported encodings are converted here into UTF-8.
  $php_supported = array(
    'utf-8',
    'us-ascii',
    'ascii',
  );
  if (in_array(strtolower($encoding), $php_supported)) {
    return $data;
  }
  $data = parent::convertEncoding($data, $encoding);
  if ($matches) {
    $data = preg_replace('/(<meta[^>]+charset\\s*=\\s*["\']?)([\\w-]+)\\b/i', '$1UTF-8', $data, 1);
  }
  return $data;
}