protected function FeedsExQueryPathHtml::convertEncoding in Feeds extensible parsers 7.2
Converts a string to UTF-8.
Requires the iconv, GNU recode or mbstring PHP extension.
Parameters
string $data: The string to convert.
string $encoding: The encoding to convert to.
Return value
string The encoded string, or the original string if encoding failed.
Overrides FeedsExXml::convertEncoding
See also
1 call to FeedsExQueryPathHtml::convertEncoding()
- FeedsExQueryPathHtml::prepareDocument in src/
FeedsExQueryPathHtml.inc - Prepares the DOM document.
File
- src/
FeedsExQueryPathHtml.inc, line 34 - Contains FeedsExQueryPathHtml.
Class
- FeedsExQueryPathHtml
- Parses HTML documents with QueryPath.
Code
protected function convertEncoding($data, $encoding = 'UTF-8') {
// Check for an encoding declaration.
$matches = FALSE;
if (preg_match('/<meta[^>]+charset\\s*=\\s*["\']?([\\w-]+)\\b/i', $data, $matches)) {
$encoding = $matches[1];
}
elseif ($detected = parent::detectEncoding($data)) {
$encoding = $detected;
}
// Unsupported encodings are converted here into UTF-8.
$php_supported = array(
'utf-8',
'us-ascii',
'ascii',
);
if (in_array(strtolower($encoding), $php_supported)) {
return $data;
}
$data = parent::convertEncoding($data, $encoding);
if ($matches) {
$data = preg_replace('/(<meta[^>]+charset\\s*=\\s*["\']?)([\\w-]+)\\b/i', '$1UTF-8', $data, 1);
}
return $data;
}