function cc::xml_php4_create_parser in Constant Contact 6.2
Same name and namespace in other branches
- 6.3 class.cc.php \cc::xml_php4_create_parser()
- 7.3 class.cc.php \cc::xml_php4_create_parser()
Instaniate an XML parser under PHP4
Unfortunately PHP4's support for character encodings and especially XML and character encodings sucks. As long as the documents you parse only contain characters from the ISO-8859-1 character set (a superset of ASCII, and a subset of UTF-8) you're fine. However once you step out of that comfy little world things get mad, bad, and dangerous to know.
The following code is based on SJM's work with FoF if passed an empty string as the encoding. * * @access private
See also
http://minutillo.com/steve/weblog/2004/6/17/php-xml-and-character-encodi...
1 call to cc::xml_php4_create_parser()
- cc::xml_create_parser in ./
class.cc.php - * Return XML parser, and possibly re-encoded source * * @access private
File
- ./
class.cc.php, line 1581
Class
- cc
- @file
Code
function xml_php4_create_parser($source, $in_enc, $detect) {
if (!$detect) {
return array(
xml_parser_create($in_enc),
$source,
);
}
if (!$in_enc) {
if (preg_match('/<?xml.*encoding=[\'"](.*?)[\'"].*?>/m', $source, $m)) {
$in_enc = strtoupper($m[1]);
$this->xml_source_encoding = $in_enc;
}
else {
$in_enc = 'UTF-8';
}
}
if ($this
->xml_known_encoding($in_enc)) {
return array(
xml_parser_create($in_enc),
$source,
);
}
// the dectected encoding is not one of the simple encodings PHP knows
// attempt to use the iconv extension to
// cast the XML to a known encoding
// @see http://php.net/iconv
if (function_exists('iconv')) {
$encoded_source = iconv($in_enc, 'UTF-8', $source);
if ($encoded_source) {
return array(
xml_parser_create('UTF-8'),
$encoded_source,
);
}
}
// iconv didn't work, try mb_convert_encoding
// @see http://php.net/mbstring
if (function_exists('mb_convert_encoding')) {
$encoded_source = mb_convert_encoding($source, 'UTF-8', $in_enc);
if ($encoded_source) {
return array(
xml_parser_create('UTF-8'),
$encoded_source,
);
}
}
// else
exit("Feed is in an unsupported character encoding. ({$in_enc}) " . "You may see strange artifacts, and mangled characters.");
return array(
xml_parser_create(),
$source,
);
}