You are here

function cc::xml_php4_create_parser in Constant Contact 7.3

Same name and namespace in other branches
  1. 6.3 class.cc.php \cc::xml_php4_create_parser()
  2. 6.2 class.cc.php \cc::xml_php4_create_parser()

Instaniate an XML parser under PHP4.

Unfortunately PHP4's support for character encodings and especially XML and character encodings sucks. As long as the documents you parse only contain characters from the ISO-8859-1 character set (a superset of ASCII, and a subset of UTF-8) you're fine. However once you step out of that comfy little world things get mad, bad, and dangerous to know.

The following code is based on SJM's work with FoF @link http://minutillo.com/steve/weblog/2004/6/17/php-xml-and-character-encodi... if passed an empty string as the encoding.

@access private

1 call to cc::xml_php4_create_parser()
cc::xml_create_parser in ./class.cc.php
Return XML parser, and possibly re-encoded source.

File

./class.cc.php, line 1772
Constant Contact PHP Class

Class

cc
@file Constant Contact PHP Class

Code

function xml_php4_create_parser($source, $in_enc, $detect) {
  if (!$detect) {
    return array(
      xml_parser_create($in_enc),
      $source,
    );
  }
  if (!$in_enc) {
    if (preg_match('/<?xml.*encoding=[\'"](.*?)[\'"].*?>/m', $source, $m)) {
      $in_enc = drupal_strtoupper($m[1]);
      $this->xml_source_encoding = $in_enc;
    }
    else {
      $in_enc = 'UTF-8';
    }
  }
  if ($this
    ->xml_known_encoding($in_enc)) {
    return array(
      xml_parser_create($in_enc),
      $source,
    );
  }

  // The dectected encoding is not one of the simple encodings PHP knows.
  // Attempt to use the iconv extension to cast the XML to a known encoding
  // @link http://php.net/iconv
  if (function_exists('iconv')) {
    $encoded_source = iconv($in_enc, 'UTF-8', $source);
    if ($encoded_source) {
      return array(
        xml_parser_create('UTF-8'),
        $encoded_source,
      );
    }
  }

  // iconv didn't work, try mb_convert_encoding
  // @link http://php.net/mbstring
  if (function_exists('mb_convert_encoding')) {
    $encoded_source = mb_convert_encoding($source, 'UTF-8', $in_enc);
    if ($encoded_source) {
      return array(
        xml_parser_create('UTF-8'),
        $encoded_source,
      );
    }
  }
  trigger_error(check_plain("Feed is in an unsupported character encoding. ({$in_enc}) " . "You may see strange artifacts, and mangled characters.", E_USER_ERROR));
}