You are here

public function Crawler::addHtmlContent in Zircon Profile 8

Same name and namespace in other branches
  1. 8.0 vendor/symfony/dom-crawler/Crawler.php \Symfony\Component\DomCrawler\Crawler::addHtmlContent()

Adds an HTML content to the list of nodes.

The libxml errors are disabled when the content is parsed.

If you want to get parsing errors, be sure to enable internal errors via libxml_use_internal_errors(true) and then, get the errors via libxml_get_errors(). Be sure to clear errors with libxml_clear_errors() afterward.

Parameters

string $content The HTML content:

string $charset The charset:

1 call to Crawler::addHtmlContent()
Crawler::addContent in vendor/symfony/dom-crawler/Crawler.php
Adds HTML/XML content.

File

vendor/symfony/dom-crawler/Crawler.php, line 151

Class

Crawler
Crawler eases navigation of a list of \DOMElement objects.

Namespace

Symfony\Component\DomCrawler

Code

public function addHtmlContent($content, $charset = 'UTF-8') {
  $internalErrors = libxml_use_internal_errors(true);
  $disableEntities = libxml_disable_entity_loader(true);
  $dom = new \DOMDocument('1.0', $charset);
  $dom->validateOnParse = true;
  set_error_handler(function () {
    throw new \Exception();
  });
  try {

    // Convert charset to HTML-entities to work around bugs in DOMDocument::loadHTML()
    if (function_exists('mb_convert_encoding')) {
      $content = mb_convert_encoding($content, 'HTML-ENTITIES', $charset);
    }
    elseif (function_exists('iconv')) {
      $content = preg_replace_callback('/[\\x80-\\xFF]+/', function ($m) {
        $m = unpack('C*', $m[0]);
        $i = 1;
        $entities = '';
        while (isset($m[$i])) {
          if (0xf0 <= $m[$i]) {
            $c = ($m[$i++] - 0xf0 << 18) + ($m[$i++] - 0x80 << 12) + ($m[$i++] - 0x80 << 6) + $m[$i++] - 0x80;
          }
          elseif (0xe0 <= $m[$i]) {
            $c = ($m[$i++] - 0xe0 << 12) + ($m[$i++] - 0x80 << 6) + $m[$i++] - 0x80;
          }
          else {
            $c = ($m[$i++] - 0xc0 << 6) + $m[$i++] - 0x80;
          }
          $entities .= '&#' . $c . ';';
        }
        return $entities;
      }, iconv($charset, 'UTF-8', $content));
    }
  } catch (\Exception $e) {
  }
  restore_error_handler();
  if ('' !== trim($content)) {
    @$dom
      ->loadHTML($content);
  }
  libxml_use_internal_errors($internalErrors);
  libxml_disable_entity_loader($disableEntities);
  $this
    ->addDocument($dom);
  $base = $this
    ->filterRelativeXPath('descendant-or-self::base')
    ->extract(array(
    'href',
  ));
  $baseHref = current($base);
  if (count($base) && !empty($baseHref)) {
    if ($this->baseHref) {
      $linkNode = $dom
        ->createElement('a');
      $linkNode
        ->setAttribute('href', $baseHref);
      $link = new Link($linkNode, $this->baseHref);
      $this->baseHref = $link
        ->getUri();
    }
    else {
      $this->baseHref = $baseHref;
    }
  }
}