public function Crawler::addHtmlContent in Zircon Profile 8
Same name and namespace in other branches
- 8.0 vendor/symfony/dom-crawler/Crawler.php \Symfony\Component\DomCrawler\Crawler::addHtmlContent()
Adds an HTML content to the list of nodes.
The libxml errors are disabled when the content is parsed.
If you want to get parsing errors, be sure to enable internal errors via libxml_use_internal_errors(true) and then, get the errors via libxml_get_errors(). Be sure to clear errors with libxml_clear_errors() afterward.
Parameters
string $content The HTML content:
string $charset The charset:
1 call to Crawler::addHtmlContent()
- Crawler::addContent in vendor/
symfony/ dom-crawler/ Crawler.php - Adds HTML/XML content.
File
- vendor/
symfony/ dom-crawler/ Crawler.php, line 151
Class
- Crawler
- Crawler eases navigation of a list of \DOMElement objects.
Namespace
Symfony\Component\DomCrawlerCode
public function addHtmlContent($content, $charset = 'UTF-8') {
$internalErrors = libxml_use_internal_errors(true);
$disableEntities = libxml_disable_entity_loader(true);
$dom = new \DOMDocument('1.0', $charset);
$dom->validateOnParse = true;
set_error_handler(function () {
throw new \Exception();
});
try {
// Convert charset to HTML-entities to work around bugs in DOMDocument::loadHTML()
if (function_exists('mb_convert_encoding')) {
$content = mb_convert_encoding($content, 'HTML-ENTITIES', $charset);
}
elseif (function_exists('iconv')) {
$content = preg_replace_callback('/[\\x80-\\xFF]+/', function ($m) {
$m = unpack('C*', $m[0]);
$i = 1;
$entities = '';
while (isset($m[$i])) {
if (0xf0 <= $m[$i]) {
$c = ($m[$i++] - 0xf0 << 18) + ($m[$i++] - 0x80 << 12) + ($m[$i++] - 0x80 << 6) + $m[$i++] - 0x80;
}
elseif (0xe0 <= $m[$i]) {
$c = ($m[$i++] - 0xe0 << 12) + ($m[$i++] - 0x80 << 6) + $m[$i++] - 0x80;
}
else {
$c = ($m[$i++] - 0xc0 << 6) + $m[$i++] - 0x80;
}
$entities .= '&#' . $c . ';';
}
return $entities;
}, iconv($charset, 'UTF-8', $content));
}
} catch (\Exception $e) {
}
restore_error_handler();
if ('' !== trim($content)) {
@$dom
->loadHTML($content);
}
libxml_use_internal_errors($internalErrors);
libxml_disable_entity_loader($disableEntities);
$this
->addDocument($dom);
$base = $this
->filterRelativeXPath('descendant-or-self::base')
->extract(array(
'href',
));
$baseHref = current($base);
if (count($base) && !empty($baseHref)) {
if ($this->baseHref) {
$linkNode = $dom
->createElement('a');
$linkNode
->setAttribute('href', $baseHref);
$link = new Link($linkNode, $this->baseHref);
$this->baseHref = $link
->getUri();
}
else {
$this->baseHref = $baseHref;
}
}
}