You are here

class FeedsXPathParserHTML in Feeds XPath Parser 6

Same name and namespace in other branches
  1. 7 FeedsXPathParserHTML.inc \FeedsXPathParserHTML

Parse HTML using XPath.

Hierarchy

Expanded class hierarchy of FeedsXPathParserHTML

2 string references to 'FeedsXPathParserHTML'
FeedsXPathParseHTMLTestCase::test in tests/feeds_xpathparser_parser_html.test
Run tests.
feeds_xpathparser_feeds_plugins in ./feeds_xpathparser.module
Implementation of hook_feeds_plugins().

File

./FeedsXPathParserHTML.inc, line 11
s Provides the FeedsXPathParserHTML class.

View source
class FeedsXPathParserHTML extends FeedsXPathParserBase {

  /**
   * Implementation of FeedsXPathParserBase::setup().
   */
  protected function setup($source_config, FeedsImportBatch $batch) {
    if (!empty($source_config['exp']['tidy'])) {
      $config = array(
        'merge-divs' => FALSE,
        'merge-spans' => FALSE,
        'join-styles' => FALSE,
        'drop-empty-paras' => FALSE,
        'wrap' => 0,
        'tidy-mark' => FALSE,
        'escape-cdata' => TRUE,
        'word-2000' => TRUE,
      );

      // Default tidy encoding is UTF8.
      $encoding = $source_config['exp']['tidy_encoding'];
      $raw = tidy_repair_string(trim($batch
        ->getRaw()), $config, $encoding);
    }
    else {
      $raw = $batch
        ->getRaw();
    }
    $doc = new DOMDocument();

    // Use our own error handling.
    $use = $this
      ->errorStart();
    $success = $doc
      ->loadHTML($raw);
    unset($raw);
    $this
      ->errorStop($use, $source_config['exp']['errors']);
    if (!$success) {
      throw new Exception(t('There was an error parsing the HTML document.'));
    }
    return $doc;
  }
  protected function getRaw(DOMNode $node) {

    // DOMDocument::saveHTML() cannot take $node as an argument prior to 5.3.6.
    if (version_compare(phpversion(), '5.3.6', '>=')) {
      return $this->doc
        ->saveHTML($node);
    }
    return $this->doc
      ->saveXML($node);
  }

}

Members

Namesort descending Modifiers Type Description Overrides
FeedsConfigurable::$config protected property
FeedsConfigurable::$disabled protected property CTools export enabled status of this object.
FeedsConfigurable::$export_type protected property
FeedsConfigurable::$id protected property
FeedsConfigurable::addConfig public function Similar to setConfig but adds to existing configuration. 1
FeedsConfigurable::configFormSubmit public function Submission handler for configForm(). 3
FeedsConfigurable::copy public function Copy a configuration. 1
FeedsConfigurable::existing public function Determine whether this object is persistent and enabled. I. e. it is defined either in code or in the database and it is enabled. 1
FeedsConfigurable::getConfig public function Implementation of getConfig(). 1
FeedsConfigurable::instance public static function Instantiate a FeedsConfigurable object. 1
FeedsConfigurable::setConfig public function Set configuration. 1
FeedsConfigurable::__get public function Override magic method __get(). Make sure that $this->config goes through getConfig()
FeedsConfigurable::__isset public function Override magic method __isset(). This is needed due to overriding __get().
FeedsParser::clear public function Clear all caches for results for given source.
FeedsParser::getSourceElement public function Get an element identified by $element_key of the given item. The element key corresponds to the values in the array returned by FeedsParser::getMappingSources(). 1
FeedsPlugin::hasSourceConfig public function Returns TRUE if $this->sourceForm() returns a form. Overrides FeedsSourceInterface::hasSourceConfig
FeedsPlugin::loadMappers protected static function Loads on-behalf implementations from mappers/ directory.
FeedsPlugin::save public function Save changes to the configuration of this object. Delegate saving to parent (= Feed) which will collect information from this object by way of getConfig() and store it. Overrides FeedsConfigurable::save
FeedsPlugin::sourceDelete public function A source is being deleted. Overrides FeedsSourceInterface::sourceDelete 1
FeedsPlugin::sourceSave public function A source is being saved. Overrides FeedsSourceInterface::sourceSave 1
FeedsPlugin::__construct protected function Constructor. Overrides FeedsConfigurable::__construct
FeedsXPathParserBase::$doc protected property
FeedsXPathParserBase::$modified_queries protected property
FeedsXPathParserBase::$rawXML protected property
FeedsXPathParserBase::$xpath protected property
FeedsXPathParserBase::configDefaults public function Define defaults. Overrides FeedsConfigurable::configDefaults
FeedsXPathParserBase::configForm public function Override parent::configForm(). Overrides FeedsConfigurable::configForm
FeedsXPathParserBase::configFormValidate public function Override parent::sourceFormValidate(). Overrides FeedsConfigurable::configFormValidate
FeedsXPathParserBase::errorStart protected function
FeedsXPathParserBase::errorStop protected function
FeedsXPathParserBase::filterMappings protected function Filters mappings, returning the ones that belong to us.
FeedsXPathParserBase::getMappingSources public function Override parent::getMappingSources(). Overrides FeedsParser::getMappingSources
FeedsXPathParserBase::getOwnMappings protected function
FeedsXPathParserBase::parse public function Implements FeedsParser::parse(). Overrides FeedsParser::parse
FeedsXPathParserBase::parseSourceElement protected function Parses one item from the context array.
FeedsXPathParserBase::sourceDefaults public function Define defaults. Overrides FeedsPlugin::sourceDefaults
FeedsXPathParserBase::sourceForm public function Source form. Overrides FeedsPlugin::sourceForm
FeedsXPathParserBase::sourceFormValidate public function Override parent::sourceFormValidate(). Overrides FeedsPlugin::sourceFormValidate
FeedsXPathParserHTML::getRaw protected function Overrides FeedsXPathParserBase::getRaw
FeedsXPathParserHTML::setup protected function Implementation of FeedsXPathParserBase::setup(). Overrides FeedsXPathParserBase::setup