You are here

class FeedsExHtml in Feeds extensible parsers 7

Same name and namespace in other branches
  1. 7.2 src/FeedsExHtml.inc \FeedsExHtml

Parses HTML documents with XPath.

Hierarchy

Expanded class hierarchy of FeedsExHtml

2 string references to 'FeedsExHtml'
FeedsExHtmlTests::setUp in src/Tests/FeedsExHtml.test
feeds_ex_feeds_plugins in ./feeds_ex.feeds.inc
Implements hook_feeds_plugins().

File

src/FeedsExHtml.inc, line 11
Contains FeedsExHtml.

View source
class FeedsExHtml extends FeedsExXml {

  /**
   * Whether this version of PHP has the correct saveHTML() method.
   *
   * @var bool
   */
  protected $useSaveHTML;

  /**
   * {@inheritdoc}
   */
  protected $encoderClass = 'FeedsExHtmlEncoder';

  /**
   * {@inheritdoc}
   */
  public function __construct($id) {
    parent::__construct($id);

    // DOMDocument::saveHTML() cannot take $node as an argument prior to 5.3.6.
    $this->useSaveHTML = version_compare(PHP_VERSION, '5.3.6', '>=');
  }

  /**
   * {@inheritdoc}
   */
  protected function prepareDocument(FeedsSource $source, FeedsFetcherResult $fetcher_result) {
    $raw = $this
      ->prepareRaw($fetcher_result);
    if ($this->config['use_tidy'] && extension_loaded('tidy')) {
      $raw = tidy_repair_string($raw, $this
        ->getTidyConfig(), 'utf8');
    }
    return FeedsExXmlUtility::createHtmlDocument($raw);
  }

  /**
   * {@inheritdoc}
   */
  protected function getRaw(DOMNode $node) {
    if ($this->useSaveHTML) {
      return $node->ownerDocument
        ->saveHTML($node);
    }
    return $node->ownerDocument
      ->saveXML($node, LIBXML_NOEMPTYTAG);
  }

  /**
   * {@inheritdoc}
   */
  protected function getTidyConfig() {
    return array(
      'merge-divs' => FALSE,
      'merge-spans' => FALSE,
      'join-styles' => FALSE,
      'drop-empty-paras' => FALSE,
      'wrap' => 0,
      'tidy-mark' => FALSE,
      'escape-cdata' => TRUE,
    );
  }

}

Members

Namesort descending Modifiers Type Description Overrides
FeedsExBase::$encoder protected property The encoder used to convert encodings.
FeedsExBase::$messenger protected property The object used to display messages to the user.
FeedsExBase::debug protected function Renders our debug messages into a list.
FeedsExBase::executeSources protected function Executes the source expressions.
FeedsExBase::getEncoder public function Returns the encoder.
FeedsExBase::getFormHeader protected function Returns the configuration form table header.
FeedsExBase::getMappingSources public function
FeedsExBase::getMessenger public function Returns the messenger.
FeedsExBase::hasConfigForm public function
FeedsExBase::hasConfigurableContext protected function Returns whether or not this parser uses a context query. 2
FeedsExBase::hasSourceConfig public function
FeedsExBase::loadLibrary protected function Loads the necessary library. 2
FeedsExBase::logErrors protected function Logs errors.
FeedsExBase::parse public function
FeedsExBase::parseItems protected function Performs the actual parsing. 2
FeedsExBase::prepareExpressions protected function Prepares the expressions for parsing.
FeedsExBase::prepareRaw protected function Prepares the raw string for parsing.
FeedsExBase::prepareVariables protected function Prepares the variable map used to substitution.
FeedsExBase::printErrors protected function Prints errors to the screen.
FeedsExBase::setEncoder public function Sets the encoder.
FeedsExBase::setMessenger public function Sets the messenger to be used to display messages.
FeedsExBase::sourceDefaults public function
FeedsExBase::sourceForm public function
FeedsExBase::sourceFormValidate public function
FeedsExBase::sourceSave public function
FeedsExHtml::$encoderClass protected property The class used as the text encoder. Overrides FeedsExXml::$encoderClass
FeedsExHtml::$useSaveHTML protected property Whether this version of PHP has the correct saveHTML() method.
FeedsExHtml::getRaw protected function Returns the raw XML of a DOM node. Overrides FeedsExXml::getRaw
FeedsExHtml::getTidyConfig protected function Returns the options for phptidy. Overrides FeedsExXml::getTidyConfig
FeedsExHtml::prepareDocument protected function Prepares the DOM document. Overrides FeedsExXml::prepareDocument
FeedsExHtml::__construct public function
FeedsExXml::$entityLoader protected property The previous value for the entity loader.
FeedsExXml::$handleXmlErrors protected property The previous value for XML error handling.
FeedsExXml::$xpath protected property The FeedsExXpathDomXpath object used for parsing.
FeedsExXml::cleanUp protected function Allows subclasses to cleanup after parsing. Overrides FeedsExBase::cleanUp
FeedsExXml::configDefaults public function Overrides FeedsExBase::configDefaults
FeedsExXml::configForm public function Overrides FeedsExBase::configForm
FeedsExXml::configFormTableColumn protected function Returns a form element for a specific column. Overrides FeedsExBase::configFormTableColumn 1
FeedsExXml::configFormTableHeader protected function Reuturns the list of table headers. Overrides FeedsExBase::configFormTableHeader 1
FeedsExXml::configFormValidate public function Overrides FeedsExBase::configFormValidate
FeedsExXml::executeContext protected function Returns rows to be parsed. Overrides FeedsExBase::executeContext 1
FeedsExXml::executeSourceExpression protected function Executes a single source expression. Overrides FeedsExBase::executeSourceExpression 1
FeedsExXml::getErrors protected function Returns the errors after parsing. Overrides FeedsExBase::getErrors
FeedsExXml::getInnerXml protected function Returns the inner XML of a DOM node.
FeedsExXml::setUp protected function Allows subclasses to prepare for parsing. Overrides FeedsExBase::setUp 1
FeedsExXml::startErrorHandling protected function Starts internal error handling. Overrides FeedsExBase::startErrorHandling
FeedsExXml::stopErrorHandling protected function Stops internal error handling. Overrides FeedsExBase::stopErrorHandling
FeedsExXml::validateExpression protected function Validates an expression. Overrides FeedsExBase::validateExpression 1