You are here

class QueryPathHtmlParser in Feeds extensible parsers 8

Defines a HTML parser using QueryPath.

@todo Make convertEncoding() into a helper function so that they aren't \ copied in 2 places.

Plugin annotation


@FeedsParser(
  id = "querypathhtml",
  title = @Translation("QueryPath HTML"),
  description = @Translation("Parse HTML with QueryPath.")
)

Hierarchy

Expanded class hierarchy of QueryPathHtmlParser

1 file declares its use of QueryPathHtmlParser
QueryPathHtmlParserTest.php in tests/src/Unit/Feeds/Parser/QueryPathHtmlParserTest.php

File

src/Feeds/Parser/QueryPathHtmlParser.php, line 22

Namespace

Drupal\feeds_ex\Feeds\Parser
View source
class QueryPathHtmlParser extends QueryPathXmlParser {

  /**
   * {@inheritdoc}
   */
  protected $encoderClass = '\\Drupal\\feeds_ex\\Encoder\\HtmlEncoder';

  /**
   * {@inheritdoc}
   */
  protected function setUp(FeedInterface $feed, FetcherResultInterface $fetcher_result, StateInterface $state) {

    // Change some parser settings.
    $this->queryPathOptions['use_parser'] = 'html';
  }

  /**
   * {@inheritdoc}
   */
  protected function getRawValue(DOMQuery $node) {
    return $node
      ->html();
  }

  /**
   * {@inheritdoc}
   */
  protected function prepareDocument(FeedInterface $feed, FetcherResultInterface $fetcher_result) {
    $raw = $this
      ->prepareRaw($fetcher_result);
    if ($this->configuration['use_tidy'] && extension_loaded('tidy')) {
      $raw = tidy_repair_string($raw, $this
        ->getTidyConfig(), 'utf8');
    }
    return $this->utility
      ->createHtmlDocument($raw);
  }

  /**
   * {@inheritdoc}
   */
  protected function getTidyConfig() {
    return [
      'merge-divs' => FALSE,
      'merge-spans' => FALSE,
      'join-styles' => FALSE,
      'drop-empty-paras' => FALSE,
      'wrap' => 0,
      'tidy-mark' => FALSE,
      'escape-cdata' => TRUE,
      'word-2000' => TRUE,
    ];
  }

}

Members

Namesort descending Modifiers Type Description Overrides
DependencySerializationTrait::$_entityStorages protected property An array of entity type IDs keyed by the property name of their storages.
DependencySerializationTrait::$_serviceIds protected property An array of service IDs keyed by property name used for serialization.
DependencySerializationTrait::__sleep public function 1
DependencySerializationTrait::__wakeup public function 2
DependencyTrait::$dependencies protected property The object's dependencies.
DependencyTrait::addDependencies protected function Adds multiple dependencies.
DependencyTrait::addDependency protected function Adds a dependency.
MessengerTrait::$messenger protected property The messenger. 29
MessengerTrait::messenger public function Gets the messenger. 29
MessengerTrait::setMessenger public function Sets the messenger.
ParserBase::$encoder protected property The encoder used to convert encodings.
ParserBase::$feedsExMessenger protected property The messenger, for compatibility with Drupal 8.5.
ParserBase::$htmlTags protected static property The default list of HTML tags allowed by Xss::filter().
ParserBase::buildFeedForm public function
ParserBase::configSourceDescription protected function Returns the description for single source. 1
ParserBase::debug protected function Renders our debug messages into a list.
ParserBase::executeSources protected function Executes the source expressions.
ParserBase::getEncoder public function Returns the encoder.
ParserBase::getFormHeader protected function Returns the configuration form table header.
ParserBase::getMappingSources public function Declare the possible mapping sources that this parser produces. Overrides ParserInterface::getMappingSources
ParserBase::getMessenger public function Gets the messenger.
ParserBase::hasConfigurableContext protected function Returns whether or not this parser uses a context query. 2
ParserBase::hasSourceConfig public function
ParserBase::mappingFormAlter public function Alter mapping form. Overrides ParserBase::mappingFormAlter
ParserBase::mappingFormSubmit public function Submit handler for the mapping form. Overrides ParserBase::mappingFormSubmit
ParserBase::mappingFormValidate public function Validate handler for the mapping form. Overrides ParserBase::mappingFormValidate
ParserBase::parse public function Parses content returned by fetcher. Overrides ParserInterface::parse
ParserBase::parseItems protected function Performs the actual parsing. 2
ParserBase::prepareExpressions protected function Prepares the expressions for parsing.
ParserBase::prepareRaw protected function Prepares the raw string for parsing.
ParserBase::prepareVariables protected function Prepares the variable map used to substitution.
ParserBase::printErrors protected function Prints errors to the screen.
ParserBase::setEncoder public function Sets the encoder.
ParserBase::setFeedsExMessenger public function Sets the messenger.
ParserBase::sourceDefaults public function
ParserBase::sourceFormValidate public function
ParserBase::sourceSave public function
ParserBase::submitConfigurationForm public function Form submission handler. Overrides PluginFormInterface::submitConfigurationForm
ParserBase::validateConfigurationForm public function Form validation handler. Overrides PluginFormInterface::validateConfigurationForm
ParserBase::_buildConfigurationForm public function Builds configuration form for the parser settings.
PluginBase::$configuration protected property Configuration information passed into the plugin. 1
PluginBase::$feedType protected property The importer this plugin is working for.
PluginBase::$linkGenerator protected property The link generator.
PluginBase::$pluginDefinition protected property The plugin implementation definition. 1
PluginBase::$pluginId protected property The plugin_id.
PluginBase::$urlGenerator protected property The url generator.
PluginBase::calculateDependencies public function Calculates dependencies for the configured plugin. Overrides DependentPluginInterface::calculateDependencies 2
PluginBase::container private function Returns the service container.
PluginBase::defaultFeedConfiguration public function Returns default feed configuration. Overrides FeedsPluginInterface::defaultFeedConfiguration 3
PluginBase::DERIVATIVE_SEPARATOR constant A string which is used to separate base plugin IDs from the derivative ID.
PluginBase::getBaseId public function Gets the base_plugin_id of the plugin instance. Overrides DerivativeInspectionInterface::getBaseId
PluginBase::getConfiguration public function Gets this plugin's configuration. Overrides ConfigurableInterface::getConfiguration
PluginBase::getDerivativeId public function Gets the derivative_id of the plugin instance. Overrides DerivativeInspectionInterface::getDerivativeId
PluginBase::getPluginDefinition public function Gets the definition of the plugin implementation. Overrides PluginInspectionInterface::getPluginDefinition 3
PluginBase::getPluginId public function Gets the plugin_id of the plugin instance. Overrides PluginInspectionInterface::getPluginId
PluginBase::isConfigurable public function Determines if the plugin is configurable.
PluginBase::l protected function Renders a link to a route given a route name and its parameters.
PluginBase::linkGenerator protected function Returns the link generator service.
PluginBase::onFeedDeleteMultiple public function A feed is being deleted. 3
PluginBase::onFeedSave public function A feed is being saved.
PluginBase::onFeedTypeDelete public function The feed type is being deleted. 1
PluginBase::onFeedTypeSave public function The feed type is being saved. 1
PluginBase::pluginType public function Returns the type of plugin. Overrides FeedsPluginInterface::pluginType
PluginBase::setConfiguration public function Sets the configuration for this plugin instance. Overrides ConfigurableInterface::setConfiguration 1
PluginBase::url protected function Generates a URL or path for a specific route based on the given parameters.
PluginBase::urlGenerator protected function Returns the URL generator service.
QueryPathHtmlParser::$encoderClass protected property The class used as the text encoder. Overrides XmlParser::$encoderClass
QueryPathHtmlParser::getRawValue protected function Returns the raw value. Overrides QueryPathXmlParser::getRawValue
QueryPathHtmlParser::getTidyConfig protected function Returns the options for phptidy. Overrides XmlParser::getTidyConfig
QueryPathHtmlParser::prepareDocument protected function Prepares the DOM document. Overrides XmlParser::prepareDocument
QueryPathHtmlParser::setUp protected function Allows subclasses to prepare for parsing. Overrides XmlParser::setUp
QueryPathXmlParser::$queryPathOptions protected property Options passed to QueryPath.
QueryPathXmlParser::configFormTableColumn protected function Returns a form element for a specific column. Overrides XmlParser::configFormTableColumn
QueryPathXmlParser::configFormTableHeader protected function Returns the list of table headers. Overrides XmlParser::configFormTableHeader
QueryPathXmlParser::executeContext protected function Returns rows to be parsed. Overrides XmlParser::executeContext
QueryPathXmlParser::executeSourceExpression protected function Executes a single source expression. Overrides XmlParser::executeSourceExpression
QueryPathXmlParser::loadLibrary protected function Loads the necessary library. Overrides ParserBase::loadLibrary
QueryPathXmlParser::validateExpression protected function Validates an expression. Overrides XmlParser::validateExpression
StringTranslationTrait::$stringTranslation protected property The string translation service. 1
StringTranslationTrait::formatPlural protected function Formats a string containing a count of items.
StringTranslationTrait::getNumberOfPlurals protected function Returns the number of plurals supported by a given language.
StringTranslationTrait::getStringTranslation protected function Gets the string translation service.
StringTranslationTrait::setStringTranslation public function Sets the string translation service to use. 2
StringTranslationTrait::t protected function Translates a string to the current language or to a given language.
XmlParser::$entityLoader protected property The previous value for the entity loader.
XmlParser::$handleXmlErrors protected property The previous value for XML error handling.
XmlParser::$utility protected property The XML helper class.
XmlParser::$xpath protected property The XpathDomXpath object used for parsing.
XmlParser::buildConfigurationForm public function Form constructor. Overrides ParserBase::buildConfigurationForm
XmlParser::cleanUp protected function Allows subclasses to cleanup after parsing. Overrides ParserBase::cleanUp
XmlParser::configFormValidate public function Overrides ParserBase::configFormValidate
XmlParser::configSourceLabel protected function Returns the label for single source. Overrides ParserBase::configSourceLabel
XmlParser::create public static function Creates an instance of the plugin. Overrides ContainerFactoryPluginInterface::create
XmlParser::defaultConfiguration public function Gets default configuration for this plugin. Overrides ParserBase::defaultConfiguration
XmlParser::getErrors protected function Returns the errors after parsing. Overrides ParserBase::getErrors
XmlParser::getInnerXml protected function Returns the inner XML of a DOM node.
XmlParser::getRaw protected function Returns the raw XML of a DOM node. 1
XmlParser::hasConfigForm public function Overrides ParserBase::hasConfigForm
XmlParser::startErrorHandling protected function Starts internal error handling. Overrides ParserBase::startErrorHandling
XmlParser::stopErrorHandling protected function Stops internal error handling. Overrides ParserBase::stopErrorHandling
XmlParser::__construct public function Constructs a JsonParserBase object. Overrides ParserBase::__construct
XmlParserTrait::$_elementRegex protected static property Matches the characters of an XML element.
XmlParserTrait::$_entityLoader protected static property The previous value of the entity loader.
XmlParserTrait::$_errors protected static property The errors reported during parsing.
XmlParserTrait::$_useError protected static property The previous value of libxml error reporting.
XmlParserTrait::getDomDocument protected static function Returns a new DOMDocument.
XmlParserTrait::getXmlErrors protected static function Returns the errors reported during parsing.
XmlParserTrait::removeDefaultNamespaces protected static function Strips the default namespaces from an XML string.
XmlParserTrait::startXmlErrorHandling protected static function Starts custom error handling.
XmlParserTrait::stopXmlErrorHandling protected static function Stops custom error handling.