You are here

class FilterHtml in Markdown 8.2

Extends FilterHtml to allow more more permissive global attributes.

Hierarchy

Expanded class hierarchy of FilterHtml

5 files declare their use of FilterHtml
BaseParser.php in src/Plugin/Markdown/BaseParser.php
HeadingPermalinkExtension.php in src/Plugin/Markdown/CommonMark/Extension/HeadingPermalinkExtension.php
MissingParser.php in src/Plugin/Markdown/MissingParser.php
Parsedown.php in src/Plugin/Markdown/Parsedown/Parsedown.php
ParserConfigurationForm.php in src/Form/ParserConfigurationForm.php

File

src/Util/FilterHtml.php, line 16

Namespace

Drupal\markdown\Util
View source
class FilterHtml extends CoreFilterHtml implements ParserAwareInterface {
  use ParserAwareTrait;

  /**
   * The placeholder value used to protect asterisks in values.
   *
   * @var string
   */
  const ASTERISK_PLACEHOLDER = '__zqh6vxfbk3cg__';

  /**
   * Creates a new instance.
   *
   * @param string $allowedHtml
   *   Optional. The allowed HTML.
   *
   * @return static
   */
  public static function create($allowedHtml = '') {
    return new static([
      'settings' => [
        'allowed_html' => $allowedHtml,
        'filter_html_help' => 1,
        'filter_html_nofollow' => 0,
      ],
    ], 'filter_html', [
      'provider' => 'markdown',
    ]);
  }

  /**
   * Create a new instance from a Markdown Parser instance.
   *
   * @param \Drupal\markdown\Plugin\Markdown\ParserInterface $parser
   *   A Markdown Parser instance.
   *
   * @return static
   */
  public static function fromParser(ParserInterface $parser) {
    return static::create($parser
      ->getCustomAllowedHtml())
      ->setParser($parser);
  }

  /**
   * Merges allowed HTML tags.
   *
   * @param array $normalizedTags
   *   An existing normalized allowed HTML tags array.
   * @param array ...$tags
   *   One or more arrays of allowed HTML tags to merge onto $normalizedTags.
   *
   * @return array
   *   The merged $normalizedTags.
   */
  public static function mergeAllowedTags(array $normalizedTags, array $tags) {
    $args = func_get_args();
    $normalizedTags = array_shift($args);
    foreach ($args as $tags) {
      if (!is_array($tags) || !$tags) {
        continue;
      }

      // Normalize the tags to merge.
      $tags = static::normalizeTags($tags);
      foreach ($tags as $tag => $attributes) {

        // Add tag if it doesn't already exist.
        if (!isset($normalizedTags[$tag])) {
          $normalizedTags[$tag] = $attributes;
          continue;
        }

        // Existing tag already allows all attributes, skip merge.
        if (!empty($normalizedTags[$tag]['*'])) {
          continue;
        }

        // New tag allows all attributes, replace existing tag.
        if (!empty($attributes['*'])) {
          $normalizedTags[$tag] = [
            '*' => TRUE,
          ];
          continue;
        }

        // Now merge in individual attributes from tag.
        foreach ($attributes as $name => $value) {

          // Add attribute if it doesn't already exist.
          if (!isset($normalizedTags[$tag][$name])) {
            $normalizedTags[$tag][$name] = $value;
            continue;
          }

          // Existing tag attribute already allows all values, skip merge.
          if ($normalizedTags[$tag][$name] === TRUE) {
            continue;
          }

          // New tag attribute allows all values, replace existing attribute.
          if ($value === TRUE) {
            $normalizedTags[$tag][$name] = $value;
            continue;
          }

          // Finally, if specific attribute values are specified, merge them.
          if (is_array($value)) {
            if (!is_array($normalizedTags[$tag][$name])) {
              $normalizedTags[$tag][$name] = [];
            }
            $normalizedTags[$tag][$name] = array_replace($normalizedTags[$tag][$name], $value);
          }
        }
      }
    }
    ksort($normalizedTags);
    return $normalizedTags;
  }

  /**
   * Normalizes allowed HTML tags.
   *
   * @param array $tags
   *   The tags to normalize.
   *
   * @return array
   *   The normalized allowed HTML tags.
   */
  public static function normalizeTags(array $tags) {
    $tags = array_map(function ($attributes) {
      if (is_array($attributes)) {
        foreach ($attributes as $name => $value) {
          if (!is_bool($value)) {
            $attributes[$name] = is_array($value) ? $value : [
              $value => TRUE,
            ];
          }
        }
        return $attributes;
      }
      return $attributes === FALSE ? [] : [
        '*' => TRUE,
      ];
    }, $tags);
    ksort($tags);
    return $tags;
  }

  /**
   * Extracts HTML tags (and attributes) from a DOMNode.
   *
   * @param \DOMNode $node
   *   The node to extract from.
   * @param bool $attributeNames
   *   Flag indicating whether to extract attribute names.
   * @param bool $attributeValues
   *   Flag indicating whether to extract attribute values.
   *
   * @return array
   *   The HTML tags extracted from the DOM node.
   */
  protected static function extractDomNodeTags(\DOMNode $node, $attributeNames = TRUE, $attributeValues = FALSE) {
    $tags = [];
    if (!isset($tags[$node->nodeName])) {
      $tags[$node->nodeName] = [];
    }
    if ($attributeNames && $node->attributes) {
      for ($i = 0, $l = $node->attributes->length; $i < $l; ++$i) {
        $attribute = $node->attributes
          ->item($i);
        $name = $attribute->name;
        $tags[$node->nodeName][$name] = $attributeValues ? $attribute->nodeValue : TRUE;
      }
      if ($node
        ->hasChildNodes()) {
        foreach ($node->childNodes as $childNode) {
          $tags = NestedArray::mergeDeep($tags, static::extractDomNodeTags($childNode));
        }
      }
    }
    return $tags;
  }

  /**
   * Extracts an array of tags (and attributes) from an HTML string.
   *
   * @param string $html
   *   The HTML string to extract tags and attributes from.
   * @param bool $attributeNames
   *   Flag indicating whether to extract attribute names.
   * @param bool $attributeValues
   *   Flag indicating whether to extract attribute values.
   *
   * @return array
   *   The HTML tags extracted from the HTML string.
   */
  public static function tagsFromHtml($html = NULL, $attributeNames = TRUE, $attributeValues = FALSE) {
    $tags = [];
    if (!$html || strpos($html, '<') === FALSE) {
      return $tags;
    }
    libxml_use_internal_errors(true);
    $dom = new \DOMDocument();
    $dom
      ->loadHTML($html);
    libxml_clear_errors();
    foreach ($dom
      ->getElementsByTagName('body')
      ->item(0)->childNodes as $childNode) {
      $tags = NestedArray::mergeDeep($tags, static::extractDomNodeTags($childNode, $attributeNames, $attributeValues));
    }
    return $tags;
  }

  /**
   * Converts an array of tags (and their potential attributes) to a string.
   *
   * @param array $tags
   *   An associative array of tags, where the key is the tag and the value can
   *   be a boolean (TRUE if allowed, FALSE otherwise) or an associative array
   *   containing key/value pairs of acceptable boolean based attribute values
   *   (i.e. 'dir' => ['ltr' => TRUE, 'rtl' => TRUE]).
   *
   * @return string
   *   The tags, in string format.
   */
  public static function tagsToString(array $tags = []) {
    $items = [];
    ksort($tags);
    foreach (static::normalizeTags($tags) as $tag => $attributes) {
      $tag = "<{$tag}";
      if (is_array($attributes)) {
        foreach ($attributes as $attribute => $value) {
          if (!$value) {
            continue;
          }
          $tag .= " {$attribute}";
          if ($value && $value !== TRUE) {
            if (is_array($value)) {
              $value = implode(' ', array_keys(array_filter($value)));
            }
            $tag .= "='{$value}'";
          }
        }
      }
      $tag .= '>';
      $items[] = $tag;
    }
    return implode(' ', $items);
  }

  /**
   * Retrieves the allowed HTML.
   *
   * @param bool $includeGlobal
   *   Flag indicating whether to include global elements (i.e. *).
   *
   * @return string
   *   The allowed HTML.
   */
  public function getAllowedHtml($includeGlobal = TRUE) {
    $restrictions = $this
      ->getHtmlRestrictions();
    if (!$includeGlobal) {
      unset($restrictions['allowed']['*']);
    }
    return static::tagsToString($restrictions['allowed']);
  }

  /**
   * Retrieves an array of allowed HTML tags.
   *
   * @return string[]
   *   An indexed array of allowed HTML tags.
   */
  public function getAllowedTags() {
    $restrictions = $this
      ->getHtmlRestrictions();

    // Split the work into two parts. For filtering HTML tags out of the content
    // we rely on the well-tested Xss::filter() code. Since there is no '*' tag
    // that needs to be removed from the list.
    unset($restrictions['allowed']['*']);
    return array_keys($restrictions['allowed']);
  }

  /**
   * {@inheritdoc}
   */
  public function getHTMLRestrictions() {

    // phpcs:ignore
    if ($this->restrictions) {
      return $this->restrictions;
    }
    $activeTheme = \Drupal::theme()
      ->getActiveTheme();
    $parser = $this
      ->getParser();
    $allowedHtmlPlugins = $parser ? AllowedHtmlManager::create()
      ->appliesTo($parser, $activeTheme) : [];
    $cacheTags = $parser ? $parser
      ->getCacheTags() : [];
    $cid = 'markdown_allowed_html:' . Crypt::hashBase64(serialize(array_merge($cacheTags, $allowedHtmlPlugins)));

    // Return cached HTML restrictions.
    $discoveryCache = \Drupal::cache('discovery');
    if (($cached = $discoveryCache
      ->get($cid)) && !empty($cached->data)) {
      $this->restrictions = $cached->data;
      return $this->restrictions;
    }
    $restrictions = parent::getHTMLRestrictions();

    // Save the original global attributes.
    $originalGlobalAttributes = $restrictions['allowed']['*'];
    unset($restrictions['allowed']['*']);

    // Determine if any user global attributes where provided (from a filter).
    $addedGlobalAttributes = [];
    if (isset($restrictions['allowed'][static::ASTERISK_PLACEHOLDER])) {
      $addedGlobalAttributes['*'] = $restrictions['allowed'][static::ASTERISK_PLACEHOLDER];
      $addedGlobalAttributes = static::normalizeTags($addedGlobalAttributes);
      unset($restrictions['allowed'][static::ASTERISK_PLACEHOLDER]);
    }

    // Normalize the allowed tags.
    $normalizedTags = static::normalizeTags($restrictions['allowed']);

    // Merge in plugins allowed HTML tags.
    foreach ($allowedHtmlPlugins as $plugin_id => $allowedHtml) {

      // Retrieve the plugin's allowed HTML tags.
      $tags = $allowedHtml
        ->allowedHtmlTags($parser, $activeTheme);

      // Merge the plugin's global attributes with the user provided ones.
      if (isset($tags['*'])) {
        $addedGlobalAttributes = static::mergeAllowedTags($addedGlobalAttributes, [
          '*' => $tags['*'],
        ]);
        unset($tags['*']);
      }

      // Now merge the plugin's tags with the allowed HTML.
      $normalizedTags = static::mergeAllowedTags($normalizedTags, $tags);
    }

    // Replace the allowed tags with the normalized/merged tags.
    $restrictions['allowed'] = $normalizedTags;

    // Restore the original global attributes.
    $restrictions['allowed']['*'] = $originalGlobalAttributes;

    // Now merge the added global attributes using the array union (+) operator.
    // This ensures that the original core defined global attributes are never
    // overridden so users cannot specify attributes like 'style' and 'on*'
    // which are highly vulnerable to XSS.
    if (!empty($addedGlobalAttributes['*'])) {
      $restrictions['allowed']['*'] += $addedGlobalAttributes['*'];
    }
    $discoveryCache
      ->set($cid, $restrictions, CacheBackendInterface::CACHE_PERMANENT, $cacheTags);
    $this->restrictions = $restrictions;
    return $restrictions;
  }

}

Members

Namesort descending Modifiers Type Description Overrides
DependencySerializationTrait::$_entityStorages protected property An array of entity type IDs keyed by the property name of their storages.
DependencySerializationTrait::$_serviceIds protected property An array of service IDs keyed by property name used for serialization.
DependencySerializationTrait::__sleep public function 1
DependencySerializationTrait::__wakeup public function 2
FilterBase::$provider public property The name of the provider that owns this filter.
FilterBase::$settings public property An associative array containing the configured settings of this filter.
FilterBase::$status public property A Boolean indicating whether this filter is enabled.
FilterBase::$weight public property The weight of this filter compared to others in a filter collection.
FilterBase::calculateDependencies public function Calculates dependencies for the configured plugin. Overrides DependentPluginInterface::calculateDependencies 1
FilterBase::defaultConfiguration public function Gets default configuration for this plugin. Overrides ConfigurableInterface::defaultConfiguration
FilterBase::getConfiguration public function Gets this plugin's configuration. Overrides ConfigurableInterface::getConfiguration
FilterBase::getDescription public function Returns the administrative description for this filter plugin. Overrides FilterInterface::getDescription
FilterBase::getLabel public function Returns the administrative label for this filter plugin. Overrides FilterInterface::getLabel
FilterBase::getType public function Returns the processing type of this filter plugin. Overrides FilterInterface::getType
FilterBase::prepare public function Prepares the text for processing. Overrides FilterInterface::prepare
FilterBase::__construct public function Constructs a \Drupal\Component\Plugin\PluginBase object. Overrides PluginBase::__construct 4
FilterHtml::$restrictions protected property The processed HTML restrictions.
FilterHtml::ASTERISK_PLACEHOLDER constant The placeholder value used to protect asterisks in values.
FilterHtml::create public static function Creates a new instance.
FilterHtml::extractDomNodeTags protected static function Extracts HTML tags (and attributes) from a DOMNode.
FilterHtml::filterAttributes public function Provides filtering of tag attributes into accepted HTML.
FilterHtml::filterElementAttributes protected function Filters attributes on an element according to a list of allowed values.
FilterHtml::findAllowedValue protected function Helper function to handle prefix matching.
FilterHtml::fromParser public static function Create a new instance from a Markdown Parser instance.
FilterHtml::getAllowedHtml public function Retrieves the allowed HTML.
FilterHtml::getAllowedTags public function Retrieves an array of allowed HTML tags.
FilterHtml::getHTMLRestrictions public function Returns HTML allowed by this filter's configuration. Overrides FilterHtml::getHTMLRestrictions
FilterHtml::mergeAllowedTags public static function Merges allowed HTML tags.
FilterHtml::normalizeTags public static function Normalizes allowed HTML tags.
FilterHtml::prepareAttributeValues protected function Helper function to prepare attribute values including wildcards.
FilterHtml::process public function Performs the filter processing. Overrides FilterInterface::process
FilterHtml::setConfiguration public function Sets the configuration for this plugin instance. Overrides FilterBase::setConfiguration
FilterHtml::settingsForm public function Generates a filter's settings form. Overrides FilterBase::settingsForm
FilterHtml::tagsFromHtml public static function Extracts an array of tags (and attributes) from an HTML string.
FilterHtml::tagsToString public static function Converts an array of tags (and their potential attributes) to a string.
FilterHtml::tips public function Generates a filter's tip. Overrides FilterBase::tips
FilterInterface::TYPE_HTML_RESTRICTOR constant HTML tag and attribute restricting filters to prevent XSS attacks.
FilterInterface::TYPE_MARKUP_LANGUAGE constant Non-HTML markup language filters that generate HTML.
FilterInterface::TYPE_TRANSFORM_IRREVERSIBLE constant Irreversible transformation filters.
FilterInterface::TYPE_TRANSFORM_REVERSIBLE constant Reversible transformation filters.
MessengerTrait::$messenger protected property The messenger. 29
MessengerTrait::messenger public function Gets the messenger. 29
MessengerTrait::setMessenger public function Sets the messenger.
ParserAwareTrait::$parser protected property A Markdown Parser instance.
ParserAwareTrait::getParser public function 1
ParserAwareTrait::setParser public function
PluginBase::$configuration protected property Configuration information passed into the plugin. 1
PluginBase::$pluginDefinition protected property The plugin implementation definition. 1
PluginBase::$pluginId protected property The plugin_id.
PluginBase::DERIVATIVE_SEPARATOR constant A string which is used to separate base plugin IDs from the derivative ID.
PluginBase::getBaseId public function Gets the base_plugin_id of the plugin instance. Overrides DerivativeInspectionInterface::getBaseId
PluginBase::getDerivativeId public function Gets the derivative_id of the plugin instance. Overrides DerivativeInspectionInterface::getDerivativeId
PluginBase::getPluginDefinition public function Gets the definition of the plugin implementation. Overrides PluginInspectionInterface::getPluginDefinition 3
PluginBase::getPluginId public function Gets the plugin_id of the plugin instance. Overrides PluginInspectionInterface::getPluginId
PluginBase::isConfigurable public function Determines if the plugin is configurable.
StringTranslationTrait::$stringTranslation protected property The string translation service. 1
StringTranslationTrait::formatPlural protected function Formats a string containing a count of items.
StringTranslationTrait::getNumberOfPlurals protected function Returns the number of plurals supported by a given language.
StringTranslationTrait::getStringTranslation protected function Gets the string translation service.
StringTranslationTrait::setStringTranslation public function Sets the string translation service to use. 2
StringTranslationTrait::t protected function Translates a string to the current language or to a given language.