You are here

function _pathologic_filter in Pathologic 7.2

Same name and namespace in other branches
  1. 8 pathologic.module \_pathologic_filter()
  2. 7.3 pathologic.module \_pathologic_filter()

Pathologic filter callback.

Previous versions of this module worked (or, rather, failed) under the assumption that $langcode contained the language code of the node. Sadly, this isn't the case. However, it turns out that the language of the current node isn't as important as the language of the node we're linking to, and even then only if language path prefixing (eg /ja/node/123) is in use. REMEMBER THIS IN THE FUTURE, ALBRIGHT.

The below code uses the @ operator before parse_url() calls because in PHP 5.3.2 and earlier, parse_url() causes a warning of parsing fails. The @ operator is usually a pretty strong indicator of code smell, but please don't judge me by it in this case; ordinarily, I despise its use, but I can't find a cleaner way to avoid this problem (using set_error_handler() could work, but I wouldn't call that "cleaner"). Fortunately, Drupal 8 will require at least PHP 5.3.5, so this mess doesn't have to spread into the D8 branch of Pathologic.

@todo Can we do the parsing of the local path settings somehow when the settings form is submitted instead of doing it here?

See also

http://drupal.org/node/1812264

https://drupal.org/node/2104849

1 call to _pathologic_filter()
PathologicTestCase::testPathologic in ./pathologic.test
2 string references to '_pathologic_filter'
pathologic_filter_info in ./pathologic.module
Implements hook_filter_info().
_pathologic_replace in ./pathologic.module
Process and replace paths. preg_replace_callback() callback.

File

./pathologic.module, line 91
Pathologic text filter for Drupal.

Code

function _pathologic_filter($text, $filter, $format, $langcode, $cache, $cache_id) {

  // Get the base URL and explode it into component parts. We add these parts
  // to the exploded local paths settings later.
  global $base_url;
  $base_url_parts = @parse_url($base_url . '/');

  // Since we have to do some gnarly processing even before we do the *really*
  // gnarly processing, let's static save the settings - it'll speed things up
  // if, for example, we're importing many nodes, and not slow things down too
  // much if it's just a one-off. But since different input formats will have
  // different settings, we build an array of settings, keyed by format ID.
  $cached_settings =& drupal_static(__FUNCTION__, array());
  if (!isset($cached_settings[$filter->format])) {
    $filter_settings = $filter->settings;
    $filter_settings['local_paths_exploded'] = array();
    if ($filter_settings['local_paths'] !== '') {

      // Build an array of the exploded local paths for this format's settings.
      // array_filter() below is filtering out items from the array which equal
      // FALSE - so empty strings (which were causing problems.
      // @see http://drupal.org/node/1727492
      $local_paths = array_filter(array_map('trim', explode("\n", $filter_settings['local_paths'])));
      foreach ($local_paths as $local) {
        $parts = @parse_url($local);

        // Okay, what the hellish "if" statement is doing below is checking to
        // make sure we aren't about to add a path to our array of exploded
        // local paths which matches the current "local" path. We consider it
        // not a match, if…
        // @todo: This is pretty horrible. Can this be simplified?
        if (isset($parts['host']) && ($parts['host'] !== $base_url_parts['host'] || ((isset($parts['path']) xor isset($base_url_parts['path'])) || isset($parts['path']) && isset($base_url_parts['path']) && $parts['path'] !== $base_url_parts['path'])) || !isset($parts['host']) && (!isset($parts['path']) || !isset($base_url_parts['path']) || $parts['path'] !== $base_url_parts['path'])) {

          // Add it to the list.
          $filter_settings['local_paths_exploded'][] = $parts;
        }
      }
    }

    // Now add local paths based on "this" server URL.
    $filter_settings['local_paths_exploded'][] = array(
      'path' => $base_url_parts['path'],
    );
    $filter_settings['local_paths_exploded'][] = array(
      'path' => $base_url_parts['path'],
      'host' => $base_url_parts['host'],
    );

    // We'll also just store the host part separately for easy access.
    $filter_settings['base_url_host'] = $base_url_parts['host'];
    $cached_settings[$filter->format] = $filter_settings;
  }

  // Get the language code for the text we're about to process.
  $cached_settings['langcode'] = $langcode;

  // And also take note of which settings in the settings array should apply.
  $cached_settings['current_settings'] =& $cached_settings[$filter->format];

  // Now that we have all of our settings prepared, attempt to process all
  // paths in href, src, action or longdesc HTML attributes. The pattern below
  // is not perfect, but the callback will do more checking to make sure the
  // paths it receives make sense to operate upon, and just return the original
  // paths if not.
  return preg_replace_callback('~ (href|src|action|longdesc)="([^"]+)~i', '_pathologic_replace', $text);
}