You are here

function linkit_scan_url in Linkit 7.2

Retrieve relevant information about a URL. Specifically this function is usable for internal (absolute) URL:s, but it also works for external URL:s.

Parameters

$url: The url that should be scanned.

Return value

$path_info An associative array containing:

  • url: The same as the argument $url, untouched.
  • target: Either "internal" or "external".
  • requested_path: If internal, the path requested relative to Drupal root. The only exception is when frontpage is referred directly, then it will be whatever the frontpage is set to.
  • system_path: If internal and the path is valid, the Drupal system path, e.g. "node/23".
  • query_fragment: If internal, the query and fragment of the url. Typically it is not needed for searching and is just reappended back when processing of the path is done. It could e.g. look like "?foo=bar#anchor".
  • safe_url: If external, and the protocol is http or https, this will be the original url, stripped from everything that could potentially be dangerous. E.g. "http://user:pass@example.org/settings?evilaction=true" will become "http://example.org/settings".
1 call to linkit_scan_url()
_linkit_result_from_url in ./linkit.module
Retrieve the result object from an absolute URL. This function calls the enabled plugins' "path info callback" to look for a result object. Both internal and external paths work.

File

./linkit.module, line 349
Main file for linkit module.

Code

function linkit_scan_url($url) {
  global $base_url;

  // We will not use the drupal wrapper function 'drupal_pasre_url' as that
  // function should only be used for URLs that have been generated by the
  // system, and we cant be sure that this is the case here.
  $parts = parse_url(trim($url, '/'));
  if (!isset($parts['scheme']) || !isset($parts['host'])) {

    // Not an absolute URL.
    return FALSE;
  }

  // Make a new array, this will hold the components from parse_url() and our
  // own "Linkit" components.
  $path_info = array();

  // Append the original components from parse_url() to our array.
  $path_info += $parts;

  // Save the whole URL.
  $path_info['url'] = $url;
  if (!isset($path_info['query'])) {
    $path_info['query'] = '';
  }

  // Convert the query string to an array as Drupal can only handle querys as
  // arrays.
  // @see http://api.drupal.org/drupal_http_build_query
  parse_str($path_info['query'], $path_info['query']);

  // The 'q' parameter contains the path of the current page if clean URLs are
  // disabled. It overrides the 'path' of the URL when present, even if clean
  // URLs are enabled, due to how Apache rewriting rules work.
  if (isset($path_info['query']['q'])) {
    $path_info['path'] = $path_info['query']['q'];
    unset($path_info['query']['q']);
  }

  // Internal URL.
  // @TODO: Handle https and other schemes here?
  if (trim($path_info['scheme'] . '://' . $path_info['host'] . base_path(), '/') == $base_url) {
    $path_info['target'] = 'internal';

    // Remove the subdirectory name from the path if the site is installed in
    // subdirectory. It will be added again by the url() function.
    if (base_path() != "/") {
      $path_info['path'] = trim(preg_replace(base_path(), '', $path_info['path'], 1), '/');
    }

    // Trim the path from slashes.
    $path_info['path'] = trim($path_info['path'], '/');

    // If we have an empty path, and an internal target, we can assume that the
    // URL should go the the frontpage.
    if (empty($path_info['path'])) {
      $path_info['frontpage'] = TRUE;
      $path_info['path'] = variable_get('site_frontpage', 'node');
    }

    // Check if the path already is an alias.
    if (!($processed_path = drupal_lookup_path('source', $path_info['path']))) {

      // Not an alias, so keep the original value.
      $processed_path = $path_info['path'];
    }

    // Add the "real" system path (not the alias) if the current user have
    // access to the URL.
    $path_info['system_path'] = drupal_valid_path($processed_path) ? $processed_path : FALSE;
  }
  else {
    $path_info['target'] = 'external';
    if (preg_match('~^https?$~', $parts['scheme'])) {
      $path_info['safe_url'] = $parts['scheme'] . '://' . $parts['host'] . $parts['path'];
    }
  }
  return $path_info;
}