You are here

protected function FeedsCrawler::parseHref in Feeds Crawler 7

Same name and namespace in other branches
  1. 6.2 FeedsCrawler.inc \FeedsCrawler::parseHref()

Builds a fully qualified URL from the source URL if necessary.

2 calls to FeedsCrawler::parseHref()
FeedsCrawler::parseAuto in ./FeedsCrawler.inc
Paginates using Atom's rel=next link automatically.
FeedsCrawler::parseXPath in ./FeedsCrawler.inc
Finds the "next" link on a page via XPath.

File

./FeedsCrawler.inc, line 112
Home of the FeedsCrawler.

Class

FeedsCrawler
Fetches data via HTTP.

Code

protected function parseHref($href, $source_url) {
  if ($href === FALSE || empty($href)) {
    return FALSE;
  }
  foreach ($href as $h) {
    $h = trim((string) $h);
    if (!empty($h)) {
      $href = $h;
      break;
    }
  }
  if (strpos($href, 'http://') !== 0 && strpos($href, 'https://') !== 0) {
    if (substr($href, 0, 1) == '/') {
      $href = ltrim($href, '/');
      $href = $this
        ->baseUrl($source_url) . '/' . $href;
    }
    else {
      $href = $this
        ->baseUrl($source_url, TRUE) . '/' . $href;
    }
  }
  return $href;
}