You are here

public function SitemapParser::parse in Feeds 8.3

Parses content returned by fetcher.

@todo This needs more documentation.

Parameters

\Drupal\feeds\FeedInterface $feed: The feed we are parsing for.

\Drupal\feeds\Result\FetcherResultInterface $fetcher_result: The result returned by the fetcher.

\Drupal\feeds\StateInterface $state: The state object.

Return value

\Drupal\feeds\Result\ParserResultInterface The parser result object.

Overrides ParserInterface::parse

File

src/Feeds/Parser/SitemapParser.php, line 30

Class

SitemapParser
Defines a SitemapXML feed parser.

Namespace

Drupal\feeds\Feeds\Parser

Code

public function parse(FeedInterface $feed, FetcherResultInterface $fetcher_result, StateInterface $state) {

  // Set time zone to GMT for parsing dates with strtotime().
  $tz = date_default_timezone_get();
  date_default_timezone_set('GMT');
  $raw = trim($fetcher_result
    ->getRaw());
  if (!strlen($raw)) {
    throw new EmptyFeedException();
  }

  // Yes, using a DOM parser is a bit inefficient, but will do for now.
  // @todo XML error handling.
  static::startXmlErrorHandling();
  $xml = new \SimpleXMLElement($raw);
  static::stopXmlErrorHandling();
  $result = new ParserResult();
  foreach ($xml->url as $url) {
    $item = new SitemapItem();
    $item
      ->set('url', (string) $url->loc);
    if ($url->lastmod) {
      $item
        ->set('lastmod', strtotime($url->lastmod));
    }
    if ($url->changefreq) {
      $item
        ->set('changefreq', (string) $url->changefreq);
    }
    if ($url->priority) {
      $item
        ->set('priority', (string) $url->priority);
    }
    $result
      ->addItem($item);
  }
  date_default_timezone_set($tz);
  return $result;
}