You are here

class Scanner in Zircon Profile 8

Same name and namespace in other branches
  1. 8.0 vendor/masterminds/html5/src/HTML5/Parser/Scanner.php \Masterminds\HTML5\Parser\Scanner

The scanner.

This scans over an input stream.

Hierarchy

  • class \Masterminds\HTML5\Parser\Scanner

Expanded class hierarchy of Scanner

5 files declare their use of Scanner
DOMTreeBuilderTest.php in vendor/masterminds/html5/test/HTML5/Parser/DOMTreeBuilderTest.php
Test the Tree Builder.
HTML5.php in vendor/masterminds/html5/src/HTML5.php
ScannerTest.php in vendor/masterminds/html5/test/HTML5/Parser/ScannerTest.php
Test the Scanner. This requires the InputStream tests are all good.
TokenizerTest.php in vendor/masterminds/html5/test/HTML5/Parser/TokenizerTest.php
TreeBuildingRulesTest.php in vendor/masterminds/html5/test/HTML5/Parser/TreeBuildingRulesTest.php
Test the Tree Builder's special-case rules.

File

vendor/masterminds/html5/src/HTML5/Parser/Scanner.php, line 9

Namespace

Masterminds\HTML5\Parser
View source
class Scanner {
  const CHARS_HEX = 'abcdefABCDEF01234567890';
  const CHARS_ALNUM = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ01234567890';
  const CHARS_ALPHA = 'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ';
  protected $is;

  // Flipping this to true will give minisculely more debugging info.
  public $debug = false;

  /**
   * Create a new Scanner.
   *
   * @param \Masterminds\HTML5\Parser\InputStream $input
   *            An InputStream to be scanned.
   */
  public function __construct($input) {
    $this->is = $input;
  }

  /**
   * Get the current position.
   *
   * @return int The current intiger byte position.
   */
  public function position() {
    return $this->is
      ->key();
  }

  /**
   * Take a peek at the next character in the data.
   *
   * @return string The next character.
   */
  public function peek() {
    return $this->is
      ->peek();
  }

  /**
   * Get the next character.
   *
   * Note: This advances the pointer.
   *
   * @return string The next character.
   */
  public function next() {
    $this->is
      ->next();
    if ($this->is
      ->valid()) {
      if ($this->debug) {
        fprintf(STDOUT, "> %s\n", $this->is
          ->current());
      }
      return $this->is
        ->current();
    }
    return false;
  }

  /**
   * Get the current character.
   *
   * Note, this does not advance the pointer.
   *
   * @return string The current character.
   */
  public function current() {
    if ($this->is
      ->valid()) {
      return $this->is
        ->current();
    }
    return false;
  }

  /**
   * Silently consume N chars.
   */
  public function consume($count = 1) {
    for ($i = 0; $i < $count; ++$i) {
      $this
        ->next();
    }
  }

  /**
   * Unconsume some of the data.
   * This moves the data pointer backwards.
   *
   * @param int $howMany
   *            The number of characters to move the pointer back.
   */
  public function unconsume($howMany = 1) {
    $this->is
      ->unconsume($howMany);
  }

  /**
   * Get the next group of that contains hex characters.
   *
   * Note, along with getting the characters the pointer in the data will be
   * moved as well.
   *
   * @return string The next group that is hex characters.
   */
  public function getHex() {
    return $this->is
      ->charsWhile(static::CHARS_HEX);
  }

  /**
   * Get the next group of characters that are ASCII Alpha characters.
   *
   * Note, along with getting the characters the pointer in the data will be
   * moved as well.
   *
   * @return string The next group of ASCII alpha characters.
   */
  public function getAsciiAlpha() {
    return $this->is
      ->charsWhile(static::CHARS_ALPHA);
  }

  /**
   * Get the next group of characters that are ASCII Alpha characters and numbers.
   *
   * Note, along with getting the characters the pointer in the data will be
   * moved as well.
   *
   * @return string The next group of ASCII alpha characters and numbers.
   */
  public function getAsciiAlphaNum() {
    return $this->is
      ->charsWhile(static::CHARS_ALNUM);
  }

  /**
   * Get the next group of numbers.
   *
   * Note, along with getting the characters the pointer in the data will be
   * moved as well.
   *
   * @return string The next group of numbers.
   */
  public function getNumeric() {
    return $this->is
      ->charsWhile('0123456789');
  }

  /**
   * Consume whitespace.
   *
   * Whitespace in HTML5 is: formfeed, tab, newline, space.
   */
  public function whitespace() {
    return $this->is
      ->charsWhile("\n\t\f ");
  }

  /**
   * Returns the current line that is being consumed.
   *
   * @return int The current line number.
   */
  public function currentLine() {
    return $this->is
      ->currentLine();
  }

  /**
   * Read chars until something in the mask is encountered.
   */
  public function charsUntil($mask) {
    return $this->is
      ->charsUntil($mask);
  }

  /**
   * Read chars as long as the mask matches.
   */
  public function charsWhile($mask) {
    return $this->is
      ->charsWhile($mask);
  }

  /**
   * Returns the current column of the current line that the tokenizer is at.
   *
   * Newlines are column 0. The first char after a newline is column 1.
   *
   * @return int The column number.
   */
  public function columnOffset() {
    return $this->is
      ->columnOffset();
  }

  /**
   * Get all characters until EOF.
   *
   * This consumes characters until the EOF.
   *
   * @return int The number of characters remaining.
   */
  public function remainingChars() {
    return $this->is
      ->remainingChars();
  }

}

Members

Namesort descending Modifiers Type Description Overrides
Scanner::$debug public property
Scanner::$is protected property
Scanner::charsUntil public function Read chars until something in the mask is encountered.
Scanner::charsWhile public function Read chars as long as the mask matches.
Scanner::CHARS_ALNUM constant
Scanner::CHARS_ALPHA constant
Scanner::CHARS_HEX constant
Scanner::columnOffset public function Returns the current column of the current line that the tokenizer is at.
Scanner::consume public function Silently consume N chars.
Scanner::current public function Get the current character.
Scanner::currentLine public function Returns the current line that is being consumed.
Scanner::getAsciiAlpha public function Get the next group of characters that are ASCII Alpha characters.
Scanner::getAsciiAlphaNum public function Get the next group of characters that are ASCII Alpha characters and numbers.
Scanner::getHex public function Get the next group of that contains hex characters.
Scanner::getNumeric public function Get the next group of numbers.
Scanner::next public function Get the next character.
Scanner::peek public function Take a peek at the next character in the data.
Scanner::position public function Get the current position.
Scanner::remainingChars public function Get all characters until EOF.
Scanner::unconsume public function Unconsume some of the data. This moves the data pointer backwards.
Scanner::whitespace public function Consume whitespace.
Scanner::__construct public function Create a new Scanner.