You are here

public function Terms::parseInput in Search API 8

Parses search keys input by the user.

Parameters

string $keys: The keywords to parse.

Return value

array|string|null The parsed keywords – either a string, or an array specifying a complex search expression, or NULL if no keywords should be set for this input. An array will contain a '#conjunction' key specifying the conjunction type, and search strings or nested expression arrays at numeric keys. Additionally, a '#negation' key might be present, which means – unless it maps to a FALSE value – that the search keys contained in that array should be negated (that is, not be present in returned results). The negation works on the whole array, not on each contained term individually – that is, with the "AND" conjunction and negation, only results that contain all the terms in the array should be excluded; with the "OR" conjunction and negation, all results containing one or more of the terms in the array should be excluded.

Overrides ParseModeInterface::parseInput

File

src/Plugin/search_api/parse_mode/Terms.php, line 21

Class

Terms
Represents a parse mode that parses the input into multiple words.

Namespace

Drupal\search_api\Plugin\search_api\parse_mode

Code

public function parseInput($keys) {

  // Split the keys into tokens. Any whitespace is considered as a delimiter
  // for tokens. This covers ASCII white spaces as well as multi-byte "spaces"
  // which for example are common in Japanese.
  $tokens = preg_split('/\\s+/u', $keys);
  $quoted = FALSE;
  $negated = FALSE;
  $phrase_contents = [];
  $ret = [
    '#conjunction' => $this
      ->getConjunction(),
  ];
  foreach ($tokens as $token) {

    // Ignore empty tokens. (Also helps keep the following code simpler.)
    if ($token === '') {
      continue;
    }

    // Check for negation.
    if ($token[0] === '-' && !$quoted) {
      $token = ltrim($token, '-');

      // If token is empty after trimming, ignore it.
      if ($token === '') {
        continue;
      }
      $negated = TRUE;
    }

    // Depending on whether we are currently in a quoted phrase, or maybe just
    // starting one, act accordingly.
    if ($quoted) {
      if (substr($token, -1) === '"') {
        $token = substr($token, 0, -1);
        $phrase_contents[] = trim($token);
        $phrase_contents = array_filter($phrase_contents, 'strlen');
        $phrase_contents = implode(' ', $phrase_contents);
        if ($phrase_contents !== '') {
          $ret[] = $phrase_contents;
        }
        $quoted = FALSE;
      }
      else {
        $phrase_contents[] = trim($token);
        continue;
      }
    }
    elseif ($token[0] === '"') {
      $len = strlen($token);
      if ($len > 1 && $token[$len - 1] === '"') {
        $ret[] = substr($token, 1, -1);
      }
      else {
        $phrase_contents = [
          trim(substr($token, 1)),
        ];
        $quoted = TRUE;
        continue;
      }
    }
    else {
      $ret[] = $token;
    }

    // If negation was set, change the last added keyword to be negated.
    if ($negated) {
      $i = count($ret) - 2;
      $ret[$i] = [
        '#negation' => TRUE,
        '#conjunction' => 'AND',
        $ret[$i],
      ];
      $negated = FALSE;
    }
  }

  // Take care of any quoted phrase missing its closing quotation mark.
  if ($quoted) {
    $phrase_contents = implode(' ', array_filter($phrase_contents, 'strlen'));
    if ($phrase_contents !== '') {
      $ret[] = $phrase_contents;
    }
  }
  return $ret;
}