You are here

public static function Utility::flattenKeys in Search API Solr 8.3

Same name and namespace in other branches
  1. 4.x src/Utility/Utility.php \Drupal\search_api_solr\Utility\Utility::flattenKeys()

Flattens keys and fields into a single search string.

Formatting the keys into a Solr query can be a bit complex. Keep in mind that the default operator is OR. For some combinations we had to take decisions because different interpretations are possible and we have to ensure that stop words in boolean combinations don't lead to zero results. Therefore this function will produce these queries:

Careful interpreting this, phrase and sloppy phrase queries will represent different phrases as A & B. To be very clear, A could equal multiple words.


#conjunction | #negation | fields | parse mode     | return value
---------------------------------------------------------------------------
AND          | FALSE     | []     | terms / phrase | +(+A +B)
AND          | TRUE      | []     | terms / phrase | -(+A +B)
OR           | FALSE     | []     | terms / phrase | +(A B)
OR           | TRUE      | []     | terms / phrase | -(A B)
AND          | FALSE     | [x]    | terms / phrase | +(x:(+A +B)^1)
AND          | TRUE      | [x]    | terms / phrase | -(x:(+A +B)^1)
OR           | FALSE     | [x]    | terms / phrase | +(x:(A B)^1)
OR           | TRUE      | [x]    | terms / phrase | -(x:(A B)^1)
AND          | FALSE     | [x,y]  | terms          | +((+(x:A^1 y:A^1) +(x:B^1 y:B^1)) x:(+A +B)^1 y:(+A +B)^1)
AND          | FALSE     | [x,y]  | phrase         | +(x:(+A +B)^1 y:(+A +B)^1)
AND          | TRUE      | [x,y]  | terms          | -((+(x:A^1 y:A^1) +(x:B^1 y:B^1)) x:(+A +B)^1 y:(+A +B)^1)
AND          | TRUE      | [x,y]  | phrase         | -(x:(+A +B)^1 y:(+A +B)^1)
OR           | FALSE     | [x,y]  | terms          | +(((x:A^1 y:A^1) (x:B^1 y:B^1)) x:(A B)^1 y:(A B)^1)
OR           | FALSE     | [x,y]  | phrase         | +(x:(A B)^1 y:(A B)^1)
OR           | TRUE      | [x,y]  | terms          | -(((x:A^1 y:A^1) (x:B^1 y:B^1)) x:(A B)^1 y:(A B)^1)
OR           | TRUE      | [x,y]  | phrase         | -(x:(A B)^1 y:(A B)^1)
AND          | FALSE     | [x,y]  | sloppy_terms   | +(x:(+"A"~10000000 +"B"~10000000)^1 y:(+"A"~10000000 +"B"~10000000)^1)
AND          | TRUE      | [x,y]  | sloppy_terms   | -(x:(+"A"~10000000 +"B"~10000000)^1 y:(+"A"~10000000 +"B"~10000000)^1)
OR           | FALSE     | [x,y]  | sloppy_terms   | +(x:("A"~10000000 "B"~10000000)^1 y:("A"~10000000 "B"~10000000)^1)
OR           | TRUE      | [x,y]  | sloppy_terms   | -(x:("A"~10000000 "B"~10000000)^1 y:("A"~10000000 "B"~10000000)^1)
AND          | FALSE     | [x,y]  | sloppy_phrase  | +(x:(+"A"~10000000 +"B"~10000000)^1 y:(+"A"~10000000 +"B"~10000000)^1)
AND          | TRUE      | [x,y]  | sloppy_phrase  | -(x:(+"A"~10000000 +"B"~10000000)^1 y:(+"A"~10000000 +"B"~10000000)^1)
OR           | FALSE     | [x,y]  | sloppy_phrase  | +(x:("A"~10000000 "B"~10000000)^1 y:("A"~10000000 "B"~10000000)^1)
OR           | TRUE      | [x,y]  | sloppy_phrase  | -(x:("A"~10000000 "B"~10000000)^1 y:("A"~10000000 "B"~10000000)^1)
AND          | FALSE     | [x,y]  | edismax        | +({!edismax qf=x^1,y^1}+A +B)
AND          | TRUE      | [x,y]  | edismax        | -({!edismax qf=x^1,y^1}+A +B)
OR           | FALSE     | [x,y]  | edismax        | +({!edismax qf=x^1,y^1}A B)
OR           | TRUE      | [x,y]  | edismax        | -({!edismax qf=x^1,y^1}A B)
AND / OR     | FALSE     | [x]    | direct         | +(x:(A)^1)
AND / OR     | TRUE      | [x]    | direct         | -(x:(A)^1)
AND / OR     | FALSE     | [x,y]  | direct         | +(x:(A)^1 y:(A)^1)
AND / OR     | TRUE      | [x,y]  | direct         | -(x:(A)^1 y:(A)^1)
AND          | FALSE     | []     | keys           | +A +B
AND          | TRUE      | []     | keys           | -(+A +B)
OR           | FALSE     | []     | keys           | A B
OR           | TRUE      | []     | keys           | -(A B)

Parameters

array|string $keys: The keys array to flatten, formatted as specified by \Drupal\search_api\Query\QueryInterface::getKeys() or a phrase string.

array $fields: (optional) An array of field names.

string $parse_mode_id: (optional) The parse mode ID. Defaults to "phrase".

Return value

string A Solr query string representing the same keys.

Throws

\Drupal\search_api_solr\SearchApiSolrException

3 calls to Utility::flattenKeys()
SearchApiSolrBackend::createFilterQueries in src/Plugin/search_api/backend/SearchApiSolrBackend.php
Recursively transforms conditions into a flat array of Solr filter queries.
SearchApiSolrBackend::search in src/Plugin/search_api/backend/SearchApiSolrBackend.php
Options on $query prefixed by 'solr_param_' will be passed natively to Solr as query parameter without the prefix. For example you can set the "Minimum Should Match" parameter 'mm' to '75%' like this:
SearchApiSolrTest::checkQueryParsers in tests/src/Kernel/SearchApiSolrTest.php
Tests the conversion of Search API queries into Solr queries.

File

src/Utility/Utility.php, line 667

Class

Utility
Provides various helper functions for Solr backends.

Namespace

Drupal\search_api_solr\Utility

Code

public static function flattenKeys($keys, array $fields = [], string $parse_mode_id = 'phrase') : string {
  switch ($parse_mode_id) {
    case 'keys':
      if (!empty($fields)) {
        throw new SearchApiSolrException(sprintf('Parse mode %s could not handle fields.', $parse_mode_id));
      }
      break;
    case 'edismax':
    case 'direct':
      if (empty($fields)) {
        throw new SearchApiSolrException(sprintf('Parse mode %s requires fields.', $parse_mode_id));
      }
      break;
  }
  $k = [];
  $pre = '+';
  $neg = '';
  $query_parts = [];
  $sloppiness = '';
  if (is_array($keys)) {
    $queryHelper = \Drupal::service('solarium.query_helper');
    if (isset($keys['#conjunction']) && $keys['#conjunction'] === 'OR') {
      $pre = '';
    }
    if (!empty($keys['#negation'])) {
      $neg = '-';
    }
    $escaped = $keys['#escaped'] ?? FALSE;
    foreach ($keys as $key_nr => $key) {

      // We cannot use \Drupal\Core\Render\Element::children() anymore because
      // $keys is not a valid render array.
      if (!$key || strpos($key_nr, '#') === 0) {
        continue;
      }
      if (is_array($key)) {
        if ('edismax' === $parse_mode_id) {
          throw new SearchApiSolrException('Incompatible parse mode.');
        }
        if ($subkeys = self::flattenKeys($key, $fields, $parse_mode_id)) {
          $query_parts[] = $subkeys;
        }
      }
      elseif ($escaped) {
        $k[] = trim($key);
      }
      else {
        switch ($parse_mode_id) {

          // Using the 'phrase' or 'sloppy_phrase' parse mode, Search API
          // provides one big phrase as keys. Using the 'terms' parse mode,
          // Search API provides chunks of single terms as keys. But these
          // chunks might contain not just real terms but again a phrase if
          // you enter something like this in the search box:
          // term1 "term2 as phrase" term3.
          // This will be converted in this keys array:
          // ['term1', 'term2 as phrase', 'term3'].
          // To have Solr behave like the database backend, these three
          // "terms" should be handled like three phrases.
          case 'terms':
          case 'sloppy_terms':
          case 'phrase':
          case 'sloppy_phrase':
          case 'edismax':
          case 'keys':
            $k[] = $queryHelper
              ->escapePhrase(trim($key));
            break;
          default:
            throw new SearchApiSolrException('Incompatible parse mode.');
        }
      }
    }
  }
  elseif (is_string($keys)) {
    switch ($parse_mode_id) {
      case 'direct':
        $pre = '';
        $k[] = '(' . trim($keys) . ')';
        break;
      default:
        throw new SearchApiSolrException('Incompatible parse mode.');
    }
  }
  if ($k) {
    switch ($parse_mode_id) {
      case 'edismax':
        $query_parts[] = "({!edismax qf='" . implode(' ', $fields) . "'}" . $pre . implode(' ' . $pre, $k) . ')';
        break;
      case 'keys':
        $query_parts[] = $pre . implode(' ' . $pre, $k);
        break;
      case 'sloppy_terms':
      case 'sloppy_phrase':

        // @todo Factor should be configurable.
        $sloppiness = '~10000000';

      // No break! Execute 'default', too. 'terms' will be skipped when $k
      // just contains one element.
      case 'terms':
        if (count($k) > 1 && count($fields) > 0) {
          $key_parts = [];
          foreach ($k as $l) {
            $field_parts = [];
            foreach ($fields as $f) {
              $field = $f;
              $boost = '';

              // Split on operators:
              // - boost (^)
              // - fixed score (^=)
              if ($split = preg_split('/([\\^])/', $f, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE)) {
                $field = array_shift($split);
                $boost = implode('', $split);
              }
              $field_parts[] = $field . ':' . $l . $boost;
            }
            $key_parts[] = $pre . '(' . implode(' ', $field_parts) . ')';
          }
          $query_parts[] = '(' . implode(' ', $key_parts) . ')';
        }

      // No break! Execute 'default', too.
      default:
        if ($sloppiness) {
          foreach ($k as &$term_or_phrase) {

            // Just add sloppiness when if we really have a phrase, indicated
            // by double quotes and terms separated by blanks.
            if (strpos($term_or_phrase, ' ') && strpos($term_or_phrase, '"') === 0) {
              $term_or_phrase .= $sloppiness;
            }
          }
          unset($term_or_phrase);
        }
        if (count($fields) > 0) {
          foreach ($fields as $f) {
            $field = $f;
            $boost = '';

            // Split on operators:
            // - boost (^)
            // - fixed score (^=)
            if ($split = preg_split('/([\\^])/', $f, -1, PREG_SPLIT_NO_EMPTY | PREG_SPLIT_DELIM_CAPTURE)) {
              $field = array_shift($split);
              $boost = implode('', $split);
            }
            $query_parts[] = $field . ':(' . $pre . implode(' ' . $pre, $k) . ')' . $boost;
          }
        }
        else {
          $query_parts[] = '(' . $pre . implode(' ' . $pre, $k) . ')';
        }
    }
  }
  if (count($query_parts) === 1) {
    return $neg . reset($query_parts);
  }
  if (count($query_parts) > 1) {
    return $neg . '(' . implode(' ', $query_parts) . ')';
  }
  return '';
}