You are here

public function Tokenizer::buildConfigurationForm in Search API 8

Form constructor.

Plugin forms are embedded in other forms. In order to know where the plugin form is located in the parent form, #parents and #array_parents must be known, but these are not available during the initial build phase. In order to have these properties available when building the plugin form's elements, let this method return a form element that has a #process callback and build the rest of the form in the callback. By the time the callback is executed, the element's #parents and #array_parents properties will have been set by the form API. For more documentation on #parents and #array_parents, see \Drupal\Core\Render\Element\FormElement.

Parameters

array $form: An associative array containing the initial structure of the plugin form.

\Drupal\Core\Form\FormStateInterface $form_state: The current state of the form. Calling code should pass on a subform state created through \Drupal\Core\Form\SubformState::createForSubform().

Return value

array The form structure.

Overrides FieldsProcessorPluginBase::buildConfigurationForm

File

src/Plugin/search_api/processor/Tokenizer.php, line 70

Class

Tokenizer
Splits text into individual words for searching.

Namespace

Drupal\search_api\Plugin\search_api\processor

Code

public function buildConfigurationForm(array $form, FormStateInterface $form_state) {
  $form = parent::buildConfigurationForm($form, $form_state);
  $args = [
    ':pcre-url' => Url::fromUri('https://php.net/manual/regexp.reference.character-classes.php')
      ->toString(),
    ':doc-url' => Url::fromUri('https://api.drupal.org/api/drupal/core!lib!Drupal!Component!Utility!Unicode.php/constant/Unicode%3A%3APREG_CLASS_WORD_BOUNDARY/8')
      ->toString(),
  ];
  $form['ignored'] = [
    '#type' => 'textfield',
    '#title' => $this
      ->t('Ignored characters'),
    '#description' => $this
      ->t('Specify the characters that should be removed prior to processing. Dots, dashes, and underscores are ignored by default to allow meaningful search behavior with acronyms and URLs. Specify the characters as the inside of a <a href=":pcre-url">PCRE character class</a>.', $args),
    '#default_value' => $this->configuration['ignored'],
  ];
  $form['spaces'] = [
    '#type' => 'textfield',
    '#title' => $this
      ->t('Whitespace characters'),
    '#description' => $this
      ->t('Specify the characters that should be regarded as whitespace and therefore used as word-delimiters. Specify the characters as the inside of a <a href=":pcre-url">PCRE character class</a>. Leave empty to use a <a href=":doc-url">default</a> which should be suitable for most languages with a Latin alphabet.', $args),
    '#default_value' => $this->configuration['spaces'],
  ];
  $form['overlap_cjk'] = [
    '#type' => 'checkbox',
    '#title' => $this
      ->t('Simple CJK handling'),
    '#default_value' => $this->configuration['overlap_cjk'],
    '#description' => $this
      ->t('Whether to apply a simple Chinese/Japanese/Korean tokenizer based on overlapping sequences. Does not affect other languages.'),
  ];
  $form['minimum_word_size'] = [
    '#type' => 'number',
    '#title' => $this
      ->t('Minimum word length to index'),
    '#default_value' => $this->configuration['minimum_word_size'],
    '#min' => 1,
    '#max' => 1000,
    '#description' => $this
      ->t('The number of characters a word has to be to be indexed. A lower setting means better search result ranking, but also a larger database. Each search query must contain at least one keyword that is this size (or longer).'),
  ];
  return $form;
}