You are here

public function TikaServerExtractor::extract in Search API attachments 9.0.x

Same name and namespace in other branches
  1. 8 src/Plugin/search_api_attachments/TikaServerExtractor.php \Drupal\search_api_attachments\Plugin\search_api_attachments\TikaServerExtractor::extract()

Extract file with a Tika JAX-RS Server.

Parameters

\Drupal\file\Entity\File $file: A file object.

Return value

string The text extracted from the file.

Throws

\GuzzleHttp\Exception\GuzzleException

Overrides TextExtractorPluginBase::extract

File

src/Plugin/search_api_attachments/TikaServerExtractor.php, line 70

Class

TikaServerExtractor
Provides tika server extractor.

Namespace

Drupal\search_api_attachments\Plugin\search_api_attachments

Code

public function extract(File $file) {
  $data = NULL;
  $options = [
    'timeout' => $this->configuration['timeout'],
    'body' => fopen($file
      ->getFileUri(), 'r'),
    'headers' => [
      'Accept' => 'text/plain',
    ],
  ];
  $response = $this->httpClient
    ->request('PUT', $this
    ->getServerUri() . '/tika', $options);
  if ($response
    ->getStatusCode() === 200) {
    $data = (string) $response
      ->getBody();
  }
  else {
    throw new \Exception('Tika JAX-RS Server is not available.');
  }
  return $data;
}