You are here

class CSV in Migrate Source CSV 8.3

Same name and namespace in other branches
  1. 8 src/Plugin/migrate/source/CSV.php \Drupal\migrate_source_csv\Plugin\migrate\source\CSV
  2. 8.2 src/Plugin/migrate/source/CSV.php \Drupal\migrate_source_csv\Plugin\migrate\source\CSV

Source for CSV files.

Available configuration options:

  • path: Path to the CSV file. File streams are supported.
  • ids: Array of column names that uniquely identify each record.
  • header_offset: (optional) The record to be used as the CSV header and the thereby each record's field name. Defaults to 0 and because records are zero indexed. Can be set to null to indicate no header record.
  • fields: (optional) nested array of names and labels to use instead of a header record. Will overwrite values provided by header record. If used, name is required. If no label is provided, name is used instead for the field description.
  • delimiter: (optional) The field delimiter (one character only). Defaults to a comma (,).
  • enclosure: (optional) The field enclosure character (one character only). Defaults to double quote marks.
  • escape: (optional) The field escape character (one character only). Defaults to a backslash (\).
  • create_record_number: (optional) Boolean value specifying whether to create an incremented value for each record in the file. Defaults to FALSE.
  • record_number_field: (optional) The name of a field that holds an incremented value for each record in the file. Defaults to record_num.

@codingStandardsIgnoreStart

Example with minimal options:


source:
  plugin: csv
  path: /tmp/countries.csv
  ids: [id]

# countries.csv
id,country
1,Nicaragua
2,Spain
3,United States

In this example above, the migration source will use a single-column id using the value from the 'id' column of the CSV file.

Example with most options configured:


source:
  plugin: csv
  path: /tmp/countries.csv
  ids: [id]
  delimiter: '|'
  enclosure: "'"
  escape: '`'
  header_offset: null
  fields:
    -
      name: id
      label: ID
    -
      name: country
      label: Country

# countries.csv
'really long string that makes this unique'|'United States'
'even longer really long string that makes this unique'|'Nicaragua'
'even more longer really long string that makes this unique'|'Spain'
'escaped data'|'one`'s country'

In this example above, we override the default character controls for delimiter, enclosure and escape. We also set a null header offset to indicate no header.

@codingStandardsIgnoreEnd

Plugin annotation


@MigrateSource(
  id = "csv",
  source_module = "migrate_source_csv"
)

Hierarchy

Expanded class hierarchy of CSV

See also

http://php.net/manual/en/splfileobject.setcsvcontrol.php

2 files declare their use of CSV
CSVUnitTest.php in tests/src/Unit/Plugin/migrate/source/CSVUnitTest.php
YieldRows.php in tests/modules/csv_source_yield_test/src/Plugin/migrate/source/YieldRows.php
2 string references to 'CSV'
migrate_csv_test.yml in tests/modules/migrate_source_csv_test/migrations/migrate_csv_test.yml
tests/modules/migrate_source_csv_test/migrations/migrate_csv_test.yml
migrate_source_csv.source.schema.yml in config/schema/migrate_source_csv.source.schema.yml
config/schema/migrate_source_csv.source.schema.yml

File

src/Plugin/migrate/source/CSV.php, line 91

Namespace

Drupal\migrate_source_csv\Plugin\migrate\source
View source
class CSV extends SourcePluginBase implements ConfigurableInterface {

  /**
   * {@inheritdoc}
   *
   * @throws \InvalidArgumentException
   * @throws \Drupal\migrate\MigrateException
   */
  public function __construct(array $configuration, $plugin_id, $plugin_definition, MigrationInterface $migration) {
    parent::__construct($configuration, $plugin_id, $plugin_definition, $migration);
    $this
      ->setConfiguration($configuration);

    // Path is required.
    if (empty($this->configuration['path'])) {
      throw new \InvalidArgumentException('You must declare the "path" to the source CSV file in your source settings.');
    }

    // IDs are required.
    if (empty($this->configuration['ids']) || !is_array($this->configuration['ids'])) {
      throw new \InvalidArgumentException('You must declare "ids" as a unique array of fields in your source settings.');
    }

    // IDs must be an array of strings.
    if ($this->configuration['ids'] !== array_unique(array_filter($this->configuration['ids'], 'is_string'))) {
      throw new \InvalidArgumentException('The ids must a flat array with unique string values.');
    }

    // CSV character control characters must be exactly 1 character.
    foreach ([
      'delimiter',
      'enclosure',
      'escape',
    ] as $character) {
      if (1 !== strlen($this->configuration[$character])) {
        throw new \InvalidArgumentException(sprintf('%s must be a single character; %s given', $character, $this->configuration[$character]));
      }
    }

    // The configuration "header_offset" must be null or an integer.
    if (!(NULL === $this->configuration['header_offset'] || is_int($this->configuration['header_offset']))) {
      throw new \InvalidArgumentException('The configuration "header_offset" must be null or an integer.');
    }

    // The configuration "header_offset" must be greater or equal to 0.
    if (NULL !== $this->configuration['header_offset'] && 0 > $this->configuration['header_offset']) {
      throw new \InvalidArgumentException('The configuration "header_offset" must be greater or equal to 0.');
    }

    // If set, all fields must have a least a defined "name" property.
    if ($this->configuration['fields']) {
      foreach ($this->configuration['fields'] as $delta => $field) {
        if (!isset($field['name'])) {
          throw new \InvalidArgumentException(sprintf('The "name" configuration for "fields" in index position %s is not defined.', $delta));
        }
      }
    }

    // If "create_record_number" is specified, "record_number_field" must be a
    // non-empty string.
    if ($this->configuration['create_record_number'] && (!is_scalar($this->configuration['record_number_field']) || empty($this->configuration['record_number_field']))) {
      throw new \InvalidArgumentException('The configuration "record_number_field" must be a non-empty string.');
    }
  }

  /**
   * {@inheritdoc}
   */
  public function defaultConfiguration() {
    return [
      'path' => '',
      'ids' => [],
      'header_offset' => 0,
      'fields' => [],
      'delimiter' => ",",
      'enclosure' => "\"",
      'escape' => "\\",
      'create_record_number' => FALSE,
      'record_number_field' => 'record_number',
    ];
  }

  /**
   * {@inheritdoc}
   */
  public function getConfiguration() {
    return $this->configuration;
  }

  /**
   * {@inheritdoc}
   */
  public function setConfiguration(array $configuration) {

    // We must preserve integer keys for column_name mapping.
    $this->configuration = NestedArray::mergeDeepArray([
      $this
        ->defaultConfiguration(),
      $configuration,
    ], TRUE);
  }

  /**
   * Return a string representing the source file path.
   *
   * @return string
   *   The file path.
   */
  public function __toString() {
    return $this->configuration['path'];
  }

  /**
   * {@inheritdoc}
   *
   * @throws \Drupal\migrate\MigrateException
   * @throws \League\Csv\Exception
   */
  public function initializeIterator() {
    $header = $this
      ->getReader()
      ->getHeader();
    if ($this->configuration['fields']) {

      // If there is no header record, we need to flip description and name so
      // the name becomes the header record.
      $header = array_flip($this
        ->fields());
    }
    return $this
      ->getGenerator($this
      ->getReader()
      ->getRecords($header));
  }

  /**
   * {@inheritdoc}
   */
  public function getIds() {
    $ids = [];
    foreach ($this->configuration['ids'] as $value) {
      $ids[$value]['type'] = 'string';
    }
    return $ids;
  }

  /**
   * {@inheritdoc}
   */
  public function fields() {

    // If fields are not defined, use the header record.
    if (empty($this->configuration['fields'])) {
      $header = $this
        ->getReader()
        ->getHeader();
      return array_combine($header, $header);
    }
    $fields = [];
    foreach ($this->configuration['fields'] as $field) {
      $fields[$field['name']] = isset($field['label']) ? $field['label'] : $field['name'];
    }
    return $fields;
  }

  /**
   * Get the generator.
   *
   * @param \Iterator $records
   *   The CSV records.
   *
   * @codingStandardsIgnoreStart
   *
   * @return \Generator
   *   The records generator.
   *
   * @codingStandardsIgnoreEnd
   */
  protected function getGenerator(\Iterator $records) {
    $record_num = $this->configuration['header_offset'] ?? 0;
    foreach ($records as $record) {
      if ($this->configuration['create_record_number']) {
        $record[$this->configuration['record_number_field']] = ++$record_num;
      }
      (yield $record);
    }
  }

  /**
   * Get the CSV reader.
   *
   * @return \League\Csv\Reader
   *   The reader.
   *
   * @throws \Drupal\migrate\MigrateException
   * @throws \League\Csv\Exception
   */
  protected function getReader() {
    $reader = $this
      ->createReader();
    $reader
      ->setDelimiter($this->configuration['delimiter']);
    $reader
      ->setEnclosure($this->configuration['enclosure']);
    $reader
      ->setEscape($this->configuration['escape']);
    $reader
      ->setHeaderOffset($this->configuration['header_offset']);
    return $reader;
  }

  /**
   * Construct a new CSV reader.
   *
   * @return \League\Csv\Reader
   *   The reader.
   */
  protected function createReader() {
    if (!file_exists($this->configuration['path'])) {
      throw new \RuntimeException(sprintf('File "%s" was not found.', $this->configuration['path']));
    }
    $csv = fopen($this->configuration['path'], 'r');
    if (!$csv) {
      throw new \RuntimeException(sprintf('File "%s" could not be opened.', $this->configuration['path']));
    }
    return Reader::createFromStream($csv);
  }

}

Members

Namesort descending Modifiers Type Description Overrides
CSV::createReader protected function Construct a new CSV reader.
CSV::defaultConfiguration public function Gets default configuration for this plugin. Overrides ConfigurableInterface::defaultConfiguration
CSV::fields public function Returns available fields on the source. Overrides MigrateSourceInterface::fields
CSV::getConfiguration public function Gets this plugin's configuration. Overrides ConfigurableInterface::getConfiguration
CSV::getGenerator protected function Get the generator.
CSV::getIds public function Defines the source fields uniquely identifying a source row. Overrides MigrateSourceInterface::getIds
CSV::getReader protected function Get the CSV reader.
CSV::initializeIterator public function Overrides SourcePluginBase::initializeIterator 1
CSV::setConfiguration public function Sets the configuration for this plugin instance. Overrides ConfigurableInterface::setConfiguration
CSV::__construct public function Overrides SourcePluginBase::__construct
CSV::__toString public function Return a string representing the source file path. Overrides MigrateSourceInterface::__toString
DependencySerializationTrait::$_entityStorages protected property An array of entity type IDs keyed by the property name of their storages.
DependencySerializationTrait::$_serviceIds protected property An array of service IDs keyed by property name used for serialization.
DependencySerializationTrait::__sleep public function 1
DependencySerializationTrait::__wakeup public function 2
MessengerTrait::$messenger protected property The messenger. 29
MessengerTrait::messenger public function Gets the messenger. 29
MessengerTrait::setMessenger public function Sets the messenger.
PluginBase::$configuration protected property Configuration information passed into the plugin. 1
PluginBase::$pluginDefinition protected property The plugin implementation definition. 1
PluginBase::$pluginId protected property The plugin_id.
PluginBase::DERIVATIVE_SEPARATOR constant A string which is used to separate base plugin IDs from the derivative ID.
PluginBase::getBaseId public function Gets the base_plugin_id of the plugin instance. Overrides DerivativeInspectionInterface::getBaseId
PluginBase::getDerivativeId public function Gets the derivative_id of the plugin instance. Overrides DerivativeInspectionInterface::getDerivativeId
PluginBase::getPluginDefinition public function Gets the definition of the plugin implementation. Overrides PluginInspectionInterface::getPluginDefinition 3
PluginBase::getPluginId public function Gets the plugin_id of the plugin instance. Overrides PluginInspectionInterface::getPluginId
PluginBase::isConfigurable public function Determines if the plugin is configurable.
SourcePluginBase::$cache protected property The backend cache.
SourcePluginBase::$cacheCounts protected property Whether this instance should cache the source count. 1
SourcePluginBase::$cacheKey protected property Key to use for caching counts.
SourcePluginBase::$currentRow protected property The current row from the query.
SourcePluginBase::$currentSourceIds protected property The primary key of the current row.
SourcePluginBase::$highWaterProperty protected property Information on the property used as the high-water mark.
SourcePluginBase::$highWaterStorage protected property The key-value storage for the high-water value.
SourcePluginBase::$idMap protected property The migration ID map.
SourcePluginBase::$iterator protected property The iterator to iterate over the source rows.
SourcePluginBase::$mapRowAdded protected property Flags whether source plugin will read the map row and add to data row.
SourcePluginBase::$migration protected property The entity migration object.
SourcePluginBase::$moduleHandler protected property The module handler service. 2
SourcePluginBase::$originalHighWater protected property The high water mark at the beginning of the import operation.
SourcePluginBase::$skipCount protected property Whether this instance should not attempt to count the source. 1
SourcePluginBase::$trackChanges protected property Flags whether to track changes to incoming data. 1
SourcePluginBase::aboveHighwater protected function Check if the incoming data is newer than what we've previously imported.
SourcePluginBase::count public function Gets the source count. 4
SourcePluginBase::current public function
SourcePluginBase::doCount protected function Gets the source count checking if the source is countable or using the iterator_count function. 1
SourcePluginBase::fetchNextRow protected function Position the iterator to the following row. 1
SourcePluginBase::getCache protected function Gets the cache object.
SourcePluginBase::getCurrentIds public function Gets the currentSourceIds data member.
SourcePluginBase::getHighWater protected function The current value of the high water mark.
SourcePluginBase::getHighWaterField protected function Get the name of the field used as the high watermark.
SourcePluginBase::getHighWaterProperty protected function Get information on the property used as the high watermark.
SourcePluginBase::getHighWaterStorage protected function Get the high water storage object. 1
SourcePluginBase::getIterator protected function Returns the iterator that will yield the row arrays to be processed.
SourcePluginBase::getModuleHandler protected function Gets the module handler.
SourcePluginBase::getSourceModule public function Gets the source module providing the source data. Overrides MigrateSourceInterface::getSourceModule
SourcePluginBase::key public function Gets the iterator key.
SourcePluginBase::next public function The migration iterates over rows returned by the source plugin. This method determines the next row which will be processed and imported into the system.
SourcePluginBase::postRollback public function Performs post-rollback tasks. Overrides RollbackAwareInterface::postRollback
SourcePluginBase::prepareRow public function Adds additional data to the row. Overrides MigrateSourceInterface::prepareRow 50
SourcePluginBase::preRollback public function Performs pre-rollback tasks. Overrides RollbackAwareInterface::preRollback
SourcePluginBase::rewind public function Rewinds the iterator.
SourcePluginBase::rowChanged protected function Checks if the incoming row has changed since our last import.
SourcePluginBase::saveHighWater protected function Save the new high water mark.
SourcePluginBase::valid public function Checks whether the iterator is currently valid.
StringTranslationTrait::$stringTranslation protected property The string translation service. 1
StringTranslationTrait::formatPlural protected function Formats a string containing a count of items.
StringTranslationTrait::getNumberOfPlurals protected function Returns the number of plurals supported by a given language.
StringTranslationTrait::getStringTranslation protected function Gets the string translation service.
StringTranslationTrait::setStringTranslation public function Sets the string translation service to use. 2
StringTranslationTrait::t protected function Translates a string to the current language or to a given language.