You are here

private function HumanNameParser_Parser::parse in Bibliography Module 7

Same name and namespace in other branches
  1. 6.2 includes/Parser.php \HumanNameParser_Parser::parse()

Parse the name into its constituent parts.

Sequentially captures each name-part, working in from the ends and trimming the namestring as it goes.

Return value

boolean true on success

1 call to HumanNameParser_Parser::parse()
HumanNameParser_Parser::setName in includes/Parser.php
Sets name string and parses it. Takes Name object or a simple string (converts the string into a Name obj), parses and loads its constituant parts.

File

includes/Parser.php, line 198

Class

HumanNameParser_Parser
Works with a Name object to parse out the parts of a name.

Code

private function parse() {

  // Each suffix gets a "\.*" behind it.
  $suffixes = implode("\\.*|\\s", $this->suffixes) . "\\.*";

  // Each prefix gets a " " behind it.
  $prefixes = implode(" |", $this->prefixes) . " ";

  // The regex use is a bit tricky.  *Everything* matched by the regex will be replaced,
  // but you can select a particular parenthesized submatch to be returned.
  // Also, note that each regex requres that the preceding ones have been run, and matches chopped out.
  // names that starts or end w/ an apostrophe break this.
  $nicknamesRegex = "/ ('|\"|\\(\"*'*)(.+?)('|\"|\"*'*\\)) /";
  $suffixRegex = "/,* *({$suffixes})\$/";
  $lastRegex = "/(?!^)\\b([^ ]+ y |{$prefixes})*[^ ]+\$/u";

  // Note the lookahead, which isn't returned or replaced.
  $leadingInitRegex = "/^(.\\.*)(?= \\p{L}{2})/";

  // .
  $firstRegex = "/^[^ ]+/";

  // Short circuit for a simple single string that would otherwise cause an Exception;
  // we take this as the last name and everything else will be empty (the default)
  if (preg_match('@^\\s*(\\p{L}+)\\s*$@u', $this->name
    ->getStr(), $matches)) {
    $this->last = $matches[1];
    return TRUE;
  }

  // Get nickname, if there is one.
  $this->nicknames = $this->name
    ->chopWithRegex($nicknamesRegex, 2);

  // Get suffix, if there is one.
  $this->suffix = $this->name
    ->chopWithRegex($suffixRegex, 1);

  // Flip the before-comma and after-comma parts of the name.
  $this->name
    ->flip(",");

  // Get the last name.
  $this->last = $this->name
    ->chopWithRegex($lastRegex, 0);
  if (!$this->last) {
    throw new Exception("Couldn't find a last name in '{$this->name->getStr()}'.");
  }

  // Get the first initial, if there is one.
  $this->leadingInit = $this->name
    ->chopWithRegex($leadingInitRegex, 1);

  // Get the first name.
  $this->first = $this->name
    ->chopWithRegex($firstRegex, 0);
  if (!$this->first && $this->category != 5) {
    throw new Exception("Couldn't find a first name in '{$this->name->getStr()}'");
  }

  // If anything's left, that's the middle name.
  $this->middle = $this->name
    ->getStr();
  return TRUE;
}