You are here

public function biblio_handler_field_contributor::get_utf8_regex in Bibliography Module 7

Same name and namespace in other branches
  1. 7.2 views/biblio_handler_field_contributor.inc \biblio_handler_field_contributor::get_utf8_regex()
1 call to biblio_handler_field_contributor::get_utf8_regex()
biblio_handler_field_contributor::get_regex_patterns in views/biblio_handler_field_contributor.inc

File

views/biblio_handler_field_contributor.inc, line 288

Class

biblio_handler_field_contributor

Code

public function get_utf8_regex() {

  // Matches Unicode letters & digits:
  // Unicode-aware equivalent of "[:alnum:]".
  $alnum = "\\p{Ll}\\p{Lu}\\p{Lt}\\p{Lo}\\p{Nd}";

  // Matches Unicode letters:
  // Unicode-aware equivalent of "[:alpha:]".
  $alpha = "\\p{Ll}\\p{Lu}\\p{Lt}\\p{Lo}";

  // Matches Unicode control codes & characters not in other categories:
  // Unicode-aware equivalent of "[:cntrl:]".
  $cntrl = "\\p{C}";

  // Matches Unicode dashes & hyphens:
  $dash = "\\p{Pd}";

  // Matches Unicode digits:
  // Unicode-aware equivalent of "[:digit:]".
  $digit = "\\p{Nd}";

  // Matches Unicode printing characters (excluding space):
  // Unicode-aware equivalent of "[:graph:]".
  $graph = "^\\p{C}\t\n\f\r\\p{Z}";

  // Matches Unicode lower case letters:
  // Unicode-aware equivalent of "[:lower:]".
  $lower = "\\p{Ll}\\p{M}";

  // Matches Unicode printing characters (including space):
  // same as "^\p{C}", Unicode-aware equivalent of "[:print:]".
  $print = "\\P{C}";

  // Matches Unicode punctuation (printing characters excluding letters & digits):
  // Unicode-aware equivalent of "[:punct:]".
  $punct = "\\p{P}";

  // Matches Unicode whitespace (separating characters with no visual representation):
  // Unicode-aware equivalent of "[:space:]".
  $space = "\t\n\f\r\\p{Z}";

  // Matches Unicode upper case letters:
  // Unicode-aware equivalent of "[:upper:]".
  $upper = "\\p{Lu}\\p{Lt}";

  // Matches Unicode "word" characters:
  // Unicode-aware equivalent of "[:word:]" (or "[:alnum:]" plus "_")
  $word = "_\\p{Ll}\\p{Lu}\\p{Lt}\\p{Lo}\\p{Nd}";

  // Defines the PCRE pattern modifier(s) to be used in conjunction with the above variables:
  // More info: <http://www.php.net/manual/en/reference.pcre.pattern.modifiers.php>
  // the "u" (PCRE_UTF8) pattern modifier causes PHP/PCRE to treat pattern strings as UTF-8.
  $patternModifiers = "u";
  return array(
    $alnum,
    $alpha,
    $cntrl,
    $dash,
    $digit,
    $graph,
    $lower,
    $print,
    $punct,
    $space,
    $upper,
    $word,
    $patternModifiers,
  );
}