function search_simplify in Drupal 5
Same name and namespace in other branches
- 8 core/modules/search/search.module \search_simplify()
- 4 modules/search.module \search_simplify()
- 6 modules/search/search.module \search_simplify()
- 7 modules/search/search.module \search_simplify()
- 9 core/modules/search/search.module \search_simplify()
Simplifies a string according to indexing rules.
2 calls to search_simplify()
- search_index_split in modules/
search/ search.module - Splits a string into tokens for indexing.
- search_parse_query in modules/
search/ search.module - Parse a search query into SQL conditions.
File
- modules/
search/ search.module, line 336 - Enables site-wide keyword searching.
Code
function search_simplify($text) {
// Decode entities to UTF-8
$text = decode_entities($text);
// Lowercase
$text = drupal_strtolower($text);
// Call an external processor for word handling.
search_preprocess($text);
// Simple CJK handling
if (variable_get('overlap_cjk', TRUE)) {
$text = preg_replace_callback('/[' . PREG_CLASS_CJK . ']+/u', 'search_expand_cjk', $text);
}
// To improve searching for numerical data such as dates, IP addresses
// or version numbers, we consider a group of numerical characters
// separated only by punctuation characters to be one piece.
// This also means that searching for e.g. '20/03/1984' also returns
// results with '20-03-1984' in them.
// Readable regexp: ([number]+)[punctuation]+(?=[number])
$text = preg_replace('/([' . PREG_CLASS_NUMBERS . ']+)[' . PREG_CLASS_PUNCTUATION . ']+(?=[' . PREG_CLASS_NUMBERS . '])/u', '\\1', $text);
// The dot, underscore and dash are simply removed. This allows meaningful
// search behaviour with acronyms and URLs.
$text = preg_replace('/[._-]+/', '', $text);
// With the exception of the rules above, we consider all punctuation,
// marks, spacers, etc, to be a word boundary.
$text = preg_replace('/[' . PREG_CLASS_SEARCH_EXCLUDE . ']+/u', ' ', $text);
return $text;
}