function views_natural_sort_encode in Views Natural Sort 7
Same name and namespace in other branches
- 6 views_natural_sort.module \views_natural_sort_encode()
Encodes a string into an ascii-sortable such:
- Leading articles in common languages are ingored: The A An El La Le Il
- Unimportant punctuation is ignored: # ' " ( )
- Unimportant words are ignored: and of or
- Embeded numbers will sort in numerical order. The following possiblities
are supported
- A leading dash indicates a negative number, unless it is preceded by a non-whitespace character, which case it is considered just a dash.
- Leading zeros are properly ignored so as to not influence sort order
- Decimal numbers are supported using a period as the decimal character
- Thousands separates are ignored, using the comma as the thous. character
- Numbers may be up to 99 digits before the decimal, up to the precision of the processor.
Parameters
$string string: The string to be encoded
Return value
string The encoded string
File
- ./
views_natural_sort.module, line 155 - Provides a views filter that sorts titles by a more natural manner by ignoring articles like "The" and "A."
Code
function views_natural_sort_encode($string) {
$words = variable_get('views_natural_sort_words_remove', array());
$beginning_words = variable_get('views_natural_sort_beginning_words_remove', array());
$symbols = variable_get('views_natural_sort_symbols_remove', '');
// Get the words ready for being put in a regex.
array_walk($beginning_words, 'preg_quote');
array_walk($words, 'preg_quote');
$regex = array();
$replace = array();
// Remove words from the beginning only!
if (!empty($beginning_words)) {
$regex[] = '/^(' . implode('|', $beginning_words) . ')\\s+/i';
$replace[] = '';
}
// Remove words reguardless where they are as long as they are a word.
if (!empty($words)) {
$regex[] = '/\\s(' . implode('|', $words) . ')\\s+/i';
$replace[] = ' ';
$regex[] = '/^(' . implode('|', $words) . ')\\s+/i';
$replace[] = '';
}
// Remove symbols.
if (strlen($symbols) != 0) {
$regex[] = '/[' . preg_quote($symbols) . ']/';
$replace[] = '';
}
if (!empty($regex) && !empty($replace)) {
$string = preg_replace($regex, $replace, $string);
}
// Find an optional leading dash (either preceded by whitespace or the first character) followed
// by either:
// - an optional series of digits (with optional imbedded commas), then a period, then an optional series of digits OR
// - a series of digits (with optional imbedded commas)
$string = preg_replace_callback('/(\\s-|^-)?(?:(\\d[\\d,]*)?\\.(\\d+)|(\\d[\\d,]*))/', '_views_natural_sort_number_encode_match_callback', $string);
// Not exactly sure why sometimes data that has been preg replaced comes back
// without utf8_encoding. This has been known to make Mysql vomit, so encoding
// here. This isn't seen anywhere else in drupal though.
// @see http://drupal.org/node/1914098
$string = utf8_encode($string);
// The size limit on the content field for views_natual_sort is sometimes not
// enough. Lets truncate all data down to that size. I personally feel the
// inaccuracy is an acceptable loss.
return substr($string, 0, 255);
}