function _invisimail_email_matching_regexes in Invisimail 7
Generates the two pattern matching regexes for finding email addresses.
This is moved to a separate function for cleanliness, and because it's a hugely complex regex that we want to be able to encapsulate separately.
Return value
array
1 call to _invisimail_email_matching_regexes()
- invisimail_encode_string in ./
invisimail.module - Encodes all email addresses in a string using the specified encoder.
File
- ./
invisimail.module, line 188 - This module provides a filter that will search content for email addresses and replace them with their ascii equivalents before display. This is not a complete protection from spam harvesters, but it is some help.
Code
function _invisimail_email_matching_regexes() {
// The check for the user/name portion of the email address. This is
// encapsulable regex that looks for at least one valid character (in most
// cases, a space), preceded by one invalid character, followed by at least
// one valid character.
$valid_user_chars = 'a-zA-Z0-9_\\-\\.\\+\\^!#\\$%&*+\\/\\=\\?\\`\\|\\{\\}~\'';
$user = "(?<![{$valid_user_chars}])[{$valid_user_chars}]+";
// For the domain portion of an email addy, you can have a string domain,
// an ipv4 address, or an ipv6 address. These three regex capture each of
// those possibilities, respectively.
$domain = '(?:[a-zA-Z0-9](?:[a-zA-Z0-9\\-]*[a-zA-Z0-9])?\\.)+[a-zA-Z]{2,6}';
$ipv4 = '[0-9]{1,3}(?:\\.[0-9]{1,3}){3}';
$ipv6 = '[0-9a-fA-F]{1,4}(?:[0-9a-fA-F]{1,4}){7}';
// Now we stick it all together into a generalized, encapsulated, portable,
// and non-subitem-capturing (hence all the '(?:', which mark subpatterns as
// non-capturing) regex for grabbing email addresses.
$mail = "(?:{$user})+\\@(?:{$domain}|(?:\\[(?:{$ipv4}|{$ipv6})\\]))";
// PCRE pattern modifiers; 'i' enables case-insensitivity, and 'S' enables
// the additional pattern analysis, as our regex is one that can benefit
// (it is a non-anchored pattern without a single fixed starting character.
// see http://us2.php.net/manual/en/reference.pcre.pattern.modifiers.php).
// Global case insensitivity is a little sloppy to use, but selectively
// toggling it within only some of the subpatterns isn't really worth the
// effort.
$modifiers = 'iS';
// The final pattern. We deal with these as an entire group because invisimail
// allows options that require us to deal with both an href and its text
// in relation to one another.
$pattern = "@(?:(<a [^>]*href=['\"](mailto:{$mail})['\"][^>]*>)?((?" . ">(?<!mailto:)){$mail}))|(<a [^>]*href=['\"](mailto:{$mail})['\"][^>]*>(.*?)</a>)@{$modifiers}";
$pattern_diff_link_text = "@(<a [^>]*href=['\"](mailto:{$mail})['\"][^>]*>(.*?)</a>)@{$modifiers}";
$pattern_same_link_text = "@(?:(<a [^>]*href=['\"](mailto:{$mail})['\"][^>]*>)?((?" . ">(?<!mailto:)){$mail}))@{$modifiers}";
return array(
'diff_link' => $pattern_diff_link_text,
'same_link' => $pattern_same_link_text,
);
}