function _spamspan_filter_process in SpamSpan filter 7
Spamspan filter process callback
Scan text and replace email addresses with span tags
We are aiming to replace emails with code like this: <span class="spamspan"> <span class="u">user</span> [at] <span class="d">example [dot] com</span> <span class="t"tag contents></span></span>
1 call to _spamspan_filter_process()
- spamspan in ./
spamspan.module - A simple utility function wrapping the main processing callback. This function may be called by other modules and themes.
1 string reference to '_spamspan_filter_process'
- spamspan_filter_info in ./
spamspan.module - Implements hook_filter_info(). This function is called on every page so keep it fast and simple.
File
- ./
spamspan.module, line 101 - This module implements the spamspan technique (http://www.spamspan.com ) for hiding email addresses from spambots.
Code
function _spamspan_filter_process($text, $filter) {
// The preg_replace_callback functions below cannot take any additional
// arguments, so we pass the relevant settings in spamspan_admin.
spamspan_admin()
->filter_set($filter);
// Top and tail the email regexp it so that it is case insensitive and
// ignores whitespace.
$emailpattern = '!' . SPAMSPAN_EMAIL . '!ix';
$emailpattern_with_options = '!' . SPAMSPAN_EMAIL . '\\[(.*?)\\]!ix';
// Next set up a regex for mailto: URLs.
// - see http://www.faqs.org/rfcs/rfc2368.html
// This captures the whole mailto: URL into the second group,
// the name into the third group and the domain into
// the fourth. The tag contents go into the fifth.
$mailtopattern = "!<a\\s+ # opening <a and spaces\n ((?:\\w+\\s*=\\s*)(?:\\w+|\"[^\"]*\"|'[^']*'))*? # any attributes\n \\s* # whitespace\n href\\s*=\\s*(['\"])(mailto:" . SPAMSPAN_EMAIL . "(?:\\?[A-Za-z0-9_= %\\.\\-\\~\\_\\&;\\!\\*\\(\\)\\'#&]*)?)" . "\\2 # the relevant quote\n # character\n ((?:\\s+\\w+\\s*=\\s*)(?:\\w+|\"[^\"]*\"|'[^']*'))*? # any more attributes\n > # end of the first tag\n (.*?) # tag contents. NB this\n # will not work properly\n # if there is a nested\n # <a>, but this is not\n # valid xhtml anyway.\n </a> # closing tag\n !ix";
// HTML image tags need to be handled separately, as they may contain base64
// encoded images slowing down the email regex function.
// Therefore, remove all image contents and add them back later.
// See https://drupal.org/node/1243042 for details.
$images = array(
array(),
);
$inline_image_pattern = '/data\\:(?:.+?)base64(?:.+?)["|\']/';
preg_match_all($inline_image_pattern, $text, $images);
$text = preg_replace($inline_image_pattern, '__spamspan_img_placeholder__', $text);
// Now we can convert all mailto URLs
$text = preg_replace_callback($mailtopattern, '_spamspan_callback_mailto', $text);
// all bare email addresses with optional formatting information
$text = preg_replace_callback($emailpattern_with_options, '_spamspan_email_addresses_with_options', $text);
// and finally, all bare email addresses
$text = preg_replace_callback($emailpattern, '_spamspan_bare_email_addresses', $text);
// Revert back to the original image contents.
foreach ($images[0] as $image) {
$text = preg_replace('/__spamspan_img_placeholder__/', $image, $text, 1);
}
return $text;
}