function filter_xss in Drupal 5
Same name and namespace in other branches
- 4 modules/filter.module \filter_xss()
- 6 modules/filter/filter.module \filter_xss()
- 7 includes/common.inc \filter_xss()
Filters XSS. Based on kses by Ulf Harnhammar, see http://sourceforge.net/projects/kses
For examples of various XSS attacks, see: http://ha.ckers.org/xss.html
This code does four things:
- Removes characters and constructs that can trick browsers
- Makes sure all HTML entities are well-formed
- Makes sure all HTML tags and attributes are well-formed
- Makes sure no HTML tags contain URLs with a disallowed protocol (e.g. javascript:)
Parameters
$string: The string with raw HTML in it. It will be stripped of everything that can cause an XSS attack.
$allowed_tags: An array of allowed tags.
$format: The format to use.
4 calls to filter_xss()
- aggregator_filter_xss in modules/
aggregator/ aggregator.module - Safely render HTML content, as allowed.
- filter_xss_admin in modules/
filter/ filter.module - Very permissive XSS/HTML filter for admin-only use.
- node_revision_overview in modules/
node/ node.module - Generate an overview table of older revisions of a node.
- _filter_html in modules/
filter/ filter.module - HTML filter. Provides filtering of input into accepted HTML.
File
- modules/
filter/ filter.module, line 1276 - Framework for handling filtering of content.
Code
function filter_xss($string, $allowed_tags = array(
'a',
'em',
'strong',
'cite',
'code',
'ul',
'ol',
'li',
'dl',
'dt',
'dd',
)) {
// Only operate on valid UTF-8 strings. This is necessary to prevent cross
// site scripting issues on Internet Explorer 6.
if (!drupal_validate_utf8($string)) {
return '';
}
// Store the input format
_filter_xss_split($allowed_tags, TRUE);
// Remove NUL characters (ignored by some browsers)
$string = str_replace(chr(0), '', $string);
// Remove Netscape 4 JS entities
$string = preg_replace('%&\\s*\\{[^}]*(\\}\\s*;?|$)%', '', $string);
// Defuse all HTML entities
$string = str_replace('&', '&', $string);
// Change back only well-formed entities in our whitelist
// Named entities
$string = preg_replace('/&([A-Za-z][A-Za-z0-9]*;)/', '&\\1', $string);
// Decimal numeric entities
$string = preg_replace('/&#([0-9]+;)/', '&#\\1', $string);
// Hexadecimal numeric entities
$string = preg_replace('/&#[Xx]0*((?:[0-9A-Fa-f]{2})+;)/', '&#x\\1', $string);
return preg_replace_callback('%
(
<(?=[^a-zA-Z!/]) # a lone <
| # or
<[^>]*(>|$) # a string that starts with a <, up until the > or the end of the string
| # or
> # just a >
)%x', '_filter_xss_split', $string);
}