PHP’s strip_tags() function seems like an easy solution for allowing certain HTML elements while blocking others. But it has a dangerous flaw.

The Problem

strip_tags() accepts a second parameter to whitelist specific tags:

$clean = strip_tags($input, '<a><b><i>');

Looks safe, right? But what if the input contains:

<a href="javascript:alert('XSS')" onclick="stealCookies()">Click me</a>

The <a> tag is whitelisted, so it passes through - complete with its malicious attributes. strip_tags() removes tags, but preserves their attributes.

The Solution

If you don’t need attributes, strip them entirely:

function strip_tags_with_attributes($string, $allowedTags) {
    // First, strip disallowed tags
    $string = strip_tags($string, $allowedTags);

    // Then remove all attributes from allowed tags
    return preg_replace('/<(\w+)[^>]*>/', '<$1>', $string);
}

// Usage
$input = '<a href="javascript:bad()" onclick="evil()">Link</a>';
$clean = strip_tags_with_attributes($input, '<a><b><i>');
// Result: <a>Link</a>

When This Isn’t Enough

If you actually need to preserve safe attributes (like href for links), you need a proper HTML sanitizer:

  • HTMLPurifier - The gold standard for PHP
  • DOMDocument - Parse and whitelist specific attributes

Key Takeaway

strip_tags() alone is not safe enough when you’re whitelisting tags. Always consider what attributes could slip through and either:

  1. Remove all attributes (simple approach above)
  2. Use a proper HTML sanitization library (complex but flexible)

Never trust user input, even when it appears to be sanitized.