CWE-79: Cross-Site Scripting (XSS) - Perl
Overview
Cross-Site Scripting (CWE-79) occurs when untrusted data is included in web pages without proper encoding. Attackers inject malicious scripts that execute in victim browsers, leading to session hijacking, credential theft, defacement, or malware distribution. Perl applications must use context-appropriate encoding functions like escapeHTML() from CGI.pm, encode_entities() from HTML::Entities, or framework-specific auto-escaping features. Unlike CWE-80 which focuses on basic XSS, CWE-79 encompasses reflected, stored, and DOM-based XSS attacks across all contexts (HTML, JavaScript, URL, CSS, JSON).
Primary Defence: Use HTML::Entities::encode_entities() or CGI.pm's escapeHTML() to encode all user-controlled output in HTML contexts, use framework-specific auto-escaping features (Template Toolkit's [% FILTER html %], Mojolicious's %= tag), implement context-appropriate encoding for JavaScript, URL, and CSS contexts, set Content-Type and X-Content-Type-Options: nosniff headers, and use Content Security Policy (CSP) to prevent XSS attacks.
Common Vulnerable Patterns
Direct Variable Interpolation
# VULNERABLE - No escaping
use CGI;
my $q = CGI->new;
my $name = $q->param('name');
print "<h1>Welcome, $name</h1>"; # XSS vulnerability!
# Attack: name=<script>alert('XSS')</script>
Unescaped CGI Param() Output
# VULNERABLE - Direct param usage
use CGI;
my $q = CGI->new;
print $q->header;
print "<html><body>";
print "<p>Search results for: " . $q->param('query') . "</p>"; # VULNERABLE
print "</body></html>";
Template Toolkit with EVAL_PERL Enabled
# VULNERABLE - EVAL_PERL allows code execution
my $tt = Template->new({
EVAL_PERL => 1, # DANGEROUS!
});
$tt->process('template.html', { user_data => $user_input });
# Template can execute arbitrary Perl:
# [% PERL %] system($user_data); [% END %]
Unsafe Mason Filters
# VULNERABLE - Bypassing Mason's auto-escape
<div><% $user_input | n %></div> # No escaping - XSS!
# Attack: user_input = "<script>alert('XSS')</script>"
Manual HTML Construction Without Encoding
# VULNERABLE - String concatenation
my $html = "<div class='user'>" . $user_bio . "</div>";
print $html; # No escaping!
Secure Patterns
CGI.pm with escapeHTML()
# SECURE - Explicit HTML escaping
use CGI qw(:standard escapeHTML);
my $q = CGI->new;
my $user_comment = $q->param('comment');
my $safe_comment = escapeHTML($user_comment);
print $q->header;
print "<p>Comment: $safe_comment</p>";
# Auto-escaping CGI functions (when using CGI.pm 4.57+):
print textfield(-name => 'username', -value => $user_value);
print textarea(-name => 'bio', -value => $user_bio);
print popup_menu(-name => 'role', -values => \@roles);
print checkbox(-name => 'agree', -value => 'yes');
Why this works:
CGI.pm's escapeHTML() function converts dangerous HTML characters (<, >, &, ") into their HTML entity equivalents (<, >, &, ") before the browser renders them. When you call escapeHTML($user_input), the function processes the string character-by-character, replacing HTML metacharacters with safe entity representations that browsers display as literal text rather than interpreting as markup. This encoding happens in your Perl code before the HTTP response is generated, ensuring that any injected script tags or HTML become harmless text. For example, if an attacker submits <script>alert('xss')</script> as input, escapeHTML() transforms it into <script>alert('xss')</script>, which the browser displays as visible text instead of executing. The function encodes double quotes (" → ") which prevents quote-based injection in HTML attributes, though you should always wrap attribute values in quotes for full protection (use qq{<div title="$safe">} rather than <div title=$safe>). Since CGI.pm version 4.57+, auto-escaping is enabled by default for CGI form generation functions like textfield() and popup_menu(), but escapeHTML() remains necessary when manually building HTML output. This encoding is specifically designed for HTML contexts - you still need different encoding for JavaScript (JSON encoding), URLs (escape() function), or CSS contexts.
HTML::Entities for All Contexts
# SECURE - Context-aware encoding
use HTML::Entities qw(encode_entities);
my $safe_html = encode_entities($user_data);
print "<div>$safe_html</div>";
# For attributes
my $safe_attr = encode_entities($attr_value, '<>&"\'');
print qq{<input value="$safe_attr">};
Why this works:
HTML::Entities' encode_entities() function provides more comprehensive HTML encoding than CGI.pm's escapeHTML(), converting not just the critical HTML metacharacters but all characters that have HTML entity equivalents. When called without a second parameter, encode_entities($text) encodes control characters, high-bit characters, and HTML-significant characters (<, >, &, ", ') into named or numeric entities. This broad encoding is particularly useful for international content containing accented characters, mathematical symbols, or other Unicode characters that might trigger unexpected behavior in certain browsers or contexts. The function can also be called with a second parameter specifying exactly which characters to encode - encode_entities($text, '<>&"\'') encodes only the critical XSS-prevention characters, which is more efficient and produces more readable HTML source. The encoding operates character-by-character, replacing dangerous input like <script>alert('xss')</script> with <script>alert('xss')</script>, which browsers display as text instead of executing. For HTML attributes, always specify quote encoding (" and ') and wrap attribute values in quotes to prevent quote-based injection attacks. HTML::Entities is more thorough than escapeHTML() but also more heavyweight - use it when you need comprehensive entity encoding for international content or when building complex HTML with mixed character sets. Like other HTML encoding methods, this only protects HTML contexts - JavaScript contexts need JSON encoding, and URL contexts need percent-encoding with URI::Escape.
Template Toolkit with Safe Defaults
# SECURE - Safe Template Toolkit configuration
use Template;
my $tt = Template->new({
EVAL_PERL => 0, # Never enable this
INTERPOLATE => 0, # Disable Perl interpolation
POST_CHOMP => 1,
});
# Template auto-escapes by default
$tt->process('page.html', {
username => $user_input,
comment => $user_comment,
});
# Template (page.html):
# <div>[% username %]</div> <!-- Escaped -->
# <p>[% comment %]</p> <!-- Escaped -->
#
# Explicit escape filter (redundant but clear):
# <div>[% username | html %]</div>
#
# URL encoding:
# <a href="/user?id=[% user_id | url %]">Profile</a>
Why this works:
Template Toolkit (TT) provides automatic HTML escaping for all template variables by default, converting dangerous HTML characters into safe entity representations during the template processing phase. When you write [% username %] in a template, TT applies HTML entity encoding (similar to encode_entities()) before inserting the value into the output, transforming characters like <, >, &, ", and ' into <, >, &, ", and ' respectively. This encoding is applied when the template is processed, after your Perl code passes data to $tt->process() but before the HTTP response is sent to the browser. The critical security configurations are EVAL_PERL => 0 (which prevents template code from executing arbitrary Perl code) and INTERPOLATE => 0 (which disables Perl variable interpolation that could bypass escaping). With these settings, even if an attacker injects <script>alert('xss')</script> into a template variable, TT converts it to <script>alert('xss')</script>, rendering it as harmless text. The explicit | html filter is available but redundant since HTML escaping is the default - however, using it makes the security intention clear to code reviewers. For non-HTML contexts, TT provides the | url filter for percent-encoding URL parameters and | js filter for JavaScript string contexts. Never enable EVAL_PERL in production code as it allows template authors to execute arbitrary Perl, completely bypassing all security measures. TT's design makes secure templating the default behavior, requiring explicit opt-out for dangerous operations.
Context-Specific Encoding
HTML Attribute Context
# SECURE - HTML attribute encoding
use CGI qw(escapeHTML);
my $safe_attr = escapeHTML($user_value);
print qq{<input type="text" value="$safe_attr">};
# Always wrap attributes in quotes for full protection
print qq{<div title="$safe_attr" class="user-input">};
Why this works:
HTML attribute contexts require careful encoding because attackers can break out of attributes using quotes. The escapeHTML() function encodes double quotes (" → ") and single quotes (' → '), preventing quote-based injection. Always wrap attribute values in quotes (preferably double quotes) - unquoted attributes like <div class=$value> are vulnerable even with encoding because attackers can inject space-separated attributes. The qq{} operator in Perl allows you to use double quotes inside the string without escaping, making the code more readable. This encoding is specifically for HTML attributes - don't use it for JavaScript or URL contexts.
JavaScript Context
# SECURE - Use JSON for embedding data in JavaScript
use JSON::XS;
my $search_term = $q->param('search');
my $json = JSON::XS->new->encode({ term => $search_term });
print qq{
<script>
var searchData = $json; // Safe - JSON encoded
console.log(searchData.term);
</script>
};
# For manual escaping (use JSON encoding instead when possible)
sub escape_js {
my $str = shift;
$str =~ s/\\/\\\\/g; # Backslash
$str =~ s/'/\\'/g; # Single quote
$str =~ s/"/\\"/g; # Double quote
$str =~ s/\n/\\n/g; # Newline
$str =~ s/\r/\\r/g; # Carriage return
return $str;
}
URL Context
# SECURE - URL parameter encoding
use CGI qw(escape);
my $safe_param = escape($user_input);
print "<a href='/search?q=$safe_param'>Search</a>";
# Or use URI::Escape for more control
use URI::Escape qw(uri_escape uri_escape_utf8);
my $encoded = uri_escape_utf8($param);
print "<a href='/results?query=$encoded'>Results</a>";
Why this works:
URL contexts require percent-encoding (URL encoding) which converts special characters into %XX hexadecimal format. CGI.pm's escape() function and URI::Escape's uri_escape() encode characters that have special meaning in URLs (&, =, ?, /, #, spaces, etc.) into percent-encoded equivalents. For example, hello world&delete=all becomes hello%20world%26delete%3Dall, preventing the &delete=all from being interpreted as a separate parameter. Use uri_escape_utf8() for Unicode strings to ensure proper UTF-8 encoding. Note that URLs appearing inside HTML attributes need both URL encoding (to create a valid URL) and HTML attribute encoding (to prevent breaking out of the attribute), though many developers only apply URL encoding since percent-encoded characters are safe in HTML attributes.
Why this works:
JavaScript contexts require JSON encoding rather than HTML entity encoding because JavaScript interpreters don't understand HTML entities and will treat them as literal strings. The JSON::XS module's encode() function properly escapes data for safe inclusion in JavaScript by converting the Perl data structure into a JSON string with appropriate JavaScript-safe escaping. JSON encoding automatically handles dangerous characters by escaping backslashes (\ → \\), quotes (" → \"), and control characters (newlines, tabs, etc.), while also converting HTML-significant characters into unicode escape sequences when necessary. For example, if $search_term contains </script><script>alert('xss')</script>, JSON::XS encodes it as a properly escaped JSON string where the angle brackets and quotes are represented as unicode escape sequences or escaped quotes, preventing the payload from breaking out of the JavaScript context and injecting HTML tags. The critical security advantage is that JSON encoding produces valid JavaScript that won't be misinterpreted by the browser's HTML parser - the encoded data remains within the JavaScript context and can't trigger HTML rendering or script execution. This approach works for simple values, objects, and arrays, making it versatile for passing complex Perl data structures to client-side JavaScript safely. Never use HTML entity encoding for JavaScript contexts (it won't prevent XSS), and never manually concatenate user input into JavaScript strings - always use JSON encoding. For Perl strings specifically being embedded in JavaScript string literals, you still need additional quote escaping, but JSON encoding handles this automatically when encoding scalar values.
Mojolicious Framework (Auto-Escaping)
# SECURE - Mojolicious auto-escapes by default
use Mojolicious::Lite;
get '/profile' => sub {
my $c = shift;
my $bio = $c->param('bio');
$c->render(template => 'profile', bio => $bio);
};
# Template (profile.html.ep):
# <div class="bio">
# <%= $bio %> <!-- Auto-escaped -->
# </div>
#
# Raw output (DANGEROUS - only for trusted content):
# <%== $trusted_html %> <!-- NOT escaped -->
Why this works:
Mojolicious framework provides automatic HTML escaping through its embedded Perl (EP) template system, which processes all <%= $variable %> tags by applying HTML entity encoding before rendering. When you use <%= $bio %> in a Mojolicious template, the framework converts dangerous HTML characters (<, >, &, ", ') into their entity equivalents (<, >, &, ", ') during the template rendering phase, after your controller passes data via $c->render() but before sending the HTTP response. This secure-by-default behavior is built into the EP template engine, making XSS injection extremely difficult unless developers explicitly bypass it using the <%== %> raw output syntax. The single-equals <%= %> applies automatic escaping and should be used for all untrusted data, while the double-equals <%== %> bypasses escaping and should only be used for pre-sanitized HTML from trusted sources. For example, if an attacker submits <script>alert('xss')</script> through a form parameter, <%= $bio %> transforms it into <script>alert('xss')</script> which displays as text rather than executing. Mojolicious also provides helper methods like $c->param('name') for accessing request parameters and $c->render() for template rendering, creating a clean MVC separation that makes security easier to maintain. The framework's convention of requiring explicit opt-out (<%== %>) for dangerous operations makes risky code patterns easy to identify during code reviews. For non-HTML contexts, use Mojolicious helpers like url_for with proper encoding or Mojo::JSON for JavaScript contexts.
Mason Framework (Auto-Escaping)
# SECURE - HTML::Mason auto-escapes by default
# Verify in httpd.conf or handler.pl:
# default_escape_flags => 'h' # HTML escaping enabled
# In Mason component:
<div class="user-content">
<% $user_input %> <!-- Auto-escaped -->
<% $user_comment | h %> <!-- Explicitly escaped -->
</div>
# Raw output (DANGEROUS - only for trusted HTML)
<div><% $trusted_html | n %></div> <!-- NOT escaped -->
Why this works:
HTML::Mason's default auto-escape feature (enabled via default_escape_flags => 'h') automatically HTML-encodes all component output, converting dangerous characters into safe HTML entities during the component rendering phase. When you write <% $user_input %> in a Mason component, the framework applies HTML entity encoding before inserting the value into the output, transforming <, >, &, ", and ' into their entity equivalents. The | h filter explicitly applies HTML escaping and is redundant when auto-escape is enabled, but makes the security intention clear. The | n filter ("no escaping") bypasses all encoding and should only be used for pre-sanitized HTML from trusted sources - never use it with user input. Mason's auto-escape must be enabled in your configuration (httpd.conf or handler.pl) with default_escape_flags => 'h', otherwise all output is unescaped by default in older Mason versions. Always verify your Mason configuration has auto-escaping enabled. For non-HTML contexts like JavaScript or URLs, you'll need additional encoding beyond Mason's default HTML escaping.
Framework-Specific Guidance
CGI.pm
use CGI qw(:standard);
# ALWAYS escape user input
my $q = CGI->new;
my $input = escapeHTML($q->param('input'));
# Use CGI functions for form elements (auto-escape)
print textfield(-name => 'username', -value => $user_value);
print popup_menu(-name => 'role', -values => \@roles);
Template Toolkit
# CRITICAL: Disable EVAL_PERL
my $tt = Template->new({
EVAL_PERL => 0, # MUST be 0 for security
});
# Default escaping is enabled - verify with:
# [% USE Dumper; Dumper.dump(variable) %]
Mason
# Default auto-escape is ON
# Verify in httpd.conf or handler.pl:
# default_escape_flags => 'h' # HTML escaping enabled
# In components:
<% $user_input %> <!-- Auto-escaped -->
<% $user_input | h %> <!-- Explicitly escaped -->
<% $user_input | n %> <!-- NOT escaped - AVOID -->
Catalyst
# SECURE - Catalyst with Template Toolkit
package MyApp::Controller::User;
use Moose;
use namespace::autoclean;
BEGIN { extends 'Catalyst::Controller'; }
sub profile : Local {
my ($self, $c) = @_;
my $bio = $c->request->param('bio');
$c->stash(
template => 'user/profile.tt',
bio => $bio, # Auto-escaped in template
);
}
# Template (user/profile.tt):
# <div>[% bio %]</div> <!-- Auto-escaped by Template Toolkit -->
Dancer2
# SECURE - Dancer2 with Template Toolkit
use Dancer2;
get '/welcome' => sub {
my $name = param('name');
template 'welcome', {
name => $name, # Auto-escaped in template
};
};
# Template (views/welcome.tt):
# <h1>Welcome, [% name %]!</h1> <!-- Auto-escaped -->
Verify Encoding Functions
use Test::More tests => 4;
use HTML::Entities qw(encode_entities);
my $malicious = '<script>alert("XSS")</script>';
my $encoded = encode_entities($malicious);
unlike($encoded, qr/<script>/, 'Script tags are encoded');
like($encoded, qr/<script>/, 'Contains encoded entities');
unlike($encoded, qr/alert/, 'Alert function not in clear text');
ok(length($encoded) > length($malicious), 'Encoded string is longer');
Check Template Configuration
# Verify Template Toolkit settings
use Template;
use Data::Dumper;
my $tt = Template->new;
print Dumper($tt->{CONFIG});
# Verify EVAL_PERL is 0 or undefined
# Verify INTERPOLATE is 0
Verification
After implementing the recommended secure patterns, verify the fix through multiple approaches:
- Manual testing: Submit malicious payloads relevant to this vulnerability and confirm they're handled safely without executing unintended operations
- Code review: Confirm all instances use the secure pattern (parameterized queries, safe APIs, proper encoding) with no string concatenation or unsafe operations
- Static analysis: Use security scanners to verify no new vulnerabilities exist and the original finding is resolved
- Regression testing: Ensure legitimate user inputs and application workflows continue to function correctly
- Edge case validation: Test with special characters, boundary conditions, and unusual inputs to verify proper handling
- Framework verification: If using a framework or library, confirm the recommended APIs are used correctly according to documentation
- Authentication/session testing: Verify security controls remain effective and cannot be bypassed (if applicable to the vulnerability type)
- Rescan: Run the security scanner again to confirm the finding is resolved and no new issues were introduced
Security Checklist
- All user input encoded at output
- All user input validated at input
- Correct context-specific encoding
- No
EVAL_PERL => 1in templates - No
\| nfilter with user data - CSP and security headers set
- Character encoding (UTF-8) specified
- Unit tests verify encoding
- Security scan shows issue resolved