CWE-79: Cross-Site Scripting (XSS)

Overview

Cross-Site Scripting (XSS) occurs when an application includes untrusted data in web pages without proper validation or encoding, allowing attackers to inject malicious scripts that execute in victims' browsers. XSS can appear in HTML content, attributes, JavaScript, CSS, or URLs. This is one of the most common web application vulnerabilities.

OWASP Classification

A05:2025 - Injection

Risk

High to Critical: Attackers can execute arbitrary JavaScript in the victim's browser, leading to session hijacking (cookie theft), credential theft via fake login forms, page defacement, malware distribution, or performing actions on behalf of the victim. All user data accessible to the application is at risk.

Remediation Steps

Core principle: Never render untrusted input directly into executable browser contexts; ensure all untrusted data is output-encoded for its specific context so it remains data, not script.

Trace the Data Path

Analyze how untrusted data reaches web page output:

Source: Identify where untrusted data enters (user input, external files, databases, network requests, cookies, headers)
Data Flow: Trace transformations between source and output
Sink: Locate where data is rendered (response writing, template rendering, DOM manipulation)
Output Context: Determine where data appears (HTML body, attribute, JavaScript, CSS, URL)
Missing Encoding: Check for encoding/escaping functions (or their absence)

Apply Context-Aware Output Encoding (Primary Defense)

Always encode untrusted data based on output context:

Encoding by context:

HTML Body: HTML entity encoding (e.g., < → <)
HTML Attributes: Attribute encoding (quote and escape special chars)
JavaScript: JavaScript encoding (escape quotes, backslashes, etc.)
URLs: URL encoding (percent-encode special characters)
CSS: CSS encoding (escape CSS special characters)

Critical rules:

Use framework-provided encoding functions, never write your own
Encode at output time, not at input time
Different contexts require different encoding

Use Safe APIs and Avoid Dangerous Functions

Leverage framework protections and avoid dangerous APIs:

Use safe defaults:

Template engines with auto-escaping (Thymeleaf, Razor, Jinja2, etc.)
Safe DOM manipulation (.textContent, not .innerHTML)
Framework data binding that auto-escapes

Avoid dangerous functions:

Never use: eval(), .innerHTML, document.write()
Avoid framework "escape hatches" with untrusted data

Never Use Framework Security Bypasses with Untrusted Data

Modern frameworks have "escape hatches" that bypass XSS protection. Never use these with untrusted data:

React: dangerouslySetInnerHTML Angular: bypassSecurityTrustHtml(), bypassSecurityTrustScript(), bypassSecurityTrustUrl() Vue.js: v-html directive Lit/Polymer: unsafeHTML(), htmlLiteral() Jinja2: {{ data | safe }} Thymeleaf: th:utext Razor: @Html.Raw()

Key Principle: Any API with "unsafe", "raw", "bypass", "dangerously", or "trust" in the name is a security risk. Only use with:

Trusted, server-generated content (never untrusted data)
Content sanitized with DOMPurify or similar
Properly escaped content for the specific context

Add Input Validation and Content Security Policy (Defense in Depth)

Input Validation (supplementary control):

Validate expected data format (email, phone, numeric, etc.)
Use allowlists for enumerated values
Reject input containing script tags or event handlers
Never rely solely on input validation - encoding is still required

Content Security Policy (CSP):

Implement strict CSP header to prevent inline scripts
Disallow unsafe-inline and unsafe-eval
Use nonces or hashes for legitimate inline scripts
Restrict script sources to trusted domains only
CSP is defense-in-depth, not a replacement for encoding

Test with XSS Payloads

Verify your encoding with attack vectors:

Basic XSS:

<script>alert(1)</script>
<img src=x onerror='alert(1)'>
<svg onload=alert(1)>

Context-specific payloads:

Attribute injection: " onclick="alert(1)"
JavaScript injection: '; alert(1); //
URL injection: javascript:alert(1)

Verification:

Load page with malicious inputs
Verify payloads displayed as text (not executed)
Check browser console for JavaScript errors
Use browser DevTools to inspect encoded output
Ensure legitimate functionality still works
Run automated scanners (OWASP ZAP, Burp Suite)

Modern frameworks have "escape hatches" that bypass security protections. Never use these with user input:

React

dangerouslySetInnerHTML - Bypasses React's XSS protection:

  // VULNERABLE
  <div dangerouslySetInnerHTML={{__html: userInput}} />

  // SAFE - React auto-escapes
  <div>{userInput}</div>

React cannot handle javascript: or data: URLs - Requires validation:

  // VULNERABLE
  <a href={userUrl}>Click</a>

  // SAFE - validate URL scheme
  const safeUrl = userUrl.startsWith('https://') || userUrl.startsWith('http://') 
                  ? userUrl : '#';
  <a href={safeUrl}>Click</a>

Angular

bypassSecurityTrustHtml() - Disables sanitization:

  // VULNERABLE
  this.sanitizer.bypassSecurityTrustHtml(userInput)

  // SAFE - let Angular sanitize automatically
  // Just bind to template: {{ userInput }}

bypassSecurityTrustScript() - Allows script execution
bypassSecurityTrustStyle() - Bypasses CSS sanitization
bypassSecurityTrustUrl() - Allows javascript: URLs
bypassSecurityTrustResourceUrl() - Allows unsafe resource loads

Vue.js

v-html directive - Renders raw HTML:

  <!-- VULNERABLE -->
  <div v-html="userInput"></div>

  <!-- SAFE - Vue auto-escapes -->
  <div>{{ userInput }}</div>

Lit / Polymer

unsafeHTML() - Bypasses Lit's HTML escaping:

  // VULNERABLE
  render() {
    return html`<div>${unsafeHTML(userInput)}</div>`;
  }

  // SAFE
  render() {
    return html`<div>${userInput}</div>`;
  }

Polymer inner-h-t-m-l attribute
htmlLiteral() function

Server-Side Templating

Jinja2 (Python/Flask):

# VULNERABLE - Autoescape disabled

{{ userInput | safe }}

# SAFE - Default autoescape

{{ userInput }}

Thymeleaf (Java/Spring):

<!-- VULNERABLE - Unescaped -->
<div th:utext="${userInput}"></div>

<!-- SAFE - Escaped -->
<div th:text="${userInput}"></div>

Razor (.NET/ASP.NET):

@* VULNERABLE - Raw HTML *@
@Html.Raw(userInput)

@* SAFE - Auto-encoded *@
@userInput

Key Principle

Any API with "unsafe", "raw", "bypass", "dangerously", or "trust" in the name is a security risk. Only use these functions with:

Trusted, server-generated content (never user input)
Content that has been sanitized with DOMPurify or similar
Properly escaped content for the specific context

Test Cases to Validate Remediation

Normal inputs: John Doe, test@example.com (should display correctly)
Special HTML characters: <div>, &, " (should be encoded, not interpreted)
XSS payloads:
- <script>alert('XSS')</script>
- <img src=x onerror='alert(1)'>
- "><script>alert(1)</script>
- javascript:alert(document.cookie)
- <svg onload=alert(1)>
Context-specific payloads:
- For attributes: " onclick="alert(1)"
- For JavaScript: '; alert(1); //
- For URLs: javascript:alert(1)

Verification Steps

Load the page with malicious inputs
Verify payloads are displayed as text (not executed)
Check browser console for JavaScript errors
Confirm business functionality still works
Use browser DevTools to inspect encoded output
Run automated XSS scanners (OWASP ZAP, Burp Suite)

Common Vulnerable Patterns

Unencoded User Data in HTML Output (JavaScript)

// Direct output without encoding
response.write("<div>Welcome, " + username + "</div>")
// Attack: username = "<script>alert(document.cookie)</script>"
// Result: Script executes, stealing session cookies

// Using innerHTML with user data
element.innerHTML = userInput
// Attack: userInput = "<img src=x onerror='alert(1)'>"
// Result: JavaScript executes when image fails to load

Secure Patterns

HTML Entity Encoding and Safe DOM APIs (JavaScript)

// HTML-encode user data before output
response.write("<div>Welcome, " + htmlEncode(username) + "</div>")
// Result: "<script>..." becomes "&lt;script&gt;..." (displayed as text)

// Use safe DOM APIs
element.textContent = userInput
// Result: Content is treated as text, not HTML/script

Why this works: Output encoding transforms dangerous characters like <, >, &, ", and ' into their HTML entity equivalents (<, >, &, ", '), ensuring browsers interpret user input as text data rather than executable HTML or JavaScript code. Template systems with auto-escaping apply this encoding automatically at render time, eliminating the need for developers to remember to encode every variable. The .textContent DOM API treats all content as plain text (not HTML), preventing script execution even if the content contains <script> tags or event handlers. Context-aware encoding ensures data inserted into HTML attributes, JavaScript contexts, or URLs receives appropriate encoding for that context (HTML encoding for body content, JavaScript encoding for <script> blocks, URL encoding for href attributes). By treating all user input as untrusted data requiring encoding, this approach prevents XSS attacks where attackers inject malicious code to steal session cookies, perform actions on behalf of victims, or deface websites.

Language-Specific Guidance

For detailed, language-specific examples and framework-specific patterns:

C# - ASP.NET Core, Razor with automatic encoding
Java - Spring Boot, JSP, Thymeleaf with context-aware escaping
JavaScript/Node.js - Express, React, Vue, Angular with XSS prevention
Perl - CGI, Catalyst with HTML escaping
PHP - Laravel, Symfony with htmlspecialchars
Python - Flask, Django, Jinja2 with autoescaping

Dynamic Scan Guidance

For guidance on remediating this CWE when detected by dynamic (DAST) scanners:

Dynamic Scan Guidance - Analyzing DAST findings and mapping to source code