CWE-80: Improper Neutralization of Script-Related HTML Tags (Basic XSS) - JavaScript

Overview

Cross-Site Scripting (CWE-80) occurs when untrusted data is included in web pages without proper encoding. In JavaScript applications, this happens when user input is directly inserted into the DOM, innerHTML, or HTML attributes without sanitization. Attackers inject malicious scripts that execute in victim browsers, leading to session theft, credential harvesting, or defacement.

Primary Defence: Use textContent or innerText instead of innerHTML for displaying user input, use framework-specific auto-escaping features (React's JSX, Vue's templates, Angular's templates), manually encode with DOMPurify or similar libraries when HTML rendering is necessary, implement Content Security Policy (CSP) to block inline scripts, and validate/sanitize all user input to prevent XSS attacks.

Common Vulnerable Patterns

Direct innerHTML Assignment

// VULNERABLE - Direct user input to innerHTML
function displayUserComment(comment) {
    document.getElementById('comment').innerHTML = comment;
    // Attack: comment = "<img src=x onerror=alert(document.cookie)>"
}

// VULNERABLE - Template literal with user input
function showMessage(message) {
    document.body.innerHTML = `<div class="alert">${message}</div>`;
    // Attack: message = "<script>alert('XSS')</script>"
}

DOM Manipulation with User Input

// VULNERABLE - Setting outerHTML
function updateContent(html) {
    document.querySelector('.content').outerHTML = html;
}

// VULNERABLE - insertAdjacentHTML
function addNotification(text) {
    document.querySelector('.notifications')
        .insertAdjacentHTML('beforeend', `<div>${text}</div>`);
}

Dynamic Script Creation

// VULNERABLE - Creating script elements
function loadUserScript(code) {
    const script = document.createElement('script');
    script.innerHTML = code; // Direct code injection
    document.body.appendChild(script);
}

// VULNERABLE - eval with user input
function processUserFunction(userCode) {
    eval(userCode); // Never use eval with user input
}

Attribute Injection

// VULNERABLE - Setting href with user input
function createLink(url, text) {
    const link = document.createElement('a');
    link.href = url; // Attack: url = "javascript:alert('XSS')"
    link.innerHTML = text;
    return link;
}

// VULNERABLE - Event handler injection
function setClickHandler(code) {
    document.querySelector('button').setAttribute('onclick', code);
}

Secure Patterns

Use textContent or Direct DOM Creation (Preferred)

// SECURE - textContent automatically escapes HTML
function displayUserComment(comment) {
    document.getElementById('comment').textContent = comment;
    // "<script>alert('XSS')</script>" is displayed as text, not executed
}

// SECURE - createElement + textContent for dynamic content
function showMessage(message) {
    const div = document.createElement('div');
    div.className = 'alert';
    div.textContent = message; // Safe - automatically escapes
    document.body.appendChild(div);
}

Why this works: The textContent property automatically treats all content as plain text, never as HTML. This means special characters like <, >, and & are displayed literally rather than interpreted as markup. When you set textContent, the browser handles all escaping internally - there's no way for an attacker to inject executable code because the browser never parses the content as HTML. Combined with createElement(), this provides type-safe DOM manipulation that's immune to XSS. This is the preferred approach for displaying user-generated content in modern JavaScript applications.

HTML Encoding Function (When innerHTML is Required)

// SECURE - Use only when you must use innerHTML with user data
function encodeHtml(value) {
    const text = document.createTextNode(value);
    const p = document.createElement('p');
    p.appendChild(text);
    return p.innerHTML;
}

// Example: Building HTML strings that will be set via innerHTML
function displayComment(comment) {
    const encoded = encodeHtml(comment);
    document.getElementById('comment').innerHTML = `<div class="comment">${encoded}</div>`;
}

// Note: Direct DOM creation is preferred over this approach when possible

Why this works: This clever technique leverages the browser's own HTML encoding mechanism. By creating a text node (which treats content as plain text) and then reading the innerHTML of its parent element, we force the browser to convert special characters like <, >, &, ", and ' into their HTML entity equivalents (<, >, &, ", '). The browser handles all edge cases including Unicode characters, ensuring consistent encoding across different platforms. This approach is useful when you need to manually construct HTML strings for legacy code that requires innerHTML, but it's less efficient than direct DOM manipulation because it creates temporary elements. Modern best practice is to use textContent or createElement() directly rather than building HTML strings, but when you must use innerHTML (e.g., integrating with legacy libraries), this encoding function provides a secure way to include user data without external dependencies.

DOMPurify Library (Recommended for Rich Content)

// SECURE - Using DOMPurify for HTML sanitization
import DOMPurify from 'dompurify';

function displayRichContent(userHtml) {
    const clean = DOMPurify.sanitize(userHtml);
    document.getElementById('content').innerHTML = clean;
}

// Configure DOMPurify for specific use cases
function displayWithConfig(userHtml) {
    const clean = DOMPurify.sanitize(userHtml, {
        ALLOWED_TAGS: ['b', 'i', 'em', 'strong', 'a', 'p'],
        ALLOWED_ATTR: ['href'],
        ALLOW_DATA_ATTR: false
    });
    document.getElementById('content').innerHTML = clean;
}

// Remove all HTML tags
function displayTextOnly(userHtml) {
    const clean = DOMPurify.sanitize(userHtml, {
        ALLOWED_TAGS: [],
        KEEP_CONTENT: true
    });
    document.getElementById('content').textContent = clean;
}

Why this works: DOMPurify is a battle-tested HTML sanitizer that uses a allowlist approach: it parses the user's HTML, removes any dangerous elements (like <script>, <iframe>, or event handlers like onclick), and returns only safe HTML. It's designed to handle complex attack vectors including mutation XSS, mXSS, and DOM clobbering. The library is actively maintained and updated as new browser quirks and attack techniques are discovered. By configuring ALLOWED_TAGS and ALLOWED_ATTR, you can precisely control what HTML is permitted, allowing rich formatting while blocking XSS. This is essential when you need to support user-provided HTML (like from WYSIWYG editors) but can't use plain textContent.

Safe DOM Manipulation

// SECURE - Building DOM safely
function createUserCard(user) {
    const card = document.createElement('div');
    card.className = 'user-card';

    const name = document.createElement('h3');
    name.textContent = user.name; // Safe

    const email = document.createElement('p');
    email.textContent = user.email; // Safe

    card.appendChild(name);
    card.appendChild(email);

    return card;
}

// SECURE - Using data attributes safely
function setDataAttribute(element, key, value) {
    // Validate key to prevent attribute injection
    if (!/^[a-z][a-z0-9-]*$/.test(key)) {
        throw new Error('Invalid data attribute name');
    }
    element.dataset[key] = value; // Safe with textContent-like behavior
}

Why this works: This pattern uses type-safe DOM APIs (createElement, textContent, appendChild) that construct HTML elements programmatically rather than parsing strings. When you call createElement('div'), you get a real DOM Element object, not a string representation. Setting textContent on that element automatically escapes any special characters, treating the value as plain text. This approach is immune to XSS because there's no parsing step where an attacker could inject malicious markup - you're directly creating the DOM structure you intend. The dataset API for data attributes is also safe because it treats values as text, but validating the attribute name (with the regex check) prevents attackers from injecting arbitrary attributes. This is the gold standard for dynamic HTML in JavaScript: it's type-safe, performant, readable, and inherently secure. Unlike innerHTML which parses strings and can execute embedded scripts, this method gives you complete control over the DOM structure.

URL Validation and Sanitization

// SECURE - Validate URLs before use
function createSafeLink(url, text) {
    const link = document.createElement('a');

    // Allowlist allowed protocols
    const allowedProtocols = ['http:', 'https:', 'mailto:'];

    try {
        const parsed = new URL(url, window.location.href);
        if (!allowedProtocols.includes(parsed.protocol)) {
            throw new Error('Disallowed protocol');
        }
        link.href = parsed.href;
    } catch (e) {
        // Invalid URL or disallowed protocol
        link.href = '#';
        link.addEventListener('click', (e) => e.preventDefault());
    }

    link.textContent = text; // Safe
    return link;
}

// SECURE - Sanitize URL parameters
function buildUrlWithParams(baseUrl, params) {
    const url = new URL(baseUrl);
    for (const [key, value] of Object.entries(params)) {
        url.searchParams.append(key, value); // Automatically URL-encodes
    }
    return url.toString();
}

Why this works: This pattern prevents javascript: URL attacks by validating the protocol before setting the href attribute. The URL constructor parses the URL and exposes its components (protocol, host, path, etc.), allowing us to check against a allowlist of safe protocols. If an attacker tries to inject javascript:alert('XSS'), the protocol check fails and we default to a safe # link with prevented navigation. The searchParams.append() method automatically URL-encodes parameter values, preventing injection of additional parameters or manipulation of the URL structure. This defense-in-depth approach combines validation (protocol allowlist) with safe API usage (URL constructor and searchParams).

Template Literals with Encoding

// SECURE - Template function with auto-escaping
function html(strings, ...values) {
    return strings.reduce((result, string, i) => {
        const value = values[i - 1];
        const encoded = value !== undefined ? encodeHtml(String(value)) : '';
        return result + encoded + string;
    });
}

// Usage
function displayUserInfo(name, bio) {
    const content = html`
        <div class="user-info">
            <h2>${name}</h2>
            <p>${bio}</p>
        </div>
    `;
    document.getElementById('profile').innerHTML = content;
}

Why this works: This pattern uses JavaScript tagged template literals to create an auto-escaping HTML builder function. Tagged templates let you process template strings before they're concatenated: the html function receives the static string parts separately from the interpolated values. By encoding each value with encodeHtml() before concatenation, all user data is automatically escaped while the static HTML structure remains intact. This provides syntax similar to JSX or Vue templates (where ${value} expressions are auto-escaped) but works in vanilla JavaScript. The String(value) conversion ensures even non-string values are handled safely. This approach combines the readability of template literals with the security of proper encoding, making it harder to accidentally introduce XSS when building HTML strings. However, direct DOM manipulation with createElement() and textContent is still preferred when possible, as it avoids string parsing entirely.

Framework-Specific Guidance

React

// SECURE - React automatically escapes values
function UserComment({ comment }) {
    return <div className="comment">{comment}</div>;
    // JSX auto-escapes, safe from XSS
}

// VULNERABLE - dangerouslySetInnerHTML
function RichComment({ html }) {
    return <div dangerouslySetInnerHTML={{ __html: html }} />;
    // Only use with sanitized content!
}

// SECURE - With DOMPurify
import DOMPurify from 'dompurify';

function SafeRichComment({ html }) {
    const clean = DOMPurify.sanitize(html);
    return <div dangerouslySetInnerHTML={{ __html: clean }} />;
}

Why this works: React's JSX syntax automatically escapes all values embedded in curly braces {}, converting special characters to HTML entities before rendering. This means that expressions like {comment} are inherently safe from XSS - React treats them as text content, not markup. The deliberate naming of dangerouslySetInnerHTML serves as a clear warning that you're bypassing React's protections. When you must render rich HTML, combining dangerouslySetInnerHTML with DOMPurify ensures the HTML is sanitized before React renders it, maintaining security. This secure-by-default design makes XSS much harder to introduce accidentally in React applications.

Vue.js

// SECURE - Vue automatically escapes
<template>
  <div class="comment">{{ userComment }}</div>
  <!-- Auto-escaped, safe -->
</template>

// VULNERABLE - v-html directive
<template>
  <div v-html="userComment"></div>
  <!-- Renders raw HTML, dangerous! -->
</template>

// SECURE - With DOMPurify
<template>
  <div v-html="sanitizedComment"></div>
</template>

<script>
import DOMPurify from 'dompurify';

export default {
  props: ['userComment'],
  computed: {
    sanitizedComment() {
      return DOMPurify.sanitize(this.userComment);
    }
  }
}
</script>

Why this works: Vue's mustache syntax {{ }} automatically escapes HTML, treating all interpolated values as plain text. This default behavior prevents XSS by ensuring user content is displayed, not executed. The v-html directive bypasses this protection and should be used sparingly. When you must render HTML (like from a rich text editor), the pattern shown here uses a computed property to sanitize the content with DOMPurify before passing it to v-html. This approach combines Vue's reactivity with DOMPurify's sanitization, ensuring that even if the user updates their input, it's re-sanitized automatically. The computed property caches the sanitized result for performance.

Angular

// SECURE - Angular sanitizes by default
@Component({
  template: `<div>{{ userComment }}</div>`
  // Auto-escaped by Angular
})

// SECURE - Using DomSanitizer for trusted content
import { DomSanitizer, SafeHtml } from '@angular/platform-browser';

export class CommentComponent {
  safeHtml: SafeHtml;

  constructor(private sanitizer: DomSanitizer) {}

  setSafeContent(html: string) {
    // Only use with already-sanitized content
    this.safeHtml = this.sanitizer.sanitize(SecurityContext.HTML, html);
  }
}

Express.js (Server-Side)

// SECURE - Using template engines with auto-escaping
const express = require('express');
const app = express();

// Handlebars (auto-escapes by default)
app.set('view engine', 'hbs');
app.get('/user/:id', (req, res) => {
    res.render('profile', { 
        username: req.params.id // Auto-escaped in template
    });
});

// EJS (use <%= %> for escaped output)
app.set('view engine', 'ejs');
// In template: <%= username %> is safe
// VULNERABLE - <%- username %> renders raw HTML

// SECURE - Manual encoding in responses
const encodeHtml = (str) => {
    return str
        .replace(/&/g, '&amp;')
        .replace(/</g, '&lt;')
        .replace(/>/g, '&gt;')
        .replace(/"/g, '&quot;')
        .replace(/'/g, '&#039;');
};

app.get('/api/comment', (req, res) => {
    const comment = encodeHtml(req.query.text);
    res.send(`<div>${comment}</div>`);
});

Content Security Policy (CSP)

// SECURE - Implement CSP headers
// In Express middleware
app.use((req, res, next) => {
    res.setHeader(
        'Content-Security-Policy',
        "default-src 'self'; " +
        "script-src 'self' 'nonce-random123'; " +
        "style-src 'self' 'unsafe-inline'; " +
        "img-src 'self' https:; " +
        "font-src 'self' data:; " +
        "connect-src 'self'; " +
        "frame-ancestors 'none'; " +
        "base-uri 'self'; " +
        "form-action 'self'"
    );
    next();
});

// Use nonce for inline scripts
function generateNonce() {
    return require('crypto').randomBytes(16).toString('base64');
}

app.use((req, res, next) => {
    res.locals.nonce = generateNonce();
    res.setHeader(
        'Content-Security-Policy',
        `script-src 'self' 'nonce-${res.locals.nonce}'`
    );
    next();
});

// In HTML template
// <script nonce="<%= nonce %>">...</script>

Security Best Practices

Input Validation

// SECURE - Validate input format
function validateEmail(email) {
    const emailRegex = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
    return emailRegex.test(email);
}

function processUserEmail(email) {
    if (!validateEmail(email)) {
        throw new Error('Invalid email format');
    }
    document.getElementById('email').textContent = email;
}

// SECURE - Allowlist allowed characters
function sanitizeUsername(username) {
    // Only allow alphanumeric and underscore
    return username.replace(/[^a-zA-Z0-9_]/g, '');
}

Context-Aware Encoding

// SECURE - Different encoding for different contexts
const Encoder = {
    // HTML context
    forHtml: (value) => {
        const text = document.createTextNode(value);
        const p = document.createElement('p');
        p.appendChild(text);
        return p.innerHTML;
    },

    // JavaScript context
    forJavaScript: (value) => {
        return JSON.stringify(String(value)).slice(1, -1);
    },

    // URL parameter context
    forUrl: (value) => {
        return encodeURIComponent(value);
    },

    // CSS context
    forCss: (value) => {
        return String(value).replace(/[^a-zA-Z0-9-_]/g, '\\$&');
    }
};

// Usage
function displayInContext(data) {
    // HTML context
    document.querySelector('.html-content').innerHTML = 
        Encoder.forHtml(data.message);

    // JavaScript context
    const script = document.createElement('script');
    script.textContent = `var userMsg = "${Encoder.forJavaScript(data.message)}";`;

    // URL context
    const link = document.createElement('a');
    link.href = `/search?q=${Encoder.forUrl(data.query)}`;
}

Avoid Dangerous Functions

// VULNERABLE - Never use these with user input
eval(userInput);                    // Code execution
new Function(userInput)();          // Code execution
setTimeout(userInput, 1000);        // Code execution
setInterval(userInput, 1000);       // Code execution
element.innerHTML = userInput;      // HTML injection
document.write(userInput);          // HTML injection

// SECURE - Safe alternatives
// Instead of eval, use JSON.parse for data
const data = JSON.parse(userJsonString);

// Instead of innerHTML, use textContent
element.textContent = userInput;

// Instead of setTimeout with string, use function
setTimeout(() => safeFunction(userInput), 1000);

Verification and Detection

Security testing requires multiple approaches - manual testing alone is insufficient.

Static Application Security Testing (SAST)

Use automated tools to detect XSS vulnerabilities in JavaScript code:

Commercial Tools:

Checkmarx - JavaScript/TypeScript XSS detection
Snyk Code - Real-time security scanning in IDE
Veracode - JavaScript security analysis

Open Source Tools:

ESLint with security plugins

npm install --save-dev eslint-plugin-security eslint-plugin-no-unsanitized

.eslintrc.json

{
  "plugins": ["security", "no-unsanitized"],
  "extends": ["plugin:security/recommended"],
  "rules": {
    "no-unsanitized/method": "error",
    "no-unsanitized/property": "error"
  }
}

Semgrep - Pattern-based security scanning

semgrep --config=p/javascript .
semgrep --config=p/xss .

SonarQube - Continuous code quality and security
```
npm install --save-dev sonarqube-scanner
```

Dynamic Application Security Testing (DAST)

Test running applications:

OWASP ZAP - Automated vulnerability scanner
Burp Suite - Manual and automated testing
Acunetix - XSS detection in SPAs
Arachni - Web application security scanner

Code Review Checklist

Manually verify:

No innerHTML with user input
textContent used instead of innerHTML where possible
All dynamic HTML uses DOMPurify or framework escaping
No eval(), Function(), or setTimeout(string) with user input
Event handlers don't use user input directly
URL attributes validated for javascript: protocol
CSP headers properly configured
Framework auto-escaping enabled (React JSX, Vue templates, Angular)

Framework-Specific Tools

React:

# ESLint React security plugin
npm install --save-dev eslint-plugin-react-security

Vue:

# Vue ESLint security rules
npm install --save-dev eslint-plugin-vue

Angular:

# Angular security linting
ng lint

Browser DevTools Testing

Manual security checks:

Inspect Element - View rendered HTML for unescaped content
Console - Check for CSP violations
Network Tab - Review response headers for security headers
Security Tab - Check certificate and security warnings

Limited Role of Automated Tests

Automated tests can verify encoding but can't guarantee security:

// Tests verify encoding works - NOT comprehensive security
const assert = require('assert');

function testEncodingFunction() {
    const malicious = '<script>alert("XSS")</script>';
    const encoded = encodeHtml(malicious);

    // Verify encoding works
    assert(!encoded.includes('<script>'));
    assert(encoded.includes('&lt;script&gt;'));
}

Important: Passing tests does NOT mean your app is secure. Use SAST/DAST tools to find actual vulnerabilities.

End-to-End Testing

// Playwright/Cypress test example
test('XSS payload should be encoded', async ({ page }) => {
    const xssPayload = '<script>alert("XSS")</script>';

    await page.goto(`/profile?name=${encodeURIComponent(xssPayload)}`);

    const html = await page.content();
    expect(html).not.toContain('<script>alert');
    expect(html).toContain('&lt;script&gt;');
});

Continuous Security

CI/CD Integration - Run linters and SAST in GitHub Actions/GitLab CI

Pre-commit Hooks - Use Husky + lint-staged

npm install --save-dev husky lint-staged
npx husky install

Dependency Scanning - Check for vulnerable packages
```
npm audit
npm audit fix
```

Security Headers - Verify CSP, X-Frame-Options, etc.
Penetration Testing - Professional security assessments

Dependencies and Installation

# DOMPurify (recommended for HTML sanitization)
npm install dompurify
npm install --save-dev @types/dompurify  # TypeScript types

# For older browsers, include polyfill
npm install core-js

<!-- CDN for DOMPurify -->
<script src="https://cdn.jsdelivr.net/npm/dompurify@3.0.6/dist/purify.min.js"></script>