CWE-80: Improper Neutralization of Script-Related HTML Tags (Basic XSS) - JavaScript
Overview
Cross-Site Scripting (CWE-80) occurs when untrusted data is included in web pages without proper encoding. In JavaScript applications, this happens when user input is directly inserted into the DOM, innerHTML, or HTML attributes without sanitization. Attackers inject malicious scripts that execute in victim browsers, leading to session theft, credential harvesting, or defacement.
Primary Defence: Use textContent or innerText instead of innerHTML for displaying user input, use framework-specific auto-escaping features (React's JSX, Vue's templates, Angular's templates), manually encode with DOMPurify or similar libraries when HTML rendering is necessary, implement Content Security Policy (CSP) to block inline scripts, and validate/sanitize all user input to prevent XSS attacks.
Common Vulnerable Patterns
Direct innerHTML Assignment
// VULNERABLE - Direct user input to innerHTML
function displayUserComment(comment) {
document.getElementById('comment').innerHTML = comment;
// Attack: comment = "<img src=x onerror=alert(document.cookie)>"
}
// VULNERABLE - Template literal with user input
function showMessage(message) {
document.body.innerHTML = `<div class="alert">${message}</div>`;
// Attack: message = "<script>alert('XSS')</script>"
}
DOM Manipulation with User Input
// VULNERABLE - Setting outerHTML
function updateContent(html) {
document.querySelector('.content').outerHTML = html;
}
// VULNERABLE - insertAdjacentHTML
function addNotification(text) {
document.querySelector('.notifications')
.insertAdjacentHTML('beforeend', `<div>${text}</div>`);
}
Dynamic Script Creation
// VULNERABLE - Creating script elements
function loadUserScript(code) {
const script = document.createElement('script');
script.innerHTML = code; // Direct code injection
document.body.appendChild(script);
}
// VULNERABLE - eval with user input
function processUserFunction(userCode) {
eval(userCode); // Never use eval with user input
}
Attribute Injection
// VULNERABLE - Setting href with user input
function createLink(url, text) {
const link = document.createElement('a');
link.href = url; // Attack: url = "javascript:alert('XSS')"
link.innerHTML = text;
return link;
}
// VULNERABLE - Event handler injection
function setClickHandler(code) {
document.querySelector('button').setAttribute('onclick', code);
}
Secure Patterns
Use textContent or Direct DOM Creation (Preferred)
// SECURE - textContent automatically escapes HTML
function displayUserComment(comment) {
document.getElementById('comment').textContent = comment;
// "<script>alert('XSS')</script>" is displayed as text, not executed
}
// SECURE - createElement + textContent for dynamic content
function showMessage(message) {
const div = document.createElement('div');
div.className = 'alert';
div.textContent = message; // Safe - automatically escapes
document.body.appendChild(div);
}
Why this works: The textContent property automatically treats all content as plain text, never as HTML. This means special characters like <, >, and & are displayed literally rather than interpreted as markup. When you set textContent, the browser handles all escaping internally - there's no way for an attacker to inject executable code because the browser never parses the content as HTML. Combined with createElement(), this provides type-safe DOM manipulation that's immune to XSS. This is the preferred approach for displaying user-generated content in modern JavaScript applications.
HTML Encoding Function (When innerHTML is Required)
// SECURE - Use only when you must use innerHTML with user data
function encodeHtml(value) {
const text = document.createTextNode(value);
const p = document.createElement('p');
p.appendChild(text);
return p.innerHTML;
}
// Example: Building HTML strings that will be set via innerHTML
function displayComment(comment) {
const encoded = encodeHtml(comment);
document.getElementById('comment').innerHTML = `<div class="comment">${encoded}</div>`;
}
// Note: Direct DOM creation is preferred over this approach when possible
Why this works: This clever technique leverages the browser's own HTML encoding mechanism. By creating a text node (which treats content as plain text) and then reading the innerHTML of its parent element, we force the browser to convert special characters like <, >, &, ", and ' into their HTML entity equivalents (<, >, &, ", '). The browser handles all edge cases including Unicode characters, ensuring consistent encoding across different platforms. This approach is useful when you need to manually construct HTML strings for legacy code that requires innerHTML, but it's less efficient than direct DOM manipulation because it creates temporary elements. Modern best practice is to use textContent or createElement() directly rather than building HTML strings, but when you must use innerHTML (e.g., integrating with legacy libraries), this encoding function provides a secure way to include user data without external dependencies.
DOMPurify Library (Recommended for Rich Content)
// SECURE - Using DOMPurify for HTML sanitization
import DOMPurify from 'dompurify';
function displayRichContent(userHtml) {
const clean = DOMPurify.sanitize(userHtml);
document.getElementById('content').innerHTML = clean;
}
// Configure DOMPurify for specific use cases
function displayWithConfig(userHtml) {
const clean = DOMPurify.sanitize(userHtml, {
ALLOWED_TAGS: ['b', 'i', 'em', 'strong', 'a', 'p'],
ALLOWED_ATTR: ['href'],
ALLOW_DATA_ATTR: false
});
document.getElementById('content').innerHTML = clean;
}
// Remove all HTML tags
function displayTextOnly(userHtml) {
const clean = DOMPurify.sanitize(userHtml, {
ALLOWED_TAGS: [],
KEEP_CONTENT: true
});
document.getElementById('content').textContent = clean;
}
Why this works: DOMPurify is a battle-tested HTML sanitizer that uses a allowlist approach: it parses the user's HTML, removes any dangerous elements (like <script>, <iframe>, or event handlers like onclick), and returns only safe HTML. It's designed to handle complex attack vectors including mutation XSS, mXSS, and DOM clobbering. The library is actively maintained and updated as new browser quirks and attack techniques are discovered. By configuring ALLOWED_TAGS and ALLOWED_ATTR, you can precisely control what HTML is permitted, allowing rich formatting while blocking XSS. This is essential when you need to support user-provided HTML (like from WYSIWYG editors) but can't use plain textContent.
Safe DOM Manipulation
// SECURE - Building DOM safely
function createUserCard(user) {
const card = document.createElement('div');
card.className = 'user-card';
const name = document.createElement('h3');
name.textContent = user.name; // Safe
const email = document.createElement('p');
email.textContent = user.email; // Safe
card.appendChild(name);
card.appendChild(email);
return card;
}
// SECURE - Using data attributes safely
function setDataAttribute(element, key, value) {
// Validate key to prevent attribute injection
if (!/^[a-z][a-z0-9-]*$/.test(key)) {
throw new Error('Invalid data attribute name');
}
element.dataset[key] = value; // Safe with textContent-like behavior
}
Why this works: This pattern uses type-safe DOM APIs (createElement, textContent, appendChild) that construct HTML elements programmatically rather than parsing strings. When you call createElement('div'), you get a real DOM Element object, not a string representation. Setting textContent on that element automatically escapes any special characters, treating the value as plain text. This approach is immune to XSS because there's no parsing step where an attacker could inject malicious markup - you're directly creating the DOM structure you intend. The dataset API for data attributes is also safe because it treats values as text, but validating the attribute name (with the regex check) prevents attackers from injecting arbitrary attributes. This is the gold standard for dynamic HTML in JavaScript: it's type-safe, performant, readable, and inherently secure. Unlike innerHTML which parses strings and can execute embedded scripts, this method gives you complete control over the DOM structure.
URL Validation and Sanitization
// SECURE - Validate URLs before use
function createSafeLink(url, text) {
const link = document.createElement('a');
// Allowlist allowed protocols
const allowedProtocols = ['http:', 'https:', 'mailto:'];
try {
const parsed = new URL(url, window.location.href);
if (!allowedProtocols.includes(parsed.protocol)) {
throw new Error('Disallowed protocol');
}
link.href = parsed.href;
} catch (e) {
// Invalid URL or disallowed protocol
link.href = '#';
link.addEventListener('click', (e) => e.preventDefault());
}
link.textContent = text; // Safe
return link;
}
// SECURE - Sanitize URL parameters
function buildUrlWithParams(baseUrl, params) {
const url = new URL(baseUrl);
for (const [key, value] of Object.entries(params)) {
url.searchParams.append(key, value); // Automatically URL-encodes
}
return url.toString();
}
Why this works: This pattern prevents javascript: URL attacks by validating the protocol before setting the href attribute. The URL constructor parses the URL and exposes its components (protocol, host, path, etc.), allowing us to check against a allowlist of safe protocols. If an attacker tries to inject javascript:alert('XSS'), the protocol check fails and we default to a safe # link with prevented navigation. The searchParams.append() method automatically URL-encodes parameter values, preventing injection of additional parameters or manipulation of the URL structure. This defense-in-depth approach combines validation (protocol allowlist) with safe API usage (URL constructor and searchParams).
Template Literals with Encoding
// SECURE - Template function with auto-escaping
function html(strings, ...values) {
return strings.reduce((result, string, i) => {
const value = values[i - 1];
const encoded = value !== undefined ? encodeHtml(String(value)) : '';
return result + encoded + string;
});
}
// Usage
function displayUserInfo(name, bio) {
const content = html`
<div class="user-info">
<h2>${name}</h2>
<p>${bio}</p>
</div>
`;
document.getElementById('profile').innerHTML = content;
}
Why this works: This pattern uses JavaScript tagged template literals to create an auto-escaping HTML builder function. Tagged templates let you process template strings before they're concatenated: the html function receives the static string parts separately from the interpolated values. By encoding each value with encodeHtml() before concatenation, all user data is automatically escaped while the static HTML structure remains intact. This provides syntax similar to JSX or Vue templates (where ${value} expressions are auto-escaped) but works in vanilla JavaScript. The String(value) conversion ensures even non-string values are handled safely. This approach combines the readability of template literals with the security of proper encoding, making it harder to accidentally introduce XSS when building HTML strings. However, direct DOM manipulation with createElement() and textContent is still preferred when possible, as it avoids string parsing entirely.
Framework-Specific Guidance
React
// SECURE - React automatically escapes values
function UserComment({ comment }) {
return <div className="comment">{comment}</div>;
// JSX auto-escapes, safe from XSS
}
// VULNERABLE - dangerouslySetInnerHTML
function RichComment({ html }) {
return <div dangerouslySetInnerHTML={{ __html: html }} />;
// Only use with sanitized content!
}
// SECURE - With DOMPurify
import DOMPurify from 'dompurify';
function SafeRichComment({ html }) {
const clean = DOMPurify.sanitize(html);
return <div dangerouslySetInnerHTML={{ __html: clean }} />;
}
Why this works: React's JSX syntax automatically escapes all values embedded in curly braces {}, converting special characters to HTML entities before rendering. This means that expressions like {comment} are inherently safe from XSS - React treats them as text content, not markup. The deliberate naming of dangerouslySetInnerHTML serves as a clear warning that you're bypassing React's protections. When you must render rich HTML, combining dangerouslySetInnerHTML with DOMPurify ensures the HTML is sanitized before React renders it, maintaining security. This secure-by-default design makes XSS much harder to introduce accidentally in React applications.
Vue.js
// SECURE - Vue automatically escapes
<template>
<div class="comment">{{ userComment }}</div>
<!-- Auto-escaped, safe -->
</template>
// VULNERABLE - v-html directive
<template>
<div v-html="userComment"></div>
<!-- Renders raw HTML, dangerous! -->
</template>
// SECURE - With DOMPurify
<template>
<div v-html="sanitizedComment"></div>
</template>
<script>
import DOMPurify from 'dompurify';
export default {
props: ['userComment'],
computed: {
sanitizedComment() {
return DOMPurify.sanitize(this.userComment);
}
}
}
</script>
Why this works: Vue's mustache syntax {{ }} automatically escapes HTML, treating all interpolated values as plain text. This default behavior prevents XSS by ensuring user content is displayed, not executed. The v-html directive bypasses this protection and should be used sparingly. When you must render HTML (like from a rich text editor), the pattern shown here uses a computed property to sanitize the content with DOMPurify before passing it to v-html. This approach combines Vue's reactivity with DOMPurify's sanitization, ensuring that even if the user updates their input, it's re-sanitized automatically. The computed property caches the sanitized result for performance.
Angular
// SECURE - Angular sanitizes by default
@Component({
template: `<div>{{ userComment }}</div>`
// Auto-escaped by Angular
})
// SECURE - Using DomSanitizer for trusted content
import { DomSanitizer, SafeHtml } from '@angular/platform-browser';
export class CommentComponent {
safeHtml: SafeHtml;
constructor(private sanitizer: DomSanitizer) {}
setSafeContent(html: string) {
// Only use with already-sanitized content
this.safeHtml = this.sanitizer.sanitize(SecurityContext.HTML, html);
}
}
Express.js (Server-Side)
// SECURE - Using template engines with auto-escaping
const express = require('express');
const app = express();
// Handlebars (auto-escapes by default)
app.set('view engine', 'hbs');
app.get('/user/:id', (req, res) => {
res.render('profile', {
username: req.params.id // Auto-escaped in template
});
});
// EJS (use <%= %> for escaped output)
app.set('view engine', 'ejs');
// In template: <%= username %> is safe
// VULNERABLE - <%- username %> renders raw HTML
// SECURE - Manual encoding in responses
const encodeHtml = (str) => {
return str
.replace(/&/g, '&')
.replace(/</g, '<')
.replace(/>/g, '>')
.replace(/"/g, '"')
.replace(/'/g, ''');
};
app.get('/api/comment', (req, res) => {
const comment = encodeHtml(req.query.text);
res.send(`<div>${comment}</div>`);
});
Content Security Policy (CSP)
// SECURE - Implement CSP headers
// In Express middleware
app.use((req, res, next) => {
res.setHeader(
'Content-Security-Policy',
"default-src 'self'; " +
"script-src 'self' 'nonce-random123'; " +
"style-src 'self' 'unsafe-inline'; " +
"img-src 'self' https:; " +
"font-src 'self' data:; " +
"connect-src 'self'; " +
"frame-ancestors 'none'; " +
"base-uri 'self'; " +
"form-action 'self'"
);
next();
});
// Use nonce for inline scripts
function generateNonce() {
return require('crypto').randomBytes(16).toString('base64');
}
app.use((req, res, next) => {
res.locals.nonce = generateNonce();
res.setHeader(
'Content-Security-Policy',
`script-src 'self' 'nonce-${res.locals.nonce}'`
);
next();
});
// In HTML template
// <script nonce="<%= nonce %>">...</script>
Security Best Practices
Input Validation
// SECURE - Validate input format
function validateEmail(email) {
const emailRegex = /^[^\s@]+@[^\s@]+\.[^\s@]+$/;
return emailRegex.test(email);
}
function processUserEmail(email) {
if (!validateEmail(email)) {
throw new Error('Invalid email format');
}
document.getElementById('email').textContent = email;
}
// SECURE - Allowlist allowed characters
function sanitizeUsername(username) {
// Only allow alphanumeric and underscore
return username.replace(/[^a-zA-Z0-9_]/g, '');
}
Context-Aware Encoding
// SECURE - Different encoding for different contexts
const Encoder = {
// HTML context
forHtml: (value) => {
const text = document.createTextNode(value);
const p = document.createElement('p');
p.appendChild(text);
return p.innerHTML;
},
// JavaScript context
forJavaScript: (value) => {
return JSON.stringify(String(value)).slice(1, -1);
},
// URL parameter context
forUrl: (value) => {
return encodeURIComponent(value);
},
// CSS context
forCss: (value) => {
return String(value).replace(/[^a-zA-Z0-9-_]/g, '\\$&');
}
};
// Usage
function displayInContext(data) {
// HTML context
document.querySelector('.html-content').innerHTML =
Encoder.forHtml(data.message);
// JavaScript context
const script = document.createElement('script');
script.textContent = `var userMsg = "${Encoder.forJavaScript(data.message)}";`;
// URL context
const link = document.createElement('a');
link.href = `/search?q=${Encoder.forUrl(data.query)}`;
}
Avoid Dangerous Functions
// VULNERABLE - Never use these with user input
eval(userInput); // Code execution
new Function(userInput)(); // Code execution
setTimeout(userInput, 1000); // Code execution
setInterval(userInput, 1000); // Code execution
element.innerHTML = userInput; // HTML injection
document.write(userInput); // HTML injection
// SECURE - Safe alternatives
// Instead of eval, use JSON.parse for data
const data = JSON.parse(userJsonString);
// Instead of innerHTML, use textContent
element.textContent = userInput;
// Instead of setTimeout with string, use function
setTimeout(() => safeFunction(userInput), 1000);
Verification and Detection
Security testing requires multiple approaches - manual testing alone is insufficient.
Static Application Security Testing (SAST)
Use automated tools to detect XSS vulnerabilities in JavaScript code:
Commercial Tools:
- Checkmarx - JavaScript/TypeScript XSS detection
- Snyk Code - Real-time security scanning in IDE
- Veracode - JavaScript security analysis
Open Source Tools:
-
ESLint with security plugins
-
Semgrep - Pattern-based security scanning
-
SonarQube - Continuous code quality and security
Dynamic Application Security Testing (DAST)
Test running applications:
- OWASP ZAP - Automated vulnerability scanner
- Burp Suite - Manual and automated testing
- Acunetix - XSS detection in SPAs
- Arachni - Web application security scanner
Code Review Checklist
Manually verify:
- No
innerHTMLwith user input -
textContentused instead ofinnerHTMLwhere possible - All dynamic HTML uses DOMPurify or framework escaping
- No
eval(),Function(), orsetTimeout(string)with user input - Event handlers don't use user input directly
- URL attributes validated for
javascript:protocol - CSP headers properly configured
- Framework auto-escaping enabled (React JSX, Vue templates, Angular)
Framework-Specific Tools
React:
Vue:
Angular:
Browser DevTools Testing
Manual security checks:
- Inspect Element - View rendered HTML for unescaped content
- Console - Check for CSP violations
- Network Tab - Review response headers for security headers
- Security Tab - Check certificate and security warnings
Limited Role of Automated Tests
Automated tests can verify encoding but can't guarantee security:
// Tests verify encoding works - NOT comprehensive security
const assert = require('assert');
function testEncodingFunction() {
const malicious = '<script>alert("XSS")</script>';
const encoded = encodeHtml(malicious);
// Verify encoding works
assert(!encoded.includes('<script>'));
assert(encoded.includes('<script>'));
}
Important: Passing tests does NOT mean your app is secure. Use SAST/DAST tools to find actual vulnerabilities.
End-to-End Testing
// Playwright/Cypress test example
test('XSS payload should be encoded', async ({ page }) => {
const xssPayload = '<script>alert("XSS")</script>';
await page.goto(`/profile?name=${encodeURIComponent(xssPayload)}`);
const html = await page.content();
expect(html).not.toContain('<script>alert');
expect(html).toContain('<script>');
});
Continuous Security
- CI/CD Integration - Run linters and SAST in GitHub Actions/GitLab CI
-
Pre-commit Hooks - Use Husky + lint-staged
-
Dependency Scanning - Check for vulnerable packages
- Security Headers - Verify CSP, X-Frame-Options, etc.
- Penetration Testing - Professional security assessments
Dependencies and Installation
# DOMPurify (recommended for HTML sanitization)
npm install dompurify
npm install --save-dev @types/dompurify # TypeScript types
# For older browsers, include polyfill
npm install core-js
<!-- CDN for DOMPurify -->
<script src="https://cdn.jsdelivr.net/npm/dompurify@3.0.6/dist/purify.min.js"></script>