CWE-91: XML Injection

Overview

XML Injection occurs when untrusted user input is used to construct XML documents or queries without proper validation or escaping, allowing attackers to modify the structure or content of XML data.

OWASP Classification

A05:2025 - Injection

Risk

High: Attackers can alter application logic, bypass authentication, or trigger denial of service by injecting malicious XML. This can lead to data corruption, unauthorized access, or system compromise.

Remediation Steps

Core principle: Never construct XML by concatenating untrusted input; use XML libraries that treat user input as data and automatically escape it so it cannot alter document structure.

Trace the Data Path and Locate the Vulnerability

Analyze how untrusted data reaches XML construction:

Review the flaw details to identify the specific file, line number, and code pattern
Source: Where untrusted data enters (user input, external file, database, network request)
Sink: XML document construction or serialization
String concatenation: Look for untrusted data being concatenated into XML strings
Understand the data flow from source to sink

Use XML Libraries with Built-in Escaping (Primary Defense)

Replace string concatenation with safe XML API usage:

Use XML libraries that automatically escape or encode XML data
Avoid manual string concatenation when building XML
Use DOM builders (e.g., ElementTree, XmlWriter, xmlbuilder2)
Let the library handle character escaping (e.g., < → <, > → >)
Follow the secure code examples provided in language-specific guidance

Validate and Sanitize Input (Defense in Depth)

Add input validation as an additional security layer:

Enforce strict type, length, and format checks on all untrusted data
Validate expected patterns before XML construction
Reject or escape special XML characters (<, >, &, ', ")
Use allowlists for enumerated values
Implement length limits to prevent DoS

Disable Dangerous XML Features

Harden XML parsers to prevent related attacks:

Disable external entity resolution (prevents XXE attacks)
Disable DTD processing if not required
Avoid external references in XML processing
Set secure defaults on XML parsers
Use hardened libraries like defusedxml (Python)

Monitor and Test

Verify your fixes and enable detection:

Test with XML injection payloads: </name><admin>true</admin><name>, <![CDATA[<script>alert(1)</script>]]>
Test with encoded characters: <script>, <admin>
Log all XML parsing errors and suspicious activity
Alert on malformed or unexpected XML input
Verify the specific input that triggered the finding no longer causes the vulnerability
Ensure legitimate functionality still works correctly
Re-scan with the security scanner to confirm the issue is resolved

Common Vulnerable Patterns

Concatenating untrusted data into XML documents
Failing to escape or validate XML content

String Concatenation for XML Construction (Pseudocode)

# Dangerous: user input in XML
xml = f"<user><name>{user_input}</name></user>"

Secure Patterns

XML Library with Automatic Escaping (Python)

# Safe: use XML library to build document
import xml.etree.ElementTree as ET
user_elem = ET.Element('user')
name_elem = ET.SubElement(user_elem, 'name')
name_elem.text = user_input  # Library escapes content
xml = ET.tostring(user_elem)

Why this works:

Uses XML library APIs that automatically escape special characters (<, >, &, ", ')
Prevents injection of malicious XML tags, attributes, or CDATA sections
Treats user input as data content rather than markup structure
Blocks attackers from breaking out of XML context to inject arbitrary elements
Avoids string concatenation that could allow XML structure manipulation

Language-Specific Guidance

For detailed, framework-specific examples and best practices:

Python - ElementTree, lxml, defusedxml, Django, Flask
Java - DOM, StAX, JAXB, Jackson XML, Apache Commons Text
JavaScript/Node.js - xmlbuilder2, xml2js, he, Next.js
C# - LINQ to XML, XmlWriter, XmlSerializer, ASP.NET Core

Dynamic Scan Guidance

For guidance on remediating this CWE when detected by dynamic (DAST) scanners:

Dynamic Scan Guidance - Analyzing DAST findings and mapping to source code