CWE-91: XML Injection
Overview
XML Injection occurs when untrusted user input is used to construct XML documents or queries without proper validation or escaping, allowing attackers to modify the structure or content of XML data.
OWASP Classification
A05:2025 - Injection
Risk
High: Attackers can alter application logic, bypass authentication, or trigger denial of service by injecting malicious XML. This can lead to data corruption, unauthorized access, or system compromise.
Remediation Steps
Core principle: Never construct XML by concatenating untrusted input; use XML libraries that treat user input as data and automatically escape it so it cannot alter document structure.
Trace the Data Path and Locate the Vulnerability
Analyze how untrusted data reaches XML construction:
- Review the flaw details to identify the specific file, line number, and code pattern
- Source: Where untrusted data enters (user input, external file, database, network request)
- Sink: XML document construction or serialization
- String concatenation: Look for untrusted data being concatenated into XML strings
- Understand the data flow from source to sink
Use XML Libraries with Built-in Escaping (Primary Defense)
Replace string concatenation with safe XML API usage:
- Use XML libraries that automatically escape or encode XML data
- Avoid manual string concatenation when building XML
- Use DOM builders (e.g.,
ElementTree,XmlWriter,xmlbuilder2) - Let the library handle character escaping (e.g.,
<→<,>→>) - Follow the secure code examples provided in language-specific guidance
Validate and Sanitize Input (Defense in Depth)
Add input validation as an additional security layer:
- Enforce strict type, length, and format checks on all untrusted data
- Validate expected patterns before XML construction
- Reject or escape special XML characters (
<,>,&,',") - Use allowlists for enumerated values
- Implement length limits to prevent DoS
Disable Dangerous XML Features
Harden XML parsers to prevent related attacks:
- Disable external entity resolution (prevents XXE attacks)
- Disable DTD processing if not required
- Avoid external references in XML processing
- Set secure defaults on XML parsers
- Use hardened libraries like
defusedxml(Python)
Monitor and Test
Verify your fixes and enable detection:
- Test with XML injection payloads:
</name><admin>true</admin><name>,<![CDATA[<script>alert(1)</script>]]> - Test with encoded characters:
<script>,<admin> - Log all XML parsing errors and suspicious activity
- Alert on malformed or unexpected XML input
- Verify the specific input that triggered the finding no longer causes the vulnerability
- Ensure legitimate functionality still works correctly
- Re-scan with the security scanner to confirm the issue is resolved
Common Vulnerable Patterns
- Concatenating untrusted data into XML documents
- Failing to escape or validate XML content
String Concatenation for XML Construction (Pseudocode)
Secure Patterns
XML Library with Automatic Escaping (Python)
# Safe: use XML library to build document
import xml.etree.ElementTree as ET
user_elem = ET.Element('user')
name_elem = ET.SubElement(user_elem, 'name')
name_elem.text = user_input # Library escapes content
xml = ET.tostring(user_elem)
Why this works:
- Uses XML library APIs that automatically escape special characters (
<,>,&,",') - Prevents injection of malicious XML tags, attributes, or CDATA sections
- Treats user input as data content rather than markup structure
- Blocks attackers from breaking out of XML context to inject arbitrary elements
- Avoids string concatenation that could allow XML structure manipulation
Language-Specific Guidance
For detailed, framework-specific examples and best practices:
- Python - ElementTree, lxml, defusedxml, Django, Flask
- Java - DOM, StAX, JAXB, Jackson XML, Apache Commons Text
- JavaScript/Node.js - xmlbuilder2, xml2js, he, Next.js
- C# - LINQ to XML, XmlWriter, XmlSerializer, ASP.NET Core
Dynamic Scan Guidance
For guidance on remediating this CWE when detected by dynamic (DAST) scanners:
- Dynamic Scan Guidance - Analyzing DAST findings and mapping to source code