CWE-611: XML External Entity (XXE) Injection

Overview

XML External Entity (XXE) injection occurs when XML input containing a reference to an external entity is processed by a weakly configured XML parser. The vulnerability exists because XML parsers, by default, often resolve external entities defined in Document Type Definitions (DTDs), allowing attackers to inject malicious entity definitions that can read arbitrary files, perform Server-Side Request Forgery (SSRF) attacks, cause Denial of Service (DoS), or in rare cases execute remote code.

OWASP Classification

A02:2025 - Security Misconfiguration

Risk

XXE vulnerabilities can lead to severe security impacts:

Confidential data disclosure: Read /etc/passwd, config files, application source code, cloud metadata
SSRF attacks: Scan internal networks, access internal services, exploit cloud instance metadata endpoints
Denial of Service: Billion Laughs attack (exponential entity expansion), external entity recursion
Remote code execution: In rare cases, combined with PHP expect:// wrapper or similar mechanisms
Authentication bypass: Access internal authentication services or retrieve sensitive credentials

XXE is particularly dangerous in cloud environments where instance metadata services (AWS, Azure, GCP) expose sensitive credentials.

Remediation Steps

Core principle: Disable XML external entities and DTD processing unless explicitly required and safely constrained; XML parsing behavior must be fully constrained by the server.

Locate XML external entity vulnerability

Review the security findings to identify the specific file, line number, and XML parsing operation
Identify where XML data enters the application: user input, external files, databases, network requests, document uploads
Trace data flow from source to the XML parser initialization and parsing call
Determine which XML parser library is being used (see Language-Specific Guidance)
Check if the parser is configured with secure settings
Look for DTD processing, external entity resolution, or XInclude features

Disable external entity processing in XML parsers (Primary Defense)

Disable DTD processing entirely: Parser should reject any document with <!DOCTYPE (safest option)
If DTDs required, disable external entities: Disable external entity resolution and external DTD loading
Disable XInclude processing: Block <xi:include> elements
Disable parameter entity processing: Prevent parameter entity attacks
Set security features BEFORE parsing: All security configuration must be applied before parsing any XML data
Apply to ALL parsers: Configure every XML parser instance in the application
Why this works: If parser cannot process external entities, XXE attacks become impossible
Language-specific configuration: See Python, Java, JavaScript, C#, PHP subdirectories for exact parser configuration code

Eliminate XML processing when possible

Replace XML with safer data formats for new implementations:
- Use JSON for API communication and data exchange
- Use YAML with safe loading (yaml.safe_load()) for configuration
- Use Protocol Buffers for structured binary data
- Use plain text or CSV for simple data
Only use XML when required by:
- Legacy systems or industry standards (SOAP, SAML, RSS, SVG)
- Document formats that require XML (Office documents, DOCX, XLSX)
- Digital signatures (XML-DSig)
Redesign to avoid XML parsing of untrusted data when possible

Add input validation for XML documents (Defense in Depth)

Validate XML against strict XSD schema: Define and enforce expected structure before processing
Reject documents containing <!DOCTYPE: If DTDs not needed, block at input validation layer
Check for suspicious entity references: Scan for <!ENTITY, <!ELEMENT patterns
Limit XML document size and complexity: Maximum elements, nesting depth, document size
Set parser limits: Maximum entity expansions, maximum entity size to prevent DoS
Use allowlists for elements/attributes: Only permit expected XML elements and attributes
Note: Input validation is defense-in-depth; secure parser configuration is essential

Apply defense-in-depth protections

Keep XML parsing libraries up to date (monitor security advisories)
Replace deprecated or unmaintained libraries
Use dependency scanning tools (OWASP Dependency-Check, Snyk)
Run XML processing with least privilege (minimal file system access)
Apply network egress filtering to prevent SSRF (block access to internal networks, cloud metadata endpoints)
Monitor and log XML parsing errors and entity resolution attempts
Set resource limits: memory, CPU time for XML processing

Test and verify XXE protection thoroughly

Test with basic XXE payload attempting file disclosure: <!ENTITY xxe SYSTEM "file:///etc/passwd">
Verify parser rejects or safely handles the payload (no file contents returned)
Test SSRF via XXE: <!ENTITY xxe SYSTEM "http://internal-service/admin">
Verify parser doesn't make external HTTP requests
Test Billion Laughs DoS attack (exponential entity expansion)
Verify parser rejects or times out (doesn't consume excessive memory)
Test cloud metadata endpoint access: http://169.254.169.254/latest/meta-data/
Test parameter entity attacks: <!ENTITY % xxe SYSTEM "file:///etc/passwd">
Verify legitimate XML documents without entities still parse correctly
Re-scan with the security scanner to confirm the issue is resolved
Check for any new findings introduced by the changes
Test Billion Laughs DoS attack (exponential entity expansion)
Verify parser rejects or times out (doesn't consume excessive memory)
Test cloud metadata endpoint access: http://169.254.169.254/latest/meta-data/
Test parameter entity attacks: <!ENTITY % xxe SYSTEM "file:///etc/passwd">
Verify legitimate XML documents without entities still parse correctly
Re-scan with the security scanner to confirm the issue is resolved
Check for any new findings introduced by the changes

Basic XXE Test Payloads

XML Include

<?xml version="1.0"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "file:///etc/passwd">
]>
<root>&xxe;</root>

Expected result: Parser should reject this or return empty/error response (not file contents).

SSRF via XXE Test

<?xml version="1.0"?>
<!DOCTYPE foo [
  <!ENTITY xxe SYSTEM "http://internal-service:8080/admin">
]>
<root>&xxe;</root>

Expected result: Parser should not make external HTTP requests.

Billion Laughs DoS Test

<?xml version="1.0"?>
<!DOCTYPE lolz [
  <!ENTITY lol "lol">
  <!ENTITY lol2 "&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;&lol;">
  <!ENTITY lol3 "&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;&lol2;">
]>
<root>&lol3;</root>

Expected result: Parser should reject or timeout (not consume excessive memory).

Language-Specific Guidance

For detailed, language-specific parser configurations and framework-specific patterns:

C# - XmlReader, XDocument, XmlDocument with DTD disabled
Java - DocumentBuilder, SAXParser, XMLStreamReader with XXE prevention
JavaScript/Node.js - libxmljs, xml2js, fast-xml-parser with secure defaults
PHP - SimpleXML, DOMDocument, XMLReader with entity loading disabled
Python - lxml, xml.etree, defusedxml for safe XML parsing