CWE-112: Missing XML Validation
Overview
Missing XML validation occurs when applications parse XML without validating it against a defined schema (XSD, DTD, or RelaxNG), allowing malformed, malicious, or unexpected XML structures to be processed. This can lead to injection attacks, denial of service, or business logic bypasses.
OWASP Classification
A05:2025 - Injection
Risk
High: Without XML validation, attackers can submit malicious XML with unexpected elements, XXE payloads, billion laughs attacks, or data that violates business rules. This can lead to code execution, data exfiltration, DoS, or application compromise.
Remediation Steps
Core principle: Never process untrusted XML without validating it against a strict, application-defined schema; reject any XML that does not conform exactly to the expected structure.
Define Strict XML Schema
Create a comprehensive XSD (XML Schema Definition) that defines the expected structure:
- Define all elements: Specify every allowed element, its type, and whether it's required or optional
- Set data type constraints: Use XSD types (string, int, date, etc.) to enforce data validation
- Enforce length limits: Set
maxLengthfor strings to prevent oversized data - Define cardinality: Specify
minOccurs,maxOccursto limit how many times elements can appear - Restrict attribute values: Use enumerations or patterns to allowlist attribute values
- Prevent unbounded nesting: Set maximum depth for complex types
Configure Secure XML Parser
Enable security features in the XML parser to prevent XXE and other attacks:
- Enable schema validation: Set the schema on the parser factory to validate all XML
- Disable external entities: Set
disallow-doctype-declto true, disableexternal-general-entitiesandexternal-parameter-entities - Disable DTD processing: If DTDs aren't needed, completely disable DTD processing
- Set entity expansion limits: Limit how many times entities can be expanded (prevent billion laughs)
- Disable XInclude: Set
XIncludeAwareto false to prevent external file inclusion - Use modern libraries: Ensure XML parser libraries are up-to-date with security patches
Validate XML Against Schema Before Processing
Parse XML with validation enabled and reject invalid input:
- Set schema on parser factory: Configure
DocumentBuilderFactory.setSchema()or equivalent - Enable namespace awareness: Set
setNamespaceAware(true)for proper schema validation - Use strict error handler: Implement error handler that fails on any validation error (don't just log warnings)
- Reject invalid XML: If schema validation fails, return error to user and don't process the XML
- Log validation failures: Record invalid XML attempts for security monitoring
Implement Business Logic Validation
After XML is structurally valid, validate business rules:
- Validate data formats: Even if XSD allows strings, validate email format, URL format, etc.
- Check business constraints: Verify numeric values are within acceptable business ranges
- Validate relationships: Check that referenced IDs exist, foreign key constraints are met
- Enforce access controls: Verify the user has permission to submit the provided data
- Use allowlists for enumerated values: For fields like status, country, category, validate against known-good lists
Apply Defense in Depth
Combine multiple security layers:
- Validate on multiple levels: Schema validation + parser security settings + business logic validation
- Use parameterized queries: If XML data goes into database, use prepared statements to prevent SQL injection
- Apply output encoding: If XML data is rendered in HTML, encode to prevent XSS
- Monitor for attacks: Log and alert on validation failures, malformed XML, XXE attempts
Test with Malicious XML Payloads
Verify the fix handles attacks:
- Test with invalid structure: Submit XML with unexpected elements, missing required fields
- Test with XXE payloads: Try external entity attacks (
<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>) - Test with billion laughs: Submit XML with recursive entity expansion
- Test with oversized data: Send extremely long strings, deeply nested structures
- Test with malformed XML: Invalid syntax, unclosed tags, encoding issues
Dynamic Scan Guidance
For guidance on remediating this CWE when detected by dynamic (DAST) scanners:
- Dynamic Scan Guidance - Analyzing DAST findings and mapping to source code
Common Vulnerable Patterns
Parsing XML without schema validation
import javax.xml.parsers.*;
import org.w3c.dom.*;
public class VulnerableXMLParser {
public void processXML(String xmlInput) throws Exception {
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
DocumentBuilder builder = factory.newDocumentBuilder();
// Dangerous: no schema validation, accepts any XML
Document doc = builder.parse(new InputSource(new StringReader(xmlInput)));
// Process document without validation
NodeList users = doc.getElementsByTagName("user");
// ...
}
}
Why this is vulnerable:
- Accepting any XML structure
- Missing constraints on element depth/count
- No validation of data types or formats
- Trusting XML from untrusted sources
Secure Patterns
Validate XML against strict schema with secure parser configuration
import javax.xml.XMLConstants;
import javax.xml.parsers.*;
import javax.xml.validation.*;
import javax.xml.transform.stream.StreamSource;
import org.w3c.dom.*;
import org.xml.sax.SAXException;
public class SecureXMLParser {
private Schema schema;
public SecureXMLParser() throws SAXException {
// Load and compile XSD schema
SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
schema = schemaFactory.newSchema(new File("user-schema.xsd"));
}
public void processXML(String xmlInput) throws Exception {
// Configure secure parser
DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
factory.setNamespaceAware(true);
factory.setSchema(schema); // Enable schema validation
// Disable XXE
factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
factory.setFeature("http://xml.org/sax/features/external-general-entities", false);
factory.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
factory.setXIncludeAware(false);
factory.setExpandEntityReferences(false);
DocumentBuilder builder = factory.newDocumentBuilder();
builder.setErrorHandler(new StrictErrorHandler()); // Fail on validation errors
try {
// Parse with schema validation
Document doc = builder.parse(new InputSource(new StringReader(xmlInput)));
// Additional business logic validation
validateBusinessRules(doc);
// Process validated document
processValidatedDocument(doc);
} catch (SAXException e) {
throw new ValidationException("XML validation failed: " + e.getMessage());
}
}
private void validateBusinessRules(Document doc) throws ValidationException {
NodeList users = doc.getElementsByTagName("user");
for (int i = 0; i < users.getLength(); i++) {
Element user = (Element) users.item(i);
String email = user.getElementsByTagName("email").item(0).getTextContent();
// Validate email format
if (!email.matches("^[A-Za-z0-9+_.-]+@[A-Za-z0-9.-]+$")) {
throw new ValidationException("Invalid email format: " + email);
}
}
}
private void processValidatedDocument(Document doc) {
// Safe to process validated XML
}
}
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
<xs:element name="users">
<xs:complexType>
<xs:sequence>
<xs:element name="user" maxOccurs="100">
<xs:complexType>
<xs:sequence>
<xs:element name="username" type="xs:string"/>
<xs:element name="email" type="xs:string"/>
<xs:element name="age" type="xs:positiveInteger"/>
</xs:sequence>
<xs:attribute name="id" type="xs:positiveInteger"
use="required"/>
</xs:complexType>
</xs:element>
</xs:sequence>
</xs:complexType>
</xs:element>
</xs:schema>
Why this works: Schema validation enforces a strict XML structure (defined in XSD) before any processing occurs, ensuring only expected elements and data types are accepted. The parser is configured to disable XXE attacks (external entities, DTDs) and entity expansion. Business logic validation adds an additional layer to check semantic rules beyond structure. Together, these defenses prevent malicious XML (XXE, billion laughs, unexpected elements) from being processed.