Skip to content

CWE-112: Missing XML Validation

Overview

Missing XML validation occurs when applications parse XML without validating it against a defined schema (XSD, DTD, or RelaxNG), allowing malformed, malicious, or unexpected XML structures to be processed. This can lead to injection attacks, denial of service, or business logic bypasses.

OWASP Classification

A05:2025 - Injection

Risk

High: Without XML validation, attackers can submit malicious XML with unexpected elements, XXE payloads, billion laughs attacks, or data that violates business rules. This can lead to code execution, data exfiltration, DoS, or application compromise.

Remediation Steps

Core principle: Never process untrusted XML without validating it against a strict, application-defined schema; reject any XML that does not conform exactly to the expected structure.

Define Strict XML Schema

Create a comprehensive XSD (XML Schema Definition) that defines the expected structure:

  • Define all elements: Specify every allowed element, its type, and whether it's required or optional
  • Set data type constraints: Use XSD types (string, int, date, etc.) to enforce data validation
  • Enforce length limits: Set maxLength for strings to prevent oversized data
  • Define cardinality: Specify minOccurs, maxOccurs to limit how many times elements can appear
  • Restrict attribute values: Use enumerations or patterns to allowlist attribute values
  • Prevent unbounded nesting: Set maximum depth for complex types

Configure Secure XML Parser

Enable security features in the XML parser to prevent XXE and other attacks:

  • Enable schema validation: Set the schema on the parser factory to validate all XML
  • Disable external entities: Set disallow-doctype-decl to true, disable external-general-entities and external-parameter-entities
  • Disable DTD processing: If DTDs aren't needed, completely disable DTD processing
  • Set entity expansion limits: Limit how many times entities can be expanded (prevent billion laughs)
  • Disable XInclude: Set XIncludeAware to false to prevent external file inclusion
  • Use modern libraries: Ensure XML parser libraries are up-to-date with security patches

Validate XML Against Schema Before Processing

Parse XML with validation enabled and reject invalid input:

  • Set schema on parser factory: Configure DocumentBuilderFactory.setSchema() or equivalent
  • Enable namespace awareness: Set setNamespaceAware(true) for proper schema validation
  • Use strict error handler: Implement error handler that fails on any validation error (don't just log warnings)
  • Reject invalid XML: If schema validation fails, return error to user and don't process the XML
  • Log validation failures: Record invalid XML attempts for security monitoring

Implement Business Logic Validation

After XML is structurally valid, validate business rules:

  • Validate data formats: Even if XSD allows strings, validate email format, URL format, etc.
  • Check business constraints: Verify numeric values are within acceptable business ranges
  • Validate relationships: Check that referenced IDs exist, foreign key constraints are met
  • Enforce access controls: Verify the user has permission to submit the provided data
  • Use allowlists for enumerated values: For fields like status, country, category, validate against known-good lists

Apply Defense in Depth

Combine multiple security layers:

  • Validate on multiple levels: Schema validation + parser security settings + business logic validation
  • Use parameterized queries: If XML data goes into database, use prepared statements to prevent SQL injection
  • Apply output encoding: If XML data is rendered in HTML, encode to prevent XSS
  • Monitor for attacks: Log and alert on validation failures, malformed XML, XXE attempts

Test with Malicious XML Payloads

Verify the fix handles attacks:

  • Test with invalid structure: Submit XML with unexpected elements, missing required fields
  • Test with XXE payloads: Try external entity attacks (<!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>)
  • Test with billion laughs: Submit XML with recursive entity expansion
  • Test with oversized data: Send extremely long strings, deeply nested structures
  • Test with malformed XML: Invalid syntax, unclosed tags, encoding issues

Dynamic Scan Guidance

For guidance on remediating this CWE when detected by dynamic (DAST) scanners:

Common Vulnerable Patterns

Parsing XML without schema validation

import javax.xml.parsers.*;
import org.w3c.dom.*;

public class VulnerableXMLParser {
    public void processXML(String xmlInput) throws Exception {
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        DocumentBuilder builder = factory.newDocumentBuilder();

        // Dangerous: no schema validation, accepts any XML
        Document doc = builder.parse(new InputSource(new StringReader(xmlInput)));

        // Process document without validation
        NodeList users = doc.getElementsByTagName("user");
        // ...
    }
}

Why this is vulnerable:

  • Accepting any XML structure
  • Missing constraints on element depth/count
  • No validation of data types or formats
  • Trusting XML from untrusted sources

Secure Patterns

Validate XML against strict schema with secure parser configuration

import javax.xml.XMLConstants;
import javax.xml.parsers.*;
import javax.xml.validation.*;
import javax.xml.transform.stream.StreamSource;
import org.w3c.dom.*;
import org.xml.sax.SAXException;

public class SecureXMLParser {
    private Schema schema;

    public SecureXMLParser() throws SAXException {
        // Load and compile XSD schema
        SchemaFactory schemaFactory = SchemaFactory.newInstance(XMLConstants.W3C_XML_SCHEMA_NS_URI);
        schema = schemaFactory.newSchema(new File("user-schema.xsd"));
    }

    public void processXML(String xmlInput) throws Exception {
        // Configure secure parser
        DocumentBuilderFactory factory = DocumentBuilderFactory.newInstance();
        factory.setNamespaceAware(true);
        factory.setSchema(schema); // Enable schema validation

        // Disable XXE
        factory.setFeature("http://apache.org/xml/features/disallow-doctype-decl", true);
        factory.setFeature("http://xml.org/sax/features/external-general-entities", false);
        factory.setFeature("http://xml.org/sax/features/external-parameter-entities", false);
        factory.setXIncludeAware(false);
        factory.setExpandEntityReferences(false);

        DocumentBuilder builder = factory.newDocumentBuilder();
        builder.setErrorHandler(new StrictErrorHandler()); // Fail on validation errors

        try {
            // Parse with schema validation
            Document doc = builder.parse(new InputSource(new StringReader(xmlInput)));

            // Additional business logic validation
            validateBusinessRules(doc);

            // Process validated document
            processValidatedDocument(doc);

        } catch (SAXException e) {
            throw new ValidationException("XML validation failed: " + e.getMessage());
        }
    }

    private void validateBusinessRules(Document doc) throws ValidationException {
        NodeList users = doc.getElementsByTagName("user");

        for (int i = 0; i < users.getLength(); i++) {
            Element user = (Element) users.item(i);
            String email = user.getElementsByTagName("email").item(0).getTextContent();

            // Validate email format
            if (!email.matches("^[A-Za-z0-9+_.-]+@[A-Za-z0-9.-]+$")) {
                throw new ValidationException("Invalid email format: " + email);
            }
        }
    }

    private void processValidatedDocument(Document doc) {
        // Safe to process validated XML
    }
}
user-schema.xsd
<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema">
    <xs:element name="users">
        <xs:complexType>
            <xs:sequence>
                <xs:element name="user" maxOccurs="100">
                    <xs:complexType>
                        <xs:sequence>
                            <xs:element name="username" type="xs:string"/>
                            <xs:element name="email" type="xs:string"/>
                            <xs:element name="age" type="xs:positiveInteger"/>
                        </xs:sequence>
                        <xs:attribute name="id" type="xs:positiveInteger" 
                            use="required"/>
                    </xs:complexType>
                </xs:element>
            </xs:sequence>
        </xs:complexType>
    </xs:element>
</xs:schema>

Why this works: Schema validation enforces a strict XML structure (defined in XSD) before any processing occurs, ensuring only expected elements and data types are accepted. The parser is configured to disable XXE attacks (external entities, DTDs) and entity expansion. Business logic validation adds an additional layer to check semantic rules beyond structure. Together, these defenses prevent malicious XML (XXE, billion laughs, unexpected elements) from being processed.

Additional Resources