Skip to content

CWE-611: XML External Entity (XXE) Injection - PHP

Overview

PHP XML security depends on the PHP and libxml versions and the parse flags used. Modern PHP/libxml disables entity substitution by default, but XXE risk is reintroduced when code enables entity substitution, DTD loading, DTD validation, or external subsets for untrusted XML.

Primary Defence: For untrusted XML, avoid LIBXML_NOENT, LIBXML_DTDLOAD, LIBXML_DTDATTR, and LIBXML_DTDVALID; use LIBXML_NONET; reject DOCTYPE/ENTITY declarations unless explicitly required; on PHP < 8.0 call libxml_disable_entity_loader(true) before parsing, and on PHP 8.4+/libxml 2.13+ use LIBXML_NO_XXE if entity substitution or DTD features are unavoidable.

Common Vulnerable Patterns

simplexml_load_string with Unsafe Flags

<?php
// VULNERABLE - entity substitution and DTD loading are enabled
$xml = $_POST['xml'];
$data = simplexml_load_string(
    $xml,
    'SimpleXMLElement',
    LIBXML_NOENT | LIBXML_DTDLOAD
);

// Attacker sends:
// <?xml version="1.0"?>
// <!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
// <root><name>&xxe;</name></root>

Why this is vulnerable:

  • LIBXML_NOENT substitutes entities and LIBXML_DTDLOAD loads external subsets.
  • Enables file disclosure, SSRF, and entity expansion DoS.

DOMDocument::loadXML with Unsafe Flags

<?php
// VULNERABLE - DOMDocument configured to load DTDs and substitute entities
$xml = file_get_contents('php://input');

$dom = new DOMDocument();
$dom->loadXML($xml, LIBXML_NOENT | LIBXML_DTDLOAD);  // DANGEROUS!

$name = $dom->getElementsByTagName('name')->item(0)->nodeValue;

Why this is vulnerable:

  • Entity substitution and external DTD loading are explicitly enabled.
  • Enables file disclosure and SSRF.

XMLReader with Unsafe Flags

<?php
// VULNERABLE - XMLReader configured to substitute entities
$xml = $_POST['xml'];

$reader = new XMLReader();
$reader->XML($xml, null, LIBXML_NOENT | LIBXML_DTDLOAD);

while ($reader->read()) {
    if ($reader->nodeType == XMLReader::ELEMENT && $reader->name == 'name') {
        echo $reader->readString();
    }
}

Why this is vulnerable:

  • DTD loading and entity substitution are explicitly enabled.
  • Enables file disclosure and SSRF via XML input.

SimpleXMLElement with Entity Substitution

<?php
// VULNERABLE - SimpleXMLElement with unsafe parser flags
$xml = $_POST['xml'];
$element = new SimpleXMLElement($xml, LIBXML_NOENT | LIBXML_DTDLOAD);
echo $element->name;

Why this is vulnerable:

  • Entity substitution and DTD loading are enabled.
  • Enables file disclosure and blind SSRF.

Secure Patterns

DOMDocument with Safe Flags

<?php
// SECURE - Avoid entity substitution and DTD loading
function parse_xml_secure($xml) {
    if (PHP_VERSION_ID < 80000) {
        libxml_disable_entity_loader(true);
    }

    $previous = libxml_use_internal_errors(true);

    $dom = new DOMDocument();

    // Do not pass LIBXML_NOENT, LIBXML_DTDLOAD, LIBXML_DTDATTR, or LIBXML_DTDVALID
    $success = $dom->loadXML($xml, LIBXML_NONET);

    libxml_use_internal_errors($previous);

    if (!$success) {
        throw new Exception('Invalid XML');
    }

    return $dom;
}

// Usage:
try {
    $dom = parse_xml_secure($_POST['xml']);
    $name = $dom->getElementsByTagName('name')->item(0)->nodeValue;
} catch (Exception $e) {
    http_response_code(400);
    echo json_encode(['error' => $e->getMessage()]);
}

Why this works:

  • Avoids the flags that enable entity substitution or external DTD loading.
  • LIBXML_NONET blocks network access if a future change introduces external loading.
  • The PHP < 8.0 compatibility call disables external entity loading for older deployments.

simplexml_load_string with Safe Flags

<?php
// SECURE - Avoid entity substitution and DTD loading
function parse_simplexml_secure($xml) {
    if (PHP_VERSION_ID < 80000) {
        libxml_disable_entity_loader(true);
    }

    // Clear previous errors
    libxml_clear_errors();
    libxml_use_internal_errors(true);

    // Parse XML
    $data = simplexml_load_string(
        $xml,
        'SimpleXMLElement',
        LIBXML_NONET | LIBXML_NOCDATA  // Block network access; no entity substitution
    );

    if ($data === false) {
        $errors = libxml_get_errors();
        libxml_clear_errors();
        throw new Exception('XML parse error: ' . json_encode($errors));
    }

    return $data;
}

// Usage:
try {
    $xml_obj = parse_simplexml_secure($_POST['xml']);
    $name = (string)$xml_obj->name;
    $email = (string)$xml_obj->email;
} catch (Exception $e) {
    http_response_code(400);
    echo json_encode(['error' => $e->getMessage()]);
}

Why this works:

  • Avoids entity substitution and DTD loading while blocking network access.
  • SimpleXML stays safe for untrusted input.

XMLReader with Safe Flags

<?php
// SECURE - XMLReader without entity substitution or DTD loading
function parse_xmlreader_secure($xml) {
    if (PHP_VERSION_ID < 80000) {
        libxml_disable_entity_loader(true);
    }

    $reader = new XMLReader();
    $reader->XML($xml, null, LIBXML_NONET);

    $data = [];

    while ($reader->read()) {
        if ($reader->nodeType == XMLReader::ELEMENT) {
            $elementName = $reader->name;
            $reader->read();
            if ($reader->nodeType == XMLReader::TEXT) {
                $data[$elementName] = $reader->value;
            }
        }
    }

    $reader->close();

    return $data;
}

// Usage:
try {
    $data = parse_xmlreader_secure($_POST['xml']);
    echo json_encode($data);
} catch (Exception $e) {
    http_response_code(400);
    echo json_encode(['error' => $e->getMessage()]);
}

Why this works:

  • Avoids entity substitution and blocks network access for streaming parsing.
  • Safe for large XML inputs without loading the full document.

Validating XML Before Parsing

<?php
// SECURE - Block DOCTYPE entirely
function validate_and_parse_xml($xml) {
    // Block if contains DOCTYPE or ENTITY declarations
    if (stripos($xml, '<!DOCTYPE') !== false) {
        throw new Exception('DOCTYPE not allowed');
    }

    if (stripos($xml, '<!ENTITY') !== false) {
        throw new Exception('ENTITY declarations not allowed');
    }

    // Check for entity references (except safe ones)
    if (preg_match('/&(?!amp;|lt;|gt;|quot;|apos;)[a-zA-Z0-9_]+;/', $xml)) {
        throw new Exception('Custom entity references not allowed');
    }

    if (PHP_VERSION_ID < 80000) {
        libxml_disable_entity_loader(true);
    }

    $dom = new DOMDocument();
    $dom->loadXML($xml, LIBXML_NONET);

    return $dom;
}

Why this works:

  • Rejects DOCTYPE/ENTITY and custom entity refs before parsing.
  • Parser hardening provides a second safety layer.

Framework-Specific Guidance

Use these patterns as starting points for common PHP frameworks:

  • Enforce application/xml content types before parsing.
  • Centralize secure XML parsing helpers and reuse them.
  • Validate extracted fields with framework validators.

Laravel

<?php
// SECURE - Laravel controller with secure XML parsing

namespace App\Http\Controllers;

use Illuminate\Http\Request;
use Illuminate\Http\JsonResponse;

class XmlController extends Controller
{
    public function processXml(Request $request): JsonResponse
    {
        // Validate content type
        if ($request->header('Content-Type') !== 'application/xml') {
            return response()->json(['error' => 'Invalid content type'], 400);
        }

        try {
            $xml = $request->getContent();

            // Parse securely
            $dom = $this->parseXmlSecure($xml);

            // Extract data
            $name = $dom->getElementsByTagName('name')->item(0)?->nodeValue;
            $email = $dom->getElementsByTagName('email')->item(0)?->nodeValue;

            // Validate
            $validated = validator([
                'name' => $name,
                'email' => $email
            ], [
                'name' => 'required|string|max:100',
                'email' => 'required|email'
            ])->validate();

            // Create user
            $user = User::create($validated);

            return response()->json($user, 201);

        } catch (\Exception $e) {
            return response()->json(['error' => $e->getMessage()], 400);
        }
    }

    private function parseXmlSecure(string $xml): \DOMDocument
    {
        if (PHP_VERSION_ID < 80000) {
            libxml_disable_entity_loader(true);
        }
        libxml_use_internal_errors(true);

        $dom = new \DOMDocument();

        if (!$dom->loadXML($xml, LIBXML_NONET)) {
            $errors = libxml_get_errors();
            libxml_clear_errors();
            throw new \Exception('Invalid XML: ' . json_encode($errors));
        }

        return $dom;
    }
}

Why this works:

  • Validates application/xml and uses a hardened parser.
  • Uses framework validation before persistence.

Symfony

<?php
// SECURE - Symfony controller with XML parsing

namespace App\Controller;

use Symfony\Bundle\FrameworkBundle\Controller\AbstractController;
use Symfony\Component\HttpFoundation\Request;
use Symfony\Component\HttpFoundation\JsonResponse;
use Symfony\Component\Validator\Validator\ValidatorInterface;

class ApiController extends AbstractController
{
    public function processXml(Request $request, ValidatorInterface $validator): JsonResponse
    {
        if ($request->getContentType() !== 'xml') {
            return new JsonResponse(['error' => 'Content-Type must be application/xml'], 400);
        }

        try {
            $xml = $request->getContent();

            // Parse with security
            if (PHP_VERSION_ID < 80000) {
                libxml_disable_entity_loader(true);
            }
            libxml_use_internal_errors(true);

            $dom = new \DOMDocument();
            if (!$dom->loadXML($xml, LIBXML_NONET)) {
                throw new \Exception('Invalid XML');
            }

            // Extract and create entity
            $user = new User();
            $user->setName($dom->getElementsByTagName('name')->item(0)?->nodeValue ?? '');
            $user->setEmail($dom->getElementsByTagName('email')->item(0)?->nodeValue ?? '');

            // Validate entity
            $errors = $validator->validate($user);

            if (count($errors) > 0) {
                return new JsonResponse(['errors' => (string)$errors], 400);
            }

            // Save
            $entityManager = $this->getDoctrine()->getManager();
            $entityManager->persist($user);
            $entityManager->flush();

            return new JsonResponse([
                'id' => $user->getId(),
                'name' => $user->getName(),
                'email' => $user->getEmail()
            ], 201);

        } catch (\Exception $e) {
            return new JsonResponse(['error' => $e->getMessage()], 400);
        }
    }
}

Why this works:

  • Enforces XML content type and avoids entity substitution or DTD loading.
  • Validates the entity before saving.

RSS/Atom Feed Parsing

<?php
// SECURE - Parse RSS feed safely
function parse_rss_feed($feed_url) {
    // Fetch feed
    $xml = file_get_contents($feed_url);

    if ($xml === false) {
        throw new Exception('Failed to fetch feed');
    }

    if (PHP_VERSION_ID < 80000) {
        libxml_disable_entity_loader(true);
    }

    // Parse with SimpleXML
    $feed = simplexml_load_string($xml, 'SimpleXMLElement', LIBXML_NONET);

    if ($feed === false) {
        throw new Exception('Invalid RSS feed');
    }

    $items = [];
    foreach ($feed->channel->item as $item) {
        $items[] = [
            'title' => (string)$item->title,
            'link' => (string)$item->link,
            'description' => (string)$item->description
        ];
    }

    return $items;
}

// Usage:
try {
    $items = parse_rss_feed('https://example.com/feed.xml');
    foreach ($items as $item) {
        echo htmlspecialchars($item['title']) . "<br>";
    }
} catch (Exception $e) {
    error_log($e->getMessage());
}

SOAP Client

<?php
// SECURE - SOAP client with XXE protection
class SecureSoapClient extends SoapClient {
    public function __doRequest($request, $location, $action, $version, $one_way = 0) {
        if (PHP_VERSION_ID < 80000) {
            libxml_disable_entity_loader(true);
        }

        return parent::__doRequest($request, $location, $action, $version, $one_way);
    }
}

// Usage:
$wsdl = 'https://example.com/service?wsdl';
$options = [
    'trace' => 1,
    'exceptions' => true,
    'features' => SOAP_SINGLE_ELEMENT_ARRAYS
];

$client = new SecureSoapClient($wsdl, $options);

try {
    $result = $client->someMethod(['param' => 'value']);
} catch (SoapFault $e) {
    error_log('SOAP error: ' . $e->getMessage());
}

Input Validation

<?php
// Validate XML structure and content
function validate_user_xml($xml) {
    // Pre-validation: Block dangerous patterns
    if (preg_match('/<!DOCTYPE|<!ENTITY/i', $xml)) {
        throw new Exception('DOCTYPE and ENTITY declarations not allowed');
    }

    if (PHP_VERSION_ID < 80000) {
        libxml_disable_entity_loader(true);
    }

    $dom = new DOMDocument();
    if (!$dom->loadXML($xml, LIBXML_NONET)) {
        throw new Exception('Invalid XML format');
    }

    // Validate structure
    $root = $dom->documentElement;
    if ($root->tagName !== 'user') {
        throw new Exception('Root element must be <user>');
    }

    // Extract elements
    $name = $dom->getElementsByTagName('name')->item(0);
    $email = $dom->getElementsByTagName('email')->item(0);

    // Validate presence
    if (!$name || !$name->nodeValue) {
        throw new Exception('Name is required');
    }

    if (!$email || !$email->nodeValue) {
        throw new Exception('Email is required');
    }

    // Validate content
    $nameValue = trim($name->nodeValue);
    $emailValue = trim($email->nodeValue);

    if (strlen($nameValue) > 100) {
        throw new Exception('Name too long');
    }

    if (!filter_var($emailValue, FILTER_VALIDATE_EMAIL)) {
        throw new Exception('Invalid email format');
    }

    return [
        'name' => $nameValue,
        'email' => $emailValue
    ];
}

PHP Configuration

; php.ini security settings

; Disable external entity loading globally (PHP 8.0+)
; (Note: libxml_disable_entity_loader is deprecated in PHP 8.0
; as external entity loading is disabled by default)

; For older PHP versions, always call in code:
; libxml_disable_entity_loader(true);

Remediation Steps

  1. Locate every XML parsing path, including DOMDocument, SimpleXML, XMLReader, SOAP clients, RSS/Atom feed parsing, uploaded XML/SVG files, and framework helpers.
  2. Identify whether untrusted XML can reach DTD loading, entity substitution, validation, external subsets, or downstream reparsing.
  3. Remove dangerous parse flags such as LIBXML_NOENT, LIBXML_DTDLOAD, LIBXML_DTDATTR, and LIBXML_DTDVALID from untrusted XML paths.
  4. Reject DOCTYPE and ENTITY declarations at the application boundary unless a documented trusted integration requires them.
  5. Use LIBXML_NONET, safe parser wrappers, and PHP-version-specific protections such as libxml_disable_entity_loader(true) on PHP < 8.0.
  6. Add size limits, schema validation where appropriate, and monitoring for unexpected file or network access during XML processing.

Testing

  • Test normal XML, SOAP, and RSS/Atom payloads expected by the application.
  • Test file-disclosure XXE payloads that reference local files and confirm they are rejected or never expanded.
  • Test SSRF-style external entities that reference internal HTTP endpoints or cloud metadata addresses.
  • Test Billion Laughs or nested entity expansion payloads with strict input size and timeout limits.
  • Test mixed-case DOCTYPE and ENTITY declarations across request bodies, uploads, queues, and third-party responses.
  • Retest static analysis findings for dangerous libxml flags and review runtime logs for blocked XML parsing attempts.

Common Pitfalls

  • Assuming modern PHP defaults stay safe after adding LIBXML_NOENT, DTD loading, or validation flags.
  • Calling libxml_disable_entity_loader(true) on older PHP but still passing unsafe parse flags.
  • Blocking only DOCTYPE while allowing ENTITY declarations or downstream reparsing.
  • Sanitizing or validating XML after entity expansion has already occurred.
  • Treating LIBXML_NONET as protection against local file disclosure; it only blocks network access.
  • Protecting request-body XML but forgetting SOAP clients, feed readers, uploaded SVG/XML files, and queued payloads.

Dependencies and Installation

  • PHP DOM, SimpleXML, XMLReader, SOAP, and libxml behavior varies by PHP and libxml version; verify production versions.
  • libxml_disable_entity_loader() is relevant for PHP < 8.0 and deprecated in PHP 8.0 because external entity loading is disabled by default.
  • LIBXML_NO_XXE is available only on newer PHP/libxml combinations and should be used as additional protection when DTD/entity features are unavoidable.
  • Keep PHP and libxml current, and avoid parser wrappers that hide unsafe libxml flags.

Additional Resources