CWE-611: XML External Entity (XXE) Injection - PHP
Overview
PHP's XML parsers can process external entities by default, leading to file disclosure, SSRF attacks, and denial of service. Proper configuration is essential to prevent XXE vulnerabilities.
Primary Defence: Call libxml_disable_entity_loader(true) before parsing XML (PHP < 8.0), or set LIBXML_NOENT to false and avoid LIBXML_DTDLOAD for all XML functions.
Common Vulnerable Patterns
simplexml_load_string (Default)
<?php
// VULNERABLE - simplexml processes external entities by default
$xml = $_POST['xml'];
$data = simplexml_load_string($xml); // DANGEROUS!
// Attacker sends:
// <?xml version="1.0"?>
// <!DOCTYPE foo [<!ENTITY xxe SYSTEM "file:///etc/passwd">]>
// <root><name>&xxe;</name></root>
Why this is vulnerable:
- External entities are processed by default on older PHP.
- Enables file disclosure, SSRF, and entity expansion DoS.
DOMDocument::loadXML (Default)
<?php
// VULNERABLE - DOMDocument allows external entities
$xml = file_get_contents('php://input');
$dom = new DOMDocument();
$dom->loadXML($xml); // DANGEROUS!
$name = $dom->getElementsByTagName('name')->item(0)->nodeValue;
Why this is vulnerable:
- External entities are loaded by default.
- Enables file disclosure and SSRF.
XMLReader (Default)
<?php
// VULNERABLE - XMLReader with default settings
$xml = $_POST['xml'];
$reader = new XMLReader();
$reader->XML($xml); // Allows external entities!
while ($reader->read()) {
if ($reader->nodeType == XMLReader::ELEMENT && $reader->name == 'name') {
echo $reader->readString();
}
}
Why this is vulnerable:
- DTDs and external entities are processed by default.
- Enables file disclosure and SSRF via XML input.
SimpleXMLElement
<?php
// VULNERABLE - SimpleXMLElement processes entities
$xml = $_POST['xml'];
$element = new SimpleXMLElement($xml); // DANGEROUS!
echo $element->name;
Why this is vulnerable:
- External entities are processed by default.
- Enables file disclosure and blind SSRF.
Secure Patterns
DOMDocument with libxml_disable_entity_loader
<?php
// SECURE - Disable external entity loading
function parse_xml_secure($xml) {
// Disable external entity loading globally
libxml_disable_entity_loader(true);
// Also disable entity substitution
$previous = libxml_use_internal_errors(true);
$dom = new DOMDocument();
// Load with LIBXML_NOENT to prevent entity expansion
// But we disabled entity loader, so external ones won't load
$success = $dom->loadXML($xml, LIBXML_DTDLOAD | LIBXML_DTDATTR | LIBXML_NOENT);
libxml_use_internal_errors($previous);
if (!$success) {
throw new Exception('Invalid XML');
}
return $dom;
}
// Usage:
try {
$dom = parse_xml_secure($_POST['xml']);
$name = $dom->getElementsByTagName('name')->item(0)->nodeValue;
} catch (Exception $e) {
http_response_code(400);
echo json_encode(['error' => $e->getMessage()]);
}
Why this works:
- Globally disables external entity resolution in libxml2.
- Works across DOMDocument, SimpleXML, and XMLReader.
simplexml_load_string with Entity Loader Disabled
<?php
// SECURE - Disable entities before parsing
function parse_simplexml_secure($xml) {
// Disable external entity loading
libxml_disable_entity_loader(true);
// Clear previous errors
libxml_clear_errors();
libxml_use_internal_errors(true);
// Parse XML
$data = simplexml_load_string(
$xml,
'SimpleXMLElement',
LIBXML_NONET | LIBXML_NOCDATA // Block network access
);
if ($data === false) {
$errors = libxml_get_errors();
libxml_clear_errors();
throw new Exception('XML parse error: ' . json_encode($errors));
}
return $data;
}
// Usage:
try {
$xml_obj = parse_simplexml_secure($_POST['xml']);
$name = (string)$xml_obj->name;
$email = (string)$xml_obj->email;
} catch (Exception $e) {
http_response_code(400);
echo json_encode(['error' => $e->getMessage()]);
}
Why this works:
- Disables external entities and blocks network access.
- SimpleXML stays safe for untrusted input.
XMLReader with Secure Settings
<?php
// SECURE - XMLReader with entity loading disabled
function parse_xmlreader_secure($xml) {
libxml_disable_entity_loader(true);
$reader = new XMLReader();
$reader->XML($xml, null, LIBXML_NONET);
$data = [];
while ($reader->read()) {
if ($reader->nodeType == XMLReader::ELEMENT) {
$elementName = $reader->name;
$reader->read();
if ($reader->nodeType == XMLReader::TEXT) {
$data[$elementName] = $reader->value;
}
}
}
$reader->close();
return $data;
}
// Usage:
try {
$data = parse_xmlreader_secure($_POST['xml']);
echo json_encode($data);
} catch (Exception $e) {
http_response_code(400);
echo json_encode(['error' => $e->getMessage()]);
}
Why this works:
- Blocks external entities and network access for streaming parsing.
- Safe for large XML inputs without loading the full document.
Validating XML Before Parsing
<?php
// SECURE - Block DOCTYPE entirely
function validate_and_parse_xml($xml) {
// Block if contains DOCTYPE or ENTITY declarations
if (stripos($xml, '<!DOCTYPE') !== false) {
throw new Exception('DOCTYPE not allowed');
}
if (stripos($xml, '<!ENTITY') !== false) {
throw new Exception('ENTITY declarations not allowed');
}
// Check for entity references (except safe ones)
if (preg_match('/&(?!amp;|lt;|gt;|quot;|apos;)[a-zA-Z0-9_]+;/', $xml)) {
throw new Exception('Custom entity references not allowed');
}
// Disable entity loading
libxml_disable_entity_loader(true);
$dom = new DOMDocument();
$dom->loadXML($xml, LIBXML_NONET | LIBXML_DTDLOAD | LIBXML_DTDATTR);
return $dom;
}
Why this works:
- Rejects DOCTYPE/ENTITY and custom entity refs before parsing.
- Parser hardening provides a second safety layer.
Framework-Specific Guidance
Use these patterns as starting points for common PHP frameworks:
- Enforce
application/xmlcontent types before parsing. - Centralize secure XML parsing helpers and reuse them.
- Validate extracted fields with framework validators.
Laravel
<?php
// SECURE - Laravel controller with secure XML parsing
namespace App\Http\Controllers;
use Illuminate\Http\Request;
use Illuminate\Http\JsonResponse;
class XmlController extends Controller
{
public function processXml(Request $request): JsonResponse
{
// Validate content type
if ($request->header('Content-Type') !== 'application/xml') {
return response()->json(['error' => 'Invalid content type'], 400);
}
try {
$xml = $request->getContent();
// Parse securely
$dom = $this->parseXmlSecure($xml);
// Extract data
$name = $dom->getElementsByTagName('name')->item(0)?->nodeValue;
$email = $dom->getElementsByTagName('email')->item(0)?->nodeValue;
// Validate
$validated = validator([
'name' => $name,
'email' => $email
], [
'name' => 'required|string|max:100',
'email' => 'required|email'
])->validate();
// Create user
$user = User::create($validated);
return response()->json($user, 201);
} catch (\Exception $e) {
return response()->json(['error' => $e->getMessage()], 400);
}
}
private function parseXmlSecure(string $xml): \DOMDocument
{
libxml_disable_entity_loader(true);
libxml_use_internal_errors(true);
$dom = new \DOMDocument();
if (!$dom->loadXML($xml, LIBXML_NONET | LIBXML_DTDLOAD)) {
$errors = libxml_get_errors();
libxml_clear_errors();
throw new \Exception('Invalid XML: ' . json_encode($errors));
}
return $dom;
}
}
Why this works:
- Validates
application/xmland uses a hardened parser. - Uses framework validation before persistence.
Symfony
<?php
// SECURE - Symfony controller with XML parsing
namespace App\Controller;
use Symfony\Bundle\FrameworkBundle\Controller\AbstractController;
use Symfony\Component\HttpFoundation\Request;
use Symfony\Component\HttpFoundation\JsonResponse;
use Symfony\Component\Validator\Validator\ValidatorInterface;
class ApiController extends AbstractController
{
public function processXml(Request $request, ValidatorInterface $validator): JsonResponse
{
if ($request->getContentType() !== 'xml') {
return new JsonResponse(['error' => 'Content-Type must be application/xml'], 400);
}
try {
$xml = $request->getContent();
// Parse with security
libxml_disable_entity_loader(true);
libxml_use_internal_errors(true);
$dom = new \DOMDocument();
if (!$dom->loadXML($xml, LIBXML_NONET)) {
throw new \Exception('Invalid XML');
}
// Extract and create entity
$user = new User();
$user->setName($dom->getElementsByTagName('name')->item(0)?->nodeValue ?? '');
$user->setEmail($dom->getElementsByTagName('email')->item(0)?->nodeValue ?? '');
// Validate entity
$errors = $validator->validate($user);
if (count($errors) > 0) {
return new JsonResponse(['errors' => (string)$errors], 400);
}
// Save
$entityManager = $this->getDoctrine()->getManager();
$entityManager->persist($user);
$entityManager->flush();
return new JsonResponse([
'id' => $user->getId(),
'name' => $user->getName(),
'email' => $user->getEmail()
], 201);
} catch (\Exception $e) {
return new JsonResponse(['error' => $e->getMessage()], 400);
}
}
}
Why this works:
- Enforces XML content type and disables entity loading.
- Validates the entity before saving.
RSS/Atom Feed Parsing
<?php
// SECURE - Parse RSS feed safely
function parse_rss_feed($feed_url) {
// Fetch feed
$xml = file_get_contents($feed_url);
if ($xml === false) {
throw new Exception('Failed to fetch feed');
}
// Disable entity loading
libxml_disable_entity_loader(true);
// Parse with SimpleXML
$feed = simplexml_load_string($xml, 'SimpleXMLElement', LIBXML_NONET);
if ($feed === false) {
throw new Exception('Invalid RSS feed');
}
$items = [];
foreach ($feed->channel->item as $item) {
$items[] = [
'title' => (string)$item->title,
'link' => (string)$item->link,
'description' => (string)$item->description
];
}
return $items;
}
// Usage:
try {
$items = parse_rss_feed('https://example.com/feed.xml');
foreach ($items as $item) {
echo htmlspecialchars($item['title']) . "<br>";
}
} catch (Exception $e) {
error_log($e->getMessage());
}
SOAP Client
<?php
// SECURE - SOAP client with XXE protection
class SecureSoapClient extends SoapClient {
public function __doRequest($request, $location, $action, $version, $one_way = 0) {
// Disable entity loading before processing SOAP response
libxml_disable_entity_loader(true);
return parent::__doRequest($request, $location, $action, $version, $one_way);
}
}
// Usage:
$wsdl = 'https://example.com/service?wsdl';
$options = [
'trace' => 1,
'exceptions' => true,
'features' => SOAP_SINGLE_ELEMENT_ARRAYS
];
$client = new SecureSoapClient($wsdl, $options);
try {
$result = $client->someMethod(['param' => 'value']);
} catch (SoapFault $e) {
error_log('SOAP error: ' . $e->getMessage());
}
Input Validation
<?php
// Validate XML structure and content
function validate_user_xml($xml) {
// Pre-validation: Block dangerous patterns
if (stripos($xml, '<!DOCTYPE') !== false) {
throw new Exception('DOCTYPE declarations not allowed');
}
// Parse securely
libxml_disable_entity_loader(true);
$dom = new DOMDocument();
if (!$dom->loadXML($xml, LIBXML_NONET)) {
throw new Exception('Invalid XML format');
}
// Validate structure
$root = $dom->documentElement;
if ($root->tagName !== 'user') {
throw new Exception('Root element must be <user>');
}
// Extract elements
$name = $dom->getElementsByTagName('name')->item(0);
$email = $dom->getElementsByTagName('email')->item(0);
// Validate presence
if (!$name || !$name->nodeValue) {
throw new Exception('Name is required');
}
if (!$email || !$email->nodeValue) {
throw new Exception('Email is required');
}
// Validate content
$nameValue = trim($name->nodeValue);
$emailValue = trim($email->nodeValue);
if (strlen($nameValue) > 100) {
throw new Exception('Name too long');
}
if (!filter_var($emailValue, FILTER_VALIDATE_EMAIL)) {
throw new Exception('Invalid email format');
}
return [
'name' => $nameValue,
'email' => $emailValue
];
}
Verification
After implementing the recommended secure patterns, verify the fix through multiple approaches:
- Manual testing: Submit malicious payloads relevant to this vulnerability and confirm they're handled safely without executing unintended operations
- Code review: Confirm all instances use the secure pattern (parameterized queries, safe APIs, proper encoding) with no string concatenation or unsafe operations
- Static analysis: Use security scanners to verify no new vulnerabilities exist and the original finding is resolved
- Regression testing: Ensure legitimate user inputs and application workflows continue to function correctly
- Edge case validation: Test with special characters, boundary conditions, and unusual inputs to verify proper handling
- Framework verification: If using a framework or library, confirm the recommended APIs are used correctly according to documentation
- Authentication/session testing: Verify security controls remain effective and cannot be bypassed (if applicable to the vulnerability type)
- Rescan: Run the security scanner again to confirm the finding is resolved and no new issues were introduced
PHP Configuration
; php.ini security settings
; Disable external entity loading globally (PHP 8.0+)
; (Note: libxml_disable_entity_loader is deprecated in PHP 8.0
; as external entity loading is disabled by default)
; For older PHP versions, always call in code:
; libxml_disable_entity_loader(true);