Skip to content

CWE-611: XML External Entity (XXE) Injection - Go

Overview

XXE vulnerabilities in Go applications occur when XML parsers process external entity references in untrusted XML input. Go's encoding/xml package historically had XXE vulnerabilities in older versions but has improved significantly. As of Go 1.16+, the standard library XML decoder does not resolve external entities by default, providing built-in protection. However, developers using third-party XML libraries or enabling entity resolution can still introduce vulnerabilities.

Unlike languages where XML parsers are vulnerable by default (Java, PHP), Go's standard library takes a secure-by-default approach in modern versions. The xml.Decoder processes XML without expanding external entities, preventing file disclosure and SSRF attacks through XXE. However, developers must be aware of configuration options that can weaken security, particularly when using custom decoders or third-party libraries like libxml2 bindings.

The Go ecosystem strongly favors JSON over XML for APIs and configuration, reducing XXE exposure. When XML processing is necessary (SOAP services, legacy integrations, XML configurations), understanding the security characteristics of encoding/xml and avoiding third-party parsers with unsafe defaults is critical. The simplicity of Go's standard library XML parser - with minimal configuration options - is a security advantage, as there are fewer knobs to misconfigure compared to feature-rich parsers in other languages.

Primary Defence: Use Go's standard encoding/xml package (Go 1.16+) which doesn't resolve external entities by default. Avoid third-party XML libraries unless necessary, and if used, explicitly disable external entity processing and DTD validation.

Common Vulnerable Patterns

Old Go Versions (Pre-1.16)

// VULNERABLE - Go versions before 1.16 may resolve entities
package main

import (
    "encoding/xml"
    "fmt"
    "io"
    "net/http"
    "strings"
)

type Result struct {
    Data string `xml:"data"`
}

func parseXMLHandler(w http.ResponseWriter, r *http.Request) {
    body, _ := io.ReadAll(r.Body)

    var result Result
    // DANGEROUS in Go < 1.16: May process external entities
    err := xml.Unmarshal(body, &result)
    if err != nil {
        http.Error(w, err.Error(), 400)
        return
    }

    fmt.Fprintf(w, "Data: %s", result.Data)
}

// Attack XML:
// <?xml version="1.0"?>
// <!DOCTYPE foo [
//   <!ENTITY xxe SYSTEM "file:///etc/passwd">
// ]>
// <result><data>&xxe;</data></result>
//
// In vulnerable versions: Reads /etc/passwd

Why this is vulnerable: Older Go versions (before 1.16) had inconsistent XXE protection. Some versions would process external entities defined in DTDs, allowing attackers to include file contents or trigger SSRF attacks. The xml.Unmarshal() function would expand entity references, making file disclosure possible.

Third-Party XML Libraries

// VULNERABLE - Using CGo-based libxml2 binding
import (
    "net/http"

    "github.com/lestrrat-go/libxml2"
    "github.com/lestrrat-go/libxml2/parser"
)

func parseWithLibxml2(w http.ResponseWriter, r *http.Request) {
    body, _ := io.ReadAll(r.Body)

    // DANGEROUS: libxml2 may have unsafe defaults
    doc, err := libxml2.Parse(body)
    if err != nil {
        http.Error(w, err.Error(), 400)
        return
    }
    defer doc.Free()

    // Process document...
}

// Attack: Same XXE payload can trigger file disclosure
// if libxml2 has entity resolution enabled

Why this is vulnerable: Third-party libraries, especially those wrapping native XML parsers through CGo (libxml2, expat), may have different security defaults than Go's standard library. These parsers often default to processing external entities and DTDs for backwards compatibility, making them vulnerable unless explicitly configured with security options. CGo dependencies also complicate deployment and security reviews.

Custom Entity Resolution

// VULNERABLE - Implementing custom entity resolver
import (
    "encoding/xml"
    "io"
    "net/http"
    "os"
)

type CustomDecoder struct {
    decoder *xml.Decoder
}

func (cd *CustomDecoder) Entity(name string) (*xml.Entity, error) {
    // DANGEROUS: Custom entity lookup from filesystem
    if name == "config" {
        data, err := os.ReadFile("/etc/app/config.xml")
        if err != nil {
            return nil, err
        }
        return &xml.Entity{
            Name: name,
            Value: string(data),
        }, nil
    }
    return nil, nil
}

// Attack: Trigger entity resolution with crafted XML

Why this is vulnerable: Implementing a custom entity resolver that reads files or makes network requests creates an intentional XXE vulnerability. Even if the implementation seems safe (checking entity names), it provides an attack surface if the resolver logic has flaws or can be bypassed. Custom resolvers should be avoided unless absolutely necessary and heavily restricted.

DTD Validation Enabled

// POTENTIALLY VULNERABLE - If DTD validation is somehow enabled
// (Note: encoding/xml doesn't provide built-in DTD validation,
// but this illustrates the concept)

func parseWithValidation(w http.ResponseWriter, r *http.Request) {
    body, _ := io.ReadAll(r.Body)

    // If using a library that validates DTDs,
    // external entity references in the DTD could be processed
    decoder := xml.NewDecoder(strings.NewReader(string(body)))

    // Process XML...
    var result interface{}
    err := decoder.Decode(&result)
    if err != nil {
        http.Error(w, err.Error(), 400)
        return
    }
}

// Attack: DTD with external entity declarations

Why this is vulnerable: DTD validation can trigger external entity processing because DTDs can define entities that reference external resources. While Go's encoding/xml doesn't validate DTDs, third-party libraries that do can be vulnerable if they don't disable external entity resolution during validation.

Secure Patterns

Use encoding/xml (Go 1.16+) - Default Safe

// SECURE - Standard library is safe by default in Go 1.16+
package main

import (
    "encoding/xml"
    "io"
    "log"
    "net/http"
)

type User struct {
    XMLName xml.Name `xml:"user"`
    Name    string   `xml:"name"`
    Email   string   `xml:"email"`
    Role    string   `xml:"role"`
}

func parseXMLHandler(w http.ResponseWriter, r *http.Request) {
    // Limit request body size to prevent DoS
    r.Body = http.MaxBytesReader(w, r.Body, 1048576) // 1MB limit

    body, err := io.ReadAll(r.Body)
    if err != nil {
        http.Error(w, "Request too large", http.StatusRequestEntityTooLarge)
        return
    }
    defer r.Body.Close()

    var user User
    // SECURE: encoding/xml doesn't process external entities (Go 1.16+)
    err = xml.Unmarshal(body, &user)
    if err != nil {
        log.Printf("XML parse error: %v", err)
        http.Error(w, "Invalid XML", http.StatusBadRequest)
        return
    }

    // Validate parsed data
    if user.Name == "" || user.Email == "" {
        http.Error(w, "Missing required fields", http.StatusBadRequest)
        return
    }

    // Process user data safely
    w.Header().Set("Content-Type", "application/json")
    fmt.Fprintf(w, `{"status":"success","name":"%s"}`, user.Name)
}

Why this works: Go 1.16 and later versions of encoding/xml do not resolve external entities by default. When the parser encounters entity references like &xxe; defined with <!ENTITY xxe SYSTEM "file:///etc/passwd">, it ignores the external definition and treats the entity as undefined, preventing file disclosure and SSRF. The http.MaxBytesReader limits upload size to prevent XML bomb attacks (billion laughs, quadratic blowup). Input validation after parsing ensures that even if malicious XML passes parsing, the business logic won't process invalid data.

XML Decoder with Streaming (Large Documents)

// SECURE - Streaming XML parsing for large documents
import (
    "encoding/xml"
    "io"
    "net/http"
)

type Item struct {
    ID    string `xml:"id,attr"`
    Name  string `xml:"name"`
    Value string `xml:"value"`
}

func streamXMLHandler(w http.ResponseWriter, r *http.Request) {
    // Create decoder from request body
    decoder := xml.NewDecoder(r.Body)
    defer r.Body.Close()

    // SECURE: Decoder also doesn't resolve external entities
    var items []Item

    for {
        token, err := decoder.Token()
        if err == io.EOF {
            break
        }
        if err != nil {
            http.Error(w, "Parse error", 400)
            return
        }

        // Process start elements
        if se, ok := token.(xml.StartElement); ok {
            if se.Name.Local == "item" {
                var item Item
                if err := decoder.DecodeElement(&item, &se); err != nil {
                    http.Error(w, "Decode error", 400)
                    return
                }
                items = append(items, item)
            }
        }
    }

    // Process items...
    w.WriteHeader(http.StatusOK)
}

Why this works: xml.NewDecoder() creates a streaming decoder that processes XML incrementally, reducing memory usage for large documents. The decoder inherits the same XXE protection as xml.Unmarshal() - external entities are not resolved. Streaming also provides better control over parsing, allowing rejection of documents with unexpected structure before full parsing. This pattern is ideal for SOAP services or XML feeds where documents may be large.

Input Sanitization (Defense in Depth)

// SECURE - Reject XML with DOCTYPE declarations
import (
    "bufio"
    "bytes"
    "encoding/xml"
    "io"
    "net/http"
    "strings"
)

func secureXMLHandler(w http.ResponseWriter, r *http.Request) {
    body, err := io.ReadAll(io.LimitReader(r.Body, 1048576))
    if err != nil {
        http.Error(w, "Read error", 500)
        return
    }
    defer r.Body.Close()

    // SECURE: Reject XML with DOCTYPE (defense in depth)
    if containsDOCTYPE(body) {
        http.Error(w, "DOCTYPE declarations not allowed", 400)
        return
    }

    var data interface{}
    err = xml.Unmarshal(body, &data)
    if err != nil {
        http.Error(w, "Invalid XML", 400)
        return
    }

    w.WriteHeader(http.StatusOK)
}

func containsDOCTYPE(xmlData []byte) bool {
    scanner := bufio.NewScanner(bytes.NewReader(xmlData))
    for scanner.Scan() {
        line := strings.TrimSpace(scanner.Text())
        if strings.Contains(strings.ToUpper(line), "<!DOCTYPE") {
            return true
        }
    }
    return false
}

Why this works: While encoding/xml doesn't process external entities, explicitly rejecting XML documents with <!DOCTYPE> declarations provides defense-in-depth. This prevents any possibility of XXE regardless of Go version or parser implementation changes. Scanning for DOCTYPE is fast and occurs before XML parsing, providing an early rejection point. This is particularly valuable if supporting multiple Go versions or as insurance against future parser changes.

Prefer JSON Over XML

// SECURE - Use JSON instead of XML (best practice)
import (
    "encoding/json"
    "io"
    "net/http"
)

type UserRequest struct {
    Name  string `json:"name"`
    Email string `json:"email"`
    Role  string `json:"role"`
}

func jsonHandler(w http.ResponseWriter, r *http.Request) {
    var user UserRequest

    // SECURE: JSON has no XXE vulnerabilities
    decoder := json.NewDecoder(io.LimitReader(r.Body, 1048576))
    decoder.DisallowUnknownFields() // Strict parsing

    err := decoder.Decode(&user)
    if err != nil {
        http.Error(w, "Invalid JSON", 400)
        return
    }
    defer r.Body.Close()

    // Validate
    if user.Name == "" || user.Email == "" {
        http.Error(w, "Missing fields", 400)
        return
    }

    // Process...
    w.Header().Set("Content-Type", "application/json")
    json.NewEncoder(w).Encode(map[string]string{
        "status": "success",
        "name":   user.Name,
    })
}

Why this works: JSON parsers have no concept of external entities, DTDs, or entity references - the format simply doesn't support them. By using JSON instead of XML for APIs and configuration, XXE vulnerabilities are completely eliminated. JSON is also simpler to parse, has broader client support, and generates smaller payloads. This is the recommended approach for new systems unless XML is required for compatibility (SOAP, legacy systems).

Framework-Specific Guidance

Gin with XML Binding

// SECURE - Gin XML binding uses encoding/xml
package main

import (
    "net/http"

    "github.com/gin-gonic/gin"
)

type Config struct {
    AppName string `xml:"appname"`
    Version string `xml:"version"`
    Debug   bool   `xml:"debug"`
}

func main() {
    r := gin.Default()

    // Limit request size globally
    r.MaxMultipartMemory = 1 << 20 // 1MB

    r.POST("/config", configHandler)

    r.Run(":8080")
}

func configHandler(c *gin.Context) {
    var config Config

    // SECURE: Gin uses encoding/xml internally
    if err := c.ShouldBindXML(&config); err != nil {
        c.JSON(http.StatusBadRequest, gin.H{"error": "Invalid XML"})
        return
    }

    // Validate
    if config.AppName == "" {
        c.JSON(http.StatusBadRequest, gin.H{"error": "appname required"})
        return
    }

    c.JSON(http.StatusOK, gin.H{
        "status": "configured",
        "app":    config.AppName,
    })
}

Why this works: Gin's ShouldBindXML() and BindXML() methods use Go's standard encoding/xml package internally, inheriting its XXE protection. Gin also provides automatic request size limiting through MaxMultipartMemory configuration. The framework's binding mechanism validates that the XML structure matches the expected Go struct, providing schema validation. This makes Gin safe for XML processing in modern Go versions.

Echo XML Deserialization

// SECURE - Echo with XML binding
package main

import (
    "net/http"

    "github.com/labstack/echo/v4"
    "github.com/labstack/echo/v4/middleware"
)

type Message struct {
    To      string `xml:"to"`
    From    string `xml:"from"`
    Content string `xml:"content"`
}

func main() {
    e := echo.New()

    // Middleware
    e.Use(middleware.Logger())
    e.Use(middleware.Recover())
    e.Use(middleware.BodyLimit("1M")) // Limit body size

    e.POST("/message", messageHandler)

    e.Start(":8080")
}

func messageHandler(c echo.Context) error {
    var msg Message

    // SECURE: Echo uses encoding/xml
    if err := c.Bind(&msg); err != nil {
        return c.JSON(http.StatusBadRequest, map[string]string{
            "error": "Invalid XML",
        })
    }

    // Validate
    if msg.To == "" || msg.From == "" {
        return c.JSON(http.StatusBadRequest, map[string]string{
            "error": "Missing required fields",
        })
    }

    return c.JSON(http.StatusOK, map[string]interface{}{
        "status":  "received",
        "to":      msg.To,
        "from":    msg.From,
        "preview": msg.Content[:min(len(msg.Content), 50)],
    })
}

func min(a, b int) int {
    if a < b {
        return a
    }
    return b
}

Why this works: Echo's Bind() method automatically detects the content type and uses appropriate deserializers. For Content-Type: application/xml, it uses encoding/xml, providing XXE protection. The middleware.BodyLimit middleware enforces request size limits globally across all routes, preventing XML bomb attacks. Echo's error handling ensures parsing errors are caught and returned safely without exposing internal details.

Additional Resources