CWE-117: Log Injection

Overview

Log Injection occurs when untrusted user input is written to logs without proper validation or encoding, allowing attackers to forge log entries, hide malicious activity, or inject misleading information.

OWASP Classification

A09:2025 - Security Logging & Alerting Failures

Risk

Medium: Attackers can manipulate log files, obscure their actions, or inject malicious content that may be interpreted by log analysis tools, leading to incident response failures or further exploitation.

Remediation Steps

Core principle: Use structured JSON/ECS logging to separate data from structure so untrusted input is always encoded inside fields and cannot forge entries or inject control characters.

Locate the log injection vulnerability in your code

Review the security findings to identify where untrusted data is written to logs
Find the source: where untrusted data enters (HTTP parameters, headers, cookies, external files, databases, network requests)
Trace to the sink: locate the logging statement (logger.info(), log.warn(), console.log(), etc.)
Check the data flow: review each frame in the data_path for missing encoding

Use structured JSON/ECS logging (Primary Defense)

RECOMMENDED: JSON/ECS output at the logging sink: Configure the logger to emit JSON (or ECS JSON) so control characters are encoded within fields rather than rendered as line breaks.
Why JSON/ECS is preferred: Encoding happens centrally and consistently, preserving full forensic evidence while preventing log forging.
Coverage: JSON encoding neutralizes ASCII control chars (\x00-\x1F, \x7F) and Unicode line separators (\u0085, \u2028, \u2029) by emitting escape sequences in the log output.
Use framework-provided JSON/ECS formatters: e.g., Logback/Log4j2 JSON, Serilog ECS/JSON, python-json-logger/structlog, winston/pino JSON.
Fallback when JSON/ECS isn’t available: Encode the full ASCII control range (\x00-\x1F, \x7F) plus Unicode line separators (\u0085, \u2028, \u2029) to visible sequences; do not remove them.
Limit length: Truncate very long strings to prevent log flooding (e.g., max 1000 characters).
Handle ANSI escape sequences: Disable colorized output in production or encode/remove ANSI control codes.

Use structured logging to separate data from format

Use JSON/ECS logging: Emit one JSON object per log entry with separate fields for message and user data.
Ensure one event per line: Configure JSON layouts with an end-of-event delimiter (e.g., eventEol=true) so entries are not merged.
Parameterize log messages: Use placeholders instead of concatenating user data into log strings (logger.info("User: {}", username) not logger.info("User: " + username)).
Avoid string concatenation: Don’t build log messages by concatenating untrusted data.
Use logging framework features: Use MDC (Mapped Diagnostic Context) or similar to separate data from structure.

Validate and restrict log content

Enforce length limits: Reject or truncate data exceeding reasonable log entry size
Validate data type: Ensure logged data matches expected type (numeric, email, etc.)
Use allowlists: For enumerated values, validate against known-good list before logging
Avoid logging sensitive data: Don't log passwords, tokens, credit cards, PII (or redact them)
Check what's necessary: Only log data needed for debugging/auditing, not everything

Monitor and audit log files for injection attempts

Review logs regularly for suspicious or malformed entries
Alert on injection patterns: monitor for newlines, control characters, unusual log formats
Implement log integrity checks: use log signing or SIEM tools to detect tampering
Separate application and security logs: keep audit logs separate from debug logs
Restrict log file access: limit who can read/write log files

Test with log injection payloads

Test with ASCII newlines: try input like value\n[2024-01-01] ADMIN LOGIN SUCCESS (forged log entry)
Test with Unicode newlines: try value\u2028FAKE LOG ENTRY (bypasses simple .replace())
Test with ANSI codes: try \x1b[31mERROR\x1b[0m (color codes)
Test with long strings: submit extremely long input to test truncation
Test with control characters: try \r, \n, \t, \0 (null bytes), \u0085, \u2028, \u2029
Verify legitimate logging works: ensure normal log entries are still recorded correctly

Common Vulnerable Patterns

Logging user input directly without sanitization

# Dangerous: user input in log
logger.info('User input: %s', user_input)

Why this is vulnerable: User input containing newline characters (\n, \r) or ANSI escape sequences can inject forged log entries, allowing attackers to create fake audit trails, hide malicious activity, or manipulate log analysis tools. Control characters can split one log entry into multiple lines, making attacks appear as legitimate system events.

Allowing newlines or control characters in log entries

Secure Patterns

Use JSON/ECS logging (Recommended)

# RECOMMENDED: JSON/ECS logging (preserves audit trail)
# Control characters are encoded inside the field value
logger.info('User input', extra={'user_input': user_input})
# Attack attempt visible in logs: "test\\nFAKE LOG ENTRY"

Why this works: Encoding control characters (converting \n to literal \n, \r to \r) makes them visible as text rather than functional newlines, preventing log injection while preserving the complete audit trail of what attackers attempted. This is superior to removal because security teams can see the full attack payload.

Encode control characters (Fallback)

# Alternative: Encode control characters when JSON/ECS output is not available
encoded_input = encode_control_chars(user_input)  # \n → \\n, \r → \\r, \x00 → \\u0000
logger.info('User input: %s', encoded_input)

Why this works: Structured logging (JSON format) automatically escapes control characters and separates data from log structure, preventing injection attacks while maintaining complete data for analysis.

Remove control characters (Not Recommended - Loses Forensic Evidence)

# NOT RECOMMENDED: Remove control chars (loses critical forensic evidence)
clean_input = remove_control_chars(user_input)  # regex: [\x00-\x1F\x7F\u0085\u2028\u2029]
logger.info('User input: %s', clean_input)
# Attack becomes: "testFAKE LOG ENTRY" - you don't see the injection attempt
# Security teams cannot see what attackers attempted

Why this works: Removing ASCII control characters (\r, \n, \t, etc.) and Unicode newlines prevents log forging, but removes evidence of attack attempts from logs. Use this only when JSON/ECS output and encoding are both truly infeasible (e.g., legacy systems with no control over log format). For production systems requiring incident response or threat intelligence, this approach is not recommended as it blinds security teams to attack patterns.

Language-Specific Guidance

For detailed, language-specific examples and logging framework patterns:

C# - ILogger, Serilog, NLog with message templates
Java - Log4j, Logback, SLF4J with parameterized logging
JavaScript/NodeJS - winston, pino, bunyan with safe log formatting
Python - logging module, structlog for structured logging