Skip to content

CWE-93: CRLF Injection - Python

Overview

CRLF Injection in Python applications occurs when untrusted user input containing carriage return (\r, %0D) and line feed (\n, %0A) characters is used in HTTP headers or other protocol fields without proper validation or sanitization. Attackers can exploit this to perform HTTP response splitting, header injection, log injection, cache poisoning, and cross-site scripting (XSS) attacks.

Primary Defence: Strip or reject all newline characters (\r, \n, \r\n) from user input before including in HTTP headers or logs, use framework methods (Flask's make_response(), Django's HttpResponse()) which provide built-in sanitization, validate header values against strict allowlists or regex patterns, and use structured logging (JSON format) to prevent CRLF injection, HTTP response splitting, and log injection attacks.

Common Vulnerable Patterns

Flask Redirect with User Input

# VULNERABLE - Direct user input in redirect location
from flask import Flask, request, redirect

app = Flask(__name__)

@app.route('/redirect')
def vulnerable_redirect():
    url = request.args.get('url', '')

    # VULNERABLE - User input directly in redirect
    return redirect(url)

# Attack: /redirect?url=http://example.com%0d%0aSet-Cookie:%20admin=true
# Results in HTTP response splitting:
# HTTP/1.1 302 Found
# Location: http://example.com
# Set-Cookie: admin=true
# Attacker can inject arbitrary headers

Why this is vulnerable:

  • No validation or sanitization
  • CRLF characters allow header injection
  • Response splitting possible
  • Can set malicious cookies or headers

Custom Response Headers

# VULNERABLE - User input in custom headers
from flask import Flask, request, Response

app = Flask(__name__)

@app.route('/api/data')
def vulnerable_headers():
    username = request.args.get('username', '')

    response = Response("User data")

    # VULNERABLE - User input in custom header
    response.headers['X-User-Name'] = username
    response.headers['X-Requested-By'] = request.headers.get('User-Agent', '')

    return response

# Attack: ?username=admin%0d%0aContent-Length:%200%0d%0a%0d%0a<script>alert('XSS')</script>
# Injects new headers and content

Why this is vulnerable:

  • Custom headers accept unsanitized input
  • Response splitting via CRLF
  • XSS via injected content
  • Cache poisoning

Django HttpResponse Headers

# VULNERABLE - Django with user-controlled headers
from django.http import HttpResponse
from django.views.decorators.http import require_GET

@require_GET
def vulnerable_view(request):
    callback = request.GET.get('callback', '')
    data = '{"status": "success"}'

    response = HttpResponse(data, content_type='application/json')

    # VULNERABLE - User input in JSONP callback header
    response['X-Callback'] = callback

    return response

# Attack: ?callback=test%0d%0aSet-Cookie:%20sessionid=stolen
# Injects Set-Cookie header

Why this is vulnerable:

  • Django doesn't auto-sanitize header values
  • JSONP callback can contain CRLF
  • Session fixation possible
  • Header injection

Email Header Injection

# VULNERABLE - Email headers with user input
import smtplib
from email.message import EmailMessage

def send_feedback(name, email, subject, message):
    msg = EmailMessage()

    # VULNERABLE - User input in email headers
    msg['From'] = email
    msg['To'] = 'admin@example.com'
    msg['Subject'] = subject
    msg.set_content(message)

    # VULNERABLE - Name in additional header
    msg['X-Sender-Name'] = name

    smtp = smtplib.SMTP('localhost')
    smtp.send_message(msg)
    smtp.quit()

# Attack: email = "attacker@evil.com%0aBcc:%20victim@example.com"
# Attack: subject = "Feedback%0aTo:%20victim2@example.com"
# Injects additional recipients

Why this is vulnerable:

  • Email headers vulnerable to CRLF
  • Can add Bcc, Cc recipients
  • Spam relay possible
  • Email spoofing

Log Injection

# VULNERABLE - Logging user input without sanitization
import logging

logger = logging.getLogger(__name__)

def process_login(username, password):
    # VULNERABLE - User input in log message
    logger.info(f"Login attempt for user: {username}")

    if authenticate(username, password):
        logger.info(f"Successful login: {username}")
        return True
    else:
        logger.warning(f"Failed login for: {username}")
        return False

# Attack: username = "admin\nINFO:root:Successful login: attacker\nINFO:root:Admin access granted"
# Creates fake log entries

Why this is vulnerable:

  • Log injection via newlines
  • Can forge log entries
  • Audit trail manipulation
  • Security monitoring bypass

FastAPI Response Headers

# VULNERABLE - FastAPI with custom headers
from fastapi import FastAPI, Query, Response

app = FastAPI()

@app.get("/download")
async def download_file(filename: str = Query(...)):
    content = "File content"

    # VULNERABLE - User input in Content-Disposition header
    response = Response(content=content, media_type="application/octet-stream")
    response.headers["Content-Disposition"] = f"attachment; filename={filename}"

    return response

# Attack: ?filename=file.txt%0d%0aX-Injected:%20malicious
# Injects additional headers

Why this is vulnerable:

  • FastAPI doesn't sanitize header values
  • Content-Disposition vulnerable
  • File download manipulation
  • Header injection

CSV Export with User Data

# VULNERABLE - CSV export with unsanitized data
import csv
from io import StringIO
from flask import Flask, Response

app = Flask(__name__)

@app.route('/export')
def export_csv():
    users = [
        {'name': request.args.get('name', 'User'), 'email': 'user@example.com'},
    ]

    # VULNERABLE - User data in CSV without sanitization
    output = StringIO()
    writer = csv.DictWriter(output, fieldnames=['name', 'email'])
    writer.writeheader()
    writer.writerows(users)

    return Response(
        output.getvalue(),
        mimetype='text/csv',
        headers={'Content-Disposition': 'attachment; filename=users.csv'}
    )

# Attack: ?name=admin%0aadmin2,admin2@evil.com
# Injects additional CSV rows

Why this is vulnerable:

  • CSV injection via newlines
  • Can inject formulas
  • Data exfiltration
  • Code execution in Excel

HTTP Proxy Headers

# VULNERABLE - Proxy forwarding with user headers
from flask import Flask, request
import requests

app = Flask(__name__)

@app.route('/proxy')
def proxy_request():
    target_url = request.args.get('url', '')

    # VULNERABLE - Forwarding user-controlled headers
    headers = {
        'X-Forwarded-For': request.headers.get('X-Forwarded-For', ''),
        'X-Real-IP': request.headers.get('X-Real-IP', ''),
        'X-Custom': request.headers.get('X-Custom', '')
    }

    response = requests.get(target_url, headers=headers)
    return response.text

# Attack: X-Forwarded-For: 1.2.3.4%0d%0aX-Admin:%20true
# Injects headers to backend

Why this is vulnerable:

  • Proxy headers not sanitized
  • Backend header injection
  • Authentication bypass
  • IP spoofing

Secure Patterns

Flask Redirect with Validation

# SECURE - Flask redirect with CRLF removal and validation
from flask import Flask, request, redirect, abort
import re
from urllib.parse import urlparse

app = Flask(__name__)

def sanitize_url(url):
    """Remove CRLF characters and validate URL"""
    if not url:
        return None

    # Remove CRLF characters
    clean_url = url.replace('\r', '').replace('\n', '').replace('%0d', '').replace('%0a', '')

    # Validate URL format
    try:
        parsed = urlparse(clean_url)
        # Only allow http/https schemes
        if parsed.scheme not in ['http', 'https', '']:
            return None
        # Optionally: allowlist domains
        # if parsed.netloc not in ['example.com', 'trusted.com']:
        #     return None
        return clean_url
    except:
        return None

@app.route('/redirect')
def secure_redirect():
    url = request.args.get('url', '')

    # SECURE - Sanitize and validate URL
    clean_url = sanitize_url(url)

    if not clean_url:
        abort(400, "Invalid redirect URL")

    return redirect(clean_url)

if __name__ == '__main__':
    app.run()

Why this works:

This pattern prevents CRLF injection through multiple defensive layers. The sanitize_url() function first removes literal CRLF characters (\r, \n) and their URL-encoded equivalents (%0d, %0a), preventing attackers from injecting header delimiters. By handling both literal and encoded forms (including uppercase variants), the sanitization catches different encoding variations that attackers might use to bypass simple filters. This comprehensive character removal ensures that even if the URL passes validation, it cannot contain the characters needed for response splitting.

The URL parsing and validation using urlparse() provides structural validation beyond just character filtering. By checking that the scheme is either empty (relative URL) or explicitly http/https, the code prevents javascript:, data:, or other exotic schemes that could be used for XSS attacks. The optional domain allowlist (commented out in the example) demonstrates how you can further restrict redirects to trusted destinations, preventing open redirect vulnerabilities where attackers trick users into visiting malicious sites.

Returning None for invalid URLs and checking this result in the route handler implements secure failure handling. The abort(400) call explicitly rejects malicious requests rather than attempting to redirect to a potentially dangerous location. This "fail securely" approach is critical for security functions - if validation detects an attack, the safest response is to reject the request entirely. The combination of character sanitization, structural validation, scheme allowlisting, and secure error handling creates defense-in-depth that protects against CRLF injection, open redirects, and XSS through the redirect parameter.

Custom Headers with Sanitization

# SECURE - Custom headers with CRLF removal
from flask import Flask, request, Response
import re

app = Flask(__name__)

def sanitize_header_value(value):
    """Remove CRLF and other control characters"""
    if not value:
        return ''

    # Remove CRLF characters (including encoded versions)
    clean = value.replace('\r', '').replace('\n', '')
    clean = clean.replace('%0d', '').replace('%0a', '')
    clean = clean.replace('%0D', '').replace('%0A', '')

    # Remove other control characters
    clean = re.sub(r'[\x00-\x1f\x7f]', '', clean)

    # Limit length
    return clean[:200]

def validate_username(username):
    """Validate username format"""
    if not username:
        return False
    return bool(re.match(r'^[a-zA-Z0-9._-]{3,50}$', username))

@app.route('/api/data')
def secure_headers():
    username = request.args.get('username', '')

    # SECURE - Validate input
    if not validate_username(username):
        return Response("Invalid username", status=400)

    response = Response("User data")

    # SECURE - Sanitize header value
    clean_username = sanitize_header_value(username)
    response.headers['X-User-Name'] = clean_username

    return response

Why this works:

This pattern demonstrates comprehensive input sanitization for HTTP headers through both validation and character filtering. The sanitize_header_value() function removes CRLF characters in multiple forms: literal \r and \n, lowercase URL-encoded %0d and %0a, and uppercase URL-encoded %0D and %0A. This multi-encoding approach prevents bypass attempts where attackers use different encoding schemes to evade simple filters. The regex [\x00-\x1f\x7f] removes all ASCII control characters, including not just CRLF but also null bytes, tabs, and escape sequences that could manipulate header parsing or terminal displays.

The username validation using regex (^[a-zA-Z0-9._-]{3,50}$) enforces a strict allowlist of allowed characters before the value is ever used. This validation-first approach means that only alphanumeric characters, dots, underscores, and hyphens are permitted in usernames. By rejecting any input that doesn't match this pattern, you eliminate entire classes of attacks - not just CRLF injection but also SQL injection attempts, XSS payloads, and other malicious input that might be disguised as a username. Returning HTTP 400 for invalid usernames provides clear feedback that the request was malformed.

The 200-character length limit in sanitize_header_value() provides additional defense against denial-of-service attacks where attackers send extremely long header values to consume server resources or trigger buffer-related vulnerabilities. By validating before sanitizing, the code ensures that only well-formed usernames even reach the sanitization function, while sanitization provides an extra layer of protection if validation is somehow bypassed or if other code paths use the sanitizer. This defense-in-depth approach - validation, sanitization, length limits - ensures that even if one control fails, others prevent the attack.

Django with Header Sanitization

# SECURE - Django with proper header handling
from django.http import HttpResponse, HttpResponseBadRequest
from django.views.decorators.http import require_GET
import re

def sanitize_header_value(value):
    """Remove CRLF and control characters"""
    if not value:
        return ''
    # Remove all newline variations
    clean = re.sub(r'[\r\n\x00-\x1f\x7f]', '', value)
    # Remove URL-encoded CRLF
    clean = re.sub(r'%0[dDaA]', '', clean)
    return clean[:200]

def validate_callback(callback):
    """Validate JSONP callback name"""
    return bool(re.match(r'^[a-zA-Z_][a-zA-Z0-9_]*$', callback))

@require_GET
def secure_view(request):
    callback = request.GET.get('callback', '')

    # SECURE - Validate callback format
    if not validate_callback(callback):
        return HttpResponseBadRequest("Invalid callback name")

    data = '{"status": "success"}'
    response = HttpResponse(data, content_type='application/json')

    # SECURE - Use validated callback (no sanitization needed)
    response['X-Callback'] = callback

    return response

Why this works:

This Django pattern combines strict input validation with sanitization to prevent CRLF injection in custom headers. The validate_callback() function uses a regex that enforces JSONP callback naming conventions: must start with a letter or underscore, followed by any combination of letters, numbers, and underscores. This allowlist approach is extremely secure because it only accepts characters that are valid in JavaScript identifiers, completely eliminating the possibility of CRLF characters or other special characters being present in the callback parameter.

The sanitize_header_value() function provides defense-in-depth by removing CRLF characters even though the validation should prevent them from occurring. The regex [\r\n\x00-\x1f\x7f] removes literal newlines and all control characters, while the second regex %0[dDaA] removes URL-encoded CRLF sequences in both lowercase and uppercase. This double-layer protection is valuable because it protects against scenarios where the sanitizer might be used elsewhere in the codebase, or if validation is accidentally bypassed due to code changes.

Returning HttpResponseBadRequest for invalid callbacks implements proper error handling for security violations. Rather than attempting to sanitize invalid input or using a default value, the code explicitly rejects malicious requests with HTTP 400. This approach makes attack attempts visible in server logs and prevents attackers from discovering what sanitization is applied. Because the callback passed validation using the strict regex, it doesn't need sanitization before being set as a header value - the comment "no sanitization needed" reflects this. However, having the sanitization function available demonstrates good security architecture for other headers that might not have such strict validation.

Email with Header Validation

# SECURE - Email with header sanitization
import smtplib
from email.message import EmailMessage
from email.utils import parseaddr
import re

def sanitize_email_header(value):
    """Remove CRLF from email headers"""
    if not value:
        return ''
    # Remove CRLF and control characters
    return re.sub(r'[\r\n\x00-\x1f\x7f]', '', value)

def validate_email(email):
    """Validate email format"""
    if not email or len(email) > 254:
        return False
    name, addr = parseaddr(email)
    if not addr:
        return False
    # Additional validation
    return bool(re.match(r'^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$', addr))

def send_feedback_secure(name, email, subject, message):
    """Send email with sanitized headers"""
    # SECURE - Validate inputs
    if not validate_email(email):
        raise ValueError("Invalid email address")

    if len(subject) > 200 or len(message) > 5000:
        raise ValueError("Content too long")

    # SECURE - Sanitize all header values
    clean_email = sanitize_email_header(email)
    clean_subject = sanitize_email_header(subject)
    clean_name = sanitize_email_header(name)

    msg = EmailMessage()
    msg['From'] = clean_email
    msg['To'] = 'admin@example.com'
    msg['Subject'] = clean_subject
    msg.set_content(message)
    msg['X-Sender-Name'] = clean_name

    smtp = smtplib.SMTP('localhost')
    smtp.send_message(msg)
    smtp.quit()

Why this works:

This email pattern prevents header injection attacks through comprehensive validation and sanitization of all email header fields. The validate_email() function performs structural validation using parseaddr() from Python's email utilities, which properly parses email addresses according to RFC standards. The length check (254 characters maximum per RFC 5321) and regex validation ensure that only properly formatted email addresses are accepted. This prevents attackers from injecting additional recipients through Bcc headers or manipulating the From field with CRLF sequences.

The sanitize_email_header() function removes all control characters including CRLF from header values using the regex [\r\n\x00-\x1f\x7f]. Email headers are particularly vulnerable to injection because SMTP protocol uses CRLF as a delimiter between headers and between headers and message body. An attacker who injects \r\n into a subject line could add Bcc: attacker@evil.com, turning your email system into a spam relay. By removing these characters from all header fields (From, Subject, custom headers like X-Sender-Name), the code prevents this entire class of attacks.

The length validation (200 characters for subject, 5000 for message) prevents denial-of-service attacks and limits the scope of any potential injection. The pattern validates before sanitizing, ensuring that only valid emails are processed, then sanitizes as defense-in-depth. Using Python's EmailMessage class is safer than manually constructing SMTP commands, as the class handles header encoding and formatting according to email RFCs. However, the class doesn't automatically sanitize CRLF from header values, making the explicit sanitization critical. This pattern demonstrates that even when using high-level libraries, you must still validate and sanitize user input before placing it in protocol-sensitive contexts like email headers.

Secure Logging

# SECURE - Logging with sanitization
import logging
import re

logger = logging.getLogger(__name__)

def sanitize_log_input(value):
    """Remove CRLF and control characters for logging"""
    if not value:
        return ''
    # Remove newlines and control characters
    clean = re.sub(r'[\r\n\x00-\x1f\x7f]', ' ', value)
    # Limit length
    return clean[:200]

def validate_username(username):
    """Validate username format"""
    return bool(re.match(r'^[a-zA-Z0-9._-]{3,50}$', username))

def process_login_secure(username, password):
    """Process login with secure logging"""
    # SECURE - Validate username
    if not validate_username(username):
        logger.warning("Invalid username format in login attempt")
        return False

    # SECURE - Sanitize for logging
    clean_username = sanitize_log_input(username)
    logger.info(f"Login attempt for user: {clean_username}")

    if authenticate(username, password):
        logger.info(f"Successful login: {clean_username}")
        return True
    else:
        logger.warning(f"Failed login for: {clean_username}")
        return False

def authenticate(username, password):
    # Authentication logic
    return True

Why this works:

This logging pattern prevents log injection attacks by sanitizing user input before it's written to log files. The sanitize_log_input() function uses regex to replace all newlines and control characters with spaces, preventing attackers from creating fake log entries. Without this protection, an attacker could provide a username like "admin\nINFO: User hacker performed GRANT ADMIN", which would create a completely fabricated log entry that appears legitimate in log analysis tools, SIEMs, and audit reviews. This could enable attackers to hide their activities or frame other users.

The regex [\r\n\x00-\x1f\x7f] matches not just CRLF but all ASCII control characters. This comprehensive approach prevents attacks using other control characters like tabs or escape sequences that might be used to manipulate log file display, inject terminal escape codes, or interfere with log parsing. Replacing these characters with spaces rather than removing them entirely preserves the readability of log entries while neutralizing the attack - users can still see what input was provided, but it can't break the log structure.

The username validation using regex (^[a-zA-Z0-9._-]{3,50}$) provides a first line of defense by rejecting usernames that don't match expected format. This validation-first approach means most attacks are caught before reaching the sanitization function. The 200-character length limit in sanitize_log_input() prevents excessively long inputs that could fill disk space or trigger buffer issues. By combining validation (rejecting invalid usernames entirely), sanitization (cleaning what passes validation), and length limits, this pattern creates multiple layers of protection. Using f-strings with sanitized values maintains clean, readable code while ensuring all logged user input is safe.

FastAPI with Pydantic Validation

# SECURE - FastAPI with Pydantic validation
from fastapi import FastAPI, HTTPException, Response
from pydantic import BaseModel, validator
import re

app = FastAPI()

class DownloadRequest(BaseModel):
    filename: str

    @validator('filename')
    def validate_filename(cls, v):
        # Remove CRLF
        clean = re.sub(r'[\r\n\x00-\x1f\x7f]', '', v)
        # Validate format
        if not re.match(r'^[a-zA-Z0-9._-]+\.[a-zA-Z0-9]+$', clean):
            raise ValueError('Invalid filename format')
        if len(clean) > 100:
            raise ValueError('Filename too long')
        return clean

@app.get("/download")
async def download_file(req: DownloadRequest):
    content = "File content"

    # SECURE - Use validated filename
    response = Response(content=content, media_type="application/octet-stream")
    response.headers["Content-Disposition"] = f"attachment; filename={req.filename}"

    return response

Why this works:

This FastAPI pattern leverages Pydantic's validation framework to prevent CRLF injection at the data model level, ensuring that invalid input is rejected before it reaches any application logic. The @validator decorator on the filename field executes automatically whenever a DownloadRequest is created, providing centralized validation that can't be accidentally bypassed. The regex re.sub(r'[\r\n\x00-\x1f\x7f]', '', v) removes all control characters including CRLF, ensuring the filename is clean before further validation.

The filename format validation using ^[a-zA-Z0-9._-]+\.[a-zA-Z0-9]+$ enforces a strict allowlist: alphanumeric characters, dots, hyphens, and underscores, with exactly one dot separating the base name from the extension. This pattern prevents not only CRLF injection but also path traversal attacks (by disallowing / and \), hidden files (by requiring the name to start with alphanumeric), and other filename-based attacks. If the filename doesn't match this pattern after CRLF removal, Pydantic raises a ValueError which FastAPI automatically converts to an HTTP 422 Unprocessable Entity response with detailed error information.

The 100-character length limit prevents denial-of-service through excessively long filenames that could cause filesystem issues or consume excessive memory. Because the validation happens in the Pydantic model, it's automatically applied to all code paths that use DownloadRequest - you can't accidentally forget to validate the filename in some route handler. Once the request object is created, you can trust that req.filename has been validated and sanitized, allowing you to use it confidently in the Content-Disposition header. This declarative validation approach is superior to imperative validation scattered throughout route handlers because it's centralized, automatically applied, type-safe, and generates consistent error responses.

Verification

After implementing the recommended secure patterns, verify the fix through multiple approaches:

  • Manual testing: Submit malicious payloads relevant to this vulnerability and confirm they're handled safely without executing unintended operations
  • Code review: Confirm all instances use the secure pattern (parameterized queries, safe APIs, proper encoding) with no string concatenation or unsafe operations
  • Static analysis: Use security scanners to verify no new vulnerabilities exist and the original finding is resolved
  • Regression testing: Ensure legitimate user inputs and application workflows continue to function correctly
  • Edge case validation: Test with special characters, boundary conditions, and unusual inputs to verify proper handling
  • Framework verification: If using a framework or library, confirm the recommended APIs are used correctly according to documentation
  • Authentication/session testing: Verify security controls remain effective and cannot be bypassed (if applicable to the vulnerability type)
  • Rescan: Run the security scanner again to confirm the finding is resolved and no new issues were introduced

Additional Resources