CWE-434: Unrestricted File Upload

Overview

Unrestricted file upload vulnerabilities occur when applications accept file uploads without properly validating file type, content, size, or destination. Attackers exploit this to upload malicious files including web shells (PHP, JSP, ASPX), executable malware, HTML files with XSS payloads, SVG files with embedded JavaScript, or oversized files causing denial of service. The vulnerability is particularly dangerous when uploaded files are stored within the webroot and can be directly accessed and executed by the web server.

OWASP Classification

A06:2025 - Insecure Design

Risk

Critical: Can lead to remote code execution, server compromise, malware distribution, stored XSS, denial of service, and defacement. Unrestricted uploads combined with path traversal can overwrite application or system files when the upload destination is attacker-controlled. Even "safe" file types like images can contain malicious metadata, parser exploit payloads, or content used for social engineering attacks.

Remediation Steps

Core principle: Use an allowlist of business-required file types, validate both extension and content, store uploads outside webroot, rename files, and never execute uploaded files. Serve original uploads only as untrusted downloads with authorization and safe response headers.

Validate Extension and Content

# VULNERABLE - trusts file extension
if filename.endswith('.jpg'):
    save_file(filename, data)

# SECURE - validate extension and actual file content
import magic
from pathlib import Path

ALLOWED_EXTENSIONS = {'.jpg', '.jpeg', '.png', '.gif'}
ALLOWED_TYPES = {
    'image/jpeg': {'.jpg', '.jpeg'},
    'image/png': {'.png'},
    'image/gif': {'.gif'},
}

extension = Path(filename).suffix.lower()
if extension not in ALLOWED_EXTENSIONS:
    return error('Invalid file extension')

mime = magic.from_buffer(file_data, mime=True)
if mime not in ALLOWED_TYPES:
    return error('Invalid file type')
if extension not in ALLOWED_TYPES[mime]:
    return error('File extension does not match content')

File signatures and MIME detection are useful checks, but they are not a complete security boundary. Keep the allowlist narrow, reject ambiguous or polyglot files where possible, and parse or re-encode files with maintained libraries before trusting them as images, documents, or archives.

Store Files Outside Webroot

# VULNERABLE - files in webroot can be executed
UPLOAD_DIR = '/var/www/html/uploads'  # Dangerous!

# SECURE - files outside webroot
UPLOAD_DIR = '/var/app_data/uploads'  # Not web-accessible

# Serve through application with access controls
@app.route('/file/<file_id>')
def serve_file(file_id):
    # Authenticate user, check permissions
    if not user.can_access(file_id):
        abort(403)

    filepath = get_secure_path(file_id)
    return send_file(
        filepath,
        as_attachment=True,
        download_name=get_original_display_name(file_id),
        mimetype='application/octet-stream'
    )

Rename Uploaded Files

# VULNERABLE - uses original filename
save_path = os.path.join(UPLOAD_DIR, uploaded_filename)

# SECURE - generate random storage name and keep original only for display
import uuid
extension = get_validated_extension(uploaded_filename)
display_name = sanitize_display_name(uploaded_filename)
new_filename = f"{uuid.uuid4()}{extension}"
save_path = os.path.join(UPLOAD_DIR, new_filename)

# Store mapping in database
db.store_file(user_id, file_id=new_filename, original_name=display_name)

Do not use the original filename in the storage path. Normalize the display name, remove path separators and control characters, and enforce a maximum length before showing it back to users.

Implement File Size Limits

# Limit upload size before reading the whole request body into memory
MAX_FILE_SIZE = 5 * 1024 * 1024  # 5 MB

if len(file_data) > MAX_FILE_SIZE:
    return error('File too large')

Apply limits at the reverse proxy, application server, and application layer. For archives and compressed formats, also limit decompressed size, file count, nesting depth, and extraction paths to avoid archive bombs and path traversal.

Sanitize Image Uploads

from PIL import Image
import io

# Strip metadata and re-encode
Image.MAX_IMAGE_PIXELS = 20_000_000
image = Image.open(io.BytesIO(file_data))
image.verify()
image = Image.open(io.BytesIO(file_data))
image.thumbnail((4096, 4096))
clean_image = io.BytesIO()
image.save(clean_image, format='PNG')  # Re-encode as PNG

Image processing reduces metadata and active-content risk for raster images, but it is not a substitute for patching image libraries and rejecting formats that can contain scriptable content, such as SVG, unless the application has a dedicated sanitizer and safe serving policy.

Configure Web Server

Storing uploads outside webroot is the preferred control. If uploaded files must be reachable through the web server, configure the upload location so it cannot execute scripts and serves untrusted content as downloads.

# Nginx - defense in depth for a web-served upload directory
location /uploads/ {
    # Serve files as downloads only
    add_header Content-Disposition "attachment";
    add_header X-Content-Type-Options "nosniff";

    # Disable script execution
    location ~ \.(php|jsp|asp|aspx|cgi)$ {
        deny all;
    }
}

Scan Uploads for Malware

Integrate anti-virus scanning:

ClamAV for open-source scanning
Cloud services: VirusTotal API, AWS GuardDuty, Azure Defender
Quarantine files until scan completes

Malware scanning is defense-in-depth, not a substitute for type allowlisting and safe storage. Do not submit sensitive user files to third-party scanning services unless the data-sharing and retention implications are acceptable for the application.

Protect Upload Endpoints

Treat upload routes as state-changing operations:

Require authentication and authorization before accepting the file
Protect browser-based upload forms from CSRF
Rate-limit upload attempts and enforce per-user storage quotas
Log upload decisions with safe metadata only, not raw file contents