CWE-183: Permissive List of Allowed Inputs - JavaScript/TypeScript

Overview

JavaScript and TypeScript-specific guidance for implementing strict input validation using regular expressions, URL APIs, and Set-based allowlists.

Primary Defence: Use fully anchored regex patterns with ^ and $, validate with native APIs like URL constructor for URLs and path.resolve() for file paths, implement Sets for allowlist matching instead of regex where possible, and enforce strict length limits to ensure complete input validation and prevent injection attacks.

Common Vulnerable Patterns

Unanchored Regular Expressions

// VULNERABLE - no anchors, matches substring
function validateUsername(username) {
    // Attacker: "admin'; DROP TABLE users--"
    const pattern = /[a-zA-Z0-9]+/;
    return pattern.test(username);  // Matches substring!
}

// VULNERABLE - permissive URL validation
function validateURL(url) {
    // Attacker: "javascript:alert(1)"
    return /.*:\/\/.*/.test(url);  // Allows any protocol!
}

Permissive File Extension Check

function validateFilename(filename) {
    // VULNERABLE - checks if extension appears anywhere
    // Attacker: "malware.exe.jpg"
    return /\.(jpg|png|gif)/.test(filename);
}

Secure Patterns

Strict Username Validation

const MAX_USERNAME_LENGTH = 20;
const USERNAME_PATTERN = /^[a-z0-9_]{3,20}$/i;
const RESERVED_NAMES = new Set(['admin', 'root', 'system', 'administrator']);

function validateUsername(username: string): boolean {
    if (!username || username.length > MAX_USERNAME_LENGTH) {
        return false;
    }

    // Strict: anchored pattern
    if (!USERNAME_PATTERN.test(username)) {
        return false;
    }

    // Reject reserved names
    if (RESERVED_NAMES.has(username.toLowerCase())) {
        return false;
    }

    return true;
}

Why this works: The anchored regex pattern /^[a-z0-9_]{3,20}$/i uses ^ (start) and $ (end) anchors to ensure the entire string matches exactly, preventing substring matches that would allow "admin'; DROP TABLE users--" to pass validation. The case-insensitive flag (i) allows flexible input while maintaining strict character requirements. Length validation prevents ReDoS attacks and buffer overflows. The Set data structure for reserved names provides O(1) lookup performance and prevents privilege escalation by blocking admin/system accounts. Defining the pattern as a constant ensures consistent validation across the application and improves performance by compiling the regex once.

Strict URL Validation

const ALLOWED_PROTOCOLS = new Set(['http:', 'https:']);

function validateURL(urlString: string): boolean {
    try {
        const url = new URL(urlString);

        // Strict: only allow specific protocols
        if (!ALLOWED_PROTOCOLS.has(url.protocol)) {
            return false;
        }

        // Validate hostname exists
        if (!url.hostname) {
            return false;
        }

        // Optional: reject localhost and private IPs
        const hostname = url.hostname.toLowerCase();
        if (hostname === 'localhost' || 
            hostname === '127.0.0.1' ||
            hostname.startsWith('192.168.') ||
            hostname.startsWith('10.') ||
            /^172\.(1[6-9]|2[0-9]|3[0-1])\./.test(hostname)) {
            return false;
        }

        return true;
    } catch (e) {
        return false;
    }
}

Why this works: The native URL constructor provides robust parsing that correctly handles URL components and rejects malformed URLs. By validating url.protocol against a Set of allowed protocols, the code prevents dangerous protocols like javascript:, data:, file:, or vbscript: that could enable XSS or local file access attacks. Checking for a non-empty hostname prevents URLs like http:// that have valid schemes but no destination. The private IP and localhost checks prevent SSRF attacks targeting internal services (192.168.x.x, 10.x.x.x, 172.16-31.x.x, 127.0.0.1). Using try-catch for parsing errors ensures that any malformed URLs result in rejection, following a fail-secure pattern. This approach is much safer than regex-based URL validation which is prone to bypasses.

Strict Email Validation

This is strictly based on xxxxx@yyyyy.zzzzzz. Full RFC5322 compliance can be much more complex.

const MAX_EMAIL_LENGTH = 254;
const MAX_LOCAL_LENGTH = 64;
const EMAIL_PATTERN = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/;

function validateEmail(email: string): boolean {
    if (!email || email.length > MAX_EMAIL_LENGTH) {
        return false;
    }

    // Anchored regex validates entire string
    if (!EMAIL_PATTERN.test(email)) {
        return false;
    }

    // Additional semantic checks
    const [local] = email.split('@');
    if (local.length > MAX_LOCAL_LENGTH) {
        return false;
    }

    return true;
}

Why this works: The anchored pattern /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/ enforces strict email structure with clear separation between local part, @ symbol, domain, and TLD. The anchors prevent accepting emails embedded in larger strings (like "user@example.com<script>alert(1)</script>"). Length validation at 254 characters matches RFC 5321 limits and prevents ReDoS attacks from extremely long inputs. The local part length check (64 characters) enforces RFC 5321 mailbox limits. The pattern requires at least a 2-character TLD (.co, .uk) which blocks invalid domains and most typos. This simplified approach balances security with usability - full RFC 5322 compliance is extremely complex and rarely needed for web applications.

Strict Filename Validation

const MAX_FILENAME_LENGTH = 255;
const FILENAME_PATTERN = /^[a-zA-Z0-9_-]+\.(jpg|png|gif)$/i;

function validateFilename(filename: string): boolean {
    if (!filename || filename.length > MAX_FILENAME_LENGTH) {
        return false;
    }

    // Anchored pattern - must END with allowed extension
    if (!FILENAME_PATTERN.test(filename)) {
        return false;
    }

    // Additional security checks
    if (filename.includes('..') || filename.includes('/') || filename.includes('\\')) {
        return false;
    }

    return true;
}

Why this works: The pattern /^[a-zA-Z0-9_-]+\.(jpg|png|gif)$/i uses the $ anchor to ensure the filename ends with an allowed extension, preventing double-extension attacks like "malware.exe.jpg" where the real extension is .exe but .jpg appears in the filename. The character allowlist [a-zA-Z0-9_-] blocks special characters that could be used for path traversal or command injection. Length validation prevents buffer overflows and denial-of-service from extremely long filenames. The explicit checks for .., /, and \ provide defense-in-depth against path traversal, even though the regex should already block these. Case-insensitive matching (i flag) prevents bypasses like "file.JPG" vs "file.jpg".

Path Validation (Node.js)

import path from 'path';
import fs from 'fs';

const BASE_DIR = path.resolve('/var/data');
const ALLOWED_FILES = new Set(['report.pdf', 'data.csv', 'summary.txt']);

function getFilePath(filename: string): string {
    // Strict allowlist
    if (!ALLOWED_FILES.has(filename)) {
        throw new Error('File not allowed');
    }

    // Resolve to absolute path
    const filePath = path.resolve(BASE_DIR, filename);

    // Verify within allowed directory
    if (!filePath.startsWith(BASE_DIR + path.sep)) {
        throw new Error('Path traversal detected');
    }

    // Verify file exists
    if (!fs.existsSync(filePath)) {
        throw new Error('File not found');
    }

    return filePath;
}

Why this works: The allowlist approach with ALLOWED_FILES provides the strongest security by explicitly defining which files can be accessed, blocking any unauthorized file requests. The path.resolve() method converts relative paths to absolute paths and normalizes them (removing ., .., redundant separators), preventing path traversal attacks that use techniques like "../../etc/passwd", symbolic links, or OS-specific tricks. The startsWith() check with BASE_DIR + path.sep ensures the canonical path remains within the base directory, blocking escapes even if normalization was bypassed. Using path.sep ensures platform-independent validation (works on both Windows \ and Unix /). The fs.existsSync() check prevents time-of-check-time-of-use (TOCTOU) issues. This defense-in-depth approach combines allowlisting, canonicalization, and boundary checking.

Enum-Based Validation (TypeScript)

enum Role {
    USER = 'user',
    MODERATOR = 'moderator',
    ADMIN = 'admin'
}

function validateRole(role: string): boolean {
    // Type-safe validation
    return Object.values(Role).includes(role as Role);
}

// Alternative: using Set
const ALLOWED_ROLES = new Set(['user', 'moderator', 'admin']);

function validateRoleSet(role: string): boolean {
    return ALLOWED_ROLES.has(role.toLowerCase());
}

Why this works: TypeScript enums provide compile-time type safety and a fixed set of allowed values that cannot be extended at runtime. The Object.values(Role).includes() check ensures only values that exist in the enum are accepted, providing perfect allowlist validation without regex complexity. The Set alternative provides O(1) lookup performance compared to array .includes() which is O(n). Converting input to lowercase enables case-insensitive matching while maintaining strict value validation. This approach eliminates injection risks entirely because there's no pattern matching - the value either exists in the allowed set or it doesn't. Using enums also improves IDE autocomplete, enables refactoring tools, and prevents typos through type checking.

Numeric ID Validation

const ID_PATTERN = /^[0-9]{8}$/;
const MIN_ID = 10000000;
const MAX_ID = 99999999;

function validateID(idStr: string): boolean {
    // Format validation
    if (!ID_PATTERN.test(idStr)) {
        return false;
    }

    // Semantic validation: check range
    const id = parseInt(idStr, 10);
    return id >= MIN_ID && id <= MAX_ID;
}

Why this works: The pattern /^[0-9]{8}$/ enforces exactly 8 digits with anchors, preventing inputs like "12345678abc" or "abc12345678" that contain valid substrings. This format validation happens before parsing, catching malformed input early and preventing issues with parseInt() which would silently ignore trailing non-numeric characters (e.g., parseInt("123abc", 10) returns 123). The range check with MIN_ID and MAX_ID enforces semantic validity - for example, if your IDs start at 10000000, inputs like "00000001" or "99999999" that match the format but are outside valid ranges get rejected. This layered validation (format → parsing → range) provides defense-in-depth and clear error boundaries.

JavaScript/TypeScript-Specific Best Practices

Use Anchored Regular Expressions

// WRONG: matches substring
const pattern1 = /[a-z0-9]+/;

// CORRECT: anchored to match entire string
const pattern2 = /^[a-z0-9]+$/;

// Test the difference
console.log(pattern1.test("admin'; DROP TABLE")); // true (matches "admin")
console.log(pattern2.test("admin'; DROP TABLE")); // false (no full match)

Define Patterns as Constants

// Define once, reuse many times
const USERNAME_PATTERN = /^[a-z0-9_]{3,20}$/i;
const EMAIL_PATTERN = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/;

function validate(username: string): boolean {
    return USERNAME_PATTERN.test(username);
}

Use Sets for Allowlists

// Faster lookup than arrays
const ALLOWED_EXTENSIONS = new Set(['jpg', 'png', 'gif', 'pdf']);

function hasAllowedExtension(filename: string): boolean {
    const ext = filename.split('.').pop()?.toLowerCase();
    return ext ? ALLOWED_EXTENSIONS.has(ext) : false;
}

Use URL API for URL Validation

// Built-in URL parsing is safer than regex
function isValidHttpUrl(urlString: string): boolean {
    try {
        const url = new URL(urlString);
        return url.protocol === 'http:' || url.protocol === 'https:';
    } catch {
        return false;
    }
}

TypeScript Type Guards

type ValidatedString = string & { __validated: true };

function validateAndTag(input: string): ValidatedString | null {
    const pattern = /^[a-z0-9_]{3,20}$/i;
    if (pattern.test(input)) {
        return input as ValidatedString;
    }
    return null;
}

// Usage ensures validated strings are used safely
function processValidatedInput(input: ValidatedString) {
    // Input is guaranteed to be validated
    console.log(input);
}

const userInput = "test_user";
const validated = validateAndTag(userInput);
if (validated) {
    processValidatedInput(validated);
}

Frontend-Specific Considerations

Client-Side Validation is Not Security

// Client-side validation for UX only
function validateClientSide(email: string): boolean {
    const pattern = /^[a-zA-Z0-9._%+-]+@[a-zA-Z0-9.-]+\.[a-zA-Z]{2,}$/;
    return pattern.test(email);
}

// ALWAYS validate on server side too
// Never trust client-side validation for security

Sanitize Before DOM Insertion

// Even with validation, sanitize for XSS prevention
function sanitizeForHTML(input: string): string {
    const div = document.createElement('div');
    div.textContent = input;
    return div.innerHTML;
}

// Or use a library like DOMPurify
import DOMPurify from 'dompurify';

function displayUserInput(input: string) {
    const clean = DOMPurify.sanitize(input);
    document.getElementById('output')!.innerHTML = clean;
}

Verification

After implementing the recommended secure patterns, verify the fix through multiple approaches:

Manual testing: Submit malicious payloads relevant to this vulnerability and confirm they're handled safely without executing unintended operations
Code review: Confirm all instances use the secure pattern (parameterized queries, safe APIs, proper encoding) with no string concatenation or unsafe operations
Static analysis: Use security scanners to verify no new vulnerabilities exist and the original finding is resolved
Regression testing: Ensure legitimate user inputs and application workflows continue to function correctly
Edge case validation: Test with special characters, boundary conditions, and unusual inputs to verify proper handling
Framework verification: If using a framework or library, confirm the recommended APIs are used correctly according to documentation
Authentication/session testing: Verify security controls remain effective and cannot be bypassed (if applicable to the vulnerability type)
Rescan: Run the security scanner again to confirm the finding is resolved and no new issues were introduced