CWE-330: Use of Insufficiently Random Values - Python

Overview

Weak random number generation in Python occurs when developers use the random module for security-sensitive operations like generating session tokens, password reset tokens, API keys, CSRF tokens, or cryptographic keys. The random module uses the Mersenne Twister algorithm, which is deterministic and predictable if an attacker can observe outputs or guess the seed. For security purposes, Python provides the secrets module (Python 3.6+) and os.urandom(), which use cryptographically secure random number generators (CSRNGs).

Primary Defence: Use secrets module (Python 3.6+) for all security-sensitive random value generation including tokens and keys.

Common Vulnerable Patterns

random.randint() for Session IDs

import random

# VULNERABLE - Predictable session ID
def generate_session_id():
    session_id = random.randint(1000000, 9999999)
    return session_id

# Attacker can predict: If they see a few session IDs (e.g., 3045213, 8765443),
# they can infer the PRNG state and predict future values

Why this is vulnerable: random is not cryptographically secure. Mersenne Twister state can be recovered from 624 consecutive outputs.

random.choice() for Password Reset Tokens

import random
import string

# VULNERABLE - Predictable reset token
def generate_reset_token():
    chars = string.ascii_letters + string.digits
    token = ''.join(random.choice(chars) for _ in range(32))
    return token

# Token looks random: "a7B3xQ9..." but is predictable
# Attacker can brute force or predict based on seed

Why this is vulnerable: random.choice() uses Mersenne Twister. If attacker knows seed (often time-based), they can generate same tokens.

time-based Seed

import random
import time

# VULNERABLE - Predictable seed
random.seed(int(time.time()))  # Seed with current timestamp
token = random.randint(0, 999999)

# Attacker knowing approximate time can brute force seed
# Time has limited entropy (~32 bits for timestamp)

Why this is vulnerable: Time is predictable. Attacker can try all possible timestamps within a time window and reproduce tokens.

UUID4 Fallback to random

import uuid

# POTENTIALLY VULNERABLE: uuid4 falls back to random if no CSPRNG available
user_id = str(uuid.uuid4())  # Usually secure, but can fall back to random

# In some environments (older Python, certain VMs),
# uuid4 may use random.getrandbits() as fallback

Why this is vulnerable: uuid.uuid4() should use os.urandom(), but implementations may fall back to random if CSPRNG unavailable.

random.shuffle() for Security

import random

# VULNERABLE - Shuffling for security purposes
def generate_verification_code(user_id):
    digits = list("0123456789" * 10)
    random.shuffle(digits)
    code = ''.join(digits[:6])
    return code

# Appears random but is predictable with seed knowledge

Why this is vulnerable: random.shuffle() uses Mersenne Twister. Patterns are reproducible.

Predictable Encryption Key

import random
from cryptography.fernet import Fernet

# VULNERABLE - Key derived from weak random
def generate_encryption_key():
    key_bytes = bytes([random.randint(0, 255) for _ in range(32)])
    # This is NOT how you generate Fernet keys, but illustrates the point
    return key_bytes

# Attacker can predict key if they know seed

Why this is vulnerable: Encryption keys must have full entropy. Predictable random = predictable keys = broken encryption.

Random Salt for Passwords

import random
import hashlib

# VULNERABLE - Weak salt
def hash_password(password):
    salt = str(random.randint(0, 999999))  # Predictable salt
    salted = salt + password
    hashed = hashlib.sha256(salted.encode()).hexdigest()
    return hashed, salt

# Salt must be unpredictable to prevent rainbow tables

Why this is vulnerable: Predictable salts can be precomputed in rainbow tables, defeating the purpose of salting.

Random Nonce/IV

import random
from Crypto.Cipher import AES

# VULNERABLE - Predictable nonce/IV
def encrypt_data(data, key):
    nonce = bytes([random.randint(0, 255) for _ in range(16)])  # WEAK
    cipher = AES.new(key, AES.MODE_GCM, nonce=nonce)
    ciphertext, tag = cipher.encrypt_and_digest(data)
    return nonce, ciphertext, tag

# Nonce reuse or prediction breaks AES-GCM security

Why this is vulnerable: Cryptographic nonces must be unique and unpredictable. Weak random = nonce reuse = catastrophic failure.

Secure Patterns

secrets.token_hex() for Session IDs

import secrets

# SECURE - Cryptographically strong session ID
def generate_session_id():
    # 32 hex chars = 128 bits of entropy
    session_id = secrets.token_hex(16)
    return session_id

# Example: "3a7b9c8e4f1d2a5b6c7e8f9a0b1c2d3e"
# Unpredictable even with knowledge of previous tokens

Why this works:

secrets module uses OS CSPRNG: Unlike random (Mersenne Twister - recoverable from 624 outputs), uses os.urandom() internally
OS entropy sources: /dev/urandom (Linux/Unix) or CryptGenRandom/BCryptGenRandom (Windows)
128 bits prevents collisions: 16 bytes = 32 hex chars; collision probability 2^-128 is astronomically unlikely
Unpredictable output: Observing any number of IDs provides no prediction capability
Prevents session attacks: Cryptographic randomness blocks session fixation and hijacking

secrets.token_urlsafe() for Reset Tokens

import secrets

# SECURE - URL-safe reset token
def generate_reset_token():
    # 32 bytes = 256 bits of entropy, base64-encoded (URL-safe)
    token = secrets.token_urlsafe(32)
    return token

# Example: "A3b7K9xQmZpLr4tYwFj2nVc8hG1sE6uD..."
# Can be safely used in URLs, emails

Why this works:

256 bits prevents brute-force: Testing 1 trillion tokens/second takes ~10^58 years to exhaust half the space
base64url encoding is URL-safe: Replaces + with -, / with _, omits = for safe transmission in URLs/emails
Independent token generation: Observing millions of tokens provides no prediction capability
Superior to predictable schemes: Unlike timestamp/sequential tokens which can be predicted or enumerated
Best practices: Single-use, 1-24 hour expiration, rate limiting to prevent brute-force

os.urandom() for Encryption Keys

import os
from cryptography.fernet import Fernet

# SECURE - Generate Fernet key properly
def generate_encryption_key():
    # Fernet.generate_key() uses os.urandom() internally
    key = Fernet.generate_key()
    return key

# Or directly use os.urandom for custom needs
def generate_custom_key():
    # 32 bytes = 256 bits
    key = os.urandom(32)
    return key

Why this works:

Highest quality randomness: os.urandom() accesses OS cryptographic source combining hardware entropy (CPU jitter, interrupt timing, hardware RNG) with crypto algorithms
256-bit security: Provides sufficient margin against brute-force; even with quantum computers using Grover's algorithm (quadratic speedup), 256-bit keys retain 128 bits post-quantum security
Fernet integration: Fernet.generate_key() internally calls os.urandom(32) for 32-byte key (AES-128-CBC + HMAC-SHA256 authenticated encryption)
Direct usage: os.urandom() appropriate for raw random bytes, custom cryptographic constructions, specific key sizes
Superior to alternatives: Avoids deriving keys from passwords without KDFs or using weak sources like random.random() (predictable/reproducible)

secrets.choice() for Random Selection

import secrets
import string

# SECURE - Random password generation
def generate_password(length=16):
    chars = string.ascii_letters + string.digits + string.punctuation
    password = ''.join(secrets.choice(chars) for _ in range(length))
    return password

# Example: "x7!aB9#qZ3$mK2&p"
# Each character independently random

Why this works:

Cryptographic randomness: secrets.choice() uses os.urandom() for independent, uniformly distributed character selection across the character set
High entropy: 94-character set (uppercase, lowercase, digits, punctuation) with 16 characters = 94^16 ≈ 5.3×10^31 possibilities (≈104 bits entropy)
Independence: Knowing any number of generated passwords provides no advantage in predicting future passwords (unlike random.choice() where Mersenne Twister state can be reconstructed)
Uniform distribution critical: Unbiased selection maximizes entropy per character; string module constants provide standard sets (filter ambiguous chars like O, 0, l, 1 for usability)

secrets.randbelow() for Random Integers

import secrets

CODE_SPACE = 1_000_000  # 6-digit decimal codes

def generate_verification_code():
    code = secrets.randbelow(CODE_SPACE)
    return f"{code:06d}"

Why this works:

Avoids modulo bias: Naively reducing random_bytes with % n introduces bias when the RNG's range is not an exact multiple of n (Example: 0..255 % 10 gives 0–5 occurring 26 times each, 6–9 occurring 25 times each → biased)
Rejection sampling: secrets.randbelow() internally uses rejection sampling to discard out-of-range values
Equal probability: For 6-digit codes (0-999999), each code has identical generation probability with ~20 bits entropy (log2(1000000) ≈ 19.93)
Short-lived use cases: Suitable for email confirmation/2FA when combined with rate limiting, expiration (5-15 min), account lockout
Unpredictable: Cryptographic randomness prevents prediction even with knowledge of previous codes

UUID4 with Explicit CSPRNG Check

import uuid
import os

# SECURE - Ensure uuid4 uses CSPRNG
def generate_user_id():
    # uuid4() should use os.urandom(), but verify
    user_id = uuid.uuid4()

    # Alternatively, generate manually to ensure CSPRNG
    random_bytes = os.urandom(16)
    manual_uuid = uuid.UUID(bytes=random_bytes, version=4)

    return str(user_id)

# Example: "f47ac10b-58cc-4372-a567-0e02b2c3d479"

Why this works:

122 bits randomness (RFC 4122 v4): 6 bits reserved for version/variant identifiers
Python 3.6+ uses os.urandom(): Modern versions ensure cryptographic randomness; older versions may fall back to weaker sources
Manual generation guarantees security: Using os.urandom(16) explicitly ensures cryptographic randomness regardless of Python version
Negligible collision probability: 1 billion UUIDs/second takes ~2.7 × 10^9 years to reach 50% collision chance
Ideal for distributed systems: No coordination needed; prevents information leakage vs sequential IDs

Password Hashing with Cryptographic Salt

import hashlib
import os

# SECURE - Proper salting with os.urandom
def hash_password(password):
    salt = os.urandom(16)  # 128 bits of random salt
    salted = salt + password.encode()
    hashed = hashlib.pbkdf2_hmac('sha256', password.encode(), salt, 600000)
    return hashed.hex(), salt.hex()

# Better: Use argon2 or bcrypt library
from argon2 import PasswordHasher

def hash_password_argon2(password):
    ph = PasswordHasher()
    # Argon2 handles salt generation internally with CSPRNG
    hashed = ph.hash(password)
    return hashed

Why this works:

Cryptographic salt prevents rainbow tables: os.urandom(16) generates 128 bits random salt ensuring unique hashes even for identical passwords
Salt doesn't need secrecy: Stored with hash; must be unique and unpredictable to prevent precomputation attacks
PBKDF2 with 600,000 iterations: Makes brute-force expensive - 600,000 hash operations per guess (OWASP 2023 recommendation)
Modern best practice: Argon2/bcrypt: Memory-hard algorithms resist GPU/ASIC attacks; Argon2 won 2015 Password Hashing Competition
Libraries auto-generate salts: Argon2/bcrypt use CSPRNG internally, eliminate manual salt handling, include constant-time comparison for timing attack protection

Cryptographic Nonce/IV Generation

import os
from Crypto.Cipher import AES

# SECURE - Proper nonce generation
def encrypt_data(data, key):
    # AES-GCM nonce should be 12 bytes (96 bits) for best performance
    nonce = os.urandom(12)
    cipher = AES.new(key, AES.MODE_GCM, nonce=nonce)
    ciphertext, tag = cipher.encrypt_and_digest(data)
    return nonce, ciphertext, tag

# Nonce is unpredictable and unique with overwhelming probability

Why this works:

Nonce uniqueness critical: For AES-GCM, nonce must be unique per encryption with same key - reusing catastrophically breaks authentication, allowing forgery and key recovery
Optimal size: 12-byte (96-bit) nonce is optimal for AES-GCM performance, avoiding internal GHASH operations needed for other sizes
Statistical uniqueness: os.urandom(12) provides statistical uniqueness - with 96 bits, collision probability negligible until ~2^48 nonces (~281 trillion)
CBC vs GCM: AES-CBC requires 16-byte IV that's unpredictable (not just unique) to prevent chosen-plaintext attacks; GCM only requires uniqueness
Storage: Nonce doesn't need secrecy; returned alongside ciphertext and auth tag for decryption

Key Security Functions

Token Generation Helper

import secrets

def generate_token(purpose, nbytes=32):
    """
    Generate secure tokens for various purposes

    Args:
        purpose: 'session', 'reset', 'api', 'csrf'
        nbytes: Number of random bytes (default 32 = 256 bits)

    Returns:
        Secure random token as hex string
    """
    token = secrets.token_hex(nbytes)
    # Optionally prefix with purpose for identification
    return f"{purpose}_{token}"

# Usage
session_token = generate_token('session')  # "session_a7b3c9..."
reset_token = generate_token('reset', 48)  # Longer for password reset

Secure Random String Generator

import secrets
import string

def generate_secure_string(length, charset='alphanumeric'):
    """
    Generate cryptographically secure random string

    Args:
        length: Desired string length
        charset: 'alphanumeric', 'hex', 'base64', 'ascii', 'digits'

    Returns:
        Random string
    """
    charsets = {
        'alphanumeric': string.ascii_letters + string.digits,
        'hex': string.hexdigits.lower(),
        'ascii': string.ascii_letters + string.digits + string.punctuation,
        'digits': string.digits,
        'base64': string.ascii_letters + string.digits + '+/'
    }

    chars = charsets.get(charset, charsets['alphanumeric'])
    return ''.join(secrets.choice(chars) for _ in range(length))

# Usage
api_key = generate_secure_string(32, 'alphanumeric')
pin = generate_secure_string(6, 'digits')

Entropy Checker

import math

def estimate_entropy(value, alphabet_size):
    """
    Estimate entropy bits for a random value

    Args:
        value: The random value (string or bytes)
        alphabet_size: Size of character set (e.g., 62 for alphanumeric)

    Returns:
        Estimated entropy in bits
    """
    length = len(value)
    entropy_bits = length * math.log2(alphabet_size)
    return entropy_bits

# Usage
token = secrets.token_hex(32)  # 64 hex chars
entropy = estimate_entropy(token, 16)  # 256 bits
print(f"Token entropy: {entropy} bits")  # Should be >= 128 for security

# Minimum entropy recommendations:
# - Session tokens: 128 bits
# - Password reset: 128-256 bits
# - API keys: 128-256 bits
# - Encryption keys: 128-256 bits (AES-128/AES-256)

Verification

After implementing the recommended secure patterns, verify the fix through multiple approaches:

Manual testing: Submit malicious payloads relevant to this vulnerability and confirm they're handled safely without executing unintended operations
Code review: Confirm all instances use the secure pattern (parameterized queries, safe APIs, proper encoding) with no string concatenation or unsafe operations
Static analysis: Use security scanners to verify no new vulnerabilities exist and the original finding is resolved
Regression testing: Ensure legitimate user inputs and application workflows continue to function correctly
Edge case validation: Test with special characters, boundary conditions, and unusual inputs to verify proper handling
Framework verification: If using a framework or library, confirm the recommended APIs are used correctly according to documentation
Authentication/session testing: Verify security controls remain effective and cannot be bypassed (if applicable to the vulnerability type)
Rescan: Run the security scanner again to confirm the finding is resolved and no new issues were introduced

Analysis Steps

Locate the weak random usage

# Line 45 in auth/tokens.py
import random
import string

def generate_api_key():
    chars = string.ascii_letters + string.digits
    token = ''.join(random.choice(chars) for _ in range(32))  # VULNERABLE
    return token

Identify the purpose

API key generation (security-critical)
Requires unpredictability
Used for authentication

Assess the risk

API keys grant access to protected resources
Predictable keys = unauthorized access
Impact: High (authentication bypass)

Determine required entropy

API keys should have 128-256 bits entropy
Current: 32 chars from 62-char alphabet = ~190 bits (if truly random)
Problem: Not truly random - predictable with seed knowledge

Remediation Steps

Replace random with secrets

# BEFORE (Line 45 - vulnerable)
import random
import string

def generate_api_key():
    chars = string.ascii_letters + string.digits
    token = ''.join(random.choice(chars) for _ in range(32))
    return token

# AFTER (fixed)
import secrets
import string

def generate_api_key():
    chars = string.ascii_letters + string.digits
    token = ''.join(secrets.choice(chars) for _ in range(32))
    return token

# Or simpler with token_urlsafe
def generate_api_key_simple():
    return secrets.token_urlsafe(32)  # 256 bits entropy

Add entropy validation

import secrets
import math

def generate_api_key():
    token = secrets.token_urlsafe(32)

    # Verify minimum entropy (optional defensive check)
    entropy_bits = len(token) * math.log2(64)  # Base64 = 64 chars
    MIN_ENTROPY = 128

    if entropy_bits < MIN_ENTROPY:
        raise ValueError(f"Insufficient entropy: {entropy_bits} bits")

    return token

Re-scan

bandit -r auth/tokens.py
# Or semgrep scan

CWE-330: Use of Insufficiently Random Values - Python

Overview

Common Vulnerable Patterns

random.randint() for Session IDs

random.choice() for Password Reset Tokens

time-based Seed

UUID4 Fallback to random

random.shuffle() for Security

Predictable Encryption Key

Random Salt for Passwords

Random Nonce/IV

Secure Patterns

secrets.token_hex() for Session IDs

secrets.token_urlsafe() for Reset Tokens

os.urandom() for Encryption Keys

secrets.choice() for Random Selection

secrets.randbelow() for Random Integers

UUID4 with Explicit CSPRNG Check

Password Hashing with Cryptographic Salt

Cryptographic Nonce/IV Generation

Key Security Functions

Token Generation Helper

Secure Random String Generator

Entropy Checker

Verification

Analysis Steps

Locate the weak random usage

Identify the purpose

Assess the risk

Determine required entropy

Remediation Steps

Replace random with secrets

Add entropy validation

Re-scan

Security Checklist

Additional Resources