CWE-330: Use of Insufficiently Random Values - Python
Overview
Weak random number generation in Python occurs when developers use the random module for security-sensitive operations like generating session tokens, password reset tokens, API keys, CSRF tokens, or cryptographic keys. The random module uses the Mersenne Twister algorithm, which is deterministic and predictable if an attacker can observe outputs or guess the seed. For security purposes, Python provides the secrets module (Python 3.6+) and os.urandom(), which use cryptographically secure random number generators (CSRNGs).
Primary Defence: Use secrets module (Python 3.6+) for all security-sensitive random value generation including tokens and keys.
Common Vulnerable Patterns
random.randint() for Session IDs
import random
# VULNERABLE - Predictable session ID
def generate_session_id():
session_id = random.randint(1000000, 9999999)
return session_id
# Attacker can predict: If they see a few session IDs (e.g., 3045213, 8765443),
# they can infer the PRNG state and predict future values
Why this is vulnerable: random is not cryptographically secure. Mersenne Twister state can be recovered from 624 consecutive outputs.
random.choice() for Password Reset Tokens
import random
import string
# VULNERABLE - Predictable reset token
def generate_reset_token():
chars = string.ascii_letters + string.digits
token = ''.join(random.choice(chars) for _ in range(32))
return token
# Token looks random: "a7B3xQ9..." but is predictable
# Attacker can brute force or predict based on seed
Why this is vulnerable: random.choice() uses Mersenne Twister. If attacker knows seed (often time-based), they can generate same tokens.
time-based Seed
import random
import time
# VULNERABLE - Predictable seed
random.seed(int(time.time())) # Seed with current timestamp
token = random.randint(0, 999999)
# Attacker knowing approximate time can brute force seed
# Time has limited entropy (~32 bits for timestamp)
Why this is vulnerable: Time is predictable. Attacker can try all possible timestamps within a time window and reproduce tokens.
UUID4 Fallback to random
import uuid
# POTENTIALLY VULNERABLE: uuid4 falls back to random if no CSPRNG available
user_id = str(uuid.uuid4()) # Usually secure, but can fall back to random
# In some environments (older Python, certain VMs),
# uuid4 may use random.getrandbits() as fallback
Why this is vulnerable: uuid.uuid4() should use os.urandom(), but implementations may fall back to random if CSPRNG unavailable.
random.shuffle() for Security
import random
# VULNERABLE - Shuffling for security purposes
def generate_verification_code(user_id):
digits = list("0123456789" * 10)
random.shuffle(digits)
code = ''.join(digits[:6])
return code
# Appears random but is predictable with seed knowledge
Why this is vulnerable: random.shuffle() uses Mersenne Twister. Patterns are reproducible.
Predictable Encryption Key
import random
from cryptography.fernet import Fernet
# VULNERABLE - Key derived from weak random
def generate_encryption_key():
key_bytes = bytes([random.randint(0, 255) for _ in range(32)])
# This is NOT how you generate Fernet keys, but illustrates the point
return key_bytes
# Attacker can predict key if they know seed
Why this is vulnerable: Encryption keys must have full entropy. Predictable random = predictable keys = broken encryption.
Random Salt for Passwords
import random
import hashlib
# VULNERABLE - Weak salt
def hash_password(password):
salt = str(random.randint(0, 999999)) # Predictable salt
salted = salt + password
hashed = hashlib.sha256(salted.encode()).hexdigest()
return hashed, salt
# Salt must be unpredictable to prevent rainbow tables
Why this is vulnerable: Predictable salts can be precomputed in rainbow tables, defeating the purpose of salting.
Random Nonce/IV
import random
from Crypto.Cipher import AES
# VULNERABLE - Predictable nonce/IV
def encrypt_data(data, key):
nonce = bytes([random.randint(0, 255) for _ in range(16)]) # WEAK
cipher = AES.new(key, AES.MODE_GCM, nonce=nonce)
ciphertext, tag = cipher.encrypt_and_digest(data)
return nonce, ciphertext, tag
# Nonce reuse or prediction breaks AES-GCM security
Why this is vulnerable: Cryptographic nonces must be unique and unpredictable. Weak random = nonce reuse = catastrophic failure.
Secure Patterns
secrets.token_hex() for Session IDs
import secrets
# SECURE - Cryptographically strong session ID
def generate_session_id():
# 32 hex chars = 128 bits of entropy
session_id = secrets.token_hex(16)
return session_id
# Example: "3a7b9c8e4f1d2a5b6c7e8f9a0b1c2d3e"
# Unpredictable even with knowledge of previous tokens
Why this works:
secretsmodule uses OS CSPRNG: Unlikerandom(Mersenne Twister - recoverable from 624 outputs), usesos.urandom()internally- OS entropy sources:
/dev/urandom(Linux/Unix) orCryptGenRandom/BCryptGenRandom(Windows) - 128 bits prevents collisions: 16 bytes = 32 hex chars; collision probability 2^-128 is astronomically unlikely
- Unpredictable output: Observing any number of IDs provides no prediction capability
- Prevents session attacks: Cryptographic randomness blocks session fixation and hijacking
secrets.token_urlsafe() for Reset Tokens
import secrets
# SECURE - URL-safe reset token
def generate_reset_token():
# 32 bytes = 256 bits of entropy, base64-encoded (URL-safe)
token = secrets.token_urlsafe(32)
return token
# Example: "A3b7K9xQmZpLr4tYwFj2nVc8hG1sE6uD..."
# Can be safely used in URLs, emails
Why this works:
- 256 bits prevents brute-force: Testing 1 trillion tokens/second takes ~10^58 years to exhaust half the space
- base64url encoding is URL-safe: Replaces
+with-,/with_, omits=for safe transmission in URLs/emails - Independent token generation: Observing millions of tokens provides no prediction capability
- Superior to predictable schemes: Unlike timestamp/sequential tokens which can be predicted or enumerated
- Best practices: Single-use, 1-24 hour expiration, rate limiting to prevent brute-force
os.urandom() for Encryption Keys
import os
from cryptography.fernet import Fernet
# SECURE - Generate Fernet key properly
def generate_encryption_key():
# Fernet.generate_key() uses os.urandom() internally
key = Fernet.generate_key()
return key
# Or directly use os.urandom for custom needs
def generate_custom_key():
# 32 bytes = 256 bits
key = os.urandom(32)
return key
Why this works:
- Highest quality randomness:
os.urandom()accesses OS cryptographic source combining hardware entropy (CPU jitter, interrupt timing, hardware RNG) with crypto algorithms - 256-bit security: Provides sufficient margin against brute-force; even with quantum computers using Grover's algorithm (quadratic speedup), 256-bit keys retain 128 bits post-quantum security
- Fernet integration:
Fernet.generate_key()internally callsos.urandom(32)for 32-byte key (AES-128-CBC + HMAC-SHA256 authenticated encryption) - Direct usage:
os.urandom()appropriate for raw random bytes, custom cryptographic constructions, specific key sizes - Superior to alternatives: Avoids deriving keys from passwords without KDFs or using weak sources like
random.random()(predictable/reproducible)
secrets.choice() for Random Selection
import secrets
import string
# SECURE - Random password generation
def generate_password(length=16):
chars = string.ascii_letters + string.digits + string.punctuation
password = ''.join(secrets.choice(chars) for _ in range(length))
return password
# Example: "x7!aB9#qZ3$mK2&p"
# Each character independently random
Why this works:
- Cryptographic randomness:
secrets.choice()usesos.urandom()for independent, uniformly distributed character selection across the character set - High entropy: 94-character set (uppercase, lowercase, digits, punctuation) with 16 characters = 94^16 ≈ 5.3×10^31 possibilities (≈104 bits entropy)
- Independence: Knowing any number of generated passwords provides no advantage in predicting future passwords (unlike
random.choice()where Mersenne Twister state can be reconstructed) - Uniform distribution critical: Unbiased selection maximizes entropy per character;
stringmodule constants provide standard sets (filter ambiguous chars likeO,0,l,1for usability)
secrets.randbelow() for Random Integers
import secrets
CODE_SPACE = 1_000_000 # 6-digit decimal codes
def generate_verification_code():
code = secrets.randbelow(CODE_SPACE)
return f"{code:06d}"
Why this works:
- Avoids modulo bias: Naively reducing random_bytes with % n introduces bias when the RNG's range is not an exact multiple of n (Example: 0..255 % 10 gives 0–5 occurring 26 times each, 6–9 occurring 25 times each → biased)
- Rejection sampling:
secrets.randbelow()internally uses rejection sampling to discard out-of-range values - Equal probability: For 6-digit codes (0-999999), each code has identical generation probability with ~20 bits entropy (log2(1000000) ≈ 19.93)
- Short-lived use cases: Suitable for email confirmation/2FA when combined with rate limiting, expiration (5-15 min), account lockout
- Unpredictable: Cryptographic randomness prevents prediction even with knowledge of previous codes
UUID4 with Explicit CSPRNG Check
import uuid
import os
# SECURE - Ensure uuid4 uses CSPRNG
def generate_user_id():
# uuid4() should use os.urandom(), but verify
user_id = uuid.uuid4()
# Alternatively, generate manually to ensure CSPRNG
random_bytes = os.urandom(16)
manual_uuid = uuid.UUID(bytes=random_bytes, version=4)
return str(user_id)
# Example: "f47ac10b-58cc-4372-a567-0e02b2c3d479"
Why this works:
- 122 bits randomness (RFC 4122 v4): 6 bits reserved for version/variant identifiers
- Python 3.6+ uses
os.urandom(): Modern versions ensure cryptographic randomness; older versions may fall back to weaker sources - Manual generation guarantees security: Using
os.urandom(16)explicitly ensures cryptographic randomness regardless of Python version - Negligible collision probability: 1 billion UUIDs/second takes ~2.7 × 10^9 years to reach 50% collision chance
- Ideal for distributed systems: No coordination needed; prevents information leakage vs sequential IDs
Password Hashing with Cryptographic Salt
import hashlib
import os
# SECURE - Proper salting with os.urandom
def hash_password(password):
salt = os.urandom(16) # 128 bits of random salt
salted = salt + password.encode()
hashed = hashlib.pbkdf2_hmac('sha256', password.encode(), salt, 600000)
return hashed.hex(), salt.hex()
# Better: Use argon2 or bcrypt library
from argon2 import PasswordHasher
def hash_password_argon2(password):
ph = PasswordHasher()
# Argon2 handles salt generation internally with CSPRNG
hashed = ph.hash(password)
return hashed
Why this works:
- Cryptographic salt prevents rainbow tables:
os.urandom(16)generates 128 bits random salt ensuring unique hashes even for identical passwords - Salt doesn't need secrecy: Stored with hash; must be unique and unpredictable to prevent precomputation attacks
- PBKDF2 with 600,000 iterations: Makes brute-force expensive - 600,000 hash operations per guess (OWASP 2023 recommendation)
- Modern best practice: Argon2/bcrypt: Memory-hard algorithms resist GPU/ASIC attacks; Argon2 won 2015 Password Hashing Competition
- Libraries auto-generate salts: Argon2/bcrypt use CSPRNG internally, eliminate manual salt handling, include constant-time comparison for timing attack protection
Cryptographic Nonce/IV Generation
import os
from Crypto.Cipher import AES
# SECURE - Proper nonce generation
def encrypt_data(data, key):
# AES-GCM nonce should be 12 bytes (96 bits) for best performance
nonce = os.urandom(12)
cipher = AES.new(key, AES.MODE_GCM, nonce=nonce)
ciphertext, tag = cipher.encrypt_and_digest(data)
return nonce, ciphertext, tag
# Nonce is unpredictable and unique with overwhelming probability
Why this works:
- Nonce uniqueness critical: For AES-GCM, nonce must be unique per encryption with same key - reusing catastrophically breaks authentication, allowing forgery and key recovery
- Optimal size: 12-byte (96-bit) nonce is optimal for AES-GCM performance, avoiding internal GHASH operations needed for other sizes
- Statistical uniqueness:
os.urandom(12)provides statistical uniqueness - with 96 bits, collision probability negligible until ~2^48 nonces (~281 trillion) - CBC vs GCM: AES-CBC requires 16-byte IV that's unpredictable (not just unique) to prevent chosen-plaintext attacks; GCM only requires uniqueness
- Storage: Nonce doesn't need secrecy; returned alongside ciphertext and auth tag for decryption
Key Security Functions
Token Generation Helper
import secrets
def generate_token(purpose, nbytes=32):
"""
Generate secure tokens for various purposes
Args:
purpose: 'session', 'reset', 'api', 'csrf'
nbytes: Number of random bytes (default 32 = 256 bits)
Returns:
Secure random token as hex string
"""
token = secrets.token_hex(nbytes)
# Optionally prefix with purpose for identification
return f"{purpose}_{token}"
# Usage
session_token = generate_token('session') # "session_a7b3c9..."
reset_token = generate_token('reset', 48) # Longer for password reset
Secure Random String Generator
import secrets
import string
def generate_secure_string(length, charset='alphanumeric'):
"""
Generate cryptographically secure random string
Args:
length: Desired string length
charset: 'alphanumeric', 'hex', 'base64', 'ascii', 'digits'
Returns:
Random string
"""
charsets = {
'alphanumeric': string.ascii_letters + string.digits,
'hex': string.hexdigits.lower(),
'ascii': string.ascii_letters + string.digits + string.punctuation,
'digits': string.digits,
'base64': string.ascii_letters + string.digits + '+/'
}
chars = charsets.get(charset, charsets['alphanumeric'])
return ''.join(secrets.choice(chars) for _ in range(length))
# Usage
api_key = generate_secure_string(32, 'alphanumeric')
pin = generate_secure_string(6, 'digits')
Entropy Checker
import math
def estimate_entropy(value, alphabet_size):
"""
Estimate entropy bits for a random value
Args:
value: The random value (string or bytes)
alphabet_size: Size of character set (e.g., 62 for alphanumeric)
Returns:
Estimated entropy in bits
"""
length = len(value)
entropy_bits = length * math.log2(alphabet_size)
return entropy_bits
# Usage
token = secrets.token_hex(32) # 64 hex chars
entropy = estimate_entropy(token, 16) # 256 bits
print(f"Token entropy: {entropy} bits") # Should be >= 128 for security
# Minimum entropy recommendations:
# - Session tokens: 128 bits
# - Password reset: 128-256 bits
# - API keys: 128-256 bits
# - Encryption keys: 128-256 bits (AES-128/AES-256)
Verification
After implementing the recommended secure patterns, verify the fix through multiple approaches:
- Manual testing: Submit malicious payloads relevant to this vulnerability and confirm they're handled safely without executing unintended operations
- Code review: Confirm all instances use the secure pattern (parameterized queries, safe APIs, proper encoding) with no string concatenation or unsafe operations
- Static analysis: Use security scanners to verify no new vulnerabilities exist and the original finding is resolved
- Regression testing: Ensure legitimate user inputs and application workflows continue to function correctly
- Edge case validation: Test with special characters, boundary conditions, and unusual inputs to verify proper handling
- Framework verification: If using a framework or library, confirm the recommended APIs are used correctly according to documentation
- Authentication/session testing: Verify security controls remain effective and cannot be bypassed (if applicable to the vulnerability type)
- Rescan: Run the security scanner again to confirm the finding is resolved and no new issues were introduced
Analysis Steps
Locate the weak random usage
# Line 45 in auth/tokens.py
import random
import string
def generate_api_key():
chars = string.ascii_letters + string.digits
token = ''.join(random.choice(chars) for _ in range(32)) # VULNERABLE
return token
Identify the purpose
- API key generation (security-critical)
- Requires unpredictability
- Used for authentication
Assess the risk
- API keys grant access to protected resources
- Predictable keys = unauthorized access
- Impact: High (authentication bypass)
Determine required entropy
- API keys should have 128-256 bits entropy
- Current: 32 chars from 62-char alphabet = ~190 bits (if truly random)
- Problem: Not truly random - predictable with seed knowledge
Remediation Steps
Replace random with secrets
# BEFORE (Line 45 - vulnerable)
import random
import string
def generate_api_key():
chars = string.ascii_letters + string.digits
token = ''.join(random.choice(chars) for _ in range(32))
return token
# AFTER (fixed)
import secrets
import string
def generate_api_key():
chars = string.ascii_letters + string.digits
token = ''.join(secrets.choice(chars) for _ in range(32))
return token
# Or simpler with token_urlsafe
def generate_api_key_simple():
return secrets.token_urlsafe(32) # 256 bits entropy
Add entropy validation
import secrets
import math
def generate_api_key():
token = secrets.token_urlsafe(32)
# Verify minimum entropy (optional defensive check)
entropy_bits = len(token) * math.log2(64) # Base64 = 64 chars
MIN_ENTROPY = 128
if entropy_bits < MIN_ENTROPY:
raise ValueError(f"Insufficient entropy: {entropy_bits} bits")
return token
Re-scan
Security Checklist
- Never use
randommodule for security (session IDs, tokens, keys, passwords) - Use
secretsmodule (Python 3.6+) oros.urandom()for all security-critical random values - Use
secrets.token_hex()orsecrets.token_urlsafe()for tokens - Use
secrets.choice()for random character selection - Use
secrets.randbelow()for random integers - Use
os.urandom()orFernet.generate_key()for encryption keys - Never manually seed random generators with time or PID
- Ensure minimum 128 bits entropy for security-critical values
- Use
uuid.uuid4()carefully (verify CSPRNG usage) - Use Argon2, bcrypt, or pbkdf2 for password hashing (handles salts internally)
- Run Bandit or Semgrep to detect
randomusage - Test token uniqueness in unit tests
- Regenerate existing tokens if weak random was used