featured image

Generic Input Validation Pattern

A deep dive into a reusable Python function that validates incoming data against expected keys and types, with special handling for edge cases like numeric strings and empty values. This pattern is essential for building robust APIs and can be applied across various programming languages and frameworks.

Published

Wed Jun 11 2025

Technologies Used

Python Flask
Beginner 8 minutes

Purpose

The Problem

Every web API receives untrusted data from clients. A common beginner mistake is writing repetitive validation code for each route:

# Bad: Repetitive validation
if "name" not in data:
    return "Missing name", 400
if type(data["name"]) is not str:
    return "Name must be string", 400
if "age" not in data:
    return "Missing age", 400
if type(data["age"]) is not int:
    return "Age must be integer", 400
# ... repeat for 10 more fields

This approach is error-prone, hard to maintain, and violates the DRY (Don’t Repeat Yourself) principle. Professional codebases use generic validation functions that work with any data structure.

The Solution

We’re studying the validate_input_data_generic() function from server.py, which validates any dictionary against expected keys and types using a single reusable function.

Why This Matters

This validator handles edge cases that junior developers often miss:

  • Numeric strings that should be treated as integers (“42” → 42)
  • Empty strings as “optional data” signals
  • Float values that might arrive as strings from JSON

Prerequisites & Tooling

Knowledge Base:

  • Python 3.11+ basic syntax (loops, conditionals)
  • Understanding of dictionaries and type checking
  • Basic exception handling (try/except)

Environment:

# No special dependencies - pure Python!
python --version  # Should be 3.11+

High-Level Architecture

graph TD
    A[Incoming Dictionary] --> B{Is it a dict?}
    B -->|No| C[Return Error: Not a dictionary]
    B -->|Yes| D[Loop through expected keys]
    D --> E{Key exists?}
    E -->|No| F[Return Error: Missing key]
    E -->|Yes| G{Value is empty string?}
    G -->|Yes| H[Skip validation - allow empty]
    G -->|No| I{Type matches?}
    I -->|Yes| J[Continue to next key]
    I -->|No| K{Expected type is int/float?}
    K -->|Yes| L{Can convert to numeric?}
    L -->|Yes| J
    L -->|No| M[Return Error: Invalid type]
    K -->|No| M
    J --> N{More keys?}
    N -->|Yes| D
    N -->|No| O[Return True - Valid!]

Analogy: Think of this like a security checkpoint at an airport. Each piece of luggage (data field) must:

  1. Exist (not missing)
  2. Be the right type (liquid in liquid containers, solids in solid containers)
  3. Have special allowances (empty containers are okay)

Implementation

Step 1: Function Signature and Initial Type Check

Logic: We need to accept three inputs: the data to validate, the list of expected keys, and their corresponding types. First, we verify the data is even a dictionary.

def validate_input_data_generic(in_data, expected_keys, expected_types):
    """
    Validates that input data is a dictionary with correct information

    Parameters
    ----------
    in_data : dict
        Object received by the POST request
    expected_keys : list
        Keys that should be found in the POST request dictionary
    expected_types : list
        The value data types that should be found (must match order of expected_keys)

    Returns
    -------
    str: Error message if there is a problem, or
    bool: True if input data is valid
    """
    # CRITICAL: Check type before attempting dictionary operations
    if type(in_data) is not dict:
        return "Input is not a dictionary"

🔴 Danger: Never assume in_data is a dictionary! If a client sends "hello" instead of {"key": "value"}, calling in_data["key"] will crash with TypeError: string indices must be integers.

Step 2: Loop Through Expected Keys

Logic: We use zip() to pair each expected key with its expected type, then iterate through all pairs simultaneously.

    # Pair each key with its expected type and iterate
    for key, value_type in zip(expected_keys, expected_types):
        # Check if the key exists in the input dictionary
        if key not in in_data:
            # Use .format() for dynamic error messages
            return "Key {} is missing from input".format(key)

🔵 Deep Dive: Why zip()? This Python built-in creates pairs from two lists:

zip(['name', 'age'], [str, int])
# Produces: [('name', str), ('age', int)]

This is more elegant than indexing: for i in range(len(expected_keys)).

Step 3: Handle Empty Strings (The Clever Part)

Logic: This system treats empty strings as “no data provided” rather than invalid data. This allows partial updates (e.g., updating only the patient name without CPAP data).

        # Allow empty strings to pass validation
        if in_data[key] == "":
            continue  # Skip to next key

Real-World Example:

# Valid request: Update only the name
{"patient_name": "John Doe", "CPAP_pressure": ""}

# Valid request: Update only CPAP data
{"patient_name": "", "CPAP_pressure": "15"}

Step 4: Type Validation with Numeric String Handling

Logic: JSON serialization often converts numbers to strings. We need to handle "42" as valid for int fields.

        # Check if the actual type matches the expected type
        if type(in_data[key]) is not value_type:
            # Special handling for float types
            if value_type == float:
                try:
                    # Attempt to convert to float
                    float(in_data[key])
                    continue  # Conversion succeeded, move to next key
                except ValueError:
                    # Not a numeric string
                    return "Key {} is not an int or numeric string".format(key)

            # Special handling for int types
            if value_type == int:
                # .isnumeric() checks if string contains only digits
                if str(in_data[key]).isnumeric() is False:
                    return "Key {} is not an int or numeric string".format(key)
            else:
                # For all other types (str, list, etc.), reject mismatches
                return "Key {} has the incorrect value type".format(key)

    # All validations passed!
    return True

🔴 Danger: The isnumeric() method only works for positive integers:

"42".isnumeric()    # True
"-42".isnumeric()   # False (negative sign not allowed)
"4.2".isnumeric()   # False (decimal point not allowed)

This is acceptable here because CPAP pressure values are always positive integers (4-25 cmH2O).

Complete Function

def validate_input_data_generic(in_data, expected_keys, expected_types):
    """Validates dictionary against expected structure"""

    # Step 1: Type check
    if type(in_data) is not dict:
        return "Input is not a dictionary"

    # Step 2-4: Validate each key-value pair
    for key, value_type in zip(expected_keys, expected_types):
        if key not in in_data:
            return "Key {} is missing from input".format(key)

        if in_data[key] == "":
            continue  # Allow empty strings

        if type(in_data[key]) is not value_type:
            if value_type == float:
                try:
                    float(in_data[key])
                    continue
                except ValueError:
                    return "Key {} is not an int or numeric string".format(key)
            if value_type == int:
                if str(in_data[key]).isnumeric() is False:
                    return "Key {} is not an int or numeric string".format(key)
            else:
                return "Key {} has the incorrect value type".format(key)

    return True

Under the Hood

Memory Efficiency

This function operates in O(n) time complexity where n = number of expected keys. It makes a single pass through the keys without creating intermediate data structures.

Memory footprint:

  • zip() returns an iterator (not a list), so it uses O(1) space regardless of input size
  • No temporary dictionaries or lists are created
  • Early returns prevent unnecessary work (fails fast)

Why type(x) is not int Instead of isinstance(x, int)?

# This codebase uses:
type(in_data[key]) is not value_type

# Why not this?
isinstance(in_data[key], value_type)

The type() check is strict and rejects subclasses, while isinstance() accepts them:

type(True) is int       # False (bool is a subclass of int)
isinstance(True, int)   # True (bool inherits from int)

For API validation, strictness is preferred. We don’t want {"age": True} to pass as a valid integer.

Edge Cases & Pitfalls

Edge Case 1: None vs. Empty String

# This passes validation:
{"patient_name": ""}

# This fails with "incorrect value type":
{"patient_name": None}

The code explicitly checks for "" but not None. In a production system, you might want:

if in_data[key] == "" or in_data[key] is None:
    continue

Edge Case 2: Numeric Strings with Spaces

# This fails validation:
{"room_number": " 42 "}  # Leading/trailing spaces

# Fix: Add .strip() before validation
if str(in_data[key]).strip().isnumeric() is False:
    return "Key {} is not an int or numeric string".format(key)

Security Consideration: Injection Attacks

🔴 Danger: The .format() method with user-provided keys could theoretically be exploited:

# Malicious input
{"__class__": "exploit"}

# Error message exposes internal object structure
"Key __class__ is missing from input"

Better approach:

return f"Key {key!r} is missing from input"  # Uses repr() for safe output

Conclusion

What You Learned:

  1. Generic Programming Pattern: Write one function that works for multiple data structures by parameterizing the expected schema
  2. Defensive Programming: Always validate input type before accessing properties
  3. Graceful Degradation: Handle edge cases (numeric strings, empty values) instead of rejecting them
  4. Early Returns: Fail fast to avoid unnecessary processing

Skill Transfer: This pattern applies to:

  • Form validation in web frameworks (React, Vue)
  • Config file parsing
  • CSV/Excel data ingestion pipelines
  • API middleware authentication checks

We respect your privacy.

← View All Tutorials

Related Projects

    Ask me anything!