Back to Scaling Python Applications guides

Introduction to Marshmallow in Python

Stanley Ulili
Updated on June 24, 2025

Marshmallow is a Python library that transforms complex data types into and out of Python data structures. It provides schema validation, serialization, and deserialization capabilities.

Marshmallow is great for building APIs, handling forms, and making sure data in your pipeline is valid. Its easy-to-use schema system and flexible options make working with data simpler while keeping everything accurate and well-structured.

This guide will show you how to use Marshmallow in your Python projects. You’ll learn how to create schemas, validate incoming data, deal with errors, and use Marshmallow in real-world applications.

Let's dive in!

Prerequisites

Before you continue with this tutorial, ensure that you have Python 3.7 or higher installed on your system, along with pip for package management. You should also have a solid understanding of Python fundamentals, including classes, dictionaries, and exception handling, since Marshmallow builds upon these core concepts.

Setting up the project environment

In this section, you'll set up a Python environment that's ready for working with Marshmallow. You'll create a clean project where you can easily experiment with different methods to validate data.

First, make a new folder for your project and go into it:

 
mkdir marshmallow-validation && cd marshmallow-validation

Create a virtual environment to isolate your project dependencies:

 
python3 -m venv venv

Activate the virtual environment:

 
source venv/bin/activate

Install Marshmallow along with additional helpful packages:

 
pip install marshmallow

For development convenience, also install these optional but useful packages:

 
pip install python-dateutil 

Create a requirements file to track your dependencies:

 
pip freeze > requirements.txt

Your development environment is now configured and ready for Marshmallow experimentation. The virtual environment ensures that your project dependencies remain isolated and manageable.

Getting started with Marshmallow

In this section, you'll learn the fundamental concepts of Marshmallow through practical schema creation and data validation. Marshmallow schemas define the structure and rules for your data, providing both validation and transformation capabilities.

Create a new file called schemas.py in your project directory:

schemas.py
from marshmallow import Schema, fields

class UserSchema(Schema):
    name = fields.Str(required=True)
    age = fields.Int(required=True)
    email = fields.Email(required=True)

user_schema = UserSchema()

This schema establishes validation rules for user data:

  • name must be a string and is required
  • age must be an integer and is required
  • email must be a valid email address and is required

Now create a main application file to test the schema validation. Add this content to main.py:

main.py
from schemas import user_schema

user_data = {
    'name': 'Sarah',
    'age': 28,
    'email': 'sarah@example.com'
}

try:
    result = user_schema.load(user_data)
    print('Valid user data:', result)
except Exception as error:
    print('Validation failed:', error)

The load() method validates the input data against your schema definition. When validation succeeds, it returns the cleaned data. If validation fails, Marshmallow raises a ValidationError with detailed information about what went wrong.

Execute the validation script:

 
python main.py

With valid input data, you'll see this confirmation:

Output
Valid user data: {'name': 'Sarah', 'age': 28, 'email': 'sarah@example.com'}

The successful validation shows that your data meets all schema requirements. Marshmallow has verified the data types and confirmed that required fields are present with appropriate values.

Customizing validations in Marshmallow

Marshmallow offers extensive validation customization through field-specific constraints, custom validation functions, and schema-level validation rules.

Basic Validation Flow

These features help you enforce business logic and data integrity requirements beyond basic type checking.

Adding field constraints

Field-level constraints allow you to specify detailed validation rules for individual schema attributes. You can enhance the UserSchema with more sophisticated validation requirements:

schemas.py
[highlight
from marshmallow import Schema, fields, validate

class UserSchema(Schema):
name = fields.Str(
required=True,
validate=validate.Length(min=2, max=50, error="Name must be between 2 and 50 characters")
)
age = fields.Int(
required=True,
validate=validate.Range(min=18, max=120, error="Age must be between 18 and 120")
)
email = fields.Email(required=True)
password = fields.Str(
required=True,
validate=validate.Length(min=8, error="Password must be at least 8 characters long")
)
user_schema = UserSchema()

These enhanced constraints provide:

  • Length() validation ensures names fall within reasonable character limits
  • Range() validation restricts ages to realistic human values
  • Password length requirements enforce basic security standards
  • Custom error messages provide clear feedback when validation fails

Test these constraints by modifying main.py with invalid data:

main.py
from marshmallow import ValidationError
from schemas import user_schema

invalid_data = {
'name': 'A', # Too short
'age': 15, # Below minimum
'email': 'invalid', # Not an email
'password': '123' # Too short
}
try:
result = user_schema.load(invalid_data)
print('Valid user data:', result) except ValidationError as error: print('Validation errors:', error.messages)

Running this code reveals structured validation feedback:

Output
Validation failed: {'name': ['Name must be between 2 and 50 characters'], 'age': ['Age must be between 18 and 120'], 'email': ['Not a valid email address.'], 'password': ['Password must be at least 8 characters long']}

The error dictionary maps each invalid field to its specific validation messages, making it straightforward to identify and address data quality issues.

Creating custom validation functions

Custom validators help you implement business-specific validation logic that goes beyond Marshmallow's built-in constraints. These functions give you complete control over validation behavior:

schemas.py
from marshmallow import Schema, fields, validate, validates, ValidationError
import re
class UserSchema(Schema): name = fields.Str(required=True, validate=validate.Length(min=2, max=50)) age = fields.Int(required=True, validate=validate.Range(min=18, max=120)) email = fields.Email(required=True) password = fields.Str(required=True, validate=validate.Length(min=8))
@validates('password')
def validate_password_complexity(self, value, **kwargs):
if not re.search(r'\d', value):
raise ValidationError('Password must contain at least one number')
if not re.search(r'[A-Z]', value):
raise ValidationError('Password must contain at least one uppercase letter')
user_schema = UserSchema()

The @validates decorator attaches custom validation logic to specific fields. This password validator ensures that passwords contain both numbers and uppercase letters, enforcing stronger security requirements.

Test the custom validation with a weak password:

main.py
from schemas import user_schema

weak_password_data = {
'name': 'Alice',
'age': 25,
'email': 'alice@example.com',
'password': 'weakpassword' # Missing number and uppercase
}
try:
result = user_schema.load(weak_password_data)
print('Valid user data:', result) except ValidationError as error: print('Validation errors:', error.messages)

The validation will fail with specific password requirements:

Output
Validation failed: {'password': ['Password must contain at least one number']}

Schema-level validation

Schema-level validation allows you to validate relationships between multiple fields or implement complex business rules that span the entire data structure:

schemas.py
from marshmallow import Schema, fields, validate, validates_schema, ValidationError

class UserSchema(Schema):
    name = fields.Str(required=True, validate=validate.Length(min=2, max=50))
    age = fields.Int(required=True, validate=validate.Range(min=18, max=120))
    email = fields.Email(required=True)
    password = fields.Str(required=True, validate=validate.Length(min=8))
    confirm_password = fields.Str(required=True)

    @validates_schema
    def validate_passwords_match(self, data, **kwargs):
        if data.get('password') != data.get('confirm_password'):
            raise ValidationError('Passwords must match', field_name='confirm_password')

user_schema = UserSchema()

Schema-level validators receive the entire data dictionary, so you can validate across fields. This example ensures that password confirmation matches the original password entry.

Test the schema validation with mismatched passwords:

main.py
from schemas import user_schema
mismatched_data = {
'name': 'Bob',
'age': 30,
'email': 'bob@example.com',
'password': 'SecurePass123',
'confirm_password': 'DifferentPass123'
}
try:
result = user_schema.load(mismatched_data)
print('Valid user data:', result) except ValidationError as error: print('Validation errors:', error.messages)

The validation output shows the cross-field validation error:

Output
Validation errors: {
    'confirm_password': ['Passwords must match']
}

This comprehensive validation approach ensures data integrity at both individual field and overall schema levels.

Handling validation errors effectively

Marshmallow provides structured error handling that makes it easy to process validation failures and provide meaningful feedback to users. Understanding how to work with validation errors is crucial for building robust applications.

Marshmallow's ValidationError contains detailed information about validation failures through its messages attribute. You can explore different approaches to error handling. Lets work with this new example:

main.py
from marshmallow import ValidationError
from schemas import user_schema

def validate_user_data(data):
    try:
        result = user_schema.load(data)
        return {'success': True, 'data': result}
    except ValidationError as error:
        return {'success': False, 'errors': error.messages}

# Test with multiple validation errors
invalid_data = {
    'name': '',
    'age': 'not_a_number',
    'email': 'invalid_email',
    'password': '123'
}

validation_result = validate_user_data(invalid_data)

if validation_result['success']:
    print('User data is valid:', validation_result['data'])
else:
    print('Validation failed with errors:')
    for field, messages in validation_result['errors'].items():
        for message in messages:
            print(f'  {field}: {message}')

This error handling approach transforms validation failures into structured data that applications can easily process. When you run this code, you'll see organized error output:

Output
Validation failed with errors:
  name: Length must be between 2 and 50.
  age: Not a valid integer.
  email: Not a valid email address.
  password: Shorter than minimum length 8.
  confirm_password: Missing data for required field.

Creating user-friendly error messages

Raw validation errors can be technical and difficult for end users to understand. Creating a translation layer helps present errors in a more user-friendly format:

error_handler.py
def format_validation_errors(error_messages):
    """Convert technical validation errors to user-friendly messages"""
    user_friendly_errors = {}

    field_translations = {
        'name': 'Full Name',
        'age': 'Age',
        'email': 'Email Address',
        'password': 'Password',
        'confirm_password': 'Password Confirmation'
    }

    for field, messages in error_messages.items():
        friendly_field = field_translations.get(field, field.title())
        user_friendly_errors[friendly_field] = messages

    return user_friendly_errors

This code defines a function that turns raw validation errors into more user-friendly messages. It replaces technical field names with readable labels (like "Email" instead of "email") to make errors easier for users to understand.

Now use this function in your main application:

main.py
from marshmallow import ValidationError
from schemas import user_schema
from error_handler import format_validation_errors

try:
    user_schema.load({'name': 'A', 'age': 15})
except ValidationError as error:
    friendly_errors = format_validation_errors(error.messages)
    print('Please correct the following issues:')
    for field, messages in friendly_errors.items():
        for message in messages:
            print(f'• {field}: {message}')

In this section, you use the format_validation_errors function in your main application to display cleaner, more readable error messages.

If the data fails validation, the code catches the ValidationError, formats the raw errors using your helper function, and prints them in a clear, user-friendly way.

This produces more accessible error messages:

Output

Please correct the following issues:
• Full Name: Length must be between 2 and 50.
• Age: Must be greater than or equal to 18 and less than or equal to 120.
• Email Address: Missing data for required field.
• Password: Missing data for required field.
• Password Confirmation: Missing data for required field.

The transformation makes validation feedback clearer for non-technical users while preserving the detailed error information that developers need.

Data serialization and transformation

Beyond validation, Marshmallow excels at transforming data between different representations. This capability is essential for API development, where you need to convert between internal data structures and external formats.

Serialization and Deserialization Cycle

Serializing Python objects to dictionaries

Serialization converts Python objects into dictionary representations suitable for JSON APIs or external systems:

models.py
from dataclasses import dataclass
from typing import List

@dataclass
class User:
    name: str
    email: str
    age: int
    is_active: bool = True

This code defines a simple User dataclass with four attributes. The is_active field has a default value of True, making it optional when creating new user instances. This dataclass will serve as our Python object that we want to serialize and deserialize.

Update your schemas file to include object creation capabilities:

schemas.py
from marshmallow import Schema, fields, post_load
from models import User

class UserSchema(Schema):
    name = fields.Str(required=True)
    email = fields.Email(required=True)
    age = fields.Int(required=True)
    is_active = fields.Bool(load_default=True)

    @post_load
    def make_user(self, data, **kwargs):
        return User(**data)

user_schema = UserSchema()

This enhanced schema includes a @post_load decorator that automatically converts validated dictionary data into User objects.

When you call load(), instead of getting back a dictionary, you'll receive a fully instantiated User object. The load_default=True parameter ensures that if is_active isn't provided in the input data, it defaults to True.

The @post_load decorator automatically converts validated data into Python objects, while serialization transforms objects back to dictionaries:

main.py
from schemas import user_schema
from models import User

# Create a User object
user_object = User(
    name="Emily", 
    email="emily@example.com", 
    age=29, 
    is_active=True
)

# Serialize to dictionary
serialized_data = user_schema.dump(user_object)
print('Serialized user:', serialized_data)

# Load and validate from dictionary
user_dict = {
    'name': 'Michael',
    'email': 'michael@example.com',
    'age': 35
}

validated_user = user_schema.load(user_dict)
print('Loaded user object:', validated_user)
print('User type:', type(validated_user))

This example demonstrates the complete serialization cycle. First, it creates a User object manually, then uses dump() to convert it into a dictionary suitable for JSON APIs.

Next, it takes a dictionary of user data and uses load() to both validate the data and create a new User object. Notice that the user_dict doesn't include is_active, but the resulting object still has this field set to True due to the default value.

Run the serialization example:

 
python main.py

The output demonstrates the bidirectional transformation:

Output
Serialized user: {'name': 'Emily', 'email': 'emily@example.com', 'age': 29, 'is_active': True}
Loaded user object: User(name='Michael', email='michael@example.com', age=35, is_active=True)
User type: <class 'models.User'>

Field-level data transformation

Marshmallow supports custom field transformations that modify data during serialization and deserialization. Create a new schema with transformation methods:

phone_schema.py
from marshmallow import Schema, fields, pre_load, post_dump
import re

class UserPhoneSchema(Schema):
    name = fields.Str(required=True)
    email = fields.Email(required=True)
    phone = fields.Str()

    @pre_load
    def clean_phone_number(self, data, **kwargs):
        if 'phone' in data and data['phone']:
            # Remove all non-digit characters from phone number
            data['phone'] = re.sub(r'\D', '', data['phone'])
        return data

    @post_dump
    def format_phone_number(self, data, **kwargs):
        if 'phone' in data and data['phone'] and len(data['phone']) == 10:
            # Format as (XXX) XXX-XXXX
            phone = data['phone']
            data['phone'] = f'({phone[:3]}) {phone[3:6]}-{phone[6:]}'
        return data

phone_schema = UserPhoneSchema()

This schema demonstrates data transformation hooks that automatically clean and format data. The @pre_load decorator runs before validation and strips all non-digit characters from phone numbers, ensuring consistent internal storage.

The @post_dump decorator runs after serialization and formats clean phone numbers into a human-readable format with parentheses and dashes. This approach separates data storage (clean digits) from data presentation (formatted display).

These transformations clean and format data automatically:

phone_main.py
from phone_schema import phone_schema

messy_data = {
    'name': 'Alex',
    'email': 'alex@example.com',
    'phone': '(555) 123-4567'
}

# Load cleans the phone number
loaded_data = phone_schema.load(messy_data)
print('Cleaned data:', loaded_data)

# Dump formats the phone number
formatted_data = phone_schema.dump(loaded_data)
print('Formatted data:', formatted_data)

This example shows the transformation pipeline in action. The input data contains a formatted phone number with parentheses, spaces, and dashes. During load(), the @pre_load method strips these characters, leaving only digits for internal storage.

When dump() is called on the cleaned data, the @post_dump method reformats the phone number back into a user-friendly display format. This ensures your application stores clean data while presenting it nicely to users.

Run the transformation example:

 
python phone_main.py

The transformation pipeline handles data cleaning and formatting seamlessly:

Output
Cleaned data: {'name': 'Alex', 'email': 'alex@example.com', 'phone': '5551234567'}
Formatted data: {'name': 'Alex', 'email': 'alex@example.com', 'phone': '(555) 123-4567'}

Final thoughts

This comprehensive guide explored Marshmallow, Python's premier schema validation library that streamlines data validation, serialization, and transformation. Through practical examples, you covered schema creation, custom validation, and error handling.

With this knowledge, you’re ready to build reliable data validation into your Python apps. Marshmallow’s clear, flexible design helps you keep your code clean while making sure your data is accurate and well-structured.

For more details and advanced usage, check out the official Marshmallow documentation.

Make your mark

Join the writer's program

Are you a developer and love writing and sharing your knowledge with the world? Join our guest writing program and get paid for writing amazing technical guides. We'll get them to the right readers that will appreciate them.

Write for us
Writer of the month
Marin Bezhanov
Marin is a software engineer and architect with a broad range of experience working...
Build on top of Better Stack

Write a script, app or project on top of Better Stack and share it with the world. Make a public repository and share it with us at our email.

community@betterstack.com

or submit a pull request and help us build better products for everyone.

See the full list of amazing projects on github