Back to Scaling Python Applications guides

A Complete Guide to Pydantic

Stanley Ulili
Updated on February 17, 2025

Pydantic is a popular data validation and serialization library for Python. It enforces type hints at runtime, ensuring that your data conforms to the expected structure while offering high performance.

Pydantic provides all the essential features expected in a data validation library, such as strict type enforcement, field constraints, custom validation rules, and serialization options.

It also stands out with its ease of use and flexibility, allowing developers to define models effortlessly while ensuring data integrity.

This article will guide you through getting started with Pydantic, defining models, validating data, customizing fields, and handling advanced use cases.

Prerequisites

Before diving into Pydantic, ensure you have a recent version of Python (3.13 or higher) installed on your machine. Pydantic takes advantage of modern Python features like type hints and dataclasses, so having an up-to-date environment is essential.

You can check your Python version by running:

 
python3 --version
Output
Python 3.13.2

Setting up the project directory

In this section, you'll set up a project directory and create a virtual environment before installing dependencies. Using a virtual environment helps keep your project dependencies isolated and organized.

First, create a new directory for your project and navigate into it:

 
mkdir pydantic-demo && cd pydantic-demo

Next, create a virtual environment inside your project directory:

 
python3 -m venv venv

Then, activate the virtual environment:

 
source venv/bin/activate 

Once the virtual environment is activated, install the latest version of Pydantic using pip:

 
pip install pydantic

To ensure Pydantic is installed correctly, open a Python shell:

 
python

Then, try importing BaseModel from Pydantic:

 
>>>from pydantic import BaseModel

If this command runs without errors, Pydantic is successfully installed!

You can now exit the Python shell by typing:

 
exit()

Now that Pydantic is installed, you can use it in your project.

Getting started with Pydantic

Pydantic is a data validation and settings management library for Python that makes it easy to enforce data types, constraints, and serialization rules.

At its core, Pydantic leverages Python type hints to define structured data models, ensuring data integrity with minimal effort.

Let's start with a simple example. Create a new Python file, main.py, and define your first Pydantic model:

main.py
from pydantic import BaseModel


class User(BaseModel):
    name: str
    age: int


user = User(name="Alice", age=30)
print(user)

In this example, you define a User model with two fields: name as a string and age as an integer. The model inherits from BaseModel, which provides the validation and serialization functionality.

Execute the script with:

 
python main.py

You will see output that looks like:

Output
name='Alice' age=30

Notice that Pydantic automatically converts and validates the data according to the specified types.

Now, let's see what happens when incorrect data is provided. Modify the script to introduce incorrect data:

main.py

user = User(name="Alice", age="thirty")
print(user)

When you run this, Pydantic will raise a validation error:

Output
...
pydantic_core._pydantic_core.ValidationError: 1 validation error for User
age
  Input should be a valid integer, unable to parse string as an integer [type=int_parsing, input_value='thirty', input_type=str]
    For further information visit https://errors.pydantic.dev/2.10/v/int_parsing

Pydantic ensures that only valid data is accepted, reducing potential errors in your application.

Pydantic is also smart enough to coerce some types automatically:

main.py

user = User(name="Alice", age="30") # Age is a string, not an int
print(user)

When you run the script, you'll see:

Output
name='Alice' age=30

Even though "30" was a string, Pydantic converted it to an integer automatically.

Now that you've seen how to create basic models and validate data, you can proceed to the next section to explore field validation and constraints for even more precise data control.

Field validation and constraints

When working with structured data, enforcing rules that ensure data integrity and consistency is important. Field validation allows you to define restrictions on input data, preventing invalid or unexpected values from entering your system.

Pydantic provides an easy way to define validation rules and constraints using the Field function. This helps enforce requirements and ensures that your application only processes valid data.

Update the User model by adding validation rules to ensure data quality so that names shouldn't be empty or unreasonably long. Ages must be within human limits, and email addresses should follow standard formats.

You can enforce these constraints using the Field function:

main.py
from pydantic import BaseModel, Field
class User(BaseModel):
name: str = Field(..., min_length=2, max_length=50) # Name must be between 2 and 50 characters
age: int = Field(..., gt=0, lt=120) # Age must be greater than 0 and less than 120
email: str = Field(..., pattern=r"^\S+@\S+\.\S+$") # Must be a valid email format
user = User(name="Alice", age=30, email="alice@example.com")
[/higlight]
print(user)

Each field in the User model uses specific validation rules: the name field must be between 2 and 50 characters, the age field must be between 0 and 120, and the email field must contain an @ symbol and domain.

The ... in each Field() means these fields are required when creating a new User.

When you run the script, you'll see that Pydantic performs these validations automatically:

Output
email: str = Field(..., pattern=r"^\S+@\S+\.\S+$")  # Must be a valid email format

Now, test what happens when invalid input is provided:

 
[label main.py
user = User(name="A", age=-5, email="invalid-email")

Running this will raise multiple validation errors:

Output
3 validation errors for User
name
  String should have at least 2 characters [type=string_too_short, input_value='A', input_type=str]
    For further information visit https://errors.pydantic.dev/2.10/v/string_too_short
age
  Input should be greater than 0 [type=greater_than, input_value=-5, input_type=int]
    For further information visit https://errors.pydantic.dev/2.10/v/greater_than
email
  String should match pattern '^\S+@\S+\.\S+$' [type=string_pattern_mismatch, input_value='invalid-email', input_type=str]
    For further information visit https://errors.pydantic.dev/2.10/v/string_pattern_mismatch

Defining the constraints lets you catch errors early rather than allowing bad data to cause issues later.

Another key aspect of Pydantic is that it requires all fields to be provided by default. However, in real-world applications, some fields may be optional and not always necessary.

A default value ensures that if a field is not provided, it will automatically take on a sensible default value instead of causing an error.

In the main.py file, add an is_active field that defaults to True:

main.py
class User(BaseModel):
    name: str = Field(..., min_length=2, max_length=50)
    ...
is_active: bool = True # Default value
# Creating a user without specifying 'is_active'
user = User(name="Bob", age=25, email="bob@example.com")
print(user)

When you run the file, the output looks like:

Output
name='Bob' age=25 email='bob@example.com' is_active=True

Since is_active wasn’t provided, it automatically defaults to True.

This ensures that every user has an active status unless explicitly set otherwise.

In some cases, fields should be optional, meaning they can be left out entirely without causing errors.

To make a field optional, use Optional from the typing module and set None as the default value:

main.py
from pydantic import BaseModel, Field
from typing import Optional
class User(BaseModel): ....
email: Optional[str] = None # Now optional
is_active: bool = True # Default value # Creating a user without an email
user = User(name="Charlie", age=28)
print(user)

When you run the script, you'll see:

Output
name='Charlie' age=28 email=None is_active=True

Since the email field is optional, it defaults to None when not provided. This makes the model more adaptable to real-world scenarios where certain fields may not always be required.

You can design more flexible and reliable data structures by incorporating validation rules, default values, and optional fields.

Next, you'll learn how to use custom validators to implement more advanced validation logic.

Custom validators for advanced data validation

While Pydantic provides built-in validation for basic data types and constraints, you need more complex validation logic in some cases. This is where custom validators come in.

A custom validator allows you to define specific rules that go beyond standard type checking.

For example, you might want to:

  • Ensure a username contains only alphanumeric characters
  • Validate a password for minimum complexity requirements
  • Check that a date falls within a specific range

With Pydantic, you can create custom validation functions using the @field_validator decorator.

Extend the User model by adding custom validation for usernames and passwords:

main.py
[highlight
from pydantic import BaseModel, Field, field_validator
import re

class User(BaseModel):
    name: str = Field(..., min_length=2, max_length=50)
    age: int = Field(..., gt=0, lt=120)
email: str = Field(..., pattern=r"^\S+@\S+\.\S+$")
password: str = Field(..., min_length=8) # Password must be at least 8 characters
@field_validator("password")
def password_complexity(cls, value):
"""Ensure password has at least one uppercase letter, one lowercase letter, and one number."""
if not (
re.search(r"[A-Z]", value)
and re.search(r"[a-z]", value)
and re.search(r"\d", value)
):
raise ValueError(
"Password must contain at least one uppercase letter, one lowercase letter, and one number"
)
return value
# Valid user
user = User(name="Alice123", age=30, email="alice@example.com", password="Secure123")
print(user)

The @field_validator("password") function ensures that the password meets complexity requirements by requiring at least one uppercase letter, one lowercase letter, and one number.

When you run the file with valid input, you'll see the following output:

Output
name='Alice123' age=30 email='alice@example.com' password='Secure123'

Now, test what happens when an invalid password is provided:

main.py
user = User(name="Alice!", age=25, email="alice@example.com", password="weakpass")
print(user)

Since "weakpass" lacks an uppercase letter and a number, Pydantic raises a validation error:

Output
1 validation error for User
password
  Value error, Password must contain at least one uppercase letter, one lowercase letter, and one number [type=value_error, input_value='weakpass', input_type=str]
    For further information visit https://errors.pydantic.dev/2.10/v/value_error

This ensures that only correctly formatted data enters the system, reducing the risk of errors and security vulnerabilities.

Now that you've mastered custom validators, it's time to explore data serialization and transformation, converting Pydantic models into dictionaries and JSON.

Data serialization and transformation

Beyond validation, Pedantic provides serialization and transformation features, allowing you to convert models into different formats such as dictionaries, JSON, and custom data structures.

This is useful when storing data in databases, sending API responses, or interacting with external services.

Update the main.py with the following code:

main.py
from pydantic import BaseModel, Field

class User(BaseModel):
    name: str = Field(..., min_length=2, max_length=50)
    age: int = Field(..., gt=0, lt=120)
    email: str = Field(..., pattern=r"^\S+@\S+\.\S+$")
    is_active: bool = True

user = User(name="Alice", age=30, email="alice@example.com")

# Convert to dictionary
user_dict = user.model_dump()
print(user_dict)

Most of these concepts should feel familiar, except for the user.model_dump(). The .model_dump() method converts the model into a native Python dictionary while preserving all field values.

When you run this program, you'll see:

 
{'name': 'Alice', 'age': 30, 'email': 'alice@example.com', 'is_active': True}

Another neat feature is the ability to serialize data into JSON format, which is essential when working with web APIs or storing data in document databases.

To convert your model to JSON, update your code with the model_dump_json() method:

main.py
....
user = User(name="Alice", age=30, email="alice@example.com")

# Convert to JSON
user_json = user.model_dump_json()
print(user_json)

The .model_dump_json() method automatically handles the conversion of Python types to their JSON equivalents, including proper formatting of boolean values (like true instead of True) and ensuring the output is a valid JSON string.

Running the file produces:

Output
{"name":"Alice","age":30,"email":"alice@example.com","is_active":true}

Pydantic also allows you to filter fields during serialization. This is particularly useful when hiding sensitive information or customizing the output for different purposes.

Update the code to incorporate field filtering:

main.py
...
user = User(name="Alice", age=30, email="alice@example.com")

# Exclude specific fields
filtered_dict = user.model_dump(exclude={"is_active"})
print(filtered_dict)
# Include only specific fields
partial_json = user.model_dump_json(include={"name", "email"})
print(partial_json)

The exclude parameter removes specified fields from the output, while include lets you select only the fields you want to keep.

Running this shows:

 
{'name': 'Alice', 'age': 30, 'email': 'alice@example.com'}
{"name":"Alice","email":"alice@example.com"}

As you can see, the first output excludes the is_active field, while the second output includes only the name and email fields.

Now that you can serialize Pydantic models into dictionaries and JSON while filtering specific fields, you’re ready to explore JSON Schema generation.

Working with JSON Schemas in Pydantic

Pydantic not only helps with data validation and serialization but also provides JSON Schema generation for your models. JSON Schema is a widely used format for defining JSON data's structure, validation rules, and constraints.

Pydantic allows you to automatically generate JSON Schema representations of your models, ensuring that your data format is well-defined and compatible with external applications.

Pydantic models come with a built-in method called .model_json_schema(), which allows you to generate the corresponding JSON Schema for any model.

To see how this works, update your main.py file:

main.py
from pydantic import BaseModel, Field

class User(BaseModel):
    name: str = Field(..., min_length=2, max_length=50)
    age: int = Field(..., gt=0, lt=120)
    email: str = Field(..., pattern=r"^\S+@\S+\.\S+$")
    is_active: bool = True

user_schema = User.model_json_schema()
print(user_schema)

When you run this script, you'll see output similar to:

Output
{
    "title": "User",
    "type": "object",
    "properties": {
        "name": {
            "type": "string",
            "title": "Name",
            "minLength": 2,
            "maxLength": 50
        },
        "age": {
            "type": "integer",
            "title": "Age",
            "exclusiveMinimum": 0,
            "exclusiveMaximum": 120
        },
        "email": {
            "type": "string",
            "title": "Email",
            "pattern": "^\\S+@\\S+\\.\\S+$"
        },
        "is_active": {
            "type": "boolean",
            "title": "Is Active",
            "default": true
        }
    },
    "required": ["name", "age", "email"]
}

This JSON Schema describes the structure of the User model, including field types, constraints, required fields, and default values.

This schema can be used for API documentation, client-side validation, or schema validation in databases.

Pydantic allows customization of the generated JSON Schema. You can modify field descriptions, add metadata, and adjust the schema structure to suit your needs.

To enhance the schema with descriptions, update the User model:

main.py
...
class User(BaseModel):
name: str = Field(
..., min_length=2, max_length=50, description="The full name of the user"
)
age: int = Field(..., gt=0, lt=120, description="The user's age in years")
email: str = Field(
..., pattern=r"^\S+@\S+\.\S+$", description="A valid email address"
)
is_active: bool = Field(default=True, description="Indicates if the user is active")
user_schema = User.model_json_schema() print(user_schema)

This modification ensures that the JSON Schema includes descriptions, making it more informative:

Output
{
    "properties": {
        "name": {
            "description": "The full name of the user",
            ...
        },
        "age": {
            "description": "The user's age in years",
            ...
        },
        "email": {
            "description": "A valid email address",
            ...
        },
        "is_active": {
            "description": "Indicates if the user is active",
            ...
        }
    },
    ...
}

Adding descriptions makes the schema easier to understand when shared with others or used in API documentation.

Final thoughts

In this article, you learned how to use Pydantic for data validation and serialization in Python. With these skills, you can now build robust data models, ensure input correctness, and simplify API development.

For more advanced features and best practices, check out the Pydantic Documentation.

Author's avatar
Article by
Stanley Ulili
Stanley Ulili is a technical educator at Better Stack based in Malawi. He specializes in backend development and has freelanced for platforms like DigitalOcean, LogRocket, and AppSignal. Stanley is passionate about making complex topics accessible to developers.
Got an article suggestion? Let us know
Next article
A Deep Dive into UV: The Fast Python Package Manager
Licensed under CC-BY-NC-SA

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Make your mark

Join the writer's program

Are you a developer and love writing and sharing your knowledge with the world? Join our guest writing program and get paid for writing amazing technical guides. We'll get them to the right readers that will appreciate them.

Write for us
Writer of the month
Marin Bezhanov
Marin is a software engineer and architect with a broad range of experience working...
Build on top of Better Stack

Write a script, app or project on top of Better Stack and share it with the world. Make a public repository and share it with us at our email.

community@betterstack.com

or submit a pull request and help us build better products for everyone.

See the full list of amazing projects on github