Back to Scaling Python Applications guides

Working With JSON Data in Python

Stanley Ulili
Updated on April 30, 2025

JSON (JavaScript Object Notation) is now the standard format for data exchange between systems. Its simple, human-readable structure makes it perfect for configuration files, API communication, and data storage.

Python provides powerful tools for working with JSON efficiently through its built-in libraries and specialized modules.

This guide shows you how to handle JSON in your Python applications. You'll learn everything from basic conversion to advanced validation techniques.

Prerequisites

Before starting, ensure that you have Python 3.7 or a later version installed. This guide assumes you are familiar with basic Python concepts, such as dictionaries and lists, which closely match the structures of JSON.

Getting started with JSON in Python

Python's standard json module has everything you need to work with JSON. Unlike third-party packages, this module comes built into Python, so you don't need to install anything extra to use it.

Let's set up a project to explore these features:

 
mkdir json-python && cd json-python

Create a virtual environment to keep your dependencies clean:

 
python3 -m venv venv

Activate it based on your operating system:

 
source venv/bin/activate

Now, create a file named app.py to use throughout this tutorial.

Converting Python dictionaries to JSON strings

Converting Python objects to JSON text is called serialization. Python's dictionaries and lists naturally convert to JSON objects and arrays, making the process straightforward.

To demonstrate this, create a file named app.py in the root directory with the following code:

app.py
import json

# Sample Python dictionary
person = {
    "name": "Jane Smith",
    "age": 32,
    "is_active": True,
    "tags": ["developer", "python", "json"],
    "address": {"street": "123 Main St", "city": "Boston", "state": "MA", "zip": "02108"}
}

# Converting Python dictionary to JSON string
json_string = json.dumps(person)
print("JSON String:")
print(json_string)

The json.dumps() function handles this conversion automatically. It takes your Python object and returns a JSON-formatted string:

Run this script to see how Python converts your dictionary to JSON:

 
python app.py
Output
JSON String:
{"name": "Jane Smith", "age": 32, "is_active": true, "tags": ["developer", "python", "json"], "address": {"street": "123 Main St", "city": "Boston", "state": "MA", "zip": "02108"}}

Notice the subtle but essential differences in the output: Python's True boolean becomes lowercase true in JSON.

The json.dumps() function automatically handles all these conversions, including proper escaping of special characters like quotes and formatting of nested structures.

This makes it easy to generate valid JSON every time, even with complex data structures.

Parsing JSON strings back to Python dictionaries

When you receive JSON data from an API or file, you need to convert it back to Python objects that you can work with. This process, called deserialization, is handled by the json.loads() function.

The function takes a JSON string and returns the equivalent Python data structure:

app.py
import json

# Sample Python dictionary
person = {
    "name": "Jane Smith",
    "age": 32,
    "is_active": True,
    "tags": ["developer", "python", "json"],
    "address": {"street": "123 Main St", "city": "Boston", "state": "MA", "zip": "02108"}
}

# Converting Python dictionary to JSON string
json_string = json.dumps(person)
print("JSON String:")
print(json_string)

# Parsing JSON string back to Python dictionary
parsed_data = json.loads(json_string)
print("\nParsed Python object:")
print(type(parsed_data))
print(f"Name: {parsed_data['name']}")
print(f"City: {parsed_data['address']['city']}")
print(f"Is active: {parsed_data['is_active']} (type: {type(parsed_data['is_active']).__name__})")

In this code, json.loads() converts the JSON string back to a Python dictionary. Then, the script prints out the name, city, and is_active fields.

Run the file again:

 
python app.py
Output
JSON String:
{"name": "Jane Smith", "age": 32, "is_active": true, "tags": ["developer", "python", "json"], "address": {"street": "123 Main St", "city": "Boston", "state": "MA", "zip": "02108"}}

Parsed Python object:
<class 'dict'>
Name: Jane Smith
City: Boston
Is active: True (type: bool)

The JSON values are reconstructed into appropriate Python types. JSON's true becomes Python's True, numbers remain numbers, and the nested structure is preserved, with objects becoming dictionaries and arrays becoming lists.

This smooth round-trip conversion between Python and JSON makes it easy to work with data from various sources while maintaining type integrity.

Once parsed, you can access nested elements using standard dictionary and list operations and apply all of Python's powerful data processing capabilities to analyze or transform the data.

Formatting JSON output for readability

Compact JSON is efficient for storage and transmission, but it can be difficult to read. When debugging, creating config files, or exploring data, you'll want prettier formatting. Python's json module offers formatting options to make JSON more readable.

Here's how to control the format of your JSON output. Update app.py with the following code:

app.py
import json

# Sample data for formatting examples
user_data = {
    "users": [
        {
            "id": 1,
            "name": "Alice Johnson",
            "roles": ["admin", "user"],
            "metadata": {"last_login": "2023-03-01T14:30:22Z"}
        },
        {
            "id": 2,
            "name": "Bob Smith",
            "roles": ["user"],
            "metadata": {"last_login": "2023-02-28T10:15:11Z"}
        }
    ],
    "total": 2,
    "page": 1
}

# Default JSON encoding (compact)
compact_json = json.dumps(user_data)
print("Compact JSON:")
print(compact_json)

# Pretty-printed JSON with indentation
pretty_json = json.dumps(user_data, indent=4)
print("\nPretty JSON with 4-space indentation:")
print(pretty_json)

# Sort keys alphabetically
sorted_json = json.dumps(user_data, indent=4, sort_keys=True)
print("\nPretty JSON with sorted keys:")
print(sorted_json)

The indent parameter adds spaces to each level of nesting, making the structure visually apparent. You can adjust the number (2, 4, etc.) to control how many spaces are used for each level.

The sort_keys parameter arranges keys alphabetically, giving consistent output regardless of the original order. This is particularly valuable when:

  • Comparing different versions of JSON data
  • Creating config files that will be tracked in version control
  • Generating predictable output for testing
  • Making JSON easier to scan visually for specific keys

When you run this code, you'll see three versions of the same data:

 
python app.py

The pretty-printed version is much easier to read:

Output
Pretty JSON with 4-space indentation:
{
    "users": [
        {
            "id": 1,
            "name": "Alice Johnson",
            "roles": [
                "admin",
                "user"
            ],
            "metadata": {
                "last_login": "2023-03-01T14:30:22Z"
            }
        },
        {
            "id": 2,
            "name": "Bob Smith",
            "roles": [
                "user"
            ],
            "metadata": {
                "last_login": "2023-02-28T10:15:11Z"
            }
        }
    ],
    "total": 2,
    "page": 1
}

Pretty JSON with sorted keys:
{
    "page": 1,
    "total": 2,
    "users": [
        ...
    ]
}

The output clearly shows the data in a more readable format.

For logging or debugging, these formatting options can significantly enhance the readability of complex JSON structures without altering the underlying data.

When working with large JSON objects, proper formatting can save you significant time in understanding and troubleshooting the data.

Working with JSON files in Python

You'll often need to save JSON data to files or read settings from JSON configuration files. Python has specialized functions for these file operations that are more efficient than manually handling files and strings separately.

Writing JSON to files

To save a Python object to a JSON file, use the json.dump() function. Unlike dumps() which returns a string, dump() writes directly to a file object, combining serialization and file writing in a single operation.

Update app.py with the following code:

app.py
import json

# Sample configuration data for file operations
config = {
    "database": {
        "host": "localhost",
        "port": 5432,
        "user": "admin",
        "name": "appdb"
    },
    "api": {
        "url": "https://api.example.com/v2",
        "key": "api_key_12345"
    },
    "logging": {
        "level": "INFO",
        "file": "app.log"
    },
    "features": {
        "dark_mode": True,
        "notifications": True
    }
}

# Writing JSON to a file
with open("config.json", "w") as file:
    json.dump(config, file, indent=2)
    print("JSON data written to config.json")

This script generates a well-formatted configuration file. The indent=2 parameter ensures the output is human-readable with appropriate spacing, which is particularly helpful for configuration files that might need to be manually edited later.

The with statement ensures proper file handling, automatically closing the file even if exceptions occur—an important practice for real-world applications.

Run the file again:

 
python app.py
Output
JSON data written to config.json

Look at the created file:

 
cat config.json
Output
{
  "database": {
    "host": "localhost",
    "port": 5432,
    "user": "admin",
    "name": "appdb"
  },
  "api": {
    "url": "https://api.example.com/v2",
    "key": "api_key_12345"
  },
  "logging": {
    "level": "INFO",
    "file": "app.log"
  },
  "features": {
    "dark_mode": true,
    "notifications": true
  }
}

Notice that Python's boolean True has been correctly converted to JSON's lowercase true in the file.

When creating JSON files, consider:

  • Using indent for human-editable configuration files
  • Omitting indent for maximum storage efficiency with large datasets
  • Adding error handling for disk space or permission issues

Reading JSON from files

To load data from a JSON file, use the json.load() function. This function reads directly from a file object and parses the JSON content, returning the equivalent Python data structure.

Create a new read_config.py file to read the config we just saved:

read_config.py
import json

# Reading JSON from an existing file
try:
    with open("config.json", "r") as file:
        loaded_config = json.load(file)

    print("JSON data loaded from file:")
    print(f"Database host: {loaded_config['database']['host']}")
    print(f"API URL: {loaded_config['api']['url']}")
    print(f"Log level: {loaded_config['logging']['level']}")
    print(f"Dark mode enabled: {loaded_config['features']['dark_mode']}")

except FileNotFoundError:
    print("Error: config.json file not found. Run app.py first to create it.")
except json.JSONDecodeError:
    print("Error: Invalid JSON format in config.json")

This script includes error handling for two common file operations issues:

  • Missing files (FileNotFoundError)
  • Malformed JSON content (JSONDecodeError)

The json.load() function, like loads(), automatically converts JSON types to their Python equivalents, so you can immediately work with the loaded data using familiar Python syntax and operations.

Run this script after creating the config file:

 
python read_config.py
Output
JSON data loaded from file:
Database host: localhost
API URL: https://api.example.com/v2
Log level: INFO
Dark mode enabled: True

Clear, actionable error messages help identify problems quickly. In production applications, you might also want to provide default values when configuration files are missing or implement more sophisticated error recovery mechanisms.

Handling JSON type conversions

JSON supports fewer data types than Python. While the basic types map well (strings, numbers, booleans, arrays, and objects), Python has many specialized types that don't have direct JSON equivalents.

Here's how Python types map to JSON and vice versa:

Diagram showing bidirectional mapping between Python data types and JSON data types

Python Type JSON Type
dict object
list, tuple array
str string
int, float number
True true
False false
None null

But what about Python's specialized types like dates, decimals, sets, or custom classes? To handle these, you need to create custom encoders and decoders:

app.py
import json
from datetime import datetime, date
from decimal import Decimal
import uuid

# Sample data with Python-specific types
complex_data = {
    "id": uuid.uuid4(),
    "created_at": datetime.now(),
    "date_only": date(2023, 3, 1),
    "price": Decimal("19.99"),
    "tags": {"python", "json", "tutorial"},  # This is a set
}

# This will raise a TypeError
try:
    json_string = json.dumps(complex_data)
except TypeError as e:
    print(f"Error when serializing complex types: {e}")

# Custom JSON encoder
class CustomJSONEncoder(json.JSONEncoder):
    def default(self, obj):
        if isinstance(obj, (datetime, date)):
            return obj.isoformat()
        if isinstance(obj, uuid.UUID):
            return str(obj)
        if isinstance(obj, Decimal):
            return float(obj)
        if isinstance(obj, set):
            return list(obj)
        return super().default(obj)

# Use the custom encoder
json_string = json.dumps(complex_data, cls=CustomJSONEncoder, indent=2)
print("\nJSON with custom encoder:")
print(json_string)

# Parse the JSON back
parsed_data = json.loads(json_string)
print("\nParsed data types:")
print(f"id: {type(parsed_data['id'])}")
print(f"created_at: {type(parsed_data['created_at'])}")
print(f"tags: {type(parsed_data['tags'])}")

The custom encoder successfully serializes the specialized types by converting them to standard JSON types:

  • Dates and timestamps become ISO-format strings
  • UUIDs become standard strings
  • Decimal numbers become regular floats
  • Sets become arrays

Upon running the file, you will see:

 
python app.py
Output
Error when serializing complex types: Object of type UUID is not JSON serializable

JSON with custom encoder:
{
  "id": "4980ecd7-e984-4b3c-9d33-e6732f47b618",
  "created_at": "2025-04-30T12:17:20.440531",
  "date_only": "2023-03-01",
  "price": 19.99,
  "tags": [
    "tutorial",
    "python",
    "json"
  ]
}

Parsed data types:
id: <class 'str'>
created_at: <class 'str'>
tags: <class 'list'>

However, when parsing this JSON back into Python objects, we lose the original type information. The dates are now strings, and we can't automatically convert them back to date objects. To restore the original types, we need a custom decoder:

app.py
import json
print(f"tags: {type(parsed_data['tags'])}")
...

# Custom JSON decoder function for object_hook
def custom_json_decoder(obj):
# Look for ISO format datetime strings
for key, value in obj.items():
if isinstance(value, str):
# Try to parse as datetime
try:
if 'T' in value and ('+' in value or 'Z' in value):
obj[key] = datetime.fromisoformat(value.replace('Z', '+00:00'))
continue
except ValueError:
pass
# Try to parse as date
try:
if value.count('-') == 2 and 'T' not in value:
obj[key] = date.fromisoformat(value)
continue
except ValueError:
pass
# Try to parse as UUID
try:
if '-' in value and len(value) == 36:
obj[key] = uuid.UUID(value)
continue
except ValueError:
pass
return obj
# Use the custom decoder
parsed_data_with_types = json.loads(json_string, object_hook=custom_json_decoder)
print("\nParsed data with custom decoder:")
print(f"id: {type(parsed_data_with_types['id'])}")
print(f"created_at: {type(parsed_data_with_types['created_at'])}")
print(f"date_only: {type(parsed_data_with_types['date_only'])}")

The object_hook function examines each value and attempts to convert string values back to their original Python types based on their pattern.

This approach works well for many common types, but it relies on string pattern recognition, which might not always be reliable.

The custom decoder restores our specialized types:

Output
...
Parsed data types:
id: <class 'str'>
created_at: <class 'str'>
tags: <class 'list'>

Parsed data with custom decoder:
id: <class 'uuid.UUID'>
created_at: <class 'str'>
date_only: <class 'datetime.date'>

For more effective type handling in complex applications, consider:

  • Adding explicit type information in your JSON
  • Using dedicated libraries like Pydantic or marshmallow
  • Creating application-specific encoding schemes for custom objects
  • Implementing schema definitions that include type information

This pattern of custom encoders and decoders is essential when your application needs to preserve specialized Python types during serialization and deserialization.

Validating JSON data with JSON schema

When receiving JSON from external sources, APIs, or user input, it is crucial to ensure the data meets your application's requirements. JSON Schema provides a standardized way to define validation rules for JSON data.

Flowchart depicting the JSON Schema validation process. JSON data is validated against a JSON Schema

You can use the jsonschema library to validate incoming data before processing it. You need to install it first using:

 
pip install jsonschema

Once installed, update app.py with the following code:

app.py
import json
from jsonschema import validate, ValidationError

# Define a JSON schema
user_schema = {
    "type": "object",
    "required": ["id", "name", "email"],
    "properties": {
        "id": {"type": "integer", "minimum": 1},
        "name": {"type": "string", "minLength": 2},
        "email": {"type": "string", "format": "email"}
    }
}

# Example of valid and invalid data
valid_user = {"id": 1, "name": "John Doe", "email": "john.doe@example.com"}
invalid_user = {"id": 2, "name": "Alice"}

# Validate valid user
try:
    validate(instance=valid_user, schema=user_schema)
    print("Valid user data!")
except ValidationError as e:
    print(f"Validation error: {e}")

# Validate invalid user
try:
    validate(instance=invalid_user, schema=user_schema)
except ValidationError as e:
    print(f"Validation error: {e.message}")

In the code, you define a JSON schema to validate user data, ensuring that the id, name, and email fields meet specified criteria.

You then validate both valid and invalid user data. For the valid data, the validation passes, and a success message is printed.

For the invalid data, the validation fails, and an error message is printed, specifying the issue (e.g., missing or incorrect data).

Upon running the file, you will see:

 
python app.py
Output
Valid user data!
Validation error: 'email' is a required property

As you can see, the valid data passes the validation, while the invalid data triggers an error due to the missing email field.

Final thoughts

This guide showed how to work efficiently with JSON data in Python using the built-in json module. You learned how to serialize Python objects to JSON, deserialize JSON back into Python dictionaries, and format JSON output for better readability, among other tasks.

For more details, refer to the official Python documentation on the json module:
Python JSON Documentation

Author's avatar
Article by
Stanley Ulili
Stanley Ulili is a technical educator at Better Stack based in Malawi. He specializes in backend development and has freelanced for platforms like DigitalOcean, LogRocket, and AppSignal. Stanley is passionate about making complex topics accessible to developers.
Got an article suggestion? Let us know
Next article
Working with CSV Files in Python
Learn how to handle CSV files in Python using the built-in csv module and pandas library. This guide covers everything from basic reading and writing of CSV files to advanced data manipulation and validation techniques, including handling different formats and ensuring data integrity.
Licensed under CC-BY-NC-SA

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Make your mark

Join the writer's program

Are you a developer and love writing and sharing your knowledge with the world? Join our guest writing program and get paid for writing amazing technical guides. We'll get them to the right readers that will appreciate them.

Write for us
Writer of the month
Marin Bezhanov
Marin is a software engineer and architect with a broad range of experience working...
Build on top of Better Stack

Write a script, app or project on top of Better Stack and share it with the world. Make a public repository and share it with us at our email.

community@betterstack.com

or submit a pull request and help us build better products for everyone.

See the full list of amazing projects on github