Back to Logging guides

Introduction to JSON Logging: Techniques and Best Practices

Ayooluwa Isaiah
Updated on February 16, 2024

Messages generated by software systems are typically crafted by developers using functions similar to the widely known printf(). While these messages are intended for human interpretation; they are increasingly being processed by machines, such as automated log aggregation and management tools.

The challenge arises because these machine-based systems frequently struggle to interpret messages designed for human readability. These messages often lack a predictable structure, omit crucial information, and are prone to changes, making automated analysis difficult.

Given these challenges, there is a growing interest in introducing more structured approaches to logging within software systems. This need for structured logging has led to the increasing adoption of JSON as the preferred format for log data.

In this article, we'll explore JSON logging in sufficient detail, examining its advantages, methods for generating JSON-formatted logs, and best practices to adopt in the process.

Let's get started!

What is JSON?

JSON (JavaScript Object Notation) is a versatile data format derived from a subset of the JavaScript programming language. Known for its human readability and ease of parsing by machines, JSON has become the go-to format for data interchange, particularly in web applications and between servers.

Despite its roots in JavaScript, JSON's utility is not confined to this language alone. Its language-agnostic nature has allowed it to be seamlessly integrated with most mainstream programming languages. This broad compatibility has made JSON a universal standard for data exchange on the web.

What is JSON logging?

JSON logging is a method of recording log data where each log entry is structured as a JSON object. This structured approach provides clarity and ease of parsing for both humans and machines, making it highly compatible with various log management systems, analytics tools, and other software applications.

Consider a traditional log entry from an Apache server, which typically follows the Common Log Format:

 
127.0.0.1 alice Alice [06/May/2021:11:26:42 +0200] "GET / HTTP/1.1" 200 3477

In its JSON-logged form, the same entry would be represented as:

 
{
  "ip_address": "127.0.0.1",
  "user_identifier": "alice",
  "user_authentication": "Alice",
  "timestamp": "06/May/2021:11:26:42 +0200",
  "request_method": "GET",
  "request_url": "/",
  "protocol": "HTTP/1.1",
  "status_code": 200,
  "response_size": 3477
}

In this JSON format, the log entry is indeed more verbose, but its key-value pairing makes each data point clear and understandable, facilitating easy identification and extraction of specific information.

Benefits of logging in JSON

Logging in JSON offers several notable benefits, especially in modern computing environments where data analysis and integration with log management tools are critical for deriving insights from logs. Here are some of the key advantages:

1. It is easy to read and parse

JSON logging provides a highly structured format that's easily readable by both humans and machines. This dual benefit arises from its consistent data organization into key-value pairs, making each log entry unambiguous.

2. It can capture all kinds of data

Any modification in log data can disrupt established parsers in unstructured or semi-structured logging formats, necessitating changes to parsing scripts or regular expressions used in log analysis. This issue arises because these formats are less adaptable to changes in the data structure.

Conversely, JSON excels in its inherent flexibility as it allows for the easy addition of new data fields without impacting the functionality of existing log parsing systems.

Additionally, its ability to represent a wide array of data structures from basic key-value pairs to more complex data like nested objects and arrays, solidifies its position as the most versatile logging format for all kinds of applications.

3. It facilitates automated monitoring and alerting

Capturing JSON logs and centralizing them in a log management system significantly streamlines the setup of automated monitoring and alerting to enable rapid response to critical incidents.

Key metrics like the presence, absence, or specific values of fields, as well as their frequency of occurrence, can be continuously monitored. Automated alerts can then be configured to trigger upon detecting notable events or anomalies, ensuring timely intervention and resolution.

In addition, structured logs greatly simplify the task of correlating logs from different sources. This correlation is essential in complex systems where understanding the interplay between different components is key to diagnosing issues.

Side note: Get a logs dashboard

Save hours of sifting through application logs. Centralize with Better Stack and start visualizing your log data in minutes.

See the demo dashboard live.

Drawbacks to logging in JSON

A primary challenge when logging in JSON is its reduced readability in development environments. The dense format, characterized by numerous curly braces, commas, and quotes, can make the log entries appear cluttered and difficult to decipher when presented in a single line.

Screenshot of a JSON log

However, this readability issue can be effectively addressed using tools like jq, which formats the output by placing each property on its own line and adds color coding for keys and values. This makes the logs more legible and easier to navigate.

Screenshot of a formatted JSON log

The trade-off, however, is that this formatted output occupies more screen space, potentially limiting the number of log entries that can be viewed simultaneously.

An alternative approach is to use a more readable format like Logfmt in development environments while keeping JSON logs for production. This flexibility can often be managed through environment variables, allowing for easy switching between formats.

Screenshot of Logfmt

How to log in JSON

Now that you're aware of the benefits and drawbacks of JSON logging, let's look at the three main approaches for generating JSON logs.

1. Using a structured logging framework

A straightforward approach to generating JSON logs is through the use of a structured logging framework that inherently supports JSON output. While many programming environments may require third-party libraries for this functionality, an increasing number are incorporating structured JSON logging into their standard libraries.

For instance, in Python, you can utilize the Structlog package to produce structured JSON logs. Here's an illustrative example:

 
import structlog

structlog.configure(
    processors=[
        structlog.processors.TimeStamper(fmt="iso"),
        structlog.processors.add_log_level,
        structlog.processors.EventRenamer("msg"),
        structlog.processors.JSONRenderer(),
    ]
)

logger = structlog.get_logger()
logger.info("image uploaded", name="image.jpg", size_bytes=2382)

This configuration yields a structured and easily parseable JSON output:

 
{
  "name": "image.jpg",
  "size_bytes": 2382,
  "timestamp": "2023-11-30T04:07:41.746806Z",
  "level": "info",
  "msg": "image uploaded"
}

For recommendations on structured logging frameworks for your programming environment, refer to our comparison articles below:

Learn more: 8 Factors for Choosing a Logging Framework

2. Setting up third-party dependencies

Many dependencies in your application environment can also be configured to generate structured JSON logs. A classic example is PostgreSQL, which traditionally generates logs in a standard textual format:

 
2023-07-30 08:31:50.628 UTC [2176] postgres@chinook LOG:  statement: select albumid, title from album where artistid = 2;

However, starting with version 15, PostgreSQL can produce logs in JSON format with a simple configuration change in postgresql.conf:

/etc/postgresql/15/main/postgresql.conf
log_destination = 'jsonlog'

With this configuration, PostgreSQL starts outputting logs in a structured JSON format, as shown below:

 
{
  "timestamp": "2023-07-30 22:48:01.817 UTC",
  "user": "postgres",
  "dbname": "chinook",
  "pid": 18254,
  "remote_host": "[local]",
  "session_id": "64c6e836.474e",
  "line_num": 1,
  "ps": "idle",
  "session_start": "2023-07-30 22:46:14 UTC",
  "vxid": "4/3",
  "txid": 0,
  "error_severity": "LOG",
  "message": "statement: SHOW data_directory;",
  "application_name": "psql",
  "backend_type": "client backend",
  "query_id": 0
}

This structured format dramatically simplifies the parsing and analysis of your database logs, making them more accessible and interpretable for log management tools.

3. Converting unstructured logs to JSON

When generating JSON logs directly at the source is infeasible, an alternative approach is to use a log aggregation tool. These tools can collect unstructured or semi-structured logs and convert them into JSON format.

This conversion process typically involves identifying individual elements within each log message (often using regular expressions) and then mapping these elements to specific JSON keys.

Take, for example, the transformation of Nginx error logs from their default format into structured JSON logs using a tool like Vector.

Original Nginx error log format:

 
2021/04/01 13:02:31 [error] 31#31: *1 open() "/usr/share/nginx/html/not-found" failed (2: No such file or directory), client: 172.17.0.1, server: localhost, request: "POST /not-found HTTP/1.1", host: "localhost:8081"

Converted JSON formatted Nginx error log:

 
{
  "cid": 1,
  "client": "172.17.0.1",
  "host": "localhost:8081",
  "message": "open() \"/usr/share/nginx/html/not-found\" failed (2: No such file or directory)",
  "pid": 31,
  "request": "POST /not-found HTTP/1.1",
  "server": "localhost",
  "severity": "error",
  "tid": 31,
  "timestamp": "2021-04-01T13:02:31Z"
}

The process involves a specific Vector configuration, as illustrated below. For further details, refer to Vector's documentation:

 
[sources.nginx_error]
type = "file"
include = ["/var/log/nginx/error.log"]
read_from = "end"

[transforms.nginx_error_to_json]
type = "remap"
inputs = [ "nginx_error" ]
source = """
structured, err = parse_nginx_log(.message, "error")
if err != null {
log("Unable to parse Nginx log: " + err, level: "error")
} else {
. = merge(., structured)
}
"""
[sinks.print] type = "console" inputs = ["nginx_error_to_json"] encoding.codec = "json"

Once the logs are transformed accordingly, you can now analyze and correlate the data programmatically alongside your other structured log data.

Best practices for logging in JSON

To maximize the effectiveness of your JSON logging, it's necessary to follow certain best practices:

1. Maintain a uniform logging schema

Implementing a consistent schema across all applications and services is crucial for streamlined data aggregation and querying. Consistency in field names ensures coherence and reliability in your log data.

For example, when logging user IDs, it's important to choose a standard field name, such as user_id or userID, and use it uniformly across all logs. This approach guarantees that essential data points are noticed in the filtering and analysis stages.

In scenarios where you don't have control over log field names at the source, like with logs from third-party applications or dependencies, consider employing a log shipper to transform and standardize field names, ensuring they align with your logging schema before they are sent to a log management system.

This helps maintain consistency across logs from various sources, enhancing the overall effectiveness of your log management strategy.

Learn more: What is Log Aggregation? Getting Started and Best Practices

2. Specify units in numerical field names

Specifying units directly in the field names of numerical values enhances the clarity of log data. For example, rather than a vague field name such as duration, more descriptive names like duration_secs or duration_msecs clearly indicate the unit of measurement used.

This effectively removes ambiguity, providing precise and clear data within the logs. By explicitly defining units in field names, it becomes straightforward for anyone analyzing the logs to accurately understand and interpret the data, without additional explanations or context.

3. Format exception stack traces as JSON

Typically, exception stack traces are logged as strings, which can be challenging to read and process:

 
{
  "message": "Application failed to execute database query",
  "timestamp": "2022-03-31T13:32:33.928188+00:00",
  "logger": "__main__",
  "level": "error",
"exception": "Traceback (most recent call last):\n File \"app.py\", line 50, in execute_query\n results = database.query(sql_query)\n File \"database.py\", line 34, in query\n raise DatabaseException(\"Failed to execute query\")\nDatabaseException: Failed to execute query"
}

A more effective approach is to format the exception field as a structured JSON object. This structuring breaks the exception down into a hierarchy of attributes, like file names, line numbers, methods, and error messages.

Consider the following revised log entry:

 
{
  "message": "Application failed to execute database query",
  "timestamp": "2022-03-31T13:32:33.928188+00:00",
  "logger": "__main__",
  "level": "error",
"exception": {
"type": "DatabaseException",
"message": "Failed to execute query",
"trace": [
{
"file": "app.py",
"line": 50,
"method": "execute_query"
},
{
"file": "database.py",
"line": 34,
"method": "query"
}
]
}
}

In this JSON-structured format, the exception information is organized into a clear, readable format, making automated parsing and analysis more straightforward.

Python's Structlog and Go's Zerolog are two notable logging frameworks that provide the ability to output a structured error stack trace out of the box.

4. Enhance logs with ample contextual information

One of the primary benefits of JSON logging is the opportunity to infuse log entries with comprehensive contextual information. This enhancement is not just about understanding the immediate circumstances of a log event but also about enabling a deeper analysis, particularly when correlating with other log data.

Incorporating detailed context, such as user details, session identifiers, and environment states, significantly increases the utility of each log entry. This enriched data should prove valuable when debugging, tracking transaction flows or user actions, and diagnosing issues.

 
{
  "timestamp": "2023-11-30T15:45:12.123Z",
  "level": "info",
  "message": "User login failed",
  "user_id": "123456",
  "username": "johndoe",
  "session": {
    "id": "abc123xyz",
    "ip": "192.168.1.10",
    "device": "iPhone 13, iOS 15.4",
    "location": "New York, USA"
  },
  "service": {
    "name": "UserAuthenticationService",
    "version": "1.2.0",
    "host": "auth-server-01"
  },
  "request": {
    "path": "/login",
    "status": 401,
    "duration_ms": 200
  }
}

5. Centralize your JSON logs

After creating well-structured JSON logs filled with rich contextual information, the next step is to optimize their utility by centralizing them in a log management system to allow for automated processing and setting up alerts for significant events.

level-error-message-request-url-search-Better-Stack.png

Better Stack provides a log analysis and observability platform that simplifies managing your JSON logs. It automatically extracts fields, offering capabilities for filtering, sorting, querying, and customizing visualizations to suit your specific requirements.

It also allows you to create custom report views based on fields that matter to you and set up alerts to notify you of notable events through your preferred channels (SMS, Slack, Email, etc). The best part? You can start using it for free, with no credit card required!

Final thoughts

In this guide, we've delved deeply into the intricacies of JSON logging, examining its advantages and how it can enhance the logging process.

If you're ready to start implementing JSON logging in your application, the library resources discussed earlier are a great starting point.

For further exploration, consider browsing our extensive logging guides to learn more about effective techniques for production logging.

Thanks for reading, and happy logging!

Author's avatar
Article by
Ayooluwa Isaiah
Ayo is the Head of Content at Better Stack. His passion is simplifying and communicating complex technical ideas effectively. His work was featured on several esteemed publications including LWN.net, Digital Ocean, and CSS-Tricks. When he’s not writing or coding, he loves to travel, bike, and play tennis.
Got an article suggestion? Let us know
Next article
Log Levels Explained and How to Use Them
Log levels are labels that indicate the severity or urgency of a log entry. This article will help you understand why they are crucial for effective log management
Licensed under CC-BY-NC-SA

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Make your mark

Join the writer's program

Are you a developer and love writing and sharing your knowledge with the world? Join our guest writing program and get paid for writing amazing technical guides. We'll get them to the right readers that will appreciate them.

Write for us
Writer of the month
Marin Bezhanov
Marin is a software engineer and architect with a broad range of experience working...
Build on top of Better Stack

Write a script, app or project on top of Better Stack and share it with the world. Make a public repository and share it with us at our email.

community@betterstack.com

or submit a pull request and help us build better products for everyone.

See the full list of amazing projects on github