Back to Logging guides

A Beginner's Guide to JSON Logging

Ayooluwa Isaiah
Updated on May 7, 2024

Software logs used to be written for humans using APIs like the classic printf(). However, with automated log analysis on the rise, those unstructured messages have become a real headache. They lack consistency, change often, and may be missing important data.

That's why developers are turning to JSON for logging. It's a familiar and widely supported structured format that plays nicely with modern log analysis tools, facilitating a much deeper level of insight than was possible in the past.

In this article, we'll dive into JSON logging: the benefits, how to do it, and best practices to make your logs truly powerful. Let's jump right in!

What is JSON?

JSON (JavaScript Object Notation) is a versatile data format derived from a subset of the JavaScript programming language.

While inspired by JavaScript syntax, JSON's usefulness extends far beyond its namesake language. It's unique blend of human readability, machine parsability, and language-agnosticism have made it the go-to format for data exchange across web applications, servers, and diverse systems.

What is JSON logging?

JSON logging is the recording of log entries as structured JSON objects. This approach makes log data easy to parse and analyze with various log management systems, analytics tools, and other software applications.

Consider a traditional log entry from an Apache server, which typically follows the Common Log Format:

 
127.0.0.1 alice Alice [06/May/2021:11:26:42 +0200] "GET / HTTP/1.1" 200 3477

While decipherable, its format doesn't lend itself to easy correlation with other log data. The same entry in JSON would look like this:

 
{
  "ip_address": "127.0.0.1",
  "user_id": "alice",
  "username": "Alice",
  "timestamp": "06/May/2021:11:26:42 +0200",
  "request_method": "GET",
  "request_url": "/",
  "protocol": "HTTP/1.1",
  "status_code": 200,
  "response_size_bytes": 3477
}

The JSON version is a bit more verbose, but the key/value pairs make pinpointing specific information (like the IP address or status code) quick and intuitive.

Benefits of logging in JSON

Choosing JSON for your logging offers several significant advantages, especially when it comes to the analysis and management of log data in today's complex systems. Here's why:

1. It is easy for humans and machines to understand

JSON's structure is simple and familiar to most developers. By presenting log data in key/value pairs, each log entry remains grokkable by humans, and parsable by log analysis tooling just like any other JSON data.

2. It is resilient to changes in log schema

Changes to log data, such as adding or modifying fields, can often complicate unstructured or semi-structured logging formats. This is because these formats aren't very adaptable, typically requiring updates to the parsing scripts or regular expressions used for log analysis.

JSON shines because its flexibility lets you add or remove fields without causing headaches. This adaptability makes it perfect for applications whose log data might evolve.

3. It facilitates true observability

Having observability means that you can drill down into your data from any angle to find the root cause of issues.

Structured JSON logs play a key role in achieving true observability by providing the rich, contextual data needed to understand the complete system state at the time of any event (for optimal results, see best practices).

Drawbacks to logging in JSON

While JSON offers many benefits, there are some trade-offs to consider:

1. Readability in development environments

Raw JSON logs, with their curly braces and punctuation, can be visually dense which might hinder quick at-a-glance insights during development.

Screenshot of a JSON log

This readability issue can be effectively addressed using tools like jq, which formats the output by placing each property on its own line and adding color coding for keys and values.

Screenshot of a formatted JSON log

An alternative approach is using a more human-readable format like Logfmt in development environments, while keeping JSON logs in production. This can often be managed through environment variables, allowing for easy switching between formats.

Screenshot of Logfmt

2. Increased storage requirements

JSON logs can be slightly larger in file size compared to some more compact formats, but this is a minor concern with modern storage and compression algorithms.

3. Processing overhead

There is a slight performance impact when generating and parsing JSON. However, for most applications, this overhead is negligible compared to the analysis benefits.

How to log in JSON

Now that you're aware of the benefits and potential drawbacks of JSON logging, let's explore the main approaches for generating JSON logs in production systems:

1. Using a structured logging framework

For applications, the easiest way to generate JSON logs is by using a logging framework that supports JSON output natively. Many programming languages offer excellent third-party libraries; some are even starting to include structured JSON logging in their standard libraries.

Here's a Python example that uses the Structlog package to produce structured JSON logs:

 
import structlog

structlog.configure(
    processors=[
        structlog.processors.TimeStamper(fmt="iso"),
        structlog.processors.add_log_level,
        structlog.processors.EventRenamer("msg"),
        structlog.processors.JSONRenderer(),
    ]
)

logger = structlog.get_logger()
logger.info("image uploaded", name="image.jpg", size_bytes=2382)

This produces clear, parsable JSON output:

Output
{
  "name": "image.jpg",
  "size_bytes": 2382,
  "timestamp": "2023-11-30T04:07:41.746806Z",
  "level": "info",
  "msg": "image uploaded"
}

To find the right logging framework for your programming language of choice, refer to our comparison articles below:

Learn more: 8 Factors for Choosing a Logging Framework

2. Configuring your dependencies

It's worth exploring the documentation for the other dependencies you use. You might be surprised to find they can also generate JSON logs!

A classic example is PostgreSQL, which generates logs in a traditional textual format by default:

 
2023-07-30 08:31:50.628 UTC [2176] postgres@chinook LOG:  statement: select albumid, title from album where artistid = 2;

However, from version 15 onwards, PostgreSQL offers a built-in option for structured JSON logging. A simple change in postgresql.conf does the trick:

/etc/postgresql/15/main/postgresql.conf
log_destination = 'jsonlog'

Now your PostgreSQL logs will look like this:

 
{
  "timestamp": "2023-07-30 22:48:01.817 UTC",
  "user": "postgres",
  "dbname": "chinook",
  "pid": 18254,
  "remote_host": "[local]",
  "session_id": "64c6e836.474e",
  "line_num": 1,
  "ps": "idle",
  "session_start": "2023-07-30 22:46:14 UTC",
  "vxid": "4/3",
  "txid": 0,
  "error_severity": "LOG",
  "message": "statement: SHOW data_directory;",
  "application_name": "psql",
  "backend_type": "client backend",
  "query_id": 0
}

This structured format makes parsing and analyzing your database logs significantly easier, especially when paired with log management tools.

3. Parsing unstructured logs to JSON

If you can't generate JSON logs directly from the source, don't worry! Log aggregation tools can help with parsing such unstructured or semi-structured data and convert them into JSON.

This conversion process typically involves identifying individual elements within each log message (often using regular expressions) and then mapping these elements to specific JSON keys.

Let's say you have Nginx error logs in their default format:

 
2021/04/01 13:02:31 [error] 31#31: *1 open() "/usr/share/nginx/html/not-found" failed (2: No such file or directory), client: 172.17.0.1, server: localhost, request: "POST /not-found HTTP/1.1", host: "localhost:8081"

Using a tool like Vector, you can transform this into JSON:

 
{
  "cid": 1,
  "client": "172.17.0.1",
  "host": "localhost:8081",
  "message": "open() \"/usr/share/nginx/html/not-found\" failed (2: No such file or directory)",
  "pid": 31,
  "request": "POST /not-found HTTP/1.1",
  "server": "localhost",
  "severity": "error",
  "tid": 31,
  "timestamp": "2021-04-01T13:02:31Z"
}

This can be achieved using the configuration below. The basic idea is that it uses Vector's parse_nginx_log() function to parse the error logs and turn them into structured JSON. For further details, please refer to Vector's documentation

/etc/vector/vector.toml
[sources.nginx_error]
type = "file"
include = ["/var/log/nginx/error.log"]
read_from = "end"

[transforms.nginx_error_to_json]
type = "remap"
inputs = [ "nginx_error" ]
source = """
structured, err = parse_nginx_log(.message, "error")
if err != null {
log("Unable to parse Nginx log: " + err, level: "error")
} else {
. = merge(., structured)
}
"""
[sinks.print] type = "console" inputs = ["nginx_error_to_json"] encoding.codec = "json"

Once all your logs are in JSON, correlating and analyzing them with other structured data becomes much easier!

Best practices for JSON logging

To get the most out of your JSON logs, keep these best practices in mind:

1. Maintain a uniform logging schema

A consistent schema is essential for making the most of your JSON logs. By ensuring that field names remain the same across all your services, you'll streamline data querying and analysis. For example, you can choose a standard field name for logging user IDs such as user_id or userID and stick to it throughout your production environment.

If you're using logs from third-party applications where you can't control the field names, a log shipper can help you transform and standardize the names to match your schema before the logs reach your management system. This keeps the data consistent, even when it comes from diverse sources.

Learn more: What is Log Aggregation? Getting Started and Best Practices

2. Specify units in numerical field names

Make your numerical data instantly understandable by including units directly in the field names. Instead of using a generic field like duration, use duration_secs or duration_msecs. This removes any chance of misinterpretation and makes your logs clear for both humans and analysis tools.

3. Format exception stack traces as JSON

When errors occur, their stack traces (which show the chain of functions that led to the problem) are often logged as plain text. This can be hard to read and analyze. Here's an example:

 
{
  "message": "Application failed to execute database query",
  "timestamp": "2022-03-31T13:32:33.928188+00:00",
  "logger": "__main__",
  "level": "error",
"exception": "Traceback (most recent call last):\n File \"app.py\", line 50, in execute_query\n results = database.query(sql_query)\n File \"database.py\", line 34, in query\n raise DatabaseException(\"Failed to execute query\")\nDatabaseException: Failed to execute query"
}

If possible, parse the entire stack trace into an hierarchy of attributes like file names, line numbers, methods, and error messages:

 
{
  "message": "Application failed to execute database query",
  "timestamp": "2022-03-31T13:32:33.928188+00:00",
  "logger": "__main__",
  "level": "error",
"exception": {
"type": "DatabaseException",
"message": "Failed to execute query",
"trace": [
{
"file": "app.py",
"line": 50,
"method": "execute_query"
},
{
"file": "database.py",
"line": 34,
"method": "query"
}
]
}
}

With this structure, log analysis tools can easily pull out the error type, message, or specific lines in the code.

Python's Structlog and Go's Zerolog are two notable logging frameworks that provide the ability to output a structured error stack trace out of the box.

4. Enrich logs with contextual information

One of the key advantages of JSON logging is the ability to enrich log entries with extensive contextual information about what is being logged. This provides clarity about the immediate context of a log event and supports much deeper analysis when correlating data across different logs.

Go beyond just recording what happened – log the context that will you answer the why. Include user and session identifiers, error messages, and other details that will shed light on the root cause of problems.

 
{
  "timestamp": "2023-11-30T15:45:12.123Z",
  "level": "info",
  "message": "User login failed",
  "user_id": "123456",
  "username": "johndoe",
  "session": {
    "id": "abc123xyz",
    "ip": "192.168.1.10",
    "device": "iPhone 13, iOS 15.4",
    "location": "New York, USA"
  },
  "service": {
    "name": "UserAuthenticationService",
    "version": "1.2.0",
    "host": "auth-server-01"
  },
  "request": {
    "path": "/login",
    "status": 401,
    "duration_ms": 200
  }
}

5. Centralize and monitor your JSON logs

In today's complex systems, keeping JSON logs scattered is a recipe for troubleshooting chaos. A centralized log management system transforms this chaos into order, offering several powerful benefits:

  • Single source of truth: Manually sifting through mountains of log files is neither efficient nor scalable. A central repository means no more hunting for log files across servers or applications.

  • Unlocking "unknown unknowns": The structured nature of JSON, paired with a powerful analysis tool, allows you to find patterns, correlations, and anomalies you might never have noticed in raw text logs.

  • Targeted alerting: You can set up alerts based on specific JSON fields, values, and even a combination of events across multiple entries ensuring you're promptly notified of critical events when they occur.

level-error-message-request-url-search-Better-Stack.png

Better Stack provides a user-friendly log analysis and observability platform specifically designed to make managing JSON logs effortless.

It automatically understands the structure of your logs, making it simple to search, sort, and filter log entries. Need to dive deeper? Advanced querying with SQL lets you pinpoint the exact data you're looking for, and collaborative features

You can also visualize your log data with customizable dashboards, and generate tailored reports to track the events that matter most, while setting up alerting so you can take immediate action when issues arise.

Start for free today and use Better Stack to uncover the hidden insights in your JSON logs!

Final thoughts

You've seen how JSON logging can streamline analysis and troubleshooting. If you're ready to take the plunge, the resources we discussed will help you implement it seamlessly.

Want to go even further? Check out our other logging guides for more strategies to get the most from your logs.

Thanks for reading, and happy logging!

Author's avatar
Article by
Ayooluwa Isaiah
Ayo is a technical content manager at Better Stack. His passion is simplifying and communicating complex technical ideas effectively. His work was featured on several esteemed publications including LWN.net, Digital Ocean, and CSS-Tricks. When he's not writing or coding, he loves to travel, bike, and play tennis.
Got an article suggestion? Let us know
Next article
Why Structured Logging is Fundamental to Observability
Unstructured logs hinder observability, but transforming them into structured data unlocks their immense value
Licensed under CC-BY-NC-SA

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Make your mark

Join the writer's program

Are you a developer and love writing and sharing your knowledge with the world? Join our guest writing program and get paid for writing amazing technical guides. We'll get them to the right readers that will appreciate them.

Write for us
Writer of the month
Marin Bezhanov
Marin is a software engineer and architect with a broad range of experience working...
Build on top of Better Stack

Write a script, app or project on top of Better Stack and share it with the world. Make a public repository and share it with us at our email.

community@betterstack.com

or submit a pull request and help us build better products for everyone.

See the full list of amazing projects on github