Parsing inner JSON objects within logs using Fluentd can be done using the parser filter plugin. This is useful when your logs contain nested JSON structures and you want to extract or transform specific fields from them.

Step-by-Step Guide to Parsing Inner JSON in Fluentd

1. Example Log Data

Assume you have logs in the following format, where there’s a nested JSON object within the message field:

  "timestamp": "2024-09-13T10:00:00Z",
  "level": "INFO",
  "message": "{\\"user\\":\\"john_doe\\",\\"action\\":\\"login\\",\\"details\\":{\\"ip\\":\\"\\",\\"browser\\":\\"Chrome\\"}}"

2. Fluentd Configuration

To parse this inner JSON, follow these steps:

a. Define the Input Source

Start by defining how Fluentd should collect logs. For example, if you are using a tail input plugin:

  @type tail
  path /var/log/app.log
  pos_file /var/log/td-agent/app.log.pos
  tag app.logs
  format json

b. Parse Inner JSON

You need to use the parser filter to extract and parse the inner JSON. First, you’ll need to install the fluent-plugin-parser if it’s not already installed:

td-agent-gem install fluent-plugin-parser

Then, configure the parser filter to parse the inner JSON:

<filter app.logs>
  @type parser
  key_name message
    @type json
  # Additional configuration to flatten nested JSON if necessary

In this configuration:

  • key_name specifies the field containing the inner JSON (message in this case).
  • The <parse> section configures Fluentd to parse this field as JSON.

c. Optional: Flatten Nested JSON

If the nested JSON needs to be flattened, use the record_transformer filter:

<filter app.logs>
  @type record_transformer
    # Example of flattening nested JSON fields
    user ${record["message"]["user"]}
    action ${record["message"]["action"]}
    ip ${record["message"]["details"]["ip"]}
    browser ${record["message"]["details"]["browser"]}

3. Define the Output

Specify where to send the parsed logs. For example, output to a file:

<match app.logs>
  @type file
  path /var/log/fluentd/parsed_logs.log
    @type json

Complete Configuration Example

Here’s a complete example of a Fluentd configuration file that parses inner JSON:

  @type tail
  path /var/log/app.log
  pos_file /var/log/td-agent/app.log.pos
  tag app.logs
  format json

<filter app.logs>
  @type parser
  key_name message
    @type json

<filter app.logs>
  @type record_transformer
    timestamp ${record["timestamp"]}
    level ${record["level"]}
    user ${record["message"]["user"]}
    action ${record["message"]["action"]}
    ip ${record["message"]["details"]["ip"]}
    browser ${record["message"]["details"]["browser"]}

<match app.logs>
  @type file
  path /var/log/fluentd/parsed_logs.log
    @type json

4. Testing and Validation

  1. Restart Fluentd to apply the new configuration:

    sudo systemctl restart td-agent
  2. Verify Logs: Check the output file or destination to ensure logs are parsed and transformed correctly.

Common Use Cases

  • Extracting Nested Fields: Extract specific fields from nested JSON objects for more straightforward querying and analysis.
  • Flattening JSON: Simplify the log structure by flattening nested JSON fields into a more accessible format.
  • Transforming Data: Apply transformations to make logs more informative or consistent.

Additional Considerations

  • Performance: Parsing and transforming logs can impact performance, so ensure your configuration is optimized for your workload.
  • Error Handling: Set up error handling and buffering to manage issues with parsing and data loss.
