How to Collect, Process, and Ship Log Data with Fluent Bit
In distributed systems, efficient log shipping is essential. A log shipper is a tool that gathers logs from various sources, like containers and servers and directs them to a central location for analysis. Several options, including LogStash and Fluentd, are available for this purpose. Among them, Fluent Bit stands out as a lightweight, high-performance log shipper introduced by Treasure Data.
Fluent Bit was developed in response to the growing need for a log shipper that could operate in resource-constrained environments, such as embedded systems and containers. With a minimal memory footprint of 1MB, Fluent Bit efficiently collects logs from multiple sources, transforms the data, and forwards it to diverse destinations for storage and analysis. Key features of Fluent Bit include SQL Stream Processing, backpressure handling, Vendor-Neutral, and Apache 2 Licensed. Fluent Bit also shines with its flexibility because of the pluggable architecture, supporting easy integration and customization. With over 100 built-in plugins, it offers extensive options for collecting, filtering, and forwarding data.
Fluent Bit's reliability is underscored by its adoption by major cloud providers like DigitalOcean, AWS Cloud, and Google Cloud, processing vast amounts of data daily.
In this comprehensive guide, you will use Fluent Bit to gather logs from diverse sources, transform, and deliver them to various destinations. The tutorial will walk you through reading logs from a file and forwarding them to the console. Subsequently, you will explore how Fluent Bit can collect logs from multiple containers and route them to a centralized location. Finally, you will monitor Fluent Bit's health to ensure its smooth operation.
Prerequisites
To follow this guide, you need access to a system that has a non-root user account with sudo
privileges. Optionally, you should install Docker and Docker Compose if you intend to follow along with later parts of this tutorial that involve collecting logs from Docker containers. If you're uncertain about the need for a log shipper, you can read this article on log shippers to understand their benefits, how to choose one, and compare a few options.
Once you've met these prerequisites, create a root project directory that will contain the application and configuration files with the following command:
mkdir log-processing-stack
Move into the newly created directory:
cd log-processing-stack
Next, create a subdirectory named logify
for the demo application you'll be building in the upcoming section:
mkdir logify
Change into the subdirectory:
cd logify
With these directories in place, you're ready to proceed to the next step, where you'll create the demo logging application.
Developing a demo logging application
In this section, you'll create a sample logging script using Bash that generates log entries at regular intervals and writes them to a file.
Create a logify.sh
file within the logify
directory. You can use your preferred text editor. This tutorial uses nano
:
nano logify.sh
In your logify.sh
file, enter the following contents to generate log entries with Bash:
#!/bin/bash
filepath="/var/log/logify/app.log"
create_log_entry() {
local info_messages=("Connected to database" "Task completed successfully" "Operation finished" "Initialized application")
local random_message=${info_messages[$RANDOM % ${#info_messages[@]}]}
local http_status_code=200
local ip_address="127.0.0.1"
local emailAddress="user@mail.com"
local level=30
local pid=$$
local ssn="407-01-2433"
local time=$(date +%s)
local log='{"status": "'$http_status_code'", "ip": "'$ip_address'", "level": '$level', "emailAddress": "'$emailAddress'", "msg": "'$random_message'", "pid": '$pid', "ssn": "'$ssn'", "timestamp": '$time'}'
echo "$log"
}
while true; do
log_record=$(create_log_entry)
echo "${log_record}" >> "${filepath}"
sleep 3
done
The create_log_entry()
function generates log entries in JSON format and includes various details such as HTTP status codes, severity levels, and random log messages. It also intentionally includes sensitive fields like IP address, Social Security Number (SSN), and email address to demonstrate Fluent Bit's ability to remove or redact sensitive data. To learn more about best practices for logging sensitive data, refer to our guide.
Next, the infinite loop continuously invokes the create_log_entry()
function to generate a log record every 3 seconds and append them to a specified file in the /var/log/logify/
directory.
When you are finished, save the new changes and make the script executable:
chmod +x logify.sh
Create a directory to store the application logs:
sudo mkdir /var/log/logify
Assign ownership of the directory to the currently logged-in user:
sudo chown -R $USER:$USER /var/log/logify/
Then, run the Bash script in the background:
./logify.sh &
The script will start writing logs to the app.log
file. To view the last few log entries, use the tail
command:
tail -n 4 /var/log/logify/app.log
{"status": "200", "ip": "127.0.0.1", "level": 30, "emailAddress": "user@mail.com", "msg": "Task completed successfully", "pid": 2833, "ssn": "407-01-2433", "timestamp": 1696071877}
{"status": "200", "ip": "127.0.0.1", "level": 30, "emailAddress": "user@mail.com", "msg": "Initialized application", "pid": 2833, "ssn": "407-01-2433", "timestamp": 1696071880}
{"status": "200", "ip": "127.0.0.1", "level": 30, "emailAddress": "user@mail.com", "msg": "Operation finished", "pid": 2833, "ssn": "407-01-2433", "timestamp": 1696071883}
{"status": "200", "ip": "127.0.0.1", "level": 30, "emailAddress": "user@mail.com", "msg": "Initialized application", "pid": 2833, "ssn": "407-01-2433", "timestamp": 1696071886}
Each line in the output represents a log event or record.
With the log entries being generated, the next step is to install Fluent Bit.
Installing Fluent Bit
In this section, you'll install the latest version of Fluent Bit on your Ubuntu 22.04 system. If you're using a different operating system, refer to the official documentation page for specific installation instructions.
Fluent Bit is not available in Ubuntu's default package repositories.
To install Fluent Bit on your system, first, add Fluent Bit GPG Keyset:
sudo sh -c 'curl https://packages.fluentbit.io/fluentbit.key | gpg --dearmor > /usr/share/keyrings/fluentbit-keyring.gpg'
Next, check your Ubuntu code name:
lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 22.04.3 LTS
Release: 22.04
Codename: jammy
Then export the codename
as an environment variable:
export CODENAME="jammy"
Following that, add the Fluent Bit source list to the sources.list.d
directory:
echo "deb [signed-by=/usr/share/keyrings/fluentbit-keyring.gpg] https://packages.fluentbit.io/ubuntu/$CODENAME/ \
$CODENAME main" | sudo tee /etc/apt/sources.list.d/fluentbit.list
apt
will search for new sources in the sources.list.d
directory.
To ensure that apt
recognises the Fluent Bit source you just added, update your package list using the following command:
sudo apt update
Then install Fluent Bit:
sudo apt install fluent-bit
Fluent Bit is now installed on your system.
How Fluent Bit works
Fluent Bit operates as a robust pipeline for handling log data. You can imagine it as a sequence where logs flow through distinct stages, each performing a specific task. Let's break down Fluent Bit's core components and plugins to provide a clearer understanding:
At the beginning of the pipeline, Fluent Bit collects logs from various sources. These logs then pass through a Parser, transforming unstructured data into structured log events. Subsequently, the log event stream encounters the Filter, which can enrich, exclude, or modify the data according to project requirements. After filtration, the logs are temporarily stored in a Buffer, either in memory or the filesystem, ensuring smooth processing. Finally, the Router directs the data to diverse destinations for analysis and storage.
To put this into practice, you can define Fluent Bit's behavior in a configuration file located at /etc/fluent-bit/fluent-bit.conf
:
[SERVICE]
...
[INPUT]
...
[FILTER]
...
[OUTPUT]
...
Let's look at these components in detail:
[SERVICE]
: contains global settings for the running service.[INPUT]
: specifies sources of log records for Fluent Bit to collect.[FILTER]
: applies transformations to log records.[OUTPUT]
: determines the destination where Fluent Bit sends the processed logs.
For these components to do their tasks, they require a plugin. Here is a brief overview of the plugins available for Fluent Bit.
Fluent Bit input plugins
For the [INPUT]
component, the following are some of input plugins that can come in handy:
tail
: monitors and collects logs from the end of a file, akin to thetail -f
command.syslog
: gathers Syslog logs from a Unix socket server.http
: captures logs via a REST endpoint.opentelemetry
: fetches telemetry data from OpenTelemetry sources.
Fluent Bit filter plugins
When you need to transform logs, Fluent Bit provides a range of filter plugins suited for different modifications:
record_modifier
: modifies log records.lua
: alters log records using Lua scripts.grep
: matches or excludes log records, similar to thegrep
command.modify
: changes log records based on specified conditions or rules.
Fluent Bit output plugins
To dispatch logs to various destinations, Fluent Bit offers versatile output plugins:
file
: write logs to a specified file.amazon_s3
: sends logs, metrics to Amazon S3.http
: pushes records to an HTTP endpoint.websocket
: forwards log records to a WebSocket endpoint.
Now that you have a rough idea of how Fluent Bit works, you can proceed to the next section to start using Fluent Bit.
Getting started with Fluent Bit
In this section, you will configure Fluent Bit to read logs from a file using the tail
input plugin and display them in the console.
First, open the Fluent Bit configuration file located at /etc/fluent-bit/fluent-bit.conf
using the following command:
sudo nano /etc/fluent-bit/fluent-bit.conf
Clear the existing contents of the file and add the following configuration code:
[SERVICE]
Flush 1
Daemon off
Log_Level debug
[INPUT]
Name tail
Path /var/log/logify/app.log
Tag filelogs
[OUTPUT]
Name stdout
Match filelogs
The [SERVICE]
defines global settings for Fluent Bit. It specifies that Fluent Bit should flush every 1 second, run in the foreground, and set the log level to debug
.
The [INPUT]
uses the tail
plugin to read logs from the specified file at /var/log/logify/app.log
. The Tag
allows other Fluent Bit components, such as [FILTER]
and [OUTPUT]
, to identify these log records.
The [OUTPUT]
component uses the stdout
plugin to forward logs to the console. The Match
parameter ensures only logs with thefilelogs
tag are delivered to the console.
After making these changes, save the file.
Next, validate your configuration file for errors:
sudo /opt/fluent-bit/bin/fluent-bit -c /etc/fluent-bit/fluent-bit.conf --dry-run
Fluent Bit v2.1.10
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io
[2023/09/30 11:11:05] [ info] Configuration:
[2023/09/30 11:11:05] [ info] flush time | 1.000000 seconds
[2023/09/30 11:11:05] [ info] grace | 5 seconds
[2023/09/30 11:11:05] [ info] daemon | 0
[2023/09/30 11:11:05] [ info] ___________
[2023/09/30 11:11:05] [ info] inputs:
[2023/09/30 11:11:05] [ info] tail
[2023/09/30 11:11:05] [ info] ___________
[2023/09/30 11:11:05] [ info] filters:
[2023/09/30 11:11:05] [ info] ___________
[2023/09/30 11:11:05] [ info] outputs:
[2023/09/30 11:11:05] [ info] stdout.0
[2023/09/30 11:11:05] [ info] ___________
[2023/09/30 11:11:05] [ info] collectors:
configuration test is successful
If the output displays "configuration test is successful", your configuration file is valid and error-free.
In the logify
directory, run the Bash program in the background:
./logify.sh &
Now, start Fluent Bit, specifying the path to the configuration file:
sudo /opt/fluent-bit/bin/fluent-bit -c /etc/fluent-bit/fluent-bit.conf
The -c
option takes the path to the Fluent Bit configuration file.
When Fluent Bit starts, you should see an output similar to the following:
...
[2023/09/30 11:13:06] [debug] [input:tail:tail.0] scanning path /var/log/logify/app.log
[2023/09/30 11:13:06] [debug] [input:tail:tail.0] inode=255633 with offset=35483 appended as /var/log/logify/app.log
[2023/09/30 11:13:06] [debug] [input:tail:tail.0] scan_glob add(): /var/log/logify/app.log, inode 255633
[2023/09/30 11:13:06] [debug] [input:tail:tail.0] 1 new files found on path '/var/log/logify/app.log'
[2023/09/30 11:13:06] [debug] [stdout:stdout.0] created event channels: read=29 write=30
[2023/09/30 11:13:06] [ info] [sp] stream processor started
[2023/09/30 11:13:06] [debug] [input:tail:tail.0] inode=255633 file=/var/log/logify/app.log promote to TAIL_EVENT
[2023/09/30 11:13:06] [ info] [input:tail:tail.0] inotify_fs_add(): inode=255633 watch_fd=1 name=/var/log/logify/app.log
[2023/09/30 11:13:06] [debug] [input:tail:tail.0] [static files] processed 0b, done
[2023/09/30 11:13:06] [ info] [output:stdout:stdout.0] worker #0 started
[2023/09/30 11:13:09] [debug] [input:tail:tail.0] inode=255633, /var/log/logify/app.log, events: IN_MODIFY
[2023/09/30 11:13:09] [debug] [input chunk] update output instances with new chunk size diff=207, records=1, input=tail.0
[2023/09/30 11:13:09] [debug] [task] created task=0x7f59c2833f80 id=0 OK
[2023/09/30 11:13:09] [debug] [output:stdout:stdout.0] task_id=0 assigned to thread #0
Following that, you will see the log messages appear:
[0] filelogs: [[1696072389.439042696, {}], {"log"=>"{"status": "200", "ip": "127.0.0.1", "level": 30, "emailAddress": "user@mail.com", "msg": "Operation finished", "pid": 2833, "ssn": "407-01-2433", "timestamp": 1696072389}"}]
...
[0] filelogs: [[1696072392.449147983, {}], {"log"=>"{"status": "200", "ip": "127.0.0.1", "level": 30, "emailAddress": "user@mail.com", "msg": "Initialized application", "pid": 2833, "ssn": "407-01-2433", "timestamp": 1696072392}"}]
Fluent Bit is now displaying the log messages along with additional context. You can exit Fluentd by pressing CTRL + C
.
Transforming logs with Fluent Bit
When collecting logs with Fluent Bit, processing them to enhance their utility is often necessary. Fluent Bit provides a powerful array of filter plugins designed to transform event streams effectively. In this section, we will explore various essential log transformation tasks:
- Parsing JSON logs.
- Removing unwanted fields.
- Adding new fields.
- Converting Unix timestamps to the ISO format.
- Maskng sensitive data.
Parsing JSON logs with Fluent Bit
When working with logs generated in JSON format, it's crucial to parse them accurately. This ensures the data maintains its integrity and adheres to the expected structure. This section will focus on parsing JSON log records as valid JSON to provide a well-defined structure.
To do that, lets examine a log event from the last section in detail:
[0] filelogs: [[1696072392.449147983, {}], {"log"=>"{"status": "200", "ip": "127.0.0.1", "level": 30, "emailAddress": "user@mail.com", "msg": "Initialized application", "pid": 2833, "ssn": "407-01-2433", "timestamp": 1696072392}"}]
Upon close inspection, you will see that Fluent Bit adds key=value
pairs, and the data here needs a consistent JSON structure.
You can create a Parser
to parse logs as JSON in Fluent Bit.
In your text editor, create a parser_json.conf
file:
sudo nano /etc/fluent-bit/parser_json.conf
In your parsers.conf
file, add the following code:
...
[PARSER]
Name json_parser
Format json
The [PARSER]
component takes the parser's name and the format in which log events should be parsed, which is json
here.
In the Fluent Bit configuration file /etc/fluent-bit/fluent-bit.conf
, make the following modifications:
[SERVICE]
Flush 1
Daemon off
Log_Level debug
Parsers_File parser_json.conf
[INPUT]
Name tail
Path /var/log/logify/app.log
Parser json_parser
Tag filelogs
[OUTPUT]
Name stdout
format json
Match filelogs
The Parsers_File
parameter references the parser_json.conf
file, which defines the json_parser
for parsing JSON logs.
In the [INPUT]
component, you add the Parser
parameter with the value json_parser
. This specifies that the incoming logs should be parsed using the JSON parser defined in parser_json.conf
.
Finally, in the [OUTPUT]
section, you set the format
parameter to json
, ensuring that the logs forwarded to the output are in the JSON format.
After making these changes, save the configuration file and restart Fluent Bit using the following command:
sudo /opt/fluent-bit/bin/fluent-bit -c /etc/fluent-bit/fluent-bit.conf
[{"date":1696075163.805419,"status":"200","ip":"127.0.0.1","level":30,"emailAddress":"user@mail.com","msg":"Operation finished","pid":2833,"ssn":"407-01-2433","timestamp":1696075163}]
...
[{"date":1696075166.815878,"status":"200","ip":"127.0.0.1","level":30,"emailAddress":"user@mail.com","msg":"Initialized application","pid":2833,"ssn":"407-01-2433","timestamp":1696075166}]
You can now observe that the logs are formatted in the JSON format.
Now, you can stop Fluent Bit with CTRL + C
.
You have learned how to parse incoming JSON logs correctly. Fluent Bit provides various parsers to handle diverse log formats:
regex
: uses regular expressions to parse log eventslogfmt
: parses log records which are in the Logfmt format.lstv
: parse log events in the LSTV format format.
These parsing methods offer flexibility, allowing Fluent Bit to handle many log formats efficiently.
Now that you can parse the JSON logs, you will alter the log records attribute in the next section.
Adding and removing fields with Fluent Bit
In this section, you'll customize log records by removing sensitive data and adding new fields. Precisely, you will remove the emailAddress
field due to its sensitive nature and add a hostname
field to enhance log context.
Open your Fluent Bit configuration file in your text editor:
sudo nano /etc/fluent-bit/fluent-bit.conf
Integrate the following [FILTER]
component into your configuration:
[SERVICE]
Flush 1
Daemon off
Log_Level debug
Parsers_File parser_json.conf
[INPUT]
Name tail
Path /var/log/logify/app.log
Tag filelogs
[FILTER]
Name record_modifier
Match filelogs
Remove_key emailAddress
Record hostname ${HOSTNAME}
[OUTPUT]
Name stdout
format json
Match filelogs
In the [FILTER]
component, the name
parameter denotes that the record_modifier
plugin is being used. To exclude the emailAddress
field, you use the Remove_key
parameter. The Record
parameter also introduces a new field called hostname
, which is automatically populated with the system's hostname information.
Save your changes and restart Fluent Bit to apply the modifications:
sudo /opt/fluent-bit/bin/fluent-bit -c /etc/fluent-bit/fluent-bit.conf
When Fluent Bit runs, you will observe the log events without the emailAddress
field, and the hostname
field will be incorporated into the log events:
[{"date":1696075326.2961,"status":"200","ip":"127.0.0.1","level":30,"msg":"Task completed successfully","pid":2833,"ssn":"407-01-2433","timestamp":1696075326,"hostname":"fluent-bit-host"}]
...
[{"date":1696075329.308298,"status":"200","ip":"127.0.0.1","level":30,"msg":"Connected to database","pid":2833,"ssn":"407-01-2433","timestamp":1696075329,"hostname":"fluent-bit-host"}]
...
That takes care of removing fields and adding new fields. In the next section, you will format the timestamps.
Formatting dates with Fluent Bit
The Bash generates logs with a Unix timestamp, representing the number of seconds that elapsed since January 1st, 1970, at 00:00:00 UTC. While these timestamps are precise, they aren't user-friendly. As a result, you'll convert them into the more human-readable ISO format.
At the time of writing, it isn't easy to do this with existing plugins. A better option is to use a Lua script to perform the conversion and reference it in the configuration file using the lua
plugin.
In your /etc/fluent-bit/
directory, create the convert_timestamp.lua
file:
sudo nano /etc/fluent-bit/convert_timestamp.lua
Next, add the following code to convert the timestamp field from Unix timestamp to ISO format:
function append_converted_timestamp(tag, timestamp, record)
new_record = record
new_record["timestamp"] = os.date("!%Y-%m-%dT%TZ", record["timestamp"])
return 2, timestamp, new_record
end
The append_converted_timestamp()
function creates a new record and sets the timestamp
field to the value returned by the os.date()
method, configured to format dates into the ISO format.
Save and exit your file. Open the Fluent Bit configuration:
sudo nano /etc/fluent-bit/fluent-bit.conf
Update the configuration to include the Lua script in the [FILTER]
component:
...
[FILTER]
Name lua
Match filelogs
Script convert_timestamp.lua
Call append_converted_timestamp
[OUTPUT]
Name stdout
format json
Match filelogs
The [FILTER]
component uses the lua
plugin to modify log records dynamically. The Script
parameter holds the path to the Lua script file. Meanwhile, the Call
parameter specifies the function within the Lua script that will be invoked to perform the conversion.
Upon saving the file, start Fluent Bit:
sudo /opt/fluent-bit/bin/fluent-bit -c /etc/fluent-bit/fluent-bit.conf
Fluent Bit will yield output similar to the following:
[{"date":1696075449.671689,"ip":"127.0.0.1","pid":2833,"ssn":"407-01-2433","timestamp":"2023-09-30T12:04:09Z","hostname":"fluent-bit-host","msg":"Initialized application","level":30,"status":"200"}]
...
[{"date":1696075455.691909,"ip":"127.0.0.1","pid":2833,"ssn":"407-01-2433","timestamp":"2023-09-30T12:04:15Z","hostname":"fluent-bit-host","msg":"Operation finished","level":30,"status":"200"}]
The timestamp
field is now in a human-readable ISO format. This change will improve the readability of your logs to understand when they occur.
Working with conditional statements in Fluent Bit
While Fluent Bit doesn't natively support conditional statements, you can achieve similar functionality by leveraging the modify plugin. In this section, you'll learn how to check if the status
field equals 200
and add an is_successful
field set to true
when this condition is met.
First, open your /etc/fluent-bit/fluent-bit.conf
configuration file:
sudo nano /etc/fluent-bit/fluent-bit.conf
Inside the file, add the following [FILTER]
component:
...
[FILTER]
Name modify
Match filelogs
Condition Key_Value_Equals status "200"
Add is_successful true
[OUTPUT]
Name stdout
Match filelogs
The modify
plugin provides the Condition
parameter with a Key_Value_Equals
option that checks if the status
field value equals "200". If the condition is met, the Add
option appends an is_successful
field to the log event.
Save the configuration file and start Fluent Bit:
sudo /opt/fluent-bit/bin/fluent-bit -c /etc/fluent-bit/fluent-bit.conf
[{"date":1696075554.97921,"hostname":"fluent-bit-host","level":30,"msg":"Operation finished","timestamp":"2023-09-30T12:05:54Z","pid":2833,"ssn":"407-01-2433","ip":"127.0.0.1","status":"200","is_successful":"true"}]
...
[{"date":1696075557.990212,"hostname":"fluent-bit-host","level":30,"msg":"Operation finished","timestamp":"2023-09-30T12:05:57Z","pid":2833,"ssn":"407-01-2433","ip":"127.0.0.1","status":"200","is_successful":"true"}]
You will now see the is_successful
field, indicating the outcomes where the status
field equals 200
.
Masking sensitive data with Fluentd
In the earlier steps, you successfully removed the emailAddress
from the log records, yet sensitive fields like IP addresses and Social Security Numbers remain. For personal information safety in the logs, it's crucial to mask this data. This becomes especially pertinent when sensitive details are part of a field that can't be entirely removed.
While many built-in plugins redact entire value fields, using a Lua script is the best solution since you can easily specify and selectively mask specific data portions.
Create a redact.lua
script with your text editor:
sudo nano /etc/fluent-bit/redact.lua
Add the following code to the redact.lua
script:
-- Function to redact SSNs and IP addresses in any field
function redact_sensitive_portions(record)
local redacted_record = {} -- Initialize a new table for the redacted record
for key, value in pairs(record) do
local redacted_value = value -- Initialize redacted_value with the original value
-- Redact SSNs
redacted_value, _ = string.gsub(redacted_value, '%d%d%d%-%d%d%-%d%d%d%d', 'REDACTED')
-- Redact IP addresses
redacted_value, _ = string.gsub(redacted_value, '%d+%.%d+%.%d+%.%d+', 'REDACTED')
redacted_record[key] = redacted_value -- Add the redacted value to the new table
end
return redacted_record
end
-- Entry point for Fluent Bit filter
function filter(tag, timestamp, record)
local redacted_record = redact_sensitive_portions(record)
return 1, timestamp, redacted_record
end
-- Return the filter object
return {
filter = filter
}
In this code snippet, the redact_sensitive_portions()
function iterates through each field, using the string.gsub()
method to locate and replace IP addresses and Social Security Numbers with the text "REDACTED".
The filter()
function acts as the entry point. It calls the redact_sensitive_portions
function to mask sensitive portions within the log record. After processing, the modified record is returned through the filter
object.
Now, open your Fluent Bit configuration file:
sudo nano /etc/fluent-bit/fluent-bit.conf
Add the [FILTER]
component to reference the redact.lua
script:
...
[FILTER]
Name lua
Match filelogs
Script redact.lua
Call filter
[OUTPUT]
Name stdout
format json
Match filelogs
The [FILTER]
component references the redact.lua
file, and the CALL
parameter invokes the filter
function as the entry point.
When you are done, start Fluent Bit:
sudo /opt/fluent-bit/bin/fluent-bit -c /etc/fluent-bit/fluent-bit.conf
[{"date":1696075807.732392,"ip":"REDACTED","level":"30","msg":"Connected to database","is_successful":"true","timestamp":"2023-09-30T12:10:07Z","hostname":"fluent-bit-host","pid":"2833","ssn":"REDACTED","status":"200"}]
...
[{"date":1696075810.743,"ip":"REDACTED","level":"30","msg":"Initialized application","is_successful":"true","timestamp":"2023-09-30T12:10:10Z","hostname":"fluent-bit-host","pid":"2833","ssn":"REDACTED","status":"200"}]
The IP address and SSN have now been masked. In scenarios where a field contains both the IP address and SSN like this:
{..., "privateInfo": "This is a sample message with SSN: 123-45-6789 and IP: 192.168.0.1"}
Fluent Bit will redact the sensitive portions only:
{...,privateInfo":"This is a sample message with SSN: REDACTED and IP: REDACTED"}
Lets now stop the logify.sh
. To do that, you will need program's process ID:
jobs -l | grep "logify"
[1]+ 2833 Running ./logify.sh &
Then, terminate the program with the kill
command and ensure the process ID has been substituted.
kill -9 <2833>
Now that you can mask sensitive portions, you can move on to collecting logs from docker containers.
Collecting logs from Docker containers and centralizing logs
In this section, you'll containerize the Bash program and use an Nginx hello world Docker image, which has been preconfigured to generate JSON Nginx logs upon each incoming request. Subsequently, you will deploy a Fluent Bit container to collect logs from Bash and Nginx containers and forward them to Better Stack for centralization.
Dockerizing the Bash script
Containerization lets you encapsulate the script and its dependencies, which makes it portable across different environments.
To containerize the Bash program, ensure you are still in the log-processing-stack/logify
directory. After that, create a Dockerfile
, which will contain instructions on how to build the image.
nano Dockerfile
In your Dockerfile
, add the following lines of code:
FROM ubuntu:latest
COPY . .
RUN chmod +x logify.sh
RUN mkdir -p /var/log/logify
RUN ln -sf /dev/stdout /var/log/logify/app.log
CMD ["./logify.sh"]
In this Dockerfile
, you start with the recent version of Ubuntu as the base image. You then copy the script into the container, make it executable, and create a directory where the application will write the logs. You then redirect all the log data written to /var/log/logify/app.log
to the standard output. And finally, you specify the command to run when the container starts.
Now, move into the parent project directory:
cd ..
Create a docker-compose.yml
:
nano docker-compose.yml
Now define the Bash Script and Nginx services:
version: '3'
services:
logify-script:
build:
context: ./logify
container_name: logify
nginx:
image: betterstackcommunity/nginx-helloworld:latest
container_name: nginx
ports:
- '80:80'
In this configuration file, you create the logify-script
and nginx
services. The logify-script
service gets built from the ./logify
directory context. The nginx
service uses the pre-built Nginx image, and you then map port 80
on the host to port 80
within the container. Make sure no other application uses port 80
to avoid conflicts.
Next, build the Bash program Docker image and create the containers:
docker compose up -d
The -d
flag puts the services in the background.
To see if the containers are running, type the following:
docker compose ps
The text "running" will be displayed under the "STATUS" column for both containers resembling this:
NAME COMMAND SERVICE STATUS PORTS
logify "./logify.sh" logify-script running
nginx "/runner.sh nginx" nginx running 0.0.0.0:80->80/tcp, :::80->80/tcp
Now that the containers are running, send HTTP requests to the Nginx service using curl
to generate logs:
curl http://localhost:80/?[1-5]
Then, view the logs with the following command:
docker compose logs
logify | {"status": "200", "ip": "127.0.0.1", "level": 30, "emailAddress": "user@mail.com", "msg": "Initialized application", "pid": 1, "ssn": "407-01-2433", "timestamp": 1696077723}
logify | {"status": "200", "ip": "127.0.0.1", "level": 30, "emailAddress": "user@mail.com", "msg": "Task completed successfully", "pid": 1, "ssn": "407-01-2433", "timestamp": 1696077726}
...
nginx | {"timestamp":"2023-09-30T12:41:53+00:00","pid":"8","remote_addr":"172.18.0.1","remote_user":"","request":"GET /?1 HTTP/1.1","status": "200","body_bytes_sent":"11109","request_time":"0.000","http_referrer":"","http_user_agent":"curl/7.81.0","time_taken_ms":"1696077713.858"}
nginx | {"timestamp":"2023-09-30T12:41:53+00:00","pid":"8","remote_addr":"172.18.0.1","remote_user":"","request":"GET /?2 HTTP/1.1","status": "200","body_bytes_sent":"11109","request_time":"0.000","http_referrer":"","http_user_agent":"curl/7.81.0","time_taken_ms":"1696077713.863"}
You will see the logs from the Nginx and Bash program containers in the output.
With both services running and generating log data, it's time to collect these logs using Fluent Bit.
Defining the Fluent Bit service with Docker Compose
In the Docker Compose configuration, you will now integrate a Fluent Bit service to collect logs from the active containers and centralize them to Better Stack. You will define a Fluent Bit configuration, containerize Fluent Bit, and set up the Fluent Bit service.
Begin by opening the docker-compose.yml
file:
nano docker-compose.yml
Add the following code to the docker-compose.yml
file to define the Fluent Bit service:
version: '3'
services:
logify-script:
build:
context: ./logify
container_name: logify
logging:
driver: "fluentd"
options:
tag: docker.logify
fluentd-address: 127.0.0.1:24224
depends_on:
- fluent-bit
links:
- fluent-bit
nginx:
image: betterstackcommunity/nginx-helloworld:latest
container_name: nginx
ports:
- '80:80'
logging:
driver: "fluentd"
options:
tag: docker.nginx
fluentd-address: 127.0.0.1:24224
depends_on:
- fluent-bit
links:
- fluent-bit
fluent-bit:
image: fluent/fluent-bit:latest
volumes:
- ./fluent-bit:/fluent-bit/etc
- /var/run/docker.sock:/var/run/docker.sock
command: ["fluent-bit", "-c", "/fluent-bit/etc/fluent-bit.conf"]
container_name: fluent-bit
ports:
- "24224:24224
In the updated docker-compose.yml
configuration, the logify-script
and nginx
services are linked to the fluent-bit
service and depend on it. Both services are configured to use the fluentd
driver for logging, and the fluentd-address
specifies the address to which Docker will send the logs. Tags are added to each container; the logify-script
service is tagged as docker.logify
, and the Nginx service tag is docker.nginx
. These tags will help identify the source of the Docker logs.
The fluent-bit
service uses the pre-built `fluent/fluent-bit image and incorporates volume mappings for Fluent Bit's configuration file (which will be created shortly). The command
parameter specifies the execution of fluent-bit.conf
when the container starts. Additionally, port 24224
is exposed to receive logs from other containers.
Next, create the fluent-bit
directory and navigate into it:
mkdir fluent-bit && cd fluent-bit
Following that, create a fluent-bit.conf
file with your text editor:
nano fluent-bit.conf
Define the [INPUT]
section to listen for logs on port 24224
using the forward
plugin:
[INPUT]
Name forward
Listen 0.0.0.0
Port 24224
The [INPUT]
configuration uses the forward
plugin to receive logs sent by services through port 24224
.
Next, you will set up the destination to forward the logs. To centralize the logs, you will use Better Stack.
First, create a free Better Stack account. And when you are logged in, visit the Sources section:
Once on the Sources page, click the Connect source button:
Enter a source name (e.g., "Logify logs") and select "Fluent-bit" as the platform:
Once the source is created, copy the Source Token field to the clipboard:
Return to the fluent-bit.conf
file and add the [OUTPUT]
component at the end of the file to deliver Docker logs to Better Stack. Make sure to update the source token:
[INPUT]
Name forward
Listen 0.0.0.0
Port 24224
[OUTPUT]
name http
match docker.logify
tls On
host in.logs.betterstack.com
port 443
uri /
header Authorization Bearer <your_logify_source_token>
header Content-Type application/msgpack
format msgpack
retry_limit 5
In the [OUTPUT]
component, Fluent Bit matches log entries tagged with docker.logify
and forwards them to Better Stack using the http
plugin. The tag is set in the docker-compose.yml
file for the logify-script
service, allowing Fluent Bit to identify the log entries correctly. The <your_logify_source_token>
should be replaced with the source token obtained from Better Stack during source creation.
After adding the configuration, save and exit the file. Return to the project's root directory using the following command:
cd ..
Start the newly configured Fluent Bit service using Docker Compose:
docker compose up -d
Check Better Stack to verify if the log entries are being successfully delivered. You should see the log entries uploading to Better Stack's interface:
For the Nginx logs, follow similar steps. Create a new source for Nginx logs on Better Stack. After creating the source, the interface will look like this:
Obtain the source token and update the [OUTPUT]
component in the fluent-bit.conf
file to match and forward Nginx logs.
...
[OUTPUT]
name http
match docker.nginx
tls On
host in.logs.betterstack.com
port 443
uri /
header Authorization Bearer <your_bash_source_token>
header Content-Type application/msgpack
format msgpack
retry_limit 5
After making the necessary changes, stop all services using the command:
docker compose down
Start all the services again:
docker compose up -d
Send more requests to the Nginx service:
curl http://localhost:80/?[1-5]
The Nginx logs will be uploaded to Better Stack:
That takes care of centralizing data in Better Stack.
Monitoring Fluent Bit health with Better Stack
Fluent Bit provides a health
endpoint that allows you to monitor Fluent Bit's health using external tools like Better Stack. Periodically, these tools send requests to determine if Fluent Bit is functioning correctly.
To enable this endpoint, open the Fluent Bit configuration file:
nano fluent-bit/fluent-bit.conf
Add the following lines at the top of the file to enable Fluent Bit's health endpoint and configure its settings:
[SERVICE]
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_PORT 2020
Health_Check On
HC_Errors_Count 5
HC_Retry_Failure_Count 5
HC_Period 5
...
These configurations instruct Fluent Bit to start listening for requests on port 2020
to check its health status.
Next, update the docker-compose.yml
file to expose the port that hosts the health endpoint:
container_name: fluent-bit
ports:
- "2020:2020"
- "24224:24224"
Now, start the Fluent Bit service with the updated changes:
docker compose up -d
Verify the health endpoint is functioning:
curl -s http://127.0.0.1:2020/api/v1/health
ok
Next, log in to Better Stack.
On the Monitors page, click the Create monitor button:
Then, select the suitable triggering option in Better Stack, your preferred notification preferences and input your server's IP address or domain name, followed by the /api/v1/health
endpoint on port 2020
. After that, click the Create monitor button:
Upon completion, Better Stack will regularly monitor Fluent Bit's health endpoint:
Let's see what happens when Fluent Bit malfunctions. Halt all services using the command:
docker compose stop
After a brief interval, check Better Stack. The status will transition to "Down":
When there is an outage, Better Stack will promptly notify you. An email alert will be dispatched detailing the downtime, allowing you to proactively manage Fluent Bit's health and swiftly address the problem:
With these tools, you can proactively manage Fluent Bit's health and swiftly respond to operational interruptions.
Final thoughts
In this comprehensive article, you learned how Fluent Bit can be integrated with tools like Docker, Nginx, and Better Stack for managing logs. First, you created a Fluent Bit configuration to read logs from a file and display them in the output. You then employed Fluent Bit to collect logs from multiple Docker containers and centralize them on Better Stack. Finally, you set up a health
endpoint to monitor Fluent Bit's health using Better Stack.
You can now effectively manage logs on your system using Fluent Bit. To delve deeper into Fluent Bit's capabilities, consult the documentation. Fluent Bit offers powerful features such as SQL stream processing, which you can explore further here. Additionally, to hone your skills in Docker and Docker Compose, refer to their respective documentation pages: Docker and Docker Compose. To gain insights into Docker logging, consult our comprehensive guide.
Thanks for reading, and happy logging!
Make your mark
Join the writer's program
Are you a developer and love writing and sharing your knowledge with the world? Join our guest writing program and get paid for writing amazing technical guides. We'll get them to the right readers that will appreciate them.
Write for usBuild on top of Better Stack
Write a script, app or project on top of Better Stack and share it with the world. Make a public repository and share it with us at our email.
community@betterstack.comor submit a pull request and help us build better products for everyone.
See the full list of amazing projects on github