How to Collect, Process, and Ship Log Data with Logstash
Logs are invaluable assets, originating from various sources such as applications, containers, databases, and operating systems. When analyzed, they offer crucial insights, especially in diagnosing issues. For their effectiveness, it's essential to centralize them, allowing for in-depth analysis and pattern recognition all in one place. This centralization process involves using a log shipper, a tool designed to gather logs from diverse sources, process them, and then forward them to different destinations.
One powerful log shipper is Logstash, a free and open-source tool created by Elastic and an integral part of the Elastic Stack, formerly known as the ELK stack. With a robust framework and over 200 plugins, Logstash offers unparalleled flexibility. These plugins enable Logstash to support various sources and perform complex manipulations, ensuring they are well-prepared before reaching their final destination.
In this comprehensive guide, you'll use Logstash to collect logs from various sources, process and forward them to multiple destinations. First, you'll use Logstash to collect logs from a file and send them to the console. Building upon that, you will use Logstash to gather logs from multiple Docker containers and centralize the logs. Finally, you'll monitor the health of a Logstash instance to ensure it performs optimally and reliably.
Prerequisites
Before you begin, ensure you have access to a system with a non-root user account with sudo
privileges. For certain parts of this guide that involve collecting logs from Docker containers, you'll need to have Docker and Docker Compose installed. If you're unfamiliar with log shippers, you can gain insights into their advantages by checking out this article.
With the prerequisites in order, create a root project directory named log-processing-stack
. This directory will serve as the core container for your application and its configurations:
mkdir log-processing-stack
Next, navigate into the newly created directory:
cd log-processing-stack
Within the log-processing-stack
directory, create a subdirectory named logify
for the demo application:
mkdir logify
Move into the logify
subdirectory:
cd logify
Now that the necessary directories are in place, you're ready to create the application.
Developing a demo logging application
In this section, you will build a sample logging application using Bash that generates logs at regular intervals and appends them to a file.
Within the logify
directory, create a logify.sh
file using your preferred text editor. For this tutorial, we will use nano
:
nano logify.sh
In your logify.sh
file, add the following code to start generating logs using Bash:
#!/bin/bash
filepath="/var/log/logify/app.log"
create_log_entry() {
local info_messages=("Connected to database" "Task completed successfully" "Operation finished" "Initialized application")
local random_message=${info_messages[$RANDOM % ${#info_messages[@]}]}
local http_status_code=200
local ip_address="127.0.0.1"
local emailAddress="user@mail.com"
local level=30
local pid=$$
local ssn="407-01-2433"
local time=$(date +%s)
local log='{"status": '$http_status_code', "ip": "'$ip_address'", "level": '$level', "emailAddress": "'$emailAddress'", "msg": "'$random_message'", "pid": '$pid', "ssn": "'$ssn'", "timestamp": '$time'}'
echo "$log"
}
while true; do
log_record=$(create_log_entry)
echo "${log_record}" >> "${filepath}"
sleep 3
done
The create_log_entry()
function generates log entries in JSON format, containing essential details such as HTTP status codes, severity levels, and random log messages. We deliberately include sensitive fields like the IP address, Social Security Number (SSN), and email address. The reason is to showcase Logstash's ability to remove or redact sensitive data. For comprehensive guidelines on best practices for handling sensitive data in logs, consult our guide.
In the continuous loop, the create_log_entry()
function is invoked every 3 seconds, generating a new log record. These records are then appended to a designated file in the /var/log/logify/
directory.
After you finish writing the code, save your file. To grant the script execution permission, use the following command:
chmod +x logify.sh
Next, create the /var/log/logify
directory, which will serve as the destination for your application logs:
sudo mkdir /var/log/logify
After creating the directory, change its ownership to the currently logged-in user specified in the $USER
environment variable:
sudo chown -R $USER:$USER /var/log/logify/
Next, run the Bash script in the background to start generating the logs:
./logify.sh &
The program will continuously append logs to app.log
. To view recent log entries, use the tail
command:
tail -n 4 /var/log/logify/app.log
This command displays the last four lines of app.log
, allowing you to monitor the real-time log records that your application is generating.
When you run the command, you will see logs looking like this:
{"status": 200, "ip": "127.0.0.1", "level": 30, "emailAddress": "user@mail.com", "msg": "Connected to database", "pid": 17089, "ssn": "407-01-2433", "timestamp": 1696150204}
{"status": 200, "ip": "127.0.0.1", "level": 30, "emailAddress": "user@mail.com", "msg": "Task completed successfully", "pid": 17089, "ssn": "407-01-2433", "timestamp": 1696150207}
{"status": 200, "ip": "127.0.0.1", "level": 30, "emailAddress": "user@mail.com", "msg": "Connected to database", "pid": 17089, "ssn": "407-01-2433", "timestamp": 1696150210}
{"status": 200, "ip": "127.0.0.1", "level": 30, "emailAddress": "user@mail.com", "msg": "Connected to database", "pid": 17089, "ssn": "407-01-2433", "timestamp": 1696150213}
In the output, the logs are structured in the JSON format, containing various fields.
With the program actively generating logs, your next step is installing Logstash on your system to process and analyze this data.
Installing Logstash
In this section, you'll install the latest version of Logstash, which is 8.10 at the time of writing, on an Ubuntu 22.04 system. For other systems, visit the official documentation page for instructions.
Logstash is not available in Ubuntu's default package repositories. You'll need to add the Logstash package source list to install it via apt
.
First, import the Logstash public GPG key to apt
:
wget -qO - https://artifacts.elastic.co/GPG-KEY-elasticsearch | sudo gpg --dearmor -o /usr/share/keyrings/elastic-keyring.gpg
Next, install the apt-transport-https
package:
sudo apt-get install apt-transport-https
Then, add the Logstash source list to the sources.list.d
directory:
echo "deb [signed-by=/usr/share/keyrings/elastic-keyring.gpg] https://artifacts.elastic.co/packages/8.x/apt stable main" | sudo tee -a /etc/apt/sources.list.d/elastic-8.x.list
To ensure apt
can read the newly added source, update the package list:
sudo apt-get update
Now, install Logstash:
sudo apt-get install logstash
Logstash has been successfully installed on your system. You're now ready to process logs with Logstash.
How Logstash works
Before diving into Logstash, it's essential to grasp the fundamental concepts:
Logstash is easier to understand when you imagine it as a pipeline. At one end of this pipeline are the inputs, representing the data sources. As log records traverse through the Logstash pipeline, they can be enriched, filtered, or manipulated according to your requirements. Ultimately, when they reach the pipeline's end, Logstash can deliver these logs to configured destinations for storage or analysis.
To create this data processing pipeline, you can configure Logstash using a configuration file.
A typical Logstash configuration file is structured as follows:
input {
plugin_name{...}
}
filter {
plugin_name{...}
}
output {
plugin_name{...}
}
Let's explore the roles of these components:
input
: represents the sources of logs, such as files or HTTP endpoints.filter
(optional): unify and transform log records.output
: the destination for forwarding the processed logs.
For these inputs, filters, and outputs to fulfill their roles, they rely on plugins. These plugins are the building blocks that empower Logstash, allowing it to achieve a wide array of tasks. Let's explore these plugins to provide you with a clearer understanding of Logstash's capabilities.
Logstash input plugins
For the inputs, Logstash provides input plugins that can collect logs from various sources, such as:
HTTP: receives log records over HTTP endpoints.
Beats: collect logs from the Beats framework.
Redis: gather log records from a Redis instance.
Unix: read log records via a Unix socket.
Logstash filter plugins
When you want to manipulate, enrich, or modify logs, some of the filter plugins here can help you do that:
- JSON: parses JSON logs.
- Grok: parsing log data and structuring it.
- I18n: removes special characters from your log records.
- Geoip: adds geographical information.
Logstash output plugins
After processing data, the following output plugins can be useful:
- WebSocket: forward the logs to a WebSocket endpoint.
- S3: send log records to Amazon Simple Storage Service (Amazon S3).
- Syslog: forward logs to a Syslog server.
- Elasticsearch: deliver log entries to Elasticsearch, which is part of the Elastic stack.
Getting started with Logstash
Now that you understand how Logstash operates, you will use it to read log records from a file and display them in the console.
To set up the Logstash pipeline, create a configuration file in the /etc/logstash/conf.d
directory:
sudo nano /etc/logstash/conf.d/logstash.conf
In your logstash.conf
file, add the following lines to instruct Logstash to read logs from a file and forward them to the console:
input {
file {
path => "/var/log/logify/app.log"
start_position => "beginning"
}
}
output {
stdout {
codec => rubydebug
}
}
The input
component uses the file
plugin to read logs from a file. The path
parameter specifies the location of the file to be read, and the start_position
instructs Logstash to begin reading files from the beginning.
The output
component uses the stdout
plugin to display logs in the console. The rubydebug
codec is used for pretty printing.
After adding the code, save the file. To ensure your configuration file has no errors, run the following command:
sudo -u logstash /usr/share/logstash/bin/logstash --path.settings /etc/logstash -t
...
Configuration OK
[2023-10-01T08:53:31,628][INFO ][logstash.runner ] Using config.test_and_exit mode. Config Validation Result: OK. Exiting Logstash
If the output includes "Configuration OK," your configuration file is error-free.
Next, change the ownership of the /usr/share/logstash/data
directory to the logstash
user:
chown -R logstash:logstash /usr/share/logstash/data
Now, start Logstash by passing the path to the configuration file:
sudo -u logstash /usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/logstash.conf
When Logstash starts running, you will see output similar to this:
Using bundled JDK: /usr/share/logstash/jdk
WARNING: Could not find logstash.yml which is typically located in $LS_HOME/config or /etc/logstash. You can specify the path using --path.settings. Continuing using the defaults
Could not find log4j2 configuration at path /usr/share/logstash/config/log4j2.properties. Using default config which logs errors to the console
[INFO ] 2023-10-01 08:58:47.142 [main] runner - Starting Logstash {"logstash.version"=>"8.10.2", "jruby.version"=>"jruby 9.4.2.0 (3.1.0) 2023-03-08 90d2913fda OpenJDK 64-Bit Server VM 17.0.8+7 on 17.0.8+7 +indy +jit [x86_64-linux]"}
...
[INFO ] 2023-10-01 08:58:49.053 [[main]-pipeline-manager] javapipeline - Pipeline started {"pipeline.id"=>"main"}
[INFO ] 2023-10-01 08:58:49.059 [[main]<file] observingtail - START, creating Discoverer, Watch with file and sincedb collections
[INFO ] 2023-10-01 08:58:49.064 [Agent thread] agent - Pipelines running {:count=>1, :running_pipelines=>[:main], :non_running_pipelines=>[]}
Once Logstash starts running, the log events will be formatted and displayed neatly in the console:
{
"message" => "{\"status\": 200, \"ip\": \"127.0.0.1\", \"level\": 30, \"emailAddress\": \"user@mail.com\", \"msg\": \"Initialized application\", \"pid\": 17089, \"ssn\": \"407-01-2433\", \"timestamp\": 1696150640}",
"event" => {
"original" => "{\"status\": 200, \"ip\": \"127.0.0.1\", \"level\": 30, \"emailAddress\": \"user@mail.com\", \"msg\": \"Initialized application\", \"pid\": 17089, \"ssn\": \"407-01-2433\", \"timestamp\": 1696150640}"
},
"host" => {
"name" => "logstash-host"
},
"log" => {
"file" => {
"path" => "/var/log/logify/app.log"
}
},
"@timestamp" => 2023-10-01T08:58:49.104572151Z,
"@version" => "1"
}
...
In the output, Logstash has added additional fields, such as host
, file
, and version
, to add more context.
Now that you can observe the formatted logs in the console, you can exit Logstash by pressing CTRL + C
.
In the upcoming section, you will transform these logs before forwarding them to the desired output destination.
Transforming logs with Logstash
In this section, you will enrich, modify fields, and mask sensitive information in your logs to ensure privacy and enhance the usefulness of the log data.
Logstash uses various filter plugins to manipulate log records. Using these plugins, you can perform essential operations such as:
- Parsing JSON logs.
- Removing unwanted fields.
- Adding new fields.
- Maskng sensitive data.
Parsing JSON logs with Logstash
Since your application produces logs in JSON format, it is crucial to parse them. Parsing JSON logs is essential because it allows you to retain the benefits of the structured JSON format.
To understand the importance of parsing data, consider the log event output from the previous section:
{
"message" => "{\"status\": 200, \"ip\": \"127.0.0.1\", \"level\": 30, \"emailAddress\": \"user@mail.com\", \"msg\": \"Initialized application\", \"pid\": 17089, \"ssn\": \"407-01-2433\", \"timestamp\": 1696150640}",
"event" => {
"original" => "{\"status\": 200, \"ip\": \"127.0.0.1\", \"level\": 30, \"emailAddress\": \"user@mail.com\", \"msg\": \"Initialized application\", \"pid\": 17089, \"ssn\": \"407-01-2433\", \"timestamp\": 1696150640}"
},
...
}
Upon inspecting the log event, you will notice that the log message is in the string format, and some special characters are escaped with backslashes. To ensure that Logstash can parse these logs as valid JSON, you need to configure a filter in the Logstash configuration file.
Open the Logstash configuration file for editing:
sudo nano /etc/logstash/conf.d/logstash.conf
Add the following code to the configuration file to parse JSON in the message
field:
input {
file {
path => "/var/log/logify/app.log"
start_position => "beginning"
}
}
filter {
if [message] =~ /^{.*}$/ {
json {
source => "message"
}
}
}
output {
stdout {
codec => rubydebug
}
}
In the filter
component, a conditional check if the message
field contains a JSON object using a regex pattern. If the condition is met, the json
plugin parses the message
field as valid JSON and adds the parsed fields to the log event.
To verify if the message
field is being parsed as JSON, save your file and restart Logstash:
sudo -u logstash /usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/logstash.conf
{
"msg" => "Operation finished",
"ssn" => "407-01-2433",
"message" => "{\"status\": 200, \"ip\": \"127.0.0.1\", \"level\": 30, \"emailAddress\": \"user@mail.com\", \"msg\": \"Operation finished\", \"pid\": 17089, \"ssn\": \"407-01-2433\", \"timestamp\": 1696151020}",
"emailAddress" => "user@mail.com",
"@timestamp" => 2023-10-01T09:03:40.987814908Z,
"@version" => "1",
"log" => {
"file" => {
"path" => "/var/log/logify/app.log"
}
},
}
The logs have been successfully parsed as valid JSON and added to the key=value
log event. You can stop Logstash now.
While your application produces logs in JSON format, it's important to note that logs can come in various formats. Logstash provides several filter
plugins that enable parsing of different log events, such as:
bytes
: parses string representations of computer storage sizes.csv
: parses log records in CSV format.kv
: parses entries in thekey=value
syntax.
These filter plugins are invaluable for parsing logs of diverse formats. In the next section, you will learn how to modify the log entries further to suit your requirements.
Adding and removing fields with Logstash
In this section, you will remove the emailAddress
field, which is considered sensitive information, and eliminate some redundant fields. Additionally, you will add a new field to the log event.
To modify the log event, open the Logstash configuration file:
sudo nano /etc/logstash/conf.d/logstash.conf
Modify the filter
section as follows:
...
filter {
if [message] =~ /^{.*}$/ {
json {
source => "message"
}
}
mutate {
remove_field => ["event", "message", "emailAddress"]
add_field => { "env" => "development" }
}
}
...
The mutate
plugin manipulates log entries. The remove_field
option accepts a list of fields to remove. Since the JSON parsing added all the fields to the log event, you no longer need the event
and message
fields. So you remove them.
To ensure data privacy, you also remove the emailAddress
field. The add_field
option adds a new field called env
with the value "development".
Save the file and restart Logstash:
sudo -u logstash /usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/logstash.conf
The resulting log event will look like this:
{
"log" => {
"file" => {
"path" => "/var/log/logify/app.log"
}
},
"@version" => "1",
"timestamp" => 1696151284,
"@timestamp" => 2023-10-01T09:08:05.248324999Z,
"env" => "development",
"msg" => "Connected to database",
"ip" => "127.0.0.1",
"ssn" => "407-01-2433",
"level" => 30,
"host" => {
"name" => "logstash-host"
},
"status" => 200,
"pid" => 17089
}
...
Removing the event
, message
, and emailAddress
fields reduces the noise in the log event. The logs now contain only the essential information.
In the next section, you will add fields to the log event based on conditional statements.
Working with conditional statements in Logstash
In this section, you will write a conditional statement that checks if the status
field equals 200
. If true, Logstash will add an is_successful
field with the value true
; otherwise, it will be set to false
.
Open the Logstash configuration file:
sudo nano /etc/logstash/conf.d/logstash.conf
To create a conditional statement, add the following code:
...
filter {
if [message] =~ /^{.*}$/ {
json {
source => "message"
}
}
mutate {
remove_field => ["event", "message", "emailAddress"]
add_field => { "env" => "development" }
}
# Add the 'is_successful' field based on the 'status' field
if [status] == 200 {
mutate {
add_field => { "is_successful" => "true" }
}
} else {
mutate {
add_field => { "is_successful" => "false" }
}
}
}
...
In the provided code, a conditional statement is implemented to check if the status
field equals the value 200
. If true, the is_successful
field is set to true
; otherwise, it is set to false
.
Save and exit the configuration file. Restart Logstash with the updated configuration:
sudo -u logstash /usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/logstash.conf
The resulting log event will include the is_successful
field, indicating whether the operation was successful:
{
"@version" => "1",
"ip" => "127.0.0.1",
"ssn" => "407-01-2433",
"msg" => "Task completed successfully",
"pid" => 17089,
"level" => 30,
"@timestamp" => 2023-10-01T09:57:54.025875098Z,
"log" => {
"file" => {
"path" => "/var/log/logify/app.log"
}
},
"is_successful" => "true",
"host" => {
"name" => "logstash-host"
},
"status" => 200,
"timestamp" => 1696154270,
"env" => "development"
}
...
The is_successful
field indicates the operation's success in the log event. The field would be set to false
if the status code differed.
Redacting sensitive data with Logstash
In the previous section, you removed the emailAddress
field from the log event. However, sensitive fields such as the IP address and Social Security Number(SSN) remain. To protect personal information, especially when complete removal isn't possible due to its integration within necessary strings, it's crucial to mask such data.
To redact the IP address and SSN, open the Logstash configuration file:
sudo nano /etc/logstash/conf.d/logstash.conf
In the configuration file, add the code below to mask sensitive portions:
...
input {
file {
path => "/var/log/logify/app.log"
start_position => "beginning"
}
}
filter {
# Redact IP addresses
mutate {
gsub => [ "message", "(\d{3}-\d{2}-\d{4})", "REDACTED" ]
gsub => [ "message", "\b(?:\d{1,3}\.){3}\d{1,3}\b", "REDACTED" ]
}
# Parse JSON if the message field matches the JSON pattern
if [message] =~ /^{.*}$/ {
json {
source => "message"
}
}
mutate {
remove_field => ["event", "message", "emailAddress"]
add_field => { "env" => "development" }
}
# Add the 'is_successful' field based on the 'status' field
if [status] == 200 {
mutate {
add_field => { "is_successful" => "true" }
}
} else {
mutate {
add_field => { "is_successful" => "false" }
}
}
}
output {
stdout {
codec => rubydebug
}
}
...
In this code snippet, the mutate
plugin is used with the gsub()
method. It takes the message
field, applies regular expressions to find sensitive portions, and replaces them with 'REDACTED' text. The first gsub
regex replaces SSNs, and the second replaces IP addresses.
Save and exit the configuration file. Restart Logstash with the updated configuration:
sudo -u logstash /usr/share/logstash/bin/logstash -f /etc/logstash/conf.d/logstash.conf
{
"@timestamp" => 2023-10-01T09:59:45.295997458Z,
"msg" => "Initialized application",
"timestamp" => 1696154385,
"level" => 30,
"@version" => "1",
"pid" => 17089,
"log" => {
"file" => {
"path" => "/var/log/logify/app.log"
}
},
"is_successful" => "true",
"ssn" => "REDACTED",
"ip" => "REDACTED",
"host" => {
"name" => "logstash-host"
},
"status" => 200,
"env" => "development"
}
...
You will notice that sensitive information such as SSN and IP addresses have been successfully redacted from the log events.
Masking sensitive data is crucial, especially when dealing with fields like this:
{..., "privateInfo": "This is a sample message with SSN: 123-45-6789 and IP: 192.168.0.1"}
The masking ensures that only sensitive portions are redacted, preserving the integrity of the rest of the message:
{...,privateInfo": "This is a sample message with SSN: REDACTED and IP: REDACTED"}
Now that you can mask data, you can stop Logstash, and the logify.sh
script. To stop the bash program, obtain the process ID:
jobs -l | grep "logify"
[1]+ 23750 Running ./logify.sh &
Substitute the process ID on the kill
command to terminate the process:
kill -9 <23750>
With this change, you can move on to collecting logs from Docker containers.
Collecting logs from Docker containers and centralizing logs
In this section, you will containerize the Bash program and leverage the Nginx hello world Docker image, preconfigured to produce JSON Nginx logs every time it receives a request. Logstash will collect logs from the Bash program and the Nginx containers and forward them to Better Stack for centralization.
Dockerizing the Bash script
First, you will containerize the Bash program responsible for generating log data. Containerization offers several benefits, including encapsulating the script and its dependencies and ensuring portability across various environments.
Make sure you are in the log-processing-stack/logify
directory. Then, create a Dockerfile
:
nano Dockerfile
Inside your Dockerfile
, include the following instructions for creating a Docker image for your Bash script:
FROM ubuntu:latest
COPY . .
RUN chmod +x logify.sh
RUN mkdir -p /var/log/logify
RUN ln -sf /dev/stdout /var/log/logify/app.log
CMD ["./logify.sh"]
In this Dockerfile
, you begin with the latest Ubuntu image as the base. You then copy the program file, change permissions to make it executable, create a directory to store log files, and redirect all data written to /var/log/logify/app.log
to the standard output. This redirection lets you view the container logs using the docker logs
command. Finally, you specify the command to run when the Docker container starts.
Save and exit the file. Change back to the parent project directory:
cd ..
In your editor, create a docker-compose.yml
file:
nano docker-compose.yml
Add the following code to define the Bash program and Nginx services:
version: '3'
services:
logify-script:
build:
context: ./logify
container_name: logify
nginx:
image: betterstackcommunity/nginx-helloworld:latest
container_name: nginx
ports:
- '80:80'
In this configuration file, you define two services: logify-script
and nginx
. The logify-script
service is built using the ./logify
directory context. The nginx
service uses a pre-built Nginx image. You then map port 80
on the host to port 80
within the container. Ensure no other services are running on port 80
on the host to avoid port conflicts.
After defining the services, build the Docker images and create the containers:
docker compose up -d
The -d
option starts the services in the background.
Check the container status to verify that they are running:
docker compose ps
You will see "running" status under the "STATUS" column for the two containers:
NAME COMMAND SERVICE STATUS PORTS
logify "./logify.sh" logify-script running
nginx "/runner.sh nginx" nginx running 0.0.0.0:80->80/tcp, :::80->80/tcp
Now that the containers are running, send five HTTP requests to the Nginx service using curl
:
curl http://localhost:80/?[1-5]
View all the logs generated by the running containers with:
docker compose logs
You will see logs similar to the following output, representing the data generated by both the Nginx service and the Bash program:
nginx | {"timestamp": "2023-10-01T10:07:13+00:00", "pid": "7", "remote_addr": "172.18.0.1", "remote_user":" ", "request": "GET /?1 HTTP/1.1", "status": "200", "body_bytes_sent": "11109", "request_time": "0.000", "http_referrer":" ", "http_user_agent": "curl/7.81.0", "time_taken_ms": "1696154833.915"}
...
logify | {"status": 200, "ip": "127.0.0.1", "level": 30, "emailAddress": "user@mail.com", "msg": "Operation finished", "pid": 1, "ssn": "407-01-2433", "timestamp": 1696154843}
This output displays all the logs generated by both services.
With the Bash program and Nginx service containers running and generating data, you can now move on to collecting these logs with Logstash.
Defining the Logstash service with Docker Compose
In this section, you will define a Logstash service in the Docker Compose setup to gather logs from the existing containers and deliver them to Better Stack. The process involves creating a Logstash configuration file and deploying the Logstash service.
Open the docker-compose.yml
file again:
nano docker-compose.yml
Update the file with the following code to define the Logstash service:
version: '3'
services:
logify-script:
build:
context: ./logify
container_name: logify
logging:
driver: gelf
options:
gelf-address: "udp://127.0.0.1:5000"
tag: docker.logify
nginx:
image: betterstackcommunity/nginx-helloworld:latest
container_name: nginx
ports:
- '80:80'
logging:
driver: gelf
options:
gelf-address: "udp://127.0.0.1:5000"
tag: docker.nginx
logstash:
image: docker.elastic.co/logstash/logstash:8.10.2
container_name: logstash
volumes:
- ./logstash/config/logstash.conf:/usr/share/logstash/pipeline/logstash.conf
ports:
- "5000:5000/udp"
The logify-script
and nginx
services are configured to use the gelf
logging driver to send logs over the network to 127.0.0.1:5000
using the UDP protocol. Logstash will run a gelf
input on port 5000
to receive log events. You then add the docker.logify
and docker.nginx
tags to distinguish the log events originating from different services.
Additionally, you define a logstash
service using an official Logstash image. It incorporates a volume mapping of Logstash's configuration file logstash.conf
, which you will create shortly. The service is configured to expose port 5000
on the UDP protocol to receive log events from the services.
Next, create the logstash/config
directory to store the configuration file:
mkdir -p logstash/config
Change into the directory:
cd logstash/config
Afterward, create the logstash.conf
configuration file:
nano logstash.conf
Add the input component using gelf
:
input {
gelf {
port => 5000
}
}
This specifies that Logstash should use gelf
to listen to events on port 5000
.
Now, you need to set up the destination to send these logs for centralization. You will use Better Stack for log management.
Before defining the output
source, create a free Better Stack account. Once you've logged in, navigate to the Sources section:
Once you are on the Sources page, click the Connect source button:
Next, provide your source a name of your choosing and select "Logstash" as the platform:
After creating the source, copy the Source Token value to the clipboard:
After copying the source token, go back to the logstash.conf
file and add the filter
component to match and assign tags:
input {
gelf {
port => 5000
}
}
filter {
if [tag] == "docker.logify" {
mutate { add_tag => "docker_logify" }
}
if [tag] == "docker.nginx" {
mutate { add_tag => "docker_nginx" }
}
}
In this configuration, if the tag
field equals docker.logify
, Logstash adds the docker_logify
tag. Similarly, if the tag field equals docker.nginx
, Logstash adds the docker_nginx
tag. The name of the tag you choose doesn't matter; just ensure it is consistent.
Next, add the output
component to forward the logs with the docker_logify
tag to Better Stack:
...
output {
if "docker_logify" in [tags] {
http {
url => "https://in.logs.betterstack.com/"
http_method => "post"
headers => {
"Authorization" => "Bearer <your_logify_source_token>"
}
format => "json"
}
}
}
Save and exit the configuration file.
Return to the project root directory:
cd ../..
Start the newly created Logstash service:
docker compose up -d
After a few seconds, visit Better Stack to confirm that Logstash is forwarding the logs:
The Bash program logs will be forwarded to Better Stack.
To forward Nginx logs, create a second source by following the steps we have covered for creating a source.
After creating the sources, the Better Stack interface will look like this:
Now, add the following output
to deliver Nginx logs to Better Stack, ensuring you update the source token accordingly:
...
output {
if "docker_logify" in [tags] {
http {
url => "https://in.logs.betterstack.com/"
http_method => "post"
headers => {
"Authorization" => "Bearer <your_logify_source_token>"
}
format => "json"
}
}
if "docker_nginx" in [tags] {
http {
url => "https://in.logs.betterstack.com/"
http_method => "post"
headers => {
"Authorization" => "Bearer <your_nginx_source_token>"
}
format => "json"
}
}
}
If the tag equals docker_nginx
, Logstash sends the logs to Better Stack's Nginx source. When you save the file, run the following command:
docker compose up -d
Send more requests to the Nginx service:
curl http://localhost:80/?[1-5]
The Nginx logs will now be uploaded to Better Stack:
Monitoring Logstash health with Better Stack
Logstash offers a monitoring API that starts automatically every time you run it. To monitor whether Logstash is up or down, you can add the provided endpoint to Better Stack to periodically check if the endpoint works.
First, update the docker-compose.yml
file to expose and map the port for the Logstash monitoring API:
container_name: logstash
volumes:
- ./logstash/config/logstash.conf:/usr/share/logstash/pipeline/logstash.conf
ports:
- "9600:9600"
- "5000:5000/udp"
When you're finished, stop and discard all the services:
docker compose down
Then start all the services:
docker compose up -d
Verify that the Logstash endpoint works:
curl -XGET 'localhost:9600/?pretty'
{
"host" : "7102e1b6ba5b",
"version" : "8.10.2",
"http_address" : "0.0.0.0:9600",
"id" : "dc30a5ab-618f-4b23-99b4-f8d64a8fea6c",
"name" : "7102e1b6ba5b",
"ephemeral_id" : "fd94098a-7bf3-4ca4-80d1-bbad59d6f574",
"status" : "green",
"snapshot" : false,
"pipeline" : {
"workers" : 2,
"batch_size" : 125,
"batch_delay" : 50
},
"build_date" : "2023-09-18T15:58:34+00:00",
"build_sha" : "cc67511c41a1531b7d563a04fbcf9782ae6f9f98",
"build_snapshot" : false
}
Now, log in to Better Stack.
On the Monitors page, click the Create monitor button:
Next, enter and check the relevant information and click the Create monitor button:
Choose your preferred way to trigger Better Stack and also provide your server's IP address or domain name on port 9600
. Finally, select how you prefer to be notified.
Upon completing the configuration, Better Stack will initiate monitoring the Logstash health endpoint and start providing performance statistics:
To see what happens when Logstash is down, stop all the services with:
docker compose stop
When you return to Better Stack, the status will be updated to "Down" after a few moments pass since the endpoint no longer works:
If you configured Better Stack to send email alerts, you will receive a notification email similar to this:
With this, you can proactively manage Logstash's health and address any issues promptly.
Final thoughts
In this article, you explored the comprehensive process of using Logstash to collect, process, and forward logs, integrating it seamlessly with Docker, Nginx, and Better Stack for efficient log management. You should feel comfortable incorporating Logstash into your projects.
As a next step, visit the Logstash documentation to explore more features. If you wish to enhance your knowledge of Docker and Docker Compose, consult their respective documentation pages: Docker and Docker Compose. Additionally, for a comprehensive understanding of Docker logging mechanisms, check out this guide.
If you are interested in exploring Logstash alternatives, consider looking into various log shippers.
Thanks for reading, and happy logging!
Make your mark
Join the writer's program
Are you a developer and love writing and sharing your knowledge with the world? Join our guest writing program and get paid for writing amazing technical guides. We'll get them to the right readers that will appreciate them.
Write for usBuild on top of Better Stack
Write a script, app or project on top of Better Stack and share it with the world. Make a public repository and share it with us at our email.
community@betterstack.comor submit a pull request and help us build better products for everyone.
See the full list of amazing projects on github