Back to Logging guides

How to Collect, Process, and Ship Log Data with Filebeat

Stanley Ulili
Updated on October 17, 2023

Managing logs has become necessary in the ever-evolving landscape of modern computing. Operating systems, applications, and databases produce logs, which are invaluable for understanding system behavior, troubleshooting issues, and ensuring uninterrupted operations. To simplify the complex task of handling these logs, it becomes crucial to centralize the logs. This involves using a log shipper to collect and forward the logs to a central location.

Enter Filebeat , a powerful log shipper designed to streamline collecting, processing, and forwarding logs from diverse sources to various destinations. Developed with efficiency in mind, Filebeat ensures that managing logs is seamless and reliable. Its lightweight nature and ability to handle significant volumes of data make it a preferred choice among developers and system administrators.

In this comprehensive guide, you will explore the capabilities of Filebeat in depth. Starting with the basics, you'll set up Filebeat to collect logs from various sources. You'll then delve into the intricacies of processing these logs efficiently, gathering logs from Docker containers and forwarding them to different destinations for analysis and monitoring.

Prerequisites

Before you begin, ensure you have access to a system with a non-root user account with sudo privileges. Additionally, if you plan to follow along with the later sections where Filebeat collects logs from Docker containers, ensure that you have Docker  and Docker Compose  installed on your system. If you're new to the concept of log shippers and their significance, take a moment to explore their advantages by reading this informative article.

With these prerequisites in place, create a dedicated project directory:

 
mkdir log-processing-stack

Navigate to the newly created directory:

 
cd log-processing-stack

Create a subdirectory for the demo application and move into the directory:

 
mkdir logify && cd logify

With these steps completed, you can create the demo logging application in the next section.

Developing a demo logging application

In this section, you'll build a basic logging application using the Bash  scripting language. The application will generate logs at regular intervals, simulating a real-world scenario where applications produce log data.

In the logify directory, create a logify.sh file using your preferred text editor:

 
nano logify.sh

In your logify.sh file, add the following code to generate logs:

log-processing-stack/logify/logify.sh
#!/bin/bash
filepath="/var/log/logify/app.log"

create_log_entry() {
    local info_messages=("Connected to database" "Task completed successfully" "Operation finished" "Initialized application")
    local random_message=${info_messages[$RANDOM % ${#info_messages[@]}]}
    local http_status_code=200
    local ip_address="127.0.0.1"
    local emailAddress="user@mail.com"
    local level=30
    local pid=$$
    local ssn="407-01-2433"
    local time=$(date +%s)
    local log='{"status": '$http_status_code', "ip": "'$ip_address'", "level": '$level', "emailAddress": "'$emailAddress'", "msg": "'$random_message'", "pid": '$pid', "ssn": "'$ssn'", "timestamp": '$time'}'
    echo "$log"
}

while true; do
    log_record=$(create_log_entry)
    echo "${log_record}" >> "${filepath}"
    sleep 3
done

The create_log_entry() function generates log records in JSON format, encompassing essential details like severity level, message, HTTP status code, and other crucial fields. In addition, it includes sensitive fields, such as email address, Social Security Number(SSN), and IP address, which have been deliberately included to demonstrate Filebeat ability to mask sensitive data in fields.

Next, the program enters an infinite loop that repeatedly invokes the create_log_entry() function and writes the logs to a specified file in the /var/log/logify directory.

After you finish adding the code, save the changes and make the script executable:

 
chmod +x logify.sh

Afterward, create the /var/log/logify directory to store your application logs:

 
sudo mkdir /var/log/logify

Next, assign ownership of the /var/log/logify directory to the currently logged-in user using the $USER environment variable:

 
sudo chown -R $USER:$USER /var/log/logify/

Run the logify.sh script in the background:

 
./logify.sh &

The & symbol at the end of the command instructs the script to run in the background, allowing you to continue using the terminal for other tasks while the logging application runs independently.

When the program starts, it will display output that looks like this:

Output
[1] 91773

Here, 91773 represents the process ID, which can be used to terminate the script later if needed.

To view the contents of the app.log file, you can use the tail command:

 
tail -n 4 /var/log/logify/app.log

This command displays the last 4 log entries in the app.log file in JSON format:

Output
{"status": 200, "ip": "127.0.0.1", "level": 30, "emailAddress": "user@mail.com", "msg": "Task completed successfully", "pid": 2764, "ssn": "407-01-2433", "timestamp": 1697281092}
{"status": 200, "ip": "127.0.0.1", "level": 30, "emailAddress": "user@mail.com", "msg": "Initialized application", "pid": 2764, "ssn": "407-01-2433", "timestamp": 1697281095}
{"status": 200, "ip": "127.0.0.1", "level": 30, "emailAddress": "user@mail.com", "msg": "Initialized application", "pid": 2764, "ssn": "407-01-2433", "timestamp": 1697281098}
{"status": 200, "ip": "127.0.0.1", "level": 30, "emailAddress": "user@mail.com", "msg": "Connected to database", "pid": 2764, "ssn": "407-01-2433", "timestamp": 1697281101}

You have now successfully created a logging application that generates sample log entries.

Installing Filebeat

Let's proceed with installing Filebeat. The instructions covered in this section are specific to Ubuntu 22.04. If you use a different operating system, consult the official Filebeat documentation  for installation instructions.

To begin, download the Filebeat .deb package into your home directory and install it using the dpkg command:

 
curl -L -o ~/filebeat-8.10.2-amd64.deb https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-8.10.2-amd64.deb
 
sudo dpkg -i ~/filebeat-8.10.2-amd64.deb

After installation, verify that Filebeat has been installed successfully:

 
filebeat version

If the installation is successful, the output will resemble the following:

Output
filebeat version 8.10.2 (amd64), libbeat 8.10.2 [480bccf4f0423099bb2c0e672a44c54ecd7a805e built 2023-09-18 18:09:06 +0000 UTC]

With Filebeat in place, you can proceed to learn how it works.

How Filebeat works

Before you start using Filebeat, it's crucial to understand how it works. In this section, we'll explore its essential components and processes, ensuring you have a solid foundation before diving into practical usage:

Diagram showing how filebeat works

Understanding how Filebeat works mainly involves familiarizing yourself with the following components:

  • Harvesters: harvesters are responsible for reading the contents of a file line by line. A harvester is initiated for each file when Filebeat is configured to monitor specific log files. These harvesters not only read the log data but also manage opening and closing files. By reading files incrementally, line by line, harvesters ensure that newly appended log data is efficiently collected and forwarded for processing.

  • Inputs: inputs serve as the bridge between harvesters and the data sources. They are responsible for managing the harvesters and locating all the sources from which Filebeat needs to read log data. Inputs can be configured for various sources, such as log files, containers, or system logs. Users can specify which files or locations Filebeat should monitor by defining inputs.

After Filebeat reads the log data, the log events are transformed or enriched with data. And then finally sent to the specified destinations.

To put into practice, you can specify this behavior in a /etc/filebeat/filebeat.yml configuration file:

 
filebeat.inputs:
  . . .
processors:
  . . .
output.plugin_name:
   . . .

Lets now look into each section in detail:

  • filebeat.inputs: the input sources that a Filebeat instance should monitor.
  • processors: enrich, modify, or filter data before it's sent to the output.
  • output.plugin_name: the output destination where Filebeat should forward the log data.

Each of these directives requires you to specify a plugin that carries out its respective task.

Now, let's explore some inputs, processors, and outputs that can be used with Filebeat.

Filebeat input plugins

Filebeat provides a range of inputs  plugins, each tailored to collect log data from specific sources:

Filebeat output plugins

Filebeat provides a variety of outputs  plugins, enabling you to send your collected log data to diverse destinations:

  • File : writes log events to files.
  • Elasticsearch : enables Filebeat to forward logs to Elasticsearch using its HTTP API.
  • Kafka : delivers log records to Apache Kafka.
  • Logstash : sends logs directly to Logstash.

Filebeat modules plugins

Filebeat streamlines log processing through its modules , providing pre-configured setups designed for specific log formats. These modules enable you to effortlessly ingest, parse, and enrich log data without requiring extensive manual configuration. Here are a few available modules  that can significantly simplify your log processing workflow:

Getting started with Filebeat

Now that you understand the workings of Filebeat, let's configure it to read log entries from a file and display them on the console.

To start, open the Filebeat configuration file located at /etc/filebeat/filebeat.yml:

 
sudo nano /etc/filebeat/filebeat.yml 

Next, clear the existing contents of the file and replace them with the following code:

/etc/filebeat/filebeat.yml
filebeat.inputs:
- type: log
  paths:
    - /var/log/logify/app.log

output.console:
  pretty: true

In the filebeat.inputs section, you specify that Filebeat should read logs from a file using the logs plugin. The paths parameter indicates the path to the log file that Filebeat will monitor, set here as /var/log/logify/app.log.

The output.console section sends the collected log data to the console. The pretty: true parameter ensures that log entries are presented in a readable and well-structured format when shown on the console.

Once you've added these configurations, save the file.

Before executing Filebeat, it's essential to verify the configuration file syntax to identify and rectify any errors:

 
sudo filebeat test config

If the configuration file is correct, you should see the following output:

Output
Config OK

Now, proceed to run Filebeat:

 
sudo filebeat

When Filebeat runs, it automatically picks up the configuration file in the /etc/filebeat directory. If your configuration file is in a different location, you can specify the path like this:

 
sudo filebeat -e -c </path/to/filebeat.yml>

As Filebeat starts running, it will display log entries similar to the following:

Output
{
  "@timestamp": "2023-10-14T11:08:20.906Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "8.10.2"
  },
  "agent": {
    "version": "8.10.2",
    "ephemeral_id": "de914ac1-d6e4-48e6-bf10-e634fc784a15",
    "id": "d6e2a5a6-f532-4453-9750-75ac5ff39d90",
    "name": "filebeat-host",
    "type": "filebeat"
  },
  "log": {
    "offset": 36073,
    "file": {
      "path": "/var/log/logify/app.log"
    }
  },
  "message": "{\"status\": 200, \"ip\": \"127.0.0.1\", \"level\": 30, \"emailAddress\": \"user@mail.com\", \"msg\": \"Task completed successfully\", \"pid\": 2764, \"ssn\": \"407-01-2433\", \"timestamp\": 1697281700}",
  "input": {
    "type": "log"
  },
  "ecs": {
    "version": "8.0.0"
  },
  "host": {
    "name": "filebeat-host"
  }
}
...

Filebeat now displays log messages in the console. The log events from the Bash script are under the message field, and Filebeat has added additional fields to provide context. You can now stop Filebeat by pressing CTRL + C.

Having successfully configured Filebeat to read and forward logs to the console, the next section will focus on data transformation.

Transforming logs with Filebeat

When Filebeat collects data, you can process it before sending it to the output. You can enrich it with new fields, parse the data, and remove or redact sensitive fields to ensure data privacy.

In this section, you'll transform the logs in the following ways:

  • Parsing JSON logs.
  • Removing unwanted fields.
  • Adding new fields.
  • Masking sensitive data.

Parsing JSON logs with Filebeat

As the demo logging application generates logs in JSON format, it's essential to parse them correctly for structured analysis.

Let's examine an example log event from the previous section:

Output
{
  ...
  "message": "{\"status\": 200, \"ip\": \"127.0.0.1\", \"level\": 30, \"emailAddress\": \"user@mail.com\", \"msg\": \"Task completed successfully\", \"pid\": 2764, \"ssn\": \"407-01-2433\", \"timestamp\": 1697281700}",
  "input": {
    "type": "log"
  },
  ...
}

In the message field, double quotes surround the log event, and many fields are escaped with backslashes. This is not valid JSON; the message field contents have been converted to a string.

To parse the log event as valid JSON, open the Filebeat configuration file:

 
sudo nano /etc/filebeat/filebeat.yml 

Then, update the file with the following lines of code:

/etc/filebeat/filebeat.yml
filebeat.inputs:
- type: log
  paths:
    - /var/log/logify/app.log

processors:
- decode_json_fields:
fields: ["message"]
target: ""
output.console: pretty: true

In the snippet above, you configure a decode_json_fields processor to decode JSON-encoded data in each log entry's message field and attach it to the log event.

Save and exit your file. Rerun Filebeat with the following command:

 
sudo filebeat
Output
{
  "@timestamp": "2023-10-14T11:34:46.027Z",
  "@metadata": {
    ...
  },
  "host": {
    "name": "filebeat-host"
  },
  "ip": "127.0.0.1",
  "msg": "Initialized application",
  "ssn": "407-01-2433",
  "timestamp": 1697283285,
  "status": 200,
  "log": {
    "offset": 127921,
    "file": {
      "path": "/var/log/logify/app.log"
    }
  },
  "input": {
    "type": "log"
  },
  "pid": 2764,
  "ecs": {
    "version": "8.0.0"
  },
  "emailAddress": "user@mail.com",
  "level": 30,
  "message": "{\"status\": 200, \"ip\": \"127.0.0.1\", \"level\": 30, \"emailAddress\": \"user@mail.com\", \"msg\": \"Initialized application\", \"pid\": 2764, \"ssn\": \"407-01-2433\", \"timestamp\": 1697283285}",
  "agent": {
   ...
  }
}
...

In the output, you will see that all properties in the message field, such as msg, ip, etc., have been added to the log event.

Now that you can parse JSON logs, you will modify attributes on a log event.

Adding and removing fields with Filebeat

The log event contains a sensitive emailAddress field that needs to be protected. In this section, you'll remove the emailAddress field and add a new field to the log event to provide more context.

Open the Filebeat configuration file:

 
sudo nano /etc/filebeat/filebeat.yml 

Add the following lines to modify the log event:

/etc/filebeat/filebeat.yml
filebeat.inputs:
- type: log
  paths:
    - /var/log/logify/app.log

processors:
  - decode_json_fields:
      fields: ["message"]
      target: ""
- drop_fields:
fields: ["emailAddress", "message"]
- add_fields:
fields:
env: "environment" # Add a new 'env' field set to "development"
output.console: pretty: true

To modify the log event, you add the drop_fields processor, which has a field option that takes a list of fields to be removed, including the sensitive EmailAddress field and the message field. You remove the message field because after parsing the data, properties from the message field were incorporated into the log event, rendering the original message field obsolete.

After writing the code, save and exit the file. Then, restart Filebeat:

 
sudo filebeat

Upon running Filebeat, you will notice that the emailAddress field has been successfully removed, and a new env field has been added to the log event:

Output
{
  "@timestamp": "2023-10-14T11:37:27.883Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "8.10.2"
  },
  "host": {
    "name": "filebeat-host"
  },
  "ip": "127.0.0.1",
  "fields": {
    "env": "environment"
  },
  "input": {
    "type": "log"
  },
  "ssn": "407-01-2433",
  "status": 200,
  "msg": "Operation finished",
  "log": {
    "offset": 137136,
    "file": {
      "path": "/var/log/logify/app.log"
    }
  },
  "ecs": {
    "version": "8.0.0"
  },
  "agent": {
    "ephemeral_id": "9216e279-50aa-4946-9d46-db71bd0d8ea0",
    "id": "d6e2a5a6-f532-4453-9750-75ac5ff39d90",
    "name": "filebeat-host",
    "type": "filebeat",
    "version": "8.10.2"
  },
  "timestamp": 1697283445,
  "level": 30,
  "pid": 2764
}
...

Now that you can enrich and remove unwanted fields, you will write conditional statements next.

Working with conditional statements in Filebeat

Filebeat allows you to check a condition and add a field when it evaluates to true. In this section, you will check if the status value equals true, and if the condition is met, you will add an is_successful field to the log event.

To achieve this, open the configuration file:

 
sudo nano /etc/filebeat/filebeat.yml 

Following that, add the highlighted lines to add the is_successful field based on the specified condition:

/etc/filebeat/filebeat.yml
...
processors:
  - decode_json_fields:
      fields: ["message"]
      target: ""
  - drop_fields:
      fields: ["emailAddress"]  # Remove the 'emailAddress' field

  - add_fields:
      fields:
        env: "environment"  # Add a new 'env' field set to "development"
- add_fields:
when:
equals:
status: 200
target: ""
fields:
is_successful: true
...

The when option checks if the status field value equals 200. If true, the is_successful field is added to the log event.

After saving the new changes, start Filebeat:

 
sudo  filebeat

Filebeat will yield output that looks closely to this:

Output
{
   ...
  "is_successful": true,
  "fields": {
    "env": "environment"
  },
  "ip": "127.0.0.1",
  "timestamp": 1697283529,
  "status": 200,
  "msg": "Operation finished",
  "pid": 2764,
  "log": {
    "offset": 142019,
    "file": {
      "path": "/var/log/logify/app.log"
    }
  }
}
...

In the output, the is_successful field has been added to the log entries with an HTTP status code of 200.

That takes care of adding a new field based on a condition.

Redacting sensitive data with Filebeat

Earlier in the article, you removed the emailAddress field to ensure data privacy. However, sensitive fields such as IP addresses and Social Security Numbers (SSN) remain in the log event. Moreover, sensitive data can be inadvertently added to log events by other developers within an organization. Redacting data that match specific patterns allows you to mask any sensitive information without needing to remove entire fields, ensuring the message's significance is preserved.

In your text editor, open the Filebeat configuration file:

 
sudo nano /etc/filebeat/filebeat.yml 

Add the following code to redact the IP address and Social Security Number:

/etc/filebeat/filebeat.yml
...
processors:
- script:
lang: javascript
id: redact-sensitive-info
source: |
function process(event) {
// Redact SSNs (e.g., 123-45-6789) from the "message" field
event.Put("message", event.Get("message").replace(/\d{3}-\d{2}-\d{4}/g, "[REDACTED-SSN]"));
// Redact IP addresses (e.g., 192.168.1.1) from the "message" field
event.Put("message", event.Get("message").replace(/\b\d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3}\b/g, "[REDACTED-IP]"));
}
- decode_json_fields: fields: ["message"] target: "" - drop_fields: fields: ["emailAddress"] # Remove the 'emailAddress' field - add_fields: fields: env: "environment" # Add a new 'env' field set to "development" - add_fields: when: equals: status: 200 target: "" fields: is_successful ...

In the added code, you define a script written in JavaScript that redacts sensitive information from a log event. The script uses regular expressions to identify SSNs and IP addresses, replacing them with [REDACTED-SSN] and [REDACTED-IP], respectively.

After adding the code, run Filebeat:

 
sudo filebeat
Output
...
{
  "@timestamp": "2023-10-15T08:16:38.162Z",
  "@metadata": {
    "beat": "filebeat",
    "type": "_doc",
    "version": "8.10.2"
  },
  "ecs": {
    "version": "8.0.0"
  },
  "msg": "Connected to database",
  "ssn": "[REDACTED-SSN]",
  "log": {
    "file": {
      "path": "/var/log/logify/app.log"
    },
    "offset": 562969
  },
  "agent": {
    "name": "filebeat-host",
    "type": "filebeat",
    "version": "8.10.2",
    "ephemeral_id": "83d935b9-84d9-4513-ac90-94237c0860fa",
    "id": "d6e2a5a6-f532-4453-9750-75ac5ff39d90"
  },
  "level": 30,
  "pid": 2764,
  "status": 200,
  "host": {
    "name": "filebeat-host"
  },
  "ip": "[REDACTED-IP]",
  "timestamp": 1697290792,
  "fields": {
    "env": "environment"
  },
  "input": {
    "type": "log"
  }
}
...

The log events in the output will now have the IP address and SSN fields redacted.

In scenarios where you have a field like the following:

Output
{..., "privateInfo": "This is a sample message with SSN: 123-45-6789 and IP: 192.168.0.1"}

After processing with Filebeat, only the sensitive portions will be removed, and the log event will appear as follows:

Output
{..., "privateInfo": "This is a sample message with SSN: [REDACTED-SSN] and IP: [REDACTED-IP]"}

The sensitive portions have now been redacted, and the context of the log message remains intact.

You can now stop Filebeat and the logify.sh program.

To stop the bash script, obtain the process ID:

 
jobs -l | grep "logify"
Output
[1]+ 91773 Running                 ./logify.sh &

Substitute the process ID in the kill command:

 
kill -9 <2113>

The program will now be terminated.

Having successfully redacted sensitive fields, you can now collect logs from Docker containers using Filebeat and centralize them for further analysis and monitoring.

Collecting logs from Docker containers and centralizing logs

In this section, you will containerize the Bash script and use the Nginx hello world Docker image , preconfigured to generate JSON Nginx logs for each incoming request. Subsequently, you will create a Filebeat container to gather logs from both containers and centralize them to Better Stack for analysis.

Dockerizing the Bash script

In this section, you'll containerize the Bash script that generates log data. This step allows you to encapsulate the script and its dependencies.

Before you begin, ensure you are in the log-processing-stack/logify directory. Then, create a Dockerfile using your preferred text editor:

 
nano Dockerfile

In your Dockerfile, add the following code:

log-processing-stack/logify/Dockerfile
FROM ubuntu:latest

COPY . .

RUN chmod +x logify.sh

RUN mkdir -p /var/log/logify

RUN ln -sf /dev/stdout /var/log/logify/app.log

CMD ["./logify.sh"]

In the first line, the latest version of Ubuntu is specified as the base image. Next, the script is copied into the container, made executable, and a directory to store log files is created. Subsequently, any data written to /var/log/logify/app.log is redirected to the standard output so that it can be viewed using the docker logs command. Finally, you specify the command to run when the container is first launched.

Save and exit the file after making these changes. Then, change into the parent project directory:

 
cd ..

Next, create a docker-compose.yml file to define the services and volumes for your Docker containers:

 
nano docker-compose.yml

Define the Bash script and Nginx services:

log-processing-stack/docker-compose.yml
version: '3'
services:
  logify-script:
    build:
      context: ./logify
    image: logify:latest
    container_name: logify
  nginx:
    image: betterstackcommunity/nginx-helloworld:latest
    logging:
      driver: json-file
    container_name: nginx
    ports:
      - '80:80'

In this Docker Compose file, you define two services: logify-script and nginx. The logify-script service is built from the ./logify directory context, creating an image tagged as logify:latest. The nginx service uses the latest version of the nginx-helloworld image  and the json-file logging driver for logging purposes. Additionally, port 80 on the host is mapped to port 80 within the container. Ensure no other services use port 80 to prevent conflicts.

To build the logify-script service image and start the containers for each defined service, use the following command:

 
docker compose up -d

The -d option puts the services in the background.

Now, check if the services are running:

 
docker compose ps

You should see a "running" status under the "STATUS" column for both containers, which should look like this:

Output
NAME                COMMAND              SERVICE             STATUS              PORTS
logify              "./logify.sh"        logify-script       running
nginx               "/runner.sh nginx"   nginx               running             0.0.0.0:80->80/tcp, :::80->80/tcp

With the containers running, use the curl command to send HTTP requests to the Nginx service:

 
curl http://localhost:80/?[1-5]

View all the logs from both containers with the following command:

 
docker compose logs
Output
nginx  | {"timestamp":"2023-10-15T08:22:27+00:00","pid":"8","remote_addr":"172.18.0.1","remote_user":"","request":"GET /?1 HTTP/1.1","status": "200","body_bytes_sent":"11109","request_time":"0.000","http_referrer":"","http_user_agent":"curl/7.81.0","time_taken_ms":"1697358147.663"}
...
logify  | {"status": 200, "ip": "127.0.0.1", "level": 30, "emailAddress": "user@mail.com", "msg": "Task completed successfully", "pid": 1, "ssn": "407-01-2433", "timestamp": 1697358157}

The output displays logs produced from both containers.

Now that the Bash script and Nginx services are running and generating logs, you can collect and centralize these logs using Filebeat.

Defining the Filebeat service with Docker Compose

In this section, you will define two Filebeat services to collect logs from the logify-script and nginx services and forward them to BetterStack.

First, open the docker-compose.yml file:

 
nano docker-compose.yml

Add the Filebeat service to read logs from the logify-script service:

log-processing-stack/docker-compose.yml
version: '3'
services:
  logify-script:
    build:
      context: ./logify
    image: logify:latest
    container_name: logify
  nginx:
    image: betterstackcommunity/nginx-helloworld:latest
    logging:
      driver: json-file
    container_name: nginx
    ports:
      - '80:80'

filebeat-logify:
image: docker.elastic.co/beats/filebeat:8.10.3
container_name: filebeat-logify
user: root
command:
- "-e"
- "--strict.perms=false"
volumes:
- ./filebeat/filebeat-logify.yml:/usr/share/filebeat/filebeat.yml:ro
- /var/lib/docker/containers:/var/lib/docker/containers:ro
- /var/run/docker.sock:/var/run/docker.sock

In this updated configuration, the filebeat-logify service uses the filebeat:8.10.3  base image, with the user set to root. The Filebeat configurations are stored in the filebeat-logify.yml file, which you will define shortly.

To proceed, create a new directory named filebeat and navigate into it:

 
mkdir filebeat && cd filebeat

Then, create the filebeat-logify.yml configuration file:

 
nano filebeat-logify.yml

In your filebeat-logify.yml, add the following code to specify how Filebeat should read logs from the logify-script service:

log-processing-stack/filebeat/filebeat-logify.yml
filebeat.autodiscover:
  providers:
    - type: docker
      labels.dedot: true
      templates:
        - condition:
            contains:
              docker.container.image: "logify"
          config:
            - type: log
              paths:
                - /var/lib/docker/containers/${data.docker.container.id}/*.log
              processors:
                - decode_json_fields:
                    fields: ["message"]
                    target: ""

In this configuration, you set up Filebeat's automatic log discovery to collect logs from Docker containers whose image names contain the substring logify. This corresponds to the container defined under the logify-script service. Filebeat uses the log input to read Docker logs specified under paths. Additionally, a processor is added to decode JSON fields.

Next, let's configure the destination to forward these logs. In this tutorial, you will send the logs to Better Stack for centralization and analysis.

Start by creating a free Better Stack account account. Once you've logged in, navigate to the Sources section.

Screenshot of the **Sources** link pointed by an arrow

On the Sources page, click the Connect source button:

Screenshot indicating the **Connect source** button

Provide your source a name, for example, "Logify logs" and select "Filebeat" as the platform:

Screenshot of the Better Stack interface with the name field filled as "Logify logs" and the Platform set to "Filebeat"

Next, copy the Source Token from the Better Stack interface:

Screenshot of Better Stack page with an arrow pointing to the "Source Token" field

In your filebeat-logify.yml, add the destination to forward the logs and replace <your_logify_source_token> with your actual source token:

log-processing-stack/filebeat/filebeat-logify.yml
...
output.elasticsearch:
  hosts: 'https://in-elasticsearch.logs.betterstack.com:443'
  headers:
    X-Better-Stack-Source-Token: '<your_logify_source_token>'

Using the elasticsearch output, you provide your source token and send logs to Better Stack on port 443.

After making the changes, save and exit the file.

Navigate to the parent directory:

 
cd ..

Start the Filebeat service:

 
docker compose up -d

After some time passes, check Better Stack to confirm that Filebeat is sending the logs:

Screenshot displaying the log entries in Better Stack

In the screenshot, Better Stack is successfully receiving the logs from Filebeat.

Now that you have successfully forwarded the Bash script logs, it's time to deliver Nginx service logs as well.

First, create an additional source named "Nginx logs" by following the same steps you did previously. Copy the source token to a safe place.

After creating the sources, the interface will look like this:

Screenshot of Better Stack with two sources: Logify, and Nginx

Next, move into the Filebeat directory:

 
cd filebeat

Create the filebeat-nginx.yml configuration file:

 
nano filebeat-nginx.yml

Now, add the following code to read logs from the nginx service and forward them to Better Stack. Make sure you update the source tokens accordingly:

log-processing-stack/filebeat/filebeat-nginx.yml
filebeat.autodiscover:
  providers:
    - type: docker
      labels.dedot: true
      templates:
        - condition:
            contains:
              docker.container.image: "betterstackcommunity/nginx-helloworld"
          config:
            - type: log
              paths:
                - /var/lib/docker/containers/${data.docker.container.id}/*.log
              processors:
                - decode_json_fields:
                    fields: ["message"]
                    target: ""

output.elasticsearch:
  hosts: 'https://in-elasticsearch.logs.betterstack.com:443'
  headers:
    X-Better-Stack-Source-Token: '<your_logify_source_token>'

In this file, Filebeat automatically discovers the Docker container with the substring "betterstackcommunity/nginx-helloworld" in their image. You then use log as the input to read the container logs, decode the JSON fields, and send the logs to Better Stack.

Go back to the parent directory:

 
cd ..

Next, open the docker-compose.yml file:

 
nano docker-compose.yml

Add another Filebeat service to the docker-compose.yml file:

log-processing-stack/docker-compose.yml
version: '3'
...

  filebeat-logify:
    image: docker.elastic.co/beats/filebeat:8.10.3
    container_name: filebeat-logify
    user: root
    command:
      - "-e"
      - "--strict.perms=false"
    volumes:
      - ./filebeat/filebeat-logify.yml:/usr/share/filebeat/filebeat.yml:ro
      - /var/lib/docker/containers:/var/lib/docker/containers:ro
      - /var/run/docker.sock:/var/run/docker.sock
filebeat-nginx:
image: docker.elastic.co/beats/filebeat:8.10.3
container_name: filebeat-nginx
user: root
command:
- "-e"
- "--strict.perms=false"
volumes:
- ./filebeat/filebeat-nginx.yml:/usr/share/filebeat/filebeat.yml:ro
- /var/lib/docker/containers:/var/lib/docker/containers:ro
- /var/run/docker.sock:/var/run/docker.sock

Start the services:

 
docker compose up -d

Send a few requests to the Nginx service:

 
curl http://localhost:80/?[1-5]

Now go back to Better Stack to confirm that the "Nginx logs" source is receiving the logs:

Screenshot of Nginx logs in Better Stack

Now that you can successfully centralize logs to Better Stack using Filebeat, you will monitor Filebeat's health in the next section.

Monitoring Filebeat health with Better Stack

Filebeat doesn't have a built-in /health endpoint for external monitoring of its instance's health. However, you can configure an endpoint for metrics. Doing so allows you to externally monitor Filebeat to determine whether it's up or down. In this tutorial, you will enable the HTTP endpoint for the filebeat-logify service.

First, navigate to the filebeat directory:

 
cd filebeat

Open the filebeat-logify.yml configuration file:

 
nano filebeat-logify.yml

Add the following lines at the top of the file:

log-processing-stack/filebeat/filebeat-logify.yml
http.enabled: true
http.host: 0.0.0.0
http.port: 5066
...

The metric endpoint will be enabled and exposed on port 5066.

Next, move back to the parent directory and open the docker-compose.yml file again:

 
cd .. && nano docker-compose.yml

After that, update the Docker Compose file to map port 5066 between the host and the container:

log-processing-stack/docker-compose.yml
...
  filebeat-logify:
    image: docker.elastic.co/beats/filebeat:8.10.3
    container_name: filebeat-logify
    user: root
ports:
- "5066:5066"
command: - "-e" - "--strict.perms=false" volumes: - ./filebeat/filebeat-logify.yml:/usr/share/filebeat/filebeat.yml:ro - /var/lib/docker/containers:/var/lib/docker/containers:ro - /var/run/docker.sock:/var/run/docker.sock ...

Restart the Filebeat service to apply the changes:

 
docker compose up -d

Verify that the Filebeat metrics endpoint is functioning properly:

 
curl -XGET 'localhost:5066/?pretty'
Output
{
  "beat": "filebeat",
  "binary_arch": "amd64",
  "build_commit": "37113021c2d283b4f5a226d81bc77d9af0c8799f",
  "build_time": "2023-10-05T05:53:41.000Z",
  "elastic_licensed": true,
  "ephemeral_id": "09a59a41-96fc-468f-bb68-1f465e0a54ca",
  "gid": "0",
  "hostname": "e8e6678906ad",
  "name": "e8e6678906ad",
  "uid": "0",
  "username": "root",
  "uuid": "bec11966-70f3-4be3-8826-98b2eb467b08",
  "version": "8.10.3"
}

Next, sign into Better Stack

On the Monitors page, click the Create monitor button:

Screenshot of the monitors page, providing an option to create a monitor

Choose the preferred method to trigger Better Stack, provide your server's IP address or domain name on port 5066, and click the Create monitor button:

Screenshot of Better Stack configured with the necessary options

Better Stack will start monitoring the Filebeat endpoint and provide performance insights:

Screenshot of Better Stack monitoring the REST API endpoint

Now, let's observe how Better Stack responds when Filebeat stops working by stopping all the services:

 
docker compose stop

After some time has passed, return to Better Stack to observe that the status has been updated to "Down":

Screenshot of Better Stack indicating that the endpoint is down

If you configured Better Stack to send you an email, check your email inbox. You will receive an email alert:

Screenshot showing the email Alert that Better Stack sent

That takes care of monitoring Filebeat using Better Stack.

Final thoughts

In this tutorial, you learned how to use Filebeat to integrate with Docker, Nginx, and Better Stack to manage logs. You started by reading logs from a file and displaying them in a console. Then, you explored various ways to transform log messages. After that, you collected logs from multiple Docker containers and forwarded them to Better Stack. Finally, you monitored Filebeat's health using Better Stack and received alerts in case of issues.

As a next step, refer to the FileBeat documentation  for more in-depth information. To learn more about Docker and Docker Compose, explore their respective documentation pages: Docker  and Docker Compose . To enhance your Docker logging knowledge, check out our comprehensive guide.

Apart from Filebeat, there are other log shippers available that you can explore. Check out the log shippers guide to learn about them.

Thanks for reading, and happy logging!

Author's avatar
Article by
Stanley Ulili
Stanley is a freelance web developer and researcher from Malawi. He loves learning new things and writing about them to understand and solidify concepts. He hopes that by sharing his experience, others can learn something from them too!
Got an article suggestion? Let us know
Next article
How to Collect, Process, and Ship Log Data with Rsyslog
Licensed under CC-BY-NC-SA

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Make your mark

Join the writer's program

Are you a developer and love writing and sharing your knowledge with the world? Join our guest writing program and get paid for writing amazing technical guides. We'll get them to the right readers that will appreciate them.

Write for us
Writer of the month
Woo Jia Hao
Woo Jia Hao is a software developer from Singapore. He is an avid learner who...
Build on top of Better Stack

Write a script, app or project on top of Better Stack and share it with the world. Make a public repository and share it with us at our email.

community@betterstack.com

or submit a pull request and help us build better products for everyone.

See the full list of amazing projects on github