Better Stack Kafka monitoring

Monitor Apache Kafka with Better Stack collector. Broker discovery and partition health out of the box, full broker internals with the Prometheus JMX exporter.

What you get out of the box

Install Better Stack collector on the hosts running Kafka. The collector automatically discovers your brokers and starts collecting cluster metadata. No Kafka configuration needed:

  • kafka_brokers: broker count
  • kafka_topic_partitions: partitions per topic
  • kafka_topic_partition_in_sync_replica: in-sync replicas (ISR) per partition
  • kafka_topic_partition_leader: leader status per partition
  • kafka_topic_partition_under_replicated_partition: under-replication status

Kafka dashboard

These metrics power the Overview and Partitions & replication sections of the Kafka dashboard.

The collector connects to brokers from the host network. Running Kafka in Docker? Publish the broker port and advertise a listener host clients can follow:

docker-compose.yml
services:
  kafka:
    image: apache/kafka:4.2.1
    ports:
      - "9092:9092"
    environment:
      KAFKA_LISTENERS: PLAINTEXT_HOST://0.0.0.0:9092,PLAINTEXT://0.0.0.0:19092,CONTROLLER://0.0.0.0:9093
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT_HOST://localhost:9092,PLAINTEXT://kafka:19092

Containers on the compose network keep connecting to kafka:19092.

Seeing your Kafka as Configuration required or Unreachable? Go to Sources β†’ your collector β†’ Configure β†’ Collect metrics, open the Kafka target, and point it at a broker address reachable from the host, e.g. localhost:9092.

Use the broker address, not a metrics endpoint

The Kafka target is a connection to the Kafka protocol port, typically 9092. Don't point it at the JMX exporter port. The JMX exporter is connected separately as a Prometheus scrape target below.

Get full Kafka metrics with JMX exporter

Broker-level performance metrics like throughput, request rates, controller state, and storage live in Kafka's JMX (Java Management Extensions) and aren't exposed outside the Java process by default. Deploy the Prometheus JMX exporter as a Java agent on each broker to light up the rest of the Kafka dashboard:

  • kafka_server_brokertopicmetrics_*: bytes and messages in/out, per topic
  • kafka_controller_kafkacontroller_*: active controller, broker count, offline partitions
  • kafka_server_replicamanager_*: leader count, partition count, ISR changes
  • kafka_network_requestmetrics_*: request rates per request type
  • kafka_log_log_size: log size per topic and partition

Download the Java agent

Download the JMX exporter agent JAR to each Kafka broker:

Download JMX exporter agent
curl -sSL -o /opt/jmx-exporter/jmx_prometheus_javaagent.jar \
  https://repo1.maven.org/maven2/io/prometheus/jmx/jmx_prometheus_javaagent/1.0.1/jmx_prometheus_javaagent-1.0.1.jar

Create the configuration file

Save the following configuration next to the agent JAR:

/opt/jmx-exporter/jmx-kafka-config.yaml
lowercaseOutputName: true
lowercaseOutputLabelNames: true
rules:
# Per-topic broker throughput (BytesInPerSec, MessagesInPerSec, ...)
- pattern: kafka.server<type=(.+), name=(.+), topic=(.+)><>Count
  name: kafka_server_$1_$2_total
  type: COUNTER
  labels:
    topic: "$3"
# Per-client, per-partition gauges
- pattern: kafka.server<type=(.+), name=(.+), clientId=(.+), topic=(.+), partition=(.*)><>Value
  name: kafka_server_$1_$2
  type: GAUGE
  labels:
    clientId: "$3"
    topic: "$4"
    partition: "$5"
- pattern: kafka.server<type=(.+), name=(.+), clientId=(.+), brokerHost=(.+), brokerPort=(.+)><>Value
  name: kafka_server_$1_$2
  type: GAUGE
  labels:
    clientId: "$3"
    broker: "$4:$5"
# Log size per topic-partition
- pattern: kafka.log<type=(.+), name=(.+), topic=(.+), partition=(.*)><>Value
  name: kafka_log_$1_$2
  type: GAUGE
  labels:
    topic: "$3"
    partition: "$4"
# Network request counters
- pattern: kafka.network<type=(.+), name=(.+), request=(.+)><>Count
  name: kafka_network_$1_$2_total
  type: COUNTER
  labels:
    request: "$3"
# Catch-alls with a single extra mbean property, kept as a label
- pattern: kafka.server<type=(.+), name=(.+), (.+)=(.+)><>Value
  name: kafka_server_$1_$2
  type: GAUGE
  labels:
    $3: "$4"
- pattern: kafka.server<type=(.+), name=(.+), (.+)=(.+)><>Count
  name: kafka_server_$1_$2_total
  type: COUNTER
  labels:
    $3: "$4"
# Generic gauges and counters
- pattern: kafka.server<type=(.+), name=(.+)><>Value
  name: kafka_server_$1_$2
  type: GAUGE
- pattern: kafka.server<type=(.+), name=(.+)><>Count
  name: kafka_server_$1_$2_total
  type: COUNTER
- pattern: kafka.controller<type=(.+), name=(.+)><>Value
  name: kafka_controller_$1_$2
  type: GAUGE
- pattern: kafka.network<type=(.+), name=(.+)><>Value
  name: kafka_network_$1_$2
  type: GAUGE
- pattern: kafka.log<type=(.+), name=(.+)><>Value
  name: kafka_log_$1_$2
  type: GAUGE

Rules are applied first-match-wins. Keep the specific patterns above the generic catch-alls, otherwise per-topic metric names get mangled.

Attach the agent to Kafka

Add the agent to Kafka's JVM options. The exporter serves Prometheus metrics on port 7071:

Linux service Docker Compose Kubernetes
# Add to the Kafka service environment, then restart Kafka
export KAFKA_OPTS="-javaagent:/opt/jmx-exporter/jmx_prometheus_javaagent.jar=7071:/opt/jmx-exporter/jmx-kafka-config.yaml"
services:
  kafka:
    image: apache/kafka:4.2.1
    ports:
      - "7071:7071" # JMX exporter Prometheus endpoint
    environment:
      KAFKA_OPTS: >-
        -javaagent:/opt/jmx-exporter/jmx_prometheus_javaagent.jar=7071:/opt/jmx-exporter/jmx-kafka-config.yaml
    volumes:
      - ./jmx-exporter:/opt/jmx-exporter:ro
# In the Kafka pod template: fetch the agent in an init container
initContainers:
  - name: jmx-exporter
    image: curlimages/curl:8.14.1
    command: ["curl", "-sSL", "-o", "/jmx-exporter/jmx_prometheus_javaagent.jar",
              "https://repo1.maven.org/maven2/io/prometheus/jmx/jmx_prometheus_javaagent/1.0.1/jmx_prometheus_javaagent-1.0.1.jar"]
    volumeMounts:
      - name: jmx-exporter
        mountPath: /jmx-exporter
containers:
  - name: kafka
    env:
      - name: KAFKA_OPTS
        value: "-javaagent:/jmx-exporter/jmx_prometheus_javaagent.jar=7071:/jmx-exporter/jmx-kafka-config.yaml"
    ports:
      - containerPort: 7071
    volumeMounts:
      - name: jmx-exporter
        mountPath: /jmx-exporter

Restart Kafka and verify the endpoint:

Verify the metrics endpoint
curl -s http://localhost:7071/metrics | grep kafka_server

Beware of Kafka CLI tools inheriting KAFKA_OPTS

Every Kafka command-line tool started in the same environment, including Docker healthchecks like kafka-broker-api-versions.sh, picks up KAFKA_OPTS. It then tries to attach a second agent to the busy port, and crashes.

To resolve this, clear the variable for CLI invocations:

Healthcheck without the agent
KAFKA_OPTS= /opt/kafka/bin/kafka-broker-api-versions.sh --bootstrap-server localhost:9092

Scrape the metrics with the collector

Kubernetes Docker & other deployments
# Add to the Kafka pod template; the collector discovers
# annotated pods and scrapes them automatically.
annotations:
  prometheus.io/scrape: "true"
  prometheus.io/port: "7071"
  prometheus.io/path: "/metrics"
Add the endpoint as a scrape target in Better Stack:

1. Go to Sources -> your collector -> Configure -> Collect metrics.
2. Click Collect metrics and select Custom Prometheus exporter or service.
3. Set Service name to match your Kafka service, e.g. kafka.
4. Set Endpoint to http://localhost:7071/metrics.

The collector scrapes the endpoint from the host network every 30 seconds. When Kafka runs in a container, publish port 7071 so the endpoint is reachable on the host.

Using the same service name as the automatically discovered Kafka service groups both metric sources under one service in Better Stack.

Verify the configuration

Within a few minutes, metrics like kafka_server_brokertopicmetrics_bytesinpersec_total and kafka_controller_kafkacontroller_activecontrollercount appear in your collector source, and the scrape target shows as Active in Configure β†’ Collect metrics.

Kafka metrics are now flowing into Better Stack

Check out the Kafka dashboard: broker throughput, controller state, partition health, and storage in one place. Charts plotting rates need two scrapes before drawing the first point. Give them a minute.

Need help?

Please let us know at hello@betterstack.com. We're happy to help! πŸ™