A Beginner's Guide to the OpenTelemetry Demo

OpenTelemetry Logs Metrics Distributed Tracing Observability

Ayooluwa Isaiah

Updated on January 8, 2025

Prerequisites
What is the OpenTelemetry Demo?
Setting up the OpenTelemetry Demo
Exploring the telemetry instrumentation and coverage
Telemetry collection and export
Viewing trace data in Jaeger
Viewing metric data in Prometheus and Grafana
Simulating service faults with feature flags
Final thoughts

The OpenTelemetry Demo provides a practical, hands-on environment for exploring and implementing OpenTelemetry.

It demonstrates how OpenTelemetry can be used to instrument and monitor a microservice-based application featuring realistic workflows and fault simulations to illustrate the effective use of the OpenTelemetry SDK, API, Collector, and other components.

In this article, we'll delve into the architecture of the OpenTelemetry Demo, examine how its services interact, and explain how telemetry data is collected and analyzed to help you better understand and adopt OpenTelemetry in your own systems.

Let's get started!

Prerequisites

Some familiarity with OpenTelemetry's basic concepts.
Docker and Docker Compose installed. I recommend using OrbStack on macOS instead of Docker Desktop.

What is the OpenTelemetry Demo?

The OpenTelemetry Demo simulates an online astronomy-themed retail shop. It is a fictional e-commerce application serves as a realistic environment for showcasing everything OpenTelemetry has to offer for instrumenting and monitoring a distributed system.

The shop features Astronomy-related items, such as telescopes, star charts, and space-themed merchandise. It also includes a shopping workflow involving browsing products, adding items to a cart, checking out, and receiving an order confirmation—typical interactions in an e-commerce platform.

It was developed with the following primary goals:

To showcase how OpenTelemetry operates in a complex environment with multiple interconnected services built in different technologies.
To allow you observe distributed tracing, metrics, and logs in action for monitoring and troubleshooting an application and learn how to implement them in your own services.
To serve as a foundational platform for vendors and tooling developers to showcase their OpenTelemetry integrations.
For OpenTelemetry contributors, the demo provides a functional, living environment to test and validate new versions of the API, SDK, or other OpenTelemetry components before updates are released to the broader community.

Components

The services that comprise the demo can be divided into four broad categories:

Core services: These are microservices written in different programming languages that talk to each other over gRPC and HTTP. These services are all instrumented with OpenTelemetry to produce traces, metrics, and logs.
Dependencies: Services that support the application services, such as Redis, Kafka, and Valkey.
Telemetry services: Components that deal with the telemetry data generated by the above services like the OpenTelemetry Collector, Prometheus, Grafana, OpenSearch, and Jaeger.
Utility services: These are additional services that provide specific functionality to support the core application such as the load-generator, flagd, and flagdui service.

Let's look at how you can set up the demo on your local machine next.

Setting up the OpenTelemetry Demo

Setting up the OpenTelemetry Demo is straightforward with Docker, and there's also an option to deploy it through Kubernetes if that's your preference.

Begin by cloning its GitHub repository to your local machine:

Copied!

git clone https://github.com/open-telemetry/opentelemetry-demo.git

Then navigate into the cloned repository with:

Copied!

cd opentelemetry-demo

You can now start the demo with docker compose:

Copied!

docker compose up --force-recreate --remove-orphans --detach

Since the demo involves a large number of containers and services, downloading and building all the necessary Docker images might take several minutes depending on your internet speed.

Once the setup is complete, you should see output similar to the following:

Output

. . .
[+] Running 26/26
 ✔ Network opentelemetry-demo         Created                        0.2s
 ✔ Container flagd                    Started                        2.2s
 ✔ Container grafana                  Started                        2.5s
 ✔ Container opensearch               Healthy                       12.6s
 ✔ Container valkey-cart              Started                        2.5s
 ✔ Container prometheus               Started                        2.5s
 ✔ Container kafka                    Healthy                       17.1s
 ✔ Container jaeger                   Started                        2.1s
 ✔ Container otel-collector           Started                       12.7s
 ✔ Container cart-service             Started                       14.5s
 ✔ Container flagdui                  Started                       14.5s
 ✔ Container quote-service            Started                       13.4s
 ✔ Container frauddetection-service   Started                       17.4s
 ✔ Container currency-service         Started                       14.1s
 ✔ Container email-service            Started                       14.1s
 ✔ Container imageprovider            Started                       13.5s
 ✔ Container ad-service               Started                       14.8s
 ✔ Container accounting               Started                       17.4s
 ✔ Container product-catalog-service  Started                       14.1s
 ✔ Container shipping-service         Started                       14.8s
 ✔ Container payment-service          Started                       13.3s
 ✔ Container recommendation-service   Started                       14.6s
 ✔ Container checkout-service         Started                       17.2s
 ✔ Container frontend                 Started                       17.5s
 ✔ Container load-generator           Started                       18.0s
 ✔ Container frontend-proxy           Started                       18.5s

The Docker Compose setup includes 25 containers, most of which represent the microservices in the demo application. Additionally, the setup includes:

prometheus, grafana, opensearch, and jaeger: For inspecting and visualizing telemetry data.
load-generator: Uses Locust to simulate user traffic.
flagd, flagdui: Provides support for changing feature flags through a user interface.

Once all services are up and running, open your browser and navigate to http://localhost:8080 to interact with the demo application:

Scrolling down reveals the available products for purchase:

Clicking on a product takes you to its details page, where you can select a quantity and add it to your shopping cart:

The shopping cart displays the selected products, calculated total, and demo payment options:

Scroll to the bottom of the cart and click the Place Order button:

You'll be directed to an order confirmation page, verifying that the demo is functioning as expected:

As you interact with the application, multiple microservices communicate and work together to handle your actions.

These interactions are fully instrumented with OpenTelemetry, allowing you to collect and analyze traces, metrics, and logs with the included telemetry tools or with a vendor you're currently evaluating.

The fastest log
search on the planet

Better Stack lets you see inside any stack, debug any issue, and resolve any incident.

Exploring the telemetry instrumentation and coverage

Each service in the demo application uses its respective OpenTelemetry SDKs to collect telemetry data—traces, metrics, and logs. However, the extent of coverage for each telemetry type varies by service, reflecting differences in functionality and maturity of instrumentation.

Trace coverage

Tracing is the most comprehensively implemented telemetry type in the demo. Most services feature robust trace instrumentation, including:

Automatic and manual span creation
Span enrichment
Context propagation

Features like baggage and span links are selectively implemented in services with complex interactions, such as the Checkout and Fraud Detection services.

Metric coverage

The metric coverage varies significantly, with most services having incomplete instrumentation. Advanced features like multiple instruments, views, and exemplars are largely missing across the board.

However, the existing metrics are sufficient to demonstrate core OpenTelemetry metric collection capabilities.

Log coverage

Logs are the least developed aspect of telemetry instrumentation in the demo as only a few services have implemented OpenTelemetry Protocol (OTLP) log export.

Context propagation

src/paymentservice/charge.js

Copied!

. . .
  // Check baggage for synthetic_request=true, and add charged attribute accordingly
  const baggage = propagation.getBaggage(context.active());
  if (baggage && baggage.getEntry('synthetic_request') && baggage.getEntry('synthetic_request').value === 'true') {
    span.setAttribute('app.payment.charged', false);
  } else {
    span.setAttribute('app.payment.charged', true);
  }
. . .

Context propagation is a key feature in OpenTelemetry that enables the correlation of telemetry signals regardless of where they are generated. Here's how it's implemented in the demo:

Trace headers (such as traceparent and tracestate) are passed along with requests as they travel between services.
Baggage is used to carry additional context across service boundaries. In the demo, it is used to annotate synthetic requests from the load generator.
Some metrics collected in the demo include trace exemplars, which are detailed samples of individual traces associated with specific metrics.

Telemetry collection and export

The generated traces, metrics, and logs are sent to the OpenTelemetry Collector via gRPC and exported to the following services:

Prometheus: Scrapes the metrics and exemplars generated by the services.
Grafana: Visualizes metric data in customizable dashboards.
Jaeger: Processes and displays distributed traces.
OpenSearch: Used to centralize logging data from services.

The configuration for the Collector defines how the telemetry data is received, processed, and exported. Below is a snippet of the configuration:

src/otel-collector/otelcol-config.yml

Copied!

receivers:
  otlp:
    protocols:
      grpc:
        endpoint: ${env:OTEL_COLLECTOR_HOST}:${env:OTEL_COLLECTOR_PORT_GRPC}
      http:
        endpoint: ${env:OTEL_COLLECTOR_HOST}:${env:OTEL_COLLECTOR_PORT_HTTP}
        cors:
          allowed_origins:
            - "http://*"
            - "https://*"
  . . .

exporters:
  debug:
  otlp:
    endpoint: "jaeger:4317"
    tls:
      insecure: true
  otlphttp/prometheus:
    endpoint: "http://prometheus:9090/api/v1/otlp"
    tls:
      insecure: true
  opensearch:
    logs_index: otel
    http:
      endpoint: "http://opensearch:9200"
      tls:
        insecure: true

processors:
  batch:
  . . .

connectors:
  spanmetrics:

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [transform, batch]
      exporters: [otlp, debug, spanmetrics]
    metrics:
      receivers: [hostmetrics, docker_stats, httpcheck/frontendproxy, otlp, prometheus, redis, spanmetrics]
      processors: [batch]
      exporters: [otlphttp/prometheus, debug]
    logs:
      receivers: [otlp]
      processors: [batch]
      exporters: [opensearch, debug]

For traces, a single OTLP receiver is configured to accept data over both gRPC and HTTP. The trace data is enriched using a transform processor, batched for efficient handling, and then exported to Jaeger for visualization, the debug exporter for troubleshooting, and spanmetrics for generating trace-derived metrics.

Metrics are collected from various sources, including system-level data with hostmetrics and docker stats, and service-level data through OTLP and Prometheus receivers. The data is processed in batches and exported to Prometheus.

Finally, logs are collected via the OTLP receiver, then batched and exported to OpenSearch for centralized storage and querying.

OpenTelemetry Collector config visualized via OTelbin

Within the Grafana instance (exposed at http://localhost:8080/grafana/), you can select the OpenTelemetry Collector Data Flow dashboard to monitor both egress and ingress metrics, along with the observability data flow within the system:

Viewing trace data in Jaeger

To explore the trace data generated by the application's services, open the Jaeger UI in your browser at http://localhost:8080/jaeger/ui/.

On the System Architecture page, you can view a high-level visualization of how the various components in the system interact with one another.

The DAG (Directed Acyclic Graph) tab provides insights into the flow of calls between services by showing the number of calls from one service to another.

In the Force Directed Graph tab, you can click on a service and highlight all the components it interacts with, which helps simplify the debugging process by narrowing down dependencies.

Jaeger System Architecture Force Directed Graph

Switching to the Search page, you'll find a list of recorded traces corresponding to service interactions once you click Find Traces:

By clicking on a trace, you can drill down into the details of a specific request. This reveals the services involved in processing the request and the time each service took, allowing you to identify potential performance bottlenecks.

The application activity observed is due to the load-generator component, which simulates user traffic using Locust. You can observe its activity at http://localhost:8080/loadgen/:

Load Generator Open Telemetry Demo service using Locust

If you're interested in tracking specific user actions in Jaeger, I recommended stopping the load generator temporarily.

This ensures that only the traces corresponding to your actions in the application are recorded, making it easier to identify and analyze them.

Viewing metric data in Prometheus and Grafana

Service metrics in the OpenTelemetry Demo are collected and stored in Prometheus, whose UI is accessible at http://localhost:9090:

While Prometheus provides raw data and query capabilities, visualizing the metrics is much easier and more effective with the pre-built dashboards in Grafana.

These dashboards display key metrics like latency, request rates, and resource usage for each service in a user-friendly format.

To access the Grafana dashboards, navigate to http://localhost:8080/grafana/dashboards in your browser. You'll be greeted with the following set of default dashboards:

Click on the Demo Dashboard entry and select a specific service to view detailed metric graphs. For example, selecting the adservice will show visualizations for metrics like response time, request count, and CPU usage:

Simulating service faults with feature flags

The OpenTelemetry Demo utilizes flagd to implement feature flagging based on the OpenFeature specification. This setup allows for dynamic control over application behavior without the need for redeployment to enable the simulation of various scenarios and fault conditions.

It includes a range of feature flags that can simulate realistic application behaviors and faults, such as:

paymentServiceFailure: Simulates an error when the charge method is invoked in the Payment Service.
cartServiceFailure: Causes the EmptyCart method to fail in the Cart Service.
kafkaQueueProblems: Overloads the Kafka queue and introduces consumer-side delays, leading to lag spikes.

To manage these feature flags, access the Flagd Configurator at http://localhost:8080/feature:

The configurator offers two views:

Basic: Allows you to toggle predefined values for each flag, such as on or off.

Advanced: Displays the raw JSON configuration for direct editing, providing greater flexibility for customization.

By default, all feature flags are disabled. To simulate a failure, you can enable the cartServiceFailure flag by toggling it on and clicking Save:

With the flag activated, the Empty Cart button in the application will stop functioning as expected:

These simulated faults are also reflected in the telemetry data. In Jaeger, errors will begin appearing in new traces:

Examining the trace spans in Jaeger will pinpoint the source of the errors to the cartservice, correlating directly to the feature flag change:

Since this is only simulated behavior, you can revert the feature flag change to restore normal operation.

However, in a real-world scenario, you'll just as easily know where to find the underlying bug in the code to fix the problem or at least what to investigate further.

Final thoughts

The OpenTelemetry Demo offers a great way to explore the capabilities of OpenTelemetry while learning how to effectively use its various components.

Even running the demo application for a few minutes generates a significant amount of telemetry data, which helps you understand how the services interact within a distributed system.

For a deeper understanding, you can dive into the code for services written in your preferred programming languages to see how they are instrumented and how the collector ties everything together.

The flexibility of the OpenTelemetry Collector also makes the demo an excellent tool for evaluating and comparing different observability backends. You can specify multiple backends and see how each vendor handles the telemetry data.

With the help of feature flags, you can also simulate faults to see which tools help you identify and resolve issues the fastest.

Thanks for reading!

Got an article suggestion? Let us know

OpenTelemetry Context Propagation Explained

→

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Make your mark

Join the writer's program

Are you a developer and love writing and sharing your knowledge with the world? Join our guest writing program and get paid for writing amazing technical guides. We'll get them to the right readers that will appreciate them.

Write for us

Writer of the month

Marin Bezhanov

Marin is a software engineer and architect with a broad range of experience working...

Build on top of Better Stack

Write a script, app or project on top of Better Stack and share it with the world. Make a public repository and share it with us at our email.

community@betterstack.com

or submit a pull request and help us build better products for everyone.

See the full list of amazing projects on github

A Beginner's Guide to the OpenTelemetry Demo

Contents

Prerequisites

What is the OpenTelemetry Demo?

Components

Setting up the OpenTelemetry Demo

Exploring the telemetry instrumentation and coverage

Trace coverage

Metric coverage

Log coverage

Context propagation

Telemetry collection and export

Viewing trace data in Jaeger

Viewing metric data in Prometheus and Grafana

Simulating service faults with feature flags

Final thoughts

Make your mark

Join the writer's program

Build on top of Better Stack