# A Beginner's Guide to the OpenTelemetry Demo

[The OpenTelemetry Demo](https://github.com/open-telemetry/opentelemetry-demo)
provides a practical, hands-on environment for exploring and implementing
OpenTelemetry.

**It demonstrates how OpenTelemetry can be used to instrument and monitor a
microservice-based application featuring realistic workflows** and fault
simulations to illustrate the effective use of the [OpenTelemetry
SDK](https://betterstack.com/community/guides/observability/opentelemetry-sdk/), API, Collector, and other components.

In this article, we'll delve into the architecture of the OpenTelemetry Demo,
examine how its services interact, and explain how telemetry data is collected
and analyzed to help you better understand and adopt OpenTelemetry in your own
systems.

Let's get started!

<iframe width="100%" height="315" src="https://www.youtube.com/embed/LzLULxhyIpU" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>


## Prerequisites

- Some familiarity with [OpenTelemetry's basic concepts](https://betterstack.com/community/guides/observability/what-is-opentelemetry/).
- [Docker](https://docs.docker.com/engine/install/) and Docker Compose
  installed. I recommend using [OrbStack](https://betterstack.com/community/guides/scaling-docker/switching-to-orbstack-on-macos/) on
  macOS instead of Docker Desktop.

## What is the OpenTelemetry Demo?

The OpenTelemetry Demo simulates an online astronomy-themed retail shop. It is a
fictional e-commerce application serves as a realistic environment for
showcasing everything OpenTelemetry has to offer for instrumenting and
monitoring a distributed system.

The shop features Astronomy-related items, such as telescopes, star charts, and
space-themed merchandise. It also includes a shopping workflow involving
browsing products, adding items to a cart, checking out, and receiving an order
confirmation—typical interactions in an e-commerce platform.

![OpenTelemetry Demo product page](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/a86558eb-2d03-4f95-562a-e58d03fffc00/lg1x
=2169x1024)

It was developed with the following primary goals:

- To showcase how OpenTelemetry operates in a complex environment with multiple
  interconnected services built in different technologies.

- To allow you observe distributed tracing, metrics, and logs in action for
  monitoring and troubleshooting an application and learn how to implement them
  in your own services.

- To serve as a foundational platform for vendors and tooling developers to
  showcase their OpenTelemetry integrations.

- For OpenTelemetry contributors, the demo provides a functional, living
  environment to test and validate new versions of the API, SDK, or other
  OpenTelemetry components before updates are released to the broader community.

### Components

The services that comprise the demo can be divided into four broad categories:

1. **Core services**: These are microservices written in different programming
   languages that talk to each other over gRPC and HTTP. These services are all
   instrumented with OpenTelemetry to produce [traces, metrics, and
   logs](https://betterstack.com/community/guides/observability/logging-metrics-tracing/).

2. **Dependencies**: Services that support the application services, such as
   Redis, Kafka, and Valkey.

3. **Telemetry services**: Components that deal with the telemetry data
   generated by the above services like the OpenTelemetry Collector, Prometheus,
   Grafana, OpenSearch, and Jaeger.

4. **Utility services**: These are additional services that provide specific
   functionality to support the core application such as the `load-generator`,
   `flagd`, and `flagdui` service.

Let's look at how you can set up the demo on your local machine next.

## Setting up the OpenTelemetry Demo

Setting up the OpenTelemetry Demo is straightforward with Docker, and there's
also an option to
[deploy it through Kubernetes](https://opentelemetry.io/docs/demo/kubernetes-deployment/)
if that's your preference.

Begin by cloning its
[GitHub repository](https://github.com/open-telemetry/opentelemetry-demo.git) to
your local machine:

```command
git clone https://github.com/open-telemetry/opentelemetry-demo.git
```

Then navigate into the cloned repository with:

```command
cd opentelemetry-demo
```

You can now start the demo with `docker compose`:

```command
docker compose up --force-recreate --remove-orphans --detach
```

Since the demo involves a large number of containers and services, downloading
and building all the necessary Docker images might take several minutes
depending on your internet speed.

Once the setup is complete, you should see output similar to the following:

```text
[output]
. . .
[+] Running 26/26
 ✔ Network opentelemetry-demo         Created                        0.2s
 ✔ Container flagd                    Started                        2.2s
 ✔ Container grafana                  Started                        2.5s
 ✔ Container opensearch               Healthy                       12.6s
 ✔ Container valkey-cart              Started                        2.5s
 ✔ Container prometheus               Started                        2.5s
 ✔ Container kafka                    Healthy                       17.1s
 ✔ Container jaeger                   Started                        2.1s
 ✔ Container otel-collector           Started                       12.7s
 ✔ Container cart-service             Started                       14.5s
 ✔ Container flagdui                  Started                       14.5s
 ✔ Container quote-service            Started                       13.4s
 ✔ Container frauddetection-service   Started                       17.4s
 ✔ Container currency-service         Started                       14.1s
 ✔ Container email-service            Started                       14.1s
 ✔ Container imageprovider            Started                       13.5s
 ✔ Container ad-service               Started                       14.8s
 ✔ Container accounting               Started                       17.4s
 ✔ Container product-catalog-service  Started                       14.1s
 ✔ Container shipping-service         Started                       14.8s
 ✔ Container payment-service          Started                       13.3s
 ✔ Container recommendation-service   Started                       14.6s
 ✔ Container checkout-service         Started                       17.2s
 ✔ Container frontend                 Started                       17.5s
 ✔ Container load-generator           Started                       18.0s
 ✔ Container frontend-proxy           Started                       18.5s
```

The Docker Compose setup includes 25 containers, most of which represent the
microservices in the demo application. Additionally, the setup includes:

- `prometheus`, `grafana`, `opensearch`, and `jaeger`: For inspecting and
  visualizing telemetry data.
- `load-generator`: Uses [Locust](https://locust.io/) to simulate user traffic.
- `flagd`, `flagdui`: Provides support for changing feature flags through a user
  interface.

Once all services are up and running, open your browser and navigate to
`http://localhost:8080` to interact with the demo application:

![OpenTelemetry Demo homepage](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/832ee114-1dc3-46ac-93ea-308731f1a400/lg2x
=2099x1182)

Scrolling down reveals the available products for purchase:

![OpenTelemetry Demo products](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/8411e072-a871-4a1a-67f8-60f4d21c9300/public
=2354x1363)

Clicking on a product takes you to its details page, where you can select a
quantity and add it to your shopping cart:

![OpenTelemetry Demo product page](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/c0e9e29f-a77d-44ba-e3b6-51c5de865500/md2x
=2285x1072)

The shopping cart displays the selected products, calculated total, and demo
payment options:

![OpenTelemetry Demo shopping cart](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/e2a122df-c20e-42b1-9075-553de4279d00/md2x
=2285x1016)

Scroll to the bottom of the cart and click the **Place Order** button:

![OpenTelemetry Demo place order](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/6676faa8-2527-4d2b-2dc7-6a584d1b9c00/lg1x
=2285x1016)

You'll be directed to an order confirmation page, verifying that the demo is
functioning as expected:

![OpenTelemetry Demo order confirmation](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/6c4359e8-550c-4a8b-365f-f6f20a132600/orig
=2285x1016)

As you interact with the application, multiple microservices communicate and
work together to handle your actions.

These interactions are fully instrumented with OpenTelemetry, allowing you to
collect and analyze traces, metrics, and logs with the included telemetry tools
or with a vendor you're currently evaluating.

[ad-logs]

## Exploring the telemetry instrumentation and coverage

Each service in the demo application uses its respective OpenTelemetry SDKs to
collect telemetry data—traces, metrics, and logs. However, the extent of
coverage for each telemetry type varies by service, reflecting differences in
functionality and maturity of instrumentation.

### Trace coverage

![OpenTelemetry Demo trace coverage](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/85081a16-2e78-4ae0-f5ac-4fb0fd9a2100/lg1x
=3180x2240)

Tracing is the most comprehensively implemented telemetry type in the demo. Most
services feature robust trace instrumentation, including:

- Automatic and manual span creation
- Span enrichment
- Context propagation

Features like baggage and span links are selectively implemented in services
with complex interactions, such as the **Checkout** and **Fraud Detection**
services.

### Metric coverage

![OpenTelemetry Demo metric coverage](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/c4a2ffaf-57fb-4238-3d83-1eedb18e6600/md1x
=3236x2113)

The metric coverage varies significantly, with most services having incomplete
instrumentation. Advanced features like multiple instruments, views, and
exemplars are largely missing across the board.

However, the existing metrics are sufficient to demonstrate core OpenTelemetry
metric collection capabilities.

### Log coverage

![OpenTelemetry Demo log coverage](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/e4569876-ab93-41bf-64fa-9b27b074a400/lg2x
=2778x1936)

Logs are the least developed aspect of telemetry instrumentation in the demo as
only a few services have implemented [OpenTelemetry Protocol (OTLP)](https://betterstack.com/community/guides/observability/otlp/) log
export.

### Context propagation

```javascript
[label src/paymentservice/charge.js]
. . .
  // Check baggage for synthetic_request=true, and add charged attribute accordingly
[highlight]
  const baggage = propagation.getBaggage(context.active());
[/highlight]
  if (baggage && baggage.getEntry('synthetic_request') && baggage.getEntry('synthetic_request').value === 'true') {
    span.setAttribute('app.payment.charged', false);
  } else {
    span.setAttribute('app.payment.charged', true);
  }
. . .
```

Context propagation is a key feature in OpenTelemetry that enables the
correlation of telemetry signals regardless of where they are generated. Here's
how it's implemented in the demo:

- Trace headers (such as `traceparent` and `tracestate`) are passed along with
  requests as they travel between services.
- Baggage is used to carry additional context across service boundaries. In the
  demo, it is used to annotate synthetic requests from the load generator.
- Some metrics collected in the demo include trace
  [exemplars](https://cloud.google.com/stackdriver/docs/instrumentation/advanced-topics/exemplars),
  which are detailed samples of individual traces associated with specific
  metrics.

## Telemetry collection and export

The generated traces, metrics, and logs are sent to the [OpenTelemetry
Collector](https://betterstack.com/community/guides/observability/opentelemetry-collector/) via gRPC and exported to the following
services:

- [Prometheus](https://betterstack.com/community/guides/monitoring/prometheus/): Scrapes the metrics and exemplars generated by the
  services.
- [Grafana](https://grafana.com/): Visualizes metric data in customizable
  dashboards.
- [Jaeger](https://betterstack.com/community/guides/observability/jaeger-guide/): Processes and displays distributed traces.
- [OpenSearch](https://opensearch.org/): Used to centralize logging data from
  services.

The configuration for the Collector defines how the telemetry data is received,
processed, and exported. Below is a snippet of the configuration:

```yaml
[label src/otel-collector/otelcol-config.yml]
receivers:
  otlp:
    protocols:
      grpc:
        endpoint: ${env:OTEL_COLLECTOR_HOST}:${env:OTEL_COLLECTOR_PORT_GRPC}
      http:
        endpoint: ${env:OTEL_COLLECTOR_HOST}:${env:OTEL_COLLECTOR_PORT_HTTP}
        cors:
          allowed_origins:
            - "http://*"
            - "https://*"
  . . .

exporters:
  debug:
  otlp:
    endpoint: "jaeger:4317"
    tls:
      insecure: true
  otlphttp/prometheus:
    endpoint: "http://prometheus:9090/api/v1/otlp"
    tls:
      insecure: true
  opensearch:
    logs_index: otel
    http:
      endpoint: "http://opensearch:9200"
      tls:
        insecure: true

processors:
  batch:
  . . .

connectors:
  spanmetrics:

service:
  pipelines:
    traces:
      receivers: [otlp]
      processors: [transform, batch]
      exporters: [otlp, debug, spanmetrics]
    metrics:
      receivers: [hostmetrics, docker_stats, httpcheck/frontendproxy, otlp, prometheus, redis, spanmetrics]
      processors: [batch]
      exporters: [otlphttp/prometheus, debug]
    logs:
      receivers: [otlp]
      processors: [batch]
      exporters: [opensearch, debug]
```

For traces, a single OTLP receiver is configured to accept data over both gRPC
and HTTP. The trace data is enriched using a `transform` processor, batched for
efficient handling, and then exported to Jaeger for visualization, the `debug`
exporter for troubleshooting, and `spanmetrics` for generating trace-derived
metrics.

Metrics are collected from various sources, including system-level data with
`hostmetrics` and docker stats, and service-level data through OTLP and
Prometheus receivers. The data is processed in batches and exported to
Prometheus.

Finally, logs are collected via the OTLP receiver, then batched and exported to
OpenSearch for centralized storage and querying.

![OpenTelemetry Collector config visualized via OTelbin](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/daa29baf-e4fa-4fb5-d6e9-dcec0aeef300/lg1x
=3640x2472)

Within the Grafana instance (exposed at `http://localhost:8080/grafana/`), you
can select the **OpenTelemetry Collector Data Flow** dashboard to monitor both
egress and ingress metrics, along with the [observability data
flow](https://betterstack.com/community/guides/observability/what-is-observability/) within the system:

![OpenTelemetry Collector data flow dashboard](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/82da511d-0a13-4085-9642-ba2478b1cc00/public
=6144x4500)


[summary]
### Skip the setup complexity with automatic instrumentation

While the OpenTelemetry Demo shows how to manually instrument services and configure collectors, [Better Stack Tracing](https://betterstack.com/tracing/) uses eBPF to automatically instrument your Kubernetes or Docker workloads without code changes. Your traces, logs, and metrics start flowing immediately.

**Predictable pricing and up to 30x cheaper than Datadog.** Start free in minutes.
[/summary]

![Better Stack Tracing bubble up view highlighting the root cause of a slow request](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/ea6d6faf-b150-4ef2-0765-02113ea7b100/md2x =4160x2378)


## Viewing trace data in Jaeger

To explore the trace data generated by the application's services, open the
Jaeger UI in your browser at `http://localhost:8080/jaeger/ui/`.

On the **System Architecture** page, you can view a high-level visualization of
how the various components in the system interact with one another.

![Jaeger System Architecture page](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/42075838-cbea-44a0-82d6-4da5d261fb00/lg1x
=2048x1185)

The **DAG (Directed Acyclic Graph)** tab provides insights into the flow of
calls between services by showing the number of calls from one service to
another.

In the **Force Directed Graph** tab, you can click on a service and highlight
all the components it interacts with, which helps simplify the debugging process
by narrowing down dependencies.

![Jaeger System Architecture Force Directed Graph](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/9c164328-8a35-4057-6f58-6698a85f3100/lg2x
=1978x1077)

Switching to the **Search** page, you'll find a list of recorded traces
corresponding to service interactions once you click **Find Traces**:

![Jaeger Trace list](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/e7640dc1-a631-4bc5-e950-85c3150db100/md2x
=2391x1419)

By clicking on a trace, you can drill down into the details of a specific
request. This reveals the services involved in processing the request and the
time each service took, allowing you to identify potential performance
bottlenecks.

![Jaeger Trace View](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/6676e850-9839-4ec4-53ec-5ca3621a1200/lg1x
=2391x951)

The application activity observed is due to the `load-generator` component,
which simulates user traffic using Locust. You can observe its activity at
`http://localhost:8080/loadgen/`:

![Load Generator Open Telemetry Demo service using Locust](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/d7c0fa2b-5ede-4ee9-f73c-18073d1fe300/orig
=2391x951)

If you're interested in tracking specific user actions in Jaeger, I recommended
stopping the load generator temporarily.

This ensures that only the traces corresponding to your actions in the
application are recorded, making it easier to identify and analyze them.

## Viewing metric data in Prometheus and Grafana

Service metrics in the OpenTelemetry Demo are collected and stored in
Prometheus, whose UI is accessible at `http://localhost:9090`:

![Prometheus interface showing metrics](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/6e337ec7-0fe0-4aed-cb81-45884566af00/lg2x
=2391x701)

While Prometheus provides raw data and query capabilities, visualizing the
metrics is much easier and more effective with the pre-built dashboards in
Grafana.

These dashboards display key metrics like latency, request rates, and resource
usage for each service in a user-friendly format.

To access the Grafana dashboards, navigate to
`http://localhost:8080/grafana/dashboards` in your browser. You'll be greeted
with the following set of default dashboards:

![Grafana default dashboards](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/ba1f4791-861e-47e6-ddb8-2b40e3c5da00/md2x
=2391x909)

Click on the **Demo Dashboard** entry and select a specific service to view
detailed metric graphs. For example, selecting the **adservice** will show
visualizations for metrics like response time, request count, and CPU usage:

![Grafana demo dashboard adservice](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/c6903f2f-9494-4e4e-4602-bc9888efdf00/md2x
=3072x1612)

## Simulating service faults with feature flags

The OpenTelemetry Demo utilizes [flagd](https://flagd.dev/) to implement feature
flagging based on the [OpenFeature specification](https://openfeature.dev/).
This setup allows for dynamic control over application behavior without the need
for redeployment to enable the simulation of various scenarios and fault
conditions.

It includes a range of
[feature flags](https://opentelemetry.io/docs/demo/feature-flags/#implemented-feature-flags)
that can simulate realistic application behaviors and faults, such as:

- `paymentServiceFailure`: Simulates an error when the `charge` method is
  invoked in the Payment Service.
- `cartServiceFailure`: Causes the `EmptyCart` method to fail in the Cart
  Service.
- `kafkaQueueProblems`: Overloads the Kafka queue and introduces consumer-side
  delays, leading to lag spikes.

To manage these feature flags, access the **Flagd Configurator** at
`http://localhost:8080/feature`:

The configurator offers two views:

- **Basic**: Allows you to toggle predefined values for each flag, such as on or
  off.

  ![Flagd Configurator Basic view](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/09975719-819e-487f-a2da-65d3c0f41100/md2x
  =2808x1473)

- **Advanced**: Displays the raw JSON configuration for direct editing,
  providing greater flexibility for customization.

  ![Flagd Configurator Advanced view](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/45312334-5c1e-4abd-adf4-291b94321d00/md2x
  =2808x1473)

By default, all feature flags are disabled. To simulate a failure, you can
enable the `cartServiceFailure` flag by toggling it on and clicking **Save**:

![Turn on feature flag](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/4d22f2f1-c728-4e15-51d9-629d4937e400/public
=2808x1151)

With the flag activated, the **Empty Cart** button in the application will stop
functioning as expected:

![Empty Cart OpenTelemetry demo](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/ea06011a-5b53-48d1-b62a-cdbfaa20c200/public
=1955x996)

These simulated faults are also reflected in the telemetry data. In Jaeger,
errors will begin appearing in new traces:

![Errors in Jaeger](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/3bbee9d8-3309-4289-5b66-8d2706ae9200/public
=1955x996)

Examining the trace spans in Jaeger will pinpoint the source of the errors to
the `cartservice`, correlating directly to the feature flag change:

![Spans in Jaeger showing errors](https://imagedelivery.net/xZXo0QFi-1_4Zimer-T0XQ/ef29fc9e-f800-4b54-53f7-82281ae57100/orig
=2423x1081)

Since this is only simulated behavior, you can revert the feature flag change to
restore normal operation.

However, in a real-world scenario, you'll just as easily know where to find the
underlying bug in the code to fix the problem or at least what to investigate
further.

## Simplifying observability with Better Stack

The OpenTelemetry Demo shows how to instrument multiple services, configure collectors, and route telemetry data to various backends. While this approach gives you control over every aspect of your observability pipeline, it requires significant setup and maintenance work.

<iframe width="100%" height="315" src="https://www.youtube.com/embed/wQKjCDD7nfk" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>

[Better Stack Tracing](https://betterstack.com/tracing/) takes a different approach that eliminates this complexity:

- eBPF-based automatic instrumentation captures traces without modifying service code
- Databases like PostgreSQL, MySQL, Redis, and MongoDB get recognized automatically
- No need to configure collectors or manage routing logic
- Context propagation works immediately across your services
- Visual "bubble up" investigation surfaces performance issues through drag and drop
- AI analyzes your service map and logs during incidents, suggesting root causes
- OpenTelemetry-native architecture keeps your data portable
- Combines traces, logs, metrics, and incident management in one platform

Instead of setting up instrumentation in each service like the demo shows, you point Better Stack at your cluster and traces start flowing. The same observability insights, without the setup overhead.

If you'd like to try automatic instrumentation while keeping OpenTelemetry compatibility, check out [Better Stack Tracing](https://betterstack.com/tracing/).

## Final thoughts

The **OpenTelemetry Demo offers a great way to explore the capabilities of OpenTelemetry** while learning how to effectively use its various components.

Even running the demo application for a few minutes generates a significant amount of telemetry data, which helps you understand how services interact within a distributed system.

For a deeper understanding, you can dive into the code for services written in your preferred programming languages to see how they are instrumented and how the collector ties everything together.

**The flexibility of the OpenTelemetry Collector also makes the demo an excellent tool for evaluating and comparing different observability backends**. You can specify multiple backends and see how each vendor handles the telemetry data.

If manual instrumentation feels like too much overhead, [Better Stack Tracing](https://betterstack.com/tracing/) uses eBPF to automatically instrument your workloads without code changes while maintaining OpenTelemetry compatibility.

With the help of feature flags, you can also simulate faults to see which tools help you identify and resolve issues the fastest.

Thanks for reading!