Datadog Pricing Gotchas in 2026 Explained
Datadog stands as the market leader in observability with a $40 billion market capitalization and an impressive feature set. However, as your infrastructure scales, you might encounter unexpected billing scenarios that catch even experienced engineering teams off guard. This article breaks down the core pricing mechanics that cause surprise bills and explains how each gotcha actually works.
Understanding Datadog's multi-dimensional pricing model
Datadog uses a usage-based pricing model that spans multiple products, each with its own billing metric. Unlike simpler alternatives that charge based on data volume alone, Datadog combines host-based pricing, cardinality-based metrics, event-based logs, and session-based monitoring into a single bill. This complexity creates several areas where costs can spiral unexpectedly.
The challenge isn't that Datadog is expensive—it's that the pricing model punishes common architectural patterns and operational practices that work well in modern cloud environments.
The high-water mark trap
Datadog bills Infrastructure Monitoring and APM based on the number of hosts you monitor each month. However, the billing calculation uses a high-water mark system that can create significant cost surprises during scaling events.
How high-water mark billing works
Every hour, Datadog measures your active host count. At the end of the month, it discards the top 1% of hours with the highest usage, then bills you for the entire month based on the 99th percentile hour. This protects you from brief anomalous spikes but penalizes sustained usage increases.
Here's a concrete example of how this affects real deployments:
Your application normally runs on 50 hosts at $31/host for APM, resulting in a typical monthly bill of $1,550. You're running a 5-day marketing campaign and scale to 200 hosts to handle the increased traffic. The campaign runs smoothly, but at month's end, you receive a bill for $6,200.
What happened? Datadog ignored only the top 1% of hours (roughly 7 hours out of 720), but your campaign ran for 120 hours. The 99th percentile hour still showed 200 hosts, so you're billed for 200 hosts for the entire month—not just the campaign period.
The math reveals the impact: you paid 4x your normal rate even though the elevated usage represented only 17% of the month. You're effectively paying for peak capacity as if it were sustained capacity.
The cost of elasticity
This billing model creates a perverse incentive against elastic scaling. Teams start making architectural decisions based on monitoring costs rather than application needs. You might consolidate services onto fewer, larger instances to reduce your host count, even though distributed microservices would be more maintainable. You might delay scaling up during traffic increases, risking degraded performance to avoid billing spikes.
The high-water mark system particularly impacts:
- Marketing campaigns and seasonal traffic patterns
- A/B tests requiring infrastructure duplication
- Gradual rollouts using blue-green deployments
- Disaster recovery testing that spins up additional capacity
- Development and staging environments that mirror production
The container multiplication problem
Host-based pricing becomes especially dangerous in containerized environments. Datadog recommends running one agent per host (such as one per Kubernetes node), but a common misconfiguration can multiply your costs dramatically.
If you mistakenly configure your agent to run as a DaemonSet that deploys to every pod instead of every node, each pod registers as a separate host. A 50-node Kubernetes cluster running 10 pods per node suddenly becomes 500 billable hosts instead of 50—a 10x cost increase from a single configuration error.
This gotcha catches teams during their initial Kubernetes setup. The error often goes unnoticed in development with just a few pods, then creates a massive bill when you deploy to production with hundreds of pods across dozens of nodes.
Datadog does include container allotments (5 containers per host for Pro, 10 for Enterprise), but once you exceed these allotments, you pay $0.002 per container per hour, or roughly $1 per container per month. For a large Kubernetes cluster, container overages can add thousands to your monthly bill.
The custom metrics cardinality explosion
Custom metrics represent one of the most unpredictable aspects of Datadog billing. At scale, custom metrics frequently constitute 30-50% of the total bill, yet teams often don't realize they're accumulating charges until the bill arrives.
How custom metrics are defined and counted
Datadog charges a premium for "custom metrics"—any metrics not provided by standard integrations. This includes all metrics from your own applications and, critically, all metrics sent via OpenTelemetry, which many teams use as their observability standard.
The Pro plan includes 100 custom metrics per infrastructure host, and additional metrics cost $5 per 100 metrics monthly. However, understanding what counts as "one metric" requires understanding cardinality.
In Datadog's model, a metric isn't just the metric name. It's the unique combination of the metric name and all its tags. If you have a metric called api.request.latency with three tags—endpoint (10 values), status_code (5 values), and customer_tier (3 values)—you've created 150 unique time series:
That single metric name just consumed more than your entire allotment for one host. Now imagine you add a customer_id tag with thousands of unique values. Your metric count explodes into the tens of thousands, generating hundreds of dollars in monthly charges.
The OpenTelemetry penalty
Many teams adopt OpenTelemetry for vendor-neutral observability. However, Datadog treats all OpenTelemetry metrics as custom metrics. If you're instrumenting your application with OTel and sending metrics to Datadog, you're paying the custom metric premium for what would be standard metrics in other platforms.
This creates a difficult decision: maintain vendor neutrality and pay premium prices, or use Datadog-specific instrumentation to get standard metrics pricing but lock yourself into the platform.
Metrics without limits complexity
Datadog offers "Metrics without Limits™" to manage cardinality costs. You can configure which tags are indexed (searchable) and which are only ingested. However, this introduces two separate charges:
- Ingested metrics: $0.10 per 100 metrics for all the original metric combinations you send
- Indexed metrics: The standard overage rate for metrics you choose to index with specific tags
You're essentially choosing between paying a high price for full visibility or juggling two different billing dimensions to reduce costs. Teams need specialized knowledge to configure this effectively, and misconfiguration can either eliminate the cost savings or eliminate critical observability.
The double-charge for logs
Datadog's log management uses a two-part pricing model that looks inexpensive at first glance but becomes costly in practice. You pay separately to collect logs (Ingest) and to make them searchable (Index).
Understanding the two-part tariff
Ingestion costs $0.10 per GB—you pay this just to get logs into Datadog's system. However, ingested logs aren't searchable. They're only useful for archival. To query logs during incident response, you need to index them, which costs $1.70 per million log events.
Consider an application generating 200 GB of logs monthly, equivalent to roughly 100 million log events:
- Ingest cost: 200 GB × $0.10 = $20
- Indexing cost: 100 million events × $1.70 = $170
- Total: $190 per month
The indexing cost is 8.5x the ingestion cost. The real gotcha is that you must choose which logs to index before an incident occurs. To control costs, teams typically index only 10-20% of their logs, meaning 80-90% of log data is unsearchable when you need it most.
The incident response dilemma
During an outage, you need comprehensive logs to diagnose issues quickly. However, the cost structure forces you to make a painful choice:
- Index everything and pay the full cost, which can reach thousands of dollars monthly
- Index selectively and risk missing the critical log line that would explain the root cause
Teams often start with aggressive log indexing, see the bill, then reduce indexing to control costs. Later, during a critical incident, they discover they can't access the logs they need because they weren't indexed. The missing logs could be rehydrated from archives, but this process takes time when every minute of downtime costs money.
The session-based RUM pricing challenge
Real User Monitoring (RUM) introduces a different billing dynamic. You pay per 1,000 sessions, but Datadog's RUM model requires understanding the difference between RUM Measure, RUM Investigate, and Session Replay.
The three-tier RUM structure
Datadog splits RUM into three products:
- RUM Measure ($0.15 per 1,000 sessions): Collects all sessions and generates 30+ out-of-the-box performance metrics
- RUM Investigate ($3 per 1,000 sessions): Retains filtered sessions for 30 days for detailed troubleshooting
- Session Replay ($2.50 per 1,000 sessions): Adds video-like replay of user sessions
You must purchase both RUM Measure and RUM Investigate to get meaningful RUM capabilities. RUM Measure alone gives you aggregated metrics but no ability to drill into individual problematic sessions. RUM Investigate lets you filter and retain only high-value sessions (those with errors, crashes, or specific user actions), but you're paying $3 per 1,000 sessions on top of the $0.15 base rate.
For a site with 10 million sessions monthly:
- RUM Measure: 10,000 × $0.15 = $1,500
- RUM Investigate (if you filter to retain 30%): 3,000 × $3 = $9,000
- Session Replay (if you replay 20% of retained sessions): 600 × $2.50 = $1,500
- Total: $12,000 per month
The pricing structure means you need to configure filtering carefully. If you retain too many sessions in RUM Investigate, costs balloon. If you filter too aggressively, you miss the sessions you need for debugging.
The APM span indexing trap
APM follows a similar pattern to logs: you pay separately for ingestion and indexing of traces. The default allotment is 150 GB of span ingestion and 1 million indexed spans per APM host per month. However, the hourly billing for on-demand usage creates unexpected charges.
How hourly span metering works
For committed plans, Datadog aggregates your usage across the entire month. If one host uses 200 GB and another uses 100 GB, you average 150 GB per host and stay within your allotment.
However, on-demand billing uses hourly metering. Your allotments are divided by the hours in the month, giving you 0.205 GB ingested spans and 1,370 indexed spans per host per hour. If any hour exceeds these thresholds across your fleet, you're charged overages—even if your monthly average stays under the allotment.
This catches teams that have bursty traffic patterns. Your application might trace heavily during business hours and lightly overnight. The daytime hours generate overages, and the nighttime hours don't offset them under hourly metering.
The accumulation effect
The real gotcha isn't any single pricing mechanism—it's how these elements combine. A team adopting Datadog often starts with Infrastructure Monitoring and APM, then adds Log Management for troubleshooting, RUM to understand user experience, and Database Monitoring for query optimization. Each addition makes sense individually, but the cumulative cost grows non-linearly.
Here's how costs accumulate for a mid-sized deployment (100 hosts, moderate usage):
- Infrastructure Pro: 100 hosts × $15 = $1,500
- APM Enterprise: 100 hosts × $40 = $4,000
- Custom metrics (10,000 above allotment): 100 × $5 = $500
- Log Management (1 TB ingested, 500M indexed): $100 + $850 = $950
- RUM (5M sessions with Measure + Investigate): (5,000 × $0.15) + (1,500 × $3) = $5,250
- Database Monitoring (20 database hosts): 20 × $70 = $1,400
Total monthly cost: $13,600
This represents a modest production environment. Larger deployments with microservices architectures, high-cardinality metrics, and comprehensive logging can easily reach $50,000-$100,000+ monthly.
Why Better Stack is better
Better Stack delivers OpenTelemetry-native tracing, log management, infrastructure monitoring, and incident management without the pricing gotchas described above. Its eBPF-based collector automatically instruments Kubernetes and Docker environments without code changes, eliminating the host-based billing traps that catch teams using traditional platforms.
If you're not familiar with Better Stack, watch this quick overview to see how easy it is:
The difference isn't just lower costs. It's predictable, transparent pricing that aligns with how modern engineering teams work.
Simple, volume-based pricing
Better Stack uses straightforward data volume pricing across all telemetry types:
- Logs & Traces: $0.10 per GB ingested
- Metrics: $0.50 per GB stored on NVMe SSD
- Error Tracking: $0.000050 per exception (approximately 6x cheaper than Sentry)
There's no host-based billing, no custom metric premiums, and no separate charges for ingestion versus indexing. You send data, you pay for the volume—the pricing model doesn't change based on your architecture, cardinality, or how you've tagged your metrics.
No architectural penalties
Better Stack's pricing doesn't punish modern cloud-native patterns:
- No high-water mark billing: Scale elastically without paying peak prices for the entire month
- No container traps: Container density doesn't affect your bill
- No cardinality penalties: High-cardinality metrics cost the same as low-cardinality metrics
- No double-charging: Logs are searchable immediately—no separate indexing fee
You can architect your infrastructure for reliability and maintainability without considering monitoring costs. Run microservices, scale dynamically, use detailed tags, and log comprehensively without triggering unexpected charges.
OpenTelemetry native without penalties
Better Stack is built from the ground up to be OpenTelemetry native. Unlike Datadog, which charges premium rates for OpenTelemetry metrics as "custom metrics," Better Stack fully embraces OTel as a first-class citizen. This means:
- No vendor lock-in—your instrumentation code remains portable
- Full support for OTel semantic conventions with deeper out-of-the-box insights
- No price penalty for using open standards
- Native support for trace funnels, external API monitoring, and messaging queue monitoring
Recent additions like intelligent trace sampling and automated external API performance tracking demonstrate Better Stack's commitment to OTel-native innovation.
Transparent, predictable costs
Better Stack's pricing page shows exactly what you'll pay with no hidden dimensions:
- No per-host charges that vary by infrastructure type
- No separate ingestion and indexing tiers to manage
- No multi-tier session pricing requiring complex filtering
- No hourly metering that penalizes bursty workloads
The $0.10 per GB logs pricing is comprehensive—it includes ingestion, transformation with VRL, querying with SQL or PromQL, real-time live tail, and alerting. What you see is what you pay.
Real cost comparison
Let's compare the same 100-host deployment we calculated for Datadog:
Datadog costs (monthly): - Infrastructure Pro: $1,500 - APM Enterprise: $4,000 - Custom metrics: $500 - Log Management: $950 - RUM: $5,250 - Database Monitoring: $1,400 - Total: $13,600/month
Better Stack equivalent (monthly): - 500 GB logs @ $0.10/GB: $50 - 500 GB traces @ $0.10/GB: $50 - 100 GB metrics @ $0.50/GB: $50 - Uptime monitoring (50 monitors): $21 - Error tracking (10M exceptions): $500 - Total: $671/month
Better Stack delivers comprehensive observability at approximately 5% of Datadog's cost for this deployment. The savings increase as you scale because Better Stack's pricing grows linearly with data volume rather than compounding across multiple billing dimensions.
Flexible deployment options
Better Stack offers hosting flexibility that adapts to your needs:
- Better Stack Cloud: Fully managed SaaS for teams wanting zero infrastructure overhead
- Better Stack Enterprise: Self-hosted in your own cloud or on-premise for strict data residency requirements
You're not locked into a single deployment model as your requirements evolve.
Built for developer experience
Better Stack provides modern observability features without complexity:
- Drag-and-drop query builder alongside SQL for power users
- Transform logs with VRL or JavaScript before storage
- Convert logs to metrics for long-term analysis
- Anomaly detection and alerting built-in
- Integrated incident management and on-call scheduling
- Shareable dashboards and embedded charts for status pages
The platform focuses on solving observability problems rather than billing optimization.
Migration made simple
Worried about switching costs? Better Stack provides an automated migration tool that translates Datadog dashboards in minutes without manual rebuilding. You can test Better Stack risk-free with a 60-day money-back guarantee.
The combination of predictable pricing, no architectural penalties, OpenTelemetry native support, and comprehensive features makes Better Stack an ideal choice for teams seeking observability without surprise bills.
Making informed decisions
Understanding these gotchas doesn't necessarily mean avoiding Datadog. For many organizations, especially those needing extensive features in a single platform, Datadog provides genuine value. However, informed decision-making requires understanding the true cost of scale.
Before committing to Datadog, consider:
- How your current infrastructure patterns align with host-based pricing
- Your container density and whether per-container charges will apply
- The cardinality of your custom metrics and OpenTelemetry adoption plans
- Whether you need comprehensive log indexing or can accept selective indexing
- The volume of user sessions and which RUM features you actually need
For teams seeking more predictable pricing based primarily on data volume rather than hosts, cardinality, and multiple billing dimensions, alternatives exist that eliminate these gotchas. Better Stack, for example, charges $0.10 per GB for logs and traces with straightforward ingestion-based pricing. There's no host-based billing, no custom metric premiums, and no multi-tier session pricing. You send data, you pay for the data volume—the pricing model doesn't penalize modern architectural patterns or high-cardinality metrics.
Better Stack provides distributed tracing, log management, error tracking, uptime monitoring, and incident management with transparent pricing that scales linearly with your actual data usage. For teams that find Datadog's multi-dimensional pricing unpredictable, this simpler model can reduce both costs and cognitive overhead.
The key is matching your observability strategy to both your technical needs and your cost tolerance as you scale. Whether you choose Datadog or an alternative, understanding the pricing mechanics helps you avoid surprise bills and make architectural decisions based on application needs rather than monitoring costs.
-
Datadog vs. CloudWatch: a side-by-side comparison for 2026
AWS already has a monitoring solution so why opt for anything else? Let’s take a look at some of the key differences between Datadog and CloudWatch.
Comparisons -
Datadog vs. New Relic: a side-by-side comparison for 2026
I have deployed, tried and tested Datadog and New Relic, to help you pick the right observability platform.
Comparisons -
Datadog vs. Sentry: a side-by-side comparison for 2026
I have deployed, tried and tested Datadog and Sentry, to help you pick the right APM/error tracking tool.
Comparisons -
Datadog vs. Splunk: a side-by-side comparison for 2026
Datadog and Splunk are both tools designed to help organizations collect, store, and analyze log data. In this article, we are going to compare the two products in detail to help you find out which one is better suited for your project.
Comparisons