# Benchmarking in Go: A Comprehensive Handbook

Performance optimization is crucial for building efficient applications, but
without proper measurement, optimization becomes mere guesswork. As Donald Knuth
famously stated, "premature optimization is the root of all evil." This is where
benchmarking comes in.

Go stands out among programming languages by providing built-in benchmarking as
part of its standard library. This native support reflects Go's philosophy of
making performance testing accessible to all developers, not just performance
specialists.

Benchmarking in Go allows you to:

- Measure code performance with microsecond precision.
- Compare implementation alternatives.
- Detect performance regressions.
- Understand memory allocation patterns.
- Make data-driven optimization decisions.

This guide will walk you through everything you need to know about benchmarking
in Go, from basic concepts to advanced techniques.

<iframe width="100%" height="315" src="https://www.youtube.com/embed/6KGH47rdeBs" title="YouTube video player" frameborder="0" allow="accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share" allowfullscreen></iframe>

## Getting started with Go benchmarks

Go benchmarks are functions that live in `*_test.go` files, just like unit
tests. While tests begin with `Test`, benchmarks follow a specific naming
convention:

```go
func BenchmarkXxx(b *testing.B) {
    // benchmark code
}
```

The benchmark function must:

1. Start with `Benchmark`.
2. Accept a `*testing.B` parameter.
3. Be in a file with a `_test.go` suffix.

The `testing.B` type provides the benchmarking infrastructure, including timing,
iteration control, and reporting facilities.

Let's create a simple benchmark for a string concatenation function:

```go
[label contact.go]
package concat

func JoinStrings(strs []string) string {
    var result string
    for _, s := range strs {
        result += s
    }
    return result
}
```

```go
[label concat_test.go]
package concat

import "testing"

func BenchmarkJoinStrings(b *testing.B) {
    strs := []string{"Hello", ", ", "world", "!"}

    // The benchmark runner will call this function b.N times
    for i := 0; i < b.N; i++ {
        JoinStrings(strs)
    }
}
```

To run a benchmark, use the `go test` command with the `-bench` flag:

```command
go test -bench=.
```

```text
[output]
goos: linux
goarch: amd64
pkg: github.com/betterstack-community/golang-benchmarks
cpu: 11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz
BenchmarkJoinStrings-16          9762195               123.0 ns/op
PASS
ok      github.com/betterstack-community/golang-benchmarks      1.330s
```

This means:

- The benchmark ran on 16 CPU cores (`-16` suffix).
- It executed 9762195 times.
- Each operation took approximately 123 nanoseconds>

[ad-logs]

## Understanding b.N

The benchmark framework automatically determines the value of `b.N` by running
your benchmark multiple times with increasing values until it gets a
statistically significant result.

The framework starts with a small value (usually 1) and increases it until the
benchmark runs for a sufficient duration (default is 1 second). This is why your
benchmark function must execute the code under test `b.N` times:

```go
func BenchmarkSomething(b *testing.B) {
    // Optional setup code

    b.ResetTimer() // Reset the timer if setup took significant time

    for i := 0; i < b.N; i++ {
        // Code you want to measure
    }
}
```

Often, benchmarks require setup and teardown code that shouldn't be included in
the timing measurements:

```go
func BenchmarkComplexOperation(b *testing.B) {
    // Setup
    data := createLargeDataset()

    // Reset the timer to exclude setup time
    b.ResetTimer()

    for i := 0; i < b.N; i++ {
        processData(data)
    }

    // Optionally pause timer during cleanup
    b.StopTimer()
    cleanupResources()
}
```

The key timing control methods include:

- `b.ResetTimer()`: Resets the timer to zero.
- `b.StartTimer()`: Resumes the timer after it was stopped.
- `b.StopTimer()`: Temporarily stops the timer.

Note that the Go compiler might optimize away code that doesn't have observable
effects, potentially invalidating your benchmark:

```go
func BenchmarkMightBeOptimizedAway(b *testing.B) {
    for i := 0; i < b.N; i++ {
        // This computation might be eliminated by the compiler
        // since its result is never used
        math.Sqrt(float64(i))
    }
}
```

To prevent this, ensure the result is used:

```go
func BenchmarkPreventOptimization(b *testing.B) {
    var result float64
    for i := 0; i < b.N; i++ {
        result += math.Sqrt(float64(i))
    }
    // Use the result to prevent optimization
    if result < 0 {
        b.Fatalf("negative result: %f", result)
    }
}
```

## Introducing b.Loop

[Go 1.24](https://betterstack.com/community/guides/scaling-go/go-1-24/) introduces a cleaner, more efficient approach to benchmarking
with the `testing.B.Loop` method, which addresses several nuances and potential
pitfalls of the traditional `b.N` loop:

```go
func BenchmarkStringConversion(b *testing.B) {
    // Setup - prepare a large integer to convert to string
    number := 9876543210
    b.ResetTimer()

    // We need a result variable to prevent optimization
    var result string

    for i := 0; i < b.N; i++ {
        // The operation we want to benchmark
        result = strconv.Itoa(number)
    }

    // Prevent compiler from optimizing away the unused result
    if len(result) == 0 {
        b.Fatal("unexpected empty string")
    }
}
```

Several issues arise with this approach:

1. The benchmark function runs multiple times, causing setup code to execute
   repeatedly.
2. You must remember to call `b.ResetTimer()` to exclude setup time from
   measurements.
3. You need to use a result variable and ensure it's used somehow to prevent the
   compiler from optimizing away your benchmark code.

The new `b.Loop()` approach eliminates these concerns:

```go
func BenchmarkStringConversion(b *testing.B) {
    // Setup - prepare a large integer to convert to string
    number := 9876543210

    // No need for b.ResetTimer() - everything outside the loop is excluded
    // No need for a result variable to prevent optimization

    for b.Loop() {
        // The operation we want to benchmark
        strconv.Itoa(number)
    }
}
```

Key advantages of `b.Loop()`:

1. The benchmark function executes only once per `-count`, so setup code runs
   just once
2. Code outside the `b.Loop()` doesn't affect benchmark timing, eliminating the
   need for `b.ResetTimer()`
3. The compiler won't optimize away function calls within a `b.Loop()` body,
   even if results aren't used.

This results in benchmarks that are easier to write, less error-prone, and
potentially more accurate by avoiding repeated setup overhead.

Note that your benchmarks should use either `b.Loop()` or a `b.N`-style loop,
but not both in the same benchmark function.

## Benchmarking different types of code

Go's benchmarking framework is versatile enough to handle various code patterns
and structures. Whether you're benchmarking simple functions, methods on
structs, concurrent operations, or memory-intensive processes, the framework
provides appropriate tools and approaches.

With the introduction of the `b.Loop()` method in Go 1.24, benchmarking becomes
even more straightforward and less error-prone across these different scenarios.
Let's explore how to effectively benchmark various types of Go code using this
improved approach.

### Function benchmarks

We've already seen simple function benchmarks. For functions with parameters,
ensure to create representative inputs:

```go
func BenchmarkCalculate(b *testing.B) {
    // Prepare realistic input data
    input := generateRepresentativeData()

    for i := 0; i < b.N; i++ {
        Calculate(input)
    }
}
```

### Method benchmarks

```go
func BenchmarkProcessor_Process(b *testing.B) {
    processor := NewProcessor(/* config */)
    data := generateTestData()

    for b.Loop() {
        processor.Process(data)
    }
}
```

Method benchmarks are similar to function benchmarks but involve struct
instances:

### Concurrent code benchmarks

For benchmarking concurrent code, you may need to synchronize goroutines:

```go
func BenchmarkConcurrentOperation(b *testing.B) {
    for b.Loop() {
        var wg sync.WaitGroup
        wg.Add(10)

        for j := 0; j < 10; j++ {
            go func() {
                defer wg.Done()
                // Concurrent operation
                processItem()
            }()
        }

        wg.Wait()
    }
}
```

### Memory allocation benchmarks

Go allows benchmarking memory allocations as well as execution time:

```go
func BenchmarkMemoryIntensive(b *testing.B) {
    // Report memory allocations
    b.ReportAllocs()

    for b.Loop() {
        createLargeData()
    }
}
```

Running with `-benchmem` flag provides allocation statistics:

```command
go test -bench=MemoryIntensive -benchmem
```

Output includes bytes allocated and allocations per operation:

```text
BenchmarkMemoryIntensive-8    100000    15234 ns/op    8192 B/op    16 allocs/op
```

---

As you've seen, the same core principles apply whether you're benchmarking a
simple function or complex concurrent operations. The `b.Loop()` method
simplifies all these cases by handling iteration count automatically and
excluding setup code from timing measurements.

Now that we've covered the basics of benchmarking different code types, let's
explore more advanced techniques that allow for more sophisticated performance
analysis and comparative benchmarking.

## Advanced benchmarking techniques

While basic benchmarks provide valuable insights, Go's benchmarking framework
offers advanced capabilities that enable more sophisticated performance
analysis.

These techniques help you benchmark across different parameters, compare
multiple implementations, and gain deeper insights into performance
characteristics under varying conditions.

The following approaches will help you create comprehensive benchmark suites
that can identify subtle performance differences and guide your optimization
efforts more effectively.

### Subbenchmarks

Subbenchmarks allow running variants of a benchmark with different parameters:

```go
func BenchmarkSort(b *testing.B) {
   sizes := []int{100, 1000, 10000, 100000}

   for _, size := range sizes {
       b.Run(fmt.Sprintf("Size-%d", size), func(b *testing.B) {
           data := generateRandomSlice(size)

           for b.Loop() {
               // Create a copy to avoid measuring the sorting of already sorted data
               dataCopy := make([]int, len(data))
               copy(dataCopy, data)
               sort.Ints(dataCopy)
           }
       })
   }
}
```

### Benchmark tables

Similar to table-driven tests, table-driven benchmarks help test multiple
scenarios:

```go
func BenchmarkHashFunctions(b *testing.B) {
   benchmarks := []struct {
       name    string
       input   []byte
       hashFn  func([]byte) []byte
   }{
       {"MD5", []byte("test data"), md5Sum},
       {"SHA1", []byte("test data"), sha1Sum},
       {"SHA256", []byte("test data"), sha256Sum},
   }

   for _, bm := range benchmarks {
       b.Run(bm.name, func(b *testing.B) {
           for b.Loop() {
               bm.hashFn(bm.input)
           }
       })
   }
}
```

### Parameterized input sizes

To understand how an algorithm performs with different input sizes:

```go
func BenchmarkSliceOperations(b *testing.B) {
   for _, size := range []int{10, 100, 1000, 10000} {
       slice := make([]int, size)
       for i := range slice {
           slice[i] = i
       }

       b.Run(fmt.Sprintf("Sum-%d", size), func(b *testing.B) {
           for b.Loop() {
               sum := 0
               for _, v := range slice {
                   sum += v
               }
               // Use sum to prevent optimization
               if sum < 0 {
                   b.Fatalf("negative sum")
               }
           }
       })
   }
}
```

### Custom timing

For precise control over what gets timed:

```go
func BenchmarkWithPreciseControl(b *testing.B) {
   // Setup code not included in timing
   data := prepareData()

   for b.Loop() {
       // Only this operation is timed
       result := process(data)

       // With b.Loop(), we don't need to manually stop/start timers for cleanup
       // as only code in the Loop() is measured
       validate(result)
   }
}
```

## Analyzing benchmark results

Standard benchmark output provides a wealth of information, though it appears
deceptively simple. Consider this typical benchmark result:

```text
BenchmarkJoinStrings-8    5000000    264 ns/op    48 B/op    2 allocs/op
```

This condensed line tells us several important things about the benchmark
execution:

The first section, `BenchmarkJoinStrings-8`, identifies the benchmark name
followed by the number of CPU cores available during execution. This hyphenated
suffix helps when comparing results across different machines.

The second figure, `5,000,000`, represents the number of iterations the
benchmark ran. The Go testing framework automatically determines this number by
repeatedly running your benchmark with increasing iteration counts until it
achieves statistical significance—typically aiming for a total run time of at
least one second.

The third figure, `264 ns/op`, is the average time per operation in nanoseconds.
This is your primary performance metric, telling you how long, on average, each
execution of your benchmarked code took.

When memory statistics are enabled with the `-benchmem` flag, you'll see two
additional metrics: `48 B/op` shows average memory allocated per operation (48
bytes in this case), and `2 allocs/op` indicates the average number of distinct
memory allocations per operation.

## Comparing Benchmarks with benchstat

Raw benchmark numbers can be difficult to interpret, especially when comparing
different implementations or tracking performance changes over time.

The `benchstat` tool, part of the Go erformance measurement toolkit, applies
statistical analysis to benchmark results to provide more meaningful
comparisons.

To use `benchstat`, first install it:

```command
go install golang.org/x/perf/cmd/benchstat@latest
```

Then, capture benchmark results from different versions of your code:

```command
go test -bench=. -count=10 > old.txt
```

When you make changes to your code, capture the new benchmark results in a
different file:

```command
go test -bench=. -count=10 > new.txt
```

Then compare both results with:

```command
benchstat old.txt new.txt
```

```text
[output]
goos: linux
goarch: amd64
pkg: github.com/betterstack-community/golang-benchmarks
cpu: 11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz
               │   old.txt    │            new.txt             │
               │    sec/op    │    sec/op     vs base          │
JoinStrings-16   74.27n ± 15%   73.98n ± 11%  ~ (p=0.684 n=10)
```

Here, the result shows:

- Old implementation: 74.27 nanoseconds per operation with 15% variability.
- New implementation: 73.98 nanoseconds per operation with 11% variability.

For the statistical analysis:

- The tilde (~) indicates no statistically significant difference between the
  old and new implementations.
- The p-value of 0.684 is well above the typical threshold of 0.05, confirming
  that the difference is not statistically significant.
- "n=10" indicates that 10 samples were used for this statistical analysis.

In practical terms, this means that despite the small nominal improvement from
74.27ns to 73.98ns (about 0.4% faster), the high variability in the measurements
(15% and 11%) and the high p-value (0.684) indicate that this difference is
likely just random variation. The two implementations should be considered
equivalent in performance.

This is a good example of why proper statistical analysis is important in
benchmarking - looking at just the raw numbers might have led someone to
incorrectly conclude that the new implementation was faster, when in fact
there's no meaningful performance difference.

## Final thoughts

Benchmarking in Go is more than just a development practice—it's a mindset that
encourages performance-conscious programming. Go's `testing` package provides a
robust framework for measuring, analyzing, and optimizing code performance
without requiring external tools or complex setups.

Performance optimization without measurement is guesswork, but with Go's
benchmarking tools, you can make data-driven decisions. By integrating
benchmarking into your development workflow—whether through manual testing
during development or automated performance monitoring in CI pipelines—you
establish a foundation for maintaining and improving application performance
over time.

Remember that the goal of benchmarking isn't just to make code faster—it's to
understand the performance implications of your design choices and to ensure
that your application meets its performance requirements consistently. A
well-crafted benchmark suite serves as both documentation of your performance
expectations and a safeguard against unexpected regressions.

Armed with these benchmarking techniques and best practices, you're
well-equipped to build Go applications that are not only correct and
maintainable but also performant and efficient.

Thanks for reading!
