Performance optimization is crucial for building efficient applications, but without proper measurement, optimization becomes mere guesswork. As Donald Knuth famously stated, "premature optimization is the root of all evil." This is where benchmarking comes in.
Go stands out among programming languages by providing built-in benchmarking as part of its standard library. This native support reflects Go's philosophy of making performance testing accessible to all developers, not just performance specialists.
Benchmarking in Go allows you to:
- Measure code performance with microsecond precision.
- Compare implementation alternatives.
- Detect performance regressions.
- Understand memory allocation patterns.
- Make data-driven optimization decisions.
This guide will walk you through everything you need to know about benchmarking in Go, from basic concepts to advanced techniques.
Getting started with Go benchmarks
Go benchmarks are functions that live in *_test.go
files, just like unit
tests. While tests begin with Test
, benchmarks follow a specific naming
convention:
func BenchmarkXxx(b *testing.B) {
// benchmark code
}
The benchmark function must:
- Start with
Benchmark
. - Accept a
*testing.B
parameter. - Be in a file with a
_test.go
suffix.
The testing.B
type provides the benchmarking infrastructure, including timing,
iteration control, and reporting facilities.
Let's create a simple benchmark for a string concatenation function:
package concat
func JoinStrings(strs []string) string {
var result string
for _, s := range strs {
result += s
}
return result
}
package concat
import "testing"
func BenchmarkJoinStrings(b *testing.B) {
strs := []string{"Hello", ", ", "world", "!"}
// The benchmark runner will call this function b.N times
for i := 0; i < b.N; i++ {
JoinStrings(strs)
}
}
To run a benchmark, use the go test
command with the -bench
flag:
go test -bench=.
goos: linux
goarch: amd64
pkg: github.com/betterstack-community/golang-benchmarks
cpu: 11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz
BenchmarkJoinStrings-16 9762195 123.0 ns/op
PASS
ok github.com/betterstack-community/golang-benchmarks 1.330s
This means:
- The benchmark ran on 16 CPU cores (
-16
suffix). - It executed 9762195 times.
- Each operation took approximately 123 nanoseconds>
Understanding b.N
The benchmark framework automatically determines the value of b.N
by running
your benchmark multiple times with increasing values until it gets a
statistically significant result.
The framework starts with a small value (usually 1) and increases it until the
benchmark runs for a sufficient duration (default is 1 second). This is why your
benchmark function must execute the code under test b.N
times:
func BenchmarkSomething(b *testing.B) {
// Optional setup code
b.ResetTimer() // Reset the timer if setup took significant time
for i := 0; i < b.N; i++ {
// Code you want to measure
}
}
Often, benchmarks require setup and teardown code that shouldn't be included in the timing measurements:
func BenchmarkComplexOperation(b *testing.B) {
// Setup
data := createLargeDataset()
// Reset the timer to exclude setup time
b.ResetTimer()
for i := 0; i < b.N; i++ {
processData(data)
}
// Optionally pause timer during cleanup
b.StopTimer()
cleanupResources()
}
The key timing control methods include:
b.ResetTimer()
: Resets the timer to zero.b.StartTimer()
: Resumes the timer after it was stopped.b.StopTimer()
: Temporarily stops the timer.
Note that the Go compiler might optimize away code that doesn't have observable effects, potentially invalidating your benchmark:
func BenchmarkMightBeOptimizedAway(b *testing.B) {
for i := 0; i < b.N; i++ {
// This computation might be eliminated by the compiler
// since its result is never used
math.Sqrt(float64(i))
}
}
To prevent this, ensure the result is used:
func BenchmarkPreventOptimization(b *testing.B) {
var result float64
for i := 0; i < b.N; i++ {
result += math.Sqrt(float64(i))
}
// Use the result to prevent optimization
if result < 0 {
b.Fatalf("negative result: %f", result)
}
}
Introducing b.Loop
Go 1.24 introduces a cleaner, more efficient approach to benchmarking
with the testing.B.Loop
method, which addresses several nuances and potential
pitfalls of the traditional b.N
loop:
func BenchmarkStringConversion(b *testing.B) {
// Setup - prepare a large integer to convert to string
number := 9876543210
b.ResetTimer()
// We need a result variable to prevent optimization
var result string
for i := 0; i < b.N; i++ {
// The operation we want to benchmark
result = strconv.Itoa(number)
}
// Prevent compiler from optimizing away the unused result
if len(result) == 0 {
b.Fatal("unexpected empty string")
}
}
Several issues arise with this approach:
- The benchmark function runs multiple times, causing setup code to execute repeatedly.
- You must remember to call
b.ResetTimer()
to exclude setup time from measurements. - You need to use a result variable and ensure it's used somehow to prevent the compiler from optimizing away your benchmark code.
The new b.Loop()
approach eliminates these concerns:
func BenchmarkStringConversion(b *testing.B) {
// Setup - prepare a large integer to convert to string
number := 9876543210
// No need for b.ResetTimer() - everything outside the loop is excluded
// No need for a result variable to prevent optimization
for b.Loop() {
// The operation we want to benchmark
strconv.Itoa(number)
}
}
Key advantages of b.Loop()
:
- The benchmark function executes only once per
-count
, so setup code runs just once - Code outside the
b.Loop()
doesn't affect benchmark timing, eliminating the need forb.ResetTimer()
- The compiler won't optimize away function calls within a
b.Loop()
body, even if results aren't used.
This results in benchmarks that are easier to write, less error-prone, and potentially more accurate by avoiding repeated setup overhead.
Note that your benchmarks should use either b.Loop()
or a b.N
-style loop,
but not both in the same benchmark function.
Benchmarking different types of code
Go's benchmarking framework is versatile enough to handle various code patterns and structures. Whether you're benchmarking simple functions, methods on structs, concurrent operations, or memory-intensive processes, the framework provides appropriate tools and approaches.
With the introduction of the b.Loop()
method in Go 1.24, benchmarking becomes
even more straightforward and less error-prone across these different scenarios.
Let's explore how to effectively benchmark various types of Go code using this
improved approach.
Function benchmarks
We've already seen simple function benchmarks. For functions with parameters, ensure to create representative inputs:
func BenchmarkCalculate(b *testing.B) {
// Prepare realistic input data
input := generateRepresentativeData()
for i := 0; i < b.N; i++ {
Calculate(input)
}
}
Method benchmarks
func BenchmarkProcessor_Process(b *testing.B) {
processor := NewProcessor(/* config */)
data := generateTestData()
for b.Loop() {
processor.Process(data)
}
}
Method benchmarks are similar to function benchmarks but involve struct instances:
Concurrent code benchmarks
For benchmarking concurrent code, you may need to synchronize goroutines:
func BenchmarkConcurrentOperation(b *testing.B) {
for b.Loop() {
var wg sync.WaitGroup
wg.Add(10)
for j := 0; j < 10; j++ {
go func() {
defer wg.Done()
// Concurrent operation
processItem()
}()
}
wg.Wait()
}
}
Memory allocation benchmarks
Go allows benchmarking memory allocations as well as execution time:
func BenchmarkMemoryIntensive(b *testing.B) {
// Report memory allocations
b.ReportAllocs()
for b.Loop() {
createLargeData()
}
}
Running with -benchmem
flag provides allocation statistics:
go test -bench=MemoryIntensive -benchmem
Output includes bytes allocated and allocations per operation:
BenchmarkMemoryIntensive-8 100000 15234 ns/op 8192 B/op 16 allocs/op
As you've seen, the same core principles apply whether you're benchmarking a
simple function or complex concurrent operations. The b.Loop()
method
simplifies all these cases by handling iteration count automatically and
excluding setup code from timing measurements.
Now that we've covered the basics of benchmarking different code types, let's explore more advanced techniques that allow for more sophisticated performance analysis and comparative benchmarking.
Advanced benchmarking techniques
While basic benchmarks provide valuable insights, Go's benchmarking framework offers advanced capabilities that enable more sophisticated performance analysis.
These techniques help you benchmark across different parameters, compare multiple implementations, and gain deeper insights into performance characteristics under varying conditions.
The following approaches will help you create comprehensive benchmark suites that can identify subtle performance differences and guide your optimization efforts more effectively.
Subbenchmarks
Subbenchmarks allow running variants of a benchmark with different parameters:
func BenchmarkSort(b *testing.B) {
sizes := []int{100, 1000, 10000, 100000}
for _, size := range sizes {
b.Run(fmt.Sprintf("Size-%d", size), func(b *testing.B) {
data := generateRandomSlice(size)
for b.Loop() {
// Create a copy to avoid measuring the sorting of already sorted data
dataCopy := make([]int, len(data))
copy(dataCopy, data)
sort.Ints(dataCopy)
}
})
}
}
Benchmark tables
Similar to table-driven tests, table-driven benchmarks help test multiple scenarios:
func BenchmarkHashFunctions(b *testing.B) {
benchmarks := []struct {
name string
input []byte
hashFn func([]byte) []byte
}{
{"MD5", []byte("test data"), md5Sum},
{"SHA1", []byte("test data"), sha1Sum},
{"SHA256", []byte("test data"), sha256Sum},
}
for _, bm := range benchmarks {
b.Run(bm.name, func(b *testing.B) {
for b.Loop() {
bm.hashFn(bm.input)
}
})
}
}
Parameterized input sizes
To understand how an algorithm performs with different input sizes:
func BenchmarkSliceOperations(b *testing.B) {
for _, size := range []int{10, 100, 1000, 10000} {
slice := make([]int, size)
for i := range slice {
slice[i] = i
}
b.Run(fmt.Sprintf("Sum-%d", size), func(b *testing.B) {
for b.Loop() {
sum := 0
for _, v := range slice {
sum += v
}
// Use sum to prevent optimization
if sum < 0 {
b.Fatalf("negative sum")
}
}
})
}
}
Custom timing
For precise control over what gets timed:
func BenchmarkWithPreciseControl(b *testing.B) {
// Setup code not included in timing
data := prepareData()
for b.Loop() {
// Only this operation is timed
result := process(data)
// With b.Loop(), we don't need to manually stop/start timers for cleanup
// as only code in the Loop() is measured
validate(result)
}
}
Analyzing benchmark results
Standard benchmark output provides a wealth of information, though it appears deceptively simple. Consider this typical benchmark result:
BenchmarkJoinStrings-8 5000000 264 ns/op 48 B/op 2 allocs/op
This condensed line tells us several important things about the benchmark execution:
The first section, BenchmarkJoinStrings-8
, identifies the benchmark name
followed by the number of CPU cores available during execution. This hyphenated
suffix helps when comparing results across different machines.
The second figure, 5,000,000
, represents the number of iterations the
benchmark ran. The Go testing framework automatically determines this number by
repeatedly running your benchmark with increasing iteration counts until it
achieves statistical significance—typically aiming for a total run time of at
least one second.
The third figure, 264 ns/op
, is the average time per operation in nanoseconds.
This is your primary performance metric, telling you how long, on average, each
execution of your benchmarked code took.
When memory statistics are enabled with the -benchmem
flag, you'll see two
additional metrics: 48 B/op
shows average memory allocated per operation (48
bytes in this case), and 2 allocs/op
indicates the average number of distinct
memory allocations per operation.
Comparing Benchmarks with benchstat
Raw benchmark numbers can be difficult to interpret, especially when comparing different implementations or tracking performance changes over time.
The benchstat
tool, part of the Go erformance measurement toolkit, applies
statistical analysis to benchmark results to provide more meaningful
comparisons.
To use benchstat
, first install it:
go install golang.org/x/perf/cmd/benchstat@latest
Then, capture benchmark results from different versions of your code:
go test -bench=. -count=10 > old.txt
When you make changes to your code, capture the new benchmark results in a different file:
go test -bench=. -count=10 > new.txt
Then compare both results with:
benchstat old.txt new.txt
goos: linux
goarch: amd64
pkg: github.com/betterstack-community/golang-benchmarks
cpu: 11th Gen Intel(R) Core(TM) i7-11850H @ 2.50GHz
│ old.txt │ new.txt │
│ sec/op │ sec/op vs base │
JoinStrings-16 74.27n ± 15% 73.98n ± 11% ~ (p=0.684 n=10)
Here, the result shows:
- Old implementation: 74.27 nanoseconds per operation with 15% variability.
- New implementation: 73.98 nanoseconds per operation with 11% variability.
For the statistical analysis:
- The tilde (~) indicates no statistically significant difference between the old and new implementations.
- The p-value of 0.684 is well above the typical threshold of 0.05, confirming that the difference is not statistically significant.
- "n=10" indicates that 10 samples were used for this statistical analysis.
In practical terms, this means that despite the small nominal improvement from 74.27ns to 73.98ns (about 0.4% faster), the high variability in the measurements (15% and 11%) and the high p-value (0.684) indicate that this difference is likely just random variation. The two implementations should be considered equivalent in performance.
This is a good example of why proper statistical analysis is important in benchmarking - looking at just the raw numbers might have led someone to incorrectly conclude that the new implementation was faster, when in fact there's no meaningful performance difference.
Final thoughts
Benchmarking in Go is more than just a development practice—it's a mindset that
encourages performance-conscious programming. Go's testing
package provides a
robust framework for measuring, analyzing, and optimizing code performance
without requiring external tools or complex setups.
Performance optimization without measurement is guesswork, but with Go's benchmarking tools, you can make data-driven decisions. By integrating benchmarking into your development workflow—whether through manual testing during development or automated performance monitoring in CI pipelines—you establish a foundation for maintaining and improving application performance over time.
Remember that the goal of benchmarking isn't just to make code faster—it's to understand the performance implications of your design choices and to ensure that your application meets its performance requirements consistently. A well-crafted benchmark suite serves as both documentation of your performance expectations and a safeguard against unexpected regressions.
Armed with these benchmarking techniques and best practices, you're well-equipped to build Go applications that are not only correct and maintainable but also performant and efficient.
Thanks for reading!
Make your mark
Join the writer's program
Are you a developer and love writing and sharing your knowledge with the world? Join our guest writing program and get paid for writing amazing technical guides. We'll get them to the right readers that will appreciate them.
Write for us
Build on top of Better Stack
Write a script, app or project on top of Better Stack and share it with the world. Make a public repository and share it with us at our email.
community@betterstack.comor submit a pull request and help us build better products for everyone.
See the full list of amazing projects on github