The CRAP Metric: Quantifying Code Risk with Cyclomatic Complexity and Test Coverage

Stanley Ulili

Updated on May 23, 2026

The two components
The formula
Practical application with cargo-crap
CI integration
Relevance to AI-generated code
Final thoughts

The Change Risk Anti-Patterns (CRAP) index is a code quality metric that combines cyclomatic complexity and test coverage into a single score. It was introduced in 2007 by Alberto Savoia and Bob Evans in a Google Testing Blog post as a way to quantify the subjective experience of risky, hard-to-maintain code.

Screenshot of the original Google Testing Blog post titled "This Code is CRAP" introducing the metric

The two components

Cyclomatic complexity

Cyclomatic complexity counts the number of linearly independent execution paths through a function. Every decision point adds to this count:

Conditional statements (if, else if, else)
Loops (for, while, loop)
Match arms or switch cases
Ternary operators
Error handling paths (catch, ? in Rust)

A score of 1–5 is simple and easy to reason about. A score of 6–10 is moderately complex. Above 10, a function becomes significantly harder to understand; above 15, it is likely to cause problems when modified.

Diagram illustrating cyclomatic complexity with nodes and edges showing how decision points create multiple paths

High complexity alone is not fatal. A complex but thoroughly tested function is manageable: tests document expected behavior and catch regressions when the code changes.

Test coverage

Coverage measures the percentage of code executed by automated tests. Low coverage on simple code is low risk. Low coverage on complex code is the problem the CRAP metric targets.

100% line coverage also does not guarantee correctness. It means the code was executed, not that every path was asserted against the correct output. The CRAP metric addresses this by scaling risk with the untested fraction of a function's complexity.

The formula

CRAP formula displayed on screen: CRAP(m) = C² × (1 - cov)³ + C

Copied!

CRAP(m) = CC² × (1 - cov)³ + CC

Where CC is the cyclomatic complexity and cov is the coverage fraction (0 to 1).

The key design choices are the exponents. Complexity is squared, and the uncovered fraction is cubed. This makes the formula non-linear: increasing coverage on a complex function reduces the score dramatically, while adequate coverage on a simple function produces a score close to the bare complexity value.

100% coverage: (1 - 1) = 0, so the first term vanishes. CRAP equals CC. The risk is the inherent complexity, acknowledged but considered managed by the tests.

0% coverage: (1 - 0)³ = 1. CRAP = CC² + CC. For CC = 15: 225 + 15 = 240.

50% coverage, CC = 15: CRAP = 15² × (0.5)³ + 15 = 225 × 0.125 + 15 = 28.125 + 15 = 43.125

Even 50% coverage leaves a score of 43 for a function with CC = 15, which most thresholds would still flag. The formula communicates that complex code requires high coverage to be considered low-risk, not just adequate coverage.

Practical application with `cargo-crap`

cargo-crap is a Rust command-line tool that calculates CRAP scores from coverage data.

Setup

Copied!

cargo install cargo-llvm-cov

Copied!

cargo install cargo-crap

Generating coverage and running analysis

From the project root, generate an lcov coverage file:

Copied!

cargo llvm-cov --lcov --output-path lcov.info

Then run cargo-crap against it:

Copied!

cargo crap --lcov lcov.info

The tool calculates cyclomatic complexity for each function, combines it with coverage from lcov.info, and produces a ranked table.

Reading the output

A well-tested complex function:

cargo-crap output table showing a low CRAP score for a complex but well-tested function

CRAP	CC	Coverage	Function	Location
13.0	13	96.0%	`process_device_telemetry`	./main.rs:77

A function with CC = 13 and 96% coverage scores 13.0: well within acceptable range.

The same function with tests removed:

cargo-crap output showing the CRAP score skyrocketing after tests are removed

CRAP	CC	Coverage	Function	Location
182.0	13	0.0%	`process_device_telemetry`	./main.rs:77

CC is unchanged at 13. Coverage drops to 0%. CRAP jumps to 182.0. The score change is the formula's non-linear penalty for untested complexity, not a change in the code itself.

CI integration

cargo-crap exits with a non-zero status code if any function exceeds the threshold, which causes CI to fail:

Copied!

cargo crap --lcov lcov.info --fail-above --threshold 30

--threshold 30 is the default. Set it lower for stricter enforcement or higher for legacy codebases with accumulated debt.

For existing projects, cargo-crap supports a baseline mode: generate an initial report as the baseline, then configure the CI check to fail only when a new change increases the CRAP score of an existing function or introduces a new function above the threshold. This allows incremental improvement without halting feature development to address all existing debt at once.

Relevance to AI-generated code

AI coding assistants generate syntactically correct and often complex code quickly. They are less consistent at generating the unit tests needed to cover all execution paths in that code. A function with CC = 20 may be produced in seconds; the 15 tests needed to cover it adequately may not follow.

Blog post excerpt explaining that AI agents are generating code and moving quickly through codebases

A CI check on compilation and linting will pass for this code. A human reviewer may approve it based on plausibility. Without an objective score, the untested complexity enters the codebase silently. A CRAP threshold in CI catches this regardless of whether the code was written by a human or generated by a tool.

Final thoughts

The CRAP metric is most useful as a CI gate rather than a code review discussion point. Its value is in automation: a threshold violation fails the build, which creates a clear prompt to either reduce complexity or add tests before the code merges. The formula's non-linear scaling means that teams do not need to enforce 100% coverage on all code, only on code complex enough to warrant it.

cargo-crap is Rust-specific, but the metric itself is language-agnostic. Similar tools exist for Java (Jacoco), PHP, JavaScript, and other ecosystems. The formula is the same regardless of the implementation.

Got an article suggestion? Let us know

Bumblebee: Read-Only Endpoint Scanner for Developer Machine Supply-Chain Exposure

Bumblebee is an open-source Go binary from Perplexity AI that inventories packages, editor extensions, browser extensions, and AI tool configs on a developer machine by parsing metadata files only, never executing package managers or lifecycle scripts. Output is NDJSON, and three scan profiles cover baseline inventory, project workspaces, and deep incident response.

→