Baselines: ratcheting thresholds on existing code
When you introduce metric thresholds on an existing codebase, you
usually hit the same wall: every reasonable threshold flags hundreds of
existing functions, and CI goes red on every push. The realistic
adoption path is "ratchet from current state, fail only on new
offenders". The baseline file (issue #99) is how bca check supports
that workflow.
Baselines are the complement to in-source suppression markers, not a substitute. Use suppression markers (Suppression markers) when a function is intentionally complex forever (a parser, a state machine, generated code). Use a baseline when the team intends to pay the debt down. Both can live in the same repo; suppression is checked first.
End-to-end adoption flow
1. Pick initial thresholds
Either gut-feel numbers (cyclomatic=15, cognitive=20) or pull them
from a bca check --no-fail run over the repo to see the current
distribution.
# bca-thresholds.toml
[thresholds]
cyclomatic = 15
cognitive = 20
"loc.lloc" = 200
2. Bootstrap the baseline
bca --paths src/ check \
--config bca-thresholds.toml \
--write-baseline .bca-baseline.toml
Commit both files in the same change:
git add bca-thresholds.toml .bca-baseline.toml
git commit -m "ci: introduce metric thresholds with baseline"
3. Wire the CI gate
GitHub Actions:
- name: Check code complexity thresholds
run: |
bca --paths src/ check \
--config bca-thresholds.toml \
--baseline .bca-baseline.toml
GitLab CI (snippet for the relevant job):
threshold-check:
image: rust:1
before_script:
- cargo install --locked big-code-analysis-cli@<VERSION>
script:
- bca --paths src/ check
--config bca-thresholds.toml
--baseline .bca-baseline.toml
Exit codes: 0 clean, 2 regression or new offender, 1 tool error.
See CI integration for the broader matrix of CI surfaces.
4. Refresh the baseline as the team pays debt down
Every few weeks, or after a focused refactor:
bca --paths src/ check \
--config bca-thresholds.toml \
--write-baseline .bca-baseline.toml
git diff .bca-baseline.toml
A shrinking diff is the goal. Two --write-baseline runs over an
unchanged tree produce byte-identical output, so spurious diffs only
appear when actual offenders changed.
5. PR-review heuristics
- Baseline shrank. Debt paid down. No further action.
- Baseline grew. Someone added a new offender to the file intentionally. Review the values — was this a deliberate stopgap, or did the author bypass the gate? Either is fine if conscious; the point of the file being committed is to make the choice reviewable.
- A single entry got a higher
value. The author re-ran--write-baselineafter the function got worse. Treat the same as "baseline grew" — surface the change in review.
Reading the gate output
A failing bca check --baseline run prefixes each surviving violation
with a tag and follows the list with a per-file rollup:
bca: filtered 422 violations via baseline
[regr +60%] src/foo.rs:1-865: <file>: halstead.effort = 1557107.72 (limit 50000)
[new] src/bar.rs:506-747: act_on_file: cognitive = 63 (limit 25)
...
--- summary ---
src/foo.rs: 5 violations (worst: halstead.effort = 1557107.72 vs limit 50000 at L1)
src/bar.rs: 4 violations (worst: cognitive = 63 vs limit 25 at L506)
Tag prefixes:
[new]— no baseline entry for this(path, function, start_line, metric)tuple. The violation is new since the baseline was written.[regr +N%]— the baseline contains a recorded value and the current value isN%higher. Cases:[regr from 0]when the recorded value is0.0and a non-zero percentage would divide by zero.[regr +>9999%]caps once the regression exceeds 100× the baseline value.[regr NaN]when the current metric value is NaN (degenerate Halstead inputs on trivial functions).
Tags only appear when --baseline is passed; without it the line
format is byte-identical to the no-baseline default. CI tooling that
grep-pipes the stderr stream can suppress the trailing summary with
--no-summary.
The summary footer groups violations by file, cites the single worst
metric per file (max value / limit ratio), and sorts rows by
violation count descending then path ascending. It is the fastest way
to read a long offender list and spot which file to start with.
6. Retire the baseline
When .bca-baseline.toml contains only version = 2 and no entries,
drop the --baseline flag from CI and delete the file. The thresholds
now stand on their own.
Composition with suppression markers
--write-baseline already excludes any function silenced by a
bca: suppress or #lizard forgives marker, so the same function
doesn't end up in two places. If a function is intentionally exempt
forever, prefer the in-source marker (lives next to the code, survives
refactors, no extra file to commit). Use the baseline only for
violations the team genuinely intends to fix.
To audit the un-filtered offender set — every violation regardless of
suppression or baseline — pass --no-suppress and omit --baseline:
bca --paths src/ check \
--config bca-thresholds.toml \
--no-suppress \
--no-fail
Combined with --write-baseline, --no-suppress records every
violation including the ones that suppression markers normally hide.
Limitations
- Line drift. Entries key on
(path, function, start_line, metric). Editing code above a function shifts itsstart_lineand the baseline entry stops matching, surfacing as a "new" offender. Refresh with--write-baselineand commit the diff. - Path identity. Entries record the path the walker saw. Run
--write-baselineand--baselinefrom the same working directory with the same--pathsargument; a relative--paths src/and an absolute--paths /repo/src/produce non-matching baselines. - OS portability. Paths are normalized to forward slashes on write and re-normalized on read, so a baseline generated on Linux matches the same tree on Windows. Non-UTF-8 paths fall back to a lossy display form and may not round-trip exactly.
- Tightening a threshold. Lowering a limit may newly expose functions that were previously clean. They will not be in the baseline → CI will fail. This is correct — tightening should expose new offenders. Refresh the baseline if the team chooses to absorb the new entries.