big-code-analysis

big-code-analysis is a Rust library to analyze and extract information from source codes written in many different programming languages. It is based on a parser generator tool and an incremental parsing library called Tree Sitter.

You can find the source code of this software on GitHub, while issues and feature requests can be posted on the respective GitHub Issue Tracker.

📖 Rust API reference: the full type-and-method reference for the crate lives on docs.rs. This book is task-oriented; docs.rs is the authoritative, always-current reference generated straight from the source.

Supported platforms

big-code-analysis can run on the most common platforms: Linux, macOS, and Windows.

On our GitHub Release Page you can find the Linux and Windows binaries already compiled and packed for you.

API docs

If you prefer to use big-code-analysis as a crate, the complete API docs generated by Rustdoc — every public type, trait, and function, including the feature-gated vcs module — live on docs.rs.

For task-oriented guides on embedding the crate — quick start, in-memory analysis, walking FuncSpace results, and error handling — see the Using as a Library section.

For the PyO3 bindings — pip install big-code-analysis, batch processing, flat-record iteration, SARIF output, and async patterns — see the Python Bindings section.

License

Mozilla-defined grammars are released under the MIT license.
big-code-analysis, big-code-analysis-cli and big-code-analysis-web are released under the Mozilla Public License v2.0.

Supported Languages

This is the list of programming languages parsed by big-code-analysis. Each entry below is a real LANG variant (defined by the mk_langs! invocation in src/langs.rs) and is gated behind the matching per-language Cargo feature documented in Per-language Cargo features.

Bash
C
C/C++
C#
Elixir
Go
Groovy
Irules
Java
JavaScript
Kotlin
Lua
Mozcpp
Mozjs
Objective-C
Perl
Php
Python
Ruby
Rust
Tcl
Tsx
Typescript

Some entries are variants of a shared grammar pipeline. JavaScript (the upstream tree-sitter-javascript grammar) is the default for .js, .mjs, .cjs, and .jsx files; Mozjs is the Mozilla / SpiderMonkey fork, now opt-in — it owns only the .jsm (Firefox module) extension and reports the canonical slug mozjs. The two are metric-equivalent on ordinary JavaScript. Tsx is Typescript with JSX syntax enabled and reports the distinct slug tsx. Since #721 C has its own variant C (slug c, upstream tree-sitter-c), owning .c and the c emacs mode; the C/C++ variant (slug cpp, upstream tree-sitter-cpp since #720) keeps .cpp / .cc / .h and the rest. .h deliberately stays on Cpp: a C++ header through the C grammar ERROR-cascades on class / template, whereas a C header through the C++ grammar only trips on C++-keyword identifiers. The Mozilla/Gecko C++ dialect is the opt-in Mozcpp variant (slug mozcpp), which owns no file extensions and is selected only by name — exactly as Mozjs relates to JavaScript (C# reports csharp). Since #724 Objective-C (slug objc, upstream tree-sitter-objc) owns .m and the objc / objective-c emacs modes. Objective-C++ (.mm) stays on Cpp: a .mm file mixes Objective-C with C++, and the tree-sitter-objc grammar cannot parse the C++ half (templates, namespaces, ::), so the C++ grammar — which only stumbles on the Objective-C glue — degrades more gracefully there, the same trade-off .h uses. Metrics for the Objective-C parts of a .mm file are therefore approximate. Every variant's slug is its LANG::name, lowercase and punctuation-free so it round-trips through FromStr.

Internal helper variants

The following LANG variants are not user-facing languages — they are internal helpers in the C-family analysis pipeline (they ride every C-family Cargo feature: cpp, c, and mozcpp) and are not selected directly when analysing source files:

Ccomment — focuses on C/C++ comments.
Preproc — focuses on C/C++ preprocessor macros.

Note: Since #720 the Mozilla/Gecko C++ dialect is exposed as the Mozcpp LANG variant (backed by the vendored bca-tree-sitter-mozcpp crate, pulled in by the opt-in mozcpp feature). It is a fully public, name-selectable language that owns no file extensions — unlike the Ccomment / Preproc helpers above, it is not internal.

Supported Code Metrics

This chapter is a guided tour of every metric that big-code-analysis computes. Each section starts from the original research paper, walks through the algorithm, and explains both the way the metric was originally meant to be used and the ways the industry has actually ended up using it years later. If you are new to software metrics, read the sections in order — the later metrics (Maintainability Index in particular) are explicitly built on top of the earlier ones (Halstead, Cyclomatic, LOC).

A few framing notes before we start:

A metric is a measurement, not a verdict. Every number on this page summarises a structural property of source code. None of them measures correctness, productivity, or developer skill. The most important question for any metric is always "compared with what?" — the same module, a month ago; this module versus its siblings; this codebase versus an industry baseline. Absolute thresholds are rough heuristics at best.
Most metrics here are computed at three scopes: per function / method, per class or unit-like space, and per file. The underlying tree-sitter parser produces a tree of "spaces" (functions, closures, classes, namespaces, …) and every metric is rolled up through that tree.
Object-oriented metrics only fire on object-oriented constructs. WMC, NPA, and NPM report 0 on a Rust file that has no impl blocks or on a Python module without classes; that is the correct answer, not a bug.

Index

Metric	Measures	First defined by
ABC	Size as `<Assignments, Branches, Conditions>`	Fitzpatrick, 1997
Cognitive Complexity	How hard a function is to read	Campbell / SonarSource, 2017
Cyclomatic Complexity (CC)	Independent paths through a function	McCabe, 1976
Halstead	Vocabulary-based size, difficulty, effort, bugs	Halstead, 1977
Lines of Code (SLOC, PLOC, LLOC, CLOC, BLANK)	Raw, physical, logical, comment, and blank line counts	Conte, Dunsmore & Shen, 1986
Maintainability Index (MI)	Composite maintainability score	Oman & Hagemeister, 1992; Coleman et al., 1994
NArgs	Number of arguments per function	folk metric
NExits	Number of exit points per function	structured-programming literature
NOM	Number of methods and closures	Lorenz & Kidd, 1994
NPA	Number of public attributes	Lorenz & Kidd, 1994
NPM	Number of public methods	Lorenz & Kidd, 1994
Tokens	Tree-sitter leaf-token count (size proxy)	Lizard tool, Terry Yin
WMC	Sum of cyclomatic complexity across a class's methods	Chidamber & Kemerer, 1994

ABC

The ABC metric measures the size of a piece of code as a three-dimensional vector. Each component counts one kind of operation:

Assignments — anything that stores a value into a variable, including compound assignments (+=, ++) and explicit initialisation.
Branches — function and method calls. Despite the name, this is not the count of conditional jumps; it is the number of points where control branches out to other code.
Conditions — boolean tests: comparison operators (==, !=, <=, >=, <, >), ternary operators (?), and the fixed keyword set (else, case, try, catch). The default / wildcard arm is not counted in any language (see the per-language deviations below). The short-circuit logical operators && and || are not counted on their own — instead, each non-comparison operand of a && / || chain contributes one condition via Fitzpatrick's "unary conditional expression" rule. The next subsection walks through the rules, the per-language deviations, and worked examples.

The metric was introduced by Jerry Fitzpatrick in the 1997 C++ Report article Applying the ABC metric to C, C++ and Java. The current canonical specification, including the rules for what counts as an A, B, or C in modern languages, is maintained on Fitzpatrick's Software Renovation site.

Counting rules

Fitzpatrick's paper enumerates the rules in three figures — Figure 2 (C), Figure 3 (C++, which extends Figure 2), and Figure 4 (Java). Big-code-analysis implements those rule sets directly per language; the table below summarises what counts in each component, with each row attributed to the figure that introduces it.

Assignments

Rule	Counted as `A`	First defined in
Plain assignment (`=`)	one per occurrence	Figure 2 (C)
Compound assignment (`+=`, `-=`, `*=`, `/=`, `%=`, `<<=`, `>>=`, `&=`, `\|=`, `^=`)	one per occurrence	Figure 2 (C) / Figure 4 (Java)
Java unsigned-right-shift-assign (`>>>=`)	one per occurrence	Figure 4 (Java)
Pre- or post-increment / decrement (`++`, `--`)	one per occurrence	Figure 2 (C)
Initializing constructor invocation	one per occurrence	Figure 3 (C++)

Branches

Rule	Counted as `B`	First defined in
Function or method call	one per call site	Figure 2 (C) / Figure 4 (Java)
`new` operator	one per occurrence	Figure 3 (C++) / Figure 4 (Java)
`delete` operator	one per occurrence	Figure 3 (C++)
`goto label`, `break label`, `continue label`	one per occurrence	Figure 2 (C) / Figure 3 (C++) / Figure 4 (Java, labeled `break` / `continue` only — Java has no `goto`)

Conditions

Rule	Counted as `C`	First defined in
Comparison operator (`==`, `!=`, `<=`, `>=`, `<`, `>`)	one per occurrence	Figure 2, Rule 5
Ternary `? :`	one per occurrence	Figure 2, Rule 5
`else`, `case`	one per occurrence	Figure 2, Rule 5
Preprocessor `#else`, `#elif`	one per occurrence	Figure 2, Rule 5
`try`, `catch`	one per occurrence	Figure 3 (C++) / Figure 4 (Java)
Unary conditional expression	one per non-comparison operand of `&&` / `\|\|` (and per `!`-wrapped or bare-truthy condition in `if` / `while` / argument / `return` slots)	Figure 3, Rule 7 / Figure 4, Rule 9

The short-circuit logical operators (&&, ||, and per-language equivalents — Ruby and / or, Python and / or, Perl and / or / xor, Lua and / or, Tcl && / ||, iRules && / || / and / or) do not contribute a condition on their own. Each non-comparison operand contributes one instead, via the unary-conditional rule. The paper makes this explicit twice:

Listing 2 annotates (am >= 0 && am <= 0xF) ? '/' : 'C' as accc — one assignment plus three conditions, where the three conditions are the two comparisons (>=, <=) and the ternary (?). The && itself contributes zero.
Rule 7 / Rule 9 instead counts each operand: for if (x || y) printf("test failure\n"); the paper writes "there are two unary conditions since both x and y are tested as conditional expressions". The || again contributes zero; x and y each contribute one.

Per-language deviations

Per-language impl Abc blocks narrow the paper rule set where the language has no equivalent construct, or where strict literal application would over-count.

Language	Deviation	Reason
C, Go, Rust	`try` / `catch` omitted	No `try`/`catch` keyword in the grammar; error-handling uses `errno` / `Result` / `Result`-like sums.
Ruby	`Rescue` substitutes for `catch`	Ruby's exception-handling keyword is `rescue`; the AST node `Rescue` plays the role of Java's `catch`.
All languages	`default` / `_` wildcard arm excluded from the condition set	Fitzpatrick's Figure 2 lists `default`, but it falls through unconditionally — counting it would inflate `C` on every `switch` / `match` regardless of body. big-code-analysis omits it for every language (the Rust `_ =>` and Java `default:` arms included).
Tcl	Chain-operand unary conditions wired; bare-truthy / argument / `return` slots are not	Each operand of a `&&` / `\|\|` chain inside `expr {…}` counts as one condition, so `if {$a && $b}` reports two. The broader Phase 2B slot routing is not wired, so a bare-truthy `if {$a}` still reports zero.
iRules	Chain-operand unary conditions wired (unlike its Tcl sibling); bare-truthy / argument / `return` slots are not	Each operand of a `&&` / `\|\|` / `and` / `or` chain counts as one condition (Rule 9), so `if {!$a && !$b}` reports two. iRules also recognises the word-form string-match comparators (`contains`, `starts_with`, `ends_with`, `equals`, `matches`, …) that Tcl lacks (Tcl's `eq` / `ne` / `in` / `ni` are shared). The broader Phase 2B slot routing is not wired, so a bare-truthy `if {$a}` still reports zero.
All Phase 2 languages (Java, Groovy, C#, Rust, Go, JavaScript, TypeScript, TSX, Mozjs, PHP, C++, Python, Perl, Lua)	`if (true) {}`, `m(!a, !b)`, `return !x` count their operand(s)	Phase 2B routes `if` / `while` / `do-while` / argument-list / `return` / ternary slots through the same walker, so the rule applies uniformly across decision-bearing positions. A bare `return x` continues to report zero — Fitzpatrick treats an identifier in a return slot as a value, not a unary conditional.
Ruby	Bare-predicate `if` / `unless` / `while` / `until` (block and modifier forms) count one condition	Idiomatic Ruby favours bare predicates (`if flag`, `x if flag`); counting the condition slot keeps ABC conditions at or above Ruby's cyclomatic decision count (the alignment enforced across the other languages). A comparison (`if a == b`) or `&&` / `\|\|` chain in the predicate is counted by its own operator / walker arm and is not double-counted.
Bash	`if` / `elif` / `while` and each non-wildcard `case` arm count one condition	A Bash predicate is a command, so the branch keyword itself — not an embedded boolean expression — is the condition signal. Each matches a Bash cyclomatic decision; the bare `*)` case arm (the analogue of `default:`) is excluded, mirroring the cyclomatic standard count.
Kotlin	`try` counts a condition alongside `catch`	Fitzpatrick counts both keywords, and Java / C# / C++ / Groovy already count both; Kotlin previously counted only the catch block.

Worked example

Consider this C function:

char digit_or_C(int am) {
    char c;
    if (am >= 0 && am <= 0xF) {
        c = '/';
    } else {
        c = 'C';
    }
    return c;
}

Walking the function body:

Token / construct	Component	Why
`am >= 0`	C += 1	Comparison (Rule 5, `>=`)
`am <= 0xF`	C += 1	Comparison (Rule 5, `<=`)
`&&`	—	Logical operator — does not contribute on its own.
`if`/`else`	C += 1	`else` keyword (Rule 5)
`c = '/'`	A += 1	Assignment (Rule 1)
`c = 'C'`	A += 1	Assignment (Rule 1)

Total: <A,B,C> = <2, 0, 3>, magnitude √13 ≈ 3.61.

If the same body is rewritten with a unary conditional —

if (am_in_range || force_letter) {
    c = 'C';
}

the walker counts am_in_range and force_letter once each (Rule 7 / 9 unary conditional). The || operator itself contributes zero. This matches Fitzpatrick's Rule 7 / 9 worked example in the paper: if (x || y) printf("test failure\n"); "there are two unary conditions since both x and y are tested as conditional expressions."

Comparison with other ABC tools

The project follows Fitzpatrick's original paper for && / ||: the operator does not count; each non-comparison operand counts once as a unary conditional. This deviates from RuboCop's Metrics/AbcSize (which counts and / or directly) and matches StepicOrg/abcmeter and eoinnoble/python-abc. When comparing ABC numbers across tools, the operator-counting choice is the single biggest source of disagreement on the same source.

Algorithm

The implementation walks every leaf node of the syntax tree exactly once. For every node it asks the language's per-language Abc trait implementation three yes/no questions: is this an assignment? a branch? a condition? — and increments the matching counter. The four headline values are:

the three components themselves, assignments, branches, conditions;
the magnitude |<A,B,C>| = √(A² + B² + C²), which is the way Fitzpatrick recommends summarising the vector as a single number.

The full serialised output (src/metrics/abc.rs) emits these four together with value (the per-space magnitude the CLI thresholds against, which equals magnitude at a leaf space), the per-component averages (assignments_average, branches_average, conditions_average), and per-component *_min / *_max at the file scope, for fourteen fields total. The metric is specialised per language in src/languages/language_*.rs.

How to read it

ABC is a size metric, not a complexity metric — a long, dull function with no decisions still scores high if it does a lot of assignments. Fitzpatrick's original recommendation was to use the magnitude as a relative ruler: rank a file's functions by ABC magnitude and look at the top decile.

In practice ABC ended up being most widely adopted by the Ruby community, where the rubocop linter and the flog tool both default to threshold-based warnings. A Ruby method with an ABC magnitude over about 17 is conventionally a refactoring candidate; over 30 is considered hard to maintain. Those thresholds are language-specific — expect higher values in C++ and Java, which use explicit getter/setter assignments more aggressively.

Cognitive Complexity

Cognitive Complexity was introduced by G. Ann Campbell at SonarSource in the 2017 white paper Cognitive Complexity — A new way of measuring understandability and the follow-up IEEE TechDebt 2018 paper Cognitive Complexity — An Overview and Evaluation. The white paper itself is available as CognitiveComplexity.pdf on the SonarSource site.

The metric was designed as a deliberate replacement for Cyclomatic Complexity in code-quality tooling. The argument Campbell makes is that cyclomatic complexity measures how hard code is to test, not how hard it is to understand: a 1024-arm switch statement scores the same as a deeply nested chain of ifs that perform identical logic, yet a human reader has a much harder time following the nested code.

Algorithm

Cognitive Complexity starts at zero and applies three rules as it walks the tree:

Ignore "shorthand" control flow. Constructs that simply route to a single block — a top-level if with no nesting, an else without conditions of its own, the head of a for, a ?: ternary — add a baseline +1 each, but they do not punish you for the pattern.
Penalise breaks in linear flow. Every if, else if, else, switch, try/catch, loop, jump (goto, break label, continue label), and recursive call adds at least +1.
Punish nesting. Every time control flow appears inside an already-nested block, the metric adds an extra +1 per level of nesting. An if inside a for inside an outer if inside a method scores 1 + 2 + 3 = 6, where a flat sequence of the same three constructs would have scored 1 + 1 + 1 = 3.

Sequences of identical boolean operators (a && b && c) score +1 for the whole run, on the grounds that a chain of &&s is no harder to read than a single &&. Switching operators (a && b || c) is where the cognitive load jumps, so the second operator earns its own +1.

big-code-analysis exports the per-function structural score along with the file-wide sum, min, max, and a per-function average. The implementation is in src/metrics/cognitive.rs.

How to read it

A Cognitive Complexity of 0 means the function is purely linear; no branches, no loops. SonarSource's tooling defaults to flagging functions above 15 as "too complex" and Campbell's recommendation in the white paper is that a function should rarely exceed about 25. Unlike Cyclomatic Complexity, the metric scales smoothly: deeply nested code with the same number of decisions scores significantly higher than flat code with the same decisions.

The emergent use case is refactoring guidance during code review: because the metric penalises nesting specifically, it tends to flag exactly the kind of function that benefits from an early-return or "extract method" refactor. SonarLint's IDE plugins (IntelliJ, VS Code, Visual Studio, Eclipse) all surface it as the headline complexity number on hover, and the metric has since been picked up by several language servers and code-review platforms outside the Sonar ecosystem.

Per-language deviations

Elixir does not score recursion or jump statements. Elixir control flow (if / unless / cond / case / with) is built from macro-shaped Call nodes rather than dedicated grammar productions, and the language has no break / continue / goto; the implementation therefore scores only the nesting-bearing constructs it can identify and omits the recursion (B3) and unstructured-jump (B2) increments that the SonarSource specification adds for languages that expose those shapes syntactically.
For every language with a syntactic function-definition node, a nested function (a local function, lambda, or a method on a local / inner class) resets the nesting counter to zero at its boundary and adds a function-depth surcharge, so control flow inside it is scored against the nested function's own depth rather than the enclosing function's nesting. Byte-equivalent constructs therefore score identically across languages.

Cyclomatic Complexity (CC)

The original software complexity metric, introduced by Thomas J. McCabe in 1976 in A Complexity Measure (IEEE Transactions on Software Engineering, SE-2(4), pages 308–320).

McCabe's idea was to apply graph theory to the control-flow graph of a function. If you draw every basic block as a node and every jump between blocks as an edge, the cyclomatic number of that graph is

M = E − N + 2P

where E is the number of edges, N the number of nodes, and P the number of connected components. Crucially, M is also exactly the number of linearly independent paths through the function — in other words, the minimum number of test cases needed to cover every branch at least once.

Algorithm

big-code-analysis does not literally build a control-flow graph. Instead it uses the equivalent, much cheaper, formulation McCabe proved in the 1976 paper for structured programs:

Cyclomatic Complexity = 1 + (number of decision points)

A "decision point" is any node where control can branch:

if, else if, ternary ?:
case / when arms in switch / match / select
while, do … while, every variant of for
exception-handler catch clauses
short-circuit boolean operators && and ||

The per-language Cyclomatic trait, in src/metrics/cyclomatic.rs, asks each tree-sitter node "are you a decision?" and increments the counter. The metric is rolled up per function and per file; per-class aggregation across method bodies is provided separately by WMC below.

Modified cyclomatic

big-code-analysis also reports a modified variant that collapses all case / match / when arms inside a single switch statement into one decision point, regardless of how many arms it has. This tends to undercount big dispatch tables in a way that often matches developer intuition better than the strict McCabe definition — a 30-arm enum dispatch reads as one decision, not thirty. (The convention itself is not original to this project: it echoes the long-standing -m mode from Terry Yin's lizard tool, which is where many readers will first have seen it.) Both numbers are exported side by side; pick one and be consistent.

Counting Rust's `?` operator

By default Rust's ? operator (the try_expression grammar node) adds +1 to both standard and modified cyclomatic, matching upstream rust-code-analysis: ? is an early-return branch. When cyclomatic is used as a maintainability gate, this can over-penalize linear-but- fallible code that threads a handful of ? through a happy path. You can opt out so ? is treated as linear error propagation:

Library: MetricsOptions::default().with_count_cyclomatic_try(false).
CLI: --cyclomatic-count-try=false (or the deprecated --no-cyclomatic-try alias), or cyclomatic_count_try = false in bca.toml (the CLI value overrides the key in either direction).
A repo gate: set cyclomatic_count_try = false in the auto-discovered bca.toml (this project's own make self-scan does exactly this — no flag or env var). Toggling the policy shifts cyclomatic values, so regenerate .bca-baseline.toml in the same change.

The default is unchanged — ? keeps counting — so published metric values are preserved. The toggle is Rust-only; no other language emits try_expression.

How to read it

McCabe's original recommendation, repeated in the 1976 paper and preserved by NIST's Structured Testing report (Special Publication 500-235, 1996), is to treat 10 as the upper bound for a single function: above that, the number of test cases needed for branch coverage grows uncomfortably large.

The emergent uses of cyclomatic complexity have been:

Defect prediction. Complexity correlates well — though imperfectly — with the probability of a function containing a bug, and most static-analysis tools flag high-CC functions as risky.
Test-coverage planning. CC is the lower bound on the number of test cases needed to cover every branch, so test teams use it directly to budget effort.
Refactor triage. Cyclomatic Complexity is the headline "complexity" number in almost every code-quality dashboard, often as a tie-breaker between two functions that look similar in length.

Be aware of the metric's well-known blind spot: it treats every decision as equal weight. A 30-arm switch over an enum and a function with two nested ifs each containing nested ifs both score around 30, even though they are very different reading experiences. Cognitive Complexity (above) was designed to fix exactly that.

Halstead

The Halstead suite is the oldest size-and-effort metric family on this page. Maurice H. Halstead introduced it in his 1977 book Elements of Software Science (Elsevier, ISBN 0-444-00205-7); the Wikipedia page on Halstead complexity measures summarises the formulas. Halstead's project was strikingly ambitious: he wanted a quantitative, empirical science of software in the same way that physics is the empirical science of matter.

The four base counts

Halstead reduces a program to its tokens, then partitions them into two categories:

Operators — anything that does something: keywords (if, return, while), arithmetic and logical operators, assignment, function-call syntax, punctuation that controls flow.
Operands — anything that is something: identifiers and literals.

From these you derive four base counts:

JSON key	Symbol	Meaning
`unique_operators`	n1	number of distinct operators
`unique_operands`	n2	number of distinct operands
`total_operators`	N1	total count of operator occurrences
`total_operands`	N2	total count of operand occurrences

The serialized output uses the descriptive JSON key column; the derived-metric formulas below use Halstead's classic n1/N1/n2/N2 notation for the same four counts.

big-code-analysis records these four numbers in src/metrics/halstead.rs per function and per file. The per-language trait classifies tokens as operator vs. operand on a token-by-token basis; the rules deliberately exclude pure layout punctuation like parentheses and statement separators, which is why the Halstead totals are not the same as the Tokens count.

Derived metrics

Halstead then derives a small zoo of formulas. big-code-analysis reports all of the standard ones, plus three less-common derivations (estimated_program_length, purity_ratio, level) that are part of the original suite:

vocabulary               n  = n1 + n2
length                   N  = N1 + N2
estimated_program_length N̂  = n1·log2(n1) + n2·log2(n2)
purity_ratio                = N̂ / N
volume                   V  = N · log2(n)                          (bits)
difficulty               D  = (n1 / 2) · (N2 / n2)
level                    L  = 1 / D
effort                   E  = D · V          (elementary mental discriminations)
time                     T  = E / 18                               (seconds)
bugs                     B  = E^(2/3) / 3000 (estimated delivered defects)

The numeric constants come from Halstead's empirical fits against a heterogeneous corpus of CDC-era programs including FORTRAN, PL/I, and Algol-family languages. The T = E / 18 "Stroud number" is separate — it comes from psychology: Halstead borrowed John Stroud's estimate that the human mind makes about 18 elementary discriminations per second.

How to read it

Halstead's original intent was to predict three things about a program before it was even written: how big it would be in bits, how long it would take to implement, and how many bugs to expect in deployment. The empirical evidence for the volume and length predictions is reasonable; the time and bugs predictions are more controversial and have been criticised at length, notably in the Purdue technical report Software Science Revisited.

In modern practice the Halstead numbers are used for three things:

As inputs into composite metrics — most importantly the Maintainability Index (next section), which depends on Halstead volume.
As a language-independent size proxy: volume in bits scales smoothly across languages in a way that LOC does not.
For comparative effort budgeting: when two refactoring candidates have similar cyclomatic complexity, the one with the higher Halstead difficulty is the one more likely to introduce regressions.

Lines of Code

This section covers the five LOC variants — SLOC, PLOC, LLOC, CLOC, and BLANK. "Counting lines" sounds trivial until you have to define exactly what counts. The five variants below are the de-facto standard breakdown, going back to Samuel Conte, Hubert Dunsmore and Vincent Shen's 1986 textbook Software Engineering Metrics and Models (Benjamin/Cummings, ISBN 0-8053-2162-4), which codified the distinction between physical and logical lines. The Wikipedia entry on source lines of code is a readable summary of that physical-versus-logical distinction.

Variant	Counts
SLOC	Source Lines Of Code — every line in the file, comments, blanks, and code alike
PLOC	Physical Lines Of Code — non-blank, non-comment-only lines
LLOC	Logical Lines Of Code — statement-bearing lines (definitions, assignments, declarations)
CLOC	Comment Lines Of Code — lines that contain a comment (with or without code on the same line)
BLANK	Blank lines — whitespace-only lines

Algorithm

big-code-analysis derives all five counts from a single pass over the tree-sitter syntax tree (see src/metrics/loc.rs). Comments and strings are identified by their AST node type rather than by lexical scanning, so multi-line strings, raw strings, doc comments, and string interpolations are all handled correctly. The per-language Loc trait specifies which node kinds count as a "statement" for LLOC; this is the subtle one, because what counts as a statement is language-defined.

The five counts satisfy a couple of useful identities:

SLOC = PLOC + BLANK + (lines that are comment-only)
CLOC ≥ (lines that are comment-only)        # CLOC also counts mixed code+comment lines

How to read it

SLOC is what most people mean colloquially by "lines of code". It is the canonical size proxy, but is sensitive to formatting and not portable across language conventions.
PLOC strips away the visual noise. It is the size measure used inside the Maintainability Index formula below.
LLOC is the most reliable statement count. It is the right measure if you are budgeting test cases per statement, or comparing the density of a Python file against a Java file.
CLOC, combined with PLOC, gives you a comment density — CLOC / PLOC is a useful rough proxy for how much of the file is documentation versus implementation.
BLANK is mostly diagnostic: a file with very low BLANK proportion is often hard to read.

The emergent uses of LOC variants go well beyond raw size. They are the most common input into cost-estimation models (COCOMO and COCOMO II both use KSLOC — thousands of source lines — as their base unit), they feed effort prediction in product-portfolio dashboards, and they are used as a normalising denominator for almost every other metric: defects per KSLOC, churn per KSLOC, test cases per KSLOC. The weakness — LOC is easy to game and a 10× difference in coding style can produce a 2× difference in LOC — is the reason this chapter has so many other metrics in it.

Maintainability Index (MI)

The Maintainability Index is a composite metric that rolls several of the metrics above into a single 0-to-100ish number meant to be read as "how maintainable is this code?". It was proposed by Paul Oman and Jack Hagemeister in their 1992 ICSM paper Metrics for assessing a software system's maintainability and refined by Don Coleman, Dan Ash, Bruce Lowther, and Paul Oman in the 1994 IEEE Computer paper Using metrics to evaluate software system maintainability (IEEE Computer 27(8), pages 44-49). Their methodology was empirical: they collected expert maintainability ratings on a handful of production Hewlett-Packard systems, computed forty candidate metrics on each, and let regression analysis pick the best linear combination. The combination that survived used Halstead volume, cyclomatic complexity, lines of code, and comment density.

big-code-analysis reports the three formulas that have stuck in practice:

The three values nest under the mi object as the keys original, sei, and visual_studio (the dotted threshold names mi.original, mi.sei, mi.visual_studio):

mi.original      = 171 − 5.2·ln(HV) − 0.23·CC − 16.2·ln(SLOC)
mi.sei           = 171 − 5.2·log2(HV) − 0.23·CC − 16.2·log2(SLOC) + 50·sin(√(2.4·comment_percentage))
mi.visual_studio = max(0, mi.original · 100 / 171)

mi.original is the Coleman–Oman formula. It can be negative for pathological files.
mi.sei is the Software Engineering Institute's refinement, which adds a comment-density term — the sin(√(...)) shape was chosen so that some comments help, but adding more after a point does not. comment_percentage is the comment-line share expressed as a percentage in [0, 100] (not a ratio in [0, 1]); the code feeds this percentage straight into the SEI term (see src/metrics/mi.rs and issue #241).
mi.visual_studio is the linear rescaling Microsoft chose for Visual Studio, where the score is clamped to [0, 100] and shown to developers traffic-light style: green ≥ 20, yellow ≥ 10, red below.

The historical context, and a sharp critique of the metric, is collected on Arie van Deursen's blog post Think Twice Before Using the Maintainability Index.

Algorithm

The implementation is purely arithmetic — src/metrics/mi.rs consumes the already-computed Halstead, Cyclomatic, and LOC metrics and applies the three formulas. Because the formulas use the natural log of Halstead volume and SLOC, MI is undefined for empty files; big-code-analysis returns 0.0 for any file with zero SLOC or zero Halstead volume.

How to read it

MI was originally designed as a portfolio-level score: "how much maintenance pain should we expect from this codebase over the next year?". It is fairly stable across releases of a healthy system and tends to drop measurably before a system enters the "legacy" quadrant.

The emergent use case is the Visual Studio traffic-light rendering: every C# developer who has hovered a method in the IDE has seen the green / yellow / red icon, and the underlying number is mi.visual_studio. This made MI by far the most user-facing software metric for an entire generation of .NET developers, which is also why it is the metric that has attracted the most criticism. Treat it as a smoke detector, not a thermostat: a sudden drop is a useful signal, but the absolute number is noisy.

NArgs

NArgs counts the number of arguments declared by a function, method, or closure. The metric does not have a famous origin paper — it is folk wisdom dating to at least Kernighan and Plauger's The Elements of Programming Style (1974) and prominently re-stated in Robert C. Martin's Clean Code (2008), which suggests three arguments as a soft ceiling.

big-code-analysis splits the count by callable kind: every aggregate is reported separately for functions and closures so a Rust file heavy on |…| … closures and a Java file with only methods produce comparable numbers. The serialised output (src/metrics/nargs.rs) is function_args, closure_args, function_args_average, closure_args_average, total, average, function_args_min, function_args_max, closure_args_min, closure_args_max. The implementation handles default arguments, variadic arguments, keyword-only arguments, and destructured parameters consistently per language.

How to read it

A function with many arguments is hard to call correctly and even harder to test exhaustively — the test matrix grows roughly exponentially. The classic refactoring advice is the introduce parameter object pattern: when a function takes more than four related arguments, group them into a record / struct / dataclass.

The emergent use is as a review-blocking lint rule: most modern linters (pylint's R0913, ESLint's max-params, Checkstyle's ParameterNumber) flag functions with more than a configurable threshold. NArgs is also a useful component of API-design dashboards: public APIs whose average NArgs has crept upward over time tend to be ones that have accreted "just one more parameter" feature flags.

NExits

NExits counts the number of distinct exit points from a function — every explicit return, every throw / raise, and (in Rust) every ? early-return. The implicit fall-through return at the end of a function is not counted; only explicit exits are (see issue #243).

The metric goes back to the structured-programming literature of the 1970s, where Edsger Dijkstra and others argued that functions should have a single entry and a single exit point (the "SESE" rule). Modern thinking is much more nuanced — see Steve McConnell's Code Complete, 2nd edition (Microsoft Press, 2004), which explicitly recommends early returns as a clarity-improving pattern when they reduce nesting.

big-code-analysis walks each function's syntax tree, identifies the language-specific exit nodes (see the per-language Exit trait in src/metrics/nexits.rs), and reports per-function counts plus file-level sum, average, min, and max. The serialised field name is nexits, matching the prose acronym used here.

How to read it

Strict SESE coding standards (DO-178C for avionics, MISRA C for embedded automotive — see MISRA's official site) still require an NExits of 1 per function, because multiple exit points complicate certified control-flow analysis. Outside those domains, an NExits of 2-4 is usually a good sign — it almost always means the function uses guard clauses to handle preconditions and then proceeds in a flat body.

A very high NExits — say above 8 — is the warning sign. It usually means the function should have been split into several smaller functions, with each "successful branch" becoming its own helper.

NOM

NOM stands for Number Of Methods and counts every function, method, and closure defined inside a given scope (file, class, or namespace). For object-oriented codebases it is one of the first metrics introduced by Mark Lorenz and Jeff Kidd in their 1994 book Object-Oriented Software Metrics (Prentice Hall, ISBN 0-13-179292-X), where it is treated as the primary class-size indicator.

big-code-analysis reports the count split by callable kind in src/metrics/nom.rs. The serialised fields are functions, closures, functions_average, closures_average, total, average (overall average across containing spaces), and per-kind functions_min, functions_max, closures_min, closures_max.

The split lets you ask different questions of the same code: a Rust crate with many closures and few functions is typical of iterator-heavy code; a Python module with many functions and few closures is typical of script-style code.

How to read it

NOM is the input to several other metrics — WMC sums cyclomatic complexity across the same set of methods that NOM counts, and NPM filters that same set down to public methods. As a standalone metric, the Lorenz–Kidd recommendation is ≤ 20 methods per class. The emergent use is as a God-class detector: a class with NOM in the dozens is almost always doing too much, and is a strong candidate for "extract collaborator" refactoring as documented in Martin Fowler's Refactoring catalogue entry on Large Class.

NPA

NPA counts the number of public attributes (a.k.a. fields, properties, instance variables) declared by a class or interface. It is part of the metric family introduced by Lorenz and Kidd in Object-Oriented Software Metrics (1994) and was later folded into the MOOD ("Metrics for Object-Oriented Design") suite proposed by Brito e Abreu and Carapuça (1994).

big-code-analysis splits the count by definition-site kind: classes (concrete types with state) and interfaces (abstract contracts). The serialised output (src/metrics/npa.rs) is class_npa_sum (sum of NPA across all classes), interface_npa_sum (sum across interfaces), class_attributes (sum of all attributes — public or not — across classes), interface_attributes, class_cda (class density of public attributes — an accessibility ratio, not an average), interface_cda, total, total_attributes, and cda. The per-language Npa trait decides what counts as "public" (Java public, C# public, Rust pub, Python's "no leading underscore" convention, …) and what counts as "attribute" rather than "method".

How to read it

NPA is a direct measure of encapsulation. Every public attribute is a piece of internal state that callers can read or write without going through a method, which means it is a piece of internal state the class cannot validate or evolve without breaking callers. The canonical guidance — first explicitly stated in Bertrand Meyer's Object-Oriented Software Construction (Prentice Hall, 1988) and known as the Uniform Access Principle — is to keep NPA at or near zero and to expose state through public methods instead.

The emergent use is API-stability auditing: a public library class whose NPA grows over time accumulates breaking-change liability faster than its public-method surface.

NPM

NPM counts the number of public methods declared by a class or interface. It is the method-side companion to NPA and was again codified by Lorenz and Kidd (1994).

As with NPA, big-code-analysis splits NPM by definition-site kind (classes vs. interfaces). The serialised output (src/metrics/npm.rs) is class_npm_sum (sum of NPM across classes), interface_npm_sum, class_methods (sum of all methods — public or not — across classes), interface_methods, class_coa, interface_coa (operation-accessibility ratios, not averages), total, total_methods, and coa. The language-specific Npm trait decides what counts as public — for example, Rust's pub, Python's leading-underscore convention, C++'s public: section — and folds together regular methods, constructors, and operator overloads as appropriate.

NPM is also one of the inputs into Mark Hitz and Behzad Montazeri's Class Interface Size metric, and into Chidamber and Kemerer's Response For a Class (RFC).

How to read it

NPM is the public interface size. A class with NPM in the dozens is a class with too large an API contract: every public method is something callers can come to depend on, and every change to it is a breaking change. The Lorenz–Kidd guidance is ≤ 20 public methods per class, with anything over 40 being considered a strong refactoring candidate. The same rule applies particularly forcefully to interfaces in Java and C#, where the contract really is the shape clients pin against.

The emergent use is as a public-API change tracker for libraries: monitoring NPM at the package level catches accidental expansion of a library's surface area in the same way that NPA catches accidental exposure of internal fields.

Tokens

Tokens is a per-function and per-file count of the tree-sitter leaf tokens — identifiers, literals, keywords, punctuation — excluding any token whose AST ancestor is a comment node. It is a modern, lexer-driven size proxy intended as a more formatting-resilient alternative to LOC. (The same idea is well known from Terry Yin's lizard command-line tool, which is where many readers will first have seen a token-count metric.)

The implementation lives in src/metrics/tokens.rs. Because Tokens counts every leaf, including punctuation that Halstead deliberately skips, the value will not equal Halstead N1 + N2, and because it counts tokens rather than lines it is not equivalent to any LOC variant. Whitespace-only reformatting does not change Tokens; renaming a variable does not change the count; removing a comment does not change Tokens. Edits that change the tokens themselves — adding an if, adding optional braces around a single-statement block, or inserting/removing semicolons in a language where they are optional — do change the count.

How to read it

Tokens is the most formatting-resilient size proxy in the suite. It is the right size measure to use when you are normalising another metric across languages or across teams with different style conventions — bugs per KSLOC is sensitive to formatting, while bugs per 1000 tokens is much less so.

The emergent use is as the defect-density denominator of choice in cross-language research: a 1000-line Java file and a 1000-line Lisp file contain very different amounts of code, but a 1000-token slice of each contains roughly the same amount of information. This makes Tokens particularly useful for machine-learning code-quality models that train across many languages.

WMC

WMC — Weighted Methods per Class — is the first metric in the Chidamber and Kemerer suite, introduced in their 1994 IEEE Transactions on Software Engineering paper A Metrics Suite for Object Oriented Design (volume 20, issue 6, pages 476-493). The CK suite — WMC, DIT, NOC, CBO, RFC, LCOM — is the single most-cited collection of OO metrics in the academic literature; big-code-analysis currently implements WMC and the simpler size metrics (NOM, NPA, NPM), with the inheritance- and coupling-based ones tracked for future work.

WMC is the sum of the cyclomatic complexity of every method defined in a class. The original paper deliberately left the "weighting" abstract — Chidamber and Kemerer wrote that "if all method complexities are considered to be unity, then WMC = n, the number of methods" — but the empirical follow-up literature has almost universally settled on cyclomatic complexity as the weight, and that is what big-code-analysis uses.

Algorithm

For each class or interface found by the per-language parser, big-code-analysis sums the standard cyclomatic complexity of every method body inside it (src/metrics/wmc.rs). The file-level serialised output is three fields: class_wmc_sum (sum of WMC across all classes in the file), interface_wmc_sum (sum across interfaces), and total (the two combined). No min/max/average aggregation is emitted at the file scope — to rank individual classes by WMC, use the report subcommand, which surfaces a Type hotspots (top N by WMC) section (see Commands → Report).

How to read it

Chidamber and Kemerer offered three hypotheses about WMC, all of which have been validated repeatedly since:

Higher WMC predicts higher maintenance effort. A class whose methods are individually complex will resist comprehension.
Higher WMC reduces reuse. Classes that do many complicated things are hard to drop into a new context.
Higher WMC suggests broader application-specific behaviour. Such classes tend to be "main loop"-style coordinators rather than reusable building blocks.

The emergent use is God-class detection: combined with NOM, WMC is one of the clearest signals that a class needs to be split. A class with high NOM but low WMC is a passive data holder (probably fine). A class with low NOM and high WMC has a few gargantuan methods (split the methods, not the class). A class with both high NOM and high WMC is the classic God class.

Where to go next

The Supported Languages chapter lists every supported language and grammar. Metric coverage varies by language because some metric definitions (NPA, NPM, WMC) only make sense in languages with classes.
The Supported Change-history (VCS) Metrics chapter covers the complementary family derived from version-control history — commit frequency, churn, ownership, and the composite risk and hotspot scores — rather than from the source AST.
The Commands → Metrics page documents how to invoke bca metrics to produce the JSON / YAML / TOML / CBOR output for any of these numbers.
The Recipes chapter shows end-to-end examples of producing quality reports from these metrics, including pipelining them into dashboards.

Supported Change-history (VCS) Metrics

This chapter is a guided tour of the change-history metrics — the signals big-code-analysis derives from version-control history rather than from the source AST. Where the Supported Code Metrics chapter measures the shape of the code as it exists today, these metrics measure how that code got there: how often it changes, how much, by how many people, and in what kind of commits. Each section starts from the empirical paper that first connected the signal to defects, walks through how big-code-analysis computes it, and explains how to read the number in practice.

The whole family is computed by the bca vcs command and attached to bca metrics --vcs output; the Commands → Change-history (VCS) metrics page documents the flags, windows, output formats, and caching. This chapter is the why; that page is the how.

A few framing notes before we start:

These metrics predict where defects cluster, not whether code is correct. The defect- and vulnerability-prediction literature consistently finds that process signals — how a file has been edited over time — out-predict product signals like size or complexity. Graves et al. put it bluntly in 2000: the number of times a module has been changed is a better predictor of its fault count than its length. None of these numbers measures correctness; they rank files by the probability that a bug is hiding in one.
Everything is measured over two windows. A single history walk per invocation produces every signal over a long window (default 12mo ≈ 365 days) and a recent window (default 90d). Recent activity is weighted more heavily than old activity throughout, because recency is the strongest single signal in the just-in-time defect-prediction line.
The composite scores are ordinal, not cardinal. The risk_score, hotspot_score, and commit-level JIT score are rulers, not thermometers: only relative ranks carry meaning. Rank a repository's files (or commits) against each other, or against the repository's own distribution over time. Do not read an absolute magnitude as a probability or compare a raw score across unrelated projects.
An absent record is not a zero. An untracked file has no record at all, which is distinct from a tracked file with zero in-window activity. A computed 0.0 (for example, a file that only ever changed alone has zero co-change entropy) is a real measurement.

Index

Metric	Measures	First connected to defects by
Commit frequency	How often a file changes	Graves et al., 2000
Code churn	How many lines change	Nagappan & Ball, 2005
Burst	Concentration of change into the recent window	recency principle (Graves; JIT line)
Authorship and ownership	How many hands touch a file, and how concentrated	Bird et al., 2011; Meneely & Williams, 2009
Fix, security, and revert commits	History of corrective change	Śliwerski, Zimmermann & Zeller, 2005
Age and last-modified	When a file was born and last touched	just-in-time defect-prediction line
Change entropy	How scattered the commits touching a file are	Hassan, 2009
Co-change entropy	How wide a file's change-ripple blast radius is	arXiv 2504.18511, 2025
Composite risk score	Weighted roll-up of every signal above	this project (formula `v2`)
Hotspot score	Complexity × recent churn	Tornhill, 2015
Bus factor	Knowledge concentration across a set of files	Avelino et al., 2016
Just-in-time commit score	Defect-induction risk of one commit	Kamei et al., 2013

Commit frequency

The simplest change-history signal is how many distinct commits have touched a file. big-code-analysis records it per window as commits_long and commits_recent.

The metric's standing as a defect predictor comes from Todd Graves, Alan Karr, J. S. Marron, and Harvey Siy's 2000 IEEE TSE paper Predicting Fault Incidence Using Software Change History. Studying a large telephone-switching system, they found that process measures drawn from the change history predicted fault rates better than any product metric of the code itself, and that the count of prior changes was among the strongest. The intuition is direct: a file nobody touches is a file nobody is breaking.

Algorithm

A single history walk visits each commit reachable from the analysed ref (first-parent only by default, the full DAG with --full-history) and attributes it to every file it modifies. Merge commits are excluded unless --include-merges is passed, and renames are followed by default so a file's history survives a move. The walk counts distinct commits per file per window, not diffs or hunks, so a commit that touches a file once and a commit that touches it in three hunks both count as one.

How to read it

Commit frequency is a rate, so it is only meaningful relative to a baseline: this file versus its siblings, or this file this quarter versus last. A consistently high count flags a file that is either under active feature development or structurally unstable; pairing it with code churn tells the two apart, since heavy churn on few commits is a different story from light churn on many.

Code churn

Code churn is the volume of change: the sum of added and deleted lines touching a file, recorded per window as churn_long and churn_recent.

Churn's defect-prediction pedigree is Nachiappan Nagappan and Thomas Ball's 2005 ICSE paper Use of Relative Code Churn Measures to Predict System Defect Density. Their key finding, validated on Windows Server 2003, is that absolute churn is a poor predictor on its own, but churn relative to file size and to the temporal spread of the changes is highly predictive of defect density. big-code-analysis keeps the raw windowed churn as a signal and lets the composite score combine it with size and recency, rather than reporting a single absolute number as a verdict.

Algorithm

For every in-window commit touching a file, the walk adds that diff's added-plus-deleted line count to the file's churn total for the window. Because added and deleted lines are summed, a one-line edit counts as two (one deletion, one addition) — churn measures activity, not net growth.

How to read it

Churn is the natural numerator for a density ratio. Churn per commit distinguishes steady editing from a few large rewrites; churn against file size recovers Nagappan and Ball's relative measure. A file whose recent churn is high relative to its long-window churn is changing faster now than its history would predict — exactly the burst signal below.

Burst

Burst is the share of a file's activity concentrated in the recent window: commits_recent / commits_long, reported as a ratio in [0, 1]. A value near 1 means almost all of the file's commits are recent; a value near 0 means the file was active long ago and has since gone quiet.

The signal follows directly from the recency principle that runs through the change-history literature — Graves et al. found that recent changes weigh more heavily on fault incidence than old ones, and the just-in-time defect-prediction line (see the JIT score) is built on the same observation. A file whose change history is front-loaded into the present is in active flux, and active flux is when defects enter.

How to read it

Burst is a tie-breaker, not a standalone alarm. A high burst on a file with little total history is a young, fast-moving file; a high burst on a file with deep history is an old file that has suddenly woken up, which is often the more interesting case. Read it alongside age to tell the two apart.

Authorship and ownership

Two related signals describe who changes a file. Author counts (authors_long, authors_recent) count the distinct people who touched it in each window. Ownership (ownership_top_share, a ratio in [0, 1]) is the fraction of edits attributable to the single most active author; a low share means diffuse ownership, a high share means one person dominates.

Both trace to empirical work on ownership and security. Christian Bird and colleagues' 2011 FSE paper Don't Touch My Code! Examining the Effects of Ownership on Software Quality found, on Windows Vista and 7, that diffuse ownership — many low-expertise contributors, a low top-owner share — predicts both pre-release faults and post-release failures. On the security side, Andrew Meneely and Laurie Williams' 2009 CCS paper Secure Open Source Collaboration: An Empirical Study of Linus' Law reported that Red Hat Enterprise Linux 4 files touched by nine or more developers were roughly sixteen times more likely to harbour a vulnerability.

Algorithm

Author identities are canonicalised through the repository .mailmap and counted by lowercased email, so a contributor who commits under two addresses counts once. Co-authored-by: trailers add participants. Bot identities (dependabot[bot], renovate[bot], github-actions[bot], and similar) are excluded by default so automated churn does not inflate the count. ownership_top_share is the top author's edit count over the file's total edits in the window.

How to read it

A rising author count on a long-lived file is a knowledge-diffusion signal: more people are now obliged to understand it. The composite risk score folds this in two ways — it scales the author factor by an ownership-dilution term (1 - ownership_top_share), so the same head-count counts for more when ownership is spread thin, and it adds categorical bumps at the six- and nine-developer marks that encode Meneely and Williams' RHEL4 thresholds. For knowledge concentration across a set of files rather than within one, see the bus factor.

Fix, security, and revert commits

big-code-analysis classifies each long-window commit touching a file by the intent of its message and keeps three counts: bug_fix_commits (messages matching a bug-fix keyword), security_fix_commits (messages matching CVE-####, security, vuln, exploit, sanitize, and similar), and revert_commits (subjects that are a revert or rollback).

Reading a commit's purpose from its message is the technique introduced by Jacek Śliwerski, Thomas Zimmermann, and Andreas Zeller's 2005 MSR paper When Do Changes Induce Fixes?, the work now universally abbreviated SZZ. The premise is that a file's history of corrective change is itself predictive: a file that has needed many fixes is a file likely to need more. A security fix is a sharper signal than an ordinary bug fix, and a revert marks a change that had to be undone — a localized admission that something went wrong there.

Algorithm

Classification is keyword-based on the commit subject and body (see src/vcs/classify.rs). It is deliberately simple and language- and tracker-agnostic: there is no issue-tracker linkage and no blame back-tracing to the fix-inducing commit, so this is the lightweight message-classification half of SZZ, not the full algorithm. The counts are kept over the long window only.

How to read it

A high bug_fix_commits count is a file that keeps needing repair. The composite score feeds all three through a single log-scaled term with security fixes double-weighted, so a file with a history of security fixes ranks above an otherwise-identical file with only ordinary bug fixes. Because the classifier reads messages, its accuracy tracks your project's commit hygiene: a repository with terse or templated messages will under-count.

Age and last-modified

Two timing signals bound a file's history. age_days is the number of days since the file's first in-window commit (capped at the long window); last_modified_days is the number of days since its most recent in-window commit.

These anchor the recency reasoning the rest of the family depends on. A small age_days marks a new file, and newly added code carries elevated risk — an observation from the just-in-time line and from Chromium's own defect analysis, where freshly added features were disproportionately fault-prone. A small last_modified_days marks a file edited recently, which the windows already weight.

How to read it

age_days is most useful as the new-file trigger it feeds into the risk score: a file first seen inside the recent window earns a small additive bonus, reflecting that brand-new code has had the least chance to be exercised and reviewed. last_modified_days is a staleness read in the other direction: a high-risk file that has not been touched in months is a different maintenance proposition from one being edited today.

Change entropy

Change entropy measures how scattered the commits touching a file are. It is reported per window as change_entropy_long and change_entropy_recent, in bits.

The metric is Ahmed Hassan's History Complexity Metric from the 2009 ICSE paper Predicting Faults Using the Complexity of Code Changes. The idea adapts Shannon entropy to commits: for each commit, take the distribution of its churn across the files it touched and compute that distribution's entropy in bits. A commit that touches one file scores 0; a commit that spreads its churn evenly across n files approaches log₂(n). A scattered, cross-cutting commit is harder to reason about than a focused one. Hassan reported that this change-process complexity predicts faults better than prior code-based or change-count models; later work measured file-level change entropy at a Pearson correlation up to 0.54 with defect counts on eight Apache projects (see co-change entropy).

Algorithm

For each commit, big-code-analysis computes the Shannon entropy H of its churn distribution across the touched files (see src/vcs/entropy.rs). Each participating file is then credited its churn share pᵢ·H of that commit — Hassan's History Complexity Metric — and these shares are summed per file per window. A file repeatedly caught up in diffuse, multi-file changes accumulates high change entropy; a file that only ever changes in tightly focused commits stays low.

How to read it

Change entropy distinguishes a file that changes as part of large, sprawling edits from one that changes in self-contained commits, even when their raw churn is identical. High change entropy is a smell of cross-cutting concern: edits to this file keep dragging in many others. It enters the recent-window risk term additively, complementing rather than restating the churn and commit counts.

Co-change entropy

Co-change entropy measures how wide a file's change blast radius is — how many different files it tends to change alongside. It is reported per window as cochange_entropy_long and cochange_entropy_recent, in bits.

The signal comes from the 2025 study Co-Change Graph Entropy: A New Process Metric for Defect Prediction (arXiv 2504.18511). Build a weighted graph in which two files share an edge whenever they change in the same commit, the weight being the number of shared commits. A file's co-change entropy is the Shannon entropy of its edge-weight distribution: low when it always co-changes with the same one partner, high when its changes ripple out across many different files. The study found that adding co-change entropy to change entropy improved AUROC in 82.5% of cases over the prior signal set across eight Apache projects.

Algorithm

The walk records, per commit, which in-scope files changed together and accumulates the co-change graph's edge weights, then computes each file's edge-weight entropy per window. Bulk-import commits touching more than 1000 files are excluded from the co-change graph — its edge count grows with the square of commit width — though they still contribute their (linear-cost) change entropy. A computed 0.0 means the file has no co-change neighbours in the window, which is a real measurement, not a missing one.

How to read it

Where change entropy asks "how scattered are the commits that touch this file?", co-change entropy asks "how many other files does touching this one drag along?". A high value flags a file whose edits have unpredictable, far-reaching consequences — a coupling hotspot. The two entropy signals are designed to complement each other, and the risk score v2 adds both recent-window terms together.

Composite risk score

The risk score rolls every signal above into a single per-file number, risk_score. It is the headline output of bca vcs and the field its ranked tables sort on. Two formulas are offered, both versioned together by a single risk_score_version (currently 2) so downstream consumers can detect a change.

The weighted formula (default)

The default formula is a log-scaled weighted sum with categorical multiplicative bumps. Counts are passed through ln(1 + x) so that the difference between 10 and 20 commits matters more than the difference between 110 and 120, matching how change activity actually saturates:

base = 0.30 · ln(1 + churn_recent)
     + 0.25 · ln(1 + commits_recent)
     + 0.15 · ln(1 + commits_long)
     + 0.15 · ln(1 + authors_long) · (1 + dilution)
     + 0.10 · ln(1 + bug_fix_commits + 2 · security_fix_commits)
     + 0.05 · ln(1 + churn_long)
     + 0.10 · change_entropy_recent + 0.05 · cochange_entropy_recent
     + ln(1 + sloc)² / 100

risk_score = base · (1 + dev_bonus + new_file_bonus)

where dilution = 1 - ownership_top_share, the dev_bonus is 0.35 for nine or more long-window authors and 0.15 for six or more, and the new_file_bonus is 0.15 for a file first seen inside the recent window. The weights are grounded term by term in the literature cited throughout this chapter: recent churn and commit frequency carry the most weight (Nagappan & Ball; the JIT line), the author factor is scaled by ownership dilution (Bird et al.) and bumped at the RHEL4 developer-count thresholds (Meneely & Williams), security fixes are double-weighted, and the two recent-window entropy terms enter additively (Hassan; arXiv 2504.18511). The full derivation lives in src/vcs/score.rs.

The size term ln(1 + sloc)² / 100 is a genuine contributor, not the tiny tie-breaker its position at the end of the sum might suggest: it reaches about 0.85 at 10k SLOC and exceeds 1.0 past roughly 50k SLOC, comparable in magnitude to the churn terms. Large files are only weakly correlated with defects, but the squared-log scaling keeps size a first-class additive signal rather than letting it dominate.

The percentile formula

--risk-formula percentile is the alternative: each signal is re-ranked to its percentile within the analysed set, and the per-file mean of those percentiles becomes the score. This trades the literature-tuned weights for cross-project robustness — the prediction literature generally recommends relative triggers over hard absolute thresholds — at the cost of a score that is only meaningful within a single run's file set.

How to read it

The score is ordinal. Sort a repository's files by it and look at the top of the list; that is the ranking the score exists to produce. Do not compare a raw risk_score between two repositories, and do not read its magnitude as a defect probability. To watch a single file move over time, use bca vcs trend, which re-anchors the walk at each historical point so the series reflects what the file actually looked like then.

Hotspot score

The hotspot score is the product of a file's complexity and its recent churn: hotspot_score = cyclomatic_sum × churn_recent. It is an Option, present only when an AST complexity figure is computed alongside the history (for example bca metrics --metrics cyclomatic --vcs), because it needs both halves.

The metric is the central idea of Adam Tornhill's 2015 book Your Code as a Crime Scene and the CodeScene tooling built on it. The argument is that complexity on its own is cheap to ignore — a complex file nobody touches costs nothing — and churn on its own is cheap too. The danger is their intersection: code that is both complicated and changing often is where defects concentrate and where developer effort is repeatedly spent. Tornhill observes that a small fraction of a codebase typically accounts for a large majority of its change activity, and the hotspot score is built to find that fraction.

How to read it

Like the risk score, the hotspot product is ordinal: rank files by it, do not read the magnitude. The CLI uses the file-level cyclomatic sum as the complexity axis by convention, but any AST complexity figure serves. The score's value is its prioritisation: of all the complex files, it surfaces the ones actively being edited, which are both the likeliest to break and the cheapest to refactor while they are already open on someone's screen.

Bus factor

Where ownership_top_share measures knowledge concentration within a single file, the bus factor (also called the truck factor) measures it across a set of files: the minimum number of developers whose departure would leave more than half of a directory's files without a knowledgeable maintainer. The broader concept has a Wikipedia summary; big-code-analysis emits it as a vcs_aggregate object covering the whole repository, each top-level directory, and each of its immediate subdirectories.

The estimation method is Guilherme Avelino, Leonardo Passos, Andre Hora, and Marco Tulio Valente's 2016 ICPC paper A Novel Approach for Estimating Truck Factors. Each developer's authorship of each file is scored with their Degree-of-Authorship heuristic:

DoA(d, f) = 3.293 + 1.098 · FA + 0.164 · DL − 0.321 · ln(1 + AC)

where FA is first authorship (1 if developer d created file f), DL is d's number of deliveries (changes) to f, and AC is the changes made by other developers. A developer is an author of a file when their DoA, normalised by the file's maximum, clears 0.75 (the paper's threshold).

Algorithm

The truck factor is a greedy removal (see src/vcs/bus_factor.rs): repeatedly drop the developer who authors the most still-covered files, stopping once more than --bus-factor-threshold (default 0.5, per Avelino) of the files are orphaned, and report how many developers were removed. The aggregate covers every in-scope file in one walk, so by_directory entries are computed over all files recursively beneath each directory.

How to read it

A bus factor of 1 means losing one person orphans the set — common, and correct, for a repository of mostly single-author files. Treat the number as a planning signal, not a guarantee: it is a heuristic over observed authorship within the long window, so "first authorship" means the earliest commit seen in that window, not necessarily a file's true creation. Use it to find the directories where knowledge is dangerously concentrated and spread review accordingly.

Just-in-time commit score

Everything above ranks files at a point in time. The just-in-time (JIT) score instead scores a single commit for its defect-induction risk — the unit a continuous-integration gate actually reviews at check-in. It is produced by bca vcs commit <commit> and reported as an ordinal risk_score with a per-group contributions breakdown.

The feature groups and their signs are taken from the just-in-time defect-prediction literature, beginning with Yasutaka Kamei and colleagues' 2013 IEEE TSE paper A Large-Scale Empirical Study of Just-in-Time Quality Assurance and confirmed by the open replications Commit Guru (FSE 2015) and Shane McIntosh and Yasutaka Kamei's Are Fix-Inducing Changes a Moving Target? (IEEE TSE 2018). big-code-analysis implements a static, rule-based scorer rather than a trained model, so nothing drifts as the project ages.

Algorithm

Five feature groups move the score, each scored against the commit's first parent (see src/vcs/jit.rs):

Group	Features	Direction
Size	lines added / deleted, files touched, diff hunks	larger ⇒ riskier
Diffusion	distinct subsystems and directories, within-commit change entropy	more scattered ⇒ riskier
History	the touched files' priors — prior changes, distinct authors, fix counts, and their composite risk — measured before the commit	turbulent history ⇒ riskier
Experience	the author's prior commit count (long and recent)	more experience ⇒ less risky (this group subtracts)
Purpose	fix / security-fix / revert classification of the message	fixes add, reverts dampen

The contributions block reports each group's signed contribution, so a consumer can see why a commit ranked where it did. A merge commit is flagged and scored against its first parent; a root commit and any new files carry zero priors by construction, so the score then leans on size and author experience, exactly as the literature prescribes for changes with no file history. A bare git diff can also be scored (bca vcs commit --diff), but only the size and diffusion groups are computable from a diff, so that path emits a deliberately partial partial_risk_score.

How to read it

Like the file-level score, the JIT score is ordinal: rank commits by it, or compare a commit against the repository's own commit-score distribution, but do not read the magnitude as a probability. Its intended use is a check-in gate — bca vcs commit HEAD --fail-above <N> exits non-zero when a commit scores at or above a threshold — with the threshold calibrated against your own history rather than treated as an absolute. Any change to the formula bumps a jit_score_version that is independent of the file-level risk_score_version.

Where to go next

The Commands → Change-history (VCS) metrics page documents how to invoke bca vcs, configure the windows, choose an output format, and use the persistent history cache.
The Supported Code Metrics chapter covers the AST-derived metrics these change-history signals complement — the hotspot score in particular combines the two families.
The Python Bindings → Change-history (VCS) metrics page shows the same family through the big_code_analysis.vcs module.

Migration: Flag CLI to Subcommand CLI

The CLI was restructured from a flat flag-style interface (one process, many mutually-exclusive --action flags) into a subcommand-style interface (bca <verb>). This page maps every old invocation to its replacement.

Why the change

The flag CLI overloaded --output-format with two unrelated meanings: per-file serialization (-O json/yaml/toml/cbor) and a post-walk aggregated report (-O markdown). It needed two clap ArgGroups plus runtime checks to police invalid combinations, and --top / --strip-prefix lived as global flags that only applied to one format. Future aggregated formats (e.g. HTML) would compound the fragility.

The subcommand CLI fixes the structure: bca metrics and bca ops emit per-file output; bca report <FORMAT> emits an aggregated report; each verb has its own scoped flag set.

Migration mapping

Old	New
`--metrics -O markdown` (+ `--top`, `--strip-prefix`)	`report markdown`
`--metrics -O json/yaml/toml/cbor`	`metrics -O json/yaml/toml/cbor`
`--metrics -O checkstyle/sarif/code-climate/clang-warning/msvc-warning`	`check --threshold ... --report-format <fmt> [--output FILE]`
`--ops -O ...`	`ops -O ...`
`--dump`	`dump`
`--find <NODE>`	`find -t <NODE> [-t <NODE>...]`
`--count <LIST>`	`count -t <NODE> [-t <NODE>...]`
`--function`	`functions`
`--comments [--in-place]`	`strip-comments [--in-place]`
`--preproc <FILE> <FILE>...` (producer)	`preproc -o <OUT>`
`--preproc <FILE>` (consumer)	`--preproc-data <FILE>` (per subcommand, after the verb)
`--list-metrics [MODE]`	`list-metrics [MODE]`
`--pr` (pretty)	`--pretty` (on `metrics` and `ops`)
`--ls`, `--le` (global)	`--line-start`, `--line-end` on `dump`/`find` (`--ls`/`--le` kept as deprecated aliases)
`-p`, `-I`, `-X`, `-j`, `-l`	scoped to the subcommand; pass them after the verb (`-w` stays universal)

Side-by-side examples

Aggregated markdown report

# OLD
big-code-analysis-cli \
    --metrics \
    --paths "$PWD" \
    --output-format markdown \
    --jobs $(nproc) \
    --top 20 \
    --strip-prefix "$PWD/"

# NEW
bca \
    report \
    --paths "$PWD" \
    --format markdown \
    --top 20 \
    --strip-prefix "$PWD/"

Per-file metric extraction

# OLD
big-code-analysis-cli --metrics --paths ./src --output-format json --output ./out/

# NEW (per-file tree: --output became --output-dir in 2.0)
bca metrics --paths ./src -O json --output-dir ./out/

Per-file ops extraction

# OLD: big-code-analysis-cli --ops --paths ./src -O json -o ./out/
# NEW: bca ops --paths ./src -O json --output-dir ./out/

AST dump

# OLD: big-code-analysis-cli --dump --paths ./file.rs
# NEW: bca dump --paths ./file.rs

Find / count nodes

# OLD: big-code-analysis-cli --find call_expression --paths ./src
# NEW: bca find --paths ./src -t call_expression

# OLD: big-code-analysis-cli --count if_statement,for_statement --paths ./src
# NEW: bca count --paths ./src -t if_statement -t for_statement

Note: count now takes one node type per positional argument (space separated) rather than one comma-separated string.

Function spans

# OLD: big-code-analysis-cli --function --paths ./src
# NEW: bca functions --paths ./src

Strip comments

# OLD: big-code-analysis-cli --comments --in-place --paths ./src
# NEW: bca strip-comments --paths ./src --in-place

Preproc data — producer

# OLD
big-code-analysis-cli --metrics --preproc a.h --preproc b.h \
    --paths ./src -o /tmp/p.json

# NEW
bca preproc --paths ./src -o /tmp/p.json

Preproc data — consumer

# OLD
big-code-analysis-cli --metrics --preproc /tmp/p.json \
    --paths ./src -O json -o ./out/

# NEW
bca metrics --paths ./src --preproc-data /tmp/p.json \
    -O json --output-dir ./out/

List metrics

# OLD: big-code-analysis-cli --list-metrics descriptions
# NEW: bca list-metrics descriptions

Migration hint at runtime

If you run a legacy invocation, the CLI prints a hint identifying the recognized old flags and their new equivalents before clap's own error. For example:

$ bca --metrics -O markdown
note: the CLI was restructured into subcommands. See migration.md for the full mapping.
  --metrics  ->  bca metrics
  -O markdown  ->  bca report markdown|html [--top N] [--strip-prefix P]
  Run `bca --help` for the new command list.

error: unexpected argument '--metrics' found

Commands

bca offers a range of commands to analyze and extract information from source code. Each command may include parameters specific to the task it performs. Below, we describe the core types of commands available in bca.

Installation

The bca command-line tool is available as a pip-installable wheel. The distribution name is big-code-analysis-cli and the installed command is bca — the two differ deliberately (the bca name on PyPI belongs to an unrelated project, and big-code-analysis is this project's importable library bindings):

pip install big-code-analysis-cli   # installs the `bca` command on PATH
bca --version

This drops the compiled bca binary onto your PATH the way pip install ruff gives you the ruff command — no Rust toolchain required. The wheel carries the full all-languages grammar set, so every supported language works out of the box. A single py3-none-<platform> wheel covers every CPython 3.x (and PyPy) on that platform; prebuilt wheels ship for Linux (manylinux_2_28 x86_64 / aarch64), macOS (x86_64 / arm64), and Windows (x86_64). On any other platform pip falls back to a source build, which needs a Rust toolchain.

This is the binary CLI, distinct from the importable Python bindings (pip install big-code-analysis). Other install paths — Homebrew, .deb / .rpm / .apk packages, prebuilt release archives, or cargo install big-code-analysis-cli — are described in the repository README.

The wheel build and publish matrix is defined in .github/workflows/python-cli-wheels.yml.

Exit codes

bca follows one exit-code convention across every subcommand, so CI scripts can branch on the process status without inspecting output:

Code	Meaning
`0`	Success.
`1`	Tool error — a bad flag / threshold / glob spec, unreadable input, or a parse failure. This includes usage errors (unknown flag, bad subcommand, a malformed `--threshold` value rejected by clap). Never a metric signal.
`2`	Metric gate: `check` thresholds were exceeded, `vcs commit --fail-above` was breached, or `diff` / `diff-baseline` under `--exit-code` found a non-empty filtered diff.
`3`–`5`	`check --exit-codes=tiered` only: tiered violation severity (regression-only / mixed / hard-breach; in tiered mode code `2` means new-only).

Codes 2–5 are gate signals, emitted only by check, vcs commit --fail-above, and diff / diff-baseline under the opt-in --exit-code flag; they report a metric result, not a failure of the tool. Every other subcommand — metrics, ops, report, diff, diff-baseline, exemptions, init, and the rest — exits 0 on success and 1 on error. Because 1 is reserved for tool errors — usage errors included, so a typo'd flag never lands in the gate band — CI can always distinguish "the gate found a regression" (2–5) from "the tool itself crashed" (1).

Flag placement and input paths

Most subcommands read the input they analyze as a trailing positional path, so the common case reads like every other code tool (tokei, cloc, scc, rg). The exceptions: report and vcs select input with --paths, diff compares two result sets, and init targets a directory via --dir.

bca metrics src/            # analyze the src/ tree
bca check src/ tests/       # gate two subtrees
bca find -t function_item . # find every function in the current tree

Flags are scoped to the subcommand that consumes them and must be written after the subcommand token:

bca metrics --exclude '*.generated.rs' src/   # correct
bca --exclude '*.generated.rs' metrics src/   # ERROR (exit 1)

Only -w / --warnings and --report-skipped are universal and accepted in any position. Every input-selection flag (-p / --paths, -I / --include, -X / --exclude, -l / --language, --paths-from, --exclude-from, --no-ignore, --no-skip-generated, --no-config), walker-tuning flag (-j / --jobs, --exclude-tests, --cyclomatic-count-try), the preprocessor flag (--preproc-data), and the output flag (--color) lives in a help-grouped section (Input selection / Walker tuning / Preprocessor / Output) on the subcommands that read it. A flag passed to a subcommand that never consumed it is a hard usage error (exit 1) rather than a silent no-op — so bca vcs commit --exclude-tests and bca list-metrics --paths both error, and bca list-metrics --help does not advertise walker flags.

The -p / --paths flag still works and is unioned with the positional paths, so bca metrics a.rs --paths b.rs walks both. The find and count subcommands take their node kinds via a repeatable -t / --type flag (so the positional slot is free for paths): bca find -t function_item -t struct_item src/.

Metrics

Metrics provide quantitative measures about source code, which can help in:

Compare different programming languages
Provide information on the quality of a code
Tell developers where their code is more tough to handle
Discovering potential issues early in the development process

big-code-analysis calculates the metrics starting from the source code of a program. These kind of metrics are called static metrics.

Nodes

To represent the structure of program code, bca builds an Abstract Syntax Tree (AST). A node is an element of this tree and denotes any syntactic construct present in a language.

Nodes can be used to:

Create the syntactic structure of a source file
Discover if a construct of a language is present in the analyzed code
Count the number of constructs of a certain kind
Detect errors in the source code

REST API

bca-web runs a server offering a REST API. This allows users to send source code via HTTP and receive corresponding metrics in JSON format.

Skipping generated code

Generated bindings (protobuf stubs, OpenAPI clients, lex/yacc output, build-system plumbing) inflate metrics for code no human will refactor. By default, bca scans the first ~50 lines / 5 KiB of each file for a generated-code marker and skips matches before parsing, so the skipped file pays no tree-sitter parse cost.

Recognized markers (case-insensitive):

@generated — Facebook / Meta convention; also emitted by buck2, rustfmt, prettier, and many code generators.
DO NOT EDIT — Go's // Code generated by … DO NOT EDIT. is the canonical form; the bare phrase is also widely copied (Bazel, protoc, OpenAPI clients).
GENERATED CODE — Lizard's marker, recognized for compatibility.

A marker phrase that appears only deep in the file body (past the scan window) does not trigger the skip — the detector deliberately looks only at the file header.

The skip applies uniformly to bca metrics, bca report, and the threshold engine.

Flags

--no-skip-generated — disable the auto-skip and restore the previous behavior (every file is parsed).
--report-skipped — log skipped (generated): <path> to stderr for each file the detector excludes, so you can audit the exclusions and add an explicit include if a file was wrongly tagged.

Respecting `.gitignore`

When a directory is passed to --paths, bca walks it with .gitignore awareness by default. Files matched by any of the following are skipped before parsing:

.gitignore files inside the walked tree.
.ignore files (the ripgrep / fd convention).
.git/info/exclude.
The global gitignore (~/.config/git/ignore, or whatever core.excludesFile points at).
.gitignore files in ancestor directories of the seed (so bca metrics src/ from a project root picks up the project's top-level .gitignore).

The walker honors .gitignore even outside a checked-in git repository, so an extracted source tarball with a .gitignore file gets the same treatment as a fresh git clone.

Hidden files (those whose basename starts with .) are filtered during the walk, matching the previous behavior.

Explicit paths bypass the filter

Files passed by name — via --paths or --paths-from — are always analyzed, even when they would be excluded by .gitignore. This makes it safe to do bca metrics --paths-from - from git diff --name-only-style pipelines without losing files that happen to be covered by a wildcard ignore rule.

Path discovery flags

--no-ignore — disable .gitignore / .ignore / global-gitignore awareness when expanding directory seeds.
--paths-from <FILE> — read newline-separated input paths from <FILE>, or from stdin when <FILE> is -. Combined as a union with any --paths values; -I / -X globs still apply. Blank lines are skipped; # is treated as a path character (not a comment). To pass a file literally named -, write ./-.
--exclude-from <FILE> — read newline-separated --exclude glob patterns from <FILE>, or from stdin when <FILE> is -. Patterns are unioned with any inline --exclude / -X values into a single deny-set; order does not matter. .gitignore-style: blank lines and lines whose first non-whitespace character is # are skipped, and a leading UTF-8 BOM is stripped. Convention is a .bcaignore at the repo root, mirroring .gitignore / .dockerignore. To pass a file literally named -, write ./-.

Metrics

bca metrics computes per-file metrics and emits them either to stdout, to a single aggregate file (--output), or to a directory of per-file structured files (--output-dir).

Migrating? This command replaces the pre-restructure --metrics flag. The aggregated report previously selected with -O markdown now lives under bca report, and the CI/IDE offender formats (Checkstyle, SARIF, code-climate, clang-warning, msvc-warning) moved to bca check --report-format <fmt>. See the migration guide.

Display metrics

To compute and display metrics for a given file or directory, run:

bca metrics --paths /path/to/your/file/or/directory

--paths (or -p): file or directory to analyze. If a directory is provided, metrics are computed for every supported file it contains. Paths may also be given positionally (bca metrics file.rs dir/).

Explicitly-named files must be parseable. When you name a file directly (positionally or via --paths/--paths-from) whose language the tool cannot recognize, bca prints a warning on stderr and — if the run produced no output at all — exits 1, mirroring the way a nonexistent explicit path fails. A mixed run that analyzed at least one file still exits 0 with the warning. Pass --language <lang> to force a parser when a file's extension lies about its contents. Files reached only by walking a directory are skipped silently (a tree full of READMEs and configs must not be noisy); pass -w to surface those skips too.

Exporting metrics

bca metrics supports five per-file output formats:

CBOR
CSV
JSON
TOML
YAML

Both JSON and TOML can be exported as pretty-printed.

The three top-level output kinds map to three separate commands so each one stays consistent with its data model:

Command	Output	Audience
`bca metrics`	Per-file metric trees	Downstream tooling
`bca report`	Aggregated quality dashboards	Humans / PRs
`bca check`	Threshold-violation reports	CI / IDE

The CI/IDE offender formats (Checkstyle, SARIF, code-climate, clang-warning, msvc-warning) used to live on bca metrics -O <fmt>. They moved to bca check --report-format <fmt> because their input is a list of threshold violations, not the per-file metric tree that the other formats above carry. See the bca check chapter for the new invocation.

Export command

To export metrics as JSON files:

bca metrics --paths /path/to/your/file/or/directory \
    -O json --output metrics.json

-O, --format: output format. Defaults to text — a human-readable colored metric tree printed to stdout; pass --format text to request that default explicitly (for example to override a bca.toml that set a structured format). The structured per-file serializers are cbor, csv, json, toml, and yaml. --output-format is accepted as a deprecated alias and is slated for removal in the next major.
-o, --output: a single file holding one aggregate document for the whole run — a top-level array of the per-file results (TOML wraps the array under a files key; CSV concatenates each file's rows). If omitted, results are printed to stdout.
--output-dir: a directory holding one document per input file, named by the input path plus the format extension. Mutually exclusive with --output; passing both is an error.
CBOR is binary and so requires a destination (--output or --output-dir). Passing either destination without a structured --format is an error (the default text format streams to stdout and writes no files), so a destination never silently no-ops (#661).
--metrics <name,…>: restrict computation to a subset of metrics (comma-separated and/or repeated, e.g. --metrics cyclomatic,cognitive --metrics loc). Names are the canonical ids bca list-metrics prints — the same vocabulary bca check --threshold and bca diff --metric use; dotted (cyclomatic.modified) and bare loc sub-metric (sloc) spellings are accepted. Derived metrics pull in their dependencies automatically. An unknown name errors with a "did you mean" hint. Omit it to compute every metric (#691).

CSV (spreadsheets and Pandas)

bca metrics --paths /path/to/your/code \
    -O csv --output-dir csv-output

The CSV writer emits one row per FuncSpace (function, class, struct, unit, etc.) with the entire metric matrix as columns. Header order is fixed — see CSV_HEADER in src/output/csv.rs for the canonical list. Identity columns come first (path, space_name, space_kind, start_line, end_line) followed by every leaf metric using the same dotted JSON-style names (loc.lloc, halstead.volume, cyclomatic.modified.average, etc.) so a single column name addresses the metric in both CSV and JSON.

Empty cells (no value, not 0) signal "not applicable for this space" — for example, the OOP-only metrics (wmc.*, npm.*, npa.*) appear empty for procedural code. RFC 4180 quoting is delegated to the [csv] crate, so paths and names containing commas, quotes, or newlines round-trip cleanly.

Stream the result to a single file with -:

bca metrics --paths /path/to/your/code -O csv \
    > metrics.csv

CSV is a per-file format; with --output-dir <dir> each input file produces a <input>.csv mirror under the output directory. With --output <file> every file's rows are concatenated into one aggregate CSV.

An aggregated HTML report covering the whole walk is available via bca report html. The previous per-file bca metrics -O html writer was removed because it degraded to an unopenable single-file table on real-world repos — CSV is the right shape for flat per-FuncSpace rows.

Pretty print

bca metrics --paths /path/to/your/file/or/directory \
    --pretty -O json

Excluding inline test code

bca metrics --paths /path/to/your/code --exclude-tests

By default, every node in the AST is counted, including inline test items. Rust files following the idiomatic #[cfg(test)] mod tests { ... } layout therefore have headline metrics that mix production and test code together.

Pass --exclude-tests to elide test-only subtrees before any metric is computed. The flag is recognised by every subcommand that walks the AST (metrics, report, check), and currently understands the following Rust attribute shapes:

#[test] and #[rstest] / #[test_case] / #[wasm_bindgen_test]
#[cfg(test)], #[cfg(all(test, ...))], #[cfg(any(test, ...))]
#[tokio::test], #[async_std::test], #[test_log::test], … (any path ending in ::test)
#![cfg(test)] on mod items (inner attribute form)

Languages without a Checker::should_skip_subtree override simply ignore the flag — only Rust applies the pruning today. The default remains off so existing metric numbers stay byte-identical for users who do not opt in.

To opt a whole project in without repeating the flag, set exclude_tests = true in the repo's bca.toml manifest. Because --exclude-tests is presence-only (no =false form), the manifest key can only turn pruning on; a CLI --exclude-tests still wins, but the manifest cannot turn it back off. Note that pruning lowers the node-counted metrics (cyclomatic, cognitive, Halstead, nom, nargs, …) but leaves unit-level loc.sloc at the full file extent, since unit SLOC is the file root span rather than a traversal accumulation.

Aggregated report

For a comprehensive, human-readable quality report, use bca report markdown. That command aggregates metrics across all analyzed files and produces per-language hotspot tables.

Listing available metrics

Tooling that drives the CLI can discover the metric catalog at runtime instead of hard-coding it:

bca list-metrics

prints metric names one per line. Pass descriptions for a one-line summary of each metric:

bca list-metrics descriptions

Change-history (VCS) metrics

bca vcs ranks files by change-history risk — signals derived from version-control history rather than the source AST. It is the project's first language-agnostic, non-AST metric family. The goal is to surface the files most likely to harbour bugs or vulnerabilities, using the signals the empirical defect- and vulnerability-prediction literature most consistently backs.

A single history walk runs once per invocation (never per file) and produces per-file signals over two configurable windows — a long window (default 12mo ≈ 365 days) and a recent window (default 90d).

Quick start

$ bca vcs --paths src --top 20
Change-history risk (long window 365d, recent 90d, formula v2)
 RANK      RISK  COMMITS rec/long  CHURN rec/long  AUTHORS long  FILE
    1       7.2             68/68     11634/11634             1  src/metrics/cyclomatic.rs
    2       6.9             68/68       7299/7299             1  src/metrics/npa.rs
    ...

With no --format, a human-readable ranked table is printed. Pass --format markdown|html for a rendered report page, or --format json|yaml|toml|cbor|csv for structured output. Unlike bca metrics / bca ops (whose --output-dir is a directory of per-file emissions), a change-history report is a single whole-repo document, so bca vcs --output <file> writes one file (CBOR, being binary, requires --output). The global --paths / --include / --exclude / --no-ignore filters are reused to pick which tracked files to report.

bca vcs errors clearly when run outside a git working tree.

File-type scope

By default bca vcs ranks only the files bca computes metrics for — the same set bca metrics would analyse. High-churn non-source files (CHANGELOG.md, Cargo.lock, generated config) carry no maintainability meaning yet maximise the churn / commit / author signals, so ranking them beside source code is noise; scoping to files-with-metrics also keeps the standalone ranking aligned with the AST hotspot tables in bca report --vcs.

--file-types <SCOPE> selects the scope:

Value	Meaning
`metrics` (default)	Only files bca has a language/metrics for, by extension
`all`	Every tracked, non-binary, non-symlink text file
`rs,py,toml,…`	A comma-separated extension allow-list (leading dots optional, case-insensitive)

bca vcs                          # rank source files only (default)
bca vcs --file-types all         # rank every tracked text file
bca vcs --file-types rs,py       # rank only Rust and Python files

The check is extension-only (no file content is read) and ANDs with the --paths / --include / --exclude / --no-ignore filters — a file must pass both to be ranked. Extension-less files (Makefile, Dockerfile, LICENSE) and unknown extensions are out of the metrics scope; a custom list is a literal extension filter, so it can include a non-metrics type like toml. An empty or all-blank custom list is a clear error rather than a scope that silently ranks nothing.

Rendered report page

bca vcs --top 50 --format html --output vcs.html
bca vcs --top 50 --format markdown --output vcs.md

--format html produces a self-contained, sortable page styled exactly like bca report html (click any column header to re-sort); --format markdown produces the same ranked table as GitHub-Flavored Markdown. Both render every signal column (the complete, sortable view of the same data the structured formats carry). The column set is defined once and shared by both renderers, so they cannot drift.

To fold the ranking into the aggregated quality report instead of a standalone page, pass bca report --vcs, which appends a "Change-history risk" section to report markdown / report html.

Signals

Field	Type	Description
`commits_long` / `commits_recent`	u32	Distinct commits touching the file in each window
`churn_long` / `churn_recent`	u64	Σ(added + deleted) lines in each window
`authors_long` / `authors_recent`	u32	Distinct canonical author identities in each window
`ownership_top_share`	f64 ∈ [0,1]	Share of edits attributable to the top author (lower = more diluted)
`burst`	f64 ∈ [0,1]	`commits_recent / commits_long`
`bug_fix_commits`	u32	Long-window commits whose message matches a bug-fix keyword
`security_fix_commits`	u32	Long-window commits matching security keywords (`CVE-####`, `security`, `vuln`, `exploit`, `sanitize`, …)
`revert_commits`	u32	Long-window commits whose subject is a revert / rollback
`age_days`	u32	Days since the file's first in-window commit (capped at the long window)
`last_modified_days`	u32	Days since the file's most recent in-window commit
`change_entropy_long` / `change_entropy_recent`	f64	Change entropy in bits per window (see below)
`cochange_entropy_long` / `cochange_entropy_recent`	f64	Co-change graph entropy in bits per window (see below)
`risk_score`	f64	Composite, formula-versioned (see below) — ordinal, not cardinal
`hotspot_score`	f64?	`complexity × churn_recent`; present only when AST metrics are computed alongside
`risk_score_version` / `vcs_schema_version`	u32	Forward-compatibility version stamps. Carried once on the report envelope, alongside `long_window_days` / `recent_window_days` — not repeated inside each per-file `vcs` block (issue #635)

Author identities are canonicalised through the repository .mailmap and counted by lowercased email; Co-authored-by: trailers add participants. Bot identities (dependabot[bot], renovate[bot], github-actions[bot], …) are excluded by default. Binary files and symlinks are skipped; an untracked file has no record at all (distinct from a tracked file with zero in-window activity).

Change & co-change entropy

Two process-entropy signals (added in risk_score_version 2) capture how a file changes, not just how much:

Change entropy (Hassan, 2009 — Predicting Faults Using the Complexity of Code Changes). For each commit, the Shannon entropy (in bits) of its churn distribution across the files it touched measures how scattered that change was: a one-file commit is 0; a commit spreading churn evenly across n files approaches log₂(n). Each file is then credited its churn share pᵢ·H of every commit it took part in (Hassan's History Complexity Metric). Higher = the file is repeatedly caught up in diffuse, cross-cutting changes. Later work (arXiv 2504.18511, below) measured file-level change entropy at a Pearson correlation up to 0.54 with defect counts on eight Apache projects.
Co-change graph entropy (arXiv 2504.18511, 2025). Files that change in the same commit are joined by a weighted edge (weight = number of shared commits). A file's co-change entropy is the Shannon entropy of its edge-weight distribution: low when it always co-changes with the same partner, high when its changes ripple across many different files. Combined with change entropy it improved AUROC in 82.5% of cases over the v1 signal set on eight Apache projects.

Both are reported per window. A 0.0 is computed, not missing: the file only ever changed alone (no co-change neighbours, or single-file commits with zero change entropy). Bulk-import commits touching more than 1000 files are excluded from the co-change graph — its edge count grows O(width²) — but still contribute their O(width) change entropy.

Composite risk score

The default weighted formula is a log-scaled weighted sum with categorical multiplicative bumps:

recency_churn  = ln(1 + churn_recent)
recency_count  = ln(1 + commits_recent)
long_count     = ln(1 + commits_long)
long_churn     = ln(1 + churn_long)
author_factor  = ln(1 + authors_long)
dilution       = (1 - ownership_top_share).clamp(0, 1)
fix_factor     = ln(1 + bug_fix_commits + 2 * security_fix_commits)
size_factor    = ln(1 + sloc)^2 / 100              // full coefficient, not a tie-breaker
entropy_factor = 0.10 * change_entropy_recent + 0.05 * cochange_entropy_recent
new_file_bonus = 0.15 if age_days < recent_window_days else 0
dev_bonus      = 0.35 if authors_long >= 9 else 0.15 if authors_long >= 6 else 0

base = 0.30 * recency_churn
     + 0.25 * recency_count
     + 0.15 * long_count
     + 0.15 * author_factor * (1 + dilution)
     + 0.10 * fix_factor
     + 0.05 * long_churn
     + entropy_factor
     + size_factor

risk_score = base * (1 + dev_bonus + new_file_bonus)

The term weights are grounded in the literature: recent churn and commit frequency carry the highest weight (Nagappan & Ball relative churn; just-in-time defect prediction; Firefox NumChanges PD 86); the author factor is scaled by ownership dilution (Avelino DoA / truck-factor; Bird et al.); the categorical developer-count bumps encode the RHEL4 finding that files touched by ≥9 developers were ~16× more likely to harbour a vulnerability; security fixes are double-weighted (Sentence-Level VFC studies; PySecDB); and the recent-window change- and co-change-entropy terms enter additively (Hassan 2009; arXiv 2504.18511). The full derivation lives in src/vcs/score.rs.

The score is ordinal: only relative ranks have meaning. A single risk_score_version (now 2) versions both formulas — any change to the weighted sum or the --risk-formula percentile blend bumps it; the recent entropy pair joins both.

--risk-formula percentile is an alternative: each signal is re-ranked to its percentile within the analyzed set, then averaged — the literature recommends relative triggers over hard thresholds for cross-project robustness.

Flags

Flag	Default	Meaning
`--long-window <DUR>`	`12mo`	Long window (`12mo`, `2y`, `8w`, `365d`, ISO 8601 `P1Y`)
`--recent-window <DUR>`	`90d`	Recent window
`--top <N>`	`50`	Show only the top N (`0` = all)
`--file-types <SCOPE>`	`metrics`	Files to rank: `metrics`, `all`, or an extension list (`rs,py`)
`--ref <REF>`	`HEAD`	Revision to analyse
`--full-history`	off	Walk the full DAG (default: first-parent only)
`--include-merges`	off	Include merge commits
`--no-follow-renames`	off	Stop following renames (default: follow)
`--no-exclude-bots` / `--bot-pattern <RE>`	exclude	Bot-author filtering
`--as-of <WHEN>`	wall clock	Reference "now" (RFC 3339 / `@unix` / git date) for reproducible snapshots
`--risk-formula {weighted\|percentile}`	`weighted`	Composite formula
`--emit-author-details`	off	Emit SHA-256-hashed canonical author IDs
`--author-hash-key <KEY>`	unset	Harden the emitted author digests into a keyed HMAC (see Author-detail privacy); requires `--emit-author-details`
`--include-deleted`	off	Also rank files deleted at the target ref
`--no-cache`	off	Skip the persistent history cache (always walk fresh)
`--clear-cache`	off	Wipe this repo's cached history before running
`--cache-dir <DIR>`	platform cache	Override the cache directory

Caching

Ranking re-walks only the part of history inside the long window, but on a large, active repository that is still the dominant cost — and in CI the interesting deltas between runs are just the commits pushed since the last one. bca vcs therefore keeps a persistent cache of each walk, keyed by the resolved HEAD SHA and the repository's identity:

On an unchanged tree the prior result is replayed, no history walk.
When HEAD has advanced the walk visits only the new commits and splices them onto the cached history.
A force-push (the cached head is no longer an ancestor of the new one) falls back to a full walk.

The cache is a pure optimization: a hit is bit-identical to a fresh walk, and the time windows are recomputed against the current moment on every run, so a cached result is never stale. An entry is ignored — and the history recomputed — whenever the schema, the score-formula version, or the walk-affecting options differ; in particular changing a window forces a fresh walk. (Finalization-only knobs such as --risk-formula, --emit-author-details, --author-hash-key, and --include-deleted are applied on replay, so they reuse the same cached walk — a cached walk even re-finalizes under a different author-hash key without re-walking.)

By default the cache lives under $XDG_CACHE_HOME/big-code-analysis/vcs (%LOCALAPPDATA% on Windows, ~/.cache otherwise). Author identities are stored only as their SHA-256 digests — never plaintext — so the cache holds no raw author emails. Note this is pseudonymization, not anonymization: the digests are recoverable against a candidate email set (see --emit-author-details). The same cache transparently accelerates bca metrics --vcs and bca report --vcs.

# First run primes the cache; the second replays it.
bca vcs --paths .
bca vcs --paths .                 # reuses prior work

bca vcs --no-cache --paths .      # ignore the cache for this run
bca vcs --clear-cache --paths .   # rebuild from scratch
bca vcs --cache-dir /tmp/bca-cache --paths .

The REST (POST /v1/vcs) and Python (vcs.rank) surfaces expose the same behaviour through optional no_cache / cache_dir parameters.

The cache is specific to the file ranking. The trend and commit subcommands — and the /v1/vcs/trend and /v1/vcs/jit endpoints — do not use it, so the cache flags do not apply there: passing --no-cache / --cache-dir alongside a subcommand is a usage error, and the trend endpoint rejects a no_cache / cache_dir field rather than silently ignoring it (issue #961).

In `bca metrics`

Pass bca metrics --vcs to attach a vcs block (plus a hotspot_score computed from the file's cyclomatic sum) to each file's metrics:

$ bca metrics --vcs --paths src/parser.rs --format json
{ "name": "src/parser.rs",
  "metrics": { "cyclomatic": { ... },
    "vcs": { "commits_long": 15, "churn_recent": 211,
             "risk_score": 3.7, "hotspot_score": 7596.0, ... } } }

bca metrics --vcs uses the default windows and weighted formula; for window / formula tuning use bca vcs.

Per-function attribution

bca metrics --vcs-per-function (which implies --vcs) additionally attaches a vcs block to every nested function, method, and class space. It blames each file once with git blame and buckets the surviving lines into the AST function spans, so you can rank the risky function inside a risky file:

$ bca metrics --vcs-per-function --paths src/parser.rs --format json
{ "name": "src/parser.rs",
  "metrics": { "vcs": { "risk_score": 3.7, ... } },   // file-level block
  "spaces": [
    { "name": "parse", "kind": "function",
      "metrics": { "vcs": { "commits_long": 4, "churn_recent": 12,
                            "risk_score": 2.1, "hotspot_score": 144.0 } } } ] }

The per-function block is a current-blame snapshot and is not directly comparable to the file-level block: its churn counts surviving lines whose last touch falls inside the window (not historical added+deleted churn), and ownership is credited per touching commit. A function nobody has changed within the window reports zero counts. Lines whose last touch predates the long window contribute to the function's size but to none of the windowed counts.

Limitations. Blame follows file renames (so edits under a former path still attribute), but attributes a line moved between functions to its current position only. A function split into two has no record of its pre-split identity, and a deleted-then-recreated function attributes to the recreating commits. If a file cannot be blamed — untracked, or the rare gix-blame failure on pathologically repetitive content — its per-function blocks are simply omitted while the file-level block (and the AST metrics) still emit.

Just-in-time (commit-level) scoring

Where everything above ranks files at a ref, bca vcs commit <commit> scores a single commit for defect-induction risk — the unit a CI gate reviews at check-in. (The subcommand was renamed from bca vcs jit in 2.0; the old jit spelling keeps working as a hidden alias for one release cycle. "Just-in-time (JIT)" stays the literature term, below.) It is a static, rule-based scorer (no trained model, so nothing drifts as the project ages), with the feature groups and signs taken from the just-in-time defect-prediction literature: Kamei et al., A Large-Scale Empirical Study of Just-in-Time Quality Assurance, IEEE TSE 2013, with the open replications Commit Guru (FSE 2015) and McIntosh & Kamei, Are Fix-Inducing Changes a Moving Target? (IEEE TSE 2018).

$ bca vcs commit HEAD --pretty
{
  "jit_schema_version": 3,
  "jit_score_version": 1,
  "source": "commit",
  "risk_score": 4.40,
  "commit": { "id": "5176d3e…", "parent_count": 1, "is_merge": false,
              "purpose": { "is_fix": true, "is_security_fix": false,
                           "is_revert": false } },
  "features": {
    "size":       { "lines_added": 942, "lines_deleted": 60,
                    "files_touched": 19, "hunks": 78 },
    "diffusion":  { "subsystems": 5, "directories": 8, "entropy": 3.48 },
    "history":    { "prior_changes": 275, "prior_distinct_authors": 1,
                    "prior_bug_fix_commits": 237,
                    "prior_security_fix_commits": 21,
                    "file_risk_max": 10.97, "file_risk_mean": 3.87,
                    "new_files": 2 },
    "experience": { "author_prior_commits": 962,
                    "author_recent_commits": 962 }
  },
  "contributions": { "size": 2.74, "diffusion": 0.97, "history": 1.57,
                     "purpose": 0.15, "experience": -1.03 }
}

The five feature groups, and how each moves the score:

Group	Features	Direction
Size	lines added / deleted, files touched, diff hunks	larger ⇒ riskier
Diffusion	distinct subsystems & directories, within-commit change entropy	more scattered ⇒ riskier
History	the touched files' priors — prior changes, distinct authors, bug- and security-fix counts, and the composite `risk_score` — measured from history before the commit	turbulent file history ⇒ riskier
Experience	the author's prior commit count (long & recent)	more experience ⇒ less risky (this group subtracts)
Purpose	fix / security-fix / revert classification of the message	fixes add, reverts dampen

The contributions block reports each group's signed contribution to the ordinal risk_score, so a consumer can see why a commit ranked where it did. Like the file-level risk_score, the score is ordinal: rank commits by it, or compare a commit against the repository's own distribution, but do not read the magnitude as a probability. Any formula change bumps jit_score_version (separate from the file-level risk_score_version).

The commit is scored against its first parent. A merge commit is flagged (is_merge, parent_count ≥ 2) and scored against that first parent. A root commit and any new files carry zero priors by construction — the score then leans on size and author experience, exactly as the literature prescribes for changes with no file history.

The window / --ref / bot / merge / rename flags are shared with the parent bca vcs command; the commit-only flags are the positional <commit> (default HEAD), --format json|yaml|toml|cbor (default json), --output, --pretty, and:

# CI gate: exit 2 when the commit scores at or above the threshold.
bca vcs commit HEAD --fail-above 6.0

--fail-above uses exit code 2 (the same "metric gate" convention as bca check; exit 1 stays reserved for tool errors). Because the score is ordinal, calibrate the threshold against your repository's own commit-score distribution rather than treating it as an absolute.

Scoring an arbitrary diff (`--diff`)

bca vcs commit --diff <file> scores a git diff instead of a commit (use --diff - to read the diff from stdin). This is handy in a pre-commit hook or a code-review bot, where the change exists only as a diff and has not been committed yet.

git diff --cached | bca vcs commit --diff - --pretty

The input must be a git-style unified diff carrying diff --git file headers, as produced by git diff or git format-patch. Plain diff -u / diff -ru output (which has ---/+++ header lines but no diff --git header) parses to zero files, and combined / merge diffs (git diff --cc, with @@@ hunk headers) are rejected as a malformed diff — pipe a regular two-way git diff instead.

A bare diff carries no author, parent, or file history, so only the size and diffusion groups are computable. The output is therefore a deliberately partial report — a distinct shape from a commit report:

$ git diff | bca vcs commit --diff - --pretty
{
  "jit_schema_version": 3,
  "jit_score_version": 1,
  "source": "diff",
  "partial_risk_score": 1.83,
  "size":      { "lines_added": 42, "lines_deleted": 8,
                 "files_touched": 3, "hunks": 6 },
  "diffusion": { "subsystems": 2, "directories": 3, "entropy": 1.46 },
  "contributions": { "size": 1.18, "diffusion": 0.65 }
}

The source field is a permanent "diff" marker, and the history / experience / purpose groups are absent from the report entirely — not present as zero. Zero is a real value (a commit genuinely with no prior history scores those groups at zero); an absent group means "unavailable", so a consumer can never mistake an unscored group for "low risk". For the same reason the score field is named partial_risk_score, not risk_score.

A diff-only score is not comparable to a commit score. The partial score sums only size + diffusion, so it is always lower than the full commit score for the same change (which also folds in history, experience, and purpose). Rank diffs against other diffs, never against commit scores. --diff and the positional <commit> are mutually exclusive; --fail-above works in both modes (calibrate the diff-mode threshold against your own diff-score distribution).

The parser understands git's default C-style path quoting (core.quotePath=true), so a diff touching a file with a non-ASCII or spaced name (which git emits as "a/na\303\257ve.txt") is grouped under its decoded path in the diffusion features, not the raw quoted string.

REST and Python parity

The JIT score is also available off the CLI:

REST: POST /v1/vcs/jit with { "id", "repo_path", "commit" } returns the commit JitReport JSON, or { "id", "diff" } returns the partial diff report. See Driving the REST API.
Python: vcs.commit(repo_path, commit=...) returns the commit report as a dict, and vcs.score_diff(diff) the partial diff report. See Change-history (VCS) metrics.

ML-based JIT models and server-side hook integration remain out of scope.

Historical trend (over time)

A single bca vcs run answers "what is risky now." bca vcs trend answers "is it getting better or worse" — the actionable question for a technical-debt programme — by sampling the metrics at several points in time and emitting a per-file time series.

$ bca vcs --top 20 trend --points 12 --span 24mo --pretty
{
  "trend_schema_version": 1,
  "vcs_schema_version": 2,
  "risk_score_version": 2,
  "long_window_days": 365,
  "recent_window_days": 90,
  "truncated_shallow_clone": false,
  "as_of_points": [ 1700000000, 1705259520, ... ],
  "files": {
    "src/parser.rs": [
      null,                       // did not exist at the oldest point
      { "as_of": 1705259520, "vcs": { "risk_score": 4.1, ... } },
      { "as_of": 1710519040, "vcs": { "risk_score": 6.8, ... } }
    ]
  },
  "deltas": {
    "improved":  [ { "path": "src/old.rs",    "delta": -3.2, ... } ],
    "regressed": [ { "path": "src/parser.rs", "delta":  2.7, ... } ]
  }
}

--points N evenly-spaced samples (inclusive of both endpoints) cover --span DURATION, ending at --as-of (or wall-clock now). as_of_points lists the sample timestamps oldest-first; every file's array aligns to it 1:1, with a null element marking a point where the file did not exist yet. deltas ranks the files whose risk_score fell the most (improved) and rose the most (regressed) between each file's earliest and latest present points; --top-deltas trims each list.

Crucially, each point re-anchors at the mainline tip that existed at or before that moment — it does not just re-window today's HEAD tree. That is what makes a file born later show as null at older points (rather than leaking its present-day metrics backwards). Files kept in the series are the --top highest-risk by their most-recent sample.

Flags reused from the parent bca vcs command: the window (--long-window / --recent-window), --ref, --file-types, bot / merge / rename toggles, --as-of (the most-recent anchor), and --top. -O accepts json (default), yaml, or cbor; TOML is excluded because an absent point serializes as null, which TOML cannot represent. The point count is bounded (2–120) to keep the per-point history walks tractable on deep histories.

Rename caveat. Renames are followed within each sample's walk, but a file renamed between two samples appears as two separate path series (its old name, then its new name) rather than one continuous line. Cross-sample rename stitching is a deferred follow-up.

Bus factor (directory & repo level)

Where the per-file ownership_top_share measures concentration within a file, the bus factor (a.k.a. truck factor) measures it across a set of files: the minimum number of developers whose departure would leave more than half of a directory's files without a knowledgeable maintainer. bca vcs emits it as a top-level vcs_aggregate object alongside the ranked files:

{
  "vcs_aggregate": {
    "bus_factor": {
      "bus_factor_schema_version": 2,
      "coverage_threshold": 0.5,
      "doa_threshold": 0.75,
      "repo": { "bus_factor": 3, "files": 412, "authors": 11 },
      "by_directory": [
        { "directory": "src", "bus_factor": 2, "files": 180, "authors": 7 },
        { "directory": "src/vcs", "bus_factor": 1, "files": 24, "authors": 3 }
      ]
    }
  }
}

Each developer's authorship of each file is scored with the Avelino Degree-of-Authorship heuristic (Avelino, Passos, Hora & Valente, A Novel Approach for Estimating Truck Factors, ICPC 2016):

DoA(d, f) = 3.293 + 1.098·FA + 0.164·DL − 0.321·ln(1 + AC)

where FA is first authorship (1 if d created f), DL is d's deliveries (changes) to f, and AC is the changes made by other developers. A developer is an author of f when their DoA, normalised by the file's maximum, clears 0.75 (the paper's threshold). The truck factor is then a greedy removal: drop the developer who authors the most still-covered files, repeat until more than --bus-factor-threshold (default 0.5, per Avelino) of the files are orphaned, and report how many were removed. by_directory covers each top-level directory and each of its immediate subdirectories, computed over every file recursively beneath it.

Caveats, by construction:

A repository (or directory) of mostly single-author files reports a bus factor of 1 — losing that one author orphans each file. This is the heuristic working as intended, not a bug; treat the number as a planning signal, not a guarantee.
Bot identities are filtered (like the per-file signals), and files with no in-window activity carry no authorship and are excluded from the denominator.
"First authorship" means the earliest commit observed within the long window, not necessarily a file's true creation.

The aggregate reflects the whole repository within the file-type scope (one history walk covers every in-scope file — by default the files-with-metrics set, see File-type scope — so --file-types all widens the bus factor to every tracked file). --paths / --include / --exclude scope only the ranked per-file list, not the bus factor. To focus on a subsystem, read its entry in by_directory rather than filtering the walk.

--emit-author-details adds a key_author_ids list to each group — the SHA-256-hashed identities of the removed key developers, in removal order (plaintext identities never leave the process). The aggregate is computed only for the dedicated bca vcs / bca report --vcs reports and the REST / Python endpoints; the per-file bca metrics --vcs injection path does not pay for it.

Author-detail privacy

The key_author_ids digests are a stable pseudonym, not anonymization. Hashing keeps plaintext emails out of the report and the cache and deters casual disclosure, but the hash is not cryptographically irreversible. The pre-image is an email — low-entropy and enumerable — and commit histories are public, so anyone with a candidate set of emails can recover which digest belongs to whom by hashing each candidate or with a precomputed email→hash table. This is the same weakness that broke Gravatar's email hashing.

Treat published key_author_ids (and the per-file author_ids) as pseudonymization that avoids emitting plaintext emails, not as a guarantee that authors cannot be re-identified by a determined attacker. If you need that guarantee, do not publish the digests.

Hardened mode: `--author-hash-key`

For stronger resistance, pass a secret key with --author-hash-key <KEY> (requires --emit-author-details). The emitted digests then become an HMAC-SHA256(key, SHA-256(email)) instead of a bare hash: an attacker without the key can no longer hash a candidate email to recognise its digest, nor use a precomputed email→hash table — both attacks need the secret key. Pick a high-entropy key and keep it secret; anyone who learns it can re-run the enumeration.

The key is stable: the same key yields the same digests across every report and across a persistent-cache replay, so cross-report correlation and the cache still work. Different keys produce unrelated digests, so two teams sharing histories cannot cross-link authors unless they share the key.

Prefer the BCA_AUTHOR_HASH_KEY environment variable over the flag — a key on the command line is visible to other local users via the process list (ps) and is saved in shell history. The flag takes precedence when both are set:

export BCA_AUTHOR_HASH_KEY="$(cat ~/.config/bca/author-key)"
bca vcs --emit-author-details

What the key does not cover: the on-disk history cache (issue #334) deliberately stores the unkeyed inner SHA-256 digest, because the key is applied at finalization so a cached walk can be re-finalized under any key without re-walking. The cache is local-only and never published, but if your threat model includes an attacker reading your local cache directory, disable the cache (--no-cache) or clear it (--clear-cache). The same key option is available on the REST endpoint (author_hash_key) and in Python (vcs.Options(author_hash_key=…)).

Dogfooding in this repo

This project runs bca vcs on its own source. make vcs prints the ranked table (path selection and the .bcaignore deny-set come from the repo-root bca.toml manifest, the same config make self-scan and make report use; BCA_VCS_TOP overrides the row cap). The manifest's [vcs] file_types key sets the default scope (the --file-types CLI flag replaces it when given). On every push to main the Pages CI job folds the rendered ranking into the flagship report — bca report html --vcs / report markdown --vcs — so the published reports/index.html shows the change-history risk section side-by-side with the AST hotspots, and additionally publishes the full top-100 ranking as reports/vcs-report.json for tooling.

REST and Python

REST: POST /v1/vcs with a JSON body { "id": "...", "repo_path": "/path/to/repo", ... } returns the ranked report, and POST /v1/vcs/trend (same fields plus points / span / top_deltas) returns the historical time series. See Driving the REST API.
Python: big_code_analysis.vcs.rank(repo_path, …) returns the ranked report as a dict, vcs.trend(repo_path, points=…, span=…, …) returns the time series, and analyze(path, vcs=True) attaches a vcs block to a single file's metrics.

Both POST /v1/vcs and vcs.rank() (through vcs.Options) accept an optional file_types ("metrics" / "all" / "rs,py") to scope which files are ranked, mirroring the CLI --file-types.

Both include the vcs_aggregate bus factor in the result and accept a bus_factor_threshold (in (0, 1)) to tune the coverage fraction.

Report

bca report [--format <FORMAT>] produces an aggregated quality-metrics report across every file walked. It is designed for pasting into pull requests, wikis, or issue trackers.

Pick the format with --format / -O (bca report --format html). When omitted, the report defaults to markdown. The bare positional form (bca report markdown) still works as a deprecated alias and is slated for removal in the next major; prefer --format.

CI integration. For runnable GitHub Actions and GitLab CI recipes that post the Markdown report as a PR/MR comment, see the CI integration recipe.

Two formats are available: markdown (plain-text, ideal for PR comments) and html (a self-contained dashboard with sortable tables, ideal for sharing as a build artifact).

Migrating? This command replaces the pre-restructure --metrics -O markdown invocation. See the migration guide.

Quick start

Print to stdout:

bca report --paths /path/to/project markdown

Write to a file:

bca report --paths /path/to/project markdown --output report.md

Note: --output must be a file path, not a directory.

Flags

Flag	Default	Description
`--top N`	20	Maximum entries per hotspot table (`0` = all).
`--strip-prefix PATH`	(empty)	Prefix removed from file paths.
`--no-suppress`	(off)	Include functions silenced by in-source suppression markers (raw audit view).
`--vcs`	(off)	Append a "Change-history risk" section ranking files by VCS risk (default windows), mirroring `bca metrics --vcs`. The section ranks the same files-with-metrics as the AST hotspot tables (the `metrics` file-type scope, #576), so both halves describe one file universe. Ignored with a warning outside a git working tree. See `bca vcs`.
`-o, --output FILE`	(stdout)	Output file. Parent directory must exist.

Suppression markers

By default, bca report markdown|html honours in-source suppression markers — the same // bca: suppress, // bca: suppress-file, and #lizard forgives comments that bca check and the SARIF emitter respect (see Suppression). A function is omitted from a metric's hotspot table when that metric is suppressed for it, so the published report agrees with the threshold gate instead of re-surfacing every silenced offender.

Suppression is per-metric: a // bca: suppress(cyclomatic) marker drops the function from the Cyclomatic table only — it still appears in the Cognitive, Halstead, and other tables. A bare // bca: suppress (or // bca: suppress-file) covers every metric.

Pass --no-suppress for the raw audit view that lists every offender regardless of markers. The setting can also be pinned in the bca.toml manifest:

[report]
no_suppress = true

The CLI flag wins; a bare --no-suppress can force the audit view on, but the manifest never forces it off.

Examples

Show only the five worst hotspots per section:

bca report -p src/ markdown --top 5

Strip the workspace root from displayed paths:

bca report -p /home/user/project markdown \
    --strip-prefix /home/user/project/

The user's daily-driver invocation:

bca report \
    --paths "$PWD" \
    markdown \
    --top 20 \
    --strip-prefix "$PWD/"

Report structure

A generated report contains the following sections (each section is omitted when no data exists for it). Every hotspot table includes a Tokens column (Lizard-style leaf-token count, comments excluded) alongside SLOC so two complementary size proxies are visible per row.

Project summary — files analyzed, languages, total SLOC / PLOC / comment counts, function and class counts, comment ratio.
Per-language overview table — one row per language with file count, SLOC, function count, the SLOC-weighted average Maintainability Index (MI), average Cyclomatic Complexity (CC), and average Cognitive Complexity. The MI average is size-weighted and uses the unclamped Visual Studio value, so a language whose files are mostly unmaintainable reads negative instead of saturating at the 0 floor of the per-file MI column.
Per-language hotspot sections (repeated for each language). Every hotspot title follows one template, <Concept> hotspots (top N by <column>) — the truncation clause states the actual --top state (top 20 by CC, or all, by CC for --top 0):
- Summary — file count, SLOC, PLOC, comment ratio, and Average MI (SLOC-weighted) with a GOOD / MODERATE / LOW rating. The headline is the SLOC-weighted mean of the unclamped Visual Studio MI: large files dominate, and a file whose displayed MI clamps to 0 still contributes its true (often negative) maintainability rather than a misleading 0.
- Actionable Summary — counts of functions exceeding common thresholds (by default CC > 10, cognitive > 15, SLOC > 100, args > 3, Halstead bugs > 1; a manifest [thresholds] table overrides each cutoff). Emitted first, directly after the Summary and before any hotspot table, so a reader who stops after a table or two still sees the highest-altitude counts. These are raw counts that ignore suppression; the section is captioned to say so, naming how many suppressed functions are folded in. When suppression empties a hotspot table whose metric this summary still counts, the table is replaced by a one-line "table omitted: all N matching functions suppressed" note so a summary bullet never points at a missing table.
- Maintainability Index hotspots (lowest N by MI) — files sorted ascending by MI.
- Cyclomatic complexity hotspots (top N by CC) — functions sorted descending by CC, with summary statistics (average, max, counts above 10 and 20).
- Cognitive complexity hotspots (top N by Cognitive) — functions sorted descending by cognitive complexity.
- Halstead effort hotspots (top N by Effort) — functions sorted descending by Halstead effort, including volume and estimated bugs. Effort and Volume render as rounded integers with thousands separators (8,845); full precision lives in JSON/CSV.
- Function size hotspots (top N by SLOC) — functions sorted descending by source lines of code.
- Many parameters hotspots (top N by Args) — functions with more than three parameters, sorted descending.
- Type hotspots (top N by WMC) — types sorted descending by Weighted Methods per Class, with NOM, NPA, and NPM. "Type" covers all six kinds the report counts: class, struct, trait, impl, interface, namespace (the legend's WMC entry lists them).
- Exit points hotspots (top N by Exits) — functions with more than two exit points, sorted descending. A single return is the baseline, not a hotspot, so the table admits only nexits > 2; when nothing clears the floor the section is omitted.
- ABC magnitude hotspots (top N by ABC) — functions sorted descending by ABC metric magnitude.

Format consistency

The Markdown and HTML reports are two renderings of one underlying data model — they always present the same data. Every shared figure (project and per-language summaries, hotspot table membership, and each hotspot caption such as the cyclomatic Average / Max / CC > 10 note) is computed once and rendered by both, so a single run produces identical numbers whether you emit --format markdown or --format html.

Both formats also carry a Legend that defines every metric column abbreviation (CC, MI, ABC, WMC, …) plus the global-header stats (PLOC, Comments, Comment ratio) — a ## Legend section in Markdown (its own outline entry, not nested under the last language) and an expanded (<details open>) block in HTML, so it survives print, mobile, and screen readers as well as a hover tooltip. Each entry links to its chapter in the Supported Metrics reference, so a one-line definition can hand the reader the full explanation. The definitions come from the same shared column specs the tooltips use, so the two formats cannot drift.

Both formats also close with a provenance footer stating the bca version, generation date, the seed paths scanned, the per-table --top value, and whether suppression markers were honored — so a detached artifact (a PR comment, a Pages deployment, a file on a ticket) records what it was generated from. The date honors SOURCE_DATE_EPOCH for reproducible builds. The HTML report additionally carries a <meta name="viewport"> tag and wraps every table in a horizontal-scroll container so wide tables stay reachable on mobile and narrow windows, and its table of contents nests each language's hotspot subsections under a collapsible entry.

Suppression is applied uniformly across every output, not just the reports. A function silenced for a metric — via an in-source marker or the baseline — is dropped from bca check's offender formats (code-climate, sarif, checkstyle, clang-warning, msvc-warning) and from the matching report hotspot table alike. The CodeClimate, SARIF, and Checkstyle documents are themselves three renderings of one offender set, so they agree by construction; the reports honour the same per-metric suppression decisions.

The single deliberate exception is the Actionable Summary, a whole-codebase health indicator that intentionally counts raw measurements regardless of suppression — silencing a function in one metric's hotspot table does not erase it from that aggregate concern count. Every other figure, including each hotspot table's caption, reflects the suppression-filtered set. To stop a reader mistaking the two populations for a double-count, each is captioned: the cyclomatic note adds "(excluding suppressed functions)", and the Actionable Summary names the raw, suppression-ignoring basis of its counts.

HTML format

bca report html emits a single self-contained HTML page covering the same sections as the Markdown report. It is designed to be served as a static artifact: inline CSS, inline vanilla JavaScript for click-to-sort on every hotspot table, and zero external dependencies (no CDN, no fonts, no template engine). The page renders identically offline.

Write it to a file and open in any browser:

bca report --paths /path/to/project \
    html --top 10 --output report.html

Click any column header to sort that table ascending, click again to toggle descending. Each table sorts independently. Empty cells (where a metric was not measured) sort as if they were positive infinity, which keeps "no data" rows out of the visible top of a hotspot.

Hover (or keyboard-focus, where the browser supports it) any metric column header — SLOC, MI, CC, ABC, WMC, NPA, NPM, Exits, etc. — for a one-sentence plain-English explanation of the metric. The tooltip is delivered through the native HTML title attribute, so it works offline with no JavaScript.

Because title tooltips are hover-only — invisible in print, on mobile, and to screen readers — the page also ends with a visible, collapsible Legend (<details>) listing every metric column's one-line definition. Both the tooltips and the legend draw from the same column specs, so a definition cannot say one thing on hover and another in the legend.

Every interpolated string — function name, file path, language label — is HTML-escaped on the way out, so a crafted source path or symbol name cannot inject markup or break out of an attribute value.

Each per-language <section> carries a stable lang-<name> class (e.g. lang-rust, lang-python) styled with a low-alpha background tint and matching left border so a multi-language report's section boundaries are obvious at a glance. Languages without an explicit palette entry fall back to a neutral lang-other tint, and a prefers-color-scheme: dark adapter raises the alpha so contrast holds in both themes.

Metric values of zero

A metric value of 0 in the report means the metric was not measured for that item (e.g. Halstead metrics on an empty function). Sections whose entries are all zero are omitted entirely.

Check

bca check evaluates per-function metrics against thresholds and exits non-zero when any function exceeds a limit. It is the CI integration point: wire it into a build step and a regression in code complexity fails the pipeline before the change lands.

Looking for full CI recipes? The CI integration recipe consolidates the --report-format matrix, runnable GitHub Actions and .gitlab-ci.yml examples, the baseline / ratchet pattern, and the GitLab Code Quality path. This page documents the command itself; the recipe documents how to wire it into a pipeline.

Exit codes

Code	Meaning
`0`	All functions within thresholds (or `--no-fail` set).
`2`	At least one threshold exceeded.
`1`	Tool error (bad arguments, unreadable config, unknown metric).

1 is reserved so CI can distinguish a regression (2) from a tool misconfiguration (1).

Tiered exit codes (`--exit-codes=tiered`)

--exit-codes=tiered (or [check] exit_codes = "tiered" in bca.toml) splits the single violation code 2 by severity so CI can branch on it without parsing the [new] / [regr +N%] stderr tags:

Code	Meaning (tiered mode)
`0`	All functions within thresholds (or `--no-fail` set).
`1`	Tool error.
`2`	New offenders only (no `--baseline` entry matched).
`3`	Baseline regressions only (a baselined offender worsened).
`4`	Both new offenders and regressions.
`5`	A `--tier=soft` violation that also breaches the hard limit.

The tiered codes are opt-in; the default contract above stays 0/1/2. Every fail-state remains non-zero, so exit != 0 → fail wrappers keep working — only tooling that tests $? -eq 2 explicitly needs to widen to 2-5. --no-fail still forces exit 0. Code 5 is emitted only at the soft tier; at the hard tier every violation is a hard breach by definition, so the 2/3/4 split applies instead. --exit-codes <default|tiered> is value-taking; the CLI value overrides the [check] exit_codes manifest key in either direction. An invalid exit_codes value is a tool error (1). --print-effective-config reports the resolved exit_codes style. The deprecated --strict-exit-codes flag is a one-cycle alias for --exit-codes tiered (warns; removed at the next major).

Declaring thresholds

Pass --threshold <metric>=<limit> once per metric (repeatable). Metric names match bca list-metrics; sub-metrics use a dotted form. 0 is a valid limit and means "no value permitted".

bca check --paths src/ \
    --threshold cyclomatic=15 \
    --threshold cognitive=20 \
    --threshold loc.lloc=200

Or keep thresholds in the bca.toml manifest (one place to version CI thresholds alongside the code). Dropped at the repo root, it is auto-discovered — a bare bca check reads it with no --config flag:

# bca.toml
paths = ["src"]

[thresholds]
cyclomatic = 15
cognitive = 20
"loc.lloc" = 200
"halstead.volume" = 1000

bca check

To merge a separate threshold file on top of the manifest for one run, pass it explicitly with --config; CLI flags and --config values override the manifest for the same metric name, so you can keep a project-wide default and tighten a single metric for a specific run:

bca check --paths src/ --config bca.toml

Accepted metric names

Top-level scalar metrics use their list-metrics names directly: cognitive, cyclomatic, nargs, nexits, nom, tokens, abc, wmc, npm, npa. Metric suites with multiple sub-fields use a dotted form:

Metric	Accepted threshold names
Cyclomatic	`cyclomatic`, `cyclomatic.modified`
Halstead	`halstead.volume`, `halstead.difficulty`, `halstead.effort`, `halstead.time`, `halstead.bugs`
Lines of code	`loc.sloc`, `loc.ploc`, `loc.lloc`, `loc.cloc`, `loc.blank`
Maintainability Index	`mi.original`, `mi.sei`, `mi.visual_studio`

An unknown threshold name is a tool error (exit 1), not silently ignored.

Threshold scope

A threshold is checked only against the space kind its metric actually measures, so a metric's whole-file or whole-impl aggregate is never mistaken for a per-function limit. Each metric has a fixed scope; there is nothing to configure.

Scope	Gated spaces	Metrics
File	the whole-file root only	`loc.sloc`, `loc.ploc`, `loc.lloc`, `loc.cloc`, `loc.blank`
Function	individual functions, methods, and closures	`cognitive`, `cyclomatic`, `cyclomatic.modified`, `halstead.`, `mi.`, `abc`, `nargs`, `nexits`, `tokens`
Container	classes, structs, traits, impls, namespaces, interfaces	`nom`, `wmc`, `npm`, `npa`

The Function-scoped metrics include the subtree sums (nargs, nexits, tokens, halstead.*): these still roll a function's own nested closures into its figure, but they are no longer summed across an entire file or impl. The Container-scoped metrics describe a type's method set (methods per class, weighted methods, public members), so they gate the container rather than every leaf function. This means a clean file whose functions are individually fine no longer trips an additive limit purely from the file-wide total — the false positive that bca: suppress-file markers used to mask.

The bare bca diff --metric spelling of a loc sub-metric is accepted as an alias for its dotted form (sloc is equivalent to loc.sloc, and so on for ploc/lloc/cloc/blank), so a name copied from a diff run gates correctly. A bare family head with no single threshold scalar (halstead, mi) is ambiguous and rejected with a "did you mean" hint listing the concrete sub-metrics — pick one (e.g. halstead.volume).

Two-tier thresholds (`--tier`)

--tier <hard|soft|soft=RATIO> selects which threshold tier the gate compares against. hard (the default) uses the [thresholds] table verbatim; soft is an early-warning tier that fires before the hard gate, flagging a function at RATIO of any limit. A bare --tier means soft; soft alone uses the default ratio 0.95; soft=0.90 pins the ratio to 0.90; soft=1.0 disables the blanket scale.

A [thresholds.soft] table sets per-metric soft limits, each either an absolute number or a "<ratio>x" string that scales the metric's hard limit:

[thresholds]
cognitive  = 25
cyclomatic = 15
nargs      = 7

[thresholds.soft]
cognitive  = 22       # absolute soft limit
cyclomatic = "0.9x"   # 90% of the hard limit → 13.5
# nargs absent → soft tier inherits the hard limit (no soft band)

bca check --paths src/ --tier=soft

The soft tier resolves in a fixed order:

Start from [thresholds] (a bca.toml manifest, merged with --config).
If a [thresholds.soft] table exists, merge its overrides on top; metrics absent from it inherit their hard limit. The blanket RATIO does not apply (explicit per-metric limits win).
Otherwise scale every limit by the soft RATIO (default 0.95 for a bare soft; soft=1.0 disables scaling).
Repeated --threshold name=value flags apply last, absolutely.

The soft RATIO (and the scale factor in a "<ratio>x" string) must be in (0, 1]. The [check] headroom manifest key supplies the ratio for a bare --tier=soft. The deprecated --headroom <R> flag is a one-cycle alias for --tier=soft=<R> (warns; removed at the next major) — it now promotes a hard run to the soft tier. Both tiers ratchet through the same --baseline, and --print-effective-config reports the resolved tier alongside the post-merge limits. See the Local threshold gates recipe for the migration tip and rationale.

Offender output

Every offending (function, metric) pair prints one line to stderr in this stable format:

<path>:<start_line>-<end_line>: <function_name>: <metric> = <value> (limit <limit>)

For example:

src/parser.rs:42-117: parse_expression: cyclomatic = 22 (limit 15)
src/parser.rs:42-117: parse_expression: cognitive = 31 (limit 20)

Lines are sorted by path, then start line, then metric name, so output is deterministic across runs over the same tree.

Silencing violations with suppression markers

In-source comments can silence threshold violations on individual functions or whole files without editing the offending code or excluding it from the walk. The native dialect is bca: suppress / bca: suppress-file; Lizard's #lizard forgives is recognized as a compatibility shim. See Suppression markers for the full reference and the --no-suppress CI-audit flag.

Exempting whole file categories (`[check.exclude]`)

Some files should be analysed and reported but never gated: test fixtures that intentionally trip cognitive/cyclomatic, generated bindings, macro-dispatch modules whose complexity is structural and will never be "fixed". Putting these in .bcaignore is too blunt — it removes them from the walk entirely, so bca report loses them too. Baselining them is also wrong — they are not debt being paid down, and they churn the baseline diff forever.

[check.exclude] is the glob-level middle ground: matching files are walked, parsed, metric'd, and shown by bca report, but bca check drops their violations before emitting offenders and before --write-baseline records anything, so the structural exemptions stay out of .bca-baseline.toml.

In bca.toml:

[check]
exclude = [
    "tests/**",
    "src/languages/language_*.rs",
    "xtask/**",
]

Or on the command line (--check-exclude is repeatable and unions with --check-exclude-from):

bca check --check-exclude "tests/**" --check-exclude "xtask/**"
bca check --check-exclude-from .bcacheckignore

--check-exclude-from reads a .gitignore-style file (blank lines and #-comments skipped); the conventional name is .bcacheckignore, mirroring .bcaignore for the walker. Globs match the path exactly as the walker matched it for --exclude. As a negative filter key, an explicit --check-exclude list unions with (does not replace) the manifest [check] exclude list — a CLI exemption is added to the project's, never a replacement, so you cannot accidentally re-gate a path the manifest deliberately exempted. Duplicates collapse; CLI patterns sort first. Pass --no-config to drop the manifest's exemptions entirely. (Positive scope keys like paths / include still replace on the CLI — only the exclude filters merge.)

Precedence with the other suppression mechanisms

Most-specific to least, bca check resolves exemptions in this order:

In-source markers (bca: suppress / bca: suppress-file) — always win; applied during the walk so the function never becomes a violation.
[check.exclude] globs — exempt categories of files (tests, generated code).
.bca-baseline.toml — known offenders being paid down.

--print-effective-config reports the resolved check_exclude globs alongside the other gate inputs.

Baselines

When you adopt thresholds on an existing codebase you typically face a binary choice between "raise the limit until nothing fires" and "fix every offender before turning the gate on". A baseline file is the ratchet-down alternative: record today's offenders, fail only on regressions and new offenders, and shrink the file over time as the team pays down debt.

Baselines are complementary to the suppression markers from Suppression markers, not a substitute. Suppressions express "this function is intentionally exempt forever" and live in source; baselines express "this is tech debt we're paying down" and live in a committed TOML file. bca check honors suppressions first and applies the baseline filter to whatever remains.

Writing a baseline

bca check --paths src/ \
    --write-baseline .bca-baseline.toml

This walks the tree, captures every threshold violation that would otherwise fail the check, and writes them to the file as sorted TOML. The run exits 0 regardless of offender count — the point is to capture them.

# bca baseline file. Generated by `bca check --write-baseline`.
# Listed offenders are filtered from threshold checks; a function that
# gets worse than its recorded value still fails. Refresh with
# `--write-baseline` when entries become stale.
version = 5

[provenance]
tier = "hard"

[[entry]]
path = "src/parser.rs"
qualified = "Parser::parse_expression"
start_line = 42
metric = "cyclomatic"
value = 22.0

The qualified field is the function's qualified symbol (the ::-joined chain of enclosing named containers plus the function name); start_line is retained only to disambiguate a symbol shared by several functions. With --baseline-fuzzy-match, each entry also carries a body_hash for rename-tolerant matching.

Functions already covered by an in-source suppression marker are excluded. Pass --no-suppress together with --write-baseline to record every violation (CI-auditor flow).

--write-baseline cannot be combined with --baseline, --report-format, --output, --since, or --changed-only — the baseline file is the output.

Reading a baseline

bca check --paths src/ \
    --baseline .bca-baseline.toml

A violation is suppressed when both conditions hold:

An entry matches by (path, qualified_symbol, metric) — independent of line number — or, failing that and with --baseline-fuzzy-match, by body hash. (See the Baselines recipe for the full resolution order.)
The current value is less than or equal to the recorded value.

A function that gets worse than its baseline value still fails. New offenders not listed in the baseline still fail. Improvements pass silently (the entry remains at its older, higher value until the next --write-baseline refresh).

A baseline file that does not exist, is empty, has a missing or unsupported version, or fails to parse is a tool error (exit 1), not a silent zero-match.

Path keys are canonicalised relative to the baseline file's own directory (the anchor), so --paths ., --paths src/, and --paths "$PWD" produce byte-identical baselines and a --baseline run matches regardless of which --paths form generated the file — switch between them freely without re-running --write-baseline.

Limitations

Ambiguous symbols / anonymous functions. Entries key on the qualified symbol, so inserting code above a named function no longer re-keys it. The exceptions: functions sharing a qualified symbol that drift beyond --baseline-line-tolerance apart, and anonymous closures/lambdas (whose synthetic symbol embeds the line). Both re-key as "new" on movement; refresh with --write-baseline.
OS portability. Paths are stored with forward slashes so a baseline written on one OS matches the same tree on another. Paths that are not valid UTF-8 fall back to a lossy display form (U+FFFD substitution) and may not round-trip exactly.

See the Baselines recipe for the end-to-end adoption flow and CI integration patterns.

Reporting without failing

--no-fail prints offenders to stderr but exits 0. Useful while adopting baselines without flipping CI red. Other CI tools call this behavior --report-only or --soft-fail; here the flag is spelled --no-fail.

bca check --paths src/ --no-fail

Actionable failure output

When bca check fails, five flags shape the failure stream so a developer skimming a CI log can see what tripped, where in their PR it tripped, and what to do next. Each flag is independent and all auto-detect from GitHub Actions env vars when present, so the common CI case needs zero explicit configuration.

Flag	Effect	Auto-detect env
`--since <ref>`	Partition per-file footer into "Files in this range" + "Other offenders"	`BCA_DIFF_BASE`, `GITHUB_BASE_REF`, `GITHUB_EVENT_BEFORE`
`--changed-only`	Drop violations outside the diff scope entirely	Requires a resolvable base (`--since` or one of the above)
`--github-annotations <auto\|always\|never>`	Emit `::error file=…::msg` workflow commands for inline file annotations (bare flag = `always`)	`auto` detects `GITHUB_ACTIONS == "true"`
`--summary-file <path\|auto\|never>`	Append markdown digest (per-file rollup + breakdown + top-10 offenders); `never` suppresses it	`auto` detects `GITHUB_STEP_SUMMARY`
`--no-remediation`	Suppress the trailing `--- next steps ---` block	Block emitted on failure unless this flag is passed

The per-violation stderr lines and the per-file rollup footer remain unchanged when none of the above are active, so existing CI tooling that grep-anchors on the legacy output keeps working.

See the CI integration recipe for worked examples — including a "putting it all together" GHA snippet that composes all five into one step — and the Baselines recipe for the --write-baseline refresh flow the remediation block links to.

Diff-base auto-detection precedence

When --since is omitted, bca consults env vars in this order:

BCA_DIFF_BASE — explicit override hatch for local shells or non-GHA CI runners.
GITHUB_BASE_REF — set by GHA on pull_request events. Expanded to origin/<value>; the runner is responsible for the corresponding git fetch (fetch-depth: 0 on actions/checkout).
GITHUB_EVENT_BEFORE — set by GHA on push events to the SHA at HEAD before the push. The all-zeroes sentinel (force push, brand-new branch) is treated as no signal.

Failing to resolve a base is non-fatal unless --changed-only is passed, in which case the gate dies — silently suppressing every violation under a misconfigured base would be the worst failure mode this feature exists to prevent. --write-baseline also conflicts with --since / --changed-only (a partial baseline would silently mask every offender outside the diff scope on the next full-tree run).

CI example (GitHub Actions)

- name: Check code complexity thresholds
  run: |
    bca check
  # Thresholds and paths come from the auto-discovered `bca.toml`
  # manifest at the repo root. The default behavior — non-zero exit
  # fails the step — is exactly what we want here. No extra wiring.

If you want to keep the job green and surface offenders as a build annotation while you reduce the count, swap in --no-fail:

- name: Surface complexity hot spots (non-blocking)
  run: |
    bca check --paths src/ --no-fail

Exporting offender records

bca check also emits a single CI/IDE document covering every offender in the walk. Pass --report-format <fmt> to pick the shape and --output <file> to write it to disk (stdout if omitted). The --format, -O, and --output-format spellings are accepted as deprecated aliases and will be removed in a future release. The exit-code contract is unaffected by these flags: 0 clean, 2 on any violation (unless --no-fail), 1 on tool error.

When --output is given without --report-format, the format is inferred from the output extension: .sarif selects sarif and .xml selects checkstyle. An extension with no unique format (notably .json, which both sarif and code-climate produce) or no extension at all is a usage error (exit 1) naming --report-format — an explicit --output is never silently ignored. An explicit --report-format always wins over the extension.

Format	Audience
`checkstyle`	Jenkins, SonarQube, GitLab, "warnings plugin" CI
`sarif`	GitHub Code Scanning, modern IDEs / security tooling
`code-climate`	GitLab MR Code Quality widget
`clang-warning`	Editor quickfix parsers, GitHub Actions problem matcher
`msvc-warning`	Visual Studio, VS Code, Windows CI runners

When no offenders exist the writer emits a well-formed but empty document — empty runs[].results array for SARIF, empty JSON array ([]) for Code Climate, no <file> children under the <checkstyle> root for Checkstyle, and zero bytes for the two warning-line formats — so CI consumers can ingest clean runs unchanged.

Checkstyle (CI integration)

bca check --paths src/ \
    --threshold cyclomatic=15 \
    --report-format checkstyle \
    --output report.checkstyle.xml

The Checkstyle writer emits a single <checkstyle version="4.3"> document containing one <file> element per source path, each holding one <error> per metric-threshold violation. The schema is the Checkstyle 4.3 XSD that Jenkins and SonarQube's "Warnings Next Generation" / "Generic Issue" importers consume directly.

SARIF (GitHub Code Scanning)

bca check --paths src/ \
    --threshold cyclomatic=15 \
    --report-format sarif \
    --output report.sarif.json

The SARIF writer emits a single SARIF 2.1.0 JSON document with one runs[] element. Each metric-threshold violation becomes a result under runs[0].results[]; the metric names appearing in the run are deduplicated into runs[0].tool.driver.rules[] with short descriptions.

To upload a SARIF file to GitHub Code Scanning from a workflow:

name: bca-sarif
on: [push, pull_request]
jobs:
  scan:
    runs-on: ubuntu-latest
    permissions:
      security-events: write
    steps:
      - uses: actions/checkout@v4
      - name: Run big-code-analysis
        run: |
          bca check --paths . \
              --report-format sarif \
              --output report.sarif.json \
              --no-fail
      - name: Upload SARIF
        uses: github/codeql-action/upload-sarif@v3
        with:
          sarif_file: report.sarif.json

--no-fail keeps the job green so the SARIF upload step still runs when offenders exist; remove it once you want a metric regression to fail the workflow.

GitLab Code Quality (Code Climate JSON)

bca check --paths src/ \
    --threshold cyclomatic=15 \
    --report-format code-climate \
    --output gl-code-quality-report.json

The Code Climate writer emits a single JSON array of issue objects matching GitLab's strict subset of the upstream Code Climate engine spec — one entry per metric-threshold violation, no byte-order-mark, one trailing newline (empty input renders as []\n). Each issue carries a namespaced check_name (big-code-analysis/<metric>), a stable SHA-256 fingerprint over path \0 function \0 metric (line- and value-insensitive so cosmetic edits still dedup in the MR widget), and a severity mapped from the value/threshold ratio onto GitLab's five-level enum: ≤ 1.5× → minor, ≤ 2× → major, ≤ 4× → critical, > 4× → blocker (inverted for the mi.* family where lower is worse). The full enum is info/minor/major/critical/blocker; bca never emits info — a threshold violation always lands at minor or higher.

To wire the artifact into GitLab's MR Code Quality widget:

code_quality:
  stage: quality
  script:
    - bca check --paths "$CI_PROJECT_DIR"
          --report-format code-climate
          --output gl-code-quality-report.json
          --no-fail
  artifacts:
    when: always
    reports:
      codequality: gl-code-quality-report.json
    paths:
      - gl-code-quality-report.json

See the GitLab Code Quality widget recipe for the full pipeline (combined Code Climate + Checkstyle + Markdown report) and a local jq smoke check.

--no-fail keeps the job green so the Code Quality report still uploads when offenders exist; remove it once you want a metric regression to fail the pipeline.

Clang/GCC warning lines (editor quickfix and CI annotators)

bca check --paths src/ \
    --threshold cyclomatic=15 \
    --report-format clang-warning \
    --output report.txt

The Clang format emits one offender per line in the conventional compiler-warning shape:

path/to/file.rs:42:5: warning: cyclomatic 17 exceeds limit 15 [big-code-analysis-cyclomatic]

This is the format clang -fdiagnostics-format= produces and the shape every editor quickfix parser (VS Code, IntelliJ, Vim) and most CI annotators understand without configuration.

GitHub Actions surfaces the lines as inline annotations on the PR diff via the built-in GCC problem matcher (or any community compiler-problem-matchers action):

name: bca-clang-warnings
on: [push, pull_request]
jobs:
  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Enable GCC problem matcher
        run: echo "::add-matcher::$RUNNER_TOOL_CACHE/problem-matchers/gcc.json"
      - name: Run big-code-analysis
        run: |
          bca check --paths . \
              --report-format clang-warning \
              --no-fail

If your runner does not ship a GCC matcher, fall back to streaming the lines and re-emitting them as ::warning file=...,line=...:: workflow commands.

MSVC warning lines (Visual Studio and Windows CI)

bca check --paths src/ \
    --threshold cyclomatic=15 \
    --report-format msvc-warning \
    --output report.txt

The MSVC format emits one offender per line in Visual Studio's cl.exe diagnostic shape:

path\to\file.rs(42,5): warning : cyclomatic 17 exceeds limit 15

Note the space before the colon after warning/error — that is the MSVC convention. On Windows the path is normalized to use \ separators (matching cl.exe output); on other platforms the path is emitted as-is. Visual Studio, VS Code with the C/C++ extension, and Windows CI runners (Azure Pipelines, GitHub Actions on windows-latest) parse these inline without extra configuration.

Suppression markers

In-source suppression markers silence threshold violations without editing the offending function or excluding the file from the walk. Drop a marker in any comment in the source file and bca check treats the covered metrics as if they were within limits for that scope. Metric computation is unaffected — raw bca metrics output still reports every number. Suppression is a measurement-display concern: bca check drops the covered violations from the gate, and bca report markdown|html omits the covered functions from the matching hotspot tables by default (pass bca report --no-suppress for the raw audit view — see report).

Markers exist for the cases editing the code is not an option: generated-style legacy modules awaiting rewrite, accepted exceptions documented in the comment, and migration from Lizard's #lizard forgives convention.

Native markers (`bca:`)

The native dialect uses the bca: namespace and the suppress verb, matching the project's internal "suppression" vocabulary (SuppressionPolicy, FuncSpace::suppressed, --no-suppress). Four forms:

Marker	Scope	Effect
`bca: suppress`	Enclosing function	Suppress every metric
`bca: suppress(metric, ...)`	Enclosing function	Suppress only the listed metrics
`bca: suppress-file`	File	Suppress every metric
`bca: suppress-file(metric, ...)`	File	Suppress only the listed metrics

A function-scope marker attaches to the innermost FuncSpace (see the FuncSpace rustdoc) whose source range contains the comment. A function-scope marker outside every function body is silently ignored; for file-wide silencing use the explicit suppress-file verb. A file-scope marker may appear anywhere in the source — there is no "must be in first N lines" rule.

`bca: suppress` — function-scoped, all metrics (Rust)

#![allow(unused)]
fn main() {
// bca: suppress
fn legacy_dispatch(opcode: u8) -> Action {
    // dense match on every supported opcode
    match opcode { /* ... */ }
}
}

`bca: suppress(metric, ...)` — function-scoped, listed metrics (Python)

def parse_token_stream(tokens):
    # bca: suppress(cognitive)
    # cognitive complexity is intrinsic to this state machine;
    # cyclomatic is still bounded.
    ...

Other thresholds (cyclomatic, halstead, loc, ...) still apply.

`bca: suppress-file` — file-scoped, all metrics (JavaScript)

// bca: suppress-file
// Hand-tuned hot path; do not rewrite to satisfy thresholds.
function transform(input) { /* ... */ }
function validate(input) { /* ... */ }

`bca: suppress-file(metric, ...)` — file-scoped, listed metrics (C++)

/* bca: suppress-file(halstead) */
// Halstead volume is inflated by the generated tables below; every
// other metric is still enforced file-wide.

Prefer a narrower tool first. Since threshold scope (#969), a metric's file-wide or impl-wide aggregate no longer fires as a per-function limit, so the most common reason suppress-file was reached for — muting a file-level halstead / nargs / nexits / nom total — is gone. Reach for suppress-file only when you genuinely want to silence a metric for every function in the file. To excuse one irreducibly-complex function, use a function-scoped bca: suppress(...) inside it; to grandfather existing offenders without blinding the gate to future regressions, prefer a baseline entry, which keeps firing once a function gets worse than its recorded value.

Lizard compatibility markers

Two Lizard-style markers are recognized verbatim so existing Lizard-instrumented codebases need no rewrites:

Lizard marker	Scope	Equivalent native marker
`#lizard forgives`	Enclosing function	`bca: suppress`
`#lizard forgive global`	File	`bca: suppress-file`

The compatibility layer is intentionally narrow: only these two shapes are accepted. Other Lizard directives parse as ordinary comments. Lizard offers no per-metric scoping, so the native form's bca: suppress(metric, ...) list has no Lizard analogue — every Lizard-style marker silences every metric.

Lizard's GENERATED CODE marker is not handled here; it is part of the generated-code auto-skip mechanism (see Skipping generated code and the --no-skip-generated flag).

Native vs Lizard side by side

Effect	Native form	Lizard form
Silence every metric for one function	`// bca: suppress`	`// #lizard forgives`
Silence one metric for one function	`// bca: suppress(cyclomatic)`	(no equivalent)
Silence every metric for the whole file	`// bca: suppress-file`	`// #lizard forgive global`
Silence one metric for the whole file	`// bca: suppress-file(halstead)`	(no equivalent)

Metric identifiers

The identifiers accepted inside bca: suppress(...) and bca: suppress-file(...) are:

abc, cognitive, cyclomatic, halstead, loc, mi, nargs, nexits, nom, npa, npm, wmc.

These match the threshold names and the JSON field names emitted on CodeMetrics, with one deliberate exclusion:

nexits is the canonical spelling — bca: suppress(nexits) silences a nexits threshold violation. The legacy exit alias was retired in #555 and is no longer accepted; spelling it exit is an unknown identifier, which warns and voids the entire marker (see below).
tokens is a threshold-checkable metric (and a CodeMetrics JSON field) but is deliberately absent from the suppression list: a marker cannot turn it off. Treat tokens as a hard resource cap, not a maintainability heuristic.

Silencing a family (for example halstead) covers every sub-metric threshold under it (halstead.volume, halstead.effort, ...); suppression vocabulary has no dotted form.

Unknown identifiers in a bca: suppress(...) list emit a stderr warning of the form

warning: path/to/file.rs:42: unknown metric 'no_such_metric' in bca suppression marker; known metrics: abc, cognitive, ...

The marker is dropped — a typo never silently widens scope to other metrics. Unknown verbs (anything other than suppress / suppress-file) and malformed bodies (unbalanced parentheses, trailing garbage) produce the same shape of warning and are similarly dropped. None of these are fatal: a typo in one file does not derail a workspace walk.

Where markers may appear

A marker is recognized inside any source comment, regardless of comment style. The scanner strips the following leading delimiter characters before matching: /, *, !, #, ;, -, and ASCII whitespace. That covers every comment shape bca parses today:

C-family line comments: // bca: suppress
C-family block comments: /* bca: suppress */
Rust inner doc comments: //! bca: suppress and /*! bca: suppress */
Python / shell / Ruby / Perl # comments: # bca: suppress
Lisp / Lua / SQL line comments: ;; bca: suppress, -- bca: suppress

Function-scope markers attach to the innermost Function-kind FuncSpace whose (start_line..=end_line) range contains the comment's line. Markers buried in a class or struct body but outside every method are silently ignored — for class-wide silencing use bca: suppress-file or repeat the marker on each method.

File-scope markers are merged into the top-level Unit space and apply to every function in the file regardless of nesting.

Position the marker near the start of the comment. The scanner trims delimiter characters from both ends and then expects bca: (or #lizard) at the very front; markers buried deep in a multi-line block comment will not be recognized.

`--no-suppress` (CI auditing)

bca check --no-suppress ignores every suppression marker — native and Lizard alike — and reports every threshold violation in the walk. Use it in audit pipelines that need the raw, un-silenced offender list:

bca check --paths src/ --no-suppress

The flag has no effect on metric values themselves: raw bca metrics output always reports every number. bca report markdown|html honours markers in its hotspot tables by default and accepts its own --no-suppress flag for the same raw audit view.

Surfacing suppressed debt (`--report-suppressed`)

Suppression keeps an offender out of the gate, which also keeps it out of the --format document — so a suppressed module disappears from the code-scan report entirely. bca check --report-suppressed puts it back, as suppressed rather than active:

bca check --report-format sarif --no-fail --report-suppressed \
    --tier=soft=0.95 --output bca.sarif

Offenders silenced by an in-source marker or covered by the baseline are emitted into the SARIF document with a SARIF suppressions entry — kind: "inSource" for markers, kind: "external" for the baseline. The suppression never fails the gate (exit code and the human stderr stream are unaffected); the suppressions entry lets downstream tooling tell suppressed debt apart from active offenders.

GitHub Code Scanning caveat. GitHub does not honor the SARIF suppressions property natively — it ingests suppressed results as open alerts, not closed ones. To dismiss them on the Security tab you need a follow-up step such as the advanced-security/dismiss-alerts action, which reads suppressions[] and dismisses the matching alerts. If you only want active offenders to appear, omit --report-suppressed from the upload (this repo's own Pages workflow does exactly that).

Notes:

Only the SARIF format represents suppression; other --format values ignore the flag and emit the active offenders alone.
Pair it with --tier=soft=0.95 (matching your baseline's provenance) so baseline-covered offenders that sit below the hard limit still appear.
Mutually exclusive with --no-suppress (which un-silences markers to show the raw offender list) and --write-baseline.

Auditing exemptions (`bca exemptions`)

--no-suppress shows you the offenders a marker silences, but not the markers themselves — to find every silencer you previously had to diff a --no-suppress run against a normal one. bca exemptions replaces that workaround with a direct listing of everything the bca check gate skips, across all three exemption tiers, in one report:

Tier	Granularity	Source
In-source markers	per-function / per-file	`bca: suppress`, `#lizard forgives`, …
`[check.exclude]` globs	per-glob (categories of files)	`bca.toml` `[check] exclude` / `--check-exclude`
Baseline entries	per-`(path, symbol, metric)`	`.bca-baseline.toml`

# List every exemption in the tree (in-source markers honour
# [walker.exclude] just like every other walking command).
bca exemptions --paths src/

# In-source markers (2)
  src/parser.rs:120  bca: suppress       metrics=all  parse_long
  src/lib.rs:1       bca: suppress-file  metrics=halstead  (whole file)

# [check.exclude] globs (1)
  tests/**

# Baseline (.bca-baseline.toml, 1 entry)
  src/markdown_report.rs:88 write_language_section cognitive 29

The surrounding function (for function-scoped markers) gives scope context; file-scoped markers read (whole file), and a function-scoped marker written outside any function — which silences nothing — reads (no enclosing fn) so dead markers are visible.

Formats and section filters

--format markdown emits tables for PR comments; --format json nests all three tiers under a single suppressions envelope for dashboards and jq filtering:

bca exemptions --paths src/ --format json | jq '.suppressions.markers[] | select(.dialect == "lizard")'

In the JSON form an omitted section is null (not requested via a --*-only flag) while a requested-but-empty section is [], so filters can tell the two apart.

The mutually-exclusive --markers-only / --excludes-only / --baseline-only flags narrow the report to a single tier for PR-bot specialisation (e.g. a bot that only comments on newly-added in-source markers). The baseline (bca.toml top-level baseline) and [check.exclude] ([check] exclude) inputs default to the same sources bca check reads, so the audit reflects exactly what the gate would skip; override the baseline with --baseline <path>.

The earlier --only-markers / --only-excludes / --only-baseline spellings remain as hidden aliases for one release cycle to keep existing PR-bot invocations working; prefer the --<section>-only forms, which match the diff-baseline section filters.

Unlike bca check, bca exemptions is informational and always exits 0 on success — it is a review surface, not a gate.

See also the Baselines recipe for using bca exemptions alongside bca diff-baseline during PR review.

JSON output

FuncSpace exposes the merged suppression scope as the optional suppressed field in its JSON output. When no marker applies to a space the field is elided so existing snapshot consumers see no change. When a marker fires the field carries one of two shapes:

{ "suppressed": { "kind": "all" } }

{ "suppressed": { "kind": "some", "metrics": ["cognitive", "loc"] } }

kind: all corresponds to a bare marker (bca: suppress, bca: suppress-file, or any Lizard-style marker). kind: some carries the explicit metric list from bca: suppress(...) / bca: suppress-file(...). Both shapes are stable serialization output suitable for dashboards and audit logs.

Migrating from Lizard

The compatibility layer means migration is incremental:

Existing #lizard forgives and #lizard forgive global markers continue to work with no change. bca check honors them out of the box.
Rewrite to the native form opportunistically. bca: suppress(...) gives per-metric scoping (the Lizard form silences everything) and is the form future audit-trail features will extend.

The project will keep the Lizard compatibility layer indefinitely; there is no removal date.

Reserved syntax

These shapes are reserved for future use and are not parsed today:

bca: suppress(metric, reason = "...") — audit-trail prose alongside the metric list, mirroring Rust's reason = "…" attribute argument.
bca: suppress-next — silence the immediately following declaration rather than the enclosing function.

Authors should avoid using either form today: a reason = "..." argument is currently parsed as an unknown metric identifier and discarded with a stderr warning, and bca: suppress-next is rejected as an unknown verb. Both will be promoted to first-class behavior in a future release without breaking existing markers.

Nodes

bca provides commands to analyze and extract information about nodes in the Abstract Syntax Tree (AST) of a source file.

Migrating? The verbs below replace the pre-restructure flag actions (-d, -f, --count, ...). See the migration guide.

Error detection

To detect syntactic errors in your code, run:

bca find -t ERROR -I "*.ext" /path/to/your/file/or/directory

[PATHS]... / -p, --paths: file or directory to analyze (analyzes all files when given a directory). Paths are given positionally or via --paths; both are unioned. Flags follow the subcommand.
-t, --type: the node type to match. Repeat the flag for several types (-t function_item -t struct_item); at least one is required. A string value matches the node-type name exactly (for example function_item). A purely numeric value is instead interpreted as a raw tree-sitter kind_id and matches nodes whose internal symbol id equals that number (so -t 0 matches the end/ERROR sentinel). The numeric form is an escape hatch for grammar inspection and is unstable: a kind_id is an index into the grammar's symbol table, so the same number names a different node after a grammar-version bump. Prefer the string form unless you specifically need a kind that has no stable name.
-I, --include: glob filter for selecting files by extension (e.g. *.js, *.rs). Each -I takes exactly one value, so a following positional path is never swallowed.

Counting nodes

Count occurrences of one or more node types with the count command:

bca count -t <NODE_TYPE> [-t <NODE_TYPE>...] -I "*.ext" \
    /path/to/your/file/or/directory

Printing the AST

To visualize the AST of a source file, use the dump command (which requires an explicit path — a whole-tree AST dump is never useful):

bca dump /path/to/your/file/or/directory

Analyzing code portions

To analyze only a specific portion of the code, use the dump subcommand's --line-start and --line-end options. For example, to print the AST of a single function from line 5 to line 10:

bca dump --line-start 5 --line-end 10 /path/to/your/file/or/directory

These flags are specific to dump and find, so they must follow the subcommand. The short --ls / --le spellings still work as deprecated aliases but are slated for removal in the next major.

Listing functions

For a list of every function or method and its line span, use:

bca functions /path/to/your/file/or/directory

Rest API

bca-web is a web server that allows users to analyze source code through a REST API. This service is useful for anyone looking to perform code analysis over HTTP.

The server can be run on any host and port, and supports the following main functionalities:

Remove Comments from source code.
Retrieve Function Spans for given code.
Retrieve the AST (abstract syntax tree) for given code.
Compute Metrics for the provided source code.

Running the Server

To run the server, you can use the following command:

bca-web --host 127.0.0.1 --port 9090

--host specifies the IP address where the server should run (default is 127.0.0.1).
--port specifies the port to be used (default is 8080).
-j specifies the number of parallel jobs (optional).
--cors enables CORS for browser-based tooling (off by default).

For the full flag set, environment variables, resource limits, and the trust boundaries to respect before exposing the daemon, see Operating bca-web.

CORS

By default bca-web emits no CORS headers: a browser script served from a different origin cannot read the API's responses. This keeps a local bca-web (the default 127.0.0.1 bind) from exposing its repository paths and metrics to any website the operator happens to be visiting.

Pass --cors to opt in. The argument is an explicit, comma-separated allow-list of origins; only those origins receive an Access-Control-Allow-Origin header, and the matched origin is echoed back verbatim (a request from any other origin gets no header and is blocked by the browser):

bca-web --cors https://app.example,https://tools.example

To answer every origin with Access-Control-Allow-Origin: *, pass a literal *:

bca-web --cors '*'

A wide-open * exposes the server's metrics and repository paths to any origin, so use it only on trusted networks.

When CORS is enabled, a preflight OPTIONS request is answered 204 No Content with Access-Control-Allow-Origin, Access-Control-Allow-Methods (the resource's own accepted methods, the same set the Allow header advertises), and Access-Control-Allow-Headers (echoing the request's Access-Control-Request-Headers, or Content-Type, Accept for a bare probe). The API has no authentication or cookies, so Access-Control-Allow-Credentials is never sent.

API Versioning

All endpoints are mounted under a /v1 prefix (for example /v1/metrics). The full route set is /v1/ping, /v1/version, /v1/languages, /v1/ast, /v1/comment, /v1/function, /v1/metrics, /v1/vcs, /v1/vcs/trend, /v1/vcs/jit, and the route index /v1. To discover them programmatically, GET /v1 (see Route index).

The unprefixed paths (/metrics, /comment, /ast, /, …) that earlier 1.x releases served as deprecated aliases were removed in 2.0; requesting one now returns 404. Use the /v1 form everywhere.

Error responses

Errors are reported with an HTTP status code, not inside a 200 body. Every error — on the JSON endpoints, the raw/octet-stream endpoints, and the 415/405/404 fallbacks alike — returns one uniform machine-readable JSON body so clients parse a single error shape regardless of the success content-type:

{
  "error": "human-readable message",
  "error_kind": "stable_machine_token",
  "id": "echoed-request-id"
}

error is the specific human-readable cause; error_kind is a stable snake_case machine token (e.g. unknown_field, unsupported_language, bad_request, parse_timeout) so clients branch on the cause without string-matching the prose (issue #631). The token vocabulary is closed and governed by STABILITY.md.

The id key is always present. It carries the client-supplied correlation id when the request had one (the JSON endpoints), and an empty string otherwise (the octet-stream / query endpoints carry no id, and the content-type / method / not-found fallbacks — and any request whose body failed to parse before the id was read — have no parsed id to echo).

Status codes:

400 Bad Request — a malformed body or query parameter: invalid JSON, a missing required field, an unrecognised key (the strict deny_unknown_fields parse, including the removed unit flag — see Compute Metrics below), or a scope value that is not full / file.
422 Unprocessable Entity — the file_name extension (and content sniffing) maps to no supported language. The route matched and the body parsed; only the submitted entity cannot be processed. The response carries the stable machine token "error": "unsupported_language"; query GET /v1/languages for the supported set. (Before 2.0 this was a 404, indistinguishable from an unknown URL — see issue #634.)
404 Not Found — the URL matches no endpoint.
415 Unsupported Media Type — a known POST endpoint received a Content-Type that is neither application/json nor application/octet-stream (a charset parameter is allowed).
405 Method Not Allowed — a known endpoint was called with the wrong HTTP method (the analysis endpoints are POST-only; /ping, /version, and /languages are GET-only).
406 Not Acceptable — the request Accept header named only media types the server cannot produce. The structured analysis endpoints serve application/json, application/yaml, and application/cbor; any other concrete type (e.g. application/xml) is a 406 carrying the not_acceptable token. See Content negotiation.
413 Payload Too Large — the request body exceeded the server limit.
500 Internal Server Error — metric computation or AST construction failed for an otherwise-valid request, or a /vcs history walk failed on the server side.
503 Service Unavailable — the parse pool is saturated by orphaned (timed-out) tasks; retry later.
504 Gateway Timeout — the parse (or history walk) exceeded the server's configured deadline.

Content negotiation

The structured analysis endpoints — /v1/ast, /v1/comment (JSON variant), /v1/function, /v1/metrics, /v1/vcs, /v1/vcs/trend, and /v1/vcs/jit — choose their response serialization from the request Accept header, mirroring the CLI's -O json|yaml|cbor outputs. The same value serializes byte-for-byte identically whether it comes from the CLI or the server.

`Accept` value	Response `Content-Type`
absent, `/`, `application/*`, `application/json`	`application/json`
`application/yaml` (or `text/yaml`, `application/x-yaml`)	`application/yaml`
`application/cbor`	`application/cbor`
any other concrete type	`406 Not Acceptable`

Rules:

JSON is the default. A request with no Accept header, or one that includes */* / application/* / application/json, gets JSON — the same body and Content-Type earlier releases always returned, so existing clients need no change.
q weights are honored. Among the supported types the highest q-weighted entry wins (Accept: application/json;q=0.5, application/yaml;q=0.9 returns YAML); q=0 refuses a type. Ties keep the first-listed entry.
Unsupported types are a 406, not a silent JSON fallback. The body is the uniform {error, error_kind, id} envelope with error_kind: "not_acceptable", and the message lists the supported media types.
Only structured serializations are offered. TOML and CSV are excluded: TOML is awkward for the deeply nested space tree and CSV is flat/tabular. The error-envelope body and the /v1/comment octet-stream variant (raw byte-in / byte-out) are always JSON / raw bytes respectively and do not negotiate. The introspection routes (/v1, /v1/version, /v1/languages) return JSON metadata only.

# YAML metrics
curl --silent \
  --header 'Content-Type: application/json' \
  --header 'Accept: application/yaml' \
  --data '{"file_name": "foo.py", "code": "def f():\n    pass\n"}' \
  http://127.0.0.1:8080/v1/metrics

# CBOR metrics (binary; pipe to a decoder)
curl --silent \
  --header 'Content-Type: application/json' \
  --header 'Accept: application/cbor' \
  --data '{"file_name": "foo.py", "code": "def f():\n    pass\n"}' \
  http://127.0.0.1:8080/v1/metrics --output metrics.cbor

Endpoints

1. Ping the Server

Use this endpoint to check if the server is running.

Request:

GET http://127.0.0.1:8080/v1/ping

Response:

Status Code: 200 OK
Body: empty.

Use curl -sf http://127.0.0.1:8080/v1/ping && echo ok to script a liveness check — -f makes curl exit non-zero on any HTTP error.

2. Remove Comments

This endpoint removes comments from the provided source code. It accepts two Content-Type variants. Use application/octet-stream for raw byte-in / byte-out, and application/json for a JSON envelope.

Request:

POST http://127.0.0.1:8080/v1/comment

Payload:

{
  "id": "unique-id",
  "file_name": "filename.ext",
  "code": "source code with comments"
}

id: A unique identifier for the request. Optional (issue #645); omitting it defaults to an empty string (treated as "no correlation id").
file_name: The name of the file being analyzed.
code: The source code with comments.

Response (JSON variant):

{
  "id": "unique-id",
  "language": "cpp",
  "code": "print"
}

The response envelope reports id, the detected language (the canonical lowercase slug — see Compute Metrics below), and the code result key. The code field is a string holding the stripped source: the request code arrived as a JSON string, so the stripped output is guaranteed valid UTF-8 and is handed back as a string, matching the request and every other JSON endpoint. The application/octet-stream variant returns the stripped source as the raw response body (no envelope), which is the correct home for binary-faithful round-trips and simpler for shell pipelines; its errors still use the uniform JSON error body above.

When the source contains no removable comments, both variants signal the empty result with a 200 status and an empty payload: the JSON variant returns "code": "" (an empty string) and the octet-stream variant returns an empty body. The status code and envelope shape are therefore identical regardless of the requested Content-Type; the octet-stream variant returns an empty 200 body rather than 204 No Content.

3. Retrieve Function Spans

This endpoint retrieves the spans of functions in the provided source code.

Request:

POST http://127.0.0.1:8080/v1/function

Payload:

{
  "id": "unique-id",
  "file_name": "filename.ext",
  "code": "source code with functions"
}

id: A unique identifier for the request. Optional (issue #645); omitting it defaults to an empty string (treated as "no correlation id").
file_name: The name of the file being analyzed.
code: The source code with functions.

Response:

{
  "id": "unique-id",
  "language": "cpp",
  "spans": [
    {
      "name": "function_name",
      "start_line": 1,
      "end_line": 10
    }
  ]
}

The envelope reports id, the detected language slug, and the spans result key. name is null when the parser could not resolve the function's name from the AST (e.g. an anonymous or malformed definition). A null name is the malformed-span signal.

4. Retrieve the AST

This endpoint returns the full tree-sitter abstract syntax tree (AST) for the provided source code as a recursive JSON node tree.

Request:

POST http://127.0.0.1:8080/v1/ast

Payload:

{
  "id": "unique-id",
  "file_name": "filename.ext",
  "code": "source code to parse",
  "comment": false,
  "span": true
}

id: A unique identifier for the request. Optional; omitting it defaults to an empty string (treated as "no correlation id").
file_name: The name of the file being analyzed.
code: The source code to parse.
comment: When true, comment nodes are omitted from the tree. Optional; defaults to false.
span: When true, each node carries its source span; when false, span is null. Optional; defaults to false.

id, comment, and span are optional and default as noted above (issue #645); file_name and code are required. Unknown keys are rejected with 400 (issue #633).

Response:

{
  "id": "unique-id",
  "language": "rust",
  "root": {
    "type": "source_file",
    "value": "",
    "span": { "start_line": 1, "start_col": 1, "end_line": 2, "end_col": 1 },
    "field_name": null,
    "children": [
      {
        "type": "function_item",
        "value": "",
        "span": { "start_line": 1, "start_col": 1, "end_line": 1, "end_col": 13 },
        "field_name": null,
        "children": [
          {
            "type": "identifier",
            "value": "main",
            "span": { "start_line": 1, "start_col": 4, "end_line": 1, "end_col": 8 },
            "field_name": "name",
            "children": []
          }
        ]
      }
    ]
  }
}

The envelope reports id, the detected language slug, and root — the root AST node. Each node carries:

type: the tree-sitter grammar node kind (grammar-specific; the language slug tells you which grammar produced it).
value: the source text for leaf/named tokens (empty for interior nodes).
span: a { start_line, start_col, end_line, end_col } object (all 1-based), or null when span was false. These span keys use the *_line vocabulary shared with /function and /metrics (issue #638 renamed the former *_row keys).
field_name: the tree-sitter grammar field through which the parent reaches this node (e.g. name, left, body), or null for the root, anonymous tokens, and unfielded children.
children: the node's child nodes, recursively.

Unlike the metric and function endpoints, the AST endpoint reports node coordinates over the exact bytes the client submitted — the source is not EOL-normalised, so spans line up with the client's own copy (issue #640).

5. Compute Metrics

This endpoint computes various metrics for the provided source code.

Request:

POST http://127.0.0.1:8080/v1/metrics

Payload:

{
  "id": "unique-id",
  "file_name": "filename.ext",
  "code": "source code for metrics",
  "scope": "full"
}

id: Unique identifier for the request. Optional (issue #645); omitting it defaults to an empty string (treated as "no correlation id").
file_name: The filename of the source code file.
code: The source code to analyze.
scope: How much of the space tree to return. full (the default) returns the complete nested space tree — the file-level root plus a recursive spaces list for every function, class, and other unit. file returns only the file-level root with its spaces children cleared. This field replaces the pre-2.0 boolean unit flag (issue

#638); sending the old unit key now fails with 400.

The payload is validated strictly: an unrecognised key (a typo, or the removed unit) is rejected with 400 and the uniform JSON error body, naming the offending field (issue #633).

On the application/octet-stream variant, the source is the raw request body and scope is supplied as a query parameter (?file_name=…&scope=full). When the parameter is omitted it defaults to full; an unrecognised value is rejected with 400.

Response:

{
  "id": "unique-id",
  "language": "rust",
  "root": {
    "name": "sample.rs",
    "start_line": 1,
    "end_line": 7,
    "kind": "unit",
    "spaces": [
      {
        "name": "double",
        "start_line": 1,
        "end_line": 7,
        "kind": "function",
        "spaces": [],
        "metrics": {
          "cyclomatic": { "sum": 2, "average": 2.0, "min": 2, "max": 2 },
          "loc": { "sloc": 7, "ploc": 7, "lloc": 1, "cloc": 0, "blank": 0 },
          "nom": { "functions": 1, "closures": 0, "total": 1 }
        }
      }
    ],
    "metrics": { "...": "the same metric block, aggregated over the file" }
  }
}

The response envelope reports id, the detected language slug, and root — the single file-level space object (issue #638 renamed this key from the misleading plural spaces). root carries name (the request file_name), the start_line / end_line span, a kind discriminator (unit, function, class, …), its metrics block, and a recursive spaces list of child units. The example above is trimmed: each real metrics block contains every metric family — cyclomatic, cognitive, halstead, loc, nom, nargs, nexits, tokens, mi, abc, and wmc — with the same nested fields the bca CLI emits. When scope is file, root.spaces is an empty list.

The language value is the canonical lowercase slug (e.g. rust, cpp, csharp, tsx) — the same token the language vocabulary accepts — not a human-pretty display name. Every analysis endpoint (/ast, /comment, /function, /metrics) reports this language field so clients can confirm which grammar was selected.

6. Server and Library Version

Reports the running server version and the version of the big-code-analysis library it was built against.

Request:

GET http://127.0.0.1:8080/v1/version

Response:

{
  "server": "2.0.0",
  "library": "2.0.0"
}

7. Supported Languages

Lists the supported languages and their registered file extensions. The names are the canonical lowercase slugs; the list and extensions are sourced from the library's language table, never hardcoded.

Request:

GET http://127.0.0.1:8080/v1/languages

Response:

{
  "languages": [
    { "name": "cpp", "extensions": ["cpp", "cc", "hpp", "..."] },
    { "name": "rust", "extensions": ["rs"] }
  ]
}

Like every endpoint, /version and /languages are served only under the /v1 prefix; the unprefixed 1.x aliases were removed in 2.0 (see API Versioning).

8. Route index

Returns a machine-readable index of every registered route — its path, the HTTP methods it accepts, and a one-line description — so clients can discover the API surface without scraping this chapter. The index is generated from the same route table the server registers, so it cannot drift from the live routing.

Request:

GET http://127.0.0.1:8080/v1

Response:

{
  "service": "bca-web",
  "version": "2.0.0",
  "routes": [
    { "path": "/v1", "methods": ["GET", "HEAD"], "description": "This route index." },
    { "path": "/v1/metrics", "methods": ["POST"], "description": "Compute maintainability metrics for the source." }
  ]
}

service is always bca-web; version matches the server field of GET /v1/version. The unprefixed root / that earlier releases served as this endpoint's alias was removed in 2.0.

Change-history (VCS) metrics

Three endpoints expose the change-history (version-control) metrics — the same numbers bca vcs computes from the CLI. Unlike every other endpoint, these analyse a git repository already present on the server's filesystem rather than source code carried in the request body: VCS metrics derive from commit history, which has no in-request representation.

Operator warning — repo_path is a trust boundary. The repo_path field is a server-side filesystem path. These endpoints make the server walk any git repository it can read and return that repository's relative file paths, churn, and author signals. This is materially different from the source-in-body endpoints, which only ever see code the client sends. The optional cache_dir field is a second caller-supplied server-side path that grants a write capability: with caching enabled (the default), the server creates directories and writes JSON cache files under it (<cache_dir>/<repo>/<head_sha>.json), so a caller controlling cache_dir can direct the server to write cache files at any path the server process can write to. (cache_dir is accepted only by /vcs; /vcs/trend and /vcs/jit do not cache.) The endpoint's filesystem reach is therefore an arbitrary read of any readable git repository and an arbitrary write of cache files under any writable path. Do not expose /vcs, /vcs/trend, or /vcs/jit to untrusted clients without an authorization layer in front of bca-web. The default 127.0.0.1 bind keeps them local. Each walk runs under the same parse-timeout and blocking-pool guard as the analysis endpoints.

All three endpoints are POST-only, accept application/json, echo the request id, and report errors with the uniform {error, error_kind, id} body (its error_kind tokens are the vcs_* family — e.g. vcs_not_a_repository, vcs_invalid_window). A client mistake — repo_path does not exist or is not a git working tree (both carry vcs_not_a_repository), an unresolvable ref/commit, a malformed or non-diff diff, or a malformed window / timestamp / formula / file-type / threshold / trend parameter — is a 400; a failure of the history walk itself is a 500. A nonexistent repo_path is a typo — the most common client error here — so it answers 400 like a path that exists but is not a repository, not a 500 (issue 653).

9. Rank files by risk — `/vcs`

Walks the repository's history once and returns its files ranked by a composite risk score (issue #328).

Request:

POST http://127.0.0.1:8080/v1/vcs

Payload:

{
  "id": "unique-id",
  "repo_path": "/srv/repos/my-project"
}

repo_path is required; every other field is optional and defaults to the bca vcs default. id is optional too (issue #645) — omitting it defaults to an empty string, echoed back unchanged. The optional fields are:

long_window / recent_window: window specs (e.g. 12mo, 90d). Defaults 12mo / 90d.
top: keep only the top N files by risk. Absent defaults to 50 (the bca vcs --top default); an explicit 0 returns all files.
ref: revision to analyse (default HEAD).
risk_formula: weighted (default) or percentile.
file_types: metrics (default — only files bca has metrics for), all (every tracked text file), or a comma-separated extension allow-list (rs,py).
full_history: walk the full DAG rather than first-parent only.
include_merges: include merge commits.
follow_renames: follow renames (default true).
exclude_bots: exclude bot identities (default true).
bot_pattern: override the bot-author exclusion regex.
as_of: reference "now" (RFC 3339 / @unix / git date) for snapshots.
emit_author_details: emit SHA-256-hashed author identities.
author_hash_key: secret key that hardens emit_author_details into a keyed HMAC-SHA256 (requires emit_author_details; an empty key or one without the flag is a 400).
include_deleted: include files deleted at the target ref.
bus_factor_threshold: bus-factor coverage threshold in (0, 1) (default 0.5).
no_cache: disable the persistent change-history cache for this request (default false).
cache_dir: override the server-side cache directory.

Response:

{
  "id": "unique-id",
  "vcs_schema_version": 2,
  "risk_score_version": 2,
  "long_window_days": 365,
  "recent_window_days": 90,
  "truncated_shallow_clone": false,
  "vcs_aggregate": { "...": "directory- / repo-level bus factor" },
  "files": [
    {
      "path": "src/main.rs",
      "vcs": {
        "commits_long": 12,
        "commits_recent": 3,
        "churn_long": 540,
        "churn_recent": 80,
        "authors_long": 4,
        "authors_recent": 2,
        "risk_score": 1.42
      }
    }
  ]
}

files is ordered by descending vcs.risk_score. Each entry carries the repository-relative path plus a nested vcs metric block (the same shape bca vcs emits, issue #684): commit and churn counts over the long and recent windows, author counts, ownership share, burst, bug-fix / security-fix / revert counts, age, change and co-change entropy, and the composite risk_score. hotspot_score and the hashed author_ids appear inside that block only when computable / requested. The four constant stamps vcs_schema_version, risk_score_version, long_window_days, and recent_window_days sit once at the top level, never per row (issue #635). vcs_aggregate carries the directory- and repo-level bus factor (issue #332).

10. Historical trend — `/vcs/trend`

Samples the change-history metrics at several evenly-spaced points in time and returns the per-file time series (issue #333). Its response is a series, not a ranked snapshot, so it is a distinct route from /vcs.

Request:

POST http://127.0.0.1:8080/v1/vcs/trend

Payload: every /vcs field above except the cache controls (no_cache / cache_dir) — trend does not use the persistent cache, so sending either field is a 400 (issue #961) rather than a silent no-op — plus:

points: number of evenly-spaced sample points (>= 2). Defaults to 12 (the bca vcs trend --points default) when omitted.
span: total look-back the points cover (default 12mo).
top_deltas: top N files per improving / regressing list. Absent defaults to 10; an explicit 0 returns all.

Response:

{
  "id": "unique-id",
  "trend_schema_version": 1,
  "vcs_schema_version": 2,
  "risk_score_version": 2,
  "long_window_days": 365,
  "recent_window_days": 90,
  "truncated_shallow_clone": false,
  "as_of_points": [1704067200, 1711929600],
  "files": {
    "src/main.rs": [ { "as_of": 1704067200, "vcs": { "risk_score": 1.1 } }, null ]
  },
  "deltas": { "improved": [], "regressed": [] }
}

as_of_points lists the sample timestamps oldest-first. Each file's array in files aligns to it 1:1, with a null element at a point where the file did not yet exist; each present element is { "as_of": ..., "vcs": { ... } }, with that file's VCS block nested under vcs at that moment (issue #684). The four constant stamps sit once at the top level, never per point (issue #635). deltas ranks the most-improved and most-regressed files by their risk-score movement across the series.

11. Just-in-time risk — `/vcs/jit`

Scores the just-in-time risk of a single change — either one commit on a server-side repository, or an arbitrary unified diff carried in the request body (issues #331 / #580). The two modes are mutually exclusive.

Commit mode scores a commit on repo_path:

{
  "id": "unique-id",
  "repo_path": "/srv/repos/my-project",
  "commit": "HEAD"
}

Commit mode also accepts the experience-window knobs long_window, recent_window, full_history, include_merges, follow_renames, and as_of. The response is a full report whose source is "commit" and whose risk_score folds in all five feature groups (size, diffusion, history, experience, purpose):

{
  "id": "unique-id",
  "jit_schema_version": 3,
  "jit_score_version": 1,
  "source": "commit",
  "long_window_days": 365,
  "recent_window_days": 90,
  "risk_score": 0.87,
  "commit": { "id": "…", "parent_count": 1, "is_merge": false, "purpose": {} },
  "features": { "size": {}, "diffusion": {}, "history": {}, "experience": {} },
  "contributions": { "size": 0.4, "diffusion": 0.2, "history": 0.1, "purpose": 0.0, "experience": -0.1 }
}

Diff mode scores an arbitrary unified diff with no repository:

{
  "id": "unique-id",
  "diff": "--- a/x\n+++ b/x\n@@ -1 +1 @@\n-old\n+new\n"
}

A bare diff carries no author, parent, or history, so only the size and diffusion groups are computable. The diff report's source is "diff" and it reports partial_risk_score — not risk_score — because the missing groups are absent from the body entirely, never present as zero:

{
  "id": "unique-id",
  "jit_schema_version": 3,
  "jit_score_version": 1,
  "source": "diff",
  "partial_risk_score": 0.6,
  "size": {},
  "diffusion": {},
  "contributions": { "size": 0.4, "diffusion": 0.2 }
}

Branch on the source discriminator ("commit" vs "diff") to read the right score field. partial_risk_score is always lower than a commit's risk_score for the same change and lives on a different scale: rank diffs against other diffs, never against commit scores.

Mode conflict. Supplying diff together with any commit-mode field (repo_path, commit, a window, history, rename, or as_of knob) is rejected with a 400 rather than silently honouring the diff and dropping the rest — the two modes answer different, non-comparable questions, so the combination is treated as a client mistake (issue number 632).

Non-diff diff. A diff value that is not a git unified diff — the wrong field, an accidentally-mangled string, arbitrary text — is rejected with a 400 (vcs_invalid_diff) rather than scored as a confident partial_risk_score of 0.0. On a risk-gating endpoint a spurious "zero risk" is the most dangerous failure mode, so non-diff input is a hard error (issue 652). An empty or whitespace-only diff is the one exception: it legitimately means "no changes", so it still returns a valid 0.0 — a CI step that computed an empty diff gets the zero-risk answer it expects.

Operating bca-web

bca-web is the HTTP daemon that wraps the big-code-analysis library, exposing comment removal, function spans, AST dumps, maintainability metrics, and change-history (VCS) metrics over a REST API. This page is for the operator running the daemon: how to build and start it, which flags and environment variables tune it, and the trust boundaries to respect before exposing it.

It is the operations companion to two reference pages. The REST API reference documents every endpoint, its request and response shapes, and the error contract. The Driving the REST API recipe shows end-to-end curl calls. This page covers the process itself and links to those two rather than repeating them.

Build and run

bca-web is the binary of the big-code-analysis-web crate. From a checkout, run it through Cargo:

cargo run -p big-code-analysis-web -- --host 127.0.0.1 --port 8080

To install the binary on your PATH, build a release artifact and copy it out, or install from the crate:

cargo install big-code-analysis-web   # installs the `bca-web` command
bca-web --host 127.0.0.1 --port 8080

bca-web binds the requested address, serves until interrupted, and exits non-zero if it cannot bind the port or hits an I/O error, so a supervisor (systemd, a container orchestrator, or a CI smoke check) sees the failure and can restart or alert.

Building with a subset of languages does not work

The shipped bca-web binary compiles every supported tree-sitter grammar in. The big-code-analysis-web crate pins the library's all-languages feature set explicitly, so passing --no-default-features or a custom --features list to cargo build -p big-code-analysis-web does not drop grammars from the resulting binary. Dropping a grammar silently from a user-facing daemon would surface as "language X stopped working" at request time rather than as a build error, so the crate forbids it (issue #252).

If you need a reduced grammar set, embed the big-code-analysis library in your own Rust code and select features in your own Cargo.toml. The per-language Cargo features chapter lists every feature with a worked example.

Command-line flags

The full flag set, with defaults:

Flag	Default	Purpose
`-j`, `--num-jobs <N\|auto>`	`auto`	Worker-thread count. `auto` resolves to the OS-reported effective CPU count.
`--host <HOST>`	`127.0.0.1`	Address to bind.
`-p`, `--port <PORT>`	`8080`	TCP port.
`--parse-timeout-secs <SECS>`	`30`	Per-parse deadline. `0` disables it.
`--cors <ORIGINS>`	off	Enable CORS for a comma-separated origin allow-list.
`-h`, `--help`		Print help and exit.
`-V`, `--version`		Print version and exit.

--num-jobs auto is cgroup-quota- and cpuset-aware on Linux: in a container with a CPU quota it resolves to the quota rather than the host's physical core count, matching the bca CLI's --num-jobs. This count sizes the worker pool and the parse-admission semaphore, so it caps how many parses run concurrently. The minimum is 1; 0 is rejected at parse time.

--parse-timeout-secs bounds how long a single parse may run before the request returns 504 Gateway Timeout. The default of 30 guards against a pathological input wedging a worker indefinitely. Setting it to 0 removes the deadline and, with it, the load-shedding described below; use 0 only when an unbounded parse is acceptable. See the REST API reference for the response body the timeout returns.

CORS is off by default. The CORS section of the reference documents it in full, covering preflight handling, the wildcard form, and the absence of credentials. The short version: pass --cors with an explicit origin allow-list to let browser tooling read responses; omit it to emit no Access-Control-* headers at all.

Environment variables

Variable	Default	Purpose
`BCA_MAX_ORPHANED_TASKS`	`max(num_jobs * 2, 4)`	Cap on orphaned (timed-out but still-running) parse tasks before new requests are shed with `503`.
`RUST_LOG`	`info`	Log filter for the `tracing` subscriber.

RUST_LOG uses the EnvFilter syntax (for example RUST_LOG=big_code_analysis_web=debug). The daemon emits one access-log line per completed request, carrying the method, route, status, and latency.

Resource limits and back-pressure

Two limits protect the daemon from a single client exhausting it.

Request body size. Every endpoint rejects a request body larger than 4 MiB with 413 Payload Too Large. The limit applies uniformly to the JSON and raw-octet-stream content types, so both reject oversized bodies at the same threshold.

Orphaned-task admission control. When a parse exceeds --parse-timeout-secs, the request returns 504, but the blocking thread keeps running on the pool until the parse finishes on its own, because tree-sitter cannot be interrupted mid-parse. To stop sustained pathological input from piling up unbounded background work, new requests are rejected with 503 Service Unavailable once the count of orphaned tasks reaches a soft cap. The cap defaults to max(num_jobs * 2, 4) and is overridable through BCA_MAX_ORPHANED_TASKS (parsed as an unsigned integer; an invalid or zero value falls back to the default). Setting --parse-timeout-secs 0 disables this mechanism entirely, since with no deadline no task is ever orphaned.

Security and trust boundaries

bca-web has no authentication, authorization, or rate limiting of its own. The defaults are chosen for a local, single-operator deployment; widen them deliberately.

Default bind is loopback. The server binds 127.0.0.1 unless --host says otherwise. Keep it there, or put an authenticating proxy in front, before exposing it to a network. Binding 0.0.0.0 makes every capability below reachable by anyone who can route to the port.

CORS is off by default. With no --cors flag, a browser script from another origin cannot read API responses, so a page the operator happens to visit cannot quietly drive a loopback bca-web. The wildcard form (--cors '*') answers every origin and exposes the server's metrics and repository paths to any site; use it only on trusted networks. Full semantics are under CORS.

The VCS endpoints read server-side repositories. Unlike the source-in-body endpoints, /v1/vcs, /v1/vcs/trend, and /v1/vcs/jit analyze a git repository already on the server's filesystem, named by the request's repo_path. A caller who can reach these endpoints can make the server walk any git repository it can read and learn that repository's file paths, churn, and author signals. The VCS trust-boundary warning in the reference covers this in full; do not expose these endpoints to untrusted clients without an authorization layer.

The VCS cache directory is a client-controlled write path. The VCS endpoints accept an optional cache_dir field that overrides where the persistent change-history cache is written, defaulting to the platform cache location ($XDG_CACHE_HOME/big-code-analysis/vcs). A caller who can set cache_dir chooses a directory the server process writes into, so an untrusted client could direct cache writes to an attacker-chosen path. This is one more reason the VCS endpoints belong behind an authorization layer, never open to untrusted input.

Recipes

Task-oriented examples for getting work done with bca and bca-web. Each recipe assumes you have built the binaries (cargo build --release) and that bca is on your PATH.

The recipes are grouped by goal:

Quality reports — generate Markdown reports suitable for pull requests, dashboards, or wikis, including the C/C++ preprocessor-aware workflow.
CI integration — wire bca check and bca report into GitHub Actions and GitLab CI, including the baseline / ratchet pattern and the Code Quality widget path.
Baselines — record existing offenders in a .bca-baseline.toml so the gate only fires on new or worsened violations, and audit or diff that baseline during review.
Local threshold gates — mirror the CI threshold gate on a developer machine with a two-tier (hard + headroom) Makefile / just / pre-commit pattern, so regressions never reach the pull request.
Feeding metrics to an agent — wire bca check into an agentic coding tool's after-edit feedback loop (Claude Code PostToolUse hook, opencode plugin), with the anti-gaming guidance that keeps the loop honest.
AST queries — search for syntactic constructs, count node types, dump trees, and detect parse errors.
Exporting metric data — emit structured output (JSON / YAML / TOML / CBOR) and consume it from shell pipelines.
Driving the REST API — run the HTTP server and call every endpoint with curl.

If you want a deeper look at any flag the recipes use, see the per-command pages under Commands. For the full list of metrics that show up in these recipes, see Supported Metrics.

Upstream reference. big-code-analysis is a fork of Mozilla's rust-code-analysis. Recipes that work for the upstream rust-code-analysis-cli binary usually translate directly — replace the binary name and adjust for the subcommand restructure documented in the migration guide.

Quality reports

Recipes for producing aggregated, human-readable Markdown reports.

Wiring reports into CI? See the CI integration recipe for runnable GitHub Actions and GitLab CI examples that post the Markdown report as a PR/MR comment and surface threshold violations through the platform's native code quality widgets.

Live example reports

big-code-analysis publishes the output of bca report -O markdown --vcs and bca report -O html --vcs against its own source tree on every push to main. Open either to see exactly what the recipes on this page produce on a multi-language Rust + Python codebase:

HTML hotspot report (sortable tables, per-language sections, plus a "Change-history risk" section from --vcs): https://dekobon.github.io/big-code-analysis/reports/index.html
Markdown PR/MR comment (paste-into-issue ready): https://dekobon.github.io/big-code-analysis/reports/report.md
Change-history risk, full top-100 ranking as machine-readable JSON (see bca vcs): https://dekobon.github.io/big-code-analysis/reports/vcs-report.json

The wiring that produces them lives in .github/workflows/pages.yml. The same workflow runs the threshold gate; see CI integration for the full pipeline shape.

Generate a project-wide quality report

Run from the project root and write the report to a file:

bca report \
    --paths "$PWD" \
    -O markdown \
    --top 20 \
    --strip-prefix "$PWD/" \
    --output report.md

--strip-prefix keeps the file paths short and stable across machines — without it every row carries the absolute path of the current checkout.
--top controls how many rows appear in each hotspot table. 20 is a good default for a PR comment; drop to 5 for a dashboard tile, or pass 0 to list every row.
--jobs defaults to the effective CPU count (cgroup-/cpuset-aware on Linux); pass --jobs 1 only to force serial mode for debugging.

Limit the report to specific languages

bca infers language from extension, so the include/exclude globs do the filtering:

bca report \
    --include "*.rs" --include "*.py" \
    --paths "$PWD" \
    -O markdown --output report.md

To exclude vendored or generated trees, layer in --exclude:

bca report \
    --include "*.rs" \
    --exclude "**/target/**" --exclude "**/vendor/**" \
    --paths "$PWD" \
    -O markdown

Flag arity. --include and --exclude take exactly one glob per occurrence; repeat the flag for additional patterns. The = form works the same way: --include="*.rs" --exclude="**/target/**".

Leading ./ is optional. A bare-relative pattern and its ./-prefixed spelling are equivalent: --exclude "vendor/**" matches exactly what --exclude "./vendor/**" does. This holds for every glob surface — --include, --exclude, --exclude-from, .bcaignore, and the [check.exclude] gate-exemption set.

For a stable repo-wide deny-set, keep the patterns in a file at the repo root (a .bcaignore by convention) and load it with --exclude-from. Patterns are unioned with any inline --exclude values; blank lines and #-prefixed comments are skipped:

bca report \
    --paths . \
    --exclude-from .bcaignore \
    -O markdown --output report.md

Show only the worst offenders

For a quick triage view that highlights the top three problems per section:

bca report -p src/ -O markdown --top 3

The report still includes every section, but each table is short enough to scan at a glance.

Compare two revisions

Aggregate reports do not diff revisions on their own. Run the report on each side and diff the Markdown:

git worktree add /tmp/before main
bca report -p /tmp/before -O markdown \
    --strip-prefix /tmp/before/ --output /tmp/before.md

bca report -p "$PWD" -O markdown \
    --strip-prefix "$PWD/" --output /tmp/after.md

diff -u /tmp/before.md /tmp/after.md | less

Because both reports use the same --strip-prefix shape, the path columns line up and the diff is dominated by metric changes rather than path noise.

C/C++ preprocessor-aware reports

Macro-heavy C/C++ codebases benefit from feeding preprocessor data into the analyzer so that conditional compilation is interpreted the way the compiler sees it. The workflow is two steps:

# 1. Build a preprocessor-data JSON from the headers and sources.
bca preproc \
    --paths src/ include/ \
    --output /tmp/preproc.json

# 2. Run the report (or any other command) with that data attached.
bca report \
    --paths src/ \
    --preproc-data /tmp/preproc.json \
    -O markdown --output report.md

--preproc-data is accepted by every metric-computing walking subcommand (metrics, ops, functions, report, check, …) — anywhere accurate C/C++ analysis matters. Subcommands that do not consume it (vcs, preproc, list-metrics, diff-baseline) reject it as a usage error.

Analyze only files changed in a PR

Pipe a list of changed files into --paths-from - to score just the diff, not the whole tree:

git diff --name-only --diff-filter=AM origin/main...HEAD \
    | bca metrics --paths-from - -O json --output-dir ./out

--diff-filter=AM keeps Added and Modified files and drops Deletions — you cannot analyze a file that no longer exists.
--paths-from - reads newline-separated paths from stdin. A file argument works the same way: --paths-from changed.txt.
Paths fed in this way are treated as explicit, so they bypass any .gitignore rule that would have hidden them in a directory walk. Combine with -I '*.py' -I '*.rs' to filter by language (repeat the flag once per glob).

For a PR-scoped Markdown summary, swap metrics for the report pipeline:

git diff --name-only --diff-filter=AM origin/main...HEAD \
    | bca report --paths-from - -O markdown \
        --top 10 --output pr-report.md

.gitignore is honored automatically when walking a directory, so recipes earlier in this page no longer need an explicit -X "**/target/**" -X "**/node_modules/**" if those paths are already covered by your project's .gitignore. Add --no-ignore if you do need to analyze gitignored trees.

CI integration

Recipes for wiring bca into a build pipeline. The bca check command already ships every output shape a modern CI needs (Checkstyle, SARIF, GitLab Code Climate JSON, clang/GCC warning lines, MSVC warning lines), plus bca report markdown for humans. This page is a consolidated map from the user's goal to the right combination of subcommand, flags, and platform glue.

Picking outputs

The matrix below maps each common goal to the bca invocation that feeds the corresponding CI surface. Linked sections below have the runnable example.

Goal	Command + flags
Hard gate on threshold regressions	`bca check` (thresholds from the auto-discovered `bca.toml`)
Ratchet thresholds on an existing codebase	`bca check --baseline .bca-baseline.toml` (‡)
Inline PR annotations (GitHub)	`bca check … --report-format clang-warning --no-fail` + GCC problem matcher
Code Scanning alerts (GitHub)	`bca check … --report-format sarif --no-fail` + `github/codeql-action/upload-sarif`
Merge-request widget (GitLab Code Quality)	`bca check … --report-format code-climate --no-fail`
Jenkins / SonarQube ingestion	`bca check … --report-format checkstyle`
Human-readable PR/MR comment or downloadable	`bca report -O markdown --top 20 --strip-prefix "$PWD/"`
Machine-readable artifact for dashboards	`bca metrics --format json --output-dir ./out`

(‡) Recommended adoption path when introducing thresholds on a codebase with existing offenders. See the Baselines recipe for the bootstrap-refresh-retire workflow.

The full reference for bca check's output formats, exit codes (0 clean, 2 violation, 1 tool error), and threshold config lives in the Check command page. For the Markdown report shape, see the Report command page and the Quality reports recipe.

GitHub Actions

Live worked example

big-code-analysis runs the recipes below against its own source on every push and PR. The workflow source — .github/workflows/pages.yml — exercises the threshold gate, the baseline ratchet, both report formats, and a SARIF upload to GitHub Code Scanning end-to-end against the workspace itself. (The SARIF upload runs on same-repo pushes and PRs only; fork PRs skip it because the upload needs a write-scoped token, exactly as the clippy SARIF job does.) The output sits on GitHub Pages alongside this book:

HTML hotspot report: https://dekobon.github.io/big-code-analysis/reports/index.html
Markdown PR/MR comment: https://dekobon.github.io/big-code-analysis/reports/report.md

Copy snippets below straight into your own workflow; the bca version quoted is the latest published release at the time of writing.

The in-tree workflow installs bca by building it from the current checkout rather than downloading a pinned release — this avoids the CLI-artifact schema-skew failure mode described under Installing bca from a GitHub Release below for repos whose .bca-baseline.toml is always written by the same bca that gates it. Downstream adopters tracking a stable release line should stick with the pinned-tarball pattern; only switch to "build from checkout" if you, too, are mutating CLI artifact schemas in lockstep with the binary.

Threshold gate, SARIF, and clang-warning matcher

The three pre-existing recipes — hard threshold gate, SARIF upload to Code Scanning, and clang-warning + GCC problem matcher for inline PR annotations — live in the Check command page. Use the link rather than re-implementing them here.

Installing `bca` from a GitHub Release (recommended)

The fastest, most reproducible install path is the prebuilt tarball from this repository's GitHub Releases. It is a single curl | sha256sum | tar, requires no Rust toolchain, and produces byte-identical binaries across runs. Pair it with actions/cache keyed by version so a green-path rerun skips the download entirely:

CLI-artifact schema compatibility. The BCA_VERSION you pin here must support the schema version of every CLI artifact your repo commits — most importantly .bca-baseline.toml (carries its own version field) and the bca.toml manifest. A baseline file written by a newer bca (carrying a newer schema version) is not loadable by an older bca and the gate will fail with baseline version N is not supported by this bca. When tracking main or regenerating baselines locally with a newer bca, either re-pin to a release that covers the new schema or switch to a cargo install --git build of bca pointed at the same commit your baseline was written from (see the cargo install alternative below). The compatibility contract is recorded in STABILITY.md.

env:
  BCA_VERSION: "2.0.0"
  BCA_TARGET:  "x86_64-unknown-linux-gnu"
  # sha256 of big-code-analysis-${BCA_VERSION}-${BCA_TARGET}.tar.gz from the
  # release's SHA256SUMS file. Bump together with BCA_VERSION.
  BCA_SHA256:  "a205fff13108d0f8c679a062e352ba8468109c4adfdd8c9e3567cf5fcc99c3d5"

steps:
  # Cache key MUST include BCA_SHA256 (and BCA_TARGET). Without the
  # sha256 in the key, rotating the published checksum without bumping
  # the version returns a stale binary on cache hit and silently
  # bypasses the `sha256sum --check` in the install step (which is
  # gated on cache miss). Including BCA_TARGET matters when the same
  # workflow runs against multiple `runs-on`.
  - name: Cache bca binary
    id: bca-cache
    uses: actions/cache@v5
    with:
      path: ~/.local/bin/bca
      key: bca-${{ runner.os }}-${{ env.BCA_TARGET }}-${{ env.BCA_VERSION }}-${{ env.BCA_SHA256 }}

  - name: Install bca from GitHub Releases
    if: steps.bca-cache.outputs.cache-hit != 'true'
    run: |
      set -euo pipefail
      stage="big-code-analysis-${BCA_VERSION}-${BCA_TARGET}"
      tarball="${stage}.tar.gz"
      url="https://github.com/dekobon/big-code-analysis/releases/download/v${BCA_VERSION}/${tarball}"
      mkdir -p "$HOME/.local/bin"
      curl -fsSL --proto '=https' --tlsv1.2 -o "/tmp/${tarball}" "$url"
      echo "${BCA_SHA256}  /tmp/${tarball}" | sha256sum --check --strict -
      tar -xzf "/tmp/${tarball}" -C /tmp
      install -m 0755 "/tmp/${stage}/bca" "$HOME/.local/bin/bca"
      rm -rf "/tmp/${tarball}" "/tmp/${stage}"

  - name: Prepend ~/.local/bin to PATH
    run: echo "$HOME/.local/bin" >> "$GITHUB_PATH"

Available BCA_TARGET values (pick the one that matches runs-on): x86_64-unknown-linux-gnu, x86_64-unknown-linux-musl, aarch64-unknown-linux-gnu, aarch64-unknown-linux-musl, aarch64-apple-darwin, x86_64-pc-windows-msvc, aarch64-pc-windows-msvc. Windows assets use .zip instead of .tar.gz; the bca-web binary ships alongside bca in the same archive.

Alternative: `cargo install` via prebuilt-aware actions

When you cannot reach github.com from a runner (air-gapped, custom mirror) but can reach crates.io, the following two actions fall back transparently to cargo install when no prebuilt is published — at the cost of compile time on the cold path. Both pin to the same crates.io release as the GitHub Releases assets, so the CLI-artifact schema compatibility warning applies here unchanged.

If you specifically need a bca ahead of the latest crates.io release (e.g., your .bca-baseline.toml is committed at a newer schema than any published bca understands), swap the tool: big-code-analysis-cli@<version> or --version form for cargo install --git https://github.com/dekobon/big-code-analysis --rev <SHA> --locked big-code-analysis-cli against the exact commit the baseline was generated from. This is what the in-tree pages.yml workflow does (against the local checkout via --path) — it is a deliberate workaround for bca's own repo, not a recommended default for downstream adopters.

# Option 1: taiki-e/install-action
- name: Install bca
  uses: taiki-e/install-action@v2
  with:
    tool: big-code-analysis-cli@2.0.0

# Option 2: cargo-binstall
- name: Install cargo-binstall
  uses: cargo-bins/cargo-binstall@main
- name: Install bca
  run: cargo binstall --no-confirm big-code-analysis-cli --version 2.0.0

If either action falls back to compilation, cache the cargo registry + the installed binary so the second run is fast:

- name: Cache cargo registry and bca binary
  uses: actions/cache@v5
  with:
    path: |
      ~/.cargo/registry
      ~/.cargo/git
      ~/.cargo/bin/bca
    # crates.io publishes immutable releases, so a `<version>` key is
    # sufficient here — there is no sha256 to rotate. (The GitHub
    # Releases install path above is different: republished release
    # assets share a version, so its cache key must include the sha256.)
    key: bca-${{ runner.os }}-2.0.0

Pin to a specific version (matching a published big-code-analysis-cli release on crates.io) so reports stay reproducible across runs. A floating install surfaces metric-counting changes as "mysterious CI flakes" on Mondays.

Posting the Markdown report as a PR comment

bca report markdown is purpose-built for PR/MR comments: a stable header structure, one row per hot spot, and short paths once you pass --strip-prefix. Pair it with marocchino/sticky-pull-request-comment so each push updates a single comment instead of stacking new ones:

name: bca-pr-report
on:
  pull_request:
    branches: [main]
jobs:
  report:
    runs-on: ubuntu-latest
    permissions:
      pull-requests: write
    steps:
      - uses: actions/checkout@v4
      - name: Install bca
        uses: taiki-e/install-action@v2
        with:
          tool: big-code-analysis-cli@2.0.0
      - name: Generate report
        run: |
          bca \
            report -O markdown \
            --paths "$PWD" \
            --top 20 \
            --strip-prefix "$PWD/" \
            --output report.md
      - name: Post or update PR comment
        uses: marocchino/sticky-pull-request-comment@v2
        with:
          path: report.md
          header: bca-quality-report

The same Markdown file is suitable for upload as a build artifact (actions/upload-artifact@v7) if you want it downloadable from the workflow run page in addition to the PR comment.

Baseline / ratchet pattern

bca check --baseline is the native ratchet: record today's offenders in a committed TOML file, fail only on regressions and new offenders, and shrink the file over time. Bootstrap once, commit, then point CI at it:

# Once, on a developer machine. Commit both files.
bca check --paths src/ \
    --write-baseline .bca-baseline.toml
git add bca.toml .bca-baseline.toml

This snippet bootstraps from src/ only — appropriate for a single-crate library. For a multi-crate workspace, see the live worked example: its .github/workflows/pages.yml scans the entire repo with --exclude-from .bcaignore, a checked-in deny-set covering vendored grammars, generated trees, and tests.

Share the exclude list across workflow, recipe, and bootstrap. Put the deny-set in a single file at the repo root (a .bcaignore by convention, mirroring .gitignore / .dockerignore) and point every bca invocation at it with --exclude-from .bcaignore. Patterns from --exclude-from are unioned with any inline --exclude <GLOB> flags into one deny-set — keep --exclude for one-off ad-hoc excludes. Blank lines and #-prefixed comment lines in the file are skipped. Patterns follow the same ./-prefix convention as --exclude arguments (the walker's emitted form). Pair edits to .bcaignore with a --write-baseline refresh — the baseline keys are sensitive to which files the walker visits.

- name: Threshold check with baseline
  run: |
    bca check --paths src/ \
        --baseline .bca-baseline.toml

A regressed function (current value > baseline value) still fails. A new offender not in the baseline still fails. An improved function passes silently and stays in the baseline until the next --write-baseline refresh.

Each surviving violation in the stderr stream is prefixed with a tag so a developer can tell at a glance whether they are looking at a brand-new offender or a known one that has worsened:

[new] — no baseline entry for this function / metric.
[regr +N%] — current value exceeds the recorded baseline by N percent. Special forms: [regr from 0] when the baseline value was zero, [regr +>9999%] when the regression exceeds 100× the baseline, [regr NaN] when the current value is NaN.

After the per-violation lines the stderr stream emits a per-file rollup footer with the format <path>: <count> violations (worst: <metric> = <value> vs limit <limit> at L<start>), sorted by violation count descending. This is intended to be the first thing a reader looks at: which file has the most problems, and which metric is the loudest in that file. Pass --no-summary to suppress the footer for downstream tooling that grep-pipes the stderr stream.

Actionable failure output

The four sub-sections below turn bca check's failure output from "a wall of offender lines" into a stack of CI-aware presentations: which files in this PR tripped a threshold (--since / --changed-only), inline file-diff annotations (--github-annotations), a rendered step-summary digest ($GITHUB_STEP_SUMMARY), and a copy-paste-safe remediation block. Each is independent; mix and match per CI surface. A combined worked example is at the end of this group.

Diff-aware mode (`--since` / `--changed-only`)

On a PR or push, the developer's first question is usually which of my files in this change tripped a threshold — not the whole-tree offender list. Two flags answer that:

--since <ref> partitions the per-file footer into a "Files in this range:" section (offenders in files touched between <ref> and HEAD) followed by "Other offenders:" (everything else). Per-violation lines are unchanged so existing grep-anchored tooling keeps working.
--changed-only drops violations from files outside the touched set entirely. Use it for PR gates that should be terse.

- uses: actions/checkout@v4
  with:
    # `--since origin/<base>` resolves a merge-base. The default
    # `fetch-depth: 1` checkout makes that ref unreachable; `0`
    # pulls the full history so the diff base resolves.
    fetch-depth: 0
- name: Threshold check with diff-aware footer
  run: |
    bca check --paths . --exclude-from .bcaignore \
        --baseline .bca-baseline.toml \
        --since "origin/${{ github.base_ref }}"

When --since is omitted, bca auto-detects the diff base from the environment in this precedence:

BCA_DIFF_BASE — the explicit-override hatch. Use this from a local shell or non-GHA CI runner to mimic the auto-detection.
GITHUB_BASE_REF — set by GitHub Actions on pull_request events. Expanded to origin/<value>; the runner is responsible for the corresponding git fetch.
GITHUB_EVENT_BEFORE — set by GitHub Actions on push events to the SHA at HEAD before the push. The all-zeroes SHA (force push, brand-new branch) is treated as no signal.

Auto-detection failing — git missing, ref unresolvable, not a git checkout — is non-fatal without --changed-only: bca prints a warning and falls back to today's whole-tree footer. With --changed-only, the same failure is fatal so a misconfigured CI does not silently green-light by suppressing every violation.

The "Files in this range:" banner names the resolved base and the signal that produced it, so a CI-log reader can verify the gate latched onto the expected ref:

Files in this range (diff base: origin/main via GITHUB_BASE_REF):
./src/a.rs: 1 violation (worst: cyclomatic = 11 vs limit 2 at L1)

Other offenders:
./src/b.rs: 1 violation (worst: cyclomatic = 11 vs limit 2 at L1)

This is distinct from bca diff-baseline, which diffs baseline files between two on-disk paths and reports added / removed / worsened / improved entries. --since diffs source files between two git refs.

GitHub Actions inline annotations (`--github-annotations`)

The GHA UI renders ::error file=…,line=…,title=…::msg workflow commands as inline annotations on the file-diff view — much more discoverable than scrolling the raw job log. bca check emits one per violation per the tri-state --github-annotations <auto|always|never>: auto (the default) enables when $GITHUB_ACTIONS == "true" (set by every GHA workflow step), always forces them on, never suppresses them even inside a step (handy when a workflow runs bca check twice and wants annotations from only one run). A bare --github-annotations means always.

The annotations ride on top of the existing per-violation human stream — both are emitted. To avoid exhausting GitHub's 10-error-per-step UI quota, annotations are capped at 10 per metric; overflow rolls up to a single ::error::N more <metric> violations not shown line so the count stays visible.

- name: Threshold check with inline annotations
  run: |
    bca check --paths . --exclude-from .bcaignore \
        --baseline .bca-baseline.toml
  # No `--github-annotations` flag needed — auto-enabled in GHA.

Pair this with --since (above) so the annotations point at the files in the PR, not the entire offender list.

Step-summary markdown digest (`$GITHUB_STEP_SUMMARY`)

GitHub Actions exposes $GITHUB_STEP_SUMMARY — a path to a markdown file that, when populated, renders as the step's summary view in the job UI. bca check appends a digest containing the per-file rollup table, a per-metric count breakdown, and the top-10 offenders by value / limit ratio whenever that env var is set, or when --summary-file <path> is passed explicitly. --summary-file never suppresses the digest even when $GITHUB_STEP_SUMMARY is set.

The digest is bracketed by HTML-comment markers ( / ) so a retried step replaces (not stacks) the previous block — three retries converge to exactly one up-to-date digest. Content outside the markers (e.g. summaries written by other tools earlier in the same step) is preserved.

- name: Threshold check with step-summary digest
  run: |
    bca check --paths . --exclude-from .bcaignore \
        --baseline .bca-baseline.toml
  # No flag needed — `$GITHUB_STEP_SUMMARY` is set automatically in GHA.

Local users can pipe the digest into any markdown file with --summary-file <path>. Empty input (clean run) still writes a "✓ No threshold violations." block so the step summary positively confirms the gate ran.

When the gate finds violations, bca check emits a trailing --- next steps --- block on stderr (and inside the step-summary digest from above):

--- next steps ---
* Detailed reports: bca-reports artifact at https://github.com/<owner>/<repo>/actions/runs/<run-id>
* To refresh baseline: bca check --paths . --exclude-from .bcaignore --write-baseline .bca-baseline.toml
* Adoption guide: https://dekobon.github.io/big-code-analysis/recipes/baselines.html

The refresh invocation mirrors the gate's resolved --paths, --exclude, --exclude-from, --config, and --baseline so a first-time reader of a failing CI log can copy-paste it verbatim. The artifact URL is derived from $GITHUB_REPOSITORY and $GITHUB_RUN_ID when both are present (always true in GHA); local runs — where there is no upload to point at — instead suggest running bca report to see the detailed view locally.

Suppress the block with --no-remediation for downstream tooling that grep-pipes stderr.

Refresh after focused refactors:

bca check --paths src/ \
    --write-baseline .bca-baseline.toml
git diff .bca-baseline.toml   # expect a shrinking file

Two --write-baseline runs over an unchanged tree produce byte-identical output, so spurious diffs only appear when offenders actually changed. See the Baselines recipe for the full adoption flow, PR-review heuristics, and the suppression composition rules.

Putting it all together

The four flags above compose. For a PR-gate workflow, the recommended invocation is:

- uses: actions/checkout@v4
  with:
    # `--since origin/<base>` resolves a merge-base. Default
    # `fetch-depth: 1` makes that ref unreachable; `0` pulls the
    # full history so the diff resolves.
    fetch-depth: 0

- name: Threshold gate (diff-aware + GHA UX)
  run: |
    bca check --paths . --exclude-from .bcaignore \
        --baseline .bca-baseline.toml \
        --since "origin/${{ github.base_ref }}"
  # No `--github-annotations` or `--summary-file` flag needed:
  # both auto-enable from `$GITHUB_ACTIONS == "true"` and
  # `$GITHUB_STEP_SUMMARY`. The trailing remediation block is also
  # auto-emitted.

What this gives you on a failing PR:

Per-violation stderr lines — same shape as the legacy gate, so existing grep tooling keeps working.
Per-file rollup footer with Files in this range: (touched in the PR) listed before Other offenders: — the developer sees their own contributions first.
Inline GHA annotations on the file-diff view, capped at 10 per metric with an overflow rollup.
Step-summary panel with a rendered markdown digest (per-file rollup, per-metric breakdown, top-10 offenders by ratio).
Trailing remediation block naming the artifact, printing the exact --write-baseline refresh invocation, and linking to the Baselines recipe.

Knobs:

Flag	Effect	Default
`--since <ref>`	Partition footer; auto-detect from env if omitted	Off, auto-detect via `BCA_DIFF_BASE` / `GITHUB_BASE_REF` / `GITHUB_EVENT_BEFORE`
`--changed-only`	Drop violations outside the diff entirely	Off
`--github-annotations <auto\|always\|never>`	Emit `::error file=…::msg` workflow commands (bare flag = `always`)	`auto` enables when `$GITHUB_ACTIONS == "true"`; `never` opts out
`--summary-file <path\|auto\|never>`	Append markdown digest; `never` opts out	`auto` detects `$GITHUB_STEP_SUMMARY`
`--no-remediation`	Suppress the trailing `--- next steps ---` block	Block emitted on failure

Local users running bca check outside GHA see no change in default behaviour: none of the four features auto-enable without an env signal. To preview the GHA experience locally:

GITHUB_ACTIONS=true GITHUB_STEP_SUMMARY=/tmp/bca-summary.md \
  BCA_DIFF_BASE=main \
  bca check --paths . --exclude-from .bcaignore \
      --baseline .bca-baseline.toml
cat /tmp/bca-summary.md

For a non-GHA CI (GitLab, Buildkite, Jenkins), set the env vars your runner exposes (or pass the flags explicitly) and the same output paths fire.

Offender-count delta against merge base (stopgap)

For teams who cannot commit a baseline file (e.g. policy reasons), a coarser approximation counts <error> elements in two Checkstyle documents — one on the merge base, one on the PR head — and fails when the count grows:

- name: Compute offender deltas vs. merge base
  run: |
    set -euo pipefail
    BASE="$(git merge-base origin/main HEAD)"
    git worktree add /tmp/base "$BASE"

    bca check --paths /tmp/base \
        --report-format checkstyle \
        --output /tmp/base.xml \
        --no-fail
    BASE_COUNT=$(grep -c "<error" /tmp/base.xml || true)

    bca check --paths "$PWD" \
        --report-format checkstyle \
        --output /tmp/head.xml \
        --no-fail
    HEAD_COUNT=$(grep -c "<error" /tmp/head.xml || true)

    echo "Offenders: base=$BASE_COUNT head=$HEAD_COUNT"
    if [ "$HEAD_COUNT" -gt "$BASE_COUNT" ]; then
      echo "::error::Offender count grew from $BASE_COUNT to $HEAD_COUNT"
      exit 1
    fi

This counts violations, not their identity: renaming an offender does not register as a regression, and improving one offender while regressing another nets to zero. The native baseline flow above is strictly more precise and is the recommended approach.

Self-scan threshold gate (local mirror of the CI gate)

CI's threshold gate fires only after push, which is too late if a refactor silently nudged a metric past its limit. The big-code-analysis repo's Makefile exposes four targets that mirror the CI gate (the Threshold gate step in .github/workflows/pages.yml) locally and add a second tier at 95% of every limit so encroachment is caught a commit or two before the hard gate trips:

make self-scan                            # hard gate, 100% of bca.toml thresholds
make self-scan-headroom                   # soft gate, default 95% (BCA_HEADROOM)
make self-scan-write-baseline             # refresh baseline at hard thresholds
make self-scan-write-baseline-headroom    # refresh baseline at soft thresholds

Path selection, the .bcaignore deny-set, the per-function thresholds, the cyclomatic ? policy, and the baseline file all live in the repo-root bca.toml manifest, which bca discovers automatically. The hard tier is exactly what CI runs; expanded, it is a bare check (no path / threshold / baseline flags — the manifest supplies them):

cargo run --quiet --release -p big-code-analysis-cli -- check

Both tiers consume the same bca.toml thresholds and the same .bca-baseline.toml; the soft tier just runs the hard recipe with every threshold value multiplied by BCA_HEADROOM. Both exit 0 clean, 2 on any threshold violation, 1 on tool error — the soft tier is a real gate, not advisory, so do not wrap make self-scan-headroom in || true. The two gate targets (self-scan, self-scan-headroom) are wired into make pre-commit, make ci, and .pre-commit-config.yaml; those chains run the hard tier before the soft tier, so a true regression always reports before near-limit headroom. The two write-baseline targets are side-effecting and deliberately not wired in.

BCA_HEADROOM=0.90 make self-scan-headroom widens the band; BCA_HEADROOM=0.99 tightens it to the last 1%. When the soft tier fires, absorb the offender into the baseline with make self-scan-write-baseline-headroom (which records every offender at the scaled thresholds — strictly a superset of the hard-tier offenders).

The pattern (hard tier mirroring CI + soft tier as early-warning band, both ratcheted by the same baseline) is project-agnostic — the Local threshold gates recipe documents the underlying principles, drop-in Makefile / just / package.json skeletons, and the helper script that scales thresholds, so you can adopt the same workflow in your own repo. The generic recipe uses the same BCA_* env-var names as the Makefile above, so overrides like BCA_HEADROOM=0.90 work identically across both.

GitLab CI

Full `.gitlab-ci.yml` example

The job below installs bca, runs the threshold check producing Code Climate JSON (for the MR Code Quality widget), Checkstyle XML, and a Markdown report, then uploads them as artifacts.

The same CLI-artifact schema-compatibility note from the GitHub Actions section applies here — the BCA_VERSION pin must cover the schema version of every CLI artifact you commit.

stages:
  - quality

variables:
  BCA_VERSION: "2.0.0"  # pin a published big-code-analysis-cli release
  BCA_TARGET:  "x86_64-unknown-linux-gnu"
  # sha256 of big-code-analysis-${BCA_VERSION}-${BCA_TARGET}.tar.gz from
  # the release's SHA256SUMS file. Bump together with BCA_VERSION.
  BCA_SHA256:  "a205fff13108d0f8c679a062e352ba8468109c4adfdd8c9e3567cf5fcc99c3d5"

bca-quality:
  stage: quality
  image: debian:stable-slim
  cache:
    # Same key shape as the GitHub Actions snippet — bumping
    # BCA_VERSION invalidates the cache automatically.
    key: "bca-$BCA_VERSION"
    paths:
      - .cache/bca/
  before_script:
    - apt-get update -qq && apt-get install -y --no-install-recommends ca-certificates curl tar
    - |
      set -euo pipefail
      install -d "$CI_PROJECT_DIR/.cache/bca" "$HOME/.local/bin"
      if [ ! -x "$CI_PROJECT_DIR/.cache/bca/bca" ]; then
        stage="big-code-analysis-${BCA_VERSION}-${BCA_TARGET}"
        tarball="${stage}.tar.gz"
        url="https://github.com/dekobon/big-code-analysis/releases/download/v${BCA_VERSION}/${tarball}"
        curl -fsSL --proto '=https' --tlsv1.2 -o "/tmp/${tarball}" "$url"
        echo "${BCA_SHA256}  /tmp/${tarball}" | sha256sum --check --strict -
        tar -xzf "/tmp/${tarball}" -C /tmp
        install -m 0755 "/tmp/${stage}/bca" "$CI_PROJECT_DIR/.cache/bca/bca"
        rm -rf "/tmp/${tarball}" "/tmp/${stage}"
      fi
      install -m 0755 "$CI_PROJECT_DIR/.cache/bca/bca" "$HOME/.local/bin/bca"
      export PATH="$HOME/.local/bin:$PATH"
  script:
    - bca
        check
        --paths "$PWD"
        --report-format code-climate
        --output gl-code-quality-report.json
        --no-fail
    - bca
        check
        --paths "$PWD"
        --report-format checkstyle
        --output bca-checkstyle.xml
        --no-fail
    - bca
        report -O markdown
        --paths "$PWD"
        --top 20
        --strip-prefix "$PWD/"
        --output bca-report.md
    # The threshold gate runs separately so the artifacts above still
    # publish on failure. Exit 2 = at least one threshold exceeded.
    - bca check --paths "$PWD"
  artifacts:
    when: always
    reports:
      codequality: gl-code-quality-report.json
    paths:
      - gl-code-quality-report.json
      - bca-checkstyle.xml
      - bca-report.md

A few notes about the example:

The first two bca check … --no-fail invocations collect offenders for the artifacts; the final bca check (no --no-fail) is the pass/fail gate. All three runs use the same threshold config so the artifacts always match the gate decision.
artifacts:when: always ensures every artifact is downloadable even on a red pipeline — which is exactly when you want them most.
artifacts:reports:codequality wires the Code Climate JSON directly into GitLab's MR Code Quality widget — see the Code Quality widget section below for the field-by-field semantics.

GitLab's first-class Code Quality experience (inline complaints on the MR diff, summary on the MR overview page) consumes Code Climate JSON. bca check emits this natively via --report-format code-climate, so the integration is a one-liner:

code_quality:
  stage: quality
  script:
    - bca check --paths "$CI_PROJECT_DIR"
          --report-format code-climate
          --output gl-code-quality-report.json
          --no-fail
  artifacts:
    when: always
    reports:
      codequality: gl-code-quality-report.json
    paths:
      - gl-code-quality-report.json

Severity bands are derived from how far each metric exceeds its configured threshold (value / limit ratio, inverted for the maintainability-index family where lower is worse): ≤ 1.5× → minor, ≤ 2× → major, ≤ 4× → critical, > 4× → blocker. The widget deduplicates findings by fingerprint; bca hashes path \0 function \0 metric (no line, no value) so a violation surviving an upstream line-drift edit still collapses into the same widget entry across pipeline runs.

Sanity-check a generated report locally:

jq 'all(.[]; has("description") and has("check_name")
     and has("fingerprint") and has("severity")
     and has("location"))' gl-code-quality-report.json
# → true
jq '[.[] | .severity] | unique' gl-code-quality-report.json
# → a subset of ["info","minor","major","critical","blocker"]

MR-only comment with the Markdown report

To attach the Markdown report as an MR note (the GitLab analogue of the GitHub PR comment recipe), use the project access token and the Notes API:

bca-mr-comment:
  stage: quality
  image: alpine:3
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"
  needs: ["bca-quality"]
  before_script:
    - apk add --no-cache curl jq
  script:
    - |
      BODY=$(jq -Rs '.' < bca-report.md)
      curl --fail --silent --show-error \
        --request POST \
        --header "PRIVATE-TOKEN: $CI_BCA_BOT_TOKEN" \
        --header "Content-Type: application/json" \
        --data "{\"body\": $BODY}" \
        "$CI_API_V4_URL/projects/$CI_PROJECT_ID/merge_requests/$CI_MERGE_REQUEST_IID/notes"

CI_BCA_BOT_TOKEN is a project access token with api scope. The job depends on bca-quality so the Markdown artifact is in place before it runs.

Jenkins / SonarQube

Both Jenkins (via the Warnings Next Generation plugin) and SonarQube (via its Generic Issue importer) consume Checkstyle 4.3 XML directly. The same invocation feeds both:

bca check --paths src/ \
    --report-format checkstyle \
    --output report.checkstyle.xml

Wire report.checkstyle.xml into your existing Jenkins Record Issues / SonarQube External Issues step. The Checkstyle writer emits an empty (well-formed) document when there are no offenders, so neither tool needs special-casing for a clean run. See the Check command page for the writer's schema details.

Generic CI guidance

Applies regardless of provider:

Pin bca to a specific version. Both cargo install --version and cargo binstall --version accept the published crate version of big-code-analysis-cli. A floating install surfaces metric-counting changes as "mysterious CI flakes" on Mondays. Pin to a version whose CLI-artifact schemas (baseline, thresholds) match the files your repo commits — see the schema-compatibility note in the install section.
--jobs defaults to the effective CPU count. The flag honors available_parallelism() — cgroup-/cpuset-/quota-aware on Linux, OS CPU count on macOS/Windows — so CI runners no longer need to thread --jobs "$(nproc)" through every recipe. --jobs 1 remains a debugging knob, not a default.
Always pass --strip-prefix "$PWD/" to bca report markdown so the path column is identical across runners with different workspace paths. Without it the diff between two reports is dominated by /home/runner/work/... vs. /builds/group/project/... noise.
Store bca.toml at the repo root, alongside Cargo.toml / pyproject.toml / package.json. bca discovers it automatically, so a bare bca check reads the committed thresholds, paths, and baseline. Treat it as source: review threshold relaxations in code review.
Exit-code contract. bca check exits 0 clean, 2 on any threshold violation, 1 on tool error (bad config, unknown metric, unreadable path). Reserving 1 for tool errors lets CI distinguish "a function got too complex" from "the analyzer crashed". Pass --exit-codes=tiered (or set [check] exit_codes = "tiered" in bca.toml) to split the violation case by severity: 2 new offenders only, 3 regressions only, 4 both, 5 a --tier=soft violation that also breaches the hard limit. The tiered codes are opt-in; the default stays 0/1/2. Every fail-state remains non-zero, so exit != 0 → fail wrappers keep working — only tooling that tests $? -eq 2 explicitly needs to widen to 2-5.
Honor in-source suppression markers, audit with --no-suppress. The default bca check honors bca: suppress / bca: suppress-file markers; passing --no-suppress ignores them so auditors see the raw offender list.

Baselines: ratcheting thresholds on existing code

When you introduce metric thresholds on an existing codebase, you usually hit the same wall: every reasonable threshold flags hundreds of existing functions, and CI goes red on every push. The realistic adoption path is "ratchet from current state, fail only on new offenders". The baseline file is how bca check supports that workflow.

Baselines are the complement to in-source suppression markers, not a substitute. Use suppression markers (Suppression markers) when a function is intentionally complex forever (a parser, a state machine, generated code). Use a baseline when the team intends to pay the debt down. Both can live in the same repo; suppression is checked first.

End-to-end adoption flow

One-shot shortcut: bca init scaffolds a consolidated bca.toml manifest (with paths, exclude_from, baseline, and a [thresholds] table), the .bcaignore it references, and an initial .bca-baseline.toml derived from the current tree in a single command. With the manifest in place, a bare bca check discovers it and gates zero-config; pass --force to overwrite existing files or --no-baseline to skip the walk. The longer recipe below is useful when you want to tune thresholds before bootstrapping the baseline.

1. Pick initial thresholds

Either gut-feel numbers (cyclomatic=15, cognitive=20) or pull them from a bca check --no-fail run over the repo to see the current distribution.

# bca.toml — dropped at the repo root, auto-discovered by `bca check`.
paths = ["src"]

[check]
baseline = ".bca-baseline.toml"

[thresholds]
cyclomatic = 15
cognitive = 20
"loc.lloc" = 200

2. Bootstrap the baseline

bca check --write-baseline

A bare --write-baseline (no path) writes to the baseline key from the bca.toml you just created, so the filename lives in exactly one place. Pass an explicit path (--write-baseline <file>) only when you have no manifest baseline to default to — without one, the bare form errors rather than guessing a filename.

Commit both files in the same change:

git add bca.toml .bca-baseline.toml
git commit -m "ci: introduce metric thresholds with baseline"

Path keys in the baseline are stored relative to the baseline file's own directory (the anchor). --paths ., --paths src/, and --paths "$PWD" produce byte-identical baselines, and --baseline runs match regardless of which --paths form CI uses — switch between them freely without re-running --write-baseline.

3. Wire the CI gate

GitHub Actions:

- name: Check code complexity thresholds
  run: |
    bca check
  # `paths`, thresholds, and `baseline` all come from the
  # auto-discovered `bca.toml` manifest at the repo root.

GitLab CI (snippet for the relevant job):

threshold-check:
  image: rust:1
  before_script:
    - cargo install --locked big-code-analysis-cli@<VERSION>
  script:
    - bca check

Exit codes: 0 clean, 2 regression or new offender, 1 tool error. See CI integration for the broader matrix of CI surfaces.

4. Refresh the baseline as the team pays debt down

Every few weeks, or after a focused refactor:

cp .bca-baseline.toml .bca-baseline.old.toml
bca check --write-baseline
bca diff-baseline .bca-baseline.old.toml .bca-baseline.toml

A shrinking diff is the goal. Two --write-baseline runs over an unchanged tree produce byte-identical output, so spurious diffs only appear when actual offenders changed.

5. PR-review heuristics

Run bca diff-baseline <old> <new> and read the summary instead of parsing a raw git diff .bca-baseline.toml in your head. It pairs entries on their (path, qualified, metric) identity — so a function that merely drifted up or down the file is not reported as a remove + add — and buckets every real change:

1 added, 1 removed, 2 worsened, 0 improved

## Added
  src/new.rs::shiny        cognitive  = 30

## Removed
  src/gone.rs::old_fn      nargs      = 9

## Worsened
  src/bar.rs::act_on_file  cognitive  60 → 63
  src/foo.rs::do_thing     cognitive  25 → 27

Map the buckets back to the old heuristics:

removed (baseline shrank). Debt paid down. No further action.
added (baseline grew). Someone added a new offender to the file intentionally. Review the values — was this a deliberate stopgap, or did the author bypass the gate? Either is fine if conscious; the point of the file being committed is to make the choice reviewable.
worsened (an entry got a higher value). The author re-ran --write-baseline after the function got worse. Treat the same as added — surface the change in review.
improved. A recorded offender got better without dropping out of the baseline; harmless, and a good sign the refactor is working.

For a PR bot, bca diff-baseline <old> <new> --format markdown emits a fenced block ready to drop into a sticky comment, and the --worsened-only / --added-only filters narrow it to just the regressions reviewers must look at. --format json feeds the same diff to other tooling. The command exits 0 by default — it informs review, it does not gate (the gate is bca check itself) — though the opt-in --exit-code flag exits 2 on a non-empty filtered diff.

Reading the gate output

A failing bca check --baseline run prefixes each surviving violation with a tag and follows the list with a per-file rollup:

bca: filtered 422 violations via baseline
[regr +60%] src/foo.rs:1-865: <file>: halstead.effort = 1557107.72 (limit 50000)
[new] src/bar.rs:506-747: act_on_file: cognitive = 63 (limit 25)
...

--- summary ---
src/foo.rs: 5 violations (worst: halstead.effort = 1557107.72 vs limit 50000 at L1)
src/bar.rs: 4 violations (worst: cognitive = 63 vs limit 25 at L506)

Tag prefixes:

[new] — no baseline entry matched this violation by qualified symbol (within the line tolerance) or, when --baseline-fuzzy-match is set, by body hash. The violation is new since the baseline was written. See Matching for the resolution order.
[regr +N%] — the baseline contains a recorded value and the current value is N% higher. Cases:
- [regr from 0] when the recorded value is 0.0 and a non-zero percentage would divide by zero.
- [regr +>9999%] caps once the regression exceeds 100× the baseline value.
- [regr NaN] when the current metric value is NaN (degenerate Halstead inputs on trivial functions).

Tags only appear when --baseline is passed; without it the line format is byte-identical to the no-baseline default. CI tooling that grep-pipes the stderr stream can suppress the trailing summary with --no-summary.

The summary footer groups violations by file, cites the single worst metric per file (max value / limit ratio), and sorts rows by violation count descending then path ascending. It is the fastest way to read a long offender list and spot which file to start with.

6. Retire the baseline

When .bca-baseline.toml contains only version = 5 and no entries, drop the --baseline flag from CI and delete the file. The thresholds now stand on their own.

Tier/headroom provenance

A baseline written with --write-baseline (v5+) records which gate it was written against in a [provenance] table:

version = 5

[provenance]
tier = "soft"
headroom = 0.95

tier = "hard" — written by the hard gate (bca check --write-baseline …); no headroom key.
tier = "soft", headroom = <ratio> — written by the soft gate scaled by the soft ratio (bca check --tier=soft=0.95 --write-baseline …).
tier = "soft" with no headroom — written by the soft gate driven by a [thresholds.soft] table (per-metric limits, no single ratio).

The provenance is a real TOML table, not a comment, so bca diff-baseline and external tooling can read it. Baselines written by an older bca (v2–v4) carry no provenance and are read without error.

The stricter-than-baseline warning

bca check reduces the baseline's provenance and the current run's effective limits to a single strictness scalar (hard → 1.0; soft scaled by h → h; smaller means stricter) and warns when the current run is stricter than the baseline was written against:

warning: this check's effective limits (strictness 0.9) are stricter
than the baseline was written against (strictness 0.95); the baseline
may under-cover and the gate can fire on untouched files. Refresh it at
the matching tier, …

This is the silent-desync the baseline-refresh discipline guards against: a baseline written looser than the current gate may not list every offender the tighter gate produces, so the gate can suddenly fire on files nobody touched.

The warning is directional. It fires only when the current run is stricter. It stays silent in the safe direction — a hard check (strictness 1.0) reading a soft-0.95 baseline sees a superset of its offenders, which is exactly the intended single-baseline setup where make self-scan (hard) and make self-scan-headroom (soft) ratchet through the same .bca-baseline.toml. It also stays silent for equal provenance, for pre-v5 baselines (provenance unknown), and when either side is a [thresholds.soft]-table baseline (no single ratio to compare). To clear a genuine warning, refresh the baseline at the current tier with the matching --write-baseline recipe.

How matching works

Each entry is keyed on (path, qualified_symbol, metric) — the qualified symbol being the ::-joined chain of enclosing named containers plus the function name (MyStruct::do_thing, my_namespace::MyClass::method). The top-level file space collapses to <file>. A violation is resolved against the baseline in this order:

Qualified symbol. If exactly one entry shares the violation's (path, qualified_symbol, metric), it matches regardless of line number — so editing code above a function no longer re-keys it as [new].
Start-line tolerance. If several entries share that key (two methods named is_valid on different impl blocks the analyzer could not tell apart, overloads, …), the entry whose recorded start_line is closest to the violation — and within --baseline-line-tolerance lines (default 50) — wins. Beyond the tolerance the violation is [new].
Body hash (opt-in). With --baseline-fuzzy-match, a violation whose qualified symbol no longer matches is matched against entries with an identical normalised body hash within the same (path, metric). This absorbs a rename that kept the function's shape (the digest elides the function's own name and is insensitive to indentation, blank lines, and CRLF). The hash is written into the baseline only when --baseline-fuzzy-match is set, so seed it with one fuzzy --write-baseline to enable fuzzy reads. Configure both keys under [check] in bca.toml as baseline_line_tolerance and baseline_fuzzy_match (the bare top-level spelling is deprecated and warns; see #599).

Anonymous functions (closures, lambdas) have no stable name, so their qualified symbol bakes in the line (outer::<anon@L42>). They therefore re-key as [new] when they move — the symbol fix only survives line drift for named top-level and method-bound functions, which produce the bulk of baseline churn.

When the gate finds violations, bca check emits a trailing --- next steps --- block on stderr (and inside the $GITHUB_STEP_SUMMARY digest) that names the artifact, prints a copy-paste-safe --write-baseline refresh invocation, and links back to this recipe. The refresh invocation mirrors the gate's resolved --paths / --exclude / --exclude-from / --config / --baseline arguments, so a first-time reader of a failing CI log can refresh the baseline without leaving the page.

Suppress the block with --no-remediation if downstream tooling grep-pipes the stderr stream and the trailing block confuses it.

Composition with suppression markers

--write-baseline already excludes any function silenced by a bca: suppress or #lizard forgives marker, so the same function doesn't end up in two places. If a function is intentionally exempt forever, prefer the in-source marker (lives next to the code, survives refactors, no extra file to commit). Use the baseline only for violations the team genuinely intends to fix.

To audit the un-filtered offender set — every violation regardless of suppression or baseline — pass --no-suppress and omit --baseline:

bca check --paths src/ \
    --no-suppress \
    --no-fail

Combined with --write-baseline, --no-suppress records every violation including the ones that suppression markers normally hide.

Auditing every exemption at once

A baseline is one of three ways code escapes the gate; the other two are in-source bca: suppress markers and [check.exclude] globs. bca exemptions lists all three tiers in a single report so a reviewer can see everything bca check is skipping without running three commands:

bca exemptions --paths src/

# In-source markers (2)
  src/parser.rs:120  bca: suppress       metrics=all  parse_long
  ...

# [check.exclude] globs (1)
  tests/**

# Baseline (.bca-baseline.toml, 417 entries)
  src/markdown_report.rs:88 write_language_section cognitive 29
  ...

The baseline section reads the same --baseline / bca.toml [check] baseline source bca check does (or .bca-baseline.toml by default). Use --baseline-only to list just the baselined offenders, --format markdown for a PR comment, or --format json for dashboards. During PR review, pair it with bca diff-baseline <old> <new> (above): the diff shows what changed in the baseline, bca exemptions shows the full current exemption surface. See the Suppression markers page for the complete flag reference.

Limitations

Ambiguous symbols. When two functions share a qualified symbol (the analyzer could not resolve distinct containers, or a language permits overloads) and both have drifted beyond --baseline-line-tolerance from their recorded lines, neither disambiguates and the violations surface as [new]. Refresh with --write-baseline, or raise the tolerance.
Anonymous functions. Closures and lambdas re-key on movement because their synthetic symbol embeds the line (see How matching works).
OS portability. Paths are normalized to forward slashes on write and re-normalized on read, so a baseline generated on Linux matches the same tree on Windows. Non-UTF-8 paths fall back to a lossy display form and may not round-trip exactly.
Tightening a threshold. Lowering a limit may newly expose functions that were previously clean. They will not be in the baseline → CI will fail. This is correct — tightening should expose new offenders. Refresh the baseline if the team chooses to absorb the new entries.

Local threshold gates

CI is the last line of defence, not the first. By the time bca check (reading the repo-root bca.toml manifest and its .bca-baseline.toml) fires red on a pull request, the offending change has already been pushed, the author has context-switched, and someone has to revisit the diff to nudge a metric back under its limit. A local threshold gate moves that feedback to the moment of git commit — the same moment cargo fmt --check and cargo clippy -- -D warnings already fire — so the regression never makes it past the developer's keyboard.

This recipe captures the pattern big-code-analysis uses on its own source (Makefile's self-scan* targets, backed by a consolidated bca.toml manifest) and distils it into something you can drop into your own repo's Makefile, justfile, package.json script, or pre-commit config. The underlying idea is provider-neutral: any threshold checker (bca, ESLint, clippy, SonarLint, Qodana) can be wired the same way.

Principles

Three principles drive the design. They are not specific to bca; they are the same conclusions Sonar reached when it pivoted its default Quality Gate to focus on new code and that the broader ratchet pattern formalises.

Gate locally, mirror CI exactly. The local gate must run the same binary with the same arguments and the same threshold / baseline / exclude files as CI. If the local gate is "almost what CI runs", it stops catching regressions the moment one diverges from the other. The cost of running the gate once before pushing is cheap; the cost of a red PR-bot ping is not.
Ratchet, don't reset. When you introduce thresholds on an existing codebase, every reasonable limit fires on dozens of pre-existing functions. The realistic adoption path is "absorb today's offenders into a baseline file, fail only on new or worsening ones, shrink the baseline over time". This is the same strategy that lets a multi-year codebase introduce strict TypeScript or strict clippy lints without a months-long boil-the-ocean pass. See the Baselines recipe for the bootstrap → CI → refresh → retire flow.
Warn before you fail. A hard 100% gate fails at the limit and gives no signal as a function creeps from 80% to 95% to 99% of its threshold. A second, looser tier that fires at e.g. 95% of every limit gives a one-or-two-commit early warning. The author still has the file open, the test cases in their head, and the freedom to refactor before the offender hardens into "well, it's in main now". Sonar's "new code" Quality Gate, the GCC -Wall / -Werror split, and clippy's warn vs. deny lint levels all encode the same insight: a tier between clean and broken is where teams actually catch drift.

The two tiers

The pattern is two recipes wrapping the same checker, plus two recipes for refreshing the baseline at each tier.

Target	Tier	Thresholds	Baseline-filtered	Use case
`self-scan`	hard	100% of config	yes	Mirror of CI. Must stay green on every commit.
`self-scan-headroom`	soft	config × `HEADROOM`	yes	Early-warning band. Fires before the hard tier.
`self-scan-write-baseline`	hard	100% of config	(write)	Absorb today's hard-tier offenders.
`self-scan-write-baseline-headroom`	soft	config × `HEADROOM`	(write)	Absorb soft-tier offenders when launching or widening the band.

The hard tier and the soft tier consume the same [thresholds] table and the same .bca-baseline.toml. The only difference between them is a scalar multiplier applied to every threshold value before bca check sees it.

Write the shared baseline at the soft tier (self-scan-write-baseline-headroom). A v5 baseline records the tier and headroom it was written against in a [provenance] table, and bca check warns when the current run is stricter than the baseline was written for. A soft-0.95 baseline is a superset of the hard gate's offenders, so the hard self-scan reads it silently; writing the baseline at the hard tier instead would make the soft self-scan-headroom warn that it is the stricter gate. See Tier/headroom provenance.

This matters: it means a contributor who wants the soft tier to be stricter (catch encroachment further out) bumps a single environment variable rather than maintaining a parallel soft-threshold file that will drift out of sync with the hard config the first time anyone forgets to update both files.

Two-tier thresholds

bca check --tier <hard|soft|soft=RATIO> selects which tier to gate against. hard (the default) compares against [thresholds] verbatim. soft is the early-warning tier, resolved in this order:

Start from [thresholds] (manifest, merged with --config).
If a [thresholds.soft] table exists, merge its overrides on top. Metrics absent from the soft table inherit their hard limit (no soft band). When a soft table is present, the blanket RATIO does not apply — explicit per-metric limits win over the scalar.
Otherwise scale every limit by the soft RATIO (default 0.95 for a bare soft, so --tier=soft is never a silent no-op; soft=1.0 disables scaling).
Repeated --threshold name=value flags apply last, absolutely.

A bare --tier=soft (ratio 0.95) is the zero-config entry point. A [thresholds.soft] table is the surface a mature project grows into, because it expresses a different soft band per metric — and keeps that band recorded next to the hard limit instead of buried in a runtime multiplier:

[thresholds]
cognitive  = 25
cyclomatic = 15
nargs      = 7

[thresholds.soft]
cognitive  = 22       # absolute soft limit
cyclomatic = "0.9x"   # 90% of the hard limit → 13.5
# nargs absent → soft tier inherits the hard limit (no soft band)

Soft limits with integer types read more cleanly as absolute values than as float-scaled ones: prefer nargs = 6 (for a hard nargs = 7) over the 0.95 × 7 = 6.65 a scalar would produce. Use the "<ratio>x" form for the large-valued metrics (halstead.*, loc.*) where an exact integer soft limit is fussy to pick. The scale factor must be in (0, 1] — the soft tier is an early-warning band that fires before the hard gate, never looser than it.

Both tiers ratchet through the same .bca-baseline.toml (no separate soft baseline file). bca check --print-effective-config --tier=soft prints the resolved limits — paste its [thresholds] output into [thresholds.soft] to migrate from a blanket-ratio band to explicit per-metric limits.

Zero-config: the `bca.toml` manifest

Rather than thread --paths, --exclude-from, --jobs, --config, --baseline, and --tier=soft=<ratio> through every recipe, drop a bca.toml at the repo root and let bca check discover it:

# bca.toml — discovered automatically at (or above) the working dir.
paths        = ["."]
exclude_from = ".bcaignore"
jobs         = "auto"          # or an integer (was `num_jobs`)

[check]
baseline     = ".bca-baseline.toml"

[thresholds]
cognitive    = 25
cyclomatic   = 15
"halstead.effort" = 50000
nom          = 30
nargs        = 7
nexits       = 5
abc          = 50
wmc          = 60

The headroom key is the soft-tier scale ratio: it only takes effect under --tier=soft, so a bare bca check (hard tier) stays the exact CI mirror regardless of any headroom key. For per-metric soft limits, prefer a [thresholds.soft] table (below) over the scalar — it records the band you chose next to the hard limit instead of leaving it to a runtime multiplier.

With that file in place the four recipes collapse to one flag each:

.PHONY: self-scan self-scan-headroom \
        self-scan-write-baseline self-scan-write-baseline-headroom

self-scan:                          # hard tier (CI mirror)
	bca check
self-scan-headroom:                 # soft tier (early warning)
	bca check --tier=soft=0.95
self-scan-write-baseline:           # absorb hard-tier offenders
	bca check --write-baseline
self-scan-write-baseline-headroom:  # absorb soft-tier offenders
	bca check --tier=soft=0.95 --write-baseline

Discovery and precedence

bca climbs from the working directory to the repo root (the directory containing .git) looking for bca.toml; the first match wins. Relative paths inside the manifest resolve against the manifest's own directory, so a bca.toml above the current directory still points at the right files.
Scalars and positive scope keys: CLI wins. Any explicit --baseline, --tier, --jobs, etc. overrides the corresponding manifest key, and the positive scope list keys (paths, include) are replaced by any explicit CLI value (bca check one.rs with manifest paths = ["src"] checks just one.rs). --config <file> merges on top of the manifest [thresholds] table (config keys win on collision), and repeated --threshold name=value flags apply last as absolute limits. The full resolution order — [thresholds] → --config → tier resolution ([thresholds.soft] or the soft RATIO scaling, only under --tier=soft) → --threshold overrides — is shared across all of --config / --tier / the manifest.
Negative filter keys: CLI unions with the manifest. The exclude list keys (top-level exclude, [check] exclude) are merged, not replaced: a CLI --exclude / --check-exclude is added to the manifest's deny-set. This way a command-line filter can never silently un-exclude a directory the project config deliberately skipped (e.g. vendor/). Duplicates across the two sources collapse; CLI patterns sort first. This mirrors ruff/ESLint's exclude (replace) vs extend-exclude (add), generalized: targets replace, filters add. As always, --x and --x-from union with each other regardless. Reach for --no-config when you need the manifest excludes gone entirely.
--no-config skips discovery entirely, for reproducible fully-explicit invocations that must not pick up repo-level config. bca init also ignores any existing manifest — it scaffolds config rather than consuming it.
The top-level include / exclude keys are the global file-filter globs (the --include / --exclude flags) that decide which files are analysed at all. They are distinct from the [check] exclude table (analysed-and-reported but ungated paths; see Exempting whole file categories).
A [check] table sets gate-only options. exclude is a glob list whose matching files are analysed and reported but exempt from the threshold gate (and from --write-baseline); exclude_from points at a .gitignore-style file of the same globs (both mirror the --check-exclude / --check-exclude-from flags). exit_codes = "tiered" opts into the finer-grained exit codes (mirrors --exit-codes=tiered; see Exit codes); "default" (the implicit value) keeps the stable 0/1/2 contract. The baseline and headroom keys are gate-only too, so they live here: baseline (the file bca check reads and a bare --write-baseline writes), baseline_line_tolerance, baseline_fuzzy_match, and headroom (the soft-tier scale ratio; mirrors --tier=soft=<R>). The CLI value overrides the table value for each, in either direction.
These four keys used to sit at the top level; that spelling is deprecated (#599) and prints a one-time warning. It is honoured for one release cycle, then removed in the next major version. Move baseline, baseline_line_tolerance, baseline_fuzzy_match, and headroom under [check]. When a key is set in both places, the [check] value wins.
A [vcs] table sets change-history ranking options for bca vcs. Its file_types key ("metrics" — the default — / "all" / a "rs,py"-style extension list) scopes which files are ranked; as a positive scope key, an explicit --file-types CLI flag replaces it (see bca vcs file-type scope).
cyclomatic_count_try and exclude_tests are walker-tuning bools that mirror the --cyclomatic-count-try / --exclude-tests flags. exclude_tests = true prunes Rust inline-test subtrees (#[test], #[cfg(test)], …) before metric computation. Both are Rust-only and inert for other grammars. --exclude-tests is presence-only (no =false form), so its manifest key can only turn pruning on — a CLI --exclude-tests wins, but the manifest cannot turn off a key the CLI did not set.
A [thresholds.soft] table sets per-metric soft-tier limits (consumed by --tier=soft; see Two-tier thresholds). Unrecognized keys are ignored with a one-line warning, so you can pre-adopt forthcoming schema additions without breaking older bca builds.
bca check --print-effective-config prints the resolved view, including a manifest provenance line, so you can see exactly what the merge produced.

The explicit-flag skeletons below remain fully supported — the manifest is sugar over the same flags, not a replacement. Reach for them when you can't drop a file at the repo root, or when one CI job needs a different layout than the committed manifest (pair the flags with --no-config).

Skeleton: GNU Make (explicit flags)

The four recipes below are a self-contained drop-in that thread every flag explicitly — the long form of the manifest recipe above. Adjust the BCA variable to point at whatever invocation gives you the checker (a pinned release binary, cargo run --release, an npm / pip wrapper). Adjust PATHS and EXCLUDE_FROM to match your layout.

# --- bca local threshold gates ------------------------------------------
# HARD tier mirrors CI exactly. Both tiers consume the same
# thresholds.toml + .bca-baseline.toml; the soft tier scales every
# threshold by $(BCA_HEADROOM) (default 0.95).
#
# Knobs are namespaced with `BCA_` so they don't collide with anything
# else in your environment. The big-code-analysis repo itself uses the
# manifest form above (a single `bca.toml`) rather than these explicit
# flags; reach for this skeleton when you can't drop a manifest at the
# repo root and must point `--config` at a standalone threshold file.
BCA               := bca
BCA_PATHS         := .
BCA_EXCLUDE_FROM  := .bcaignore
BCA_THRESHOLDS    := thresholds.toml
BCA_BASELINE      := .bca-baseline.toml
BCA_HEADROOM      ?= 0.95

# Common args, factored out so the four recipes stay in lockstep.
# `--jobs` defaults to the OS-reported effective CPU count
# (cgroup-/cpuset-aware on Linux), so no `$(nproc)` plumbing is
# needed. Override with `--jobs N` (or `--jobs 1` to force
# serial mode for debugging).
BCA_BASE_ARGS := --paths $(BCA_PATHS) --exclude-from $(BCA_EXCLUDE_FROM)

.PHONY: self-scan self-scan-headroom \
        self-scan-write-baseline self-scan-write-baseline-headroom

self-scan:
	@echo "bca self-scan (hard gate)..."
	@$(BCA) check $(BCA_BASE_ARGS) \
	  --config $(BCA_THRESHOLDS) \
	  --baseline $(BCA_BASELINE)

# `self-scan-headroom: self-scan` is intentional: under `make -j` Make
# would otherwise run both gates in parallel and the soft tier's scaled
# error message could land before the true regression on the hard tier.
# `--tier=soft=$(BCA_HEADROOM)` scales every config limit before the
# offender comparison — no helper script, no second TOML file.
self-scan-headroom: self-scan
	@echo "bca self-scan (soft gate, BCA_HEADROOM=$(BCA_HEADROOM))..."
	@$(BCA) check $(BCA_BASE_ARGS) \
	  --config $(BCA_THRESHOLDS) \
	  --tier=soft=$(BCA_HEADROOM) \
	  --baseline $(BCA_BASELINE)

self-scan-write-baseline:
	@echo "Refreshing $(BCA_BASELINE) at hard thresholds..."
	@$(BCA) check $(BCA_BASE_ARGS) \
	  --config $(BCA_THRESHOLDS) \
	  --write-baseline $(BCA_BASELINE)

# Soft-tier baseline write. NOTE: this and `self-scan-write-baseline`
# both write `$(BCA_BASELINE)`; never compose them as parallel
# prerequisites of one umbrella target or invoke them with `make -j2`,
# or the two `bca` processes will race on the same file and the
# losing tier's offenders will silently vanish from the baseline.
# Run them sequentially (hard first, then soft) and commit the diff.
self-scan-write-baseline-headroom:
	@echo "Refreshing $(BCA_BASELINE) at soft thresholds (BCA_HEADROOM=$(BCA_HEADROOM))..."
	@$(BCA) check $(BCA_BASE_ARGS) \
	  --config $(BCA_THRESHOLDS) \
	  --tier=soft=$(BCA_HEADROOM) \
	  --write-baseline $(BCA_BASELINE)

bca check --tier=soft=<ratio> scales every limit from --config by the ratio (default 0.95 for a bare --tier=soft) before the offender comparison, then filters against the same .bca-baseline.toml the hard tier writes. Explicit --threshold name=value overrides are absolute and are not rescaled. There is no separate helper script or second TOML file to maintain — the soft tier is the hard-tier invocation plus one flag.

Exit codes

The gate exit codes propagate verbatim from bca check: 0 clean, 2 on any threshold violation (hard or soft), 1 on tool error. The soft tier is a real gate — never wrap make self-scan-headroom in || true thinking it's advisory; the non-zero exit is the whole point of the encroachment band.

Pass --exit-codes=tiered (or set [check] exit_codes = "tiered") to split the single violation code 2 by severity: 2 new offenders only, 3 regressions only, 4 both, 5 a --tier=soft violation that also breaches the hard limit. The tiered codes are opt-in; the default stays 0/1/2, and every fail-state remains non-zero. Use them when CI needs to route "a new offender appeared" differently from "a baselined offender got worse" without parsing the [new] / [regr +N%] stderr tags.

Wiring into pre-commit and CI

Add the soft gate to whatever umbrella target your developers already run before pushing. The hard gate runs as its prerequisite (see the self-scan-headroom: self-scan edge above), so listing only the soft target is enough — and crucially survives make -j, which would otherwise schedule both leaves in parallel and interleave their output:

.PHONY: pre-commit
pre-commit: fmt-check clippy test self-scan-headroom

Ordering matters: the hard tier names a true regression with the 100% limit, not the scaled one. The prerequisite edge enforces that order even under parallel Make.

In CI, run only the hard tier:

- name: Threshold gate
  run: make self-scan

The soft tier is a developer feedback knob, not a release gate. Running it in CI either duplicates the hard tier (when nothing has encroached) or fires noisily on a baseline-absorbed offender that crept upward without crossing 100% — neither buys you anything CI doesn't already cover.

The headroom knob

BCA_HEADROOM is a single scalar in (0, 1]. The interesting band is narrow:

`BCA_HEADROOM`	Fires when a function reaches…	Use case
`0.99`	99% of any limit	Tightest possible warning, fires on the last commit before the hard gate would.
`0.95`	95% of any limit (default)	One-or-two-commit lead time. Good default.
`0.90`	90% of any limit	Wider band — useful immediately after raising a limit, while the new ceiling settles.
`1.00`	100% (parity with hard gate)	Sanity check that the two tiers agree.

Values below ~0.80 turn the soft tier into a second hard tier with arbitrary numbers and stop being useful: every threshold has some function near 80% of it on a real codebase, and the soft tier becomes a permanent baseline-management chore rather than an early-warning signal.

When the soft tier fires

A failed soft gate is a decision point, not a bug report. There are exactly three legitimate resolutions:

Refactor. Same workflow as any other complexity regression — extract a helper, collapse a dispatch arm, split the function. This is the common case, and the soft tier exists to give you the time to do it on the same branch.
Raise the limit. Edit the [thresholds] table (in bca.toml for this repo, or your own threshold file), leave a why-comment explaining what changed (a new language module, a genuine algorithmic floor, a re-classified macro). Re-run make self-scan-headroom to confirm the new value covers the offender with room to spare.
Absorb into the baseline. Run make self-scan-write-baseline (hard tier) or make self-scan-write-baseline-headroom (soft tier) when the value is legitimate forever — a parser dispatch arm whose width matches the grammar it covers, a stable state machine, generated code. Commit the diff in .bca-baseline.toml in the same PR as the code that produced it.

Don't pick "raise the limit" silently to make the gate go away. The committed why-comment is the only audit trail the next reader has; without it the bumped limit looks indistinguishable from neglect.

Skeleton: `justfile`

For projects that prefer just:

# bca local threshold gates. Hard tier mirrors CI; soft tier (headroom)
# is local-only early warning.
bca         := "bca"
paths       := "."
exclude     := ".bcaignore"
thresholds  := "thresholds.toml"
baseline    := ".bca-baseline.toml"
headroom    := env_var_or_default("BCA_HEADROOM", "0.95")

# `--jobs` defaults to the effective CPU count, so the skeleton
# no longer threads `$(nproc)` through `just`. Override
# inline if needed: `just self-scan --jobs 1`.
base_args   := "--paths " + paths + " --exclude-from " + exclude

self-scan:
    {{bca}} check {{base_args}} \
        --config {{thresholds}} --baseline {{baseline}}

self-scan-headroom: self-scan
    {{bca}} check {{base_args}} \
        --config {{thresholds}} --tier=soft={{headroom}} --baseline {{baseline}}

self-scan-write-baseline:
    {{bca}} check {{base_args}} \
        --config {{thresholds}} --write-baseline {{baseline}}

# Like the Make skeleton, never compose this with `self-scan-write-baseline`
# in parallel — they race on the same {{baseline}} file.
self-scan-write-baseline-headroom:
    {{bca}} check {{base_args}} \
        --config {{thresholds}} --tier=soft={{headroom}} --write-baseline {{baseline}}

Skeleton: `package.json` scripts

For JavaScript projects pulling in bca via npx or a pinned binary. --jobs defaults to the effective CPU count (cgroup-/cpuset-aware on Linux), so the npm tier no longer needs a BCA_NUM_JOBS env var to produce byte-identical bca check invocations as Make / just. Pass --jobs 1 explicitly only when debugging:

{
  "scripts": {
    "self-scan": "bca check --paths . --exclude-from .bcaignore --config thresholds.toml --baseline .bca-baseline.toml",
    "self-scan-headroom": "bca check --paths . --exclude-from .bcaignore --config thresholds.toml --tier=soft=0.95 --baseline .bca-baseline.toml",
    "self-scan-write-baseline": "bca check --paths . --exclude-from .bcaignore --config thresholds.toml --write-baseline .bca-baseline.toml",
    "self-scan-write-baseline-headroom": "bca check --paths . --exclude-from .bcaignore --config thresholds.toml --tier=soft=0.95 --write-baseline .bca-baseline.toml"
  }
}

Because the soft tier is now a plain bca check invocation, the npm scripts are byte-identical across shells — no helper script, no python3-vs-py alias to paper over, no env-var-vs-shell-expansion portability traps. To widen the band, edit the literal 0.95 in the script (or wire it through your task runner of choice); the flag parses the same on every platform.

Pair with husky or pre-commit so the same scripts run on git commit.

Skeleton: `pre-commit` hook

If you use the pre-commit framework (version 3.2.0 or newer — see the version note below), both tiers are local hooks that shell out to make:

- repo: local
  hooks:
    - id: bca-self-scan
      name: bca self-scan (hard gate)
      entry: make self-scan
      language: system
      pass_filenames: false
      stages: [pre-commit]
    - id: bca-self-scan-headroom
      name: bca self-scan-headroom (soft gate)
      entry: make self-scan-headroom
      language: system
      pass_filenames: false
      stages: [pre-commit]

pass_filenames: false is deliberate — bca discovers its own inputs from --paths plus the baseline. Letting pre-commit pass the changed files in would shrink the scan to just those files and miss the cross-file effect of a baseline refresh.

Minimum pre-commit version 3.2.0. The stages: vocabulary was renamed in pre-commit 3.2.0 (March 2024) — commit → pre-commit, push → pre-push, etc. Older installs (notably RHEL 8 EPEL, Ubuntu 20.04 default packages, and any .pre-commit-config.yaml pinned to the legacy vocabulary) reject stages: [pre-commit] as an unknown stage name and the hook never registers. If you must support older installations, substitute stages: [commit]; in mixed fleets, pin the framework with pre-commit --version ≥ 3.2.0 in the dev-tooling docs so this contradiction does not surface silently.

Composition with the broader baseline workflow

The four self-scan* targets above are not a replacement for the documented Baselines recipe — they are that recipe, mechanised into developer-machine commands. The same ordering still applies:

Bootstrap once. Write the initial thresholds, write the initial baseline, commit both.
Gate on every commit. Hard tier fails on regression; soft tier fails on encroachment.
Refresh during focused refactors. When a function legitimately moved (someone did pay down debt), regenerate the baseline and review the diff.
Retire when empty. When .bca-baseline.toml shrinks to just version = 5 (the bare schema stamp with no offender entries), drop the --baseline flag and delete the file. The thresholds now stand on their own.

The local tiers shorten the feedback loop on steps 2 and 3 from "red CI on a pull request" to "red Make recipe before git commit returns". That is the whole pitch.

The hard / soft tier split is one instance of a broader pattern. If you have used any of the following, the mental model carries over:

Sonar's Quality Gates focused on new code. Old code is held at its current state; changes must not make things worse. The baseline file is bca's native form of the "new code" / "leak period" idea.
clippy's warn-vs-deny lint levels. A warn lint surfaces in local builds; the same lint denied with -D warnings fails CI. The two-tier configuration gives you a place to land experimental tighter rules.
The ratchet pattern in general migration tooling: record today's count, fail on increase, lower the ceiling as the count drops. bca check ratchets per-function rather than per-pattern, but the monotonicity guarantee is the same.
-Wall + -Werror in C/C++. A first pass with -Wall reveals the noise; promoting to -Werror after the baseline reaches zero is the same retirement step as deleting .bca-baseline.toml once it's empty.

Feeding metrics to an agentic coding tool

An agentic coding tool — Claude Code, opencode, and the like — is a fast-growing consumer of maintainability feedback, and it wants that feedback in a different shape than a human in an editor does. A human keystrokes, so the editor loop reaches for a language server: parse on didChange, render complexity in the margin, update on every character. An agent does not keystroke. It writes a whole edit through a tool call and yields the turn. The right feedback for that consumer is batch, after-edit, structured — exactly what bca check already emits.

So this recipe ships no new binary surface. bca check already gives an agent everything it needs:

A machine-parseable offender list (per-violation lines on stderr, or --report-format sarif | code-climate | clang-warning | msvc-warning | checkstyle for a structured document).
A tiered exit code — 2 when offenders are present, 0 clean, 1 on tool error — so a hook can branch on "did this edit make the code too complex?" without parsing anything.
Baseline filtering, in-source suppression markers, and [check.exclude] globs, so the signal an agent sees is the same ratcheted signal a human sees.

What was missing is the wiring. This page is that wiring: a copy-pasteable feedback loop per tool, plus the agent-facing guidance that keeps the loop from backfiring.

After-edit, not keystroke-time

This recipe is the deliberate counterpart to the proposed bca lsp server (#384). The two serve different consumers and do not depend on each other:

	`bca lsp` (#384)	This recipe
Consumer	Human in an editor	Agent in a tool loop
Trigger	Keystroke (`didChange`)	Tool call completes (an edit lands)
Reparse	Incremental	Whole-file, once per edit
Surface	Margin diagnostics	Text fed back into the model
Status	Proposed	Works today with `bca check`

Reach for the LSP when a person is typing; reach for this when an agent is editing. Wiring an agent through the LSP's incremental didChange path would pay for machinery the agent never uses.

The command every per-tool section below invokes is the same one you would run by hand:

# Exit 2 ⇒ this file has at least one offender. Thresholds come from
# the repo-root bca.toml (discovered automatically); override ad hoc
# with one or more --threshold flags.
bca check path/to/edited_file.rs --threshold cognitive=25

With a bca.toml at the repo root (see Local threshold gates) the --threshold flags are unnecessary — a bare bca check <file> reads the committed limits, baseline, and excludes, so the agent loop gates on exactly what CI gates on.

Claude Code

Mechanism: a PostToolUse hook in .claude/settings.json, with the matcher scoped to the file-editing tools.

Feedback channel: this is the strongest fit of any agentic tool. A PostToolUse hook fires the instant an edit lands, and it can inject text straight back into the model — either by exiting 2 with the message on stderr (Claude reads stderr as context about what happened) or by emitting JSON with hookSpecificOutput.additionalContext. Feedback arrives at the exact edit boundary and costs zero tokens until there is something to say.

Wire the hook in .claude/settings.json:

{
  "hooks": {
    "PostToolUse": [
      {
        "matcher": "Edit|Write|MultiEdit",
        "hooks": [
          {
            "type": "command",
            "command": "${CLAUDE_PROJECT_DIR}/.claude/hooks/bca-check.sh"
          }
        ]
      }
    ]
  }
}

The wrapper script reads the edited file's path from the hook's stdin JSON, runs bca check on just that file, and — only when the gate trips — prints the offender list followed by the agent-guidance block on stderr and exits 2:

#!/usr/bin/env bash
# .claude/hooks/bca-check.sh — gate a single edited file after the edit.
set -euo pipefail

# PostToolUse delivers the tool call as JSON on stdin; the edited
# file's path is .tool_input.file_path for Edit/Write/MultiEdit.
file_path="$(jq -r '.tool_input.file_path // empty')"
[ -n "$file_path" ] || exit 0          # nothing to check; stay silent.
[ -f "$file_path" ] || exit 0          # file gone (e.g. a delete).

# Thresholds, baseline, and excludes come from the repo-root bca.toml.
# --no-summary / --no-remediation keep the feedback to the offender
# lines themselves; the guidance below tells the agent what to do.
status=0
report="$(bca check "$file_path" --no-summary --no-remediation 2>&1)" || status=$?

# bca check exits 0 clean, 2 on offenders, 1 on tool error. Branch on 2
# specifically so a config/IO error is not mislabelled as "complexity".
case "$status" in
  0) exit 0 ;;                         # clean ⇒ say nothing.
  2) ;;                                # offenders ⇒ report them below.
  *) printf 'bca check could not run (exit %s):\n%s\n' "$status" "$report" >&2
     exit 0 ;;
esac

# Exit 2 makes Claude read stderr as context about the edit.
cat >&2 <<EOF
bca flagged complexity in the file you just edited:

$report

$(cat "${CLAUDE_PROJECT_DIR}/.claude/hooks/bca-guidance.txt")
EOF
exit 2

Store the agent-guidance block in .claude/hooks/bca-guidance.txt so the hook feedback and your CLAUDE.md cite identical wording. (jq is the only dependency; it ships with most agent images and is a one-line install otherwise.)

If you would rather inject advisory context without the exit 2 signal, emit JSON on stdout instead of writing to stderr:

jq -n --arg ctx "$offenders" '{
  hookSpecificOutput: {
    hookEventName: "PostToolUse",
    additionalContext: $ctx
  }
}'
exit 0

Use additionalContext when you want the offender list in Claude's view as a note; use exit 2 when you want it framed as a problem to address before moving on. Both leave the edit in place — PostToolUse runs after the tool, so neither can undo it.

opencode

Mechanism: a plugin — a JavaScript or TypeScript module exporting an async function that returns a hooks object — using the tool.execute.after hook, which fires after a tool runs (including the write and edit file-modification tools).

Feedback channel: the after-hook surfaces a problem to the agent by throwing. opencode's public plugins page documents the throw-to-signal pattern but not an advisory return value for the after-hook, so this recipe defaults to throwing — a thrown Error carries its message back to the agent as the tool's failure.

Mind the argument shape — it differs from the before hook. The published @opencode-ai/plugin types give tool.execute.after the signature (input, output) where the tool name is input.tool and the tool arguments are on input.args (the file path is input.args.filePath). This is the trap: the docs' only worked example is for tool.execute.before, where args live on output.args (mutable, pre-execution). Copy that into an after hook and output.args is undefined, the guard below always trips, and the plugin silently never runs — a no-op that looks installed. Read input.args.filePath in the after hook.

Drop this file in .opencode/plugins/ (project-level; auto-loaded — no opencode.json entry needed, that key is for npm-published plugins). The plugin below is plain JavaScript; for a TypeScript plugin, import type { Plugin } from "@opencode-ai/plugin" and annotate the export, and declare any extra deps in .opencode/package.json (opencode installs it with Bun):

// .opencode/plugins/bca-check.js
const GUIDANCE = `
Responding to bca metric feedback: make the code genuinely simpler,
not the number smaller. Do not extract a meaningless helper or split a
cohesive function to dodge the count — a spurious helper often raises
file-level nom/nargs and helps nothing. If the complexity is essential
and the function is clearest left whole, add a suppression marker with
a one-line reason instead of contorting the code.
`.trim()

export const BcaCheck = async ({ $ }) => {
  return {
    // Note: the after-hook's args are on `input.args`, NOT `output.args`
    // (output carries the tool's result: title/output/metadata).
    "tool.execute.after": async (input, _output) => {
      // React only to the file-writing tools. (Patch-style edit tools
      // carry no single filePath and are intentionally not covered.)
      if (input.tool !== "write" && input.tool !== "edit") return
      const filePath = input.args?.filePath
      if (!filePath) return

      // `bca check` exits 0 clean, 2 on offenders, 1 on tool error.
      // Bun's $ throws on non-zero by default; capture instead so we
      // can branch on the exact code.
      const res = await $`bca check ${filePath} --no-summary --no-remediation`
        .quiet()
        .nothrow()
      // 0 clean, 1 tool error: not a complexity issue. `< 2` rather than
      // `=== 2` so the tiered exit codes (3-5, from `--exit-codes=tiered`
      // / `exit_codes = "tiered"`) still report.
      if (res.exitCode < 2) return

      // Surface the offenders to the agent by throwing.
      const offenders = res.stderr.toString().trim() || res.stdout.toString().trim()
      throw new Error(`bca flagged complexity in ${filePath}:\n\n${offenders}\n\n${GUIDANCE}`)
    },
  }
}

Keep the GUIDANCE string in sync with the verbatim block below (or read it from a shared file). Because the channel is a thrown error, opencode reports it as a failed post-edit step — which is the intended "address this before continuing" framing. The edit itself still lands: tool.execute.after runs once the write or edit tool has already written the file, so the throw frames the next step without undoing the change.

Restart opencode after adding the plugin. Plugins are loaded once at startup and are not hot-reloaded. A freshly dropped .opencode/plugins/bca-check.js does nothing in the running session; quit and relaunch opencode, then confirm by editing a file you know is over threshold and checking that the edit/write tool reports the bca failure. An installed-but-inert plugin is the most likely symptom, and a stale session is the most likely cause.

For a hardened reference, this repository ships its own copy at .opencode/plugins/bca-check.js. It adds three guards the minimal example omits, each worth porting for a real project:

Repo-scope guard. Resolve the path and skip anything outside the project root, so the hook never runs bca on a file the agent edits elsewhere on disk.
Local-build resolution. Prefer $BCA, then a target/release/bca in the checkout, then bca on PATH. A project that builds bca itself then gates against its own analyzer instead of whatever is installed globally.
Shared guidance. Read the guidance text from one file that both this plugin and the Claude Code hook cite, so the two never drift.

Agent guidance (ship this verbatim)

The feedback channel is only half the recipe. A bare "cognitive 26 > 25" reliably triggers the gaming move — the agent extracts a semantically empty helper to shave the per-function number, lowering per-function complexity while raising file-level nom/nargs and leaving the code worse. The mitigation is telling the agent what a violation means and what to do about it. Paste this block into your agent rules file (CLAUDE.md, opencode AGENTS.md) and into the hook feedback text, so the instruction is present both as standing policy and at the moment of the violation:

**Responding to `bca` metric feedback.** A threshold violation
(cognitive, cyclomatic, ABC, …) means *this function is hard for a
human to follow*. The number is a proxy for that, not the goal. Your
job is to make the code genuinely simpler — not to make the number go
down.

- **Do not game the metric.** Do not extract a helper that exists only
  to move complexity off one function, split a cohesive function at an
  arbitrary line, collapse readable branches into a dense expression,
  or inline/obfuscate logic to dodge the count. These lower the
  per-function score while making the code worse — and a spurious
  helper often *raises* file-level `nom`/`nargs`, so you have not even
  helped the file.
- **Refactor only when it truly clarifies.** A good split has a name
  that means something and a boundary a reader would have drawn anyway.
  If you cannot name the extracted piece without inventing a
  `foo_part2`, the split is gaming — stop.
- **When the complexity is essential, suppress with a reason.** Some
  functions are irreducibly complex *and clearest left whole* — a
  dispatch `match`, a hand-rolled parser table, an exhaustive state
  machine. For these, do not contort the code: add a suppression marker
  with a one-line rationale and move on. A clear function with an
  honest `// bca: suppress(...)` is better than a "compliant" tangle.

Honest suppression (exact syntax)

The guidance above tells the agent that suppression is a legitimate move. For that to work the agent has to spell the marker correctly — and the marker syntax is a frequent source of silent no-ops. Teach it precisely (full reference: Suppression markers):

Per function — place the marker in a comment inside the function body, naming the metric(s): // bca: suppress(cyclomatic, abc). A bare // bca: suppress (no list) silences every metric for that function.
Whole file — // bca: suppress-file(halstead, nargs, nexits) anywhere in the file; the bare // bca: suppress-file form silences every metric file-wide.
Use canonical metric names. The accepted identifiers are abc, cognitive, cyclomatic, halstead, loc, mi, nargs, nexits, nom, npa, npm, wmc. It is nexits, not exit (the legacy exit alias was retired) — and an unknown identifier both warns and voids the entire marker, silently un-suppressing everything it listed. tokens is deliberately not suppressible; treat it as a hard resource cap.
Always pair a suppression with a rationale comment so a reviewer — human or agent — can later tell an honest exemption from a dodge. bca exemptions lists every marker in the tree for exactly this audit.

Gate at the task boundary, not per edit

The per-edit hooks above are an early-warning convenience, not the gate. The gate is the same check a human runs before declaring a task done: the two-tier make self-scan / pre-commit pattern (hard tier mirroring CI, soft tier as a 95%-of-limit headroom band). Point the agent at that as its "before I'm finished" step.

Going finer-grained than the task boundary is low- or negative-value for an agent. A complexity threshold is a proxy, not a correctness gate (see the caveats below), so re-running it after every micro-edit mid-refactor just produces transient violations that resolve themselves a few edits later — noise the agent then wastes a turn "fixing". Let the agent finish a coherent change, then gate once.

Caveats

Two caveats the recipe depends on; ignore them and the loop does more harm than good.

Goodhart's law / metric-gaming. "Make this number smaller" is not the same instruction as "make this code simpler", and an LLM told the former will satisfy it the cheapest way it can — usually by shuffling complexity across a function boundary rather than removing it. The agent-guidance block and the honest-suppression section are the mitigation; they are load-bearing, not optional polish. Ship them with the hook or expect the gaming move.
Thresholds are proxies, not correctness gates. A failing compile or a red test is unambiguous — a tight agent loop on those converges. A complexity threshold is softer: crossing it means "a human should look at whether this is still readable", which is a judgement call, not a defect. Set expectations accordingly — wire complexity feedback as advice the agent weighs, not a pass/fail it must drive to zero at any cost.

AST queries

Recipes that work with the parsed syntax tree directly: searching for node types, counting them, or dumping the tree.

Library-side equivalents. Every recipe below has an in-process Rust counterpart in Walking the AST directly — useful when shelling out per file is too slow or when you want to compose metrics with custom AST analysis in one parse.

Detect parse errors before committing

Tree-sitter exposes a synthetic ERROR node anywhere it could not parse. Use find to surface them:

bca find \
    --include "*.rs" \
    --paths "$PWD" \
    -t ERROR

One glob per occurrence. --include and --exclude take exactly one value each time they appear; repeat the flag for multiple globs (--include "*.rs" --include "*.py"). A positional path that follows is never swallowed. The = form (--include="*.rs") also works.

A clean run prints nothing. Wire this into a pre-commit hook to fail fast when a syntactically broken file is staged.

Count specific syntactic constructs

count takes one or more node types via the repeatable -t/--type flag and reports the totals. For example, to count if, for, and while constructs across a Rust project:

bca count \
    --include "*.rs" \
    --paths src/ \
    -t if_expression -t for_expression -t while_expression

The exact node-type names come from the underlying tree-sitter grammar. To discover them, dump the AST of a small sample file (see below) and read the node names off the tree.

Find all `unsafe` blocks in a Rust crate

bca find \
    --include "*.rs" \
    --paths src/ \
    -t unsafe_block

Each match prints the file path and the line range of the node.

Dump the AST of a file

Useful for understanding why a metric came out the way it did, or for discovering the tree-sitter node names you need for find / count:

bca dump --paths src/lib.rs

To narrow the dump to a specific function or block, add line bounds with the --line-start and --line-end flags (they must follow the dump subcommand):

bca dump \
    --paths src/lib.rs \
    --line-start 42 --line-end 88

--line-start / --line-end apply to dump and find, so the same range can be used to scope a search to a single function:

bca find \
    --paths src/lib.rs \
    --line-start 42 --line-end 88 -t return_expression

The short --ls / --le spellings remain as deprecated aliases and are slated for removal in the next major.

List every function or method

For a quick human-readable inventory:

bca functions \
    --include "*.rs" \
    --paths src/

The output is a tree per file: an In file … header followed by an indented row per function with name and line span. It is intended for reading, not parsing.

For tooling that needs a structured inventory — coverage mapping, documentation generation, code-owner reports — use the JSON metrics output instead and walk .spaces[] recursively, taking entries whose kind is function:

bca metrics \
    --include "*.rs" \
    --paths src/ \
    -O json \
  | jq -c '
      . as $root
      | def funcs: if .kind == "function" then [.] else [] end
                   + (.spaces // [] | map(funcs) | add // []);
      funcs[] | {file: $root.name, name, start_line, end_line}
    '

This emits one JSON object per function and is safe to pipe into downstream tooling.

Exporting metric data

The metrics, ops, and preproc subcommands all support structured output formats meant for machine consumption. Pair them with a JSON processor like jq for ad-hoc analysis, or feed them into a database or dashboard.

Export per-file metrics as JSON

bca metrics \
    --paths src/ \
    -O json \
    --output-dir /tmp/metrics

This writes one JSON file per analyzed source file under /tmp/metrics/. The output filename mirrors the input path with the format extension appended — src/lib.rs becomes src/lib.rs.json, not src/lib.json. Use --pretty if you intend to read the files by hand:

bca metrics -p src/ --pretty -O json --output-dir /tmp/metrics

To collect the whole run into one file instead of a tree, use --output <file>; it writes a single aggregate document (a top-level JSON array of the per-file results):

bca metrics -p src/ -O json --output /tmp/metrics.json

CBOR (-O cbor) is the most compact format; it is binary and so requires a destination (--output or --output-dir). JSON, TOML, and YAML can all be streamed to stdout when no destination is given, which is useful for pipelines.

Compare two metric runs with `bca diff`

bca diff compares two JSON metric runs and reports, per metric, which files changed (old → new), plus any files added or removed between the two sets. Each side is either a single per-file JSON document or a whole directory tree of them (the form metrics -O json --output-dir <dir> writes), so the common workflow is two bca metrics runs into separate directories:

# Capture the "before" state.
bca metrics -p src/ -O json --output-dir /tmp/before

# ...make a change (e.g. bump a tree-sitter grammar)...

# Capture the "after" state and diff.
bca metrics -p src/ -O json --output-dir /tmp/after
bca diff /tmp/before /tmp/after

The output buckets every per-file delta by metric name — the same names bca list-metrics prints (cyclomatic, cognitive, sloc, …):

2 metric(s) changed, 1 added file(s), 0 removed file(s)

## Added files
  src/new_module.rs.json

## cyclomatic (2 change(s))
  src/lib.rs.json.sum  12 → 14
  src/lib.rs.json.max   4 → 6

## halstead (1 change(s))
  src/lib.rs.json.effort  5820.3 → 7104.9

Useful flags:

--format markdown for a sticky PR comment, --format json for a stable machine-readable schema (CI consumers).
--min-change <N> reports only deltas whose absolute change is at least N (the default 0 reports any change).
--metric <name> (repeatable) restricts the diff to specific metrics.

Diff against a git ref with `--since`

For the interactive "what did my uncommitted change do to the metrics?" case, --since <ref> skips the two-capture dance: it analyzes the tree at a git ref for the before side, and diffs it against the current working tree (or an explicit after-side tree):

# Before = the tree at HEAD~1; after = the current working tree.
bca diff --since HEAD~1 -p src/

# After = an explicit tree (e.g. a second checkout) instead of the
# working tree. The single positional is the after side.
bca diff --since main /path/to/other-checkout -p src/

--since materializes the ref's tree into a temporary directory (via git archive), runs the same metric walk against it, then diffs — the temp tree is removed automatically, including on error. The same -p/--paths, -I/--include, and -X/--exclude selection applies to both sides so they analyze the same file set. Selection paths are repo-root-relative and must be relative: the working-tree side is anchored at the repository root (matching the whole-tree git archive of the ref), so bca diff --since produces the same result from any subdirectory. An absolute --paths is rejected (exit 1) — it cannot address the extracted ref tree.

Unlike bca check --since (best-effort), bca diff --since is an explicit request: an unresolvable ref, a missing git, or a non-git working directory is a hard error (exit 1). --since takes at most one positional (the after side); passing two is an error.

bca diff exits 0 on success — it is informational, not a gate — unless the opt-in --exit-code flag is passed, which exits 2 when the filtered diff is non-empty. It replaces the former json-minimal-tests + split-minimal-tests.py chain used to validate that a grammar bump did not regress metrics; the check-grammar-crate.py helper now calls bca diff internally.

Pull a single metric across an entire tree

Combine streamed JSON output with jq to extract one value per file:

bca metrics -p src/ -O json \
  | jq -c '{file: .name, mi: .metrics.mi.visual_studio}'

The same idea works for any metric — cyclomatic.sum, cognitive.sum, loc.sloc, and so on. Run bca list-metrics descriptions to see the catalog.

Discover the metric catalog at runtime

Tooling that drives the CLI shouldn't hard-code metric names. Ask the binary:

bca list-metrics                # one name per line
bca list-metrics descriptions   # name + summary

This is the right input for code generators, schema definitions, or tab-completion.

Extract operands and operators (Halstead)

ops emits the raw operand and operator lists per file, which is the input to Halstead-style metric calculations beyond what the built-in report shows:

bca ops \
    --include "*.rs" \
    --paths src/ \
    -O json --pretty \
    --output-dir /tmp/ops

One glob per occurrence. --include and --exclude take exactly one value each time they appear; repeat the flag for multiple globs (--include "*.rs" --include "*.py"). A positional path that follows is never swallowed. The = form (--include="*.rs") also works.

Each output file mirrors the input path under /tmp/ops/.

Strip comments from a tree

strip-comments rewrites source so that downstream tools that don't understand comment syntax can still consume the code. Output routing has three modes:

stdout (default). With neither flag, the stripped source streams to stdout — best for a single file in a pipeline.
--output / -o <FILE> (single file). Writes the stripped source to <FILE>, leaving the input untouched. Only meaningful for one input file; mutually exclusive with --in-place.
--in-place (multi-file). Rewrites each matched input file on disk. Use this for a whole tree; mutually exclusive with --output.

# Stream a single file with comments removed.
bca strip-comments --paths src/lib.rs

# Write a single stripped file to a new path (input untouched).
bca strip-comments --paths src/lib.rs --output src/lib.stripped.rs

# Rewrite every Python file in src/ in place.
bca strip-comments --include "*.py" --paths src/ \
    --in-place

--in-place is destructive — make sure the tree is committed or backed up first. Passing both --in-place and --output is a usage error.

Driving the REST API

bca-web exposes the same analysis primitives over HTTP. Use it when the consumer is a long-running service (an editor plugin, CI worker, or web app) that should not pay the cost of spawning the CLI per file.

For the full endpoint reference, see Rest API. The recipes below show practical end-to-end calls with curl. Every endpoint is mounted under the /v1 prefix; the old unprefixed paths were removed at the 2.0 release (#637) and now return 404.

Start the server

bca-web --host 127.0.0.1 --port 8080 -j "$(nproc)"

Verify it's up:

curl -sf http://127.0.0.1:8080/v1/ping && echo "ok"
# => ok

/ping returns 200 OK with an empty body — curl -sf exits 0 on success and non-zero on any HTTP error, which is what scripts want.

For building, flag and environment-variable tuning, resource limits, and the security posture of the daemon itself, see Operating bca-web.

Compute metrics for an inline snippet

curl -s http://127.0.0.1:8080/v1/metrics \
    -H 'Content-Type: application/json' \
    -d '{
          "id": "snippet-1",
          "file_name": "demo.rs",
          "code": "fn add(a: i32, b: i32) -> i32 { a + b }",
          "scope": "full"
        }' \
  | jq '.root.metrics'

scope: "file" returns only top-level metrics; "full" (the default) walks every function and class space inside the snippet. The server infers language from file_name, so the extension matters.

Compute metrics for a file from disk

curl --data-binary plus jq makes it easy to package a real file into the JSON envelope the server expects:

jq -nc \
    --arg id "$(uuidgen)" \
    --arg file_name "src/lib.rs" \
    --rawfile code src/lib.rs \
    '{id: $id, file_name: $file_name, code: $code, scope: "full"}' \
  | curl -s http://127.0.0.1:8080/v1/metrics \
      -H 'Content-Type: application/json' \
      --data-binary @- \
  | jq '.root.metrics.cyclomatic, .root.metrics.cognitive'

This pattern — jq -n --rawfile to build the request, curl --data-binary @- to stream it — is the easiest way to avoid quoting problems with multi-line source code.

Strip comments through the API

The endpoint is /comment (singular). It has two variants selected by Content-Type:

application/json — wraps the request and response in JSON. The response code field is a byte array, not a string, because the underlying API is byte-oriented.
application/octet-stream — accepts the source as the raw request body and returns the stripped source as the raw response body. This is by far the easiest variant to use from the shell.

Octet-stream form (recommended for one-off shell use):

curl -s "http://127.0.0.1:8080/v1/comment?file_name=demo.py" \
    -H 'Content-Type: application/octet-stream' \
    --data-binary $'# leading comment\nprint("hi")  # trailing'
# => print("hi")

JSON form (use when your client speaks JSON natively). Decode the byte array with jq … | implode for ASCII / UTF-8 source:

curl -s http://127.0.0.1:8080/v1/comment \
    -H 'Content-Type: application/json' \
    -d '{
          "id": "strip-1",
          "file_name": "demo.py",
          "code": "# leading comment\nprint(\"hi\")  # trailing"
        }' \
  | jq -r '.code | implode'

The JSON response carries the same id you sent, so a client that multiplexes many requests can correlate them.

Extract function spans for an editor plugin

The endpoint is /function (singular):

curl -s http://127.0.0.1:8080/v1/function \
    -H 'Content-Type: application/json' \
    -d '{
          "id": "spans-1",
          "file_name": "demo.rs",
          "code": "fn a() {}\nfn b() {}\n"
        }' \
  | jq '.spans'

Each entry has name, start_line, end_line, and an error boolean (set when the parser flagged the function span as malformed) — enough for an editor to draw a function navigator without re-parsing the file locally.

Rank a repository by change-history risk

The /vcs endpoint analyses a git working tree that already exists on the server's filesystem (change history has no in-request representation), and returns the files ranked by composite risk score. See Change-history (VCS) metrics for the signal and formula reference.

Security: unlike the other endpoints, /vcs takes a server-side repo_path and will walk any git repository the server process can read, returning that repo's file paths and change signals. Do not expose /vcs to untrusted clients without an authorization layer; the default 127.0.0.1 bind keeps it local.

curl -s http://127.0.0.1:8080/v1/vcs \
    -H 'Content-Type: application/json' \
    -d '{
          "id": "risk-1",
          "repo_path": "/srv/checkouts/my-project",
          "top": 20
        }' \
  | jq '.files[] | {path, risk_score, churn_recent}'

The body accepts the same knobs as bca vcs (long_window, recent_window, top, ref, risk_formula, file_types, full_history, include_merges, follow_renames, exclude_bots, bot_pattern, as_of, emit_author_details, include_deleted, bus_factor_threshold, no_cache, cache_dir) as optional fields. file_types scopes the ranking (metrics — the default — / all / a rs,py-style extension list), mirroring the CLI --file-types. A repo_path that is not a git working tree, or a malformed window / timestamp / formula / scope, returns 400 with the uniform JSON error body.

Trend over time

POST /vcs/trend samples the same metrics at several points in time and returns a per-file time series (see Historical trend). The body takes the /vcs fields plus points (>= 2), span (default 12mo), and top_deltas.

curl -s http://127.0.0.1:8080/v1/vcs/trend \
    -H 'Content-Type: application/json' \
    -d '{
          "id": "trend-1",
          "repo_path": "/srv/checkouts/my-project",
          "points": 12,
          "span": "24mo",
          "top": 20
        }' \
  | jq '.deltas.regressed[] | {path, delta}'

A point count below 2 (or above the supported maximum) returns 400 with the uniform JSON error body, like the other bad-request cases.

Just-in-time commit / diff scoring

POST /vcs/jit scores a single commit (see Just-in-time scoring). The body takes repo_path, commit (default HEAD), and the long_window / recent_window / full_history / include_merges / follow_renames / as_of knobs; it returns the commit JitReport JSON with the echoed id.

curl -s http://127.0.0.1:8080/v1/vcs/jit \
    -H 'Content-Type: application/json' \
    -d '{ "id": "jit-1", "repo_path": "/srv/checkouts/my-project",
          "commit": "HEAD" }' \
  | jq '{risk_score, purpose: .commit.purpose}'

To score an arbitrary diff instead, send a diff field carrying the unified diff (no repo_path needed). The response is then the partial report — source: "diff", a partial_risk_score, and no history / experience / purpose groups, which are absent rather than zero. The partial score is not comparable to a commit score.

git diff | jq -Rs '{id: "jit-diff", diff: .}' \
  | curl -s http://127.0.0.1:8080/v1/vcs/jit \
      -H 'Content-Type: application/json' -d @- \
  | jq '{source, partial_risk_score}'

A malformed diff (or an unresolvable commit, or a repo_path that is not a git working tree) returns 400 with the uniform JSON error body.

Calling the API from CI

The server starts in milliseconds, so for short-lived CI jobs it's often simplest to start it as a background process inside the job and tear it down at the end:

bca-web --port 8080 &
SERVER_PID=$!
trap 'kill "$SERVER_PID"' EXIT

# Wait for it to come up.
until curl -sf http://127.0.0.1:8080/v1/ping >/dev/null; do sleep 0.1; done

# … run your analysis calls here …

For longer-lived workers, run the server as a systemd unit (or container) and point your jobs at its host/port.

Using as a Library

big-code-analysis is published on crates.io as a Rust library. The CLI (bca) and REST server (bca-web) are both thin wrappers around the same public API, so anything they can do you can do directly from your own crate.

This section is task-oriented. For full type signatures and field docs, follow the rustdoc on docs.rs.

When to embed the library

Reach for the library (instead of shelling out to bca) when you want one or more of the following:

In-process analysis. Avoid the cost of spawning a subprocess per file when scoring thousands of files in a custom tool, IDE plugin, or static-analysis pipeline.
In-memory source. Score generated, pre-processed, or streamed source without writing it to disk first. See Analyzing in-memory source.
Selective walking. Drive a custom traversal over the FuncSpace tree to extract per-function metrics on your own schedule. See Walking FuncSpace results.
Custom output. Skip the JSON / YAML / TOML / CBOR serializers shipped under src/output/ and emit your own report format (CSV, SARIF, a database row, whatever).

If you just want a Markdown quality report or a CI threshold gate, the bca CLI is faster to wire up.

What is on offer today

Quick start — parse a string, get a FuncSpace, print the cognitive complexity.
Analyzing in-memory source — feed source from a buffer rather than a file.
Reusing an existing tree-sitter Tree — feed a caller-built tree_sitter::Tree into the metric walker.
Parse once, run metrics many times — hold a parsed Ast and run multiple metric subsets / custom walks against the same tree.
Walking the AST directly — count syntactic constructs, find nodes by kind, detect parse errors, or build a symbol table alongside the metrics walk.
Selecting metrics — compute only the metrics you need with MetricsOptions::with_only, including how dependent metrics pull in their inputs.
Walking FuncSpace results — recurse into nested function / class / impl spaces.
Error handling — what Result<FuncSpace, MetricsError> means today and how to turn it into a useful diagnostic.
Stability and versioning — what you can and cannot rely on across the 2.x line.

A note on API stability

The library is on the 2.x line and ships under a written stability contract: the shape of the public API is held stable across patch and minor bumps, and breaking changes are reserved for the next major bump. Every example in this section compiles against the current published crate and is expected to keep compiling across 2.x without edits.

Metric values may still drift across minor bumps when a grammar pin moves or a metric definition is fixed — see STABILITY.md § What is stable in value for the carve-out. Each drift is called out in the changelog entry that introduces it.

Quick start

This page walks through the minimum amount of code needed to compute metrics from a string of source code.

1. Add the crate

# Cargo.toml
[dependencies]
big-code-analysis = "2.0.0"

The crate uses Rust edition 2024 and pins rust-version = "1.94". Older toolchains will not build it — see the MSRV section of STABILITY.md for the policy.

2. Compute metrics from a string

The recommended entry point is analyze: pass a Source carrying the language, source bytes, and an optional display name, plus a MetricsOptions for any per-traversal flags. No filesystem path is needed.

use big_code_analysis::{analyze, MetricsOptions, Source, LANG};

fn main() {
    let source = "fn add(a: i32, b: i32) -> i32 { a + b }";

    let space = analyze(
        Source::new(LANG::Rust, source.as_bytes())
            .with_name(Some("snippet.rs".to_owned())),
        MetricsOptions::default(),
    )
    .expect("Rust source should parse");

    println!(
        "cognitive complexity (file-level): {}",
        space.metrics.cognitive.cognitive_sum(),
    );
}

Source::name ends up as the top-level FuncSpace::name; passing None leaves the top-level name unset. The return type is Result<FuncSpace, MetricsError>. In practice the Err variant means the requested language's Cargo feature is disabled in this build; a parse failure does not produce Err (tree-sitter recovers with ERROR nodes). See Error handling for the variant set and matching patterns. MetricsError is #[non_exhaustive], so always include a _ arm when matching.

Tip: use big_code_analysis::prelude::*; brings the recommended entry points (analyze, Ast, Source, MetricsOptions, MetricsError, LANG, FuncSpace, CodeMetrics, SpaceKind, Metric) into scope in one line. Anything outside the prelude can still be reached by name — for example use big_code_analysis::guess_language;.

Need more than metrics from one parse — operators/operands, an AST dump, a function-span list? Parse once with Ast::parse and call the per-pass methods on the handle. See Parse once, run metrics many times. If you already drive your own tree_sitter::Parser, adopt the resulting tree with Ast::from_tree_sitter (see Reusing an existing tree-sitter Tree).

3. What you got back

FuncSpace is a tree of spaces. The top-level node represents the whole file; its spaces field holds nested function / class / impl spaces. Every node carries the same CodeMetrics struct, so you can read any metric at any level of granularity.

use big_code_analysis::{analyze, MetricsOptions, Source, SpaceKind, LANG};

fn main() {
    let source = "\
fn outer() {
    fn inner() {}
}
";
    let space = analyze(
        Source::new(LANG::Rust, source.as_bytes())
            .with_name(Some("snippet.rs".to_owned())),
        MetricsOptions::default(),
    )
    .expect("Rust source should parse");

    assert_eq!(space.kind, SpaceKind::Unit);
    assert_eq!(space.spaces.len(), 1); // `outer`
    assert_eq!(space.spaces[0].spaces.len(), 1); // `inner`
}

For a deeper walk over FuncSpace, see Walking FuncSpace results.

Picking a language

If you do not know the language up front, use guess_language — it consults the path extension, an Emacs mode line in the buffer, and the shebang in that order:

use std::path::PathBuf;

use big_code_analysis::{analyze, guess_language, MetricsOptions, Source};

fn main() {
    let source = b"print('hi')\n";
    let path = PathBuf::from("hello.py");

    let (Some(lang), _name) = guess_language(source, &path) else {
        eprintln!("unrecognised language");
        return;
    };

    let _space = analyze(
        Source::new(lang, source).with_name(Some("hello.py".to_owned())),
        MetricsOptions::default(),
    );
}

guess_language returns (None, _) for unknown extensions; treat that as "skip this file" rather than as a parse error.

What changes when

The recommended entry point is analyze(Source, MetricsOptions) and returns Result<FuncSpace, MetricsError>.

Analyzing in-memory source

big-code-analysis never requires source to live on disk. The recommended entry point analyze takes a Source carrying the language, source bytes, and an optional caller-supplied display name; no filesystem path is involved unless the C/C++ preprocessor lookup needs one (Source::preproc_path).

This is useful for:

Scoring generated code before it is written out.
Scoring pre-processed or bundled source (e.g. after a template expansion).
Driving the analyzer from a language server or editor plugin that already holds the buffer in memory.
Stdin pipelines and unit tests that should not touch the filesystem.

Reading from a buffer

#![allow(unused)]
fn main() {
use big_code_analysis::{analyze, MetricsOptions, Source, LANG};

fn analyze_buffer(source: &[u8]) -> Option<u64> {
    // `Source::name` is the display identifier baked into the
    // top-level `FuncSpace`. Pick whatever is meaningful for
    // downstream consumers (logs, JSON output); pass `None` if
    // you have nothing useful to attach.
    let space = analyze(
        Source::new(LANG::Python, source).with_name(Some("<stdin>".to_owned())),
        MetricsOptions::default(),
    )
    .ok()?;

    Some(space.metrics.cognitive.cognitive_sum())
}
}

Source::new borrows the source bytes — the caller retains ownership. If your downstream pipeline needs to highlight findings on the same bytes, you can keep using the original buffer after analyze returns.

Reading from stdin

use std::io::{self, Read};

use big_code_analysis::{analyze, MetricsOptions, Source, LANG};

fn main() -> io::Result<()> {
    let mut source = Vec::new();
    io::stdin().read_to_end(&mut source)?;

    let space = match analyze(
        Source::new(LANG::Javascript, &source)
            .with_name(Some("<stdin>".to_owned())),
        MetricsOptions::default(),
    ) {
        Ok(space) => space,
        Err(err) => {
            eprintln!("parse failed: {err}");
            std::process::exit(1);
        }
    };

    println!("{}", space.metrics.cyclomatic.cyclomatic_sum());
    Ok(())
}

Picking the language from content

If you do not know the language up front, combine guess_language with analyze. guess_language peeks at the path extension, an Emacs mode-line, and the shebang in that order:

#![allow(unused)]
fn main() {
use std::path::PathBuf;

use big_code_analysis::{analyze, guess_language, MetricsOptions, Source};

fn analyze_unknown(path: PathBuf, source: Vec<u8>) -> Option<()> {
    let (lang, _name) = guess_language(&source, &path);
    let lang = lang?;
    // `.ok()?` collapses `MetricsError` into `None` so this helper's
    // `Option` return shape is preserved. See `error-handling.md` for
    // a richer mapping that preserves the variant.
    let _space = analyze(
        Source::new(lang, &source)
            .with_name(path.to_str().map(str::to_owned)),
        MetricsOptions::default(),
    )
    .ok()?;
    Some(())
}
}

guess_language returns (None, _) for unrecognised extensions — treat that as "skip" rather than as a hard error.

Watch out for these

Name identity matters. Top-level FuncSpace::name is whatever string you put in Source::name. Two analyses sharing the same name will look identical to a downstream consumer that keys on it. Use distinct labels for distinct buffers.
Source::name is Option<String>. Passing None leaves the top-level FuncSpace::name as None — useful for ad-hoc snippets that have no meaningful identity. Downstream consumers that require a stable identifier should check for None explicitly.
No filesystem fallback. Unlike the CLI, the library does not read sibling files, follow #includes, or interpret a .gitignore. Feed it exactly the bytes you want analyzed.

Reusing one parse for several passes

analyze is the one-shot entry point: bytes in, FuncSpace out. When you need more than metrics from a single parse — operators and operands, an AST dump, a function-span list — parse once with Ast::parse and call the per-pass methods on the returned handle (metrics, ops, dump, functions, …). See Parse once, run metrics many times.

Reusing an existing tree-sitter Tree

A common pain point is that callers who already drive tree-sitter for syntax highlighting, code folding, or queries end up parsing every file twice: once for their own tree, once inside the metric walker. The parse seam lets you hand big-code-analysis an already-parsed tree_sitter::Tree and get the same FuncSpace back without re-parsing.

Use Ast::from_tree_sitter. It adopts a caller-built tree_sitter::Tree and lets you run the metric walker more than once against the same parse (different MetricsOptions::with_only selections, custom tree-sitter walks interleaved with metrics, Ast::ops for operator/operand extraction, etc.). See Parse once, run metrics many times. It carries an explicit name: Option<String> rather than deriving the top-level FuncSpace::name from a path via lossy UTF-8 conversion.

When to use this

Use the parse seam if you:

Already keep a tree_sitter::Tree per open buffer (editor, LSP, language server, custom static-analysis pipeline) and want to reuse that parse for metrics rather than paying the byte-based cost again.
Want to run multiple passes (metrics + AST dump + custom analysis) against one parse result.
Intend to pin tree-sitter on your side without taking a separate dependency from this library. The re-exported big_code_analysis::tree_sitter module is the same crate we link against, so the types agree by definition.

Use the byte-based entry point analyze (with a Source) if you do not already have a tree — it constructs the parser internally and owns the parse end to end.

Working example

use big_code_analysis::{analyze, tree_sitter, Ast, LANG, MetricsOptions, Source};

let source_code = "fn main() { if true { 1 } else { 2 }; }";
let source = source_code.as_bytes().to_vec();

// Step 1: build a tree with the *re-exported* tree-sitter crate.
// Using `big_code_analysis::tree_sitter` (rather than a direct
// `tree-sitter` dependency on your side) guarantees the version
// matches the one the metric walker was compiled against.
let mut parser = tree_sitter::Parser::new();
parser
    .set_language(
        &LANG::Rust.tree_sitter_language().expect("rust feature enabled"),
    )
    .expect("rust grammar pinned to a compatible version");
let tree = parser
    .parse(&source, None)
    .expect("parser has a language set");

// Step 2: adopt the tree with an explicit display name.
let from_tree = Ast::from_tree_sitter(
    LANG::Rust,
    tree,
    source.clone(),
    Some("foo.rs".to_owned()),
)
.expect("rust feature enabled")
.metrics(MetricsOptions::default())
.expect("non-empty input");

// Step 3 (optional): confirm the values match the byte-based path.
let from_bytes = analyze(
    Source::new(LANG::Rust, &source).with_name(Some("foo.rs".to_owned())),
    MetricsOptions::default(),
)
.expect("non-empty input");

assert_eq!(
    from_tree.metrics.cyclomatic.cyclomatic_sum(),
    from_bytes.metrics.cyclomatic.cyclomatic_sum(),
);

The same shape works for any LANG variant — pass the matching grammar to tree_sitter::Parser::set_language (via LANG::tree_sitter_language) and the metric walker will produce the same FuncSpace it would have produced from bytes.

The single tree-adoption seam

Ast::from_tree_sitter is the entry point for tree reuse — it dispatches on a LANG at runtime and hides the parser plumbing entirely. The former lower-level path (the generic Parser<T> / ParserTrait and the per-language *Parser / *Code tag types) is now crate-private (pub(crate)) and is no longer part of the public surface; see STABILITY.md. Library consumers should adopt a tree through Ast::from_tree_sitter, which does not expose any per-language tag types or trait bounds.

Out of scope

Incremental re-computation. Applying a tree_sitter::InputEdit and re-querying only the changed spans is not supported yet — the metric walker still walks the entire tree on every call. The parse seam is the first step; making the walker itself incremental is a follow-up.
Promoting all of Node's pub(crate) traversal methods. Node still exposes its inner tree_sitter::Node through the public .0 field for ad-hoc traversal; the wrapper helpers remain crate-private.

Parse once, run metrics many times

big-code-analysis's one-shot entry point analyze re-parses its Source on every call. For pipelines that score a file multiple times — different metric subsets, an interleaved custom tree-sitter walk, or a metric re-run after a configuration change — that re-parse is wasted work.

The Ast type, added in 1.1.0, exposes the seam: parse the source once, then call Ast::metrics as many times as you need against the held parse.

When to use this

Reach for Ast when any of the following applies:

Selective metric runs. You compute one set of metrics for a report, then another for a CI threshold gate, against the same file.
Custom tree-sitter walks. You already drive a tree_sitter::Tree for queries / highlighting / symbol extraction and want to fold the metric walker into the same parse.
Cached analysis. An LSP-like service that holds parsed files in memory should be able to re-run metrics on demand when configuration changes, without going back to bytes.

If you only ever compute every metric once per file, stick with analyze — it now delegates to Ast internally, so the shapes line up but the one-shot API stays simpler.

Selective metrics across calls

#![allow(unused)]
fn main() {
use big_code_analysis::{Ast, LANG, Metric, MetricsOptions, Source};

let source = b"fn f(x: i32) -> i32 { if x > 0 { 1 } else { -1 } }";

// One parse, two metric subsets.
let ast = Ast::parse(Source::new(LANG::Rust, source))
    .expect("rust feature enabled");

let loc = ast
    .metrics(MetricsOptions::default().with_only(&[Metric::Loc]))
    .expect("walker succeeds");
let cyclomatic = ast
    .metrics(MetricsOptions::default().with_only(&[Metric::Cyclomatic]))
    .expect("walker succeeds");

println!("ploc = {}", loc.metrics.loc.ploc());
println!("ccn  = {}", cyclomatic.metrics.cyclomatic.cyclomatic_sum());
}

Each metrics call walks the tree once. The savings versus calling analyze twice come from skipping the parse, which dominates runtime for everything except the very largest source files.

Custom tree-sitter walk + metrics on the same parse

Ast::as_tree_sitter borrows the underlying tree_sitter::Tree. The returned reference is valid for the lifetime of the Ast; nodes obtained from it resolve against Ast::source (see the note on the C++ preprocessor below for what source returns under macro expansion).

For realistic AST work — counting node kinds, finding constructs by name, detecting parse errors, building a symbol table — see Walking the AST directly. The example below is a minimal smoke test; the dedicated chapter shows the full pattern (reusable depth-first walker, field-name lookup, error detection).

#![allow(unused)]
fn main() {
use big_code_analysis::{Ast, LANG, MetricsOptions, Source};

let ast = Ast::parse(Source::new(LANG::Rust, b"fn f() {}"))
    .expect("rust feature enabled");

// Walk the tree for your own purposes…
let root = ast.as_tree_sitter().root_node();
assert_eq!(root.kind(), "source_file");

// …and run the metric walker over the same parse.
let space = ast
    .metrics(MetricsOptions::default())
    .expect("walker succeeds");
println!("name = {:?}", space.name);
}

Operators and operands on the same parse

Ast::ops returns the per-space operator/operand Ops tree (the data behind the Halstead metrics). The top-level Ops::name is the Source::name you supplied — carried explicitly, including the None case — rather than a lossy-path string, so Ops::name_was_lossy is never set on this path.

#![allow(unused)]
fn main() {
use big_code_analysis::{Ast, LANG, Source};

let ops = Ast::parse(
    Source::new(LANG::Rust, b"fn f() { let x = 1 + 2; }")
        .with_name(Some("snippet.rs".to_owned())),
)
.expect("rust feature enabled")
.ops()
.expect("walker succeeds");
assert_eq!(ops.name.as_deref(), Some("snippet.rs"));
assert!(ops.operators.iter().any(|op| op == "+"));
}

Adopting a caller-built tree

If you already build the tree_sitter::Tree yourself (e.g. because your editor / LSP has its own parser pool), Ast::from_tree_sitter is the tree-adoption seam. It carries an explicit name: Option<String> end-to-end instead of deriving one from a path via lossy UTF-8 conversion.

#![allow(unused)]
fn main() {
use big_code_analysis::{Ast, LANG, MetricsOptions, tree_sitter};

let source = b"fn f() {}".to_vec();
let mut parser = tree_sitter::Parser::new();
parser
    .set_language(
        &LANG::Rust
            .tree_sitter_language()
            .expect("rust feature enabled"),
    )
    .expect("rust grammar compatible");
let tree = parser
    .parse(&source, None)
    .expect("parser has a language set");

let ast = Ast::from_tree_sitter(LANG::Rust, tree, source, None)
    .expect("rust feature enabled");
let _ = ast.metrics(MetricsOptions::default()).expect("walker succeeds");
}

The tree must have been produced from code with the grammar returned by LANG::tree_sitter_language for lang; a mismatch is not unsafe, but the metric walker matches on tree-sitter kind_id values that come from the language's enum, so values from a different grammar yield nonsensical results.

C++ preprocessor

When Ast::parse is called on a Source carrying preprocessor inputs (Source::with_preproc_path + Source::with_preproc) and the language is LANG::Cpp, the macro pre-pass runs before tree-sitter does — and Ast::source returns the expanded bytes the parser actually saw, not the original input.

Ast::from_tree_sitter is unaffected: it adopts whatever tree the caller built. Whatever expansion (or lack thereof) the caller applied before building the tree is what Ast::source reflects.

Concurrency

Ast is Send + Sync. Running Ast::metrics from multiple threads against the same &Ast is safe — the walker only reads from the held tree_sitter::Tree. (Benchmarking parallel metric runs is a separate follow-up.)

Out of scope

Incremental reparse via tree_sitter::InputEdit. Caching a stable Ast across an analysis pipeline is in scope; editing the held tree is not.
Parallel-by-default APIs. Ast::metrics does not internally parallelize across the metric set. Callers that want one thread per subset are free to do so.

Walking the AST directly

Ast::parse gives you a parsed tree_sitter::Tree together with the source bytes it was parsed from; Ast::as_tree_sitter hands that tree out as a borrowed reference. This chapter shows how to use it to drive your own syntax-tree analysis — counting node kinds, finding constructs by name, detecting parse errors, or pulling out a symbol table — without paying for a second parse.

When to use this

Reach for direct AST traversal when:

You want to count or find syntactic constructs in-process. The CLI equivalents (bca count -t <kind>, bca find -t <kind>, recipe) shell out per file; the library path is one parse and one Rust loop.
You want to detect parse errors programmatically. Tree-sitter emits a synthetic ERROR node anywhere the grammar could not match; Node::has_error is O(1) — tree-sitter caches the error bit on every node — so the check is free even on a multi-MB source file.
You want to mix metrics with custom analysis in one parse — e.g. capture metric values and a list of function names for a coverage mapping, an IDE outline, or a code-owner report.

If you only need standard metrics, stay with analyze or Ast::metrics — they walk the tree for you. The direct path is for things the metric walker does not already compute.

Use the re-exported `tree_sitter`

Import tree_sitter from big_code_analysis::tree_sitter rather than adding a sibling tree-sitter dependency. The re-export is pinned to the exact version the metric walker was built against, so the Tree types agree by definition. See Reusing an existing tree-sitter Tree and Stability and versioning for the value-not-stable posture this re-export carries.

A reusable DFS walker

Most of the examples below need a depth-first traversal of every descendant. Tree-sitter ships a TreeCursor that does this in O(1) per step (no allocations beyond the cursor itself). The canonical walk is short enough to inline:

#![allow(unused)]
fn main() {
use big_code_analysis::tree_sitter;

/// Visit every node in `tree` in pre-order, root first, passing each
/// node to `visit`. Allocation-free apart from the cursor itself.
fn walk_preorder<F: FnMut(tree_sitter::Node<'_>)>(
    tree: &tree_sitter::Tree,
    mut visit: F,
) {
    let mut cursor = tree.walk();
    'walk: loop {
        visit(cursor.node());
        if cursor.goto_first_child() {
            continue;
        }
        loop {
            if cursor.goto_next_sibling() {
                continue 'walk;
            }
            if !cursor.goto_parent() {
                return;
            }
        }
    }
}
}

The pattern is: visit, descend, climb back up while there is no next sibling, repeat. Every example in this chapter is a thin wrapper around this walker — the code fences below are marked ignore because they assume walk_preorder is already in scope; the matching set of tests in tests/book_ast_traversal_examples.rs keeps them honest, so a refactor that broke an example would fail cargo test.

Count nodes by kind

Library equivalent of bca count -t if_expression -t for_expression -t while_expression from the AST-queries recipe:

use big_code_analysis::{Ast, LANG, Source};
use std::collections::HashMap;

let ast = Ast::parse(Source::new(
    LANG::Rust,
    b"fn a() { if true { 1 } else { 2 } } fn b() { for _ in 0..10 {} }",
))
.expect("rust feature enabled");

let mut counts: HashMap<&str, usize> = HashMap::new();
walk_preorder(ast.as_tree_sitter(), |node| {
    *counts.entry(node.kind()).or_default() += 1;
});

assert_eq!(counts.get("if_expression").copied().unwrap_or(0), 1);
assert_eq!(counts.get("for_expression").copied().unwrap_or(0), 1);

The string keys ("if_expression", "for_expression", …) are the tree-sitter grammar's node-type names. The fastest way to discover them for a new language is bca dump --paths sample.rs, which prints the full AST.

Anonymous tokens. The walker visits every node tree-sitter emits, including anonymous tokens like "{", ";", and keyword literals. The targeted counts.get("if_expression") lookups above are unaffected — anonymous tokens have different kind names — but counts.values().sum() would be much larger than the count of named grammar productions. Filter with tree_sitter::Node::is_named() inside the visitor if you only want named nodes.

Find nodes by kind

Library equivalent of bca find -t unsafe_block:

use big_code_analysis::{Ast, LANG, Source};

let ast = Ast::parse(Source::new(
    LANG::Rust,
    b"fn safe() {} fn risky() { unsafe { } }",
))
.expect("rust feature enabled");

let source = ast.source();
// Captured slices borrow from `source` — no per-hit `String` allocation.
let mut hits: Vec<((usize, usize), &str)> = Vec::new();
walk_preorder(ast.as_tree_sitter(), |node| {
    if node.kind() == "unsafe_block" {
        let span = (node.start_position().row, node.end_position().row);
        let text = node
            .utf8_text(source)
            .expect("source is valid utf-8");
        hits.push((span, text));
    }
});

assert_eq!(hits.len(), 1);

Node::utf8_text(&source[..]) slices the source bytes by the node's byte range. Pair it with Ast::source — for C++ with preprocessor inputs supplied to Ast::parse, source is the expanded buffer the parser actually saw, not the original input (see the C++ preprocessor note).

Detect parse errors

Tree-sitter is lossless: even on malformed input it returns a tree, but nodes that could not be matched are tagged as errors. The cheapest check is on the root:

#![allow(unused)]
fn main() {
use big_code_analysis::{Ast, LANG, Source};

let ast = Ast::parse(Source::new(LANG::Rust, b"fn broken("))
    .expect("rust feature enabled");

// Walks far enough to confirm something went wrong, but does not
// enumerate every error site.
assert!(ast.as_tree_sitter().root_node().has_error());
}

To list the offending nodes, walk the tree and check each:

use big_code_analysis::{Ast, LANG, Source};

let ast = Ast::parse(Source::new(LANG::Rust, b"fn broken("))
    .expect("rust feature enabled");

let mut error_lines = Vec::new();
walk_preorder(ast.as_tree_sitter(), |node| {
    if node.is_error() || node.is_missing() {
        error_lines.push(node.start_position().row);
    }
});

assert!(!error_lines.is_empty());

Node::is_error() flags the synthetic ERROR node tree-sitter inserts where it could not match the grammar; Node::is_missing() flags phantom nodes the parser invented to recover from a missing token. The CLI's bca find -t ERROR recipe uses the same nodes.

Combine metrics with a custom walk

The whole point of Ast is parse-once / compute-many. A realistic pipeline computes metrics and extracts a symbol table from the same parse:

use big_code_analysis::{Ast, LANG, MetricsOptions, Source};

let ast = Ast::parse(Source::new(
    LANG::Rust,
    b"fn outer() { fn inner() {} } fn alone() {}",
))
.expect("rust feature enabled");

// One parse: metrics walker uses it…
let space = ast
    .metrics(MetricsOptions::default())
    .expect("walker succeeds");

// …and so does the custom walk, against the very same tree. The
// captured names borrow from `source` rather than allocating a fresh
// `String` per function — the same pattern as `find_unsafe_blocks`
// above.
let source = ast.source();
let mut functions: Vec<&str> = Vec::new();
walk_preorder(ast.as_tree_sitter(), |node| {
    if node.kind() == "function_item"
        && let Some(name_node) = node.child_by_field_name("name")
    {
        let name = name_node
            .utf8_text(source)
            .expect("source is valid utf-8");
        functions.push(name);
    }
});

assert_eq!(space.metrics.nom.functions_sum(), 3);
assert_eq!(functions, ["outer", "inner", "alone"]);

Node::child_by_field_name walks the named grammar fields — the same fields that show up in the field_name key of the serialized AST (REST /ast, Ast::dump). Field-based lookup is more robust than positional indexing because it does not depend on which children the grammar emits for anonymous tokens (commas, parentheses, …).

Want a serializable JSON tree?

For pipelines that want a structured AST as data — diffing, queries on the wire, language-agnostic schema work — Ast::dump materializes the tree as a Serialize-able AstResponse of AstNodes. This is the same shape the REST /ast endpoint produces. Call it on the parse handle:

#![allow(unused)]
fn main() {
use big_code_analysis::{Ast, AstCfg, AstPayload, LANG, Source};

let payload = AstPayload {
    id: "snippet".to_owned(),
    file_name: "snippet.rs".to_owned(),
    code: "fn f() {}".to_owned(),
    comment: false,
    span: true,
};
let cfg = AstCfg {
    id: payload.id.clone(),
    language: "rust".to_owned(),
    comment: payload.comment,
    span: payload.span,
};
let response = Ast::parse(
    Source::new(LANG::Rust, payload.code.as_bytes())
        .with_name(Some(payload.file_name.clone())),
)
.expect("rust feature enabled")
.dump(cfg);
let json = serde_json::to_string(&response).expect("AstResponse serializes");
println!("{json}");
}

For one-off in-process work, the as_tree_sitter() walker above is cheaper (no allocation per node). Reach for Ast::dump when you need a serializable owned tree.

Out of scope

Incremental reparse — tree-sitter supports tree_sitter::InputEdit for incremental updates, but Ast is a snapshot. To reflect a source edit, build a fresh Ast::parse or drive tree_sitter::Parser::parse(&new_source, Some(&old_tree)) directly via the re-exported tree_sitter and feed the result through Ast::from_tree_sitter.
The crate-internal big_code_analysis::Node wrapper. It is exposed for the metric walker's traversal needs, but most of its traversal methods (kind, child_count, children, cursor, …) stay pub(crate). Library consumers should reach the tree-sitter Node through as_tree_sitter().root_node() — that is the documented seam.

Selecting metrics

By default, every call to analyze computes the full metric suite — ABC, cognitive, cyclomatic, Halstead, LoC, MI, NArgs, NExits, NOM, NPA, NPM, tokens, and WMC. That is the right default for the CLI, where the user has just asked for the metrics, but it is heavyweight for callers that only want one number per file.

MetricsOptions::with_only(&[Metric]) lets you restrict the walker to a subset of metrics. Unselected metrics are skipped at the per-node level — no T::Halstead::compute, no T::Cognitive::compute, etc. — and elided from the CodeMetrics serialization output.

A worked example

Compute LoC only, then read the result:

use big_code_analysis::{analyze, LANG, Metric, MetricsOptions, Source};

fn main() {
    let source = b"fn f(x: i32) -> i32 { if x > 0 { 1 } else { 0 } }";

    let opts = MetricsOptions::default().with_only(&[Metric::Loc]);
    let space = analyze(
        Source::new(LANG::Rust, source).with_name(Some("snippet.rs".to_owned())),
        opts,
    )
    .expect("parses");

    // LoC was selected — it carries real numbers.
    println!("ploc = {}", space.metrics.loc.ploc());

    // Halstead, cognitive, cyclomatic, … were skipped. Their
    // `Stats` fields are at `Default` and elided from JSON output.
    let json = serde_json::to_string_pretty(&space.metrics).unwrap();
    println!("{json}");
}

The JSON output for that call contains only the loc object; every other metric is absent.

Dependencies between metrics

Two metrics are derived — they consume the outputs of other metrics during the finalize step:

Metric	Dependencies
`Metric::Mi`	`Loc`, `Cyclomatic`, `Halstead`
`Metric::Wmc`	`Cyclomatic`, `Nom`

with_only resolves these closures silently. Asking for Mi alone still computes Loc + Cyclomatic + Halstead, so the MI value is meaningful rather than a function of zero-default inputs:

#![allow(unused)]
fn main() {
use big_code_analysis::{Metric, MetricSet, MetricsOptions};
let opts = MetricsOptions::default().with_only(&[Metric::Mi]);
// opts.metrics now contains Mi + Loc + Cyclomatic + Halstead.
}

You can introspect the final set from the resulting FuncSpace via space.metrics.selected():

#![allow(unused)]
fn main() {
use big_code_analysis::{analyze, LANG, Metric, MetricsOptions, Source};
let space = analyze(
    Source::new(LANG::Rust, b"fn f() {}"),
    MetricsOptions::default().with_only(&[Metric::Mi]),
).unwrap();
let sel = space.metrics.selected();
assert!(sel.contains(Metric::Mi));
assert!(sel.contains(Metric::Loc)); // auto-added dependency
}

Default behaviour is unchanged

MetricsOptions::default() selects every metric. Calling analyze (or Ast::metrics) without with_only produces byte-for-byte the same JSON it always did.

What about "everything except X"?

There is no built-in complement API — with_only takes a positive selection, not an exclusion list. The intentional asymmetry keeps the dependency closure unambiguous: a positive list always grows through Metric::dependencies, whereas an exclusion list would need to decide what to do when the caller excludes a dependency of a metric they kept.

If you genuinely want "all except Halstead", build the list explicitly. Because Metric is #[non_exhaustive], downstream crates can construct the variants but cannot exhaustively match on them, so the conventional pattern is to enumerate the variants you want and accept that adding a future Metric variant will not silently opt you in:

#![allow(unused)]
fn main() {
use big_code_analysis::{Metric, MetricsOptions};

let opts = MetricsOptions::default().with_only(&[
    Metric::Cognitive,
    Metric::Cyclomatic,
    Metric::Loc,
    Metric::Nom,
    Metric::Tokens,
    Metric::Nargs,
    Metric::Nexits,
    Metric::Abc,
    Metric::Npm,
    Metric::Npa,
    Metric::Wmc,
    // Metric::Mi intentionally omitted: it would pull Halstead
    // back in via the dependency closure.
]);
}

Note the trap: keeping Metric::Mi re-adds Metric::Halstead through Metric::dependencies. To truly drop Halstead you must also drop Mi.

When to reach for `with_only`

Hot paths that need only one or two metrics per file — Halstead in particular owns its own per-space HalsteadMaps allocation and is the headline saving for an LoC-only run.
CI integrations that only display one number (e.g. a cognitive-complexity gate) and want the rest of CodeMetrics to drop out of the cached JSON payload.
Library callers wiring big-code-analysis into their own reports who would otherwise see fields for every metric in their own UI.

Per-metric Cargo features (compile-time stripping) are not covered by this knob.

Per-language Cargo features

Every tree-sitter grammar this library bundles is gated behind its own Cargo feature. The default feature set is all-languages, so the default

[dependencies]
big-code-analysis = "2.0.0"

pulls every grammar in — matching the library's historical behaviour and what the bca / bca-web binaries themselves ship with. The cost is concrete: every grammar crate compiles when the library compiles, and every grammar's parsing tables stay live in the final binary.

Library consumers that only need a subset of languages can opt out of the defaults and re-enable just the grammars they care about.

A worked example

A downstream service that only analyses Rust and TypeScript:

[dependencies]
big-code-analysis = { version = "2.0.0", default-features = false, features = ["rust", "typescript"] }

The library still compiles, the LANG enum still has every variant, and analyze / Ast / the rest of the dispatch surface still work for the enabled languages.

Supported features

The following per-language features are available. Each feature pulls in the matching grammar crate (and any helper grammars the per-language pipeline depends on).

Feature	Grammar crates pulled in
`bash`	`tree-sitter-bash`
`c`	`tree-sitter-c` (+ `c-family-helpers`); dedicated `C` variant owning `.c`, added in #721
`c-family-helpers`	`bca-tree-sitter-ccomment`, `bca-tree-sitter-preproc` — internal, enabled automatically by `c` / `cpp` / `mozcpp`; gates the `Ccomment` / `Preproc` helper variants. Not meant to be selected directly
`cpp`	`tree-sitter-cpp` (+ `c-family-helpers`); the `Cpp` variant, upstream grammar since #720. Also enables the `Ccomment` / `Preproc` helper variants
`csharp`	`tree-sitter-c-sharp`
`elixir`	`tree-sitter-elixir`
`go`	`tree-sitter-go`
`groovy`	`dekobon-tree-sitter-groovy`
`irules`	`tree-sitter-irules` (F5 iRules, a Tcl dialect)
`java`	`tree-sitter-java`
`javascript`	`tree-sitter-javascript`
`kotlin`	`tree-sitter-kotlin-ng`
`lua`	`tree-sitter-lua`
`mozcpp`	`bca-tree-sitter-mozcpp` (+ `c-family-helpers`); opt-in Mozilla/Gecko C++ dialect, `Mozcpp` variant — owns no extensions, selected by name
`mozjs`	`bca-tree-sitter-mozjs`
`objc`	`tree-sitter-objc`; the `Objc` variant (Objective-C). Owns `.m`; `.mm` Objective-C++ stays on `Cpp`
`perl`	`tree-sitter-perl`
`php`	`tree-sitter-php`
`python`	`tree-sitter-python`
`ruby`	`tree-sitter-ruby`
`rust`	`tree-sitter-rust`
`tcl`	`bca-tree-sitter-tcl`
`typescript`	`tree-sitter-typescript` (used by both the `Typescript` and `Tsx` variants)

The umbrella all-languages feature enables every entry in this table. The bca-tree-sitter-* crates are in-tree forks of the upstream Mozilla / community grammars; the Rust import path remains tree_sitter_<lang> regardless. See RELEASING.md for the rename rationale and the workspace package = ... alias trick that keeps consumer call sites unchanged.

What happens when a feature is off

The LANG enum keeps every variant defined regardless of the active feature set — disabling a feature does not change the enum surface or any of the file-extension / emacs-mode detection helpers. Selecting a LANG whose feature is off only affects the dispatch path.

Every dispatch entry point that returns a Result surfaces the disabled state as Err(MetricsError::LanguageDisabled(LANG)):

analyze
Ast::parse / Ast::from_tree_sitter (and the metrics / ops methods on the returned Ast)
LANG::tree_sitter_language — this returns Result<tree_sitter::Language, MetricsError> rather than the bare Language the upstream project returned

Callers can query the compiled-in set without going through a dispatcher:

#![allow(unused)]
fn main() {
use big_code_analysis::LANG;

for lang in LANG::into_enum_iter() {
    if lang.is_enabled() {
        println!("{:?} is compiled in", lang);
    }
}
}

This pairs well with the get_language_for_file / guess_language helpers, which still hand back any LANG variant for a recognised extension — callers walking a directory may want to skip files whose language is not enabled in the current build.

Stability

Per-language features are themselves part of the contract. Adding a new language feature is an additive minor-bump change; removing one is a major-bump (3.0) break. The all-languages default is permanent within 2.x, so the variants a default build covers do not shrink before 3.0. Any such change will be flagged in the changelog under (breaking).

Walking `FuncSpace` results

FuncSpace is the tree the library hands back from analyze. The top-level node represents the whole file; its spaces field holds nested function / class / impl / trait / namespace spaces. Each node carries the same CodeMetrics payload, so any metric is available at any level of granularity.

Anatomy of a `FuncSpace`

The fields you reach for most often are:

Field	Type	What it is
`name`	`Option<String>`	Caller-supplied identifier (top-level) or symbol name (nested)
`kind`	`SpaceKind`	`Unit`, `Function`, `Class`, `Impl`, …
`start_line`	`usize`	First line (1-based)
`end_line`	`usize`	Last line (1-based)
`spaces`	`Vec<FuncSpace>`	Nested spaces
`metrics`	`CodeMetrics`	All per-space metric values
`suppressed`	`SuppressionScope`	In-source suppression markers

SpaceKind is an enum — match on it to filter what you care about (Function only, or "anything that owns methods").

Recursive walk

Recursion mirrors the tree shape. Here we collect every function space whose cognitive complexity exceeds a threshold:

use big_code_analysis::{
    analyze, FuncSpace, MetricsOptions, SpaceKind, Source, LANG,
};

fn hotspots(space: &FuncSpace, threshold: u64, out: &mut Vec<String>) {
    if space.kind == SpaceKind::Function
        && space.metrics.cognitive.cognitive_sum() > threshold
        && let Some(name) = &space.name
    {
        out.push(format!(
            "{name} (lines {}–{})",
            space.start_line, space.end_line,
        ));
    }
    for child in &space.spaces {
        hotspots(child, threshold, out);
    }
}

fn main() {
    let source = b"\
fn easy() { let _ = 1; }
fn hard(x: i32) -> i32 {
    if x > 0 { if x > 10 { 1 } else { 2 } } else { 3 }
}
";
    let space = analyze(
        Source::new(LANG::Rust, source).with_name(Some("snippet.rs".to_owned())),
        MetricsOptions::default(),
    )
    .expect("parses");

    let mut hits = Vec::new();
    hotspots(&space, 2, &mut hits);
    for hit in hits {
        println!("{hit}");
    }
}

Iterative walk

For deep trees, prefer an explicit stack — Rust does not tail-call-optimise, and pathological generated code can be arbitrarily nested:

#![allow(unused)]
fn main() {
use big_code_analysis::FuncSpace;

fn total_functions(root: &FuncSpace) -> usize {
    let mut stack = vec![root];
    let mut count = 0;
    while let Some(space) = stack.pop() {
        if space.kind == big_code_analysis::SpaceKind::Function {
            count += 1;
        }
        stack.extend(space.spaces.iter());
    }
    count
}
}

Reading per-metric numbers

CodeMetrics exposes each metric as its own Stats struct. Inside, each struct offers integer-valued summary accessors plus per-space derived ones. A few patterns:

#![allow(unused)]
fn main() {
use big_code_analysis::FuncSpace;

fn summary(space: &FuncSpace) {
    let m = &space.metrics;

    println!("cognitive (this space):     {}", m.cognitive.cognitive_sum());
    println!("cyclomatic (this space):    {}", m.cyclomatic.cyclomatic_sum());
    println!("# functions in this space:  {}", m.nom.functions_sum());
    println!("source lines (sloc):        {}", m.loc.sloc());
    println!("physical lines (ploc):      {}", m.loc.ploc());
    println!("ABC branches:               {}", m.abc.branches());
}
}

The *_sum accessors aggregate across child spaces; bare accessors like m.loc.sloc() are the value attributable to this node. The full list of fields and methods lives in the per-metric rustdoc.

Don't rely on traversal order

The library walks the AST in source order, but the contract is only that every space appears once in the tree. If you need a stable order across versions, sort by start_line after the walk:

#![allow(unused)]
fn main() {
use big_code_analysis::FuncSpace;

fn flatten(space: &FuncSpace, out: &mut Vec<(usize, String)>) {
    if let Some(name) = &space.name {
        out.push((space.start_line, name.clone()));
    }
    for child in &space.spaces {
        flatten(child, out);
    }
}

fn sorted(space: &FuncSpace) -> Vec<(usize, String)> {
    let mut v = Vec::new();
    flatten(space, &mut v);
    v.sort_by_key(|&(line, _)| line);
    v
}
}

Error handling

The entry point analyze returns Result<FuncSpace, MetricsError>. This page documents what each variant means and how to act on it.

Heads up. MetricsError is #[non_exhaustive], so always include a _ arm when matching exhaustively to stay forward-compatible with future variants.

Pattern-matching the error variants

use big_code_analysis::{analyze, LANG, MetricsError, MetricsOptions, Source};

fn main() {
    let result = analyze(
        Source::new(LANG::Rust, b"this is not rust")
            .with_name(Some("snippet.rs".to_owned())),
        MetricsOptions::default(),
    );

    match result {
        Ok(space) => println!("ok: {} lines", space.metrics.loc.sloc()),
        Err(MetricsError::EmptyRoot) => {
            eprintln!("walker produced no top-level FuncSpace");
        }
        Err(MetricsError::LanguageDisabled(lang)) => {
            eprintln!("language {:?} is not enabled in this build", lang);
        }
        // `MetricsError` is `#[non_exhaustive]`; new variants may be added.
        Err(_) => eprintln!("unexpected MetricsError variant"),
    }
}

What each variant means

EmptyRoot — Reserved; not produced today. metrics_with_options always pushes a synthetic top-level Unit FuncSpace before walking the AST, so every parse — including empty, whitespace-only, and comment-only input — returns Ok(FuncSpace { kind: Unit, .. }). The variant is kept for a future walker change that could let the state stack legitimately drain to empty.
LanguageDisabled(LANG) — The requested LANG is not enabled in this build. Every dispatch entry point produces it when the caller selects a LANG whose per-language Cargo feature is off. The default feature set (all-languages) compiles every grammar in, so you only see this variant after opting into a narrower set (--no-default-features --features rust,…).

MetricsError has no ParseHasErrors or NonUtf8Path variant; since it stays #[non_exhaustive], a future strict-parsing or strict-identifier mode can introduce them without a breaking change. Non-UTF-8 paths are already handled up front: the recommended analyze entry point takes a caller-supplied Source::name (Option<String>), so a lossy path is never round-tripped in the first place.

Tree-sitter does not always say "no"

Most parse errors do not surface as Err(_). Tree-sitter is an error-recovering parser — it will produce a tree even for syntactically broken input, marking the bad regions with ERROR nodes. The metric walk happily computes numbers over the recovered tree. That means:

Garbage in, numbers out. Feeding C++ source to LANG::Python generally produces an Ok(FuncSpace) whose metrics are nonsense. Make sure you have selected the right language (e.g. via guess_language) before trusting the result.
Partial files score. A truncated file with an unterminated brace will still return Ok(FuncSpace). The metrics reflect the recovered tree, not the intended source.

If you need to know whether the input parsed cleanly, count ERROR nodes by walking the tree-sitter AST yourself (see the Node escape hatch in STABILITY.md) or use bca find -t ERROR on the CLI side (see the Nodes page).

Bubbling `MetricsError` through `?`

Because MetricsError implements [std::error::Error], you can bubble it through any Result<_, Box<dyn Error>> chain without boilerplate:

#![allow(unused)]
fn main() {
use std::error::Error;

use big_code_analysis::{analyze, FuncSpace, LANG, MetricsOptions, Source};

pub fn run(
    lang: LANG,
    source: &[u8],
    name: Option<String>,
) -> Result<FuncSpace, Box<dyn Error>> {
    Ok(analyze(
        Source::new(lang, source).with_name(name),
        MetricsOptions::default(),
    )?)
}
}

If you want a project-specific error type, an explicit From impl keeps call sites clean while letting you attach extra context (file path, language guess, etc.).

Warnings are not errors

The library writes warnings to stderr for non-fatal issues (malformed bca: suppression markers, mainly). They do not abort the walk and they do not flip Ok to Err. If you are running embedded inside a server or library and need to capture those warnings, redirect stderr at the process level — the library does not currently expose a programmatic warning sink.

Stability and versioning

big-code-analysis is on the 2.x line (currently 2.0.0). The full stability contract lives in STABILITY.md at the root of the repository — that file is the source of truth and is updated alongside the changelog at every release.

The headlines for library consumers:

Shape stability across patch and minor bumps. Every public type and function signature listed in STABILITY.md § "What is stable in shape" is held across the 2.x line. Additive changes (new items, new LANG variants, new MetricsError variants, new language features) are allowed in minor bumps. Breaking shape changes are reserved for the next major bump and will appear in the changelog under (breaking) in the 3.0.0 section.
No value stability guarantee within 2.x. A grammar pin bump or a bug fix in a metric definition can shift any metric value on any file in any direction, even across a patch bump. Each such drift is flagged in the changelog. Pin to an exact version (big-code-analysis = "= 2.0.0") if you need bit-for-bit reproducibility across runs.
MSRV is 1.94. Bumping the MSRV is treated as a minor-bump event and is flagged in the changelog under (breaking) — see STABILITY.md § MSRV policy.
Escape hatches. The Node wrapper exposes tree_sitter::Node through .0, and the tree_sitter crate is re-exported as big_code_analysis::tree_sitter. Anything reached through those seams follows the pinned tree-sitter version, not our own SemVer. See STABILITY.md § Escape hatches before depending on them.

On the `3.0` horizon

The breaking changes once staged for 2.0 have shipped in 2.0.0: the #[non_exhaustive] markers on the open public enums, the serialized-key normalization, the integer-metric u64 shift, the language-dispatch and grammar defaults, the Python and REST surface changes, and a consolidated metric-value re-baseline folding in the drift accumulated since 1.0. The path-positional callback dispatch (action / the Callback trait), the free metrics / metrics_with_options / get_function_spaces / metrics_from_tree / get_ops functions, and the generic Parser<T> / ParserTrait plumbing were removed at the same time — analyze and Ast are now the single analysis seam, with Parser and the per-language parser/tag types demoted to pub(crate).

One loose end is deferred to the next major: the per-metric Stats structs are not yet #[non_exhaustive], so adding a field is a shape break in the strict SemVer sense. In practice field additions are treated as additive in minor bumps and flagged in the changelog; marking the structs #[non_exhaustive] is on the 3.0 roadmap so that carve-out can be retired.

No 3.0 is scheduled. The #[non_exhaustive] markers added at 2.0 keep most future additions (new enum variants, new fields) non-breaking, so 2.x is the surface you should depend on.

Python Bindings

big-code-analysis ships first-party Python bindings (PyO3 + maturin) that expose the same metric pipeline as the Rust library and the bca CLI — same JSON shape, same numeric formatting, same language coverage.

import big_code_analysis as bca

result = bca.analyze("src/main.rs")
if result is not None:
    print(result["metrics"]["cyclomatic"]["sum"])

The bindings are a peer of the Rust API: anywhere this book points at a Rust function (big_code_analysis::analyze, FuncSpace, the metric modules), Python has a one-to-one equivalent. Pick whichever language fits your pipeline — the metrics are identical.

When to reach for Python

You're already in a data-pipeline stack (pandas, Jupyter, Airflow, dbt, Polars) and want metric records as dict/DataFrame rows without shelling out to the CLI.
You're integrating with a Python-native security tool that consumes SARIF — see SARIF output.
You're building a code-quality dashboard whose backend is a Python web framework (FastAPI, Django).

If you only need a one-shot quality report from the command line, the bca CLI is the simpler tool — see Commands → Metrics.

If you're embedding the analysis into a long-running Rust program, the Rust library is the lower-overhead option.

Chapter contents

Installation — pip install, wheel matrix, building from source.
Quick start — analyse one file, print one metric.
Batch processing — analyze_batch, AnalysisFailure, parallelism with ThreadPoolExecutor.
Flat-record iteration — flatten_spaces feeding sqlite / pandas.
Metric selection — metrics= kwarg, bca.METRIC_NAMES, dependency-pull semantics.
AST traversal — Ast, Node, walk(), and find() over the held parse.
SARIF output — to_sarif + GitHub Code Scanning upload.
Change-history (VCS) metrics — vcs.rank, vcs.trend, vcs.commit, and vcs.score_diff over a git working tree.
Error handling — the full exception taxonomy and the never-raise batch contract.
Async patterns — asyncio.to_thread is the canonical recipe.

The headline example on each page is embedded verbatim from an importable file under big-code-analysis-py/examples/ and exercised end-to-end by big-code-analysis-py/tests/test_book_examples.py, so a renamed kwarg or a removed function on the primary path fails CI before it can rot the docs. Shorter illustrative snippets that surround the embedded example (logging recipes, regex parsing of the errno suffix, the asyncio anti-pattern, the pandas one-liner, …) are inline and intentionally not test-pinned — treat the embedded blocks as the canonical reference when the two disagree.

Installation

The bindings are distributed as a pure-wheel Python package. The recommended install is via pip (or your preferred lockfile manager — uv, poetry, pdm).

pip install big-code-analysis

Python >=3.12 is required. The compiled extension uses CPython's stable abi3 surface (abi3-py312), so one wheel covers 3.12, 3.13, and every future minor release without a per-version wheel build.

Wheel matrix

CI publishes wheels for the following targets today. If your platform is not listed, build from source.

Platform	Architectures
Linux (`manylinux_2_28`)	`x86_64`, `aarch64`

The wheel matrix is defined in .github/workflows/python-wheels.yml. manylinux_2_28 requires glibc >= 2.28 (RHEL 8 / Debian 10 / Ubuntu 18.10 and newer); older distributions (RHEL 7 / CentOS 7, glibc 2.17) need to build from source. macOS and Windows wheels are not yet shipped — pip install on those platforms falls back to a source build today.

Verifying the install

python -c "import big_code_analysis as bca; print(bca.__version__)"

The version printed equals [workspace.package].version from the Rust workspace's Cargo.toml — the bindings and the Rust library version in lockstep.

Building from source

If no wheel matches your platform, or you want to bind against an unreleased Rust commit, build with maturin:

git clone https://github.com/dekobon/big-code-analysis.git
cd big-code-analysis/big-code-analysis-py
python -m venv .venv && source .venv/bin/activate
pip install --upgrade pip
pip install "maturin>=1.7,<2.0"
maturin develop --release   # editable install of big_code_analysis
python -c "import big_code_analysis as bca; print(bca.__version__)"

maturin develop builds the Rust extension in-place and installs it into the active venv so import big_code_analysis resolves locally — no separate pip install -e . step is required. The --release flag turns on the optimiser; omit it during development for faster rebuilds.

You will also need:

A stable Rust toolchain (MSRV: 1.94). Install via rustup.
A C compiler (used by the tree-sitter grammar crates).
CPython development headers (python3-dev on Debian / Ubuntu).

Walk through the quick-start to compute your first metric, or skip ahead to batch processing if you're wiring this into a pipeline over many files.

Quick start

This page walks through the minimum amount of code needed to compute metrics from a single source file.

1. Install the package

pip install big-code-analysis

See Installation for the wheel matrix and build-from-source instructions.

2. Analyse a file

bca.analyze(path) returns a dict matching the JSON bca metrics --format json emits for the same file — same field order, same numeric formatting, same shape.

"""Quick-start: analyse one file and print the headline cyclomatic count.

Mirrors the worked example shown on the book's
``python/quick-start.md`` page. The book embeds this file verbatim,
so the snippet is the test fixture — if the API drifts, the
``test_book_examples.py`` test fails and the docs are forced back
into sync.
"""

from __future__ import annotations

from pathlib import Path

import big_code_analysis as bca
from big_code_analysis import FuncSpaceDict


def run(path: Path) -> FuncSpaceDict:
    """Analyse ``path`` and return its metric dict."""
    result = bca.analyze(path)
    if result is None:
        msg = f"{path} was skipped (empty, binary, or generated)"
        raise SystemExit(msg)

    cyclomatic = result["metrics"]["cyclomatic"]
    print(f"{result['name']}: cyclomatic sum = {cyclomatic['sum']:.0f}")
    return result


if __name__ == "__main__":
    import sys

    if len(sys.argv) != 2:
        sys.exit("usage: python quick_start.py <path>")
    run(Path(sys.argv[1]))

A few details worth noting:

analyze returns None for any file the CLI walker would skip: one that is three bytes or fewer (treated as empty), one whose leading window is not valid UTF-8 (treated as binary), or — with the default skip_generated=True — one matching the walker's is_generated predicate (a leading @generated, DO NOT EDIT, or GENERATED CODE marker). Always handle the optional return before reaching into result["metrics"].
The returned object is a plain dict at runtime — safe to serialise with json.dumps, ship to a downstream service, or feed into flatten_spaces for tabular consumers. Type checkers see it as the FuncSpaceDict TypedDict (generated from the Rust wire shapes), so nested metric access checks statically under mypy/pyright without casts.
Language detection mirrors the CLI exactly: path extension first, then shebang / emacs-mode fallback. Pass bca.analyze_source(code, language) if you have the source in-memory.

3. Analyse an in-memory snippet

import big_code_analysis as bca

metrics = bca.analyze_source("fn main() {}\n", "rust")
print(metrics["metrics"]["loc"]["sloc"])

analyze_source accepts str, bytes, or bytearray. The returned dict has the same shape as analyze's output, with name set to None (no path is associated with an in-memory buffer).

Where to go next

Batch processing — analyze_batch for many files without per-file try/except clutter.
Metric selection — compute only the metrics you need.
Error handling — the full exception taxonomy.
The CLI's Metrics command is the equivalent shell-level workflow.

Batch processing

bca.analyze_batch(paths) runs the same analysis as bca.analyze over every path in an iterable and never raises on per-file errors: each result element is either an analysis dict or a bca.AnalysisFailure describing the failure. Results preserve input order, so zip(inputs, results) lines up by index when no path is skipped. analyze_batch shares analyze's keyword-only options — exclude_tests, allow_lossy_path, skip_generated (default True), and metrics — so the two entry points are behaviour-preserving.

def run(paths: Iterable[Path]) -> dict[str, int]:
    """Analyse ``paths`` as a batch and bucket successes vs failures.

    Returns a small summary dict (`ok`, `errors`, `total`) so the
    accompanying test can assert on it without re-parsing.
    """
    materialised = list(paths)
    # `skip_generated=False` guarantees one result element per input
    # (generated files are analysed, not dropped), so the `strict=True`
    # zip against `materialised` cannot raise `ValueError`. Under the
    # 2.0 default (`skip_generated=True`) a generated input yields no
    # slot, the lengths diverge, and the strict zip blows up — the same
    # bug #660 fixed in `pipeline_db.py`.
    results = bca.analyze_batch(materialised, skip_generated=False)

    ok = 0
    errors = 0
    for path, result in zip(materialised, results, strict=True):
        if isinstance(result, bca.AnalysisFailure):
            errors += 1
            print(f"  skip {path}: ({result.error_kind}) {result.error}")
        else:
            ok += 1
            sloc = result["metrics"]["loc"]["sloc"]
            print(f"  ok   {path}: sloc = {sloc:.0f}")

    return {"ok": ok, "errors": errors, "total": len(materialised)}

A few key contracts:

AnalysisFailure is returned, not raised. It is not an Exception subclass — isinstance(slot, bca.AnalysisFailure) is the discriminator.
paths is consumed lazily, so generators work — but if you want to keep the input around for zip, materialise it into a list first.
With the default skip_generated=True, a generated file is skipped and produces no element, so the result list can be shorter than the input — exactly matching single-file analyze, which returns None for a generated file. Pass skip_generated=False to guarantee one element per input (the pre-2.0 default). This default flipped at 2.0 so that switching between analyze and analyze_batch no longer silently changes generated-file handling.

Walking a directory: `analyze_paths`

analyze_batch analyses an explicit list of paths verbatim. When you instead want to find the source files first — "analyze my repo" — reach for analyze_paths (#658), which reuses the CLI's gitignore-aware walker:

import big_code_analysis as bca

results = bca.analyze_paths("path/to/repo", include="*.py")

Each positional seed may be a file or a directory; directories are walked honouring .gitignore, the include / exclude globs (a single glob string or a sequence; a leading ./ is optional, so dir/** ≡ ./dir/**), and the generated-file filter. A seed naming a file directly is always analysed regardless of exclude — an explicit request overrides ignore-style rules — while include still narrows it by basename. respect_gitignore=False opts into walking ignored files. The result is the same list[FuncSpaceDict | AnalysisFailure] shape and never-raise contract as analyze_batch, and it forwards the same exclude_tests / allow_lossy_path / skip_generated / metrics / vcs / vcs_per_function kwargs.

Attaching change-history metrics

analyze_batch and analyze_paths accept the same vcs=True / vcs_per_function=True kwargs as single-file analyze (#670). The batch builds one history index / blame engine per containing repository and reuses it across that repo's files — amortising the walk that a comprehension over analyze(p, vcs=True) would repeat per file. A VCS failure on one file leaves its AST metrics intact (it never becomes an AnalysisFailure); a file outside any repository simply gets no vcs block. For ranking a whole repository (rather than per-file attachment), use the dedicated big_code_analysis.vcs surface instead.

Parallel execution

There is no built-in concurrency inside analyze_batch — it is a sequential sweep. For parallelism, fan the per-file analyze call out across a thread pool:

def run_parallel(paths: Iterable[Path], *, workers: int = 4) -> list[FuncSpaceDict | None]:
    """Fan ``analyze`` out across a thread pool.

    PyO3 releases the GIL across each file's read + parse, so a
    thread pool actually parallelises the heavy work. Use this when
    you need per-file exceptions instead of ``AnalysisFailure`` slots.
    """

    def _analyze(p: Path) -> FuncSpaceDict | None:
        return bca.analyze(p)

    with ThreadPoolExecutor(max_workers=workers) as pool:
        return list(pool.map(_analyze, paths))

PyO3's Python::detach releases the GIL across each file's read + tree-sitter parse, so the threads do not serialise on the interpreter lock — this is real parallelism, not contended co-operation.

`AnalysisFailure` taxonomy

error_kind is a closed Literal:

`error_kind`	Triggered by
`"UnsupportedLanguage"`	Unknown extension + no shebang / emacs-mode hit
`"ParseError"`	tree-sitter rejected the source, or a rare internal serialisation failure (`internal: serialization error: …`)
`"IoError"`	`std::fs::read` failed or the path was not valid UTF-8

AnalysisFailure is frozen and implements __eq__ / __hash__ / __repr__ over all three fields, so callers can put errors in a set to deduplicate failures across runs. For retry classification, the errno is preserved in the error string via Rust's default formatting:

import re

match = re.search(r"\(os error (\d+)\)$", slot.error)
errno = int(match.group(1)) if match else None

If you need typed dispatch (FileNotFoundError, PermissionError, …) call bca.analyze(path) per-file instead of analyze_batch — single-file analyze raises the canonical OSError subclass. See Error handling.

Flat-record iteration

bca.flatten_spaces(result) walks the nested FuncSpace tree in pre-order and yields one flat, scalar-only dict per node — ready for sqlite3.executemany, pandas.DataFrame.from_records, or any other tabular consumer.

Metric keys use the same dotted convention as the CLI's CSV writer (cyclomatic.modified.sum, halstead.volume, loc.lloc_average, …). Identity keys (path, name, kind, start_line, end_line, parent_name, depth) are added on every record.

SQLite via `executemany`

The example below analyses one file and inserts one row per FuncSpace into a sqlite table whose columns are the union of all flattened keys.

"""Flatten a FuncSpace tree into scalar rows for sqlite / pandas.

Demonstrates ``bca.flatten_spaces`` + ``sqlite3.executemany``. The
pandas equivalent is shown in the book as a non-executed snippet so
this example stays dependency-free (sqlite ships with the stdlib).

Tied to the book's ``python/flat-records.md`` page.
"""

from __future__ import annotations

import sqlite3
from contextlib import closing
from pathlib import Path

import big_code_analysis as bca


def run(path: Path, db_path: Path) -> int:
    """Analyse ``path`` and insert one row per FuncSpace into ``db_path``.

    Returns the number of rows inserted so the test can assert on it.
    """
    result = bca.analyze(path)
    if result is None:
        msg = f"{path} was skipped (looks generated)"
        raise SystemExit(msg)

    # The flattened keys are dotted, lowercase names
    # (`halstead.unique_operators`, `halstead.total_operators`, …) that
    # are unique under SQLite's case-insensitive column comparison (the
    # old `N1`/`n1` Halstead collision was removed in #511), so each
    # lands on its own column without renaming.
    records = [dict(r) for r in bca.flatten_spaces(result)]
    if not records:
        return 0

    columns = sorted({k for r in records for k in r})
    cols_sql = ", ".join(f'"{c}"' for c in columns)
    placeholders = ", ".join("?" for _ in columns)
    rows = [tuple(r.get(c) for c in columns) for r in records]

    # `closing(sqlite3.connect(...))` is the documented idiom — the
    # bare ``with sqlite3.connect(...)`` context manager only commits
    # / rolls back the transaction; it does NOT close the connection,
    # so a long-running consumer leaks file descriptors (and on
    # Windows holds an exclusive write lock on the db file).
    with closing(sqlite3.connect(db_path)) as db, db:
        db.execute(f"CREATE TABLE IF NOT EXISTS metrics ({cols_sql})")
        db.executemany(
            f"INSERT INTO metrics ({cols_sql}) VALUES ({placeholders})",
            rows,
        )

    return len(rows)


if __name__ == "__main__":
    import sys

    if len(sys.argv) != 3:
        sys.exit("usage: python flat_records.py <source-file> <out.db>")
    inserted = run(Path(sys.argv[1]), Path(sys.argv[2]))
    print(f"inserted {inserted} rows into {sys.argv[2]}")

The iterator is lazy and single-use: it walks the input once without materialising the whole list. A second iteration of the same iterator yields nothing — call list() once if you need to re-iterate.

Pandas

flatten_spaces is the natural input to pandas.DataFrame.from_records. Pandas is not a dependency of the bindings; install it separately if you want the DataFrame view.

import big_code_analysis as bca
import pandas as pd

result = bca.analyze("src/lib.rs")
if result is not None:
    df = pd.DataFrame.from_records(bca.flatten_spaces(result))
    print(df.head())
    # Group by space kind to inspect the average cyclomatic per
    # function vs. per class vs. per file.
    by_kind = df.groupby("kind")["cyclomatic.sum"].mean()

Identity columns vs CLI CSV

The flat-record schema is mostly aligned with the CLI's CSV writer, with a couple of intentional deltas:

Identity columns use name / kind here; the CSV writer uses space_name / space_kind. Flat records also add parent_name / depth; the CSV writer omits those.
tokens.* flattens to the JSON shape (tokens.tokens, tokens.average, tokens.min, tokens.max). Only the sum leaf differs from CSV, which spells it tokens.sum; the average / min / max leaves now match (#590). Rename the sum leaf in the consumer if you need exact CSV alignment.

Anonymous spaces (Rust closures, JavaScript function expressions / arrows) keep their name == "<anonymous>" marker verbatim — flatten_spaces does not normalise.

Caveats

parent_name alone cannot disambiguate same-named siblings nested under different parents (e.g. two Inner classes under two different outer classes both surface as parent_name == "Inner" for their own children). Pair with depth and source-order position, or rebuild the qualified name in your consumer, if you need a fully-qualified path.
Do not mutate the input result while iterating: the walker keeps references into it, so mutations to not-yet-yielded subtrees will be observed in later records.
Missing metric subtrees produce no keys (absent, not None), matching the "Halstead disabled" edge case for metric selection.
flatten_spaces raises TypeError if the input is not a mapping; callers must filter None returns from bca.analyze (e.g. generated files with skip_generated=True) before passing.

Metric selection

Pass metrics=[…] to compute only a subset of the metric suite. metrics=None (the default) preserves the "compute everything" behaviour. Unrequested metrics are absent from the result dict (not present with None placeholders).

def run(path: Path) -> FuncSpaceDict:
    """Compute only LoC + cyclomatic for ``path`` and return the result.

    ``bca.METRIC_NAMES`` is a ``tuple[MetricName, ...]`` of canonical
    names accepted by ``metrics=``; its ``StrEnum`` members are
    ``str``-comparable, so ``"halstead" in bca.METRIC_NAMES`` works — an
    ABI smoke check the catalog is populated, not a test of the selection.
    """
    if "halstead" not in bca.METRIC_NAMES:
        msg = "halstead is missing from METRIC_NAMES — bindings ABI drift"
        raise RuntimeError(msg)
    selected = bca.analyze(path, metrics=["loc", "cyclomatic"])
    if selected is None:
        msg = f"{path} was skipped (looks generated)"
        raise SystemExit(msg)

    metric_keys = sorted(selected["metrics"])
    print(f"computed only: {metric_keys}")
    return selected


def run_derived(path: Path) -> FuncSpaceDict:
    """Selecting ``mi`` auto-pulls in its three dependencies."""
    selected = bca.analyze(path, metrics=["mi"])
    if selected is None:
        msg = f"{path} was skipped (looks generated)"
        raise SystemExit(msg)

    pulled = sorted(selected["metrics"])
    print(f"mi pulled in: {pulled}")
    return selected

The same kwarg is honoured by bca.analyze_source and bca.analyze_batch — the latter applies the selection uniformly to every file in the batch. Validation runs before any file I/O: an empty list or unknown name raises ValueError immediately and never returns an AnalysisFailure slot for what is really a caller bug.

Canonical names

The full set is available as a tuple of MetricName members. Each member is a StrEnum, so it is a str — "halstead" in bca.METRIC_NAMES works, and bca.MetricName.HALSTEAD == "halstead" is True. Pass either a plain string or a member to metrics=:

import big_code_analysis as bca
from big_code_analysis import MetricName

assert "halstead" in bca.METRIC_NAMES
assert bca.MetricName.HALSTEAD == "halstead"

# Either spelling works in `metrics=`:
selection = [MetricName.CYCLOMATIC, "cognitive"]

The members are generated from the same Metric table the CLI and JSON output use, so the values never drift from the slugs you see in bca metrics --format json.

Names are case-sensitive lowercase; passing an unknown name raises ValueError with the canonical list in the message. The canonical spelling for the exit-point metric is "nexits" everywhere (the enum Display, METRIC_NAMES, and the JSON output key). The legacy "exit" alias was retired at 2.0 and now raises ValueError like any other unknown name. Duplicates are silently collapsed.

Metric	JSON key	Dependencies pulled in
LoC	`loc`	—
Cyclomatic	`cyclomatic`	—
Cognitive	`cognitive`	—
Halstead	`halstead`	—
ABC	`abc`	—
`nargs`	`nargs`	—
`nom`	`nom`	—
`npa`	`npa`	—
`npm`	`npm`	—
`nexits`	`nexits`	—
`tokens`	`tokens`	—
Maintainability Index	`mi`	`loc`, `cyclomatic`, `halstead`
Weighted Methods per Class	`wmc`	`cyclomatic`, `nom`

Performance trade-off

Computing the full suite is the default because it is what the CLI does. Selecting a single metric is strictly faster — each compute pass is skipped — but the tree-sitter parse and the AST walk are the dominant cost on most inputs, so the saving on a single file is small. The benefit scales with batch size: when analyze_batch runs across a large repository, dropping the most expensive metric you do not need (often Halstead, on deep call trees) is a measurable win.

Unrequested metrics are absent from the result. Code that unconditionally indexes into result["metrics"]["mi"] will KeyError if you opted out of mi; guard with if "mi" in result["metrics"] or use .get("mi").

AST traversal

bca.analyze(...) gives you metrics. When you need the syntax tree itself — to find every function definition, pull a docstring, or port a py-tree-sitter matcher — parse once into an Ast and walk it with lazy Node handles.

The `Ast` handle

bca.Ast.parse(code, language) (or bca.Ast.from_path(path)) parses the source once and hands back a handle you can draw both metrics and the tree from, instead of parsing twice — once in py-tree-sitter, once in analyze():

import big_code_analysis as bca

ast = bca.Ast.parse("fn main() { let x = 1 + 2; }", "rust")
ast.metrics()      # same dict as analyze_source(...)
ast.root_node      # the syntax tree, walked lazily (below)

The handle is immutable and thread-safe, so it composes with ThreadPoolExecutor fan-out exactly like analyze.

The `Node` handle

ast.root_node is the tree's root as a lazy Node. Unlike ast.dump(), which materialises one dict per node, a Node is a cursor into the retained tree: it costs nothing until you read from it, and a selective extractor pays only for the nodes it visits.

root = ast.root_node
root.kind                       # "source_file"
root.type                       # "source_file" (py-tree-sitter alias for kind)
root.children                   # list[Node], direct children
root.child_by_field_name("…")   # a field child, or None
node.text                       # the node's source bytes

Traversal mirrors py-tree-sitter: children / named_children, parent, next_sibling / prev_sibling (and the *_named_* variants), child(i) / named_child(i), child_by_field_name(name) / children_by_field_name(name), and the field_name the parent reaches a node through.

Walking the whole subtree

walk() is a lazy pre-order iterator over a node and its descendants; descendants_by_kind(kinds) collects the matches in one pass; and ast.find(filters) searches the whole tree, accepting the same vocabulary as bca count (function, call, comment, string, an exact kind, …):

# Every function name in the file, the lazy way.
for fn in ast.find(["function_item"]):
    name = fn.child_by_field_name("name")
    print(name.text.decode())

# Or filter a subtree by raw grammar kind.
idents = root.descendants_by_kind(["identifier"])

These have Rust counterparts — Node::preorder and Node::descendants_by_kind — so library callers get the same helpers.

Coordinates

A node reports its one location in every vocabulary, so nothing has to be converted by hand:

Accessor	Meaning
`start_byte` / `end_byte`	byte offsets into `ast.source`
`start_point` / `end_point`	0-based `(row, col)` (py-tree-sitter parity)
`start_line` / `end_line`	1-based lines
`span`	the 1-based `{start_line, start_col, …, start_byte, end_byte}` dict `dump()` emits

So node.start_line == node.start_point[0] + 1, and ast.source[node.start_byte:node.end_byte] == node.text.

node.type is a py-tree-sitter-compatible alias for node.kind, so a matcher written against py-tree-sitter's node.type ports over unchanged; kind stays the canonical bca spelling.

Lazy nodes vs. `dump()`

ast.dump() returns the tree as nested dicts; ast.root_node returns lazy handles. They differ in two ways that matter:

Memory. dump() builds one dict (with span, value, children) per node — fine for small files, costly for the large ones. A Node walk allocates only the handles you touch.
Taxonomy. A Node's kind is the raw grammar kind. dump() kinds pass through bca's Alterator and are curated — for example, string-literal nodes are renamed to "string" and flattened (their grammar children removed). So the two surfaces intentionally disagree on altered nodes:
```
ast = bca.Ast.parse('fn f() { let s = "hi"; }', "rust")
# Raw tree: the string keeps its quote/content children.
raw = next(n for n in ast.root_node.walk() if "string" in n.kind)
assert raw.children
```
Use lazy nodes when you want exactly what the grammar produced (the right choice for porting a py-tree-sitter matcher); use dump() when you want bca's curated, JSON-serialisable view.

Lifetime and threading

A Node keeps its Ast alive: it stays valid even after you drop every other reference to the parse, so returning a node (or a list of nodes) from a function that builds the Ast locally is safe. Nodes are also safe to share across threads.

C/C++ preprocessor. For Cpp parsed with preprocessor inputs, ast.source — and therefore every node's byte offsets — indexes into the expanded source the parser saw, not the on-disk file.

Where to go next

Metric selection — compute only the metrics you need from the same parse.
The CLI's dump and count commands are the shell-level equivalents of dump() and find().

SARIF output

bca.to_sarif(result, *, thresholds=None) renders an analysis result (or an iterable of them) into a SARIF 2.1.0 JSON document, ready for upload to GitHub Code Scanning or any other SARIF consumer. The output is produced by the same Rust writer that backs bca check --report-format sarif, so the schema URL, tool driver name / version, and rule descriptions match the CLI byte-for-byte.

def run(
    paths: Iterable[Path],
    sarif_path: Path,
    thresholds: Mapping[str, float],
) -> str:
    """Analyse ``paths`` and write a SARIF document to ``sarif_path``.

    Returns the rendered SARIF JSON so the caller (or the test) can
    inspect it without re-reading the file.
    """
    batch = bca.analyze_batch(paths)
    sarif = bca.to_sarif(batch, thresholds=dict(thresholds))

    sarif_path.parent.mkdir(parents=True, exist_ok=True)
    sarif_path.write_text(sarif, encoding="utf-8")
    print(f"wrote {sarif_path} ({len(sarif.encode('utf-8'))} bytes)")
    return sarif

to_sarif accepts:

A single dict returned by bca.analyze or bca.analyze_source.
Any iterable yielding such dicts and / or bca.AnalysisFailure instances (the natural shape of bca.analyze_batch's return value). AnalysisFailure entries are skipped silently — they represent files that could not be analysed, not findings.

Thresholds

Accepted threshold names mirror the CLI's EXTRACTORS table in big-code-analysis-cli/src/thresholds.rs:

cognitive, cyclomatic, cyclomatic.modified
halstead.volume, halstead.difficulty, halstead.effort, halstead.time, halstead.bugs
loc.sloc, loc.ploc, loc.lloc, loc.cloc, loc.blank
nom, tokens, nexits, nargs
mi.original, mi.sei, mi.visual_studio
abc, wmc, npm, npa

An unknown name raises ValueError listing the accepted set, so a typo fails fast instead of silently producing an empty SARIF run.

thresholds=None (the default) and thresholds={} both produce a well-formed SARIF document with empty results and rules arrays. This matches the CLI's posture: there are no built-in default thresholds; every check run supplies its own limits.

Upload to GitHub Code Scanning

# .github/workflows/code-scanning.yml (excerpt)
- name: Compute metric SARIF
  run: |
    python - <<'PY'
    import big_code_analysis as bca
    with open("paths.txt", encoding="utf-8") as paths_fh:
        results = bca.analyze_batch(paths_fh.read().splitlines())
    with open("metrics.sarif", "w", encoding="utf-8") as fh:
        fh.write(bca.to_sarif(results, thresholds={"cyclomatic": 15}))
    PY
- name: Upload to Code Scanning
  uses: github/codeql-action/upload-sarif@v3
  with:
    sarif_file: metrics.sarif

The upload action is documented under github/codeql-action/upload-sarif. The bindings produce one SARIF run per call; the action handles the upload to the repository's Code Scanning alerts.

What "Unit" findings mean

to_sarif emits a finding at every space — the file unit, each container, and each leaf function or closure — whose own value breaches its limit, exactly matching bca check --report-format sarif. For most metrics the JSON headline at a space already is that space's own value. The four subtree-aggregate metrics — cyclomatic, cyclomatic.modified, cognitive, and abc — additionally expose a sum / magnitude rolled up across child spaces; the binding reads their per-space value field instead, so it reports an interior breach (for example a function whose own complexity breaches even though a nested closure's does not) without being fooled by the larger aggregate. Before the value field existed the binding could read only the aggregate and so emitted these four only at leaf spaces, missing genuine interior breaches the CLI reports (#958).

Unit findings carry logicalLocations: [{"fullyQualifiedName": "<file>"}]. Nameless non-unit spaces (rare parse-failure case) carry "<unnamed>" — both matching the CLI's function_token placeholders.

Change-history (VCS) metrics

The big_code_analysis.vcs submodule ranks files and scores commits by change-history risk — signals derived from version-control history rather than the source AST. It is the Python analogue of the bca vcs CLI command, and the same Rust engine backs both, so the returned dicts match the CLI's structured output field-for-field.

from big_code_analysis import vcs

report = vcs.rank("path/to/repo", top=20)
trend = vcs.trend("path/to/repo", points=6)
commit = vcs.commit("path/to/repo", commit="HEAD")
diff = vcs.score_diff(unified_diff_text)

The four entry points mirror the bca vcs subcommands: vcs.rank ranks files (bca vcs), vcs.trend samples that ranking over time (bca vcs trend), vcs.commit scores one commit (bca vcs commit), and vcs.score_diff scores a bare unified diff (bca vcs commit --diff). For background on the signals, the composite risk score, and the underlying defect-prediction literature, read the CLI chapter; this page covers the Python surface.

This is distinct from analyze(..., vcs=True), which attaches a vcs block to a single file's metrics. The vcs submodule walks history once for a whole repository, so prefer it for ranking; reach for the analyze kwarg only when you want change-history numbers alongside a file's AST metrics. See Batch processing for the analyze_batch(..., vcs=True) path that amortises the walk across a repository's files.

Ranking files

vcs.rank(repo_path, *, options=None, top=None, no_cache=False, cache_dir=None) ranks every in-scope file by descending risk and returns a VcsReportDict. The keyword-only knobs that vary per call live on rank; the history-walk knobs shared with trend and commit live on a shared Options object (covered below).

from big_code_analysis import vcs

report = vcs.rank("path/to/repo", top=20)

print(f"long window:  {report['long_window_days']} days")
print(f"recent window: {report['recent_window_days']} days")

for ranked in report["files"]:
    block = ranked["vcs"]
    print(f"{block['risk_score']:6.2f}  {ranked['path']}")

top caps how many files the ranking keeps; 0 or None keeps all. The report carries the resolved window lengths and the risk_score_version / vcs_schema_version stamps once at the top level (not on each file's vcs block). The files list is ordered by descending vcs.risk_score, and a vcs_aggregate key holds the repository bus-factor summary when it was computed:

aggregate = report.get("vcs_aggregate")
if aggregate is not None:
    bus = aggregate["bus_factor"]
    print(f"repo bus factor: {bus['repo']['bus_factor']}")

Scoring a commit

vcs.commit(repo_path, *, commit="HEAD", options=None) scores a single commit for just-in-time (commit-level) risk against its first parent, returning a JitCommitReportDict. The commit argument is any git revision spelling ("HEAD", "HEAD~3", a branch, a tag, a SHA).

from big_code_analysis import vcs

report = vcs.commit("path/to/repo", commit="HEAD")

print(f"risk score: {report['risk_score']}")
print(f"is merge:   {report['commit']['is_merge']}")

size = report["features"]["size"]
print(f"+{size['lines_added']} -{size['lines_deleted']} "
      f"across {size['files_touched']} files")

# Each feature group's signed push on the ordinal score.
for group, value in report["contributions"].items():
    print(f"  {group:<11} {value}")

The score is ordinal: rank commits by it, or compare a commit against the repository's own distribution, but do not read the magnitude as a probability. The contributions block reports each feature group's signed contribution so a consumer can see why a commit ranked where it did.

Scoring an arbitrary diff

vcs.score_diff(diff) scores a git-style unified diff that has not been committed yet — the shape a pre-commit hook or code-review bot works with. It returns a JitDiffReportDict.

import subprocess
from big_code_analysis import vcs

staged = subprocess.run(
    ["git", "diff", "--cached"],
    capture_output=True, text=True, check=True,
).stdout

report = vcs.score_diff(staged)
print(f"partial risk: {report['partial_risk_score']}")

A bare diff carries no author, parent, or file history, so source is the literal "diff", only the size and diffusion groups are computable, and partial_risk_score is not comparable to a commit's risk_score. The history, experience, and purpose groups are absent, not zero.

Sampling a trend

vcs.trend(repo_path, *, options=None, points=12, span=None, top=None, top_deltas=None) samples the file ranking at several points in time and returns a VcsTrendDict. Each point re-anchors at that moment's mainline tip, so the series is a sequence of true historical snapshots rather than the current ranking re-projected backwards.

from big_code_analysis import vcs

trend = vcs.trend("path/to/repo", points=6, span="6mo", top_deltas=10)

# The sample timestamps, oldest first (Unix seconds).
print("sampled at:", trend["as_of_points"])

# Files that regressed most over the window.
for delta in trend["deltas"]["regressed"]:
    print(f"  +{delta['delta']:.2f}  {delta['path']}")

points (at least 2) samples span span (default 12mo), ending at options.as_of. The files map aligns each file's series 1:1 with as_of_points, with a None element where the file did not yet exist. The deltas summary splits files into improved and regressed lists; top_deltas trims each list and top caps how many files are kept.

Shared options

The three repository-walking entry points — rank, trend, and commit — accept the same vcs.Options object, so one configuration can drive a rank plus a trend pass without restating the common knobs. Every field is keyword-only and optional, and the defaults reproduce the bca vcs CLI defaults, so Options() matches the default ranking.

from datetime import datetime, timezone
from big_code_analysis import vcs

options = vcs.Options(
    long_window="2y",
    recent_window="60d",
    risk_formula="percentile",
    file_types=["rs", "py"],
    as_of=datetime(2026, 1, 1, tzinfo=timezone.utc),
)

report = vcs.rank("path/to/repo", options=options, top=20)

The widened option kwargs (issue #619) each accept more than a bare string:

file_types selects which files to rank: "metrics" (the default — only files bca computes metrics for), "all" (every tracked text file), a comma-separated extension allow-list ("rs,py"), or a Sequence[str] of extensions (["rs", "py"]).
as_of pins the reference "now" for reproducible snapshots, as either a datetime or a string (RFC 3339, @unix, or a git date). Pinning as_of makes a run reproducible: the ranking is computed as it stood at that moment, not against the wall clock.
cache_dir (on rank, not Options) accepts a str or any os.PathLike — a pathlib.Path passes straight through.

The history-walk toggles mirror the CLI flags: full_history, include_merges, follow_renames (default True), exclude_bots (default True), and bot_pattern to override the bot-author regular expression. bus_factor_threshold sets the coverage fraction for the bus-factor flag (default 0.5), and emit_author_details includes SHA-256-hashed canonical author identities. author_hash_key (requires emit_author_details) hardens those digests into a keyed HMAC-SHA256, the same opt-in described under Author-detail privacy.

Caching

vcs.rank keeps a persistent cache of each history walk, on by default. A cache hit is bit-identical to a fresh walk, and the time windows are recomputed against the current moment on every run, so a cached result is never stale.

from big_code_analysis import vcs

# First call primes the cache; the second replays it.
vcs.rank("path/to/repo")
vcs.rank("path/to/repo")  # reuses the prior walk

vcs.rank("path/to/repo", no_cache=True)            # ignore the cache
vcs.rank("path/to/repo", cache_dir="/tmp/bca")     # override the directory

By default the cache lives under the platform cache directory. Author identities are stored only as their SHA-256 digests, never plaintext. Note that hashing is pseudonymization, not anonymization: the digests are recoverable against a candidate email set — see Author-detail privacy.

Releasing the GIL

The repository-walking calls (vcs.rank, vcs.trend, and the commit score in vcs.commit) release the GIL across the history walk (issue #620), so a ThreadPoolExecutor can rank several repositories in parallel without serialising on the interpreter lock:

from concurrent.futures import ThreadPoolExecutor
from big_code_analysis import vcs

repos = ["service-a", "service-b", "service-c"]

with ThreadPoolExecutor() as pool:
    reports = list(pool.map(lambda r: vcs.rank(r, top=20), repos))

This is the same pattern the Async patterns page applies to the per-file analyze calls.

Errors

The vcs functions raise a typed exception hierarchy rooted at bca.VcsError (itself a ValueError). See Error handling for the full taxonomy and which call raises which type.

Error handling

The bindings split errors into two domains:

Caller errors are raised — ValueError for bad arguments, TypeError for the wrong type, OSError and its subclasses for filesystem failures.
Per-file analysis errors in a batch are returned as bca.AnalysisFailure values inside the result list. They are not exceptions and never raise.

The single-file bca.analyze walks the first path; the batch bca.analyze_batch walks the second.

def run(
    fixtures: Path,
    *,
    missing_path: Path,
) -> dict[str, Any]:
    """Trigger each error path and return a small report.

    ``fixtures`` is a directory containing at least ``hello.rs``;
    ``missing_path`` must NOT exist on disk.
    """
    report: dict[str, Any] = {
        "file_not_found": False,
        "unsupported": False,
        "batch_errors": 0,
    }

    # 1. analyze() on a missing path raises a typed OSError subclass.
    try:
        bca.analyze(missing_path)
    except FileNotFoundError as err:
        report["file_not_found"] = True
        print(f"file_not_found: errno={err.errno} filename={err.filename}")

    # 2. analyze() on an unknown extension raises
    #    UnsupportedLanguageError (itself a ValueError subclass).
    #    The write is inside the try/finally so a future second
    #    mutation before the analyse call still gets cleaned up.
    unknown = fixtures / "hello.unknown_extension"
    try:
        unknown.write_text("noop", encoding="utf-8")
        bca.analyze(unknown)
    except bca.UnsupportedLanguageError as err:
        report["unsupported"] = True
        print(f"unsupported_language: {err}")
    finally:
        unknown.unlink(missing_ok=True)

    # 3. analyze_batch() returns AnalysisFailure, never raises per-file.
    paths = [fixtures / "hello.rs", missing_path]
    for slot in bca.analyze_batch(paths):
        if isinstance(slot, bca.AnalysisFailure):
            report["batch_errors"] += 1
            print(f"batch_error: ({slot.error_kind}) {slot.error}")

    return report

Single-file exceptions

bca.analyze and bca.analyze_source raise:

Exception	Subclass of	Triggered by
`bca.UnsupportedLanguageError`	`ValueError`	Unknown extension + no shebang / emacs-mode hit
`bca.ParseError`	`ValueError`	tree-sitter rejected the source
`ValueError` (raw)	—	Non-UTF-8 path with `allow_lossy_path=False` (the default)
`OSError` and subclasses	—	`std::fs::read` failed

The OSError raised by analyze dispatches to the canonical subclass based on errno:

import big_code_analysis as bca

path = "src/example.rs"

try:
    bca.analyze(path)
except FileNotFoundError as err:
    print("missing:", err.errno, err.filename)
except PermissionError as err:
    print("denied:", err.errno, err.filename)
except IsADirectoryError as err:
    print("directory:", err.errno, err.filename)

Each branch dispatches on the underlying errno:

Exception	Typical `err.errno` (Linux)	When it fires
`FileNotFoundError`	2 (`ENOENT`)	Path does not exist.
`PermissionError`	13 (`EACCES`)	Read bit denied for the calling user.
`IsADirectoryError`	21 (`EISDIR`)	Path resolves to a directory.

Use except OSError if you want to catch the whole family and inspect err.errno / err.filename yourself.

UnsupportedLanguageError and ParseError are both ValueError subclasses, so a single except ValueError catches both. Prefer the typed catches when you want to differentiate.

Batch errors

bca.analyze_batch returns bca.AnalysisFailure values instead of raising, so a single bad file does not break the whole batch.

for slot in bca.analyze_batch(paths):
    if isinstance(slot, bca.AnalysisFailure):
        log.warning("%s (%s): %s", slot.path, slot.error_kind, slot.error)
    else:
        process(slot)

error_kind is a closed Literal:

"UnsupportedLanguage" — extension and shebang / emacs-mode resolution both came up empty.
"ParseError" — tree-sitter rejected the input, or (rare) a Rust-side JSON serialisation of the result failed. The serialisation case is prefixed with internal: serialization error: in the error string; check for the prefix when the distinction matters (serialisation failures are not recoverable by re-reading the file).
"IoError" — the most common kind: std::fs::read failed. The closed taxonomy also folds in non-UTF-8 path failures, so a path-encoding error surfaces as "IoError" rather than as a distinct fourth value.

For "IoError" instances the underlying OS errno is preserved in the error string via Rust's default formatting ("<msg> (os error <N>)" on Unix). Parse with regex if you need it for retry classification:

import re

match = re.search(r"\(os error (\d+)\)$", slot.error)
errno = int(match.group(1)) if match else None

If you need typed OSError subclasses, call bca.analyze per file instead of analyze_batch — single-file analyze raises FileNotFoundError / PermissionError / IsADirectoryError directly.

Programmer errors in batches

analyze_batch does still raise on caller bugs:

TypeError if paths is not iterable, or an element is not str / os.PathLike[str]. This aborts the whole call; any results computed before the bad element are discarded.
ValueError if metrics= is an explicitly empty sequence or contains an unknown name. Validation runs before the input iterable's __iter__, so a generator's side effects (and any partial yields) are preserved on this raise path.

Change-history (VCS) exceptions

The big_code_analysis.vcs functions raise a typed hierarchy rooted at bca.VcsError, itself a ValueError, so an existing except ValueError (or except bca.VcsError) catches every VCS failure (#624). The analyze(..., vcs=True) kwarg shares the same option-parsing errors.

Exception	Subclass of	Triggered by
`bca.NotARepositoryError`	`bca.VcsError`	`repo_path` is not inside a git working tree
`bca.InvalidRevisionError`	`bca.VcsError`	A `reference` / `commit` could not be resolved
`bca.InvalidDiffError`	`bca.VcsError`	The `diff` passed to `vcs.score_diff` is malformed
`bca.VcsEnvironmentError`	`bca.VcsError`	History walk, diffing, `.mailmap`, blame, or cache I/O failed
`bca.VcsError` (raw)	`ValueError`	A malformed option value (window / timestamp / formula / file-type scope / bus-factor threshold / bot pattern / trend point count); the message names the offending value

NotARepositoryError is the variant to branch on for "not a repo, skip this directory". The base VcsError is raised directly for a bad option, where the message names the offending value, while the named subclasses cover the input failures (a missing revision, a malformed diff). VcsEnvironmentError is the environment / backend bucket, mirroring the 500 (not 400) responses the web crate returns for the same failures.

import big_code_analysis as bca
from big_code_analysis import vcs

try:
    report = vcs.rank("path/to/repo", top=20)
except bca.NotARepositoryError:
    print("not a git repository, skipping")
except bca.VcsError as err:
    # Malformed window, formula, file-type scope, and so on.
    print("bad VCS option:", err)

analyze(..., vcs=True) is the exception to the NotARepositoryError rule: a file outside any repository simply yields no vcs block rather than raising, so only the option-parsing VcsError reaches the caller from that path.

Logging recipe

A small logging helper for batch output keeps successes / failures aligned without bespoke formatting:

import logging
import big_code_analysis as bca

log = logging.getLogger(__name__)

def report(paths: list[str]) -> None:
    # skip_generated=False keeps the result list index-aligned with
    # `paths`; with the default True, a generated file yields no slot
    # and the zip silently misaligns.
    for path, slot in zip(paths, bca.analyze_batch(paths, skip_generated=False)):
        if isinstance(slot, bca.AnalysisFailure):
            log.warning(
                "skip %s (%s): %s", path, slot.error_kind, slot.error
            )
        else:
            log.info(
                "ok %s sloc=%s", path,
                slot["metrics"]["loc"]["sloc"],
            )

Async patterns

bca.analyze is CPU-bound: the work is a tree-sitter parse plus the metric passes, both of which release the GIL on the Rust side via PyO3's Python::detach. The canonical async pattern is therefore asyncio.to_thread:

async def analyze_async(path: Path) -> FuncSpaceDict | None:
    """Run ``bca.analyze(path)`` on the default thread executor."""
    return await asyncio.to_thread(bca.analyze, path)


async def analyze_all(
    paths: Iterable[Path],
) -> list[FuncSpaceDict | BaseException | None]:
    """Fan ``analyze_async`` out across ``paths`` with ``asyncio.gather``.

    ``return_exceptions=True`` matters here: ``bca.analyze`` runs
    inside ``asyncio.to_thread`` and Python threads cannot be
    cancelled. If one call raises and gather re-raises with
    ``return_exceptions=False``, the surviving threads keep running
    in the default executor, producing results that are silently
    discarded. With ``return_exceptions=True`` every thread's
    result (success OR exception) lands in the returned list so
    the caller can dispatch per-file.
    """
    return await asyncio.gather(
        *(analyze_async(p) for p in paths),
        return_exceptions=True,
    )

Why `to_thread`, not native `async`

bca.analyze is a synchronous Python function backed by synchronous Rust code — there is no await boundary inside it. Wrapping it in asyncio.to_thread:

Schedules the call on the default thread pool.
Lets other coroutines progress while the parse + metric pass runs.
Returns the result back to the calling coroutine when done.

Because the Rust side releases the GIL across the heavy work, several to_thread(bca.analyze, ...) calls genuinely run in parallel — this is not co-operative I/O multiplexing, it is real multi-core utilisation gated on the thread pool's size.

Custom executors

For a tighter cap on the worker count, hand to_thread a purpose-built executor:

import asyncio
from concurrent.futures import ThreadPoolExecutor

import big_code_analysis as bca

async def analyze_many(paths: list[str]) -> list[object]:
    loop = asyncio.get_running_loop()
    with ThreadPoolExecutor(max_workers=8) as pool:
        return await asyncio.gather(
            *(loop.run_in_executor(pool, bca.analyze, p) for p in paths)
        )

Eight workers on an 8-core machine is the comfortable upper bound for purely CPU-bound work; raising it further oversubscribes the machine and trades throughput for context-switch overhead.

Streaming results

asyncio.as_completed lets you start consuming results as soon as the first analysis finishes — useful when the per-file work varies wildly in cost (a 5 KB file vs a 500 KB generated bundle):

import asyncio
import big_code_analysis as bca

async def first_failure(paths: list[str]) -> str | None:
    """Return the path of the first file with cyclomatic > 50."""
    tasks = [asyncio.create_task(asyncio.to_thread(bca.analyze, p)) for p in paths]
    try:
        for coro in asyncio.as_completed(tasks):
            result = await coro
            if result is None:
                continue
            if result["metrics"]["cyclomatic"]["sum"] > 50:
                return result["name"]
    finally:
        for t in tasks:
            t.cancel()
    return None

The finally-block cancellation matters: as_completed does not auto-cancel pending tasks when the caller returns early, so a leaked task can keep running on the thread pool well after the async function returns.

Anti-pattern: calling `bca.analyze` directly in a coroutine

# Don't do this.
async def bad(path: str) -> dict | None:
    return bca.analyze(path)  # blocks the event loop on every call

async def does not make the body asynchronous. Without to_thread or an explicit executor, every coroutine that calls bca.analyze stalls the event loop for the full duration of the parse — other tasks waiting on I/O, timers, or queues all freeze until the parse returns. The to_thread wrapper is one line and makes the difference between a responsive server and a single-threaded one.

When `analyze_batch` is the better fit

If you are processing a static, finite list of paths and do not need streaming results, bca.analyze_batch is simpler than gather(*to_thread(...)): it runs sequentially on the calling thread but never raises on per-file errors. Wrap the whole analyze_batch call in asyncio.to_thread to keep the event loop responsive:

import asyncio
import big_code_analysis as bca

async def batch(paths: list[str]) -> list[object]:
    return await asyncio.to_thread(bca.analyze_batch, paths)

This trades the per-file parallelism of gather for the simpler error model of analyze_batch. Pick gather when you want both parallelism and typed OSError dispatch; pick to_thread(analyze_batch, paths) when you want one async call and the never-raise contract.

Developers Guide

If you want to contribute to the development of big-code-analysis we have summarized here a series of guidelines that are supposed to help you in your building process.

As prerequisite, you need to install the last available version of Rust. You can learn how to do that here.

Clone Repository

First of all, you need to clone the repository. You can do that:

through HTTPS

git clone -j8 https://github.com/dekobon/big-code-analysis.git

or through SSH

git clone -j8 git@github.com:dekobon/big-code-analysis.git

Make is the canonical entry point

The repository ships a Makefile that wraps every common build, test, lint, format, and docs task. Run make help to see the full list of targets, and make check-tools to verify which optional tools (taplo, rumdl, shellcheck, shfmt, checkmake, mdbook, cargo-insta, cargo-udeps) are present on your machine.

The two composite targets you will use most:

make pre-commit — the recommended local gate before committing. Runs cargo fmt --check, both clippy invocations (default-features and --all-features), cargo test --workspace --all-features (lib + bin + integration + doc), cargo +nightly udeps, and the markdown / TOML / shell / Makefile lint families in one parallel pass.
make ci — the same checks in the order CI runs them, with no auto-fixing. Use this to reproduce a failing CI run locally.

If GNU Make 4 or any of the optional tools are unavailable, fall back to the raw cargo commands shown below — they are equivalent to the corresponding Make targets.

Building

To build the big-code-analysis library, the CLI, and the web server in one shot:

make build           # cargo build --workspace --all-targets
make build-release   # cargo build --workspace --release

For an individual crate, invoke cargo directly:

cargo build                              # library only
cargo build -p big-code-analysis-cli     # CLI only
cargo build -p big-code-analysis-web     # web server only

make check runs cargo check --workspace --all-targets for fast type-checking during iteration.

Testing

To verify that all tests pass:

make test       # cargo test --workspace --all-features --lib --bins --tests
make test-doc   # cargo test --workspace --all-features --doc

If you only want to run the cargo command yourself:

cargo test --workspace --all-features --verbose

Updating insta tests

We use insta; install cargo insta to manage snapshots. The Makefile wraps the two operations you need:

make insta-review   # cargo insta test --review (interactive)
make insta-accept   # cargo insta test --accept (use with care)

make insta-review runs the tests, generates the new snapshot references, and lets you review each diff. Reach for make insta-accept only for bulk metric-value-only refreshes (grammar bumps, Halstead operator reclassification) where you have already verified the diff pattern is uniform.

Code Formatting

If all previous steps went well, and you want to make a pull request to integrate your invaluable help in the codebase, the last step left is code formatting. The make fmt target runs every formatter in the project (Rust, Markdown, TOML, Bash) in one shot; make fmt-check verifies formatting without modifying files.

make fmt         # cargo fmt + rumdl check --fix + shfmt -w + taplo fmt
make fmt-check   # the equivalent --check variants

Rustfmt

This tool formats your code according to Rust style guidelines.

To install:

rustup component add rustfmt

To format the code (handled automatically by make fmt):

cargo fmt

Clippy

This tool helps developers to write better code catching automatically lots of common mistakes for them. It detects in your code a series of errors and warnings that must be fixed before making a pull request.

make clippy runs both clippy invocations the project enforces (default-features and --all-features); make lint additionally runs the markdown, shell, TOML, and Makefile linters.

To install:

rustup component add clippy

To detect errors and warnings:

make clippy
# or, manually:
cargo clippy --workspace --all-targets -- -D warnings
cargo clippy --workspace --all-targets --all-features -- -D warnings

Unused dependencies

make udeps runs cargo +nightly udeps --workspace --all-targets to catch dependencies declared in Cargo.toml but never referenced. Requires the nightly toolchain (rustup toolchain install nightly) and cargo-udeps.

Code Documentation

make doc        # cargo doc --no-deps --workspace --all-features  (warning-tolerant)
make doc-open   # same, then open in a browser
make doc-check  # strict gate: appends -D warnings to RUSTDOCFLAGS, fails on any rustdoc warning

make doc and make doc-open are the interactive viewers — they build whatever they can so you can still inspect rendered output mid-refactor. make doc-check is the strict gate that runs as part of make pre-commit and CI (cargo doc --no-deps --workspace --all-features with RUSTDOCFLAGS extended by -D warnings); it catches broken intra-doc links, links into private items, and other rustdoc regressions.

Remove the --no-deps option from the underlying cargo invocation if you also want to build the documentation of each dependency used by big-code-analysis.

Building this book

The book you are reading lives under big-code-analysis-book/:

make book        # mdbook build
make book-serve  # mdbook serve with live reload

Run your code

You can run bca using:

cargo run -p big-code-analysis-cli -- [bca-parameters]

To know the list of bca parameters, run:

cargo run -p big-code-analysis-cli -- --help

You can run bca-web using:

cargo run -p big-code-analysis-web -- [bca-web-parameters]

To know the list of bca-web parameters, run:

cargo run -p big-code-analysis-web -- --help

make install, make install-cli, and make install-web invoke cargo install --path for the respective binary crates.

Practical advice

When you add a new feature, add at least one unit or integration test to verify that everything works correctly
Document public API
Do not add dead code
Comment intricate code such that others can comprehend what you have accomplished
Run make pre-commit before pushing — it is the same gate CI runs

Supporting a new language

This section is to help developers implement support for a new language in big-code-analysis.

To implement a new language, two steps are required:

Generate the grammar
Add the grammar to big-code-analysis

A number of metrics are supported and help to implement those are covered elsewhere in the documentation.

Generating the grammar

As a prerequisite for adding a new grammar, there needs to exist a tree-sitter version for the desired language that matches the version used in this project.

The grammars are generated by a project in this repository called enums. The following steps add the language support from the language crate and generate an enum file that is then used as the grammar in this project to evaluate metrics.

Add the language specific tree-sitter crate to the enums crate, making sure the dependency is pinned with =X.Y.Z to the same version used in the root big-code-analysis Cargo.toml. For example, for the Rust support the following line exists in the /enums/Cargo.toml: tree-sitter-rust = "=0.24.2".
Append the language to the enum crate in /enums/src/languages.rs. Keeping with Rust as the example, the line would be (Rust, tree_sitter_rust). The first parameter is the name of the Rust enum that will be generated, the second is the tree-sitter function to call to get the language's grammar.
Add a case to the end of the match in mk_get_language macro rule in /enums/src/macros.rs. The current convention uses the LANGUAGE constant exposed by modern grammar crates: for Rust that line is Lang::Rust => tree_sitter_rust::LANGUAGE.into().
Lastly, we execute the /recreate-grammars.sh script that runs the enums crate to generate the grammar for the new language.

At this point we should have a new grammar file for the new language in /src/languages/. See /src/languages/language_rust.rs as an example of the generated enum.

Adding the new grammar to big-code-analysis

Add the language specific tree-sitter crate to the big-code-analysis workspace, with the same =X.Y.Z pin as the enums crate uses. For example, for the Rust support the line in the root Cargo.toml is tree-sitter-rust = "=0.24.2".
Next we add the new tree-sitter language namespace to /src/languages/mod.rs eg.

#![allow(unused)]
fn main() {
pub mod language_rust;
pub use language_rust::*;
}

Lastly, we add a definition of the language to the arguments of mk_langs! macro in /src/langs.rs.

#![allow(unused)]
fn main() {
// 1) Cargo feature name that enables this variant's grammar
// 2) Name for enum
// 3) Language description
// 4) Display name
// 5) Empty struct name to implement
// 6) Parser name
// 7) tree-sitter function to call to get a Language
// 8) file extensions
// 9) emacs modes
// 10) pinned grammar crate version (mirrors the `=X.Y.Z` pin in the
//     workspace Cargo.toml; a drift test asserts the two agree)
(
    "rust",
    Rust,
    "The `Rust` language",
    "rust",
    RustCode,
    RustParser,
    tree_sitter_rust,
    [rs],
    ["rust"],
    "0.24.2"
)
}

Implementing traits and tests

Wiring the grammar is only the first step. The new <Lang>Code type must also implement the AST plumbing and every metric trait the workspace defines:

Checker in /src/checker.rs — comment, function, closure, call, string-literal, and else-if predicates over the grammar's kind_ids.
Getter in /src/getter.rs — get_space_kind plus the Halstead operator/operand classification table.
Alterator in /src/alterator.rs — usually only string-literal preservation; the default impl works for most languages.
All thirteen metric traits: Abc, Cognitive, Cyclomatic, Exit, Halstead, Loc, Mi, NArgs, Nom, Npa, Npm, Tokens, Wmc. Register each via the implement_metric_trait! macro invocation in /src/metrics/ to start with default (no-op) bodies, then replace with real impls for the metrics that have meaningful semantics for the language.

Audit aliased grammar variants

Tree-sitter grammars frequently emit several distinct kind_ids that map to the same node.kind() string (Identifier / Identifier2 / Identifier3 in Go, InvocationExpression / InvocationExpression2 in C#, QuotedContent ⋯ QuotedContent20 in Elixir). Every match node.kind_id() arm that touches an aliasable rule must either list every numbered variant or compare on the string node.kind() instead. Missing an alias silently drops nodes from the metric. See the add-lang skill for the mechanical audit procedure and lessons 2, 4, and 13 in docs/development/lessons_learned.md for the failure modes.

Tests

Add per-language tests under each src/metrics/*.rs test module — aim for parity with the existing Rust coverage across the metric files. Every insta::assert_json_snapshot! call MUST be anchored: either with an inline expected block, a positive assert_eq! on the headline integer accessor above it, or an explanatory // expected: comment. make snapshot-anchors (run as part of make pre-commit) enforces this against .snapshot-anchor-baseline.txt.

End-to-end workflow

For an opinionated, end-to-end recipe — including the alias audit, test layout, snapshot anchoring, and code-quality post-passes — see the project's add-lang Claude Code skill. It is the canonical workflow used by recent language additions (Elixir, PHP, C#, Bash, Go).

Lines of Code (LoC)

In this document we give some guidance on how to implement the LoC metrics available in this crate. Lines of code is a software metric that gives an indication of the size of some source code by counting the lines of the source code. There are many types of LoC so we will first explain those by way of an example.

Types of LoC

#![allow(unused)]
fn main() {
/*
Instruction: Implement factorial function
For extra credits, do not use mutable state or a imperative loop like `for` or `while`.
 */

/// Factorial: n! = n*(n-1)*(n-2)*(n-3)...3*2*1
fn factorial(num: u64) -> u64 {

    // use `product` on `Iterator`
    (1..=num).product()
}
}

The example above will be used to illustrate each of the LoC metrics described below.

SLOC

A straight count of all lines in the file including code, comments, and blank lines.
METRIC VALUE: 11

PLOC

A count of the instruction lines of code contained in the source code. This would include any brackets or similar syntax on a new line. Note that comments and blank lines are not counted in this. METRIC VALUE: 3

LLOC

The "logical" lines is a count of the number of statements in the code. Note that what a statement is depends on the language. In the above example there is only a single statement which id the function call of product with the Iterator as its argument. METRIC VALUE: 1

CLOC

A count of the comments in the code. The type of comment does not matter ie single line, block, or doc.
METRIC VALUE: 6

BLANK

Last but not least, this metric counts the blank lines present in a code. METRIC VALUE: 2

Implementation

To implement the LoC related metrics described above you need to implement the Loc trait for the language you want to support.

This requires implementing the compute function. See /src/metrics/loc.rs for where to implement, as well as examples from other languages.

Update grammars

Each programming language needs to be parsed in order to extract its syntax and semantic: the so-called grammar of a language. In big-code-analysis, we use tree-sitter as parsing library since it provides a set of distinct grammars for each of our supported programming languages. But a grammar is not a static monolith, it changes over time, and it can also be affected by bugs, hence it is necessary to update it every now and then.

As now, since we have used bash scripts to automate the operations, grammars can be updated natively only on Linux and MacOS systems, but these scripts can also run on Windows using WSL.

In big-code-analysis we use both third-party and internal grammars. The first ones are published on crates.io and maintained by external developers, while the second ones have been thought and defined inside the project to manage variant of some languages used in Firefox. We are going to explain how to update both of them in the following sections.

Third-party grammars

Update the grammar version in Cargo.toml and enums/Cargo.toml. Below an example for the tree-sitter-java grammar

tree-sitter-java = "x.xx.x"

where x represents a digit.

Run ./recreate-grammars.sh to recreate and refresh all grammars structures and data

./recreate-grammars.sh

Once the script above has finished its execution, you need to fix, if there are any, all failed tests and problems introduced by changes in the grammars.

Commit your changes and create a new pull request

Internal grammars

Update the version of tree-sitter-cli in the package.json file of the internal grammar, then refresh its committed package-lock.json with npm install --package-lock-only --ignore-scripts in the same directory and commit both files together. The regen scripts install with npm ci, which fails loudly when the lockfile is missing or out of sync with package.json — this keeps every regen hash-verified and byte-reproducible (OpenSSF Scorecard Pinned-Dependencies).

The five vendored grammars publish under the bca-tree-sitter-* namespace (see RELEASING.md for the rename rationale), but consumer call sites still reference them as tree-sitter-<lang> via Cargo's package = ... alias. A grammar refresh does not bump the leaf's version on its own — every crate in this repository shares one workspace-wide version, and bumping the leaves out of step with the parent is not allowed (see the "Lockstep version policy" in RELEASING.md). Regenerate the parser tables, accept the resulting test-snapshot drift, and ship the change under the current version. The next workspace release picks up the new grammars at whatever shared version the next tag declares.

If a regeneration also needs an updated tree-sitter runtime dependency, bump the dev-dependency line inside the leaf's Cargo.toml:

[dev-dependencies]
tree-sitter = "=x.x.x"

Leave [package] name = "bca-tree-sitter-<lang>", [package] version, and [lib] name = "tree_sitter_<lang>" untouched — the rename trick in [lib] is what keeps Rust import paths stable, and the version line is managed by the lockstep bump at release time.

Run the appropriate script to update the grammar by recreating and refreshing every file and script.

For tree-sitter-ccomment and tree-sitter-preproc run ./generate-grammars/generate-grammar.sh followed by the name of the grammar. Below an example always using the tree-sitter-ccomment grammar

./generate-grammars/generate-grammar.sh tree-sitter-ccomment

Instead, for tree-sitter-mozcpp and tree-sitter-mozjs, use their specific scripts.

For tree-sitter-mozcpp, run

./generate-grammars/generate-mozcpp.sh

For tree-sitter-mozjs, run

./generate-grammars/generate-mozjs.sh

tree-sitter-tcl, the fifth vendored grammar, has no regeneration script: it vendors pre-generated parser sources only (no grammar.js), so updating it means re-vendoring the generated src/ from its upstream project rather than running tree-sitter generate locally.

Once the script above has finished its execution, you need to fix, if there are any, all failed tests and problems introduced by changes in the grammars.

Commit your changes and create a new pull request

big-code-analysis Documentation