big-code-analysis
big-code-analysis is a Rust library to analyze and extract information from source codes written in many different programming languages. It is based on a parser generator tool and an incremental parsing library called Tree Sitter.
You can find the source code of this software on GitHub, while issues and feature requests can be posted on the respective GitHub Issue Tracker.
Supported platforms
big-code-analysis can run on the most common platforms: Linux, macOS, and Windows.
On our
GitHub Release Page
you can find the Linux and Windows binaries already compiled and
packed for you.
API docs
If you prefer to use big-code-analysis as a crate, you can find the
API docs generated by Rustdoc
here.
For task-oriented guides on embedding the crate — quick start,
in-memory analysis, walking FuncSpace results, and error
handling — see the Using as a Library section.
For the PyO3 bindings — pip install big-code-analysis, batch
processing, flat-record iteration, SARIF output, and async
patterns — see the Python Bindings section.
License
-
Mozilla-defined grammars are released under the MIT license.
-
big-code-analysis, big-code-analysis-cli and big-code-analysis-web are released under the Mozilla Public License v2.0.
Supported Languages
This is the list of programming languages parsed by big-code-analysis.
- C
- C++
- C#
- Mozcpp
- Bash
- Ccomment
- Elixir
- Preproc
- Go
- Groovy
- Java
- JavaScript
- Kotlin
- Lua
- Mozjs
- Perl
- Php
- Python
- Ruby
- Rust
- Tcl
- Tsx
- Typescript
Supported Metrics
This chapter is a guided tour of every metric that big-code-analysis computes. Each section starts from the original research paper, walks through the algorithm, and explains both the way the metric was originally meant to be used and the ways the industry has actually ended up using it years later. If you are new to software metrics, read the sections in order — the later metrics (Maintainability Index in particular) are explicitly built on top of the earlier ones (Halstead, Cyclomatic, LOC).
A few framing notes before we start:
- A metric is a measurement, not a verdict. Every number on this page summarises a structural property of source code. None of them measures correctness, productivity, or developer skill. The most important question for any metric is always "compared with what?" — the same module, a month ago; this module versus its siblings; this codebase versus an industry baseline. Absolute thresholds are rough heuristics at best.
- Most metrics here are computed at three scopes: per function / method, per class or unit-like space, and per file. The underlying tree-sitter parser produces a tree of "spaces" (functions, closures, classes, namespaces, …) and every metric is rolled up through that tree. See the Supported Languages chapter for which scopes apply to which languages.
- Object-oriented metrics only fire on object-oriented constructs.
WMC, NPA, and NPM report
0on a Rust file that has noimplblocks or on a Python module without classes; that is the correct answer, not a bug.
Index
| Metric | Measures | First defined by |
|---|---|---|
| ABC | Size as <Assignments, Branches, Conditions> | Fitzpatrick, 1997 |
| Cognitive Complexity | How hard a function is to read | Campbell / SonarSource, 2017 |
| Cyclomatic Complexity (CC) | Independent paths through a function | McCabe, 1976 |
| Halstead | Vocabulary-based size, difficulty, effort, bugs | Halstead, 1977 |
| Lines of Code (SLOC, PLOC, LLOC, CLOC, BLANK) | Raw, physical, logical, comment, and blank line counts | Conte, Dunsmore & Shen, 1986 |
| Maintainability Index (MI) | Composite maintainability score | Oman & Hagemeister, 1992; Coleman et al., 1994 |
| NArgs | Number of arguments per function | folk metric |
| NExits | Number of exit points per function | structured-programming literature |
| NOM | Number of methods and closures | Lorenz & Kidd, 1994 |
| NPA | Number of public attributes | Lorenz & Kidd, 1994 |
| NPM | Number of public methods | Lorenz & Kidd, 1994 |
| Tokens | Tree-sitter leaf-token count (size proxy) | Lizard tool, Terry Yin |
| WMC | Sum of cyclomatic complexity across a class's methods | Chidamber & Kemerer, 1994 |
ABC
The ABC metric measures the size of a piece of code as a three-dimensional vector. Each component counts one kind of operation:
- Assignments — anything that stores a value into a variable,
including compound assignments (
+=,++) and explicit initialisation. - Branches — function and method calls. Despite the name, this is not the count of conditional jumps; it is the number of points where control branches out to other code.
- Conditions — boolean tests:
if,while, ternary operators, short-circuit&&/||, and the comparison operators that feed them.
The metric was introduced by Jerry Fitzpatrick in the 1997 C++ Report article Applying the ABC metric to C, C++ and Java. The current canonical specification, including the rules for what counts as an A, B, or C in modern languages, is maintained on Fitzpatrick's Software Renovation site.
Algorithm
The implementation walks every leaf node of the syntax tree exactly
once. For every node it asks the language's per-language Abc trait
implementation three yes/no questions: is this an assignment? a
branch? a condition? — and increments the matching counter. The
four headline values are:
- the three components themselves,
assignments,branches,conditions; - the magnitude
|<A,B,C>| = √(A² + B² + C²), which is the way Fitzpatrick recommends summarising the vector as a single number.
The full serialised output (src/metrics/abc.rs) emits these four
together with the per-component averages (assignments_average,
branches_average, conditions_average) and per-component
*_min / *_max at the file scope, for thirteen fields total. The
metric is specialised per language in src/languages/language_*.rs.
How to read it
ABC is a size metric, not a complexity metric — a long, dull function with no decisions still scores high if it does a lot of assignments. Fitzpatrick's original recommendation was to use the magnitude as a relative ruler: rank a file's functions by ABC magnitude and look at the top decile.
In practice ABC ended up being most widely adopted by the Ruby
community, where the rubocop linter and the
flog tool both default to
threshold-based warnings. A Ruby method with an ABC magnitude over
about 17 is conventionally a refactoring candidate; over 30 is
considered hard to maintain. Those thresholds are language-specific —
expect higher values in C++ and Java, which use explicit getter/setter
assignments more aggressively.
Cognitive Complexity
Cognitive Complexity was introduced by G. Ann Campbell at
SonarSource in the 2017 white paper Cognitive Complexity — A new way
of measuring understandability and the follow-up IEEE TechDebt 2018
paper Cognitive Complexity — An Overview and
Evaluation. The
white paper itself is available as
CognitiveComplexity.pdf
on the SonarSource site.
The metric was designed as a deliberate replacement for Cyclomatic
Complexity in code-quality tooling. The argument Campbell makes is
that cyclomatic complexity measures how hard code is to test, not
how hard it is to understand: a 1024-arm switch statement scores
the same as a deeply nested chain of ifs that perform identical
logic, yet a human reader has a much harder time following the
nested code.
Algorithm
Cognitive Complexity starts at zero and applies three rules as it walks the tree:
- Ignore "shorthand" control flow. Constructs that simply route
to a single block — a top-level
ifwith no nesting, anelsewithout conditions of its own, the head of afor, a?:ternary — add a baseline+1each, but they do not punish you for the pattern. - Penalise breaks in linear flow. Every
if,else if,else,switch,try/catch, loop, jump (goto,break label,continue label), and recursive call adds at least+1. - Punish nesting. Every time control flow appears inside an
already-nested block, the metric adds an extra
+1per level of nesting. Anifinside aforinside an outerifinside a method scores1 + 2 + 3 = 6, where a flat sequence of the same three constructs would have scored1 + 1 + 1 = 3.
Sequences of identical boolean operators (a && b && c) score +1
for the whole run, on the grounds that a chain of &&s is no harder
to read than a single &&. Switching operators (a && b || c) is
where the cognitive load jumps, so the second operator earns its own
+1.
big-code-analysis exports the per-function structural score along
with the file-wide sum, min, max, and a per-function average.
The implementation is in src/metrics/cognitive.rs.
How to read it
A Cognitive Complexity of 0 means the function is purely linear; no
branches, no loops. SonarSource's tooling defaults to flagging
functions above 15 as "too complex" and Campbell's recommendation
in the white paper is that a function should rarely exceed about
25. Unlike Cyclomatic Complexity, the metric scales smoothly:
deeply nested code with the same number of decisions scores
significantly higher than flat code with the same decisions.
The emergent use case is refactoring guidance during code review: because the metric penalises nesting specifically, it tends to flag exactly the kind of function that benefits from an early-return or "extract method" refactor. SonarLint's IDE plugins (IntelliJ, VS Code, Visual Studio, Eclipse) all surface it as the headline complexity number on hover, and the metric has since been picked up by several language servers and code-review platforms outside the Sonar ecosystem.
Cyclomatic Complexity (CC)
The original software complexity metric, introduced by Thomas J. McCabe in 1976 in A Complexity Measure (IEEE Transactions on Software Engineering, SE-2(4), pages 308–320).
McCabe's idea was to apply graph theory to the control-flow graph of a function. If you draw every basic block as a node and every jump between blocks as an edge, the cyclomatic number of that graph is
M = E − N + 2P
where E is the number of edges, N the number of nodes, and P
the number of connected components. Crucially, M is also exactly
the number of linearly independent paths through the function —
in other words, the minimum number of test cases needed to cover
every branch at least once.
Algorithm
big-code-analysis does not literally build a control-flow graph. Instead it uses the equivalent, much cheaper, formulation McCabe proved in the 1976 paper for structured programs:
Cyclomatic Complexity = 1 + (number of decision points)
A "decision point" is any node where control can branch:
if,else if, ternary?:case/whenarms inswitch/match/selectwhile,do … while, every variant offor- exception-handler
catchclauses - short-circuit boolean operators
&&and||
The per-language Cyclomatic trait, in src/metrics/cyclomatic.rs,
asks each tree-sitter node "are you a decision?" and increments the
counter. The metric is rolled up per function and per file; per-class
aggregation across method bodies is provided separately by
WMC below.
Modified cyclomatic
big-code-analysis also reports a modified variant that collapses
all case / match / when arms inside a single switch
statement into one decision point, regardless of how many arms it
has. This tends to undercount big dispatch tables in a way that
often matches developer intuition better than the strict McCabe
definition — a 30-arm enum dispatch reads as one decision, not
thirty. (The convention itself is not original to this project: it
echoes the long-standing -m mode from Terry Yin's
lizard tool, which is where
many readers will first have seen it.) Both numbers are exported
side by side; pick one and be consistent.
How to read it
McCabe's original recommendation, repeated in the 1976 paper and
preserved by NIST's Structured Testing
report (Special
Publication 500-235, 1996), is to treat 10 as the upper bound for a
single function: above that, the number of test cases needed for
branch coverage grows uncomfortably large.
The emergent uses of cyclomatic complexity have been:
- Defect prediction. Complexity correlates well — though imperfectly — with the probability of a function containing a bug, and most static-analysis tools flag high-CC functions as risky.
- Test-coverage planning. CC is the lower bound on the number of test cases needed to cover every branch, so test teams use it directly to budget effort.
- Refactor triage. Cyclomatic Complexity is the headline "complexity" number in almost every code-quality dashboard, often as a tie-breaker between two functions that look similar in length.
Be aware of the metric's well-known blind spot: it treats every
decision as equal weight. A 30-arm switch over an enum and a
function with two nested ifs each containing nested ifs both
score around 30, even though they are very different reading
experiences. Cognitive Complexity (above) was designed to fix exactly
that.
Halstead
The Halstead suite is the oldest size-and-effort metric family on this page. Maurice H. Halstead introduced it in his 1977 book Elements of Software Science (Elsevier, ISBN 0-444-00205-7); the Wikipedia page on Halstead complexity measures summarises the formulas. Halstead's project was strikingly ambitious: he wanted a quantitative, empirical science of software in the same way that physics is the empirical science of matter.
The four base counts
Halstead reduces a program to its tokens, then partitions them into two categories:
- Operators — anything that does something: keywords (
if,return,while), arithmetic and logical operators, assignment, function-call syntax, punctuation that controls flow. - Operands — anything that is something: identifiers and literals.
From these you derive four base counts:
| Symbol | Meaning |
|---|---|
n1 | number of distinct operators |
n2 | number of distinct operands |
N1 | total count of operator occurrences |
N2 | total count of operand occurrences |
big-code-analysis records these four numbers in
src/metrics/halstead.rs per function and per file. The per-language
trait classifies tokens as operator vs. operand on a token-by-token
basis; the rules deliberately exclude pure layout punctuation like
parentheses and statement separators, which is why the Halstead
totals are not the same as the Tokens count.
Derived metrics
Halstead then derives a small zoo of formulas. big-code-analysis
reports all of the standard ones, plus three less-common derivations
(estimated_program_length, purity_ratio, level) that are part
of the original suite:
vocabulary n = n1 + n2
length N = N1 + N2
estimated_program_length N̂ = n1·log2(n1) + n2·log2(n2)
purity_ratio = N̂ / N
volume V = N · log2(n) (bits)
difficulty D = (n1 / 2) · (N2 / n2)
level L = 1 / D
effort E = D · V (elementary mental discriminations)
time T = E / 18 (seconds)
bugs B = E^(2/3) / 3000 (estimated delivered defects)
The numeric constants come from Halstead's empirical fits against a
heterogeneous corpus of CDC-era programs including FORTRAN, PL/I, and
Algol-family languages. The T = E / 18 "Stroud number" is separate
— it comes from psychology: Halstead borrowed John Stroud's estimate
that the human mind makes about 18 elementary discriminations per
second.
How to read it
Halstead's original intent was to predict three things about a program before it was even written: how big it would be in bits, how long it would take to implement, and how many bugs to expect in deployment. The empirical evidence for the volume and length predictions is reasonable; the time and bugs predictions are more controversial and have been criticised at length, notably in the Purdue technical report Software Science Revisited.
In modern practice the Halstead numbers are used for three things:
- As inputs into composite metrics — most importantly the Maintainability Index (next section), which depends on Halstead volume.
- As a language-independent size proxy: volume in bits scales smoothly across languages in a way that LOC does not.
- For comparative effort budgeting: when two refactoring candidates have similar cyclomatic complexity, the one with the higher Halstead difficulty is the one more likely to introduce regressions.
Lines of Code
This section covers the five LOC variants — SLOC, PLOC, LLOC, CLOC, and BLANK. "Counting lines" sounds trivial until you have to define exactly what counts. The five variants below are the de-facto standard breakdown, going back to Samuel Conte, Hubert Dunsmore and Vincent Shen's 1986 textbook Software Engineering Metrics and Models (Benjamin/Cummings, ISBN 0-8053-2162-4), which codified the distinction between physical and logical lines. The OpenStaticAnalyzer project maintains a readable summary of the modern definitions.
| Variant | Counts |
|---|---|
| SLOC | Source Lines Of Code — every line in the file, comments, blanks, and code alike |
| PLOC | Physical Lines Of Code — non-blank, non-comment-only lines |
| LLOC | Logical Lines Of Code — statement-bearing lines (definitions, assignments, declarations) |
| CLOC | Comment Lines Of Code — lines that contain a comment (with or without code on the same line) |
| BLANK | Blank lines — whitespace-only lines |
Algorithm
big-code-analysis derives all five counts from a single pass over the
tree-sitter syntax tree (see src/metrics/loc.rs). Comments and
strings are identified by their AST node type rather than by lexical
scanning, so multi-line strings, raw strings, doc comments, and
string interpolations are all handled correctly. The per-language
Loc trait specifies which node kinds count as a "statement" for
LLOC; this is the subtle one, because what counts as a statement is
language-defined.
The five counts satisfy a couple of useful identities:
SLOC = PLOC + BLANK + (lines that are comment-only)
CLOC ≥ (lines that are comment-only) # CLOC also counts mixed code+comment lines
How to read it
- SLOC is what most people mean colloquially by "lines of code". It is the canonical size proxy, but is sensitive to formatting and not portable across language conventions.
- PLOC strips away the visual noise. It is the size measure used inside the Maintainability Index formula below.
- LLOC is the most reliable statement count. It is the right measure if you are budgeting test cases per statement, or comparing the density of a Python file against a Java file.
- CLOC, combined with PLOC, gives you a comment density —
CLOC / PLOCis a useful rough proxy for how much of the file is documentation versus implementation. - BLANK is mostly diagnostic: a file with very low BLANK proportion is often hard to read.
The emergent uses of LOC variants go well beyond raw size. They are the most common input into cost-estimation models (COCOMO and COCOMO II both use KSLOC — thousands of source lines — as their base unit), they feed effort prediction in product-portfolio dashboards, and they are used as a normalising denominator for almost every other metric: defects per KSLOC, churn per KSLOC, test cases per KSLOC. The weakness — LOC is easy to game and a 10× difference in coding style can produce a 2× difference in LOC — is the reason this chapter has so many other metrics in it.
Maintainability Index (MI)
The Maintainability Index is a composite metric that rolls several of the metrics above into a single 0-to-100ish number meant to be read as "how maintainable is this code?". It was proposed by Paul Oman and Jack Hagemeister in their 1992 ICSM paper Metrics for assessing a software system's maintainability and refined by Don Coleman, Dan Ash, Bruce Lowther, and Paul Oman in the 1994 IEEE Computer paper Using metrics to evaluate software system maintainability (IEEE Computer 27(8), pages 44-49). Their methodology was empirical: they collected expert maintainability ratings on a handful of production Hewlett-Packard systems, computed forty candidate metrics on each, and let regression analysis pick the best linear combination. The combination that survived used Halstead volume, cyclomatic complexity, lines of code, and comment density.
big-code-analysis reports the three formulas that have stuck in practice:
mi_original = 171 − 5.2·ln(HV) − 0.23·CC − 16.2·ln(SLOC)
mi_sei = 171 − 5.2·log2(HV) − 0.23·CC − 16.2·log2(SLOC) + 50·sin(√(2.4·comment_ratio))
mi_visual_studio = max(0, mi_original · 100 / 171)
mi_originalis the Coleman–Oman formula. It can be negative for pathological files.mi_seiis the Software Engineering Institute's refinement, which adds a comment-density term — thesin(√(...))shape was chosen so that some comments help, but adding more after a point does not.mi_visual_studiois the linear rescaling Microsoft chose for Visual Studio, where the score is clamped to[0, 100]and shown to developers traffic-light style: green ≥ 20, yellow ≥ 10, red below.
The historical context, and a sharp critique of the metric, is collected on Arie van Deursen's blog post Think Twice Before Using the Maintainability Index.
Algorithm
The implementation is purely arithmetic — src/metrics/mi.rs
consumes the already-computed Halstead, Cyclomatic, and LOC
metrics and applies the three formulas. Because the formulas use the
natural log of Halstead volume and SLOC, MI is undefined for empty
files; big-code-analysis returns 0.0 for any file with zero SLOC or
zero Halstead volume.
How to read it
MI was originally designed as a portfolio-level score: "how much maintenance pain should we expect from this codebase over the next year?". It is fairly stable across releases of a healthy system and tends to drop measurably before a system enters the "legacy" quadrant.
The emergent use case is the Visual Studio traffic-light rendering:
every C# developer who has hovered a method in the IDE has seen the
green / yellow / red icon, and the underlying number is mi_visual_studio.
This made MI by far the most user-facing software metric for an
entire generation of .NET developers, which is also why it is the
metric that has attracted the most criticism. Treat it as a smoke
detector, not a thermostat: a sudden drop is a useful signal, but
the absolute number is noisy.
NArgs
NArgs counts the number of arguments declared by a function, method, or closure. The metric does not have a famous origin paper — it is folk wisdom dating to at least Kernighan and Plauger's The Elements of Programming Style (1974) and prominently re-stated in Robert C. Martin's Clean Code (2008), which suggests three arguments as a soft ceiling.
big-code-analysis splits the count by callable kind: every aggregate
is reported separately for functions and closures so a Rust file
heavy on |…| … closures and a Java file with only methods produce
comparable numbers. The serialised output
(src/metrics/nargs.rs) is total_functions, total_closures,
average_functions, average_closures, total, average,
functions_min, functions_max, closures_min, closures_max.
The implementation handles default arguments, variadic arguments,
keyword-only arguments, and destructured parameters consistently per
language.
How to read it
A function with many arguments is hard to call correctly and even harder to test exhaustively — the test matrix grows roughly exponentially. The classic refactoring advice is the introduce parameter object pattern: when a function takes more than four related arguments, group them into a record / struct / dataclass.
The emergent use is as a review-blocking lint rule: most modern
linters (pylint's R0913, ESLint's max-params, Checkstyle's
ParameterNumber) flag functions with more than a configurable
threshold. NArgs is also a useful component of API-design dashboards:
public APIs whose average NArgs has crept upward over time tend to be
ones that have accreted "just one more parameter" feature flags.
NExits
NExits counts the number of distinct exit points from a
function — every return, every throw / raise, and the implicit
fall-through return at the end of a void function.
The metric goes back to the structured-programming literature of the 1970s, where Edsger Dijkstra and others argued that functions should have a single entry and a single exit point (the "SESE" rule). Modern thinking is much more nuanced — see Steve McConnell's Code Complete, 2nd edition (Microsoft Press, 2004), which explicitly recommends early returns as a clarity-improving pattern when they reduce nesting.
big-code-analysis walks each function's syntax tree, identifies the
language-specific exit nodes (see the per-language Exit trait in
src/metrics/exit.rs), and reports per-function counts plus
file-level sum, average, min, and max. The serialised
field name is nexits, matching the prose acronym used here.
How to read it
Strict SESE coding standards (DO-178C for avionics, MISRA C for
embedded automotive — see MISRA's official
site) still require an NExits of 1 per
function, because multiple exit points complicate certified
control-flow analysis. Outside those domains, an NExits of 2-4 is
usually a good sign — it almost always means the function uses
guard clauses to handle preconditions and then proceeds in a flat
body.
A very high NExits — say above 8 — is the warning sign. It usually means the function should have been split into several smaller functions, with each "successful branch" becoming its own helper.
NOM
NOM stands for Number Of Methods and counts every function, method, and closure defined inside a given scope (file, class, or namespace). For object-oriented codebases it is one of the first metrics introduced by Mark Lorenz and Jeff Kidd in their 1994 book Object-Oriented Software Metrics (Prentice Hall, ISBN 0-13-179292-X), where it is treated as the primary class-size indicator.
big-code-analysis reports the count split by callable kind in
src/metrics/nom.rs. The serialised fields are functions,
closures, functions_average, closures_average, total,
average (overall average across containing spaces), and per-kind
functions_min, functions_max, closures_min, closures_max.
The split lets you ask different questions of the same code: a Rust crate with many closures and few functions is typical of iterator-heavy code; a Python module with many functions and few closures is typical of script-style code.
How to read it
NOM is the input to several other metrics — WMC sums cyclomatic
complexity across the same set of methods that NOM counts, and NPM
filters that same set down to public methods. As a standalone
metric, the Lorenz–Kidd recommendation is ≤ 20 methods per class.
The emergent use is as a God-class detector: a class with NOM in
the dozens is almost always doing too much, and is a strong
candidate for "extract collaborator" refactoring as documented in
Martin Fowler's Refactoring catalogue
entry on Large Class.
NPA
NPA counts the number of public attributes (a.k.a. fields, properties, instance variables) declared by a class or interface. It is part of the metric family introduced by Lorenz and Kidd in Object-Oriented Software Metrics (1994) and was later folded into the MOOD ("Metrics for Object-Oriented Design") suite proposed by Brito e Abreu and Carapuça (1994).
big-code-analysis splits the count by definition-site kind:
classes (concrete types with state) and interfaces (abstract
contracts). The serialised output (src/metrics/npa.rs) is
classes (sum of NPA across all classes), interfaces (sum across
interfaces), class_attributes (sum of all attributes — public or
not — across classes), interface_attributes, classes_average
(class density of public attributes), interfaces_average, total,
total_attributes, and average. The per-language Npa trait
decides what counts as "public" (Java public, C# public, Rust
pub, Python's "no leading underscore" convention, …) and what
counts as "attribute" rather than "method".
How to read it
NPA is a direct measure of encapsulation. Every public attribute is a piece of internal state that callers can read or write without going through a method, which means it is a piece of internal state the class cannot validate or evolve without breaking callers. The canonical guidance — first explicitly stated in Bertrand Meyer's Object-Oriented Software Construction (Prentice Hall, 1988) and known as the Uniform Access Principle — is to keep NPA at or near zero and to expose state through public methods instead.
The emergent use is API-stability auditing: a public library class whose NPA grows over time accumulates breaking-change liability faster than its public-method surface.
NPM
NPM counts the number of public methods declared by a class or interface. It is the method-side companion to NPA and was again codified by Lorenz and Kidd (1994).
As with NPA, big-code-analysis splits NPM by definition-site kind
(classes vs. interfaces). The serialised output
(src/metrics/npm.rs) is classes (sum of NPM across classes),
interfaces, class_methods (sum of all methods — public or
not — across classes), interface_methods, classes_average,
interfaces_average, total, total_methods, and average.
The language-specific Npm trait decides what counts as public —
for example, Rust's pub, Python's leading-underscore convention,
C++'s public: section — and folds together regular methods,
constructors, and operator overloads as appropriate.
NPM is also one of the inputs into Mark Hitz and Behzad Montazeri's Class Interface Size metric, and into Chidamber and Kemerer's Response For a Class (RFC).
How to read it
NPM is the public interface size. A class with NPM in the dozens
is a class with too large an API contract: every public method is
something callers can come to depend on, and every change to it is a
breaking change. The Lorenz–Kidd guidance is ≤ 20 public methods
per class, with anything over 40 being considered a strong
refactoring candidate. The same rule applies particularly forcefully
to interfaces in Java and C#, where the contract really is the
shape clients pin against.
The emergent use is as a public-API change tracker for libraries: monitoring NPM at the package level catches accidental expansion of a library's surface area in the same way that NPA catches accidental exposure of internal fields.
Tokens
Tokens is a per-function and per-file count of the tree-sitter
leaf tokens — identifiers, literals, keywords, punctuation —
excluding any token whose AST ancestor is a comment node. It is a
modern, lexer-driven size proxy intended as a more
formatting-resilient alternative to LOC. (The same idea is well
known from Terry Yin's lizard
command-line tool, which is where many readers will first have seen
a token-count metric.)
The implementation lives in src/metrics/tokens.rs. Because Tokens
counts every leaf, including punctuation that Halstead
deliberately skips, the value will not equal Halstead N1 + N2,
and because it counts tokens rather than lines it is not
equivalent to any LOC variant. Whitespace-only reformatting does not
change Tokens; renaming a variable does not change the count;
removing a comment does not change Tokens. Edits that change the
tokens themselves — adding an if, adding optional braces around
a single-statement block, or inserting/removing semicolons in a
language where they are optional — do change the count.
How to read it
Tokens is the most formatting-resilient size proxy in the suite.
It is the right size measure to use when you are normalising another
metric across languages or across teams with different style
conventions — bugs per KSLOC is sensitive to formatting, while
bugs per 1000 tokens is much less so.
The emergent use is as the defect-density denominator of choice in cross-language research: a 1000-line Java file and a 1000-line Lisp file contain very different amounts of code, but a 1000-token slice of each contains roughly the same amount of information. This makes Tokens particularly useful for machine-learning code-quality models that train across many languages.
WMC
WMC — Weighted Methods per Class — is the first metric in the Chidamber and Kemerer suite, introduced in their 1994 IEEE Transactions on Software Engineering paper A Metrics Suite for Object Oriented Design (volume 20, issue 6, pages 476-493). The CK suite — WMC, DIT, NOC, CBO, RFC, LCOM — is the single most-cited collection of OO metrics in the academic literature; big-code-analysis currently implements WMC and the simpler size metrics (NOM, NPA, NPM), with the inheritance- and coupling-based ones tracked for future work.
WMC is the sum of the cyclomatic complexity of every method defined in a class. The original paper deliberately left the "weighting" abstract — Chidamber and Kemerer wrote that "if all method complexities are considered to be unity, then WMC = n, the number of methods" — but the empirical follow-up literature has almost universally settled on cyclomatic complexity as the weight, and that is what big-code-analysis uses.
Algorithm
For each class or interface found by the per-language parser,
big-code-analysis sums the standard cyclomatic complexity of every
method body inside it (src/metrics/wmc.rs). The file-level
serialised output is three fields: classes (sum of WMC across
all classes in the file), interfaces (sum across interfaces),
and total (the two combined). No min/max/average aggregation is
emitted at the file scope — to rank individual classes by WMC, use
the report subcommand, which surfaces a WMC hotspots section
(see Commands → Report).
How to read it
Chidamber and Kemerer offered three hypotheses about WMC, all of which have been validated repeatedly since:
- Higher WMC predicts higher maintenance effort. A class whose methods are individually complex will resist comprehension.
- Higher WMC reduces reuse. Classes that do many complicated things are hard to drop into a new context.
- Higher WMC suggests broader application-specific behaviour. Such classes tend to be "main loop"-style coordinators rather than reusable building blocks.
The emergent use is God-class detection: combined with NOM, WMC is one of the clearest signals that a class needs to be split. A class with high NOM but low WMC is a passive data holder (probably fine). A class with low NOM and high WMC has a few gargantuan methods (split the methods, not the class). A class with both high NOM and high WMC is the classic God class.
Where to go next
- The Supported Languages chapter lists which
metrics fire for which languages — language coverage varies
because some metric definitions (
NPA,NPM,WMC) only make sense in languages with classes. - The Commands → Metrics page documents
how to invoke
bca metricsto produce the JSON / YAML / TOML / CBOR output for any of these numbers. - The Recipes chapter shows end-to-end examples of producing quality reports from these metrics, including pipelining them into dashboards.
Migration: Flag CLI to Subcommand CLI
The CLI was restructured from a flat flag-style interface (one process,
many mutually-exclusive --action flags) into a subcommand-style
interface (bca <verb>). This page maps every old invocation to its
replacement.
Why the change
The flag CLI overloaded --output-format with two unrelated meanings:
per-file serialization (-O json/yaml/toml/cbor) and a post-walk
aggregated report (-O markdown). It needed two clap ArgGroups plus
runtime checks to police invalid combinations, and --top /
--strip-prefix lived as global flags that only applied to one format.
Future aggregated formats (e.g. HTML) would compound the fragility.
The subcommand CLI fixes the structure: bca metrics and bca ops emit
per-file output; bca report <FORMAT> emits an aggregated report; each
verb has its own scoped flag set.
Migration mapping
| Old | New |
|---|---|
--metrics -O markdown (+ --top, --strip-prefix) | report markdown |
--metrics -O json/yaml/toml/cbor | metrics -O json/yaml/toml/cbor |
--metrics -O checkstyle/sarif/code-climate/clang-warning/msvc-warning | check --threshold ... --output-format <fmt> [--output FILE] |
--ops -O ... | ops -O ... |
--dump | dump |
--find <NODE> | find <NODE> [<NODE>...] |
--count <LIST> | count <NODE> [<NODE>...] |
--function | functions |
--comments [--in-place] | strip-comments [--in-place] |
--preproc <FILE> <FILE>... (producer) | preproc -o <OUT> |
--preproc <FILE> (consumer) | --preproc-data <FILE> (global) |
--list-metrics [MODE] | list-metrics [MODE] |
--pr (pretty) | --pretty (on metrics and ops) |
-p, -I, -X, -j, -l, --ls, --le, -w | unchanged; global |
Side-by-side examples
Aggregated markdown report
# OLD
big-code-analysis-cli \
--metrics \
--paths "$PWD" \
--output-format markdown \
--num-jobs $(nproc) \
--top 20 \
--strip-prefix "$PWD/"
# NEW
bca \
--paths "$PWD" \
--num-jobs $(nproc) \
report markdown \
--top 20 \
--strip-prefix "$PWD/"
Per-file metric extraction
# OLD
big-code-analysis-cli --metrics --paths ./src --output-format json --output ./out/
# NEW
bca --paths ./src metrics -O json --output ./out/
Per-file ops extraction
# OLD: big-code-analysis-cli --ops --paths ./src -O json -o ./out/
# NEW: bca --paths ./src ops -O json -o ./out/
AST dump
# OLD: big-code-analysis-cli --dump --paths ./file.rs
# NEW: bca --paths ./file.rs dump
Find / count nodes
# OLD: big-code-analysis-cli --find call_expression --paths ./src
# NEW: bca --paths ./src find call_expression
# OLD: big-code-analysis-cli --count if_statement,for_statement --paths ./src
# NEW: bca --paths ./src count if_statement for_statement
Note:
countnow takes one node type per positional argument (space separated) rather than one comma-separated string.
Function spans
# OLD: big-code-analysis-cli --function --paths ./src
# NEW: bca --paths ./src functions
Strip comments
# OLD: big-code-analysis-cli --comments --in-place --paths ./src
# NEW: bca --paths ./src strip-comments --in-place
Preproc data — producer
# OLD
big-code-analysis-cli --metrics --preproc a.h --preproc b.h \
--paths ./src -o /tmp/p.json
# NEW
bca --paths ./src preproc -o /tmp/p.json
Preproc data — consumer
# OLD
big-code-analysis-cli --metrics --preproc /tmp/p.json \
--paths ./src -O json -o ./out/
# NEW
bca --paths ./src --preproc-data /tmp/p.json \
metrics -O json -o ./out/
List metrics
# OLD: big-code-analysis-cli --list-metrics descriptions
# NEW: bca list-metrics descriptions
Migration hint at runtime
If you run a legacy invocation, the CLI prints a hint identifying the recognized old flags and their new equivalents before clap's own error. For example:
$ bca --metrics -O markdown
note: the CLI was restructured into subcommands. See migration.md for the full mapping.
--metrics -> bca metrics
-O markdown -> bca report markdown [--top N] [--strip-prefix P]
Run `bca --help` for the new command list.
error: unexpected argument '--metrics' found
Commands
bca offers a range of commands to analyze and extract information from source code. Each command may include parameters specific to the task it performs. Below, we describe the core types of commands available in bca.
Metrics
Metrics provide quantitative measures about source code, which can help in:
- Compare different programming languages
- Provide information on the quality of a code
- Tell developers where their code is more tough to handle
- Discovering potential issues early in the development process
big-code-analysis calculates the metrics starting from the source code of a program. These kind of metrics are called static metrics.
Nodes
To represent the structure of program code, bca builds an Abstract Syntax Tree (AST). A node is an element of this tree and denotes any syntactic construct present in a language.
Nodes can be used to:
- Create the syntactic structure of a source file
- Discover if a construct of a language is present in the analyzed code
- Count the number of constructs of a certain kind
- Detect errors in the source code
REST API
bca-web runs a server offering a REST API. This allows users to
send source code via HTTP and receive corresponding metrics in JSON
format.
Skipping generated code
Generated bindings (protobuf stubs, OpenAPI clients, lex/yacc output,
build-system plumbing) inflate metrics for code no human will refactor.
By default, bca scans the first ~50 lines / 5 KiB of
each file for a generated-code marker and skips matches before parsing,
so the skipped file pays no tree-sitter parse cost.
Recognized markers (case-insensitive):
@generated— Facebook / Meta convention; also emitted by buck2, rustfmt, prettier, and many code generators.DO NOT EDIT— Go's// Code generated by … DO NOT EDIT.is the canonical form; the bare phrase is also widely copied (Bazel, protoc, OpenAPI clients).GENERATED CODE— Lizard's marker, recognized for compatibility.
A marker phrase that appears only deep in the file body (past the scan window) does not trigger the skip — the detector deliberately looks only at the file header.
The skip applies uniformly to bca metrics, bca report, and the
threshold engine.
Flags
--no-skip-generated— disable the auto-skip and restore the previous behavior (every file is parsed).--report-skipped— logskipped (generated): <path>to stderr for each file the detector excludes, so you can audit the exclusions and add an explicit include if a file was wrongly tagged.
Respecting .gitignore
When a directory is passed to --paths, bca walks
it with .gitignore awareness by default. Files matched by any of the
following are skipped before parsing:
.gitignorefiles inside the walked tree..ignorefiles (the ripgrep /fdconvention)..git/info/exclude.- The global gitignore (
~/.config/git/ignore, or whatevercore.excludesFilepoints at). .gitignorefiles in ancestor directories of the seed (sobca --paths src/from a project root picks up the project's top-level.gitignore).
The walker honors .gitignore even outside a checked-in git
repository, so an extracted source tarball with a .gitignore file
gets the same treatment as a fresh git clone.
Hidden files (those whose basename starts with .) are filtered
during the walk, matching the previous behavior.
Explicit paths bypass the filter
Files passed by name — via --paths or --paths-from — are always
analyzed, even when they would be excluded by .gitignore. This makes
it safe to do bca metrics --paths-from - from git diff --name-only-style pipelines without losing files that happen to be
covered by a wildcard ignore rule.
Path discovery flags
--no-ignore— disable.gitignore/.ignore/ global-gitignore awareness when expanding directory seeds.--paths-from <FILE>— read newline-separated input paths from<FILE>, or from stdin when<FILE>is-. Combined as a union with any--pathsvalues;-I/-Xglobs still apply. Blank lines are skipped;#is treated as a path character (not a comment). To pass a file literally named-, write./-.--exclude-from <FILE>— read newline-separated--excludeglob patterns from<FILE>, or from stdin when<FILE>is-. Patterns are unioned with any inline--exclude/-Xvalues into a single deny-set; order does not matter..gitignore-style: blank lines and lines whose first non-whitespace character is#are skipped, and a leading UTF-8 BOM is stripped. Convention is a.bcaignoreat the repo root, mirroring.gitignore/.dockerignore. To pass a file literally named-, write./-.
Metrics
bca metrics computes per-file metrics and emits them either to stdout
or to a directory of structured files.
Migrating? This command replaces the pre-restructure
--metricsflag. The aggregated report previously selected with-O markdownnow lives underbca report, and the CI/IDE offender formats (Checkstyle, SARIF, code-climate, clang-warning, msvc-warning) moved tobca check --output-format <fmt>. See the migration guide.
Display metrics
To compute and display metrics for a given file or directory, run:
bca --paths /path/to/your/file/or/directory metrics
--paths(or-p): file or directory to analyze. If a directory is provided, metrics are computed for every supported file it contains.
Exporting metrics
bca metrics supports five per-file output formats:
- CBOR
- CSV
- JSON
- TOML
- YAML
Both JSON and TOML can be exported as pretty-printed.
The three top-level output kinds map to three separate commands so each one stays consistent with its data model:
| Command | Output | Audience |
|---|---|---|
bca metrics | Per-file metric trees | Downstream tooling |
bca report | Aggregated quality dashboards | Humans / PRs |
bca check | Threshold-violation reports | CI / IDE |
The CI/IDE offender formats (Checkstyle, SARIF, code-climate,
clang-warning, msvc-warning) used to live on bca metrics -O <fmt>.
They moved to
bca check --output-format <fmt> in #235 because their input is a
list of threshold violations, not the per-file metric tree that the
other formats above carry. See the
bca check chapter for the
new invocation.
Export command
To export metrics as JSON files:
bca --paths /path/to/your/file/or/directory metrics \
-O json -o /path/to/output/directory
-O, --output-format: per-file output format (cbor,csv,json,toml,yaml).-o, --output: directory to save output files. Filenames mirror the input file plus the format extension. If omitted, results are printed to stdout. CBOR is binary and therefore requires-o.
CSV (spreadsheets and Pandas)
bca --paths /path/to/your/code metrics \
-O csv -o csv-output
The CSV writer emits one row per FuncSpace (function, class,
struct, unit, etc.) with the entire metric matrix as columns. Header
order is fixed — see CSV_HEADER in
src/output/csv.rs
for the canonical list. Identity columns come first
(path, space_name, space_kind, start_line, end_line)
followed by every leaf metric using the same dotted JSON-style names
(loc.lloc, halstead.volume, cyclomatic.modified.average, etc.)
so a single column name addresses the metric in both CSV and JSON.
Empty cells (no value, not 0) signal "not applicable for this
space" — for example, the OOP-only metrics (wmc.*, npm.*,
npa.*) appear empty for procedural code. RFC 4180 quoting is
delegated to the [csv] crate, so paths and names containing commas,
quotes, or newlines round-trip cleanly.
Stream the result to a single file with -:
bca --paths /path/to/your/code metrics -O csv \
> metrics.csv
CSV is a per-file format; with --output <dir> each input file
produces a <input>.csv mirror under the output directory.
An aggregated HTML report covering the whole walk is available via
bca report html. The previous per-filebca metrics -O htmlwriter was removed because it degraded to an unopenable single-file table on real-world repos — CSV is the right shape for flat per-FuncSpacerows.
Pretty print
bca --paths /path/to/your/file/or/directory metrics \
--pretty -O json
Excluding inline test code
bca --paths /path/to/your/code --exclude-tests metrics
By default, every node in the AST is counted, including inline test
items. Rust files following the idiomatic
#[cfg(test)] mod tests { ... } layout therefore have headline
metrics that mix production and test code together.
Pass --exclude-tests to elide test-only subtrees before any metric
is computed. The flag is recognised by every subcommand that walks
the AST (metrics, report, check), and currently understands the
following Rust attribute shapes:
#[test]and#[rstest]/#[test_case]/#[wasm_bindgen_test]#[cfg(test)],#[cfg(all(test, ...))],#[cfg(any(test, ...))]#[tokio::test],#[async_std::test],#[test_log::test], … (any path ending in::test)#![cfg(test)]onmoditems (inner attribute form)
Languages without a Checker::should_skip_subtree override simply
ignore the flag — only Rust applies the pruning today. The default
remains off so existing metric numbers stay byte-identical for users
who do not opt in.
Aggregated report
For a comprehensive, human-readable quality report, use
bca report markdown. That command aggregates metrics
across all analyzed files and produces per-language hotspot tables.
Listing available metrics
Tooling that drives the CLI can discover the metric catalog at runtime instead of hard-coding it:
bca list-metrics
prints metric names one per line. Pass descriptions for a one-line
summary of each metric:
bca list-metrics descriptions
Report
bca report <FORMAT> produces an aggregated quality-metrics report
across every file walked. It is designed for pasting into pull
requests, wikis, or issue trackers.
CI integration. For runnable GitHub Actions and GitLab CI recipes that post the Markdown report as a PR/MR comment, see the CI integration recipe.
Two formats are available: markdown (plain-text, ideal for PR
comments) and html (a self-contained dashboard with sortable tables,
ideal for sharing as a build artifact).
Migrating? This command replaces the pre-restructure
--metrics -O markdowninvocation. See the migration guide.
Quick start
Print to stdout:
bca --paths /path/to/project report markdown
Write to a file:
bca --paths /path/to/project report markdown --output report.md
Note:
--outputmust be a file path, not a directory.
Flags
| Flag | Default | Description |
|---|---|---|
--top N | 20 | Maximum entries per hotspot table. |
--strip-prefix PATH | (empty) | Prefix removed from file paths. |
-o, --output FILE | (stdout) | Output file. Parent directory must exist. |
Examples
Show only the five worst hotspots per section:
bca -p src/ report markdown --top 5
Strip the workspace root from displayed paths:
bca -p /home/user/project report markdown \
--strip-prefix /home/user/project/
The user's daily-driver invocation:
bca \
--paths "$PWD" \
--num-jobs $(nproc) \
report markdown \
--top 20 \
--strip-prefix "$PWD/"
Report structure
A generated report contains the following sections (each section is
omitted when no data exists for it). Every hotspot table includes a
Tokens column (Lizard-style leaf-token count, comments excluded)
alongside SLOC so two complementary size proxies are visible per row.
- Project summary — files analyzed, languages, total SLOC / PLOC / comment counts, function and class counts, comment ratio.
- Per-language overview table — one row per language with file count, SLOC, function count, average Maintainability Index (MI), average Cyclomatic Complexity (CC), and average Cognitive Complexity.
- Per-language hotspot sections (repeated for each language):
- Summary — file count, SLOC, PLOC, comment ratio, average MI with a GOOD / MODERATE / LOW rating.
- Maintainability Index (lowest files) — files sorted ascending by MI.
- Cyclomatic Complexity Hotspots — functions sorted descending by CC, with summary statistics (average, max, counts above 10 and 20).
- Cognitive Complexity Hotspots — functions sorted descending by cognitive complexity.
- Halstead Effort Hotspots — functions sorted descending by Halstead effort, including volume and estimated bugs.
- Largest Functions by SLOC — functions sorted descending by source lines of code.
- Functions With Many Parameters (>3) — functions with more than three parameters, sorted descending.
- Actionable Summary — counts of functions exceeding common thresholds (CC > 10, cognitive > 15, SLOC > 100, args > 3, Halstead bugs > 1).
- Class/Trait/Impl Hotspots (WMC) — classes sorted descending by Weighted Methods per Class, with NOM, NPA, and NPM.
- Functions with the most exit points (NEXITS) — sorted descending by exit count.
- ABC Magnitude Hotspots — functions sorted descending by ABC metric magnitude.
HTML format
bca report html emits a single self-contained HTML page covering the
same sections as the Markdown report. It is designed to be served as a
static artifact: inline CSS, inline vanilla JavaScript for click-to-sort
on every hotspot table, and zero external dependencies (no CDN, no
fonts, no template engine). The page renders identically offline.
Write it to a file and open in any browser:
bca --paths /path/to/project \
report html --top 10 --output report.html
Click any column header to sort that table ascending, click again to toggle descending. Each table sorts independently. Empty cells (where a metric was not measured) sort as if they were positive infinity, which keeps "no data" rows out of the visible top of a hotspot.
Hover (or keyboard-focus, where the browser supports it) any metric
column header — SLOC, MI, CC, ABC, WMC, NPA, NPM,
Exits, etc. — for a one-sentence plain-English explanation of the
metric. The tooltip is delivered through the native HTML title
attribute, so it works offline with no JavaScript.
Every interpolated string — function name, file path, language label — is HTML-escaped on the way out, so a crafted source path or symbol name cannot inject markup or break out of an attribute value.
Each per-language <section> carries a stable lang-<name> class
(e.g. lang-rust, lang-python) styled with a low-alpha background
tint and matching left border so a multi-language report's section
boundaries are obvious at a glance. Languages without an explicit
palette entry fall back to a neutral lang-other tint, and a
prefers-color-scheme: dark adapter raises the alpha so contrast
holds in both themes.
Metric values of zero
A metric value of 0 in the report means the metric was not measured for that item (e.g. Halstead metrics on an empty function). Sections whose entries are all zero are omitted entirely.
Check
bca check evaluates per-function metrics against thresholds and exits
non-zero when any function exceeds a limit. It is the CI integration
point: wire it into a build step and a regression in code complexity
fails the pipeline before the change lands.
Looking for full CI recipes? The CI integration recipe consolidates the
--output-formatmatrix, runnable GitHub Actions and.gitlab-ci.ymlexamples, the baseline / ratchet pattern, and the GitLab Code Quality path. This page documents the command itself; the recipe documents how to wire it into a pipeline.
Exit codes
| Code | Meaning |
|---|---|
0 | All functions within thresholds (or --no-fail set). |
2 | At least one threshold exceeded. |
1 | Tool error (bad arguments, unreadable config, unknown metric). |
1 is reserved so CI can distinguish a regression (2) from a tool
misconfiguration (1).
Declaring thresholds
Pass --threshold <metric>=<limit> once per metric (repeatable). Metric
names match bca list-metrics; sub-metrics use a dotted form. 0 is a
valid limit and means "no value permitted".
bca --paths src/ check \
--threshold cyclomatic=15 \
--threshold cognitive=20 \
--threshold loc.lloc=200
Or pull thresholds from a TOML config (one place to keep CI thresholds versioned alongside the code):
# bca-thresholds.toml
[thresholds]
cyclomatic = 15
cognitive = 20
"loc.lloc" = 200
"halstead.volume" = 1000
bca --paths src/ check --config bca-thresholds.toml
CLI flags override values from --config for the same metric name, so
you can keep a project-wide default and tighten a single metric for a
specific run.
Accepted metric names
Top-level scalar metrics use their list-metrics names directly:
cognitive, cyclomatic, nargs, nexits, nom, tokens, abc,
wmc, npm, npa. Metric suites with multiple sub-fields use a dotted
form:
| Metric | Accepted threshold names |
|---|---|
| Cyclomatic | cyclomatic, cyclomatic.modified |
| Halstead | halstead.volume, halstead.difficulty, halstead.effort, halstead.time, halstead.bugs |
| Lines of code | loc.sloc, loc.ploc, loc.lloc, loc.cloc, loc.blank |
| Maintainability Index | mi.original, mi.sei, mi.visual_studio |
An unknown threshold name is a tool error (exit 1), not silently
ignored.
Offender output
Every offending (function, metric) pair prints one line to stderr in
this stable format:
<path>:<start_line>-<end_line>: <function_name>: <metric> = <value> (limit <limit>)
For example:
src/parser.rs:42-117: parse_expression: cyclomatic = 22 (limit 15)
src/parser.rs:42-117: parse_expression: cognitive = 31 (limit 20)
Lines are sorted by path, then start line, then metric name, so output is deterministic across runs over the same tree.
Silencing violations with suppression markers
In-source comments can silence threshold violations on individual
functions or whole files without editing the offending code or
excluding it from the walk. The native dialect is bca: suppress /
bca: suppress-file; Lizard's #lizard forgives is recognized as a
compatibility shim. See Suppression markers for
the full reference and the --no-suppress CI-audit flag.
Baselines
When you adopt thresholds on an existing codebase you typically face a binary choice between "raise the limit until nothing fires" and "fix every offender before turning the gate on". A baseline file is the ratchet-down alternative: record today's offenders, fail only on regressions and new offenders, and shrink the file over time as the team pays down debt.
Baselines are complementary to the suppression markers from
Suppression markers, not a substitute. Suppressions
express "this function is intentionally exempt forever" and live in
source; baselines express "this is tech debt we're paying down" and
live in a committed TOML file. bca check honors suppressions first
and applies the baseline filter to whatever remains.
Writing a baseline
bca --paths src/ check \
--config bca-thresholds.toml \
--write-baseline .bca-baseline.toml
This walks the tree, captures every threshold violation that would
otherwise fail the check, and writes them to the file as sorted TOML.
The run exits 0 regardless of offender count — the point is to
capture them.
# bca baseline file. Generated by `bca check --write-baseline`.
# Listed offenders are filtered from threshold checks; a function that
# gets worse than its recorded value still fails. Refresh with
# `--write-baseline` when entries become stale.
version = 1
[[entry]]
path = "src/parser.rs"
function = "parse_expression"
start_line = 42
metric = "cyclomatic"
value = 22.0
Functions already covered by an in-source suppression marker are
excluded. Pass --no-suppress together with --write-baseline to
record every violation (CI-auditor flow).
--write-baseline cannot be combined with --baseline,
--output-format, or --output — the baseline file is the output.
Reading a baseline
bca --paths src/ check \
--config bca-thresholds.toml \
--baseline .bca-baseline.toml
A violation is suppressed when both conditions hold:
- An entry exists at
(path, function, start_line, metric). - The current
valueis less than or equal to the recorded value.
A function that gets worse than its baseline value still fails. New
offenders not listed in the baseline still fail. Improvements pass
silently (the entry remains at its older, higher value until the next
--write-baseline refresh).
A baseline file that does not exist, is empty, has a missing or
unsupported version, or fails to parse is a tool error (exit 1),
not a silent zero-match.
Limitations
- Line drift. The entry key is
(path, function, start_line, metric). Inserting code above a function shifts itsstart_lineand the entry stops matching, surfacing as a "new" offender. Run--write-baselineto refresh and commit the diff. - Path identity. Entries record the path as the walker saw it.
Generate and consume the baseline with the same
--pathsargument from the same working directory; a relative--paths src/and an absolute--paths /repo/src/do not match each other. - OS portability. Paths are stored with forward slashes so a baseline written on one OS matches the same tree on another. Paths that are not valid UTF-8 fall back to a lossy display form (U+FFFD substitution) and may not round-trip exactly.
See the Baselines recipe for the end-to-end adoption flow and CI integration patterns.
Reporting without failing
--no-fail prints offenders to stderr but exits 0. Useful while
adopting baselines without flipping CI red. Other CI tools call this
behavior --report-only or --soft-fail; here the flag is spelled
--no-fail.
bca --paths src/ check \
--config bca-thresholds.toml --no-fail
CI example (GitHub Actions)
- name: Check code complexity thresholds
run: |
bca --paths src/ check --config bca-thresholds.toml
# The default behavior — non-zero exit fails the step — is exactly
# what we want here. No extra wiring needed.
If you want to keep the job green and surface offenders as a build
annotation while you reduce the count, swap in --no-fail:
- name: Surface complexity hot spots (non-blocking)
run: |
bca --paths src/ check \
--config bca-thresholds.toml --no-fail
Exporting offender records
bca check also emits a single CI/IDE document covering every
offender in the walk. Pass --output-format <fmt> to pick the shape
and --output <file> to write it to disk (stdout if omitted). The
exit-code contract is unaffected by these flags: 0 clean, 2 on any
violation (unless --no-fail), 1 on tool error.
| Format | Audience |
|---|---|
checkstyle | Jenkins, SonarQube, GitLab, "warnings plugin" CI |
sarif | GitHub Code Scanning, modern IDEs / security tooling |
code-climate | GitLab MR Code Quality widget |
clang-warning | Editor quickfix parsers, GitHub Actions problem matcher |
msvc-warning | Visual Studio, VS Code, Windows CI runners |
When no offenders exist the writer emits a well-formed but empty
document — empty runs[].results array for SARIF, empty JSON array
([]) for Code Climate, no <file> children under the
<checkstyle> root for Checkstyle, and zero bytes for the two
warning-line formats — so CI consumers can ingest clean runs
unchanged.
Checkstyle (CI integration)
bca --paths src/ check \
--threshold cyclomatic=15 \
--output-format checkstyle \
--output report.checkstyle.xml
The Checkstyle writer emits a single <checkstyle version="4.3">
document containing one <file> element per source path, each
holding one <error> per metric-threshold violation. The schema is
the Checkstyle 4.3 XSD that Jenkins and SonarQube's "Warnings Next
Generation" / "Generic Issue" importers consume directly.
SARIF (GitHub Code Scanning)
bca --paths src/ check \
--threshold cyclomatic=15 \
--output-format sarif \
--output report.sarif.json
The SARIF writer emits a single SARIF 2.1.0 JSON document with one
runs[] element. Each metric-threshold violation becomes a result
under runs[0].results[]; the metric names appearing in the run are
deduplicated into runs[0].tool.driver.rules[] with short
descriptions.
To upload a SARIF file to GitHub Code Scanning from a workflow:
name: bca-sarif
on: [push, pull_request]
jobs:
scan:
runs-on: ubuntu-latest
permissions:
security-events: write
steps:
- uses: actions/checkout@v4
- name: Run big-code-analysis
run: |
bca --paths . check \
--config bca-thresholds.toml \
--output-format sarif \
--output report.sarif.json \
--no-fail
- name: Upload SARIF
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: report.sarif.json
--no-fail keeps the job green so the SARIF upload step still runs
when offenders exist; remove it once you want a metric regression to
fail the workflow.
GitLab Code Quality (Code Climate JSON)
bca --paths src/ check \
--threshold cyclomatic=15 \
--output-format code-climate \
--output gl-code-quality-report.json
The Code Climate writer emits a single JSON array of issue objects
matching GitLab's strict subset
of the upstream Code Climate engine spec — one entry per
metric-threshold violation, no byte-order-mark, one trailing
newline (empty input renders as []\n). Each issue carries a
namespaced check_name (big-code-analysis/<metric>), a stable
SHA-256 fingerprint over path \0 function \0 metric (line- and
value-insensitive so cosmetic edits still dedup in the MR widget),
and a severity mapped from the value/threshold ratio onto
GitLab's five-level enum: ≤ 1.5× → minor, ≤ 2× → major,
≤ 4× → critical, > 4× → blocker (inverted for the mi.*
family where lower is worse). The full enum is
info/minor/major/critical/blocker; bca never emits
info — a threshold violation always lands at minor or higher.
To wire the artifact into GitLab's MR Code Quality widget:
code_quality:
stage: quality
script:
- bca --paths "$CI_PROJECT_DIR" check
--config bca-thresholds.toml
--output-format code-climate
--output gl-code-quality-report.json
--no-fail
artifacts:
when: always
reports:
codequality: gl-code-quality-report.json
paths:
- gl-code-quality-report.json
See the
GitLab Code Quality widget recipe
for the full pipeline (combined Code Climate + Checkstyle + Markdown
report) and a local jq smoke check.
--no-fail keeps the job green so the Code Quality report still
uploads when offenders exist; remove it once you want a metric
regression to fail the pipeline.
Clang/GCC warning lines (editor quickfix and CI annotators)
bca --paths src/ check \
--threshold cyclomatic=15 \
--output-format clang-warning \
--output report.txt
The Clang format emits one offender per line in the conventional compiler-warning shape:
path/to/file.rs:42:5: warning: cyclomatic 17 exceeds limit 15 [big-code-analysis-cyclomatic]
This is the format clang -fdiagnostics-format= produces and the
shape every editor quickfix parser (VS Code, IntelliJ, Vim) and most
CI annotators understand without configuration.
GitHub Actions surfaces the lines as inline annotations on the PR
diff via the built-in GCC problem matcher (or any community
compiler-problem-matchers action):
name: bca-clang-warnings
on: [push, pull_request]
jobs:
lint:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Enable GCC problem matcher
run: echo "::add-matcher::$RUNNER_TOOL_CACHE/problem-matchers/gcc.json"
- name: Run big-code-analysis
run: |
bca --paths . check \
--config bca-thresholds.toml \
--output-format clang-warning \
--no-fail
If your runner does not ship a GCC matcher, fall back to streaming
the lines and re-emitting them as ::warning file=...,line=...::
workflow commands.
MSVC warning lines (Visual Studio and Windows CI)
bca --paths src/ check \
--threshold cyclomatic=15 \
--output-format msvc-warning \
--output report.txt
The MSVC format emits one offender per line in Visual Studio's
cl.exe diagnostic shape:
path\to\file.rs(42,5): warning : cyclomatic 17 exceeds limit 15
Note the space before the colon after warning/error — that is
the MSVC convention. On Windows the path is normalized to use \
separators (matching cl.exe output); on other platforms the path is
emitted as-is. Visual Studio, VS Code with the C/C++ extension, and
Windows CI runners (Azure Pipelines, GitHub Actions on
windows-latest) parse these inline without extra configuration.
Suppression markers
In-source suppression markers silence threshold violations without
editing the offending function or excluding the file from the walk.
Drop a marker in any comment in the source file and bca check
treats the covered metrics as if they were within limits for that
scope. Metric computation is unaffected — raw bca metrics /
bca report output still reports every number. Suppression is a
threshold-check concern only.
Markers exist for the cases editing the code is not an option:
generated-style legacy modules awaiting rewrite, accepted exceptions
documented in the comment, and migration from
Lizard's #lizard forgives
convention.
Native markers (bca:)
The native dialect uses the bca: namespace and the suppress verb,
matching the project's internal "suppression" vocabulary
(SuppressionPolicy, FuncSpace::suppressed, --no-suppress). Four
forms:
| Marker | Scope | Effect |
|---|---|---|
bca: suppress | Enclosing function | Suppress every metric |
bca: suppress(metric, ...) | Enclosing function | Suppress only the listed metrics |
bca: suppress-file | File | Suppress every metric |
bca: suppress-file(metric, ...) | File | Suppress only the listed metrics |
A function-scope marker attaches to the innermost FuncSpace
(see the FuncSpace rustdoc)
whose source range contains the comment.
A function-scope marker outside every function body is silently
ignored; for file-wide silencing use the explicit suppress-file verb.
A file-scope marker may appear anywhere in the source — there is no
"must be in first N lines" rule.
bca: suppress — function-scoped, all metrics (Rust)
#![allow(unused)] fn main() { // bca: suppress fn legacy_dispatch(opcode: u8) -> Action { // dense match on every supported opcode; rewrite tracked in #123 match opcode { /* ... */ } } }
bca: suppress(metric, ...) — function-scoped, listed metrics (Python)
def parse_token_stream(tokens):
# bca: suppress(cognitive)
# cognitive complexity is intrinsic to this state machine;
# cyclomatic is still bounded.
...
Other thresholds (cyclomatic, halstead, loc, ...) still apply.
bca: suppress-file — file-scoped, all metrics (JavaScript)
// bca: suppress-file
// Hand-tuned hot path; do not rewrite to satisfy thresholds.
function transform(input) { /* ... */ }
function validate(input) { /* ... */ }
bca: suppress-file(metric, ...) — file-scoped, listed metrics (C++)
/* bca: suppress-file(halstead) */
// Halstead volume is inflated by the generated tables below; every
// other metric is still enforced file-wide.
Lizard compatibility markers
Two Lizard-style markers are recognized verbatim so existing Lizard-instrumented codebases need no rewrites:
| Lizard marker | Scope | Equivalent native marker |
|---|---|---|
#lizard forgives | Enclosing function | bca: suppress |
#lizard forgive global | File | bca: suppress-file |
The compatibility layer is intentionally narrow: only these two
shapes are accepted. Other Lizard directives parse as ordinary
comments. Lizard offers no per-metric scoping, so the native form's
bca: suppress(metric, ...) list has no Lizard analogue — every
Lizard-style marker silences every metric.
Lizard's GENERATED CODE marker is not handled here; it is part
of the generated-code auto-skip mechanism (see
Skipping generated code and the
--no-skip-generated flag).
Native vs Lizard side by side
| Effect | Native form | Lizard form |
|---|---|---|
| Silence every metric for one function | // bca: suppress | // #lizard forgives |
| Silence one metric for one function | // bca: suppress(cyclomatic) | (no equivalent) |
| Silence every metric for the whole file | // bca: suppress-file | // #lizard forgive global |
| Silence one metric for the whole file | // bca: suppress-file(halstead) | (no equivalent) |
Metric identifiers
The identifiers accepted inside bca: suppress(...) and
bca: suppress-file(...) are:
abc, cognitive, cyclomatic, exit, halstead, loc, mi,
nargs, nom, npa, npm, wmc.
They mostly match the JSON field names emitted on CodeMetrics, with
two deliberate differences:
exitis the suppression spelling for the threshold namenexits(the JSON field is alsonexits) —bca: suppress(exit)silences anexitsthreshold violation.tokensis a threshold-checkable metric (and aCodeMetricsJSON field) but is deliberately absent from the suppression list: a marker cannot turn it off. Treattokensas a hard resource cap, not a maintainability heuristic.
Silencing a family (for example halstead) covers every sub-metric
threshold under it (halstead.volume, halstead.effort, ...);
suppression vocabulary has no dotted form.
Unknown identifiers in a bca: suppress(...) list emit a stderr warning
of the form
warning: path/to/file.rs:42: unknown metric 'no_such_metric' in bca suppression marker; known metrics: abc, cognitive, ...
The marker is dropped — a typo never silently widens scope to other
metrics. Unknown verbs (anything other than suppress / suppress-file)
and malformed bodies (unbalanced parentheses, trailing garbage)
produce the same shape of warning and are similarly dropped. None of
these are fatal: a typo in one file does not derail a workspace walk.
Where markers may appear
A marker is recognized inside any source comment, regardless of
comment style. The scanner strips the following leading delimiter
characters before matching: /, *, !, #, ;, -, and ASCII
whitespace. That covers every comment shape bca parses today:
- C-family line comments:
// bca: suppress - C-family block comments:
/* bca: suppress */ - Rust inner doc comments:
//! bca: suppressand/*! bca: suppress */ - Python / shell / Ruby / Perl
#comments:# bca: suppress - Lisp / Lua / SQL line comments:
;; bca: suppress,-- bca: suppress
Function-scope markers attach to the innermost Function-kind
FuncSpace whose (start_line..=end_line) range contains the
comment's line. Markers buried in a class or struct body but outside
every method are silently ignored — for class-wide silencing use
bca: suppress-file or repeat the marker on each method.
File-scope markers are merged into the top-level Unit space and
apply to every function in the file regardless of nesting.
Position the marker near the start of the comment. The scanner trims
delimiter characters from both ends and then expects bca: (or
#lizard) at the very front; markers buried deep in a multi-line
block comment will not be recognized.
--no-suppress (CI auditing)
bca check --no-suppress ignores every suppression marker — native
and Lizard alike — and reports every threshold violation in the
walk. Use it in audit pipelines that need the raw, un-silenced
offender list:
bca --paths src/ check --config bca-thresholds.toml --no-suppress
The flag has no effect on metric values themselves: raw
bca metrics / bca report output already ignores markers, since
suppression is a threshold-check concern only.
JSON output
FuncSpace exposes the merged suppression scope as the optional
suppressed field in its JSON output. When no marker applies to a
space the field is elided so existing snapshot consumers see no
change. When a marker fires the field carries one of two shapes:
{ "suppressed": { "kind": "all" } }
{ "suppressed": { "kind": "some", "metrics": ["cognitive", "loc"] } }
kind: all corresponds to a bare marker (bca: suppress,
bca: suppress-file, or any Lizard-style marker). kind: some carries
the explicit metric list from bca: suppress(...) /
bca: suppress-file(...). Both shapes are stable serialization output
suitable for dashboards and audit logs.
Migrating from Lizard
The compatibility layer means migration is incremental:
- Existing
#lizard forgivesand#lizard forgive globalmarkers continue to work with no change.bca checkhonors them out of the box. - Rewrite to the native form opportunistically.
bca: suppress(...)gives per-metric scoping (the Lizard form silences everything) and is the form future audit-trail features will extend.
The project will keep the Lizard compatibility layer indefinitely; there is no removal date.
Reserved syntax
These shapes are reserved for future use and are not parsed today:
bca: suppress(metric, reason = "...")— audit-trail prose alongside the metric list, mirroring Rust'sreason = "…"attribute argument.bca: suppress-next— silence the immediately following declaration rather than the enclosing function.
Authors should avoid using either form today: a reason = "..."
argument is currently parsed as an unknown metric identifier and
discarded with a stderr warning, and bca: suppress-next is rejected
as an unknown verb. Both will be promoted to first-class behavior
in a future release without breaking existing markers.
Nodes
bca provides commands to analyze and extract
information about nodes in the Abstract Syntax Tree (AST) of a
source file.
Migrating? The verbs below replace the pre-restructure flag actions (
-d,-f,--count, ...). See the migration guide.
Error detection
To detect syntactic errors in your code, run:
bca -I "*.ext" -p /path/to/your/file/or/directory find ERROR
-p, --paths: file or directory (analyzes all files when given a directory).-I, --include: glob filter for selecting files by extension (e.g.*.js,*.rs). Variadic — put it before-pso the subcommand isn't swallowed as another glob, or use the-I=GLOBsingle-value form.find <NODE>: search for nodes of a specific type (one or more positional names).
Counting nodes
Count occurrences of one or more node types with the count command:
bca -I "*.ext" -p /path/to/your/file/or/directory \
count <NODE_TYPE> [<NODE_TYPE>...]
Printing the AST
To visualize the AST of a source file, use the dump command:
bca -p /path/to/your/file/or/directory dump
Analyzing code portions
To analyze only a specific portion of the code, use the global --ls
(line start) and --le (line end) options. For example, to print the
AST of a single function from line 5 to line 10:
bca -p /path/to/your/file/or/directory --ls 5 --le 10 dump
Listing functions
For a list of every function or method and its line span, use:
bca -p /path/to/your/file/or/directory functions
Rest API
bca-web is a web server that allows users to analyze source code through a REST API. This service is useful for anyone looking to perform code analysis over HTTP.
The server can be run on any host and port, and supports the following main functionalities:
- Remove Comments from source code.
- Retrieve Function Spans for given code.
- Compute Metrics for the provided source code.
Running the Server
To run the server, you can use the following command:
bca-web --host 127.0.0.1 --port 9090
--hostspecifies the IP address where the server should run (default is 127.0.0.1).--portspecifies the port to be used (default is 8080).-jspecifies the number of parallel jobs (optional).
Endpoints
1. Ping the Server
Use this endpoint to check if the server is running.
Request:
GET http://127.0.0.1:8080/ping
Response:
- Status Code:
200 OK - Body: empty.
Use curl -sf http://127.0.0.1:8080/ping && echo ok to script a
liveness check — -f makes curl exit non-zero on any HTTP error.
2. Remove Comments
This endpoint removes comments from the provided source code. It
accepts two Content-Type variants. Use application/octet-stream
for raw byte-in / byte-out, and application/json for a JSON
envelope.
Request:
POST http://127.0.0.1:8080/comment
Payload:
{
"id": "unique-id",
"file_name": "filename.ext",
"code": "source code with comments"
}
id: A unique identifier for the request.file_name: The name of the file being analyzed.code: The source code with comments.
Response (JSON variant):
{
"id": "unique-id",
"code": [10, 112, 114, 105, 110, 116]
}
The code field is a byte array of the stripped source, not a
string. Decode it with jq -r '.code | implode' (ASCII/UTF-8) or
the equivalent in your client. The application/octet-stream
variant returns the stripped source as the raw response body, which
is simpler for shell pipelines.
3. Retrieve Function Spans
This endpoint retrieves the spans of functions in the provided source code.
Request:
POST http://127.0.0.1:8080/function
Payload:
{
"id": "unique-id",
"file_name": "filename.ext",
"code": "source code with functions"
}
id: A unique identifier for the request.file_name: The name of the file being analyzed.code: The source code with functions.
Response:
{
"id": "unique-id",
"spans": [
{
"name": "function_name",
"start_line": 1,
"end_line": 10,
"error": false
}
]
}
error is true when the parser flagged the span as malformed
(e.g. unbalanced delimiters inside the function body).
4. Compute Metrics
This endpoint computes various metrics for the provided source code.
Request:
POST http://127.0.0.1:8080/metrics
Payload:
{
"id": "unique-id",
"file_name": "filename.ext",
"code": "source code for metrics"
"unit": false
}
id: Unique identifier for the request.file_name: The filename of the source code file.code: The source code to analyze.unit: A boolean value.trueto compute only top-level metrics,falsefor detailed metrics across all units (functions, classes, etc.).
Response:
{
"id": "unique-id",
"language": "Rust",
"spaces": {
"metrics": {
"cyclomatic_complexity": 5,
"lines_of_code": 100,
"function_count": 10
}
}
}
Recipes
Task-oriented examples for getting work done with bca and bca-web.
Each recipe assumes you have built the binaries (cargo build --release) and that bca is on your PATH.
The recipes are grouped by goal:
- Quality reports — generate Markdown reports suitable for pull requests, dashboards, or wikis, including the C/C++ preprocessor-aware workflow.
- CI integration — wire
bca checkandbca reportinto GitHub Actions and GitLab CI, including the baseline / ratchet pattern and the Code Quality widget path. - Local threshold gates — mirror the CI threshold
gate on a developer machine with a two-tier (hard + headroom)
Makefile /
just/pre-commitpattern, so regressions never reach the pull request. - AST queries — search for syntactic constructs, count node types, dump trees, and detect parse errors.
- Exporting metric data — emit structured output (JSON / YAML / TOML / CBOR) and consume it from shell pipelines.
- Driving the REST API — run the HTTP server and call
every endpoint with
curl.
If you want a deeper look at any flag the recipes use, see the per-command pages under Commands. For the full list of metrics that show up in these recipes, see Supported Metrics.
Upstream reference.
big-code-analysisis a fork of Mozilla'srust-code-analysis. Recipes that work for the upstreamrust-code-analysis-clibinary usually translate directly — replace the binary name and adjust for the subcommand restructure documented in the migration guide.
Quality reports
Recipes for producing aggregated, human-readable Markdown reports.
Wiring reports into CI? See the CI integration recipe for runnable GitHub Actions and GitLab CI examples that post the Markdown report as a PR/MR comment and surface threshold violations through the platform's native code quality widgets.
Live example reports
big-code-analysis publishes the output of bca report markdown and
bca report html against its own source tree on every push to main.
Open either to see exactly what the recipes on this page produce on a
multi-language Rust + Python codebase:
- HTML hotspot report (sortable tables, per-language sections): https://dekobon.github.io/big-code-analysis/reports/index.html
- Markdown PR/MR comment (paste-into-issue ready): https://dekobon.github.io/big-code-analysis/reports/report.md
The wiring that produces them lives in
.github/workflows/pages.yml.
The same workflow runs the threshold gate; see
CI integration for the full pipeline
shape.
Generate a project-wide quality report
Run from the project root and write the report to a file:
bca \
--paths "$PWD" \
--num-jobs "$(nproc)" \
report markdown \
--top 20 \
--strip-prefix "$PWD/" \
--output report.md
--strip-prefixkeeps the file paths short and stable across machines — without it every row carries the absolute path of the current checkout.--topcontrols how many rows appear in each hotspot table. 20 is a good default for a PR comment; drop to 5 for a dashboard tile.--num-jobscontrols parallelism. The walker is CPU-bound on most modern hardware.
Limit the report to specific languages
bca infers language from extension, so the
include/exclude globs do the filtering:
bca \
--include "*.rs" "*.py" \
--paths "$PWD" \
report markdown --output report.md
To exclude vendored or generated trees, layer in --exclude:
bca \
--include "*.rs" \
--exclude "**/target/**" "**/vendor/**" \
--paths "$PWD" \
report markdown
Flag ordering.
--includeand--excludeaccept multiple values and stop only when the next flag begins. Put them before--paths(or any single-value flag) so the subcommand name isn't swallowed as a glob. Equivalent single-value forms with=also work:--include="*.rs" --exclude="**/target/**".
For a stable repo-wide deny-set, keep the patterns in a file at the
repo root (a .bcaignore by convention) and load it with
--exclude-from. Patterns are unioned with any inline --exclude
values; blank lines and #-prefixed comments are skipped:
bca \
--paths . \
--exclude-from .bcaignore \
report markdown --output report.md
Show only the worst offenders
For a quick triage view that highlights the top three problems per section:
bca -p src/ report markdown --top 3
The report still includes every section, but each table is short enough to scan at a glance.
Compare two revisions
Aggregate reports do not diff revisions on their own. Run the report on each side and diff the Markdown:
git worktree add /tmp/before main
bca -p /tmp/before report markdown \
--strip-prefix /tmp/before/ --output /tmp/before.md
bca -p "$PWD" report markdown \
--strip-prefix "$PWD/" --output /tmp/after.md
diff -u /tmp/before.md /tmp/after.md | less
Because both reports use the same --strip-prefix shape, the path
columns line up and the diff is dominated by metric changes rather
than path noise.
C/C++ preprocessor-aware reports
Macro-heavy C/C++ codebases benefit from feeding preprocessor data into the analyzer so that conditional compilation is interpreted the way the compiler sees it. The workflow is two steps:
# 1. Build a preprocessor-data JSON from the headers and sources.
bca \
--paths src/ include/ \
preproc \
--output /tmp/preproc.json
# 2. Run the report (or any other command) with that data attached.
bca \
--paths src/ \
--preproc-data /tmp/preproc.json \
report markdown --output report.md
--preproc-data is a global flag, so it works with metrics, ops,
functions, and the other subcommands as well — anywhere accurate
C/C++ analysis matters.
Analyze only files changed in a PR
Pipe a list of changed files into --paths-from - to score just the
diff, not the whole tree:
git diff --name-only --diff-filter=AM origin/main...HEAD \
| bca --paths-from - metrics -O json -o ./out
--diff-filter=AMkeeps Added and Modified files and drops Deletions — you cannot analyze a file that no longer exists.--paths-from -reads newline-separated paths from stdin. A file argument works the same way:--paths-from changed.txt.- Paths fed in this way are treated as explicit, so they bypass
any
.gitignorerule that would have hidden them in a directory walk. Combine with-I '*.py' '*.rs'to filter by language.
For a PR-scoped Markdown summary, swap metrics for the report
pipeline:
git diff --name-only --diff-filter=AM origin/main...HEAD \
| bca --paths-from - report markdown \
--top 10 --output pr-report.md
.gitignore is honored automatically when walking a directory, so
recipes earlier in this page no longer need an explicit
-X "**/target/**" "**/node_modules/**" if those paths are already
covered by your project's .gitignore. Add --no-ignore if you do
need to analyze gitignored trees.
CI integration
Recipes for wiring bca into a build pipeline. The
bca check command already ships every output
shape a modern CI needs (Checkstyle, SARIF, GitLab Code Climate JSON,
clang/GCC warning lines, MSVC warning lines), plus
bca report markdown
for humans. This page is a consolidated map from the user's goal to
the right combination of subcommand, flags, and platform glue.
Picking outputs
The matrix below maps each common goal to the bca invocation that
feeds the corresponding CI surface. Linked sections below have the
runnable example.
| Goal | Command + flags |
|---|---|
| Hard gate on threshold regressions | bca check --config bca-thresholds.toml |
| Ratchet thresholds on an existing codebase | bca check --config bca-thresholds.toml --baseline .bca-baseline.toml (‡) |
| Inline PR annotations (GitHub) | bca check … --output-format clang-warning --no-fail + GCC problem matcher |
| Code Scanning alerts (GitHub) | bca check … --output-format sarif --no-fail + github/codeql-action/upload-sarif |
| Merge-request widget (GitLab Code Quality) | bca check … --output-format code-climate --no-fail |
| Jenkins / SonarQube ingestion | bca check … --output-format checkstyle |
| Human-readable PR/MR comment or downloadable | bca report markdown --top 20 --strip-prefix "$PWD/" |
| Machine-readable artifact for dashboards | bca metrics --output-format json --output ./out |
(‡) Recommended adoption path when introducing thresholds on a codebase with existing offenders. See the Baselines recipe for the bootstrap-refresh-retire workflow.
The full reference for bca check's output formats, exit codes
(0 clean, 2 violation, 1 tool error), and threshold config lives
in the Check command page. For the Markdown
report shape, see the Report command page and
the Quality reports recipe.
GitHub Actions
Live worked example
big-code-analysis runs the recipes below against its own source on
every push and PR. The workflow source —
.github/workflows/pages.yml —
exercises the GitHub-Releases install path, the cache, the
baseline-ratcheted gate, and both report formats. The output sits on
GitHub Pages alongside this book:
- HTML hotspot report: https://dekobon.github.io/big-code-analysis/reports/index.html
- Markdown PR/MR comment: https://dekobon.github.io/big-code-analysis/reports/report.md
Copy snippets below straight into your own workflow; the bca version
quoted is the latest published release at the time of writing.
Threshold gate, SARIF, and clang-warning matcher
The three pre-existing recipes — hard threshold gate, SARIF upload to
Code Scanning, and clang-warning + GCC problem matcher for inline PR
annotations — live in the
Check command page.
Use the link rather than re-implementing them here.
Installing bca from a GitHub Release (recommended)
The fastest, most reproducible install path is the prebuilt tarball
from this repository's GitHub Releases.
It is a single curl | sha256sum | tar, requires no Rust toolchain,
and produces byte-identical binaries across runs. Pair it with
actions/cache keyed by version
so a green-path rerun skips the download entirely:
env:
BCA_VERSION: "1.1.0"
BCA_TARGET: "x86_64-unknown-linux-gnu"
# sha256 of big-code-analysis-${BCA_VERSION}-${BCA_TARGET}.tar.gz from the
# release's SHA256SUMS file. Bump together with BCA_VERSION.
BCA_SHA256: "f11c324fd80787e1a9edf99d3c1763980e035e51abb5479527b14b1e2f83e919"
steps:
# Cache key MUST include BCA_SHA256 (and BCA_TARGET). Without the
# sha256 in the key, rotating the published checksum without bumping
# the version returns a stale binary on cache hit and silently
# bypasses the `sha256sum --check` in the install step (which is
# gated on cache miss). Including BCA_TARGET matters when the same
# workflow runs against multiple `runs-on`.
- name: Cache bca binary
id: bca-cache
uses: actions/cache@v5
with:
path: ~/.local/bin/bca
key: bca-${{ runner.os }}-${{ env.BCA_TARGET }}-${{ env.BCA_VERSION }}-${{ env.BCA_SHA256 }}
- name: Install bca from GitHub Releases
if: steps.bca-cache.outputs.cache-hit != 'true'
run: |
set -euo pipefail
stage="big-code-analysis-${BCA_VERSION}-${BCA_TARGET}"
tarball="${stage}.tar.gz"
url="https://github.com/dekobon/big-code-analysis/releases/download/v${BCA_VERSION}/${tarball}"
mkdir -p "$HOME/.local/bin"
curl -fsSL --proto '=https' --tlsv1.2 -o "/tmp/${tarball}" "$url"
echo "${BCA_SHA256} /tmp/${tarball}" | sha256sum --check --strict -
tar -xzf "/tmp/${tarball}" -C /tmp
install -m 0755 "/tmp/${stage}/bca" "$HOME/.local/bin/bca"
rm -rf "/tmp/${tarball}" "/tmp/${stage}"
- name: Prepend ~/.local/bin to PATH
run: echo "$HOME/.local/bin" >> "$GITHUB_PATH"
Available BCA_TARGET values (pick the one that matches runs-on):
x86_64-unknown-linux-gnu, x86_64-unknown-linux-musl,
aarch64-unknown-linux-gnu, aarch64-unknown-linux-musl,
aarch64-apple-darwin, x86_64-pc-windows-msvc,
aarch64-pc-windows-msvc. Windows assets use .zip instead of
.tar.gz; the bca-web binary ships alongside bca in the same
archive.
Alternative: cargo install via prebuilt-aware actions
When you cannot reach github.com from a runner (air-gapped, custom
mirror) but can reach crates.io, the following two actions fall back
transparently to cargo install when no prebuilt is published — at
the cost of compile time on the cold path. Both pin to the same
crates.io release as the GitHub Releases assets:
# Option 1: taiki-e/install-action
- name: Install bca
uses: taiki-e/install-action@v2
with:
tool: big-code-analysis-cli@1.1.0
# Option 2: cargo-binstall
- name: Install cargo-binstall
uses: cargo-bins/cargo-binstall@main
- name: Install bca
run: cargo binstall --no-confirm big-code-analysis-cli --version 1.1.0
If either action falls back to compilation, cache the cargo registry + the installed binary so the second run is fast:
- name: Cache cargo registry and bca binary
uses: actions/cache@v5
with:
path: |
~/.cargo/registry
~/.cargo/git
~/.cargo/bin/bca
# crates.io publishes immutable releases, so a `<version>` key is
# sufficient here — there is no sha256 to rotate. (The GitHub
# Releases install path above is different: republished release
# assets share a version, so its cache key must include the sha256.)
key: bca-${{ runner.os }}-1.1.0
Pin to a specific version (matching a published
big-code-analysis-cli release on crates.io) so reports stay
reproducible across runs. A floating install surfaces
metric-counting changes as "mysterious CI flakes" on Mondays.
Posting the Markdown report as a PR comment
bca report markdown is purpose-built for PR/MR comments: a stable
header structure, one row per hot spot, and short paths once you pass
--strip-prefix. Pair it with
marocchino/sticky-pull-request-comment
so each push updates a single comment instead of stacking new ones:
name: bca-pr-report
on:
pull_request:
branches: [main]
jobs:
report:
runs-on: ubuntu-latest
permissions:
pull-requests: write
steps:
- uses: actions/checkout@v4
- name: Install bca
uses: taiki-e/install-action@v2
with:
tool: big-code-analysis-cli@1.1.0
- name: Generate report
run: |
bca \
--paths "$PWD" \
--num-jobs "$(nproc)" \
report markdown \
--top 20 \
--strip-prefix "$PWD/" \
--output report.md
- name: Post or update PR comment
uses: marocchino/sticky-pull-request-comment@v2
with:
path: report.md
header: bca-quality-report
The same Markdown file is suitable for upload as a build artifact
(actions/upload-artifact@v7) if you want it downloadable from the
workflow run page in addition to the PR comment.
Baseline / ratchet pattern
bca check --baseline is the native ratchet: record today's offenders
in a committed TOML file, fail only on regressions and new offenders,
and shrink the file over time. Bootstrap once, commit, then point CI
at it:
# Once, on a developer machine. Commit both files.
bca --paths src/ check \
--config bca-thresholds.toml \
--write-baseline .bca-baseline.toml
git add bca-thresholds.toml .bca-baseline.toml
Path-style stickiness. Baseline entries are keyed by the exact path string bca emits at write time.
--paths src/recordssrc/foo.rs,--paths .records./src/foo.rs, and--paths "$PWD"records the absolute path. The subsequentbca check --baselineMUST use the same--pathsform, or every entry mismatches and the gate fails on every existing offender. Pick one form and apply it consistently in CI and in the bootstrap command.
This snippet bootstraps from src/ only — appropriate for a
single-crate library. For a multi-crate workspace, see the
live worked example: its .github/workflows/pages.yml
scans the entire repo with --exclude-from .bcaignore, a checked-in
deny-set covering vendored grammars, generated trees, and tests.
Share the exclude list across workflow, recipe, and bootstrap. Put the deny-set in a single file at the repo root (a
.bcaignoreby convention, mirroring.gitignore/.dockerignore) and point everybcainvocation at it with--exclude-from .bcaignore. Patterns from--exclude-fromare unioned with any inline--exclude <GLOB>flags into one deny-set — keep--excludefor one-off ad-hoc excludes. Blank lines and#-prefixed comment lines in the file are skipped. Patterns follow the same./-prefix convention as--excludearguments (the walker's emitted form). Pair edits to.bcaignorewith a--write-baselinerefresh — the baseline keys are sensitive to which files the walker visits.
- name: Threshold check with baseline
run: |
bca --paths src/ check \
--config bca-thresholds.toml \
--baseline .bca-baseline.toml
A regressed function (current value > baseline value) still fails.
A new offender not in the baseline still fails. An improved function
passes silently and stays in the baseline until the next
--write-baseline refresh.
Each surviving violation in the stderr stream is prefixed with a tag so a developer can tell at a glance whether they are looking at a brand-new offender or a known one that has worsened:
[new]— no baseline entry for this function / metric.[regr +N%]— current value exceeds the recorded baseline byNpercent. Special forms:[regr from 0]when the baseline value was zero,[regr +>9999%]when the regression exceeds 100× the baseline,[regr NaN]when the current value is NaN.
After the per-violation lines the stderr stream emits a per-file
rollup footer with the format <path>: <count> violations (worst: <metric> = <value> vs limit <limit> at L<start>), sorted by
violation count descending. This is intended to be the first thing a
reader looks at: which file has the most problems, and which metric
is the loudest in that file. Pass --no-summary to suppress the
footer for downstream tooling that grep-pipes the stderr stream.
Refresh after focused refactors:
bca --paths src/ check \
--config bca-thresholds.toml \
--write-baseline .bca-baseline.toml
git diff .bca-baseline.toml # expect a shrinking file
Two --write-baseline runs over an unchanged tree produce
byte-identical output, so spurious diffs only appear when offenders
actually changed. See the Baselines recipe for the
full adoption flow, PR-review heuristics, and the suppression
composition rules.
Offender-count delta against merge base (stopgap)
For teams who cannot commit a baseline file (e.g. policy reasons), a
coarser approximation counts <error> elements in two Checkstyle
documents — one on the merge base, one on the PR head — and fails
when the count grows:
- name: Compute offender deltas vs. merge base
run: |
set -euo pipefail
BASE="$(git merge-base origin/main HEAD)"
git worktree add /tmp/base "$BASE"
bca --paths /tmp/base check \
--config bca-thresholds.toml \
--output-format checkstyle \
--output /tmp/base.xml \
--no-fail
BASE_COUNT=$(grep -c "<error" /tmp/base.xml || true)
bca --paths "$PWD" check \
--config bca-thresholds.toml \
--output-format checkstyle \
--output /tmp/head.xml \
--no-fail
HEAD_COUNT=$(grep -c "<error" /tmp/head.xml || true)
echo "Offenders: base=$BASE_COUNT head=$HEAD_COUNT"
if [ "$HEAD_COUNT" -gt "$BASE_COUNT" ]; then
echo "::error::Offender count grew from $BASE_COUNT to $HEAD_COUNT"
exit 1
fi
This counts violations, not their identity: renaming an offender does not register as a regression, and improving one offender while regressing another nets to zero. The native baseline flow above is strictly more precise and is the recommended approach.
Self-scan threshold gate (local mirror of the CI gate)
CI's threshold gate fires only after push, which is too late if a
refactor silently nudged a metric past its limit. The
big-code-analysis repo's
Makefile
exposes four targets that mirror the CI gate (the
Threshold gate step in .github/workflows/pages.yml)
locally and add a second tier at 95% of every limit so encroachment
is caught a commit or two before the hard gate trips:
make self-scan # hard gate, 100% of bca-thresholds.toml
make self-scan-headroom # soft gate, default 95% (BCA_HEADROOM)
make self-scan-write-baseline # refresh baseline at hard thresholds
make self-scan-write-baseline-headroom # refresh baseline at soft thresholds
The hard tier is exactly what CI runs; expanded, it is:
cargo run --quiet --release -p big-code-analysis-cli -- \
--paths . --exclude-from .bcaignore \
check \
--config bca-thresholds.toml \
--baseline .bca-baseline.toml
Both tiers consume the same bca-thresholds.toml and the same
.bca-baseline.toml; the soft tier just runs the hard recipe
with every threshold value multiplied by BCA_HEADROOM. Both
exit 0 clean, 2 on any threshold violation, 1 on tool
error — the soft tier is a real gate, not advisory, so do not
wrap make self-scan-headroom in || true. All four targets
are wired into make pre-commit, make ci, and
.pre-commit-config.yaml, with self-scan-headroom: self-scan
as a Make prerequisite so the hard tier always reports a true
regression before the soft tier reports near-limit headroom.
BCA_HEADROOM=0.90 make self-scan-headroom widens the band;
BCA_HEADROOM=0.99 tightens it to the last 1%. When the soft
tier fires, absorb the offender into the baseline with
make self-scan-write-baseline-headroom (which records every
offender at the scaled thresholds — strictly a superset of the
hard-tier offenders).
The pattern (hard tier mirroring CI + soft tier as early-warning
band, both ratcheted by the same baseline) is project-agnostic —
the Local threshold gates recipe documents the
underlying principles, drop-in Makefile / just / package.json
skeletons, and the helper script that scales thresholds, so you
can adopt the same workflow in your own repo. The generic recipe
uses the same BCA_* env-var names as the Makefile above, so
overrides like BCA_HEADROOM=0.90 work identically across both.
GitLab CI
Full .gitlab-ci.yml example
The job below installs bca, runs the threshold check producing
Code Climate JSON (for the MR Code Quality widget), Checkstyle XML,
and a Markdown report, then uploads them as artifacts:
stages:
- quality
variables:
BCA_VERSION: "1.1.0" # pin a published big-code-analysis-cli release
BCA_TARGET: "x86_64-unknown-linux-gnu"
# sha256 of big-code-analysis-${BCA_VERSION}-${BCA_TARGET}.tar.gz from
# the release's SHA256SUMS file. Bump together with BCA_VERSION.
BCA_SHA256: "f11c324fd80787e1a9edf99d3c1763980e035e51abb5479527b14b1e2f83e919"
bca-quality:
stage: quality
image: debian:stable-slim
cache:
# Same key shape as the GitHub Actions snippet — bumping
# BCA_VERSION invalidates the cache automatically.
key: "bca-$BCA_VERSION"
paths:
- .cache/bca/
before_script:
- apt-get update -qq && apt-get install -y --no-install-recommends ca-certificates curl tar
- |
set -euo pipefail
install -d "$CI_PROJECT_DIR/.cache/bca" "$HOME/.local/bin"
if [ ! -x "$CI_PROJECT_DIR/.cache/bca/bca" ]; then
stage="big-code-analysis-${BCA_VERSION}-${BCA_TARGET}"
tarball="${stage}.tar.gz"
url="https://github.com/dekobon/big-code-analysis/releases/download/v${BCA_VERSION}/${tarball}"
curl -fsSL --proto '=https' --tlsv1.2 -o "/tmp/${tarball}" "$url"
echo "${BCA_SHA256} /tmp/${tarball}" | sha256sum --check --strict -
tar -xzf "/tmp/${tarball}" -C /tmp
install -m 0755 "/tmp/${stage}/bca" "$CI_PROJECT_DIR/.cache/bca/bca"
rm -rf "/tmp/${tarball}" "/tmp/${stage}"
fi
install -m 0755 "$CI_PROJECT_DIR/.cache/bca/bca" "$HOME/.local/bin/bca"
export PATH="$HOME/.local/bin:$PATH"
script:
- bca
--paths "$PWD"
--num-jobs "$(nproc)"
check
--config bca-thresholds.toml
--output-format code-climate
--output gl-code-quality-report.json
--no-fail
- bca
--paths "$PWD"
--num-jobs "$(nproc)"
check
--config bca-thresholds.toml
--output-format checkstyle
--output bca-checkstyle.xml
--no-fail
- bca
--paths "$PWD"
--num-jobs "$(nproc)"
report markdown
--top 20
--strip-prefix "$PWD/"
--output bca-report.md
# The threshold gate runs separately so the artifacts above still
# publish on failure. Exit 2 = at least one threshold exceeded.
- bca --paths "$PWD" check --config bca-thresholds.toml
artifacts:
when: always
reports:
codequality: gl-code-quality-report.json
paths:
- gl-code-quality-report.json
- bca-checkstyle.xml
- bca-report.md
A few notes about the example:
- The first two
bca check … --no-failinvocations collect offenders for the artifacts; the finalbca check(no--no-fail) is the pass/fail gate. All three runs use the same threshold config so the artifacts always match the gate decision. artifacts:when: alwaysensures every artifact is downloadable even on a red pipeline — which is exactly when you want them most.artifacts:reports:codequalitywires the Code Climate JSON directly into GitLab's MR Code Quality widget — see the Code Quality widget section below for the field-by-field semantics.
GitLab Code Quality widget
GitLab's first-class Code Quality experience (inline complaints on
the MR diff, summary on the MR overview page) consumes
Code Climate JSON.
bca check emits this natively via --output-format code-climate,
so the integration is a one-liner:
code_quality:
stage: quality
script:
- bca --paths "$CI_PROJECT_DIR" check
--config bca-thresholds.toml
--output-format code-climate
--output gl-code-quality-report.json
--no-fail
artifacts:
when: always
reports:
codequality: gl-code-quality-report.json
paths:
- gl-code-quality-report.json
Severity bands are derived from how far each metric exceeds its
configured threshold (value / limit ratio, inverted for the
maintainability-index family where lower is worse): ≤ 1.5× →
minor, ≤ 2× → major, ≤ 4× → critical, > 4× →
blocker. The widget deduplicates findings by fingerprint; bca
hashes path \0 function \0 metric (no line, no value) so a
violation surviving an upstream line-drift edit still collapses
into the same widget entry across pipeline runs.
Sanity-check a generated report locally:
jq 'all(.[]; has("description") and has("check_name")
and has("fingerprint") and has("severity")
and has("location"))' gl-code-quality-report.json
# → true
jq '[.[] | .severity] | unique' gl-code-quality-report.json
# → a subset of ["info","minor","major","critical","blocker"]
MR-only comment with the Markdown report
To attach the Markdown report as an MR note (the GitLab analogue of the GitHub PR comment recipe), use the project access token and the Notes API:
bca-mr-comment:
stage: quality
image: alpine:3
rules:
- if: $CI_PIPELINE_SOURCE == "merge_request_event"
needs: ["bca-quality"]
before_script:
- apk add --no-cache curl jq
script:
- |
BODY=$(jq -Rs '.' < bca-report.md)
curl --fail --silent --show-error \
--request POST \
--header "PRIVATE-TOKEN: $CI_BCA_BOT_TOKEN" \
--header "Content-Type: application/json" \
--data "{\"body\": $BODY}" \
"$CI_API_V4_URL/projects/$CI_PROJECT_ID/merge_requests/$CI_MERGE_REQUEST_IID/notes"
CI_BCA_BOT_TOKEN is a project access token with api scope. The
job depends on bca-quality so the Markdown artifact is in place
before it runs.
Jenkins / SonarQube
Both Jenkins (via the Warnings Next Generation plugin) and SonarQube (via its Generic Issue importer) consume Checkstyle 4.3 XML directly. The same invocation feeds both:
bca --paths src/ check \
--config bca-thresholds.toml \
--output-format checkstyle \
--output report.checkstyle.xml
Wire report.checkstyle.xml into your existing Jenkins
Record Issues / SonarQube External Issues step. The Checkstyle
writer emits an empty (well-formed) document when there are no
offenders, so neither tool needs special-casing for a clean run. See
the Check command page
for the writer's schema details.
Generic CI guidance
Applies regardless of provider:
- Pin
bcato a specific version. Bothcargo install --versionandcargo binstall --versionaccept the published crate version ofbig-code-analysis-cli. A floating install surfaces metric-counting changes as "mysterious CI flakes" on Mondays. - Use
--num-jobs "$(nproc)". The walker is CPU-bound on modern hardware;--num-jobs 1is a debugging knob, not a default. - Always pass
--strip-prefix "$PWD/"tobca report markdownso the path column is identical across runners with different workspace paths. Without it the diff between two reports is dominated by/home/runner/work/...vs./builds/group/project/...noise. - Store
bca-thresholds.tomlat the repo root, alongsideCargo.toml/pyproject.toml/package.json. Treat it as source: review threshold relaxations in code review. - Exit-code contract.
bca checkexits0clean,2on any threshold violation,1on tool error (bad config, unknown metric, unreadable path). Reserving1for tool errors lets CI distinguish "a function got too complex" from "the analyzer crashed". - Honor in-source suppression markers, audit with
--no-suppress. The defaultbca checkhonorsbca: suppress/bca: suppress-filemarkers; passing--no-suppressignores them so auditors see the raw offender list.
Baselines: ratcheting thresholds on existing code
When you introduce metric thresholds on an existing codebase, you
usually hit the same wall: every reasonable threshold flags hundreds of
existing functions, and CI goes red on every push. The realistic
adoption path is "ratchet from current state, fail only on new
offenders". The baseline file (issue #99) is how bca check supports
that workflow.
Baselines are the complement to in-source suppression markers, not a substitute. Use suppression markers (Suppression markers) when a function is intentionally complex forever (a parser, a state machine, generated code). Use a baseline when the team intends to pay the debt down. Both can live in the same repo; suppression is checked first.
End-to-end adoption flow
1. Pick initial thresholds
Either gut-feel numbers (cyclomatic=15, cognitive=20) or pull them
from a bca check --no-fail run over the repo to see the current
distribution.
# bca-thresholds.toml
[thresholds]
cyclomatic = 15
cognitive = 20
"loc.lloc" = 200
2. Bootstrap the baseline
bca --paths src/ check \
--config bca-thresholds.toml \
--write-baseline .bca-baseline.toml
Commit both files in the same change:
git add bca-thresholds.toml .bca-baseline.toml
git commit -m "ci: introduce metric thresholds with baseline"
3. Wire the CI gate
GitHub Actions:
- name: Check code complexity thresholds
run: |
bca --paths src/ check \
--config bca-thresholds.toml \
--baseline .bca-baseline.toml
GitLab CI (snippet for the relevant job):
threshold-check:
image: rust:1
before_script:
- cargo install --locked big-code-analysis-cli@<VERSION>
script:
- bca --paths src/ check
--config bca-thresholds.toml
--baseline .bca-baseline.toml
Exit codes: 0 clean, 2 regression or new offender, 1 tool error.
See CI integration for the broader matrix of CI surfaces.
4. Refresh the baseline as the team pays debt down
Every few weeks, or after a focused refactor:
bca --paths src/ check \
--config bca-thresholds.toml \
--write-baseline .bca-baseline.toml
git diff .bca-baseline.toml
A shrinking diff is the goal. Two --write-baseline runs over an
unchanged tree produce byte-identical output, so spurious diffs only
appear when actual offenders changed.
5. PR-review heuristics
- Baseline shrank. Debt paid down. No further action.
- Baseline grew. Someone added a new offender to the file intentionally. Review the values — was this a deliberate stopgap, or did the author bypass the gate? Either is fine if conscious; the point of the file being committed is to make the choice reviewable.
- A single entry got a higher
value. The author re-ran--write-baselineafter the function got worse. Treat the same as "baseline grew" — surface the change in review.
Reading the gate output
A failing bca check --baseline run prefixes each surviving violation
with a tag and follows the list with a per-file rollup:
bca: filtered 422 violations via baseline
[regr +60%] src/foo.rs:1-865: <file>: halstead.effort = 1557107.72 (limit 50000)
[new] src/bar.rs:506-747: act_on_file: cognitive = 63 (limit 25)
...
--- summary ---
src/foo.rs: 5 violations (worst: halstead.effort = 1557107.72 vs limit 50000 at L1)
src/bar.rs: 4 violations (worst: cognitive = 63 vs limit 25 at L506)
Tag prefixes:
[new]— no baseline entry for this(path, function, start_line, metric)tuple. The violation is new since the baseline was written.[regr +N%]— the baseline contains a recorded value and the current value isN%higher. Cases:[regr from 0]when the recorded value is0.0and a non-zero percentage would divide by zero.[regr +>9999%]caps once the regression exceeds 100× the baseline value.[regr NaN]when the current metric value is NaN (degenerate Halstead inputs on trivial functions).
Tags only appear when --baseline is passed; without it the line
format is byte-identical to the no-baseline default. CI tooling that
grep-pipes the stderr stream can suppress the trailing summary with
--no-summary.
The summary footer groups violations by file, cites the single worst
metric per file (max value / limit ratio), and sorts rows by
violation count descending then path ascending. It is the fastest way
to read a long offender list and spot which file to start with.
6. Retire the baseline
When .bca-baseline.toml contains only version = 2 and no entries,
drop the --baseline flag from CI and delete the file. The thresholds
now stand on their own.
Composition with suppression markers
--write-baseline already excludes any function silenced by a
bca: suppress or #lizard forgives marker, so the same function
doesn't end up in two places. If a function is intentionally exempt
forever, prefer the in-source marker (lives next to the code, survives
refactors, no extra file to commit). Use the baseline only for
violations the team genuinely intends to fix.
To audit the un-filtered offender set — every violation regardless of
suppression or baseline — pass --no-suppress and omit --baseline:
bca --paths src/ check \
--config bca-thresholds.toml \
--no-suppress \
--no-fail
Combined with --write-baseline, --no-suppress records every
violation including the ones that suppression markers normally hide.
Limitations
- Line drift. Entries key on
(path, function, start_line, metric). Editing code above a function shifts itsstart_lineand the baseline entry stops matching, surfacing as a "new" offender. Refresh with--write-baselineand commit the diff. - Path identity. Entries record the path the walker saw. Run
--write-baselineand--baselinefrom the same working directory with the same--pathsargument; a relative--paths src/and an absolute--paths /repo/src/produce non-matching baselines. - OS portability. Paths are normalized to forward slashes on write and re-normalized on read, so a baseline generated on Linux matches the same tree on Windows. Non-UTF-8 paths fall back to a lossy display form and may not round-trip exactly.
- Tightening a threshold. Lowering a limit may newly expose functions that were previously clean. They will not be in the baseline → CI will fail. This is correct — tightening should expose new offenders. Refresh the baseline if the team chooses to absorb the new entries.
Local threshold gates
CI is the last line of defence, not the first. By the time
bca check --config bca-thresholds.toml --baseline .bca-baseline.toml
fires red on a pull request, the offending change has already been
pushed, the author has context-switched, and someone has to revisit
the diff to nudge a metric back under its limit. A local threshold
gate moves that feedback to the moment of git commit — the same
moment cargo fmt --check and cargo clippy -- -D warnings already
fire — so the regression never makes it past the developer's
keyboard.
This recipe captures the pattern big-code-analysis uses on its own
source (Makefile's self-scan* targets)
and distils it into something you can drop into your own repo's
Makefile, justfile, package.json script, or pre-commit
config. The underlying idea is provider-neutral: any threshold
checker (bca, ESLint, clippy, SonarLint, Qodana) can be wired the
same way.
Principles
Three principles drive the design. They are not specific to bca;
they are the same conclusions Sonar reached when it pivoted its
default Quality Gate to focus on
new code
and that the broader ratchet pattern formalises.
- Gate locally, mirror CI exactly. The local gate must run the same binary with the same arguments and the same threshold / baseline / exclude files as CI. If the local gate is "almost what CI runs", it stops catching regressions the moment one diverges from the other. The cost of running the gate once before pushing is cheap; the cost of a red PR-bot ping is not.
- Ratchet, don't reset. When you introduce thresholds on an existing codebase, every reasonable limit fires on dozens of pre-existing functions. The realistic adoption path is "absorb today's offenders into a baseline file, fail only on new or worsening ones, shrink the baseline over time". This is the same strategy that lets a multi-year codebase introduce strict TypeScript or strict clippy lints without a months-long boil-the-ocean pass. See the Baselines recipe for the bootstrap → CI → refresh → retire flow.
- Warn before you fail. A hard 100% gate fails at the limit
and gives no signal as a function creeps from 80% to 95% to 99%
of its threshold. A second, looser tier that fires at e.g. 95%
of every limit gives a one-or-two-commit early warning. The
author still has the file open, the test cases in their head,
and the freedom to refactor before the offender hardens into
"well, it's in main now". Sonar's "new code" Quality Gate, the
GCC
-Wall/-Werrorsplit, and clippy'swarnvs.denylint levels all encode the same insight: a tier between clean and broken is where teams actually catch drift.
The two tiers
The pattern is two recipes wrapping the same checker, plus two recipes for refreshing the baseline at each tier.
| Target | Tier | Thresholds | Baseline-filtered | Use case |
|---|---|---|---|---|
self-scan | hard | 100% of config | yes | Mirror of CI. Must stay green on every commit. |
self-scan-headroom | soft | config × HEADROOM | yes | Early-warning band. Fires before the hard tier. |
self-scan-write-baseline | hard | 100% of config | (write) | Absorb today's hard-tier offenders. |
self-scan-write-baseline-headroom | soft | config × HEADROOM | (write) | Absorb soft-tier offenders when launching or widening the band. |
The hard tier and the soft tier consume the same
bca-thresholds.toml and the same .bca-baseline.toml. The
only difference between them is a scalar multiplier applied to
every threshold value before bca check sees it.
This matters: it means a contributor who wants the soft tier to be
stricter (catch encroachment further out) bumps a single
environment variable rather than maintaining a parallel
bca-thresholds-soft.toml that will drift out of sync with the
hard config the first time anyone forgets to update both files.
Skeleton: GNU Make
The four recipes below are a self-contained drop-in. Adjust the
BCA variable to point at whatever invocation gives you the
checker (a pinned release binary, cargo run --release, an npm /
pip wrapper). Adjust PATHS and EXCLUDE_FROM to match your
layout.
# --- bca local threshold gates ------------------------------------------
# HARD tier mirrors CI exactly. Both tiers consume the same
# bca-thresholds.toml + .bca-baseline.toml; the soft tier scales every
# threshold by $(BCA_HEADROOM) (default 0.95).
#
# Knobs are namespaced with `BCA_` so they don't collide with anything
# else in your environment. The big-code-analysis repo's own Makefile
# uses the same names — this skeleton is drop-in for that project too.
BCA := bca
BCA_PATHS := .
BCA_EXCLUDE_FROM := .bcaignore
BCA_THRESHOLDS := bca-thresholds.toml
BCA_BASELINE := .bca-baseline.toml
BCA_HEADROOM ?= 0.95
# `PY` lets Windows hosts override to `py -3` (the stock python.org
# installer ships `py.exe` and `python.exe` but no `python3` alias).
PY ?= python3
# Common args, factored out so the four recipes stay in lockstep.
BCA_BASE_ARGS := --paths $(BCA_PATHS) --exclude-from $(BCA_EXCLUDE_FROM) \
--num-jobs $(shell nproc 2>/dev/null || sysctl -n hw.ncpu 2>/dev/null || echo 4)
.PHONY: self-scan self-scan-headroom \
self-scan-write-baseline self-scan-write-baseline-headroom
self-scan:
@echo "bca self-scan (hard gate)..."
@$(BCA) $(BCA_BASE_ARGS) check \
--config $(BCA_THRESHOLDS) \
--baseline $(BCA_BASELINE)
# `self-scan-headroom: self-scan` is intentional: under `make -j` Make
# would otherwise run both gates in parallel and the soft tier's scaled
# error message could land before the true regression on the hard tier.
# `BCA_THRESHOLDS` / `BCA_BASELINE` are exported because the helper
# reads them from the environment — see "Helper script" below.
self-scan-headroom: self-scan
@echo "bca self-scan (soft gate, BCA_HEADROOM=$(BCA_HEADROOM))..."
@BCA_HEADROOM=$(BCA_HEADROOM) \
BCA_THRESHOLDS=$(BCA_THRESHOLDS) \
BCA_BASELINE=$(BCA_BASELINE) \
$(PY) ./utils/bca-self-scan-headroom.py \
$(BCA) $(BCA_BASE_ARGS)
self-scan-write-baseline:
@echo "Refreshing $(BCA_BASELINE) at hard thresholds..."
@$(BCA) $(BCA_BASE_ARGS) check \
--config $(BCA_THRESHOLDS) \
--write-baseline $(BCA_BASELINE)
# Soft-tier baseline write. NOTE: this and `self-scan-write-baseline`
# both write `$(BCA_BASELINE)`; never compose them as parallel
# prerequisites of one umbrella target or invoke them with `make -j2`,
# or the two Python processes will race on the same file and the
# losing tier's offenders will silently vanish from the baseline.
# Run them sequentially (hard first, then soft) and commit the diff.
self-scan-write-baseline-headroom:
@echo "Refreshing $(BCA_BASELINE) at soft thresholds (BCA_HEADROOM=$(BCA_HEADROOM))..."
@BCA_HEADROOM=$(BCA_HEADROOM) \
BCA_THRESHOLDS=$(BCA_THRESHOLDS) \
BCA_BASELINE=$(BCA_BASELINE) \
BCA_HEADROOM_WRITE_BASELINE=$(BCA_BASELINE) \
$(PY) ./utils/bca-self-scan-headroom.py \
$(BCA) $(BCA_BASE_ARGS)
The helper (utils/bca-self-scan-headroom.py) reads four env vars —
BCA_HEADROOM (default 0.95), BCA_THRESHOLDS (default
bca-thresholds.toml), BCA_BASELINE (default .bca-baseline.toml),
and the optional BCA_HEADROOM_WRITE_BASELINE switch — multiplies
every value in the thresholds file by the headroom ratio, and
re-emits the limits as --threshold name=value flags so bca check
sees scaled limits without you having to maintain a second TOML
file. The Make skeleton above exports the first three so renaming
any of those paths in one place propagates to both tiers. See
Helper script below for a ready-to-paste
implementation.
The gate exit codes propagate verbatim from bca check: 0
clean, 2 on any threshold violation (hard or soft), 1 on tool
error. The soft tier is a real gate — never wrap
make self-scan-headroom in || true thinking it's advisory; the
non-zero exit is the whole point of the encroachment band.
Keep
--pathsidentical across all four recipes. Baseline entries are keyed by the exact path stringbcaemits at write time:--paths .records./src/foo.rs,--paths src/recordssrc/foo.rs, and--paths "$PWD"records the absolute path. A subsequent--baselineinvocation that uses a different--pathsform silently mismatches every entry and the gate re-fails on every existing offender. The skeletons above all use--paths .deliberately — if you change it, change it in every recipe and refresh.bca-baseline.tomlonce. See Baselines: path identity for the full caveat.
Wiring into pre-commit and CI
Add the soft gate to whatever umbrella target your developers
already run before pushing. The hard gate runs as its prerequisite
(see the self-scan-headroom: self-scan edge above), so listing
only the soft target is enough — and crucially survives
make -j, which would otherwise schedule both leaves in parallel
and interleave their output:
.PHONY: pre-commit
pre-commit: fmt-check clippy test self-scan-headroom
Ordering matters: the hard tier names a true regression with the 100% limit, not the scaled one. The prerequisite edge enforces that order even under parallel Make.
In CI, run only the hard tier:
- name: Threshold gate
run: make self-scan
The soft tier is a developer feedback knob, not a release gate. Running it in CI either duplicates the hard tier (when nothing has encroached) or fires noisily on a baseline-absorbed offender that crept upward without crossing 100% — neither buys you anything CI doesn't already cover.
The headroom knob
BCA_HEADROOM is a single scalar in (0, 1]. The interesting band
is narrow:
BCA_HEADROOM | Fires when a function reaches… | Use case |
|---|---|---|
0.99 | 99% of any limit | Tightest possible warning, fires on the last commit before the hard gate would. |
0.95 | 95% of any limit (default) | One-or-two-commit lead time. Good default. |
0.90 | 90% of any limit | Wider band — useful immediately after raising a limit, while the new ceiling settles. |
1.00 | 100% (parity with hard gate) | Sanity check that the two tiers agree. |
Values below ~0.80 turn the soft tier into a second hard tier with arbitrary numbers and stop being useful: every threshold has some function near 80% of it on a real codebase, and the soft tier becomes a permanent baseline-management chore rather than an early-warning signal.
When the soft tier fires
A failed soft gate is a decision point, not a bug report. There are exactly three legitimate resolutions:
- Refactor. Same workflow as any other complexity regression — extract a helper, collapse a dispatch arm, split the function. This is the common case, and the soft tier exists to give you the time to do it on the same branch.
- Raise the limit. Edit
bca-thresholds.toml, leave a why-comment explaining what changed (a new language module, a genuine algorithmic floor, a re-classified macro). Re-runmake self-scan-headroomto confirm the new value covers the offender with room to spare. - Absorb into the baseline. Run
make self-scan-write-baseline(hard tier) ormake self-scan-write-baseline-headroom(soft tier) when the value is legitimate forever — a parser dispatch arm whose width matches the grammar it covers, a stable state machine, generated code. Commit the diff in.bca-baseline.tomlin the same PR as the code that produced it.
Don't pick "raise the limit" silently to make the gate go away. The committed why-comment is the only audit trail the next reader has; without it the bumped limit looks indistinguishable from neglect.
Skeleton: justfile
For projects that prefer just:
# bca local threshold gates. Hard tier mirrors CI; soft tier (headroom)
# is local-only early warning.
bca := "bca"
paths := "."
exclude := ".bcaignore"
thresholds := "bca-thresholds.toml"
baseline := ".bca-baseline.toml"
headroom := env_var_or_default("BCA_HEADROOM", "0.95")
py := env_var_or_default("PY", "python3")
jobs := `nproc 2>/dev/null || sysctl -n hw.ncpu 2>/dev/null || echo 4`
base_args := "--paths " + paths + " --exclude-from " + exclude + " --num-jobs " + jobs
self-scan:
{{bca}} {{base_args}} \
check --config {{thresholds}} --baseline {{baseline}}
self-scan-headroom: self-scan
BCA_HEADROOM={{headroom}} \
BCA_THRESHOLDS={{thresholds}} \
BCA_BASELINE={{baseline}} \
{{py}} ./utils/bca-self-scan-headroom.py {{bca}} {{base_args}}
self-scan-write-baseline:
{{bca}} {{base_args}} \
check --config {{thresholds}} --write-baseline {{baseline}}
# Like the Make skeleton, never compose this with `self-scan-write-baseline`
# in parallel — they race on the same {{baseline}} file.
self-scan-write-baseline-headroom:
BCA_HEADROOM={{headroom}} \
BCA_THRESHOLDS={{thresholds}} \
BCA_BASELINE={{baseline}} \
BCA_HEADROOM_WRITE_BASELINE={{baseline}} \
{{py}} ./utils/bca-self-scan-headroom.py {{bca}} {{base_args}}
Skeleton: package.json scripts
For JavaScript projects pulling in bca via npx or a pinned
binary. The --num-jobs flag is threaded through via the
BCA_NUM_JOBS env var (default in the wrapper script below) so the
npm tier runs the same shape of command as Make / just — per
Principle 1, all three skeletons should produce byte-identical
bca check invocations:
{
"scripts": {
"self-scan": "bca --paths . --exclude-from .bcaignore --num-jobs ${BCA_NUM_JOBS:-4} check --config bca-thresholds.toml --baseline .bca-baseline.toml",
"self-scan-headroom": "npm run self-scan && python3 ./utils/bca-self-scan-headroom.py bca --paths . --exclude-from .bcaignore --num-jobs ${BCA_NUM_JOBS:-4}",
"self-scan-write-baseline": "bca --paths . --exclude-from .bcaignore --num-jobs ${BCA_NUM_JOBS:-4} check --config bca-thresholds.toml --write-baseline .bca-baseline.toml",
"self-scan-write-baseline-headroom": "BCA_HEADROOM_WRITE_BASELINE=.bca-baseline.toml python3 ./utils/bca-self-scan-headroom.py bca --paths . --exclude-from .bcaignore --num-jobs ${BCA_NUM_JOBS:-4}"
}
}
Three portability footnotes for the npm tier:
- Env vars beat shell expansion. The helper reads
BCA_HEADROOMfrom the environment (default0.95), so overriding the band isBCA_HEADROOM=0.90 npm run self-scan-headroomon POSIX shells. On Windowscmd.exe, set the variable separately or usecross-env:cross-env BCA_HEADROOM=0.90 npm run self-scan-headroom. Avoid${VAR:-default}as a primary configuration mechanism —cmd.exepasses it through literally. The${BCA_NUM_JOBS:-4}usage above is a reasonable default for POSIX hosts; Windows users either setBCA_NUM_JOBSexplicitly or replace the literal with a fixed number in a per-platform script. python3vspython. The stock python.org Windows installer shipspython.exeandpy.exebut nopython3alias. Replace the literalpython3above withpy -3(Windows launcher) or add a one-linescripts/python3.cmdshim that forwards topy -3. macOS / Linux / WSL hosts havepython3onPATHby default.- Use
cross-env(orpnpm exec --shell) if you need any env var to be portable across the package.json users' shells. Mixingbash-isms intoscriptsis the most common source of "works on my Mac, broken on a Windows reviewer's machine" pings.
Pair with husky or
pre-commit so the same scripts run on
git commit.
Skeleton: pre-commit hook
If you use the pre-commit framework
(version 3.2.0 or newer — see the version note below), both
tiers are local hooks that shell out to make:
- repo: local
hooks:
- id: bca-self-scan
name: bca self-scan (hard gate)
entry: make self-scan
language: system
pass_filenames: false
stages: [pre-commit]
- id: bca-self-scan-headroom
name: bca self-scan-headroom (soft gate)
entry: make self-scan-headroom
language: system
pass_filenames: false
stages: [pre-commit]
pass_filenames: false is deliberate — bca discovers its own
inputs from --paths plus the baseline. Letting pre-commit
pass the changed files in would shrink the scan to just those
files and miss the cross-file effect of a baseline refresh.
Minimum
pre-commitversion 3.2.0. Thestages:vocabulary was renamed in pre-commit 3.2.0 (March 2024) —commit→pre-commit,push→pre-push, etc. Older installs (notably RHEL 8 EPEL, Ubuntu 20.04 default packages, and any.pre-commit-config.yamlpinned to the legacy vocabulary) rejectstages: [pre-commit]as an unknown stage name and the hook never registers. If you must support older installations, substitutestages: [commit]; in mixed fleets, pin the framework withpre-commit --version≥ 3.2.0 in the dev-tooling docs so this contradiction does not surface silently.
Helper script
The headroom helper exists because bca check's
--threshold name=value flag accepts overrides on the command
line. The helper reads the TOML, multiplies, and re-emits.
A ~40-line implementation suitable for any project. It is a
condensed restatement of big-code-analysis's own
utils/bca-self-scan-headroom.py
— same env-var contract, same defensive checks, same exit codes —
trimmed for in-line readability:
#!/usr/bin/env python3
"""Scale every threshold by $BCA_HEADROOM and run bca check."""
from __future__ import annotations
import os, subprocess, sys
from pathlib import Path
try:
import tomllib # Python 3.11+
except ImportError: # pragma: no cover
import tomli as tomllib # `pip install tomli` on 3.9/3.10
def main() -> int:
if len(sys.argv) < 2:
print("usage: bca-self-scan-headroom.py <bca-invocation...>", file=sys.stderr)
return 64
raw = os.environ.get("BCA_HEADROOM") or "0.95" # treat '' as unset
try:
ratio = float(raw)
except ValueError:
print(f"BCA_HEADROOM must be a number; got {raw!r}", file=sys.stderr)
return 64
if not 0.0 < ratio <= 1.0:
print(f"BCA_HEADROOM must be in (0, 1]; got {ratio}", file=sys.stderr)
return 64
thresholds_path = Path(os.environ.get("BCA_THRESHOLDS") or "bca-thresholds.toml")
baseline_path = Path(os.environ.get("BCA_BASELINE") or ".bca-baseline.toml")
if not thresholds_path.is_file():
print(f"missing {thresholds_path}", file=sys.stderr)
return 1
cfg = tomllib.loads(thresholds_path.read_text(encoding="utf-8"))
thresholds = cfg.get("thresholds", {})
if not thresholds:
print(f"no [thresholds] table in {thresholds_path}", file=sys.stderr)
return 1
flags: list[str] = []
for name, limit in thresholds.items():
# Float so a fractional scaled limit (e.g. 6.65 for nargs=7
# at BCA_HEADROOM=0.95) survives — flooring to int silently
# widens the band.
flags += ["--threshold", f"{name}={limit * ratio:.6g}"]
write_target = os.environ.get("BCA_HEADROOM_WRITE_BASELINE")
if write_target:
cmd = [*sys.argv[1:], "check", "--write-baseline", write_target, *flags]
else:
cmd = [*sys.argv[1:], "check", "--baseline", str(baseline_path), *flags]
return subprocess.call(cmd)
if __name__ == "__main__":
sys.exit(main())
Five implementation details that matter in practice:
- Emit a float, not an int.
bca check --thresholdparses every value asf64, and the offender test isvalue > limit(strict). AtBCA_HEADROOM=0.95,nargs=7scales to6.65. Flooring to6would silently widen the band by an extra ratio step. The{:.6g}format truncates float-multiplication artefacts (6.6499999999999995) without losing precision on the largest thresholds in the file. - Validate the ratio. The half-open interval
(0, 1]is the only sensible range.0disables the gate; values above1would make the soft tier looser than the hard tier and fire after CI — useless. Theor "0.95"idiom treats both unset and set-but-empty (BCA_HEADROOM=in a stripped CI env) as the default, so a misconfigured matrix variable does not exit 64 with the confusing messagegot ''. - Same baseline as the hard tier. The soft tier
--baselinemust point at the exact same file the hard tier writes; otherwise every hard-tier offender re-fires on the soft tier. The helper readsBCA_BASELINEfrom the env (default.bca-baseline.toml) so renaming the file in one place — the Make /justrecipe — propagates to both tiers without editing the Python. - Read everything from the environment, not
argv. Env-var propagation works the same inmake,just, andnpmscripts on every platform; CLI parameter expansion (${HEADROOM:-0.95}) does not — Windowscmd.exepasses it through literally. Argv carries only the literalbcainvocation prefix; the four configuration knobs (BCA_HEADROOM,BCA_THRESHOLDS,BCA_BASELINE,BCA_HEADROOM_WRITE_BASELINE) all come fromos.environ. - Defensive diagnostics. The argv-length, file-exists, and
empty-
[thresholds]checks all exit before constructing abcacommand, with stderr messages that name the helper rather than the downstream tool. Without them, a missing config file produces a confusing "no thresholds defined" error frombcaitself, and the user has to bisect whether the helper, the config, orbcais at fault. The fallbackimport tomli as tomllibkeeps the script working on Python 3.9/3.10 hosts (RHEL 8, Ubuntu 20.04, Debian bullseye); on 3.11+tomllibis stdlib andtomliis not needed.
Composition with the broader baseline workflow
The four self-scan* targets above are not a replacement for the
documented Baselines recipe — they are that
recipe, mechanised into developer-machine commands. The same
ordering still applies:
- Bootstrap once. Write the initial thresholds, write the initial baseline, commit both.
- Gate on every commit. Hard tier fails on regression; soft tier fails on encroachment.
- Refresh during focused refactors. When a function legitimately moved (someone did pay down debt), regenerate the baseline and review the diff.
- Retire when empty. When
.bca-baseline.tomlshrinks to justversion = 2, drop the--baselineflag and delete the file. The thresholds now stand on their own.
The local tiers shorten the feedback loop on steps 2 and 3 from
"red CI on a pull request" to "red Make recipe before
git commit returns". That is the whole pitch.
Related industry patterns
The hard / soft tier split is one instance of a broader pattern. If you have used any of the following, the mental model carries over:
- Sonar's
Quality Gates focused on new code.
Old code is held at its current state; changes must not make
things worse. The baseline file is
bca's native form of the "new code" / "leak period" idea. - clippy's
warn-vs-denylint levels. Awarnlint surfaces in local builds; the same lint denied with-D warningsfails CI. The two-tier configuration gives you a place to land experimental tighter rules. - The
ratchet pattern in
general migration tooling: record today's count, fail on
increase, lower the ceiling as the count drops.
bca checkratchets per-function rather than per-pattern, but the monotonicity guarantee is the same. -Wall+-Werrorin C/C++. A first pass with-Wallreveals the noise; promoting to-Werrorafter the baseline reaches zero is the same retirement step as deleting.bca-baseline.tomlonce it's empty.
AST queries
Recipes that work with the parsed syntax tree directly: searching for node types, counting them, or dumping the tree.
Library-side equivalents. Every recipe below has an in-process Rust counterpart in Walking the AST directly — useful when shelling out per file is too slow or when you want to compose metrics with custom AST analysis in one parse.
Detect parse errors before committing
Tree-sitter exposes a synthetic ERROR node anywhere it could not
parse. Use find to surface them:
bca \
--include "*.rs" \
--paths "$PWD" \
find ERROR
Flag ordering.
--includeand--excludeare variadic and consume tokens until the next flag begins, so put them before--pathsto avoid the subcommand name being eaten as a glob. The single-value=form (--include="*.rs") also works.
A clean run prints nothing. Wire this into a pre-commit hook to fail fast when a syntactically broken file is staged.
Count specific syntactic constructs
count accepts one or more node-type names and reports the totals.
For example, to count if, for, and while constructs across a
Rust project:
bca \
--include "*.rs" \
--paths src/ \
count if_expression for_expression while_expression
The exact node-type names come from the underlying tree-sitter grammar. To discover them, dump the AST of a small sample file (see below) and read the node names off the tree.
Find all unsafe blocks in a Rust crate
bca \
--include "*.rs" \
--paths src/ \
find unsafe_block
Each match prints the file path and the line range of the node.
Dump the AST of a file
Useful for understanding why a metric came out the way it did, or for
discovering the tree-sitter node names you need for find / count:
bca --paths src/lib.rs dump
To narrow the dump to a specific function or block, add line bounds
with the global --ls and --le flags:
bca \
--paths src/lib.rs \
--ls 42 --le 88 \
dump
--ls / --le apply to dump and find, so the same range can be
used to scope a search to a single function:
bca \
--paths src/lib.rs \
--ls 42 --le 88 \
find return_expression
List every function or method
For a quick human-readable inventory:
bca \
--include "*.rs" \
--paths src/ \
functions
The output is a tree per file: an In file … header followed by an
indented row per function with name and line span. It is intended for
reading, not parsing.
For tooling that needs a structured inventory — coverage mapping,
documentation generation, code-owner reports — use the JSON metrics
output instead and walk .spaces[] recursively, taking entries whose
kind is function:
bca \
--include "*.rs" \
--paths src/ \
metrics -O json \
| jq -c '
. as $root
| def funcs: if .kind == "function" then [.] else [] end
+ (.spaces // [] | map(funcs) | add // []);
funcs[] | {file: $root.name, name, start_line, end_line}
'
This emits one JSON object per function and is safe to pipe into downstream tooling.
Exporting metric data
The metrics, ops, and preproc subcommands all support structured
output formats meant for machine consumption. Pair them with a JSON
processor like jq for ad-hoc
analysis, or feed them into a database or dashboard.
Export per-file metrics as JSON
bca \
--paths src/ \
metrics \
-O json \
-o /tmp/metrics
This writes one JSON file per analyzed source file under
/tmp/metrics/. The output filename mirrors the input path with the
format extension appended — src/lib.rs becomes src/lib.rs.json,
not src/lib.json. Use --pretty if you intend to read the files by
hand:
bca -p src/ metrics --pretty -O json -o /tmp/metrics
CBOR (-O cbor) is the most compact format; it is binary and
therefore requires -o. JSON, TOML, and YAML can all be streamed to
stdout when -o is omitted, which is useful for pipelines.
Pull a single metric across an entire tree
Combine streamed JSON output with jq to extract one value per file:
bca -p src/ metrics -O json \
| jq -c '{file: .name, mi: .metrics.mi.mi_visual_studio}'
The same idea works for any metric — cyclomatic.sum,
cognitive.sum, loc.sloc, and so on. Run bca list-metrics descriptions to see the catalog.
Discover the metric catalog at runtime
Tooling that drives the CLI shouldn't hard-code metric names. Ask the binary:
bca list-metrics # one name per line
bca list-metrics descriptions # name + summary
This is the right input for code generators, schema definitions, or tab-completion.
Extract operands and operators (Halstead)
ops emits the raw operand and operator lists per file, which is the
input to Halstead-style metric calculations beyond what the built-in
report shows:
bca \
--include "*.rs" \
--paths src/ \
ops \
-O json --pretty \
-o /tmp/ops
Flag ordering. Variadic flags like
--includeand--excludeconsume tokens until the next flag, so put them before--paths(or use the--include=GLOBsingle-value form) to keep the subcommand from being eaten as a glob.
Each output file mirrors the input path under /tmp/ops/.
Strip comments from a tree
strip-comments rewrites source so that downstream tools that don't
understand comment syntax can still consume the code. It defaults to
streaming the result to stdout; pass --in-place to overwrite files
on disk:
# Stream a single file with comments removed.
bca --paths src/lib.rs strip-comments
# Rewrite every Python file in src/ in place.
bca --include "*.py" --paths src/ \
strip-comments --in-place
--in-place is destructive — make sure the tree is committed or
backed up first.
Driving the REST API
bca-web exposes the same analysis primitives over
HTTP. Use it when the consumer is a long-running service (an editor
plugin, CI worker, or web app) that should not pay the cost of
spawning the CLI per file.
For the full endpoint reference, see Rest API.
The recipes below show practical end-to-end calls with curl.
Start the server
bca-web --host 127.0.0.1 --port 8080 -j "$(nproc)"
Verify it's up:
curl -sf http://127.0.0.1:8080/ping && echo "ok"
# => ok
/ping returns 200 OK with an empty body — curl -sf exits 0 on
success and non-zero on any HTTP error, which is what scripts want.
Compute metrics for an inline snippet
curl -s http://127.0.0.1:8080/metrics \
-H 'Content-Type: application/json' \
-d '{
"id": "snippet-1",
"file_name": "demo.rs",
"code": "fn add(a: i32, b: i32) -> i32 { a + b }",
"unit": false
}' \
| jq '.spaces.metrics'
unit: true returns only top-level metrics; false walks every
function and class space inside the snippet. The server infers
language from file_name, so the extension matters.
Compute metrics for a file from disk
curl --data-binary plus jq makes it easy to package a real file
into the JSON envelope the server expects:
jq -nc \
--arg id "$(uuidgen)" \
--arg file_name "src/lib.rs" \
--rawfile code src/lib.rs \
'{id: $id, file_name: $file_name, code: $code, unit: false}' \
| curl -s http://127.0.0.1:8080/metrics \
-H 'Content-Type: application/json' \
--data-binary @- \
| jq '.spaces.metrics.cyclomatic, .spaces.metrics.cognitive'
This pattern — jq -n --rawfile to build the request, curl --data-binary @- to stream it — is the easiest way to avoid quoting
problems with multi-line source code.
Strip comments through the API
The endpoint is /comment (singular). It has two variants selected
by Content-Type:
application/json— wraps the request and response in JSON. The responsecodefield is a byte array, not a string, because the underlying API is byte-oriented.application/octet-stream— accepts the source as the raw request body and returns the stripped source as the raw response body. This is by far the easiest variant to use from the shell.
Octet-stream form (recommended for one-off shell use):
curl -s "http://127.0.0.1:8080/comment?file_name=demo.py" \
-H 'Content-Type: application/octet-stream' \
--data-binary $'# leading comment\nprint("hi") # trailing'
# => print("hi")
JSON form (use when your client speaks JSON natively). Decode the
byte array with jq … | implode for ASCII / UTF-8 source:
curl -s http://127.0.0.1:8080/comment \
-H 'Content-Type: application/json' \
-d '{
"id": "strip-1",
"file_name": "demo.py",
"code": "# leading comment\nprint(\"hi\") # trailing"
}' \
| jq -r '.code | implode'
The JSON response carries the same id you sent, so a client that
multiplexes many requests can correlate them.
Extract function spans for an editor plugin
The endpoint is /function (singular):
curl -s http://127.0.0.1:8080/function \
-H 'Content-Type: application/json' \
-d '{
"id": "spans-1",
"file_name": "demo.rs",
"code": "fn a() {}\nfn b() {}\n"
}' \
| jq '.spans'
Each entry has name, start_line, end_line, and an error
boolean (set when the parser flagged the function span as
malformed) — enough for an editor to draw a function navigator
without re-parsing the file locally.
Calling the API from CI
The server starts in milliseconds, so for short-lived CI jobs it's often simplest to start it as a background process inside the job and tear it down at the end:
bca-web --port 8080 &
SERVER_PID=$!
trap 'kill "$SERVER_PID"' EXIT
# Wait for it to come up.
until curl -sf http://127.0.0.1:8080/ping >/dev/null; do sleep 0.1; done
# … run your analysis calls here …
For longer-lived workers, run the server as a systemd unit (or container) and point your jobs at its host/port.
Using as a Library
big-code-analysis is published on crates.io as a Rust library. The
CLI (bca) and REST server (bca-web) are both thin wrappers around
the same public API, so anything they can do you can do directly from
your own crate.
This section is task-oriented. For full type signatures and field docs, follow the rustdoc on docs.rs.
When to embed the library
Reach for the library (instead of shelling out to bca) when you
want one or more of the following:
- In-process analysis. Avoid the cost of spawning a subprocess per file when scoring thousands of files in a custom tool, IDE plugin, or static-analysis pipeline.
- In-memory source. Score generated, pre-processed, or streamed source without writing it to disk first. See Analyzing in-memory source.
- Selective walking. Drive a custom traversal over the
FuncSpacetree to extract per-function metrics on your own schedule. See Walking FuncSpace results. - Custom output. Skip the JSON / YAML / TOML / CBOR serializers
shipped under
src/output/and emit your own report format (CSV, SARIF, a database row, whatever).
If you just want a Markdown quality report or a CI threshold gate,
the bca CLI is faster to wire up.
What is on offer today
- Quick start — parse a string, get a
FuncSpace, print the cognitive complexity. - Analyzing in-memory source — feed source from a buffer rather than a file.
- Reusing an existing tree-sitter Tree — feed a
caller-built
tree_sitter::Treeinto the metric walker. - Parse once, run metrics many times — hold a parsed
Astand run multiple metric subsets / custom walks against the same tree. - Walking the AST directly — count syntactic constructs, find nodes by kind, detect parse errors, or build a symbol table alongside the metrics walk.
- Selecting metrics — (stub — planned).
- Walking
FuncSpaceresults — recurse into nested function / class / impl spaces. - Error handling — what
Result<FuncSpace, MetricsError>means today and how to turn it into a useful diagnostic. - Stability and versioning — what you can and
cannot rely on across the
1.xline.
A note on API stability
The library is on the 1.x line and ships under a written
stability contract: the shape of the public API
is held stable across patch and minor bumps, and breaking changes
are reserved for the next major bump. Every example in this
section compiles against the current published crate and is
expected to keep compiling across 1.x without edits.
Metric values may still drift across minor bumps when a grammar pin moves or a metric definition is fixed — see STABILITY.md § What is stable in value for the carve-out. Each drift is called out in the changelog entry that introduces it.
Quick start
This page walks through the minimum amount of code needed to compute metrics from a string of source code.
1. Add the crate
# Cargo.toml
[dependencies]
big-code-analysis = "1.1.0"
The crate uses Rust edition 2024 and pins rust-version = "1.94".
Older toolchains will not build it — see the
MSRV section of STABILITY.md for the policy.
2. Compute metrics from a string
The recommended entry point is analyze: pass a Source
carrying the language, source bytes, and an optional display name,
plus a MetricsOptions for any per-traversal flags. No
filesystem path is needed.
use big_code_analysis::{analyze, MetricsOptions, Source, LANG}; fn main() { let source = "fn add(a: i32, b: i32) -> i32 { a + b }"; let space = analyze( Source::new(LANG::Rust, source.as_bytes()) .with_name(Some("snippet.rs".to_owned())), MetricsOptions::default(), ) .expect("Rust source should parse"); println!( "cognitive complexity (file-level): {}", space.metrics.cognitive.cognitive_sum(), ); }
Source::name ends up as the top-level FuncSpace::name; passing
None leaves the top-level name unset. The return type is
Result<FuncSpace, MetricsError>. The Err variant
tells parse-failure apart from empty-input apart from disabled-
language; see Error handling for the variant
set and matching patterns. MetricsError is #[non_exhaustive], so
always include a _ arm when matching.
Tip: use big_code_analysis::prelude::*; brings the recommended
entry points (analyze, Source, MetricsOptions, MetricsError,
LANG, FuncSpace, CodeMetrics, SpaceKind, Metric,
metrics_from_tree) into scope in one line. Anything outside the
prelude can still be reached by name — for example
use big_code_analysis::guess_language;.
The older
get_function_spaces(lang, bytes, path, pr)andmetrics_with_options(parser, path, options)entry points are still available but#[deprecated]— they derive the top-level name frompathvia lossy UTF-8 conversion. Use them only when you already have aParser<T>in hand from another seam.
3. What you got back
FuncSpace is a tree of spaces. The top-level node represents the
whole file; its spaces field holds nested function / class / impl
spaces. Every node carries the same CodeMetrics
struct, so you can read any metric at any level of granularity.
use big_code_analysis::{analyze, MetricsOptions, Source, SpaceKind, LANG}; fn main() { let source = "\ fn outer() { fn inner() {} } "; let space = analyze( Source::new(LANG::Rust, source.as_bytes()) .with_name(Some("snippet.rs".to_owned())), MetricsOptions::default(), ) .expect("Rust source should parse"); assert_eq!(space.kind, SpaceKind::Unit); assert_eq!(space.spaces.len(), 1); // `outer` assert_eq!(space.spaces[0].spaces.len(), 1); // `inner` }
For a deeper walk over FuncSpace, see
Walking FuncSpace results.
Picking a language
If you do not know the language up front, use guess_language —
it consults the path extension, an Emacs mode line in the buffer,
and the shebang in that order:
use std::path::PathBuf; use big_code_analysis::{analyze, guess_language, MetricsOptions, Source}; fn main() { let source = b"print('hi')\n"; let path = PathBuf::from("hello.py"); let (Some(lang), _name) = guess_language(source, &path) else { eprintln!("unrecognised language"); return; }; let _space = analyze( Source::new(lang, source).with_name(Some("hello.py".to_owned())), MetricsOptions::default(), ); }
guess_language returns (None, _) for unknown extensions; treat
that as "skip this file" rather than as a parse error.
What changes when
The recommended entry point is analyze(Source, MetricsOptions) and
returns Result<FuncSpace, MetricsError> (per #253 and #254).
The library-DX tracker collects the remaining shape changes —
naming, per-language features, and the parse seam.
Analyzing in-memory source
big-code-analysis never requires source to live on disk. The
recommended entry point analyze takes a Source carrying the
language, source bytes, and an optional caller-supplied display
name; no filesystem path is involved unless the C/C++ preprocessor
lookup needs one (Source::preproc_path).
This is useful for:
- Scoring generated code before it is written out.
- Scoring pre-processed or bundled source (e.g. after a template expansion).
- Driving the analyzer from a language server or editor plugin that already holds the buffer in memory.
- Stdin pipelines and unit tests that should not touch the filesystem.
Reading from a buffer
#![allow(unused)] fn main() { use big_code_analysis::{analyze, MetricsOptions, Source, LANG}; fn analyze_buffer(source: &[u8]) -> Option<f64> { // `Source::name` is the display identifier baked into the // top-level `FuncSpace`. Pick whatever is meaningful for // downstream consumers (logs, JSON output); pass `None` if // you have nothing useful to attach. let space = analyze( Source::new(LANG::Python, source).with_name(Some("<stdin>".to_owned())), MetricsOptions::default(), ) .ok()?; Some(space.metrics.cognitive.cognitive_sum()) } }
Source::new borrows the source bytes — the caller retains
ownership. If your downstream pipeline needs to highlight findings
on the same bytes, you can keep using the original buffer after
analyze returns.
Reading from stdin
use std::io::{self, Read}; use big_code_analysis::{analyze, MetricsOptions, Source, LANG}; fn main() -> io::Result<()> { let mut source = Vec::new(); io::stdin().read_to_end(&mut source)?; let space = match analyze( Source::new(LANG::Javascript, &source) .with_name(Some("<stdin>".to_owned())), MetricsOptions::default(), ) { Ok(space) => space, Err(err) => { eprintln!("parse failed: {err}"); std::process::exit(1); } }; println!("{}", space.metrics.cyclomatic.cyclomatic_sum()); Ok(()) }
Picking the language from content
If you do not know the language up front, combine
guess_language with analyze. guess_language peeks at the
path extension, an Emacs mode-line, and the shebang in that order:
#![allow(unused)] fn main() { use std::path::PathBuf; use big_code_analysis::{analyze, guess_language, MetricsOptions, Source}; fn analyze_unknown(path: PathBuf, source: Vec<u8>) -> Option<()> { let (lang, _name) = guess_language(&source, &path); let lang = lang?; // `.ok()?` collapses `MetricsError` into `None` so this helper's // `Option` return shape is preserved. See `error-handling.md` for // a richer mapping that preserves the variant. let _space = analyze( Source::new(lang, &source) .with_name(path.to_str().map(str::to_owned)), MetricsOptions::default(), ) .ok()?; Some(()) } }
guess_language returns (None, _) for unrecognised extensions —
treat that as "skip" rather than as a hard error.
Watch out for these
- Name identity matters. Top-level
FuncSpace::nameis whatever string you put inSource::name. Two analyses sharing the same name will look identical to a downstream consumer that keys on it. Use distinct labels for distinct buffers. Source::nameisOption<String>. PassingNoneleaves the top-levelFuncSpace::nameasNone— useful for ad-hoc snippets that have no meaningful identity. Downstream consumers that require a stable identifier should check forNoneexplicitly.- No filesystem fallback. Unlike the CLI, the library does not
read sibling files, follow
#includes, or interpret a.gitignore. Feed it exactly the bytes you want analyzed.
Alternative: the path-positional shim
For backwards compatibility, the older path-positional entry points
(get_function_spaces and metrics_with_options) still work
but are #[deprecated] in favour of analyze. They derive
FuncSpace::name from the supplied &Path via lossy UTF-8
conversion and are otherwise equivalent.
Reusing an existing tree-sitter Tree
A common pain point is that callers who already drive
tree-sitter for syntax highlighting, code folding, or queries
end up parsing every file twice: once for their own tree, once
inside get_function_spaces. The parse seam (issue #251) lets you
hand big-code-analysis an already-parsed tree_sitter::Tree and
get the same FuncSpace back without re-parsing.
Prefer
Ast::from_tree_sitterif you also want to run the metric walker more than once against the same parse (differentMetricsOptions::with_onlyselections, custom tree-sitter walks interleaved with metrics, etc.). See Parse once, run metrics many times. Themetrics_from_treefunction shown below is a single-shot equivalent that constructs anAstinternally and discards it after one call.
When to use this
Use the parse seam if you:
- Already keep a
tree_sitter::Treeper open buffer (editor, LSP, language server, custom static-analysis pipeline) and want to reuse that parse for metrics rather than paying the byte-based cost again. - Want to run multiple passes (metrics + AST dump + custom analysis) against one parse result.
- Intend to pin
tree-sitteron your side without taking a separate dependency from this library. The re-exportedbig_code_analysis::tree_sittermodule is the same crate we link against, so the types agree by definition.
Use the byte-based entry points
(get_function_spaces / metrics_with_options) if
you do not already have a tree — they construct the parser
internally and own the parse end to end.
Working example
use std::path::PathBuf; use big_code_analysis::{ get_function_spaces, metrics_from_tree, tree_sitter, LANG, MetricsOptions, }; let source_code = "fn main() { if true { 1 } else { 2 }; }"; let path = PathBuf::from("foo.rs"); let source = source_code.as_bytes().to_vec(); // Step 1: build a tree with the *re-exported* tree-sitter crate. // Using `big_code_analysis::tree_sitter` (rather than a direct // `tree-sitter` dependency on your side) guarantees the version // matches the one the metric walker was compiled against. let mut parser = tree_sitter::Parser::new(); parser .set_language(&LANG::Rust.get_tree_sitter_language()) .expect("rust grammar pinned to a compatible version"); let tree = parser .parse(&source, None) .expect("parser has a language set"); // Step 2: feed the tree into metrics_from_tree. let from_tree = metrics_from_tree( &LANG::Rust, tree, source.clone(), &path, None, MetricsOptions::default(), ) .expect("non-empty input"); // Step 3 (optional): confirm the values match the byte-based path. let from_bytes = get_function_spaces(&LANG::Rust, source, &path, None) .expect("non-empty input"); assert_eq!( from_tree.metrics.cyclomatic.cyclomatic_sum(), from_bytes.metrics.cyclomatic.cyclomatic_sum(), );
The same shape works for any LANG variant — pass the
matching grammar to tree_sitter::Parser::set_language (via
LANG::get_tree_sitter_language) and the metric
walker will produce the same FuncSpace it would have produced
from bytes.
Lower-level: Parser::from_tree (internal)
metrics_from_tree is the documented entry point for tree reuse —
it dispatches on a &LANG and hides the generic parser plumbing
entirely. The lower-level path goes through Parser<T> /
ParserTrait, which are now #[doc(hidden)] (see issue #256). They
remain pub so the in-tree macros (mk_action!, action::<T>,
the Callback dispatch shared with the REST API) can refer to
them, but they are not part of the documented surface and treating
them as a stable extension point is at your own risk.
The per-language *Parser aliases (RustParser, PythonParser,
…) emitted by mk_langs! are doc-hidden for the same reason —
see STABILITY.md for the escape-hatch caveat. For library
consumers, the higher-level metrics_from_tree shown above is the
right entry point because it dispatches on a &LANG at runtime
and does not expose any of the per-language tag types or trait
bounds.
Out of scope
- Incremental re-computation. Applying a
tree_sitter::InputEditand re-querying only the changed spans is not supported yet — the metric walker still walks the entire tree on every call. The parse seam is the first step; making the walker itself incremental is a follow-up. - Promoting all of
Node'spub(crate)traversal methods.Nodestill exposes its innertree_sitter::Nodethrough the public.0field for ad-hoc traversal; the wrapper helpers remain crate-private and are tracked under thepub usecuration issue.
Parse once, run metrics many times
big-code-analysis's one-shot entry point analyze re-parses
its Source on every call. For pipelines that score a file
multiple times — different metric subsets, an interleaved custom
tree-sitter walk, or a metric re-run after a configuration change — that
re-parse is wasted work.
The Ast type, added in 0.0.26 (#264), exposes the seam:
parse the source once, then call Ast::metrics as many
times as you need against the held parse.
When to use this
Reach for Ast when any of the following applies:
- Selective metric runs. You compute one set of metrics for a report, then another for a CI threshold gate, against the same file.
- Custom tree-sitter walks. You already drive a
tree_sitter::Treefor queries / highlighting / symbol extraction and want to fold the metric walker into the same parse. - Cached analysis. An LSP-like service that holds parsed files in memory should be able to re-run metrics on demand when configuration changes, without going back to bytes.
If you only ever compute every metric once per file, stick with
analyze — it now delegates to Ast internally, so the
shapes line up but the one-shot API stays simpler.
Selective metrics across calls
#![allow(unused)] fn main() { use big_code_analysis::{Ast, LANG, Metric, MetricsOptions, Source}; let source = b"fn f(x: i32) -> i32 { if x > 0 { 1 } else { -1 } }"; // One parse, two metric subsets. let ast = Ast::parse(Source::new(LANG::Rust, source)) .expect("rust feature enabled"); let loc = ast .metrics(MetricsOptions::default().with_only(&[Metric::Loc])) .expect("walker succeeds"); let cyclomatic = ast .metrics(MetricsOptions::default().with_only(&[Metric::Cyclomatic])) .expect("walker succeeds"); println!("ploc = {}", loc.metrics.loc.ploc()); println!("ccn = {}", cyclomatic.metrics.cyclomatic.cyclomatic_sum()); }
Each metrics call walks the tree once. The savings versus calling
analyze twice come from skipping the parse, which dominates
runtime for everything except the very largest source files.
Custom tree-sitter walk + metrics on the same parse
Ast::as_tree_sitter borrows the underlying tree_sitter::Tree. The
returned reference is valid for the lifetime of the Ast; nodes
obtained from it resolve against Ast::source (see the note on the
C++ preprocessor below for what source returns
under macro expansion).
For realistic AST work — counting node kinds, finding constructs by name, detecting parse errors, building a symbol table — see Walking the AST directly. The example below is a minimal smoke test; the dedicated chapter shows the full pattern (reusable depth-first walker, field-name lookup, error detection).
#![allow(unused)] fn main() { use big_code_analysis::{Ast, LANG, MetricsOptions, Source}; let ast = Ast::parse(Source::new(LANG::Rust, b"fn f() {}")) .expect("rust feature enabled"); // Walk the tree for your own purposes… let root = ast.as_tree_sitter().root_node(); assert_eq!(root.kind(), "source_file"); // …and run the metric walker over the same parse. let space = ast .metrics(MetricsOptions::default()) .expect("walker succeeds"); println!("name = {:?}", space.name); }
Adopting a caller-built tree
If you already build the tree_sitter::Tree yourself (e.g. because
your editor / LSP has its own parser pool),
Ast::from_tree_sitter is the Source-flavored
counterpart of the older metrics_from_tree. It carries an
explicit name: Option<String> end-to-end instead of deriving one
from a path via lossy UTF-8 conversion.
#![allow(unused)] fn main() { use big_code_analysis::{Ast, LANG, MetricsOptions, tree_sitter}; let source = b"fn f() {}".to_vec(); let mut parser = tree_sitter::Parser::new(); parser .set_language( &LANG::Rust .get_tree_sitter_language() .expect("rust feature enabled"), ) .expect("rust grammar compatible"); let tree = parser .parse(&source, None) .expect("parser has a language set"); let ast = Ast::from_tree_sitter(LANG::Rust, tree, source, None) .expect("rust feature enabled"); let _ = ast.metrics(MetricsOptions::default()).expect("walker succeeds"); }
The tree must have been produced from code with the grammar returned
by LANG::get_tree_sitter_language for lang; a
mismatch is not unsafe, but the metric walker matches on tree-sitter
kind_id values that come from the language's enum, so values from a
different grammar yield nonsensical results.
C++ preprocessor
When Ast::parse is called on a Source carrying preprocessor
inputs (Source::with_preproc_path + Source::with_preproc) and the
language is LANG::Cpp, the macro pre-pass runs before
tree-sitter does — and Ast::source returns the expanded bytes the
parser actually saw, not the original input.
Ast::from_tree_sitter is unaffected: it adopts whatever tree the
caller built. Whatever expansion (or lack thereof) the caller applied
before building the tree is what Ast::source reflects.
Concurrency
Ast is Send + Sync. Running Ast::metrics from multiple threads
against the same &Ast is safe — the walker only reads from the held
tree_sitter::Tree. (Benchmarking parallel metric runs is a separate
follow-up.)
Out of scope
- Incremental reparse via
tree_sitter::InputEdit. Caching a stableAstacross an analysis pipeline is in scope; editing the held tree is not. - Parallel-by-default APIs.
Ast::metricsdoes not internally parallelize across the metric set. Callers that want one thread per subset are free to do so.
Walking the AST directly
Ast::parse gives you a parsed
tree_sitter::Tree together with the source bytes it was
parsed from; Ast::as_tree_sitter hands that tree out as a
borrowed reference. This chapter shows how to use it to drive your own
syntax-tree analysis — counting node kinds, finding constructs by name,
detecting parse errors, or pulling out a symbol table — without paying
for a second parse.
When to use this
Reach for direct AST traversal when:
- You want to count or find syntactic constructs in-process. The CLI
equivalents (
bca count <kind>,bca find <kind>, recipe) shell out per file; the library path is one parse and one Rust loop. - You want to detect parse errors programmatically. Tree-sitter
emits a synthetic
ERRORnode anywhere the grammar could not match;Node::has_erroris O(1) — tree-sitter caches the error bit on every node — so the check is free even on a multi-MB source file. - You want to mix metrics with custom analysis in one parse — e.g. capture metric values and a list of function names for a coverage mapping, an IDE outline, or a code-owner report.
If you only need standard metrics, stay with analyze or
Ast::metrics — they walk the tree for you. The direct
path is for things the metric walker does not already compute.
Use the re-exported tree_sitter
Import tree_sitter from
big_code_analysis::tree_sitter rather than adding a sibling
tree-sitter dependency. The re-export is pinned to the exact version
the metric walker was built against, so the Tree types agree by
definition. See Reusing an existing tree-sitter
Tree and
Stability and versioning for the value-not-stable
posture this re-export carries.
A reusable DFS walker
Most of the examples below need a depth-first traversal of every
descendant. Tree-sitter ships a TreeCursor that does
this in O(1) per step (no allocations beyond the cursor itself). The
canonical walk is short enough to inline:
#![allow(unused)] fn main() { use big_code_analysis::tree_sitter; /// Visit every node in `tree` in pre-order, root first, passing each /// node to `visit`. Allocation-free apart from the cursor itself. fn walk_preorder<F: FnMut(tree_sitter::Node<'_>)>( tree: &tree_sitter::Tree, mut visit: F, ) { let mut cursor = tree.walk(); 'walk: loop { visit(cursor.node()); if cursor.goto_first_child() { continue; } loop { if cursor.goto_next_sibling() { continue 'walk; } if !cursor.goto_parent() { return; } } } } }
The pattern is: visit, descend, climb back up while there is no next
sibling, repeat. Every example in this chapter is a thin wrapper around
this walker — the code fences below are marked ignore because they
assume walk_preorder is already in scope; the matching set of tests
in tests/book_ast_traversal_examples.rs keeps them
honest, so a refactor that broke an example would fail cargo test.
Count nodes by kind
Library equivalent of bca count if_expression for_expression while_expression from the
AST-queries recipe:
use big_code_analysis::{Ast, LANG, Source};
use std::collections::HashMap;
let ast = Ast::parse(Source::new(
LANG::Rust,
b"fn a() { if true { 1 } else { 2 } } fn b() { for _ in 0..10 {} }",
))
.expect("rust feature enabled");
let mut counts: HashMap<&str, usize> = HashMap::new();
walk_preorder(ast.as_tree_sitter(), |node| {
*counts.entry(node.kind()).or_default() += 1;
});
assert_eq!(counts.get("if_expression").copied().unwrap_or(0), 1);
assert_eq!(counts.get("for_expression").copied().unwrap_or(0), 1);
The string keys ("if_expression", "for_expression", …) are the
tree-sitter grammar's node-type names. The fastest way to discover them
for a new language is bca --paths sample.rs dump, which prints the
full AST.
Anonymous tokens. The walker visits every node tree-sitter emits, including anonymous tokens like
"{",";", and keyword literals. The targetedcounts.get("if_expression")lookups above are unaffected — anonymous tokens have different kind names — butcounts.values().sum()would be much larger than the count of named grammar productions. Filter withtree_sitter::Node::is_named()inside the visitor if you only want named nodes.
Find nodes by kind
Library equivalent of bca find unsafe_block:
use big_code_analysis::{Ast, LANG, Source};
let ast = Ast::parse(Source::new(
LANG::Rust,
b"fn safe() {} fn risky() { unsafe { } }",
))
.expect("rust feature enabled");
let source = ast.source();
// Captured slices borrow from `source` — no per-hit `String` allocation.
let mut hits: Vec<((usize, usize), &str)> = Vec::new();
walk_preorder(ast.as_tree_sitter(), |node| {
if node.kind() == "unsafe_block" {
let span = (node.start_position().row, node.end_position().row);
let text = node
.utf8_text(source)
.expect("source is valid utf-8");
hits.push((span, text));
}
});
assert_eq!(hits.len(), 1);
Node::utf8_text(&source[..]) slices the source bytes by the node's
byte range. Pair it with Ast::source — for C++ with
preprocessor inputs supplied to Ast::parse, source is
the expanded buffer the parser actually saw, not the original input
(see the C++ preprocessor note).
Detect parse errors
Tree-sitter is lossless: even on malformed input it returns a tree, but nodes that could not be matched are tagged as errors. The cheapest check is on the root:
#![allow(unused)] fn main() { use big_code_analysis::{Ast, LANG, Source}; let ast = Ast::parse(Source::new(LANG::Rust, b"fn broken(")) .expect("rust feature enabled"); // Walks far enough to confirm something went wrong, but does not // enumerate every error site. assert!(ast.as_tree_sitter().root_node().has_error()); }
To list the offending nodes, walk the tree and check each:
use big_code_analysis::{Ast, LANG, Source};
let ast = Ast::parse(Source::new(LANG::Rust, b"fn broken("))
.expect("rust feature enabled");
let mut error_lines = Vec::new();
walk_preorder(ast.as_tree_sitter(), |node| {
if node.is_error() || node.is_missing() {
error_lines.push(node.start_position().row);
}
});
assert!(!error_lines.is_empty());
Node::is_error() flags the synthetic ERROR node tree-sitter inserts
where it could not match the grammar; Node::is_missing() flags
phantom nodes the parser invented to recover from a missing token. The
CLI's bca find ERROR recipe uses the same nodes.
Combine metrics with a custom walk
The whole point of Ast is parse-once / compute-many. A
realistic pipeline computes metrics and extracts a symbol table from
the same parse:
use big_code_analysis::{Ast, LANG, MetricsOptions, Source};
let ast = Ast::parse(Source::new(
LANG::Rust,
b"fn outer() { fn inner() {} } fn alone() {}",
))
.expect("rust feature enabled");
// One parse: metrics walker uses it…
let space = ast
.metrics(MetricsOptions::default())
.expect("walker succeeds");
// …and so does the custom walk, against the very same tree. The
// captured names borrow from `source` rather than allocating a fresh
// `String` per function — the same pattern as `find_unsafe_blocks`
// above.
let source = ast.source();
let mut functions: Vec<&str> = Vec::new();
walk_preorder(ast.as_tree_sitter(), |node| {
if node.kind() == "function_item"
&& let Some(name_node) = node.child_by_field_name("name")
{
let name = name_node
.utf8_text(source)
.expect("source is valid utf-8");
functions.push(name);
}
});
assert_eq!(space.metrics.nom.functions_sum(), 3.0);
assert_eq!(functions, ["outer", "inner", "alone"]);
Node::child_by_field_name walks the named grammar fields — the same
fields that show up in the FieldName column when you run
bca --paths sample.rs dump. Field-based lookup is more robust than
positional indexing because it does not depend on which children the
grammar emits for anonymous tokens (commas, parentheses, …).
Want a serializable JSON tree?
For pipelines that want a structured AST as data — diffing, queries on
the wire, language-agnostic schema work — the
AstCallback / AstNode family materializes
the tree as a Serialize-able struct. This is what the REST /ast
endpoint produces (bca dump uses a separate Dump callback that
writes a human-readable form to stdout). Library consumers can call
the JSON-shaped callback directly:
#![allow(unused)] fn main() { use std::path::PathBuf; use big_code_analysis::{ AstCallback, AstCfg, AstPayload, LANG, action, }; let payload = AstPayload { id: "snippet".to_owned(), file_name: "snippet.rs".to_owned(), code: "fn f() {}".to_owned(), comment: false, span: true, }; let cfg = AstCfg { id: payload.id.clone(), comment: payload.comment, span: payload.span, }; let response = action::<AstCallback>( &LANG::Rust, payload.code.into_bytes(), &PathBuf::from(&payload.file_name), None, cfg, ); let json = serde_json::to_string(&response).expect("AstResponse serializes"); println!("{json}"); }
For one-off in-process work, the as_tree_sitter() walker above is
cheaper (no allocation per node). Reach for AstCallback when you
need a serializable owned tree.
Out of scope
- Incremental reparse — tree-sitter supports
tree_sitter::InputEditfor incremental updates, butAstis a snapshot. To reflect a source edit, build a freshAst::parseor callParser::parse(&new_source, Some(&old_tree))directly via the re-exportedtree_sitterand feed the result throughAst::from_tree_sitter. - The crate-internal
big_code_analysis::Nodewrapper. It is exposed for the metric walker's traversal needs, but most of its traversal methods (kind,child_count,children,cursor, …) staypub(crate). Library consumers should reach the tree-sitterNodethroughas_tree_sitter().root_node()— that is the documented seam.
Selecting metrics
By default, every call to analyze computes the full metric
suite — ABC, cognitive, cyclomatic, Halstead, LoC, MI, NArgs,
NExits, NOM, NPA, NPM, tokens, and WMC. That is the right default
for the CLI, where the user has just asked for the metrics, but
it is heavyweight for callers that only want one number per file.
MetricsOptions::with_only(&[Metric]) lets you restrict the walker
to a subset of metrics. Unselected metrics are skipped at the
per-node level — no T::Halstead::compute, no
T::Cognitive::compute, etc. — and elided from the
CodeMetrics serialization output.
A worked example
Compute LoC only, then read the result:
use big_code_analysis::{analyze, LANG, Metric, MetricsOptions, Source}; fn main() { let source = b"fn f(x: i32) -> i32 { if x > 0 { 1 } else { 0 } }"; let opts = MetricsOptions::default().with_only(&[Metric::Loc]); let space = analyze( Source::new(LANG::Rust, source).with_name(Some("snippet.rs".to_owned())), opts, ) .expect("parses"); // LoC was selected — it carries real numbers. println!("ploc = {}", space.metrics.loc.ploc()); // Halstead, cognitive, cyclomatic, … were skipped. Their // `Stats` fields are at `Default` and elided from JSON output. let json = serde_json::to_string_pretty(&space.metrics).unwrap(); println!("{json}"); }
The JSON output for that call contains only the loc object;
every other metric is absent.
Dependencies between metrics
Two metrics are derived — they consume the outputs of other metrics during the finalize step:
| Metric | Dependencies |
|---|---|
Metric::Mi | Loc, Cyclomatic, Halstead |
Metric::Wmc | Cyclomatic, Nom |
with_only resolves these closures silently. Asking for Mi
alone still computes Loc + Cyclomatic + Halstead, so the MI
value is meaningful rather than a function of zero-default
inputs:
#![allow(unused)] fn main() { use big_code_analysis::{Metric, MetricSet, MetricsOptions}; let opts = MetricsOptions::default().with_only(&[Metric::Mi]); // opts.metrics now contains Mi + Loc + Cyclomatic + Halstead. }
You can introspect the final set from the resulting
FuncSpace via space.metrics.selected():
#![allow(unused)] fn main() { use big_code_analysis::{analyze, LANG, Metric, MetricsOptions, Source}; let space = analyze( Source::new(LANG::Rust, b"fn f() {}"), MetricsOptions::default().with_only(&[Metric::Mi]), ).unwrap(); let sel = space.metrics.selected(); assert!(sel.contains(Metric::Mi)); assert!(sel.contains(Metric::Loc)); // auto-added dependency }
Default behaviour is unchanged
MetricsOptions::default() selects every metric. The
pre-#257 entry points (analyze without with_only, plus the
deprecated metrics / metrics_with_options shims) produce
byte-for-byte the same JSON they always did.
What about "everything except X"?
There is no built-in complement API — with_only takes a positive
selection, not an exclusion list. The intentional asymmetry keeps
the dependency closure unambiguous: a positive list always grows
through Metric::dependencies, whereas an exclusion list would
need to decide what to do when the caller excludes a dependency
of a metric they kept.
If you genuinely want "all except Halstead", build the list
explicitly. Because Metric is #[non_exhaustive], downstream
crates can construct the variants but cannot exhaustively match
on them, so the conventional pattern is to enumerate the variants
you want and accept that adding a future Metric variant will not
silently opt you in:
#![allow(unused)] fn main() { use big_code_analysis::{Metric, MetricsOptions}; let opts = MetricsOptions::default().with_only(&[ Metric::Cognitive, Metric::Cyclomatic, Metric::Loc, Metric::Nom, Metric::Tokens, Metric::NArgs, Metric::Exit, Metric::Abc, Metric::Npm, Metric::Npa, Metric::Wmc, // Metric::Mi intentionally omitted: it would pull Halstead // back in via the dependency closure. ]); }
Note the trap: keeping Metric::Mi re-adds Metric::Halstead
through Metric::dependencies. To truly drop Halstead you must
also drop Mi.
When to reach for with_only
- Hot paths that need only one or two metrics per file —
Halstead in particular owns its own per-space
HalsteadMapsallocation and is the headline saving for an LoC-only run. - CI integrations that only display one number (e.g. a
cognitive-complexity gate) and want the rest of
CodeMetricsto drop out of the cached JSON payload. - Library callers wiring
big-code-analysisinto their own reports who would otherwise see fields for every metric in their own UI.
Per-metric Cargo features (compile-time stripping) are not covered by this knob; they remain tracked separately under the grammar-feature work (#252).
Per-language Cargo features
Every tree-sitter grammar this library bundles is gated behind its
own Cargo feature. The default feature set is all-languages, so
the default
[dependencies]
big-code-analysis = "1.1.0"
pulls every grammar in — matching the library's historical
behaviour and what the bca / bca-web binaries themselves ship
with. The cost is concrete: every grammar crate compiles when the
library compiles, and every grammar's parsing tables stay live in
the final binary.
Library consumers that only need a subset of languages can opt out of the defaults and re-enable just the grammars they care about.
A worked example
A downstream service that only analyses Rust and TypeScript:
[dependencies]
big-code-analysis = { version = "1.1.0", default-features = false, features = ["rust", "typescript"] }
The library still compiles, the LANG enum still has every
variant, and analyze / metrics_from_tree / the rest of the
dispatch surface still work for the enabled languages.
Supported features
The following per-language features are available. Each feature pulls in the matching grammar crate (and any helper grammars the per-language pipeline depends on).
| Feature | Grammar crates pulled in |
|---|---|
bash | tree-sitter-bash |
cpp | bca-tree-sitter-mozcpp, bca-tree-sitter-ccomment, bca-tree-sitter-preproc (covers the Cpp, Ccomment, and Preproc variants) |
csharp | tree-sitter-c-sharp |
elixir | tree-sitter-elixir |
go | tree-sitter-go |
groovy | dekobon-tree-sitter-groovy |
java | tree-sitter-java |
javascript | tree-sitter-javascript |
kotlin | tree-sitter-kotlin-ng |
lua | tree-sitter-lua |
mozjs | bca-tree-sitter-mozjs |
perl | tree-sitter-perl |
php | tree-sitter-php |
python | tree-sitter-python |
ruby | tree-sitter-ruby |
rust | tree-sitter-rust |
tcl | bca-tree-sitter-tcl |
typescript | tree-sitter-typescript (used by both the Typescript and Tsx variants) |
The umbrella all-languages feature enables every entry in this
table. The bca-tree-sitter-* crates are in-tree forks of the
upstream Mozilla / community grammars; the Rust import path remains
tree_sitter_<lang> regardless. See
RELEASING.md
for the rename rationale and the workspace package = ... alias
trick that keeps consumer call sites unchanged.
What happens when a feature is off
The LANG enum keeps every variant defined regardless of the
active feature set — disabling a feature does not change the enum
surface, the per-language *Code / *Parser type aliases, or any
of the file-extension / emacs-mode detection helpers. Selecting a
LANG whose feature is off only affects the dispatch path.
Every dispatch entry point that returns a Result surfaces the
disabled state as Err(MetricsError::LanguageDisabled(LANG)):
analyzemetrics_from_treeactionget_opsget_function_spaces/get_function_spaces_with_options(deprecated)LANG::get_tree_sitter_language— this returnsResult<tree_sitter::Language, MetricsError>(changed in 0.0.26) rather than the previousLanguage
Callers can query the compiled-in set without going through a dispatcher:
#![allow(unused)] fn main() { use big_code_analysis::LANG; for lang in LANG::into_enum_iter() { if lang.is_enabled() { println!("{:?} is compiled in", lang); } } }
This pairs well with the
get_language_for_file /
guess_language
helpers, which still hand back any LANG variant for a recognised
extension — callers walking a directory may want to skip files
whose language is not enabled in the current build.
Stability
Per-language features are themselves stable. Adding or removing a
language feature in the future is a minor-bump break (it changes
which LANG variants the default build covers); changes to the
default feature set will be flagged in the changelog under
(breaking).
Walking FuncSpace results
FuncSpace is the tree the library hands back from
analyze. The top-level node represents the whole file; its
spaces field holds nested function / class / impl / trait /
namespace spaces. Each node carries the same
CodeMetrics payload, so any metric is available at
any level of granularity.
Anatomy of a FuncSpace
The fields you reach for most often are:
| Field | Type | What it is |
|---|---|---|
name | Option<String> | Caller-supplied identifier (top-level) or symbol name (nested) |
kind | SpaceKind | Unit, Function, Class, Impl, … |
start_line | usize | First line (1-based) |
end_line | usize | Last line (1-based) |
spaces | Vec<FuncSpace> | Nested spaces |
metrics | CodeMetrics | All per-space metric values |
suppressed | SuppressionScope | In-source suppression markers |
SpaceKind is an enum — match on it to filter what
you care about (Function only, or "anything that owns methods").
Recursive walk
Recursion mirrors the tree shape. Here we collect every function space whose cognitive complexity exceeds a threshold:
use big_code_analysis::{ analyze, FuncSpace, MetricsOptions, SpaceKind, Source, LANG, }; fn hotspots(space: &FuncSpace, threshold: f64, out: &mut Vec<String>) { if space.kind == SpaceKind::Function && space.metrics.cognitive.cognitive_sum() > threshold { if let Some(name) = &space.name { out.push(format!( "{name} (lines {}–{})", space.start_line, space.end_line, )); } } for child in &space.spaces { hotspots(child, threshold, out); } } fn main() { let source = b"\ fn easy() { let _ = 1; } fn hard(x: i32) -> i32 { if x > 0 { if x > 10 { 1 } else { 2 } } else { 3 } } "; let space = analyze( Source::new(LANG::Rust, source).with_name(Some("snippet.rs".to_owned())), MetricsOptions::default(), ) .expect("parses"); let mut hits = Vec::new(); hotspots(&space, 2.0, &mut hits); for hit in hits { println!("{hit}"); } }
Iterative walk
For deep trees, prefer an explicit stack — Rust does not tail-call-optimise, and pathological generated code can be arbitrarily nested:
#![allow(unused)] fn main() { use big_code_analysis::FuncSpace; fn total_functions(root: &FuncSpace) -> usize { let mut stack = vec![root]; let mut count = 0; while let Some(space) = stack.pop() { if space.kind == big_code_analysis::SpaceKind::Function { count += 1; } stack.extend(space.spaces.iter()); } count } }
Reading per-metric numbers
CodeMetrics exposes each metric as its own Stats struct.
Inside, each struct offers integer-valued summary accessors plus
per-space derived ones. A few patterns:
#![allow(unused)] fn main() { use big_code_analysis::FuncSpace; fn summary(space: &FuncSpace) { let m = &space.metrics; println!("cognitive (this space): {}", m.cognitive.cognitive_sum()); println!("cyclomatic (this space): {}", m.cyclomatic.cyclomatic_sum()); println!("# functions in this space: {}", m.nom.functions_sum()); println!("source lines (sloc): {}", m.loc.sloc()); println!("physical lines (ploc): {}", m.loc.ploc()); println!("ABC branches: {}", m.abc.branches()); } }
The *_sum accessors aggregate across child spaces; bare
accessors like m.loc.sloc() are the value attributable to this
node. The full list of fields and methods lives in the
per-metric rustdoc.
Don't rely on traversal order
The library walks the AST in source order, but the contract is
only that every space appears once in the tree. If you need a
stable order across versions, sort by start_line after the
walk:
#![allow(unused)] fn main() { use big_code_analysis::FuncSpace; fn flatten(space: &FuncSpace, out: &mut Vec<(usize, String)>) { if let Some(name) = &space.name { out.push((space.start_line, name.clone())); } for child in &space.spaces { flatten(child, out); } } fn sorted(space: &FuncSpace) -> Vec<(usize, String)> { let mut v = Vec::new(); flatten(space, &mut v); v.sort_by_key(|&(line, _)| line); v } }
Error handling
The entry point analyze returns Result<FuncSpace, MetricsError>.
This page documents what each variant means and how to act on it.
Heads up. Prior to #253 this entry point returned
Option<FuncSpace>and collapsed every failure mode into a singleNone. TheResultvariant set is additive —MetricsErroris#[non_exhaustive], so always include a_arm when matching exhaustively to stay forward-compatible with future variants.
Pattern-matching the error variants
use big_code_analysis::{analyze, LANG, MetricsError, MetricsOptions, Source}; fn main() { let result = analyze( Source::new(LANG::Rust, b"this is not rust") .with_name(Some("snippet.rs".to_owned())), MetricsOptions::default(), ); match result { Ok(space) => println!("ok: {} lines", space.metrics.loc.sloc()), Err(MetricsError::EmptyRoot) => { eprintln!("walker produced no top-level FuncSpace"); } Err(MetricsError::ParseHasErrors) => { eprintln!("tree-sitter reported syntax errors (strict mode)"); } Err(MetricsError::LanguageDisabled(lang)) => { eprintln!("language {:?} is not enabled in this build", lang); } Err(MetricsError::NonUtf8Path) => { eprintln!("path is not valid UTF-8"); } // `MetricsError` is `#[non_exhaustive]`; new variants may be added. Err(_) => eprintln!("unexpected MetricsError variant"), } }
What each variant means
EmptyRoot— The walker reached the end of the AST without producing a top-levelFuncSpace. The most common cause is empty input or input whose only content is comments. Defensive failures (the traversal produced noUnitspace for any supported language) also surface here; if you hit one on real-world source, please file an issue.ParseHasErrors— Reserved for a future strict-parsing toggle onMetricsOptions. Not produced by today's default entry points; tree-sitter's error recovery is intentionally tolerant (see below).LanguageDisabled(LANG)— Reserved for upcoming per-language Cargo features (see #252). The current build enables every supported language, so this variant is never produced today.NonUtf8Path— Reserved for callers that opt into strict-identifier mode. Since #254, the recommendedanalyzeentry point takes a caller-suppliedSource::name(Option<String>), so non-UTF-8 paths are never round-tripped through lossy conversion in the first place. The deprecated path-positional shims (get_function_spaces,metrics_with_options) still fall back toPath::to_string_lossy. This variant is not produced today; it is kept for future strict-identifier validators.
Tree-sitter does not always say "no"
Most parse errors do not surface as Err(_). Tree-sitter is an
error-recovering parser — it will produce a tree even for
syntactically broken input, marking the bad regions with ERROR
nodes. The metric walk happily computes numbers over the recovered
tree. That means:
- Garbage in, numbers out. Feeding C++ source to
LANG::Pythongenerally produces anOk(FuncSpace)whose metrics are nonsense. Make sure you have selected the right language (e.g. viaguess_language) before trusting the result. - Partial files score. A truncated file with an unterminated
brace will still return
Ok(FuncSpace). The metrics reflect the recovered tree, not the intended source.
If you need to know whether the input parsed cleanly, count
ERROR nodes by walking the tree-sitter AST yourself (see the
Node escape hatch in
STABILITY.md) or use the
bca nodes subcommand on the CLI side.
Bubbling MetricsError through ?
Because MetricsError implements [std::error::Error], you can
bubble it through any Result<_, Box<dyn Error>> chain without
boilerplate:
#![allow(unused)] fn main() { use std::error::Error; use big_code_analysis::{analyze, FuncSpace, LANG, MetricsOptions, Source}; pub fn run( lang: LANG, source: &[u8], name: Option<String>, ) -> Result<FuncSpace, Box<dyn Error>> { Ok(analyze( Source::new(lang, source).with_name(name), MetricsOptions::default(), )?) } }
If you want a project-specific error type, an explicit From impl
keeps call sites clean while letting you attach extra context
(file path, language guess, etc.).
Warnings are not errors
The library writes warnings to stderr for non-fatal issues
(malformed bca: suppression markers, mainly). They do not abort
the walk and they do not flip Ok to Err. If you are running
embedded inside a server or library and need to capture those
warnings, redirect stderr at the process level — the library does
not currently expose a programmatic warning sink. That is tracked
under the library-DX umbrella (#250).
Stability and versioning
big-code-analysis is on the 1.x line. The full stability
contract lives in STABILITY.md at the root of the
repository — that file is the source of truth and is updated
alongside the changelog at every release.
The headlines for library consumers:
- Shape stability across patch and minor bumps. Every public
type and function signature listed in
STABILITY.md § "What is stable in shape"
is held across the
1.xline. Additive changes (new items, newLANGvariants, newMetricsErrorvariants, new language features) are allowed in minor bumps. Breaking shape changes are reserved for the next major bump and will appear in the changelog under (breaking) in the2.0.0section. - No value stability guarantee within
1.x. A grammar pin bump or a bug fix in a metric definition can shift any metric value on any file in any direction, even across a patch bump. Each such drift is flagged in the changelog. Pin to an exact version (big-code-analysis = "= 1.1.0") if you need bit-for-bit reproducibility across runs. - MSRV is
1.94. Bumping the MSRV is treated as a minor-bump event and is flagged in the changelog under (breaking) — see STABILITY.md § MSRV policy. - Escape hatches. The
Nodewrapper exposestree_sitter::Nodethrough.0, and thetree_sittercrate is re-exported asbig_code_analysis::tree_sitter. Anything reached through those seams follows the pinnedtree-sitterversion, not our own SemVer. See STABILITY.md § Escape hatches before depending on them.
On the 2.0 horizon
A small number of loose ends are deferred to 2.0; they are
listed in STABILITY.md § "On the 2.0 horizon".
The headline items are:
- The per-metric
Statsstructs gain#[non_exhaustive], so field additions stop being a shape break in the strict SemVer sense. - The deprecated
metrics/metrics_with_optionsshims (in favour ofanalyze) are removed. - The accumulated metric-definition fixes that have shifted values
across
1.xget a clean re-baseline note.
2.0 is not scheduled. Until then, 1.x is the surface you should
depend on.
Python Bindings
big-code-analysis ships first-party Python bindings (PyO3 +
maturin) that expose the same metric
pipeline as the Rust library and the bca CLI — same JSON shape,
same numeric formatting, same language coverage.
import big_code_analysis as bca
result = bca.analyze("src/main.rs")
if result is not None:
print(result["metrics"]["cyclomatic"]["sum"])
The bindings are a peer of the Rust API: anywhere this book points
at a Rust function (big_code_analysis::analyze,
FuncSpace, the metric modules),
Python has a one-to-one equivalent. Pick whichever language fits
your pipeline — the metrics are identical.
When to reach for Python
- You're already in a data-pipeline stack (pandas, Jupyter,
Airflow, dbt, Polars) and want metric records as
dict/DataFramerows without shelling out to the CLI. - You're integrating with a Python-native security tool that consumes SARIF — see SARIF output.
- You're building a code-quality dashboard whose backend is a Python web framework (FastAPI, Django).
If you only need a one-shot quality report from the command line,
the bca CLI is the simpler tool — see
Commands → Metrics.
If you're embedding the analysis into a long-running Rust program, the Rust library is the lower-overhead option.
Chapter contents
- Installation —
pip install, wheel matrix, building from source. - Quick start — analyse one file, print one metric.
- Batch processing —
analyze_batch,AnalysisError, parallelism withThreadPoolExecutor. - Flat-record iteration —
flatten_spacesfeeding sqlite / pandas. - Metric selection —
metrics=kwarg,bca.METRIC_NAMES, dependency-pull semantics. - SARIF output —
to_sarif+ GitHub Code Scanning upload. - Error handling — the full exception taxonomy and the never-raise batch contract.
- Async patterns —
asyncio.to_threadis the canonical recipe.
The headline example on each page is embedded verbatim from an
importable file under big-code-analysis-py/examples/ and
exercised end-to-end by
big-code-analysis-py/tests/test_book_examples.py, so a renamed
kwarg or a removed function on the primary path fails CI before
it can rot the docs. Shorter illustrative snippets that surround
the embedded example (logging recipes, regex parsing of the
errno suffix, the asyncio anti-pattern, the pandas
one-liner, …) are inline and intentionally not test-pinned —
treat the embedded blocks as the canonical reference when the
two disagree.
Installation
The bindings are distributed as a pure-wheel Python package. The
recommended install is via pip (or your preferred lockfile
manager — uv, poetry, pdm).
pip install big-code-analysis
Python >=3.12 is required. The compiled extension uses CPython's
stable abi3 surface
(abi3-py312), so one wheel covers 3.12, 3.13, and every
future minor release without a per-version wheel build.
Wheel matrix
CI publishes wheels for the following targets today. If your platform is not listed, build from source.
| Platform | Architectures |
|---|---|
Linux (manylinux_2_28) | x86_64, aarch64 |
The wheel matrix is defined in
.github/workflows/python-wheels.yml.
Phase 7
of the bindings work lit up the manylinux_2_28 Linux legs.
manylinux_2_28 requires glibc >= 2.28 (RHEL 8 / Debian 10 /
Ubuntu 18.10 and newer); older distributions (RHEL 7 / CentOS 7,
glibc 2.17) need to build from source. macOS and Windows wheel
publication is tracked under #103
and not yet shipped — pip install on those platforms falls back
to a source build today.
Verifying the install
python -c "import big_code_analysis as bca; print(bca.__version__)"
The version printed equals
[workspace.package].version
from the Rust workspace's Cargo.toml — the bindings and the Rust
library version in lockstep.
Building from source
If no wheel matches your platform, or you want to bind against an unreleased Rust commit, build with maturin:
git clone https://github.com/dekobon/big-code-analysis.git
cd big-code-analysis/big-code-analysis-py
python -m venv .venv && source .venv/bin/activate
pip install --upgrade pip
pip install "maturin>=1.7,<2.0"
maturin develop --release # editable install of big_code_analysis
python -c "import big_code_analysis as bca; print(bca.__version__)"
maturin develop builds the Rust extension in-place and installs it
into the active venv so import big_code_analysis resolves
locally — no separate pip install -e . step is required. The
--release flag turns on the optimiser; omit it during development
for faster rebuilds.
You will also need:
- A stable Rust toolchain (MSRV:
1.94). Install via rustup. - A C compiler (used by the tree-sitter grammar crates).
- CPython development headers (
python3-devon Debian / Ubuntu).
Next
Walk through the quick-start to compute your first metric, or skip ahead to batch processing if you're wiring this into a pipeline over many files.
Quick start
This page walks through the minimum amount of code needed to compute metrics from a single source file.
1. Install the package
pip install big-code-analysis
See Installation for the wheel matrix and build-from-source instructions.
2. Analyse a file
bca.analyze(path) returns a dict matching the JSON bca metrics --output-format json emits for the same file — same field
order, same numeric formatting, same shape.
"""Quick-start: analyse one file and print the headline cyclomatic count.
Mirrors the worked example shown on the book's
``python/quick-start.md`` page. The book embeds this file verbatim,
so the snippet is the test fixture — if the API drifts, the
``test_book_examples.py`` test fails and the docs are forced back
into sync.
"""
from __future__ import annotations
from pathlib import Path
from typing import Any
import big_code_analysis as bca
def run(path: Path) -> dict[str, Any]:
"""Analyse ``path`` and return its metric dict."""
result = bca.analyze(path)
if result is None:
msg = f"{path} was skipped (looks generated)"
raise SystemExit(msg)
cyclomatic = result["metrics"]["cyclomatic"]
print(f"{result['name']}: cyclomatic sum = {cyclomatic['sum']:.0f}")
return result
if __name__ == "__main__":
import sys
if len(sys.argv) != 2:
sys.exit("usage: python quick_start.py <path>")
run(Path(sys.argv[1]))
A few details worth noting:
analyzereturnsNonewhen the file matches the CLI walker'sis_generatedpredicate (a leading@generated,DO NOT EDIT, orGENERATED CODEmarker). Always handle the optional return before reaching intoresult["metrics"].- The returned object is a plain
dict[str, Any]. It is safe to serialise withjson.dumps, ship to a downstream service, or feed intoflatten_spacesfor tabular consumers. - Language detection mirrors the CLI exactly: path extension
first, then shebang / emacs-mode fallback. Pass
bca.analyze_source(code, language)if you have the source in-memory.
3. Analyse an in-memory snippet
import big_code_analysis as bca
metrics = bca.analyze_source("fn main() {}\n", "rust")
print(metrics["metrics"]["loc"]["sloc"])
analyze_source accepts str, bytes, or bytearray. The
returned dict has the same shape as analyze's output, with
name set to None (no path is associated with an in-memory
buffer).
Where to go next
- Batch processing —
analyze_batchfor many files without per-file try/except clutter. - Metric selection — compute only the metrics you need.
- Error handling — the full exception taxonomy.
- The CLI's Metrics command is the equivalent shell-level workflow.
Batch processing
bca.analyze_batch(paths) runs the same analysis as bca.analyze
over every path in an iterable and never raises on per-file
errors: each result slot is either an analysis dict or a
bca.AnalysisError describing the failure. The list has the same
length as the input and preserves order one-to-one, so callers
can zip(inputs, results) without losing the pairing.
def run(paths: Iterable[Path]) -> dict[str, int]:
"""Analyse ``paths`` as a batch and bucket successes vs failures.
Returns a small summary dict (`ok`, `errors`, `total`) so the
accompanying test can assert on it without re-parsing.
"""
materialised = [str(p) for p in paths]
results = bca.analyze_batch(materialised)
ok = 0
errors = 0
for path, result in zip(materialised, results, strict=True):
if isinstance(result, bca.AnalysisError):
errors += 1
print(f" skip {path}: ({result.error_kind}) {result.error}")
else:
ok += 1
sloc = result["metrics"]["loc"]["sloc"]
print(f" ok {path}: sloc = {sloc:.0f}")
return {"ok": ok, "errors": errors, "total": len(materialised)}
A few key contracts:
AnalysisErroris returned, not raised. It is not anExceptionsubclass —isinstance(slot, bca.AnalysisError)is the discriminator.- The result list is the same length as the input.
pathsis consumed lazily, so generators work — but if you want to keep the input around forzip, materialise it into a list first. analyze_batchruns with theis_generatedwalker filter off: every input position yields either adictor anAnalysisError, neverNone. Callbca.analyze(path)per-file with the defaultskip_generated=Trueif you need the CLI walker's skip behaviour.
Parallel execution
There is no built-in concurrency inside analyze_batch — it is a
sequential sweep. For parallelism, fan the per-file analyze
call out across a thread pool:
def run_parallel(paths: Iterable[Path], *, workers: int = 4) -> list[dict[str, Any] | None]:
"""Fan ``analyze`` out across a thread pool.
PyO3 releases the GIL across each file's read + parse, so a
thread pool actually parallelises the heavy work. Use this when
you need per-file exceptions instead of ``AnalysisError`` slots.
"""
def _analyze(p: Path) -> dict[str, Any] | None:
return bca.analyze(str(p))
with ThreadPoolExecutor(max_workers=workers) as pool:
return list(pool.map(_analyze, paths))
PyO3's Python::detach releases the GIL across each file's read +
tree-sitter parse, so the threads do not serialise on the
interpreter lock — this is real parallelism, not contended
co-operation.
AnalysisError taxonomy
error_kind is a closed Literal:
error_kind | Triggered by |
|---|---|
"UnsupportedLanguage" | Unknown extension + no shebang / emacs-mode hit |
"ParseError" | tree-sitter rejected the source, or a rare internal serialisation failure (internal: serialization error: …) |
"IoError" | std::fs::read failed or the path was not valid UTF-8 |
AnalysisError is frozen and implements __eq__ / __hash__ /
__repr__ over all three fields, so callers can put errors in a
set to deduplicate failures across runs. For retry
classification, the errno is preserved in the error string via
Rust's default formatting:
import re
match = re.search(r"\(os error (\d+)\)$", slot.error)
errno = int(match.group(1)) if match else None
If you need typed dispatch (FileNotFoundError,
PermissionError, …) call bca.analyze(path) per-file instead
of analyze_batch — single-file analyze raises the
canonical OSError subclass. See Error handling.
Flat-record iteration
bca.flatten_spaces(result) walks the nested FuncSpace tree in
pre-order and yields one flat, scalar-only dict per node —
ready for sqlite3.executemany,
pandas.DataFrame.from_records, or any other tabular consumer.
Metric keys use the same dotted convention as the CLI's CSV
writer (cyclomatic.modified.sum, halstead.volume,
loc.lloc_average, …). Identity keys (path, name, kind,
start_line, end_line, parent_name, depth) are added on
every record.
SQLite via executemany
The example below analyses one file and inserts one row per
FuncSpace into a sqlite table whose columns are the union of
all flattened keys.
"""Flatten a FuncSpace tree into scalar rows for sqlite / pandas.
Demonstrates ``bca.flatten_spaces`` + ``sqlite3.executemany``. The
pandas equivalent is shown in the book as a non-executed snippet so
this example stays dependency-free (sqlite ships with the stdlib).
Tied to the book's ``python/flat-records.md`` page.
"""
from __future__ import annotations
import sqlite3
from contextlib import closing
from pathlib import Path
import big_code_analysis as bca
# SQLite identifier names are case-insensitive, so the Halstead
# pair `N1` / `n1` (and `N2` / `n2`) collide on one column. Rewrite
# the uppercase totals to a distinct name before insertion. The
# explicit map (not a `.replace(".N", "...")` substring rewrite)
# means a hypothetical future `halstead.NN_metric` would not be
# silently mangled.
_RENAME_FOR_SQLITE: dict[str, str] = {
"halstead.N1": "halstead.total_1",
"halstead.N2": "halstead.total_2",
}
def _safe_column(key: str) -> str:
return _RENAME_FOR_SQLITE.get(key, key)
def run(path: Path, db_path: Path) -> int:
"""Analyse ``path`` and insert one row per FuncSpace into ``db_path``.
Returns the number of rows inserted so the test can assert on it.
"""
result = bca.analyze(path)
if result is None:
msg = f"{path} was skipped (looks generated)"
raise SystemExit(msg)
records = [{_safe_column(k): v for k, v in r.items()} for r in bca.flatten_spaces(result)]
if not records:
return 0
columns = sorted({k for r in records for k in r})
cols_sql = ", ".join(f'"{c}"' for c in columns)
placeholders = ", ".join("?" for _ in columns)
rows = [tuple(r.get(c) for c in columns) for r in records]
# `closing(sqlite3.connect(...))` is the documented idiom — the
# bare ``with sqlite3.connect(...)`` context manager only commits
# / rolls back the transaction; it does NOT close the connection,
# so a long-running consumer leaks file descriptors (and on
# Windows holds an exclusive write lock on the db file).
with closing(sqlite3.connect(db_path)) as db, db:
db.execute(f"CREATE TABLE IF NOT EXISTS metrics ({cols_sql})")
db.executemany(
f"INSERT INTO metrics ({cols_sql}) VALUES ({placeholders})",
rows,
)
return len(rows)
if __name__ == "__main__":
import sys
if len(sys.argv) != 3:
sys.exit("usage: python flat_records.py <source-file> <out.db>")
inserted = run(Path(sys.argv[1]), Path(sys.argv[2]))
print(f"inserted {inserted} rows into {sys.argv[2]}")
The iterator is lazy and single-use: it walks the input once
without materialising the whole list. A second iteration of the
same iterator yields nothing — call list() once if you need to
re-iterate.
Pandas
flatten_spaces is the natural input to
pandas.DataFrame.from_records. Pandas is not a dependency of
the bindings; install it separately if you want the DataFrame
view.
import big_code_analysis as bca
import pandas as pd
result = bca.analyze("src/lib.rs")
if result is not None:
df = pd.DataFrame.from_records(bca.flatten_spaces(result))
print(df.head())
# Group by space kind to inspect the average cyclomatic per
# function vs. per class vs. per file.
by_kind = df.groupby("kind")["cyclomatic.sum"].mean()
Identity columns vs CLI CSV
The flat-record schema is mostly aligned with the CLI's CSV writer, with a couple of intentional deltas:
- Identity columns use
name/kindhere; the CSV writer usesspace_name/space_kind. Flat records also addparent_name/depth; the CSV writer omits those. tokens.*flattens to the JSON shape (tokens.tokens,tokens.tokens_average, …), while CSV renames those totokens.sum/.average/.min/.max. Rename in the consumer if you need exact CSV alignment.
Anonymous spaces (Rust closures, JavaScript function expressions /
arrows) keep their name == "<anonymous>" marker verbatim —
flatten_spaces does not normalise.
Caveats
parent_namealone cannot disambiguate same-named siblings nested under different parents (e.g. twoInnerclasses under two different outer classes both surface asparent_name == "Inner"for their own children). Pair withdepthand source-order position, or rebuild the qualified name in your consumer, if you need a fully-qualified path.- Do not mutate the input
resultwhile iterating: the walker keeps references into it, so mutations to not-yet-yielded subtrees will be observed in later records. - Missing metric subtrees produce no keys (absent, not
None), matching the "Halstead disabled" edge case for metric selection. flatten_spacesraisesTypeErrorif the input is not a mapping; callers must filterNonereturns frombca.analyze(e.g. generated files withskip_generated=True) before passing.
Metric selection
Pass metrics=[…] to compute only a subset of the metric suite.
metrics=None (the default) preserves the "compute everything"
behaviour. Unrequested metrics are absent from the result
dict (not present with None placeholders).
def run(path: Path) -> dict[str, Any]:
"""Compute only LoC + cyclomatic for ``path`` and return the result.
``bca.METRIC_NAMES`` is a ``tuple[str, ...]`` of every canonical
name accepted by ``metrics=``. The string ``"halstead"`` is one
of them; ``in`` membership tests the selection client-side
before any I/O is paid for.
"""
if "halstead" not in bca.METRIC_NAMES:
msg = "halstead is missing from METRIC_NAMES — bindings ABI drift"
raise RuntimeError(msg)
selected = bca.analyze(path, metrics=["loc", "cyclomatic"])
if selected is None:
msg = f"{path} was skipped (looks generated)"
raise SystemExit(msg)
metric_keys = sorted(selected["metrics"])
print(f"computed only: {metric_keys}")
return selected
def run_derived(path: Path) -> dict[str, Any]:
"""Selecting ``mi`` auto-pulls in its three dependencies."""
selected = bca.analyze(path, metrics=["mi"])
if selected is None:
msg = f"{path} was skipped (looks generated)"
raise SystemExit(msg)
pulled = sorted(selected["metrics"])
print(f"mi pulled in: {pulled}")
return selected
The same kwarg is honoured by bca.analyze_source and
bca.analyze_batch — the latter applies the selection uniformly
to every file in the batch. Validation runs before any file
I/O: an empty list or unknown name raises ValueError
immediately and never returns an AnalysisError slot for what is
really a caller bug.
Canonical names
The full set is available as a tuple:
import big_code_analysis as bca
assert "halstead" in bca.METRIC_NAMES
Names are case-sensitive lowercase; passing an unknown name
raises ValueError with the canonical list in the message. The
"exit" Metric-Display spelling is accepted as an alias for the
canonical JSON-key spelling "nexits"; both produce a
"nexits" key in the output. Duplicates are silently collapsed.
| Metric | JSON key | Dependencies pulled in |
|---|---|---|
| LoC | loc | — |
| Cyclomatic | cyclomatic | — |
| Cognitive | cognitive | — |
| Halstead | halstead | — |
| ABC | abc | — |
nargs | nargs | — |
nom | nom | — |
npa | npa | — |
npm | npm | — |
nexits (alias exit) | nexits | — |
tokens | tokens | — |
| Maintainability Index | mi | loc, cyclomatic, halstead |
| Weighted Methods per Class | wmc | cyclomatic, nom |
Performance trade-off
Computing the full suite is the default because it is what the
CLI does. Selecting a single metric is strictly faster —
each compute pass is skipped — but the tree-sitter parse and
the AST walk are the dominant cost on most inputs, so the saving
on a single file is small. The benefit scales with batch size:
when analyze_batch runs across a large repository, dropping
the most expensive metric you do not need (often Halstead, on
deep call trees) is a measurable win.
Unrequested metrics are absent from the result. Code that
unconditionally indexes into result["metrics"]["mi"] will
KeyError if you opted out of mi; guard with if "mi" in result["metrics"] or use .get("mi").
See also
- Batch processing —
metrics=applies uniformly to every file in a batch; validation runs once, before the input is iterated. - SARIF output — threshold names are independent of
the
metrics=selection; you can requestmetrics=["loc"]and still gate oncyclomaticthresholds, but the SARIF will have no findings for the dropped metrics. - Flat-record iteration —
flatten_spacessilently emits no keys for metrics that were absent from the source dict, so ametrics=selection naturally narrows the flattened columns.
SARIF output
bca.to_sarif(result, *, thresholds=None) renders an analysis
result (or an iterable of them) into a SARIF
2.1.0
JSON document, ready for upload to GitHub Code Scanning or any
other SARIF consumer. The output is produced by the same Rust
writer that backs bca check -O sarif, so the schema URL, tool
driver name / version, and rule descriptions match the CLI
byte-for-byte.
def run(
paths: Iterable[Path],
sarif_path: Path,
thresholds: Mapping[str, float],
) -> str:
"""Analyse ``paths`` and write a SARIF document to ``sarif_path``.
Returns the rendered SARIF JSON so the caller (or the test) can
inspect it without re-reading the file.
"""
batch = bca.analyze_batch([str(p) for p in paths])
sarif = bca.to_sarif(batch, thresholds=dict(thresholds))
sarif_path.parent.mkdir(parents=True, exist_ok=True)
sarif_path.write_text(sarif, encoding="utf-8")
print(f"wrote {sarif_path} ({len(sarif)} bytes)")
return sarif
to_sarif accepts:
- A single
dictreturned bybca.analyzeorbca.analyze_source. - Any iterable yielding such dicts and / or
bca.AnalysisErrorinstances (the natural shape ofbca.analyze_batch's return value).AnalysisErrorentries are skipped silently — they represent files that could not be analysed, not findings.
Thresholds
Accepted threshold names mirror the CLI's EXTRACTORS table in
big-code-analysis-cli/src/thresholds.rs:
cognitive,cyclomatic,cyclomatic.modifiedhalstead.volume,halstead.difficulty,halstead.effort,halstead.time,halstead.bugsloc.sloc,loc.ploc,loc.lloc,loc.cloc,loc.blanknom,tokens,nexits,nargsmi.original,mi.sei,mi.visual_studioabc,wmc,npm,npa
An unknown name raises ValueError listing the accepted set, so
a typo fails fast instead of silently producing an empty SARIF
run.
thresholds=None (the default) and thresholds={} both produce
a well-formed SARIF document with empty results and rules
arrays. This matches the CLI's posture: there are no built-in
default thresholds; every check run supplies its own limits.
Upload to GitHub Code Scanning
# .github/workflows/code-scanning.yml (excerpt)
- name: Compute metric SARIF
run: |
python - <<'PY'
import big_code_analysis as bca
with open("paths.txt", encoding="utf-8") as paths_fh:
results = bca.analyze_batch(paths_fh.read().splitlines())
with open("metrics.sarif", "w", encoding="utf-8") as fh:
fh.write(bca.to_sarif(results, thresholds={"cyclomatic": 15}))
PY
- name: Upload to Code Scanning
uses: github/codeql-action/upload-sarif@v3
with:
sarif_file: metrics.sarif
The upload action is documented under
github/codeql-action/upload-sarif.
The bindings produce one SARIF run per call; the action handles
the upload to the repository's Code Scanning alerts.
What "Unit" findings mean
to_sarif emits file-scope (unit-space) findings for every
metric whose JSON headline at the unit space matches the CLI's
per-space accessor (loc.*, halstead.*, mi.*, nom,
nargs, nexits, tokens, abc, wmc, npm, npa). The
three exceptions — cyclomatic, cyclomatic.modified,
cognitive — are skipped at the unit level because the JSON
exposes the aggregate sum across children while the CLI's
per-space accessor returns just the unit's own scalar.
Unit findings carry logicalLocations: [{"fullyQualifiedName": "<file>"}]. Nameless non-unit spaces (rare parse-failure case)
carry "<unnamed>" — both matching the CLI's function_token
placeholders.
See also
- Batch processing — the natural source of input
iterables for
to_sarif;AnalysisErrorentries are skipped silently. - Metric selection — threshold names are a closed
set independent of
metrics=; requesting a narrower metric suite while gating on a dropped threshold yields an empty SARIF run. - Error handling — the typed exceptions
to_sarifraises for bad caller input (TypeError/ValueError).
Error handling
The bindings split errors into two domains:
- Caller errors are raised —
ValueErrorfor bad arguments,TypeErrorfor the wrong type,OSErrorand its subclasses for filesystem failures. - Per-file analysis errors in a batch are returned as
bca.AnalysisErrorvalues inside the result list. They are not exceptions and never raise.
The single-file bca.analyze walks the first path; the batch
bca.analyze_batch walks the second.
def run(
fixtures: Path,
*,
missing_path: Path,
) -> dict[str, Any]:
"""Trigger each error path and return a small report.
``fixtures`` is a directory containing at least ``hello.rs``;
``missing_path`` must NOT exist on disk.
"""
report: dict[str, Any] = {
"file_not_found": False,
"unsupported": False,
"batch_errors": 0,
}
# 1. analyze() on a missing path raises a typed OSError subclass.
try:
bca.analyze(str(missing_path))
except FileNotFoundError as err:
report["file_not_found"] = True
print(f"file_not_found: errno={err.errno} filename={err.filename}")
# 2. analyze() on an unknown extension raises
# UnsupportedLanguageError (itself a ValueError subclass).
# The write is inside the try/finally so a future second
# mutation before the analyse call still gets cleaned up.
unknown = fixtures / "hello.unknown_extension"
try:
unknown.write_text("noop", encoding="utf-8")
bca.analyze(str(unknown))
except bca.UnsupportedLanguageError as err:
report["unsupported"] = True
print(f"unsupported_language: {err}")
finally:
unknown.unlink(missing_ok=True)
# 3. analyze_batch() returns AnalysisError, never raises per-file.
paths = [str(fixtures / "hello.rs"), str(missing_path)]
for slot in bca.analyze_batch(paths):
if isinstance(slot, bca.AnalysisError):
report["batch_errors"] += 1
print(f"batch_error: ({slot.error_kind}) {slot.error}")
return report
Single-file exceptions
bca.analyze and bca.analyze_source raise:
| Exception | Subclass of | Triggered by |
|---|---|---|
bca.UnsupportedLanguageError | ValueError | Unknown extension + no shebang / emacs-mode hit |
bca.ParseError | ValueError | tree-sitter rejected the source |
ValueError (raw) | — | Non-UTF-8 path with allow_lossy_path=False (the default) |
OSError and subclasses | — | std::fs::read failed |
The OSError raised by analyze dispatches to the canonical
subclass based on errno:
import big_code_analysis as bca
path = "src/example.rs"
try:
bca.analyze(path)
except FileNotFoundError as err:
print("missing:", err.errno, err.filename)
except PermissionError as err:
print("denied:", err.errno, err.filename)
except IsADirectoryError as err:
print("directory:", err.errno, err.filename)
Each branch dispatches on the underlying errno:
| Exception | Typical err.errno (Linux) | When it fires |
|---|---|---|
FileNotFoundError | 2 (ENOENT) | Path does not exist. |
PermissionError | 13 (EACCES) | Read bit denied for the calling user. |
IsADirectoryError | 21 (EISDIR) | Path resolves to a directory. |
Use except OSError if you want to catch the whole family and
inspect err.errno / err.filename yourself.
UnsupportedLanguageError and ParseError are both ValueError
subclasses, so a single except ValueError catches both. Prefer
the typed catches when you want to differentiate.
Batch errors
bca.analyze_batch returns bca.AnalysisError values instead of
raising, so a single bad file does not break the whole batch.
for slot in bca.analyze_batch(paths):
if isinstance(slot, bca.AnalysisError):
log.warning("%s (%s): %s", slot.path, slot.error_kind, slot.error)
else:
process(slot)
error_kind is a closed Literal:
"UnsupportedLanguage"— extension and shebang / emacs-mode resolution both came up empty."ParseError"— tree-sitter rejected the input, or (rare) a Rust-side JSON serialisation of the result failed. The serialisation case is prefixed withinternal: serialization error:in theerrorstring; check for the prefix when the distinction matters (serialisation failures are not recoverable by re-reading the file)."IoError"— the most common kind:std::fs::readfailed. The closed taxonomy also folds in non-UTF-8 path failures, so a path-encoding error surfaces as"IoError"rather than as a distinct fourth value.
For "IoError" instances the underlying OS errno is preserved
in the error string via Rust's default formatting ("<msg> (os error <N>)" on Unix). Parse with regex if you need it for retry
classification:
import re
match = re.search(r"\(os error (\d+)\)$", slot.error)
errno = int(match.group(1)) if match else None
If you need typed OSError subclasses, call bca.analyze per
file instead of analyze_batch — single-file analyze raises
FileNotFoundError / PermissionError / IsADirectoryError
directly.
Programmer errors in batches
analyze_batch does still raise on caller bugs:
TypeErrorifpathsis not iterable, or an element is notstr/os.PathLike[str]. This aborts the whole call; any results computed before the bad element are discarded.ValueErrorifmetrics=is an explicitly empty sequence or contains an unknown name. Validation runs before the input iterable's__iter__, so a generator's side effects (and any partial yields) are preserved on this raise path.
Logging recipe
A small logging helper for batch output keeps successes / failures aligned without bespoke formatting:
import logging
import big_code_analysis as bca
log = logging.getLogger(__name__)
def report(paths: list[str]) -> None:
for path, slot in zip(paths, bca.analyze_batch(paths)):
if isinstance(slot, bca.AnalysisError):
log.warning(
"skip %s (%s): %s", path, slot.error_kind, slot.error
)
else:
log.info(
"ok %s sloc=%s", path,
slot["metrics"]["loc"]["sloc"],
)
See also
- Batch processing — the never-raise contract that
routes per-file failures into
AnalysisErrorslots. - Async patterns —
asyncio.gather(..., return_exceptions=True)is the async-side equivalent of the batch contract: per-task exceptions land in the result list instead of cancelling the whole gather. - Quick start — the single-file
analyzepath that raises typedOSErrorsubclasses.
Async patterns
bca.analyze is CPU-bound: the work is a tree-sitter parse plus
the metric passes, both of which release the GIL on the Rust side
via PyO3's Python::detach. The canonical async pattern is
therefore asyncio.to_thread:
async def analyze_async(path: Path) -> dict[str, Any] | None:
"""Run ``bca.analyze(path)`` on the default thread executor."""
return await asyncio.to_thread(bca.analyze, str(path))
async def analyze_all(
paths: Iterable[Path],
) -> list[dict[str, Any] | BaseException | None]:
"""Fan ``analyze_async`` out across ``paths`` with ``asyncio.gather``.
``return_exceptions=True`` matters here: ``bca.analyze`` runs
inside ``asyncio.to_thread`` and Python threads cannot be
cancelled. If one call raises and gather re-raises with
``return_exceptions=False``, the surviving threads keep running
in the default executor, producing results that are silently
discarded. With ``return_exceptions=True`` every thread's
result (success OR exception) lands in the returned list so
the caller can dispatch per-file.
"""
return await asyncio.gather(
*(analyze_async(p) for p in paths),
return_exceptions=True,
)
Why to_thread, not native async
bca.analyze is a synchronous Python function backed by
synchronous Rust code — there is no await boundary inside it.
Wrapping it in asyncio.to_thread:
- Schedules the call on the default thread pool.
- Lets other coroutines progress while the parse + metric pass runs.
- Returns the result back to the calling coroutine when done.
Because the Rust side releases the GIL across the heavy work,
several to_thread(bca.analyze, ...) calls genuinely run in
parallel — this is not co-operative I/O multiplexing, it is real
multi-core utilisation gated on the thread pool's size.
Custom executors
For a tighter cap on the worker count, hand to_thread a
purpose-built executor:
import asyncio
from concurrent.futures import ThreadPoolExecutor
import big_code_analysis as bca
async def analyze_many(paths: list[str]) -> list[object]:
loop = asyncio.get_running_loop()
with ThreadPoolExecutor(max_workers=8) as pool:
return await asyncio.gather(
*(loop.run_in_executor(pool, bca.analyze, p) for p in paths)
)
Eight workers on an 8-core machine is the comfortable upper bound for purely CPU-bound work; raising it further oversubscribes the machine and trades throughput for context-switch overhead.
Streaming results
asyncio.as_completed lets you start consuming results as soon
as the first analysis finishes — useful when the per-file work
varies wildly in cost (a 5 KB file vs a 500 KB generated bundle):
import asyncio
import big_code_analysis as bca
async def first_failure(paths: list[str]) -> str | None:
"""Return the path of the first file with cyclomatic > 50."""
tasks = [asyncio.create_task(asyncio.to_thread(bca.analyze, p)) for p in paths]
try:
for coro in asyncio.as_completed(tasks):
result = await coro
if result is None:
continue
if result["metrics"]["cyclomatic"]["sum"] > 50:
return result["name"]
finally:
for t in tasks:
t.cancel()
return None
The finally-block cancellation matters: as_completed does not
auto-cancel pending tasks when the caller returns early, so a
leaked task can keep running on the thread pool well after the
async function returns.
Anti-pattern: calling bca.analyze directly in a coroutine
# Don't do this.
async def bad(path: str) -> dict | None:
return bca.analyze(path) # blocks the event loop on every call
async def does not make the body asynchronous. Without
to_thread or an explicit executor, every coroutine that calls
bca.analyze stalls the event loop for the full duration of the
parse — other tasks waiting on I/O, timers, or queues all freeze
until the parse returns. The to_thread wrapper is one line and
makes the difference between a responsive server and a
single-threaded one.
When analyze_batch is the better fit
If you are processing a static, finite list of paths and do not
need streaming results, bca.analyze_batch is
simpler than gather(*to_thread(...)): it runs sequentially on
the calling thread but never raises on per-file errors. Wrap the
whole analyze_batch call in asyncio.to_thread to keep the
event loop responsive:
import asyncio
import big_code_analysis as bca
async def batch(paths: list[str]) -> list[object]:
return await asyncio.to_thread(bca.analyze_batch, paths)
This trades the per-file parallelism of gather for the
simpler error model of analyze_batch. Pick gather when you
want both parallelism and typed OSError dispatch; pick
to_thread(analyze_batch, paths) when you want one async call
and the never-raise contract.
Developers Guide
If you want to contribute to the development of big-code-analysis we have
summarized here a series of guidelines that are supposed to help you in your
building process.
As prerequisite, you need to install the last available version of Rust.
You can learn how to do that
here.
Clone Repository
First of all, you need to clone the repository. You can do that:
through HTTPS
git clone -j8 https://github.com/dekobon/big-code-analysis.git
or through SSH
git clone -j8 git@github.com:dekobon/big-code-analysis.git
Make is the canonical entry point
The repository ships a Makefile that wraps every common build, test,
lint, format, and docs task. Run make help to see the full list of
targets, and make check-tools to verify which optional tools
(taplo, markdownlint-cli2, shellcheck, shfmt, checkmake,
mdbook, cargo-insta, cargo-udeps) are present on your machine.
The two composite targets you will use most:
make pre-commit— the recommended local gate before committing. Runscargo fmt --check, both clippy invocations (default-features and--all-features),cargo test --workspace --all-features(lib + bin + integration + doc),cargo +nightly udeps, and the markdown / TOML / shell / Makefile lint families in one parallel pass.make ci— the same checks in the order CI runs them, with no auto-fixing. Use this to reproduce a failing CI run locally.
If GNU Make 4 or any of the optional tools are unavailable, fall back to the raw cargo commands shown below — they are equivalent to the corresponding Make targets.
Building
To build the big-code-analysis library, the CLI, and the web
server in one shot:
make build # cargo build --workspace --all-targets
make build-release # cargo build --workspace --release
For an individual crate, invoke cargo directly:
cargo build # library only
cargo build -p big-code-analysis-cli # CLI only
cargo build -p big-code-analysis-web # web server only
make check runs cargo check --workspace --all-targets for fast
type-checking during iteration.
Testing
To verify that all tests pass:
make test # cargo test --workspace --all-features --lib --bins --tests
make test-doc # cargo test --workspace --all-features --doc
If you only want to run the cargo command yourself:
cargo test --workspace --all-features --verbose
Updating insta tests
We use insta; install cargo insta to manage snapshots. The Makefile wraps the two operations you need:
make insta-review # cargo insta test --review (interactive)
make insta-accept # cargo insta test --accept (use with care)
make insta-review runs the tests, generates the new snapshot
references, and lets you review each diff. Reach for make insta-accept only for bulk metric-value-only refreshes (grammar
bumps, Halstead operator reclassification) where you have already
verified the diff pattern is uniform.
Code Formatting
If all previous steps went well, and you want to make a pull request
to integrate your invaluable help in the codebase, the last step left
is code formatting. The make fmt target runs every formatter in the
project (Rust, Markdown, TOML, Bash) in one shot; make fmt-check
verifies formatting without modifying files.
make fmt # cargo fmt + markdownlint-cli2 --fix + shfmt -w + taplo fmt
make fmt-check # the equivalent --check variants
Rustfmt
This tool formats your code according to Rust style guidelines.
To install:
rustup component add rustfmt
To format the code (handled automatically by make fmt):
cargo fmt
Clippy
This tool helps developers to write better code catching automatically lots of common mistakes for them. It detects in your code a series of errors and warnings that must be fixed before making a pull request.
make clippy runs both clippy invocations the project enforces
(default-features and --all-features); make lint additionally
runs the markdown, shell, TOML, and Makefile linters.
To install:
rustup component add clippy
To detect errors and warnings:
make clippy
# or, manually:
cargo clippy --workspace --all-targets -- -D warnings
cargo clippy --workspace --all-targets --all-features -- -D warnings
Unused dependencies
make udeps runs cargo +nightly udeps --workspace --all-targets to
catch dependencies declared in Cargo.toml but never referenced.
Requires the nightly toolchain (rustup toolchain install nightly)
and cargo-udeps.
Code Documentation
make doc # cargo doc --no-deps --workspace --all-features (warning-tolerant)
make doc-open # same, then open in a browser
make doc-check # strict gate: appends -D warnings to RUSTDOCFLAGS, fails on any rustdoc warning
make doc and make doc-open are the interactive viewers — they
build whatever they can so you can still inspect rendered output
mid-refactor. make doc-check is the strict gate that runs as part
of make pre-commit and CI (cargo doc --no-deps --workspace --all-features with RUSTDOCFLAGS extended by -D warnings); it
catches broken intra-doc links, links into private items, and other
rustdoc regressions.
Remove the --no-deps option from the underlying cargo invocation if
you also want to build the documentation of each dependency used by
big-code-analysis.
Building this book
The book you are reading lives under big-code-analysis-book/:
make book # mdbook build
make book-serve # mdbook serve with live reload
Run your code
You can run bca using:
cargo run -p big-code-analysis-cli -- [bca-parameters]
To know the list of bca parameters, run:
cargo run -p big-code-analysis-cli -- --help
You can run bca-web using:
cargo run -p big-code-analysis-web -- [bca-web-parameters]
To know the list of bca-web parameters, run:
cargo run -p big-code-analysis-web -- --help
make install, make install-cli, and make install-web invoke
cargo install --path for the respective binary crates.
Practical advice
- When you add a new feature, add at least one unit or integration test to verify that everything works correctly
- Document public API
- Do not add dead code
- Comment intricate code such that others can comprehend what you have accomplished
- Run
make pre-commitbefore pushing — it is the same gate CI runs
Supporting a new language
This section is to help developers implement support for a new
language in big-code-analysis.
To implement a new language, two steps are required:
- Generate the grammar
- Add the grammar to
big-code-analysis
A number of metrics are supported and help to implement those are covered elsewhere in the documentation.
Generating the grammar
As a prerequisite for adding a new grammar, there needs to exist a tree-sitter version for the desired language that matches the version used in this project.
The grammars are generated by a project in this repository called enums. The following steps add the language support from the language crate and generate an enum file that is then used as the grammar in this project to evaluate metrics.
- Add the language specific
tree-sittercrate to theenumscrate, making sure the dependency is pinned with=X.Y.Zto the same version used in the rootbig-code-analysisCargo.toml. For example, for the Rust support the following line exists in the /enums/Cargo.toml:tree-sitter-rust = "=0.24.2". - Append the language to the
enumcrate in /enums/src/languages.rs. Keeping with Rust as the example, the line would be(Rust, tree_sitter_rust). The first parameter is the name of the Rust enum that will be generated, the second is thetree-sitterfunction to call to get the language's grammar. - Add a case to the end of the match in
mk_get_languagemacro rule in /enums/src/macros.rs. The current convention uses theLANGUAGEconstant exposed by modern grammar crates: for Rust that line isLang::Rust => tree_sitter_rust::LANGUAGE.into(). - Lastly, we execute the
/recreate-grammars.sh
script that runs the
enumscrate to generate the grammar for the new language.
At this point we should have a new grammar file for the new language in /src/languages/. See /src/languages/language_rust.rs as an example of the generated enum.
Adding the new grammar to big-code-analysis
- Add the language specific
tree-sittercrate to thebig-code-analysisworkspace, with the same=X.Y.Zpin as theenumscrate uses. For example, for the Rust support the line in the root Cargo.toml istree-sitter-rust = "=0.24.2". - Next we add the new
tree-sitterlanguage namespace to /src/languages/mod.rs eg.
#![allow(unused)] fn main() { pub mod language_rust; pub use language_rust::*; }
- Lastly, we add a definition of the language to the arguments of
mk_langs!macro in /src/langs.rs.
#![allow(unused)] fn main() { // 1) Name for enum // 2) Language description // 3) Display name // 4) Empty struct name to implement // 5) Parser name // 6) tree-sitter function to call to get a Language // 7) file extensions // 8) emacs modes ( Rust, "The `Rust` language", "rust", RustCode, RustParser, tree_sitter_rust, [rs], ["rust"] ) }
Implementing traits and tests
Wiring the grammar is only the first step. The new <Lang>Code type
must also implement the AST plumbing and every metric trait the
workspace defines:
Checkerin /src/checker.rs — comment, function, closure, call, string-literal, andelse-ifpredicates over the grammar'skind_ids.Getterin /src/getter.rs —get_space_kindplus the Halstead operator/operand classification table.Alteratorin /src/alterator.rs — usually only string-literal preservation; the default impl works for most languages.- All twelve metric traits:
Abc,Cognitive,Cyclomatic,Exit,Halstead,Loc,Mi,NArgs,Nom,Npa,Npm,Wmc. Register each via theimplement_metric_trait!macro invocation in /src/metrics/ to start with default (no-op) bodies, then replace with real impls for the metrics that have meaningful semantics for the language.
Audit aliased grammar variants
Tree-sitter grammars frequently emit several distinct kind_ids that
map to the same node.kind() string (Identifier /
Identifier2 / Identifier3 in Go,
InvocationExpression / InvocationExpression2 in C#,
QuotedContent ⋯ QuotedContent20 in Elixir). Every match node.kind_id() arm that touches an aliasable rule must either list
every numbered variant or compare on the string node.kind()
instead. Missing an alias silently drops nodes from the metric. See
the add-lang skill for the mechanical audit procedure and
lessons 2, 4, and 13 in
docs/development/lessons_learned.md
for the failure modes.
Tests
Add per-language tests under each src/metrics/*.rs test module —
aim for parity with the Rust coverage (≥ 34 tests total across the
metric files). Every insta::assert_json_snapshot! call MUST be
anchored: either with an inline expected block, a positive
assert_eq! on the headline integer accessor above it, or an
explanatory // expected: comment. make snapshot-anchors (run as
part of make pre-commit) enforces this against
.snapshot-anchor-baseline.txt.
End-to-end workflow
For an opinionated, end-to-end recipe — including the alias audit,
test layout, snapshot anchoring, and code-quality post-passes — see
the project's
add-lang
Claude Code skill. It is the canonical workflow used by recent
language additions (Elixir, PHP, C#, Bash, Go).
Lines of Code (LoC)
In this document we give some guidance on how to implement the LoC metrics available in this crate. Lines of code is a software metric that gives an indication of the size of some source code by counting the lines of the source code. There are many types of LoC so we will first explain those by way of an example.
Types of LoC
#![allow(unused)] fn main() { /* Instruction: Implement factorial function For extra credits, do not use mutable state or a imperative loop like `for` or `while`. */ /// Factorial: n! = n*(n-1)*(n-2)*(n-3)...3*2*1 fn factorial(num: u64) -> u64 { // use `product` on `Iterator` (1..=num).product() } }
The example above will be used to illustrate each of the LoC metrics described below.
SLOC
A straight count of all lines in the file including code, comments, and blank lines.
METRIC VALUE: 11
PLOC
A count of the instruction lines of code contained in the source code. This would include any brackets or similar syntax on a new line. Note that comments and blank lines are not counted in this. METRIC VALUE: 3
LLOC
The "logical" lines is a count of the number of statements in the
code. Note that what a statement is depends on the language.
In the above example there is only a single statement which id the
function call of product with the Iterator as its argument.
METRIC VALUE: 1
CLOC
A count of the comments in the code. The type of comment does not matter ie single line, block, or doc.
METRIC VALUE: 6
BLANK
Last but not least, this metric counts the blank lines present in a code. METRIC VALUE: 2
Implementation
To implement the LoC related metrics described above you need to
implement the Loc trait for the language you want to support.
This requires implementing the compute function.
See
/src/metrics/loc.rs
for where to implement, as well as examples from other languages.
Update grammars
Each programming language needs to be parsed in order to extract its
syntax and semantic: the so-called grammar of a language.
In big-code-analysis, we use
tree-sitter as parsing library since
it provides a set of distinct grammars for each of our supported
programming languages. But a grammar is not a static monolith, it
changes over time, and it can also be affected by bugs, hence it is
necessary to update it every now and then.
As now, since we have used bash scripts to automate the operations,
grammars can be updated natively only on Linux and MacOS
systems, but these scripts can also run on Windows using WSL.
In big-code-analysis we use both third-party and internal grammars.
The first ones are published on crates.io and maintained by external developers,
while the second ones have been thought and defined inside the project to manage variant of some languages
used in Firefox.
We are going to explain how to update both of them in the following sections.
Third-party grammars
Update the grammar version in Cargo.toml and enums/Cargo.toml.
Below an example for the tree-sitter-java grammar
tree-sitter-java = "x.xx.x"
where x represents a digit.
Run ./recreate-grammars.sh to recreate and refresh all grammars
structures and data
./recreate-grammars.sh
Once the script above has finished its execution, you need to fix, if there are any, all failed tests and problems introduced by changes in the grammars.
Commit your changes and create a new pull request
Internal grammars
Update the version of tree-sitter-cli in the package.json file of
the internal grammar and then install the updated version.
The five vendored grammars publish under the bca-tree-sitter-*
namespace (see RELEASING.md for the rename rationale), but consumer
call sites still reference them as tree-sitter-<lang> via Cargo's
package = ... alias. A grammar refresh does not bump the leaf's
version on its own — every crate in this repository shares one
workspace-wide version, and bumping the leaves out of step with the
parent is not allowed (see the "Lockstep version policy" in
RELEASING.md). Regenerate the parser tables, accept the resulting
test-snapshot drift, and ship the change under the current version.
The next workspace release picks up the new grammars at whatever
shared version the next tag declares.
If a regeneration also needs an updated tree-sitter runtime
dependency, bump the dev-dependency line inside the leaf's
Cargo.toml:
[dev-dependencies]
tree-sitter = "=x.x.x"
Leave [package] name = "bca-tree-sitter-<lang>",
[package] version, and [lib] name = "tree_sitter_<lang>"
untouched — the rename trick in [lib] is what keeps Rust import
paths stable, and the version line is managed by the lockstep
bump at release time.
Run the appropriate script to update the grammar by recreating and refreshing every file and script.
For tree-sitter-ccomment and tree-sitter-preproc run
./generate-grammars/generate-grammar.sh followed by the name of the
grammar.
Below an example always using the tree-sitter-ccomment grammar
./generate-grammars/generate-grammar.sh tree-sitter-ccomment
Instead, for tree-sitter-mozcpp and tree-sitter-mozjs, use their specific scripts.
For tree-sitter-mozcpp, run
./generate-grammars/generate-mozcpp.sh
For tree-sitter-mozjs, run
./generate-grammars/generate-mozjs.sh
Once the script above has finished its execution, you need to fix, if there are any, all failed tests and problems introduced by changes in the grammars.
Commit your changes and create a new pull request