big-code-analysis

big-code-analysis is a Rust library to analyze and extract information from source codes written in many different programming languages. It is based on a parser generator tool and an incremental parsing library called Tree Sitter.

You can find the source code of this software on GitHub, while issues and feature requests can be posted on the respective GitHub Issue Tracker.

Supported platforms

big-code-analysis can run on the most common platforms: Linux, macOS, and Windows.

On our GitHub Release Page you can find the Linux and Windows binaries already compiled and packed for you.

API docs

If you prefer to use big-code-analysis as a crate, you can find the API docs generated by Rustdoc here.

For task-oriented guides on embedding the crate — quick start, in-memory analysis, walking FuncSpace results, and error handling — see the Using as a Library section.

For the PyO3 bindings — pip install big-code-analysis, batch processing, flat-record iteration, SARIF output, and async patterns — see the Python Bindings section.

License

  • Mozilla-defined grammars are released under the MIT license.

  • big-code-analysis, big-code-analysis-cli and big-code-analysis-web are released under the Mozilla Public License v2.0.

Supported Languages

This is the list of programming languages parsed by big-code-analysis.

  • C
  • C++
  • C#
  • Mozcpp
  • Bash
  • Ccomment
  • Elixir
  • Preproc
  • Go
  • Groovy
  • Java
  • JavaScript
  • Kotlin
  • Lua
  • Mozjs
  • Perl
  • Php
  • Python
  • Ruby
  • Rust
  • Tcl
  • Tsx
  • Typescript

Supported Metrics

This chapter is a guided tour of every metric that big-code-analysis computes. Each section starts from the original research paper, walks through the algorithm, and explains both the way the metric was originally meant to be used and the ways the industry has actually ended up using it years later. If you are new to software metrics, read the sections in order — the later metrics (Maintainability Index in particular) are explicitly built on top of the earlier ones (Halstead, Cyclomatic, LOC).

A few framing notes before we start:

  • A metric is a measurement, not a verdict. Every number on this page summarises a structural property of source code. None of them measures correctness, productivity, or developer skill. The most important question for any metric is always "compared with what?" — the same module, a month ago; this module versus its siblings; this codebase versus an industry baseline. Absolute thresholds are rough heuristics at best.
  • Most metrics here are computed at three scopes: per function / method, per class or unit-like space, and per file. The underlying tree-sitter parser produces a tree of "spaces" (functions, closures, classes, namespaces, …) and every metric is rolled up through that tree. See the Supported Languages chapter for which scopes apply to which languages.
  • Object-oriented metrics only fire on object-oriented constructs. WMC, NPA, and NPM report 0 on a Rust file that has no impl blocks or on a Python module without classes; that is the correct answer, not a bug.

Index

MetricMeasuresFirst defined by
ABCSize as <Assignments, Branches, Conditions>Fitzpatrick, 1997
Cognitive ComplexityHow hard a function is to readCampbell / SonarSource, 2017
Cyclomatic Complexity (CC)Independent paths through a functionMcCabe, 1976
HalsteadVocabulary-based size, difficulty, effort, bugsHalstead, 1977
Lines of Code (SLOC, PLOC, LLOC, CLOC, BLANK)Raw, physical, logical, comment, and blank line countsConte, Dunsmore & Shen, 1986
Maintainability Index (MI)Composite maintainability scoreOman & Hagemeister, 1992; Coleman et al., 1994
NArgsNumber of arguments per functionfolk metric
NExitsNumber of exit points per functionstructured-programming literature
NOMNumber of methods and closuresLorenz & Kidd, 1994
NPANumber of public attributesLorenz & Kidd, 1994
NPMNumber of public methodsLorenz & Kidd, 1994
TokensTree-sitter leaf-token count (size proxy)Lizard tool, Terry Yin
WMCSum of cyclomatic complexity across a class's methodsChidamber & Kemerer, 1994

ABC

The ABC metric measures the size of a piece of code as a three-dimensional vector. Each component counts one kind of operation:

  • Assignments — anything that stores a value into a variable, including compound assignments (+=, ++) and explicit initialisation.
  • Branches — function and method calls. Despite the name, this is not the count of conditional jumps; it is the number of points where control branches out to other code.
  • Conditions — boolean tests: if, while, ternary operators, short-circuit && / ||, and the comparison operators that feed them.

The metric was introduced by Jerry Fitzpatrick in the 1997 C++ Report article Applying the ABC metric to C, C++ and Java. The current canonical specification, including the rules for what counts as an A, B, or C in modern languages, is maintained on Fitzpatrick's Software Renovation site.

Algorithm

The implementation walks every leaf node of the syntax tree exactly once. For every node it asks the language's per-language Abc trait implementation three yes/no questions: is this an assignment? a branch? a condition? — and increments the matching counter. The four headline values are:

  • the three components themselves, assignments, branches, conditions;
  • the magnitude |<A,B,C>| = √(A² + B² + C²), which is the way Fitzpatrick recommends summarising the vector as a single number.

The full serialised output (src/metrics/abc.rs) emits these four together with the per-component averages (assignments_average, branches_average, conditions_average) and per-component *_min / *_max at the file scope, for thirteen fields total. The metric is specialised per language in src/languages/language_*.rs.

How to read it

ABC is a size metric, not a complexity metric — a long, dull function with no decisions still scores high if it does a lot of assignments. Fitzpatrick's original recommendation was to use the magnitude as a relative ruler: rank a file's functions by ABC magnitude and look at the top decile.

In practice ABC ended up being most widely adopted by the Ruby community, where the rubocop linter and the flog tool both default to threshold-based warnings. A Ruby method with an ABC magnitude over about 17 is conventionally a refactoring candidate; over 30 is considered hard to maintain. Those thresholds are language-specific — expect higher values in C++ and Java, which use explicit getter/setter assignments more aggressively.

Cognitive Complexity

Cognitive Complexity was introduced by G. Ann Campbell at SonarSource in the 2017 white paper Cognitive Complexity — A new way of measuring understandability and the follow-up IEEE TechDebt 2018 paper Cognitive Complexity — An Overview and Evaluation. The white paper itself is available as CognitiveComplexity.pdf on the SonarSource site.

The metric was designed as a deliberate replacement for Cyclomatic Complexity in code-quality tooling. The argument Campbell makes is that cyclomatic complexity measures how hard code is to test, not how hard it is to understand: a 1024-arm switch statement scores the same as a deeply nested chain of ifs that perform identical logic, yet a human reader has a much harder time following the nested code.

Algorithm

Cognitive Complexity starts at zero and applies three rules as it walks the tree:

  1. Ignore "shorthand" control flow. Constructs that simply route to a single block — a top-level if with no nesting, an else without conditions of its own, the head of a for, a ?: ternary — add a baseline +1 each, but they do not punish you for the pattern.
  2. Penalise breaks in linear flow. Every if, else if, else, switch, try/catch, loop, jump (goto, break label, continue label), and recursive call adds at least +1.
  3. Punish nesting. Every time control flow appears inside an already-nested block, the metric adds an extra +1 per level of nesting. An if inside a for inside an outer if inside a method scores 1 + 2 + 3 = 6, where a flat sequence of the same three constructs would have scored 1 + 1 + 1 = 3.

Sequences of identical boolean operators (a && b && c) score +1 for the whole run, on the grounds that a chain of &&s is no harder to read than a single &&. Switching operators (a && b || c) is where the cognitive load jumps, so the second operator earns its own +1.

big-code-analysis exports the per-function structural score along with the file-wide sum, min, max, and a per-function average. The implementation is in src/metrics/cognitive.rs.

How to read it

A Cognitive Complexity of 0 means the function is purely linear; no branches, no loops. SonarSource's tooling defaults to flagging functions above 15 as "too complex" and Campbell's recommendation in the white paper is that a function should rarely exceed about 25. Unlike Cyclomatic Complexity, the metric scales smoothly: deeply nested code with the same number of decisions scores significantly higher than flat code with the same decisions.

The emergent use case is refactoring guidance during code review: because the metric penalises nesting specifically, it tends to flag exactly the kind of function that benefits from an early-return or "extract method" refactor. SonarLint's IDE plugins (IntelliJ, VS Code, Visual Studio, Eclipse) all surface it as the headline complexity number on hover, and the metric has since been picked up by several language servers and code-review platforms outside the Sonar ecosystem.

Cyclomatic Complexity (CC)

The original software complexity metric, introduced by Thomas J. McCabe in 1976 in A Complexity Measure (IEEE Transactions on Software Engineering, SE-2(4), pages 308–320).

McCabe's idea was to apply graph theory to the control-flow graph of a function. If you draw every basic block as a node and every jump between blocks as an edge, the cyclomatic number of that graph is

M = E − N + 2P

where E is the number of edges, N the number of nodes, and P the number of connected components. Crucially, M is also exactly the number of linearly independent paths through the function — in other words, the minimum number of test cases needed to cover every branch at least once.

Algorithm

big-code-analysis does not literally build a control-flow graph. Instead it uses the equivalent, much cheaper, formulation McCabe proved in the 1976 paper for structured programs:

Cyclomatic Complexity = 1 + (number of decision points)

A "decision point" is any node where control can branch:

  • if, else if, ternary ?:
  • case / when arms in switch / match / select
  • while, do … while, every variant of for
  • exception-handler catch clauses
  • short-circuit boolean operators && and ||

The per-language Cyclomatic trait, in src/metrics/cyclomatic.rs, asks each tree-sitter node "are you a decision?" and increments the counter. The metric is rolled up per function and per file; per-class aggregation across method bodies is provided separately by WMC below.

Modified cyclomatic

big-code-analysis also reports a modified variant that collapses all case / match / when arms inside a single switch statement into one decision point, regardless of how many arms it has. This tends to undercount big dispatch tables in a way that often matches developer intuition better than the strict McCabe definition — a 30-arm enum dispatch reads as one decision, not thirty. (The convention itself is not original to this project: it echoes the long-standing -m mode from Terry Yin's lizard tool, which is where many readers will first have seen it.) Both numbers are exported side by side; pick one and be consistent.

How to read it

McCabe's original recommendation, repeated in the 1976 paper and preserved by NIST's Structured Testing report (Special Publication 500-235, 1996), is to treat 10 as the upper bound for a single function: above that, the number of test cases needed for branch coverage grows uncomfortably large.

The emergent uses of cyclomatic complexity have been:

  1. Defect prediction. Complexity correlates well — though imperfectly — with the probability of a function containing a bug, and most static-analysis tools flag high-CC functions as risky.
  2. Test-coverage planning. CC is the lower bound on the number of test cases needed to cover every branch, so test teams use it directly to budget effort.
  3. Refactor triage. Cyclomatic Complexity is the headline "complexity" number in almost every code-quality dashboard, often as a tie-breaker between two functions that look similar in length.

Be aware of the metric's well-known blind spot: it treats every decision as equal weight. A 30-arm switch over an enum and a function with two nested ifs each containing nested ifs both score around 30, even though they are very different reading experiences. Cognitive Complexity (above) was designed to fix exactly that.

Halstead

The Halstead suite is the oldest size-and-effort metric family on this page. Maurice H. Halstead introduced it in his 1977 book Elements of Software Science (Elsevier, ISBN 0-444-00205-7); the Wikipedia page on Halstead complexity measures summarises the formulas. Halstead's project was strikingly ambitious: he wanted a quantitative, empirical science of software in the same way that physics is the empirical science of matter.

The four base counts

Halstead reduces a program to its tokens, then partitions them into two categories:

  • Operators — anything that does something: keywords (if, return, while), arithmetic and logical operators, assignment, function-call syntax, punctuation that controls flow.
  • Operands — anything that is something: identifiers and literals.

From these you derive four base counts:

SymbolMeaning
n1number of distinct operators
n2number of distinct operands
N1total count of operator occurrences
N2total count of operand occurrences

big-code-analysis records these four numbers in src/metrics/halstead.rs per function and per file. The per-language trait classifies tokens as operator vs. operand on a token-by-token basis; the rules deliberately exclude pure layout punctuation like parentheses and statement separators, which is why the Halstead totals are not the same as the Tokens count.

Derived metrics

Halstead then derives a small zoo of formulas. big-code-analysis reports all of the standard ones, plus three less-common derivations (estimated_program_length, purity_ratio, level) that are part of the original suite:

vocabulary               n  = n1 + n2
length                   N  = N1 + N2
estimated_program_length N̂  = n1·log2(n1) + n2·log2(n2)
purity_ratio                = N̂ / N
volume                   V  = N · log2(n)                          (bits)
difficulty               D  = (n1 / 2) · (N2 / n2)
level                    L  = 1 / D
effort                   E  = D · V          (elementary mental discriminations)
time                     T  = E / 18                               (seconds)
bugs                     B  = E^(2/3) / 3000 (estimated delivered defects)

The numeric constants come from Halstead's empirical fits against a heterogeneous corpus of CDC-era programs including FORTRAN, PL/I, and Algol-family languages. The T = E / 18 "Stroud number" is separate — it comes from psychology: Halstead borrowed John Stroud's estimate that the human mind makes about 18 elementary discriminations per second.

How to read it

Halstead's original intent was to predict three things about a program before it was even written: how big it would be in bits, how long it would take to implement, and how many bugs to expect in deployment. The empirical evidence for the volume and length predictions is reasonable; the time and bugs predictions are more controversial and have been criticised at length, notably in the Purdue technical report Software Science Revisited.

In modern practice the Halstead numbers are used for three things:

  1. As inputs into composite metrics — most importantly the Maintainability Index (next section), which depends on Halstead volume.
  2. As a language-independent size proxy: volume in bits scales smoothly across languages in a way that LOC does not.
  3. For comparative effort budgeting: when two refactoring candidates have similar cyclomatic complexity, the one with the higher Halstead difficulty is the one more likely to introduce regressions.

Lines of Code

This section covers the five LOC variants — SLOC, PLOC, LLOC, CLOC, and BLANK. "Counting lines" sounds trivial until you have to define exactly what counts. The five variants below are the de-facto standard breakdown, going back to Samuel Conte, Hubert Dunsmore and Vincent Shen's 1986 textbook Software Engineering Metrics and Models (Benjamin/Cummings, ISBN 0-8053-2162-4), which codified the distinction between physical and logical lines. The OpenStaticAnalyzer project maintains a readable summary of the modern definitions.

VariantCounts
SLOCSource Lines Of Code — every line in the file, comments, blanks, and code alike
PLOCPhysical Lines Of Code — non-blank, non-comment-only lines
LLOCLogical Lines Of Code — statement-bearing lines (definitions, assignments, declarations)
CLOCComment Lines Of Code — lines that contain a comment (with or without code on the same line)
BLANKBlank lines — whitespace-only lines

Algorithm

big-code-analysis derives all five counts from a single pass over the tree-sitter syntax tree (see src/metrics/loc.rs). Comments and strings are identified by their AST node type rather than by lexical scanning, so multi-line strings, raw strings, doc comments, and string interpolations are all handled correctly. The per-language Loc trait specifies which node kinds count as a "statement" for LLOC; this is the subtle one, because what counts as a statement is language-defined.

The five counts satisfy a couple of useful identities:

SLOC = PLOC + BLANK + (lines that are comment-only)
CLOC ≥ (lines that are comment-only)        # CLOC also counts mixed code+comment lines

How to read it

  • SLOC is what most people mean colloquially by "lines of code". It is the canonical size proxy, but is sensitive to formatting and not portable across language conventions.
  • PLOC strips away the visual noise. It is the size measure used inside the Maintainability Index formula below.
  • LLOC is the most reliable statement count. It is the right measure if you are budgeting test cases per statement, or comparing the density of a Python file against a Java file.
  • CLOC, combined with PLOC, gives you a comment densityCLOC / PLOC is a useful rough proxy for how much of the file is documentation versus implementation.
  • BLANK is mostly diagnostic: a file with very low BLANK proportion is often hard to read.

The emergent uses of LOC variants go well beyond raw size. They are the most common input into cost-estimation models (COCOMO and COCOMO II both use KSLOC — thousands of source lines — as their base unit), they feed effort prediction in product-portfolio dashboards, and they are used as a normalising denominator for almost every other metric: defects per KSLOC, churn per KSLOC, test cases per KSLOC. The weakness — LOC is easy to game and a 10× difference in coding style can produce a 2× difference in LOC — is the reason this chapter has so many other metrics in it.

Maintainability Index (MI)

The Maintainability Index is a composite metric that rolls several of the metrics above into a single 0-to-100ish number meant to be read as "how maintainable is this code?". It was proposed by Paul Oman and Jack Hagemeister in their 1992 ICSM paper Metrics for assessing a software system's maintainability and refined by Don Coleman, Dan Ash, Bruce Lowther, and Paul Oman in the 1994 IEEE Computer paper Using metrics to evaluate software system maintainability (IEEE Computer 27(8), pages 44-49). Their methodology was empirical: they collected expert maintainability ratings on a handful of production Hewlett-Packard systems, computed forty candidate metrics on each, and let regression analysis pick the best linear combination. The combination that survived used Halstead volume, cyclomatic complexity, lines of code, and comment density.

big-code-analysis reports the three formulas that have stuck in practice:

mi_original      = 171 − 5.2·ln(HV) − 0.23·CC − 16.2·ln(SLOC)
mi_sei           = 171 − 5.2·log2(HV) − 0.23·CC − 16.2·log2(SLOC) + 50·sin(√(2.4·comment_ratio))
mi_visual_studio = max(0, mi_original · 100 / 171)
  • mi_original is the Coleman–Oman formula. It can be negative for pathological files.
  • mi_sei is the Software Engineering Institute's refinement, which adds a comment-density term — the sin(√(...)) shape was chosen so that some comments help, but adding more after a point does not.
  • mi_visual_studio is the linear rescaling Microsoft chose for Visual Studio, where the score is clamped to [0, 100] and shown to developers traffic-light style: green ≥ 20, yellow ≥ 10, red below.

The historical context, and a sharp critique of the metric, is collected on Arie van Deursen's blog post Think Twice Before Using the Maintainability Index.

Algorithm

The implementation is purely arithmetic — src/metrics/mi.rs consumes the already-computed Halstead, Cyclomatic, and LOC metrics and applies the three formulas. Because the formulas use the natural log of Halstead volume and SLOC, MI is undefined for empty files; big-code-analysis returns 0.0 for any file with zero SLOC or zero Halstead volume.

How to read it

MI was originally designed as a portfolio-level score: "how much maintenance pain should we expect from this codebase over the next year?". It is fairly stable across releases of a healthy system and tends to drop measurably before a system enters the "legacy" quadrant.

The emergent use case is the Visual Studio traffic-light rendering: every C# developer who has hovered a method in the IDE has seen the green / yellow / red icon, and the underlying number is mi_visual_studio. This made MI by far the most user-facing software metric for an entire generation of .NET developers, which is also why it is the metric that has attracted the most criticism. Treat it as a smoke detector, not a thermostat: a sudden drop is a useful signal, but the absolute number is noisy.

NArgs

NArgs counts the number of arguments declared by a function, method, or closure. The metric does not have a famous origin paper — it is folk wisdom dating to at least Kernighan and Plauger's The Elements of Programming Style (1974) and prominently re-stated in Robert C. Martin's Clean Code (2008), which suggests three arguments as a soft ceiling.

big-code-analysis splits the count by callable kind: every aggregate is reported separately for functions and closures so a Rust file heavy on |…| … closures and a Java file with only methods produce comparable numbers. The serialised output (src/metrics/nargs.rs) is total_functions, total_closures, average_functions, average_closures, total, average, functions_min, functions_max, closures_min, closures_max. The implementation handles default arguments, variadic arguments, keyword-only arguments, and destructured parameters consistently per language.

How to read it

A function with many arguments is hard to call correctly and even harder to test exhaustively — the test matrix grows roughly exponentially. The classic refactoring advice is the introduce parameter object pattern: when a function takes more than four related arguments, group them into a record / struct / dataclass.

The emergent use is as a review-blocking lint rule: most modern linters (pylint's R0913, ESLint's max-params, Checkstyle's ParameterNumber) flag functions with more than a configurable threshold. NArgs is also a useful component of API-design dashboards: public APIs whose average NArgs has crept upward over time tend to be ones that have accreted "just one more parameter" feature flags.

NExits

NExits counts the number of distinct exit points from a function — every return, every throw / raise, and the implicit fall-through return at the end of a void function.

The metric goes back to the structured-programming literature of the 1970s, where Edsger Dijkstra and others argued that functions should have a single entry and a single exit point (the "SESE" rule). Modern thinking is much more nuanced — see Steve McConnell's Code Complete, 2nd edition (Microsoft Press, 2004), which explicitly recommends early returns as a clarity-improving pattern when they reduce nesting.

big-code-analysis walks each function's syntax tree, identifies the language-specific exit nodes (see the per-language Exit trait in src/metrics/exit.rs), and reports per-function counts plus file-level sum, average, min, and max. The serialised field name is nexits, matching the prose acronym used here.

How to read it

Strict SESE coding standards (DO-178C for avionics, MISRA C for embedded automotive — see MISRA's official site) still require an NExits of 1 per function, because multiple exit points complicate certified control-flow analysis. Outside those domains, an NExits of 2-4 is usually a good sign — it almost always means the function uses guard clauses to handle preconditions and then proceeds in a flat body.

A very high NExits — say above 8 — is the warning sign. It usually means the function should have been split into several smaller functions, with each "successful branch" becoming its own helper.

NOM

NOM stands for Number Of Methods and counts every function, method, and closure defined inside a given scope (file, class, or namespace). For object-oriented codebases it is one of the first metrics introduced by Mark Lorenz and Jeff Kidd in their 1994 book Object-Oriented Software Metrics (Prentice Hall, ISBN 0-13-179292-X), where it is treated as the primary class-size indicator.

big-code-analysis reports the count split by callable kind in src/metrics/nom.rs. The serialised fields are functions, closures, functions_average, closures_average, total, average (overall average across containing spaces), and per-kind functions_min, functions_max, closures_min, closures_max.

The split lets you ask different questions of the same code: a Rust crate with many closures and few functions is typical of iterator-heavy code; a Python module with many functions and few closures is typical of script-style code.

How to read it

NOM is the input to several other metrics — WMC sums cyclomatic complexity across the same set of methods that NOM counts, and NPM filters that same set down to public methods. As a standalone metric, the Lorenz–Kidd recommendation is ≤ 20 methods per class. The emergent use is as a God-class detector: a class with NOM in the dozens is almost always doing too much, and is a strong candidate for "extract collaborator" refactoring as documented in Martin Fowler's Refactoring catalogue entry on Large Class.

NPA

NPA counts the number of public attributes (a.k.a. fields, properties, instance variables) declared by a class or interface. It is part of the metric family introduced by Lorenz and Kidd in Object-Oriented Software Metrics (1994) and was later folded into the MOOD ("Metrics for Object-Oriented Design") suite proposed by Brito e Abreu and Carapuça (1994).

big-code-analysis splits the count by definition-site kind: classes (concrete types with state) and interfaces (abstract contracts). The serialised output (src/metrics/npa.rs) is classes (sum of NPA across all classes), interfaces (sum across interfaces), class_attributes (sum of all attributes — public or not — across classes), interface_attributes, classes_average (class density of public attributes), interfaces_average, total, total_attributes, and average. The per-language Npa trait decides what counts as "public" (Java public, C# public, Rust pub, Python's "no leading underscore" convention, …) and what counts as "attribute" rather than "method".

How to read it

NPA is a direct measure of encapsulation. Every public attribute is a piece of internal state that callers can read or write without going through a method, which means it is a piece of internal state the class cannot validate or evolve without breaking callers. The canonical guidance — first explicitly stated in Bertrand Meyer's Object-Oriented Software Construction (Prentice Hall, 1988) and known as the Uniform Access Principle — is to keep NPA at or near zero and to expose state through public methods instead.

The emergent use is API-stability auditing: a public library class whose NPA grows over time accumulates breaking-change liability faster than its public-method surface.

NPM

NPM counts the number of public methods declared by a class or interface. It is the method-side companion to NPA and was again codified by Lorenz and Kidd (1994).

As with NPA, big-code-analysis splits NPM by definition-site kind (classes vs. interfaces). The serialised output (src/metrics/npm.rs) is classes (sum of NPM across classes), interfaces, class_methods (sum of all methods — public or not — across classes), interface_methods, classes_average, interfaces_average, total, total_methods, and average. The language-specific Npm trait decides what counts as public — for example, Rust's pub, Python's leading-underscore convention, C++'s public: section — and folds together regular methods, constructors, and operator overloads as appropriate.

NPM is also one of the inputs into Mark Hitz and Behzad Montazeri's Class Interface Size metric, and into Chidamber and Kemerer's Response For a Class (RFC).

How to read it

NPM is the public interface size. A class with NPM in the dozens is a class with too large an API contract: every public method is something callers can come to depend on, and every change to it is a breaking change. The Lorenz–Kidd guidance is ≤ 20 public methods per class, with anything over 40 being considered a strong refactoring candidate. The same rule applies particularly forcefully to interfaces in Java and C#, where the contract really is the shape clients pin against.

The emergent use is as a public-API change tracker for libraries: monitoring NPM at the package level catches accidental expansion of a library's surface area in the same way that NPA catches accidental exposure of internal fields.

Tokens

Tokens is a per-function and per-file count of the tree-sitter leaf tokens — identifiers, literals, keywords, punctuation — excluding any token whose AST ancestor is a comment node. It is a modern, lexer-driven size proxy intended as a more formatting-resilient alternative to LOC. (The same idea is well known from Terry Yin's lizard command-line tool, which is where many readers will first have seen a token-count metric.)

The implementation lives in src/metrics/tokens.rs. Because Tokens counts every leaf, including punctuation that Halstead deliberately skips, the value will not equal Halstead N1 + N2, and because it counts tokens rather than lines it is not equivalent to any LOC variant. Whitespace-only reformatting does not change Tokens; renaming a variable does not change the count; removing a comment does not change Tokens. Edits that change the tokens themselves — adding an if, adding optional braces around a single-statement block, or inserting/removing semicolons in a language where they are optional — do change the count.

How to read it

Tokens is the most formatting-resilient size proxy in the suite. It is the right size measure to use when you are normalising another metric across languages or across teams with different style conventions — bugs per KSLOC is sensitive to formatting, while bugs per 1000 tokens is much less so.

The emergent use is as the defect-density denominator of choice in cross-language research: a 1000-line Java file and a 1000-line Lisp file contain very different amounts of code, but a 1000-token slice of each contains roughly the same amount of information. This makes Tokens particularly useful for machine-learning code-quality models that train across many languages.

WMC

WMCWeighted Methods per Class — is the first metric in the Chidamber and Kemerer suite, introduced in their 1994 IEEE Transactions on Software Engineering paper A Metrics Suite for Object Oriented Design (volume 20, issue 6, pages 476-493). The CK suite — WMC, DIT, NOC, CBO, RFC, LCOM — is the single most-cited collection of OO metrics in the academic literature; big-code-analysis currently implements WMC and the simpler size metrics (NOM, NPA, NPM), with the inheritance- and coupling-based ones tracked for future work.

WMC is the sum of the cyclomatic complexity of every method defined in a class. The original paper deliberately left the "weighting" abstract — Chidamber and Kemerer wrote that "if all method complexities are considered to be unity, then WMC = n, the number of methods" — but the empirical follow-up literature has almost universally settled on cyclomatic complexity as the weight, and that is what big-code-analysis uses.

Algorithm

For each class or interface found by the per-language parser, big-code-analysis sums the standard cyclomatic complexity of every method body inside it (src/metrics/wmc.rs). The file-level serialised output is three fields: classes (sum of WMC across all classes in the file), interfaces (sum across interfaces), and total (the two combined). No min/max/average aggregation is emitted at the file scope — to rank individual classes by WMC, use the report subcommand, which surfaces a WMC hotspots section (see Commands → Report).

How to read it

Chidamber and Kemerer offered three hypotheses about WMC, all of which have been validated repeatedly since:

  1. Higher WMC predicts higher maintenance effort. A class whose methods are individually complex will resist comprehension.
  2. Higher WMC reduces reuse. Classes that do many complicated things are hard to drop into a new context.
  3. Higher WMC suggests broader application-specific behaviour. Such classes tend to be "main loop"-style coordinators rather than reusable building blocks.

The emergent use is God-class detection: combined with NOM, WMC is one of the clearest signals that a class needs to be split. A class with high NOM but low WMC is a passive data holder (probably fine). A class with low NOM and high WMC has a few gargantuan methods (split the methods, not the class). A class with both high NOM and high WMC is the classic God class.


Where to go next

  • The Supported Languages chapter lists which metrics fire for which languages — language coverage varies because some metric definitions (NPA, NPM, WMC) only make sense in languages with classes.
  • The Commands → Metrics page documents how to invoke bca metrics to produce the JSON / YAML / TOML / CBOR output for any of these numbers.
  • The Recipes chapter shows end-to-end examples of producing quality reports from these metrics, including pipelining them into dashboards.

Migration: Flag CLI to Subcommand CLI

The CLI was restructured from a flat flag-style interface (one process, many mutually-exclusive --action flags) into a subcommand-style interface (bca <verb>). This page maps every old invocation to its replacement.

Why the change

The flag CLI overloaded --output-format with two unrelated meanings: per-file serialization (-O json/yaml/toml/cbor) and a post-walk aggregated report (-O markdown). It needed two clap ArgGroups plus runtime checks to police invalid combinations, and --top / --strip-prefix lived as global flags that only applied to one format. Future aggregated formats (e.g. HTML) would compound the fragility.

The subcommand CLI fixes the structure: bca metrics and bca ops emit per-file output; bca report <FORMAT> emits an aggregated report; each verb has its own scoped flag set.

Migration mapping

OldNew
--metrics -O markdown (+ --top, --strip-prefix)report markdown
--metrics -O json/yaml/toml/cbormetrics -O json/yaml/toml/cbor
--metrics -O checkstyle/sarif/code-climate/clang-warning/msvc-warningcheck --threshold ... --output-format <fmt> [--output FILE]
--ops -O ...ops -O ...
--dumpdump
--find <NODE>find <NODE> [<NODE>...]
--count <LIST>count <NODE> [<NODE>...]
--functionfunctions
--comments [--in-place]strip-comments [--in-place]
--preproc <FILE> <FILE>... (producer)preproc -o <OUT>
--preproc <FILE> (consumer)--preproc-data <FILE> (global)
--list-metrics [MODE]list-metrics [MODE]
--pr (pretty)--pretty (on metrics and ops)
-p, -I, -X, -j, -l, --ls, --le, -wunchanged; global

Side-by-side examples

Aggregated markdown report

# OLD
big-code-analysis-cli \
    --metrics \
    --paths "$PWD" \
    --output-format markdown \
    --num-jobs $(nproc) \
    --top 20 \
    --strip-prefix "$PWD/"

# NEW
bca \
    --paths "$PWD" \
    --num-jobs $(nproc) \
    report markdown \
    --top 20 \
    --strip-prefix "$PWD/"

Per-file metric extraction

# OLD
big-code-analysis-cli --metrics --paths ./src --output-format json --output ./out/

# NEW
bca --paths ./src metrics -O json --output ./out/

Per-file ops extraction

# OLD: big-code-analysis-cli --ops --paths ./src -O json -o ./out/
# NEW: bca --paths ./src ops -O json -o ./out/

AST dump

# OLD: big-code-analysis-cli --dump --paths ./file.rs
# NEW: bca --paths ./file.rs dump

Find / count nodes

# OLD: big-code-analysis-cli --find call_expression --paths ./src
# NEW: bca --paths ./src find call_expression

# OLD: big-code-analysis-cli --count if_statement,for_statement --paths ./src
# NEW: bca --paths ./src count if_statement for_statement

Note: count now takes one node type per positional argument (space separated) rather than one comma-separated string.

Function spans

# OLD: big-code-analysis-cli --function --paths ./src
# NEW: bca --paths ./src functions

Strip comments

# OLD: big-code-analysis-cli --comments --in-place --paths ./src
# NEW: bca --paths ./src strip-comments --in-place

Preproc data — producer

# OLD
big-code-analysis-cli --metrics --preproc a.h --preproc b.h \
    --paths ./src -o /tmp/p.json

# NEW
bca --paths ./src preproc -o /tmp/p.json

Preproc data — consumer

# OLD
big-code-analysis-cli --metrics --preproc /tmp/p.json \
    --paths ./src -O json -o ./out/

# NEW
bca --paths ./src --preproc-data /tmp/p.json \
    metrics -O json -o ./out/

List metrics

# OLD: big-code-analysis-cli --list-metrics descriptions
# NEW: bca list-metrics descriptions

Migration hint at runtime

If you run a legacy invocation, the CLI prints a hint identifying the recognized old flags and their new equivalents before clap's own error. For example:

$ bca --metrics -O markdown
note: the CLI was restructured into subcommands. See migration.md for the full mapping.
  --metrics  ->  bca metrics
  -O markdown  ->  bca report markdown [--top N] [--strip-prefix P]
  Run `bca --help` for the new command list.

error: unexpected argument '--metrics' found

Commands

bca offers a range of commands to analyze and extract information from source code. Each command may include parameters specific to the task it performs. Below, we describe the core types of commands available in bca.

Metrics

Metrics provide quantitative measures about source code, which can help in:

  • Compare different programming languages
  • Provide information on the quality of a code
  • Tell developers where their code is more tough to handle
  • Discovering potential issues early in the development process

big-code-analysis calculates the metrics starting from the source code of a program. These kind of metrics are called static metrics.

Nodes

To represent the structure of program code, bca builds an Abstract Syntax Tree (AST). A node is an element of this tree and denotes any syntactic construct present in a language.

Nodes can be used to:

  • Create the syntactic structure of a source file
  • Discover if a construct of a language is present in the analyzed code
  • Count the number of constructs of a certain kind
  • Detect errors in the source code

REST API

bca-web runs a server offering a REST API. This allows users to send source code via HTTP and receive corresponding metrics in JSON format.

Skipping generated code

Generated bindings (protobuf stubs, OpenAPI clients, lex/yacc output, build-system plumbing) inflate metrics for code no human will refactor. By default, bca scans the first ~50 lines / 5 KiB of each file for a generated-code marker and skips matches before parsing, so the skipped file pays no tree-sitter parse cost.

Recognized markers (case-insensitive):

  • @generated — Facebook / Meta convention; also emitted by buck2, rustfmt, prettier, and many code generators.
  • DO NOT EDIT — Go's // Code generated by … DO NOT EDIT. is the canonical form; the bare phrase is also widely copied (Bazel, protoc, OpenAPI clients).
  • GENERATED CODE — Lizard's marker, recognized for compatibility.

A marker phrase that appears only deep in the file body (past the scan window) does not trigger the skip — the detector deliberately looks only at the file header.

The skip applies uniformly to bca metrics, bca report, and the threshold engine.

Flags

  • --no-skip-generated — disable the auto-skip and restore the previous behavior (every file is parsed).
  • --report-skipped — log skipped (generated): <path> to stderr for each file the detector excludes, so you can audit the exclusions and add an explicit include if a file was wrongly tagged.

Respecting .gitignore

When a directory is passed to --paths, bca walks it with .gitignore awareness by default. Files matched by any of the following are skipped before parsing:

  • .gitignore files inside the walked tree.
  • .ignore files (the ripgrep / fd convention).
  • .git/info/exclude.
  • The global gitignore (~/.config/git/ignore, or whatever core.excludesFile points at).
  • .gitignore files in ancestor directories of the seed (so bca --paths src/ from a project root picks up the project's top-level .gitignore).

The walker honors .gitignore even outside a checked-in git repository, so an extracted source tarball with a .gitignore file gets the same treatment as a fresh git clone.

Hidden files (those whose basename starts with .) are filtered during the walk, matching the previous behavior.

Explicit paths bypass the filter

Files passed by name — via --paths or --paths-from — are always analyzed, even when they would be excluded by .gitignore. This makes it safe to do bca metrics --paths-from - from git diff --name-only-style pipelines without losing files that happen to be covered by a wildcard ignore rule.

Path discovery flags

  • --no-ignore — disable .gitignore / .ignore / global-gitignore awareness when expanding directory seeds.
  • --paths-from <FILE> — read newline-separated input paths from <FILE>, or from stdin when <FILE> is -. Combined as a union with any --paths values; -I / -X globs still apply. Blank lines are skipped; # is treated as a path character (not a comment). To pass a file literally named -, write ./-.
  • --exclude-from <FILE> — read newline-separated --exclude glob patterns from <FILE>, or from stdin when <FILE> is -. Patterns are unioned with any inline --exclude / -X values into a single deny-set; order does not matter. .gitignore-style: blank lines and lines whose first non-whitespace character is # are skipped, and a leading UTF-8 BOM is stripped. Convention is a .bcaignore at the repo root, mirroring .gitignore / .dockerignore. To pass a file literally named -, write ./-.

Metrics

bca metrics computes per-file metrics and emits them either to stdout or to a directory of structured files.

Migrating? This command replaces the pre-restructure --metrics flag. The aggregated report previously selected with -O markdown now lives under bca report, and the CI/IDE offender formats (Checkstyle, SARIF, code-climate, clang-warning, msvc-warning) moved to bca check --output-format <fmt>. See the migration guide.

Display metrics

To compute and display metrics for a given file or directory, run:

bca --paths /path/to/your/file/or/directory metrics
  • --paths (or -p): file or directory to analyze. If a directory is provided, metrics are computed for every supported file it contains.

Exporting metrics

bca metrics supports five per-file output formats:

  • CBOR
  • CSV
  • JSON
  • TOML
  • YAML

Both JSON and TOML can be exported as pretty-printed.

The three top-level output kinds map to three separate commands so each one stays consistent with its data model:

CommandOutputAudience
bca metricsPer-file metric treesDownstream tooling
bca reportAggregated quality dashboardsHumans / PRs
bca checkThreshold-violation reportsCI / IDE

The CI/IDE offender formats (Checkstyle, SARIF, code-climate, clang-warning, msvc-warning) used to live on bca metrics -O <fmt>. They moved to bca check --output-format <fmt> in #235 because their input is a list of threshold violations, not the per-file metric tree that the other formats above carry. See the bca check chapter for the new invocation.

Export command

To export metrics as JSON files:

bca --paths /path/to/your/file/or/directory metrics \
    -O json -o /path/to/output/directory
  • -O, --output-format: per-file output format (cbor, csv, json, toml, yaml).
  • -o, --output: directory to save output files. Filenames mirror the input file plus the format extension. If omitted, results are printed to stdout. CBOR is binary and therefore requires -o.

CSV (spreadsheets and Pandas)

bca --paths /path/to/your/code metrics \
    -O csv -o csv-output

The CSV writer emits one row per FuncSpace (function, class, struct, unit, etc.) with the entire metric matrix as columns. Header order is fixed — see CSV_HEADER in src/output/csv.rs for the canonical list. Identity columns come first (path, space_name, space_kind, start_line, end_line) followed by every leaf metric using the same dotted JSON-style names (loc.lloc, halstead.volume, cyclomatic.modified.average, etc.) so a single column name addresses the metric in both CSV and JSON.

Empty cells (no value, not 0) signal "not applicable for this space" — for example, the OOP-only metrics (wmc.*, npm.*, npa.*) appear empty for procedural code. RFC 4180 quoting is delegated to the [csv] crate, so paths and names containing commas, quotes, or newlines round-trip cleanly.

Stream the result to a single file with -:

bca --paths /path/to/your/code metrics -O csv \
    > metrics.csv

CSV is a per-file format; with --output <dir> each input file produces a <input>.csv mirror under the output directory.

An aggregated HTML report covering the whole walk is available via bca report html. The previous per-file bca metrics -O html writer was removed because it degraded to an unopenable single-file table on real-world repos — CSV is the right shape for flat per-FuncSpace rows.

Pretty print

bca --paths /path/to/your/file/or/directory metrics \
    --pretty -O json

Excluding inline test code

bca --paths /path/to/your/code --exclude-tests metrics

By default, every node in the AST is counted, including inline test items. Rust files following the idiomatic #[cfg(test)] mod tests { ... } layout therefore have headline metrics that mix production and test code together.

Pass --exclude-tests to elide test-only subtrees before any metric is computed. The flag is recognised by every subcommand that walks the AST (metrics, report, check), and currently understands the following Rust attribute shapes:

  • #[test] and #[rstest] / #[test_case] / #[wasm_bindgen_test]
  • #[cfg(test)], #[cfg(all(test, ...))], #[cfg(any(test, ...))]
  • #[tokio::test], #[async_std::test], #[test_log::test], … (any path ending in ::test)
  • #![cfg(test)] on mod items (inner attribute form)

Languages without a Checker::should_skip_subtree override simply ignore the flag — only Rust applies the pruning today. The default remains off so existing metric numbers stay byte-identical for users who do not opt in.

Aggregated report

For a comprehensive, human-readable quality report, use bca report markdown. That command aggregates metrics across all analyzed files and produces per-language hotspot tables.

Listing available metrics

Tooling that drives the CLI can discover the metric catalog at runtime instead of hard-coding it:

bca list-metrics

prints metric names one per line. Pass descriptions for a one-line summary of each metric:

bca list-metrics descriptions

Report

bca report <FORMAT> produces an aggregated quality-metrics report across every file walked. It is designed for pasting into pull requests, wikis, or issue trackers.

CI integration. For runnable GitHub Actions and GitLab CI recipes that post the Markdown report as a PR/MR comment, see the CI integration recipe.

Two formats are available: markdown (plain-text, ideal for PR comments) and html (a self-contained dashboard with sortable tables, ideal for sharing as a build artifact).

Migrating? This command replaces the pre-restructure --metrics -O markdown invocation. See the migration guide.

Quick start

Print to stdout:

bca --paths /path/to/project report markdown

Write to a file:

bca --paths /path/to/project report markdown --output report.md

Note: --output must be a file path, not a directory.

Flags

FlagDefaultDescription
--top N20Maximum entries per hotspot table.
--strip-prefix PATH(empty)Prefix removed from file paths.
-o, --output FILE(stdout)Output file. Parent directory must exist.

Examples

Show only the five worst hotspots per section:

bca -p src/ report markdown --top 5

Strip the workspace root from displayed paths:

bca -p /home/user/project report markdown \
    --strip-prefix /home/user/project/

The user's daily-driver invocation:

bca \
    --paths "$PWD" \
    --num-jobs $(nproc) \
    report markdown \
    --top 20 \
    --strip-prefix "$PWD/"

Report structure

A generated report contains the following sections (each section is omitted when no data exists for it). Every hotspot table includes a Tokens column (Lizard-style leaf-token count, comments excluded) alongside SLOC so two complementary size proxies are visible per row.

  1. Project summary — files analyzed, languages, total SLOC / PLOC / comment counts, function and class counts, comment ratio.
  2. Per-language overview table — one row per language with file count, SLOC, function count, average Maintainability Index (MI), average Cyclomatic Complexity (CC), and average Cognitive Complexity.
  3. Per-language hotspot sections (repeated for each language):
    • Summary — file count, SLOC, PLOC, comment ratio, average MI with a GOOD / MODERATE / LOW rating.
    • Maintainability Index (lowest files) — files sorted ascending by MI.
    • Cyclomatic Complexity Hotspots — functions sorted descending by CC, with summary statistics (average, max, counts above 10 and 20).
    • Cognitive Complexity Hotspots — functions sorted descending by cognitive complexity.
    • Halstead Effort Hotspots — functions sorted descending by Halstead effort, including volume and estimated bugs.
    • Largest Functions by SLOC — functions sorted descending by source lines of code.
    • Functions With Many Parameters (>3) — functions with more than three parameters, sorted descending.
    • Actionable Summary — counts of functions exceeding common thresholds (CC > 10, cognitive > 15, SLOC > 100, args > 3, Halstead bugs > 1).
    • Class/Trait/Impl Hotspots (WMC) — classes sorted descending by Weighted Methods per Class, with NOM, NPA, and NPM.
    • Functions with the most exit points (NEXITS) — sorted descending by exit count.
    • ABC Magnitude Hotspots — functions sorted descending by ABC metric magnitude.

HTML format

bca report html emits a single self-contained HTML page covering the same sections as the Markdown report. It is designed to be served as a static artifact: inline CSS, inline vanilla JavaScript for click-to-sort on every hotspot table, and zero external dependencies (no CDN, no fonts, no template engine). The page renders identically offline.

Write it to a file and open in any browser:

bca --paths /path/to/project \
    report html --top 10 --output report.html

Click any column header to sort that table ascending, click again to toggle descending. Each table sorts independently. Empty cells (where a metric was not measured) sort as if they were positive infinity, which keeps "no data" rows out of the visible top of a hotspot.

Hover (or keyboard-focus, where the browser supports it) any metric column header — SLOC, MI, CC, ABC, WMC, NPA, NPM, Exits, etc. — for a one-sentence plain-English explanation of the metric. The tooltip is delivered through the native HTML title attribute, so it works offline with no JavaScript.

Every interpolated string — function name, file path, language label — is HTML-escaped on the way out, so a crafted source path or symbol name cannot inject markup or break out of an attribute value.

Each per-language <section> carries a stable lang-<name> class (e.g. lang-rust, lang-python) styled with a low-alpha background tint and matching left border so a multi-language report's section boundaries are obvious at a glance. Languages without an explicit palette entry fall back to a neutral lang-other tint, and a prefers-color-scheme: dark adapter raises the alpha so contrast holds in both themes.

Metric values of zero

A metric value of 0 in the report means the metric was not measured for that item (e.g. Halstead metrics on an empty function). Sections whose entries are all zero are omitted entirely.

Check

bca check evaluates per-function metrics against thresholds and exits non-zero when any function exceeds a limit. It is the CI integration point: wire it into a build step and a regression in code complexity fails the pipeline before the change lands.

Looking for full CI recipes? The CI integration recipe consolidates the --output-format matrix, runnable GitHub Actions and .gitlab-ci.yml examples, the baseline / ratchet pattern, and the GitLab Code Quality path. This page documents the command itself; the recipe documents how to wire it into a pipeline.

Exit codes

CodeMeaning
0All functions within thresholds (or --no-fail set).
2At least one threshold exceeded.
1Tool error (bad arguments, unreadable config, unknown metric).

1 is reserved so CI can distinguish a regression (2) from a tool misconfiguration (1).

Declaring thresholds

Pass --threshold <metric>=<limit> once per metric (repeatable). Metric names match bca list-metrics; sub-metrics use a dotted form. 0 is a valid limit and means "no value permitted".

bca --paths src/ check \
    --threshold cyclomatic=15 \
    --threshold cognitive=20 \
    --threshold loc.lloc=200

Or pull thresholds from a TOML config (one place to keep CI thresholds versioned alongside the code):

# bca-thresholds.toml
[thresholds]
cyclomatic = 15
cognitive = 20
"loc.lloc" = 200
"halstead.volume" = 1000
bca --paths src/ check --config bca-thresholds.toml

CLI flags override values from --config for the same metric name, so you can keep a project-wide default and tighten a single metric for a specific run.

Accepted metric names

Top-level scalar metrics use their list-metrics names directly: cognitive, cyclomatic, nargs, nexits, nom, tokens, abc, wmc, npm, npa. Metric suites with multiple sub-fields use a dotted form:

MetricAccepted threshold names
Cyclomaticcyclomatic, cyclomatic.modified
Halsteadhalstead.volume, halstead.difficulty, halstead.effort, halstead.time, halstead.bugs
Lines of codeloc.sloc, loc.ploc, loc.lloc, loc.cloc, loc.blank
Maintainability Indexmi.original, mi.sei, mi.visual_studio

An unknown threshold name is a tool error (exit 1), not silently ignored.

Offender output

Every offending (function, metric) pair prints one line to stderr in this stable format:

<path>:<start_line>-<end_line>: <function_name>: <metric> = <value> (limit <limit>)

For example:

src/parser.rs:42-117: parse_expression: cyclomatic = 22 (limit 15)
src/parser.rs:42-117: parse_expression: cognitive = 31 (limit 20)

Lines are sorted by path, then start line, then metric name, so output is deterministic across runs over the same tree.

Silencing violations with suppression markers

In-source comments can silence threshold violations on individual functions or whole files without editing the offending code or excluding it from the walk. The native dialect is bca: suppress / bca: suppress-file; Lizard's #lizard forgives is recognized as a compatibility shim. See Suppression markers for the full reference and the --no-suppress CI-audit flag.

Baselines

When you adopt thresholds on an existing codebase you typically face a binary choice between "raise the limit until nothing fires" and "fix every offender before turning the gate on". A baseline file is the ratchet-down alternative: record today's offenders, fail only on regressions and new offenders, and shrink the file over time as the team pays down debt.

Baselines are complementary to the suppression markers from Suppression markers, not a substitute. Suppressions express "this function is intentionally exempt forever" and live in source; baselines express "this is tech debt we're paying down" and live in a committed TOML file. bca check honors suppressions first and applies the baseline filter to whatever remains.

Writing a baseline

bca --paths src/ check \
    --config bca-thresholds.toml \
    --write-baseline .bca-baseline.toml

This walks the tree, captures every threshold violation that would otherwise fail the check, and writes them to the file as sorted TOML. The run exits 0 regardless of offender count — the point is to capture them.

# bca baseline file. Generated by `bca check --write-baseline`.
# Listed offenders are filtered from threshold checks; a function that
# gets worse than its recorded value still fails. Refresh with
# `--write-baseline` when entries become stale.
version = 1

[[entry]]
path = "src/parser.rs"
function = "parse_expression"
start_line = 42
metric = "cyclomatic"
value = 22.0

Functions already covered by an in-source suppression marker are excluded. Pass --no-suppress together with --write-baseline to record every violation (CI-auditor flow).

--write-baseline cannot be combined with --baseline, --output-format, or --output — the baseline file is the output.

Reading a baseline

bca --paths src/ check \
    --config bca-thresholds.toml \
    --baseline .bca-baseline.toml

A violation is suppressed when both conditions hold:

  • An entry exists at (path, function, start_line, metric).
  • The current value is less than or equal to the recorded value.

A function that gets worse than its baseline value still fails. New offenders not listed in the baseline still fail. Improvements pass silently (the entry remains at its older, higher value until the next --write-baseline refresh).

A baseline file that does not exist, is empty, has a missing or unsupported version, or fails to parse is a tool error (exit 1), not a silent zero-match.

Limitations

  • Line drift. The entry key is (path, function, start_line, metric). Inserting code above a function shifts its start_line and the entry stops matching, surfacing as a "new" offender. Run --write-baseline to refresh and commit the diff.
  • Path identity. Entries record the path as the walker saw it. Generate and consume the baseline with the same --paths argument from the same working directory; a relative --paths src/ and an absolute --paths /repo/src/ do not match each other.
  • OS portability. Paths are stored with forward slashes so a baseline written on one OS matches the same tree on another. Paths that are not valid UTF-8 fall back to a lossy display form (U+FFFD substitution) and may not round-trip exactly.

See the Baselines recipe for the end-to-end adoption flow and CI integration patterns.

Reporting without failing

--no-fail prints offenders to stderr but exits 0. Useful while adopting baselines without flipping CI red. Other CI tools call this behavior --report-only or --soft-fail; here the flag is spelled --no-fail.

bca --paths src/ check \
    --config bca-thresholds.toml --no-fail

CI example (GitHub Actions)

- name: Check code complexity thresholds
  run: |
    bca --paths src/ check --config bca-thresholds.toml
  # The default behavior — non-zero exit fails the step — is exactly
  # what we want here. No extra wiring needed.

If you want to keep the job green and surface offenders as a build annotation while you reduce the count, swap in --no-fail:

- name: Surface complexity hot spots (non-blocking)
  run: |
    bca --paths src/ check \
        --config bca-thresholds.toml --no-fail

Exporting offender records

bca check also emits a single CI/IDE document covering every offender in the walk. Pass --output-format <fmt> to pick the shape and --output <file> to write it to disk (stdout if omitted). The exit-code contract is unaffected by these flags: 0 clean, 2 on any violation (unless --no-fail), 1 on tool error.

FormatAudience
checkstyleJenkins, SonarQube, GitLab, "warnings plugin" CI
sarifGitHub Code Scanning, modern IDEs / security tooling
code-climateGitLab MR Code Quality widget
clang-warningEditor quickfix parsers, GitHub Actions problem matcher
msvc-warningVisual Studio, VS Code, Windows CI runners

When no offenders exist the writer emits a well-formed but empty document — empty runs[].results array for SARIF, empty JSON array ([]) for Code Climate, no <file> children under the <checkstyle> root for Checkstyle, and zero bytes for the two warning-line formats — so CI consumers can ingest clean runs unchanged.

Checkstyle (CI integration)

bca --paths src/ check \
    --threshold cyclomatic=15 \
    --output-format checkstyle \
    --output report.checkstyle.xml

The Checkstyle writer emits a single <checkstyle version="4.3"> document containing one <file> element per source path, each holding one <error> per metric-threshold violation. The schema is the Checkstyle 4.3 XSD that Jenkins and SonarQube's "Warnings Next Generation" / "Generic Issue" importers consume directly.

SARIF (GitHub Code Scanning)

bca --paths src/ check \
    --threshold cyclomatic=15 \
    --output-format sarif \
    --output report.sarif.json

The SARIF writer emits a single SARIF 2.1.0 JSON document with one runs[] element. Each metric-threshold violation becomes a result under runs[0].results[]; the metric names appearing in the run are deduplicated into runs[0].tool.driver.rules[] with short descriptions.

To upload a SARIF file to GitHub Code Scanning from a workflow:

name: bca-sarif
on: [push, pull_request]
jobs:
  scan:
    runs-on: ubuntu-latest
    permissions:
      security-events: write
    steps:
      - uses: actions/checkout@v4
      - name: Run big-code-analysis
        run: |
          bca --paths . check \
              --config bca-thresholds.toml \
              --output-format sarif \
              --output report.sarif.json \
              --no-fail
      - name: Upload SARIF
        uses: github/codeql-action/upload-sarif@v3
        with:
          sarif_file: report.sarif.json

--no-fail keeps the job green so the SARIF upload step still runs when offenders exist; remove it once you want a metric regression to fail the workflow.

GitLab Code Quality (Code Climate JSON)

bca --paths src/ check \
    --threshold cyclomatic=15 \
    --output-format code-climate \
    --output gl-code-quality-report.json

The Code Climate writer emits a single JSON array of issue objects matching GitLab's strict subset of the upstream Code Climate engine spec — one entry per metric-threshold violation, no byte-order-mark, one trailing newline (empty input renders as []\n). Each issue carries a namespaced check_name (big-code-analysis/<metric>), a stable SHA-256 fingerprint over path \0 function \0 metric (line- and value-insensitive so cosmetic edits still dedup in the MR widget), and a severity mapped from the value/threshold ratio onto GitLab's five-level enum: ≤ 1.5×minor, ≤ 2×major, ≤ 4×critical, > 4×blocker (inverted for the mi.* family where lower is worse). The full enum is info/minor/major/critical/blocker; bca never emits info — a threshold violation always lands at minor or higher.

To wire the artifact into GitLab's MR Code Quality widget:

code_quality:
  stage: quality
  script:
    - bca --paths "$CI_PROJECT_DIR" check
          --config bca-thresholds.toml
          --output-format code-climate
          --output gl-code-quality-report.json
          --no-fail
  artifacts:
    when: always
    reports:
      codequality: gl-code-quality-report.json
    paths:
      - gl-code-quality-report.json

See the GitLab Code Quality widget recipe for the full pipeline (combined Code Climate + Checkstyle + Markdown report) and a local jq smoke check.

--no-fail keeps the job green so the Code Quality report still uploads when offenders exist; remove it once you want a metric regression to fail the pipeline.

Clang/GCC warning lines (editor quickfix and CI annotators)

bca --paths src/ check \
    --threshold cyclomatic=15 \
    --output-format clang-warning \
    --output report.txt

The Clang format emits one offender per line in the conventional compiler-warning shape:

path/to/file.rs:42:5: warning: cyclomatic 17 exceeds limit 15 [big-code-analysis-cyclomatic]

This is the format clang -fdiagnostics-format= produces and the shape every editor quickfix parser (VS Code, IntelliJ, Vim) and most CI annotators understand without configuration.

GitHub Actions surfaces the lines as inline annotations on the PR diff via the built-in GCC problem matcher (or any community compiler-problem-matchers action):

name: bca-clang-warnings
on: [push, pull_request]
jobs:
  lint:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v4
      - name: Enable GCC problem matcher
        run: echo "::add-matcher::$RUNNER_TOOL_CACHE/problem-matchers/gcc.json"
      - name: Run big-code-analysis
        run: |
          bca --paths . check \
              --config bca-thresholds.toml \
              --output-format clang-warning \
              --no-fail

If your runner does not ship a GCC matcher, fall back to streaming the lines and re-emitting them as ::warning file=...,line=...:: workflow commands.

MSVC warning lines (Visual Studio and Windows CI)

bca --paths src/ check \
    --threshold cyclomatic=15 \
    --output-format msvc-warning \
    --output report.txt

The MSVC format emits one offender per line in Visual Studio's cl.exe diagnostic shape:

path\to\file.rs(42,5): warning : cyclomatic 17 exceeds limit 15

Note the space before the colon after warning/error — that is the MSVC convention. On Windows the path is normalized to use \ separators (matching cl.exe output); on other platforms the path is emitted as-is. Visual Studio, VS Code with the C/C++ extension, and Windows CI runners (Azure Pipelines, GitHub Actions on windows-latest) parse these inline without extra configuration.

Suppression markers

In-source suppression markers silence threshold violations without editing the offending function or excluding the file from the walk. Drop a marker in any comment in the source file and bca check treats the covered metrics as if they were within limits for that scope. Metric computation is unaffected — raw bca metrics / bca report output still reports every number. Suppression is a threshold-check concern only.

Markers exist for the cases editing the code is not an option: generated-style legacy modules awaiting rewrite, accepted exceptions documented in the comment, and migration from Lizard's #lizard forgives convention.

Native markers (bca:)

The native dialect uses the bca: namespace and the suppress verb, matching the project's internal "suppression" vocabulary (SuppressionPolicy, FuncSpace::suppressed, --no-suppress). Four forms:

MarkerScopeEffect
bca: suppressEnclosing functionSuppress every metric
bca: suppress(metric, ...)Enclosing functionSuppress only the listed metrics
bca: suppress-fileFileSuppress every metric
bca: suppress-file(metric, ...)FileSuppress only the listed metrics

A function-scope marker attaches to the innermost FuncSpace (see the FuncSpace rustdoc) whose source range contains the comment. A function-scope marker outside every function body is silently ignored; for file-wide silencing use the explicit suppress-file verb. A file-scope marker may appear anywhere in the source — there is no "must be in first N lines" rule.

bca: suppress — function-scoped, all metrics (Rust)

#![allow(unused)]
fn main() {
// bca: suppress
fn legacy_dispatch(opcode: u8) -> Action {
    // dense match on every supported opcode; rewrite tracked in #123
    match opcode { /* ... */ }
}
}

bca: suppress(metric, ...) — function-scoped, listed metrics (Python)

def parse_token_stream(tokens):
    # bca: suppress(cognitive)
    # cognitive complexity is intrinsic to this state machine;
    # cyclomatic is still bounded.
    ...

Other thresholds (cyclomatic, halstead, loc, ...) still apply.

bca: suppress-file — file-scoped, all metrics (JavaScript)

// bca: suppress-file
// Hand-tuned hot path; do not rewrite to satisfy thresholds.
function transform(input) { /* ... */ }
function validate(input) { /* ... */ }

bca: suppress-file(metric, ...) — file-scoped, listed metrics (C++)

/* bca: suppress-file(halstead) */
// Halstead volume is inflated by the generated tables below; every
// other metric is still enforced file-wide.

Lizard compatibility markers

Two Lizard-style markers are recognized verbatim so existing Lizard-instrumented codebases need no rewrites:

Lizard markerScopeEquivalent native marker
#lizard forgivesEnclosing functionbca: suppress
#lizard forgive globalFilebca: suppress-file

The compatibility layer is intentionally narrow: only these two shapes are accepted. Other Lizard directives parse as ordinary comments. Lizard offers no per-metric scoping, so the native form's bca: suppress(metric, ...) list has no Lizard analogue — every Lizard-style marker silences every metric.

Lizard's GENERATED CODE marker is not handled here; it is part of the generated-code auto-skip mechanism (see Skipping generated code and the --no-skip-generated flag).

Native vs Lizard side by side

EffectNative formLizard form
Silence every metric for one function// bca: suppress// #lizard forgives
Silence one metric for one function// bca: suppress(cyclomatic)(no equivalent)
Silence every metric for the whole file// bca: suppress-file// #lizard forgive global
Silence one metric for the whole file// bca: suppress-file(halstead)(no equivalent)

Metric identifiers

The identifiers accepted inside bca: suppress(...) and bca: suppress-file(...) are:

abc, cognitive, cyclomatic, exit, halstead, loc, mi, nargs, nom, npa, npm, wmc.

They mostly match the JSON field names emitted on CodeMetrics, with two deliberate differences:

  • exit is the suppression spelling for the threshold name nexits (the JSON field is also nexits) — bca: suppress(exit) silences a nexits threshold violation.
  • tokens is a threshold-checkable metric (and a CodeMetrics JSON field) but is deliberately absent from the suppression list: a marker cannot turn it off. Treat tokens as a hard resource cap, not a maintainability heuristic.

Silencing a family (for example halstead) covers every sub-metric threshold under it (halstead.volume, halstead.effort, ...); suppression vocabulary has no dotted form.

Unknown identifiers in a bca: suppress(...) list emit a stderr warning of the form

warning: path/to/file.rs:42: unknown metric 'no_such_metric' in bca suppression marker; known metrics: abc, cognitive, ...

The marker is dropped — a typo never silently widens scope to other metrics. Unknown verbs (anything other than suppress / suppress-file) and malformed bodies (unbalanced parentheses, trailing garbage) produce the same shape of warning and are similarly dropped. None of these are fatal: a typo in one file does not derail a workspace walk.

Where markers may appear

A marker is recognized inside any source comment, regardless of comment style. The scanner strips the following leading delimiter characters before matching: /, *, !, #, ;, -, and ASCII whitespace. That covers every comment shape bca parses today:

  • C-family line comments: // bca: suppress
  • C-family block comments: /* bca: suppress */
  • Rust inner doc comments: //! bca: suppress and /*! bca: suppress */
  • Python / shell / Ruby / Perl # comments: # bca: suppress
  • Lisp / Lua / SQL line comments: ;; bca: suppress, -- bca: suppress

Function-scope markers attach to the innermost Function-kind FuncSpace whose (start_line..=end_line) range contains the comment's line. Markers buried in a class or struct body but outside every method are silently ignored — for class-wide silencing use bca: suppress-file or repeat the marker on each method.

File-scope markers are merged into the top-level Unit space and apply to every function in the file regardless of nesting.

Position the marker near the start of the comment. The scanner trims delimiter characters from both ends and then expects bca: (or #lizard) at the very front; markers buried deep in a multi-line block comment will not be recognized.

--no-suppress (CI auditing)

bca check --no-suppress ignores every suppression marker — native and Lizard alike — and reports every threshold violation in the walk. Use it in audit pipelines that need the raw, un-silenced offender list:

bca --paths src/ check --config bca-thresholds.toml --no-suppress

The flag has no effect on metric values themselves: raw bca metrics / bca report output already ignores markers, since suppression is a threshold-check concern only.

JSON output

FuncSpace exposes the merged suppression scope as the optional suppressed field in its JSON output. When no marker applies to a space the field is elided so existing snapshot consumers see no change. When a marker fires the field carries one of two shapes:

{ "suppressed": { "kind": "all" } }
{ "suppressed": { "kind": "some", "metrics": ["cognitive", "loc"] } }

kind: all corresponds to a bare marker (bca: suppress, bca: suppress-file, or any Lizard-style marker). kind: some carries the explicit metric list from bca: suppress(...) / bca: suppress-file(...). Both shapes are stable serialization output suitable for dashboards and audit logs.

Migrating from Lizard

The compatibility layer means migration is incremental:

  1. Existing #lizard forgives and #lizard forgive global markers continue to work with no change. bca check honors them out of the box.
  2. Rewrite to the native form opportunistically. bca: suppress(...) gives per-metric scoping (the Lizard form silences everything) and is the form future audit-trail features will extend.

The project will keep the Lizard compatibility layer indefinitely; there is no removal date.

Reserved syntax

These shapes are reserved for future use and are not parsed today:

  • bca: suppress(metric, reason = "...") — audit-trail prose alongside the metric list, mirroring Rust's reason = "…" attribute argument.
  • bca: suppress-next — silence the immediately following declaration rather than the enclosing function.

Authors should avoid using either form today: a reason = "..." argument is currently parsed as an unknown metric identifier and discarded with a stderr warning, and bca: suppress-next is rejected as an unknown verb. Both will be promoted to first-class behavior in a future release without breaking existing markers.

Nodes

bca provides commands to analyze and extract information about nodes in the Abstract Syntax Tree (AST) of a source file.

Migrating? The verbs below replace the pre-restructure flag actions (-d, -f, --count, ...). See the migration guide.

Error detection

To detect syntactic errors in your code, run:

bca -I "*.ext" -p /path/to/your/file/or/directory find ERROR
  • -p, --paths: file or directory (analyzes all files when given a directory).
  • -I, --include: glob filter for selecting files by extension (e.g. *.js, *.rs). Variadic — put it before -p so the subcommand isn't swallowed as another glob, or use the -I=GLOB single-value form.
  • find <NODE>: search for nodes of a specific type (one or more positional names).

Counting nodes

Count occurrences of one or more node types with the count command:

bca -I "*.ext" -p /path/to/your/file/or/directory \
    count <NODE_TYPE> [<NODE_TYPE>...]

Printing the AST

To visualize the AST of a source file, use the dump command:

bca -p /path/to/your/file/or/directory dump

Analyzing code portions

To analyze only a specific portion of the code, use the global --ls (line start) and --le (line end) options. For example, to print the AST of a single function from line 5 to line 10:

bca -p /path/to/your/file/or/directory --ls 5 --le 10 dump

Listing functions

For a list of every function or method and its line span, use:

bca -p /path/to/your/file/or/directory functions

Rest API

bca-web is a web server that allows users to analyze source code through a REST API. This service is useful for anyone looking to perform code analysis over HTTP.

The server can be run on any host and port, and supports the following main functionalities:

  • Remove Comments from source code.
  • Retrieve Function Spans for given code.
  • Compute Metrics for the provided source code.

Running the Server

To run the server, you can use the following command:

bca-web --host 127.0.0.1 --port 9090
  • --host specifies the IP address where the server should run (default is 127.0.0.1).
  • --port specifies the port to be used (default is 8080).
  • -j specifies the number of parallel jobs (optional).

Endpoints

1. Ping the Server

Use this endpoint to check if the server is running.

Request:

GET http://127.0.0.1:8080/ping

Response:

  • Status Code: 200 OK
  • Body: empty.

Use curl -sf http://127.0.0.1:8080/ping && echo ok to script a liveness check — -f makes curl exit non-zero on any HTTP error.

2. Remove Comments

This endpoint removes comments from the provided source code. It accepts two Content-Type variants. Use application/octet-stream for raw byte-in / byte-out, and application/json for a JSON envelope.

Request:

POST http://127.0.0.1:8080/comment

Payload:

{
  "id": "unique-id",
  "file_name": "filename.ext",
  "code": "source code with comments"
}
  • id: A unique identifier for the request.
  • file_name: The name of the file being analyzed.
  • code: The source code with comments.

Response (JSON variant):

{
  "id": "unique-id",
  "code": [10, 112, 114, 105, 110, 116]
}

The code field is a byte array of the stripped source, not a string. Decode it with jq -r '.code | implode' (ASCII/UTF-8) or the equivalent in your client. The application/octet-stream variant returns the stripped source as the raw response body, which is simpler for shell pipelines.

3. Retrieve Function Spans

This endpoint retrieves the spans of functions in the provided source code.

Request:

POST http://127.0.0.1:8080/function

Payload:

{
  "id": "unique-id",
  "file_name": "filename.ext",
  "code": "source code with functions"
}
  • id: A unique identifier for the request.
  • file_name: The name of the file being analyzed.
  • code: The source code with functions.

Response:

{
  "id": "unique-id",
  "spans": [
    {
      "name": "function_name",
      "start_line": 1,
      "end_line": 10,
      "error": false
    }
  ]
}

error is true when the parser flagged the span as malformed (e.g. unbalanced delimiters inside the function body).

4. Compute Metrics

This endpoint computes various metrics for the provided source code.

Request:

POST http://127.0.0.1:8080/metrics

Payload:

{
  "id": "unique-id",
  "file_name": "filename.ext",
  "code": "source code for metrics"
  "unit": false
}
  • id: Unique identifier for the request.
  • file_name: The filename of the source code file.
  • code: The source code to analyze.
  • unit: A boolean value. true to compute only top-level metrics, false for detailed metrics across all units (functions, classes, etc.).

Response:

{
  "id": "unique-id",
  "language": "Rust",
  "spaces": {
    "metrics": {
      "cyclomatic_complexity": 5,
      "lines_of_code": 100,
      "function_count": 10
    }
  }
}

Recipes

Task-oriented examples for getting work done with bca and bca-web. Each recipe assumes you have built the binaries (cargo build --release) and that bca is on your PATH.

The recipes are grouped by goal:

  • Quality reports — generate Markdown reports suitable for pull requests, dashboards, or wikis, including the C/C++ preprocessor-aware workflow.
  • CI integration — wire bca check and bca report into GitHub Actions and GitLab CI, including the baseline / ratchet pattern and the Code Quality widget path.
  • Local threshold gates — mirror the CI threshold gate on a developer machine with a two-tier (hard + headroom) Makefile / just / pre-commit pattern, so regressions never reach the pull request.
  • AST queries — search for syntactic constructs, count node types, dump trees, and detect parse errors.
  • Exporting metric data — emit structured output (JSON / YAML / TOML / CBOR) and consume it from shell pipelines.
  • Driving the REST API — run the HTTP server and call every endpoint with curl.

If you want a deeper look at any flag the recipes use, see the per-command pages under Commands. For the full list of metrics that show up in these recipes, see Supported Metrics.

Upstream reference. big-code-analysis is a fork of Mozilla's rust-code-analysis. Recipes that work for the upstream rust-code-analysis-cli binary usually translate directly — replace the binary name and adjust for the subcommand restructure documented in the migration guide.

Quality reports

Recipes for producing aggregated, human-readable Markdown reports.

Wiring reports into CI? See the CI integration recipe for runnable GitHub Actions and GitLab CI examples that post the Markdown report as a PR/MR comment and surface threshold violations through the platform's native code quality widgets.

Live example reports

big-code-analysis publishes the output of bca report markdown and bca report html against its own source tree on every push to main. Open either to see exactly what the recipes on this page produce on a multi-language Rust + Python codebase:

The wiring that produces them lives in .github/workflows/pages.yml. The same workflow runs the threshold gate; see CI integration for the full pipeline shape.

Generate a project-wide quality report

Run from the project root and write the report to a file:

bca \
    --paths "$PWD" \
    --num-jobs "$(nproc)" \
    report markdown \
    --top 20 \
    --strip-prefix "$PWD/" \
    --output report.md
  • --strip-prefix keeps the file paths short and stable across machines — without it every row carries the absolute path of the current checkout.
  • --top controls how many rows appear in each hotspot table. 20 is a good default for a PR comment; drop to 5 for a dashboard tile.
  • --num-jobs controls parallelism. The walker is CPU-bound on most modern hardware.

Limit the report to specific languages

bca infers language from extension, so the include/exclude globs do the filtering:

bca \
    --include "*.rs" "*.py" \
    --paths "$PWD" \
    report markdown --output report.md

To exclude vendored or generated trees, layer in --exclude:

bca \
    --include "*.rs" \
    --exclude "**/target/**" "**/vendor/**" \
    --paths "$PWD" \
    report markdown

Flag ordering. --include and --exclude accept multiple values and stop only when the next flag begins. Put them before --paths (or any single-value flag) so the subcommand name isn't swallowed as a glob. Equivalent single-value forms with = also work: --include="*.rs" --exclude="**/target/**".

For a stable repo-wide deny-set, keep the patterns in a file at the repo root (a .bcaignore by convention) and load it with --exclude-from. Patterns are unioned with any inline --exclude values; blank lines and #-prefixed comments are skipped:

bca \
    --paths . \
    --exclude-from .bcaignore \
    report markdown --output report.md

Show only the worst offenders

For a quick triage view that highlights the top three problems per section:

bca -p src/ report markdown --top 3

The report still includes every section, but each table is short enough to scan at a glance.

Compare two revisions

Aggregate reports do not diff revisions on their own. Run the report on each side and diff the Markdown:

git worktree add /tmp/before main
bca -p /tmp/before report markdown \
    --strip-prefix /tmp/before/ --output /tmp/before.md

bca -p "$PWD" report markdown \
    --strip-prefix "$PWD/" --output /tmp/after.md

diff -u /tmp/before.md /tmp/after.md | less

Because both reports use the same --strip-prefix shape, the path columns line up and the diff is dominated by metric changes rather than path noise.

C/C++ preprocessor-aware reports

Macro-heavy C/C++ codebases benefit from feeding preprocessor data into the analyzer so that conditional compilation is interpreted the way the compiler sees it. The workflow is two steps:

# 1. Build a preprocessor-data JSON from the headers and sources.
bca \
    --paths src/ include/ \
    preproc \
    --output /tmp/preproc.json

# 2. Run the report (or any other command) with that data attached.
bca \
    --paths src/ \
    --preproc-data /tmp/preproc.json \
    report markdown --output report.md

--preproc-data is a global flag, so it works with metrics, ops, functions, and the other subcommands as well — anywhere accurate C/C++ analysis matters.

Analyze only files changed in a PR

Pipe a list of changed files into --paths-from - to score just the diff, not the whole tree:

git diff --name-only --diff-filter=AM origin/main...HEAD \
    | bca --paths-from - metrics -O json -o ./out
  • --diff-filter=AM keeps Added and Modified files and drops Deletions — you cannot analyze a file that no longer exists.
  • --paths-from - reads newline-separated paths from stdin. A file argument works the same way: --paths-from changed.txt.
  • Paths fed in this way are treated as explicit, so they bypass any .gitignore rule that would have hidden them in a directory walk. Combine with -I '*.py' '*.rs' to filter by language.

For a PR-scoped Markdown summary, swap metrics for the report pipeline:

git diff --name-only --diff-filter=AM origin/main...HEAD \
    | bca --paths-from - report markdown \
        --top 10 --output pr-report.md

.gitignore is honored automatically when walking a directory, so recipes earlier in this page no longer need an explicit -X "**/target/**" "**/node_modules/**" if those paths are already covered by your project's .gitignore. Add --no-ignore if you do need to analyze gitignored trees.

CI integration

Recipes for wiring bca into a build pipeline. The bca check command already ships every output shape a modern CI needs (Checkstyle, SARIF, GitLab Code Climate JSON, clang/GCC warning lines, MSVC warning lines), plus bca report markdown for humans. This page is a consolidated map from the user's goal to the right combination of subcommand, flags, and platform glue.

Picking outputs

The matrix below maps each common goal to the bca invocation that feeds the corresponding CI surface. Linked sections below have the runnable example.

GoalCommand + flags
Hard gate on threshold regressionsbca check --config bca-thresholds.toml
Ratchet thresholds on an existing codebasebca check --config bca-thresholds.toml --baseline .bca-baseline.toml (‡)
Inline PR annotations (GitHub)bca check … --output-format clang-warning --no-fail + GCC problem matcher
Code Scanning alerts (GitHub)bca check … --output-format sarif --no-fail + github/codeql-action/upload-sarif
Merge-request widget (GitLab Code Quality)bca check … --output-format code-climate --no-fail
Jenkins / SonarQube ingestionbca check … --output-format checkstyle
Human-readable PR/MR comment or downloadablebca report markdown --top 20 --strip-prefix "$PWD/"
Machine-readable artifact for dashboardsbca metrics --output-format json --output ./out

(‡) Recommended adoption path when introducing thresholds on a codebase with existing offenders. See the Baselines recipe for the bootstrap-refresh-retire workflow.

The full reference for bca check's output formats, exit codes (0 clean, 2 violation, 1 tool error), and threshold config lives in the Check command page. For the Markdown report shape, see the Report command page and the Quality reports recipe.

GitHub Actions

Live worked example

big-code-analysis runs the recipes below against its own source on every push and PR. The workflow source — .github/workflows/pages.yml — exercises the GitHub-Releases install path, the cache, the baseline-ratcheted gate, and both report formats. The output sits on GitHub Pages alongside this book:

Copy snippets below straight into your own workflow; the bca version quoted is the latest published release at the time of writing.

Threshold gate, SARIF, and clang-warning matcher

The three pre-existing recipes — hard threshold gate, SARIF upload to Code Scanning, and clang-warning + GCC problem matcher for inline PR annotations — live in the Check command page. Use the link rather than re-implementing them here.

The fastest, most reproducible install path is the prebuilt tarball from this repository's GitHub Releases. It is a single curl | sha256sum | tar, requires no Rust toolchain, and produces byte-identical binaries across runs. Pair it with actions/cache keyed by version so a green-path rerun skips the download entirely:

env:
  BCA_VERSION: "1.1.0"
  BCA_TARGET:  "x86_64-unknown-linux-gnu"
  # sha256 of big-code-analysis-${BCA_VERSION}-${BCA_TARGET}.tar.gz from the
  # release's SHA256SUMS file. Bump together with BCA_VERSION.
  BCA_SHA256:  "f11c324fd80787e1a9edf99d3c1763980e035e51abb5479527b14b1e2f83e919"

steps:
  # Cache key MUST include BCA_SHA256 (and BCA_TARGET). Without the
  # sha256 in the key, rotating the published checksum without bumping
  # the version returns a stale binary on cache hit and silently
  # bypasses the `sha256sum --check` in the install step (which is
  # gated on cache miss). Including BCA_TARGET matters when the same
  # workflow runs against multiple `runs-on`.
  - name: Cache bca binary
    id: bca-cache
    uses: actions/cache@v5
    with:
      path: ~/.local/bin/bca
      key: bca-${{ runner.os }}-${{ env.BCA_TARGET }}-${{ env.BCA_VERSION }}-${{ env.BCA_SHA256 }}

  - name: Install bca from GitHub Releases
    if: steps.bca-cache.outputs.cache-hit != 'true'
    run: |
      set -euo pipefail
      stage="big-code-analysis-${BCA_VERSION}-${BCA_TARGET}"
      tarball="${stage}.tar.gz"
      url="https://github.com/dekobon/big-code-analysis/releases/download/v${BCA_VERSION}/${tarball}"
      mkdir -p "$HOME/.local/bin"
      curl -fsSL --proto '=https' --tlsv1.2 -o "/tmp/${tarball}" "$url"
      echo "${BCA_SHA256}  /tmp/${tarball}" | sha256sum --check --strict -
      tar -xzf "/tmp/${tarball}" -C /tmp
      install -m 0755 "/tmp/${stage}/bca" "$HOME/.local/bin/bca"
      rm -rf "/tmp/${tarball}" "/tmp/${stage}"

  - name: Prepend ~/.local/bin to PATH
    run: echo "$HOME/.local/bin" >> "$GITHUB_PATH"

Available BCA_TARGET values (pick the one that matches runs-on): x86_64-unknown-linux-gnu, x86_64-unknown-linux-musl, aarch64-unknown-linux-gnu, aarch64-unknown-linux-musl, aarch64-apple-darwin, x86_64-pc-windows-msvc, aarch64-pc-windows-msvc. Windows assets use .zip instead of .tar.gz; the bca-web binary ships alongside bca in the same archive.

Alternative: cargo install via prebuilt-aware actions

When you cannot reach github.com from a runner (air-gapped, custom mirror) but can reach crates.io, the following two actions fall back transparently to cargo install when no prebuilt is published — at the cost of compile time on the cold path. Both pin to the same crates.io release as the GitHub Releases assets:

# Option 1: taiki-e/install-action
- name: Install bca
  uses: taiki-e/install-action@v2
  with:
    tool: big-code-analysis-cli@1.1.0
# Option 2: cargo-binstall
- name: Install cargo-binstall
  uses: cargo-bins/cargo-binstall@main
- name: Install bca
  run: cargo binstall --no-confirm big-code-analysis-cli --version 1.1.0

If either action falls back to compilation, cache the cargo registry + the installed binary so the second run is fast:

- name: Cache cargo registry and bca binary
  uses: actions/cache@v5
  with:
    path: |
      ~/.cargo/registry
      ~/.cargo/git
      ~/.cargo/bin/bca
    # crates.io publishes immutable releases, so a `<version>` key is
    # sufficient here — there is no sha256 to rotate. (The GitHub
    # Releases install path above is different: republished release
    # assets share a version, so its cache key must include the sha256.)
    key: bca-${{ runner.os }}-1.1.0

Pin to a specific version (matching a published big-code-analysis-cli release on crates.io) so reports stay reproducible across runs. A floating install surfaces metric-counting changes as "mysterious CI flakes" on Mondays.

Posting the Markdown report as a PR comment

bca report markdown is purpose-built for PR/MR comments: a stable header structure, one row per hot spot, and short paths once you pass --strip-prefix. Pair it with marocchino/sticky-pull-request-comment so each push updates a single comment instead of stacking new ones:

name: bca-pr-report
on:
  pull_request:
    branches: [main]
jobs:
  report:
    runs-on: ubuntu-latest
    permissions:
      pull-requests: write
    steps:
      - uses: actions/checkout@v4
      - name: Install bca
        uses: taiki-e/install-action@v2
        with:
          tool: big-code-analysis-cli@1.1.0
      - name: Generate report
        run: |
          bca \
            --paths "$PWD" \
            --num-jobs "$(nproc)" \
            report markdown \
            --top 20 \
            --strip-prefix "$PWD/" \
            --output report.md
      - name: Post or update PR comment
        uses: marocchino/sticky-pull-request-comment@v2
        with:
          path: report.md
          header: bca-quality-report

The same Markdown file is suitable for upload as a build artifact (actions/upload-artifact@v7) if you want it downloadable from the workflow run page in addition to the PR comment.

Baseline / ratchet pattern

bca check --baseline is the native ratchet: record today's offenders in a committed TOML file, fail only on regressions and new offenders, and shrink the file over time. Bootstrap once, commit, then point CI at it:

# Once, on a developer machine. Commit both files.
bca --paths src/ check \
    --config bca-thresholds.toml \
    --write-baseline .bca-baseline.toml
git add bca-thresholds.toml .bca-baseline.toml

Path-style stickiness. Baseline entries are keyed by the exact path string bca emits at write time. --paths src/ records src/foo.rs, --paths . records ./src/foo.rs, and --paths "$PWD" records the absolute path. The subsequent bca check --baseline MUST use the same --paths form, or every entry mismatches and the gate fails on every existing offender. Pick one form and apply it consistently in CI and in the bootstrap command.

This snippet bootstraps from src/ only — appropriate for a single-crate library. For a multi-crate workspace, see the live worked example: its .github/workflows/pages.yml scans the entire repo with --exclude-from .bcaignore, a checked-in deny-set covering vendored grammars, generated trees, and tests.

Share the exclude list across workflow, recipe, and bootstrap. Put the deny-set in a single file at the repo root (a .bcaignore by convention, mirroring .gitignore / .dockerignore) and point every bca invocation at it with --exclude-from .bcaignore. Patterns from --exclude-from are unioned with any inline --exclude <GLOB> flags into one deny-set — keep --exclude for one-off ad-hoc excludes. Blank lines and #-prefixed comment lines in the file are skipped. Patterns follow the same ./-prefix convention as --exclude arguments (the walker's emitted form). Pair edits to .bcaignore with a --write-baseline refresh — the baseline keys are sensitive to which files the walker visits.

- name: Threshold check with baseline
  run: |
    bca --paths src/ check \
        --config bca-thresholds.toml \
        --baseline .bca-baseline.toml

A regressed function (current value > baseline value) still fails. A new offender not in the baseline still fails. An improved function passes silently and stays in the baseline until the next --write-baseline refresh.

Each surviving violation in the stderr stream is prefixed with a tag so a developer can tell at a glance whether they are looking at a brand-new offender or a known one that has worsened:

  • [new] — no baseline entry for this function / metric.
  • [regr +N%] — current value exceeds the recorded baseline by N percent. Special forms: [regr from 0] when the baseline value was zero, [regr +>9999%] when the regression exceeds 100× the baseline, [regr NaN] when the current value is NaN.

After the per-violation lines the stderr stream emits a per-file rollup footer with the format <path>: <count> violations (worst: <metric> = <value> vs limit <limit> at L<start>), sorted by violation count descending. This is intended to be the first thing a reader looks at: which file has the most problems, and which metric is the loudest in that file. Pass --no-summary to suppress the footer for downstream tooling that grep-pipes the stderr stream.

Refresh after focused refactors:

bca --paths src/ check \
    --config bca-thresholds.toml \
    --write-baseline .bca-baseline.toml
git diff .bca-baseline.toml   # expect a shrinking file

Two --write-baseline runs over an unchanged tree produce byte-identical output, so spurious diffs only appear when offenders actually changed. See the Baselines recipe for the full adoption flow, PR-review heuristics, and the suppression composition rules.

Offender-count delta against merge base (stopgap)

For teams who cannot commit a baseline file (e.g. policy reasons), a coarser approximation counts <error> elements in two Checkstyle documents — one on the merge base, one on the PR head — and fails when the count grows:

- name: Compute offender deltas vs. merge base
  run: |
    set -euo pipefail
    BASE="$(git merge-base origin/main HEAD)"
    git worktree add /tmp/base "$BASE"

    bca --paths /tmp/base check \
        --config bca-thresholds.toml \
        --output-format checkstyle \
        --output /tmp/base.xml \
        --no-fail
    BASE_COUNT=$(grep -c "<error" /tmp/base.xml || true)

    bca --paths "$PWD" check \
        --config bca-thresholds.toml \
        --output-format checkstyle \
        --output /tmp/head.xml \
        --no-fail
    HEAD_COUNT=$(grep -c "<error" /tmp/head.xml || true)

    echo "Offenders: base=$BASE_COUNT head=$HEAD_COUNT"
    if [ "$HEAD_COUNT" -gt "$BASE_COUNT" ]; then
      echo "::error::Offender count grew from $BASE_COUNT to $HEAD_COUNT"
      exit 1
    fi

This counts violations, not their identity: renaming an offender does not register as a regression, and improving one offender while regressing another nets to zero. The native baseline flow above is strictly more precise and is the recommended approach.

Self-scan threshold gate (local mirror of the CI gate)

CI's threshold gate fires only after push, which is too late if a refactor silently nudged a metric past its limit. The big-code-analysis repo's Makefile exposes four targets that mirror the CI gate (the Threshold gate step in .github/workflows/pages.yml) locally and add a second tier at 95% of every limit so encroachment is caught a commit or two before the hard gate trips:

make self-scan                            # hard gate, 100% of bca-thresholds.toml
make self-scan-headroom                   # soft gate, default 95% (BCA_HEADROOM)
make self-scan-write-baseline             # refresh baseline at hard thresholds
make self-scan-write-baseline-headroom    # refresh baseline at soft thresholds

The hard tier is exactly what CI runs; expanded, it is:

cargo run --quiet --release -p big-code-analysis-cli -- \
    --paths . --exclude-from .bcaignore \
    check \
    --config bca-thresholds.toml \
    --baseline .bca-baseline.toml

Both tiers consume the same bca-thresholds.toml and the same .bca-baseline.toml; the soft tier just runs the hard recipe with every threshold value multiplied by BCA_HEADROOM. Both exit 0 clean, 2 on any threshold violation, 1 on tool error — the soft tier is a real gate, not advisory, so do not wrap make self-scan-headroom in || true. All four targets are wired into make pre-commit, make ci, and .pre-commit-config.yaml, with self-scan-headroom: self-scan as a Make prerequisite so the hard tier always reports a true regression before the soft tier reports near-limit headroom.

BCA_HEADROOM=0.90 make self-scan-headroom widens the band; BCA_HEADROOM=0.99 tightens it to the last 1%. When the soft tier fires, absorb the offender into the baseline with make self-scan-write-baseline-headroom (which records every offender at the scaled thresholds — strictly a superset of the hard-tier offenders).

The pattern (hard tier mirroring CI + soft tier as early-warning band, both ratcheted by the same baseline) is project-agnostic — the Local threshold gates recipe documents the underlying principles, drop-in Makefile / just / package.json skeletons, and the helper script that scales thresholds, so you can adopt the same workflow in your own repo. The generic recipe uses the same BCA_* env-var names as the Makefile above, so overrides like BCA_HEADROOM=0.90 work identically across both.

GitLab CI

Full .gitlab-ci.yml example

The job below installs bca, runs the threshold check producing Code Climate JSON (for the MR Code Quality widget), Checkstyle XML, and a Markdown report, then uploads them as artifacts:

stages:
  - quality

variables:
  BCA_VERSION: "1.1.0"  # pin a published big-code-analysis-cli release
  BCA_TARGET:  "x86_64-unknown-linux-gnu"
  # sha256 of big-code-analysis-${BCA_VERSION}-${BCA_TARGET}.tar.gz from
  # the release's SHA256SUMS file. Bump together with BCA_VERSION.
  BCA_SHA256:  "f11c324fd80787e1a9edf99d3c1763980e035e51abb5479527b14b1e2f83e919"

bca-quality:
  stage: quality
  image: debian:stable-slim
  cache:
    # Same key shape as the GitHub Actions snippet — bumping
    # BCA_VERSION invalidates the cache automatically.
    key: "bca-$BCA_VERSION"
    paths:
      - .cache/bca/
  before_script:
    - apt-get update -qq && apt-get install -y --no-install-recommends ca-certificates curl tar
    - |
      set -euo pipefail
      install -d "$CI_PROJECT_DIR/.cache/bca" "$HOME/.local/bin"
      if [ ! -x "$CI_PROJECT_DIR/.cache/bca/bca" ]; then
        stage="big-code-analysis-${BCA_VERSION}-${BCA_TARGET}"
        tarball="${stage}.tar.gz"
        url="https://github.com/dekobon/big-code-analysis/releases/download/v${BCA_VERSION}/${tarball}"
        curl -fsSL --proto '=https' --tlsv1.2 -o "/tmp/${tarball}" "$url"
        echo "${BCA_SHA256}  /tmp/${tarball}" | sha256sum --check --strict -
        tar -xzf "/tmp/${tarball}" -C /tmp
        install -m 0755 "/tmp/${stage}/bca" "$CI_PROJECT_DIR/.cache/bca/bca"
        rm -rf "/tmp/${tarball}" "/tmp/${stage}"
      fi
      install -m 0755 "$CI_PROJECT_DIR/.cache/bca/bca" "$HOME/.local/bin/bca"
      export PATH="$HOME/.local/bin:$PATH"
  script:
    - bca
        --paths "$PWD"
        --num-jobs "$(nproc)"
        check
        --config bca-thresholds.toml
        --output-format code-climate
        --output gl-code-quality-report.json
        --no-fail
    - bca
        --paths "$PWD"
        --num-jobs "$(nproc)"
        check
        --config bca-thresholds.toml
        --output-format checkstyle
        --output bca-checkstyle.xml
        --no-fail
    - bca
        --paths "$PWD"
        --num-jobs "$(nproc)"
        report markdown
        --top 20
        --strip-prefix "$PWD/"
        --output bca-report.md
    # The threshold gate runs separately so the artifacts above still
    # publish on failure. Exit 2 = at least one threshold exceeded.
    - bca --paths "$PWD" check --config bca-thresholds.toml
  artifacts:
    when: always
    reports:
      codequality: gl-code-quality-report.json
    paths:
      - gl-code-quality-report.json
      - bca-checkstyle.xml
      - bca-report.md

A few notes about the example:

  • The first two bca check … --no-fail invocations collect offenders for the artifacts; the final bca check (no --no-fail) is the pass/fail gate. All three runs use the same threshold config so the artifacts always match the gate decision.
  • artifacts:when: always ensures every artifact is downloadable even on a red pipeline — which is exactly when you want them most.
  • artifacts:reports:codequality wires the Code Climate JSON directly into GitLab's MR Code Quality widget — see the Code Quality widget section below for the field-by-field semantics.

GitLab Code Quality widget

GitLab's first-class Code Quality experience (inline complaints on the MR diff, summary on the MR overview page) consumes Code Climate JSON. bca check emits this natively via --output-format code-climate, so the integration is a one-liner:

code_quality:
  stage: quality
  script:
    - bca --paths "$CI_PROJECT_DIR" check
          --config bca-thresholds.toml
          --output-format code-climate
          --output gl-code-quality-report.json
          --no-fail
  artifacts:
    when: always
    reports:
      codequality: gl-code-quality-report.json
    paths:
      - gl-code-quality-report.json

Severity bands are derived from how far each metric exceeds its configured threshold (value / limit ratio, inverted for the maintainability-index family where lower is worse): ≤ 1.5×minor, ≤ 2×major, ≤ 4×critical, > 4×blocker. The widget deduplicates findings by fingerprint; bca hashes path \0 function \0 metric (no line, no value) so a violation surviving an upstream line-drift edit still collapses into the same widget entry across pipeline runs.

Sanity-check a generated report locally:

jq 'all(.[]; has("description") and has("check_name")
     and has("fingerprint") and has("severity")
     and has("location"))' gl-code-quality-report.json
# → true
jq '[.[] | .severity] | unique' gl-code-quality-report.json
# → a subset of ["info","minor","major","critical","blocker"]

MR-only comment with the Markdown report

To attach the Markdown report as an MR note (the GitLab analogue of the GitHub PR comment recipe), use the project access token and the Notes API:

bca-mr-comment:
  stage: quality
  image: alpine:3
  rules:
    - if: $CI_PIPELINE_SOURCE == "merge_request_event"
  needs: ["bca-quality"]
  before_script:
    - apk add --no-cache curl jq
  script:
    - |
      BODY=$(jq -Rs '.' < bca-report.md)
      curl --fail --silent --show-error \
        --request POST \
        --header "PRIVATE-TOKEN: $CI_BCA_BOT_TOKEN" \
        --header "Content-Type: application/json" \
        --data "{\"body\": $BODY}" \
        "$CI_API_V4_URL/projects/$CI_PROJECT_ID/merge_requests/$CI_MERGE_REQUEST_IID/notes"

CI_BCA_BOT_TOKEN is a project access token with api scope. The job depends on bca-quality so the Markdown artifact is in place before it runs.

Jenkins / SonarQube

Both Jenkins (via the Warnings Next Generation plugin) and SonarQube (via its Generic Issue importer) consume Checkstyle 4.3 XML directly. The same invocation feeds both:

bca --paths src/ check \
    --config bca-thresholds.toml \
    --output-format checkstyle \
    --output report.checkstyle.xml

Wire report.checkstyle.xml into your existing Jenkins Record Issues / SonarQube External Issues step. The Checkstyle writer emits an empty (well-formed) document when there are no offenders, so neither tool needs special-casing for a clean run. See the Check command page for the writer's schema details.

Generic CI guidance

Applies regardless of provider:

  • Pin bca to a specific version. Both cargo install --version and cargo binstall --version accept the published crate version of big-code-analysis-cli. A floating install surfaces metric-counting changes as "mysterious CI flakes" on Mondays.
  • Use --num-jobs "$(nproc)". The walker is CPU-bound on modern hardware; --num-jobs 1 is a debugging knob, not a default.
  • Always pass --strip-prefix "$PWD/" to bca report markdown so the path column is identical across runners with different workspace paths. Without it the diff between two reports is dominated by /home/runner/work/... vs. /builds/group/project/... noise.
  • Store bca-thresholds.toml at the repo root, alongside Cargo.toml / pyproject.toml / package.json. Treat it as source: review threshold relaxations in code review.
  • Exit-code contract. bca check exits 0 clean, 2 on any threshold violation, 1 on tool error (bad config, unknown metric, unreadable path). Reserving 1 for tool errors lets CI distinguish "a function got too complex" from "the analyzer crashed".
  • Honor in-source suppression markers, audit with --no-suppress. The default bca check honors bca: suppress / bca: suppress-file markers; passing --no-suppress ignores them so auditors see the raw offender list.

Baselines: ratcheting thresholds on existing code

When you introduce metric thresholds on an existing codebase, you usually hit the same wall: every reasonable threshold flags hundreds of existing functions, and CI goes red on every push. The realistic adoption path is "ratchet from current state, fail only on new offenders". The baseline file (issue #99) is how bca check supports that workflow.

Baselines are the complement to in-source suppression markers, not a substitute. Use suppression markers (Suppression markers) when a function is intentionally complex forever (a parser, a state machine, generated code). Use a baseline when the team intends to pay the debt down. Both can live in the same repo; suppression is checked first.

End-to-end adoption flow

1. Pick initial thresholds

Either gut-feel numbers (cyclomatic=15, cognitive=20) or pull them from a bca check --no-fail run over the repo to see the current distribution.

# bca-thresholds.toml
[thresholds]
cyclomatic = 15
cognitive = 20
"loc.lloc" = 200

2. Bootstrap the baseline

bca --paths src/ check \
    --config bca-thresholds.toml \
    --write-baseline .bca-baseline.toml

Commit both files in the same change:

git add bca-thresholds.toml .bca-baseline.toml
git commit -m "ci: introduce metric thresholds with baseline"

3. Wire the CI gate

GitHub Actions:

- name: Check code complexity thresholds
  run: |
    bca --paths src/ check \
        --config bca-thresholds.toml \
        --baseline .bca-baseline.toml

GitLab CI (snippet for the relevant job):

threshold-check:
  image: rust:1
  before_script:
    - cargo install --locked big-code-analysis-cli@<VERSION>
  script:
    - bca --paths src/ check
        --config bca-thresholds.toml
        --baseline .bca-baseline.toml

Exit codes: 0 clean, 2 regression or new offender, 1 tool error. See CI integration for the broader matrix of CI surfaces.

4. Refresh the baseline as the team pays debt down

Every few weeks, or after a focused refactor:

bca --paths src/ check \
    --config bca-thresholds.toml \
    --write-baseline .bca-baseline.toml
git diff .bca-baseline.toml

A shrinking diff is the goal. Two --write-baseline runs over an unchanged tree produce byte-identical output, so spurious diffs only appear when actual offenders changed.

5. PR-review heuristics

  • Baseline shrank. Debt paid down. No further action.
  • Baseline grew. Someone added a new offender to the file intentionally. Review the values — was this a deliberate stopgap, or did the author bypass the gate? Either is fine if conscious; the point of the file being committed is to make the choice reviewable.
  • A single entry got a higher value. The author re-ran --write-baseline after the function got worse. Treat the same as "baseline grew" — surface the change in review.

Reading the gate output

A failing bca check --baseline run prefixes each surviving violation with a tag and follows the list with a per-file rollup:

bca: filtered 422 violations via baseline
[regr +60%] src/foo.rs:1-865: <file>: halstead.effort = 1557107.72 (limit 50000)
[new] src/bar.rs:506-747: act_on_file: cognitive = 63 (limit 25)
...

--- summary ---
src/foo.rs: 5 violations (worst: halstead.effort = 1557107.72 vs limit 50000 at L1)
src/bar.rs: 4 violations (worst: cognitive = 63 vs limit 25 at L506)

Tag prefixes:

  • [new] — no baseline entry for this (path, function, start_line, metric) tuple. The violation is new since the baseline was written.
  • [regr +N%] — the baseline contains a recorded value and the current value is N% higher. Cases:
    • [regr from 0] when the recorded value is 0.0 and a non-zero percentage would divide by zero.
    • [regr +>9999%] caps once the regression exceeds 100× the baseline value.
    • [regr NaN] when the current metric value is NaN (degenerate Halstead inputs on trivial functions).

Tags only appear when --baseline is passed; without it the line format is byte-identical to the no-baseline default. CI tooling that grep-pipes the stderr stream can suppress the trailing summary with --no-summary.

The summary footer groups violations by file, cites the single worst metric per file (max value / limit ratio), and sorts rows by violation count descending then path ascending. It is the fastest way to read a long offender list and spot which file to start with.

6. Retire the baseline

When .bca-baseline.toml contains only version = 2 and no entries, drop the --baseline flag from CI and delete the file. The thresholds now stand on their own.

Composition with suppression markers

--write-baseline already excludes any function silenced by a bca: suppress or #lizard forgives marker, so the same function doesn't end up in two places. If a function is intentionally exempt forever, prefer the in-source marker (lives next to the code, survives refactors, no extra file to commit). Use the baseline only for violations the team genuinely intends to fix.

To audit the un-filtered offender set — every violation regardless of suppression or baseline — pass --no-suppress and omit --baseline:

bca --paths src/ check \
    --config bca-thresholds.toml \
    --no-suppress \
    --no-fail

Combined with --write-baseline, --no-suppress records every violation including the ones that suppression markers normally hide.

Limitations

  • Line drift. Entries key on (path, function, start_line, metric). Editing code above a function shifts its start_line and the baseline entry stops matching, surfacing as a "new" offender. Refresh with --write-baseline and commit the diff.
  • Path identity. Entries record the path the walker saw. Run --write-baseline and --baseline from the same working directory with the same --paths argument; a relative --paths src/ and an absolute --paths /repo/src/ produce non-matching baselines.
  • OS portability. Paths are normalized to forward slashes on write and re-normalized on read, so a baseline generated on Linux matches the same tree on Windows. Non-UTF-8 paths fall back to a lossy display form and may not round-trip exactly.
  • Tightening a threshold. Lowering a limit may newly expose functions that were previously clean. They will not be in the baseline → CI will fail. This is correct — tightening should expose new offenders. Refresh the baseline if the team chooses to absorb the new entries.

Local threshold gates

CI is the last line of defence, not the first. By the time bca check --config bca-thresholds.toml --baseline .bca-baseline.toml fires red on a pull request, the offending change has already been pushed, the author has context-switched, and someone has to revisit the diff to nudge a metric back under its limit. A local threshold gate moves that feedback to the moment of git commit — the same moment cargo fmt --check and cargo clippy -- -D warnings already fire — so the regression never makes it past the developer's keyboard.

This recipe captures the pattern big-code-analysis uses on its own source (Makefile's self-scan* targets) and distils it into something you can drop into your own repo's Makefile, justfile, package.json script, or pre-commit config. The underlying idea is provider-neutral: any threshold checker (bca, ESLint, clippy, SonarLint, Qodana) can be wired the same way.

Principles

Three principles drive the design. They are not specific to bca; they are the same conclusions Sonar reached when it pivoted its default Quality Gate to focus on new code and that the broader ratchet pattern formalises.

  1. Gate locally, mirror CI exactly. The local gate must run the same binary with the same arguments and the same threshold / baseline / exclude files as CI. If the local gate is "almost what CI runs", it stops catching regressions the moment one diverges from the other. The cost of running the gate once before pushing is cheap; the cost of a red PR-bot ping is not.
  2. Ratchet, don't reset. When you introduce thresholds on an existing codebase, every reasonable limit fires on dozens of pre-existing functions. The realistic adoption path is "absorb today's offenders into a baseline file, fail only on new or worsening ones, shrink the baseline over time". This is the same strategy that lets a multi-year codebase introduce strict TypeScript or strict clippy lints without a months-long boil-the-ocean pass. See the Baselines recipe for the bootstrap → CI → refresh → retire flow.
  3. Warn before you fail. A hard 100% gate fails at the limit and gives no signal as a function creeps from 80% to 95% to 99% of its threshold. A second, looser tier that fires at e.g. 95% of every limit gives a one-or-two-commit early warning. The author still has the file open, the test cases in their head, and the freedom to refactor before the offender hardens into "well, it's in main now". Sonar's "new code" Quality Gate, the GCC -Wall / -Werror split, and clippy's warn vs. deny lint levels all encode the same insight: a tier between clean and broken is where teams actually catch drift.

The two tiers

The pattern is two recipes wrapping the same checker, plus two recipes for refreshing the baseline at each tier.

TargetTierThresholdsBaseline-filteredUse case
self-scanhard100% of configyesMirror of CI. Must stay green on every commit.
self-scan-headroomsoftconfig × HEADROOMyesEarly-warning band. Fires before the hard tier.
self-scan-write-baselinehard100% of config(write)Absorb today's hard-tier offenders.
self-scan-write-baseline-headroomsoftconfig × HEADROOM(write)Absorb soft-tier offenders when launching or widening the band.

The hard tier and the soft tier consume the same bca-thresholds.toml and the same .bca-baseline.toml. The only difference between them is a scalar multiplier applied to every threshold value before bca check sees it.

This matters: it means a contributor who wants the soft tier to be stricter (catch encroachment further out) bumps a single environment variable rather than maintaining a parallel bca-thresholds-soft.toml that will drift out of sync with the hard config the first time anyone forgets to update both files.

Skeleton: GNU Make

The four recipes below are a self-contained drop-in. Adjust the BCA variable to point at whatever invocation gives you the checker (a pinned release binary, cargo run --release, an npm / pip wrapper). Adjust PATHS and EXCLUDE_FROM to match your layout.

# --- bca local threshold gates ------------------------------------------
# HARD tier mirrors CI exactly. Both tiers consume the same
# bca-thresholds.toml + .bca-baseline.toml; the soft tier scales every
# threshold by $(BCA_HEADROOM) (default 0.95).
#
# Knobs are namespaced with `BCA_` so they don't collide with anything
# else in your environment. The big-code-analysis repo's own Makefile
# uses the same names — this skeleton is drop-in for that project too.
BCA               := bca
BCA_PATHS         := .
BCA_EXCLUDE_FROM  := .bcaignore
BCA_THRESHOLDS    := bca-thresholds.toml
BCA_BASELINE      := .bca-baseline.toml
BCA_HEADROOM      ?= 0.95

# `PY` lets Windows hosts override to `py -3` (the stock python.org
# installer ships `py.exe` and `python.exe` but no `python3` alias).
PY                ?= python3

# Common args, factored out so the four recipes stay in lockstep.
BCA_BASE_ARGS := --paths $(BCA_PATHS) --exclude-from $(BCA_EXCLUDE_FROM) \
                 --num-jobs $(shell nproc 2>/dev/null || sysctl -n hw.ncpu 2>/dev/null || echo 4)

.PHONY: self-scan self-scan-headroom \
        self-scan-write-baseline self-scan-write-baseline-headroom

self-scan:
	@echo "bca self-scan (hard gate)..."
	@$(BCA) $(BCA_BASE_ARGS) check \
	  --config $(BCA_THRESHOLDS) \
	  --baseline $(BCA_BASELINE)

# `self-scan-headroom: self-scan` is intentional: under `make -j` Make
# would otherwise run both gates in parallel and the soft tier's scaled
# error message could land before the true regression on the hard tier.
# `BCA_THRESHOLDS` / `BCA_BASELINE` are exported because the helper
# reads them from the environment — see "Helper script" below.
self-scan-headroom: self-scan
	@echo "bca self-scan (soft gate, BCA_HEADROOM=$(BCA_HEADROOM))..."
	@BCA_HEADROOM=$(BCA_HEADROOM) \
	  BCA_THRESHOLDS=$(BCA_THRESHOLDS) \
	  BCA_BASELINE=$(BCA_BASELINE) \
	  $(PY) ./utils/bca-self-scan-headroom.py \
	  $(BCA) $(BCA_BASE_ARGS)

self-scan-write-baseline:
	@echo "Refreshing $(BCA_BASELINE) at hard thresholds..."
	@$(BCA) $(BCA_BASE_ARGS) check \
	  --config $(BCA_THRESHOLDS) \
	  --write-baseline $(BCA_BASELINE)

# Soft-tier baseline write. NOTE: this and `self-scan-write-baseline`
# both write `$(BCA_BASELINE)`; never compose them as parallel
# prerequisites of one umbrella target or invoke them with `make -j2`,
# or the two Python processes will race on the same file and the
# losing tier's offenders will silently vanish from the baseline.
# Run them sequentially (hard first, then soft) and commit the diff.
self-scan-write-baseline-headroom:
	@echo "Refreshing $(BCA_BASELINE) at soft thresholds (BCA_HEADROOM=$(BCA_HEADROOM))..."
	@BCA_HEADROOM=$(BCA_HEADROOM) \
	  BCA_THRESHOLDS=$(BCA_THRESHOLDS) \
	  BCA_BASELINE=$(BCA_BASELINE) \
	  BCA_HEADROOM_WRITE_BASELINE=$(BCA_BASELINE) \
	  $(PY) ./utils/bca-self-scan-headroom.py \
	  $(BCA) $(BCA_BASE_ARGS)

The helper (utils/bca-self-scan-headroom.py) reads four env vars — BCA_HEADROOM (default 0.95), BCA_THRESHOLDS (default bca-thresholds.toml), BCA_BASELINE (default .bca-baseline.toml), and the optional BCA_HEADROOM_WRITE_BASELINE switch — multiplies every value in the thresholds file by the headroom ratio, and re-emits the limits as --threshold name=value flags so bca check sees scaled limits without you having to maintain a second TOML file. The Make skeleton above exports the first three so renaming any of those paths in one place propagates to both tiers. See Helper script below for a ready-to-paste implementation.

The gate exit codes propagate verbatim from bca check: 0 clean, 2 on any threshold violation (hard or soft), 1 on tool error. The soft tier is a real gate — never wrap make self-scan-headroom in || true thinking it's advisory; the non-zero exit is the whole point of the encroachment band.

Keep --paths identical across all four recipes. Baseline entries are keyed by the exact path string bca emits at write time: --paths . records ./src/foo.rs, --paths src/ records src/foo.rs, and --paths "$PWD" records the absolute path. A subsequent --baseline invocation that uses a different --paths form silently mismatches every entry and the gate re-fails on every existing offender. The skeletons above all use --paths . deliberately — if you change it, change it in every recipe and refresh .bca-baseline.toml once. See Baselines: path identity for the full caveat.

Wiring into pre-commit and CI

Add the soft gate to whatever umbrella target your developers already run before pushing. The hard gate runs as its prerequisite (see the self-scan-headroom: self-scan edge above), so listing only the soft target is enough — and crucially survives make -j, which would otherwise schedule both leaves in parallel and interleave their output:

.PHONY: pre-commit
pre-commit: fmt-check clippy test self-scan-headroom

Ordering matters: the hard tier names a true regression with the 100% limit, not the scaled one. The prerequisite edge enforces that order even under parallel Make.

In CI, run only the hard tier:

- name: Threshold gate
  run: make self-scan

The soft tier is a developer feedback knob, not a release gate. Running it in CI either duplicates the hard tier (when nothing has encroached) or fires noisily on a baseline-absorbed offender that crept upward without crossing 100% — neither buys you anything CI doesn't already cover.

The headroom knob

BCA_HEADROOM is a single scalar in (0, 1]. The interesting band is narrow:

BCA_HEADROOMFires when a function reaches…Use case
0.9999% of any limitTightest possible warning, fires on the last commit before the hard gate would.
0.9595% of any limit (default)One-or-two-commit lead time. Good default.
0.9090% of any limitWider band — useful immediately after raising a limit, while the new ceiling settles.
1.00100% (parity with hard gate)Sanity check that the two tiers agree.

Values below ~0.80 turn the soft tier into a second hard tier with arbitrary numbers and stop being useful: every threshold has some function near 80% of it on a real codebase, and the soft tier becomes a permanent baseline-management chore rather than an early-warning signal.

When the soft tier fires

A failed soft gate is a decision point, not a bug report. There are exactly three legitimate resolutions:

  1. Refactor. Same workflow as any other complexity regression — extract a helper, collapse a dispatch arm, split the function. This is the common case, and the soft tier exists to give you the time to do it on the same branch.
  2. Raise the limit. Edit bca-thresholds.toml, leave a why-comment explaining what changed (a new language module, a genuine algorithmic floor, a re-classified macro). Re-run make self-scan-headroom to confirm the new value covers the offender with room to spare.
  3. Absorb into the baseline. Run make self-scan-write-baseline (hard tier) or make self-scan-write-baseline-headroom (soft tier) when the value is legitimate forever — a parser dispatch arm whose width matches the grammar it covers, a stable state machine, generated code. Commit the diff in .bca-baseline.toml in the same PR as the code that produced it.

Don't pick "raise the limit" silently to make the gate go away. The committed why-comment is the only audit trail the next reader has; without it the bumped limit looks indistinguishable from neglect.

Skeleton: justfile

For projects that prefer just:

# bca local threshold gates. Hard tier mirrors CI; soft tier (headroom)
# is local-only early warning.
bca         := "bca"
paths       := "."
exclude     := ".bcaignore"
thresholds  := "bca-thresholds.toml"
baseline    := ".bca-baseline.toml"
headroom    := env_var_or_default("BCA_HEADROOM", "0.95")
py          := env_var_or_default("PY", "python3")

jobs        := `nproc 2>/dev/null || sysctl -n hw.ncpu 2>/dev/null || echo 4`
base_args   := "--paths " + paths + " --exclude-from " + exclude + " --num-jobs " + jobs

self-scan:
    {{bca}} {{base_args}} \
        check --config {{thresholds}} --baseline {{baseline}}

self-scan-headroom: self-scan
    BCA_HEADROOM={{headroom}} \
        BCA_THRESHOLDS={{thresholds}} \
        BCA_BASELINE={{baseline}} \
        {{py}} ./utils/bca-self-scan-headroom.py {{bca}} {{base_args}}

self-scan-write-baseline:
    {{bca}} {{base_args}} \
        check --config {{thresholds}} --write-baseline {{baseline}}

# Like the Make skeleton, never compose this with `self-scan-write-baseline`
# in parallel — they race on the same {{baseline}} file.
self-scan-write-baseline-headroom:
    BCA_HEADROOM={{headroom}} \
        BCA_THRESHOLDS={{thresholds}} \
        BCA_BASELINE={{baseline}} \
        BCA_HEADROOM_WRITE_BASELINE={{baseline}} \
        {{py}} ./utils/bca-self-scan-headroom.py {{bca}} {{base_args}}

Skeleton: package.json scripts

For JavaScript projects pulling in bca via npx or a pinned binary. The --num-jobs flag is threaded through via the BCA_NUM_JOBS env var (default in the wrapper script below) so the npm tier runs the same shape of command as Make / just — per Principle 1, all three skeletons should produce byte-identical bca check invocations:

{
  "scripts": {
    "self-scan": "bca --paths . --exclude-from .bcaignore --num-jobs ${BCA_NUM_JOBS:-4} check --config bca-thresholds.toml --baseline .bca-baseline.toml",
    "self-scan-headroom": "npm run self-scan && python3 ./utils/bca-self-scan-headroom.py bca --paths . --exclude-from .bcaignore --num-jobs ${BCA_NUM_JOBS:-4}",
    "self-scan-write-baseline": "bca --paths . --exclude-from .bcaignore --num-jobs ${BCA_NUM_JOBS:-4} check --config bca-thresholds.toml --write-baseline .bca-baseline.toml",
    "self-scan-write-baseline-headroom": "BCA_HEADROOM_WRITE_BASELINE=.bca-baseline.toml python3 ./utils/bca-self-scan-headroom.py bca --paths . --exclude-from .bcaignore --num-jobs ${BCA_NUM_JOBS:-4}"
  }
}

Three portability footnotes for the npm tier:

  • Env vars beat shell expansion. The helper reads BCA_HEADROOM from the environment (default 0.95), so overriding the band is BCA_HEADROOM=0.90 npm run self-scan-headroom on POSIX shells. On Windows cmd.exe, set the variable separately or use cross-env: cross-env BCA_HEADROOM=0.90 npm run self-scan-headroom. Avoid ${VAR:-default} as a primary configuration mechanismcmd.exe passes it through literally. The ${BCA_NUM_JOBS:-4} usage above is a reasonable default for POSIX hosts; Windows users either set BCA_NUM_JOBS explicitly or replace the literal with a fixed number in a per-platform script.
  • python3 vs python. The stock python.org Windows installer ships python.exe and py.exe but no python3 alias. Replace the literal python3 above with py -3 (Windows launcher) or add a one-line scripts/python3.cmd shim that forwards to py -3. macOS / Linux / WSL hosts have python3 on PATH by default.
  • Use cross-env (or pnpm exec --shell) if you need any env var to be portable across the package.json users' shells. Mixing bash-isms into scripts is the most common source of "works on my Mac, broken on a Windows reviewer's machine" pings.

Pair with husky or pre-commit so the same scripts run on git commit.

Skeleton: pre-commit hook

If you use the pre-commit framework (version 3.2.0 or newer — see the version note below), both tiers are local hooks that shell out to make:

- repo: local
  hooks:
    - id: bca-self-scan
      name: bca self-scan (hard gate)
      entry: make self-scan
      language: system
      pass_filenames: false
      stages: [pre-commit]
    - id: bca-self-scan-headroom
      name: bca self-scan-headroom (soft gate)
      entry: make self-scan-headroom
      language: system
      pass_filenames: false
      stages: [pre-commit]

pass_filenames: false is deliberate — bca discovers its own inputs from --paths plus the baseline. Letting pre-commit pass the changed files in would shrink the scan to just those files and miss the cross-file effect of a baseline refresh.

Minimum pre-commit version 3.2.0. The stages: vocabulary was renamed in pre-commit 3.2.0 (March 2024) — commitpre-commit, pushpre-push, etc. Older installs (notably RHEL 8 EPEL, Ubuntu 20.04 default packages, and any .pre-commit-config.yaml pinned to the legacy vocabulary) reject stages: [pre-commit] as an unknown stage name and the hook never registers. If you must support older installations, substitute stages: [commit]; in mixed fleets, pin the framework with pre-commit --version ≥ 3.2.0 in the dev-tooling docs so this contradiction does not surface silently.

Helper script

The headroom helper exists because bca check's --threshold name=value flag accepts overrides on the command line. The helper reads the TOML, multiplies, and re-emits.

A ~40-line implementation suitable for any project. It is a condensed restatement of big-code-analysis's own utils/bca-self-scan-headroom.py — same env-var contract, same defensive checks, same exit codes — trimmed for in-line readability:

#!/usr/bin/env python3
"""Scale every threshold by $BCA_HEADROOM and run bca check."""
from __future__ import annotations
import os, subprocess, sys
from pathlib import Path

try:
    import tomllib  # Python 3.11+
except ImportError:  # pragma: no cover
    import tomli as tomllib  # `pip install tomli` on 3.9/3.10

def main() -> int:
    if len(sys.argv) < 2:
        print("usage: bca-self-scan-headroom.py <bca-invocation...>", file=sys.stderr)
        return 64

    raw = os.environ.get("BCA_HEADROOM") or "0.95"  # treat '' as unset
    try:
        ratio = float(raw)
    except ValueError:
        print(f"BCA_HEADROOM must be a number; got {raw!r}", file=sys.stderr)
        return 64
    if not 0.0 < ratio <= 1.0:
        print(f"BCA_HEADROOM must be in (0, 1]; got {ratio}", file=sys.stderr)
        return 64

    thresholds_path = Path(os.environ.get("BCA_THRESHOLDS") or "bca-thresholds.toml")
    baseline_path = Path(os.environ.get("BCA_BASELINE") or ".bca-baseline.toml")
    if not thresholds_path.is_file():
        print(f"missing {thresholds_path}", file=sys.stderr)
        return 1
    cfg = tomllib.loads(thresholds_path.read_text(encoding="utf-8"))
    thresholds = cfg.get("thresholds", {})
    if not thresholds:
        print(f"no [thresholds] table in {thresholds_path}", file=sys.stderr)
        return 1

    flags: list[str] = []
    for name, limit in thresholds.items():
        # Float so a fractional scaled limit (e.g. 6.65 for nargs=7
        # at BCA_HEADROOM=0.95) survives — flooring to int silently
        # widens the band.
        flags += ["--threshold", f"{name}={limit * ratio:.6g}"]

    write_target = os.environ.get("BCA_HEADROOM_WRITE_BASELINE")
    if write_target:
        cmd = [*sys.argv[1:], "check", "--write-baseline", write_target, *flags]
    else:
        cmd = [*sys.argv[1:], "check", "--baseline", str(baseline_path), *flags]
    return subprocess.call(cmd)

if __name__ == "__main__":
    sys.exit(main())

Five implementation details that matter in practice:

  • Emit a float, not an int. bca check --threshold parses every value as f64, and the offender test is value > limit (strict). At BCA_HEADROOM=0.95, nargs=7 scales to 6.65. Flooring to 6 would silently widen the band by an extra ratio step. The {:.6g} format truncates float-multiplication artefacts (6.6499999999999995) without losing precision on the largest thresholds in the file.
  • Validate the ratio. The half-open interval (0, 1] is the only sensible range. 0 disables the gate; values above 1 would make the soft tier looser than the hard tier and fire after CI — useless. The or "0.95" idiom treats both unset and set-but-empty (BCA_HEADROOM= in a stripped CI env) as the default, so a misconfigured matrix variable does not exit 64 with the confusing message got ''.
  • Same baseline as the hard tier. The soft tier --baseline must point at the exact same file the hard tier writes; otherwise every hard-tier offender re-fires on the soft tier. The helper reads BCA_BASELINE from the env (default .bca-baseline.toml) so renaming the file in one place — the Make / just recipe — propagates to both tiers without editing the Python.
  • Read everything from the environment, not argv. Env-var propagation works the same in make, just, and npm scripts on every platform; CLI parameter expansion (${HEADROOM:-0.95}) does not — Windows cmd.exe passes it through literally. Argv carries only the literal bca invocation prefix; the four configuration knobs (BCA_HEADROOM, BCA_THRESHOLDS, BCA_BASELINE, BCA_HEADROOM_WRITE_BASELINE) all come from os.environ.
  • Defensive diagnostics. The argv-length, file-exists, and empty-[thresholds] checks all exit before constructing a bca command, with stderr messages that name the helper rather than the downstream tool. Without them, a missing config file produces a confusing "no thresholds defined" error from bca itself, and the user has to bisect whether the helper, the config, or bca is at fault. The fallback import tomli as tomllib keeps the script working on Python 3.9/3.10 hosts (RHEL 8, Ubuntu 20.04, Debian bullseye); on 3.11+ tomllib is stdlib and tomli is not needed.

Composition with the broader baseline workflow

The four self-scan* targets above are not a replacement for the documented Baselines recipe — they are that recipe, mechanised into developer-machine commands. The same ordering still applies:

  1. Bootstrap once. Write the initial thresholds, write the initial baseline, commit both.
  2. Gate on every commit. Hard tier fails on regression; soft tier fails on encroachment.
  3. Refresh during focused refactors. When a function legitimately moved (someone did pay down debt), regenerate the baseline and review the diff.
  4. Retire when empty. When .bca-baseline.toml shrinks to just version = 2, drop the --baseline flag and delete the file. The thresholds now stand on their own.

The local tiers shorten the feedback loop on steps 2 and 3 from "red CI on a pull request" to "red Make recipe before git commit returns". That is the whole pitch.

The hard / soft tier split is one instance of a broader pattern. If you have used any of the following, the mental model carries over:

  • Sonar's Quality Gates focused on new code. Old code is held at its current state; changes must not make things worse. The baseline file is bca's native form of the "new code" / "leak period" idea.
  • clippy's warn-vs-deny lint levels. A warn lint surfaces in local builds; the same lint denied with -D warnings fails CI. The two-tier configuration gives you a place to land experimental tighter rules.
  • The ratchet pattern in general migration tooling: record today's count, fail on increase, lower the ceiling as the count drops. bca check ratchets per-function rather than per-pattern, but the monotonicity guarantee is the same.
  • -Wall + -Werror in C/C++. A first pass with -Wall reveals the noise; promoting to -Werror after the baseline reaches zero is the same retirement step as deleting .bca-baseline.toml once it's empty.

AST queries

Recipes that work with the parsed syntax tree directly: searching for node types, counting them, or dumping the tree.

Library-side equivalents. Every recipe below has an in-process Rust counterpart in Walking the AST directly — useful when shelling out per file is too slow or when you want to compose metrics with custom AST analysis in one parse.

Detect parse errors before committing

Tree-sitter exposes a synthetic ERROR node anywhere it could not parse. Use find to surface them:

bca \
    --include "*.rs" \
    --paths "$PWD" \
    find ERROR

Flag ordering. --include and --exclude are variadic and consume tokens until the next flag begins, so put them before --paths to avoid the subcommand name being eaten as a glob. The single-value = form (--include="*.rs") also works.

A clean run prints nothing. Wire this into a pre-commit hook to fail fast when a syntactically broken file is staged.

Count specific syntactic constructs

count accepts one or more node-type names and reports the totals. For example, to count if, for, and while constructs across a Rust project:

bca \
    --include "*.rs" \
    --paths src/ \
    count if_expression for_expression while_expression

The exact node-type names come from the underlying tree-sitter grammar. To discover them, dump the AST of a small sample file (see below) and read the node names off the tree.

Find all unsafe blocks in a Rust crate

bca \
    --include "*.rs" \
    --paths src/ \
    find unsafe_block

Each match prints the file path and the line range of the node.

Dump the AST of a file

Useful for understanding why a metric came out the way it did, or for discovering the tree-sitter node names you need for find / count:

bca --paths src/lib.rs dump

To narrow the dump to a specific function or block, add line bounds with the global --ls and --le flags:

bca \
    --paths src/lib.rs \
    --ls 42 --le 88 \
    dump

--ls / --le apply to dump and find, so the same range can be used to scope a search to a single function:

bca \
    --paths src/lib.rs \
    --ls 42 --le 88 \
    find return_expression

List every function or method

For a quick human-readable inventory:

bca \
    --include "*.rs" \
    --paths src/ \
    functions

The output is a tree per file: an In file … header followed by an indented row per function with name and line span. It is intended for reading, not parsing.

For tooling that needs a structured inventory — coverage mapping, documentation generation, code-owner reports — use the JSON metrics output instead and walk .spaces[] recursively, taking entries whose kind is function:

bca \
    --include "*.rs" \
    --paths src/ \
    metrics -O json \
  | jq -c '
      . as $root
      | def funcs: if .kind == "function" then [.] else [] end
                   + (.spaces // [] | map(funcs) | add // []);
      funcs[] | {file: $root.name, name, start_line, end_line}
    '

This emits one JSON object per function and is safe to pipe into downstream tooling.

Exporting metric data

The metrics, ops, and preproc subcommands all support structured output formats meant for machine consumption. Pair them with a JSON processor like jq for ad-hoc analysis, or feed them into a database or dashboard.

Export per-file metrics as JSON

bca \
    --paths src/ \
    metrics \
    -O json \
    -o /tmp/metrics

This writes one JSON file per analyzed source file under /tmp/metrics/. The output filename mirrors the input path with the format extension appended — src/lib.rs becomes src/lib.rs.json, not src/lib.json. Use --pretty if you intend to read the files by hand:

bca -p src/ metrics --pretty -O json -o /tmp/metrics

CBOR (-O cbor) is the most compact format; it is binary and therefore requires -o. JSON, TOML, and YAML can all be streamed to stdout when -o is omitted, which is useful for pipelines.

Pull a single metric across an entire tree

Combine streamed JSON output with jq to extract one value per file:

bca -p src/ metrics -O json \
  | jq -c '{file: .name, mi: .metrics.mi.mi_visual_studio}'

The same idea works for any metric — cyclomatic.sum, cognitive.sum, loc.sloc, and so on. Run bca list-metrics descriptions to see the catalog.

Discover the metric catalog at runtime

Tooling that drives the CLI shouldn't hard-code metric names. Ask the binary:

bca list-metrics                # one name per line
bca list-metrics descriptions   # name + summary

This is the right input for code generators, schema definitions, or tab-completion.

Extract operands and operators (Halstead)

ops emits the raw operand and operator lists per file, which is the input to Halstead-style metric calculations beyond what the built-in report shows:

bca \
    --include "*.rs" \
    --paths src/ \
    ops \
    -O json --pretty \
    -o /tmp/ops

Flag ordering. Variadic flags like --include and --exclude consume tokens until the next flag, so put them before --paths (or use the --include=GLOB single-value form) to keep the subcommand from being eaten as a glob.

Each output file mirrors the input path under /tmp/ops/.

Strip comments from a tree

strip-comments rewrites source so that downstream tools that don't understand comment syntax can still consume the code. It defaults to streaming the result to stdout; pass --in-place to overwrite files on disk:

# Stream a single file with comments removed.
bca --paths src/lib.rs strip-comments

# Rewrite every Python file in src/ in place.
bca --include "*.py" --paths src/ \
    strip-comments --in-place

--in-place is destructive — make sure the tree is committed or backed up first.

Driving the REST API

bca-web exposes the same analysis primitives over HTTP. Use it when the consumer is a long-running service (an editor plugin, CI worker, or web app) that should not pay the cost of spawning the CLI per file.

For the full endpoint reference, see Rest API. The recipes below show practical end-to-end calls with curl.

Start the server

bca-web --host 127.0.0.1 --port 8080 -j "$(nproc)"

Verify it's up:

curl -sf http://127.0.0.1:8080/ping && echo "ok"
# => ok

/ping returns 200 OK with an empty body — curl -sf exits 0 on success and non-zero on any HTTP error, which is what scripts want.

Compute metrics for an inline snippet

curl -s http://127.0.0.1:8080/metrics \
    -H 'Content-Type: application/json' \
    -d '{
          "id": "snippet-1",
          "file_name": "demo.rs",
          "code": "fn add(a: i32, b: i32) -> i32 { a + b }",
          "unit": false
        }' \
  | jq '.spaces.metrics'

unit: true returns only top-level metrics; false walks every function and class space inside the snippet. The server infers language from file_name, so the extension matters.

Compute metrics for a file from disk

curl --data-binary plus jq makes it easy to package a real file into the JSON envelope the server expects:

jq -nc \
    --arg id "$(uuidgen)" \
    --arg file_name "src/lib.rs" \
    --rawfile code src/lib.rs \
    '{id: $id, file_name: $file_name, code: $code, unit: false}' \
  | curl -s http://127.0.0.1:8080/metrics \
      -H 'Content-Type: application/json' \
      --data-binary @- \
  | jq '.spaces.metrics.cyclomatic, .spaces.metrics.cognitive'

This pattern — jq -n --rawfile to build the request, curl --data-binary @- to stream it — is the easiest way to avoid quoting problems with multi-line source code.

Strip comments through the API

The endpoint is /comment (singular). It has two variants selected by Content-Type:

  • application/json — wraps the request and response in JSON. The response code field is a byte array, not a string, because the underlying API is byte-oriented.
  • application/octet-stream — accepts the source as the raw request body and returns the stripped source as the raw response body. This is by far the easiest variant to use from the shell.

Octet-stream form (recommended for one-off shell use):

curl -s "http://127.0.0.1:8080/comment?file_name=demo.py" \
    -H 'Content-Type: application/octet-stream' \
    --data-binary $'# leading comment\nprint("hi")  # trailing'
# => print("hi")

JSON form (use when your client speaks JSON natively). Decode the byte array with jq … | implode for ASCII / UTF-8 source:

curl -s http://127.0.0.1:8080/comment \
    -H 'Content-Type: application/json' \
    -d '{
          "id": "strip-1",
          "file_name": "demo.py",
          "code": "# leading comment\nprint(\"hi\")  # trailing"
        }' \
  | jq -r '.code | implode'

The JSON response carries the same id you sent, so a client that multiplexes many requests can correlate them.

Extract function spans for an editor plugin

The endpoint is /function (singular):

curl -s http://127.0.0.1:8080/function \
    -H 'Content-Type: application/json' \
    -d '{
          "id": "spans-1",
          "file_name": "demo.rs",
          "code": "fn a() {}\nfn b() {}\n"
        }' \
  | jq '.spans'

Each entry has name, start_line, end_line, and an error boolean (set when the parser flagged the function span as malformed) — enough for an editor to draw a function navigator without re-parsing the file locally.

Calling the API from CI

The server starts in milliseconds, so for short-lived CI jobs it's often simplest to start it as a background process inside the job and tear it down at the end:

bca-web --port 8080 &
SERVER_PID=$!
trap 'kill "$SERVER_PID"' EXIT

# Wait for it to come up.
until curl -sf http://127.0.0.1:8080/ping >/dev/null; do sleep 0.1; done

# … run your analysis calls here …

For longer-lived workers, run the server as a systemd unit (or container) and point your jobs at its host/port.

Using as a Library

big-code-analysis is published on crates.io as a Rust library. The CLI (bca) and REST server (bca-web) are both thin wrappers around the same public API, so anything they can do you can do directly from your own crate.

This section is task-oriented. For full type signatures and field docs, follow the rustdoc on docs.rs.

When to embed the library

Reach for the library (instead of shelling out to bca) when you want one or more of the following:

  • In-process analysis. Avoid the cost of spawning a subprocess per file when scoring thousands of files in a custom tool, IDE plugin, or static-analysis pipeline.
  • In-memory source. Score generated, pre-processed, or streamed source without writing it to disk first. See Analyzing in-memory source.
  • Selective walking. Drive a custom traversal over the FuncSpace tree to extract per-function metrics on your own schedule. See Walking FuncSpace results.
  • Custom output. Skip the JSON / YAML / TOML / CBOR serializers shipped under src/output/ and emit your own report format (CSV, SARIF, a database row, whatever).

If you just want a Markdown quality report or a CI threshold gate, the bca CLI is faster to wire up.

What is on offer today

A note on API stability

The library is on the 1.x line and ships under a written stability contract: the shape of the public API is held stable across patch and minor bumps, and breaking changes are reserved for the next major bump. Every example in this section compiles against the current published crate and is expected to keep compiling across 1.x without edits.

Metric values may still drift across minor bumps when a grammar pin moves or a metric definition is fixed — see STABILITY.md § What is stable in value for the carve-out. Each drift is called out in the changelog entry that introduces it.

Quick start

This page walks through the minimum amount of code needed to compute metrics from a string of source code.

1. Add the crate

# Cargo.toml
[dependencies]
big-code-analysis = "1.1.0"

The crate uses Rust edition 2024 and pins rust-version = "1.94". Older toolchains will not build it — see the MSRV section of STABILITY.md for the policy.

2. Compute metrics from a string

The recommended entry point is analyze: pass a Source carrying the language, source bytes, and an optional display name, plus a MetricsOptions for any per-traversal flags. No filesystem path is needed.

use big_code_analysis::{analyze, MetricsOptions, Source, LANG};

fn main() {
    let source = "fn add(a: i32, b: i32) -> i32 { a + b }";

    let space = analyze(
        Source::new(LANG::Rust, source.as_bytes())
            .with_name(Some("snippet.rs".to_owned())),
        MetricsOptions::default(),
    )
    .expect("Rust source should parse");

    println!(
        "cognitive complexity (file-level): {}",
        space.metrics.cognitive.cognitive_sum(),
    );
}

Source::name ends up as the top-level FuncSpace::name; passing None leaves the top-level name unset. The return type is Result<FuncSpace, MetricsError>. The Err variant tells parse-failure apart from empty-input apart from disabled- language; see Error handling for the variant set and matching patterns. MetricsError is #[non_exhaustive], so always include a _ arm when matching.

Tip: use big_code_analysis::prelude::*; brings the recommended entry points (analyze, Source, MetricsOptions, MetricsError, LANG, FuncSpace, CodeMetrics, SpaceKind, Metric, metrics_from_tree) into scope in one line. Anything outside the prelude can still be reached by name — for example use big_code_analysis::guess_language;.

The older get_function_spaces(lang, bytes, path, pr) and metrics_with_options(parser, path, options) entry points are still available but #[deprecated] — they derive the top-level name from path via lossy UTF-8 conversion. Use them only when you already have a Parser<T> in hand from another seam.

3. What you got back

FuncSpace is a tree of spaces. The top-level node represents the whole file; its spaces field holds nested function / class / impl spaces. Every node carries the same CodeMetrics struct, so you can read any metric at any level of granularity.

use big_code_analysis::{analyze, MetricsOptions, Source, SpaceKind, LANG};

fn main() {
    let source = "\
fn outer() {
    fn inner() {}
}
";
    let space = analyze(
        Source::new(LANG::Rust, source.as_bytes())
            .with_name(Some("snippet.rs".to_owned())),
        MetricsOptions::default(),
    )
    .expect("Rust source should parse");

    assert_eq!(space.kind, SpaceKind::Unit);
    assert_eq!(space.spaces.len(), 1); // `outer`
    assert_eq!(space.spaces[0].spaces.len(), 1); // `inner`
}

For a deeper walk over FuncSpace, see Walking FuncSpace results.

Picking a language

If you do not know the language up front, use guess_language — it consults the path extension, an Emacs mode line in the buffer, and the shebang in that order:

use std::path::PathBuf;

use big_code_analysis::{analyze, guess_language, MetricsOptions, Source};

fn main() {
    let source = b"print('hi')\n";
    let path = PathBuf::from("hello.py");

    let (Some(lang), _name) = guess_language(source, &path) else {
        eprintln!("unrecognised language");
        return;
    };

    let _space = analyze(
        Source::new(lang, source).with_name(Some("hello.py".to_owned())),
        MetricsOptions::default(),
    );
}

guess_language returns (None, _) for unknown extensions; treat that as "skip this file" rather than as a parse error.

What changes when

The recommended entry point is analyze(Source, MetricsOptions) and returns Result<FuncSpace, MetricsError> (per #253 and #254). The library-DX tracker collects the remaining shape changes — naming, per-language features, and the parse seam.

Analyzing in-memory source

big-code-analysis never requires source to live on disk. The recommended entry point analyze takes a Source carrying the language, source bytes, and an optional caller-supplied display name; no filesystem path is involved unless the C/C++ preprocessor lookup needs one (Source::preproc_path).

This is useful for:

  • Scoring generated code before it is written out.
  • Scoring pre-processed or bundled source (e.g. after a template expansion).
  • Driving the analyzer from a language server or editor plugin that already holds the buffer in memory.
  • Stdin pipelines and unit tests that should not touch the filesystem.

Reading from a buffer

#![allow(unused)]
fn main() {
use big_code_analysis::{analyze, MetricsOptions, Source, LANG};

fn analyze_buffer(source: &[u8]) -> Option<f64> {
    // `Source::name` is the display identifier baked into the
    // top-level `FuncSpace`. Pick whatever is meaningful for
    // downstream consumers (logs, JSON output); pass `None` if
    // you have nothing useful to attach.
    let space = analyze(
        Source::new(LANG::Python, source).with_name(Some("<stdin>".to_owned())),
        MetricsOptions::default(),
    )
    .ok()?;

    Some(space.metrics.cognitive.cognitive_sum())
}
}

Source::new borrows the source bytes — the caller retains ownership. If your downstream pipeline needs to highlight findings on the same bytes, you can keep using the original buffer after analyze returns.

Reading from stdin

use std::io::{self, Read};

use big_code_analysis::{analyze, MetricsOptions, Source, LANG};

fn main() -> io::Result<()> {
    let mut source = Vec::new();
    io::stdin().read_to_end(&mut source)?;

    let space = match analyze(
        Source::new(LANG::Javascript, &source)
            .with_name(Some("<stdin>".to_owned())),
        MetricsOptions::default(),
    ) {
        Ok(space) => space,
        Err(err) => {
            eprintln!("parse failed: {err}");
            std::process::exit(1);
        }
    };

    println!("{}", space.metrics.cyclomatic.cyclomatic_sum());
    Ok(())
}

Picking the language from content

If you do not know the language up front, combine guess_language with analyze. guess_language peeks at the path extension, an Emacs mode-line, and the shebang in that order:

#![allow(unused)]
fn main() {
use std::path::PathBuf;

use big_code_analysis::{analyze, guess_language, MetricsOptions, Source};

fn analyze_unknown(path: PathBuf, source: Vec<u8>) -> Option<()> {
    let (lang, _name) = guess_language(&source, &path);
    let lang = lang?;
    // `.ok()?` collapses `MetricsError` into `None` so this helper's
    // `Option` return shape is preserved. See `error-handling.md` for
    // a richer mapping that preserves the variant.
    let _space = analyze(
        Source::new(lang, &source)
            .with_name(path.to_str().map(str::to_owned)),
        MetricsOptions::default(),
    )
    .ok()?;
    Some(())
}
}

guess_language returns (None, _) for unrecognised extensions — treat that as "skip" rather than as a hard error.

Watch out for these

  • Name identity matters. Top-level FuncSpace::name is whatever string you put in Source::name. Two analyses sharing the same name will look identical to a downstream consumer that keys on it. Use distinct labels for distinct buffers.
  • Source::name is Option<String>. Passing None leaves the top-level FuncSpace::name as None — useful for ad-hoc snippets that have no meaningful identity. Downstream consumers that require a stable identifier should check for None explicitly.
  • No filesystem fallback. Unlike the CLI, the library does not read sibling files, follow #includes, or interpret a .gitignore. Feed it exactly the bytes you want analyzed.

Alternative: the path-positional shim

For backwards compatibility, the older path-positional entry points (get_function_spaces and metrics_with_options) still work but are #[deprecated] in favour of analyze. They derive FuncSpace::name from the supplied &Path via lossy UTF-8 conversion and are otherwise equivalent.

Reusing an existing tree-sitter Tree

A common pain point is that callers who already drive tree-sitter for syntax highlighting, code folding, or queries end up parsing every file twice: once for their own tree, once inside get_function_spaces. The parse seam (issue #251) lets you hand big-code-analysis an already-parsed tree_sitter::Tree and get the same FuncSpace back without re-parsing.

Prefer Ast::from_tree_sitter if you also want to run the metric walker more than once against the same parse (different MetricsOptions::with_only selections, custom tree-sitter walks interleaved with metrics, etc.). See Parse once, run metrics many times. The metrics_from_tree function shown below is a single-shot equivalent that constructs an Ast internally and discards it after one call.

When to use this

Use the parse seam if you:

  • Already keep a tree_sitter::Tree per open buffer (editor, LSP, language server, custom static-analysis pipeline) and want to reuse that parse for metrics rather than paying the byte-based cost again.
  • Want to run multiple passes (metrics + AST dump + custom analysis) against one parse result.
  • Intend to pin tree-sitter on your side without taking a separate dependency from this library. The re-exported big_code_analysis::tree_sitter module is the same crate we link against, so the types agree by definition.

Use the byte-based entry points (get_function_spaces / metrics_with_options) if you do not already have a tree — they construct the parser internally and own the parse end to end.

Working example

use std::path::PathBuf;

use big_code_analysis::{
    get_function_spaces, metrics_from_tree, tree_sitter, LANG,
    MetricsOptions,
};

let source_code = "fn main() { if true { 1 } else { 2 }; }";
let path = PathBuf::from("foo.rs");
let source = source_code.as_bytes().to_vec();

// Step 1: build a tree with the *re-exported* tree-sitter crate.
// Using `big_code_analysis::tree_sitter` (rather than a direct
// `tree-sitter` dependency on your side) guarantees the version
// matches the one the metric walker was compiled against.
let mut parser = tree_sitter::Parser::new();
parser
    .set_language(&LANG::Rust.get_tree_sitter_language())
    .expect("rust grammar pinned to a compatible version");
let tree = parser
    .parse(&source, None)
    .expect("parser has a language set");

// Step 2: feed the tree into metrics_from_tree.
let from_tree = metrics_from_tree(
    &LANG::Rust,
    tree,
    source.clone(),
    &path,
    None,
    MetricsOptions::default(),
)
.expect("non-empty input");

// Step 3 (optional): confirm the values match the byte-based path.
let from_bytes =
    get_function_spaces(&LANG::Rust, source, &path, None)
        .expect("non-empty input");

assert_eq!(
    from_tree.metrics.cyclomatic.cyclomatic_sum(),
    from_bytes.metrics.cyclomatic.cyclomatic_sum(),
);

The same shape works for any LANG variant — pass the matching grammar to tree_sitter::Parser::set_language (via LANG::get_tree_sitter_language) and the metric walker will produce the same FuncSpace it would have produced from bytes.

Lower-level: Parser::from_tree (internal)

metrics_from_tree is the documented entry point for tree reuse — it dispatches on a &LANG and hides the generic parser plumbing entirely. The lower-level path goes through Parser<T> / ParserTrait, which are now #[doc(hidden)] (see issue #256). They remain pub so the in-tree macros (mk_action!, action::<T>, the Callback dispatch shared with the REST API) can refer to them, but they are not part of the documented surface and treating them as a stable extension point is at your own risk.

The per-language *Parser aliases (RustParser, PythonParser, …) emitted by mk_langs! are doc-hidden for the same reason — see STABILITY.md for the escape-hatch caveat. For library consumers, the higher-level metrics_from_tree shown above is the right entry point because it dispatches on a &LANG at runtime and does not expose any of the per-language tag types or trait bounds.

Out of scope

  • Incremental re-computation. Applying a tree_sitter::InputEdit and re-querying only the changed spans is not supported yet — the metric walker still walks the entire tree on every call. The parse seam is the first step; making the walker itself incremental is a follow-up.
  • Promoting all of Node's pub(crate) traversal methods. Node still exposes its inner tree_sitter::Node through the public .0 field for ad-hoc traversal; the wrapper helpers remain crate-private and are tracked under the pub use curation issue.

Parse once, run metrics many times

big-code-analysis's one-shot entry point analyze re-parses its Source on every call. For pipelines that score a file multiple times — different metric subsets, an interleaved custom tree-sitter walk, or a metric re-run after a configuration change — that re-parse is wasted work.

The Ast type, added in 0.0.26 (#264), exposes the seam: parse the source once, then call Ast::metrics as many times as you need against the held parse.

When to use this

Reach for Ast when any of the following applies:

  • Selective metric runs. You compute one set of metrics for a report, then another for a CI threshold gate, against the same file.
  • Custom tree-sitter walks. You already drive a tree_sitter::Tree for queries / highlighting / symbol extraction and want to fold the metric walker into the same parse.
  • Cached analysis. An LSP-like service that holds parsed files in memory should be able to re-run metrics on demand when configuration changes, without going back to bytes.

If you only ever compute every metric once per file, stick with analyze — it now delegates to Ast internally, so the shapes line up but the one-shot API stays simpler.

Selective metrics across calls

#![allow(unused)]
fn main() {
use big_code_analysis::{Ast, LANG, Metric, MetricsOptions, Source};

let source = b"fn f(x: i32) -> i32 { if x > 0 { 1 } else { -1 } }";

// One parse, two metric subsets.
let ast = Ast::parse(Source::new(LANG::Rust, source))
    .expect("rust feature enabled");

let loc = ast
    .metrics(MetricsOptions::default().with_only(&[Metric::Loc]))
    .expect("walker succeeds");
let cyclomatic = ast
    .metrics(MetricsOptions::default().with_only(&[Metric::Cyclomatic]))
    .expect("walker succeeds");

println!("ploc = {}", loc.metrics.loc.ploc());
println!("ccn  = {}", cyclomatic.metrics.cyclomatic.cyclomatic_sum());
}

Each metrics call walks the tree once. The savings versus calling analyze twice come from skipping the parse, which dominates runtime for everything except the very largest source files.

Custom tree-sitter walk + metrics on the same parse

Ast::as_tree_sitter borrows the underlying tree_sitter::Tree. The returned reference is valid for the lifetime of the Ast; nodes obtained from it resolve against Ast::source (see the note on the C++ preprocessor below for what source returns under macro expansion).

For realistic AST work — counting node kinds, finding constructs by name, detecting parse errors, building a symbol table — see Walking the AST directly. The example below is a minimal smoke test; the dedicated chapter shows the full pattern (reusable depth-first walker, field-name lookup, error detection).

#![allow(unused)]
fn main() {
use big_code_analysis::{Ast, LANG, MetricsOptions, Source};

let ast = Ast::parse(Source::new(LANG::Rust, b"fn f() {}"))
    .expect("rust feature enabled");

// Walk the tree for your own purposes…
let root = ast.as_tree_sitter().root_node();
assert_eq!(root.kind(), "source_file");

// …and run the metric walker over the same parse.
let space = ast
    .metrics(MetricsOptions::default())
    .expect("walker succeeds");
println!("name = {:?}", space.name);
}

Adopting a caller-built tree

If you already build the tree_sitter::Tree yourself (e.g. because your editor / LSP has its own parser pool), Ast::from_tree_sitter is the Source-flavored counterpart of the older metrics_from_tree. It carries an explicit name: Option<String> end-to-end instead of deriving one from a path via lossy UTF-8 conversion.

#![allow(unused)]
fn main() {
use big_code_analysis::{Ast, LANG, MetricsOptions, tree_sitter};

let source = b"fn f() {}".to_vec();
let mut parser = tree_sitter::Parser::new();
parser
    .set_language(
        &LANG::Rust
            .get_tree_sitter_language()
            .expect("rust feature enabled"),
    )
    .expect("rust grammar compatible");
let tree = parser
    .parse(&source, None)
    .expect("parser has a language set");

let ast = Ast::from_tree_sitter(LANG::Rust, tree, source, None)
    .expect("rust feature enabled");
let _ = ast.metrics(MetricsOptions::default()).expect("walker succeeds");
}

The tree must have been produced from code with the grammar returned by LANG::get_tree_sitter_language for lang; a mismatch is not unsafe, but the metric walker matches on tree-sitter kind_id values that come from the language's enum, so values from a different grammar yield nonsensical results.

C++ preprocessor

When Ast::parse is called on a Source carrying preprocessor inputs (Source::with_preproc_path + Source::with_preproc) and the language is LANG::Cpp, the macro pre-pass runs before tree-sitter does — and Ast::source returns the expanded bytes the parser actually saw, not the original input.

Ast::from_tree_sitter is unaffected: it adopts whatever tree the caller built. Whatever expansion (or lack thereof) the caller applied before building the tree is what Ast::source reflects.

Concurrency

Ast is Send + Sync. Running Ast::metrics from multiple threads against the same &Ast is safe — the walker only reads from the held tree_sitter::Tree. (Benchmarking parallel metric runs is a separate follow-up.)

Out of scope

  • Incremental reparse via tree_sitter::InputEdit. Caching a stable Ast across an analysis pipeline is in scope; editing the held tree is not.
  • Parallel-by-default APIs. Ast::metrics does not internally parallelize across the metric set. Callers that want one thread per subset are free to do so.

Walking the AST directly

Ast::parse gives you a parsed tree_sitter::Tree together with the source bytes it was parsed from; Ast::as_tree_sitter hands that tree out as a borrowed reference. This chapter shows how to use it to drive your own syntax-tree analysis — counting node kinds, finding constructs by name, detecting parse errors, or pulling out a symbol table — without paying for a second parse.

When to use this

Reach for direct AST traversal when:

  • You want to count or find syntactic constructs in-process. The CLI equivalents (bca count <kind>, bca find <kind>, recipe) shell out per file; the library path is one parse and one Rust loop.
  • You want to detect parse errors programmatically. Tree-sitter emits a synthetic ERROR node anywhere the grammar could not match; Node::has_error is O(1) — tree-sitter caches the error bit on every node — so the check is free even on a multi-MB source file.
  • You want to mix metrics with custom analysis in one parse — e.g. capture metric values and a list of function names for a coverage mapping, an IDE outline, or a code-owner report.

If you only need standard metrics, stay with analyze or Ast::metrics — they walk the tree for you. The direct path is for things the metric walker does not already compute.

Use the re-exported tree_sitter

Import tree_sitter from big_code_analysis::tree_sitter rather than adding a sibling tree-sitter dependency. The re-export is pinned to the exact version the metric walker was built against, so the Tree types agree by definition. See Reusing an existing tree-sitter Tree and Stability and versioning for the value-not-stable posture this re-export carries.

A reusable DFS walker

Most of the examples below need a depth-first traversal of every descendant. Tree-sitter ships a TreeCursor that does this in O(1) per step (no allocations beyond the cursor itself). The canonical walk is short enough to inline:

#![allow(unused)]
fn main() {
use big_code_analysis::tree_sitter;

/// Visit every node in `tree` in pre-order, root first, passing each
/// node to `visit`. Allocation-free apart from the cursor itself.
fn walk_preorder<F: FnMut(tree_sitter::Node<'_>)>(
    tree: &tree_sitter::Tree,
    mut visit: F,
) {
    let mut cursor = tree.walk();
    'walk: loop {
        visit(cursor.node());
        if cursor.goto_first_child() {
            continue;
        }
        loop {
            if cursor.goto_next_sibling() {
                continue 'walk;
            }
            if !cursor.goto_parent() {
                return;
            }
        }
    }
}
}

The pattern is: visit, descend, climb back up while there is no next sibling, repeat. Every example in this chapter is a thin wrapper around this walker — the code fences below are marked ignore because they assume walk_preorder is already in scope; the matching set of tests in tests/book_ast_traversal_examples.rs keeps them honest, so a refactor that broke an example would fail cargo test.

Count nodes by kind

Library equivalent of bca count if_expression for_expression while_expression from the AST-queries recipe:

use big_code_analysis::{Ast, LANG, Source};
use std::collections::HashMap;

let ast = Ast::parse(Source::new(
    LANG::Rust,
    b"fn a() { if true { 1 } else { 2 } } fn b() { for _ in 0..10 {} }",
))
.expect("rust feature enabled");

let mut counts: HashMap<&str, usize> = HashMap::new();
walk_preorder(ast.as_tree_sitter(), |node| {
    *counts.entry(node.kind()).or_default() += 1;
});

assert_eq!(counts.get("if_expression").copied().unwrap_or(0), 1);
assert_eq!(counts.get("for_expression").copied().unwrap_or(0), 1);

The string keys ("if_expression", "for_expression", …) are the tree-sitter grammar's node-type names. The fastest way to discover them for a new language is bca --paths sample.rs dump, which prints the full AST.

Anonymous tokens. The walker visits every node tree-sitter emits, including anonymous tokens like "{", ";", and keyword literals. The targeted counts.get("if_expression") lookups above are unaffected — anonymous tokens have different kind names — but counts.values().sum() would be much larger than the count of named grammar productions. Filter with tree_sitter::Node::is_named() inside the visitor if you only want named nodes.

Find nodes by kind

Library equivalent of bca find unsafe_block:

use big_code_analysis::{Ast, LANG, Source};

let ast = Ast::parse(Source::new(
    LANG::Rust,
    b"fn safe() {} fn risky() { unsafe { } }",
))
.expect("rust feature enabled");

let source = ast.source();
// Captured slices borrow from `source` — no per-hit `String` allocation.
let mut hits: Vec<((usize, usize), &str)> = Vec::new();
walk_preorder(ast.as_tree_sitter(), |node| {
    if node.kind() == "unsafe_block" {
        let span = (node.start_position().row, node.end_position().row);
        let text = node
            .utf8_text(source)
            .expect("source is valid utf-8");
        hits.push((span, text));
    }
});

assert_eq!(hits.len(), 1);

Node::utf8_text(&source[..]) slices the source bytes by the node's byte range. Pair it with Ast::source — for C++ with preprocessor inputs supplied to Ast::parse, source is the expanded buffer the parser actually saw, not the original input (see the C++ preprocessor note).

Detect parse errors

Tree-sitter is lossless: even on malformed input it returns a tree, but nodes that could not be matched are tagged as errors. The cheapest check is on the root:

#![allow(unused)]
fn main() {
use big_code_analysis::{Ast, LANG, Source};

let ast = Ast::parse(Source::new(LANG::Rust, b"fn broken("))
    .expect("rust feature enabled");

// Walks far enough to confirm something went wrong, but does not
// enumerate every error site.
assert!(ast.as_tree_sitter().root_node().has_error());
}

To list the offending nodes, walk the tree and check each:

use big_code_analysis::{Ast, LANG, Source};

let ast = Ast::parse(Source::new(LANG::Rust, b"fn broken("))
    .expect("rust feature enabled");

let mut error_lines = Vec::new();
walk_preorder(ast.as_tree_sitter(), |node| {
    if node.is_error() || node.is_missing() {
        error_lines.push(node.start_position().row);
    }
});

assert!(!error_lines.is_empty());

Node::is_error() flags the synthetic ERROR node tree-sitter inserts where it could not match the grammar; Node::is_missing() flags phantom nodes the parser invented to recover from a missing token. The CLI's bca find ERROR recipe uses the same nodes.

Combine metrics with a custom walk

The whole point of Ast is parse-once / compute-many. A realistic pipeline computes metrics and extracts a symbol table from the same parse:

use big_code_analysis::{Ast, LANG, MetricsOptions, Source};

let ast = Ast::parse(Source::new(
    LANG::Rust,
    b"fn outer() { fn inner() {} } fn alone() {}",
))
.expect("rust feature enabled");

// One parse: metrics walker uses it…
let space = ast
    .metrics(MetricsOptions::default())
    .expect("walker succeeds");

// …and so does the custom walk, against the very same tree. The
// captured names borrow from `source` rather than allocating a fresh
// `String` per function — the same pattern as `find_unsafe_blocks`
// above.
let source = ast.source();
let mut functions: Vec<&str> = Vec::new();
walk_preorder(ast.as_tree_sitter(), |node| {
    if node.kind() == "function_item"
        && let Some(name_node) = node.child_by_field_name("name")
    {
        let name = name_node
            .utf8_text(source)
            .expect("source is valid utf-8");
        functions.push(name);
    }
});

assert_eq!(space.metrics.nom.functions_sum(), 3.0);
assert_eq!(functions, ["outer", "inner", "alone"]);

Node::child_by_field_name walks the named grammar fields — the same fields that show up in the FieldName column when you run bca --paths sample.rs dump. Field-based lookup is more robust than positional indexing because it does not depend on which children the grammar emits for anonymous tokens (commas, parentheses, …).

Want a serializable JSON tree?

For pipelines that want a structured AST as data — diffing, queries on the wire, language-agnostic schema work — the AstCallback / AstNode family materializes the tree as a Serialize-able struct. This is what the REST /ast endpoint produces (bca dump uses a separate Dump callback that writes a human-readable form to stdout). Library consumers can call the JSON-shaped callback directly:

#![allow(unused)]
fn main() {
use std::path::PathBuf;

use big_code_analysis::{
    AstCallback, AstCfg, AstPayload, LANG, action,
};

let payload = AstPayload {
    id: "snippet".to_owned(),
    file_name: "snippet.rs".to_owned(),
    code: "fn f() {}".to_owned(),
    comment: false,
    span: true,
};
let cfg = AstCfg {
    id: payload.id.clone(),
    comment: payload.comment,
    span: payload.span,
};
let response = action::<AstCallback>(
    &LANG::Rust,
    payload.code.into_bytes(),
    &PathBuf::from(&payload.file_name),
    None,
    cfg,
);
let json = serde_json::to_string(&response).expect("AstResponse serializes");
println!("{json}");
}

For one-off in-process work, the as_tree_sitter() walker above is cheaper (no allocation per node). Reach for AstCallback when you need a serializable owned tree.

Out of scope

  • Incremental reparse — tree-sitter supports tree_sitter::InputEdit for incremental updates, but Ast is a snapshot. To reflect a source edit, build a fresh Ast::parse or call Parser::parse(&new_source, Some(&old_tree)) directly via the re-exported tree_sitter and feed the result through Ast::from_tree_sitter.
  • The crate-internal big_code_analysis::Node wrapper. It is exposed for the metric walker's traversal needs, but most of its traversal methods (kind, child_count, children, cursor, …) stay pub(crate). Library consumers should reach the tree-sitter Node through as_tree_sitter().root_node() — that is the documented seam.

Selecting metrics

By default, every call to analyze computes the full metric suite — ABC, cognitive, cyclomatic, Halstead, LoC, MI, NArgs, NExits, NOM, NPA, NPM, tokens, and WMC. That is the right default for the CLI, where the user has just asked for the metrics, but it is heavyweight for callers that only want one number per file.

MetricsOptions::with_only(&[Metric]) lets you restrict the walker to a subset of metrics. Unselected metrics are skipped at the per-node level — no T::Halstead::compute, no T::Cognitive::compute, etc. — and elided from the CodeMetrics serialization output.

A worked example

Compute LoC only, then read the result:

use big_code_analysis::{analyze, LANG, Metric, MetricsOptions, Source};

fn main() {
    let source = b"fn f(x: i32) -> i32 { if x > 0 { 1 } else { 0 } }";

    let opts = MetricsOptions::default().with_only(&[Metric::Loc]);
    let space = analyze(
        Source::new(LANG::Rust, source).with_name(Some("snippet.rs".to_owned())),
        opts,
    )
    .expect("parses");

    // LoC was selected — it carries real numbers.
    println!("ploc = {}", space.metrics.loc.ploc());

    // Halstead, cognitive, cyclomatic, … were skipped. Their
    // `Stats` fields are at `Default` and elided from JSON output.
    let json = serde_json::to_string_pretty(&space.metrics).unwrap();
    println!("{json}");
}

The JSON output for that call contains only the loc object; every other metric is absent.

Dependencies between metrics

Two metrics are derived — they consume the outputs of other metrics during the finalize step:

MetricDependencies
Metric::MiLoc, Cyclomatic, Halstead
Metric::WmcCyclomatic, Nom

with_only resolves these closures silently. Asking for Mi alone still computes Loc + Cyclomatic + Halstead, so the MI value is meaningful rather than a function of zero-default inputs:

#![allow(unused)]
fn main() {
use big_code_analysis::{Metric, MetricSet, MetricsOptions};
let opts = MetricsOptions::default().with_only(&[Metric::Mi]);
// opts.metrics now contains Mi + Loc + Cyclomatic + Halstead.
}

You can introspect the final set from the resulting FuncSpace via space.metrics.selected():

#![allow(unused)]
fn main() {
use big_code_analysis::{analyze, LANG, Metric, MetricsOptions, Source};
let space = analyze(
    Source::new(LANG::Rust, b"fn f() {}"),
    MetricsOptions::default().with_only(&[Metric::Mi]),
).unwrap();
let sel = space.metrics.selected();
assert!(sel.contains(Metric::Mi));
assert!(sel.contains(Metric::Loc)); // auto-added dependency
}

Default behaviour is unchanged

MetricsOptions::default() selects every metric. The pre-#257 entry points (analyze without with_only, plus the deprecated metrics / metrics_with_options shims) produce byte-for-byte the same JSON they always did.

What about "everything except X"?

There is no built-in complement API — with_only takes a positive selection, not an exclusion list. The intentional asymmetry keeps the dependency closure unambiguous: a positive list always grows through Metric::dependencies, whereas an exclusion list would need to decide what to do when the caller excludes a dependency of a metric they kept.

If you genuinely want "all except Halstead", build the list explicitly. Because Metric is #[non_exhaustive], downstream crates can construct the variants but cannot exhaustively match on them, so the conventional pattern is to enumerate the variants you want and accept that adding a future Metric variant will not silently opt you in:

#![allow(unused)]
fn main() {
use big_code_analysis::{Metric, MetricsOptions};

let opts = MetricsOptions::default().with_only(&[
    Metric::Cognitive,
    Metric::Cyclomatic,
    Metric::Loc,
    Metric::Nom,
    Metric::Tokens,
    Metric::NArgs,
    Metric::Exit,
    Metric::Abc,
    Metric::Npm,
    Metric::Npa,
    Metric::Wmc,
    // Metric::Mi intentionally omitted: it would pull Halstead
    // back in via the dependency closure.
]);
}

Note the trap: keeping Metric::Mi re-adds Metric::Halstead through Metric::dependencies. To truly drop Halstead you must also drop Mi.

When to reach for with_only

  • Hot paths that need only one or two metrics per file — Halstead in particular owns its own per-space HalsteadMaps allocation and is the headline saving for an LoC-only run.
  • CI integrations that only display one number (e.g. a cognitive-complexity gate) and want the rest of CodeMetrics to drop out of the cached JSON payload.
  • Library callers wiring big-code-analysis into their own reports who would otherwise see fields for every metric in their own UI.

Per-metric Cargo features (compile-time stripping) are not covered by this knob; they remain tracked separately under the grammar-feature work (#252).

Per-language Cargo features

Every tree-sitter grammar this library bundles is gated behind its own Cargo feature. The default feature set is all-languages, so the default

[dependencies]
big-code-analysis = "1.1.0"

pulls every grammar in — matching the library's historical behaviour and what the bca / bca-web binaries themselves ship with. The cost is concrete: every grammar crate compiles when the library compiles, and every grammar's parsing tables stay live in the final binary.

Library consumers that only need a subset of languages can opt out of the defaults and re-enable just the grammars they care about.

A worked example

A downstream service that only analyses Rust and TypeScript:

[dependencies]
big-code-analysis = { version = "1.1.0", default-features = false, features = ["rust", "typescript"] }

The library still compiles, the LANG enum still has every variant, and analyze / metrics_from_tree / the rest of the dispatch surface still work for the enabled languages.

Supported features

The following per-language features are available. Each feature pulls in the matching grammar crate (and any helper grammars the per-language pipeline depends on).

FeatureGrammar crates pulled in
bashtree-sitter-bash
cppbca-tree-sitter-mozcpp, bca-tree-sitter-ccomment, bca-tree-sitter-preproc (covers the Cpp, Ccomment, and Preproc variants)
csharptree-sitter-c-sharp
elixirtree-sitter-elixir
gotree-sitter-go
groovydekobon-tree-sitter-groovy
javatree-sitter-java
javascripttree-sitter-javascript
kotlintree-sitter-kotlin-ng
luatree-sitter-lua
mozjsbca-tree-sitter-mozjs
perltree-sitter-perl
phptree-sitter-php
pythontree-sitter-python
rubytree-sitter-ruby
rusttree-sitter-rust
tclbca-tree-sitter-tcl
typescripttree-sitter-typescript (used by both the Typescript and Tsx variants)

The umbrella all-languages feature enables every entry in this table. The bca-tree-sitter-* crates are in-tree forks of the upstream Mozilla / community grammars; the Rust import path remains tree_sitter_<lang> regardless. See RELEASING.md for the rename rationale and the workspace package = ... alias trick that keeps consumer call sites unchanged.

What happens when a feature is off

The LANG enum keeps every variant defined regardless of the active feature set — disabling a feature does not change the enum surface, the per-language *Code / *Parser type aliases, or any of the file-extension / emacs-mode detection helpers. Selecting a LANG whose feature is off only affects the dispatch path.

Every dispatch entry point that returns a Result surfaces the disabled state as Err(MetricsError::LanguageDisabled(LANG)):

Callers can query the compiled-in set without going through a dispatcher:

#![allow(unused)]
fn main() {
use big_code_analysis::LANG;

for lang in LANG::into_enum_iter() {
    if lang.is_enabled() {
        println!("{:?} is compiled in", lang);
    }
}
}

This pairs well with the get_language_for_file / guess_language helpers, which still hand back any LANG variant for a recognised extension — callers walking a directory may want to skip files whose language is not enabled in the current build.

Stability

Per-language features are themselves stable. Adding or removing a language feature in the future is a minor-bump break (it changes which LANG variants the default build covers); changes to the default feature set will be flagged in the changelog under (breaking).

Walking FuncSpace results

FuncSpace is the tree the library hands back from analyze. The top-level node represents the whole file; its spaces field holds nested function / class / impl / trait / namespace spaces. Each node carries the same CodeMetrics payload, so any metric is available at any level of granularity.

Anatomy of a FuncSpace

The fields you reach for most often are:

FieldTypeWhat it is
nameOption<String>Caller-supplied identifier (top-level) or symbol name (nested)
kindSpaceKindUnit, Function, Class, Impl, …
start_lineusizeFirst line (1-based)
end_lineusizeLast line (1-based)
spacesVec<FuncSpace>Nested spaces
metricsCodeMetricsAll per-space metric values
suppressedSuppressionScopeIn-source suppression markers

SpaceKind is an enum — match on it to filter what you care about (Function only, or "anything that owns methods").

Recursive walk

Recursion mirrors the tree shape. Here we collect every function space whose cognitive complexity exceeds a threshold:

use big_code_analysis::{
    analyze, FuncSpace, MetricsOptions, SpaceKind, Source, LANG,
};

fn hotspots(space: &FuncSpace, threshold: f64, out: &mut Vec<String>) {
    if space.kind == SpaceKind::Function
        && space.metrics.cognitive.cognitive_sum() > threshold
    {
        if let Some(name) = &space.name {
            out.push(format!(
                "{name} (lines {}–{})",
                space.start_line, space.end_line,
            ));
        }
    }
    for child in &space.spaces {
        hotspots(child, threshold, out);
    }
}

fn main() {
    let source = b"\
fn easy() { let _ = 1; }
fn hard(x: i32) -> i32 {
    if x > 0 { if x > 10 { 1 } else { 2 } } else { 3 }
}
";
    let space = analyze(
        Source::new(LANG::Rust, source).with_name(Some("snippet.rs".to_owned())),
        MetricsOptions::default(),
    )
    .expect("parses");

    let mut hits = Vec::new();
    hotspots(&space, 2.0, &mut hits);
    for hit in hits {
        println!("{hit}");
    }
}

Iterative walk

For deep trees, prefer an explicit stack — Rust does not tail-call-optimise, and pathological generated code can be arbitrarily nested:

#![allow(unused)]
fn main() {
use big_code_analysis::FuncSpace;

fn total_functions(root: &FuncSpace) -> usize {
    let mut stack = vec![root];
    let mut count = 0;
    while let Some(space) = stack.pop() {
        if space.kind == big_code_analysis::SpaceKind::Function {
            count += 1;
        }
        stack.extend(space.spaces.iter());
    }
    count
}
}

Reading per-metric numbers

CodeMetrics exposes each metric as its own Stats struct. Inside, each struct offers integer-valued summary accessors plus per-space derived ones. A few patterns:

#![allow(unused)]
fn main() {
use big_code_analysis::FuncSpace;

fn summary(space: &FuncSpace) {
    let m = &space.metrics;

    println!("cognitive (this space):     {}", m.cognitive.cognitive_sum());
    println!("cyclomatic (this space):    {}", m.cyclomatic.cyclomatic_sum());
    println!("# functions in this space:  {}", m.nom.functions_sum());
    println!("source lines (sloc):        {}", m.loc.sloc());
    println!("physical lines (ploc):      {}", m.loc.ploc());
    println!("ABC branches:               {}", m.abc.branches());
}
}

The *_sum accessors aggregate across child spaces; bare accessors like m.loc.sloc() are the value attributable to this node. The full list of fields and methods lives in the per-metric rustdoc.

Don't rely on traversal order

The library walks the AST in source order, but the contract is only that every space appears once in the tree. If you need a stable order across versions, sort by start_line after the walk:

#![allow(unused)]
fn main() {
use big_code_analysis::FuncSpace;

fn flatten(space: &FuncSpace, out: &mut Vec<(usize, String)>) {
    if let Some(name) = &space.name {
        out.push((space.start_line, name.clone()));
    }
    for child in &space.spaces {
        flatten(child, out);
    }
}

fn sorted(space: &FuncSpace) -> Vec<(usize, String)> {
    let mut v = Vec::new();
    flatten(space, &mut v);
    v.sort_by_key(|&(line, _)| line);
    v
}
}

Error handling

The entry point analyze returns Result<FuncSpace, MetricsError>. This page documents what each variant means and how to act on it.

Heads up. Prior to #253 this entry point returned Option<FuncSpace> and collapsed every failure mode into a single None. The Result variant set is additive — MetricsError is #[non_exhaustive], so always include a _ arm when matching exhaustively to stay forward-compatible with future variants.

Pattern-matching the error variants

use big_code_analysis::{analyze, LANG, MetricsError, MetricsOptions, Source};

fn main() {
    let result = analyze(
        Source::new(LANG::Rust, b"this is not rust")
            .with_name(Some("snippet.rs".to_owned())),
        MetricsOptions::default(),
    );

    match result {
        Ok(space) => println!("ok: {} lines", space.metrics.loc.sloc()),
        Err(MetricsError::EmptyRoot) => {
            eprintln!("walker produced no top-level FuncSpace");
        }
        Err(MetricsError::ParseHasErrors) => {
            eprintln!("tree-sitter reported syntax errors (strict mode)");
        }
        Err(MetricsError::LanguageDisabled(lang)) => {
            eprintln!("language {:?} is not enabled in this build", lang);
        }
        Err(MetricsError::NonUtf8Path) => {
            eprintln!("path is not valid UTF-8");
        }
        // `MetricsError` is `#[non_exhaustive]`; new variants may be added.
        Err(_) => eprintln!("unexpected MetricsError variant"),
    }
}

What each variant means

  • EmptyRoot — The walker reached the end of the AST without producing a top-level FuncSpace. The most common cause is empty input or input whose only content is comments. Defensive failures (the traversal produced no Unit space for any supported language) also surface here; if you hit one on real-world source, please file an issue.
  • ParseHasErrors — Reserved for a future strict-parsing toggle on MetricsOptions. Not produced by today's default entry points; tree-sitter's error recovery is intentionally tolerant (see below).
  • LanguageDisabled(LANG) — Reserved for upcoming per-language Cargo features (see #252). The current build enables every supported language, so this variant is never produced today.
  • NonUtf8Path — Reserved for callers that opt into strict-identifier mode. Since #254, the recommended analyze entry point takes a caller-supplied Source::name (Option<String>), so non-UTF-8 paths are never round-tripped through lossy conversion in the first place. The deprecated path-positional shims (get_function_spaces, metrics_with_options) still fall back to Path::to_string_lossy. This variant is not produced today; it is kept for future strict-identifier validators.

Tree-sitter does not always say "no"

Most parse errors do not surface as Err(_). Tree-sitter is an error-recovering parser — it will produce a tree even for syntactically broken input, marking the bad regions with ERROR nodes. The metric walk happily computes numbers over the recovered tree. That means:

  • Garbage in, numbers out. Feeding C++ source to LANG::Python generally produces an Ok(FuncSpace) whose metrics are nonsense. Make sure you have selected the right language (e.g. via guess_language) before trusting the result.
  • Partial files score. A truncated file with an unterminated brace will still return Ok(FuncSpace). The metrics reflect the recovered tree, not the intended source.

If you need to know whether the input parsed cleanly, count ERROR nodes by walking the tree-sitter AST yourself (see the Node escape hatch in STABILITY.md) or use the bca nodes subcommand on the CLI side.

Bubbling MetricsError through ?

Because MetricsError implements [std::error::Error], you can bubble it through any Result<_, Box<dyn Error>> chain without boilerplate:

#![allow(unused)]
fn main() {
use std::error::Error;

use big_code_analysis::{analyze, FuncSpace, LANG, MetricsOptions, Source};

pub fn run(
    lang: LANG,
    source: &[u8],
    name: Option<String>,
) -> Result<FuncSpace, Box<dyn Error>> {
    Ok(analyze(
        Source::new(lang, source).with_name(name),
        MetricsOptions::default(),
    )?)
}
}

If you want a project-specific error type, an explicit From impl keeps call sites clean while letting you attach extra context (file path, language guess, etc.).

Warnings are not errors

The library writes warnings to stderr for non-fatal issues (malformed bca: suppression markers, mainly). They do not abort the walk and they do not flip Ok to Err. If you are running embedded inside a server or library and need to capture those warnings, redirect stderr at the process level — the library does not currently expose a programmatic warning sink. That is tracked under the library-DX umbrella (#250).

Stability and versioning

big-code-analysis is on the 1.x line. The full stability contract lives in STABILITY.md at the root of the repository — that file is the source of truth and is updated alongside the changelog at every release.

The headlines for library consumers:

  • Shape stability across patch and minor bumps. Every public type and function signature listed in STABILITY.md § "What is stable in shape" is held across the 1.x line. Additive changes (new items, new LANG variants, new MetricsError variants, new language features) are allowed in minor bumps. Breaking shape changes are reserved for the next major bump and will appear in the changelog under (breaking) in the 2.0.0 section.
  • No value stability guarantee within 1.x. A grammar pin bump or a bug fix in a metric definition can shift any metric value on any file in any direction, even across a patch bump. Each such drift is flagged in the changelog. Pin to an exact version (big-code-analysis = "= 1.1.0") if you need bit-for-bit reproducibility across runs.
  • MSRV is 1.94. Bumping the MSRV is treated as a minor-bump event and is flagged in the changelog under (breaking) — see STABILITY.md § MSRV policy.
  • Escape hatches. The Node wrapper exposes tree_sitter::Node through .0, and the tree_sitter crate is re-exported as big_code_analysis::tree_sitter. Anything reached through those seams follows the pinned tree-sitter version, not our own SemVer. See STABILITY.md § Escape hatches before depending on them.

On the 2.0 horizon

A small number of loose ends are deferred to 2.0; they are listed in STABILITY.md § "On the 2.0 horizon". The headline items are:

  • The per-metric Stats structs gain #[non_exhaustive], so field additions stop being a shape break in the strict SemVer sense.
  • The deprecated metrics / metrics_with_options shims (in favour of analyze) are removed.
  • The accumulated metric-definition fixes that have shifted values across 1.x get a clean re-baseline note.

2.0 is not scheduled. Until then, 1.x is the surface you should depend on.

Python Bindings

big-code-analysis ships first-party Python bindings (PyO3 + maturin) that expose the same metric pipeline as the Rust library and the bca CLI — same JSON shape, same numeric formatting, same language coverage.

import big_code_analysis as bca

result = bca.analyze("src/main.rs")
if result is not None:
    print(result["metrics"]["cyclomatic"]["sum"])

The bindings are a peer of the Rust API: anywhere this book points at a Rust function (big_code_analysis::analyze, FuncSpace, the metric modules), Python has a one-to-one equivalent. Pick whichever language fits your pipeline — the metrics are identical.

When to reach for Python

  • You're already in a data-pipeline stack (pandas, Jupyter, Airflow, dbt, Polars) and want metric records as dict/DataFrame rows without shelling out to the CLI.
  • You're integrating with a Python-native security tool that consumes SARIF — see SARIF output.
  • You're building a code-quality dashboard whose backend is a Python web framework (FastAPI, Django).

If you only need a one-shot quality report from the command line, the bca CLI is the simpler tool — see Commands → Metrics.

If you're embedding the analysis into a long-running Rust program, the Rust library is the lower-overhead option.

Chapter contents

The headline example on each page is embedded verbatim from an importable file under big-code-analysis-py/examples/ and exercised end-to-end by big-code-analysis-py/tests/test_book_examples.py, so a renamed kwarg or a removed function on the primary path fails CI before it can rot the docs. Shorter illustrative snippets that surround the embedded example (logging recipes, regex parsing of the errno suffix, the asyncio anti-pattern, the pandas one-liner, …) are inline and intentionally not test-pinned — treat the embedded blocks as the canonical reference when the two disagree.

Installation

The bindings are distributed as a pure-wheel Python package. The recommended install is via pip (or your preferred lockfile manager — uv, poetry, pdm).

pip install big-code-analysis

Python >=3.12 is required. The compiled extension uses CPython's stable abi3 surface (abi3-py312), so one wheel covers 3.12, 3.13, and every future minor release without a per-version wheel build.

Wheel matrix

CI publishes wheels for the following targets today. If your platform is not listed, build from source.

PlatformArchitectures
Linux (manylinux_2_28)x86_64, aarch64

The wheel matrix is defined in .github/workflows/python-wheels.yml. Phase 7 of the bindings work lit up the manylinux_2_28 Linux legs. manylinux_2_28 requires glibc >= 2.28 (RHEL 8 / Debian 10 / Ubuntu 18.10 and newer); older distributions (RHEL 7 / CentOS 7, glibc 2.17) need to build from source. macOS and Windows wheel publication is tracked under #103 and not yet shipped — pip install on those platforms falls back to a source build today.

Verifying the install

python -c "import big_code_analysis as bca; print(bca.__version__)"

The version printed equals [workspace.package].version from the Rust workspace's Cargo.toml — the bindings and the Rust library version in lockstep.

Building from source

If no wheel matches your platform, or you want to bind against an unreleased Rust commit, build with maturin:

git clone https://github.com/dekobon/big-code-analysis.git
cd big-code-analysis/big-code-analysis-py
python -m venv .venv && source .venv/bin/activate
pip install --upgrade pip
pip install "maturin>=1.7,<2.0"
maturin develop --release   # editable install of big_code_analysis
python -c "import big_code_analysis as bca; print(bca.__version__)"

maturin develop builds the Rust extension in-place and installs it into the active venv so import big_code_analysis resolves locally — no separate pip install -e . step is required. The --release flag turns on the optimiser; omit it during development for faster rebuilds.

You will also need:

  • A stable Rust toolchain (MSRV: 1.94). Install via rustup.
  • A C compiler (used by the tree-sitter grammar crates).
  • CPython development headers (python3-dev on Debian / Ubuntu).

Next

Walk through the quick-start to compute your first metric, or skip ahead to batch processing if you're wiring this into a pipeline over many files.

Quick start

This page walks through the minimum amount of code needed to compute metrics from a single source file.

1. Install the package

pip install big-code-analysis

See Installation for the wheel matrix and build-from-source instructions.

2. Analyse a file

bca.analyze(path) returns a dict matching the JSON bca metrics --output-format json emits for the same file — same field order, same numeric formatting, same shape.

"""Quick-start: analyse one file and print the headline cyclomatic count.

Mirrors the worked example shown on the book's
``python/quick-start.md`` page. The book embeds this file verbatim,
so the snippet is the test fixture — if the API drifts, the
``test_book_examples.py`` test fails and the docs are forced back
into sync.
"""

from __future__ import annotations

from pathlib import Path
from typing import Any

import big_code_analysis as bca


def run(path: Path) -> dict[str, Any]:
    """Analyse ``path`` and return its metric dict."""
    result = bca.analyze(path)
    if result is None:
        msg = f"{path} was skipped (looks generated)"
        raise SystemExit(msg)

    cyclomatic = result["metrics"]["cyclomatic"]
    print(f"{result['name']}: cyclomatic sum = {cyclomatic['sum']:.0f}")
    return result


if __name__ == "__main__":
    import sys

    if len(sys.argv) != 2:
        sys.exit("usage: python quick_start.py <path>")
    run(Path(sys.argv[1]))

A few details worth noting:

  • analyze returns None when the file matches the CLI walker's is_generated predicate (a leading @generated, DO NOT EDIT, or GENERATED CODE marker). Always handle the optional return before reaching into result["metrics"].
  • The returned object is a plain dict[str, Any]. It is safe to serialise with json.dumps, ship to a downstream service, or feed into flatten_spaces for tabular consumers.
  • Language detection mirrors the CLI exactly: path extension first, then shebang / emacs-mode fallback. Pass bca.analyze_source(code, language) if you have the source in-memory.

3. Analyse an in-memory snippet

import big_code_analysis as bca

metrics = bca.analyze_source("fn main() {}\n", "rust")
print(metrics["metrics"]["loc"]["sloc"])

analyze_source accepts str, bytes, or bytearray. The returned dict has the same shape as analyze's output, with name set to None (no path is associated with an in-memory buffer).

Where to go next

Batch processing

bca.analyze_batch(paths) runs the same analysis as bca.analyze over every path in an iterable and never raises on per-file errors: each result slot is either an analysis dict or a bca.AnalysisError describing the failure. The list has the same length as the input and preserves order one-to-one, so callers can zip(inputs, results) without losing the pairing.

def run(paths: Iterable[Path]) -> dict[str, int]:
    """Analyse ``paths`` as a batch and bucket successes vs failures.

    Returns a small summary dict (`ok`, `errors`, `total`) so the
    accompanying test can assert on it without re-parsing.
    """
    materialised = [str(p) for p in paths]
    results = bca.analyze_batch(materialised)

    ok = 0
    errors = 0
    for path, result in zip(materialised, results, strict=True):
        if isinstance(result, bca.AnalysisError):
            errors += 1
            print(f"  skip {path}: ({result.error_kind}) {result.error}")
        else:
            ok += 1
            sloc = result["metrics"]["loc"]["sloc"]
            print(f"  ok   {path}: sloc = {sloc:.0f}")

    return {"ok": ok, "errors": errors, "total": len(materialised)}

A few key contracts:

  • AnalysisError is returned, not raised. It is not an Exception subclass — isinstance(slot, bca.AnalysisError) is the discriminator.
  • The result list is the same length as the input. paths is consumed lazily, so generators work — but if you want to keep the input around for zip, materialise it into a list first.
  • analyze_batch runs with the is_generated walker filter off: every input position yields either a dict or an AnalysisError, never None. Call bca.analyze(path) per-file with the default skip_generated=True if you need the CLI walker's skip behaviour.

Parallel execution

There is no built-in concurrency inside analyze_batch — it is a sequential sweep. For parallelism, fan the per-file analyze call out across a thread pool:

def run_parallel(paths: Iterable[Path], *, workers: int = 4) -> list[dict[str, Any] | None]:
    """Fan ``analyze`` out across a thread pool.

    PyO3 releases the GIL across each file's read + parse, so a
    thread pool actually parallelises the heavy work. Use this when
    you need per-file exceptions instead of ``AnalysisError`` slots.
    """

    def _analyze(p: Path) -> dict[str, Any] | None:
        return bca.analyze(str(p))

    with ThreadPoolExecutor(max_workers=workers) as pool:
        return list(pool.map(_analyze, paths))

PyO3's Python::detach releases the GIL across each file's read + tree-sitter parse, so the threads do not serialise on the interpreter lock — this is real parallelism, not contended co-operation.

AnalysisError taxonomy

error_kind is a closed Literal:

error_kindTriggered by
"UnsupportedLanguage"Unknown extension + no shebang / emacs-mode hit
"ParseError"tree-sitter rejected the source, or a rare internal serialisation failure (internal: serialization error: …)
"IoError"std::fs::read failed or the path was not valid UTF-8

AnalysisError is frozen and implements __eq__ / __hash__ / __repr__ over all three fields, so callers can put errors in a set to deduplicate failures across runs. For retry classification, the errno is preserved in the error string via Rust's default formatting:

import re

match = re.search(r"\(os error (\d+)\)$", slot.error)
errno = int(match.group(1)) if match else None

If you need typed dispatch (FileNotFoundError, PermissionError, …) call bca.analyze(path) per-file instead of analyze_batch — single-file analyze raises the canonical OSError subclass. See Error handling.

Flat-record iteration

bca.flatten_spaces(result) walks the nested FuncSpace tree in pre-order and yields one flat, scalar-only dict per node — ready for sqlite3.executemany, pandas.DataFrame.from_records, or any other tabular consumer.

Metric keys use the same dotted convention as the CLI's CSV writer (cyclomatic.modified.sum, halstead.volume, loc.lloc_average, …). Identity keys (path, name, kind, start_line, end_line, parent_name, depth) are added on every record.

SQLite via executemany

The example below analyses one file and inserts one row per FuncSpace into a sqlite table whose columns are the union of all flattened keys.

"""Flatten a FuncSpace tree into scalar rows for sqlite / pandas.

Demonstrates ``bca.flatten_spaces`` + ``sqlite3.executemany``. The
pandas equivalent is shown in the book as a non-executed snippet so
this example stays dependency-free (sqlite ships with the stdlib).

Tied to the book's ``python/flat-records.md`` page.
"""

from __future__ import annotations

import sqlite3
from contextlib import closing
from pathlib import Path

import big_code_analysis as bca

# SQLite identifier names are case-insensitive, so the Halstead
# pair `N1` / `n1` (and `N2` / `n2`) collide on one column. Rewrite
# the uppercase totals to a distinct name before insertion. The
# explicit map (not a `.replace(".N", "...")` substring rewrite)
# means a hypothetical future `halstead.NN_metric` would not be
# silently mangled.
_RENAME_FOR_SQLITE: dict[str, str] = {
    "halstead.N1": "halstead.total_1",
    "halstead.N2": "halstead.total_2",
}


def _safe_column(key: str) -> str:
    return _RENAME_FOR_SQLITE.get(key, key)


def run(path: Path, db_path: Path) -> int:
    """Analyse ``path`` and insert one row per FuncSpace into ``db_path``.

    Returns the number of rows inserted so the test can assert on it.
    """
    result = bca.analyze(path)
    if result is None:
        msg = f"{path} was skipped (looks generated)"
        raise SystemExit(msg)

    records = [{_safe_column(k): v for k, v in r.items()} for r in bca.flatten_spaces(result)]
    if not records:
        return 0

    columns = sorted({k for r in records for k in r})
    cols_sql = ", ".join(f'"{c}"' for c in columns)
    placeholders = ", ".join("?" for _ in columns)
    rows = [tuple(r.get(c) for c in columns) for r in records]

    # `closing(sqlite3.connect(...))` is the documented idiom — the
    # bare ``with sqlite3.connect(...)`` context manager only commits
    # / rolls back the transaction; it does NOT close the connection,
    # so a long-running consumer leaks file descriptors (and on
    # Windows holds an exclusive write lock on the db file).
    with closing(sqlite3.connect(db_path)) as db, db:
        db.execute(f"CREATE TABLE IF NOT EXISTS metrics ({cols_sql})")
        db.executemany(
            f"INSERT INTO metrics ({cols_sql}) VALUES ({placeholders})",
            rows,
        )

    return len(rows)


if __name__ == "__main__":
    import sys

    if len(sys.argv) != 3:
        sys.exit("usage: python flat_records.py <source-file> <out.db>")
    inserted = run(Path(sys.argv[1]), Path(sys.argv[2]))
    print(f"inserted {inserted} rows into {sys.argv[2]}")

The iterator is lazy and single-use: it walks the input once without materialising the whole list. A second iteration of the same iterator yields nothing — call list() once if you need to re-iterate.

Pandas

flatten_spaces is the natural input to pandas.DataFrame.from_records. Pandas is not a dependency of the bindings; install it separately if you want the DataFrame view.

import big_code_analysis as bca
import pandas as pd

result = bca.analyze("src/lib.rs")
if result is not None:
    df = pd.DataFrame.from_records(bca.flatten_spaces(result))
    print(df.head())
    # Group by space kind to inspect the average cyclomatic per
    # function vs. per class vs. per file.
    by_kind = df.groupby("kind")["cyclomatic.sum"].mean()

Identity columns vs CLI CSV

The flat-record schema is mostly aligned with the CLI's CSV writer, with a couple of intentional deltas:

  • Identity columns use name / kind here; the CSV writer uses space_name / space_kind. Flat records also add parent_name / depth; the CSV writer omits those.
  • tokens.* flattens to the JSON shape (tokens.tokens, tokens.tokens_average, …), while CSV renames those to tokens.sum / .average / .min / .max. Rename in the consumer if you need exact CSV alignment.

Anonymous spaces (Rust closures, JavaScript function expressions / arrows) keep their name == "<anonymous>" marker verbatim — flatten_spaces does not normalise.

Caveats

  • parent_name alone cannot disambiguate same-named siblings nested under different parents (e.g. two Inner classes under two different outer classes both surface as parent_name == "Inner" for their own children). Pair with depth and source-order position, or rebuild the qualified name in your consumer, if you need a fully-qualified path.
  • Do not mutate the input result while iterating: the walker keeps references into it, so mutations to not-yet-yielded subtrees will be observed in later records.
  • Missing metric subtrees produce no keys (absent, not None), matching the "Halstead disabled" edge case for metric selection.
  • flatten_spaces raises TypeError if the input is not a mapping; callers must filter None returns from bca.analyze (e.g. generated files with skip_generated=True) before passing.

Metric selection

Pass metrics=[…] to compute only a subset of the metric suite. metrics=None (the default) preserves the "compute everything" behaviour. Unrequested metrics are absent from the result dict (not present with None placeholders).

def run(path: Path) -> dict[str, Any]:
    """Compute only LoC + cyclomatic for ``path`` and return the result.

    ``bca.METRIC_NAMES`` is a ``tuple[str, ...]`` of every canonical
    name accepted by ``metrics=``. The string ``"halstead"`` is one
    of them; ``in`` membership tests the selection client-side
    before any I/O is paid for.
    """
    if "halstead" not in bca.METRIC_NAMES:
        msg = "halstead is missing from METRIC_NAMES — bindings ABI drift"
        raise RuntimeError(msg)
    selected = bca.analyze(path, metrics=["loc", "cyclomatic"])
    if selected is None:
        msg = f"{path} was skipped (looks generated)"
        raise SystemExit(msg)

    metric_keys = sorted(selected["metrics"])
    print(f"computed only: {metric_keys}")
    return selected


def run_derived(path: Path) -> dict[str, Any]:
    """Selecting ``mi`` auto-pulls in its three dependencies."""
    selected = bca.analyze(path, metrics=["mi"])
    if selected is None:
        msg = f"{path} was skipped (looks generated)"
        raise SystemExit(msg)

    pulled = sorted(selected["metrics"])
    print(f"mi pulled in: {pulled}")
    return selected

The same kwarg is honoured by bca.analyze_source and bca.analyze_batch — the latter applies the selection uniformly to every file in the batch. Validation runs before any file I/O: an empty list or unknown name raises ValueError immediately and never returns an AnalysisError slot for what is really a caller bug.

Canonical names

The full set is available as a tuple:

import big_code_analysis as bca

assert "halstead" in bca.METRIC_NAMES

Names are case-sensitive lowercase; passing an unknown name raises ValueError with the canonical list in the message. The "exit" Metric-Display spelling is accepted as an alias for the canonical JSON-key spelling "nexits"; both produce a "nexits" key in the output. Duplicates are silently collapsed.

MetricJSON keyDependencies pulled in
LoCloc
Cyclomaticcyclomatic
Cognitivecognitive
Halsteadhalstead
ABCabc
nargsnargs
nomnom
npanpa
npmnpm
nexits (alias exit)nexits
tokenstokens
Maintainability Indexmiloc, cyclomatic, halstead
Weighted Methods per Classwmccyclomatic, nom

Performance trade-off

Computing the full suite is the default because it is what the CLI does. Selecting a single metric is strictly faster — each compute pass is skipped — but the tree-sitter parse and the AST walk are the dominant cost on most inputs, so the saving on a single file is small. The benefit scales with batch size: when analyze_batch runs across a large repository, dropping the most expensive metric you do not need (often Halstead, on deep call trees) is a measurable win.

Unrequested metrics are absent from the result. Code that unconditionally indexes into result["metrics"]["mi"] will KeyError if you opted out of mi; guard with if "mi" in result["metrics"] or use .get("mi").

See also

  • Batch processingmetrics= applies uniformly to every file in a batch; validation runs once, before the input is iterated.
  • SARIF output — threshold names are independent of the metrics= selection; you can request metrics=["loc"] and still gate on cyclomatic thresholds, but the SARIF will have no findings for the dropped metrics.
  • Flat-record iterationflatten_spaces silently emits no keys for metrics that were absent from the source dict, so a metrics= selection naturally narrows the flattened columns.

SARIF output

bca.to_sarif(result, *, thresholds=None) renders an analysis result (or an iterable of them) into a SARIF 2.1.0 JSON document, ready for upload to GitHub Code Scanning or any other SARIF consumer. The output is produced by the same Rust writer that backs bca check -O sarif, so the schema URL, tool driver name / version, and rule descriptions match the CLI byte-for-byte.

def run(
    paths: Iterable[Path],
    sarif_path: Path,
    thresholds: Mapping[str, float],
) -> str:
    """Analyse ``paths`` and write a SARIF document to ``sarif_path``.

    Returns the rendered SARIF JSON so the caller (or the test) can
    inspect it without re-reading the file.
    """
    batch = bca.analyze_batch([str(p) for p in paths])
    sarif = bca.to_sarif(batch, thresholds=dict(thresholds))

    sarif_path.parent.mkdir(parents=True, exist_ok=True)
    sarif_path.write_text(sarif, encoding="utf-8")
    print(f"wrote {sarif_path} ({len(sarif)} bytes)")
    return sarif

to_sarif accepts:

  • A single dict returned by bca.analyze or bca.analyze_source.
  • Any iterable yielding such dicts and / or bca.AnalysisError instances (the natural shape of bca.analyze_batch's return value). AnalysisError entries are skipped silently — they represent files that could not be analysed, not findings.

Thresholds

Accepted threshold names mirror the CLI's EXTRACTORS table in big-code-analysis-cli/src/thresholds.rs:

  • cognitive, cyclomatic, cyclomatic.modified
  • halstead.volume, halstead.difficulty, halstead.effort, halstead.time, halstead.bugs
  • loc.sloc, loc.ploc, loc.lloc, loc.cloc, loc.blank
  • nom, tokens, nexits, nargs
  • mi.original, mi.sei, mi.visual_studio
  • abc, wmc, npm, npa

An unknown name raises ValueError listing the accepted set, so a typo fails fast instead of silently producing an empty SARIF run.

thresholds=None (the default) and thresholds={} both produce a well-formed SARIF document with empty results and rules arrays. This matches the CLI's posture: there are no built-in default thresholds; every check run supplies its own limits.

Upload to GitHub Code Scanning

# .github/workflows/code-scanning.yml (excerpt)
- name: Compute metric SARIF
  run: |
    python - <<'PY'
    import big_code_analysis as bca
    with open("paths.txt", encoding="utf-8") as paths_fh:
        results = bca.analyze_batch(paths_fh.read().splitlines())
    with open("metrics.sarif", "w", encoding="utf-8") as fh:
        fh.write(bca.to_sarif(results, thresholds={"cyclomatic": 15}))
    PY
- name: Upload to Code Scanning
  uses: github/codeql-action/upload-sarif@v3
  with:
    sarif_file: metrics.sarif

The upload action is documented under github/codeql-action/upload-sarif. The bindings produce one SARIF run per call; the action handles the upload to the repository's Code Scanning alerts.

What "Unit" findings mean

to_sarif emits file-scope (unit-space) findings for every metric whose JSON headline at the unit space matches the CLI's per-space accessor (loc.*, halstead.*, mi.*, nom, nargs, nexits, tokens, abc, wmc, npm, npa). The three exceptions — cyclomatic, cyclomatic.modified, cognitive — are skipped at the unit level because the JSON exposes the aggregate sum across children while the CLI's per-space accessor returns just the unit's own scalar.

Unit findings carry logicalLocations: [{"fullyQualifiedName": "<file>"}]. Nameless non-unit spaces (rare parse-failure case) carry "<unnamed>" — both matching the CLI's function_token placeholders.

See also

  • Batch processing — the natural source of input iterables for to_sarif; AnalysisError entries are skipped silently.
  • Metric selection — threshold names are a closed set independent of metrics=; requesting a narrower metric suite while gating on a dropped threshold yields an empty SARIF run.
  • Error handling — the typed exceptions to_sarif raises for bad caller input (TypeError / ValueError).

Error handling

The bindings split errors into two domains:

  • Caller errors are raised — ValueError for bad arguments, TypeError for the wrong type, OSError and its subclasses for filesystem failures.
  • Per-file analysis errors in a batch are returned as bca.AnalysisError values inside the result list. They are not exceptions and never raise.

The single-file bca.analyze walks the first path; the batch bca.analyze_batch walks the second.

def run(
    fixtures: Path,
    *,
    missing_path: Path,
) -> dict[str, Any]:
    """Trigger each error path and return a small report.

    ``fixtures`` is a directory containing at least ``hello.rs``;
    ``missing_path`` must NOT exist on disk.
    """
    report: dict[str, Any] = {
        "file_not_found": False,
        "unsupported": False,
        "batch_errors": 0,
    }

    # 1. analyze() on a missing path raises a typed OSError subclass.
    try:
        bca.analyze(str(missing_path))
    except FileNotFoundError as err:
        report["file_not_found"] = True
        print(f"file_not_found: errno={err.errno} filename={err.filename}")

    # 2. analyze() on an unknown extension raises
    #    UnsupportedLanguageError (itself a ValueError subclass).
    #    The write is inside the try/finally so a future second
    #    mutation before the analyse call still gets cleaned up.
    unknown = fixtures / "hello.unknown_extension"
    try:
        unknown.write_text("noop", encoding="utf-8")
        bca.analyze(str(unknown))
    except bca.UnsupportedLanguageError as err:
        report["unsupported"] = True
        print(f"unsupported_language: {err}")
    finally:
        unknown.unlink(missing_ok=True)

    # 3. analyze_batch() returns AnalysisError, never raises per-file.
    paths = [str(fixtures / "hello.rs"), str(missing_path)]
    for slot in bca.analyze_batch(paths):
        if isinstance(slot, bca.AnalysisError):
            report["batch_errors"] += 1
            print(f"batch_error: ({slot.error_kind}) {slot.error}")

    return report

Single-file exceptions

bca.analyze and bca.analyze_source raise:

ExceptionSubclass ofTriggered by
bca.UnsupportedLanguageErrorValueErrorUnknown extension + no shebang / emacs-mode hit
bca.ParseErrorValueErrortree-sitter rejected the source
ValueError (raw)Non-UTF-8 path with allow_lossy_path=False (the default)
OSError and subclassesstd::fs::read failed

The OSError raised by analyze dispatches to the canonical subclass based on errno:

import big_code_analysis as bca

path = "src/example.rs"

try:
    bca.analyze(path)
except FileNotFoundError as err:
    print("missing:", err.errno, err.filename)
except PermissionError as err:
    print("denied:", err.errno, err.filename)
except IsADirectoryError as err:
    print("directory:", err.errno, err.filename)

Each branch dispatches on the underlying errno:

ExceptionTypical err.errno (Linux)When it fires
FileNotFoundError2 (ENOENT)Path does not exist.
PermissionError13 (EACCES)Read bit denied for the calling user.
IsADirectoryError21 (EISDIR)Path resolves to a directory.

Use except OSError if you want to catch the whole family and inspect err.errno / err.filename yourself.

UnsupportedLanguageError and ParseError are both ValueError subclasses, so a single except ValueError catches both. Prefer the typed catches when you want to differentiate.

Batch errors

bca.analyze_batch returns bca.AnalysisError values instead of raising, so a single bad file does not break the whole batch.

for slot in bca.analyze_batch(paths):
    if isinstance(slot, bca.AnalysisError):
        log.warning("%s (%s): %s", slot.path, slot.error_kind, slot.error)
    else:
        process(slot)

error_kind is a closed Literal:

  • "UnsupportedLanguage" — extension and shebang / emacs-mode resolution both came up empty.
  • "ParseError" — tree-sitter rejected the input, or (rare) a Rust-side JSON serialisation of the result failed. The serialisation case is prefixed with internal: serialization error: in the error string; check for the prefix when the distinction matters (serialisation failures are not recoverable by re-reading the file).
  • "IoError" — the most common kind: std::fs::read failed. The closed taxonomy also folds in non-UTF-8 path failures, so a path-encoding error surfaces as "IoError" rather than as a distinct fourth value.

For "IoError" instances the underlying OS errno is preserved in the error string via Rust's default formatting ("<msg> (os error <N>)" on Unix). Parse with regex if you need it for retry classification:

import re

match = re.search(r"\(os error (\d+)\)$", slot.error)
errno = int(match.group(1)) if match else None

If you need typed OSError subclasses, call bca.analyze per file instead of analyze_batch — single-file analyze raises FileNotFoundError / PermissionError / IsADirectoryError directly.

Programmer errors in batches

analyze_batch does still raise on caller bugs:

  • TypeError if paths is not iterable, or an element is not str / os.PathLike[str]. This aborts the whole call; any results computed before the bad element are discarded.
  • ValueError if metrics= is an explicitly empty sequence or contains an unknown name. Validation runs before the input iterable's __iter__, so a generator's side effects (and any partial yields) are preserved on this raise path.

Logging recipe

A small logging helper for batch output keeps successes / failures aligned without bespoke formatting:

import logging
import big_code_analysis as bca

log = logging.getLogger(__name__)

def report(paths: list[str]) -> None:
    for path, slot in zip(paths, bca.analyze_batch(paths)):
        if isinstance(slot, bca.AnalysisError):
            log.warning(
                "skip %s (%s): %s", path, slot.error_kind, slot.error
            )
        else:
            log.info(
                "ok %s sloc=%s", path,
                slot["metrics"]["loc"]["sloc"],
            )

See also

  • Batch processing — the never-raise contract that routes per-file failures into AnalysisError slots.
  • Async patternsasyncio.gather(..., return_exceptions=True) is the async-side equivalent of the batch contract: per-task exceptions land in the result list instead of cancelling the whole gather.
  • Quick start — the single-file analyze path that raises typed OSError subclasses.

Async patterns

bca.analyze is CPU-bound: the work is a tree-sitter parse plus the metric passes, both of which release the GIL on the Rust side via PyO3's Python::detach. The canonical async pattern is therefore asyncio.to_thread:

async def analyze_async(path: Path) -> dict[str, Any] | None:
    """Run ``bca.analyze(path)`` on the default thread executor."""
    return await asyncio.to_thread(bca.analyze, str(path))


async def analyze_all(
    paths: Iterable[Path],
) -> list[dict[str, Any] | BaseException | None]:
    """Fan ``analyze_async`` out across ``paths`` with ``asyncio.gather``.

    ``return_exceptions=True`` matters here: ``bca.analyze`` runs
    inside ``asyncio.to_thread`` and Python threads cannot be
    cancelled. If one call raises and gather re-raises with
    ``return_exceptions=False``, the surviving threads keep running
    in the default executor, producing results that are silently
    discarded. With ``return_exceptions=True`` every thread's
    result (success OR exception) lands in the returned list so
    the caller can dispatch per-file.
    """
    return await asyncio.gather(
        *(analyze_async(p) for p in paths),
        return_exceptions=True,
    )

Why to_thread, not native async

bca.analyze is a synchronous Python function backed by synchronous Rust code — there is no await boundary inside it. Wrapping it in asyncio.to_thread:

  1. Schedules the call on the default thread pool.
  2. Lets other coroutines progress while the parse + metric pass runs.
  3. Returns the result back to the calling coroutine when done.

Because the Rust side releases the GIL across the heavy work, several to_thread(bca.analyze, ...) calls genuinely run in parallel — this is not co-operative I/O multiplexing, it is real multi-core utilisation gated on the thread pool's size.

Custom executors

For a tighter cap on the worker count, hand to_thread a purpose-built executor:

import asyncio
from concurrent.futures import ThreadPoolExecutor

import big_code_analysis as bca

async def analyze_many(paths: list[str]) -> list[object]:
    loop = asyncio.get_running_loop()
    with ThreadPoolExecutor(max_workers=8) as pool:
        return await asyncio.gather(
            *(loop.run_in_executor(pool, bca.analyze, p) for p in paths)
        )

Eight workers on an 8-core machine is the comfortable upper bound for purely CPU-bound work; raising it further oversubscribes the machine and trades throughput for context-switch overhead.

Streaming results

asyncio.as_completed lets you start consuming results as soon as the first analysis finishes — useful when the per-file work varies wildly in cost (a 5 KB file vs a 500 KB generated bundle):

import asyncio
import big_code_analysis as bca

async def first_failure(paths: list[str]) -> str | None:
    """Return the path of the first file with cyclomatic > 50."""
    tasks = [asyncio.create_task(asyncio.to_thread(bca.analyze, p)) for p in paths]
    try:
        for coro in asyncio.as_completed(tasks):
            result = await coro
            if result is None:
                continue
            if result["metrics"]["cyclomatic"]["sum"] > 50:
                return result["name"]
    finally:
        for t in tasks:
            t.cancel()
    return None

The finally-block cancellation matters: as_completed does not auto-cancel pending tasks when the caller returns early, so a leaked task can keep running on the thread pool well after the async function returns.

Anti-pattern: calling bca.analyze directly in a coroutine

# Don't do this.
async def bad(path: str) -> dict | None:
    return bca.analyze(path)  # blocks the event loop on every call

async def does not make the body asynchronous. Without to_thread or an explicit executor, every coroutine that calls bca.analyze stalls the event loop for the full duration of the parse — other tasks waiting on I/O, timers, or queues all freeze until the parse returns. The to_thread wrapper is one line and makes the difference between a responsive server and a single-threaded one.

When analyze_batch is the better fit

If you are processing a static, finite list of paths and do not need streaming results, bca.analyze_batch is simpler than gather(*to_thread(...)): it runs sequentially on the calling thread but never raises on per-file errors. Wrap the whole analyze_batch call in asyncio.to_thread to keep the event loop responsive:

import asyncio
import big_code_analysis as bca

async def batch(paths: list[str]) -> list[object]:
    return await asyncio.to_thread(bca.analyze_batch, paths)

This trades the per-file parallelism of gather for the simpler error model of analyze_batch. Pick gather when you want both parallelism and typed OSError dispatch; pick to_thread(analyze_batch, paths) when you want one async call and the never-raise contract.

Developers Guide

If you want to contribute to the development of big-code-analysis we have summarized here a series of guidelines that are supposed to help you in your building process.

As prerequisite, you need to install the last available version of Rust. You can learn how to do that here.

Clone Repository

First of all, you need to clone the repository. You can do that:

through HTTPS

git clone -j8 https://github.com/dekobon/big-code-analysis.git

or through SSH

git clone -j8 git@github.com:dekobon/big-code-analysis.git

Make is the canonical entry point

The repository ships a Makefile that wraps every common build, test, lint, format, and docs task. Run make help to see the full list of targets, and make check-tools to verify which optional tools (taplo, markdownlint-cli2, shellcheck, shfmt, checkmake, mdbook, cargo-insta, cargo-udeps) are present on your machine.

The two composite targets you will use most:

  • make pre-commit — the recommended local gate before committing. Runs cargo fmt --check, both clippy invocations (default-features and --all-features), cargo test --workspace --all-features (lib + bin + integration + doc), cargo +nightly udeps, and the markdown / TOML / shell / Makefile lint families in one parallel pass.
  • make ci — the same checks in the order CI runs them, with no auto-fixing. Use this to reproduce a failing CI run locally.

If GNU Make 4 or any of the optional tools are unavailable, fall back to the raw cargo commands shown below — they are equivalent to the corresponding Make targets.

Building

To build the big-code-analysis library, the CLI, and the web server in one shot:

make build           # cargo build --workspace --all-targets
make build-release   # cargo build --workspace --release

For an individual crate, invoke cargo directly:

cargo build                              # library only
cargo build -p big-code-analysis-cli     # CLI only
cargo build -p big-code-analysis-web     # web server only

make check runs cargo check --workspace --all-targets for fast type-checking during iteration.

Testing

To verify that all tests pass:

make test       # cargo test --workspace --all-features --lib --bins --tests
make test-doc   # cargo test --workspace --all-features --doc

If you only want to run the cargo command yourself:

cargo test --workspace --all-features --verbose

Updating insta tests

We use insta; install cargo insta to manage snapshots. The Makefile wraps the two operations you need:

make insta-review   # cargo insta test --review (interactive)
make insta-accept   # cargo insta test --accept (use with care)

make insta-review runs the tests, generates the new snapshot references, and lets you review each diff. Reach for make insta-accept only for bulk metric-value-only refreshes (grammar bumps, Halstead operator reclassification) where you have already verified the diff pattern is uniform.

Code Formatting

If all previous steps went well, and you want to make a pull request to integrate your invaluable help in the codebase, the last step left is code formatting. The make fmt target runs every formatter in the project (Rust, Markdown, TOML, Bash) in one shot; make fmt-check verifies formatting without modifying files.

make fmt         # cargo fmt + markdownlint-cli2 --fix + shfmt -w + taplo fmt
make fmt-check   # the equivalent --check variants

Rustfmt

This tool formats your code according to Rust style guidelines.

To install:

rustup component add rustfmt

To format the code (handled automatically by make fmt):

cargo fmt

Clippy

This tool helps developers to write better code catching automatically lots of common mistakes for them. It detects in your code a series of errors and warnings that must be fixed before making a pull request.

make clippy runs both clippy invocations the project enforces (default-features and --all-features); make lint additionally runs the markdown, shell, TOML, and Makefile linters.

To install:

rustup component add clippy

To detect errors and warnings:

make clippy
# or, manually:
cargo clippy --workspace --all-targets -- -D warnings
cargo clippy --workspace --all-targets --all-features -- -D warnings

Unused dependencies

make udeps runs cargo +nightly udeps --workspace --all-targets to catch dependencies declared in Cargo.toml but never referenced. Requires the nightly toolchain (rustup toolchain install nightly) and cargo-udeps.

Code Documentation

make doc        # cargo doc --no-deps --workspace --all-features  (warning-tolerant)
make doc-open   # same, then open in a browser
make doc-check  # strict gate: appends -D warnings to RUSTDOCFLAGS, fails on any rustdoc warning

make doc and make doc-open are the interactive viewers — they build whatever they can so you can still inspect rendered output mid-refactor. make doc-check is the strict gate that runs as part of make pre-commit and CI (cargo doc --no-deps --workspace --all-features with RUSTDOCFLAGS extended by -D warnings); it catches broken intra-doc links, links into private items, and other rustdoc regressions.

Remove the --no-deps option from the underlying cargo invocation if you also want to build the documentation of each dependency used by big-code-analysis.

Building this book

The book you are reading lives under big-code-analysis-book/:

make book        # mdbook build
make book-serve  # mdbook serve with live reload

Run your code

You can run bca using:

cargo run -p big-code-analysis-cli -- [bca-parameters]

To know the list of bca parameters, run:

cargo run -p big-code-analysis-cli -- --help

You can run bca-web using:

cargo run -p big-code-analysis-web -- [bca-web-parameters]

To know the list of bca-web parameters, run:

cargo run -p big-code-analysis-web -- --help

make install, make install-cli, and make install-web invoke cargo install --path for the respective binary crates.

Practical advice

  • When you add a new feature, add at least one unit or integration test to verify that everything works correctly
  • Document public API
  • Do not add dead code
  • Comment intricate code such that others can comprehend what you have accomplished
  • Run make pre-commit before pushing — it is the same gate CI runs

Supporting a new language

This section is to help developers implement support for a new language in big-code-analysis.

To implement a new language, two steps are required:

  1. Generate the grammar
  2. Add the grammar to big-code-analysis

A number of metrics are supported and help to implement those are covered elsewhere in the documentation.

Generating the grammar

As a prerequisite for adding a new grammar, there needs to exist a tree-sitter version for the desired language that matches the version used in this project.

The grammars are generated by a project in this repository called enums. The following steps add the language support from the language crate and generate an enum file that is then used as the grammar in this project to evaluate metrics.

  1. Add the language specific tree-sitter crate to the enums crate, making sure the dependency is pinned with =X.Y.Z to the same version used in the root big-code-analysis Cargo.toml. For example, for the Rust support the following line exists in the /enums/Cargo.toml: tree-sitter-rust = "=0.24.2".
  2. Append the language to the enum crate in /enums/src/languages.rs. Keeping with Rust as the example, the line would be (Rust, tree_sitter_rust). The first parameter is the name of the Rust enum that will be generated, the second is the tree-sitter function to call to get the language's grammar.
  3. Add a case to the end of the match in mk_get_language macro rule in /enums/src/macros.rs. The current convention uses the LANGUAGE constant exposed by modern grammar crates: for Rust that line is Lang::Rust => tree_sitter_rust::LANGUAGE.into().
  4. Lastly, we execute the /recreate-grammars.sh script that runs the enums crate to generate the grammar for the new language.

At this point we should have a new grammar file for the new language in /src/languages/. See /src/languages/language_rust.rs as an example of the generated enum.

Adding the new grammar to big-code-analysis

  1. Add the language specific tree-sitter crate to the big-code-analysis workspace, with the same =X.Y.Z pin as the enums crate uses. For example, for the Rust support the line in the root Cargo.toml is tree-sitter-rust = "=0.24.2".
  2. Next we add the new tree-sitter language namespace to /src/languages/mod.rs eg.
#![allow(unused)]
fn main() {
pub mod language_rust;
pub use language_rust::*;
}
  1. Lastly, we add a definition of the language to the arguments of mk_langs! macro in /src/langs.rs.
#![allow(unused)]
fn main() {
// 1) Name for enum
// 2) Language description
// 3) Display name
// 4) Empty struct name to implement
// 5) Parser name
// 6) tree-sitter function to call to get a Language
// 7) file extensions
// 8) emacs modes
(
    Rust,
    "The `Rust` language",
    "rust",
    RustCode,
    RustParser,
    tree_sitter_rust,
    [rs],
    ["rust"]
)
}

Implementing traits and tests

Wiring the grammar is only the first step. The new <Lang>Code type must also implement the AST plumbing and every metric trait the workspace defines:

  • Checker in /src/checker.rs — comment, function, closure, call, string-literal, and else-if predicates over the grammar's kind_ids.
  • Getter in /src/getter.rsget_space_kind plus the Halstead operator/operand classification table.
  • Alterator in /src/alterator.rs — usually only string-literal preservation; the default impl works for most languages.
  • All twelve metric traits: Abc, Cognitive, Cyclomatic, Exit, Halstead, Loc, Mi, NArgs, Nom, Npa, Npm, Wmc. Register each via the implement_metric_trait! macro invocation in /src/metrics/ to start with default (no-op) bodies, then replace with real impls for the metrics that have meaningful semantics for the language.

Audit aliased grammar variants

Tree-sitter grammars frequently emit several distinct kind_ids that map to the same node.kind() string (Identifier / Identifier2 / Identifier3 in Go, InvocationExpression / InvocationExpression2 in C#, QuotedContentQuotedContent20 in Elixir). Every match node.kind_id() arm that touches an aliasable rule must either list every numbered variant or compare on the string node.kind() instead. Missing an alias silently drops nodes from the metric. See the add-lang skill for the mechanical audit procedure and lessons 2, 4, and 13 in docs/development/lessons_learned.md for the failure modes.

Tests

Add per-language tests under each src/metrics/*.rs test module — aim for parity with the Rust coverage (≥ 34 tests total across the metric files). Every insta::assert_json_snapshot! call MUST be anchored: either with an inline expected block, a positive assert_eq! on the headline integer accessor above it, or an explanatory // expected: comment. make snapshot-anchors (run as part of make pre-commit) enforces this against .snapshot-anchor-baseline.txt.

End-to-end workflow

For an opinionated, end-to-end recipe — including the alias audit, test layout, snapshot anchoring, and code-quality post-passes — see the project's add-lang Claude Code skill. It is the canonical workflow used by recent language additions (Elixir, PHP, C#, Bash, Go).

Lines of Code (LoC)

In this document we give some guidance on how to implement the LoC metrics available in this crate. Lines of code is a software metric that gives an indication of the size of some source code by counting the lines of the source code. There are many types of LoC so we will first explain those by way of an example.

Types of LoC

#![allow(unused)]
fn main() {
/*
Instruction: Implement factorial function
For extra credits, do not use mutable state or a imperative loop like `for` or `while`.
 */

/// Factorial: n! = n*(n-1)*(n-2)*(n-3)...3*2*1
fn factorial(num: u64) -> u64 {
    
    // use `product` on `Iterator`
    (1..=num).product()
}
}

The example above will be used to illustrate each of the LoC metrics described below.

SLOC

A straight count of all lines in the file including code, comments, and blank lines.
METRIC VALUE: 11

PLOC

A count of the instruction lines of code contained in the source code. This would include any brackets or similar syntax on a new line. Note that comments and blank lines are not counted in this. METRIC VALUE: 3

LLOC

The "logical" lines is a count of the number of statements in the code. Note that what a statement is depends on the language. In the above example there is only a single statement which id the function call of product with the Iterator as its argument. METRIC VALUE: 1

CLOC

A count of the comments in the code. The type of comment does not matter ie single line, block, or doc.
METRIC VALUE: 6

BLANK

Last but not least, this metric counts the blank lines present in a code. METRIC VALUE: 2

Implementation

To implement the LoC related metrics described above you need to implement the Loc trait for the language you want to support.

This requires implementing the compute function. See /src/metrics/loc.rs for where to implement, as well as examples from other languages.

Update grammars

Each programming language needs to be parsed in order to extract its syntax and semantic: the so-called grammar of a language. In big-code-analysis, we use tree-sitter as parsing library since it provides a set of distinct grammars for each of our supported programming languages. But a grammar is not a static monolith, it changes over time, and it can also be affected by bugs, hence it is necessary to update it every now and then.

As now, since we have used bash scripts to automate the operations, grammars can be updated natively only on Linux and MacOS systems, but these scripts can also run on Windows using WSL.

In big-code-analysis we use both third-party and internal grammars. The first ones are published on crates.io and maintained by external developers, while the second ones have been thought and defined inside the project to manage variant of some languages used in Firefox. We are going to explain how to update both of them in the following sections.

Third-party grammars

Update the grammar version in Cargo.toml and enums/Cargo.toml. Below an example for the tree-sitter-java grammar

tree-sitter-java = "x.xx.x"

where x represents a digit.

Run ./recreate-grammars.sh to recreate and refresh all grammars structures and data

./recreate-grammars.sh

Once the script above has finished its execution, you need to fix, if there are any, all failed tests and problems introduced by changes in the grammars.

Commit your changes and create a new pull request

Internal grammars

Update the version of tree-sitter-cli in the package.json file of the internal grammar and then install the updated version.

The five vendored grammars publish under the bca-tree-sitter-* namespace (see RELEASING.md for the rename rationale), but consumer call sites still reference them as tree-sitter-<lang> via Cargo's package = ... alias. A grammar refresh does not bump the leaf's version on its own — every crate in this repository shares one workspace-wide version, and bumping the leaves out of step with the parent is not allowed (see the "Lockstep version policy" in RELEASING.md). Regenerate the parser tables, accept the resulting test-snapshot drift, and ship the change under the current version. The next workspace release picks up the new grammars at whatever shared version the next tag declares.

If a regeneration also needs an updated tree-sitter runtime dependency, bump the dev-dependency line inside the leaf's Cargo.toml:

[dev-dependencies]
tree-sitter = "=x.x.x"

Leave [package] name = "bca-tree-sitter-<lang>", [package] version, and [lib] name = "tree_sitter_<lang>" untouched — the rename trick in [lib] is what keeps Rust import paths stable, and the version line is managed by the lockstep bump at release time.

Run the appropriate script to update the grammar by recreating and refreshing every file and script.

For tree-sitter-ccomment and tree-sitter-preproc run ./generate-grammars/generate-grammar.sh followed by the name of the grammar. Below an example always using the tree-sitter-ccomment grammar

./generate-grammars/generate-grammar.sh tree-sitter-ccomment

Instead, for tree-sitter-mozcpp and tree-sitter-mozjs, use their specific scripts.

For tree-sitter-mozcpp, run

./generate-grammars/generate-mozcpp.sh

For tree-sitter-mozjs, run

./generate-grammars/generate-mozjs.sh

Once the script above has finished its execution, you need to fix, if there are any, all failed tests and problems introduced by changes in the grammars.

Commit your changes and create a new pull request