Metric selection

Pass metrics=[…] to compute only a subset of the metric suite. metrics=None (the default) preserves the "compute everything" behaviour. Unrequested metrics are absent from the result dict (not present with None placeholders).

def run(path: Path) -> dict[str, Any]:
    """Compute only LoC + cyclomatic for ``path`` and return the result.

    ``bca.METRIC_NAMES`` is a ``tuple[str, ...]`` of every canonical
    name accepted by ``metrics=``. The string ``"halstead"`` is one
    of them; ``in`` membership tests the selection client-side
    before any I/O is paid for.
    """
    if "halstead" not in bca.METRIC_NAMES:
        msg = "halstead is missing from METRIC_NAMES — bindings ABI drift"
        raise RuntimeError(msg)
    selected = bca.analyze(path, metrics=["loc", "cyclomatic"])
    if selected is None:
        msg = f"{path} was skipped (looks generated)"
        raise SystemExit(msg)

    metric_keys = sorted(selected["metrics"])
    print(f"computed only: {metric_keys}")
    return selected


def run_derived(path: Path) -> dict[str, Any]:
    """Selecting ``mi`` auto-pulls in its three dependencies."""
    selected = bca.analyze(path, metrics=["mi"])
    if selected is None:
        msg = f"{path} was skipped (looks generated)"
        raise SystemExit(msg)

    pulled = sorted(selected["metrics"])
    print(f"mi pulled in: {pulled}")
    return selected

The same kwarg is honoured by bca.analyze_source and bca.analyze_batch — the latter applies the selection uniformly to every file in the batch. Validation runs before any file I/O: an empty list or unknown name raises ValueError immediately and never returns an AnalysisError slot for what is really a caller bug.

Canonical names

The full set is available as a tuple:

import big_code_analysis as bca

assert "halstead" in bca.METRIC_NAMES

Names are case-sensitive lowercase; passing an unknown name raises ValueError with the canonical list in the message. The "exit" Metric-Display spelling is accepted as an alias for the canonical JSON-key spelling "nexits"; both produce a "nexits" key in the output. Duplicates are silently collapsed.

MetricJSON keyDependencies pulled in
LoCloc
Cyclomaticcyclomatic
Cognitivecognitive
Halsteadhalstead
ABCabc
nargsnargs
nomnom
npanpa
npmnpm
nexits (alias exit)nexits
tokenstokens
Maintainability Indexmiloc, cyclomatic, halstead
Weighted Methods per Classwmccyclomatic, nom

Performance trade-off

Computing the full suite is the default because it is what the CLI does. Selecting a single metric is strictly faster — each compute pass is skipped — but the tree-sitter parse and the AST walk are the dominant cost on most inputs, so the saving on a single file is small. The benefit scales with batch size: when analyze_batch runs across a large repository, dropping the most expensive metric you do not need (often Halstead, on deep call trees) is a measurable win.

Unrequested metrics are absent from the result. Code that unconditionally indexes into result["metrics"]["mi"] will KeyError if you opted out of mi; guard with if "mi" in result["metrics"] or use .get("mi").

See also

  • Batch processingmetrics= applies uniformly to every file in a batch; validation runs once, before the input is iterated.
  • SARIF output — threshold names are independent of the metrics= selection; you can request metrics=["loc"] and still gate on cyclomatic thresholds, but the SARIF will have no findings for the dropped metrics.
  • Flat-record iterationflatten_spaces silently emits no keys for metrics that were absent from the source dict, so a metrics= selection naturally narrows the flattened columns.