Quick start

This page walks through the minimum amount of code needed to compute metrics from a single source file.

1. Install the package

pip install big-code-analysis

See Installation for the wheel matrix and build-from-source instructions.

2. Analyse a file

bca.analyze(path) returns a dict matching the JSON bca metrics --format json emits for the same file — same field order, same numeric formatting, same shape.

"""Quick-start: analyse one file and print the headline cyclomatic count.

Mirrors the worked example shown on the book's
``python/quick-start.md`` page. The book embeds this file verbatim,
so the snippet is the test fixture — if the API drifts, the
``test_book_examples.py`` test fails and the docs are forced back
into sync.
"""

from __future__ import annotations

from pathlib import Path

import big_code_analysis as bca
from big_code_analysis import FuncSpaceDict


def run(path: Path) -> FuncSpaceDict:
    """Analyse ``path`` and return its metric dict."""
    result = bca.analyze(path)
    if result is None:
        msg = f"{path} was skipped (empty, binary, or generated)"
        raise SystemExit(msg)

    cyclomatic = result["metrics"]["cyclomatic"]
    print(f"{result['name']}: cyclomatic sum = {cyclomatic['sum']:.0f}")
    return result


if __name__ == "__main__":
    import sys

    if len(sys.argv) != 2:
        sys.exit("usage: python quick_start.py <path>")
    run(Path(sys.argv[1]))

A few details worth noting:

analyze returns None for any file the CLI walker would skip: one that is three bytes or fewer (treated as empty), one whose leading window is not valid UTF-8 (treated as binary), or — with the default skip_generated=True — one matching the walker's is_generated predicate (a leading @generated, DO NOT EDIT, or GENERATED CODE marker). Always handle the optional return before reaching into result["metrics"].
The returned object is a plain dict at runtime — safe to serialise with json.dumps, ship to a downstream service, or feed into flatten_spaces for tabular consumers. Type checkers see it as the FuncSpaceDict TypedDict (generated from the Rust wire shapes), so nested metric access checks statically under mypy/pyright without casts.
Language detection mirrors the CLI exactly: path extension first, then shebang / emacs-mode fallback. Pass bca.analyze_source(code, language) if you have the source in-memory.

3. Analyse an in-memory snippet

import big_code_analysis as bca

metrics = bca.analyze_source("fn main() {}\n", "rust")
print(metrics["metrics"]["loc"]["sloc"])

analyze_source accepts str, bytes, or bytearray. The returned dict has the same shape as analyze's output, with name set to None (no path is associated with an in-memory buffer).

Where to go next

Batch processing — analyze_batch for many files without per-file try/except clutter.
Metric selection — compute only the metrics you need.
Error handling — the full exception taxonomy.
The CLI's Metrics command is the equivalent shell-level workflow.

big-code-analysis Documentation

Quick start

1. Install the package

2. Analyse a file

3. Analyse an in-memory snippet

Where to go next