Quick start

This page walks through the minimum amount of code needed to compute metrics from a single source file.

1. Install the package

pip install big-code-analysis

See Installation for the wheel matrix and build-from-source instructions.

2. Analyse a file

bca.analyze(path) returns a dict matching the JSON bca metrics --output-format json emits for the same file — same field order, same numeric formatting, same shape.

"""Quick-start: analyse one file and print the headline cyclomatic count.

Mirrors the worked example shown on the book's
``python/quick-start.md`` page. The book embeds this file verbatim,
so the snippet is the test fixture — if the API drifts, the
``test_book_examples.py`` test fails and the docs are forced back
into sync.
"""

from __future__ import annotations

from pathlib import Path
from typing import Any

import big_code_analysis as bca


def run(path: Path) -> dict[str, Any]:
    """Analyse ``path`` and return its metric dict."""
    result = bca.analyze(path)
    if result is None:
        msg = f"{path} was skipped (looks generated)"
        raise SystemExit(msg)

    cyclomatic = result["metrics"]["cyclomatic"]
    print(f"{result['name']}: cyclomatic sum = {cyclomatic['sum']:.0f}")
    return result


if __name__ == "__main__":
    import sys

    if len(sys.argv) != 2:
        sys.exit("usage: python quick_start.py <path>")
    run(Path(sys.argv[1]))

A few details worth noting:

  • analyze returns None when the file matches the CLI walker's is_generated predicate (a leading @generated, DO NOT EDIT, or GENERATED CODE marker). Always handle the optional return before reaching into result["metrics"].
  • The returned object is a plain dict[str, Any]. It is safe to serialise with json.dumps, ship to a downstream service, or feed into flatten_spaces for tabular consumers.
  • Language detection mirrors the CLI exactly: path extension first, then shebang / emacs-mode fallback. Pass bca.analyze_source(code, language) if you have the source in-memory.

3. Analyse an in-memory snippet

import big_code_analysis as bca

metrics = bca.analyze_source("fn main() {}\n", "rust")
print(metrics["metrics"]["loc"]["sloc"])

analyze_source accepts str, bytes, or bytearray. The returned dict has the same shape as analyze's output, with name set to None (no path is associated with an in-memory buffer).

Where to go next