Python

Load catalog bundles, inspect Polars tables, query with DuckDB, and create LanceDB search indexes.

The chartcoach package loads catalog instances into a Polars-first Catalog.

Run one check without adding the package to a project:

uv run --with chartcoach python -c "from chartcoach import Catalog; print(Catalog.open().guidelines().select('id', 'title').head(3))"

Add the package from an application root:

uv add chartcoach
uv add 'chartcoach[index]'
uv add 'chartcoach[mcp,index]'

from chartcoach import Catalog

catalog = Catalog.open()
catalog.guidelines().select("id", "title").head(5)

Catalog.open() reads the package-pinned Default Catalog when no path is provided.

Catalog.open() and first CLI catalog reads download package-pinned artifacts. Indexed search downloads an index archive. Set CHARTCOACH_CACHE_DIR to isolate writes.

Loading

from chartcoach import Catalog

default_catalog = Catalog.open()
bundle_catalog = Catalog.open("dist/catalog")
parquet_catalog = Catalog.open("dist/catalog/entries.parquet")
folder_catalog = Catalog.from_folder("catalog-source")

Catalog.open(path) accepts authored folders, bundle directories, standalone parquet files, and release metadata URLs. Bundle reads load MANIFEST.md and entries.parquet.

Tables

Catalog exposes derived Polars tables through guidelines(), sections(), labels(), guideline_labels(), references(), guideline_references(), and guideline_sources().

Use catalog.table(name) when a caller receives a table name from the CLI, MCP, or docs. The Cataloging scheme lists table names and columns.

Entries

entry = catalog.entry("compare-percentages-with-bars-not-pies")
print(entry["id"])
print(entry["title"])
print(entry["sections"][0]["role"])
print(entry["references"])

entry(id) returns one flat dictionary with id, title, description, labels, sections, and references. Unknown ids raise KeyError.

Use citation records when an agent needs source provenance:

from chartcoach.catalog.references import citation_records

citations = citation_records(
    catalog,
    ids=["compare-percentages-with-bars-not-pies"],
)
print(citations[0]["id"])
print(citations[0]["url"])
print(citations[0]["sources"][0]["reference_id"])

Citation records contain id, title, url, guideline_citation, and sources. Each source includes reference_id, source_title, doi, url, and a formatted citation.

Use Catalog.from_entries() with Guideline and Section objects for in-memory catalogs. Guideline requires either body or sections and derives one from the other when needed.

DuckDB

Use DuckDB for joins across guideline tables.

conn = catalog.duckdb()
try:
    rows = conn.sql("""
        select g.id, g.title, s.role, s.content
        from guidelines g
        join sections s on s.guideline_id = g.id
        where s.role = 'advice'
        limit 5
    """).pl()
finally:
    conn.close()

The result has id, title, role, and content columns. DuckDB is installed with the base Python package.

Write a durable database file for another tool:

catalog.write_duckdb("./chartcoach-catalog.duckdb", overwrite=True)

LanceDB search

Add the index extra before using search in application code.

from chartcoach.catalog import default_index_path
from chartcoach.search import open as open_index, search

table = open_index(default_index_path(table_name="catalog_documents"))
hits = search(catalog, table, "overplotted scatter plots").to_dict()["rows"]
print(hits[0]["id"])
print(hits[0]["matched_role"])
print(hits[0]["matched_text"])

default_index_path() downloads and extracts the package-pinned index on first use. Later calls reuse the local copy.

Each search hit contains rank, id, title, description, labels, matched_document_id, matched_role, score, and matched_text. Treat hits as candidates, then call catalog.entry(hit["id"]) and citation_records(...) before producing a sourced answer.

Use index(catalog, "./chartcoach-index", embedding=...) when the application needs to build a caller-owned table. Without an embedding function, index() creates a full-text table over the text column.

Write formats

catalog.write_folder("dist/authored")
catalog.write_parquet("dist/entries.parquet")
catalog.write_bundle("dist/catalog", overwrite=True)

write_bundle() writes MANIFEST.md, entries.parquet, and metadata.json. It requires a catalog manifest.