chartcoach

Cataloging scheme

Catalog structure, row shape, manifest vocabulary, release metadata, cache behavior, and queryable tables for chartcoach catalogs.

The Guideline Catalog is a versioned collection of citable visualization guidelines. Each guideline has a stable id, guideline text, role-annotated sections, labels, and source references.

Core concepts

Scheme elementMeaning
Guideline idStable id used by read, cite, URLs, and table joins
Section roleCatalog-defined role such as advice attached to one guideline section
Label familyCatalog-defined prefix such as chart in labels like chart:bar:use
ManifestMANIFEST.md, which defines section roles and label families for one catalog
BundlePublishable directory with metadata.json, MANIFEST.md, and entries.parquet
Default CatalogPackage-pinned release used when no source is passed

Inspect the manifest before hard-coding a role or label family. Those names are catalog vocabulary, not global chartcoach constants.

catalog-release/
├── metadata.json
├── MANIFEST.md
└── entries.parquet
uvx chartcoach@latest catalog read compare-percentages-with-bars-not-pies --source-detail minimal --format markdown

Use exact ids from catalog list, catalog query, catalog sql, or catalog find. Do not derive ids from titles.

Default Catalog and pinning

The Default Catalog is the package-pinned release used when a caller omits --source or Catalog.open() receives no path. JavaScript callers use DEFAULT_CATALOG for the same package-pinned artifact URLs and pass the loaded artifact bytes to loadCatalog(). The package constants select a release URL and digest, so repeated reads resolve the same bundle until the installed package changes or the caller passes a custom source.

The 0.1.4 release contains 781 guideline records and 262 source references. It follows the cataloging scheme described in Structured Visualization Design Knowledge for Grounding Generative Reasoning and Situated Feedback.

Examples use chartcoach@latest for one-off access to the newest published package. Pin [email protected] when output must match this catalog release.

Use a custom source for another catalog instance:

uvx chartcoach@latest catalog validate --source ./dist/catalog
CHARTCOACH_SOURCE=./dist/catalog uvx chartcoach@latest catalog overview

Query and search engines

chartcoach uses external engines at two boundaries:

EngineResponsibility in chartcoachReference
DuckDBExecutes read-only SQL over catalog tables and writes catalog export duckdb database files for tools that need a durable SQL artifactDuckDB documentation
LanceDBStores optional indexed catalog document rows for catalog find, MCP search, full-text search, vector search, and hybrid searchLanceDB documentation

Use chartcoach docs for catalog shape, source locators, command behavior, and cache layout. Use the engine docs for SQL syntax, database-file behavior, LanceDB table configuration, embedding functions, and search modes.

Serialized row

entries.parquet stores one row per guideline. The top-level row id must match guideline.id.

type CatalogRow = {
  id: string
  guideline: {
    id: string
    title: string
    bibliography?: string | null
    description: string
    labels: string[]
    body: string
    sections: Array<{
      role: string
      title: string
      content: string
    }>
  }
  references: string[]
}

Loaders reject duplicate ids, mismatched row ids, invalid labels, empty section roles, and manifest coverage gaps.

Authored Markdown

An authored catalog folder stores each entry at entries/<id>/guideline.md. The guideline Markdown carries front matter, sections, and section role comments. Build and read commands serialize references into the row-level references array.

Manifest

Every catalog instance needs a MANIFEST.md with these headings:

  • Section Roles
  • Label Families

Each role or label family is a level-three heading with prose underneath. The manifest defines what the current catalog means by a role or family. Validation checks that every used section role and label family is defined there.

## Section roles

### advice

Guidance that states what to do, avoid, prefer, or choose.

## Label families

### chart

Labels for chart families and visual forms, such as `chart:bar`.

Label examples inside a family definition must belong to that family.

Release metadata and locators

metadata.json names a catalog release and its artifacts. Base readers require manifest and entries. Indexed commands can use lancedb-index.

{
  "version": "0.1.4",
  "digest": "7cfd43ee820be252b8ae9058c4c36109a9c8415c6b3a5ff8a9127117b4a10c19",
  "artifacts": [
    { "kind": "manifest", "path": "MANIFEST.md", "digest": "...", "bytes": 1234 },
    { "kind": "entries", "path": "entries.parquet", "digest": "...", "bytes": 5678, "rows": 781 }
  ]
}

Artifact descriptors include kind, path, digest, bytes, and kind-specific fields such as rows, format, table, and documents. path must be relative and cannot contain ... Catalog clients resolve paths relative to the metadata URL.

Published artifacts live below:

https://artifacts.chartcoach.dev/catalog/releases/<version>/<digest>/

Release selection happens through package constants or explicit metadata URLs. There is no catalog/current object prefix in the cataloging scheme.

CLI and Python readers accept these locator forms:

LocatorBehavior
omittedUse the package-pinned Default Catalog
metadata URLRead release metadata and resolve artifacts
release root URLAppend metadata.json
bundle directoryRead local MANIFEST.md and entries.parquet
authored folderRead local MANIFEST.md and entries/*/guideline.md
parquet fileRead serialized rows

The JavaScript package accepts artifact bytes. Use DEFAULT_CATALOG, parseCatalogReleaseMetadata(), catalogArtifact(), and catalogArtifactUrl() to locate artifacts, then pass entries and optional manifest data to loadCatalog().

Cache

The Python CLI and package store downloaded release artifacts under artifacts/ inside the user's platform cache directory.

uvx chartcoach@latest catalog cache path

Base reads download MANIFEST.md and entries.parquet. Indexed commands download LanceDB archives when chartcoach[index] is installed and a default index is resolved. A caller-owned --index path stays outside the chartcoach artifact cache.

Set CHARTCOACH_CACHE_DIR for tests or isolated runs that need a separate platform cache root.

Queryable tables

Inspect the live schema before writing SQL:

uvx chartcoach@latest catalog schema --tables --row-counts
uvx chartcoach@latest catalog schema --format jsonl
TableContains
guidelinesGuideline rows with ids, titles, labels, body, and nested sections
sectionsOne row per guideline section
labelsUnique parsed label values
guideline_labelsGuideline-to-label edges
referencesParsed BibTeX reference entries
guideline_referencesGuideline-to-reference edges
guideline_sourcesGuideline rows joined to source metadata

catalog sql and MCP sql accept one read-only SELECT statement. Use catalog schema TABLE for columns before writing a join. Use catalog export duckdb when another tool needs a database file.

Inspect vocabulary before writing filters:

uvx chartcoach@latest catalog values labels --contains chart: --format jsonl
uvx chartcoach@latest catalog values roles --format jsonl
uvx chartcoach@latest catalog values label.family --format jsonl
uvx chartcoach@latest catalog values sections.role --format jsonl

Aliases such as labels, roles, label.family, label.category, and label.modifier map to table columns.

Loader boundary

Use the Python or JavaScript package APIs as the supported data boundary. The generated docs output and public HTML pages are rendering artifacts.

On this page