{"data":{"@kind":"guide","slug":"mirror-the-catalog","title":"Mirror Cambridge TCG's card catalog locally","subtitle":"One request, ~12k cards, CC0.","intro":"If you're building a meta-product (price aggregator, deck builder, search engine), you'll want a local mirror of the catalog so your users don't hit our API for every card view. This guide gets you from zero to a refreshable local copy in one request, plus a polite refresh discipline.","audiences":["mirror","aggregator","scraper"],"prerequisites":["About 6 MB of disk for the JSONL file","A daily cron or scheduled task"],"estimated_minutes":10,"step_count":3,"steps":[{"step_number":1,"title":"Fetch the bulk catalog","instruction":"One request returns the entire catalog as streaming JSONL. The first line is a manifest header (count, retrieved_at, license); the last is a footer (complete, count_emitted); intervening lines are cards in canonical universal-mirror sparse form. Each card carries `@content_hash` for change-detection.","curl":"curl -H 'Accept-Encoding: gzip' \\\n  https://cambridgetcg.com/data/catalog.jsonl \\\n  > catalog.jsonl","expected_response_shape":"Line 1: { \"@kind\": \"catalog_manifest\", \"count_expected\": 12000, \"license\": \"CC0-1.0\", ... }\nLine 2-N: { \"@kind\": \"card\", \"@content_hash\": \"sha256:...\", \"sku\": \"...\", \"price\": {...}, ... }\nLine N+1: { \"@kind\": \"catalog_footer\", \"complete\": true, \"count_emitted\": 11984 }","what_to_do_with_it":"Parse line-by-line. Store the manifest header — its `retrieved_at` is your cache key. Index cards by `sku`. Compare each card's `@content_hash` against your stored copy on next refresh; only re-index changed rows. The footer's `complete: true` is the signal you got the full stream; `truncated: true` means you hit the 50k cap (unlikely today; cursor pagination is future work)."},{"step_number":2,"title":"Schedule a daily refresh","instruction":"The catalog freshness budget is `catalog` (24 hours). Pulling once a day at off-peak (e.g. 04:00 UTC) is the polite cadence. Don't pull more often than every 6 hours — the catalog doesn't change that fast and your bandwidth is wasted.","curl":"# cron entry: 0 4 * * *  curl -o catalog.jsonl https://cambridgetcg.com/data/catalog.jsonl","what_to_do_with_it":"After each refresh, diff the new `@content_hash` set against your previous to find changed/added/removed rows. Cards never get hard-deleted but the `@content_hash` changes when the latest captured price changes."},{"step_number":3,"title":"Cite Cambridge TCG honestly","instruction":"The data is CC0 — you owe no attribution legally. But *substrate-honest* attribution is encouraged: in your UI, name where the data came from, and link back. Reciprocal kindness.","what_to_do_with_it":"Recommended attribution: 'Catalog data from Cambridge TCG (https://cambridgetcg.com) — CC0-1.0.' Or in machine-readable form, attach `provenance: { source: \"cambridge-tcg\", license: \"CC0-1.0\", retrieved_at: \"...\" }` to each row in your downstream product."}],"gotchas":[{"title":"The price chain may include cardrush JP retail","description":"GBP prices are Cambridge TCG's own retail offers (CC0). But the underlying price observation pipeline at our wholesale layer reads from CardRush JP (license: internal-only). The bulk export only carries derived GBP — not raw JPY — so you're fine. But if you later use /api/v1/cards/[sku]/cardrush-history (auth-gated tier-2), the JPY values come with `internal-only` license restrictions: personal-decision use OK, bulk re-export not."},{"title":"JSONL parsing — one object per line","description":"Don't parse the whole response as a single JSON document. Read line by line. Each line is a complete JSON object.","symptom":"Your parser errors with 'JSON document has trailing content' or similar.","fix":"In Node: `body.split('\\n').filter(Boolean).map(JSON.parse)`. In Python: `[json.loads(line) for line in response.iter_lines() if line]`."},{"title":"The catalog has 50k row cap today","description":"Current catalog is ~12k rows. The bulk endpoint caps at 50k per request — well above today's size. When/if the catalog grows past that, we'll add cursor pagination via `?since_sku=`. The footer's `truncated: true` is the signal."}],"next_guide":{"slug":"track-one-card","title":"Track one card's price over time","url":"/api/v1/guides/track-one-card","html_url":"/agents/guides/track-one-card"},"see_also":[{"label":"Connection doc: the-license-propagation","href":"https://github.com/cambridgetcg/Cambridge-TCG-monorepo/blob/main/docs/connections/the-license-propagation.md"},{"label":"Universal representation spec","href":"/methodology/universal-representation"},{"label":"Bulk endpoint OpenAPI","href":"/api/openapi.json"}],"last_verified":"2026-05-14","feedback":{"kind":"guide-feedback","endpoint":"/api/v1/feedback","body_template":{"kind":"guide-feedback","guide_slug":"mirror-the-catalog","step_number":"<which step had the issue, or null for whole-guide feedback>","observation":"<what you observed>","expected":"<what you expected>","reporter_contact":"<your email>"}},"html_sibling":"/agents/guides/mirror-the-catalog"},"_meta":{"spec_version":"1","endpoint":"/api/v1/guides/[slug]","retrieved_at":"2026-05-13T19:58:51.186Z","as_of":"2026-05-13T19:58:51.186Z","sources":["ctcg-derived"],"freshness_seconds":86400,"license":"CC0-1.0","request_id":"req_0b20dfdd-246","deprecation":null,"next_link":null,"self_reference":{"this_endpoint":"/api/v1/guides/[slug]","contains_self":true},"source_license":["CC0-1.0"]}}