Methodology

About the numbers

Updated 2026-05-20 · all figures derived from /data/summary.json.

The four counts that matter

StageCountDefinition
All US hospitals5,426Rows in CMS Hospital General Information.
CMS-required4,625Acute care, critical access, children's, and rural emergency hospitals — the set 45 CFR § 180 binds.
Live MRF (compliant)3,986CMS-required hospitals where we verified a live, downloadable MRF URL.
Standardized prices3,692Live MRFs we successfully parsed into n > 0 standardized rows. This is the homepage headline.

The 294-hospital gap, explained

Between "live MRF" (3,986) and "standardized prices" (3,692) sits a coverage hole of 294 hospitals. Two failure modes:

We surface only n > 0 entries on the homepage because zero-row outputs aren't useful for comparison. They count toward the parse-pipeline yield, not toward the patient- facing count.

The pipeline

  1. Seed — start with CMS hospital list (5,426) and the CMS HPT seed.
  2. Discover — sitemap walker + email-domain expansion + Wayback fallback to locate each hospital's MRF URL.
  3. Probe — HEAD / partial GET to confirm the URL is alive and serves a non-HTML, non-XML, non-PDF body. 3,986 of 4,625 CMS-required hospitals pass this gate.
  4. Parse → standardize — extract CDM / CPT / DRG / HCPCS rows into a uniform schema. 3,800 hospitals have an index entry; 3,692 of those have at least one usable row.
  5. Publish — write per-hospital JSON to R2 + /data/prices/index.json for cross-hospital comparison.

Closing the gap

The closeout pipeline (scripts/coverage_closeout.py) replays the ingest pass against the 186 not-in-index CCNs and the 108 zero-row CCNs with auto-tuned parallelism. Two open issues:

Re-derive everything yourself

# Live summary
curl -s https://hospitalledger.com/data/summary.json | jq '{compliant, standardized_price_hospitals, standardized_price_index_hospitals, zero_price_index_entries}'

# Price index (full)
curl -s https://hospitalledger.com/data/prices/index.json | jq '.hospitals | length'

# Hospitals with n > 0
curl -s https://hospitalledger.com/data/prices/index.json | jq '[.hospitals[] | select(.n > 0)] | length'

← Back to Hospital Ledger