Your claims data warehouse went live three months ago. The schema looked solid in the design review. Your DBA normalized everything properly. Then finance runs the first reconciliation report against the 835 remittance files.
The counts don't match.
Claims you thought were paid show as denied. Paid amounts differ by thousands of dollars. Some claims appear twice. Others are missing entirely. Your CFO wants answers. Your vendor blames your ETL. Your ETL team blames the source data.
The real problem? You modeled claims as transactions, not as the complex lifecycle events they actually are.
Healthcare claims go through submission → adjudication → payment → adjustment → reprocessing. Each stage generates data. Each adjustment creates a new version. Financial reconciliation requires tracking every version, not just the final state.
Here's how to build claims data models that maintain accuracy through the entire adjudication and payment lifecycle.
The Claims Lifecycle: Why Simple Models Fail
A claim isn't a transaction. It's a state machine with these stages:
- Submission - Provider sends claim via [[[[EDI 837](/terms/EDI%20837)](/terms/EDI%20837)](/terms/EDI%20837)](/terms/EDI%20837)
- Receipt - Payer receives and validates format
- Adjudication - Payer applies benefit rules, determines payment
- Payment - Payer issues payment via [[[[EDI 835](/terms/EDI%20835)](/terms/EDI%20835)](/terms/EDI%20835)](/terms/EDI%20835), sends EOB to member
- Adjustment - Claim gets corrected, voided, or reprocessed
- Appeals - Provider disputes denial, claim re-adjudicated
Each stage generates data. A simple transactional model captures only the final state, losing the audit trail that finance and compliance require.
The Naive Claims Model (Guaranteed to Fail Reconciliation)
CREATE TABLE claim (
claim_id VARCHAR(50) PRIMARY KEY,
member_id VARCHAR(50),
provider_id VARCHAR(50),
service_date DATE,
billed_amount DECIMAL(10,2),
paid_amount DECIMAL(10,2),
claim_status VARCHAR(20),
processed_date DATE
);
🔄Free Tool
Convert this DDL to Snowflake, BigQuery, or PostgreSQL instantly →
Why this breaks in production:
Problem 1: No Version History
Claim gets paid $500 on March 15. On April 3, it's adjusted to $450. What happened?
- If you UPDATE: Lost the original $500 (can't reconcile to March financials)
- If you DELETE+INSERT: Lost the relationship (orphaned claim lines)
- If you add
versioncolumn: Can't query "current state" without MAX(version) everywhere
Problem 2: No Adjustment Tracking
Why was the claim adjusted? Was it:
- Provider submitted corrected claim?
- Member appealed and won?
- Payer discovered duplicate payment?
- Coordination of benefits changed?
Without tracking adjustment type and reason, you can't answer "Why did our paid claims decrease by $2.3M this quarter?"
Problem 3: No Payment Reconciliation
Finance receives an 835 remittance file with 10,000 claims totaling $4.2M. Your database shows 9,847 claims totaling $4.3M.
- Which claims are missing?
- Which claims paid different amounts?
- Which claims are in your database but not in the 835?
Without linking claims to payment batches, reconciliation is manual detective work.
The Production Claims Model: Header + History + Lines
Healthcare claims need three-tier modeling:
-- Tier 1: Claim Header (current state + identifiers)
CREATE TABLE claim_header (
clm_id VARCHAR(50) PRIMARY KEY,
-- Identifiers
internal_clm_nbr VARCHAR(50) UNIQUE,
payer_clm_nbr VARCHAR(50),
external_clm_nbr VARCHAR(50),
-- Member/Provider relationships
mmbr_id VARCHAR(50),
subscriber_id VARCHAR(50),
rendering_prvdr_id VARCHAR(50),
billing_prvdr_id VARCHAR(50),
referring_prvdr_id VARCHAR(50),
facility_prvdr_id VARCHAR(50),
-- Current financial state
current_billed_amt DECIMAL(10,2),
current_allowed_amt DECIMAL(10,2),
current_paid_amt DECIMAL(10,2),
current_patient_resp_amt DECIMAL(10,2),
-- Current status
current_status_cd VARCHAR(20),
current_version_nbr INT DEFAULT 1,
adjustment_count INT DEFAULT 0,
-- Key dates
service_from_dt DATE,
service_to_dt DATE,
received_dt DATE,
first_processed_dt DATE,
last_processed_dt DATE,
paid_dt DATE,
-- Metadata
claim_type_cd VARCHAR(20),
claim_frequency_cd VARCHAR(10),
bill_type_cd VARCHAR(10),
is_current_flag CHAR(1) DEFAULT 'Y',
created_dttm TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_dttm TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
Tracking Adjustments: The Key to Reconciliation
When a claim gets adjusted, you need to answer:
- What changed? (amount, status, diagnosis)
- Why did it change? (correction, appeal, duplicate)
- When did it change? (effective date, process date)
- Who changed it? (payer, provider, member)
Payment Reconciliation: Linking Claims to 835 Files
The 835 remittance file is your source of truth for payments. Your claims warehouse must reconcile to it.
CREATE TABLE payment_batch (
batch_id VARCHAR(50) PRIMARY KEY,
payer_id VARCHAR(50),
check_nbr VARCHAR(50),
check_dt DATE,
check_amt DECIMAL(10,2),
file_name VARCHAR(200),
claim_count INT,
total_paid_amt DECIMAL(10,2),
reconciled_flag CHAR(1)
);
Conclusion
Claims aren't transactions. They're complex lifecycle events that change over time, require complete audit trails, and must reconcile to external payment files.
Model claims with:
- Header for current state
- History for all versions
- Payment links for reconciliation
- Line detail for accuracy
- Code tracking for compliance
Fix the claims model. Fix the financials.
Related Guides
Key Terms in This Article
More in Healthcare Data Modeling
HCC Risk Adjustment Data Model: Building Accurate Risk Score Pipelines in SQL
HCC risk adjustment determines how much revenue a health plan receives for each Medicare Advantage member. The data model behind it — from diagnosis code ingestion to RAF score calculation — is one of the most consequential schemas a payer data engineer will build.
Read moreNPI Number Validation: How to Clean and Enrich Provider Data in Your Database
Invalid NPI numbers in your provider table silently break claims routing, credentialing workflows, and CMS submissions. This guide covers validation approaches, NPPES enrichment patterns, and SQL queries for catching NPI data quality problems before they reach production.
Read moreHEDIS Measure Data Architecture: Building a Compliant Quality Reporting Pipeline
HEDIS measure calculation is deceptively complex. NCQA technical specifications are hundreds of pages long, source data comes from four different systems, and one logic error can affect thousands of members. Here is how to build a pipeline that holds up to NCQA audit.
Read moreReady to improve your data architecture?
Free tools for DDL conversion, SQL analysis, naming standards, and more.