You're building a healthcare data warehouse. Your team has modeled retail systems, banking platforms, even e-commerce giants. Healthcare should be easier, right?
Wrong.
Three months into the project, your carefully normalized member table can't handle mid-month plan switches. Claims get adjusted weeks after payment, breaking your financial reconciliation. Provider network attribution shows Dr. Smith treating patients in three cities simultaneously. Your quality measures calculate incorrectly because you didn't model diagnosis code sequencing.
Healthcare data modeling is fundamentally different. Here's how to build logical models for Members, Claims, Providers, and Compliance that actually work in production.
The Four Pillars of Healthcare Data Models
Every healthcare data warehouse centers on four core entity groups:
- Members - People enrolled in health plans
- Claims - Medical services rendered and billed
- Providers - Doctors, hospitals, clinics delivering care
- Compliance - Regulatory reporting, quality measures, audits
The fatal mistake? Modeling these like transactional retail data. Healthcare requires temporal tracking, version history, source lineage, and regulatory audit trails that traditional normalized models don't support.
Pillar 1: Member Data Models
Members aren't static records. A member can:
- Switch plans mid-month (retroactive coverage changes)
- Have multiple coverage types (medical + dental + vision + Rx)
- Change addresses while claims are processing (where do you route the EOB?)
- Appear in 4 different source systems with 4 different names
The Naive Member Model (Breaks Immediately)
CREATE TABLE member (
mmbr_id VARCHAR(50) PRIMARY KEY,
first_nm VARCHAR(100),
last_nm VARCHAR(100),
date_of_birth DATE,
gender_cd CHAR(1),
ssn VARCHAR(11),
address_line1 VARCHAR(200),
city_nm VARCHAR(100),
state_cd CHAR(2),
zip_cd VARCHAR(10),
plan_id VARCHAR(50),
coverage_start_dt DATE,
coverage_end_dt DATE
);
🔄Free Tool
Convert this DDL to Snowflake, BigQuery, or PostgreSQL instantly →
Why this fails in production:
- Updates overwrite history (can't reproduce last month's metrics)
- Can't handle mid-month plan changes (member had 2 plans in one month)
- No source system tracking (which system is truth for address?)
- No family relationships (subscriber vs. dependent)
The Production Member Model
-- Golden record (current truth)
CREATE TABLE member_master (
mmbr_id VARCHAR(50) PRIMARY KEY,
subscriber_id VARCHAR(50),
dependent_seq_nbr INT,
golden_first_nm VARCHAR(100),
golden_last_nm VARCHAR(100),
golden_dob DATE,
golden_gender_cd CHAR(1),
golden_ssn VARCHAR(11),
mdm_effective_dttm TIMESTAMP,
is_current_flag CHAR(1),
created_dttm TIMESTAMP,
updated_dttm TIMESTAMP
);
-- Demographic history (all changes)
CREATE TABLE member_demographic_history (
demo_history_id VARCHAR(50) PRIMARY KEY,
mmbr_id VARCHAR(50),
version_nbr INT,
first_nm VARCHAR(100),
last_nm VARCHAR(100),
address_line1 VARCHAR(200),
city_nm VARCHAR(100),
state_cd CHAR(2),
zip_cd VARCHAR(10),
effective_dt DATE,
end_dt DATE,
created_dttm TIMESTAMP,
source_system_cd VARCHAR(20),
is_current_flag CHAR(1),
FOREIGN KEY (mmbr_id) REFERENCES member_master(mmbr_id)
);
-- Enrollment/coverage (temporal)
CREATE TABLE member_enrollment (
enrollment_id VARCHAR(50) PRIMARY KEY,
mmbr_id VARCHAR(50),
plan_id VARCHAR(50),
coverage_type_cd VARCHAR(20),
coverage_start_dt DATE,
coverage_end_dt DATE,
enrollment_status_cd VARCHAR(20),
termination_reason_cd VARCHAR(50),
premium_amt DECIMAL(10,2),
effective_dt DATE,
end_dt DATE,
is_current_flag CHAR(1),
FOREIGN KEY (mmbr_id) REFERENCES member_master(mmbr_id)
);
-- Member months (pre-calculated)
CREATE TABLE member_months (
mmbr_month_id VARCHAR(50) PRIMARY KEY,
mmbr_id VARCHAR(50),
plan_id VARCHAR(50),
year_nbr INT,
month_nbr INT,
coverage_days INT,
member_months DECIMAL(5,2),
full_month_flag CHAR(1),
calculation_method VARCHAR(50),
FOREIGN KEY (mmbr_id) REFERENCES member_master(mmbr_id)
);
Why this works:
- ✅ Golden record preserves current state
- ✅ Full demographic history for point-in-time queries
- ✅ Enrollment periods tracked separately
- ✅ Member months pre-calculated (no runtime debates)
- ✅ Source lineage documented
Pillar 2: Claims Data Models
Claims are the most complex healthcare entity:
- Get submitted → adjudicated → paid → adjusted → reprocessed
- Have 50+ attributes (diagnosis codes, procedure codes, modifiers, revenue codes)
- Link to multiple providers (billing, rendering, referring, facility)
- Must reconcile to [EDI 835](/terms/EDI%20835) remittance files
- Require complete audit trail for CMS audits
The Naive Claims Model
CREATE TABLE claim (
clm_id VARCHAR(50) PRIMARY KEY,
mmbr_id VARCHAR(50),
prvdr_id VARCHAR(50),
service_dt DATE,
total_billed_amt DECIMAL(10,2),
total_paid_amt DECIMAL(10,2),
claim_status_cd VARCHAR(20)
);
Why this fails:
- No adjustment tracking (claims change weeks after payment)
- No line-level detail (claims can have 100+ service lines)
- No diagnosis codes (needed for quality measures, risk adjustment)
- No provider role tracking (who billed vs. who rendered?)
- No reconciliation to financial ledger
Free Tool
Calculate RAF scores with our free HCC Calculator →
The Production Claims Model
-- Claim master (current state)
CREATE TABLE claim_master (
clm_id VARCHAR(50) PRIMARY KEY,
mmbr_id VARCHAR(50),
plan_id VARCHAR(50),
original_clm_nbr VARCHAR(50),
payer_clm_nbr VARCHAR(50),
service_from_dt DATE,
service_to_dt DATE,
received_dt DATE,
processed_dt DATE,
paid_dt DATE,
total_billed_amt DECIMAL(10,2),
total_allowed_amt DECIMAL(10,2),
total_paid_amt DECIMAL(10,2),
total_patient_resp_amt DECIMAL(10,2),
claim_status_cd VARCHAR(20),
claim_type_cd VARCHAR(20),
current_version_nbr INT,
adjustment_count INT,
is_current_flag CHAR(1),
created_dttm TIMESTAMP,
updated_dttm TIMESTAMP,
FOREIGN KEY (mmbr_id) REFERENCES member_master(mmbr_id)
);
-- Claim version history
CREATE TABLE claim_version_history (
clm_version_id VARCHAR(50) PRIMARY KEY,
clm_id VARCHAR(50),
version_nbr INT,
total_billed_amt DECIMAL(10,2),
total_paid_amt DECIMAL(10,2),
claim_status_cd VARCHAR(20),
adjustment_type_cd VARCHAR(20),
adjustment_reason_txt TEXT,
effective_dt DATE,
end_dt DATE,
version_created_dttm TIMESTAMP,
is_current_flag CHAR(1),
FOREIGN KEY (clm_id) REFERENCES claim_master(clm_id)
);
-- Claim line detail
CREATE TABLE claim_line (
clm_line_id VARCHAR(50) PRIMARY KEY,
clm_id VARCHAR(50),
line_nbr INT,
service_dt DATE,
place_of_service_cd VARCHAR(10),
revenue_cd VARCHAR(10),
procedure_cd VARCHAR(20),
modifier_cd_1 VARCHAR(10),
units_qty DECIMAL(10,2),
billed_amt DECIMAL(10,2),
allowed_amt DECIMAL(10,2),
paid_amt DECIMAL(10,2),
deductible_amt DECIMAL(10,2),
coinsurance_amt DECIMAL(10,2),
copay_amt DECIMAL(10,2),
rendering_prvdr_id VARCHAR(50),
line_status_cd VARCHAR(20),
denial_reason_cd VARCHAR(50),
FOREIGN KEY (clm_id) REFERENCES claim_master(clm_id)
);
-- Claim diagnosis codes
CREATE TABLE claim_diagnosis (
clm_diag_id VARCHAR(50) PRIMARY KEY,
clm_id VARCHAR(50),
diagnosis_seq_nbr INT,
diagnosis_cd VARCHAR(20),
diagnosis_type_cd VARCHAR(20),
present_on_admit_cd CHAR(1),
FOREIGN KEY (clm_id) REFERENCES claim_master(clm_id)
);
-- Claim provider relationships
CREATE TABLE claim_provider (
clm_prvdr_id VARCHAR(50) PRIMARY KEY,
clm_id VARCHAR(50),
prvdr_id VARCHAR(50),
provider_role_cd VARCHAR(20),
FOREIGN KEY (clm_id) REFERENCES claim_master(clm_id)
);
Why this works:
- ✅ Version history preserves all adjustments
- ✅ Line-level detail captured
- ✅ Diagnosis codes stored with sequence
- ✅ Multiple provider roles documented
- ✅ Complete audit trail for compliance
Pillar 3: Provider Data Models
Providers have:
- Multiple identifiers ([NPI](https://www.mdatool.com/glossary#NPI), state license, DEA, PTAN)
- Taxonomy classifications (specialty codes)
- Practice locations (primary + satellites)
- Network participation (contract dates, reimbursement rates)
- Credentialing status (active, suspended, under review)
The Production Provider Model
-- Provider master
CREATE TABLE provider_master (
prvdr_id VARCHAR(50) PRIMARY KEY,
npi_id VARCHAR(10) UNIQUE,
tax_id VARCHAR(20),
prvdr_type_cd VARCHAR(20),
last_nm VARCHAR(100),
first_nm VARCHAR(100),
organization_nm VARCHAR(200),
primary_taxonomy_cd VARCHAR(20),
primary_specialty_cd VARCHAR(50),
is_active_flag CHAR(1),
created_dttm TIMESTAMP,
updated_dttm TIMESTAMP
);
-- Provider practice locations
CREATE TABLE provider_location (
prvdr_location_id VARCHAR(50) PRIMARY KEY,
prvdr_id VARCHAR(50),
location_type_cd VARCHAR(20),
address_line1 VARCHAR(200),
city_nm VARCHAR(100),
state_cd CHAR(2),
zip_cd VARCHAR(10),
phone_nbr VARCHAR(20),
is_primary_flag CHAR(1),
effective_dt DATE,
end_dt DATE,
FOREIGN KEY (prvdr_id) REFERENCES provider_master(prvdr_id)
);
-- Network participation
CREATE TABLE provider_network (
prvdr_network_id VARCHAR(50) PRIMARY KEY,
prvdr_id VARCHAR(50),
plan_id VARCHAR(50),
network_id VARCHAR(50),
participation_type_cd VARCHAR(20),
contract_start_dt DATE,
contract_end_dt DATE,
reimbursement_type_cd VARCHAR(20),
network_status_cd VARCHAR(20),
termination_reason_cd VARCHAR(50),
is_accepting_new_patients_flag CHAR(1),
FOREIGN KEY (prvdr_id) REFERENCES provider_master(prvdr_id)
);
Pillar 4: Compliance & Quality Models
Healthcare requires regulatory reporting:
- [HEDIS](/terms/HEDIS) measures
- Stars ratings
- Risk adjustment (HCC)
- Prior authorization tracking
-- Quality measure definitions
CREATE TABLE quality_measure (
measure_id VARCHAR(50) PRIMARY KEY,
measure_cd VARCHAR(20),
measure_nm VARCHAR(200),
measure_desc TEXT,
measure_category_cd VARCHAR(50),
measurement_year INT,
numerator_logic TEXT,
denominator_logic TEXT,
exclusion_logic TEXT,
is_active_flag CHAR(1)
);
-- Member measure compliance
CREATE TABLE member_quality_measure (
mmbr_measure_id VARCHAR(50) PRIMARY KEY,
mmbr_id VARCHAR(50),
measure_id VARCHAR(50),
measurement_year INT,
in_denominator_flag CHAR(1),
in_numerator_flag CHAR(1),
is_excluded_flag CHAR(1),
gap_exists_flag CHAR(1),
gap_close_dt DATE,
calculated_dttm TIMESTAMP,
FOREIGN KEY (mmbr_id) REFERENCES member_master(mmbr_id),
FOREIGN KEY (measure_id) REFERENCES quality_measure(measure_id)
);
The Integration Layer
Your model must explicitly connect:
- Member → Claims via mmbr_id
- Claims → Providers via claim_provider
- Claims → Quality via diagnosis/procedure codes
- Providers → Network via provider_network
- Members → Enrollment via member_enrollment
Critical Query: Member Attribution
SELECT
m.mmbr_id,
m.golden_first_nm,
p.npi_id,
p.organization_nm
FROM member_master m
JOIN member_enrollment me ON m.mmbr_id = me.mmbr_id
AND me.is_current_flag = 'Y'
JOIN provider_network pn ON me.plan_id = pn.plan_id
AND pn.participation_type_cd = 'PCP'
AND pn.network_status_cd = 'ACTIVE'
JOIN provider_master p ON pn.prvdr_id = p.prvdr_id
WHERE m.is_current_flag = 'Y';
Compliance Requirements
Every table needs:
- Audit trail: created_by, created_dttm, updated_by, updated_dttm
- Temporal validity: effective_dt, end_dt, is_current_flag
- Source lineage: source_system_cd, source_file_id, load_dttm
Conclusion
Healthcare data models must handle:
- Temporal complexity - Data changes over time
- Regulatory compliance - Audit trails required
- Multi-source integration - Conflicting data
- Reconciliation - Must match external sources
Stop modeling healthcare like retail. Model for healthcare reality.
Checklist
Members:
- Demographic history
- Enrollment periods
- Member months calculation
- Source lineage
Claims:
- Version tracking
- Line-level detail
- Diagnosis codes
- Provider roles
- EDI reconciliation
Providers:
- NPI identifier
- Practice locations
- Network participation
- Credentialing status
Compliance:
- Quality measures
- Audit trails
- Temporal validity
Check every box or your model breaks in production.
Related Guides
More in Healthcare Data Modeling
HCC Risk Adjustment Data Model: Building Accurate Risk Score Pipelines in SQL
HCC risk adjustment determines how much revenue a health plan receives for each Medicare Advantage member. The data model behind it — from diagnosis code ingestion to RAF score calculation — is one of the most consequential schemas a payer data engineer will build.
Read moreNPI Number Validation: How to Clean and Enrich Provider Data in Your Database
Invalid NPI numbers in your provider table silently break claims routing, credentialing workflows, and CMS submissions. This guide covers validation approaches, NPPES enrichment patterns, and SQL queries for catching NPI data quality problems before they reach production.
Read moreHEDIS Measure Data Architecture: Building a Compliant Quality Reporting Pipeline
HEDIS measure calculation is deceptively complex. NCQA technical specifications are hundreds of pages long, source data comes from four different systems, and one logic error can affect thousands of members. Here is how to build a pipeline that holds up to NCQA audit.
Read moreFree Tools
Free HCC RAF Score Calculator
Calculate CMS-HCC V28 RAF scores from ICD-10 codes. No account required.
Try it freeFree NPI Lookup
Search any provider by NPI number, name, or organization instantly.
Try it freeFree DDL Converter
Translate SQL schemas between Snowflake, BigQuery, Oracle, and SQL Server.
Try it freeReady to improve your data architecture?
Free tools for DDL conversion, SQL analysis, naming standards, and more.