Why Your "Perfect" Logical Model Still Produces Bad Metrics
You spent months designing a logical data model. Your ERD is pristine. Every entity has clear relationships. Your business stakeholders signed off. Your DBA implemented it flawlessly in Snowflake. The data warehouse is live.
Then the first quarterly metrics report lands on the CFO's desk. Member attribution is off by 15%. Claim counts don't match finance. Provider network metrics contradict what operations sees in the EHR. Your "perfect" model just produced garbage metrics.
What happened?
The uncomfortable truth: logical models don't fail because of bad design. They fail because they ignore the brutal realities of healthcare data operations. Here's why even the best logical models produce bad metrics—and how to fix it.
The Core Problem: Logical Models Ignore Data Lineage
Healthcare data doesn't arrive clean. It arrives as:
- EDI 837 claims with 50+ segments, loops, and conditional fields
- HL7 ADT feeds with patient demographic updates buried in PID segments
- Eligibility files from carriers with retroactive effective dates
- Provider roster updates that contradict yesterday's NPI registry snapshot
Your logical model says: Member → Claim → Provider. Clean. Simple. Wrong.
Reality says: Member records get updated 6 times during a claim lifecycle. Claims get reversed, reprocessed, and resubmitted. Provider addresses change mid-contract but old addresses still need to route payments.
The gap? Your logical model doesn't account for temporal validity, source system precedence, or reconciliation logic.
Example: The Member Attribution Disaster
The Logical Model (Looks Perfect)
CREATE TABLE member ( mmbr_id VARCHAR(50) PRIMARY KEY, first_nm VARCHAR(100), last_nm VARCHAR(100), date_of_birth DATE, gender_cd CHAR(1), primary_care_prvdr_id VARCHAR(50) );
The Reality (Data Hell)
Your member data comes from 4 sources:
- Eligibility file (daily snapshot from carrier)
- Claims data (member name as billed by provider)
- Provider portal (patient self-registration)
- EHR feed (demographic updates from clinic EMR)
All four disagree on:
- Name spelling ("Mary Smith" vs "Marie Smythe")
- Date of birth (off by one day due to timezone differences)
- PCP assignment (eligibility shows Dr. Jones, claims show Dr. Smith)
Your metrics report:
- Member count: 47,315 (eligibility says 48,200)
- Members with PCP: 39,104 (provider portal says 41,892)
- CFO asks: "Which number is right?"
Your logical model has no answer because it never defined:
- Source system hierarchy (which source wins when there's conflict?)
- Effective dating (when did the PCP assignment change?)
- Match/merge rules (how do we know Mary Smith = Marie Smythe?)
Fix #1: Add Source System Lineage to Your Logical Model
Don't model entities. Model SOURCE RECORDS.
Updated Logical Model
CREATE TABLE member_source ( mmbr_src_id VARCHAR(50) PRIMARY KEY, mmbr_id VARCHAR(50), source_system_cd VARCHAR(20), source_record_id VARCHAR(100), first_nm VARCHAR(100), last_nm VARCHAR(100), date_of_birth DATE, gender_cd CHAR(1), primary_care_prvdr_id VARCHAR(50), effective_dt DATE, end_dt DATE, is_current_flag CHAR(1), load_dttm TIMESTAMP, source_priority INT ); CREATE TABLE member_master ( mmbr_id VARCHAR(50) PRIMARY KEY, golden_first_nm VARCHAR(100), golden_last_nm VARCHAR(100), golden_dob DATE, golden_gender_cd CHAR(1), golden_pcp_id VARCHAR(50), last_updated_dttm TIMESTAMP );
Fix #2: Model Temporal Validity, Not Static Snapshots
Healthcare data changes constantly. Your logical model needs bi-temporal tables.
CREATE TABLE claim_history ( clm_history_id VARCHAR(50) PRIMARY KEY, clm_id VARCHAR(50), version_nbr INT, mmbr_id VARCHAR(50), total_billed_amt DECIMAL(10,2), total_paid_amt DECIMAL(10,2), claim_status_cd VARCHAR(20), effective_dt DATE, end_dt DATE, valid_from_dttm TIMESTAMP, valid_to_dttm TIMESTAMP, is_current_flag CHAR(1) );
Fix #3: Model Data Quality Rules as First-Class Entities
CREATE TABLE data_quality_rule ( rule_id VARCHAR(50) PRIMARY KEY, rule_nm VARCHAR(200), rule_category_cd VARCHAR(50), entity_nm VARCHAR(100), severity_cd VARCHAR(20) ); CREATE TABLE data_quality_violation ( violation_id VARCHAR(50) PRIMARY KEY, rule_id VARCHAR(50), entity_id VARCHAR(100), violation_dttm TIMESTAMP, resolved_flag CHAR(1) );
Fix #4: Model Reconciliation Logic
CREATE TABLE reconciliation_control ( recon_id VARCHAR(50) PRIMARY KEY, recon_nm VARCHAR(200), source_system_cd VARCHAR(50), target_entity_nm VARCHAR(100), tolerance_pct DECIMAL(5,2) );
The Checklist
✅ Source System Lineage
- Every entity tracks which source system it came from
- Conflicting sources have documented precedence rules
✅ Temporal Validity
- Entities with changing values use effective dating
- Historical versions are preserved
✅ Data Quality
- Validation rules are documented
- Violations are tracked and reportable
✅ Reconciliation
- External reconciliation points are identified
- Variance thresholds are defined
Conclusion
The best logical models in healthcare don't just show entities and relationships. They show:
- Where data comes from (source lineage)
- When data changed (temporal validity)
- What makes data valid (quality rules)
- How metrics reconcile (external validation)
Your "perfect" logical model produces bad metrics because it models the business as you wish it worked—not as it actually operates.
Fix the model. Fix the metrics.