Back to Blog
Healthcare Data Modeling

Why Your "Perfect" Logical Model Still Produces Bad Metrics

mdatool TeamMarch 23, 202610 min read
healthcaredata modelingmetricsdata quality

You spent months designing a logical data model. Your ERD is pristine. Every entity has clear relationships. Your business stakeholders signed off. Your DBA implemented it flawlessly in Snowflake. The data warehouse is live.

Then the first quarterly metrics report lands on the CFO's desk. Member attribution is off by 15%. Claim counts don't match finance. Provider network metrics contradict what operations sees in the EHR. Your "perfect" model just produced garbage metrics.

What happened?

The uncomfortable truth: logical models don't fail because of bad design. They fail because they ignore the brutal realities of healthcare data operations. Here's why even the best logical models produce bad metrics—and how to fix it.


The Core Problem: Logical Models Ignore Data Lineage

Healthcare data doesn't arrive clean. It arrives as:

  • EDI 837 claims with 50+ segments, loops, and conditional fields
  • HL7 ADT feeds with patient demographic updates buried in PID segments
  • Eligibility files from carriers with retroactive effective dates
  • Provider roster updates that contradict yesterday's NPI registry snapshot

Your logical model says: Member → Claim → Provider. Clean. Simple. Wrong.

Reality says: Member records get updated 6 times during a claim lifecycle. Claims get reversed, reprocessed, and resubmitted. Provider addresses change mid-contract but old addresses still need to route payments.

The gap? Your logical model doesn't account for temporal validity, source system precedence, or reconciliation logic.


Example: The Member Attribution Disaster

The Logical Model (Looks Perfect)

CREATE TABLE member (
  mmbr_id          VARCHAR(50) PRIMARY KEY,
  first_nm         VARCHAR(100),
  last_nm          VARCHAR(100),
  date_of_birth    DATE,
  gender_cd        CHAR(1),
  primary_care_prvdr_id VARCHAR(50)
);

The Reality (Data Hell)

Your member data comes from 4 sources:

  1. Eligibility file (daily snapshot from carrier)
  2. Claims data (member name as billed by provider)
  3. Provider portal (patient self-registration)
  4. EHR feed (demographic updates from clinic EMR)

All four disagree on:

  • Name spelling ("Mary Smith" vs "Marie Smythe")
  • Date of birth (off by one day due to timezone differences)
  • PCP assignment (eligibility shows Dr. Jones, claims show Dr. Smith)

Your metrics report:

  • Member count: 47,315 (eligibility says 48,200)
  • Members with PCP: 39,104 (provider portal says 41,892)
  • CFO asks: "Which number is right?"

Your logical model has no answer because it never defined:

  • Source system hierarchy (which source wins when there's conflict?)
  • Effective dating (when did the PCP assignment change?)
  • Match/merge rules (how do we know Mary Smith = Marie Smythe?)

Fix #1: Add Source System Lineage to Your Logical Model

Don't model entities. Model SOURCE RECORDS.

Updated Logical Model

CREATE TABLE member_source (
  mmbr_src_id        VARCHAR(50) PRIMARY KEY,
  mmbr_id            VARCHAR(50),
  source_system_cd   VARCHAR(20),
  source_record_id   VARCHAR(100),
  first_nm           VARCHAR(100),
  last_nm            VARCHAR(100),
  date_of_birth      DATE,
  gender_cd          CHAR(1),
  primary_care_prvdr_id VARCHAR(50),
  effective_dt       DATE,
  end_dt             DATE,
  is_current_flag    CHAR(1),
  load_dttm          TIMESTAMP,
  source_priority    INT
);

CREATE TABLE member_master (
  mmbr_id            VARCHAR(50) PRIMARY KEY,
  golden_first_nm    VARCHAR(100),
  golden_last_nm     VARCHAR(100),
  golden_dob         DATE,
  golden_gender_cd   CHAR(1),
  golden_pcp_id      VARCHAR(50),
  last_updated_dttm  TIMESTAMP
);

Fix #2: Model Temporal Validity, Not Static Snapshots

Healthcare data changes constantly. Your logical model needs bi-temporal tables.

CREATE TABLE claim_history (
  clm_history_id       VARCHAR(50) PRIMARY KEY,
  clm_id               VARCHAR(50),
  version_nbr          INT,
  mmbr_id              VARCHAR(50),
  total_billed_amt     DECIMAL(10,2),
  total_paid_amt       DECIMAL(10,2),
  claim_status_cd      VARCHAR(20),
  effective_dt         DATE,
  end_dt               DATE,
  valid_from_dttm      TIMESTAMP,
  valid_to_dttm        TIMESTAMP,
  is_current_flag      CHAR(1)
);

Fix #3: Model Data Quality Rules as First-Class Entities

CREATE TABLE data_quality_rule (
  rule_id          VARCHAR(50) PRIMARY KEY,
  rule_nm          VARCHAR(200),
  rule_category_cd VARCHAR(50),
  entity_nm        VARCHAR(100),
  severity_cd      VARCHAR(20)
);

CREATE TABLE data_quality_violation (
  violation_id     VARCHAR(50) PRIMARY KEY,
  rule_id          VARCHAR(50),
  entity_id        VARCHAR(100),
  violation_dttm   TIMESTAMP,
  resolved_flag    CHAR(1)
);

Fix #4: Model Reconciliation Logic

CREATE TABLE reconciliation_control (
  recon_id            VARCHAR(50) PRIMARY KEY,
  recon_nm            VARCHAR(200),
  source_system_cd    VARCHAR(50),
  target_entity_nm    VARCHAR(100),
  tolerance_pct       DECIMAL(5,2)
);

The Checklist

✅ Source System Lineage

  • Every entity tracks which source system it came from
  • Conflicting sources have documented precedence rules

✅ Temporal Validity

  • Entities with changing values use effective dating
  • Historical versions are preserved

✅ Data Quality

  • Validation rules are documented
  • Violations are tracked and reportable

✅ Reconciliation

  • External reconciliation points are identified
  • Variance thresholds are defined

Conclusion

The best logical models in healthcare don't just show entities and relationships. They show:

  • Where data comes from (source lineage)
  • When data changed (temporal validity)
  • What makes data valid (quality rules)
  • How metrics reconcile (external validation)

Your "perfect" logical model produces bad metrics because it models the business as you wish it worked—not as it actually operates.

Fix the model. Fix the metrics.

Ready to improve your data architecture?

Get started with mdatool's free tools for DDL conversion, SQL analysis, and more.

Get Started Free