Data Models Don’t Break — Assumptions Do
When data models fail in production, teams often blame:
Bad data Broken pipelines Performance issues Tool limitations
But the real cause is usually simpler—and more dangerous.
Assumptions broke.
Data Models Are Built on Assumptions
Every model encodes beliefs:
- About business behavior
- About process stability
- About timing
- About completeness
- About how users will query the data
Most of these assumptions are undocumented.
They live in the modeler’s head.
The Problem With Invisible Assumptions
Assumptions are fragile because:
- They change over time
- They are rarely validated
- They are not enforced by schema
- They are not visible to downstream users
When an assumption changes, the model doesn’t scream.
It quietly lies.
Common Assumptions That Fail
“This value never changes”
Until it does.
Examples:
- Member status
- Product classification
- Risk tier
- Customer segment
When history is overwritten:
- Trend analysis breaks
- Audits fail
- Metrics become irreproducible
“There is only one active record”
Until overlaps appear.
Examples:
- Coverage periods
- Pricing windows
- Assignments
- Contracts
Without explicit temporal modeling:
- Duplicates emerge
- Joins multiply rows
- Aggregates inflate silently
“This field is optional”
Until someone depends on it.
Nullable fields create:
- CASE logic everywhere
- Hard-to-debug filters
- BI inconsistencies
- ML feature instability
Optional today becomes mandatory tomorrow.
Assumptions About Grain Are the Most Dangerous
Many failures stem from unclear grain.
Questions that should never be ambiguous:
- Is this per entity?
- Per event?
- Per day?
- Per transaction?
If grain is assumed instead of enforced:
- Metrics drift
- GROUP BY logic grows
- DISTINCT becomes common
- Performance collapses
Time Is the Biggest Assumption Leak
Time-related assumptions fail constantly:
- Late-arriving data
- Backdated changes
- Retroactive corrections
- Reprocessed history
If models assume:
- Data arrives in order
- Changes are forward-only
- History is static
They will fail.
Eventually.
BI Tools Don’t Protect You From Assumptions
BI tools faithfully execute logic.
They don’t question:
- Join correctness
- Semantic intent
- Business meaning
They amplify model assumptions at scale.
A wrong assumption in the model becomes:
Hundreds of wrong dashboards Thousands of wrong decisions
Assumptions Accumulate Over Time
Models age.
Assumptions that were valid:
- Last year
- Last quarter
- Last release
Slowly become wrong.
Without explicit design:
- No alarms trigger
- No errors surface
- Trust erodes quietly
How Strong Models Handle Assumptions
Resilient models:
- Make assumptions explicit
- Encode them structurally
- Preserve history
- Isolate change
Techniques include:
- Effective dating
- Event-based modeling
- Snapshotting
- Clear grain signaling
- Immutable facts
Documenting Assumptions Is Not Enough
Documentation helps—but it’s not sufficient.
Assumptions must be:
- Enforced by schema
- Visible in naming
- Reflected in structure
- Tested continuously
If an assumption matters, design for its failure.
Design for Change, Not Stability
Stable systems fail because businesses don’t stay stable.
Good data models assume:
- Change is constant
- Exceptions are normal
- History matters
- Context shifts
Flexibility is not overengineering.
It’s realism.
Final Thoughts
Data models rarely break suddenly.
They decay slowly as assumptions drift out of alignment with reality.
The best models don’t eliminate assumptions.
They expect them to fail and survive anyway.
About the Author
Data modeling experts helping enterprises build better databases and data architectures.