Back to Guides
Analytics
Healthcare Analytics Guide
Complete guide to healthcare data warehousing, clinical analytics, population health metrics, and business intelligence for data-driven healthcare organizations.
Healthcare Analytics Overview
Healthcare analytics transforms clinical, operational, and financial data into actionable insights. Modern healthcare organizations use analytics for population health management, quality improvement, cost reduction, and value-based care initiatives.
A healthcare data warehouse integrates data from EHRs, claims systems, labs, pharmacies, and external sources to create a single source of truth for reporting and analytics.
Healthcare Data Warehouse Architecture
-- Dimensional Model (Star Schema) -- Dimension: Patient CREATE TABLE dim_patient ( patient_key SERIAL PRIMARY KEY, patient_id UUID UNIQUE NOT NULL, mrn VARCHAR(50), age INTEGER, age_group VARCHAR(20), -- 0-17, 18-44, 45-64, 65+ gender VARCHAR(20), race VARCHAR(100), ethnicity VARCHAR(100), zip_code VARCHAR(10), county VARCHAR(100), state VARCHAR(2), -- SCD Type 2 fields effective_date DATE NOT NULL, expiration_date DATE, is_current BOOLEAN DEFAULT TRUE, loaded_at TIMESTAMP DEFAULT NOW() ); -- Dimension: Provider CREATE TABLE dim_provider ( provider_key SERIAL PRIMARY KEY, provider_id UUID UNIQUE NOT NULL, npi VARCHAR(10), provider_name VARCHAR(255), specialty VARCHAR(100), specialty_group VARCHAR(100), organization VARCHAR(255), effective_date DATE NOT NULL, expiration_date DATE, is_current BOOLEAN DEFAULT TRUE, loaded_at TIMESTAMP DEFAULT NOW() ); -- Dimension: Diagnosis CREATE TABLE dim_diagnosis ( diagnosis_key SERIAL PRIMARY KEY, icd10_code VARCHAR(10) NOT NULL, icd10_description VARCHAR(500), icd10_category VARCHAR(100), chronic_flag BOOLEAN, loaded_at TIMESTAMP DEFAULT NOW() ); -- Dimension: Date CREATE TABLE dim_date ( date_key INTEGER PRIMARY KEY, -- YYYYMMDD format full_date DATE UNIQUE NOT NULL, year INTEGER, quarter INTEGER, month INTEGER, month_name VARCHAR(20), day INTEGER, day_of_week INTEGER, day_name VARCHAR(20), week_of_year INTEGER, is_weekend BOOLEAN, is_holiday BOOLEAN, fiscal_year INTEGER, fiscal_quarter INTEGER ); -- Fact Table: Encounters CREATE TABLE fact_encounter ( encounter_key SERIAL PRIMARY KEY, -- Foreign Keys (Dimensions) patient_key INTEGER REFERENCES dim_patient(patient_key), provider_key INTEGER REFERENCES dim_provider(provider_key), admission_date_key INTEGER REFERENCES dim_date(date_key), discharge_date_key INTEGER REFERENCES dim_date(date_key), -- Degenerate Dimensions encounter_id UUID NOT NULL, encounter_type VARCHAR(100), visit_type VARCHAR(100), -- Measures length_of_stay INTEGER, total_charges DECIMAL(10,2), readmission_30day BOOLEAN, er_visit BOOLEAN, loaded_at TIMESTAMP DEFAULT NOW() ); CREATE INDEX idx_fact_enc_patient ON fact_encounter(patient_key); CREATE INDEX idx_fact_enc_provider ON fact_encounter(provider_key); CREATE INDEX idx_fact_enc_admit_date ON fact_encounter(admission_date_key); -- Fact Table: Quality Measures CREATE TABLE fact_quality_measure ( measure_key SERIAL PRIMARY KEY, patient_key INTEGER REFERENCES dim_patient(patient_key), provider_key INTEGER REFERENCES dim_provider(provider_key), measurement_date_key INTEGER REFERENCES dim_date(date_key), -- Measure Details measure_name VARCHAR(100), -- HEDIS, CMS Star, etc. measure_category VARCHAR(100), -- Results numerator_flag BOOLEAN, -- Met measure? denominator_flag BOOLEAN, -- In measure population? exclusion_flag BOOLEAN, loaded_at TIMESTAMP DEFAULT NOW() );
Key Healthcare Metrics & KPIs
Clinical Quality Metrics
- • HEDIS Measures: Diabetes HbA1c control, breast cancer screening
- • Hospital-Acquired Conditions: C. diff, CAUTI, CLABSI, falls
- • Readmission Rates: 30-day all-cause, condition-specific
- • Mortality Rates: Risk-adjusted mortality by condition
Population Health Metrics
- • Risk Stratification: HCC risk scores, care gaps
- • Chronic Disease Management: Diabetes, CHF, COPD prevalence & control
- • Preventive Care: Screening rates, immunization coverage
- • Social Determinants: SDOH factors, community health metrics
Operational Metrics
- • Utilization: Bed occupancy, OR utilization, ER wait times
- • Patient Flow: Length of stay, discharge delays, throughput
- • Provider Productivity: RVUs, encounters per day, panel size
- • Patient Experience: HCAHPS scores, patient satisfaction
Analytics SQL Examples
30-Day Readmission Rate
WITH readmissions AS (
SELECT
e1.patient_key,
e1.discharge_date_key,
CASE WHEN EXISTS (
SELECT 1 FROM fact_encounter e2
WHERE e2.patient_key = e1.patient_key
AND e2.admission_date_key BETWEEN e1.discharge_date_key AND e1.discharge_date_key + 30
AND e2.encounter_key != e1.encounter_key
) THEN 1 ELSE 0 END as readmitted_30day
FROM fact_encounter e1
WHERE e1.encounter_type = 'inpatient'
AND e1.discharge_date_key >= 20240101
)
SELECT
COUNT(*) as total_discharges,
SUM(readmitted_30day) as readmissions,
ROUND(100.0 * SUM(readmitted_30day) / COUNT(*), 2) as readmission_rate
FROM readmissions;Diabetes HbA1c Control Rate (HEDIS)
SELECT
dp.specialty,
COUNT(DISTINCT fq.patient_key) as diabetic_patients,
SUM(CASE WHEN fq.numerator_flag THEN 1 ELSE 0 END) as controlled_patients,
ROUND(100.0 * SUM(CASE WHEN fq.numerator_flag THEN 1 ELSE 0 END) /
COUNT(DISTINCT fq.patient_key), 2) as control_rate
FROM fact_quality_measure fq
JOIN dim_provider dp ON fq.provider_key = dp.provider_key
WHERE fq.measure_name = 'Diabetes HbA1c Control'
AND fq.denominator_flag = TRUE
AND dp.is_current = TRUE
GROUP BY dp.specialty
ORDER BY control_rate DESC;High-Risk Patient Cohort Analysis
SELECT p.age_group, COUNT(DISTINCT e.patient_key) as patient_count, AVG(e.length_of_stay) as avg_los, SUM(e.total_charges) as total_charges, AVG(e.total_charges) as avg_charges, SUM(CASE WHEN e.readmission_30day THEN 1 ELSE 0 END) as readmissions FROM fact_encounter e JOIN dim_patient p ON e.patient_key = p.patient_key WHERE p.is_current = TRUE AND e.encounter_type = 'inpatient' AND e.admission_date_key >= 20240101 GROUP BY p.age_group ORDER BY avg_charges DESC;
ETL Best Practices
- •Incremental Loading: Load only changed data using CDC or timestamps
- •Data Quality Checks: Validate completeness, accuracy, consistency
- •SCD Type 2: Track historical changes in dimensions
- •Surrogate Keys: Use integer keys for better join performance
- •Audit Tables: Log all ETL runs with row counts and timestamps