Back to Guides
Analytics

Healthcare Analytics Guide

Complete guide to healthcare data warehousing, clinical analytics, population health metrics, and business intelligence for data-driven healthcare organizations.

Healthcare Analytics Overview

Healthcare analytics transforms clinical, operational, and financial data into actionable insights. Modern healthcare organizations use analytics for population health management, quality improvement, cost reduction, and value-based care initiatives.

A healthcare data warehouse integrates data from EHRs, claims systems, labs, pharmacies, and external sources to create a single source of truth for reporting and analytics.

Healthcare Data Warehouse Architecture

-- Dimensional Model (Star Schema)

-- Dimension: Patient
CREATE TABLE dim_patient (
  patient_key         SERIAL PRIMARY KEY,
  patient_id          UUID UNIQUE NOT NULL,
  mrn                 VARCHAR(50),
  age                 INTEGER,
  age_group           VARCHAR(20),                     -- 0-17, 18-44, 45-64, 65+
  gender              VARCHAR(20),
  race                VARCHAR(100),
  ethnicity           VARCHAR(100),
  zip_code            VARCHAR(10),
  county              VARCHAR(100),
  state               VARCHAR(2),
  
  -- SCD Type 2 fields
  effective_date      DATE NOT NULL,
  expiration_date     DATE,
  is_current          BOOLEAN DEFAULT TRUE,
  
  loaded_at           TIMESTAMP DEFAULT NOW()
);

-- Dimension: Provider
CREATE TABLE dim_provider (
  provider_key        SERIAL PRIMARY KEY,
  provider_id         UUID UNIQUE NOT NULL,
  npi                 VARCHAR(10),
  provider_name       VARCHAR(255),
  specialty           VARCHAR(100),
  specialty_group     VARCHAR(100),
  organization        VARCHAR(255),
  
  effective_date      DATE NOT NULL,
  expiration_date     DATE,
  is_current          BOOLEAN DEFAULT TRUE,
  
  loaded_at           TIMESTAMP DEFAULT NOW()
);

-- Dimension: Diagnosis
CREATE TABLE dim_diagnosis (
  diagnosis_key       SERIAL PRIMARY KEY,
  icd10_code          VARCHAR(10) NOT NULL,
  icd10_description   VARCHAR(500),
  icd10_category      VARCHAR(100),
  chronic_flag        BOOLEAN,
  
  loaded_at           TIMESTAMP DEFAULT NOW()
);

-- Dimension: Date
CREATE TABLE dim_date (
  date_key            INTEGER PRIMARY KEY,             -- YYYYMMDD format
  full_date           DATE UNIQUE NOT NULL,
  year                INTEGER,
  quarter             INTEGER,
  month               INTEGER,
  month_name          VARCHAR(20),
  day                 INTEGER,
  day_of_week         INTEGER,
  day_name            VARCHAR(20),
  week_of_year        INTEGER,
  is_weekend          BOOLEAN,
  is_holiday          BOOLEAN,
  fiscal_year         INTEGER,
  fiscal_quarter      INTEGER
);

-- Fact Table: Encounters
CREATE TABLE fact_encounter (
  encounter_key       SERIAL PRIMARY KEY,
  
  -- Foreign Keys (Dimensions)
  patient_key         INTEGER REFERENCES dim_patient(patient_key),
  provider_key        INTEGER REFERENCES dim_provider(provider_key),
  admission_date_key  INTEGER REFERENCES dim_date(date_key),
  discharge_date_key  INTEGER REFERENCES dim_date(date_key),
  
  -- Degenerate Dimensions
  encounter_id        UUID NOT NULL,
  encounter_type      VARCHAR(100),
  visit_type          VARCHAR(100),
  
  -- Measures
  length_of_stay      INTEGER,
  total_charges       DECIMAL(10,2),
  readmission_30day   BOOLEAN,
  er_visit            BOOLEAN,
  
  loaded_at           TIMESTAMP DEFAULT NOW()
);

CREATE INDEX idx_fact_enc_patient ON fact_encounter(patient_key);
CREATE INDEX idx_fact_enc_provider ON fact_encounter(provider_key);
CREATE INDEX idx_fact_enc_admit_date ON fact_encounter(admission_date_key);

-- Fact Table: Quality Measures
CREATE TABLE fact_quality_measure (
  measure_key         SERIAL PRIMARY KEY,
  
  patient_key         INTEGER REFERENCES dim_patient(patient_key),
  provider_key        INTEGER REFERENCES dim_provider(provider_key),
  measurement_date_key INTEGER REFERENCES dim_date(date_key),
  
  -- Measure Details
  measure_name        VARCHAR(100),                    -- HEDIS, CMS Star, etc.
  measure_category    VARCHAR(100),
  
  -- Results
  numerator_flag      BOOLEAN,                         -- Met measure?
  denominator_flag    BOOLEAN,                         -- In measure population?
  exclusion_flag      BOOLEAN,
  
  loaded_at           TIMESTAMP DEFAULT NOW()
);

Key Healthcare Metrics & KPIs

Clinical Quality Metrics

  • HEDIS Measures: Diabetes HbA1c control, breast cancer screening
  • Hospital-Acquired Conditions: C. diff, CAUTI, CLABSI, falls
  • Readmission Rates: 30-day all-cause, condition-specific
  • Mortality Rates: Risk-adjusted mortality by condition

Population Health Metrics

  • Risk Stratification: HCC risk scores, care gaps
  • Chronic Disease Management: Diabetes, CHF, COPD prevalence & control
  • Preventive Care: Screening rates, immunization coverage
  • Social Determinants: SDOH factors, community health metrics

Operational Metrics

  • Utilization: Bed occupancy, OR utilization, ER wait times
  • Patient Flow: Length of stay, discharge delays, throughput
  • Provider Productivity: RVUs, encounters per day, panel size
  • Patient Experience: HCAHPS scores, patient satisfaction

Analytics SQL Examples

30-Day Readmission Rate

WITH readmissions AS (
  SELECT 
    e1.patient_key,
    e1.discharge_date_key,
    CASE WHEN EXISTS (
      SELECT 1 FROM fact_encounter e2
      WHERE e2.patient_key = e1.patient_key
        AND e2.admission_date_key BETWEEN e1.discharge_date_key AND e1.discharge_date_key + 30
        AND e2.encounter_key != e1.encounter_key
    ) THEN 1 ELSE 0 END as readmitted_30day
  FROM fact_encounter e1
  WHERE e1.encounter_type = 'inpatient'
    AND e1.discharge_date_key >= 20240101
)
SELECT 
  COUNT(*) as total_discharges,
  SUM(readmitted_30day) as readmissions,
  ROUND(100.0 * SUM(readmitted_30day) / COUNT(*), 2) as readmission_rate
FROM readmissions;

Diabetes HbA1c Control Rate (HEDIS)

SELECT 
  dp.specialty,
  COUNT(DISTINCT fq.patient_key) as diabetic_patients,
  SUM(CASE WHEN fq.numerator_flag THEN 1 ELSE 0 END) as controlled_patients,
  ROUND(100.0 * SUM(CASE WHEN fq.numerator_flag THEN 1 ELSE 0 END) / 
    COUNT(DISTINCT fq.patient_key), 2) as control_rate
FROM fact_quality_measure fq
JOIN dim_provider dp ON fq.provider_key = dp.provider_key
WHERE fq.measure_name = 'Diabetes HbA1c Control'
  AND fq.denominator_flag = TRUE
  AND dp.is_current = TRUE
GROUP BY dp.specialty
ORDER BY control_rate DESC;

High-Risk Patient Cohort Analysis

SELECT 
  p.age_group,
  COUNT(DISTINCT e.patient_key) as patient_count,
  AVG(e.length_of_stay) as avg_los,
  SUM(e.total_charges) as total_charges,
  AVG(e.total_charges) as avg_charges,
  SUM(CASE WHEN e.readmission_30day THEN 1 ELSE 0 END) as readmissions
FROM fact_encounter e
JOIN dim_patient p ON e.patient_key = p.patient_key
WHERE p.is_current = TRUE
  AND e.encounter_type = 'inpatient'
  AND e.admission_date_key >= 20240101
GROUP BY p.age_group
ORDER BY avg_charges DESC;

ETL Best Practices

  • Incremental Loading: Load only changed data using CDC or timestamps
  • Data Quality Checks: Validate completeness, accuracy, consistency
  • SCD Type 2: Track historical changes in dimensions
  • Surrogate Keys: Use integer keys for better join performance
  • Audit Tables: Log all ETL runs with row counts and timestamps

Related Healthcare Guides