BlogData ArchitectureEvent-Driven Architecture for Healthcare Data Pipelines (Kafka + FHIR)
Data Architecture

Event-Driven Architecture for Healthcare Data Pipelines (Kafka + FHIR)

Real-time clinical data demands event-driven architecture. ADT feeds, lab results, and prior auth events cannot wait for a nightly batch. Here is how to design Kafka-based pipelines for FHIR-native healthcare data.

mdatool Team·April 21, 2026·9 min read
event-driven architectureKafkaFHIRhealthcare data pipelinereal-timeHL7

Introduction

Nightly batch is the wrong architecture for real-time clinical data. An ADT (Admit-Discharge-Transfer) event needs to trigger a care management alert within minutes — not appear in tomorrow morning's data warehouse load. A prior authorization approval needs to update the claims adjudication system in real time — not through a file drop that runs at 2 AM. The clinical data that drives care coordination, risk management, and real-time quality monitoring is fundamentally event-driven. Your data pipeline architecture should be too.

This guide covers the design of Kafka-based event-driven pipelines for healthcare, with specific patterns for FHIR R4 resources, HL7 v2 messages, and the schema governance that keeps clinical event streams reliable over time.


Why Event-Driven Matters for Healthcare

Healthcare generates events — not records. A patient encounter is not a row that gets inserted once. It is a stream of events: admit, transfer, discharge, lab ordered, result received, diagnosis assigned, medication prescribed, prior auth requested, prior auth approved, claim submitted, claim adjudicated. Each event has a timestamp, a source, and a downstream consumer.

Batch processing collapses this event stream into a point-in-time snapshot. Useful for analytics. Insufficient for:

  • Care management alerts: A care manager needs to know when a high-risk member is admitted to the ED — not 12 hours later.
  • Real-time eligibility: A provider portal checking eligibility at point of care needs sub-second response, not a nightly sync.
  • Prior authorization: CMS's Interoperability Rule requires prior auth decisions within 72 hours (sometimes 24). Real-time PA event processing is no longer optional.
  • Fraud detection: Claims fraud patterns are detectable in near-real-time if the data pipeline supports it.

Kafka Topic Design for Healthcare

Kafka topics in healthcare should be designed around FHIR resource types or HL7 message types — not around source systems. A topic per source system (epic-adt, cerner-labs, athena-encounters) creates a consumer nightmare when the same consumer needs all encounters regardless of source.

# FHIR resource-based topics (source-agnostic)
healthcare.fhir.patient
healthcare.fhir.encounter
healthcare.fhir.condition
healthcare.fhir.observation       # lab results, vitals
healthcare.fhir.medicationrequest
healthcare.fhir.claim
healthcare.fhir.explanationofbenefit
healthcare.fhir.coverageeligibilityrequest
healthcare.fhir.task              # prior auth, referrals

# HL7 v2 topics (by message type)
healthcare.hl7v2.adt              # ADT^A01 (admit), ADT^A03 (discharge), ADT^A08 (update)
healthcare.hl7v2.oru              # ORU^R01 lab results
healthcare.hl7v2.orm              # ORM^O01 order messages

# Domain event topics (business-level)
healthcare.events.member-enrolled
healthcare.events.claim-adjudicated
healthcare.events.prior-auth-decision
healthcare.events.risk-gap-identified

Use partitioning by patient ID (or member ID) to ensure that all events for a given patient land on the same partition. This is critical for stateful consumers that need to reconstruct patient-level event sequences.

# Kafka producer with patient-based partitioning
from confluent_kafka import Producer

producer = Producer({'bootstrap.servers': 'kafka-broker:9092'})

def publish_fhir_event(resource_type: str, fhir_resource: dict):
    topic = f"healthcare.fhir.{resource_type.lower()}"
    patient_id = extract_patient_id(fhir_resource)  # implementation-specific

    producer.produce(
        topic=topic,
        key=patient_id.encode('utf-8'),     # partition key = patient ID
        value=json.dumps(fhir_resource).encode('utf-8'),
        headers={
            'content-type': 'application/fhir+json',
            'fhir-version': 'R4',
            'source-system': 'EPIC',
            'event-time': datetime.utcnow().isoformat()
        }
    )
    producer.flush()

Schema Registry for HL7/FHIR

Schema management is the silent failure mode of healthcare event streams. Without a schema registry, a producer changes the FHIR Patient resource structure (adds a field, removes an extension) and downstream consumers break silently.

Use Confluent Schema Registry (or AWS Glue Schema Registry, or Apicurio) with Avro or JSON Schema formats.

FHIR Resource Schema in Avro

{
  "type": "record",
  "name": "FHIRPatient",
  "namespace": "com.mdatool.healthcare.fhir",
  "fields": [
    {"name": "resourceType", "type": "string"},
    {"name": "id", "type": "string"},
    {"name": "meta", "type": {
      "type": "record",
      "name": "Meta",
      "fields": [
        {"name": "versionId", "type": ["null", "string"], "default": null},
        {"name": "lastUpdated", "type": "string"}
      ]
    }},
    {"name": "identifier", "type": {
      "type": "array",
      "items": {
        "type": "record",
        "name": "Identifier",
        "fields": [
          {"name": "system", "type": "string"},
          {"name": "value", "type": "string"}
        ]
      }
    }},
    {"name": "name", "type": {
      "type": "array",
      "items": {
        "type": "record",
        "name": "HumanName",
        "fields": [
          {"name": "family", "type": "string"},
          {"name": "given", "type": {"type": "array", "items": "string"}}
        ]
      }
    }},
    {"name": "birthDate", "type": ["null", "string"], "default": null},
    {"name": "gender", "type": ["null", "string"], "default": null}
  ]
}

Registering this schema enforces backwards compatibility — adding optional fields is allowed, removing or renaming fields requires a schema version bump.


HL7 v2 to FHIR Transformation Pattern

Most existing healthcare systems still produce HL7 v2 messages, not FHIR. The event pipeline must bridge both.

HL7 v2 Source (MLLP) → Kafka Connect MLLP Source → healthcare.hl7v2.adt topic
                                                              ↓
                                              Kafka Streams transformer
                                           (HL7v2 → FHIR R4 mapping)
                                                              ↓
                                              healthcare.fhir.encounter topic
                                                              ↓
                                    [Consumer: Care Management Alerting]
                                    [Consumer: Data Warehouse (Snowflake sink)]
                                    [Consumer: FHIR Store (GCP Healthcare API)]

The HAPI FHIR library (Java) provides HL7 v2 → FHIR R4 conversion utilities. For Python-based pipelines, use the HL7apy or HL7-FHIR-R4-Core libraries.

Use the HL7 Parser to inspect raw HL7 v2 messages and validate their structure before they enter the Kafka pipeline — catching malformed ADT or ORU messages before they propagate into downstream FHIR stores.


Consumer Patterns

Care Management Alerting Consumer

Stateless consumer on healthcare.fhir.encounter and healthcare.hl7v2.adt. Filters for admit events (ADT^A01) for members flagged as high-risk. Publishes alerts to a separate healthcare.events.care-alert topic consumed by the care management workflow system.

Data Warehouse Sink Consumer

Uses Kafka Connect Snowflake Sink Connector (or BigQuery Sink Connector). Writes FHIR resource events to bronze tables. Schema-on-write using the registered Avro schema. Idempotent writes using the FHIR resource ID as the deduplication key.

FHIR Store Sync Consumer

Consumes the healthcare.fhir.* topics and writes resources to the GCP Cloud Healthcare API or Azure FHIR Service. This keeps the managed FHIR store synchronized with the event stream — providing both event-driven consumers and REST-based FHIR API access from the same source of truth.


Key Takeaways

  • Design Kafka topics by FHIR resource type or HL7 message type, not by source system. Consumers care about data type, not data origin.
  • Partition by patient ID to ensure per-patient event ordering for stateful consumers.
  • A Schema Registry is non-negotiable for production healthcare event pipelines. FHIR structure changes without schema governance break consumers silently.
  • HL7 v2 to FHIR transformation should happen in the Kafka Streams tier, not in every consumer. One transformation, many consumers.
  • Validate HL7 message structure before publishing to Kafka using the HL7 Parser to catch malformed messages at the ingestion boundary.
M

mdatool Team

The mdatool team builds free engineering tools for healthcare data architects, analysts, and engineers working across payer, provider, and life sciences data.

Ready to improve your data architecture?

Free tools for DDL conversion, SQL analysis, naming standards, and more.

Get Started Free