Back to Glossary

Domain

Technology

Systems, databases, interfaces and data standards

About Technology Data

The technology domain covers healthcare IT systems, data standards, interfaces, and interoperability frameworks. Key standards include HL7 v2 for clinical messaging, HL7 FHIR for modern API-based exchange, DICOM for medical imaging, and X12 EDI for administrative transactions.

HIPAA Security Rule requirements drive architecture decisions around encryption, access controls, and audit logging. Modern healthcare data platforms include Snowflake, Databricks, BigQuery, and Azure Health Data Services. Healthcare data teams work with integration engines, API gateways, and master patient index systems.

428 technology terms

Centers for Medicare and Medicaid Servicescms

The Centers for Medicare and Medicaid Services (CMS) is the federal agency within the U.S. Department of Health and Human Services (HHS) responsible for administering the Medicare, Medicaid, Children's Health Insurance Program (CHIP), and Health Insurance Marketplace programs. CMS covers more than 160 million Americans, making it the largest payer of healthcare services in the United States and one of the largest in the world. In addition to program administration, CMS is the primary regulatory authority for healthcare data standards, interoperability requirements, quality measurement programs, and payment policy — including the Inpatient Prospective Payment System, the Physician Fee Schedule, and Medicare Advantage capitation rates. CMS data programs and regulatory actions directly shape healthcare data engineering work across the industry. The NPPES NPI registry, published monthly by CMS, is the authoritative source for provider identities. CMS publishes the ICD-10-CM and ICD-10-PCS code sets updated annually. The CMS-HCC risk adjustment model and its annual calibration updates define how RAF scores are calculated for Medicare Advantage. The CMS Interoperability and Patient Access Final Rule mandates FHIR R4 API access for Medicare Advantage plans. The Quality Payment Program (QPP) and Merit-based Incentive Payment System (MIPS) govern how physician quality data is collected and used to adjust Medicare fee-for-service payments. Understanding which CMS program a data element belongs to and which CMS regulatory framework governs it is prerequisite knowledge for any healthcare data engineer. Healthcare data engineers interact with CMS-published data in multiple ways: loading the NPPES dissemination file for provider master data, downloading the ICD-to-HCC crosswalk for risk adjustment pipelines, using the CMS Physician Fee Schedule RVU files for allowed amount benchmarking, consuming CMS Medicare claims research data files (such as the 100% Medicare Limited Data Set) for population analytics, and building FHIR-compliant APIs to satisfy CMS interoperability mandates. CMS also publishes the Encounter Data Processing System (EDPS) data dictionary and the Risk Adjustment Processing System (RAPS) specifications that govern how Medicare Advantage plans submit encounter and diagnosis data. Key CMS portals for data engineers include the CMS Data Navigator, the Quality Payment Program API, and the NPPES NPI Registry API.

Common Data ModelCDM

A standardized schema enabling consistent representation of healthcare data across disparate EHR, claims, and pharmacy systems to support federated analytics and research. CDMs such as OMOP, PCORnet, and i2b2 are used in data warehouse implementations to harmonize terminology, patient identifiers, and clinical events for cross-system reporting.

Data Loggerdata_logr

A device or software component used in healthcare settings to automatically record environmental or physiological data such as temperature for medication storage or patient vital signs over time. Data logger output is integrated into EHR, pharmacy cold chain management, and clinical trial data systems for compliance and audit purposes.

Data Loss PreventionDLP

A set of policies, tools, and technologies implemented in healthcare data environments to detect and prevent unauthorized transmission or exposure of protected health information across EHR, claims, and PBM systems. DLP solutions enforce HIPAA and organizational data governance requirements by monitoring data flows at endpoints, networks, and cloud platforms.

Data Use AgreementDUA

A legally binding contract governing the permitted uses and disclosures of a limited dataset shared between covered entities or research institutions, satisfying HIPAA Privacy Rule requirements under 45 CFR 164.514(e). In healthcare data engineering, DUAs define field-level restrictions, permissible analytics, and security controls applied to de-identified or limited datasets used in research or operational reporting workflows.

Dimension TableDim

A reference or lookup table in a healthcare data warehouse star or snowflake schema that stores descriptive attributes such as provider, member, diagnosis, or facility details. Used by data engineers to join against fact tables in EHR, claims, and enrollment data models to enable slicing and filtering in downstream reporting and analytics tools.

Electronic Data Interchangeedi

Electronic Data Interchange (EDI) in healthcare refers to the computer-to-computer exchange of standardized administrative and financial transactions between health plans, providers, and clearinghouses using formats mandated by HIPAA. The ASC X12 standards body defines the transaction sets used in healthcare EDI: 837P (professional claims), 837I (institutional claims), 837D (dental claims), 270/271 (eligibility inquiry and response), 276/277 (claim status inquiry and response), 278 (prior authorization), 834 (benefit enrollment), and 835 (remittance advice). All covered entities under HIPAA are required to use these standard EDI formats for electronic administrative transactions, making EDI the backbone of the US healthcare payment infrastructure. Understanding EDI transaction structure is fundamental for healthcare data engineers because raw claims data, eligibility data, and remittance data arrive in these formats before any transformation into structured analytical schemas. An 837P transaction is a hierarchical text file using segment identifiers, element separators, and loop structures to encode the bill for a professional claim — claim header information appears in the CLM segment, diagnosis codes in HI segments, and service line details in SV1 segments. Parsing errors or misinterpretation of EDI segment logic are a common source of data quality problems in raw claims ingestion pipelines, particularly around multi-level loop structures (loop 2000A for billing provider, 2000B for subscriber, 2000C for patient, 2300 for claim, 2400 for service line). Healthcare data engineers build EDI parsing pipelines using specialized EDI parsing libraries or custom parsers that transform raw X12 transaction files into structured staging tables, then apply business logic to normalize and validate the data before loading into the claims data warehouse. Key engineering tasks include handling ISA/GS envelope metadata for trading partner identification, parsing NM1 provider segments to extract NPI and TIN values, extracting CLM01 (claim ID), CLM02 (billed amount), and CLM05 (place of service/bill type), and mapping HI diagnosis segment qualifiers to distinguish principal, admitting, and other diagnosis codes. The 835 remittance transaction closes the payment loop, and reconciling 835 CAS adjustment reason codes against 837 billed amounts is essential for revenue cycle analytics and underpayment identification.

Extensible Markup LanguageXML

A structured, hierarchical data format widely used in healthcare data systems for exchanging clinical, administrative, and financial information. XML underlies HL7 CDA documents, X12 EDI transaction wrappers, and EHR interoperability payloads, requiring schema validation and parsing logic in ETL pipelines processing claims, eligibility, and enrollment data.

Fast Healthcare Interoperability Resourcesfhir

Fast Healthcare Interoperability Resources (FHIR) is a modern healthcare data exchange standard developed and maintained by Health Level Seven International (HL7). FHIR defines a collection of modular data objects called "resources" — such as Patient, Observation, Condition, Encounter, and MedicationRequest — each with a standardized JSON or XML schema and a RESTful API access pattern. Unlike earlier HL7 standards, FHIR was designed from the ground up to work with web technologies, making it accessible to developers using standard HTTP clients and JSON parsers without specialized EDI tooling. FHIR is central to healthcare interoperability regulation in the United States. The CMS Interoperability and Patient Access Final Rule (CMS-9115-F) and the ONC 21st Century Cures Act Final Rule both mandate FHIR R4 API access for Medicare Advantage, Medicaid, CHIP, and federally-facilitated exchange plans. Payers must expose patient claims data, clinical data, and formulary information through FHIR APIs, making FHIR proficiency essential for any data engineer working at a health plan or healthcare IT vendor. Healthcare data engineers encounter FHIR in two primary contexts: ingesting FHIR-formatted data from EHRs and payer APIs into analytical platforms, and building FHIR-compliant APIs to expose data to authorized applications. In Snowflake or Databricks, FHIR resources typically arrive as semi-structured JSON stored in VARIANT columns, requiring transformation pipelines that flatten nested elements such as coding arrays, extension blocks, and contained resources into normalized analytical tables. Key engineering tasks include mapping FHIR Observation resources to laboratory result fact tables, transforming FHIR Condition resources into diagnosis dimension tables, and handling the FHIR reference pattern (e.g., "Patient/12345") to resolve cross-resource foreign keys. Related standards include ICD-10-CM for diagnosis coding within FHIR Condition resources, LOINC for FHIR Observation codes, SNOMED CT for clinical terminology, and US Core Implementation Guide profiles that constrain FHIR resources for the US healthcare market.

Health Insurance Portability and Accountability Acthipaa

The Health Insurance Portability and Accountability Act of 1996 (HIPAA) is the primary federal law governing the privacy, security, and electronic exchange of protected health information (PHI) in the United States. HIPAA established three main rules that directly shape healthcare data architecture: the Privacy Rule, which defines what constitutes PHI and restricts its use and disclosure; the Security Rule, which mandates administrative, physical, and technical safeguards for electronic PHI (ePHI); and the Transaction and Code Sets Rule, which standardizes the EDI formats for claims, eligibility, remittance, and other administrative transactions. Covered entities — health plans, healthcare clearinghouses, and most providers — and their business associates are subject to HIPAA enforcement by the HHS Office for Civil Rights (OCR). HIPAA compliance is not optional and is not just a legal requirement — it fundamentally shapes data architecture decisions. Every healthcare data warehouse, pipeline, and API must implement access controls, encryption at rest and in transit, audit logging, and breach detection procedures that satisfy the Security Rule. The Privacy Rule's minimum necessary standard requires that data systems only expose the PHI fields required for a specific purpose, which drives row-level security and column masking implementations in platforms like Snowflake, Databricks Unity Catalog, and BigQuery authorized views. Healthcare data engineers implement HIPAA compliance through several technical mechanisms: encryption of ePHI using AES-256 at rest and TLS 1.2+ in transit, role-based access control that limits PHI access to authorized personnel and applications, immutable audit log tables that capture every PHI access event with user, timestamp, and data element accessed, and de-identification pipelines that apply either Safe Harbor (removing all 18 PHI identifier categories) or Expert Determination methods before sharing data for analytics or research. HIPAA's 18 PHI identifier categories include names, geographic subdivisions smaller than state, dates directly related to an individual (birth date, admission date, discharge date), phone numbers, social security numbers, and NPI numbers. Violations carry civil penalties up to $1.9 million per violation category per year, with criminal penalties for willful neglect.

Health Level Sevenhl7

Health Level Seven International (HL7) is the global standards development organization responsible for the most widely used healthcare data exchange standards in the world. Founded in 1987, HL7 has produced multiple generations of healthcare interoperability standards: HL7 Version 2 (v2), the message-based standard still used in the vast majority of EHR-to-system interfaces; HL7 Version 3 and Clinical Document Architecture (CDA), the XML-based document exchange standards used for structured clinical documents like CCDs and discharge summaries; and FHIR (Fast Healthcare Interoperability Resources), the modern RESTful API-based standard now mandated by CMS and ONC regulations. HL7 v2 messaging remains the dominant clinical interface standard used in US hospitals despite being over 35 years old, primarily because the installed base of v2 interfaces across EHRs, lab systems, radiology systems, and clinical applications is enormous and the standard works reliably for event-driven point-to-point messaging. An HL7 v2 message is a pipe-delimited text file structured as a series of segments: the MSH segment contains message metadata (sending system, receiving system, message type, timestamp, message control ID); ADT messages (A01-A60) carry patient admission, discharge, and transfer events; ORM messages carry laboratory and radiology orders; ORU messages carry result observations; and DFT messages carry detailed financial transactions. Healthcare data engineers encounter HL7 v2 messages when building integration pipelines between clinical systems and analytical platforms. The primary engineering challenge is that HL7 v2 is highly configurable — different EHR implementations use non-standard segment structures, custom Z-segments, and local code systems that deviate from the base standard, making generic parsers unreliable without site-specific configuration. Engineers implement HL7 v2 parsers using libraries such as HAPI (Java), hl7apy (Python), or Mirth Connect integration engines, then transform parsed message data into structured staging tables. Key segments for analytical use include PID (patient demographics), PV1 (visit/encounter data), OBX (observation results with LOINC codes and values), DG1 (diagnosis codes), and IN1/IN2 (insurance information). HL7 FHIR represents the next-generation replacement for v2 in new implementations, though v2 will remain operational across legacy systems for decades.

Identity and Access ManagementIAM

A framework of policies, technologies, and processes used to control user authentication, authorization, and access to healthcare data systems including EHR platforms, data warehouses, and payer portals. IAM systems enforce role-based access controls, audit logging, and compliance with HIPAA security rules, and are critical components of healthcare data governance and breach prevention strategies.

Master Data ManagementMDM

A discipline and set of processes used in healthcare IT to ensure a single, authoritative source of truth for core entities such as members, providers, facilities, and plans across EHR, claims, PBM, and enrollment systems. MDM eliminates duplicate records and reconciles data conflicts across disparate platforms.

NCPDP Version D.0D.0

The NCPDP Telecommunication Standard Version D.0 is the electronic transaction format mandated for retail pharmacy claims adjudication between pharmacies and PBM or payer systems. It defines field-level specifications for claim submission, eligibility inquiry, and claim reversal transactions processed in real time.

Open Reading FrameORF

A continuous stretch of codons in a DNA or RNA sequence that begins with a start codon and ends with a stop codon, representing a potential protein-coding region. In genomic data systems supporting oncology or precision medicine pipelines, ORF annotations are stored in variant databases and bioinformatics platforms to identify therapeutic targets and interpret somatic mutation data.

Operational Data StoreODS

An integrated, subject-oriented database designed for near-real-time operational reporting, sitting architecturally between transactional source systems such as EHR and claims platforms and the enterprise data warehouse. The ODS consolidates current-state member, clinical, and claims data to support daily operational workflows and analytics without impacting source system performance.

Picture Archiving and Communication SystemPACS

Medical imaging technology used in EHR and radiology information systems to store, retrieve, manage, and transmit digital diagnostic images such as X-rays, MRIs, and CT scans. PACS integrates with HL7 and DICOM standards, enabling data engineers to pipeline imaging metadata into clinical analytics, claims, and care coordination platforms.

Process Analytical TechnologyPAT

A pharmaceutical manufacturing framework used in pharmacy and drug supply chain data systems to monitor and control production quality in real time. PAT data is integrated into PBM and drug formulary pipelines to ensure dispensed products meet regulatory compliance and batch integrity standards tracked by NDC.

Safety Data SheetSDS

A standardized regulatory document detailing chemical hazard, handling, and safety information required under OSHA and GHS standards. In healthcare data systems, SDS records are managed in occupational health and facility management platforms; data engineers integrate SDS data with incident reporting and employee health EHR modules.

Single-chain Variable FragmentscFv

A recombinant antibody construct consisting of fused heavy and light chain variable regions used in targeted biologic therapies. In healthcare data systems, scFv identifiers appear in specialty pharmacy, PBM formulary, and clinical trial datasets; data engineers must map these to NDC and biologic product reference tables accurately.

Page 1 of 22Next