mdatool
Healthcare Data Dictionary for the Modern Data Stack
LibraryBlogPricing
mdatool
mdatool

The healthcare data dictionary for dbt, Snowflake, Databricks, and BigQuery. 100,000+ ISO-11179 standard terms, free SQL tools, and AI data modeling.

HIPAA-AlignedEnterprise Ready

Tools

  • SQL Linter
  • DDL Converter
  • Bulk Sanitizer
  • Naming Auditor
  • Name Generator
  • AI Data Modeling
  • HCC Calculator
  • Data Model Canvas

Library

  • Glossary
  • Guides
  • Blog

Company

  • About
  • Contact
  • Pricing

Account

  • Sign Up Free
  • Sign In
  • Upgrade to Pro
  • Dashboard

Legal

  • Privacy Policy
  • Terms of Service

© 2026 mdatool. All rights reserved.

Built for healthcare data engineers & architects.

HomeBlogCloud ArchitectureHIPAA-Compliant GCP Architecture: BigQuery, VPC Service Controls & Cloud Healthcare API (2026 Guide)
Cloud Architecture

HIPAA-Compliant GCP Architecture: BigQuery, VPC Service Controls & Cloud Healthcare API (2026 Guide)

Building HIPAA-compliant data infrastructure on Google Cloud requires more than checking a BAA checkbox. Here is the architecture — BigQuery, Cloud Healthcare API, Pub/Sub, Dataflow, and the security controls that make it defensible.

mdatool Team·April 21, 2026·9 min read
GCPHIPAABigQueryCloud Healthcare APIdata architectureVPC Service Controls

Introduction

Google Cloud is a legitimate HIPAA-eligible platform. GCP signs Business Associate Agreements, and a specific set of services fall within scope. But HIPAA compliance on GCP is not automatic — it requires deliberate architectural choices around encryption, network isolation, audit logging, and access controls. This guide walks through a production-grade healthcare data architecture on GCP, from raw clinical data ingestion through analytics, with the security controls that make it defensible under HIPAA.

GCP ServiceHIPAA BAAPHI AllowedKey Security Feature
BigQuery✅ Yes✅ YesColumn-level security
Cloud Healthcare API✅ Yes✅ YesFHIR R4 native
Cloud Storage✅ Yes✅ YesCMEK encryption
Pub/Sub✅ Yes✅ YesVPC perimeter
Dataflow✅ Yes✅ YesManaged pipeline
Cloud Functions❌ No❌ NoNot BAA covered
Firebase❌ No❌ NoNot BAA covered
App Engine❌ No❌ NoNot BAA covered

GCP Services in Scope for HIPAA

GCP's BAA covers a specific list of services. The most relevant for a healthcare data architecture are:

  • Cloud Healthcare API — FHIR R4, HL7 v2, and DICOM storage
  • BigQuery — Data warehouse and analytics
  • Cloud Storage (GCS) — Object storage for raw files (837, 835, flat files)
  • Pub/Sub — Real-time event streaming
  • Dataflow — Managed Apache Beam for batch and streaming pipelines
  • Cloud Composer — Managed Airflow for orchestration
  • Secret Manager — Credential and secret storage
  • Cloud KMS — Encryption key management
  • VPC Service Controls — Network security perimeter
  • Cloud Audit Logs — Access and activity logging
📋

Free Tool

Parse this HL7 message →

Services not covered by GCP's BAA should never touch PHI. When in doubt, check GCP's current BAA addendum before using a new service.


Architecture Overview

The reference architecture has five layers:

[Source Systems] Epic (FHIR R4) → Cloud Healthcare API → BigQuery (streaming sync) Clearinghouse (837/835 EDI) → GCS (raw zone) → Dataflow → BigQuery Lab vendor (HL7 v2) → Cloud Healthcare API HL7 Store → Pub/Sub → Dataflow [Ingestion & Processing] Dataflow pipelines for transformation and PHI standardization [Storage] BigQuery: bronze (raw), silver (standardized), gold (aggregated) GCS: raw file archive (encrypted, lifecycle-managed) [Access & Governance] VPC Service Controls security perimeter IAM + Column-level security in BigQuery Cloud Audit Logs → BigQuery logging sink [Analytics & Serving] Looker / Looker Studio for operational reporting Vertex AI for risk models

Step 1: Establish the Security Perimeter with VPC Service Controls

VPC Service Controls create a security perimeter around your PHI-handling GCP project. Resources inside the perimeter can communicate with each other; data egress to outside the perimeter is blocked by default.

Create a service perimeter that includes all PHI-handling services:

gcloud access-context-manager perimeters create phi-perimeter   --title="PHI Data Perimeter"   --resources=projects/[YOUR_PROJECT_NUMBER]   --restricted-services=bigquery.googleapis.com,healthcare.googleapis.com,storage.googleapis.com,pubsub.googleapis.com,dataflow.googleapis.com   --policy=[YOUR_ACCESS_POLICY_ID]

This is your first line of defense against data exfiltration. A service account or user with BigQuery access inside the perimeter cannot export PHI to an external GCS bucket or BigQuery dataset outside the perimeter.


Step 2: Configure Cloud Healthcare API

Create a FHIR store with audit logging and BigQuery streaming enabled:

gcloud healthcare fhir-stores create ehr-fhir-store   --dataset=clinical-dataset   --location=us-central1   --version=R4   --enable-update-create   --pubsub-topic=projects/[PROJECT_ID]/topics/fhir-mutations

Enable BigQuery streaming sync for analytics access:

{
  "streamConfigs": [{
    "resourceTypes": ["Patient", "Encounter", "Condition", "Observation", "MedicationRequest"],
    "bigqueryDestination": {
      "datasetUri": "bq://[PROJECT_ID].clinical_fhir_bronze",
      "schemaConfig": {
        "schemaType": "ANALYTICS_V2",
        "recursiveStructureDepth": 5
      },
      "writeDisposition": "WRITE_APPEND"
    }
  }]
}

The ANALYTICS_V2 schema flattens FHIR JSON into BigQuery-native columns, making FHIR resources directly queryable without JSON parsing.


Step 3: Encryption with Customer-Managed Keys

By default, GCP encrypts all data at rest using Google-managed keys. For PHI, use Customer-Managed Encryption Keys (CMEK) via Cloud KMS — this gives your organization control over the encryption lifecycle.

# Create a key ring for PHI data
gcloud kms keyrings create phi-keyring   --location=us-central1

# Create a symmetric encryption key
gcloud kms keys create phi-data-key   --keyring=phi-keyring   --location=us-central1   --purpose=encryption   --rotation-period=90d   --next-rotation-time=$(date -d '+90 days' --iso-8601)

# Apply CMEK to a BigQuery dataset
bq update   --default_kms_key=projects/[PROJECT_ID]/locations/us-central1/keyRings/phi-keyring/cryptoKeys/phi-data-key   [PROJECT_ID]:phi_data

Key rotation every 90 days is a HIPAA Security Rule best practice for encryption keys protecting ePHI.


Step 4: IAM and Column-Level Security in BigQuery

Implement least-privilege access using BigQuery column-level security with data policies:

-- Tag PHI columns with a policy tag
-- (policy tag must be created in Data Catalog first)
ALTER TABLE phi.member_demographics
  ALTER COLUMN ssn
  SET OPTIONS (policy_tags = '["projects/[PROJECT]/locations/us-central1/taxonomies/[TAXONOMY_ID]/policyTags/[TAG_ID]"]');

-- Grant access to the PHI policy tag for privileged role only
-- Done via Data Catalog IAM, not BigQuery IAM

This ensures that even if a user has BigQuery read access to the table, they cannot read the SSN column unless they have been explicitly granted access to the PHI policy tag.


Step 5: Audit Logging

Enable Data Access audit logs for all PHI-handling services:

gcloud projects set-iam-policy [PROJECT_ID] - <<'EOF'
auditConfigs:
- auditLogConfigs:
  - logType: DATA_READ
  - logType: DATA_WRITE
  - logType: ADMIN_READ
  service: bigquery.googleapis.com
- auditLogConfigs:
  - logType: DATA_READ
  - logType: DATA_WRITE
  service: healthcare.googleapis.com
- auditLogConfigs:
  - logType: DATA_READ
  - logType: DATA_WRITE
  service: storage.googleapis.com
EOF

Route audit logs to BigQuery for long-term retention and queryable audit evidence:

gcloud logging sinks create phi-audit-sink   bigquery.googleapis.com/projects/[PROJECT_ID]/datasets/audit_logs   --log-filter='logName="projects/[PROJECT_ID]/logs/cloudaudit.googleapis.com%2Fdata_access"'

Retain audit logs for a minimum of 6 years (HIPAA requirement). BigQuery's partitioned table storage makes 6-year retention economically practical.


Key Takeaways

  • VPC Service Controls are non-negotiable for PHI on GCP — they prevent data egfiltration that IAM alone cannot stop.
  • Cloud Healthcare API's BigQuery streaming sync eliminates the ETL tier for FHIR analytics. Enable it at FHIR store creation time.
  • CMEK with 90-day key rotation satisfies HIPAA Security Rule encryption management requirements.
  • BigQuery column-level security with policy tags provides fine-grained PHI access control without view proliferation.
  • Before any SQL query runs against PHI tables, validate it with the SQL Linter to catch unbounded queries and anti-patterns that could expose more PHI than intended.

HIPAA-Compliant vs Non-Compliant GCP Services

Not every GCP service is eligible for PHI. Google's BAA covers a defined list — everything outside it must stay PHI-free.

GCP ServiceHIPAA BAAPHI AllowedKey Config Required
BigQuery✅ Yes✅ YesCMEK encryption, VPC Service Controls, column-level security
Cloud Healthcare API✅ Yes✅ YesDesigned for PHI — FHIR/HL7 native
Cloud Storage✅ Yes✅ YesCMEK, uniform bucket-level access, audit logging
Cloud Pub/Sub✅ Yes✅ YesVPC Service Controls perimeter required
Cloud KMS✅ Yes✅ YesManages CMEK keys for all PHI services
Cloud Audit Logs✅ Yes✅ YesRoute to BigQuery; 6-year retention policy
Dataflow✅ Yes✅ YesRun inside VPC, no external IPs
Cloud Composer✅ Yes✅ YesPrivate IP environment, VPC-native
Cloud Functions❌ No❌ NoCompute only — don't pass PHI through
Firebase❌ No❌ NoNot covered by GCP HIPAA BAA
BigQuery ML❌ No❌ NoTrain on de-identified data only
Vertex AI⚠️ Partial⚠️ ConditionalVerify against current BAA addendum

Always verify: GCP's BAA-covered service list changes quarterly. Check the GCP HIPAA implementation guide before adding any new service to a PHI pipeline.

Frequently Asked Questions

Is Google Cloud Platform HIPAA compliant?

GCP is HIPAA-eligible — Google signs Business Associate Agreements (BAA) for a specific set of services including BigQuery, Cloud Healthcare API, Cloud Storage, Pub/Sub, and Dataflow. However HIPAA compliance is not automatic — it requires proper architectural controls including VPC Service Controls, CMEK encryption, audit logging, and IAM configuration.

What GCP services are covered under the HIPAA BAA?

Google's BAA covers BigQuery, Cloud Healthcare API, Cloud Storage, Pub/Sub, Dataflow, Cloud Composer, Secret Manager, Cloud KMS, VPC Service Controls, and Cloud Audit Logs. Services not on this list should never process PHI.

How do I make BigQuery HIPAA compliant?

Making BigQuery HIPAA compliant requires four controls: enable Customer-Managed Encryption Keys (CMEK) via Cloud KMS, implement column-level security with policy tags for PHI fields, enable Data Access audit logs for all read and write operations, and place BigQuery inside a VPC Service Controls perimeter to prevent data exfiltration.

What are VPC Service Controls in GCP?

VPC Service Controls create a security perimeter around your GCP project that prevents data exfiltration even if an IAM account is compromised. Resources inside the perimeter can communicate with each other, but data cannot leave the perimeter to external destinations without explicit access levels configured.

How long must HIPAA audit logs be retained on GCP?

HIPAA requires audit logs to be retained for a minimum of 6 years. On GCP, route Cloud Audit Logs to a BigQuery sink with a partitioned table structure — this makes 6-year retention economically practical and keeps logs queryable for compliance investigations.

M

mdatool Team

The mdatool team builds free engineering tools for healthcare data architects, analysts, and engineers working across payer, provider, and life sciences data.

Related Guides

HIPAA Compliance

HIPAA Privacy and Security Rules, PHI handling, and compliance frameworks.

Read Guide

Key Terms in This Article

data architecturehipaa

More in Cloud Architecture

Multi-Cloud Healthcare Data Architecture: Patterns, Risks, and Best Practices

Healthcare organizations end up multi-cloud for reasons that are rarely strategic. Here is how to architect data infrastructure across clouds without creating a compliance and operational nightmare.

Read more

GCP vs Azure vs AWS HealthLake: FHIR Comparison 2026

Three cloud giants, three healthcare data services, and one question: which one actually fits your architecture? A practical comparison across FHIR compliance, PHI handling, pipeline integration, and real-world limitations.

Read more

Azure Health Data Services vs AWS HealthLake

Both Microsoft and AWS now offer managed FHIR-native cloud platforms for healthcare data. We compare Azure Health Data Services and AWS HealthLake across FHIR compliance, data pipeline integration, cost, and real-world use cases so your team can make an informed choice.

Read more

Free Tools

Free SQL Linter

Catch SQL bugs, performance issues, and naming violations before production.

Try it free

Ready to improve your data architecture?

Free tools for DDL conversion, SQL analysis, naming standards, and more.

Get Started Free

Get weekly healthcare data engineering tips

Practical guides on data modeling, SQL standards, and healthcare domain conventions — straight to your inbox.

No spam. Unsubscribe any time.

On this page

  • Introduction
  • GCP Services in Scope for HIPAA
  • Architecture Overview
  • Step 1: Establish the Security Perimeter with VPC Service Controls
  • Step 2: Configure Cloud Healthcare API
  • Step 3: Encryption with Customer-Managed Keys
  • Step 4: IAM and Column-Level Security in BigQuery
  • Step 5: Audit Logging
  • Key Takeaways
  • HIPAA-Compliant vs Non-Compliant GCP Services
  • Frequently Asked Questions
  • Is Google Cloud Platform HIPAA compliant?
  • What GCP services are covered under the HIPAA BAA?
  • How do I make BigQuery HIPAA compliant?
  • What are VPC Service Controls in GCP?
  • How long must HIPAA audit logs be retained on GCP?

Share

Share on XShare on LinkedIn

Engineering Tools

Convert DDL, lint SQL, and audit naming conventions — free.

Explore Tools