Introduction
Google Cloud is a legitimate HIPAA-eligible platform. GCP signs Business Associate Agreements, and a specific set of services fall within scope. But HIPAA compliance on GCP is not automatic — it requires deliberate architectural choices around encryption, network isolation, audit logging, and access controls. This guide walks through a production-grade healthcare data architecture on GCP, from raw clinical data ingestion through analytics, with the security controls that make it defensible under HIPAA.
| GCP Service | HIPAA BAA | PHI Allowed | Key Security Feature |
|---|---|---|---|
| BigQuery | ✅ Yes | ✅ Yes | Column-level security |
| Cloud Healthcare API | ✅ Yes | ✅ Yes | FHIR R4 native |
| Cloud Storage | ✅ Yes | ✅ Yes | CMEK encryption |
| Pub/Sub | ✅ Yes | ✅ Yes | VPC perimeter |
| Dataflow | ✅ Yes | ✅ Yes | Managed pipeline |
| Cloud Functions | ❌ No | ❌ No | Not BAA covered |
| Firebase | ❌ No | ❌ No | Not BAA covered |
| App Engine | ❌ No | ❌ No | Not BAA covered |
GCP Services in Scope for HIPAA
GCP's BAA covers a specific list of services. The most relevant for a healthcare data architecture are:
- Cloud Healthcare API — FHIR R4, HL7 v2, and DICOM storage
- BigQuery — Data warehouse and analytics
- Cloud Storage (GCS) — Object storage for raw files (837, 835, flat files)
- Pub/Sub — Real-time event streaming
- Dataflow — Managed Apache Beam for batch and streaming pipelines
- Cloud Composer — Managed Airflow for orchestration
- Secret Manager — Credential and secret storage
- Cloud KMS — Encryption key management
- VPC Service Controls — Network security perimeter
- Cloud Audit Logs — Access and activity logging
Free Tool
Parse this HL7 message →
Services not covered by GCP's BAA should never touch PHI. When in doubt, check GCP's current BAA addendum before using a new service.
Architecture Overview
The reference architecture has five layers:
[Source Systems]
Epic (FHIR R4) → Cloud Healthcare API → BigQuery (streaming sync)
Clearinghouse (837/835 EDI) → GCS (raw zone) → Dataflow → BigQuery
Lab vendor (HL7 v2) → Cloud Healthcare API HL7 Store → Pub/Sub → Dataflow
[Ingestion & Processing]
Dataflow pipelines for transformation and PHI standardization
[Storage]
BigQuery: bronze (raw), silver (standardized), gold (aggregated)
GCS: raw file archive (encrypted, lifecycle-managed)
[Access & Governance]
VPC Service Controls security perimeter
IAM + Column-level security in BigQuery
Cloud Audit Logs → BigQuery logging sink
[Analytics & Serving]
Looker / Looker Studio for operational reporting
Vertex AI for risk models
Step 1: Establish the Security Perimeter with VPC Service Controls
VPC Service Controls create a security perimeter around your PHI-handling GCP project. Resources inside the perimeter can communicate with each other; data egress to outside the perimeter is blocked by default.
Create a service perimeter that includes all PHI-handling services:
gcloud access-context-manager perimeters create phi-perimeter --title="PHI Data Perimeter" --resources=projects/[YOUR_PROJECT_NUMBER] --restricted-services=bigquery.googleapis.com,healthcare.googleapis.com,storage.googleapis.com,pubsub.googleapis.com,dataflow.googleapis.com --policy=[YOUR_ACCESS_POLICY_ID]
This is your first line of defense against data exfiltration. A service account or user with BigQuery access inside the perimeter cannot export PHI to an external GCS bucket or BigQuery dataset outside the perimeter.
Step 2: Configure Cloud Healthcare API
Create a FHIR store with audit logging and BigQuery streaming enabled:
gcloud healthcare fhir-stores create ehr-fhir-store --dataset=clinical-dataset --location=us-central1 --version=R4 --enable-update-create --pubsub-topic=projects/[PROJECT_ID]/topics/fhir-mutations
Enable BigQuery streaming sync for analytics access:
{
"streamConfigs": [{
"resourceTypes": ["Patient", "Encounter", "Condition", "Observation", "MedicationRequest"],
"bigqueryDestination": {
"datasetUri": "bq://[PROJECT_ID].clinical_fhir_bronze",
"schemaConfig": {
"schemaType": "ANALYTICS_V2",
"recursiveStructureDepth": 5
},
"writeDisposition": "WRITE_APPEND"
}
}]
}
The ANALYTICS_V2 schema flattens FHIR JSON into BigQuery-native columns, making FHIR resources directly queryable without JSON parsing.
Step 3: Encryption with Customer-Managed Keys
By default, GCP encrypts all data at rest using Google-managed keys. For PHI, use Customer-Managed Encryption Keys (CMEK) via Cloud KMS — this gives your organization control over the encryption lifecycle.
# Create a key ring for PHI data
gcloud kms keyrings create phi-keyring --location=us-central1
# Create a symmetric encryption key
gcloud kms keys create phi-data-key --keyring=phi-keyring --location=us-central1 --purpose=encryption --rotation-period=90d --next-rotation-time=$(date -d '+90 days' --iso-8601)
# Apply CMEK to a BigQuery dataset
bq update --default_kms_key=projects/[PROJECT_ID]/locations/us-central1/keyRings/phi-keyring/cryptoKeys/phi-data-key [PROJECT_ID]:phi_data
Key rotation every 90 days is a HIPAA Security Rule best practice for encryption keys protecting ePHI.
Step 4: IAM and Column-Level Security in BigQuery
Implement least-privilege access using BigQuery column-level security with data policies:
-- Tag PHI columns with a policy tag
-- (policy tag must be created in Data Catalog first)
ALTER TABLE phi.member_demographics
ALTER COLUMN ssn
SET OPTIONS (policy_tags = '["projects/[PROJECT]/locations/us-central1/taxonomies/[TAXONOMY_ID]/policyTags/[TAG_ID]"]');
-- Grant access to the PHI policy tag for privileged role only
-- Done via Data Catalog IAM, not BigQuery IAM
This ensures that even if a user has BigQuery read access to the table, they cannot read the SSN column unless they have been explicitly granted access to the PHI policy tag.
Step 5: Audit Logging
Enable Data Access audit logs for all PHI-handling services:
gcloud projects set-iam-policy [PROJECT_ID] - <<'EOF'
auditConfigs:
- auditLogConfigs:
- logType: DATA_READ
- logType: DATA_WRITE
- logType: ADMIN_READ
service: bigquery.googleapis.com
- auditLogConfigs:
- logType: DATA_READ
- logType: DATA_WRITE
service: healthcare.googleapis.com
- auditLogConfigs:
- logType: DATA_READ
- logType: DATA_WRITE
service: storage.googleapis.com
EOF
Route audit logs to BigQuery for long-term retention and queryable audit evidence:
gcloud logging sinks create phi-audit-sink bigquery.googleapis.com/projects/[PROJECT_ID]/datasets/audit_logs --log-filter='logName="projects/[PROJECT_ID]/logs/cloudaudit.googleapis.com%2Fdata_access"'
Retain audit logs for a minimum of 6 years (HIPAA requirement). BigQuery's partitioned table storage makes 6-year retention economically practical.
Key Takeaways
- VPC Service Controls are non-negotiable for PHI on GCP — they prevent data egfiltration that IAM alone cannot stop.
- Cloud Healthcare API's BigQuery streaming sync eliminates the ETL tier for FHIR analytics. Enable it at FHIR store creation time.
- CMEK with 90-day key rotation satisfies HIPAA Security Rule encryption management requirements.
- BigQuery column-level security with policy tags provides fine-grained PHI access control without view proliferation.
- Before any SQL query runs against PHI tables, validate it with the SQL Linter to catch unbounded queries and anti-patterns that could expose more PHI than intended.
HIPAA-Compliant vs Non-Compliant GCP Services
Not every GCP service is eligible for PHI. Google's BAA covers a defined list — everything outside it must stay PHI-free.
| GCP Service | HIPAA BAA | PHI Allowed | Key Config Required |
|---|---|---|---|
| BigQuery | ✅ Yes | ✅ Yes | CMEK encryption, VPC Service Controls, column-level security |
| Cloud Healthcare API | ✅ Yes | ✅ Yes | Designed for PHI — FHIR/HL7 native |
| Cloud Storage | ✅ Yes | ✅ Yes | CMEK, uniform bucket-level access, audit logging |
| Cloud Pub/Sub | ✅ Yes | ✅ Yes | VPC Service Controls perimeter required |
| Cloud KMS | ✅ Yes | ✅ Yes | Manages CMEK keys for all PHI services |
| Cloud Audit Logs | ✅ Yes | ✅ Yes | Route to BigQuery; 6-year retention policy |
| Dataflow | ✅ Yes | ✅ Yes | Run inside VPC, no external IPs |
| Cloud Composer | ✅ Yes | ✅ Yes | Private IP environment, VPC-native |
| Cloud Functions | ❌ No | ❌ No | Compute only — don't pass PHI through |
| Firebase | ❌ No | ❌ No | Not covered by GCP HIPAA BAA |
| BigQuery ML | ❌ No | ❌ No | Train on de-identified data only |
| Vertex AI | ⚠️ Partial | ⚠️ Conditional | Verify against current BAA addendum |
Always verify: GCP's BAA-covered service list changes quarterly. Check the GCP HIPAA implementation guide before adding any new service to a PHI pipeline.
Frequently Asked Questions
Is Google Cloud Platform HIPAA compliant?
GCP is HIPAA-eligible — Google signs Business Associate Agreements (BAA) for a specific set of services including BigQuery, Cloud Healthcare API, Cloud Storage, Pub/Sub, and Dataflow. However HIPAA compliance is not automatic — it requires proper architectural controls including VPC Service Controls, CMEK encryption, audit logging, and IAM configuration.
What GCP services are covered under the HIPAA BAA?
Google's BAA covers BigQuery, Cloud Healthcare API, Cloud Storage, Pub/Sub, Dataflow, Cloud Composer, Secret Manager, Cloud KMS, VPC Service Controls, and Cloud Audit Logs. Services not on this list should never process PHI.
How do I make BigQuery HIPAA compliant?
Making BigQuery HIPAA compliant requires four controls: enable Customer-Managed Encryption Keys (CMEK) via Cloud KMS, implement column-level security with policy tags for PHI fields, enable Data Access audit logs for all read and write operations, and place BigQuery inside a VPC Service Controls perimeter to prevent data exfiltration.
What are VPC Service Controls in GCP?
VPC Service Controls create a security perimeter around your GCP project that prevents data exfiltration even if an IAM account is compromised. Resources inside the perimeter can communicate with each other, but data cannot leave the perimeter to external destinations without explicit access levels configured.
How long must HIPAA audit logs be retained on GCP?
HIPAA requires audit logs to be retained for a minimum of 6 years. On GCP, route Cloud Audit Logs to a BigQuery sink with a partitioned table structure — this makes 6-year retention economically practical and keeps logs queryable for compliance investigations.
mdatool Team
The mdatool team builds free engineering tools for healthcare data architects, analysts, and engineers working across payer, provider, and life sciences data.
Related Guides
Key Terms in This Article
More in Cloud Architecture
Multi-Cloud Healthcare Data Architecture: Patterns, Risks, and Best Practices
Healthcare organizations end up multi-cloud for reasons that are rarely strategic. Here is how to architect data infrastructure across clouds without creating a compliance and operational nightmare.
Read moreGCP vs Azure vs AWS HealthLake: FHIR Comparison 2026
Three cloud giants, three healthcare data services, and one question: which one actually fits your architecture? A practical comparison across FHIR compliance, PHI handling, pipeline integration, and real-world limitations.
Read moreAzure Health Data Services vs AWS HealthLake
Both Microsoft and AWS now offer managed FHIR-native cloud platforms for healthcare data. We compare Azure Health Data Services and AWS HealthLake across FHIR compliance, data pipeline integration, cost, and real-world use cases so your team can make an informed choice.
Read moreReady to improve your data architecture?
Free tools for DDL conversion, SQL analysis, naming standards, and more.
Get weekly healthcare data engineering tips
Practical guides on data modeling, SQL standards, and healthcare domain conventions — straight to your inbox.
No spam. Unsubscribe any time.