Deploying a Private LLM in a HIPAA-Compliant VPC
A step-by-step DevOps guide for healthcare and biotech companies.
The Challenge
Healthcare and biotech companies face a unique AI challenge: they need to leverage large language models for clinical research, drug discovery, and patient data analysis — while meeting strict HIPAA requirements that prohibit sending Protected Health Information (PHI) to third-party services.
The solution is a fully private LLM deployment that runs entirely within your own Virtual Private Cloud (VPC). No data leaves your infrastructure. No third-party API calls. Complete audit trail from query to response.
Step 1: VPC Network Isolation
Start with a dedicated VPC that has no internet gateway. All compute instances live in private subnets with no public IP addresses. Communication with other AWS services happens exclusively through VPC endpoints — the traffic never traverses the public internet.
Key configurations: disable all outbound internet access from private subnets, configure VPC flow logs for all network interfaces, and set up AWS PrivateLink for any required service integrations.
Step 2: Encryption at Every Layer
HIPAA requires encryption of PHI both at rest and in transit. For the LLM deployment, this means:
- At rest: All EBS volumes encrypted with customer-managed KMS keys. Model weights, vector stores, and query logs — all encrypted.
- In transit: TLS 1.3 enforced on all internal service communication. No plaintext traffic, even within the VPC.
- Key rotation: Automated KMS key rotation every 90 days with full audit logging of key usage.
Step 3: Model Serving Infrastructure
Choose GPU-optimized instances (e.g., g5.xlarge or p4d.24xlarge depending on model size) deployed in an Auto Scaling Group behind an internal Application Load Balancer. The model serving layer handles tokenization, inference, and response formatting — all within the private subnet.
For models requiring multiple GPUs, use model parallelism across instances within the same placement group to minimize inter-node latency. Configure health checks that verify both instance health and model readiness before routing traffic.
Step 4: Audit Logging & BAA Compliance
Every query to the LLM must be logged with: timestamp, user identity, query hash (not the query itself, to avoid storing PHI in logs), response latency, and token count. These logs feed into CloudTrail and a dedicated SIEM for compliance monitoring.
Ensure your AWS account has a signed Business Associate Agreement (BAA) and that only HIPAA-eligible services are used in the architecture. Regularly run AWS Config rules to detect any drift from the compliant baseline.
Step 5: Automated Compliance Checks
Build compliance validation into your CI/CD pipeline. Every infrastructure change must pass automated security scans that verify: no public subnets in the VPC, all volumes encrypted, no security groups allowing 0.0.0.0/0 ingress, and all IAM roles follow least-privilege principles.
This isn't a one-time setup — it's a continuous compliance posture that runs on every deployment.