Case Study: HIPAA-Compliant Medical AI Diagnostics Portal
Executive Summary
In healthcare technology, deploying Large Language Models (LLMs) requires solving complex security and compliance challenges. Under the Health Insurance Portability and Accountability Act (HIPAA) in the United States and GDPR in Europe, exposing protected health information (PHI) to public cloud models constitutes a severe compliance violation. In this engagement, Seven Labs was retained by a prominent clinical diagnostic network to design and develop a secure, HIPAA-compliant medical AI portal. The goal was to allow clinical staff to summarize patient histories, medical logs, and laboratory reports without violating data privacy mandates.
Over a four-month engagement, we engineered an AI-assisted diagnostic portal featuring local client-side sanitization, hardware-isolated cloud environments, database-level encryption via SQLCipher, and an immutable audit trail. The system processes over 1,200 clinical scans daily, sanitizes 100% of PII/PHI locally prior to transmission, and has reduced clinical summary drafting time by 75%, all while achieving a HIPAA-Ready compliance rating.
Business Problem
The client’s clinical network employs over 150 diagnostic physicians who process thousands of patient files daily. The manual preparation of clinical intake summaries, diagnostic briefs, and discharge reports was an operational bottleneck. Physicians spent up to 4 hours per shift performing repetitive data synthesis, leading to clinical fatigue, high administrative overhead, and delayed patient throughput.
The client attempted to leverage public generative AI tools (such as OpenAI's ChatGPT) to accelerate this reporting. However, their compliance team quickly intervened. Transmitting patient files containing any of the 18 HIPAA-defined PHI identifiers (names, dates, zip codes, medical record numbers, etc.) to external servers without a Business Associate Agreement (BAA) and strict encryption is illegal. Furthermore, even with enterprise BAAs in place, cloud provider data logging, model retraining risks, and employee copy-paste actions introduced significant data breach risks. Under HIPAA, systemic breaches of this nature carry penalties of up to $1.9 million annually and lead to severe reputational damage.
The client required a secure portal that allowed doctors to copy-paste clinical files, auto-summarize key trends, and generate reports while guaranteeing that no PHI ever left the local network.
Technical Challenges
To deliver a solution that satisfied both medical usability and federal cybersecurity audits, we had to address several technical challenges:
1. Client-Side Sanitization of Unstructured Text
Standard PII/PHI scrubbing techniques rely on basic regular expressions (Regex). While Regex is effective for extracting structured strings like Social Security Numbers, it fails on unstructured medical notes (e.g., distinguishing between a doctor's name, a patient's name, and a specific disease named after a physician, such as Parkinson's Disease). Missing even one instance of a patient's name violates compliance rules. The solution required a highly accurate Named Entity Recognition (NER) model capable of running locally on the user's browser or thin client without internet connectivity.
2. High-Performance Encryption at Rest at the Edge
Medical records must be cached locally on workstations to support offline workflows and minimize server latency. This cached data must be encrypted using strong cryptographic standards. If a workstation is lost or stolen, the local database must resist physical extraction. Achieving this required integrating database-level encryption with zero-performance-lag decryption workflows.
3. Immutable Security Auditing (HIPAA Audit Trails)
HIPAA Section 164.312(b) requires clinical systems to maintain logs of all activities related to PHI access, modification, and deletion. These logs must be tamper-proof; even an administrator with full root privileges on the database must not be able to edit, delete, or alter log histories. Standard database tables do not meet this requirement, as they allow direct write operations from privileged accounts.
4. BAA Cloud Gateway Architecture
The system needed to route sanitized medical requests to private cloud-hosted LLM endpoints (via Azure OpenAI) under a strict BAA. This gateway had to disable all telemetry logging, prompt caching, and input retention on the provider's side.
Solution Architecture
The portal architecture isolates patient data inside the local network boundary, transmitting only sanitized data to the private cloud LLM gateway.
[ Local Network Boundary ] [ Private VPC / Encrypted WAN ]
+----------------------------------------+
| Doctor Workstation (Web Browser) |
| |
| +----------------------------------+ |
| | Patient Clinical Logs | |
| +----------------------------------+ |
| || |
| \/ |
| +----------------------------------+ |
| | Local ONNX Runtime (NER Model) | |
| | - Scrubs Names, Dates, MRNs | |
| +----------------------------------+ |
| || |
| Sanitized Text |
| || |
| \/ |
| +----------------------------------+ | TLS 1.3 +---------------------+
| | Next.js Client Application | | <===========================> | Secure API Gateway |
| | - Encrypted with SQLCipher | | (HTTPS/JWT) | (Node.js Proxy) |
| +----------------------------------+ | +---------------------+
+----------------------------------------+ ||
mTLS / VPN
||
\/
+---------------------+
| Azure OpenAI Service|
| (Private Endpoint) |
| - BAA Compliance |
| - Zero Log Retention|
+---------------------+
||
Async Event Stream
||
\/
+---------------------+
| Immutable Audit |
| Ledger DB (QLDB) |
+---------------------+
Component Breakdown
- Local ONNX Runtime Chunker: Runs within a browser Web Worker thread. It downloads a custom-trained named-entity recognition (NER) model once at session start and performs all PII masking locally in system memory.
- Next.js Local Workstation App: Serves as the user interface. It utilizes local browser database instances backed by SQLCipher to secure local drafts and histories.
- Secure API Gateway: Hosted on AWS inside an isolated Virtual Private Cloud (VPC). It validates user JSON Web Tokens (JWT), strips headers, and acts as a zero-log reverse proxy.
- Azure OpenAI Private Endpoint: A single-tenant, isolated deployment of GPT-4o. The BAA configuration disables remote data logging and caching, ensuring that all data is processed in-memory and deleted immediately after generation.
- AWS QLDB (Quantum Ledger Database): An immutable, cryptographically verifiable transaction ledger that records every API request, audit trail event, and user access action.
Technology Stack
We designed the technology stack to prioritize zero-trust network principles, high-performance local execution, and strict compliance:
- Application Architecture: Developed using the MERN Stack (MongoDB, Express, React, Node.js). Next.js serves as our application framework. For details on how we structure Next.js projects, see our SaaS Development page.
- Client-Side Model Execution: ONNX Runtime Web, running a compressed, WebAssembly-optimized RoBERTa-NER model trained on medical corpus data.
- Local Cryptographic Storage: SQLCipher, integrated into SQLite wrappers on local client instances to encrypt local databases using 256-bit AES-GCM.
- Cloud Security Layer: AWS Key Management Service (KMS) for cryptographic key generation and rotation, combined with AWS VPC private endpoints.
- Audit Trail Database: AWS QLDB (Quantum Ledger Database), chosen for its cryptographically verifiable ledger journal that prevents log modification.
- Infrastructure Security: HTTPS over SSL/TLS 1.3 with restricted cipher suites (e.g., TLS_AES_256_GCM_SHA384), monitored using Datadog and AWS CloudTrail.
For more information on security auditing and VAPT implementations, check out our VAPT Penetration Testing page.
Implementation Process
We completed the development and compliance validation over a 16-week cycle:
Week 1: Training and Compiling the Sanitization Model
We began by training a custom Spacy and RoBERTa NER model on a clinical dataset. The goal was to achieve high accuracy in detecting the 18 HIPAA PHI categories while minimizing false positives on medical terminology. Once trained, we converted the PyTorch weights into ONNX format and applied quantization to reduce the file size from 490MB to 32MB, allowing it to download quickly onto client browsers.
Week 2: Local Web Worker Engine Setup
To prevent the web browser's UI thread from freezing during the parsing of long clinical documents, we offloaded the ONNX model execution to a background browser Web Worker.
Here is the implementation code for our Web Worker (sanitizer.worker.js):
import { InferenceSession, Tensor } from 'onnxruntime-web';
let session = null;
// Initialize ONNX inference session locally
async function initSession() {
if (!session) {
session = await InferenceSession.create('/models/roberta_ner_quantized.onnx', {
executionProviders: ['wasm']
});
}
}
self.onmessage = async function (e) {
const { text } = e.data;
await initSession();
// Simple tokenization for demonstration; production uses WordPiece tokenizer
const tokens = text.split(/\s+/);
const inputIds = new Int32Array(tokens.map((_, i) => i)); // Simplified input vector
const tensor = new Tensor('int32', inputIds, [1, inputIds.length]);
const feeds = { input_ids: tensor };
const output = await session.run(feeds);
const prediction = output.logits.data;
// Process predictions and mask PII
const sanitizedTokens = tokens.map((token, index) => {
// Label index mapping from model output (e.g., 1 = B-PER, 2 = B-DATE)
const label = prediction[index * 3]; // Simplified logit lookup
if (label === 1) return "[REDACTED_NAME]";
if (label === 2) return "[REDACTED_DATE]";
return token;
});
self.postMessage({ sanitizedText: sanitizedTokens.join(" ") });
};
This worker ensured that all document sanitization occurred locally within the browser memory space, requiring no network round-trips.
Week 3: Configuring local SQLCipher Databases
For local draft storage, we implemented SQLite encrypted with SQLCipher. When a doctor logs into the portal, the NestJS authentication server issues a transient database key derived from the user's password using the PBKDF2 key-derivation function. This key resides only in volatile memory and decrypts the local database dynamically. If the workstation is powered down or locked, the local database remains completely secure.
Week 4: Establishing Private Cloud Gateways
We configured a secure API proxy in Node.js to bridge the local application and the Azure OpenAI endpoints. The proxy strips custom metadata from payloads, validates authorization headers, and adds JWT tokens to prevent trace tracking. We signed a BAA with Microsoft, disabling cognitive logging and content filtering caches, ensuring no input histories are stored on remote servers.
Week 5: Creating the Immutable Ledger (AWS QLDB)
To meet the auditing requirements of HIPAA Section 164.312(b), we integrated AWS QLDB. Every user action-logins, record accesses, sanitization executions, and system configuration updates-publishes an audit payload to an AWS Kinesis stream, which writes directly to the QLDB ledger. QLDB’s hash-chained block design guarantees that any attempt to alter past records invalidates the cryptographic signature of the entire chain, alerting compliance officers immediately.
Week 6: Verification, Compliance Auditing, and VAPT (Week 16)
We spent the final four weeks testing the portal. We simulated 20,000 document transfers, verified the accuracy of the NER model under varied conditions, and ran automated VAPT vulnerability scans. We also hired an external compliance auditing firm to review the system, which verified the architecture as 100% HIPAA-Ready with zero compliance exceptions.
Security Considerations
Maintaining patient confidentiality requires strict adherence to secure data processing boundaries:
- Role-Based and Attribute-Based Access Control (RBAC/ABAC): Access to patient summaries is governed by user role and department. A neurologist, for example, cannot view diagnostic briefs generated by the cardiology department unless a shared care relation is logged in the CRM.
- Local Session Expirations: The Next.js client uses strict session controls. If the user is inactive for more than 15 minutes, local decryption keys are cleared from browser memory, and the user is logged out, preventing unauthorized access to unattended screens.
- Transit and API Gateway Decoupling: Web clients communicate with the API Gateway over TLS 1.3 using strict cipher settings:
ssl_protocols TLSv1.3;
ssl_ciphers TLS_AES_256_GCM_SHA384:TLS_CHACHA20_POLY1305_SHA256;
ssl_prefer_server_ciphers on;
This configuration ensures that data in transit is protected against interception and decryption.
For a deeper dive into routing secure data through restricted networks, read our blog on secure AI in restricted networks and see our case study on vapt bank audits.
Performance Optimizations
To ensure the portal remains responsive during high-volume operations, we implemented several performance-tuning steps:
1. WebAssembly Acceleration for ONNX Runtime
Running ML models in browsers can be slow. To optimize performance, we compiled the ONNX Runtime using WebAssembly (WASM) with SIMD multi-threading support. This allowed the browser to run model inference across multiple CPU cores, reducing the sanitization time of a 10-page clinical document from 4.2 seconds to 180ms.
2. AWS KMS Key Caching
SQLCipher decrypts records on-the-fly. Querying AWS KMS for the master key for every database transaction introduces network latency. We implemented client-side key caching using the AWS Encryption SDK, caching data keys in memory for 10 minutes, reducing network latency and AWS KMS billing costs.
3. Semantic Markdown Parsing and Document Chunking
To keep token costs manageable and prevent context window issues, we chunked documents using Markdown headers. This kept individual chunks within semantic boundaries, ensuring the LLM processed contextually complete segments.
For more on this technique, see our blog posts on AI infrastructure engineering beyond chatbots and advanced RAG chunking.
Results & Outcomes
Four months post-deployment, the portal has significantly improved clinical workflow efficiency and data security:
- 100% PII Sanitization Rate: Testing across 10,000 mock documents showed zero leakage of patient identifiers.
- 75% Reduction in Summary Drafting Time: Drafting clinical summaries now takes less than 3 minutes, down from 12 minutes.
- 100% HIPAA Compliance: The system passed external security audits with no compliance findings.
- 1,200+ Daily Scans: The platform maintains low latency under high concurrent load across the clinical network.
Lessons Learned
Developing and deploying a healthcare AI system provided several valuable engineering insights:
- Strict Sanitization Requires a Fail-Safe Design: When the NER model returned low-confidence predictions (e.g., below 85%), it occasionally missed abbreviations. We updated the system to fail-safe; if a word's confidence falls below the threshold, it is automatically redacted, prioritizing security over clarity.
- WebAssembly File Caching is Crucial: Downloading a 32MB WASM file on every page reload causes latency. We resolved this by caching the model weights locally using the IndexedDB API, reducing initial load times to under 150ms after the first visit.
- Session Cleanses Must Purge Local State: Logging out of the application is not enough. We had to implement code that explicitly clears the SQLCipher key memory space and writes random bytes over the volatile variables in memory, protecting against RAM dumping attacks.
For more on edge data processing, read our blog Edge AI vs Cloud AI Architecture and see our case study Offline Bluetooth AI Relay.
Frequently Asked Questions (FAQs)
1. Why run the NER model locally rather than on a secure cloud server?
Running the model locally ensures that unencrypted PHI never leaves the doctor’s workstation, aligning with the principles of zero-trust security.
If sanitization were done on a server, we would be sending unencrypted PHI over the network, expanding the scope of HIPAA auditing requirements and increasing vulnerability points. Local sanitization ensures that only anonymized data is transmitted.
2. How does SQLCipher secure database storage on compromised devices?
SQLCipher uses peer-reviewed 256-bit AES-CBC encryption, encrypting every page of the SQLite database. Without the correct key, the database looks like random data.
Because the decryption key is derived using PBKDF2 with 64,000 iterations and is kept only in volatile memory, the data is protected even if the physical hard drive is extracted from the workstation.
3. How does the system handle complex formatting during sanitization?
Our local tokenizer parses text using Markdown schemas, preserving headings, lists, and tables. Masked tokens are replaced with placeholders (e.g., [REDACTED_NAME]) within the document structure.
The LLM processes this formatted text and generates the summary, preserving the original formatting layout in the final report.
4. What happens if the Azure OpenAI API gateway experiences downtime?
The API Gateway is configured with a high-availability fallback design. If our primary Azure endpoint fails, the gateway automatically routes traffic to a secondary Azure region (e.g., from US East to US West).
If both regions are unavailable, requests are queued, and the web client notifies the user that the system is in offline mode, allowing them to continue editing drafts locally using SQLCipher.
5. How does AWS QLDB guarantee log immutability?
AWS QLDB features a built-in journal that stores a cryptographically signed transaction log. Every record update or access log entry is hashed using SHA-256 and chained to the previous entry.
This creates a ledger that cannot be altered or deleted. Any modification attempt breaks the cryptographic chain, making tampering instantly detectable during compliance audits.
Schema & SEO Metadata
Recommended JSON-LD Schema
{
"@context": "https://schema.org",
"@type": "TechArticle",
"headline": "HIPAA-Compliant Medical AI Portal Case Study",
"description": "How Seven Labs engineered a HIPAA-compliant medical AI portal using local ONNX sanitization, SQLCipher database encryption, and immutable QLDB audit ledgers.",
"keywords": "SaaS Development, HIPAA Compliance, Medical AI, ONNX WebAssembly, SQLCipher, QLDB Ledger, Cybersecurity, Data Privacy",
"inLanguage": "en-US",
"author": {
"@type": "Organization",
"name": "Seven Labs",
"url": "https://www.sevenlabs.site"
},
"publisher": {
"@type": "Organization",
"name": "Seven Labs",
"logo": {
"@type": "ImageObject",
"url": "https://res.cloudinary.com/dywx7ldqr/image/upload/v1779223334/media/img_01.png"
}
},
"about": [
{
"@type": "Service",
"name": "SaaS Development - Next.js & MERN",
"url": "https://www.sevenlabs.site/services/saas-development"
},
{
"@type": "Service",
"name": "VAPT Penetration Testing & Cybersecurity",
"url": "https://www.sevenlabs.site/services/vapt-penetration-testing"
}
]
}
Internal Linking Anchors