Case Study Coming soon

High-Risk Disease Detection — Clinical Document Intelligence

A production system that reads ~3 million inbound clinical documents monthly — ER records, specialist notes, labs, faxes — and uses vector embeddings to surface high-risk conditions for clinical review in near real-time.

Client ChenMed LLC

Role Principal Architect

Period 2023 – Present

Scale ~3M documents/month · XML, PDF, scanned TIF · 12,000+ recommendations in a two-week period

In value-based care, a diagnosis that arrives from an outside encounter and sits unread is a patient outcome waiting to get worse. This system reads every inbound clinical document and uses vector embeddings to surface semantic matches against ICD-10, HCC, and high-risk condition taxonomies — cancers, diabetes, cardiovascular and renal disease — so care teams learn about new diagnoses in near real-time.

The full write-up will cover:

Vector-embedding document classification across heterogeneous formats (XML, PDF, scanned TIF)
Building taxonomy matching against ICD-10, HCC, and high-risk condition sets
The clinical-review workflow: tuning what surfaces and what does not
Operating semantic search at the scale of millions of documents per month
Why manual review cannot scale to this problem, and what that means for patient safety

Full case study coming soon.

← All case studies · Engage me on similar work