Work On-Prem LLM Inference · proprietary
ayLLM — Natural-Language PACS Search
A fully-local natural-language query interface for the PACS database — fine-tuned 270M Gemma, three-stage NL → JSON → SQL, no cloud API and no equivalent shipping from any commercial vendor.
270M
fine-tuned Gemma · Q4_0 GGUF
10K
synthetic training pairs
0 cloud
fully local · HIPAA-safe
Problem
Finding studies means knowing the query language. Radiology staff want to ask in plain English — “MRIs for Smith last month” — but the data is PHI, so a cloud LLM is off the table. No commercial PACS vendor ships a local natural-language query interface.
Architecture
A three-stage local-inference pipeline that never emits raw model SQL.
- Stage 1 — NL → JSON: a fine-tuned Gemma 270M (chat-format, near-deterministic) emits a structured intent object, with JSON repair for missing keys.
- Stage 2 — JSON → SQL: field names map through a schema dictionary into parameterized SQL (
%splaceholders,ILIKEfor case-insensitive match) — the model never writes executable SQL directly, which keeps it safe and validatable. - Stage 3 — execution: runs through the existing
as_dalPostgreSQL layer against the live Store.
A 270M model is viable because the domain is narrow — a single table — and the intermediate JSON gives a validation seam. Training data is 10,000 synthetic NL/JSON pairs across basic, multi-field, and aggregation templates.
Results
End-to-end natural-language search over the PACS database, quantized to Q4_0 for minimal memory and running entirely on local hardware — zero cloud calls.
Impact
A query interface no vendor offers, built to a hard constraint: patient data never leaves the building.