Work On-Prem LLM Inference · proprietary

ayLLM — Natural-Language PACS Search

A fully-local natural-language query interface for the PACS database — fine-tuned 270M Gemma, three-stage NL → JSON → SQL, no cloud API and no equivalent shipping from any commercial vendor.

270M

fine-tuned Gemma · Q4_0 GGUF

10K

synthetic training pairs

0 cloud

fully local · HIPAA-safe

Problem

Finding studies means knowing the query language. Radiology staff want to ask in plain English — “MRIs for Smith last month” — but the data is PHI, so a cloud LLM is off the table. No commercial PACS vendor ships a local natural-language query interface.

Architecture

A three-stage local-inference pipeline that never emits raw model SQL.

Stage 1 — NL → JSON: a fine-tuned Gemma 270M (chat-format, near-deterministic) emits a structured intent object, with JSON repair for missing keys.
Stage 2 — JSON → SQL: field names map through a schema dictionary into parameterized SQL (%s placeholders, ILIKE for case-insensitive match) — the model never writes executable SQL directly, which keeps it safe and validatable.
Stage 3 — execution: runs through the existing as_dal PostgreSQL layer against the live Store.

A 270M model is viable because the domain is narrow — a single table — and the intermediate JSON gives a validation seam. Training data is 10,000 synthetic NL/JSON pairs across basic, multi-field, and aggregation templates.

Results

End-to-end natural-language search over the PACS database, quantized to Q4_0 for minimal memory and running entirely on local hardware — zero cloud calls.

Impact

A query interface no vendor offers, built to a hard constraint: patient data never leaves the building.