FerriScribe listens to your doctor-patient conversations and generates structured SOAP notes using a local AI model of your choice that runs entirely on your computer. No cloud APIs. No data transmission. No subscriptions. HIPAA-compliant by design โ because nothing ever leaves your device.
Fine-tuned on real doctor-patient conversations. Understands medical terminology, clinical reasoning, and standard SOAP note structure out of the box.
All audio processing and note generation happen on your machine. No cloud APIs, no data transmission, no third-party access. HIPAA-compliant by architecture, not by policy.
Records the consultation, transcribes speech to text, and generates a structured SOAP note โ ready for your review before the next patient sits down.
Further fine-tune the model to your specialty, your templates, and your documentation style. Your notes, your way.
Audit every line of code. Fork it, contribute, or customize it for your clinic. No black boxes, no vendor lock-in, no future price hikes.
macOS, Linux, and Windows. Runs on any capable machine with 16GB+ RAM โ from a MacBook Air to a clinic workstation.
No per-seat licenses. No API fees. No cloud subscriptions. Download it, run it, own it. One less line item on your practice budget.
Optimized with quantized models (Qwen 3.6 35B A3B) for local inference. No GPU server, no special equipment โ just a capable laptop or desktop.
Hit record before the consultation. FerriScribe captures audio and transcribes it locally with high-quality speech-to-text.
The local AI model analyzes the transcript on-device โ extracting clinical findings, assessments, and plans.
A structured SOAP note appears in seconds. Review, edit, and copy it into your EMR. Done before your next patient.
FerriScribe runs entirely on your local machine โ no cloud APIs, no data transmission, no third-party servers. Patient conversations never leave your device. This isn't a privacy policy you have to trust; it's an architecture you can verify. Every line of code is open source and auditable.
Native Apple Silicon support with MLX optimization. Runs great on MacBook Air, MacBook Pro, and Mac Studio with M-series chips.
Full NVIDIA CUDA and AMD ROCm support. Run it on your workstation, a home server, or a dedicated clinic machine.
DirectML and CUDA support. Works on any capable Windows laptop or desktop with sufficient RAM for local inference.
The only requirement: a computer with enough RAM to run local models (16GB+ recommended).