Research

Open evaluation. Published models.

We believe healthcare AI should be verifiable. We publish our benchmarks, open-source our evaluation code, and release models under permissive licences.

M-WER

Speech

Medical Speech-to-Text Benchmark

42 speech-to-text models ranked on medical conversations using Medical Word Error Rate.

42 models M-WER Apr 2026

Clinical SOAP Note Evaluation

A safety-first SOAP benchmark measuring hallucinations, evidence grounding, and clinical coverage.

Safety-first 6 models 300 dialogues

Omi-Sum 3B

An open 3B clinical model for structured SOAP notes, released under the MIT licence.

3B MIT ROUGE-1 70

Read article →

Open Source

Repositories & models

medical-STT-eval

Evaluation framework for speech-to-text models on medical conversations.

GitHub → medical-note-eval

SOAP note safety benchmark for hallucination, grounding, and quality.

GitHub → sum-small

Omi-Sum 3B model weights and training dataset. MIT licence.

HuggingFace →