Omi-Sum 3B: Open-Source Model for Medical Summaries
We're excited to announce the release of Omi-Sum (3B) Small, a compact yet powerful language model designed to turn medical dialogues into structured SOAP summaries. Omi-Sum is openly available on HuggingFace and has shown higher ROUGE-1 scores than GPT-4 Turbo on our summarization benchmark.
Omi-Sum was fine-tuned on our synthetic medical-dialogue-to-soap-summary dataset (10,000 examples) using Microsoft's Phi-3-mini-4k-instruct as a base. The model, dataset, and training code are released under the MIT licence to encourage adoption and collaboration.
Benchmark results (ROUGE-1 on test set)
| Model | ROUGE-1 |
|---|---|
| Omi-Sum 3B Small (Omi Health) | 70 |
| GPT-4 Turbo (OpenAI) | 69 |
| Llama-3 8B Instruct (Meta) | 59 |
| Phi-3 Mini 4k Instruct — base (Microsoft) | 55 |
| GPT-3.5 Turbo (OpenAI) | 54 |
| Phi-2 — base (Microsoft) | 41 |
Omi-Sum is designed for research and development of AI-powered medical documentation tools. While it is not yet approved for clinical use, we believe this open-source release is a step towards safer, more transparent AI for healthcare.
Where to find it
- Model weights: huggingface.co/omi-health/sum-small
- Training dataset: huggingface.co/datasets/omi-health/medical-dialogue-to-soap-summary
- Contact: [email protected]
We look forward to seeing how the community uses and improves this model.