Updates

Omi-Sum 3B: Open-Source Model for Medical Summaries

Jun 1, 2024

We’re excited to announce the release of Omi-Sum (3B) Small, a compact yet powerful language model designed to turn medical dialogues into structured SOAP summaries. Omi-Sum is openly available on Hugging Face and has already shown to outperform GPT‑4 and other larger models on our summarization benchmarks.

Omi-Sum was fine-tuned on our synthetic medical-dialogue-to-soap-summary dataset (10,000 examples) using Microsoft’s Phi-3-mini-4k-instruct as a base. The model, dataset, and training code are released under the MIT license to encourage adoption and collaboration.

Benchmark results (ROUGE-1 on test set)

Model	ROUGE-1
Omi-Sum 3B Small	70
GPT‑4 Turbo	69
Llama‑3 8B Instruct	59
GPT‑3.5 Turbo	54
Phi‑3 Mini 4k Instruct (base)	55
Phi‑2 (base)	41

Where to find it

Model & weights: https://huggingface.co/omi-health/sum-small
Training dataset: https://huggingface.co/datasets/omi-health/medical-dialogue-to-soap-summary

Omi-Sum is designed for research and development of AI-powered medical documentation tools. While it is not yet ready for clinical use, we believe this open-source release is a step towards safer, more transparent AI for healthcare.

We look forward to seeing how the community uses and improves this model. For questions or to discuss API access, please reach out at [email protected]

‹ Benchmarking Speech-to-Text Models for Long-Form Medical Dialogue