Sum Small: A Compact 3B Language Model Designed to Streamline Medical Summaries for Clinicians

Sum Small:  A New 3B-sized Language Model for Streamlining Medical Summaries for Clinicians

Recently, AI professional and Health AI Entrepreneur Farhang Dehzad introduced a new 3B-sized model, known as Sum Small, designed specifically for summarizing medical dialogues. This model notably surpasses GPT-4 in efficiently creating summaries for clinicians, potentially saving countless hours of manual documentation 🔥!

Sum Small: 3B Model Outperforms GPT-4 in Medical Summarization

The Sum Small model could potentially be used to fetch data from FHIR servers and generate summarized clinical notes directly from patient health records.

The AI community has warmly received this new model, appreciating Farhang's commitment to openness, evidenced by releasing the model under the MIT License. This license promotes broader usage, supporting both commercial and non-commercial applications.

Farhang's journey to fine-tune this model began with creating a synthetic dataset using GPT-4 (the dataset is available here). Initial tests with the Phi-2 model showed promising but slightly lower results than GPT-4. However, after transitioning to the newer Phi-3 model, Farhang optimized the fine-tuning process and achieved superior performance to GPT-4 by using LORA with FA2 across just two epochs.

Looking ahead, the Health AI Entrepreneur plans to adapt this model to run locally on devices like the iPhone 14, aiming to establish a seamless Voice-to-Text-to-Summary flow. This would further enhance the utility of the model in clinical settings 🌿.

For those interested in the technical details, Farhang has shared a comprehensive tutorial on Medium about fine-tuning the Phi-3 model. His setup included 40GB A100 GPUs, although A6000 GPUs could also be used, with memory requirements ranging from 22GB to 40GB depending on the settings. The base context length for Phi-3 is 4k.

Sum Small's creator is eager to connect with clinicians, medical students, developers, and data scientists for feedback and potential collaboration. If you’re interested in this innovative project, you can connect with him on LinkedIn to discuss or follow the developments further.

 

Was this helpful?
🌟 Advertise your project 🚀
Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241217