SmolLM2: The Week's Top-Ranked Compact Language Model

02/11/2024 13:55:33

SmolLM2: Small Size, Big Performance - This Week's Most Popular LLM

SmolLM2 1.7B Instruct has claimed the top position in our weekly language model rankings, where we evaluate models based on download frequency and user engagement metrics from Hugging Face and LLM Explorer.

weekly language model rankings

The model family has generated significant interest in the AI community, particularly for its impressive performance in a compact form factor. Coming in three variants - 135M, 360M, and 1.7B parameters - users report "shockingly good" performance across the range, with the 1.7B Instruct version drawing special attention from practitioners.

Technical Implementation Details

The development team's training process is particularly noteworthy for those interested in reproduction or fine-tuning. The model was trained on 11 trillion tokens using a diverse dataset combination including FineWeb-Edu, DCLM, and The Stack, along with specialized mathematics and coding datasets. For the instruction-tuned version, the team performed supervised fine-tuning on a ~1M instruction dataset, combining newly curated data with established datasets like OpenHermes2.5, MetaMathQA, Numina-CoT, and self-oss-instruct-sc2. The final step involved Direct Preference Optimization using Ultrafeedback for three epochs.

While the initial training utilized 256 H100 GPUs, the development team confirms that LoRA fine-tuning works effectively on more modest hardware setups - an important consideration for practitioners with limited resources. Multiple users have praised the availability of both base and instruct versions under the Apache 2.0 license, making it particularly accessible for further experimentation and deployment.

Performance and Practical Applications

The 1.7B Instruct version has been cited by some experienced users as "probably the best under 2B model" they've tested, with particularly strong coherence and instruction-following capabilities. Users report it performs on par with and sometimes exceeds Qwen2.5 in several benchmarks, making it a notable achievement for its size class.

The 360M variant has emerged as a particular sweet spot in the size/speed/quality balance, with users successfully deploying it on hardware as modest as a Raspberry Pi. The model's compact size makes it suitable for local on-device inference, function calling, and RAG applications - reaching the kind of size that could potentially be shipped in mobile apps rather than requiring OS-level integration. When users discussed potential mobile applications, they referenced apps like Layla and PocketPal as examples of where small language models can be deployed.

Implementation Considerations

When comparing against Gemma 2B (2.61B parameters), users note several distinctions. While Gemma has 53% more parameters, SmolLM2-1.7B offers uncensored operation out of the box and uses the ChatML format for straightforward integration. Users have identified certain characteristics to be aware of during implementation - the model can occasionally get stuck in loops, particularly in creative tasks or uncertain topics. However, these limitations primarily appear in use cases that typically demand larger models, suggesting the importance of aligning the model choice with appropriate applications.

The model's efficient size-to-performance ratio makes it particularly suitable for:

Local on-device inference

Function calling applications

RAG (Retrieval Augmented Generation) implementations

Mobile app integration

Edge computing scenarios

For practitioners working with the model, recommended technical considerations include understanding the model's performance characteristics in relation to available computing resources and intended application requirements. The ability to run effectively on consumer-grade hardware, combined with its strong performance in structured tasks, makes it an attractive option for developers looking to implement AI capabilities in resource-constrained environments.

Was this helpful?

🌟 Advertise your project 🚀

Recent Blog Posts

The Hidden Dangers of NSFW Language Models: What You Need to Know

2025-02-21
Llama 3.3 70B: Community Experience and Technical Review

2024-12-16
Kudos Qwen Coder Models: Open Weights and Self-Hosted on Your Hardware

2024-11-12
OmniParser and Ferret-UI: New Tools for AI Understanding of User Interfaces

2024-10-30
Aya Expanse 8B: Translation-Focused Language Model

2024-10-27
User Feedback on NVIDIA's Llama 3.1 Nemotron 70B Instruct: Strengths and Limitations

2024-10-18
WhiteRabbitNeo-2.5-Qwen-2.5-Coder-7B: Practical Applications in Cybersecurity

2024-10-03
Meta's Llama 3.2 Restriction Prompts EU AI Regulation Debate

2024-09-29
User Feedback on Qwen 2.5 Models: Impressive Performance with Lower Computational Resources

2024-09-25

Email us: info@extractum.io. Our Privacy Policy | Terms and Conditions | Suggest an improvement.

Our Social Media →

Original data from HuggingFace, OpenCompass and various public git repos.

Release v20241227

Support LLM Explorer