SmolLM 135M De By LemiSt: Benchmarks, Features and Detailed Analysis. Insights on SmolLM 135M De.

Base model:finetune:huggingfac... Base model:huggingfacetb/smoll... Dataset:almanach/halvest Dataset:d4ve-r/terra-xplain-cc... Dataset:devngho/culturax-mini-... Dataset:djstrong/oscar-small Dataset:lemist/gutenberg de Dataset:maxidl/finenews-unfilt... Dataset:wikimedia/wikipedia De Endpoints compatible Feature-extraction Llama Region:us Safetensors

Model Card on HF 🤗: https://huggingface.co/LemiSt/SmolLM-135M-de

SmolLM 135M De Benchmarks

LLME Score: 0.22073

^nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").

What is the LLM Explorer Rank (Score)

SmolLM 135M De Parameters and Internals

Model Type

Large Language Model (Llama architecture)

Use Cases

Areas:

small experimentation

Applications:

benchmarking datasets

Limitations:

Will output blatantly wrong information, Possible generation of inappropriate content, Not recommended for production use

Considerations:

Consider further fine tuning and preference optimization before use

Additional Notes

The model is primarily for experimentation and benchmarking rather than production use. Outputs correct German primarily.

Supported Languages

German (advanced)

Training Details

Data Sources:

devngho/culturax-mini-nonshuffled, maxidl/FineNews-unfiltered, djstrong/oscar-small, LemiSt/gutenberg_de, almanach/HALvest, wikimedia/wikipedia, D4ve-R/terra-xplain-cc-de

Data Volume:

about 6 billion German-language tokens

Methodology:

trained with axolotl, using full fine tuning

Context Length:

2048

Training Time:

nearly 2 epochs

Model Architecture:

Llama architecture

LLM Name	SmolLM 135M De
Repository 🤗	https://huggingface.co/LemiSt/SmolLM-135M-de
Base Model(s)	SmolLM 135M HuggingFaceTB/SmolLM-135M
Model Size	135m
Required VRAM	0.5 GB
Updated	2025-06-01
Maintainer	LemiSt
Model Type	llama
Model Files	0.5 GB
Supported Languages	de
Model Architecture	LlamaModel
License	apache-2.0
Context Length	2048
Model Max Length	2048
Transformers Version	4.44.2
Tokenizer Class	GPT2Tokenizer
Padding Token	<\|endoftext\|>
Vocabulary Size	49152
Torch Data Type	float32
Errors	replace

Rank the SmolLM 135M De Capabilities

🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟

Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation

What open-source LLMs or SLMs are you in search of? 47770 in total.

Email us: info@extractum.io. Our Privacy Policy | Terms and Conditions | Suggest an improvement.

Our Social Media →

Original data from HuggingFace, OpenCompass and various public git repos.

Release v20241227

Support LLM Explorer

SmolLM 135M De by LemiSt

» All LLMs » LemiSt » SmolLM 135M De URL Share it on

SmolLM 135M De Benchmarks

SmolLM 135M De Parameters and Internals

Rank the SmolLM 135M De Capabilities

What open-source LLMs or SLMs are you in search of? 47770 in total.