Model Type |
| |||||||||
Additional Notes |
| |||||||||
Supported Languages |
| |||||||||
Training Details |
| |||||||||
Input Output |
|
LLM Name | Smol Llama 220M GQA 32K Theta Sft |
Repository ๐ค | https://huggingface.co/Doctor-Shotgun/smol_llama-220M-GQA-32k-theta-sft |
Merged Model | Yes |
Model Size | 220m |
Required VRAM | 0.4 GB |
Updated | 2024-12-21 |
Maintainer | Doctor-Shotgun |
Model Type | llama |
Instruction-Based | Yes |
Model Files | |
Supported Languages | en |
Model Architecture | LlamaForCausalLM |
Context Length | 32768 |
Model Max Length | 32768 |
Transformers Version | 4.36.2 |
Tokenizer Class | LlamaTokenizer |
Padding Token | </s> |
Vocabulary Size | 32000 |
Torch Data Type | bfloat16 |
Best Alternatives |
Context / RAM |
Downloads |
Likes |
---|---|---|---|
Smol Llama 220M Open Instruct | 2K / 0.4 GB | 33 | 1 |
๐ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐