Llama 161M 100B by abacaj

 ยป  All LLMs  ยป  abacaj  ยป  Llama 161M 100B   URL Share it on

  Autotrain compatible   Endpoints compatible   Llama   Onnx   Region:us   Safetensors
Model Card on HF ๐Ÿค—: https://huggingface.co/abacaj/llama-161M-100B 

Llama 161M 100B Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Llama 161M 100B (abacaj/llama-161M-100B)

Llama 161M 100B Parameters and Internals

Model Type 
pretrained, text generation
Use Cases 
Primary Use Cases:
base pretrained model requiring fine-tuning
Limitations:
Requires further fine-tuning to be useful
Additional Notes 
This is a base pretrained model and requires further fine tuning to be useful.
Training Details 
Data Sources:
80% code, 10% NL, 10% instruction data
Data Volume:
100B tokens
Methodology:
WSD scheduler with 10% decay
Training Time:
110 hours
Hardware Used:
8x3090s
LLM NameLlama 161M 100B
Repository ๐Ÿค—https://huggingface.co/abacaj/llama-161M-100B 
Model Size100b
Required VRAM0.3 GB
Updated2025-01-24
Maintainerabacaj
Model Typellama
Model Files  0.3 GB
Model ArchitectureLlamaForCausalLM
Licenseapache-2.0
Context Length1024
Model Max Length1024
Transformers Version4.40.2
Tokenizer ClassLlamaTokenizer
Padding Token<unk>
Vocabulary Size32000
Torch Data Typebfloat16

Best Alternatives to Llama 161M 100B

Best Alternatives
Context / RAM
Downloads
Likes
Stockmark 100B4K / 191.9 GB15333
Saily 100b4K / 235.5 GB7427
Plankton 100M4K / 0.4 GB1320
...lisLM 100M Layer Hidden Pruned2K / 0.2 GB8200
Reglu 100B2K / 2.6 GB121
...ephyr Smol Llama 100M DPO Full1K / 0.2 GB133
...ephyr Smol Llama 100M DPO Full1K /  GB231
...yr Smol Llama 100M DPO 1 Epoch1K / 0.2 GB100
Babylama Hidden Sizes7680.3K / 0.7 GB80
Babyllama 100M 20240.3K / 0.2 GB24214
Note: green Score (e.g. "73.2") means that the model is better than abacaj/llama-161M-100B.

Rank the Llama 161M 100B Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 41817 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227