TeenyTinyLlama 460M by nicholasKluge

 ยป  All LLMs  ยป  nicholasKluge  ยป  TeenyTinyLlama 460M   URL Share it on

  Arxiv:2401.16640   Autotrain compatible   Co2 eq emissions Dataset:nicholaskluge/pt-corpu...   Endpoints compatible   Instruct   Jax   Llama   Model-index   Pt   Pytorch   Region:us   Safetensors

TeenyTinyLlama 460M Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
TeenyTinyLlama 460M (nicholasKluge/TeenyTinyLlama-460m)

TeenyTinyLlama 460M Parameters and Internals

Model Type 
text-generation
Use Cases 
Areas:
Research
Primary Use Cases:
Developing language models for low-resource languages
Limitations:
Not suitable for human-facing interactions, Not intended for deployment, Limited to Brazilian Portuguese
Considerations:
Users should conduct risk and bias assessment before any real-world application
Additional Notes 
Pre-trained model released under Apache 2.0; comprehensive evaluations available.
Supported Languages 
Portuguese (high)
Training Details 
Data Sources:
Pt-Corpus Instruct (6.2B tokens)
Data Volume:
6.2B tokens
Methodology:
Transformer-based model pre-trained via causal language modeling
Context Length:
2048
Training Time:
~ 280 hours
Hardware Used:
1 NVIDIA A100-SXM4-40GB
Model Architecture:
Transformer
Input Output 
Input Format:
Tokenizer input as text for generation
Accepted Modalities:
text
Output Format:
Generated text
Performance Tips:
Review repetition penalty settings to avoid verbosity and repetition
LLM NameTeenyTinyLlama 460M
Repository ๐Ÿค—https://huggingface.co/nicholasKluge/TeenyTinyLlama-460m 
Model Size460m
Required VRAM1.9 GB
Updated2024-12-22
MaintainernicholasKluge
Model Typellama
Instruction-BasedYes
Model Files  1.9 GB   3.8 GB   1.9 GB   0.0 GB
Supported Languagespt
Model ArchitectureLlamaForCausalLM
Licenseapache-2.0
Context Length2048
Model Max Length2048
Transformers Version4.35.2
Tokenizer ClassLlamaTokenizer
Padding Token<pad>
Vocabulary Size32000
Torch Data Typebfloat16

Quantized Models of the TeenyTinyLlama 460M

Model
Likes
Downloads
VRAM
TeenyTinyLlama 460M AWQ1210 GB
TeenyTinyLlama 460M Chat AWQ1180 GB

Best Alternatives to TeenyTinyLlama 460M

Best Alternatives
Context / RAM
Downloads
Likes
TeenyTinyLlama 460M Chat2K / 0 GB4443
...60M Experimental Ptbr Instruct2K / 0.9 GB323
TeenyTinyLlama 460M AWQ2K / 0.3 GB211
TeenyTinyLlama 460M Chat AWQ2K / 0.3 GB181

Rank the TeenyTinyLlama 460M Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 40066 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241217