Model Type | text-generation, instruction-tuned |
|
Use Cases |
Areas: | Research, Commercial applications |
|
Applications: | Assistant-like chat, Natural language generation tasks |
|
Primary Use Cases: | Instruction tuned models for dialogue, Pretrained models for various tasks |
|
Limitations: | Use in languages other than English, Any manner violating laws or regulations |
|
Considerations: | Fine-tuning allowed for languages beyond English under compliance |
|
|
Additional Notes | Pretraining data cutoff March 2023 for 8B, December 2023 for 70B. |
|
Supported Languages | |
Training Details |
Data Sources: | publicly available online data |
|
Data Volume: | |
Methodology: | Supervised fine-tuning (SFT), Reinforcement Learning with Human Feedback (RLHF) |
|
Context Length: | |
Hardware Used: | Meta's Research SuperCluster, H100-80GB GPUs |
|
Model Architecture: | Auto-regressive language model, optimized transformer architecture |
|
|
Safety Evaluation |
Methodologies: | Red teaming, Adversarial evaluations |
|
Findings: | Fewer false refusals compared to Llama 2 |
|
Risk Categories: | CBRNE threats, Cyber Security risks, Child Safety risks |
|
Ethical Considerations: | Residual risks remain, developers should assess them for specific use cases |
|
|
Responsible Ai Considerations |
Transparency: | Open approach to AI, encourage community engagement |
|
Mitigation Strategies: | Updated Responsible Use Guide, use of Meta Llama Guard 2 and Code Shield |
|
|
Input Output |
Input Format: | |
Output Format: | |
Performance Tips: | Follow prompt template provided by Llama-3 |
|
|