Model Type | |
Use Cases |
Areas: | |
Applications: | Assistant-like chat, Natural language generation tasks |
|
Primary Use Cases: | Commercial applications, Research |
|
Limitations: | Out-of-scope use in languages other than English without fine-tuning |
|
Considerations: | Developers are encouraged to fine-tune models for their specific use cases. |
|
|
Additional Notes | Pretraining does not include Meta user data. Developers encouraged to share feedback via provided channels. Carbon emissions for training offset by Meta's sustainability program. |
|
Supported Languages | English (Fully Supported) |
|
Training Details |
Data Sources: | Publicly available online data |
|
Data Volume: | |
Methodology: | supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) |
|
Context Length: | |
Hardware Used: | Meta's Research SuperCluster (H100-80GB GPUs) |
|
Model Architecture: | Optimized transformer architecture with Grouped-Query Attention (GQA) |
|
|
Safety Evaluation |
Methodologies: | Red teaming, Adversarial evaluations |
|
Risk Categories: | CBRNE (Chemical, Biological, Radiological, Nuclear, and Explosives), Cyber security, Child safety |
|
Ethical Considerations: | Residual risks remain; recommended safety evaluations before deployment |
|
|
Responsible Ai Considerations |
Fairness: | Trade-offs between model helpfulness and alignment are unavoidable. |
|
Transparency: | Efforts towards community standardization and transparency. |
|
Accountability: | Developers should assess safety risks in specific applications. |
|
Mitigation Strategies: | Use of Purple Llama tools and Llama Guard for system-level safety. |
|
|
Input Output |
Input Format: | |
Accepted Modalities: | |
Output Format: | |
|
Release Notes |
Version: | |
Date: | |
Notes: | Initial release of Meta Llama 3 models (8B and 70B sizes) |
|
|
|