Model Type | Auto-regressive language model, Text Generation, Dialogue |
|
Additional Notes | English language model with potential for fine-tuning for other languages under license conditions. |
|
Training Details |
Data Sources: | Publicly available online data |
|
Data Volume: | |
Methodology: | Supervised fine-tuning (SFT) and Reinforcement Learning with Human Feedback (RLHF) |
|
Context Length: | |
Hardware Used: | Meta's Research SuperCluster, H100-80GB GPUs |
|
Model Architecture: | Optimized transformer architecture |
|
|
Safety Evaluation |
Methodologies: | Red-teaming, Adversarial evaluations |
|
Findings: | Llama 3 is significantly less likely to falsely refuse to answer prompts than Llama 2 |
|
Risk Categories: | Misinformation, Insecure coding |
|
Ethical Considerations: | Iterative testing was done to assess safety related to CBRNE threats. |
|
|
Responsible Ai Considerations |
Fairness: | Model is designed to be inclusive and helpful across a wide range of use cases. |
|
Transparency: | Efforts are made to maintain transparency through open community contributions. |
|
Accountability: | Meta encourages developers to be responsible for customizing safety for their use case. |
|
Mitigation Strategies: | Meta Llama Guard 2 and Code Shield for safety. |
|
|
Input Output |
Input Format: | |
Accepted Modalities: | |
Output Format: | |
|
Release Notes |
Version: | |
Date: | |
Notes: | Additional parameters and context length. |
|
Version: | |
Date: | |
Notes: | Initial release with Grouped-Query Attention. |
|
|
|