Model Type | |
Use Cases |
Areas: | |
Applications: | Assistant-like chat, Natural language generation |
|
Primary Use Cases: | Instruction-tuned models are optimized for dialogue. |
|
Limitations: | Use in languages other than English., Use in prohibited ways by the Acceptable Use Policy and Llama 3 Community License., Models trained on specific datasets; may produce inaccurate or biased outputs. |
|
Considerations: | Developers may fine-tune for languages beyond English per the Community License and Policy. |
|
|
Additional Notes | Enhanced inference efficiency by quantizing to FP8. |
|
Training Details |
Data Sources: | A new mix of publicly available online data |
|
Data Volume: | |
Methodology: | fine-tuning with supervised fine-tuning (SFT) and reinforcement learning from human feedback (RLHF) |
|
Context Length: | |
Hardware Used: | Meta's Research SuperCluster, H100-80GB GPUs |
|
Model Architecture: | Auto-regressive language model with optimized transformer architecture |
|
|
Safety Evaluation |
Methodologies: | adversarial evaluations, red-teaming exercises |
|
Findings: | residual risks expected, emphasis on mitigations for over-refusing prompts |
|
Risk Categories: | CBRNE threats, Cyber security, Child safety |
|
|
Responsible Ai Considerations |
Mitigation Strategies: | Implemented series of safety tools such as Meta Llama Guard 2 and Code Shield |
|
|
Input Output |
Input Format: | |
Accepted Modalities: | |
Output Format: | |
|
Release Notes |
Version: | |
Date: | |
Notes: | Meta developed and released the Meta Llama 3 family of large language models (LLMs). |
|
|
|