Model Type | text generation, instruction-tuned |
|
Use Cases |
Areas: | commercial applications, research |
|
Applications: | |
Primary Use Cases: | Natural language generation tasks |
|
Limitations: | Use only in accordance with laws |
|
Considerations: | Safety testing and tuning recommended |
|
|
Additional Notes | Uses proprietary EasyContext Blockwise RingAttention library for long context training. |
|
Supported Languages | |
Training Details |
Data Sources: | SlimPajama, UltraChat, publicly available instruction datasets |
|
Data Volume: | |
Methodology: | |
Context Length: | |
Training Time: | 100-516 minutes per context length |
|
Hardware Used: | |
Model Architecture: | auto-regressive transformer architecture |
|
|
Safety Evaluation |
Methodologies: | red-teaming, adversarial evaluations, CyberSecEval |
|
Findings: | residual risks remain, over-refusal reduced |
|
Risk Categories: | CBRNE, Cyber Security, Child Safety |
|
Ethical Considerations: | Open approach for better, safer products |
|
|
Responsible Ai Considerations |
Fairness: | Access to many different backgrounds |
|
Transparency: | |
Accountability: | Developers responsible for safe use |
|
Mitigation Strategies: | Meta Llama Guard and Code Shield safeguards |
|
|
Input Output |
Input Format: | |
Accepted Modalities: | |
Output Format: | |
Performance Tips: | |
|
Release Notes |
Version: | |
Date: | |
Notes: | Initial release of Llama 3 model |
|
|
|