Model Type | |
Use Cases |
Areas: | |
Applications: | assistant-like chat, natural language generation tasks |
|
Primary Use Cases: | English language research & applications |
|
Limitations: | Use in languages other than English |
|
Considerations: | Developers must comply with the Acceptable Use Policy and Llama 3 Community License. |
|
|
Additional Notes | Optimized for handling very long contexts with minimal training adjustments. |
|
Supported Languages | |
Training Details |
Data Sources: | SlimPajama dataset, UltraChat chat dataset |
|
Data Volume: | |
Methodology: | |
Context Length: | |
Training Time: | |
Hardware Used: | Crusoe Energy high performance L40S cluster |
|
Model Architecture: | auto-regressive optimized transformer with RoPE |
|
|
Safety Evaluation |
Methodologies: | red teaming, adversarial evaluations |
|
Risk Categories: | cybersecurity, child safety |
|
Ethical Considerations: | Residual risks and trade-offs between helpfulness and alignment noted. |
|
|
Responsible Ai Considerations |
Fairness: | Efforts to reduce biases and ensure model safety. |
|
Transparency: | Documentation and methodologies publicly available. |
|
Accountability: | Users are responsible for ensuring applications are compliant with use policies. |
|
Mitigation Strategies: | Use of Llama Guard and Code Shield safeguards for safe deployments. |
|
|
Input Output |
Input Format: | |
Accepted Modalities: | |
Output Format: | |
Performance Tips: | Use RoPE scaling and appropriate hardware for long context handling. |
|
|
Release Notes |
Version: | |
Date: | |
Notes: | Initial release of Llama-3 70B Instruct Gradient 262K |
|
|
|