Model Type | |
Use Cases |
Areas: | |
Applications: | natural language generation |
|
Primary Use Cases: | |
Limitations: | not suitable for use in languages other than English |
|
Considerations: | developers to perform safety testing and tuning tailored to applications |
|
|
Additional Notes | Model is static and trained on an offline dataset. Future versions will focus on safety improvements. |
|
Supported Languages | |
Training Details |
Data Sources: | publicly available online data, SlimPajama, UltraChat |
|
Data Volume: | |
Methodology: | NTK-aware interpolation for RoPE theta optimization, progressive training on increasing context lengths, supervised fine-tuning (SFT), reinforcement learning with human feedback (RLHF) |
|
Context Length: | |
Hardware Used: | Crusoe Energy high performance L40S cluster |
|
Model Architecture: | auto-regressive language model using an optimized transformer architecture |
|
|
Safety Evaluation |
Methodologies: | red teaming, adversarial evaluations |
|
Findings: | mitigations implemented to limit false refusals, CBRNE assessments |
|
Risk Categories: | misuse, critical risks, cybersecurity, child safety |
|
Ethical Considerations: | open approach to better, safer products, emphasis on responsible AI development |
|
|
Responsible Ai Considerations |
Fairness: | openness, inclusivity, helpfulness |
|
Transparency: | steps and best practices for safe deployment |
|
Accountability: | |
Mitigation Strategies: | Purple Llama solutions, Llama Guard for input-output safeguards |
|
|
Input Output |
Input Format: | |
Accepted Modalities: | |
Output Format: | |
|