Model Type | text-generation, multilingual |
|
Use Cases |
Areas: | |
Applications: | assistant-like chat, natural language generation tasks, synthetic data generation |
|
Primary Use Cases: | |
Limitations: | not for use beyond 8 supported languages without fine-tuning and compliance with terms |
|
Considerations: | ensure safety in additional languages |
|
|
Additional Notes | Used for commercial and research purposes. Regular updates to improve model safety with community feedback. |
|
Supported Languages | English (full), German (full), French (full), Italian (full), Portuguese (full), Hindi (full), Spanish (full), Thai (full) |
|
Training Details |
Data Sources: | publicly available online data |
|
Data Volume: | |
Methodology: | supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF) |
|
Context Length: | |
Training Time: | |
Hardware Used: | custom built GPU cluster, H100-80GB |
|
Model Architecture: | optimized transformer architecture |
|
|
Safety Evaluation |
Methodologies: | red teaming, risk assessments, evaluation datasets |
|
Findings: | some safety risks identified and mitigated |
|
Risk Categories: | misinformation, cyber threats, child safety |
|
Ethical Considerations: | engagement strategies with subject-matter experts for real-world harms |
|
|
Responsible Ai Considerations |
Fairness: | efforts to mitigate bias through multi-faceted data collection approach |
|
Transparency: | part of an open community for AI safety progress |
|
Accountability: | use of output reporting mechanism |
|
Mitigation Strategies: | adopting MLCommons taxonomy, employing numerous safety guardrails |
|
|
Input Output |
Input Format: | |
Accepted Modalities: | |
Output Format: | multilingual text and code |
|
Performance Tips: | Follow the Responsible Use Guide |
|
|
Release Notes |
Version: | |
Date: | |
Notes: | Enhancements for multilingual dialogue, improved benchmarks results. |
|
|
|