Model Type | auto-regressive, generative text model |
|
Use Cases |
Areas: | Commercial and research applications in English |
|
Applications: | Natural language generation tasks, Assistant-like chat |
|
Primary Use Cases: | Pretrained models can be adapted for various tasks |
|
Limitations: | Use in languages other than English, Violates laws or regulations |
|
Considerations: | Specific formatting required to get expected features for chat. |
|
|
Additional Notes | Carbon footprint of pretraining is offset by Metaβs program. |
|
Supported Languages | languages_zero_shot (/0/), proficiency_level (/0/), default_language (/0/) |
|
Training Details |
Data Sources: | A new mix of publicly available online data |
|
Data Volume: | |
Methodology: | Pretrained using auto-regressive architecture and fine-tuned with supervised learning and reinforcement learning with human feedback. |
|
Context Length: | |
Training Time: | |
Hardware Used: | Meta's Research Super Cluster, third-party cloud compute |
|
Model Architecture: | Optimized transformer architecture |
|
|
Safety Evaluation |
Methodologies: | Evaluation on standard academic benchmarks |
|
Findings: | Outperform open-source chat models on most benchmarks tested, Par with some closed-source models |
|
Ethical Considerations: | Before deploying applications, developers should perform safety testing tailored to specific applications. |
|
|
Responsible Ai Considerations |
Fairness: | Testing covers English scenarios, cannot predict nor cover all scenarios. |
|
Accountability: | Developers should perform safety testing tailored to specific applications. |
|
Mitigation Strategies: | Tuned with reinforcement learning with human feedback for alignment. |
|
|
Input Output |
Input Format: | |
Accepted Modalities: | |
Output Format: | |
Performance Tips: | Bigger models (70B) use Grouped-Query Attention for improved scalability. |
|
|