Model Type | Large Language Model, Instruction Following |
|
Use Cases |
Areas: | |
Applications: | Language generation, Chat assistance, Instruction following |
|
Primary Use Cases: | Instruction following, Multi-turn dialogue |
|
Limitations: | Potential to generate offensive or unethical content under adversarial conditions |
|
Considerations: | Continuous improvement and encourage responsible usage. |
|
|
Additional Notes | Unofficial checkpoint for research purposes. |
|
Supported Languages | |
Training Details |
Data Sources: | Preference data mix, Prompt collection for RLHF training |
|
Data Volume: | |
Methodology: | Iterative DPO with online RLHF |
|
Training Time: | |
Model Architecture: | Iterative DPO based training |
|
|
Safety Evaluation |
Methodologies: | |
Findings: | Potential for offensive content under adversarial conditions |
|
Risk Categories: | Offensive content, Ethical considerations |
|
Ethical Considerations: | Safety and ethical considerations are integral to the alignment process. |
|
|
Responsible Ai Considerations |
Fairness: | |
Transparency: | Technical report available |
|
Accountability: | Developers and affiliated institution |
|
Mitigation Strategies: | Continuous improvement in model safety |
|
|
Input Output |
Input Format: | |
Accepted Modalities: | |
Output Format: | |
Performance Tips: | Optimal performance on CUDA-enabled devices |
|
|
Release Notes |
Version: | |
Notes: | Initial release of the unofficial checkpoint showcasing online iterative RLHF. |
|
|
|