Model Type | text generation, instruction following |
|
Use Cases |
Areas: | |
Applications: | General purpose AI systems, Memory/compute constrained environments, Latency bound scenarios, Strong reasoning (code, math, logic) |
|
Primary Use Cases: | Intended for use in broad commercial and research fields |
|
Limitations: | Not designed for all downstream purposes, Limited language support outside English |
|
Considerations: | Evaluate and mitigate for accuracy, safety, and fairness |
|
|
Supported Languages | multilingual (English), others (10% multilingual) |
|
Training Details |
Data Sources: | Publicly available documents, Newly created synthetic data, High quality chat format supervised data |
|
Data Volume: | |
Methodology: | Supervised fine-tuning and Direct Preference Optimization (DPO) |
|
Context Length: | |
Training Time: | |
Hardware Used: | |
Model Architecture: | Dense decoder-only Transformer |
|
|
Responsible Ai Considerations |
Fairness: | Awareness of language variety performance, representation of harms and stereotypes |
|
Transparency: | Disclosures on potential behaviors |
|
Accountability: | Developers are responsible |
|
Mitigation Strategies: | Follow best practices and implement additional mitigations for sensitive deployment contexts |
|
|
Input Output |
Input Format: | |
Accepted Modalities: | |
Output Format: | |
Performance Tips: | Include specific tokens for improved reliability |
|
|