Model Type | |
Use Cases |
Areas: | Research, Commercial applications |
|
Primary Use Cases: | Memory/compute constrained environments, Latency bound scenarios, Strong reasoning tasks |
|
Limitations: | Not specifically designed or evaluated for all downstream purposes. |
|
Considerations: | Adherence to laws and regulations is required. |
|
|
Additional Notes | This is a static model trained on an offline dataset with a cutoff date of October 2023. Future versions may improve upon it. |
|
Supported Languages | |
Training Details |
Data Sources: | Publicly available documents, Newly created synthetic data, High quality chat format supervised data |
|
Data Volume: | |
Methodology: | Supervised fine-tuning, Direct Preference Optimization |
|
Context Length: | |
Training Time: | |
Hardware Used: | |
Model Architecture: | Dense decoder-only Transformer |
|
|
Responsible Ai Considerations |
Fairness: | Models can over- or under-represent groups, erase representation of some groups, or reinforce stereotypes. |
|
Transparency: | Inappropriate or offensive content generation potential. |
|
Accountability: | Developers need to ensure the model complies with laws and regulations. |
|
Mitigation Strategies: | Use safety classifiers or implement custom safety solutions. |
|
|
Input Output |
Input Format: | |
Accepted Modalities: | |
Output Format: | Generated text in response to input |
|
Performance Tips: | For certain GPUs, call AutoModelForCausalLM.from_pretrained() with attn_implementation="eager". |
|
|
Release Notes |
Version: | |
Notes: | Improvement in long-context understanding, instruction following, reasoning capability. |
|
|
|