Model Type | |
Use Cases |
Areas: | |
Applications: | English and coding language conversations |
|
Primary Use Cases: | Chat and dialogue generation |
|
Limitations: | May produce biased, inaccurate, or undesirable text., Weak against alignment-breaking attacks. |
|
Considerations: | Use guardrails to prevent potentially harmful outputs. |
|
|
Additional Notes | Optimized for a single H100-80GB GPU using novel NAS approach. |
|
Supported Languages | en (proficient), coding languages (supported) |
|
Training Details |
Data Sources: | FineWeb, Buzz-V1.2, Dolma |
|
Data Volume: | |
Methodology: | |
Context Length: | |
Training Time: | |
Model Architecture: | Transformer Decoder (auto-regressive language model) |
|
|
Safety Evaluation |
Methodologies: | Garak, AEGIS, Human Content Red Teaming |
|
Risk Categories: | toxic language, unsafe content, societal biases |
|
Ethical Considerations: | Developers responsible for ensuring model meets industry use case requirements. |
|
|
Responsible Ai Considerations |
Fairness: | Model trained on biased data, potential to amplify biases. |
|
Accountability: | Responsible AI development and shared trustworthiness responsibility. |
|
Mitigation Strategies: | Deploy guardrails to prevent harmful outputs. |
|
|
Input Output |
Input Format: | |
Accepted Modalities: | |
Output Format: | |
Performance Tips: | High throughput and efficiency using the NAS approach. |
|
|