Model Type | decoder-only transformer, text generation |
|
Use Cases |
Areas: | research, commercial applications |
|
Applications: | text generation, natural language understanding |
|
Primary Use Cases: | translation, code generation |
|
Limitations: | Limited proficiency outside supported languages |
|
|
Additional Notes | Base model, requiring further fine-tuning for specific use cases. |
|
Supported Languages | fi (fluent), en (fluent), da (fluent), sv (fluent), no (fluent), nn (fluent), is (fluent) |
|
Training Details |
Data Sources: | cerebras/SlimPajama-627B, bigcode/starcoderdata, mc4 |
|
Data Volume: | |
Context Length: | |
Training Time: | |
Hardware Used: | |
Model Architecture: | GPT-like with rotary positional embeddings and flash attention |
|
|
Responsible Ai Considerations |
Fairness: | May produce outputs that are inaccurate, prejudiced, or controversial due to its training data. |
|
Mitigation Strategies: | Users should consider additional evaluation and customization. |
|
|
Input Output |
Input Format: | |
Accepted Modalities: | |
Output Format: | |
|
Release Notes |
Version: | |
Date: | |
Notes: | Initial model release with partial training data. |
|
|
|