Model Type | decoder-only, transformer |
|
Use Cases |
Areas: | research, commercial applications |
|
Limitations: | no meaningful proficiency in languages outside of English, Finnish, Swedish, Norwegian, Danish, Icelandic, and code |
|
Considerations: | The model is partially trained, and ethical considerations should guide its application. |
|
|
Supported Languages | fi (fluent), en (fluent), da (fluent), sv (fluent), no (fluent), nn (fluent), is (fluent) |
|
Training Details |
Data Sources: | cerebras/SlimPajama-627B, bigcode/starcoderdata, mc4 |
|
Data Volume: | |
Methodology: | uses a LLaMA-like GPT architecture, rotary positional embeddings, flash attention |
|
Context Length: | |
Hardware Used: | |
Model Architecture: | |
|
Safety Evaluation |
Ethical Considerations: | Viking may produce outputs that can be considered inaccurate, prejudiced, or controversial. Users should exercise discretion. |
|
|
Responsible Ai Considerations |
Mitigation Strategies: | Special care should be taken when using outputs. |
|
|
Input Output | |
Release Notes |
Notes: | Training checkpoints available as branches in the repository, e.g., 100B, 200B, up to 2000B tokens. |
|
|
|