Model Type | decoder-only, transformer, language model |
|
Use Cases |
Areas: | Research, Evaluation of Large Language Models in Nordic languages |
|
Limitations: | Bias and safety limitations, Possible content inaccuracies and irrelevance, Generation diversity issues, Potential for generating offensive, inappropriate content |
|
Considerations: | Includes data diversity concerns and requires feedback mechanism for affected individuals. |
|
|
Supported Languages | languages_supported (da, sv, no, en, is), proficiency_level (fluent) |
|
Training Details |
Data Sources: | Books from Litteraturbanken, The Pile, Articles from Diva, The Pile: PubMed, The Pile: ArXiv, Code from Code Parrot: Github, Pushshift.io Reddit dataset, English Math dataset, Swedish Math dataset, Summarization data, OPUS, Movie scripts, Natural Instructions, P3, The Norwegian Colossal Corpus, Danish Gigaword, Icelandic Gigaword, The Pile: Stack Exchange, Web Common Crawl, MC4, OSCAR, Open Web Text, Miscellaneous public Swedish websites, Familjeliv Articles, Public Swedish Job Ads, Wikipedia |
|
Data Volume: | |
Methodology: | Pretrained using a causal language modeling objective |
|
Model Architecture: | |
|
Responsible Ai Considerations |
Fairness: | The model has limitations regarding bias and safety. |
|
Transparency: | Communication and transparency around usage is encouraged. |
|
Mitigation Strategies: | Controlled pre-release; feedback collection from Nordic NLP ecosystem. |
|
|
Release Notes | |