GPT Sw3 356M by AI-Sweden-Models

 ยป  All LLMs  ยป  AI-Sweden-Models  ยป  GPT Sw3 356M   URL Share it on

  Autotrain compatible   Da   En   Endpoints compatible   Gpt2   Is   No   Pytorch   Region:us   Safetensors   Sv

GPT Sw3 356M Benchmarks

GPT Sw3 356M (AI-Sweden-Models/gpt-sw3-356m)

GPT Sw3 356M Parameters and Internals

Model Type 
decoder-only, transformer, language model
Use Cases 
Areas:
Research, Evaluation of Large Language Models in Nordic languages
Limitations:
Bias and safety limitations, Possible content inaccuracies and irrelevance, Generation diversity issues, Potential for generating offensive, inappropriate content
Considerations:
Includes data diversity concerns and requires feedback mechanism for affected individuals.
Supported Languages 
languages_supported (da, sv, no, en, is), proficiency_level (fluent)
Training Details 
Data Sources:
Books from Litteraturbanken, The Pile, Articles from Diva, The Pile: PubMed, The Pile: ArXiv, Code from Code Parrot: Github, Pushshift.io Reddit dataset, English Math dataset, Swedish Math dataset, Summarization data, OPUS, Movie scripts, Natural Instructions, P3, The Norwegian Colossal Corpus, Danish Gigaword, Icelandic Gigaword, The Pile: Stack Exchange, Web Common Crawl, MC4, OSCAR, Open Web Text, Miscellaneous public Swedish websites, Familjeliv Articles, Public Swedish Job Ads, Wikipedia
Data Volume:
1.1TB UTF-8 encoded text
Methodology:
Pretrained using a causal language modeling objective
Model Architecture:
NeMo Megatron GPT
Responsible Ai Considerations 
Fairness:
The model has limitations regarding bias and safety.
Transparency:
Communication and transparency around usage is encouraged.
Mitigation Strategies:
Controlled pre-release; feedback collection from Nordic NLP ecosystem.
Release Notes 
Version:
Second generation
Date:
2022-12-20
LLM NameGPT Sw3 356M
Repository ๐Ÿค—https://huggingface.co/AI-Sweden-Models/gpt-sw3-356m 
Model Size356m
Required VRAM1.6 GB
Updated2025-02-16
MaintainerAI-Sweden-Models
Model Typegpt2
Model Files  1.6 GB   1.6 GB
Supported Languagesda sv no en is
Model ArchitectureGPT2LMHeadModel
Licenseother
Transformers Version4.25.0.dev0
Tokenizer ClassGPTSw3Tokenizer
Vocabulary Size64000
Torch Data Typefloat32
Activation Functiongelu

Best Alternatives to GPT Sw3 356M

Best Alternatives
Context / RAM
Downloads
Likes
GPT Sw3 356M Instruct0K / 1.6 GB16690
Note: green Score (e.g. "73.2") means that the model is better than AI-Sweden-Models/gpt-sw3-356m.

Rank the GPT Sw3 356M Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 43187 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227