GPT Sw3 20B by AI-Sweden-Models

 ยป  All LLMs  ยป  AI-Sweden-Models  ยป  GPT Sw3 20B   URL Share it on

  Autotrain compatible   Da   En   Endpoints compatible   Gpt2   Is   No   Pytorch   Region:us   Safetensors   Sharded   Sv   Tensorflow

GPT Sw3 20B Benchmarks

GPT Sw3 20B (AI-Sweden-Models/gpt-sw3-20b)

GPT Sw3 20B Parameters and Internals

Model Type 
decoder-only, transformer, language model
Use Cases 
Areas:
research, evaluation of LLM capabilities
Applications:
Nordic NLP ecosystem validation and testing
Primary Use Cases:
language generation tasks
Limitations:
bias, safety, generation diversity, hallucination
Additional Notes 
GPT-SW3 is compliant with the Modified RAIL license focusing on communication and transparency regarding LLMs.
Supported Languages 
da (advanced), sv (advanced), no (advanced), en (advanced), is (advanced)
Training Details 
Data Sources:
Books, Articles, Code, Conversational, Math, Miscellaneous, Web Sources
Data Volume:
320B tokens
Methodology:
NeMo Megatron GPT implementation
Model Architecture:
decoder-only transformer
Release Notes 
Version:
Second Generation
Date:
2022-12-20
Notes:
Release of second generation of GPT-SW3 model.
LLM NameGPT Sw3 20B
Repository ๐Ÿค—https://huggingface.co/AI-Sweden-Models/gpt-sw3-20b 
Model Size20b
Required VRAM83.2 GB
Updated2025-02-05
MaintainerAI-Sweden-Models
Model Typegpt2
Model Files  9.5 GB: 1-of-9   9.7 GB: 2-of-9   9.7 GB: 3-of-9   9.7 GB: 4-of-9   9.7 GB: 5-of-9   9.7 GB: 6-of-9   9.7 GB: 7-of-9   9.7 GB: 8-of-9   5.8 GB: 9-of-9   9.5 GB: 1-of-9   9.7 GB: 2-of-9   9.7 GB: 3-of-9   9.7 GB: 4-of-9   9.7 GB: 5-of-9   9.7 GB: 6-of-9   9.7 GB: 7-of-9   9.7 GB: 8-of-9   5.8 GB: 9-of-9
Supported Languagesda sv no en is
Model ArchitectureGPT2LMHeadModel
Licenseother
Transformers Version4.25.0.dev0
Tokenizer ClassGPTSw3Tokenizer
Vocabulary Size64000
Torch Data Typefloat32
Activation Functiongelu

Best Alternatives to GPT Sw3 20B

Best Alternatives
Context / RAM
Downloads
Likes
GPT Sw3 20B Instruct0K / 83.2 GB214312
InstructPalmyra 20B0K / 40.6 GB139040
Note: green Score (e.g. "73.2") means that the model is better than AI-Sweden-Models/gpt-sw3-20b.

Rank the GPT Sw3 20B Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 42577 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227