GPT Sw3 356M Instruct By AI-Sweden-Models: Benchmarks, Features and Detailed Analysis. Insights on GPT Sw3 356M Instruct.

Bias, Safety, Generation diversity, Hallucination, Overrepresentation of some viewpoints, Discriminatory language, Inaccurate information generation, Repetitive outputs, Content appropriateness, Stereotyping

Considerations:

Awareness of risks and limitations; providing feedback mechanisms to users.

Supported Languages

da (Danish), sv (Swedish), en (English), no (Norwegian), is (Icelandic)

Training Details

Data Sources:

databricks/databricks-dolly-15k, laion/OIG, OpenAssistant/oasst1

Data Volume:

320B tokens

Methodology:

Pretrained using causal language modeling with NeMo Megatron GPT implementation. The instruct models were finetuned on instruction data using chat and raw text formats.

Model Architecture:

Large decoder-only pretrained transformer language models.

Safety Evaluation

Risk Categories:

bias, safety

Ethical Considerations:

Potential for generating biased, incorrect, or harmful content; overrepresentation of some viewpoints.

Responsible Ai Considerations

Fairness:

Potential bias due to diverse or non-diverse training data.

Transparency:

Increased communication and transparency sought through a modified RAIL license.

Mitigation Strategies:

Encouragement for open communication and feedback collection from indirect users.

Input Output

Accepted Modalities:

text

Release Notes

Version:

Second generation

Date:

2022-12-20

LLM Name	GPT Sw3 356M Instruct
Repository 🤗	https://huggingface.co/AI-Sweden-Models/gpt-sw3-356m-instruct
Base Model(s)	GPT Sw3 356M AI-Sweden-Models/gpt-sw3-356m
Model Size	356m
Required VRAM	1.6 GB
Updated	2025-06-01
Maintainer	AI-Sweden-Models
Model Type	gpt2
Instruction-Based	Yes
Model Files	1.6 GB 1.6 GB
Supported Languages	da sv en no is
Model Architecture	GPT2LMHeadModel
License	other
Model Max Length	2048
Transformers Version	4.22.1
Tokenizer Class	GPTSw3Tokenizer
Vocabulary Size	64000
Torch Data Type	float32
Activation Function	gelu

Rank the GPT Sw3 356M Instruct Capabilities

🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟

Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation

What open-source LLMs or SLMs are you in search of? 47770 in total.

Email us: info@extractum.io. Our Privacy Policy | Terms and Conditions | Suggest an improvement.

Our Social Media →

Original data from HuggingFace, OpenCompass and various public git repos.

Release v20241227

Support LLM Explorer

GPT Sw3 356M Instruct by AI-Sweden-Models

» All LLMs » AI-Sweden-Models » GPT Sw3 356M Instruct URL Share it on

GPT Sw3 356M Instruct Benchmarks

GPT Sw3 356M Instruct Parameters and Internals

Rank the GPT Sw3 356M Instruct Capabilities

What open-source LLMs or SLMs are you in search of? 47770 in total.