GPT Sw3 356M Instruct by AI-Sweden-Models

 ยป  All LLMs  ยป  AI-Sweden-Models  ยป  GPT Sw3 356M Instruct   URL Share it on

  Autotrain compatible Base model:ai-sweden-models/gp... Base model:finetune:ai-sweden-...   Conversational   Da Dataset:databricks/databricks-...   Dataset:laion/oig   Dataset:openassistant/oasst1   En   Endpoints compatible   Gpt2   Instruct   Is   No   Pytorch   Region:us   Safetensors   Sv

GPT Sw3 356M Instruct Benchmarks

GPT Sw3 356M Instruct (AI-Sweden-Models/gpt-sw3-356m-instruct)

GPT Sw3 356M Instruct Parameters and Internals

Model Type 
large, decoder-only, transformer, autoregressive
Use Cases 
Areas:
research, evaluation of Large Language Models capabilities
Primary Use Cases:
Validating the model and collecting feedback on Large Language Models.
Limitations:
Bias, Safety, Generation diversity, Hallucination, Overrepresentation of some viewpoints, Discriminatory language, Inaccurate information generation, Repetitive outputs, Content appropriateness, Stereotyping
Considerations:
Awareness of risks and limitations; providing feedback mechanisms to users.
Supported Languages 
da (Danish), sv (Swedish), en (English), no (Norwegian), is (Icelandic)
Training Details 
Data Sources:
databricks/databricks-dolly-15k, laion/OIG, OpenAssistant/oasst1
Data Volume:
320B tokens
Methodology:
Pretrained using causal language modeling with NeMo Megatron GPT implementation. The instruct models were finetuned on instruction data using chat and raw text formats.
Model Architecture:
Large decoder-only pretrained transformer language models.
Safety Evaluation 
Risk Categories:
bias, safety
Ethical Considerations:
Potential for generating biased, incorrect, or harmful content; overrepresentation of some viewpoints.
Responsible Ai Considerations 
Fairness:
Potential bias due to diverse or non-diverse training data.
Transparency:
Increased communication and transparency sought through a modified RAIL license.
Mitigation Strategies:
Encouragement for open communication and feedback collection from indirect users.
Input Output 
Accepted Modalities:
text
Release Notes 
Version:
Second generation
Date:
2022-12-20
LLM NameGPT Sw3 356M Instruct
Repository ๐Ÿค—https://huggingface.co/AI-Sweden-Models/gpt-sw3-356m-instruct 
Base Model(s)  GPT Sw3 356M   AI-Sweden-Models/gpt-sw3-356m
Model Size356m
Required VRAM1.6 GB
Updated2025-01-20
MaintainerAI-Sweden-Models
Model Typegpt2
Instruction-BasedYes
Model Files  1.6 GB   1.6 GB
Supported Languagesda sv en no is
Model ArchitectureGPT2LMHeadModel
Licenseother
Model Max Length2048
Transformers Version4.22.1
Tokenizer ClassGPTSw3Tokenizer
Vocabulary Size64000
Torch Data Typefloat32
Activation Functiongelu

Rank the GPT Sw3 356M Instruct Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 41636 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227