Swallow 13B Instruct GPTQ by TheBloke

 ยป  All LLMs  ยป  TheBloke  ยป  Swallow 13B Instruct GPTQ   URL Share it on

  4-bit   Autotrain compatible Base model:quantized:tokyotech... Base model:tokyotech-llm/swall...   En   Gptq   Instruct   Ja   Llama   Quantized   Region:us   Safetensors

Swallow 13B Instruct GPTQ Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Swallow 13B Instruct GPTQ (TheBloke/Swallow-13B-Instruct-GPTQ)

Swallow 13B Instruct GPTQ Parameters and Internals

Model Type 
llama, text-generation
Use Cases 
Areas:
research, commercial applications
Limitations:
The models are still in early stages and might not align well with human intent or safety considerations.
Additional Notes 
Trained primarily with additional Japanese language data to increase efficiency and representation.
Supported Languages 
Japanese (High), English (High)
Training Details 
Data Sources:
Japanese Wikipedia, RefinedWeb, Swallow Corpus, The Pile
Methodology:
Continual pre-training with supervised fine-tuning (SFT)
Context Length:
4096
Model Architecture:
llama2
Input Output 
Accepted Modalities:
text
Output Format:
text
LLM NameSwallow 13B Instruct GPTQ
Repository ๐Ÿค—https://huggingface.co/TheBloke/Swallow-13B-Instruct-GPTQ 
Model NameSwallow 13B Instruct
Model Creatortokyotech-llm
Base Model(s)  Swallow 13B Instruct Hf   tokyotech-llm/Swallow-13b-instruct-hf
Model Size13b
Required VRAM7.5 GB
Updated2025-03-13
MaintainerTheBloke
Model Typellama
Instruction-BasedYes
Model Files  7.5 GB
Supported Languagesen ja
GPTQ QuantizationYes
Quantization Typegptq
Model ArchitectureLlamaForCausalLM
Licensellama2
Context Length4096
Model Max Length4096
Transformers Version4.35.2
Tokenizer ClassLlamaTokenizer
Beginning of Sentence Token<s>
End of Sentence Token</s>
Unk Token<unk>
Vocabulary Size43176
Torch Data Typebfloat16

Best Alternatives to Swallow 13B Instruct GPTQ

Best Alternatives
Context / RAM
Downloads
Likes
NexusRaven 13B GPTQ16K / 7.3 GB1627
CodeLlama 13B Instruct GPTQ16K / 7.3 GB32039
...sianai 13B Chat Bilingual GPTQ8K / 7.3 GB1144
Leo Hessianai 13B Chat GPTQ8K / 7.3 GB1081
...lama2 13B Orca V2 8K 3166 GPTQ8K / 7.3 GB8525
Mythalion 13B GPTQ4K / 7.3 GB26353
Pygmalion 2 13B GPTQ4K / 7.3 GB19342
...2 13B Ft Instruct Es Gptq 3bit4K / 5.7 GB213
Speechless Llama2 13B GPTQ4K / 7.3 GB582
LoKuS 13B GPTQ4K / 7.3 GB232

Rank the Swallow 13B Instruct GPTQ Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 45005 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227