Swallow 70B GPTQ by TheBloke

 ยป  All LLMs  ยป  TheBloke  ยป  Swallow 70B GPTQ   URL Share it on

  4-bit   Autotrain compatible Base model:quantized:tokyotech... Base model:tokyotech-llm/swall...   En   Gptq   Ja   Llama   Quantized   Region:us   Safetensors

Swallow 70B GPTQ Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Swallow 70B GPTQ (TheBloke/Swallow-70B-GPTQ)

Swallow 70B GPTQ Parameters and Internals

Model Type 
llama
Additional Notes 
The Swallow 70B model has been enhanced for Japanese with significant improvements in JCommonsenseQA, JEMHopQA, NIILC, JSQuAD, MGSM, and WMT20 benchmark tasks.
Supported Languages 
Japanese (high), English (high)
Training Details 
Data Sources:
Japanese Wikipedia, RefinedWeb, Swallow Corpus, The Pile
Methodology:
The Swallow model has undergone continuous pre-training from the Llama 2 family, primarily with the addition of Japanese language data.
Model Architecture:
Refer to LLaMA-2 technical report for details on the model architecture.
Input Output 
Input Format:
{prompt}
Accepted Modalities:
text
LLM NameSwallow 70B GPTQ
Repository ๐Ÿค—https://huggingface.co/TheBloke/Swallow-70B-GPTQ 
Model NameSwallow 70B
Model Creatortokyotech-llm
Base Model(s)  Swallow 70B Hf   tokyotech-llm/Swallow-70b-hf
Model Size70b
Required VRAM35.7 GB
Updated2025-02-22
MaintainerTheBloke
Model Typellama
Model Files  35.7 GB
Supported Languagesen ja
GPTQ QuantizationYes
Quantization Typegptq
Model ArchitectureLLaMAForCausalLM
Licensellama2
Context Length4096
Model Max Length4096
Transformers Version4.37.0.dev0
Tokenizer ClassLlamaTokenizer
Beginning of Sentence Token<s>
End of Sentence Token</s>
Unk Token<unk>
Vocabulary Size43176
Torch Data Typebfloat16

Best Alternatives to Swallow 70B GPTQ

Best Alternatives
Context / RAM
Downloads
Likes
Llama3 ChatQA 2 70B128K / 141.2 GB459
Note: green Score (e.g. "73.2") means that the model is better than TheBloke/Swallow-70B-GPTQ.

Rank the Swallow 70B GPTQ Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 43470 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227