Swallow 13B Instruct AWQ by TheBloke

 ยป  All LLMs  ยป  TheBloke  ยป  Swallow 13B Instruct AWQ   URL Share it on

  4-bit   Autotrain compatible   Awq Base model:quantized:tokyotech... Base model:tokyotech-llm/swall...   En   Instruct   Ja   Llama   Quantized   Region:us   Safetensors

Swallow 13B Instruct AWQ Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Swallow 13B Instruct AWQ (TheBloke/Swallow-13B-Instruct-AWQ)

Swallow 13B Instruct AWQ Parameters and Internals

Model Type 
text generation, bilingual
Use Cases 
Areas:
Research, Commercial applications
Applications:
Japanese and English text generation, Instruction following
Primary Use Cases:
Text generation, Bilingual interaction
Limitations:
Early stages without tuning for alignment with human intent and safety
Considerations:
Follow ethical and responsible AI guidelines when using this model.
Additional Notes 
Quantized using AWQ method for faster processing while maintaining output quality.
Supported Languages 
Japanese (high), English (medium)
Training Details 
Data Sources:
Japanese Wikipedia, RefinedWeb, Swallow Corpus, The Pile, Anthropic HH-RLHF, Databricks Dolly 15-k, OpenAssistant Conversations Dataset
Methodology:
Supervised fine-tuning (SFT)
Context Length:
4096
Hardware Used:
Hardware provided by Massed Compute
Model Architecture:
Transformer-based architecture, specific details according to LLaMA-2 technical report
Input Output 
Input Format:
Japanese or English text instructions
Accepted Modalities:
text
Output Format:
Text generation based on prompts
Performance Tips:
Utilize appropriate quantization level and adjust sampling parameters for better performance.
LLM NameSwallow 13B Instruct AWQ
Repository ๐Ÿค—https://huggingface.co/TheBloke/Swallow-13B-Instruct-AWQ 
Model NameSwallow 13B Instruct
Model Creatortokyotech-llm
Base Model(s)  Swallow 13B Instruct Hf   tokyotech-llm/Swallow-13b-instruct-hf
Model Size13b
Required VRAM7.5 GB
Updated2025-02-05
MaintainerTheBloke
Model Typellama
Instruction-BasedYes
Model Files  7.5 GB
Supported Languagesen ja
AWQ QuantizationYes
Quantization Typeawq
Model ArchitectureLlamaForCausalLM
Licensellama2
Context Length4096
Model Max Length4096
Transformers Version4.35.2
Tokenizer ClassLlamaTokenizer
Beginning of Sentence Token<s>
End of Sentence Token</s>
Unk Token<unk>
Vocabulary Size43176
Torch Data Typefloat16

Best Alternatives to Swallow 13B Instruct AWQ

Best Alternatives
Context / RAM
Downloads
Likes
NexusRaven 13B AWQ16K / 7.2 GB114
CodeLlama 13B Instruct AWQ16K / 7.2 GB579
...ma 13B Instruct Hf W4 G128 AWQ16K / 7.2 GB150
Meta Llama 3 13B Instruct AWQ8K / 8.8 GB50
...ssianai 13B Chat Bilingual AWQ8K / 7.2 GB91
Leo Hessianai 13B Chat AWQ8K / 7.2 GB90
Mythalion 13B AWQ4K / 7.2 GB326810
Pygmalion 2 13B AWQ4K / 7.2 GB3176
Speechless Llama2 13B AWQ4K / 7.2 GB71
LoKuS 13B AWQ4K / 7.2 GB51
Note: green Score (e.g. "73.2") means that the model is better than TheBloke/Swallow-13B-Instruct-AWQ.

Rank the Swallow 13B Instruct AWQ Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 42577 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227