ELYZA Japanese Llama 2 13B Fast Instruct 4bit Quantized by SushiTokyo

 ยป  All LLMs  ยป  SushiTokyo  ยป  ELYZA Japanese Llama 2 13B Fast Instruct 4bit Quantized   URL Share it on

  4-bit   4bit   Autotrain compatible   Endpoints compatible   Gptq   Instruct   Llama   Quantized   Region:us   Safetensors   Sharded   Tensorflow

ELYZA Japanese Llama 2 13B Fast Instruct 4bit Quantized Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
ELYZA Japanese Llama 2 13B Fast Instruct 4bit Quantized (SushiTokyo/ELYZA-japanese-Llama-2-13b-fast-instruct-4bit-quantized)

ELYZA Japanese Llama 2 13B Fast Instruct 4bit Quantized Parameters and Internals

Model Type 
text generation, instruction-following
Additional Notes 
The model is quantized to 4-bit for faster processing. Original model faced difficulty even with RTX4090, but now outputs in about 10 seconds. Accuracy is unmeasured. Refer to quantize.py for quantization code.
LLM NameELYZA Japanese Llama 2 13B Fast Instruct 4bit Quantized
Repository ๐Ÿค—https://huggingface.co/SushiTokyo/ELYZA-japanese-Llama-2-13b-fast-instruct-4bit-quantized 
Model Size13b
Required VRAM7.8 GB
Updated2025-02-22
MaintainerSushiTokyo
Model Typellama
Instruction-BasedYes
Model Files  5.0 GB: 1-of-2   2.8 GB: 2-of-2
Quantization Type4bit
Model ArchitectureLlamaForCausalLM
Context Length4096
Model Max Length4096
Transformers Version4.40.1
Tokenizer ClassLlamaTokenizer
Padding Token</s>
Vocabulary Size44581
Torch Data Typefloat16

Best Alternatives to ELYZA Japanese Llama 2 13B Fast Instruct 4bit Quantized

Best Alternatives
Context / RAM
Downloads
Likes
CodeLlama 13B Instruct Fp1616K / 26 GB348428
...Llama 13B Instruct Hf 4bit MLX16K / 7.8 GB1002
...13B Instruct Nf4 Fp16 Upscaled16K / 26 GB1300
Model 007 13b V24K / 26 GB264
...igogne2 Enno 13B Sft Lora 4bit4K / 26 GB7860
Xwin LM 13B V0.2 EXL24K / 5.2 GB203
Mythalion 13B 2.30bpw H4 EXL24K / 4.1 GB133
...lion Kimiko V2.6.05bpw H8 EXL24K / 10.1 GB71
Law LLM 13B 4.0bpw H6 EXL22K / 6.8 GB51
Finance LLM 13B 6.0bpw H6 EXL22K / 10 GB41

Rank the ELYZA Japanese Llama 2 13B Fast Instruct 4bit Quantized Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 43470 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227