Llama 2 7B NuGPTQ by smpanaro

 ยป  All LLMs  ยป  smpanaro  ยป  Llama 2 7B NuGPTQ   URL Share it on

  Arxiv:2210.17323   Arxiv:2306.07629   Autotrain compatible Base model:meta-llama/llama-2-... Base model:quantized:meta-llam...   Custom code   Dataset:wikitext   Endpoints compatible   Llama   Region:us   Safetensors

Llama 2 7B NuGPTQ Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Llama 2 7B NuGPTQ (smpanaro/Llama-2-7b-NuGPTQ)

Llama 2 7B NuGPTQ Parameters and Internals

Model Type 
LLM compression
Additional Notes 
NuGPTQ combines GPTQ, SqueezeLLM, and output scaling as a whole-tensor LLM compression method. The model is fake quantized, storing weights in float16 with limited unique values.
LLM NameLlama 2 7B NuGPTQ
Repository ๐Ÿค—https://huggingface.co/smpanaro/Llama-2-7b-NuGPTQ 
Base Model(s)  Llama 2 7B Hf   meta-llama/Llama-2-7b-hf
Model Size7b
Required VRAM13.5 GB
Updated2025-02-22
Maintainersmpanaro
Model Typellama
Model Files  13.5 GB
Model ArchitectureLLamaNuGPTQForCausalLM
Context Length4096
Model Max Length4096
Transformers Version4.38.2
Tokenizer ClassLlamaTokenizer
Beginning of Sentence Token<s>
End of Sentence Token</s>
Unk Token<unk>
Vocabulary Size32000
Torch Data Typefloat16

Rank the Llama 2 7B NuGPTQ Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 43470 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227