Llama 3 70B Instruct Gradient 262K AWQ By starsy: Benchmarks, Features and Detailed Analysis. Insights on Llama 3 70B Instruct Gradient 262K AWQ.

Arxiv:2305.14233 Arxiv:2309.00071 Arxiv:2310.05209 Arxiv:2402.08268 4-bit Autotrain compatible Awq Conversational En Endpoints compatible Instruct Llama Llama-3 Meta Quantized Region:us Safetensors Sharded Tensorflow

Model Card on HF 🤗: https://huggingface.co/starsy/Llama-3-70B-Instruct-Gradient-262k-AWQ

Llama 3 70B Instruct Gradient 262K AWQ Benchmarks

ARC: 66.81 vs 96.7 (so35)^-30.9%

HellaSwag: 85.46 vs 95.3 (gpt4)^-10.3%

MMLU: 76.37 vs 88.3 (so35)^-13.5%

TruthfulQA: 53.73 vs 59 (gpt4)^-8.9%

WinoGrande: 82.64 vs 87.5 (gpt4)^-5.6%

GSM8K: 78.85 vs 96.4 (so35)^-18.2%

LLME Score: 0.1768

^nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").

What is the LLM Explorer Rank (Score)

Llama 3 70B Instruct Gradient 262K AWQ (starsy/Llama-3-70B-Instruct-Gradient-262k-AWQ)

Llama 3 70B Instruct Gradient 262K AWQ Parameters and Internals

Model Type

Text Generation

Use Cases

Areas:

Commercial, Research

Primary Use Cases:

Chat Assistant

Limitations:

Inappropriate outside stated language use, Potential bias and inaccuracies

Considerations:

Fine-tuning may be required for other languages, ensure compliance with laws

Additional Notes

Model trained on a static offline dataset, future versions might evolve with community feedback.

Supported Languages

en (High)

Training Details

Data Sources:

SlimPajama, UltraChat

Data Volume:

15 trillion tokens pretraining; 105M tokens stage training; 188M tokens all stages

Methodology:

SFT, RLHF

Context Length:

262000

Hardware Used:

NVIDIA L40S, Meta's Research SuperCluster

Model Architecture:

Auto-regressive language model using optimized transformer architecture

Responsible Ai Considerations

Transparency:

Outlined in Responsible Use Guide

Accountability:

Developers are accountable and should implement safety measures

Mitigation Strategies:

Purple Llama solutions and Llama Guard for input/output safety filtering

Input Output

Input Format:

Text

Accepted Modalities:

Text

Output Format:

Text, code

Release Notes

Version:

April 18, 2024

Notes:

Initial release

LLM Name	Llama 3 70B Instruct Gradient 262K AWQ
Repository 🤗	https://huggingface.co/starsy/Llama-3-70B-Instruct-Gradient-262k-AWQ
Base Model(s)	...a 3 70B Instruct Gradient 524K gradientai/Llama-3-70B-Instruct-Gradient-524k
Model Size	70b
Required VRAM	39.9 GB
Updated	2025-05-31
Maintainer	starsy
Model Type	llama
Instruction-Based	Yes
Model Files	5.0 GB: 1-of-9 4.9 GB: 2-of-9 4.9 GB: 3-of-9 4.9 GB: 4-of-9 4.9 GB: 5-of-9 4.9 GB: 6-of-9 4.9 GB: 7-of-9 3.4 GB: 8-of-9 2.1 GB: 9-of-9
Supported Languages	en
AWQ Quantization	Yes
Quantization Type	awq
Model Architecture	LlamaForCausalLM
License	llama3
Context Length	262144
Model Max Length	262144
Transformers Version	4.40.2
Tokenizer Class	PreTrainedTokenizerFast
Vocabulary Size	128256
Torch Data Type	float16

Best Alternatives to Llama 3 70B Instruct Gradient 262K AWQ

Best Alternatives	Context / RAM	Downloads	Likes
...0B Instruct Gradient 1048K AWQ	1024K / 39.9 GB	18	1
Llama 3.3 70B Instruct AWQ	128K / 39.9 GB	93409	5
Llama 3.3 70B Instruct AWQ	128K / 39.9 GB	43621	32
...lama 3.3 70B Instruct AWQ INT4	128K / 39.9 GB	6980	24
... SauerkrautLM 70B Instruct AWQ	128K / 39.9 GB	14	4
Llama 3 70B Instruct AWQ	8K / 39.9 GB	23220	68
...ama 3 70B Instruct AWQ Smashed	8K / 39.9 GB	3245	9
...Typhoon V1.5x 70B Instruct AWQ	8K / 39.9 GB	293	2
Meta Llama 3 70B Instruct AWQ	8K / 39.9 GB	21	1
Llama 3 70B Instruct AWQ	8K / 39.9 GB	12	1

Note: green Score (e.g. "73.2") means that the model is better than starsy/Llama-3-70B-Instruct-Gradient-262k-AWQ.

Rank the Llama 3 70B Instruct Gradient 262K AWQ Capabilities

🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟

Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation

What open-source LLMs or SLMs are you in search of? 47753 in total.

Email us: info@extractum.io. Our Privacy Policy | Terms and Conditions | Suggest an improvement.

Our Social Media →

Original data from HuggingFace, OpenCompass and various public git repos.

Release v20241227

Support LLM Explorer

Llama 3 70B Instruct Gradient 262K AWQ by starsy

» All LLMs » starsy » Llama 3 70B Instruct Gradient 262K AWQ URL Share it on

Llama 3 70B Instruct Gradient 262K AWQ Benchmarks

Llama 3 70B Instruct Gradient 262K AWQ Parameters and Internals

Best Alternatives to Llama 3 70B Instruct Gradient 262K AWQ

Rank the Llama 3 70B Instruct Gradient 262K AWQ Capabilities

What open-source LLMs or SLMs are you in search of? 47753 in total.