Bagel DPO 8x7b V0.2 By jondurbin: Benchmarks, Features and Detailed Analysis. Insights on Bagel DPO 8x7b V0.2.

Autotrain compatible Conversational Dataset:ai2 arc Dataset:allenai/ultrafeedback ... Dataset:boolq Dataset:cais/mmlu Dataset:cakiki/rosetta-code Dataset:codeparrot/apps Dataset:datasets/winogrande Dataset:drop Dataset:facebook/belebele Dataset:intel/orca dpo pairs Dataset:jondurbin/airoboros-3.... Dataset:jondurbin/cinematika-v... Dataset:jondurbin/truthy-dpo-v... Dataset:julielab/emobank Dataset:kingbri/pippa-sharegpt Dataset:ldjnr/capybara Dataset:lmsys/lmsys-chat-1m Dataset:migtissera/synthia-v1.... Dataset:muennighoff/natural-in... Dataset:nvidia/helpsteer Dataset:open-orca/slimorca Dataset:openbookqa Dataset:piqa Dataset:spider Dataset:squad v2 Dataset:squish42/bluemoon-fand... Dataset:tiger-lab/mathinstruct Dataset:unalignment/toxic-dpo-... Dataset:vezora/tested-22k-pyth... Endpoints compatible Mixtral Moe Region:us Safetensors Sharded Tensorflow

Bagel DPO 8x7b V0.2 Benchmarks

ARC: 72.1 vs 96.7 (so35)^-25.4%

HellaSwag: 86.41 vs 95.3 (gpt4)^-9.3%

MMLU: 70.27 vs 88.3 (so35)^-20.4%

TruthfulQA: 72.83 vs 59 (gpt4)^23.4%

WinoGrande: 83.27 vs 87.5 (gpt4)^-4.8%

GSM8K: 50.04 vs 96.4 (so35)^-48.1%

LLME Score: 0.21165

^nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").

What is the LLM Explorer Rank (Score)

Bagel DPO 8x7b V0.2 Parameters and Internals

LLM Name	Bagel DPO 8x7b V0.2
Repository 🤗	https://huggingface.co/jondurbin/bagel-dpo-8x7b-v0.2
Model Size	46.7b
Required VRAM	93.8 GB
Updated	2024-09-28
Maintainer	jondurbin
Model Type	mixtral
Model Files	4.0 GB: 1-of-24 4.0 GB: 2-of-24 4.0 GB: 3-of-24 3.9 GB: 4-of-24 4.0 GB: 5-of-24 3.9 GB: 6-of-24 4.0 GB: 7-of-24 4.0 GB: 8-of-24 3.9 GB: 9-of-24 4.0 GB: 10-of-24 4.0 GB: 11-of-24 3.9 GB: 12-of-24 4.0 GB: 13-of-24 4.0 GB: 14-of-24 3.9 GB: 15-of-24 4.0 GB: 16-of-24 3.9 GB: 17-of-24 4.0 GB: 18-of-24 4.0 GB: 19-of-24 3.9 GB: 20-of-24 4.0 GB: 21-of-24 4.0 GB: 22-of-24 3.9 GB: 23-of-24 2.6 GB: 24-of-24
Model Architecture	MixtralForCausalLM
License	apache-2.0
Context Length	32768
Model Max Length	32768
Transformers Version	4.37.0.dev0
Tokenizer Class	LlamaTokenizer
Vocabulary Size	32000
Torch Data Type	bfloat16

Bagel DPO 8x7b V0.2 (jondurbin/bagel-dpo-8x7b-v0.2)

Quantized Models of the Bagel DPO 8x7b V0.2

Model	Likes	Downloads	VRAM
Bagel DPO 8x7b V0.2 GGUF	6	67	17 GB
Bagel DPO 8x7b V0.2 GPTQ	2	10	23 GB
Bagel DPO 8x7b V0.2 AWQ	2	10	24 GB

Best Alternatives to Bagel DPO 8x7b V0.2

Best Alternatives	Context / RAM	Downloads	Likes
Mixtral 8x7B Instruct V0.1	32K / 93.6 GB	651236	3978
Nous Hermes 2 Mixtral 8x7B DPO	32K / 93.6 GB	6955	412
Mixtral 8x7B V0.1	32K / 93.6 GB	63310	1628
...rkrautLM Mixtral 8x7B Instruct	32K / 93.6 GB	54250	21
GritLM 8x7B KTO	32K / 93.6 GB	2766	3
...enbuddy Mixtral 7bx8 V18.1 32K	32K / 93.7 GB	4514	14
Smaug Mixtral V0.1	32K / 187.7 GB	2856	12
Dolphin 2.5 Mixtral 8x7b	32K / 93.6 GB	4316	1201
Merge Mixtral Prometheus 8x7B	32K / 91.9 GB	8	2
XLAM 8x7b R	32K / 93.6 GB	254	9

Note: green Score (e.g. "73.2") means that the model is better than jondurbin/bagel-dpo-8x7b-v0.2.

Rank the Bagel DPO 8x7b V0.2 Capabilities

🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟

Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation

What open-source LLMs or SLMs are you in search of? 36368 in total.

Email us: info@extractum.io. Our Privacy Policy | Terms and Conditions | Suggest an improvement.

Our Social Media →

Original data from HuggingFace, OpenCompass and various public git repos.

Release v2024072803

Support LLM Explorer