Dolphin 2.9.2 Mixtral 8x22b by cognitivecomputations

 ยป  All LLMs  ยป  cognitivecomputations  ยป  Dolphin 2.9.2 Mixtral 8x22b   URL Share it on

  Autotrain compatible   Axolotl Base model:finetune:mistral-co... Base model:mistral-community/m...   Conversational Dataset:abacusai/systemchat-1.... Dataset:cognitivecomputations/... Dataset:cognitivecomputations/... Dataset:cognitivecomputations/... Dataset:cognitivecomputations/... Dataset:huggingfaceh4/ultracha...   Dataset:internlm/agent-flan Dataset:locutusque/function-ca... Dataset:m-a-p/codefeedback-fil... Dataset:microsoft/orca-math-wo...   Dataset:teknium/openhermes-2.5   En   Endpoints compatible   Generated from trainer   Mixtral   Moe   Region:us   Safetensors   Sharded   Tensorflow

Dolphin 2.9.2 Mixtral 8x22b Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Dolphin 2.9.2 Mixtral 8x22b (cognitivecomputations/dolphin-2.9.2-mixtral-8x22b)

Dolphin 2.9.2 Mixtral 8x22b Parameters and Internals

Additional Notes 
Dolphin-2.9 has a variety of instruction, conversational, and coding skills. It also has initial agentic abilities and supports function calling.
Supported Languages 
en (English)
Training Details 
Data Sources:
GPT4, cognitivecomputations/Dolphin-2.9.2, cognitivecomputations/SystemChat-2.0, teknium/OpenHermes-2.5, m-a-p/CodeFeedback-Filtered-Instruction, cognitivecomputations/dolphin-coder, cognitivecomputations/samantha-data, HuggingFaceH4/ultrachat_200k, microsoft/orca-math-word-problems-200k, abacusai/SystemChat-1.1, Locutusque/function-calling-chatml, internlm/Agent-FLAN
Methodology:
FFT on 50% parameters using ChatML prompt template format. Base model has 64k context, fine-tuning was with 16k sequence length.
Context Length:
64000
Training Time:
1 week on 8xH100
Hardware Used:
Crusoe Cloud 8xH100 node
LLM NameDolphin 2.9.2 Mixtral 8x22b
Repository ๐Ÿค—https://huggingface.co/cognitivecomputations/dolphin-2.9.2-mixtral-8x22b 
Base Model(s)  mistral-community/Mixtral-8x22B-v0.1   mistral-community/Mixtral-8x22B-v0.1
Model Size140.6b
Required VRAM207.2 GB
Updated2025-02-15
Maintainercognitivecomputations
Model Typemixtral
Model Files  5.0 GB: 1-of-59   4.8 GB: 2-of-59   4.8 GB: 3-of-59   4.8 GB: 4-of-59   4.8 GB: 5-of-59   4.8 GB: 6-of-59   4.8 GB: 7-of-59   4.8 GB: 8-of-59   4.8 GB: 9-of-59   4.8 GB: 10-of-59   4.8 GB: 11-of-59   4.8 GB: 12-of-59   4.8 GB: 13-of-59   4.8 GB: 14-of-59   4.8 GB: 15-of-59   4.8 GB: 16-of-59   4.8 GB: 17-of-59   4.8 GB: 18-of-59   4.8 GB: 19-of-59   4.8 GB: 20-of-59   4.8 GB: 21-of-59   4.8 GB: 22-of-59   4.8 GB: 23-of-59   4.9 GB: 24-of-59   5.0 GB: 25-of-59   5.0 GB: 26-of-59   4.9 GB: 27-of-59   4.8 GB: 28-of-59   4.8 GB: 29-of-59   4.8 GB: 30-of-59   4.8 GB: 31-of-59   4.8 GB: 32-of-59   4.8 GB: 33-of-59   4.8 GB: 34-of-59   4.8 GB: 35-of-59   4.8 GB: 36-of-59   4.8 GB: 37-of-59   4.8 GB: 38-of-59   4.8 GB: 39-of-59   4.8 GB: 40-of-59   4.8 GB: 41-of-59   4.8 GB: 42-of-59   4.8 GB: 43-of-59
Supported Languagesen
Model ArchitectureMixtralForCausalLM
Licenseapache-2.0
Context Length65536
Model Max Length65536
Transformers Version4.40.2
Vocabulary Size32002
Torch Data Typebfloat16

Best Alternatives to Dolphin 2.9.2 Mixtral 8x22b

Best Alternatives
Context / RAM
Downloads
Likes
Mixtral 8x22B Instruct V0.164K / 221.4 GB148016713
Zephyr Orpo 141B A35b V0.164K / 207.2 GB616265
Mixtral 8x22B V0.164K / 212 GB4399674
WizardLM 2 8x22B64K / 216.8 GB7157397
Mixtral 8x22B V0.164K / 221.6 GB8327210
Mixtral 8x22B V0.364K / 221.4 GB523
XLAM 8x22b R64K / 211.8 GB259144
...ixtral 8x22B Instruct V0.1 FP864K / 140.9 GB9482
...igHuggyD Grey WizardLM 2 8x22B64K / 216.6 GB184
WizardLM 2 8x22B Beige64K / 221.4 GB303
Note: green Score (e.g. "73.2") means that the model is better than cognitivecomputations/dolphin-2.9.2-mixtral-8x22b.

Rank the Dolphin 2.9.2 Mixtral 8x22b Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 43137 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227