FuseO1 DeepSeekR1 QwQ SkyT1 32B Preview by FuseAI

 »  All LLMs  »  FuseAI  »  FuseO1 DeepSeekR1 QwQ SkyT1 32B Preview   URL Share it on

  Arxiv:2401.10491   Arxiv:2408.07990   Arxiv:2412.03187   Qwen2   Region:us   Safetensors   Sharded   Tensorflow

FuseO1 DeepSeekR1 QwQ SkyT1 32B Preview Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
FuseO1 DeepSeekR1 QwQ SkyT1 32B Preview (FuseAI/FuseO1-DeepSeekR1-QwQ-SkyT1-32B-Preview)

FuseO1 DeepSeekR1 QwQ SkyT1 32B Preview Parameters and Internals

LLM NameFuseO1 DeepSeekR1 QwQ SkyT1 32B Preview
Repository 🤗https://huggingface.co/FuseAI/FuseO1-DeepSeekR1-QwQ-SkyT1-32B-Preview 
Model Size32b
Required VRAM65.8 GB
Updated2025-03-12
MaintainerFuseAI
Model Typeqwen2
Model Files  5.0 GB: 1-of-14   5.0 GB: 2-of-14   4.9 GB: 3-of-14   4.9 GB: 4-of-14   4.9 GB: 5-of-14   4.9 GB: 6-of-14   4.9 GB: 7-of-14   4.9 GB: 8-of-14   4.9 GB: 9-of-14   4.9 GB: 10-of-14   4.9 GB: 11-of-14   4.9 GB: 12-of-14   4.9 GB: 13-of-14   1.9 GB: 14-of-14
Model ArchitectureQwen2ForCausalLM
Licenseapache-2.0
Context Length131072
Model Max Length131072
Transformers Version4.43.1
Tokenizer ClassLlamaTokenizerFast
Beginning of Sentence Token<|begin▁of▁sentence|>
End of Sentence Token<|end▁of▁sentence|>
Vocabulary Size152064
Torch Data Typebfloat16

Quantized Models of the FuseO1 DeepSeekR1 QwQ SkyT1 32B Preview

Model
Likes
Downloads
VRAM
...ekR1 QwQ SkyT1 32B Preview AWQ698419 GB

Best Alternatives to FuseO1 DeepSeekR1 QwQ SkyT1 32B Preview

Best Alternatives
Context / RAM
Downloads
Likes
Openbuddy Qwq 32B V24.2 200K195K / 65.8 GB823
Openbuddy Qwq 32B V24.1 200K195K / 65.8 GB843
...y Qwen2.5coder 32B V24.1q 200K195K / 65.8 GB142
QwQ 32B128K / 65.8 GB1692341971
DeepSeek R1 Distill Qwen 32B128K / 65.7 GB15291401248
TinyR1 32B Preview128K / 65.6 GB5213317
Qwen2.5 32B128K / 65.5 GB100224123
RomboUltima 32B128K / 20.7 GB1542
...k R1 Distill Qwen 32B Japanese128K / 65.8 GB7197244
Ultiima 32B128K / 65.8 GB3025
Note: green Score (e.g. "73.2") means that the model is better than FuseAI/FuseO1-DeepSeekR1-QwQ-SkyT1-32B-Preview.

Rank the FuseO1 DeepSeekR1 QwQ SkyT1 32B Preview Capabilities

🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 44949 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227