Qwen2idae 16x14B V1.0 By hywu: Benchmarks, Features and Detailed Analysis. Insights on Qwen2idae 16x14B V1.0.

Arxiv:1902.00751 Arxiv:2212.05055 Arxiv:2305.14314 Arxiv:2401.02731 Autotrain compatible Conversational Custom code Dataset:ise-uiuc/magicoder-evo... Dataset:ise-uiuc/magicoder-oss... Dataset:meta-math/metamathqa Dataset:open-orca/slimorca En Endpoints compatible Instruct Moe Qwen2idae Region:us Safetensors Sharded Tensorflow

Model Card on HF 🤗: https://huggingface.co/hywu/Qwen2idae-16x14B-v1.0

Qwen2idae 16x14B V1.0 Benchmarks

LLME Score: 0.15759

^nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").

What is the LLM Explorer Rank (Score)

Qwen2idae 16x14B V1.0 (hywu/Qwen2idae-16x14B-v1.0)

Qwen2idae 16x14B V1.0 Parameters and Internals

Model Type

text generation

Additional Notes

Part of Parameter-Efficient Sparsity Crafting project in collaboration with Serp-ai.

Supported Languages

English (-)

Training Details

Data Sources:

Open-Orca/SlimOrca, ise-uiuc/Magicoder-OSS-Instruct-75K, ise-uiuc/Magicoder-Evol-Instruct-110K, meta-math/MetaMathQA

Methodology:

Parameter-Efficient Sparsity Crafting, Instruction Tuning, MoE structure, QLoRA, Adapter techniques, Efficient Sparse Upcycling

Hardware Used:

unsloth support

Model Architecture:

Parameter-Efficient Sparsity Crafting from Dense to Mixture-of-Experts

Input Output

Input Format:

Expecting tokenized input with special tokens for context separation.

Accepted Modalities:

text

Output Format:

Text

Release Notes

Version:

v1.0

Date:

3/12/2024

Notes:

Qwen2idae-16x14B-v1.0 released.

LLM Name	Qwen2idae 16x14B V1.0
Repository 🤗	https://huggingface.co/hywu/Qwen2idae-16x14B-v1.0
Model Size	17.5b
Required VRAM	35.1 GB
Updated	2025-06-01
Maintainer	hywu
Model Type	qwen2idae
Instruction-Based	Yes
Model Files	5.0 GB: 1-of-8 4.9 GB: 2-of-8 4.9 GB: 3-of-8 5.0 GB: 4-of-8 5.0 GB: 5-of-8 4.9 GB: 6-of-8 3.8 GB: 7-of-8 1.6 GB: 8-of-8
Supported Languages	en
Model Architecture	Qwen2ForCausalLM
License	apache-2.0
Context Length	32768
Model Max Length	32768
Transformers Version	4.37.2
Tokenizer Class	Qwen2Tokenizer
Padding Token	<\|endoftext\|>
Vocabulary Size	152064
Torch Data Type	bfloat16
Errors	replace

Rank the Qwen2idae 16x14B V1.0 Capabilities

🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟

Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation

What open-source LLMs or SLMs are you in search of? 47770 in total.

Email us: info@extractum.io. Our Privacy Policy | Terms and Conditions | Suggest an improvement.

Our Social Media →

Original data from HuggingFace, OpenCompass and various public git repos.

Release v20241227

Support LLM Explorer

Qwen2idae 16x14B V1.0 by hywu

» All LLMs » hywu » Qwen2idae 16x14B V1.0 URL Share it on

Qwen2idae 16x14B V1.0 Benchmarks

Qwen2idae 16x14B V1.0 Parameters and Internals

Rank the Qwen2idae 16x14B V1.0 Capabilities

What open-source LLMs or SLMs are you in search of? 47770 in total.