Jetmoe 8B by jetmoe

 ยป  All LLMs  ยป  jetmoe  ยป  Jetmoe 8B   URL Share it on

  Arxiv:2404.07413   Autotrain compatible   Endpoints compatible   Jetmoe   Region:us   Safetensors   Sharded   Tensorflow
Model Card on HF ๐Ÿค—: https://huggingface.co/jetmoe/jetmoe-8b 

Jetmoe 8B Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Jetmoe 8B (jetmoe/jetmoe-8b)

Jetmoe 8B Parameters and Internals

Model Type 
text generation, multimodal
Use Cases 
Primary Use Cases:
Language model applications
Limitations:
Limited finetuning due to hardware constraints
Training Details 
Data Sources:
RefinedWeb, Pile, Github
Data Volume:
1.25T tokens
Methodology:
Two-phases training method
Model Architecture:
24 blocks with two MoE layers each
Release Notes 
Version:
8B
Notes:
Initial release
LLM NameJetmoe 8B
Repository ๐Ÿค—https://huggingface.co/jetmoe/jetmoe-8b 
Model Size8b
Required VRAM17 GB
Updated2025-02-22
Maintainerjetmoe
Model Typejetmoe
Model Files  4.9 GB: 1-of-4   4.9 GB: 2-of-4   4.9 GB: 3-of-4   2.3 GB: 4-of-4
Model ArchitectureJetMoEForCausalLM
Licenseapache-2.0
Model Max Length4096
Tokenizer ClassLlamaTokenizer
Padding Token</s>
Vocabulary Size32000
Activation Functionsilu

Best Alternatives to Jetmoe 8B

Best Alternatives
Context / RAM
Downloads
Likes
Jetmoe 8B Sft0K / 17 GB7246
Jetmoe 8B Chat0K / 17 GB9628

Rank the Jetmoe 8B Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 43470 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227