Mamba 790M by Q-bert

 ยป  All LLMs  ยป  Q-bert  ยป  Mamba 790M   URL Share it on

  Arxiv:2312.00752   Autotrain compatible   Custom code   En   Endpoints compatible   Mamba   Mamba-hf   Pytorch   Region:us
Model Card on HF ๐Ÿค—: https://huggingface.co/Q-bert/Mamba-790M 

Mamba 790M Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Mamba 790M (Q-bert/Mamba-790M)

Mamba 790M Parameters and Internals

Model Type 
Causal LM
Additional Notes 
You must use the MambaTrainer class for training and set fp16 to False.
Supported Languages 
en (English)
LLM NameMamba 790M
Repository ๐Ÿค—https://huggingface.co/Q-bert/Mamba-790M 
Model Size790m
Required VRAM3.2 GB
Updated2025-03-12
MaintainerQ-bert
Model Typemamba
Model Files  3.2 GB
Supported Languagesen
Model ArchitectureMambaForCausalLM
Licenseapache-2.0
Transformers Version4.35.2
Is Biased0
Tokenizer ClassGPTNeoXTokenizer
Vocabulary Size50280
Torch Data Typefloat32

Best Alternatives to Mamba 790M

Best Alternatives
Context / RAM
Downloads
Likes
Mamba 790M Hf0K / 3.2 GB26883
Mamba Char Japanese 790M0K / 2.9 GB314
Mamba 790M Chat0K / 0 GB180
Note: green Score (e.g. "73.2") means that the model is better than Q-bert/Mamba-790M.

Rank the Mamba 790M Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 44887 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227