Mamba GPT 3B V2 by CobraMamba

 ยป  All LLMs  ยป  CobraMamba  ยป  Mamba GPT 3B V2   URL Share it on

  Autotrain compatible   En   Gpt   Llama   Lora   Pytorch   Region:us   Safetensors

Mamba GPT 3B V2 Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Mamba GPT 3B V2 (CobraMamba/mamba-gpt-3b-v2)

Mamba GPT 3B V2 Parameters and Internals

Model Type 
Large Language Model, Text Generation
Use Cases 
Areas:
Research, Commercial Applications
Additional Notes 
The model uses LLaMA architecture for causal language modeling.
Training Details 
Methodology:
Fine-tuned model surpassing the original on several evaluations.
Model Architecture:
LLaMA (LlamaForCausalLM) with 32 layers of LlamaDecoderLayer each consisting of LlamaAttention (uses Linear, RotaryEmbedding) and LlamaMLP (uses Linear, SiLUActivation).
Input Output 
Input Format:
<|prompt|>Your input~~<|answer|>
Output Format:
Generated text corresponding to the prompt.
Performance Tips:
Ensure appropriate setup for transformer, accelerate, and torch.
LLM NameMamba GPT 3B V2
Repository ๐Ÿค—https://huggingface.co/CobraMamba/mamba-gpt-3b-v2 
Model Size3b
Required VRAM6.8 GB
Updated2025-02-22
MaintainerCobraMamba
Model Files  6.8 GB   6.8 GB
Supported Languagesen
Model ArchitectureAutoModelForCausalLM
Licenseapache-2.0
Model Max Length2048
Is Biasednone
Tokenizer ClassLlamaTokenizer
Beginning of Sentence Token<s>
End of Sentence Token</s>
Unk Token<unk>
PEFT TypeLORA
LoRA ModelYes
PEFT Target Modulesq_proj|v_proj
LoRA Alpha16
LoRA Dropout0.1
R Param256

Best Alternatives to Mamba GPT 3B V2

Best Alternatives
Context / RAM
Downloads
Likes
Granite 3B Mup4K / 14 GB3740
Llama 3.2 3B Mathdaily Chatbot0K / 6.5 GB1090
MM Alpaca 3B Lora0K / 0.2 GB70
Qwen2.5 3b Lora Model0K / 0.1 GB110
...ma 3.2 3B It Ecommerce ChatBot0K / 6.5 GB1904
Note: green Score (e.g. "73.2") means that the model is better than CobraMamba/mamba-gpt-3b-v2.

Rank the Mamba GPT 3B V2 Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 43470 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227