Mamba 1.4B Ru by SpirinEgor

 ยป  All LLMs  ยป  SpirinEgor  ยป  Mamba 1.4B Ru   URL Share it on

  Arxiv:2312.00752   Autotrain compatible   En   Endpoints compatible   Mamba   Pytorch   Region:us   Ru   Safetensors

Mamba 1.4B Ru Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Mamba 1.4B Ru (SpirinEgor/mamba-1.4b-ru)

Mamba 1.4B Ru Parameters and Internals

Model Type 
Causal Language Model
Supported Languages 
en (English), ru (Russian)
Training Details 
Data Sources:
SlimPajama, Wikipedia, Reddit
Data Volume:
1 trillion tokens
Methodology:
Original implementation with FSDP strategy
Context Length:
2048
Training Time:
500,000 steps
Model Architecture:
Mamba model architecture with modified vocabulary size
Input Output 
Accepted Modalities:
text
Output Format:
text
Performance Tips:
Install optimized kernels for improved performance.
LLM NameMamba 1.4B Ru
Repository ๐Ÿค—https://huggingface.co/SpirinEgor/mamba-1.4b-ru 
Model Size1.4b
Required VRAM5.3 GB
Updated2025-02-22
MaintainerSpirinEgor
Model Typemamba
Model Files  5.3 GB   5.3 GB
Supported Languagesru en
Model ArchitectureMambaForCausalLM
Licenseapache-2.0
Transformers Version4.39.0.dev0
Tokenizer ClassLlamaTokenizer
Vocabulary Size32768
Torch Data Typefloat32

Best Alternatives to Mamba 1.4B Ru

Best Alternatives
Context / RAM
Downloads
Likes
Mamba 1.4B Hf0K / 5.5 GB444810
Mambamerd0K / 5.8 GB60
Mamba 1.4B Instruct Hf0K / 5.5 GB960
Ofm Mamba 1.4B Lambda Hf0K / 5.5 GB541
Mamba 1.4B0K / 2.8 GB1260
Mambamerd0K / 5.8 GB00
Note: green Score (e.g. "73.2") means that the model is better than SpirinEgor/mamba-1.4b-ru.

Rank the Mamba 1.4B Ru Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 43470 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227