Mpt 30B by mosaicml

 ยป  All LLMs  ยป  mosaicml  ยป  Mpt 30B   URL Share it on

  Arxiv:1909.08053   Arxiv:2010.04245   Arxiv:2108.12409   Arxiv:2205.14135   Arxiv:2302.06675   Arxiv:2302.13971   Autotrain compatible   Composer   Custom code   Dataset:allenai/c4   Dataset:allenai/s2orc Dataset:bigcode/the-stack-dedu...   Dataset:mc4 Dataset:togethercomputer/redpa...   Llm-foundry   Mosaicml   Mpt   Pytorch   Region:us   Sharded   Streamingdatasets
Model Card on HF ๐Ÿค—: https://huggingface.co/mosaicml/mpt-30b 

Mpt 30B Benchmarks

Mpt 30B (mosaicml/mpt-30b)

Mpt 30B Parameters and Internals

Model Type 
text generation
Training Details 
Data Sources:
allenai/c4, mc4, togethercomputer/RedPajama-Data-1T, bigcode/the-stack-dedup, allenai/s2orc
Data Volume:
1T tokens
Context Length:
8192
Hardware Used:
440 A100-40GB GPUs, 216 A100-40GB GPUs, 256 H100-80GB GPUs
Model Architecture:
Modified transformer architecture with FlashAttention, ALiBi, no biases
Input Output 
Performance Tips:
Use FlashAttention and ALiBi for faster training and inference.
LLM NameMpt 30B
Repository ๐Ÿค—https://huggingface.co/mosaicml/mpt-30b 
Model Size30b
Required VRAM60.1 GB
Updated2024-12-22
Maintainermosaicml
Model Typempt
Model Files  9.8 GB: 1-of-7   9.9 GB: 2-of-7   9.9 GB: 3-of-7   9.9 GB: 4-of-7   9.9 GB: 5-of-7   9.9 GB: 6-of-7   0.8 GB: 7-of-7
Model ArchitectureMPTForCausalLM
Licenseapache-2.0
Model Max Length8192
Transformers Version4.28.1
Tokenizer ClassGPTNeoXTokenizer
Vocabulary Size50432
Torch Data Typebfloat16

Best Alternatives to Mpt 30B

Best Alternatives
Context / RAM
Downloads
Likes
Mpt 30B Chat0K / 60.1 GB1428203
Mpt 30B Instruct0K / 60.1 GB1263101
Mpt 30B Orca Mini0K / 180.5 GB171
Mpt 30B V20K / 60.1 GB1310
Mpt 30B V30K / 60.1 GB122
Mpt 30B Qlora Multi GPU0K /  GB161
Mpt 30B Peft Compatible0K / 60.1 GB148
...s Mpt 30B Gpt4 1p4 Five Epochs0K / 60.1 GB147
...t 30B Instruct Peft Compatible0K / 60.1 GB132
Mpt 30B Qlora Compatible0K / 60.1 GB1211
Note: green Score (e.g. "73.2") means that the model is better than mosaicml/mpt-30b.

Rank the Mpt 30B Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 40066 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241217