BigMistral 11B by athirdpath

 ยป  All LLMs  ยป  athirdpath  ยป  BigMistral 11B   URL Share it on

  Autotrain compatible   Endpoints compatible   Mistral   Region:us   Safetensors   Sharded   Tensorflow

BigMistral 11B Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
BigMistral 11B (athirdpath/BigMistral-11b)

BigMistral 11B Parameters and Internals

Additional Notes 
Base Mistral, but with some minor head trauma. Promising in theory, needs finetuning, but only really outperforms the 14b in size.
Training Details 
Methodology:
NeverSleep recipe
LLM NameBigMistral 11B
Repository ๐Ÿค—https://huggingface.co/athirdpath/BigMistral-11b 
Model Size11b
Required VRAM21.5 GB
Updated2025-03-14
Maintainerathirdpath
Model Typemistral
Model Files  10.0 GB: 1-of-3   10.0 GB: 2-of-3   1.5 GB: 3-of-3
Model ArchitectureMistralForCausalLM
Licenseapache-2.0
Context Length32768
Model Max Length32768
Transformers Version4.35.2
Tokenizer ClassLlamaTokenizer
Vocabulary Size32000
Torch Data Typebfloat16

Best Alternatives to BigMistral 11B

Best Alternatives
Context / RAM
Downloads
Likes
Normistral 11B Warm1000K / 22.9 GB7716
Starling LM 11B Alpha32K / 21.4 GB177513
...elik V2.3 Instruct MedIT Merge32K / 22.3 GB1611
CarbonBeagle 11B Truthy32K / 21.4 GB1166810
...elik V2.3 Instruct Llama Prune32K / 15.4 GB1380
ConfigurableBeagle 11B32K / 21.4 GB35303
CarbonBeagle 11B32K / 21.4 GB34189
Alphacode MALI 11B Slowtest32K / 21.8 GB19850
Alphacode MALI 11B32K / 21.8 GB20291
Mistral 11B Miniplatypus32K / 21.5 GB100
Note: green Score (e.g. "73.2") means that the model is better than athirdpath/BigMistral-11b.

Rank the BigMistral 11B Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 45019 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227