BigMistral 11B by athirdpath

 ยป  All LLMs  ยป  athirdpath  ยป  BigMistral 11B   URL Share it on

  Autotrain compatible   Endpoints compatible   Mistral   Region:us   Safetensors   Sharded   Tensorflow

BigMistral 11B Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
BigMistral 11B (athirdpath/BigMistral-11b)

BigMistral 11B Parameters and Internals

Additional Notes 
Base Mistral, but with some minor head trauma. Promising in theory, needs finetuning, but only really outperforms the 14b in size.
Training Details 
Methodology:
NeverSleep recipe
LLM NameBigMistral 11B
Repository ๐Ÿค—https://huggingface.co/athirdpath/BigMistral-11b 
Model Size11b
Required VRAM21.5 GB
Updated2024-12-22
Maintainerathirdpath
Model Typemistral
Model Files  10.0 GB: 1-of-3   10.0 GB: 2-of-3   1.5 GB: 3-of-3
Model ArchitectureMistralForCausalLM
Licenseapache-2.0
Context Length32768
Model Max Length32768
Transformers Version4.35.2
Tokenizer ClassLlamaTokenizer
Vocabulary Size32000
Torch Data Typebfloat16

Best Alternatives to BigMistral 11B

Best Alternatives
Context / RAM
Downloads
Likes
Starling LM 11B Alpha32K / 21.4 GB111612
...elik V2.3 Instruct MedIT Merge32K / 22.3 GB19121
...elik V2.3 Instruct Llama Prune32K / 15.4 GB21450
CarbonBeagle 11B Truthy32K / 21.4 GB1547510
ConfigurableBeagle 11B32K / 21.4 GB76993
CarbonBeagle 11B32K / 21.4 GB89159
Alphacode MALI 11B Slowtest32K / 21.8 GB47610
Alphacode MALI 11B32K / 21.8 GB47831
Mistral 11B Miniplatypus32K / 21.5 GB110
MistralLite 11B32K / 21.4 GB13372
Note: green Score (e.g. "73.2") means that the model is better than athirdpath/BigMistral-11b.

Rank the BigMistral 11B Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 40066 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241217