NinjaMouse2 2.5B V0.1 by trollek

 ยป  All LLMs  ยป  trollek  ยป  NinjaMouse2 2.5B V0.1   URL Share it on

  Autotrain compatible Base model:finetune:h2oai/h2o-... Base model:h2oai/h2o-danube2-1...   Conversational   Dataset:abacusai/systemchat Dataset:garage-baind/open-plat... Dataset:glaiveai/glaive-code-a... Dataset:hiyouga/glaive-functio... Dataset:jondurbin/airoboros-3....   Dataset:ldjnr/capybara   Dataset:m-a-p/code-feedback Dataset:m-a-p/codefeedback-fil... Dataset:migtissera/synthia-v1.... Dataset:teknium/gpteacher-gene...   Dataset:teknium/openhermes   Dataset:tiger-lab/mathinstruct Dataset:trollek/mouse-diffusio... Dataset:trollek/self-rewarding...   Dataset:vicgalle/alpaca-gpt4   Dataset:weyaxi/sci-datasets Dataset:whiterabbitneo/wrn-cha... Dataset:whiterabbitneo/wrn-cha...   En   Endpoints compatible   Mistral   Region:us   Safetensors   Sharded   Tensorflow

NinjaMouse2 2.5B V0.1 Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
NinjaMouse2 2.5B V0.1 (trollek/NinjaMouse2-2.5B-v0.1)

NinjaMouse2 2.5B V0.1 Parameters and Internals

Model Type 
text generation, causal language model
Supported Languages 
en (English)
Training Details 
Data Sources:
WhiteRabbitNeo/WRN-Chapter-1, WhiteRabbitNeo/WRN-Chapter-2, LDJnr/Capybara, teknium/openhermes, teknium/GPTeacher-General-Instruct, Weyaxi/sci-datasets, TIGER-Lab/MathInstruct, hiyouga/glaive-function-calling-v2-sharegpt, glaiveai/glaive-code-assistant, m-a-p/CodeFeedback-Filtered-Instruction, m-a-p/Code-Feedback, migtissera/Synthia-v1.3, abacusai/SystemChat, jondurbin/airoboros-3.2, vicgalle/alpaca-gpt4, garage-bAInd/Open-Platypus, trollek/Mouse-Diffusion-Instruct, trollek/Self-Rewarding-Mouse
Data Volume:
filtered for token count between 2k and 8k
Methodology:
Layerwise insertion of data following the Llama Pro method with additional techniques like BAdam, DoRA, QLoRA.
Context Length:
8000
Model Architecture:
MistralForCausalLM with 34 MistralDecoderLayer
LLM NameNinjaMouse2 2.5B V0.1
Repository ๐Ÿค—https://huggingface.co/trollek/NinjaMouse2-2.5B-v0.1 
Base Model(s)  h2oai/h2o-danube2-1.8b-chat   h2oai/h2o-danube2-1.8b-chat
Model Size1.8b
Required VRAM5.1 GB
Updated2025-02-22
Maintainertrollek
Model Typemistral
Model Files  4.9 GB: 1-of-2   0.2 GB: 2-of-2
Supported Languagesen
Model ArchitectureMistralForCausalLM
Licenseapache-2.0
Context Length8192
Model Max Length8192
Transformers Version4.40.0
Tokenizer ClassLlamaTokenizer
Padding Token<unk>
Vocabulary Size32009
Torch Data Typebfloat16

Best Alternatives to NinjaMouse2 2.5B V0.1

Best Alternatives
Context / RAM
Downloads
Likes
H2o Danube 1.8B Base16K / 3.7 GB39643
H2o Danube 1.8B Chat16K / 3.7 GB49754
Cypher Mini 1.8B16K / 3.7 GB1672
H2o Danube 1.8B Sft16K / 3.7 GB17311
Cypher CoT 1.8B16K / 3.7 GB1521
PixieZehirNano16K / 3.7 GB100
...1.8B Chat Sft Merge Fourier V116K / 7.3 GB901
H2o Danube2 1.8B Chat8K / 3.7 GB285661
H2o Danube2 1.8B Base8K / 3.7 GB24246
H2o Danube2 1.8B Sft8K / 3.7 GB2636
Note: green Score (e.g. "73.2") means that the model is better than trollek/NinjaMouse2-2.5B-v0.1.

Rank the NinjaMouse2 2.5B V0.1 Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 43470 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227