Ruadapt Qwen2.5 3B Ext U48 Full Lr5e4 Peft Mlp 32 32 Bs256 by RefalMachine

 ยป  All LLMs  ยป  RefalMachine  ยป  Ruadapt Qwen2.5 3B Ext U48 Full Lr5e4 Peft Mlp 32 32 Bs256   URL Share it on

  Adapter Base model:adapter:refalmachin... Base model:refalmachine/ruadap...   Finetuned   Generated from trainer   Lora   Peft   Region:us   Safetensors   Tensorboard

Ruadapt Qwen2.5 3B Ext U48 Full Lr5e4 Peft Mlp 32 32 Bs256 Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Ruadapt Qwen2.5 3B Ext U48 Full Lr5e4 Peft Mlp 32 32 Bs256 (RefalMachine/ruadapt_qwen2.5_3B_ext_u48_full_lr5e4_peft_mlp_32_32_bs256)

Ruadapt Qwen2.5 3B Ext U48 Full Lr5e4 Peft Mlp 32 32 Bs256 Parameters and Internals

LLM NameRuadapt Qwen2.5 3B Ext U48 Full Lr5e4 Peft Mlp 32 32 Bs256
Repository ๐Ÿค—https://huggingface.co/RefalMachine/ruadapt_qwen2.5_3B_ext_u48_full_lr5e4_peft_mlp_32_32_bs256 
Base Model(s)  RefalMachine/ruadapt_qwen2.5_3B_ext_u48_mean_init   RefalMachine/ruadapt_qwen2.5_3B_ext_u48_mean_init
Model Size3b
Required VRAM0.7 GB
Updated2025-01-06
MaintainerRefalMachine
Model Files  0.7 GB   0.0 GB
Model ArchitectureAdapter
Model Max Length131072
Is Biasednone
Tokenizer ClassQwen2Tokenizer
Padding Token<|endoftext|>
PEFT TypeLORA
LoRA ModelYes
PEFT Target Modulesgate_proj|down_proj|v_proj|o_proj|up_proj|k_proj|q_proj
LoRA Alpha32
LoRA Dropout0.05
R Param32
Errorsreplace

Best Alternatives to Ruadapt Qwen2.5 3B Ext U48 Full Lr5e4 Peft Mlp 32 32 Bs256

Best Alternatives
Context / RAM
Downloads
Likes
Art Skynet 3B0K / 6.5 GB226
Xenith 3B0K / 7.6 GB22
Results0K / 0.9 GB60
Accel30K / 0 GB70
Accel20K / 6.8 GB50
Openllama Test10K / 0.1 GB50
Paligemma Hindi0K / 2.2 GB60
Paligemma Vqav20K / 0 GB50
Theory Of Mind 128 StableLM0K / 0.7 GB60
Theory Of Mind RP 128 StableLM0K / 0 GB50
Note: green Score (e.g. "73.2") means that the model is better than RefalMachine/ruadapt_qwen2.5_3B_ext_u48_full_lr5e4_peft_mlp_32_32_bs256.

Rank the Ruadapt Qwen2.5 3B Ext U48 Full Lr5e4 Peft Mlp 32 32 Bs256 Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 42625 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227