Batch1 Epochs1 Lr1e 05 Paged Adamw 32bit Cosine Length2048 Warmup 0.05 Max Grad1.0 Grad Accu32 by caisarl76

 ยป  All LLMs  ยป  caisarl76  ยป  Batch1 Epochs1 Lr1e 05 Paged Adamw 32bit Cosine Length2048 Warmup 0.05 Max Grad1.0 Grad Accu32   URL Share it on

  32bit   Autotrain compatible Base model:mistralai/mistral-7...   Conversational   Endpoints compatible   Generated from trainer   Instruct   License:apache-2.0   Llama   Quantized   Region:us   Safetensors   Sharded   Tensorflow

Batch1 Epochs1 Lr1e 05 Paged Adamw 32bit Cosine Length2048 Warmup 0.05 Max Grad1.0 Grad Accu32 Benchmarks

Rank the Batch1 Epochs1 Lr1e 05 Paged Adamw 32bit Cosine Length2048 Warmup 0.05 Max Grad1.0 Grad Accu32 Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  
Batch1 Epochs1 Lr1e 05 Paged Adamw 32bit Cosine Length2048 Warmup 0.05 Max Grad1.0 Grad Accu32 (caisarl76/batch1_epochs1_lr1e-05_paged_adamw_32bit_cosine_length2048_warmup_0.05_max_grad1.0_grad_accu32)

Best Alternatives to Batch1 Epochs1 Lr1e 05 Paged Adamw 32bit Cosine Length2048 Warmup 0.05 Max Grad1.0 Grad Accu32

Best Alternatives
HF Rank
Context/RAM
Downloads
Likes
...p 0.05 Max Grad1.0 Grad Accu3232K / 14.4 GB80
...coder S CL 7B 3.0bpw H6 EXL2 216K / 2.8 GB31
...6.7B Instruct 3.0bpw H6 EXL2 216K / 2.8 GB21
...coder S CL 7B 4.0bpw H6 EXL2 216K / 3.6 GB31
CodelLama7B Inst DPO 7K Mlx16K / 4.2 GB71
...eLlama 7B Instruct Hf 4bit MLX16K / 4.2 GB51
...B Instruct Hf Bnb 4bit Smashed16K / 4.2 GB240
...ruct Solidity Bnb 4bit Smashed16K / 4.2 GB180
...coder S CL 7B 5.0bpw H6 EXL2 216K / 4.4 GB31
...6.7B Instruct 8.0bpw H8 EXL2 216K / 6.8 GB22

Batch1 Epochs1 Lr1e 05 Paged Adamw 32bit Cosine Length2048 Warmup 0.05 Max Grad1.0 Grad Accu32 Parameters and Internals

LLM NameBatch1 Epochs1 Lr1e 05 Paged Adamw 32bit Cosine Length2048 Warmup 0.05 Max Grad1.0 Grad Accu32
RepositoryOpen on ๐Ÿค— 
Base Model(s)  Mistral 7B Instruct V0.1   mistralai/Mistral-7B-Instruct-v0.1
Model Size7b
Required VRAM14.4 GB
Updated2024-07-12
Maintainercaisarl76
Model Typellama
Instruction-BasedYes
Model Files  4.9 GB: 1-of-3   5.0 GB: 2-of-3   4.5 GB: 3-of-3   0.0 GB
Quantization Type32bit
Model ArchitectureLlamaForCausalLM
Licenseapache-2.0
Context Length32768
Model Max Length32768
Transformers Version4.35.2
Tokenizer ClassLlamaTokenizer
Padding Token[PAD]
Vocabulary Size32000
Initializer Range0.02
Torch Data Typebfloat16

What open-source LLMs or SLMs are you in search of? 36243 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v2024042801