LWM 7B 1M 1000000ctx AEZAKMI 3 1 1702 by adamo1139

 ยป  All LLMs  ยป  adamo1139  ยป  LWM 7B 1M 1000000ctx AEZAKMI 3 1 1702   URL Share it on

  Autotrain compatible   Endpoints compatible   Llama   Region:us   Safetensors   Sharded   Tensorflow

LWM 7B 1M 1000000ctx AEZAKMI 3 1 1702 Benchmarks

LWM 7B 1M 1000000ctx AEZAKMI 3 1 1702 Parameters and Internals

Model Type 
text generation
Additional Notes 
Model is finetuned on AEZAKMI v3.1 dataset. Exl2 quants and base model will be available soon in safetensors format. Most long context capabilities expected to remain.
Training Details 
Data Sources:
AEZAKMI v3.1 dataset
Methodology:
Finetuning using QLoRA with lora_r 32 and cosine learning rate decaying from 0.00015, finetuned with unsloth, FA2
Context Length:
4000
Training Time:
6 hours
Hardware Used:
Local RTX 3090 Ti
LLM NameLWM 7B 1M 1000000ctx AEZAKMI 3 1 1702
Repository ๐Ÿค—https://huggingface.co/adamo1139/LWM-7B-1M-1000000ctx-AEZAKMI-3_1-1702 
Model Size7b
Required VRAM13.5 GB
Updated2024-11-19
Maintaineradamo1139
Model Typellama
Model Files  4.9 GB: 1-of-3   5.0 GB: 2-of-3   3.6 GB: 3-of-3
Model ArchitectureLlamaForCausalLM
Licensellama2
Context Length1048576
Model Max Length1048576
Transformers Version4.36.2
Tokenizer ClassLlamaTokenizer
Vocabulary Size32000
Torch Data Typefloat16
LWM 7B 1M 1000000ctx AEZAKMI 3 1 1702 (adamo1139/LWM-7B-1M-1000000ctx-AEZAKMI-3_1-1702)

Best Alternatives to LWM 7B 1M 1000000ctx AEZAKMI 3 1 1702

Best Alternatives
Context / RAM
Downloads
Likes
... Qwen2.5llamaify 7B V23.1 200K195K / 15.2 GB29240
Yarn Llama 2 7B 128K128K / 13.5 GB407738
LLaMA 7B PoSE YaRN 128K128K / 13.5 GB233
LLaMA 7B PoSE Linear 96K96K / 27 GB222
LLaMA 7B PoSE YaRN 96K96K / 13.5 GB181
Chat Llama2 7B 80K80K / 13.8 GB280
Llama2 7B 80K80K / 13.8 GB110
Lloma Step40064K / 13.5 GB730
Lloma Step20064K / 13.5 GB730
66664K / 13.5 GB2670
Note: green Score (e.g. "73.2") means that the model is better than adamo1139/LWM-7B-1M-1000000ctx-AEZAKMI-3_1-1702.

Rank the LWM 7B 1M 1000000ctx AEZAKMI 3 1 1702 Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 38100 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241110