Llama 3 8B Instruct 1048K by DevsDoCode

 ยป  All LLMs  ยป  DevsDoCode  ยป  Llama 3 8B Instruct 1048K   URL Share it on

  Arxiv:2309.00071   Arxiv:2402.08268   Autotrain compatible   Conversational   En   Endpoints compatible   Instruct   Llama   Llama-3   Meta   Region:us   Safetensors   Sharded   Tensorflow

Llama 3 8B Instruct 1048K Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Llama 3 8B Instruct 1048K (DevsDoCode/Llama-3-8B-Instruct-1048k)

Llama 3 8B Instruct 1048K Parameters and Internals

Model Type 
text generation, instruction tuned
Use Cases 
Areas:
commercial, research
Applications:
assistant-like chat, natural language generation tasks
Primary Use Cases:
English language applications
Limitations:
Out-of-scope usage in languages other than English
Considerations:
Developers may fine-tune for other languages following the Community License
Additional Notes 
Model addresses users across many backgrounds with an emphasis on openness and inclusivity.
Supported Languages 
English (commercial and research use)
Training Details 
Data Sources:
publicly available online data
Data Volume:
<0.01% of Llama-3's original pre-training data
Methodology:
NTK-aware interpolation, RoPE theta optimization, Progressive training
Context Length:
1048000
Hardware Used:
NVIDIA L40S, high performance L40S cluster
Model Architecture:
auto-regressive language model with optimized transformer architecture
Safety Evaluation 
Methodologies:
red teaming, adversarial evaluations
Findings:
significantly less likely to falsely refuse responses than Llama 2
Risk Categories:
CBRNE, cybersecurity, child safety
Ethical Considerations:
Iterative testing, external expert evaluation
Responsible Ai Considerations 
Fairness:
Model intends to serve everyone, designed for inclusivity
Transparency:
Outlined in Responsible Use Guide
Accountability:
Developers should ensure safety benchmarks
Mitigation Strategies:
Meta Llama Guard 2, Code Shield, Responsible Use Guide
Input Output 
Input Format:
text
Output Format:
text
Release Notes 
Version:
8B
Date:
April 18, 2024
Notes:
Part of Llama 3 release, optimized for dialogue use cases.
LLM NameLlama 3 8B Instruct 1048K
Repository ๐Ÿค—https://huggingface.co/DevsDoCode/Llama-3-8B-Instruct-1048k 
Model Size8b
Required VRAM16.1 GB
Updated2024-12-14
MaintainerDevsDoCode
Model Typellama
Instruction-BasedYes
Model Files  5.0 GB: 1-of-4   5.0 GB: 2-of-4   4.9 GB: 3-of-4   1.2 GB: 4-of-4
Supported Languagesen
Model ArchitectureLlamaForCausalLM
Licensellama3
Context Length1048576
Model Max Length1048576
Transformers Version4.39.1
Tokenizer ClassPreTrainedTokenizerFast
Vocabulary Size128256
Torch Data Typebfloat16

Quantized Models of the Llama 3 8B Instruct 1048K

Model
Likes
Downloads
VRAM
Llama 3 8B Instruct 1048K 4bit25104 GB
Llama 3 8B Instruct 1048K 8bit17298 GB

Best Alternatives to Llama 3 8B Instruct 1048K

Best Alternatives
Context / RAM
Downloads
Likes
...a 3 8B Instruct Gradient 1048K1024K / 16.1 GB14046677
Test V0.7z 8B1024K / 16.1 GB760
Test V0.6c 8B1024K / 16.1 GB670
Test V0.6l 8B1024K / 16.1 GB510
Test V0.7i 8B1024K / 16.1 GB470
Test V0.6M 8B1024K / 16.1 GB230
Test V0.7h 8B1024K / 16.1 GB180
Test V0.6n 8B1024K / 16.1 GB140
L3.1 Gradient1024K / 16.1 GB80
...SLERP Gradient1048k OpenBioLLM1024K / 16.1 GB390
Note: green Score (e.g. "73.2") means that the model is better than DevsDoCode/Llama-3-8B-Instruct-1048k.

Rank the Llama 3 8B Instruct 1048K Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 39237 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241124