Llama 3 8B Instruct Gradient 1048K by gradientai

 ยป  All LLMs  ยป  gradientai  ยป  Llama 3 8B Instruct Gradient 1048K   URL Share it on

  Arxiv:2305.14233   Arxiv:2309.00071   Arxiv:2402.08268   Autotrain compatible   Conversational   Doi:10.57967/hf/3372   En   Endpoints compatible   Instruct   Llama   Llama-3   Meta   Region:us   Safetensors   Sharded   Tensorflow

Llama 3 8B Instruct Gradient 1048K Benchmarks

Llama 3 8B Instruct Gradient 1048K (gradientai/Llama-3-8B-Instruct-Gradient-1048k)

Llama 3 8B Instruct Gradient 1048K Parameters and Internals

Model Type 
text-generation
Use Cases 
Areas:
commercial, research
Applications:
natural language generation
Primary Use Cases:
assistant-like chat
Limitations:
not suitable for use in languages other than English
Considerations:
developers to perform safety testing and tuning tailored to applications
Additional Notes 
Model is static and trained on an offline dataset. Future versions will focus on safety improvements.
Supported Languages 
en (primary)
Training Details 
Data Sources:
publicly available online data, SlimPajama, UltraChat
Data Volume:
1.4B tokens total
Methodology:
NTK-aware interpolation for RoPE theta optimization, progressive training on increasing context lengths, supervised fine-tuning (SFT), reinforcement learning with human feedback (RLHF)
Context Length:
1048
Hardware Used:
Crusoe Energy high performance L40S cluster
Model Architecture:
auto-regressive language model using an optimized transformer architecture
Safety Evaluation 
Methodologies:
red teaming, adversarial evaluations
Findings:
mitigations implemented to limit false refusals, CBRNE assessments
Risk Categories:
misuse, critical risks, cybersecurity, child safety
Ethical Considerations:
open approach to better, safer products, emphasis on responsible AI development
Responsible Ai Considerations 
Fairness:
openness, inclusivity, helpfulness
Transparency:
steps and best practices for safe deployment
Accountability:
developers
Mitigation Strategies:
Purple Llama solutions, Llama Guard for input-output safeguards
Input Output 
Input Format:
text only
Accepted Modalities:
text
Output Format:
text and code only
LLM NameLlama 3 8B Instruct Gradient 1048K
Repository ๐Ÿค—https://huggingface.co/gradientai/Llama-3-8B-Instruct-Gradient-1048k 
Model Size8b
Required VRAM16.1 GB
Updated2024-12-21
Maintainergradientai
Model Typellama
Instruction-BasedYes
Model Files  5.0 GB: 1-of-4   5.0 GB: 2-of-4   4.9 GB: 3-of-4   1.2 GB: 4-of-4
Supported Languagesen
Model ArchitectureLlamaForCausalLM
Licensellama3
Context Length1048576
Model Max Length1048576
Transformers Version4.41.0.dev0
Tokenizer ClassPreTrainedTokenizerFast
Vocabulary Size128256
Torch Data Typebfloat16

Quantized Models of the Llama 3 8B Instruct Gradient 1048K

Model
Likes
Downloads
VRAM
...8B Instruct Gradient 1048K AWQ0145 GB
...truct Gradient 1048K IMat GGUF63872 GB
...ent 1048K Molecule Q4 K M GGUF0284 GB
...B Instruct Gradient 1048K GGUF31773 GB
...radient 1048K AWQ 4bit Smashed1325 GB

Best Alternatives to Llama 3 8B Instruct Gradient 1048K

Best Alternatives
Context / RAM
Downloads
Likes
MrRoboto ProLong 8B V1a1024K / 16.1 GB1070
MrRoboto ProLong 8B V2a1024K / 16.1 GB1000
MrRoboto ProLong 8B V2f1024K / 16.1 GB510
MrRoboto ProLong 8B V1f1024K / 16.1 GB630
MrRoboto ProLong 8B V1l1024K / 16.1 GB600
8B Unaligned BASE V2b1024K / 16.1 GB930
MrRoboto ProLong 8B V1h1024K / 16.1 GB360
MrRoboto ProLong 8B V1d1024K / 16.1 GB340
MrRoboto ProLong 8B V1m1024K / 16.1 GB280
Test V0.7z 8B1024K / 16.1 GB760

Rank the Llama 3 8B Instruct Gradient 1048K Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 40013 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241217