Model Type | |
Use Cases |
Applications: | Chat assistants, Natural language generation |
|
Primary Use Cases: | Assistant-like chat, GPTQ quantized for GPU inference |
|
Limitations: | Testing conducted in English, outputs in other languages are out-of-scope |
|
Considerations: | Compliance with Meta's Acceptable Use Policy |
|
|
Additional Notes | Model architecture uses 4-bit quantized versions for different VRAM requirements and inference quality optimization; supported by AutoGPTQ |
|
Training Details |
Data Sources: | Publicly available online data, Publicly available instruction datasets, Over one million new human-annotated examples |
|
Data Volume: | 2 trillion tokens for pretraining |
|
Methodology: | Auto-regressive language modeling with transformer architecture. Fine-tuned with supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF). |
|
Context Length: | |
Training Time: | Llama 2 70B required 1720320 GPU hours |
|
Hardware Used: | Meta's Research Super Cluster, Production clusters, 3.3M GPU hours on A100-80GB GPUs |
|
Model Architecture: | Auto-regressive transformer |
|
|
Safety Evaluation |
Methodologies: | Human evaluations, Internal benchmarks |
|
Findings: | Outperformed open-source chat models on benchmarks, On par with closed-source models like ChatGPT for helpfulness and safety |
|
Risk Categories: | |
Ethical Considerations: | Testing conducted in English and has not covered all scenarios; may produce inaccurate or biased outputs |
|
|
Responsible Ai Considerations |
Fairness: | Testing conducted indicates model may produce inaccurate, biased outputs |
|
Transparency: | Safety testing and tuning should be performed for specific applications |
|
Accountability: | Developers need to ensure safety before deploying applications |
|
Mitigation Strategies: | Use safety testing and tuning tailored to specific applications |
|
|
Input Output |
Input Format: | |
Accepted Modalities: | |
Output Format: | |
|