Model Type | Causal decoder-only, Quantized model |
|
Use Cases |
Areas: | |
Applications: | Chatbots, Customer service |
|
Primary Use Cases: | Ready-to-use chat/instruct |
|
Limitations: | Production use without risk assessment |
|
Considerations: | Is mostly trained in English; does not generalize well to other languages |
|
Supported Languages | English (high proficiency), French (medium proficiency) |
|
Training Details |
Data Sources: | |
Data Volume: | 150M tokens |
Methodology: | Finetuned from Falcon-7B |
Context Length: | 2048 |
Hardware Used: | |
Model Architecture: | GPT-3 inspired with rotary embeddings, multiquery and FlashAttention |
|
Safety Evaluation | |
Responsible Ai Considerations |
Fairness: | Contains stereotypes and biases from web corpora |
Mitigation Strategies: | Develop guardrails and precautions for production use |
|
Input Output |
Input Format: | Prompts using 'A helpful assistant' template |
Accepted Modalities: | |
Output Format: | Generated text responses |
Performance Tips: | Very slow, expect around 0.7 tokens/s |
|
Release Notes |
Version: | 4bit GPTQ |
Notes: | Quantised to 4bit using AutoGPTQ |
|
|