Model Type | Causal decoder-only transformer language model |
|
Use Cases |
Areas: | Research, Commercial applications |
|
Applications: | |
Primary Use Cases: | Chat assistant applications |
|
Limitations: | Model outputs may be unpredictable, inaccurate, biased, or objectionable. |
|
Considerations: | Perform application-specific safety testing before deployment. |
|
|
Additional Notes | Embeddings padded to multiple of 128 for sharded inference compatibility. |
|
Supported Languages | en (full), de (limited), es (limited), fr (limited), it (limited), pt (limited), pl (limited), nl (limited), ro (limited), cs (limited), sv (limited) |
|
Training Details |
Data Sources: | rombodawg/LosslessMegaCodeTrainingV2_1m_Evol_Uncensored, OpenAssistant/oasst1, shahules786/orca-best, argilla/databricks-dolly-15k-curated-multilingual |
|
Methodology: | Fine-tuned in two stages: first on synthetic instructions and coding tasks, then on top human demonstrations. |
|
Context Length: | |
Hardware Used: | EPFL's Machine Learning and Optimization Laboratory, Natural Language Processing Lab |
|
Model Architecture: | Causal decoder-only transformer architecture |
|
|
Responsible Ai Considerations |
Fairness: | Testing mainly in English, outputs may be unpredictable in other scenarios. |
|
Transparency: | Documented training processes and datasets. |
|
Accountability: | Open-Assistant development team is accountable for model outputs. |
|
Mitigation Strategies: | Developers should perform safety testing and tuning specific to their applications. |
|
|
Input Output |
Input Format: | Prompt dialogue template with OpenAI's chatml format. |
|
Accepted Modalities: | |
Output Format: | |
Performance Tips: | Use the official Llama2 system message for improved inference. |
|
|