Model Type | text-generation, decoder-only |
|
Use Cases |
Areas: | |
Applications: | Text Generation, Prompt-based Evaluation |
|
Primary Use Cases: | Text generation using CLM objective |
|
Limitations: | High possibility of bias and quality issues like hallucination and lack of diversity |
|
|
Additional Notes | OPT models aim to enable reproducible and responsible research. |
|
Supported Languages | English (Predominantly supported), Non-English (Small amount in training corpus) |
|
Training Details |
Data Sources: | BookCorpus, CC-Stories, The Pile, Pushshift.io Reddit dataset, CCNewsV2 |
|
Data Volume: | |
Methodology: | Causal Language Modeling (CLM) |
|
Context Length: | |
Training Time: | |
Hardware Used: | |
Model Architecture: | Decoder-only, similar to GPT-3 |
|
|
Responsible Ai Considerations |
Mitigation Strategies: | Model may have bias due to unfiltered internet data. |
|
|
Input Output |
Input Format: | Sequences of 2048 consecutive tokens, tokenized using GPT2 BPE with a vocabulary of 50272. |
|
Accepted Modalities: | |
Output Format: | |
Performance Tips: | Use the generate method directly for better performance with large models. |
|
|
Release Notes |
Version: | |
Date: | |
Notes: | Initial release with sizes from 125M to 175B parameters. |
|
|
|