Model Type | auto-regressive language model, transformer architecture |
|
Use Cases |
Areas: | research, NLP exploratory tasks |
|
Applications: | question answering, reading comprehension, natural language understanding |
|
Primary Use Cases: | research on large language models, exploring potential applications |
|
Limitations: | has not been trained with human feedback; can thus generate toxic or offensive content |
|
Considerations: | Foundation model, should not be used on downstream applications without further risk evaluation and mitigation. |
|
|
Supported Languages | primary (English), others (Spanish, French, German, Dutch, Italian, Portuguese, Russian, Chinese, etc.) |
|
Training Details |
Data Sources: | CCNet, C4, GitHub, Wikipedia, Books, ArXiv, Stack Exchange |
|
Data Volume: | Approximately 1T tokens for smaller models, 1.4T tokens for larger models |
|
Model Architecture: | |
|
Responsible Ai Considerations |
Fairness: | Expected to reflect biases from sources due to internet data. Evaluated on RAI datasets for various biases. |
|
Mitigation Strategies: | Filtered web data for proximity to Wikipedia with Kneser-Ney language model and fastText linear classifier. |
|
|