Model Type | auto-regressive, transformer architecture |
|
Use Cases |
Primary Use Cases: | Research on large language models, Exploring applications such as question answering and reading comprehension, Evaluating and mitigating biases, Determining capabilities and limitations of models |
|
Limitations: | Base model not suitable for downstream applications without risk evaluation |
|
|
Supported Languages | en (High proficiency), others (Included 20 languages, mainly supports English) |
|
Training Details |
Data Sources: | CCNet, C4, GitHub, Wikipedia, Books, ArXiv, Stack Exchange |
|
Data Volume: | 1T tokens with different breakdowns for different model sizes |
|
Model Architecture: | |
|
Responsible Ai Considerations |
Fairness: | Model reflects biases from web sources. Evaluated biases include gender, religion, race, sexual orientation, age, nationality, disability, physical appearance, and socioeconomic status. |
|
Transparency: | Model trained using web-sourced data which may contain biased and harmful content. |
|
Accountability: | Use GitHub repository to raise questions or comments. |
|
Mitigation Strategies: | Filtered data based on proximity to Wikipedia text using a Kneser-Ney language model and fastText linear classifier. |
|
|