Model Type | Transformer-based, Autoregressive, Language Model |
|
Use Cases |
Areas: | |
Primary Use Cases: | Feature extraction, downstream task learning |
|
Limitations: | Not for direct deployment, Possible biases and inaccuracies |
|
|
Additional Notes | Not deduplicated before training, which might influence the model's output integrity. |
|
Supported Languages | |
Training Details |
Data Sources: | |
Data Volume: | |
Methodology: | Auto-regressive training using the GPT-NeoX library |
|
Context Length: | |
Training Time: | Approx. 150,000 steps with 1538 sequences per step |
|
Model Architecture: | Similar to GPT-3, with specifics available in the paper. |
|
|
Safety Evaluation |
Risk Categories: | |
Ethical Considerations: | Various biases related to gender, religion, and race as discussed in the Pile paper. |
|
|
Responsible Ai Considerations |
Fairness: | See Pile paper for discussion on biases |
|
|
Input Output |
Input Format: | Text input required for prompts |
|
Accepted Modalities: | |
Output Format: | Autoregressive next-token predictions |
|
|