Model Type | Transformer style autoregressive language model |
|
Use Cases |
Areas: | research, commercial applications |
|
Limitations: | easy to prompt models to generate harmful content, bias, many facts may not be true |
|
|
Additional Notes | OLMo models are available with open training and evaluation code. |
|
Supported Languages | |
Training Details |
Data Sources: | |
Data Volume: | 2.5 Trillion Tokens for OLMo 7B |
|
Context Length: | |
Hardware Used: | MI250X GPUs at the LUMI supercomputer, A100-40GB GPUs provided by MosaicML |
|
Model Architecture: | OLMo 7B architecture with peer models for comparison: d_model 4096, num heads 32, num layers 32, activation SwiGLU |
|
|
Input Output |
Input Format: | |
Accepted Modalities: | |
Output Format: | |
|
Release Notes |
Version: | |
Notes: | Core OLMo 7B release with model details, performance, and usage guidelines. |
|
|
|