Model Type | GPT, pre-trained, instruction-based |
|
Use Cases |
Areas: | Research, Commercial Applications |
|
Applications: | General Language Modeling, Domain-Specific Instruction Modeling |
|
Primary Use Cases: | Domain adaptation in finance and biomedicine, Synthesizing instruction-response pairs |
|
Limitations: | No specific finance data due to ethical concerns |
|
|
Additional Notes | Demonstrates the effectiveness of supervised multitask pre-training using instruction-response pairs. |
Supported Languages | |
Training Details |
Data Sources: | tiiuae/falcon-refinedweb, instruction-pretrain/ft-instruction-synthesizer-collection, instruction-pretrain/general-instruction-augmented-corpora |
|
Data Volume: | 100B tokens |
Methodology: | Supervised multitask pre-training using instruction-response pairs |
Model Architecture: | Instruction-based GPT model |
|
Release Notes |
Version: | 2024/9/20 |
Date: | 2024-09-20 |
Notes: | Paper accepted at EMNLP 2024 main conference. |
Version: | 2024/9/11 |
Date: | 2024-09-11 |
Notes: | Updated FAQ on continual pre-training from Llama3. |
Version: | 2024/8/29 |
Date: | 2024-08-29 |
Notes: | Updated guidelines on domain-specific tasks evaluation. |
Version: | 2024/7/31 |
Date: | 2024-07-31 |
Notes: | Scaled up pre-trained tokens to 250B, with 500M instruction-response pairs. |
Version: | 2024/6/21 |
Date: | 2024-06-21 |
Notes: | Released paper, code, and resources. |
|
|