Model Type | GPT, pre-trained, instruction-based |
|
Use Cases |
Areas: | Research, Commercial Applications |
|
Applications: | General Language Modeling, Domain-Specific Instruction Modeling |
|
Primary Use Cases: | Domain adaptation in finance and biomedicine, Synthesizing instruction-response pairs |
|
Limitations: | No specific finance data due to ethical concerns |
|
|
Additional Notes | Demonstrates the effectiveness of supervised multitask pre-training using instruction-response pairs. |
|
Supported Languages | |
Training Details |
Data Sources: | tiiuae/falcon-refinedweb, instruction-pretrain/ft-instruction-synthesizer-collection, instruction-pretrain/general-instruction-augmented-corpora |
|
Data Volume: | |
Methodology: | Supervised multitask pre-training using instruction-response pairs |
|
Model Architecture: | Instruction-based GPT model |
|
|
Release Notes |
Version: | |
Date: | |
Notes: | Paper accepted at EMNLP 2024 main conference. |
|
Version: | |
Date: | |
Notes: | Updated FAQ on continual pre-training from Llama3. |
|
Version: | |
Date: | |
Notes: | Updated guidelines on domain-specific tasks evaluation. |
|
Version: | |
Date: | |
Notes: | Scaled up pre-trained tokens to 250B, with 500M instruction-response pairs. |
|
Version: | |
Date: | |
Notes: | Released paper, code, and resources. |
|
|
|