Model Type | text-generation, instruction-tuned |
|
Use Cases |
Areas: | |
Applications: | Cross-Lingual Adaptation, Instruction Following |
|
Primary Use Cases: | Text Generation, Language Translation |
|
Limitations: | Not fine-tuned for specific human intent and safety considerations |
|
|
Additional Notes | Developed by multiple team members from TokyoTech-LLM, with acknowledgements to Meta Research for Llama 2. |
|
Supported Languages | Japanese (Proficient), English (Proficient) |
|
Training Details |
Data Sources: | OpenAssistant Conversations Dataset EN top-1 thread, OpenAssistant Conversations Dataset |
|
Methodology: | Supervised fine-tuning (SFT) |
|
Model Architecture: | Please refer to LLaMA-2 technical report for details on the model architecture. |
|
|
Input Output |
Input Format: | ~~[INST] <>
{SYSTEM_PROMPT}
<>
{USER_MESSAGE} [/INST] |
|
Accepted Modalities: | |
Output Format: | |
Performance Tips: | Adhere strictly to instruction format to maintain performance. |
|
|
Release Notes |
Version: | |
Date: | |
Notes: | Release of enhanced instruction-tuned models as preview versions. |
|
Version: | |
Date: | |
Notes: | Trained with approximately twice as many Japanese tokens. |
|
Version: | |
Date: | |
Notes: | Model release with no vocabulary expansion. |
|
Version: | |
Date: | |
Notes: | Release of various instruct-hf models as well as no vocabulary expansion models. |
|
Version: | |
Date: | |
Notes: | Initial release of Swallow 7b, 13b, and 70b in instruct hf variants. |
|
|
|