Model Type | |
Use Cases |
Areas: | research, personal experimentation, proof of concept |
|
Limitations: | Model may not operate correctly as separate verification was not performed |
|
|
Additional Notes | This model includes a vocab expansion with Korean vocab. It was developed mainly for personal experimentation, not for commercial use. |
|
Supported Languages | ko (Korean), en (English) |
|
Training Details |
Data Sources: | wikimedia/wikipedia, maywell/korean_textbooks, nampdn-ai/tiny-codes, Open-Orca/OpenOrca |
|
Data Volume: | |
Methodology: | Continued pre-training with vocabulary expansion |
|
Hardware Used: | |
Model Architecture: | SBERT architecture (BBPE) |
|
|
Input Output |
Input Format: | Tokenized Korean and English text |
|
Accepted Modalities: | |
Output Format: | |
Performance Tips: | Consider fine-tuning such as instruction tuning or alignment tuning according to your use case. |
|
|