Training Details |
Data Sources: | Saxo/ko_cn_translation_tech_social_science_linkbricks_single_dataset, Saxo/ko_jp_translation_tech_social_science_linkbricks_single_dataset, Saxo/en_ko_translation_tech_science_linkbricks_single_dataset_with_prompt_text_huggingface, Saxo/en_ko_translation_social_science_linkbricks_single_dataset_with_prompt_text_huggingface, Saxo/ko_aspect_sentiment_sns_mall_sentiment_linkbricks_single_dataset_with_prompt_text_huggingface, Saxo/ko_summarization_linkbricks_single_dataset_with_prompt_text_huggingface, Saxo/OpenOrca_cleaned_kor_linkbricks_single_dataset_with_prompt_text_huggingface, Saxo/ko_government_qa_total_linkbricks_single_dataset_with_prompt_text_huggingface_sampled, Saxo/ko-news-corpus-1, Saxo/ko-news-corpus-2, Saxo/ko-news-corpus-3, Saxo/ko-news-corpus-4, Saxo/ko-news-corpus-5, Saxo/ko-news-corpus-6, Saxo/ko-news-corpus-7, Saxo/ko-news-corpus-8, Saxo/ko-news-corpus-9, maywell/ko_Ultrafeedback_binarized, youjunhyeok/ko-orca-pair-and-ultrafeedback-dpo, lilacai/glaive-function-calling-v2-sharegpt, kuotient/gsm8k-ko |
|
Data Volume: | 10 million Korean news corpus |
|
Methodology: | CPT(Continue-Pretraining)->SFT->DPO |
|
Context Length: | |
Hardware Used: | |
Model Architecture: | Uses Mistral-Nemo-Instruct-2407 as a base model; the tokenizer of the base model used without word expansion |
|
|