Training Details |
Data Sources: | ai2_arc, airoboros, apps, belebele, bluemoon, boolq, camel-ai biology, camel-ai chemistry, camel-ai math, camel-ai physics, capybara, cinematika, emobank, evol-instruct, glaive-function-calling-v2, gutenberg, limarp-augmented, lmsys_chat_1m, lollms, mathinstruct, natural_instructions, openbookqa, pippa, piqa, python_alpaca, ropes, rosetta_code, slimorca, sql-create-context, squad_v2, airoboros-summarization, synthia, whiterabbitneo chapter 1, whiterabbitneo chapter 2, winogrande, airoboros 3.2, contextual-dpo, helpsteer, distilabel_orca_dpo_pairs, gutenberg-dpo, py-dpo, toxic-dpo, truthy, ultrafeedback |
|
Methodology: | Fine-tuning with specific SFT and DPO datasets. Used diverse datasets for instruction tuning and context-obedient question answering. |
|
Hardware Used: | |
|