Use Cases |
Areas: | General-purpose chat, Assistance with writing and coding, Live chat agent, In-game NPC interactions |
|
Applications: | Conversational agents, Programming assistance, Customer support, Gaming |
|
Primary Use Cases: | Interactive dialogue, Text generation |
|
Limitations: | Potential for biases, Limited context window of 8192 tokens, Possibility of errors |
|
Considerations: | Users should verify critical information and be aware of potential biases. |
|
|
Training Details |
Data Sources: | darkcloudai-smallmodel-frontieredition, darkcloudai-webdriver-redditcrawl-2023, darkcloudai-unalignment-truthfulness, darkcloudai-generaldpo, ai2_arc, allenai/ultrafeedback_binarized_cleaned, argilla/distilabel-intel-orca-dpo-pairs, jondurbin/airoboros-3.2, codeparrot/apps, facebook/belebele, bluemoon-fandom-1-1-rp-cleaned, boolq, camel-ai/biology, camel-ai/chemistry, camel-ai/math, camel-ai/physics, jondurbin/contextual-dpo-v0.1, jondurbin/gutenberg-dpo-v0.1, jondurbin/py-dpo-v0.1, jondurbin/truthy-dpo-v0.1, LDJnr/Capybara, jondurbin/cinematika-v0.1, WizardLM/WizardLM_evol_instruct_70k, glaiveai/glaive-function-calling-v2, grimulkan/LimaRP-augmented, lmsys/lmsys-chat-1m, ParisNeo/lollms_aware_dataset, TIGER-Lab/MathInstruct, Muennighoff/natural-instructions, openbookqa, kingbri/PIPPA-shareGPT, piqa, Vezora/Tested-22k-Python-Alpaca, ropes, cakiki/rosetta-code, Open-Orca/SlimOrca, b-mc2/sql-create-context, squad_v2, mattpscott/airoboros-summarization, migtissera/Synthia-v1.3, unalignment/toxic-dpo-v0.2, WhiteRabbitNeo/WRN-Chapter-1, WhiteRabbitNeo/WRN-Chapter-2, winogrande |
|
Methodology: | Direct Preference Optimization (DPO) and Reinforcement Learning from Artificial Intelligence Feedback (RLAIF) with traditional SFT (Supervised Fine-Tuning) |
|
Context Length: | |
Model Architecture: | |
|