Model Type | |
Use Cases |
Considerations: | It's recommended to use at least 16k context due to long response lengths. |
|
|
Additional Notes | The model produces verbose and long reasoning responses, offering a detailed step-by-step explanation of topics such as math proofs. |
|
Supported Languages | en (English), de (German), fr (French), it (Italian), pt (Portuguese), hi (Hindi), es (Spanish), th (Thai) |
|
Training Details |
Data Sources: | leafspark/DetailedReflection-Claude-v3_5-Sonnet |
|
Data Volume: | 81 examples, each approximately 3000 tokens |
|
Methodology: | Unsloth approach with LoRA Rank: 128, Packing: enabled, Batch size: 2, Gradient accumulation steps: 4, Epochs: 3, Steps: 30 |
|
Context Length: | |
Training Time: | |
Hardware Used: | |
|
Input Output |
Input Format: | Prompts should be formatted to begin with and utilize nested XML tags for the reasoning process and response generation. |
|
Accepted Modalities: | |
Output Format: | |
Performance Tips: | Use recommended sampling parameters (Temperature: 0.15, Min-P: 0.2, Top-K: 50, Top-P: 1, Frequency Penalty: 0.5, Presence Penalty: 0.1) for coherent responses. |
|
|