Model Type | Large Language Model (Llama architecture) |
|
Use Cases |
Areas: | |
Applications: | |
Limitations: | Will output blatantly wrong information, Possible generation of inappropriate content, Not recommended for production use |
|
Considerations: | Consider further fine tuning and preference optimization before use |
|
|
Additional Notes | The model is primarily for experimentation and benchmarking rather than production use. Outputs correct German primarily. |
|
Supported Languages | |
Training Details |
Data Sources: | devngho/culturax-mini-nonshuffled, maxidl/FineNews-unfiltered, djstrong/oscar-small, LemiSt/gutenberg_de, almanach/HALvest, wikimedia/wikipedia, D4ve-R/terra-xplain-cc-de |
|
Data Volume: | about 6 billion German-language tokens |
|
Methodology: | trained with axolotl, using full fine tuning |
|
Context Length: | |
Training Time: | |
Model Architecture: | |
|