Model Type | text-generation, multimodal |
|
Additional Notes | Model trained on unfiltered internet data, may contain objectionable content. |
|
Supported Languages | en (high), zh (high), ja (high), de (high) |
|
Training Details |
Data Sources: | synthetic dataset generated using large context windows, retrieval-augmented generation, and knowledge graph integration |
|
Data Volume: | |
Methodology: | fine-tuning using a synthesis dataset |
|
Context Length: | |
Training Time: | < 1 day on 16 nodes of 8*A100-80G |
|
Hardware Used: | 16 nodes of 8*A100-80G GPUs |
|
|
Input Output |
Input Format: | Accepts text and image modalities. |
|
Accepted Modalities: | |
Performance Tips: | Use a standardized implementation for inference to avoid performance degradation. For fewer hallucinations, use top_p=0.8 and temperature=0.3, or temperature=0.2. |
|
|