Model Type | |
Use Cases |
Areas: | |
Primary Use Cases: | Developing language models for low-resource languages |
|
Limitations: | Not suitable for human-facing interactions, Not intended for deployment, Limited to Brazilian Portuguese |
|
Considerations: | Users should conduct risk and bias assessment before any real-world application |
|
|
Additional Notes | Pre-trained model released under Apache 2.0; comprehensive evaluations available. |
|
Supported Languages | |
Training Details |
Data Sources: | Pt-Corpus Instruct (6.2B tokens) |
|
Data Volume: | |
Methodology: | Transformer-based model pre-trained via causal language modeling |
|
Context Length: | |
Training Time: | |
Hardware Used: | |
Model Architecture: | |
|
Input Output |
Input Format: | Tokenizer input as text for generation |
|
Accepted Modalities: | |
Output Format: | |
Performance Tips: | Review repetition penalty settings to avoid verbosity and repetition |
|
|