Model Type | Transformer Decoder (Auto-regressive Language Model) |
|
Use Cases |
Primary Use Cases: | Roleplaying, Retrieval augmented generation, Function calling |
|
Limitations: | Model may amplify societal biases and return toxic responses, May produce inaccurate or unacceptable text |
|
Considerations: | Validate the imported packages are from a trusted source for end-to-end security. |
|
|
Training Details |
Methodology: | Multi-stage SFT and preference-based alignment with NeMo Aligner |
|
Context Length: | |
Model Architecture: | Transformer Decoder with Grouped-Query Attention (GQA) and Rotary Position Embeddings (RoPE), 40 layers, 32 attention heads. |
|
|
Safety Evaluation |
Methodologies: | Garak automated LLM vulnerability scanner, AEGIS content safety evaluation, Human Content Red Teaming |
|
Ethical Considerations: | NVIDIA encourages working with internal model team to ensure model meets specific industry and use case requirements. |
|
|
Input Output |
Input Format: | System {system prompt} User {prompt} Assistant\n |
|
Performance Tips: | The model may not perform optimally without the recommended prompt template. |
|
|