Model Type | |
Use Cases |
Areas: | research, commercial applications |
|
Applications: | software development, educational tools |
|
Primary Use Cases: | code generation, automated coding assistance |
|
Limitations: | Not guaranteed to generate functioning code, Might contain inefficiencies or bugs |
|
|
Additional Notes | Model is not an instruction model and may not perform well with explicit commands like writing a square root computation. |
|
Supported Languages | EN (High), multi-language (Limited) |
|
Training Details |
Data Sources: | GitHub code, Arxiv, Wikipedia |
|
Data Volume: | |
Methodology: | Grouped Query Attention, a context window of 16,384 tokens, sliding window attention of 4,096 tokens, Fill-in-the-Middle objective |
|
Hardware Used: | |
Model Architecture: | Transformer decoder with grouped-query and sliding window attention and Fill-in-the-Middle objective |
|
|
Input Output | |