Model Type | text-to-text, text-to-code, decoder-only |
|
Use Cases |
Areas: | code completion, code generation, code conversation, code education |
|
Limitations: | limitations based on training data, ethical concerns |
|
|
Additional Notes | Supports Responsible AI development |
|
Supported Languages | |
Training Details |
Data Sources: | publicly available code repositories, open source mathematics datasets, synthetically generated code |
|
Data Volume: | |
Methodology: | FIM Pretraining, dependency graph-based packing, unit test-based lexical packing |
|
Hardware Used: | |
|
Safety Evaluation |
Findings: | evaluation within acceptable thresholds for content safety, representational harms |
|
Risk Categories: | child safety, content safety, representational harms, memorization, large-scale harms |
|
Ethical Considerations: | Tested against autonomous hacking capabilities and potential harms |
|
|
Responsible Ai Considerations |
Mitigation Strategies: | Implemented safety filtering |
|
|
Input Output |
Input Format: | Code prefix/suffix or natural language text prompt |
|
Output Format: | Code generation, completion, conversation |
|
|