Model Type | text-to-text, text-to-code, decoder-only |
|
Use Cases |
Areas: | Research, Commercial applications |
|
Applications: | Code Completion, Code Generation, Code Conversation, Code Education |
|
Primary Use Cases: | Code completion with IDE extension, Interactive code learning experiences |
|
Limitations: | Limitations of LLMs based on training data., Potential representational harms. |
|
Considerations: | See Gemma model card for comprehensive considerations. |
|
|
Additional Notes | The model is built for Responsible AI development with a focus on open code applications. |
|
Supported Languages | |
Training Details |
Data Sources: | Publicly available code repositories, Open source mathematics datasets, Synthetically generated code |
|
Data Volume: | |
Methodology: | |
Hardware Used: | |
Model Architecture: | |
|
Safety Evaluation |
Methodologies: | Internal red-teaming, Structured evaluations |
|
Risk Categories: | Human safety, Representational harms, Cyber-offence capabilities |
|
Ethical Considerations: | Testing autonomous hacking capabilities and ensuring potential harms are limited. |
|
|
Responsible Ai Considerations |
Fairness: | Human evaluation on prompts covering content safety and representational harms. |
|
Transparency: | Discussions and evaluations are detailed in the Gemma model card. |
|
Accountability: | Developed by Google, accountable for outputs under their AI principles. |
|
Mitigation Strategies: | Controlled through structured evaluations and internal red-teaming. |
|
|
Input Output |
Input Format: | For pretrained model: code prefix and/or suffix for code completion and generation. |
|
Accepted Modalities: | |
Output Format: | For instruction-tuned model: code and natural language |
|
Performance Tips: | Ensure correct usage of FIM tokens in prompts. |
|
|