Model Type | |
Use Cases |
Areas: | |
Applications: | general purpose AI systems, memory/compute constrained environments, latency bound scenarios, strong reasoning scenarios |
|
Primary Use Cases: | language modeling, multilingual tasks, reasoning tasks |
|
Limitations: | limited factual knowledge storage, susceptible to generating repetitive or inconsistent responses in long sessions |
|
Considerations: | Developers are advised to evaluate and mitigate for accuracy, safety, and fairness before deploying. |
|
|
Additional Notes | Phi-3.5-mini can be used with or without flash attention implementation depending on GPU capabilities. |
|
Supported Languages | Arabic (supported), Chinese (supported), Czech (supported), Danish (supported), Dutch (supported), English (supported), Finnish (supported), French (supported), German (supported), Hebrew (supported), Hungarian (supported), Italian (supported), Japanese (supported), Korean (supported), Norwegian (supported), Polish (supported), Portuguese (supported), Russian (supported), Spanish (supported), Swedish (supported), Thai (supported), Turkish (supported), Ukrainian (supported) |
|
Training Details |
Data Sources: | publicly available documents, synthetic data, high-quality educational data, code |
|
Data Volume: | |
Methodology: | supervised fine-tuning, proximal policy optimization, direct preference optimization |
|
Context Length: | |
Training Time: | |
Hardware Used: | |
Model Architecture: | dense decoder-only Transformer |
|
|
Safety Evaluation |
Methodologies: | red teaming, adversarial conversation simulations, multilingual safety evaluation benchmark datasets |
|
Findings: | The model may refuse to generate undesirable outputs in English even when requested in another language., More susceptible to longer multi-turn jailbreak techniques across languages. |
|
Risk Categories: | misinformation, offensive content, perpetuation of stereotypes |
|
Ethical Considerations: | It highlights the need for industry-wide investment in high-quality safety evaluation datasets across multiple languages and risk areas. |
|
|
Responsible Ai Considerations |
Fairness: | Models may reflect real-world patterns and societal biases. |
|
Transparency: | Models provide simple AI-driven outputs without additional operational transparency measures. |
|
Accountability: | Users are encouraged to pair the model with larger AI systems for better contextual and application-specific outcomes. |
|
Mitigation Strategies: | Fine-tuning with additional safety datasets and building application-level safeguards are recommended. |
|
|
Input Output |
Input Format: | |
Accepted Modalities: | |
Output Format: | generated text in response to the input |
|
Performance Tips: | To use flash attention, ensure using suitable GPU hardware. |
|
|
Release Notes |
Version: | |
Date: | |
Notes: | Updated with post-training data for gains in multilingual and reasoning capability. |
|
|
|