Phi 3.5 Mini Instruct by microsoft

 ยป  All LLMs  ยป  microsoft  ยป  Phi 3.5 Mini Instruct   URL Share it on

  Arxiv:2403.06412   Arxiv:2404.14219   Arxiv:2407.13833   Autotrain compatible   Code   Conversational   Custom code   Endpoints compatible   Instruct   Multilingual   Phi3   Region:us   Safetensors   Sharded   Tensorflow

Phi 3.5 Mini Instruct Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").

Phi 3.5 Mini Instruct Parameters and Internals

Model Type 
text-generation
Use Cases 
Areas:
research, commercial applications
Applications:
AI systems, natural language processing
Primary Use Cases:
memory/compute constrained environments, latency-bound scenarios, strong reasoning tasks
Limitations:
languages other than English may have worse performance
Considerations:
Developers should consider model limitations and adhere to safety and regulatory guidelines.
Additional Notes 
None
Supported Languages 
Arabic (supported), Chinese (supported), Czech (supported), Danish (supported), Dutch (supported), English (supported), Finnish (supported), French (supported), German (supported), Hebrew (supported), Hungarian (supported), Italian (supported), Japanese (supported), Korean (supported), Norwegian (supported), Polish (supported), Portuguese (supported), Russian (supported), Spanish (supported), Swedish (supported), Thai (supported), Turkish (supported), Ukrainian (supported)
Training Details 
Data Sources:
publicly available documents, textbook-like synthetic data
Data Volume:
3.4T tokens
Methodology:
supervised fine-tuning, proximal policy optimization, and direct preference optimization
Context Length:
128000
Training Time:
10 days
Hardware Used:
512 H100-80G GPUs
Model Architecture:
dense decoder-only Transformer
Safety Evaluation 
Methodologies:
red-teaming, adversarial conversation simulations
Findings:
models may refuse undesirable outputs in English across multiple languages
Risk Categories:
misinformation, bias
Ethical Considerations:
Industry-wide investment in high-quality safety evaluation datasets is needed.
Responsible Ai Considerations 
Fairness:
Models may over- or under-represent groups of people and need fine-tuning for diversity.
Transparency:
Model operation and biases should be understood and communicated to users.
Accountability:
Microsoft accountable for model's outputs.
Mitigation Strategies:
Utilize safety classifiers and fine-tuning based on deployment scenarios.
Input Output 
Input Format:
Text inputs with chat format expected
Accepted Modalities:
text
Output Format:
Generated text
Performance Tips:
Use in-memory or latency-bound scenarios.
Release Notes 
Version:
June 2024
Date:
2024-06
Notes:
Updated with feedback, improved conversation quality in multilingual settings.
LLM NamePhi 3.5 Mini Instruct
Repository ๐Ÿค—https://huggingface.co/microsoft/Phi-3.5-mini-instruct 
Model Size3.8b
Required VRAM7.7 GB
Updated2024-11-21
Maintainermicrosoft
Model Typephi3
Instruction-BasedYes
Model Files  5.0 GB: 1-of-2   2.7 GB: 2-of-2
Model ArchitecturePhi3ForCausalLM
Licensemit
Context Length131072
Model Max Length131072
Transformers Version4.43.3
Tokenizer ClassLlamaTokenizer
Padding Token<|endoftext|>
Vocabulary Size32064
Torch Data Typebfloat16
Phi 3.5 Mini Instruct (microsoft/Phi-3.5-mini-instruct)

Quantized Models of the Phi 3.5 Mini Instruct

Model
Likes
Downloads
VRAM
Phi 3.5 Mini Instruct GGUF518381560 GB
Flow Judge V0.1 AWQ623232 GB

Best Alternatives to Phi 3.5 Mini Instruct

Best Alternatives
Context / RAM
Downloads
Likes
Phi 3 Mini 128K Instruct128K / 7.7 GB7358311605
NuExtract 1.5128K / 7.7 GB107204113
NuExtract V1.5128K / 7.7 GB10851189
ECE EIFFEL 3Bv2128K / 7.7 GB50
Phi 3.5 Mini ITA128K / 7.7 GB788510
Flow Judge V0.1128K / 7.7 GB37945
Borea Phi 3.5 Mini Instruct Jp128K / 7.7 GB5299
Phi 3.5 Mini TitanFusion 0.1128K / 7.7 GB290
Borea Phi 3.5 Mini Instruct Jp128K / 7.7 GB2025
....5 Mini 3.8B ArliAI RPMax V1.1128K / 7.7 GB258

Rank the Phi 3.5 Mini Instruct Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 38149 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241110