Phi 3 Vision 128K Instruct by inventbot

 ยป  All LLMs  ยป  inventbot  ยป  Phi 3 Vision 128K Instruct   URL Share it on

  Autotrain compatible   Code   Conversational   Custom code   Instruct   Multilingual   Phi3 v   Region:us   Safetensors   Sharded   Tensorflow   Vision

Phi 3 Vision 128K Instruct Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Phi 3 Vision 128K Instruct (inventbot/Phi-3-vision-128k-instruct)

Phi 3 Vision 128K Instruct Parameters and Internals

Model Type 
text generation, multimodal
Use Cases 
Areas:
research, commercial applications
Applications:
general AI systems, visual and text input capabilities, memory/compute constrained environments, OCR, image understanding
Primary Use Cases:
general image understanding, text generation, language understanding
Limitations:
not evaluated for all downstream purposes, inappropriate for high-risk scenarios without additional safeguards
Considerations:
Developers should ensure accuracy, safety, and fairness for their use cases.
Additional Notes 
Phi-3-Vision-128K-Instruct is designed for use in latency-constrained scenarios and comes with rights for commercial use. Developers should apply responsible AI best practices.
Supported Languages 
multilingual (high quality, reasoning dense)
Training Details 
Data Sources:
publicly available documents, high-quality educational data and code, selected high-quality image-text interleave, synthetic data for teaching math, coding, and reasoning, newly created image data (charts, tables, diagrams), high-quality chat format supervised data
Data Volume:
500B vision and text tokens
Methodology:
supervised fine-tuning and direct preference optimization for instruction adherence
Context Length:
128000
Training Time:
1.5 days
Hardware Used:
512 H100-80G GPUs
Model Architecture:
Includes image encoder, connector, projector, and Phi-3 Mini language model
Safety Evaluation 
Risk Categories:
misinformation, offensive content, bias
Ethical Considerations:
Models may produce inappropriate or offensive content. Developers should implement necessary safeguards.
Responsible Ai Considerations 
Fairness:
Models can over- or under-represent groups of people and reinforce stereotypes.
Transparency:
Developers should inform users they are interacting with AI.
Accountability:
Developers are responsible for ensuring use case compliance with laws.
Mitigation Strategies:
Additional debiasing techniques and RAG for misinformation.
Input Output 
Input Format:
Text and image as inputs using chat template format
Accepted Modalities:
text, image
Output Format:
Generated text in response
LLM NamePhi 3 Vision 128K Instruct
Repository ๐Ÿค—https://huggingface.co/inventbot/Phi-3-vision-128k-instruct 
Model Size4.1b
Required VRAM8.3 GB
Updated2024-12-22
Maintainerinventbot
Model Typephi3_v
Instruction-BasedYes
Model Files  4.9 GB: 1-of-2   3.4 GB: 2-of-2
Model ArchitecturePhi3VForCausalLM
Licensemit
Context Length131072
Model Max Length131072
Transformers Version4.38.1
Tokenizer ClassLlamaTokenizer
Padding Token<|endoftext|>
Vocabulary Size32064
Torch Data Typebfloat16

Best Alternatives to Phi 3 Vision 128K Instruct

Best Alternatives
Context / RAM
Downloads
Likes
Phi 3.5 Vision Instruct128K / 8.3 GB337071619
Phi 3 Vision 128K Instruct128K / 8.3 GB82210940
VLM2Vec Full128K / 8.3 GB2984619
Phi 3.5 Vision Instruct Bf16128K / 8.3 GB942
...hi 3 HornyVision 128K Instruct128K / 8.3 GB10626
...on 128K Instruct No Flash Attn128K / 8.3 GB130
Phi 3 Vision Win Snap128K / 8.3 GB161
Note: green Score (e.g. "73.2") means that the model is better than inventbot/Phi-3-vision-128k-instruct.

Rank the Phi 3 Vision 128K Instruct Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 40123 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241217