InternVL Chat ViT 6B Vicuna 13B 448px by OpenGVLab

 ยป  All LLMs  ยป  OpenGVLab  ยป  InternVL Chat ViT 6B Vicuna 13B 448px   URL Share it on

  Arxiv:2312.14238   Autotrain compatible Base model:lmsys/vicuna-13b-v1... Base model:merge:lmsys/vicuna-... Base model:merge:opengvlab/int... Base model:opengvlab/internvit...   Llava   Pytorch   Region:us   Sharded   Visual-question-answering

InternVL Chat ViT 6B Vicuna 13B 448px Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
InternVL Chat ViT 6B Vicuna 13B 448px (OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-13B-448px)

InternVL Chat ViT 6B Vicuna 13B 448px Parameters and Internals

Model Type 
vision, vision-language, foundation model, auto-regressive language model, transformer architecture
Use Cases 
Areas:
research in large multimodal models and chatbots
Applications:
computer vision, natural language processing, machine learning, artificial intelligence
Primary Use Cases:
Multimodal dialogue
Additional Notes 
Model card adapted from LLaVA's model card.
Training Details 
Data Sources:
LAION-en, LAION-multi, LAION-COCO, COYO, Wukong, CC12M, CC3M, SBU
Data Volume:
1.206 million image-text pairs and multimodal instruction-following data
Methodology:
Fine-tuning LLaMA/Vicuna on GPT-generated multimodal instruction-following data
Model Architecture:
Transformer architecture
Input Output 
Input Format:
multimodal instruction-following format
Accepted Modalities:
text, image
Output Format:
dialogue responses in text format
LLM NameInternVL Chat ViT 6B Vicuna 13B 448px
Repository ๐Ÿค—https://huggingface.co/OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-13B-448px 
Base Model(s)  OpenGVLab/InternViT-6B-448px-V1-0   lmsys/vicuna-13b-v1.5   OpenGVLab/InternViT-6B-448px-V1-0   lmsys/vicuna-13b-v1.5
Model Size6b
Required VRAM37.9 GB
Updated2025-02-18
MaintainerOpenGVLab
Model Typellava
Model Files  0.1 GB   9.9 GB: 1-of-4   9.9 GB: 2-of-4   10.0 GB: 3-of-4   8.1 GB: 4-of-4   0.0 GB
Model ArchitectureLlavaLlamaForCausalLM
Context Length4096
Model Max Length4096
Transformers Version4.32.0
Tokenizer ClassLlamaTokenizer
Beginning of Sentence Token<s>
End of Sentence Token</s>
Unk Token<unk>
Vocabulary Size32000
Torch Data Typebfloat16

Best Alternatives to InternVL Chat ViT 6B Vicuna 13B 448px

Best Alternatives
Context / RAM
Downloads
Likes
Yi VL 6B4K / 13.4 GB116121
InternVL Chat ViT 6B Vicuna 7B4K / 25.4 GB889
...nternVL Chat ViT 6B Vicuna 13B4K / 37.9 GB318
Llava 6B4K / 12.7 GB150
Note: green Score (e.g. "73.2") means that the model is better than OpenGVLab/InternVL-Chat-ViT-6B-Vicuna-13B-448px.

Rank the InternVL Chat ViT 6B Vicuna 13B 448px Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 43267 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227