VideoLLaMA2 7B By DAMO-NLP-SG: Benchmarks, Features and Detailed Analysis. Insights on VideoLLaMA2 7B.

Arxiv:2306.02858 Arxiv:2406.07476 Autotrain compatible Dataset:lin-chen/sharegpt4v Dataset:liuhaotian/llava-instr... Dataset:opengvlab/videochat2-i... En Endpoints compatible Instruct Large video-language model Multimodal large language mode... Region:us Safetensors Sharded Tensorflow Videollama2 mistral Visual-question-answering

Model Card on HF 🤗: https://huggingface.co/DAMO-NLP-SG/VideoLLaMA2-7B

VideoLLaMA2 7B Benchmarks

LLME Score: 0.19368

^nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").

What is the LLM Explorer Rank (Score)

VideoLLaMA2 7B (DAMO-NLP-SG/VideoLLaMA2-7B)

VideoLLaMA2 7B Parameters and Internals

Model Type

multimodal large language model, large video-language model

Release Notes

Version:

VideoLLaMA2-7B-Base

Notes:

Base variant with a visual encoder and language decoder. Trained on 8 frames.

Version:

VideoLLaMA2-7B

Notes:

Chat variant with a visual encoder and language decoder. Trained on 8 frames.

Version:

VideoLLaMA2-7B-16F-Base

Notes:

Base variant with 16 training frames.

Version:

VideoLLaMA2-7B-16F

Notes:

Chat variant with 16 training frames.

Version:

VideoLLaMA2-8x7B-Base

Notes:

Base variant with Mixtral-8x7B-Instruct-v0.1 and 8 frames.

Version:

VideoLLaMA2-8x7B

Notes:

Chat variant with Mixtral-8x7B-Instruct-v0.1 and 8 frames.

Version:

VideoLLaMA2-72B-Base

Notes:

Base variant with Qwen2-72B-Instruct and 8 frames.

Version:

VideoLLaMA2-72B

Notes:

Chat variant with Qwen2-72B-Instruct and 8 frames.

LLM Name	VideoLLaMA2 7B
Repository 🤗	https://huggingface.co/DAMO-NLP-SG/VideoLLaMA2-7B
Model Size	7b
Required VRAM	16 GB
Updated	2025-06-01
Maintainer	DAMO-NLP-SG
Model Type	videollama2_mistral
Instruction-Based	Yes
Model Files	4.9 GB: 1-of-4 5.0 GB: 2-of-4 5.0 GB: 3-of-4 1.1 GB: 4-of-4
Supported Languages	en
Model Architecture	Videollama2MistralForCausalLM
License	apache-2.0
Context Length	32768
Model Max Length	32768
Transformers Version	4.37.2
Tokenizer Class	LlamaTokenizer
Padding Token	<unk>
Vocabulary Size	32000
Torch Data Type	bfloat16

Best Alternatives to VideoLLaMA2 7B

Best Alternatives	Context / RAM	Downloads	Likes
VideoLLaMA2 7B 16F	32K / 16 GB	210	15

Rank the VideoLLaMA2 7B Capabilities

🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟

Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation

What open-source LLMs or SLMs are you in search of? 47753 in total.

Email us: info@extractum.io. Our Privacy Policy | Terms and Conditions | Suggest an improvement.

Our Social Media →

Original data from HuggingFace, OpenCompass and various public git repos.

Release v20241227

Support LLM Explorer

VideoLLaMA2 7B by DAMO-NLP-SG

» All LLMs » DAMO-NLP-SG » VideoLLaMA2 7B URL Share it on

VideoLLaMA2 7B Benchmarks

VideoLLaMA2 7B Parameters and Internals

Best Alternatives to VideoLLaMA2 7B

Rank the VideoLLaMA2 7B Capabilities

What open-source LLMs or SLMs are you in search of? 47753 in total.