GroundingGPT by zwli

 ยป  All LLMs  ยป  zwli  ยป  GroundingGPT   URL Share it on

  Arxiv:2401.06071   Autotrain compatible   Endpoints compatible   Lego   Pytorch   Region:us   Sharded
Model Card on HF ๐Ÿค—: https://huggingface.co/zwli/GroundingGPT 

GroundingGPT Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
GroundingGPT (zwli/GroundingGPT)

GroundingGPT Parameters and Internals

Model Type 
Multimodal
Use Cases 
Areas:
Research applications, Multimodal grounding tasks
Applications:
Comprehensive grounding tasks across images, audios, and videos
Primary Use Cases:
Understanding and grounding multimodal inputs
Additional Notes 
Model available on Hugging Face
Supported Languages 
English (Proficient)
Training Details 
Data Sources:
LLaVA, COCO, GQA, OCR-VQA, TextVQA, VisualGenome, Flickr30K-Entities, Valley, DiDeMO, ActivityNet Captions, Charades-STA, VGGSS, WaveCaps, Clotho
Data Volume:
Large and diverse multimodal dataset
Methodology:
End-to-end multimodal grounding model with spatial and temporal information integration
Hardware Used:
GPUs (specifics not mentioned)
Model Architecture:
End-to-end multimodal grounding
Safety Evaluation 
Methodologies:
Experimental evaluations
Findings:
Effective in grounding tasks across various modalities
Input Output 
Input Format:
Multimodal inputs including images, audio, and video
Accepted Modalities:
Image, Audio, Video
Output Format:
Model's outputs based on multimodal inputs
LLM NameGroundingGPT
Repository ๐Ÿค—https://huggingface.co/zwli/GroundingGPT 
Model Size7b
Required VRAM18.2 GB
Updated2025-02-22
Maintainerzwli
Model TypeLEGO
Model Files  10.0 GB: 1-of-2   8.2 GB: 2-of-2   0.0 GB
Model ArchitectureLEGOLlamaForCausalLM
Context Length4096
Model Max Length4096
Transformers Version4.34.0
Tokenizer ClassLlamaTokenizer
Padding Token<unk>
Vocabulary Size32009
Torch Data Typebfloat16

Rank the GroundingGPT Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 43470 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227