Mini Omni2 By gpt-omni: Benchmarks, Features and Detailed Analysis. Insights on Mini Omni2.

Arxiv:2410.11190 Any-to-any Mini-omni2 Region:us

Model Card on HF 🤗: https://huggingface.co/gpt-omni/mini-omni2

Mini Omni2 Benchmarks

^nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").

What is the LLM Explorer Rank (Score)

Mini Omni2 Parameters and Internals

Model Type

omni-interactive, multimodal, text generation, speech-to-speech

Use Cases

Areas:

research, interactive applications, voice assistants

Applications:

multimodal interaction, speech-to-speech conversations

Primary Use Cases:

real-time speech output, understanding images, audio, and text

Additional Notes

Uses whisper for audio encoding, clip for image encoding, snac for audio decoding, and CosyVoice for generating synthetic speech

Supported Languages

English (full)

Training Details

Data Sources:

OpenOrca datasets, MOSS, whisper

Methodology:

Three-stage training: encoder adaptation, modal alignment, and multimodal fine-tuning

Model Architecture:

Uses multiple sequences for input and output to perform comprehensive tasks

Input Output

Input Format:

Concatenated image, audio, and text features

Accepted Modalities:

image, text, audio

Output Format:

Real-time speech responses guided by text

Release Notes

Version:

2024.10

Notes:

Release of the model, technical report, inference, and chat demo code

LLM Name	Mini Omni2
Repository 🤗	https://huggingface.co/gpt-omni/mini-omni2
Required VRAM	0.4 GB
Updated	2025-02-22
Maintainer	gpt-omni
Model Files	0.4 GB 0.5 GB
License	mit
Model Max Length	32768
Tokenizer Class	Qwen2Tokenizer
Padding Token	<\|endoftext\|>
Errors	replace

Rank the Mini Omni2 Capabilities

🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟

Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation

What open-source LLMs or SLMs are you in search of? 43470 in total.

Email us: info@extractum.io. Our Privacy Policy | Terms and Conditions | Suggest an improvement.

Our Social Media →

Original data from HuggingFace, OpenCompass and various public git repos.

Release v20241227

Support LLM Explorer

Mini Omni2 by gpt-omni

» All LLMs » gpt-omni » Mini Omni2 URL Share it on

Mini Omni2 Benchmarks

Mini Omni2 Parameters and Internals

Rank the Mini Omni2 Capabilities

What open-source LLMs or SLMs are you in search of? 43470 in total.