Hymba 1.5B Instruct by nvidia

 ยป  All LLMs  ยป  nvidia  ยป  Hymba 1.5B Instruct   URL Share it on

  Arxiv:2411.13676   Autotrain compatible Base model:finetune:nvidia/hym... Base model:nvidia/hymba-1.5b-b...   Conversational   Custom code   Hymba   Instruct   Region:us   Safetensors

Hymba 1.5B Instruct Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Hymba 1.5B Instruct (nvidia/Hymba-1.5B-Instruct)

Hymba 1.5B Instruct Parameters and Internals

Model Type 
text-generation
Additional Notes 
The model is susceptible to jailbreak attacks and may generate inaccurate or biased content. Strong output validation controls are recommended.
Training Details 
Data Sources:
open source instruction datasets, internally collected synthetic datasets
Methodology:
supervised fine-tuning and direct preference optimization
Training Time:
between September 4, 2024, and November 10th, 2024.
Model Architecture:
Hybrid-head Architecture with standard attention heads and Mamba heads, Grouped-Query Attention (GQA), Rotary Position Embeddings (RoPE)
Responsible Ai Considerations 
Mitigation Strategies:
Developers should work with their internal model team to ensure this model meets requirements for the relevant industry and use case and address unforeseen product misuse
Input Output 
Accepted Modalities:
text
Performance Tips:
During generation, the batch size needs to be 1 as the current implementation does not fully support padding of Meta tokens + SWA
LLM NameHymba 1.5B Instruct
Repository ๐Ÿค—https://huggingface.co/nvidia/Hymba-1.5B-Instruct 
Base Model(s)  nvidia/Hymba-1.5B-Base   nvidia/Hymba-1.5B-Base
Model Size1.5b
Required VRAM3 GB
Updated2024-12-26
Maintainernvidia
Model Typehymba
Instruction-BasedYes
Model Files  3.0 GB
Model ArchitectureHymbaForCausalLM
Licenseother
Context Length8192
Model Max Length8192
Transformers Version4.44.0
Tokenizer ClassLlamaTokenizer
Padding Token[PAD]
Vocabulary Size32001
Torch Data Typebfloat16

Rank the Hymba 1.5B Instruct Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 40248 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241217