Minitron 8B Base By nvidia: Benchmarks, Features and Detailed Analysis. Insights on Minitron 8B Base.

Arxiv:2009.03300 Arxiv:2407.14679 Nemo Nemotron Pytorch Region:us

Model Card on HF 🤗: https://huggingface.co/nvidia/Minitron-8B-Base

Minitron 8B Base Benchmarks

MMLU Pro: 24.23

GPQA: 3.13

MUSR: 9.09

BBH: 22.04

IFEval: 24.24 vs 88 (so35)^-72.5%

MATH Lvl 5: 2.34

LLME Score: 0.25443

^nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").

What is the LLM Explorer Rank (Score)

Minitron 8B Base (nvidia/Minitron-8B-Base)

Minitron 8B Base Parameters and Internals

Model Type

text generation, auto-regressive

Use Cases

Areas:

research, development

Applications:

Multitask language understanding, code generation

Limitations:

Contains biases from internet-crawled data, May emit toxic or biased content

Considerations:

Developers must ensure use case requirements and address model misuse

Additional Notes

Performance improvements achieved with fewer training tokens per model.

Supported Languages

languages (/text), proficiency_levels (English and multilingual, including code)

Training Details

Data Sources:

continuous pre-training data corpus from Nemotron-4 15B

Data Volume:

94 billion tokens

Methodology:

Pruning, knowledge distillation, continued training

Training Time:

Feb 2024 - June 2024

Hardware Used:

NVIDIA A100

Model Architecture:

Transformer Decoder

Responsible Ai Considerations

Fairness:

Efforts required by developers to ensure meeting industry and use case requirements.

Accountability:

Shared responsibility promoted by NVIDIA

Mitigation Strategies:

Establish policies and practices to address product misuse.

Input Output

Input Format:

String

Output Format:

String

Release Notes

Version:

Minitron-8B-Base

Date:

February 2024 - June 2024

Notes:

Pruned and distilled from Nemotron-4 15B, achieving compute cost savings and improved performance.

LLM Name	Minitron 8B Base
Repository 🤗	https://huggingface.co/nvidia/Minitron-8B-Base
Model Size	8b
Required VRAM	16.5 GB
Updated	2025-02-05
Maintainer	nvidia
Model Type	nemotron
Model Files	16.5 GB
Model Architecture	NemotronForCausalLM
License	other
Context Length	4096
Model Max Length	4096
Transformers Version	4.44.0
Tokenizer Class	PreTrainedTokenizerFast
Vocabulary Size	256000
Torch Data Type	bfloat16

Best Alternatives to Minitron 8B Base

Best Alternatives	Context / RAM	Downloads	Likes
Nemotron3 8B	4K / 17.1 GB	964	0

Rank the Minitron 8B Base Capabilities

🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟

Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation

What open-source LLMs or SLMs are you in search of? 42577 in total.

Email us: info@extractum.io. Our Privacy Policy | Terms and Conditions | Suggest an improvement.

Our Social Media →

Original data from HuggingFace, OpenCompass and various public git repos.

Release v20241227

Support LLM Explorer

Minitron 8B Base by nvidia

» All LLMs » nvidia » Minitron 8B Base URL Share it on

Minitron 8B Base Benchmarks

Minitron 8B Base Parameters and Internals

Best Alternatives to Minitron 8B Base

Rank the Minitron 8B Base Capabilities

What open-source LLMs or SLMs are you in search of? 42577 in total.