InstructLM 500M By instruction-pretrain: Benchmarks, Features and Detailed Analysis. Insights on InstructLM 500M.

Arxiv:2309.09530 Arxiv:2406.14491 Arxiv:2411.19930 Autotrain compatible Dataset:instruction-pretrain/f... Dataset:instruction-pretrain/g... Dataset:tiiuae/falcon-refinedw... En Endpoints compatible Instruct Mistral Pytorch Region:us Safetensors

Model Card on HF 🤗: https://huggingface.co/instruction-pretrain/InstructLM-500M

InstructLM 500M Benchmarks

MMLU Pro: 1.57

GPQA: 0.89

MUSR: 2.07

BBH: 2.32

IFEval: 10.28 vs 88 (so35)^-88.3%

LLME Score: 0.21208

^nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").

What is the LLM Explorer Rank (Score)

InstructLM 500M (instruction-pretrain/InstructLM-500M)

InstructLM 500M Parameters and Internals

Model Type

Language Model

Additional Notes

Instruction Pre-Training framework scalably augments massive raw corpora with instruction-response pairs to pre-train language models.

Training Details

Data Sources:

tiiuae/falcon-refinedweb, instruction-pretrain/ft-instruction-synthesizer-collection, instruction-pretrain/general-instruction-augmented-corpora

Data Volume:

100B - 250B tokens

Methodology:

Instruction Pre-Training using supervised multitask pre-training with instruction-response pairs.

Release Notes

Date:

2024-09-20

Notes:

Our paper has been accepted by EMNLP 2024 main conference.

Date:

2024-09-11

Notes:

Updated FAQ on continual pre-training from Llama3.

Date:

2024-08-29

Notes:

Updated guidelines on evaluating any Huggingface models on domain-specific tasks.

Date:

2024-07-31

Notes:

Updated pre-training suggestions in the Advanced Usage section of instruction-synthesizer.

Date:

2024-07-15

Notes:

Scaled up the pre-trained tokens from 100B to 250B, with the number of synthesized instruction-response pairs reaching 500M.

Date:

2024-06-21

Notes:

Released the paper, code, and resources.

LLM Name	InstructLM 500M
Repository 🤗	https://huggingface.co/instruction-pretrain/InstructLM-500M
Model Size	500m
Required VRAM	2.3 GB
Updated	2025-02-22
Maintainer	instruction-pretrain
Model Type	mistral
Instruction-Based	Yes
Model Files	2.3 GB 2.3 GB
Supported Languages	en
Model Architecture	MistralForCausalLM
License	apache-2.0
Context Length	2048
Model Max Length	2048
Transformers Version	4.34.0.dev0
Tokenizer Class	LlamaTokenizer
Vocabulary Size	32000
Torch Data Type	float16

Rank the InstructLM 500M Capabilities

🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟

Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation

What open-source LLMs or SLMs are you in search of? 43470 in total.

Email us: info@extractum.io. Our Privacy Policy | Terms and Conditions | Suggest an improvement.

Our Social Media →

Original data from HuggingFace, OpenCompass and various public git repos.

Release v20241227

Support LLM Explorer

InstructLM 500M by instruction-pretrain

» All LLMs » instruction-pretrain » InstructLM 500M URL Share it on

InstructLM 500M Benchmarks

InstructLM 500M Parameters and Internals

Rank the InstructLM 500M Capabilities

What open-source LLMs or SLMs are you in search of? 43470 in total.