Suzume Llama 3 8B Multilingual By lightblue: Benchmarks, Features and Detailed Analysis. Insights on Suzume Llama 3 8B Multilingual.

Arxiv:2405.12612 Autotrain compatible Base model:finetune:meta-llama... Base model:meta-llama/meta-lla... Conversational Endpoints compatible Generated from trainer Instruct Llama Pytorch Region:us Safetensors Sharded Tensorflow

Model Card on HF 🤗: https://huggingface.co/lightblue/suzume-llama-3-8B-multilingual

Suzume Llama 3 8B Multilingual Benchmarks

MMLU Pro: 26.48

GPQA: 4.47

MUSR: 7.84

BBH: 28.9

IFEval: 66.78 vs 88 (so35)^-24.1%

ARC: 60.75 vs 96.7 (so35)^-37.2%

HellaSwag: 79.49 vs 95.3 (gpt4)^-16.6%

MMLU: 66.62 vs 88.3 (so35)^-24.6%

TruthfulQA: 48.7 vs 59 (gpt4)^-17.5%

WinoGrande: 76.16 vs 87.5 (gpt4)^-13%

GSM8K: 61.56 vs 96.4 (so35)^-36.1%

MATH Lvl 5: 9.44

LLME Score: 0.33122

^nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").

What is the LLM Explorer Rank (Score)

Suzume Llama 3 8B Multilingual (lightblue/suzume-llama-3-8B-multilingual)

Suzume Llama 3 8B Multilingual Parameters and Internals

Model Type

multilingual, text generation

Use Cases

Areas:

Research, Commercial applications

Applications:

Multilingual chat applications

Primary Use Cases:

Multilingual conversational AI

Limitations:

Excludes certain problem categories for Russian, Not yet fully evaluated

Considerations:

Ongoing evaluation and feedback encouraged

Additional Notes

Ongoing development with future releases

Supported Languages

languages_supported (multilingual), proficiency_levels (high)

Training Details

Data Sources:

lightblue/tagengo-gpt4, lmsys/lmsys-chat-1m, megagonlabs/instruction_ja, openchat/openchat_sharegpt4_dataset

Data Volume:

90,000 multilingual conversations

Methodology:

finetuning

Context Length:

8192

Training Time:

2.5 hours

Hardware Used:

4 x A100 (80GB) GPUs

Model Architecture:

LlamaForCausalLM

Input Output

Input Format:

Prompt messages should be constructed in a chat format

Accepted Modalities:

text

Output Format:

Text generation in response format

Performance Tips:

Utilize vLLM for optimal inference speed

LLM Name	Suzume Llama 3 8B Multilingual
Repository 🤗	https://huggingface.co/lightblue/suzume-llama-3-8B-multilingual
Base Model(s)	Meta Llama 3 8B Instruct meta-llama/Meta-Llama-3-8B-Instruct
Model Size	8b
Required VRAM	16.1 GB
Updated	2025-02-22
Maintainer	lightblue
Model Type	llama
Instruction-Based	Yes
Model Files	5.0 GB: 1-of-4 5.0 GB: 2-of-4 4.9 GB: 3-of-4 1.2 GB: 4-of-4 16.1 GB
Model Architecture	LlamaForCausalLM
License	other
Context Length	8192
Model Max Length	8192
Transformers Version	4.38.2
Tokenizer Class	PreTrainedTokenizerFast
Padding Token	<\|end_of_text\|>
Vocabulary Size	128256
Torch Data Type	bfloat16

Quantized Models of the Suzume Llama 3 8B Multilingual

Model	Likes	Downloads	VRAM
Suzume Llama 3 8B Multilingual	0	9	4 GB

Best Alternatives to Suzume Llama 3 8B Multilingual

Best Alternatives	Context / RAM	Downloads	Likes
...a 3 8B Instruct Gradient 1048K	1024K / 16.1 GB	3927	680
Mpasila Viking 8B	1024K / 16.1 GB	84	0
Hel V2 8B DARK FICTION	1024K / 16.1 GB	22	0
16	1024K / 16.1 GB	169	0
...di95 LewdStorytellerMix 8B 64K	1024K / 16.1 GB	69	2
Because Im Bored Nsfw1	1024K / 16.1 GB	36	1
12	1024K / 16.1 GB	60	0
MrRoboto ProLong 8B V4b	1024K / 16.1 GB	107	0
MrRoboto ProLong 8B V1a	1024K / 16.1 GB	108	0
MrRoboto ProLong 8B V2a	1024K / 16.1 GB	102	0

Note: green Score (e.g. "73.2") means that the model is better than lightblue/suzume-llama-3-8B-multilingual.

Rank the Suzume Llama 3 8B Multilingual Capabilities

🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟

Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation

What open-source LLMs or SLMs are you in search of? 43470 in total.

Email us: info@extractum.io. Our Privacy Policy | Terms and Conditions | Suggest an improvement.

Our Social Media →

Original data from HuggingFace, OpenCompass and various public git repos.

Release v20241227

Support LLM Explorer