Suzume Llama 3 8B Japanese By lightblue: Benchmarks, Features and Detailed Analysis. Insights on Suzume Llama 3 8B Japanese.

Arxiv:2405.12612 Autotrain compatible Base model:finetune:meta-llama... Base model:meta-llama/meta-lla... Conversational Endpoints compatible Generated from trainer Instruct Llama Pytorch Region:us Safetensors Sharded Tensorflow

Model Card on HF 🤗: https://huggingface.co/lightblue/suzume-llama-3-8B-japanese

Suzume Llama 3 8B Japanese Benchmarks

LLME Score: 0.21234

^nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").

What is the LLM Explorer Rank (Score)

Suzume Llama 3 8B Japanese (lightblue/suzume-llama-3-8B-japanese)

Suzume Llama 3 8B Japanese Parameters and Internals

Model Type

Fine-tuned LLM, Language Model, Causal Language Model

Use Cases

Areas:

Research, Commercial applications

Applications:

Language processing, Translation

Primary Use Cases:

Japanese language chat, Multilingual tasks

Limitations:

Performance on other languages might be less optimized

Additional Notes

Fine-tuned for enhanced Japanese conversation ability.

Supported Languages

Japanese (high), English (high)

Training Details

Data Sources:

megagonlabs/instruction_ja, kunishou/hh-rlhf-49k-ja, openchat/openchat_sharegpt4_dataset, lightblue/tagengo-gpt4

Data Volume:

3,000+ conversations

Methodology:

Fine-tuning

Context Length:

8192

Hardware Used:

multi-GPU, 3 devices

Model Architecture:

LlamaForCausalLM

Input Output

Input Format:

Text

Accepted Modalities:

Text

Output Format:

Text

Performance Tips:

Use vLLM for optimal performance.

LLM Name	Suzume Llama 3 8B Japanese
Repository 🤗	https://huggingface.co/lightblue/suzume-llama-3-8B-japanese
Base Model(s)	Meta Llama 3 8B Instruct meta-llama/Meta-Llama-3-8B-Instruct
Model Size	8b
Required VRAM	16.1 GB
Updated	2025-02-22
Maintainer	lightblue
Model Type	llama
Instruction-Based	Yes
Model Files	5.0 GB: 1-of-4 5.0 GB: 2-of-4 4.9 GB: 3-of-4 1.2 GB: 4-of-4 16.1 GB
Model Architecture	LlamaForCausalLM
License	other
Context Length	8192
Model Max Length	8192
Transformers Version	4.40.0.dev0
Tokenizer Class	PreTrainedTokenizerFast
Padding Token	<\|end_of_text\|>
Vocabulary Size	128256
Torch Data Type	bfloat16

Best Alternatives to Suzume Llama 3 8B Japanese

Best Alternatives	Context / RAM	Downloads	Likes
...a 3 8B Instruct Gradient 1048K	1024K / 16.1 GB	3927	680
Mpasila Viking 8B	1024K / 16.1 GB	84	0
Hel V2 8B DARK FICTION	1024K / 16.1 GB	22	0
16	1024K / 16.1 GB	169	0
...di95 LewdStorytellerMix 8B 64K	1024K / 16.1 GB	69	2
Because Im Bored Nsfw1	1024K / 16.1 GB	36	1
12	1024K / 16.1 GB	60	0
MrRoboto ProLong 8B V4b	1024K / 16.1 GB	107	0
MrRoboto ProLong 8B V1a	1024K / 16.1 GB	108	0
MrRoboto ProLong 8B V2a	1024K / 16.1 GB	102	0

Note: green Score (e.g. "73.2") means that the model is better than lightblue/suzume-llama-3-8B-japanese.

Rank the Suzume Llama 3 8B Japanese Capabilities

🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟

Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation

What open-source LLMs or SLMs are you in search of? 43470 in total.

Email us: info@extractum.io. Our Privacy Policy | Terms and Conditions | Suggest an improvement.

Our Social Media →

Original data from HuggingFace, OpenCompass and various public git repos.

Release v20241227

Support LLM Explorer

Suzume Llama 3 8B Japanese by lightblue

» All LLMs » lightblue » Suzume Llama 3 8B Japanese URL Share it on

Suzume Llama 3 8B Japanese Benchmarks

Suzume Llama 3 8B Japanese Parameters and Internals

Best Alternatives to Suzume Llama 3 8B Japanese

Rank the Suzume Llama 3 8B Japanese Capabilities

What open-source LLMs or SLMs are you in search of? 43470 in total.