Llama3 8B Chinese Chat by shenzhi-wang

 ยป  All LLMs  ยป  shenzhi-wang  ยป  Llama3 8B Chinese Chat   URL Share it on

  Autotrain compatible Base model:finetune:meta-llama... Base model:meta-llama/meta-lla...   Conversational   Doi:10.57967/hf/2316   En   Endpoints compatible   Instruct   Llama   Llama-factory   Orpo   Region:us   Safetensors   Sharded   Tensorflow   Zh

Llama3 8B Chinese Chat Benchmarks

Llama3 8B Chinese Chat (shenzhi-wang/Llama3-8B-Chinese-Chat)

Llama3 8B Chinese Chat Parameters and Internals

Model Type 
text generation, multimodal
Additional Notes 
Model primarily fine-tuned for Chinese & English users with abilities like roleplaying & tool-using. For optimal performance, model's identity is not fine-tuned.
Supported Languages 
Chinese (high), English (medium)
Training Details 
Data Sources:
mixed Chinese-English dataset
Data Volume:
~100K preference pairs
Methodology:
ORPO (Reference-free Monolithic Preference Optimization with Odds Ratio)
Context Length:
8192
Model Architecture:
Meta-Llama-3
Input Output 
Input Format:
instruction-based prompts
Accepted Modalities:
text
Output Format:
text
Release Notes 
Version:
v2.1
Date:
May 6, 2024
Notes:
Training dataset is 5x larger (~100K preference pairs). Enhancements in roleplay, function calling, math. Less prone to including English words in Chinese responses.
Version:
v2
Date:
Apr. 29, 2024
Notes:
Increases in training data size from 20K to 100K; improved performance in roleplay, tool using, and math.
Version:
v1
Notes:
Significantly reduces issues of 'Chinese questions with English answers' and the mixing of Chinese and English in responses.
LLM NameLlama3 8B Chinese Chat
Repository ๐Ÿค—https://huggingface.co/shenzhi-wang/Llama3-8B-Chinese-Chat 
Base Model(s)  Meta Llama 3 8B Instruct   meta-llama/Meta-Llama-3-8B-Instruct
Model Size8b
Required VRAM16.1 GB
Updated2025-01-24
Maintainershenzhi-wang
Model Typellama
Instruction-BasedYes
Model Files  5.0 GB: 1-of-4   5.0 GB: 2-of-4   4.9 GB: 3-of-4   1.2 GB: 4-of-4
Supported Languagesen zh
Model ArchitectureLlamaForCausalLM
Licensellama3
Context Length8192
Model Max Length8192
Transformers Version4.40.0
Tokenizer ClassPreTrainedTokenizerFast
Padding Token<|eot_id|>
Vocabulary Size128256
Torch Data Typebfloat16

Quantized Models of the Llama3 8B Chinese Chat

Model
Likes
Downloads
VRAM
... Chinese Chat AWQ 4bit Smashed065 GB

Best Alternatives to Llama3 8B Chinese Chat

Best Alternatives
Context / RAM
Downloads
Likes
...a 3 8B Instruct Gradient 1048K1024K / 16.1 GB6542680
Because Im Bored Nsfw11024K / 16.1 GB571
161024K / 16.1 GB1690
121024K / 16.1 GB600
MrRoboto ProLong 8B V4b1024K / 16.1 GB1070
MrRoboto ProLong 8B V1a1024K / 16.1 GB1080
MrRoboto ProLong 8B V2a1024K / 16.1 GB1020
MrRoboto ProLong 8B V4c1024K / 16.1 GB870
8B Unaligned BASE V2b1024K / 16.1 GB980
...o ProLongBASE Pt6 Unaligned 8B1024K / 16.1 GB710
Note: green Score (e.g. "73.2") means that the model is better than shenzhi-wang/Llama3-8B-Chinese-Chat.

Rank the Llama3 8B Chinese Chat Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 41817 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227