Yi 6B by 01-ai

 ยป  All LLMs  ยป  01-ai  ยป  Yi 6B   URL Share it on

  Arxiv:2311.16502   Arxiv:2401.11944   Arxiv:2403.04652   Autotrain compatible   Endpoints compatible   Llama   Pytorch   Region:us   Safetensors   Sharded   Tensorflow
Model Card on HF ๐Ÿค—: https://huggingface.co/01-ai/Yi-6B 

Yi 6B Benchmarks

Yi 6B (01-ai/Yi-6B)

Yi 6B Parameters and Internals

Model Type 
text generation, chat
Use Cases 
Areas:
research, commercial applications, personal use
Primary Use Cases:
text and chat generation
Limitations:
May produce hallucinations, Non-determinism in re-generation, Cumulative error potential
Considerations:
Adjust generation parameters for diverse responses
Additional Notes 
Yi is based on Llama architecture but not a derivative; independently trained.
Supported Languages 
English (high), Chinese (high)
Training Details 
Data Sources:
multilingual corpus, custom datasets developed by Yi
Data Volume:
3T tokens
Methodology:
Supervised Fine-Tuning (SFT) for chat models
Context Length:
200000
Training Time:
unknown
Hardware Used:
NVIDIA A800, GPU environment
Model Architecture:
Transformer-based, similar to Llama
Responsible Ai Considerations 
Fairness:
Not detailed
Transparency:
Open-source distribution under Apache 2.0
Accountability:
Not specified
Mitigation Strategies:
Uses compliance checking algorithms to maximize data compliance
Input Output 
Input Format:
Text input for prompts
Accepted Modalities:
text
Output Format:
Generated text output
Performance Tips:
Use appropriate generation settings (temperature, top_p) for task diversity
Release Notes 
Version:
Yi 1.5
Date:
2024-05-13
Notes:
Improved coding, math, reasoning abilities
LLM NameYi 6B
Repository ๐Ÿค—https://huggingface.co/01-ai/Yi-6B 
Model Size6b
Required VRAM12.1 GB
Updated2025-02-05
Maintainer01-ai
Model Typellama
Model Files  9.9 GB: 1-of-2   2.2 GB: 2-of-2   9.9 GB: 1-of-2   2.2 GB: 2-of-2
Model ArchitectureLlamaForCausalLM
Licenseapache-2.0
Context Length4096
Model Max Length4096
Transformers Version4.34.0
Tokenizer ClassLlamaTokenizer
Padding Token<unk>
Vocabulary Size64000
Torch Data Typebfloat16

Quantized Models of the Yi 6B

Model
Likes
Downloads
VRAM
Yi 6B GGUF146652 GB
Yi 6B GPTQ1623 GB
Yi 6B AWQ153 GB
... Spicyboros 3.1 4.0bpw H6 EXL23173 GB
... Spicyboros 3.1 3.0bpw H6 EXL21172 GB

Best Alternatives to Yi 6B

Best Alternatives
Context / RAM
Downloads
Likes
Yi 6B 200K195K / 12.1 GB8396172
Yi 6B 200K AEZAKMI V2195K / 12.1 GB12841
Yi 6B 200K DPO195K / 12.1 GB13400
Wukong Yi 6B 200K195K / 12.1 GB141
Barcenas 6B 200K195K / 12.1 GB13232
Yi 6B 200K Llama195K / 12.1 GB85
Yi 6B 200K Llamafied195K / 12.1 GB1411
Llama 3.2 6B AlgoCode128K / 12.7 GB6927
Chatglm2 6B Port Llama32K / 12.5 GB84
Miqu 6B Truthy31K / 11.3 GB1191
Note: green Score (e.g. "73.2") means that the model is better than 01-ai/Yi-6B.

Rank the Yi 6B Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 42577 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227