Yi 6B 200K by 01-ai

 ยป  All LLMs  ยป  01-ai  ยป  Yi 6B 200K   URL Share it on

  Arxiv:2311.16502   Arxiv:2401.11944   Arxiv:2403.04652   Autotrain compatible   Endpoints compatible   Llama   Pytorch   Region:us   Safetensors   Sharded   Tensorflow
Model Card on HF ๐Ÿค—: https://huggingface.co/01-ai/Yi-6B-200K 

Yi 6B 200K Benchmarks

Yi 6B 200K (01-ai/Yi-6B-200K)

Yi 6B 200K Parameters and Internals

Model Type 
Chat model, Text generation
Use Cases 
Areas:
Chat applications, Creative content generation
Applications:
Commercial applications, Research, Educational tools
Primary Use Cases:
Chatbots, Virtual assistants, Story generation
Limitations:
Potential for hallucination, May produce inconsistent outputs
Considerations:
Adjust generation parameters for desired output qualities.
Additional Notes 
Models do not directly use Llama's weights; unique datasets and training infrastructure emphasize Yi's independent development.
Supported Languages 
English (Fluent), Chinese (Fluent)
Training Details 
Data Sources:
Trainer Multilingual Corpora, 3T Tokens
Data Volume:
3T Multilingual Corpus
Methodology:
Transformer-based architecture
Context Length:
200000
Training Time:
Not specified
Hardware Used:
NVIDIA A800 (80GB), 4090 GPU
Model Architecture:
Based on Llama's architecture
Responsible Ai Considerations 
Fairness:
Addressed during model development.
Transparency:
Standard Transformer architecture; detailed in tech report.
Accountability:
01.AI
Mitigation Strategies:
Use of Supervised Fine-Tuning for better accuracy.
Input Output 
Input Format:
Interactive prompt conversation
Accepted Modalities:
Text
Output Format:
Text responses or follow-ups
Performance Tips:
Calibrate temperature, top_p, top_k settings for desired response diversity.
Release Notes 
Version:
1.0
Date:
2023-11-23
Notes:
Initial open-source release of chat model, supporting both 4-bit and 8-bit quantizations.
Version:
2.0
Date:
2023-12-19
Notes:
Improved performance in coding, math, and reasoning with larger context capabilities.
LLM NameYi 6B 200K
Repository ๐Ÿค—https://huggingface.co/01-ai/Yi-6B-200K 
Model Size6b
Required VRAM12.1 GB
Updated2024-12-21
Maintainer01-ai
Model Typellama
Model Files  9.9 GB: 1-of-2   2.2 GB: 2-of-2   9.9 GB: 1-of-2   2.2 GB: 2-of-2
Model ArchitectureLlamaForCausalLM
Licenseapache-2.0
Context Length200000
Model Max Length200000
Transformers Version4.34.0
Tokenizer ClassLlamaTokenizer
Padding Token<unk>
Vocabulary Size64000
Torch Data Typebfloat16

Quantized Models of the Yi 6B 200K

Model
Likes
Downloads
VRAM
Yi 6B 200K GGUF0532 GB
Yi 6B 200K GGUF283172 GB
Yi 6B 200K GPTQ2683 GB
Yi 6B 200K AWQ2593 GB

Best Alternatives to Yi 6B 200K

Best Alternatives
Context / RAM
Downloads
Likes
Yi 6B 200K AEZAKMI V2195K / 12.1 GB11261
Yi 6B 200K DPO195K / 12.1 GB11900
Wukong Yi 6B 200K195K / 12.1 GB141
Barcenas 6B 200K195K / 12.1 GB12062
Yi 6B 200K Llamafied195K / 12.1 GB2111
Yi 6B 200K Llama195K / 12.1 GB185
Chatglm2 6B Port Llama32K / 12.5 GB134
Miqu 6B Truthy31K / 11.3 GB5031
Llama 3 6B V0.18K / 12.6 GB3712
Pruned Llama 3 Oasis 6B8K / 13 GB121

Rank the Yi 6B 200K Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 40013 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241217