DeepSeek Coder V2 Lite Instruct by deepseek-ai

 »  All LLMs  »  deepseek-ai  »  DeepSeek Coder V2 Lite Instruct   URL Share it on

  Arxiv:2401.06066   Autotrain compatible   Codegen   Conversational   Custom code   Deepseek v2   Endpoints compatible   Instruct   Region:us   Safetensors   Sharded   Tensorflow

DeepSeek Coder V2 Lite Instruct Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").

DeepSeek Coder V2 Lite Instruct Parameters and Internals

Model Type 
Mixture-of-Experts (MoE), code language model
Use Cases 
Areas:
code-specific tasks, math and reasoning, programming languages extension
Applications:
AI code assistance, software development, research in code intelligence
Primary Use Cases:
Code completion, Code insertion, Chatbot assistance for coding queries
Limitations:
Optimal performance requires specified hardware, Compatibility with certain APIs necessary
Additional Notes 
Supported languages expanded to 338 from 86. Allows for commercial use.
Supported Languages 
languages_supported (, programming languages: 338, extended from 86), competence_level (high proficiency in code-specific tasks)
Training Details 
Data Sources:
DeepSeekMoE framework, intermediate checkpoint of DeepSeek-V2, additional 6 trillion tokens
Data Volume:
6 trillion tokens
Methodology:
Mixture-of-experts mechanism for enhanced coding and reasoning
Context Length:
128000
Hardware Used:
BF16 format inference requires 8*80GB GPUs
Model Architecture:
Mixture-of-Experts with active parameters
Input Output 
Input Format:
Prompt-based input
Accepted Modalities:
text
Output Format:
Model-generated text responses
Performance Tips:
Use of specified HF or vLLM frameworks for optimal inference.
LLM NameDeepSeek Coder V2 Lite Instruct
Repository 🤗https://huggingface.co/deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct 
Model Size15.7b
Required VRAM31.4 GB
Updated2024-11-21
Maintainerdeepseek-ai
Model Typedeepseek_v2
Instruction-BasedYes
Model Files  8.6 GB: 1-of-4   8.6 GB: 2-of-4   8.6 GB: 3-of-4   5.6 GB: 4-of-4
Generates CodeYes
Model ArchitectureDeepseekV2ForCausalLM
Licenseother
Context Length163840
Model Max Length163840
Transformers Version4.39.3
Tokenizer ClassLlamaTokenizerFast
Beginning of Sentence Token<|begin▁of▁sentence|>
End of Sentence Token<|end▁of▁sentence|>
Vocabulary Size102400
Torch Data Typebfloat16
DeepSeek Coder V2 Lite Instruct (deepseek-ai/DeepSeek-Coder-V2-Lite-Instruct)

Quantized Models of the DeepSeek Coder V2 Lite Instruct

Model
Likes
Downloads
VRAM
...ek Coder V2 Lite Instruct GGUF530866 GB
...ek Coder V2 Lite Instruct GGUF11876 GB

Best Alternatives to DeepSeek Coder V2 Lite Instruct

Best Alternatives
Context / RAM
Downloads
Likes
...2 Lite Instruct FlashAttnPatch160K / 31.4 GB120

Rank the DeepSeek Coder V2 Lite Instruct Capabilities

🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 38149 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241110