Cerebras GPT 6.7B by cerebras

 ยป  All LLMs  ยป  cerebras  ยป  Cerebras GPT 6.7B   URL Share it on

  Arxiv:2101.00027   Arxiv:2203.15556   Arxiv:2304.03208   Dataset:the pile   En   Gpt2   Pytorch   Region:us

Cerebras GPT 6.7B Benchmarks

Cerebras GPT 6.7B (cerebras/Cerebras-GPT-6.7B)

Cerebras GPT 6.7B Parameters and Internals

Model Type 
Transformer-based Language Model, Causal Language Model
Use Cases 
Areas:
NLP research, applications, ethics, alignment research
Applications:
NLP applications
Primary Use Cases:
Research into large language models, Foundation model for NLP
Limitations:
Not suitable for machine translation, Not tuned for human-facing dialog
Considerations:
Further testing and mitigations are required for safety-related applications.
Additional Notes 
Compatible with Hugging Face pipelines and Cerebras Model Studio for pre-training and fine-tuning. Checkpoints available in Cerebras Model Zoo.
Supported Languages 
English (full proficiency)
Training Details 
Data Sources:
The Pile
Data Volume:
371B tokens
Methodology:
Training followed Chinchilla scaling laws with 20 tokens per model parameter using the Pile dataset.
Context Length:
2048
Hardware Used:
16 CS-2 wafer scale systems
Model Architecture:
GPT-3 style model with full attention
Responsible Ai Considerations 
Fairness:
Analysis has been conducted on the ethical standpoints of the Pile dataset, including toxicity and gender bias.
Accountability:
Developers and researchers should ensure the appropriateness of model use.
Mitigation Strategies:
Standard Pile dataset pre-processing.
Input Output 
Accepted Modalities:
text
LLM NameCerebras GPT 6.7B
Repository ๐Ÿค—https://huggingface.co/cerebras/Cerebras-GPT-6.7B 
Model Size6.7b
Required VRAM26.8 GB
Updated2024-12-22
Maintainercerebras
Model Typegpt2
Model Files  26.8 GB
Supported Languagesen
Model ArchitectureAutoModel
Licenseapache-2.0
Vocabulary Size50257
Activation Functiongelu

Best Alternatives to Cerebras GPT 6.7B

Best Alternatives
Context / RAM
Downloads
Likes
Alpaca Cerebras 6.7B0K / 0 GB03
Vigogne Opt 6.7B Instruct0K / 0 GB02
...pseek Coder 6.7B Instruct GGUF0K / 2.8 GB5895173
Magicoder S DS 6.7B GGUF0K / 2.8 GB89876
...ydecompiler 3.7 6.7B V0.9 GGUF0K / 2.5 GB560
...enCodeInterpreter DS 6.7B GGUF0K / 2.5 GB622
Deepseek Coder 6.7B Base GGUF0K / 2.8 GB96711
Note: green Score (e.g. "73.2") means that the model is better than cerebras/Cerebras-GPT-6.7B.

Rank the Cerebras GPT 6.7B Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 40066 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241217