Cerebras GPT 6.7B By cerebras: Benchmarks, Features and Detailed Analysis. Insights on Cerebras GPT 6.7B.

Arxiv:2101.00027 Arxiv:2203.15556 Arxiv:2304.03208 Dataset:the pile En Gpt2 Pytorch Region:us

Model Card on HF 🤗: https://huggingface.co/cerebras/Cerebras-GPT-6.7B

Cerebras GPT 6.7B Benchmarks

ARC: 35.07 vs 96.7 (so35)^-63.7%

HellaSwag: 59.36 vs 95.3 (gpt4)^-37.7%

MMLU: 25.93 vs 88.3 (so35)^-70.6%

TruthfulQA: 38.02 vs 59 (gpt4)^-35.6%

WinoGrande: 58.72 vs 87.5 (gpt4)^-32.9%

GSM8K: 0.53 vs 96.4 (so35)^-99.5%

LLME Score: 0.15645

^nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").

What is the LLM Explorer Rank (Score)

Cerebras GPT 6.7B (cerebras/Cerebras-GPT-6.7B)

Cerebras GPT 6.7B Parameters and Internals

Model Type

Transformer-based Language Model, Causal Language Model

Use Cases

Areas:

NLP research, applications, ethics, alignment research

Applications:

NLP applications

Primary Use Cases:

Research into large language models, Foundation model for NLP

Limitations:

Not suitable for machine translation, Not tuned for human-facing dialog

Considerations:

Further testing and mitigations are required for safety-related applications.

Additional Notes

Compatible with Hugging Face pipelines and Cerebras Model Studio for pre-training and fine-tuning. Checkpoints available in Cerebras Model Zoo.

Supported Languages

English (full proficiency)

Training Details

Data Sources:

The Pile

Data Volume:

371B tokens

Methodology:

Training followed Chinchilla scaling laws with 20 tokens per model parameter using the Pile dataset.

Context Length:

2048

Hardware Used:

16 CS-2 wafer scale systems

Model Architecture:

GPT-3 style model with full attention

Responsible Ai Considerations

Fairness:

Analysis has been conducted on the ethical standpoints of the Pile dataset, including toxicity and gender bias.

Accountability:

Developers and researchers should ensure the appropriateness of model use.

Mitigation Strategies:

Standard Pile dataset pre-processing.

Input Output

Accepted Modalities:

text

LLM Name	Cerebras GPT 6.7B
Repository 🤗	https://huggingface.co/cerebras/Cerebras-GPT-6.7B
Model Size	6.7b
Required VRAM	26.8 GB
Updated	2024-12-22
Maintainer	cerebras
Model Type	gpt2
Model Files	26.8 GB
Supported Languages	en
Model Architecture	AutoModel
License	apache-2.0
Vocabulary Size	50257
Activation Function	gelu

Best Alternatives to Cerebras GPT 6.7B

Best Alternatives	Context / RAM	Downloads	Likes
Alpaca Cerebras 6.7B	0K / 0 GB	0	3
Vigogne Opt 6.7B Instruct	0K / 0 GB	0	2
...pseek Coder 6.7B Instruct GGUF	0K / 2.8 GB	5895	173
Magicoder S DS 6.7B GGUF	0K / 2.8 GB	898	76
...ydecompiler 3.7 6.7B V0.9 GGUF	0K / 2.5 GB	56	0
...enCodeInterpreter DS 6.7B GGUF	0K / 2.5 GB	62	2
Deepseek Coder 6.7B Base GGUF	0K / 2.8 GB	967	11

Note: green Score (e.g. "73.2") means that the model is better than cerebras/Cerebras-GPT-6.7B.

Rank the Cerebras GPT 6.7B Capabilities

🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟

Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation

What open-source LLMs or SLMs are you in search of? 40066 in total.

Email us: info@extractum.io. Our Privacy Policy | Terms and Conditions | Suggest an improvement.

Our Social Media →

Original data from HuggingFace, OpenCompass and various public git repos.

Release v20241217

Support LLM Explorer

Cerebras GPT 6.7B by cerebras

» All LLMs » cerebras » Cerebras GPT 6.7B URL Share it on

Cerebras GPT 6.7B Benchmarks

Cerebras GPT 6.7B Parameters and Internals

Best Alternatives to Cerebras GPT 6.7B

Rank the Cerebras GPT 6.7B Capabilities

What open-source LLMs or SLMs are you in search of? 40066 in total.