Dlite V1 124M By aisquared: Benchmarks, Features and Detailed Analysis. Insights on Dlite V1 124M.

Autotrain compatible Dataset:tatsu-lab/alpaca En Endpoints compatible Gpt2 Pytorch Region:us

Model Card on HF 🤗: https://huggingface.co/aisquared/dlite-v1-124m

Dlite V1 124M Benchmarks

ARC: 24.32 vs 96.7 (so35)^-74.9%

HellaSwag: 31.16 vs 95.3 (gpt4)^-67.3%

MMLU: 25.08 vs 88.3 (so35)^-71.6%

TruthfulQA: 36.38 vs 59 (gpt4)^-38.3%

WinoGrande: 50.2 vs 87.5 (gpt4)^-42.6%

LLME Score: 0.15978

^nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").

What is the LLM Explorer Rank (Score)

Dlite V1 124M Parameters and Internals

Model Type

Large Language Model

Use Cases

Limitations:

Not a state-of-the-art model, Experimental model, Can exhibit undesired behaviors: factual inaccuracies, biases, offensive responses, toxicity, and hallucinations

Considerations:

Exercise good judgment when applying this technology.

Additional Notes

Model not suitable for environments other than research purposes.

Supported Languages

EN (English)

Training Details

Data Sources:

tatsu-lab/alpaca

Data Volume:

50k records

Methodology:

Fine-tuning on GPT-2

Hardware Used:

Single T4 GPU

Model Architecture:

Derived from GPT-2

Input Output

Performance Tips:

Including `torch_dtype=torch.bfloat16` is generally recommended to reduce memory usage.

LLM Name	Dlite V1 124M
Repository 🤗	https://huggingface.co/aisquared/dlite-v1-124m
Model Size	124m
Required VRAM	0.5 GB
Updated	2025-02-22
Maintainer	aisquared
Model Type	gpt2
Model Files	0.5 GB 0.0 GB
Supported Languages	en
Model Architecture	GPT2LMHeadModel
License	apache-2.0
Model Max Length	1024
Transformers Version	4.25.1
Tokenizer Class	GPT2Tokenizer
Vocabulary Size	50257
Torch Data Type	float32
Activation Function	gelu_new

Best Alternatives to Dlite V1 124M

Best Alternatives	Context / RAM	Downloads	Likes
GPT2 124M Poetry RL	0K / 0.5 GB	13	0
Gpt2 Final	0K / 0.5 GB	168	0
Filiberto 124M	0K / 0.5 GB	125	0
GPT2 Nepali 124M	0K / 0.5 GB	12	3
LaMini GPT 124M	0K / 0.5 GB	3243	22
Dlite V2 124M	0K / 0.3 GB	2443	6

Note: green Score (e.g. "73.2") means that the model is better than aisquared/dlite-v1-124m.

Rank the Dlite V1 124M Capabilities

🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟

Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation

What open-source LLMs or SLMs are you in search of? 43470 in total.

Email us: info@extractum.io. Our Privacy Policy | Terms and Conditions | Suggest an improvement.

Our Social Media →

Original data from HuggingFace, OpenCompass and various public git repos.

Release v20241227

Support LLM Explorer

Dlite V1 124M by aisquared

» All LLMs » aisquared » Dlite V1 124M URL Share it on

Dlite V1 124M Benchmarks

Dlite V1 124M Parameters and Internals

Best Alternatives to Dlite V1 124M

Rank the Dlite V1 124M Capabilities

What open-source LLMs or SLMs are you in search of? 43470 in total.