Phi 2 Upscaled 4B Instruct V0.1 By daekeun-ml: Benchmarks, Features and Detailed Analysis. Insights on Phi 2 Upscaled 4B Instruct V0.1.

Arxiv:2312.15166 Autotrain compatible Conversational Dataset:intel/orca dpo pairs Dataset:open-orca/openorca Dataset:wikipedia En Instruct Phi Region:us Safetensors Sharded Tensorflow

Model Card on HF 🤗: https://huggingface.co/daekeun-ml/phi-2-upscaled-4B-instruct-v0.1

Phi 2 Upscaled 4B Instruct V0.1 Benchmarks

ARC: 22.95 vs 96.7 (so35)^-76.3%

HellaSwag: 28.68 vs 95.3 (gpt4)^-69.9%

MMLU: 26.8 vs 88.3 (so35)^-69.6%

TruthfulQA: 40.92 vs 59 (gpt4)^-30.6%

WinoGrande: 50.59 vs 87.5 (gpt4)^-42.2%

GSM8K: 0.76 vs 96.4 (so35)^-99.2%

LLME Score: 0.21354

^nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").

What is the LLM Explorer Rank (Score)

Phi 2 Upscaled 4B Instruct V0.1 (daekeun-ml/phi-2-upscaled-4B-instruct-v0.1)

Phi 2 Upscaled 4B Instruct V0.1 Parameters and Internals

Model Type

instruction tuning

Additional Notes

Created as a personal experiment. May not operate correctly without verification.

Training Details

Data Sources:

wikipedia, Intel/orca_dpo_pairs, Open-Orca/OpenOrca

Data Volume:

1.5 million samples after tokenization

Methodology:

Depth Upscaling (DUS), instruction tuning, alignment tuning

Context Length:

1024

Training Time:

3 days for pre-training, 10 hours for tuning

Hardware Used:

AWS ml.g5.48xlarge (NVIDIA A10G GPU x 32), AWS ml.g5.24xlarge (NVIDIA A10G GPU x 4)

Model Architecture:

48 transformer blocks (expanded from 32)

Input Output

Input Format:

Expected format is a chat template with roles like 'system' and 'user'.

Accepted Modalities:

text

Output Format:

Generated responses in text format.

Performance Tips:

Tokenize inputs using chat templates and handle memory constraints for larger context lengths.

LLM Name	Phi 2 Upscaled 4B Instruct V0.1
Repository 🤗	https://huggingface.co/daekeun-ml/phi-2-upscaled-4B-instruct-v0.1
Model Size	4b
Required VRAM	8.1 GB
Updated	2025-02-22
Maintainer	daekeun-ml
Model Type	phi
Instruction-Based	Yes
Model Files	5.0 GB: 1-of-2 3.1 GB: 2-of-2
Supported Languages	en
Model Architecture	PhiForCausalLM
License	apache-2.0
Context Length	2048
Model Max Length	2048
Transformers Version	4.37.2
Tokenizer Class	CodeGenTokenizer
Padding Token	!
Vocabulary Size	51200
Torch Data Type	bfloat16

Best Alternatives to Phi 2 Upscaled 4B Instruct V0.1

Best Alternatives	Context / RAM	Downloads	Likes
Delta 4B Instruct V0.1	2K / 9.4 GB	15	0

Rank the Phi 2 Upscaled 4B Instruct V0.1 Capabilities

🆘 Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! 🌟

Instruction Following and Task Automation
Factuality and Completeness of Knowledge
Censorship and Alignment
Data Analysis and Insight Generation
Text Generation
Text Summarization and Feature Extraction
Code Generation
Multi-Language Support and Translation

What open-source LLMs or SLMs are you in search of? 43470 in total.

Email us: info@extractum.io. Our Privacy Policy | Terms and Conditions | Suggest an improvement.

Our Social Media →

Original data from HuggingFace, OpenCompass and various public git repos.

Release v20241227

Support LLM Explorer

Phi 2 Upscaled 4B Instruct V0.1 by daekeun-ml

» All LLMs » daekeun-ml » Phi 2 Upscaled 4B Instruct V0.1 URL Share it on

Phi 2 Upscaled 4B Instruct V0.1 Benchmarks

Phi 2 Upscaled 4B Instruct V0.1 Parameters and Internals

Best Alternatives to Phi 2 Upscaled 4B Instruct V0.1

Rank the Phi 2 Upscaled 4B Instruct V0.1 Capabilities

What open-source LLMs or SLMs are you in search of? 43470 in total.