EVA Qwen2.5 14B V0.1 by EVA-UNIT-01

 ยป  All LLMs  ยป  EVA-UNIT-01  ยป  EVA Qwen2.5 14B V0.1   URL Share it on

Base model:finetune:qwen/qwen2...   Base model:qwen/qwen2.5-14b Dataset:allura-org/celeste-1.x... Dataset:allura-org/shortstorie... Dataset:anthracite-org/kalo-op... Dataset:epiculous/synthrp-gens... Dataset:epiculous/synthstruct-... Dataset:gryphe/chatgpt-4o-writ... Dataset:gryphe/sonnet3.5-charc... Dataset:gryphe/sonnet3.5-slimo... Dataset:nopm/opus writingstruc... Dataset:nothingiisreal/reddit-...   Instruct   Qwen2   Region:us   Safetensors   Sharded   Tensorflow

EVA Qwen2.5 14B V0.1 Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").

EVA Qwen2.5 14B V0.1 Parameters and Internals

Model Type
RP/storywriting specialist
Additional NotesNote: using quantized KV cache with Qwen2.5 is not recommended and can lead to degraded output quality. Using f16 should be okay.
Training Details
Data Sources:
anthracite-org/kalo-opus-instruct-22k-no-refusal, Nopm/Opus_WritingStruct, Gryphe/Sonnet3.5-SlimOrcaDedupCleaned, Gryphe/Sonnet3.5-Charcard-Roleplay, Gryphe/ChatGPT-4o-Writing-Prompts, Epiculous/Synthstruct-Gens-v1.1-Filtered-n-Cleaned, Epiculous/SynthRP-Gens-v1.1-Filtered-n-Cleaned, nothingiisreal/Reddit-Dirty-And-WritingPrompts, allura-org/Celeste-1.x-data-mixture, allura-org/shortstories_synthlabels
Methodology:Full-parameter finetune of Qwen2.5-7B on mixture of synthetic and natural data
Input Output
Input Format:ChatML
Performance Tips:Using quantized KV cache with Qwen2.5 is not recommended. Recommended sampler values are: Temperature 0.87, Top-P 0.81, Min-P 0.0025, Repetition Penalty 1.03. Temperature lower than 1 is recommended, but can perform okay with higher temp and Min-P.
Release Notes
Version:0.1
Notes:Dataset was deduped and cleaned from version 0.0, sequence length increased. Model is stabler, problems with handling short inputs and min_p sampling seem resolved. This is epoch 2.7 checkpoint. Problems with quantized KV cache. Recommended sampler values added.
LLM NameEVA Qwen2.5 14B V0.1
Repository ๐Ÿค—https://huggingface.co/EVA-UNIT-01/EVA-Qwen2.5-14B-v0.1 
Base Model(s)  Qwen/Qwen2.5-14B   Qwen/Qwen2.5-14B
Model Size14b
Required VRAM29.7 GB
Updated2024-11-13
MaintainerEVA-UNIT-01
Model Typeqwen2
Instruction-BasedYes
Model Files  5.0 GB: 1-of-6   5.0 GB: 2-of-6   5.0 GB: 3-of-6   5.0 GB: 4-of-6   5.0 GB: 5-of-6   4.7 GB: 6-of-6   0.0 GB
Model ArchitectureQwen2ForCausalLM
Licenseapache-2.0
Context Length131072
Model Max Length131072
Transformers Version4.45.1
Tokenizer ClassQwen2Tokenizer
Padding Token<|endoftext|>
Vocabulary Size152064
Torch Data Typebfloat16
Errorsreplace
EVA Qwen2.5 14B V0.1 (EVA-UNIT-01/EVA-Qwen2.5-14B-v0.1)

Best Alternatives to EVA Qwen2.5 14B V0.1

Best Alternatives
Context / RAM
Downloads
Likes
Rombos LLM V2.6 Qwen 14B128K / 29.7 GB428942
EVA Qwen2.5 14B V0.2128K / 29.7 GB696
Tsunami 1.0 14B Instruct128K / 29.7 GB1520
EVA Qwen2.5 14B V0.0128K / 29.7 GB5213
Qwen2.5 14B Instruct32K / 29.6 GB130877113
Qwen2.5 Coder 14B Instruct32K / 29.7 GB31627
Qwen2.5 Gutenberg Doppel 14B32K / 29.7 GB126
Qwen2.5 14B Instruct32K / 29.7 GB35476
Openthaigpt1.5 14B Instruct32K / 29.7 GB12343
Lambda Qwen2.5 14B DPO Test32K / 29.7 GB26737
Note: green Score (e.g. "73.2") means that the model is better than EVA-UNIT-01/EVA-Qwen2.5-14B-v0.1.

Rank the EVA Qwen2.5 14B V0.1 Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 37901 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241110