Base Eval by deval-core

 ยป  All LLMs  ยป  deval-core  ยป  Base Eval   URL Share it on

  Arxiv:2204.05149 Base model:finetune:meta-llama... Base model:meta-llama/llama-3....   Conversational   De   En   Es   Facebook   Fr   Hi   It   Llama   Llama-3   Meta   Pt   Pytorch   Region:us   Safetensors   Sharded   Tensorflow   Th
Model Card on HF ๐Ÿค—: https://huggingface.co/deval-core/base-eval 

Base Eval Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Base Eval (deval-core/base-eval)

Base Eval Parameters and Internals

Model Type 
text generation, multilingual
Use Cases 
Areas:
Commercial, Research
Applications:
Assistant-like chat, Multilingual dialogue, Synthetic data generation
Primary Use Cases:
Instruction tuning for assistant-like chat
Limitations:
Use in unsupported languages without controls, Violations of applicable laws or the Acceptable Use Policy
Considerations:
Developers should fine-tune Llama 3.1 models for additional languages responsibly.
Additional Notes 
Developers can customize model deployment using available recipes and guidelines
Supported Languages 
en (English), de (German), fr (French), it (Italian), pt (Portuguese), hi (Hindi), es (Spanish), th (Thai)
Training Details 
Data Sources:
Publicly available online data
Data Volume:
~15 trillion tokens
Methodology:
supervised fine-tuning (SFT) and reinforcement learning with human feedback (RLHF)
Context Length:
128000
Training Time:
39.3M GPU hours
Hardware Used:
H100-80GB GPUs
Model Architecture:
Auto-regressive language model using an optimized transformer architecture
Safety Evaluation 
Methodologies:
Safety fine-tuning, Red teaming
Findings:
Model must be deployed with system-level safeguards
Risk Categories:
Misinformation, Bias, Child Safety, Cybersecurity risks
Ethical Considerations:
Avoid using in unsupported languages without fine-tuning and system controls.
Responsible Ai Considerations 
Fairness:
Focus on multilingual safety and fairness across different languages
Transparency:
Clear guidelines and resources provided for deployment
Accountability:
Developers must deploy safeguards when building with the model
Mitigation Strategies:
Incorporation of safety mitigations, domain-specific evaluations
Input Output 
Input Format:
Multilingual text and multilingual text with code
Accepted Modalities:
text
Output Format:
Text, including multilingual text and code
Performance Tips:
Use transformers or llama codebase for generation
Release Notes 
Version:
3.1
Date:
2024-07-23
Notes:
Introduction of multilingual support and longer context window.
LLM NameBase Eval
Repository ๐Ÿค—https://huggingface.co/deval-core/base-eval 
Base Model(s)  meta-llama/Meta-Llama-3.1-8B   meta-llama/Meta-Llama-3.1-8B
Model Size8b
Required VRAM16.1 GB
Updated2024-12-06
Maintainerdeval-core
Model Typellama
Model Files  5.0 GB: 1-of-4   5.0 GB: 2-of-4   4.9 GB: 3-of-4   1.2 GB: 4-of-4
Supported Languagesen de fr it pt hi es th
Model ArchitectureLlamaForCausalLM
Licensellama3.1
Context Length131072
Model Max Length131072
Transformers Version4.42.3
Tokenizer ClassPreTrainedTokenizerFast
Vocabulary Size128256
Torch Data Typebfloat16

Best Alternatives to Base Eval

Best Alternatives
Context / RAM
Downloads
Likes
...a 3 8B Instruct Gradient 1048K1024K / 16.1 GB14743675
Thor V1.3a 8B FANTASY 1024K1024K / 16.1 GB1501
Loki V2.8 8B EROTICA1024K / 16.1 GB192
Odin V1.1 8B FICTION 1024K1024K / 16.1 GB1040
RP Naughty V1.0e 8B1024K / 16.1 GB561
RP Naughty V1.2 8B1024K / 16.1 GB401
...or V1.35 8B DARK FANTASY 1024K1024K / 16.1 GB261
Loki V2.75B 8B EROTICA 1024K1024K / 16.1 GB191
Loki V2.75 8B EROTICA 1024K1024K / 16.1 GB171
8B Base Academic 11024K / 16.1 GB61
Note: green Score (e.g. "73.2") means that the model is better than deval-core/base-eval.

Rank the Base Eval Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 38920 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241124