AquilaDense 7B by BAAI

 ยป  All LLMs  ยป  BAAI  ยป  AquilaDense 7B   URL Share it on

  Arxiv:2202.08906   Arxiv:2212.05055   Arxiv:2401.09192   Aquiladense   Autotrain compatible   Conversational   Custom code   En   Moe   Pytorch   Region:us   Sharded   Zh
Model Card on HF ๐Ÿค—: https://huggingface.co/BAAI/AquilaDense-7B 

AquilaDense 7B Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
AquilaDense 7B (BAAI/AquilaDense-7B)

AquilaDense 7B Parameters and Internals

Model Type 
Mixture of Experts (MoE)
Additional Notes 
AquilaMoE utilizes an innovative EfficientScale training methodology for MoE models, ensuring substantial performance while minimizing data requirements.
Supported Languages 
en (high), zh (high)
Training Details 
Data Sources:
https://huggingface.co/datasets/togethercomputer/RedPajama-Data-V2, https://huggingface.co/datasets/tiiuae/falcon-refinedweb, https://huggingface.co/datasets/allenai/c4, https://huggingface.co/datasets/EleutherAI/pile, https://data.baai.ac.cn/details/WuDaoCorporaText, https://huggingface.co/datasets/CASIA-LM/ChineseWebText
Data Volume:
4TB tokens
Methodology:
EfficientScale; Mixture of Experts (MoE) training; Scale-Up and Scale-Out strategies
Context Length:
4096
Model Architecture:
Mixture of Experts (MoE); Context Length: 4096; QKV Bias: yes; Layers: 40; Hidden Dim: 5120; Intermediate Dim: 20480; KV Group: 8
LLM NameAquilaDense 7B
Repository ๐Ÿค—https://huggingface.co/BAAI/AquilaDense-7B 
Model Size7b
Required VRAM16.3 GB
Updated2025-02-22
MaintainerBAAI
Model Typeaquiladense
Model Files  2.0 GB: 1-of-9   1.9 GB: 2-of-9   2.0 GB: 3-of-9   1.9 GB: 4-of-9   2.0 GB: 5-of-9   1.9 GB: 6-of-9   2.0 GB: 7-of-9   1.4 GB: 8-of-9   1.2 GB: 9-of-9
Supported Languagesen zh
Model ArchitectureAquilaDenseForCausalLM
Licenseapache-2.0
Context Length4096
Model Max Length4096
Transformers Version4.37.2
Tokenizer ClassQWenTokenizer
Vocabulary Size151851
Torch Data Typebfloat16

Rank the AquilaDense 7B Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 43470 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227