Model Type | text generation, decoder, causal-lm |
|
Use Cases |
Areas: | Research, Commercial Applications |
|
Applications: | Development of chat assistants, Sentiment analysis, Summarization of bilingual documents |
|
Primary Use Cases: | Arabic and English NLP tasks, Cultural alignment analysis, Mechanistic interpretability |
|
|
Additional Notes | Jais models are designed for Arabic and English tasks, not other languages. |
|
Supported Languages | Arabic (high proficiency), English (strong capabilities) |
|
Training Details |
Data Sources: | Public web pages, Wikipedia, News articles, Social network content, Code in various languages, Books in Arabic and English, ArXiv papers, Synthetic translations of high-quality English resources |
|
Data Volume: | Up to 1.6 trillion tokens |
|
Methodology: | Two-stage training with frozen and unfrozen layers for adapted pre-training; progressive context length expansion |
|
Context Length: | |
Hardware Used: | Condor Galaxy supercomputer, 64 Cerebras CS-2 WSE-2 units |
|
Model Architecture: | Transformer-based, decoder-only architecture with SwiGLU activation and ALiBi/ROPE position encoding |
|
|
Responsible Ai Considerations |
Fairness: | Bias mitigation techniques employed. |
|
|
Input Output |
Accepted Modalities: | |
Output Format: | |
|