Model Type | Bilingual large language model, Causal language model, Text generation |
|
Use Cases |
Areas: | Research, Commercial applications |
|
Applications: | Chat applications, sentiment analysis, summarization |
|
Primary Use Cases: | Natural language understanding and generation, Chat development, Cultural analysis |
|
Limitations: | Not suitable for non-Arabic or English languages, Not suitable for decision-making without human oversight |
|
Considerations: | Use responsibly in compliance with legal regulations. |
|
|
Additional Notes | Optimized from Llama-2 with specialized techniques for Arabic. |
|
Supported Languages | Arabic (Advanced), English (Advanced) |
|
Training Details |
Data Sources: | Web, Code, Books, Scientific papers, Synthetic translation |
|
Data Volume: | |
Methodology: | From scratch and adapted pre-training methods |
|
Context Length: | |
Training Time: | |
Hardware Used: | Cerebras CS-2 Wafer-Scale Engines |
|
Model Architecture: | Transformer-based, decoder-only (GPT-3) with SwiGLU activation and ALiBi position encoding (Jais) / RoPE and Grouped Query Attention (adapted from Llama-2) |
|
|
Safety Evaluation |
Methodologies: | GPT-4-as-a-judge evaluation |
|
Risk Categories: | |
Ethical Considerations: | Contains disclaimers for incorrect, misleading, and/or offensive content generation. |
|
|
Responsible Ai Considerations |
Fairness: | Trained on diverse datasets to reduce bias, though biases may still be present. |
|
Transparency: | Open release of model details and license. |
|
Accountability: | Researchers and users should ensure compliance with applicable laws. |
|
Mitigation Strategies: | Use diverse datasets and evaluation with linguist review. |
|
|
Input Output |
Input Format: | |
Accepted Modalities: | |
Output Format: | |
Performance Tips: | Enable `trust_remote_code=True` when loading for special model classes. |
|
|
Release Notes |
Version: | |
Date: | |
Notes: | Release of the bilingual Jais family models. |
|
|
|