Supported Languages | en (English), zh (Chinese), id (Indonesian), ms (Malay), th (Thai), vi (Vietnamese), fil (Filipino), ta (Tamil), my (Burmese), km (Khmer), lo (Lao) |
|
Training Details |
Data Sources: | RefinedWeb - English, mC4 - Chinese, mC4 - Indonesian, mC4 - Malay, mC4 - Filipino, mC4 - Burmese, mC4 - Vietnamese, mC4 - Thai, WangChanBERTa - Thai, mC4 - Lao, mC4 - Khmer, mC4 - Tamil, the Stack - Python, the Stack - Javascript, the Stack - Shell, the Stack - SQL, the Stack - Markdown, RedPajama - StackExchange, RedPajama - ArXiv |
|
Data Volume: | |
Context Length: | |
Training Time: | |
Hardware Used: | AWS EC2 p4d.24xlarge - 32 instances, Nvidia A100 40GB GPU - 256 |
|
Model Architecture: | Uses MPT architecture with 32 layers, d_model: 4096, head_dim: 32, Vocabulary: 256000, Sequence Length: 2048 |
|
|