Model Type | auto-regressive language model |
|
Use Cases |
Areas: | |
Primary Use Cases: | Zero-shot common sense reasoning tasks |
|
|
Additional Notes | MobileLLM integrated several key techniques such as SwiGLU activation function, deep and thin architectures, embedding sharing, and grouped-query attention. |
|
Training Details |
Data Sources: | Publicly available online data. |
|
Data Volume: | |
Context Length: | |
Hardware Used: | |
Model Architecture: | MobileLLM is an auto-regressive language model leveraging an optimized transformer architecture with techniques such as SwiGLU activation function, deep and thin architectures, embedding sharing, and grouped-query attention. |
|
|
Input Output |
Input Format: | |
Accepted Modalities: | |
Output Format: | |
|