Mistral 1L Tiny by nilq

 ยป  All LLMs  ยป  nilq  ยป  Mistral 1L Tiny   URL Share it on

  Arxiv:2305.07759   Autotrain compatible   Dataset:roneneldan/tinystories   Endpoints compatible   Generated from trainer   Mistral   Model-index   Region:us   Safetensors
Model Card on HF ๐Ÿค—: https://huggingface.co/nilq/mistral-1L-tiny 

Mistral 1L Tiny Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Mistral 1L Tiny (nilq/mistral-1L-tiny)

Mistral 1L Tiny Parameters and Internals

Model Type 
Causal Language Modeling, text-generation
Use Cases 
Primary Use Cases:
Analysis of feature dynamics and emergence in real-world language models.
Additional Notes 
Trained on the roneneldan/TinyStories dataset. Consistent English text generation observed.
Training Details 
Data Sources:
roneneldan/TinyStories
Methodology:
Inspired by the 21M parameter one-layer GPT-Neo of the Tiny Stories paper. Trained to reproduce results and acquire high-frequency checkpoints for further analysis.
Training Time:
~2 hours on a single H100
Hardware Used:
single H100
Model Architecture:
Single-layer Mistral model with hidden size 512 and MLP intermediate size 1024.
LLM NameMistral 1L Tiny
Repository ๐Ÿค—https://huggingface.co/nilq/mistral-1L-tiny 
Model Size35.1m
Required VRAM0.1 GB
Updated2025-03-14
Maintainernilq
Model Typemistral
Model Files  0.1 GB   0.0 GB
Model ArchitectureMistralForCausalLM
Context Length2048
Model Max Length2048
Transformers Version4.38.1
Tokenizer ClassPreTrainedTokenizerFast
Vocabulary Size32000
Torch Data Typefloat32

Best Alternatives to Mistral 1L Tiny

Best Alternatives
Context / RAM
Downloads
Likes
...Mistral 1L Tiny TinyStories Ft2K / 0.1 GB721

Rank the Mistral 1L Tiny Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 45019 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227