Fox 1 1.6B by tensoropera

 ยป  All LLMs  ยป  tensoropera  ยป  Fox 1 1.6B   URL Share it on

  Arxiv:2411.05281   Autotrain compatible   Conversational   En   Endpoints compatible   Llama   Model-index   Region:us   Safetensors
Model Card on HF ๐Ÿค—: https://huggingface.co/tensoropera/Fox-1-1.6B 

Fox 1 1.6B Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Fox 1 1.6B (tensoropera/Fox-1-1.6B)

Fox 1 1.6B Parameters and Internals

Model Type 
text-generation
Additional Notes 
Fox-1 is a base pretrained model that requires further fine-tuning for most use cases.
Training Details 
Data Sources:
text, code
Data Volume:
3 trillion tokens
Methodology:
3-stage data curriculum
Context Length:
8000
Hardware Used:
8 H100 GPUs
Model Architecture:
decoder-only transformer-based small language model (SLM)
LLM NameFox 1 1.6B
Repository ๐Ÿค—https://huggingface.co/tensoropera/Fox-1-1.6B 
Model Size1.6b
Required VRAM3.3 GB
Updated2024-12-14
Maintainertensoropera
Model Typellama
Model Files  3.3 GB
Supported Languagesen
Model ArchitectureLlamaForCausalLM
Licenseapache-2.0
Context Length8192
Model Max Length8192
Transformers Version4.39.3
Tokenizer ClassGemmaTokenizer
Padding Token<pad>
Vocabulary Size256000
Torch Data Typebfloat16

Best Alternatives to Fox 1 1.6B

Best Alternatives
Context / RAM
Downloads
Likes
1.5 Pints 2K V0.116K / 3.1 GB31716
1.5 Pints 16K V0.116K / 3.1 GB14514
Fox 1 1.6B Instruct V0.18K / 3.3 GB17215
Subnet6 0018K / 16.1 GB80
Model 64K / 3.3 GB12450
6 24K / 3.3 GB8890
Chuxin 1.6B Base4K / 3.3 GB1216
Chuxin 1.6B 1M4K / 3.3 GB199
Mymodel4K / 3.3 GB80
SN64K / 3.3 GB70

Rank the Fox 1 1.6B Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 39237 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241124