Sea Lion 3B by aisingapore

 ยป  All LLMs  ยป  aisingapore  ยป  Sea Lion 3B   URL Share it on

  Arxiv:2101.09635   Autotrain compatible   Custom code   En   Endpoints compatible   Id   Km   Lo   Mpt   Ms   My   Region:us   Safetensors   Ta   Th   Tl   Vi   Zh
Model Card on HF ๐Ÿค—: https://huggingface.co/aisingapore/sea-lion-3b 

Sea Lion 3B Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Sea Lion 3B (aisingapore/sea-lion-3b)

Sea Lion 3B Parameters and Internals

Model Type 
Decoder
Additional Notes 
Model has not been aligned for safety. Users should perform their own safety fine-tuning.
Supported Languages 
en (English), zh (Chinese), id (Indonesian), ms (Malay), tl (Filipino), my (Burmese), vi (Vietnamese), th (Thai), lo (Lao), km (Khmer), ta (Tamil)
Training Details 
Data Sources:
RefinedWeb - English, mC4 - Chinese, mC4 - Indonesian, mC4 - Malay, mC4 - Filipino, mC4 - Burmese, mC4 - Vietnamese, mC4 - Thai, WangChanBERTa - Thai, mC4 - Lao, mC4 - Khmer, mC4 - Tamil, the Stack - Python, the Stack - Javascript, the Stack - Shell, the Stack - SQL, the Stack - Markdown, RedPajama - StackExchange, RedPajama - ArXiv
Data Volume:
980B tokens
Methodology:
Pretrained and instruct-tuned for SEA region
Context Length:
2048
Training Time:
14 days
Hardware Used:
AWS EC2 p4d.24xlarge, Nvidia A100 40GB GPU
Model Architecture:
MPT architecture
LLM NameSea Lion 3B
Repository ๐Ÿค—https://huggingface.co/aisingapore/sea-lion-3b 
Model Size3b
Required VRAM6.4 GB
Updated2024-12-21
Maintaineraisingapore
Model Typempt
Model Files  6.4 GB
Supported Languagesen zh id ms tl my vi th lo km ta
Model ArchitectureMPTForCausalLM
Licensemit
Transformers Version4.34.1
Tokenizer ClassSEABPETokenizer
Vocabulary Size256000
Torch Data Typebfloat16

Best Alternatives to Sea Lion 3B

Best Alternatives
Context / RAM
Downloads
Likes
Replit Code V1.5 3B0K / 6.6 GB27008289
Replit Code V1 3B0K / 10.4 GB705724
Code Millenials 3B0K / 5.2 GB251
Mpt 3B 8K Instruct0K / 6.9 GB183
Glaive Function Calling V10K / 10.4 GB6167
...aive Function Calling V2 Small0K / 10.4 GB1613
Evol Replit V10K / 10.4 GB118
Replit V2 CodeInstruct 3B0K / 10.4 GB1772
Replit CodeInstruct V30K / 10.4 GB92
Replit V1 CodeInstruct 3B0K / 10.4 GB1636
Note: green Score (e.g. "73.2") means that the model is better than aisingapore/sea-lion-3b.

Rank the Sea Lion 3B Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 40013 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241217