Falcon 11B by tiiuae

 ยป  All LLMs  ยป  tiiuae  ยป  Falcon 11B   URL Share it on

  Arxiv:1911.02150   Arxiv:2005.14165   Arxiv:2104.09864   Arxiv:2307.08691   Arxiv:2311.16867   Arxiv:2407.14885   Autotrain compatible   Conversational   Cs   Custom code Dataset:tiiuae/falcon-refinedw...   De   En   Es   Falcon   Fr   It   Nl   Pl   Pt   Region:us   Ro   Safetensors   Sharded   Tensorflow
Model Card on HF ๐Ÿค—: https://huggingface.co/tiiuae/falcon-11B 

Falcon 11B Benchmarks

Falcon 11B (tiiuae/falcon-11B)

Falcon 11B Parameters and Internals

Model Type 
causal decoder-only
Use Cases 
Areas:
Research
Applications:
summarization, text generation, chatbot
Limitations:
Not suitable for production use without adequate assessment of risks
Additional Notes 
The model requires further fine-tuning for specific use cases and includes biases representative of web data.
Supported Languages 
en (English), de (German), es (Spanish), fr (French), it (Italian), nl (Dutch), pl (Polish), pt (Portuguese), ro (Romanian), cs (Czech), sv (Swedish)
Training Details 
Data Sources:
RefinedWeb, RefinedWeb-English, Refined Web-Europe (cs, de, es, fr, it, nl, pl, pt, ro, sv), high quality technical data, code data, conversational data extracted from public sources
Data Volume:
5,000B tokens
Methodology:
four stage training strategy
Context Length:
8192
Training Time:
approximately two months
Hardware Used:
1024 A100 40GB GPUs
Model Architecture:
Adapted from GPT-3 with rotary positional embeddings, multiquery attention, and FlashAttention2
Input Output 
Input Format:
Token-based input with context length up to 8192 tokens
Accepted Modalities:
text
Output Format:
Token-based output
Performance Tips:
Fine-tuning recommended for specific tasks
LLM NameFalcon 11B
Repository ๐Ÿค—https://huggingface.co/tiiuae/falcon-11B 
Model Size11b
Required VRAM22.1 GB
Updated2024-12-22
Maintainertiiuae
Model Typefalcon
Model Files  5.0 GB: 1-of-5   4.9 GB: 2-of-5   4.9 GB: 3-of-5   4.9 GB: 4-of-5   2.4 GB: 5-of-5
Supported Languagesen de es fr it nl pl pt ro cs
Model ArchitectureFalconForCausalLM
Licenseunknown
Context Length8192
Model Max Length8192
Transformers Version4.39.2
Is Biased0
Tokenizer ClassPreTrainedTokenizerFast
Padding Token<|endoftext|>
Vocabulary Size65024
Torch Data Typebfloat16

Best Alternatives to Falcon 11B

Best Alternatives
Context / RAM
Downloads
Likes
Falcon2 5.5B Multilingual8K / 10.9 GB4124
Falcon2 5.5B Polish8K / 10.9 GB21961
Falcon2 5.5B German8K / 10.9 GB3980
Falcon2 11B8K / 6.6 GB180
Enron Falcon 11B8K / 7.6 GB141
Falcon2 5.5B Czech8K / 10.9 GB420
Falcon2 5.5B Portuguese8K / 10.9 GB270
Falcon2 5.5B Spanish8K / 10.9 GB220
Falcon2 5.5B Norwegian8K / 10.9 GB211
Falcon2 5.5B Dutch8K / 10.9 GB291

Rank the Falcon 11B Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 40066 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241217