Kurage Multilingual by lightblue

 ยป  All LLMs  ยป  lightblue  ยป  Kurage Multilingual   URL Share it on

  Am   Ar   Bg   Bn   Conversational   Cs   Da   De   El   En   Es   Fa   Fi   Fr   Gu   Ha   Hi   Hu   Id   It   Ja   Jv   Kn   Ko   Lt   Mr   Nl   No   Pl   Pt   Qwen2   Rag   Region:us   Ro   Ru   Safetensors   Sharded   Sk   Sv   Sw   Ta   Te   Tensorflow   Th   Tl   Tr   Uk   Ur   Vi   Yo   Zh

Kurage Multilingual Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Kurage Multilingual (lightblue/kurage-multilingual)

Kurage Multilingual Parameters and Internals

Model Type 
text generation
Additional Notes 
Kurage is a multipurpose RAG model trained to perform RAG in 44 languages, with features including Multi-chunk RAG, Single-chunk RAG, Answer extension, Multilingual RAG, and Q&A generation.
Supported Languages 
am (support), ar (support), bg (support), bn (support), cs (support), da (support), de (support), el (support), en (support), es (support), fa (support), fi (support), fr (support), gu (support), ha (support), hi (support), hu (support), id (support), it (support), ja (support), jv (support), kn (support), ko (support), lt (support), mr (support), nl (support), no (support), pl (support), pt (support), ro (support), ru (support), sk (support), sv (support), sw (support), ta (support), te (support), th (support), tl (support), tr (support), uk (support), ur (support), vi (support), yo (support), zh (support)
Training Details 
Data Sources:
MADLAD-400, BAAI/bge-m3
Methodology:
Training on chunks of different token sizes to generate questions and answers, selecting negatives using similarity from dense embeddings.
Hardware Used:
ml.gu7ef.8xlarge-gu100 instance on Platform For AI from Alibaba Cloud
LLM NameKurage Multilingual
Repository ๐Ÿค—https://huggingface.co/lightblue/kurage-multilingual 
Model Size7.6b
Required VRAM15.2 GB
Updated2025-01-22
Maintainerlightblue
Model Typeqwen2
Model Files  4.9 GB: 1-of-4   4.9 GB: 2-of-4   4.3 GB: 3-of-4   1.1 GB: 4-of-4   0.0 GB
Supported Languagesam ar bg bn cs da de el en es fa fi fr gu ha hi hu id it ja jv kn ko lt mr nl pl pt ro ru sk sv sw ta te th tl tr uk ur vi yo zh
Model ArchitectureQwen2ForCausalLM
Licenseapache-2.0
Context Length32768
Model Max Length32768
Transformers Version4.37.0
Tokenizer ClassQwen2Tokenizer
Padding Token<|endoftext|>
Vocabulary Size151648
Torch Data Typebfloat16
Errorsreplace

Best Alternatives to Kurage Multilingual

Best Alternatives
Context / RAM
Downloads
Likes
MawaredT1128K / 15.2 GB16311
Exp 3 Q R128K / 15.2 GB70
T36Model128K / 15.2 GB1270
T21Model128K / 15.2 GB1300
Exp 2 Q R128K / 15.2 GB60
Arcee Agent128K / 15.2 GB50392
T35Model128K / 15.2 GB90
Marco O132K / 15.2 GB5436700
T Lite It 1.032K / 15.2 GB669068
Arcee Spark32K / 15.2 GB472486
Note: green Score (e.g. "73.2") means that the model is better than lightblue/kurage-multilingual.

Rank the Kurage Multilingual Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 41728 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227