Japanese Mistral 300M Base by ce-lery

 ยป  All LLMs  ยป  ce-lery  ยป  Japanese Mistral 300M Base   URL Share it on

  Autotrain compatible   Endpoints compatible   F32   Generated from trainer   Ggml   Gguf   Mistral   Quantized   Region:us   Safetensors   Tensorboard

Japanese Mistral 300M Base Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Japanese Mistral 300M Base (ce-lery/japanese-mistral-300m-base)

Japanese Mistral 300M Base Parameters and Internals

Model Type 
text generation
Additional Notes 
Suppression of unknown word generation by using byte fallback in SentencePiece tokenizer and conversion to huggingface Tokenizers format.
Supported Languages 
Japanese (fluent)
Training Details 
Data Sources:
wikipedia, cc100
Methodology:
Pretraining with flash attention2 and torch.compile and DeepSpeed, Fine-tuning with databricks-dolly-15k-ja
Input Output 
Input Format:
tokenized input
Accepted Modalities:
text
Output Format:
tokenized output
LLM NameJapanese Mistral 300M Base
Repository ๐Ÿค—https://huggingface.co/ce-lery/japanese-mistral-300m-base 
Base Model(s)  None   /None
Model Size300m
Required VRAM2.8 GB
Updated2025-02-22
Maintainerce-lery
Model Typemistral
Model Files  1.4 GB   1.4 GB   0.0 GB
GGML QuantizationYes
GGUF QuantizationYes
Quantization Typeggml|gguf
Model ArchitectureMistralForCausalLM
Context Length4096
Model Max Length4096
Transformers Version4.35.2
Tokenizer ClassT5Tokenizer
Padding Token[PAD]
Vocabulary Size50257
Torch Data Typefloat32

Best Alternatives to Japanese Mistral 300M Base

Best Alternatives
Context / RAM
Downloads
Likes
Lite Oute 1 300M Instruct4K / 1.2 GB48510
Lite Oute 1 300M4K / 1.2 GB3547
Mistral 300M4K / 0 GB1652
...anese Mistral 300M Instruction4K / 1.4 GB1223
Note: green Score (e.g. "73.2") means that the model is better than ce-lery/japanese-mistral-300m-base.

Rank the Japanese Mistral 300M Base Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 43508 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227