Swallow 7B Instruct V0.1 by tokyotech-llm

 ยป  All LLMs  ยป  tokyotech-llm  ยป  Swallow 7B Instruct V0.1   URL Share it on

  Autotrain compatible   Conversational   En   Endpoints compatible   Instruct   Ja   Llama   Region:us   Safetensors   Sharded   Tensorflow

Swallow 7B Instruct V0.1 Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Swallow 7B Instruct V0.1 (tokyotech-llm/Swallow-7b-instruct-v0.1)

Swallow 7B Instruct V0.1 Parameters and Internals

Model Type 
text-generation, instruction-tuned
Use Cases 
Areas:
Research, Development
Applications:
Cross-Lingual Adaptation, Instruction Following
Primary Use Cases:
Text Generation, Language Translation
Limitations:
Not fine-tuned for specific human intent and safety considerations
Additional Notes 
Developed by multiple team members from TokyoTech-LLM, with acknowledgements to Meta Research for Llama 2.
Supported Languages 
Japanese (Proficient), English (Proficient)
Training Details 
Data Sources:
OpenAssistant Conversations Dataset EN top-1 thread, OpenAssistant Conversations Dataset
Methodology:
Supervised fine-tuning (SFT)
Model Architecture:
Please refer to LLaMA-2 technical report for details on the model architecture.
Input Output 
Input Format:
~~[INST] <> {SYSTEM_PROMPT} <> {USER_MESSAGE} [/INST]
Accepted Modalities:
text
Output Format:
Strings
Performance Tips:
Adhere strictly to instruction format to maintain performance.
Release Notes 
Version:
0.1
Date:
April 26, 2024
Notes:
Release of enhanced instruction-tuned models as preview versions.
Version:
7b-plus
Date:
March 2, 2024
Notes:
Trained with approximately twice as many Japanese tokens.
Version:
13b-NVE-hf
Date:
February 4, 2024
Notes:
Model release with no vocabulary expansion.
Version:
7b-NVE
Date:
January 26, 2024
Notes:
Release of various instruct-hf models as well as no vocabulary expansion models.
Version:
7b
Date:
December 19, 2023
Notes:
Initial release of Swallow 7b, 13b, and 70b in instruct hf variants.
LLM NameSwallow 7B Instruct V0.1
Repository ๐Ÿค—https://huggingface.co/tokyotech-llm/Swallow-7b-instruct-v0.1 
Model Size7b
Required VRAM13.7 GB
Updated2025-02-16
Maintainertokyotech-llm
Model Typellama
Instruction-BasedYes
Model Files  4.9 GB: 1-of-3   5.0 GB: 2-of-3   3.8 GB: 3-of-3
Supported Languagesen ja
Model ArchitectureLlamaForCausalLM
Licensellama2
Context Length4096
Model Max Length4096
Transformers Version4.39.0.dev0
Tokenizer ClassLlamaTokenizer
Beginning of Sentence Token<s>
End of Sentence Token</s>
Unk Token<unk>
Vocabulary Size43176
Torch Data Typebfloat16

Quantized Models of the Swallow 7B Instruct V0.1

Model
Likes
Downloads
VRAM
Swallow 7B Instruct V0.1 4bit1804 GB

Best Alternatives to Swallow 7B Instruct V0.1

Best Alternatives
Context / RAM
Downloads
Likes
... Qwen2.5llamaify 7B V23.1 200K195K / 15.2 GB44583
SuperNeuralDreadDevil 8B128K / 16.1 GB581
Falcon3 7B Instruct32K / 14.8 GB4290250
Falcon3 Jessi V0.4 7B Slerp32K / 14.9 GB4759
Jessi V0.4 Falcon3 7B Instruct32K / 14.8 GB1340
Taurus Opus 7B32K / 14.8 GB8210
Jessi V0.6 Falcon3 7B Instruct32K / 14.8 GB300
Jessi V0.5 Falcon3 7B Instruct32K / 14.8 GB120
Jessi V0.3 Falcon3 7B Instruct32K / 14.8 GB100
Jessi V0.2 Falcon3 7B Instruct32K / 14.8 GB90
Note: green Score (e.g. "73.2") means that the model is better than tokyotech-llm/Swallow-7b-instruct-v0.1.

Rank the Swallow 7B Instruct V0.1 Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 43233 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227