L3 8B Stheno V3.2 by Sao10K

 ยป  All LLMs  ยป  Sao10K  ยป  L3 8B Stheno V3.2   URL Share it on

  Autotrain compatible   Conversational Dataset:gryphe/opus-writingpro... Dataset:sao10k/c2-logs-filtere... Dataset:sao10k/claude-3-opus-i... Dataset:sao10k/short-storygen-...   En   Endpoints compatible   Instruct   Llama   Region:us   Safetensors   Sharded   Tensorflow

L3 8B Stheno V3.2 Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
L3 8B Stheno V3.2 (Sao10K/L3-8B-Stheno-v3.2)

L3 8B Stheno V3.2 Parameters and Internals

Model Type 
text generation, assistant-style
Use Cases 
Areas:
Storywriting, Narration, Assistant-type tasks, Roleplaying
Primary Use Cases:
Story writing with balance between SFW and NSFW content, handling prompts with multi-turn coherency, assistant style tasks.
Limitations:
May be slightly less creative than previous versions.
Additional Notes 
Contact the developer on Discord (username: sao10k) for more information.
Supported Languages 
en (English)
Training Details 
Data Sources:
Gryphe/Opus-WritingPrompts, Sao10K/Claude-3-Opus-Instruct-15K, Sao10K/Short-Storygen-v2, Sao10K/c2-Logs-Filtered
Methodology:
Merged multiple model variations back to base with different weights and training runs. Hyperparameter tuning was done for better performance. Used both SFW and NSFW data for story writing.
Training Time:
24 hours over multiple runs
Hardware Used:
1x H100 SXM
Model Architecture:
Sixth iteration with improvements from v3.1.
Input Output 
Input Format:
<|begin_of_text|><|start_header_id|>system<|end_header_id|> {system_prompt}<|eot_id|><|start_header_id|>user<|end_header_id|> {input}<|eot_id|><|start_header_id|>assistant<|end_header_id|> {output}<|eot_id|>
Accepted Modalities:
text
Output Format:
Text outputs with adherence to templates and instructions.
Performance Tips:
Use recommended samplers like Temperature - 1.12-1.22, Min-P - 0.075, Top-K - 50. Adhere to stopping strings for best results.
Release Notes 
Version:
v3.2
Notes:
Includes mix of SFW and NSFW data, more instruct style data, cleaned roleplay samples, better performance in various tasks.
LLM NameL3 8B Stheno V3.2
Repository ๐Ÿค—https://huggingface.co/Sao10K/L3-8B-Stheno-v3.2 
Model Size8b
Required VRAM16.1 GB
Updated2025-02-22
MaintainerSao10K
Model Typellama
Instruction-BasedYes
Model Files  5.0 GB: 1-of-4   5.0 GB: 2-of-4   4.9 GB: 3-of-4   1.2 GB: 4-of-4
Supported Languagesen
Model ArchitectureLlamaForCausalLM
Licensecc-by-nc-4.0
Context Length8192
Model Max Length8192
Transformers Version4.41.2
Tokenizer ClassPreTrainedTokenizerFast
Vocabulary Size128256
Torch Data Typebfloat16

Quantized Models of the L3 8B Stheno V3.2

Model
Likes
Downloads
VRAM
L3 8B Stheno V3.2 AWQ2865 GB
L3 8B Stheno V3.2 AWQ0675 GB

Best Alternatives to L3 8B Stheno V3.2

Best Alternatives
Context / RAM
Downloads
Likes
...a 3 8B Instruct Gradient 1048K1024K / 16.1 GB3927680
Mpasila Viking 8B1024K / 16.1 GB840
Hel V2 8B DARK FICTION1024K / 16.1 GB220
161024K / 16.1 GB1690
...di95 LewdStorytellerMix 8B 64K1024K / 16.1 GB692
Because Im Bored Nsfw11024K / 16.1 GB361
121024K / 16.1 GB600
MrRoboto ProLong 8B V4b1024K / 16.1 GB1070
MrRoboto ProLong 8B V1a1024K / 16.1 GB1080
MrRoboto ProLong 8B V2a1024K / 16.1 GB1020

Rank the L3 8B Stheno V3.2 Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 43470 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227