XVERSE 13B 256K by xverse

 ยป  All LLMs  ยป  xverse  ยป  XVERSE 13B 256K   URL Share it on

  Autotrain compatible   Custom code   Pytorch   Region:us   Sharded   Xverse
Model Card on HF ๐Ÿค—: https://huggingface.co/xverse/XVERSE-13B-256K 

XVERSE 13B 256K Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
XVERSE 13B 256K (xverse/XVERSE-13B-256K)

XVERSE 13B 256K Parameters and Internals

Additional Notes 
Model weights are fully open to academia and support free commercial use.
Training Details 
Data Volume:
20% of pre-training data
Methodology:
Continual-Pre-Training based on ABF and supervised fine-tuning based on NTK
Context Length:
256000
Release Notes 
Date:
2024-06-28
Notes:
Updated tokenizers.
Date:
2024-01-16
Notes:
Released the long-sequence model XVERSE-13B-256K. Supports a maximum window length of 256K for tasks like summarization and report analysis.
Date:
2023-11-06
Notes:
Released new versions of XVERSE-13B-2 base model and XVERSE-13B-Chat-2 with extended training from 1.4T to 3.2T.
Date:
2023-09-26
Notes:
Released XVERSE-7B base model and XVERSE-7B-Chat supporting deployment on a single consumer-grade graphics card.
Date:
2023-08-22
Notes:
Released instruct-finetuned XVERSE-13B-Chat model.
Date:
2023-08-07
Notes:
Released XVERSE-13B base model.
LLM NameXVERSE 13B 256K
Repository ๐Ÿค—https://huggingface.co/xverse/XVERSE-13B-256K 
Model Size13b
Required VRAM27.4 GB
Updated2025-02-22
Maintainerxverse
Model Typexverse
Model Files  1.9 GB: 1-of-15   1.9 GB: 2-of-15   1.9 GB: 3-of-15   1.9 GB: 4-of-15   1.9 GB: 5-of-15   1.9 GB: 6-of-15   1.9 GB: 7-of-15   1.9 GB: 8-of-15   1.9 GB: 9-of-15   1.9 GB: 10-of-15   1.9 GB: 11-of-15   1.9 GB: 12-of-15   1.9 GB: 13-of-15   1.7 GB: 14-of-15   1.0 GB: 15-of-15
Model ArchitectureXverseForCausalLM
Licenseapache-2.0
Context Length32768
Model Max Length32768
Transformers Version4.28.1
Tokenizer ClassPreTrainedTokenizerFast
Vocabulary Size100534
Torch Data Typebfloat16

Best Alternatives to XVERSE 13B 256K

Best Alternatives
Context / RAM
Downloads
Likes
XVERSE 13B8K / 27.6 GB291120
XVERSE 13B Chat8K / 27.6 GB12945
Xverse 13B Int48K / 8.4 GB772

Rank the XVERSE 13B 256K Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 43470 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227