35B Beta2ep by CausalLM

 ยป  All LLMs  ยป  CausalLM  ยป  35B Beta2ep   URL Share it on

  Autotrain compatible   Cohere   Conversational   Dataset:baai/coig Dataset:causallm/refined-anime... Dataset:garage-baind/open-plat... Dataset:jondurbin/airoboros-3.... Dataset:josephuscheung/guanaco...   Dataset:ldjnr/puffin   Dataset:liwu/mnbvc Dataset:m-a-p/codefeedback-fil...   Dataset:meta-math/metamathqa Dataset:microsoft/orca-math-wo... Dataset:milashkaarshif/moegirl...   Dataset:ryokoai/fandom23k   Dataset:ryokoai/sharegpt52k   Dataset:teknium/openhermes Dataset:tigerresearch/tigerbot...   Dataset:wiki lingua   Dataset:wikipedia Dataset:wizardlm/wizardlm evol...   De   En   Endpoints compatible   Instruct   Ja   Region:us   Safetensors   Sharded   Tensorflow   Zh
Model Card on HF ๐Ÿค—: https://huggingface.co/CausalLM/35b-beta2ep 

35B Beta2ep Benchmarks

35B Beta2ep (CausalLM/35b-beta2ep)

35B Beta2ep Parameters and Internals

Additional Notes 
No loras, no quants, no tricks. Recommended for general tasks, knowledge, coding.
Supported Languages 
en (full), zh (full), ja (full), de (full)
Training Details 
Data Sources:
JosephusCheung/GuanacoDataset, meta-math/MetaMathQA, jondurbin/airoboros-3.1, WizardLM/WizardLM_evol_instruct_V2_196k, RyokoAI/ShareGPT52K, RyokoAI/Fandom23K, milashkaarshif/MoeGirlPedia_wikitext_raw_archive, wikipedia, wiki_lingua, garage-bAInd/Open-Platypus, LDJnr/Puffin, BAAI/COIG, TigerResearch/tigerbot-zhihu-zh-10k, liwu/MNBVC, teknium/openhermes, CausalLM/Refined-Anime-Text, microsoft/orca-math-word-problems-200k, m-a-p/CodeFeedback-Filtered-Instruction
Methodology:
Fully fine-tuned at 128K+ ~30M entries long, web crawl input, GPT-4-32k/3.5-16k output, synthetic dataset
Context Length:
128000
Training Time:
1 epoch
Input Output 
Input Format:
web crawl input, synthetic dataset
Output Format:
GPT-4-32k/3.5-16k output
LLM Name35B Beta2ep
Repository ๐Ÿค—https://huggingface.co/CausalLM/35b-beta2ep 
Model Size35b
Required VRAM69.5 GB
Updated2025-02-22
MaintainerCausalLM
Model Typecohere
Instruction-BasedYes
Model Files  4.7 GB: 1-of-15   4.9 GB: 2-of-15   4.9 GB: 3-of-15   4.9 GB: 4-of-15   4.9 GB: 5-of-15   4.9 GB: 6-of-15   4.9 GB: 7-of-15   4.9 GB: 8-of-15   4.9 GB: 9-of-15   4.9 GB: 10-of-15   4.9 GB: 11-of-15   4.9 GB: 12-of-15   4.9 GB: 13-of-15   4.9 GB: 14-of-15   1.1 GB: 15-of-15
Supported Languagesen zh ja de
Model ArchitectureCohereForCausalLM
Licensegpl-3.0
Context Length8192
Model Max Length8192
Transformers Version4.38.2
Tokenizer ClassLlamaTokenizer
Padding Token<PAD>
Vocabulary Size256000
Torch Data Typebfloat16

Best Alternatives to 35B Beta2ep

Best Alternatives
Context / RAM
Downloads
Likes
35B Beta Long8K / 69.5 GB5665
...ommand R V01 Japanese Instruct8K / 69.5 GB1094
Note: green Score (e.g. "73.2") means that the model is better than CausalLM/35b-beta2ep.

Rank the 35B Beta2ep Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 43470 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227