KULLM RLHF by Trofish

 ยป  All LLMs  ยป  Trofish  ยป  KULLM RLHF   URL Share it on

  Arxiv:2303.16634   Autotrain compatible   Endpoints compatible   Gpt neox   Pytorch   Region:us
Model Card on HF ๐Ÿค—: https://huggingface.co/Trofish/KULLM-RLHF 

KULLM RLHF Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
KULLM RLHF (Trofish/KULLM-RLHF)

KULLM RLHF Parameters and Internals

Model Type 
Conversational AI, Chatbot
Use Cases 
Areas:
Research, Chatbot Development
Applications:
Korean language conversational AI
Primary Use Cases:
Friendly and harmless everyday conversations in Korean
Additional Notes 
The final model, implemented with RLHF and DeepSpeedChat, aims to produce high-quality conversational responses that are user-friendly and ethical.
Supported Languages 
Korean (high)
Training Details 
Data Sources:
Self-Instruct using GPT-4, RLHF with human feedback, DeepSpeed optimization
Methodology:
Reinforcement Learning from Human Feedback, Self-Instruct Data Augmentation, DeepSpeed for large-scale distributed deep learning
Hardware Used:
Google Colab A100 40GB GPU
Model Architecture:
Used KULLM as baseline model trained with RLHF and SFT (Supervised Fine-tuning) techniques
LLM NameKULLM RLHF
Repository ๐Ÿค—https://huggingface.co/Trofish/KULLM-RLHF 
Required VRAM25.8 GB
Updated2025-02-22
MaintainerTrofish
Model Typegpt_neox
Model Files  25.8 GB
Model ArchitectureGPTNeoXForCausalLM
Context Length2048
Model Max Length2048
Transformers Version4.31.0
Tokenizer ClassPreTrainedTokenizerFast
Padding Token<|endoftext|>
Vocabulary Size30008
Torch Data Typefloat16

Best Alternatives to KULLM RLHF

Best Alternatives
Context / RAM
Downloads
Likes
Catlm8K / 7.8 GB454
...Prover 14final Checkpoint 58304K / 14.9 GB50
Neox Musenet Untrained4K / 7.3 GB60
Stabillm Instruct De4K / 31.8 GB50
Open Calm Large2K / 1.8 GB349510
MonoCoder OMP2K / 3.6 GB2020
ProofGPT V0.12K / 2.9 GB19743
GPT NeoX Pretrain News2K / 0.3 GB3060
Step3 Mk72K / 25.8 GB180
GPT NeoX Pretrain 1GB2K / 0.3 GB1620
Note: green Score (e.g. "73.2") means that the model is better than Trofish/KULLM-RLHF.

Rank the KULLM RLHF Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 43470 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227