OpenBezoar HH RLHF SFT by SurgeGlobal

 ยป  All LLMs  ยป  SurgeGlobal  ยป  OpenBezoar HH RLHF SFT   URL Share it on

  Arxiv:2306.02707   Arxiv:2404.12195   Autotrain compatible Base model:finetune:surgegloba... Base model:surgeglobal/openbez...   Dataset:anthropic/hh-rlhf   En   Endpoints compatible   Llama   Pytorch   Region:us   Safetensors

OpenBezoar HH RLHF SFT Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
OpenBezoar HH RLHF SFT (SurgeGlobal/OpenBezoar-HH-RLHF-SFT)

OpenBezoar HH RLHF SFT Parameters and Internals

Model Type 
text generation
Use Cases 
Limitations:
The model might not consistently show improved abilities to follow instructions, and it could respond inappropriately or get stuck in loops., This model is not aligned to human preferences and therefore it may generate harmful and uncensored content., Caution is urged against relying on this model for production or adjacent use-cases.
Supported Languages 
en ()
Training Details 
Data Sources:
Anthropic HH-RLHF Dataset
Data Volume:
First 100K examples
Methodology:
Supervised Fine-Tuning (SFT)
Model Architecture:
OpenLLaMA 3B v2
Input Output 
Input Format:
Alpaca prompt template
Performance Tips:
It is important to utilize the Alpaca prompt template in order to obtain best responses for instruction related tasks.
LLM NameOpenBezoar HH RLHF SFT
Repository ๐Ÿค—https://huggingface.co/SurgeGlobal/OpenBezoar-HH-RLHF-SFT 
Base Model(s)  OpenBezoar SFT   SurgeGlobal/OpenBezoar-SFT
Model Size3b
Required VRAM6.8 GB
Updated2025-02-22
MaintainerSurgeGlobal
Model Typellama
Model Files  6.8 GB   6.8 GB
Supported Languagesen
Model ArchitectureLlamaForCausalLM
Licensecc-by-nc-4.0
Context Length2048
Model Max Length2048
Transformers Version4.33.2
Tokenizer ClassLlamaTokenizer
Vocabulary Size32000
Torch Data Typefloat16

Best Alternatives to OpenBezoar HH RLHF SFT

Best Alternatives
Context / RAM
Downloads
Likes
Llama 3.2 3B Instruct128K / 6.5 GB16030891085
Llama 3.2 3B128K / 6.5 GB292157509
Hermes 3 Llama 3.2 3B128K / 6.5 GB21968144
ReasoningCore 3B RE1 V2128K / 6.5 GB1160
...penReasoner Llama 3.2 3B Rs1.0128K / 6.5 GB1471
Zeitgeist 3B V1128K / 6.5 GB492
...eflection L3.2 JametMiniMix 3B128K / 6.4 GB1100
Calme 3.1 Llamaloi 3B128K / 10.6 GB29111
Dolphin3.0 Llama3.2 3B128K / 6.5 GB1880036
... 3.2 3B Math Instruct RE1 ORPO128K / 6.5 GB1350
Note: green Score (e.g. "73.2") means that the model is better than SurgeGlobal/OpenBezoar-HH-RLHF-SFT.

Rank the OpenBezoar HH RLHF SFT Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 43470 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227