TC Instruct DPO by tanamettpk

 ยป  All LLMs  ยป  tanamettpk  ยป  TC Instruct DPO   URL Share it on

  Autotrain compatible Base model:finetune:scb10x/typ...   Base model:scb10x/typhoon-7b   Chatml Dataset:pythainlp/thainer-corp...   Dataset:pythainlp/thaisum Dataset:superai2-machima/thaiq...   Dataset:thai toxicity tweet   Dataset:thaisum Dataset:thaweewat/alpaca-clean... Dataset:thaweewat/instruct-qa-...   Dataset:yahma/alpaca-cleaned   Dpo   En   Endpoints compatible   Finetuned   Instruct   Mistral   Region:us   Rlhf   Safetensors   Sharded   Synthetic data   Tensorflow   Th

TC Instruct DPO Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
TC Instruct DPO (tanamettpk/TC-instruct-DPO)

TC Instruct DPO Parameters and Internals

Model Type 
instruct, chatml, DPO, RLHF, synthetic data
Additional Notes 
The model is designed primarily for educational purposes on LLM creation, with acknowledgment of initial training challenges and intended for improvement and experimentation.
Supported Languages 
en (unknown proficiency level), th (unknown proficiency level)
Training Details 
Data Sources:
Thaweewat/alpaca-cleaned-52k-th, yahma/alpaca-cleaned, pythainlp/thaisum, thai_toxicity_tweet, pythainlp/thainer-corpus-v2, Thaweewat/instruct-qa-thai-combined, SuperAI2-Machima/ThaiQA_LST20, thaisum
Methodology:
Train with QLoRA Rank 32 Alpha 64, Custom Script of Huggingface
Training Time:
Approximately 21 hours
Hardware Used:
H100 PCIE 80 GB from vast.ai
Input Output 
Output Format:
The format as per instruction-response design
Performance Tips:
Use axolotl or unsloth for better training cost efficiency
LLM NameTC Instruct DPO
Repository ๐Ÿค—https://huggingface.co/tanamettpk/TC-instruct-DPO 
Base Model(s)  Typhoon 7B   scb10x/typhoon-7b
Model Size7b
Required VRAM14.6 GB
Updated2024-12-14
Maintainertanamettpk
Model Typemistral
Instruction-BasedYes
Model Files  5.0 GB: 1-of-3   5.0 GB: 2-of-3   4.6 GB: 3-of-3
Supported Languagesen th
Model ArchitectureMistralForCausalLM
Licenseapache-2.0
Context Length32768
Model Max Length32768
Transformers Version4.37.2
Tokenizer ClassLlamaTokenizer
Padding Token</s>
Vocabulary Size35219
Torch Data Typefloat16

Best Alternatives to TC Instruct DPO

Best Alternatives
Context / RAM
Downloads
Likes
...Nemo Instruct 2407 Abliterated1000K / 24.5 GB27459
SpydazWeb AI HumanAI RP512K / 14.4 GB361
SpydazWeb AI HumanAI 002512K / 14.4 GB161
...daz Web AI ChatML 512K Project512K / 14.5 GB120
... Summarize 64K QLoRANET Merged128K / 4.1 GB50
...1 Summarize 64K LoRANET Merged128K / 14.4 GB50
Mistral 7B Instruct V0.232K / 14.4 GB13701322593
Mistral 7B Instruct V0.132K / 14.4 GB11806021537
...ity Instruct 7M Gen Mistral 7B32K / 14.4 GB56273
...ty Instruct 3M 0625 Mistral 7B32K / 14.4 GB56603
Note: green Score (e.g. "73.2") means that the model is better than tanamettpk/TC-instruct-DPO.

Rank the TC Instruct DPO Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 39237 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241124