Mistral 7B Instruct V0.1 Summarize 64K LoRANET Merged by Ayush-1722

 ยป  All LLMs  ยป  Ayush-1722  ยป  Mistral 7B Instruct V0.1 Summarize 64K LoRANET Merged   URL Share it on

  Arxiv:2310.06825   Big patents   200k+ context length   7b   Autotrain compatible   Chat   Conversational   Dataset:facebook/babi qa   Dataset:rmt-team/babilong Dataset:rmt-team/babilong-1k-s... Dataset:trelis/big patent 100k...   Endpoints compatible   Finetuned   Instruct   Long context   Lora   Mistral   Norm & embed trained   Question answering   Region:us   Research   Rope   Safetensors   Science   Sharded   Summarize   Tensorflow   Theta scaling

Mistral 7B Instruct V0.1 Summarize 64K LoRANET Merged Benchmarks

nn.n% — How the model compares to the reference models: Anthropic Sonnet 3.5 ("so35"), GPT-4o ("gpt4o") or GPT-4 ("gpt4").
Mistral 7B Instruct V0.1 Summarize 64K LoRANET Merged (Ayush-1722/Mistral-7B-Instruct-v0.1-Summarize-64K-LoRANET-Merged)

Mistral 7B Instruct V0.1 Summarize 64K LoRANET Merged Parameters and Internals

Model Type 
text-generation, summarize, instruct, question-answering, chat, conversational, research, science
Additional Notes 
The model does not have any moderation mechanisms and discussions with the community are awaited to incorporate guardrails into the deployment environment.
Training Details 
Data Sources:
RMT-team/babilong, RMT-team/babilong-1k-samples, Trelis/big_patent_100k_characters, facebook/babi_qa
Methodology:
Instruction fine-tuning with conversation datasets
Model Architecture:
Mistral-7B-v0.1 based transformer with Grouped-Query Attention and Sliding-Window Attention
Input Output 
Input Format:
Surround your prompt with `[INST]` and `[/INST]` tokens.
LLM NameMistral 7B Instruct V0.1 Summarize 64K LoRANET Merged
Repository ๐Ÿค—https://huggingface.co/Ayush-1722/Mistral-7B-Instruct-v0.1-Summarize-64K-LoRANET-Merged 
Model Size7b
Required VRAM14.4 GB
Updated2024-12-13
MaintainerAyush-1722
Model Typemistral
Instruction-BasedYes
Model Files  9.9 GB: 1-of-2   4.5 GB: 2-of-2
Model ArchitectureMistralForCausalLM
Licenseapache-2.0
Context Length131072
Model Max Length131072
Transformers Version4.40.2
Tokenizer ClassLlamaTokenizer
Padding Token<unk>
Vocabulary Size32000
Torch Data Typebfloat16

Best Alternatives to Mistral 7B Instruct V0.1 Summarize 64K LoRANET Merged

Best Alternatives
Context / RAM
Downloads
Likes
...Nemo Instruct 2407 Abliterated1000K / 24.5 GB27459
SpydazWeb AI HumanAI RP512K / 14.4 GB361
SpydazWeb AI HumanAI 002512K / 14.4 GB161
...daz Web AI ChatML 512K Project512K / 14.5 GB120
... Summarize 64K QLoRANET Merged128K / 4.1 GB50
Mistral 7B Instruct V0.232K / 14.4 GB13701322593
Mistral 7B Instruct V0.132K / 14.4 GB11806021537
...ity Instruct 7M Gen Mistral 7B32K / 14.4 GB56273
...ty Instruct 3M 0625 Mistral 7B32K / 14.4 GB56603
...ty Instruct 3M 0613 Mistral 7B32K / 14.4 GB808111
Note: green Score (e.g. "73.2") means that the model is better than Ayush-1722/Mistral-7B-Instruct-v0.1-Summarize-64K-LoRANET-Merged.

Rank the Mistral 7B Instruct V0.1 Summarize 64K LoRANET Merged Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 39237 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241124