MistralLite by AWS

 ยป  All LLMs  ยป  AWS  ยป  MistralLite   URL Share it on

  Autotrain compatible   Mistral   Pytorch   Region:us   Sharded
Model Card on HF ๐Ÿค—: https://huggingface.co/amazon/MistralLite 

MistralLite Benchmarks

MistralLite (amazon/MistralLite)

MistralLite Parameters and Internals

Model Type 
Language Model, Text Generation
Use Cases 
Areas:
Research, Commercial applications
Applications:
Long context retrieval, Summarization, Question-answering
Primary Use Cases:
Long context line and topic retrieval, Summarization, Question-answering
Limitations:
Performance may vary based on specific long context tasks and input lengths.
Considerations:
Use prompt templates for effective outcomes.
Additional Notes 
MistralLite supports various deployment methods suitable for different environments. It requires initial setup but offers improved performance for long context tasks.
Supported Languages 
English (Proficient)
Training Details 
Data Sources:
SLidingEncoder and Decoder (SLED), (Long) Natural Questions (NQ), OpenAssistant Conversations Dataset (OASST1)
Methodology:
Utilized an adapted Rotary Embedding and sliding window during fine-tuning
Context Length:
32000
Model Architecture:
Fine-tuned version of the Mistral-7B-v0.1 model using adaptations for long context handling.
Input Output 
Input Format:
Prompt templates such as '<|prompter|>What are the main challenges to support a long context for LLM?~~<|assistant|>'
Accepted Modalities:
text
Output Format:
Generated text responses aligned with input prompts
Performance Tips:
Use prompt templates for optimal model performance.
LLM NameMistralLite
Repository ๐Ÿค—https://huggingface.co/amazon/MistralLite 
Required VRAM14.4 GB
Updated2025-06-01
MaintainerAWS
Model Typemistral
Model Files  9.9 GB: 1-of-2   4.5 GB: 2-of-2
Model ArchitectureMistralForCausalLM
Licenseapache-2.0
Context Length32768
Model Max Length32768
Transformers Version4.34.0
Tokenizer ClassLlamaTokenizer
Padding Token[PAD]
Vocabulary Size32003
Torch Data Typebfloat16

Quantized Models of the MistralLite

Model
Likes
Downloads
VRAM
MistralLite 7B GGUF407153 GB
MistralLite 7B GGUF11092 GB
MistralLite 7B AWQ8314 GB
MistralLite 7B GPTQ3164 GB

Best Alternatives to MistralLite

Best Alternatives
Context / RAM
Downloads
Likes
Krutrim 2 Instruct1000K / 49.3 GB145828
Ft V1 Violet1000K / 24.5 GB60
Devstral Small 2505 Bf16128K / 46.9 GB3371
Tiny Random MistralForCausalLM128K / 0 GB61371
Winterreise M732K / 14.4 GB00
Frostwind V2.1 M732K / 14.4 GB00
...ydaz Web AI Reasoner BaseModel32K / 14.4 GB01
MistralLite32K / 14.4 GB61777430
Mixtral AI Cyber Child32K / 14.5 GB141
Kheops Textbook Immo232K / 14.5 GB110
Note: green Score (e.g. "73.2") means that the model is better than amazon/MistralLite.

Rank the MistralLite Capabilities

๐Ÿ†˜ Have you tried this model? Rate its performance. This feedback would greatly assist ML community in identifying the most suitable model for their needs. Your contribution really does make a difference! ๐ŸŒŸ

Instruction Following and Task Automation  
Factuality and Completeness of Knowledge  
Censorship and Alignment  
Data Analysis and Insight Generation  
Text Generation  
Text Summarization and Feature Extraction  
Code Generation  
Multi-Language Support and Translation  

What open-source LLMs or SLMs are you in search of? 47753 in total.

Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227