What open-source LLMs or SLMs are you in search of? 36644 in total.

Quantized models in GGUF Format Was this list helpful?

The GGUF format is a new extensible binary format for AI models, introduced in August 2023. It is designed for fast loading, flexibility, and single-file convenience. GGUF is specifically tailored for LLM (Large Language Model) inference tasks, such as language encoding and decoding. It offers several advantages over its predecessors, such as better tokenization, support for special tokens, metadata, and extensibility. The format is aimed at making model loading faster, easier, and more adaptable to future changes. GGUF is designed to be easy to use, requiring minimal code for model loading and eliminating the need for external libraries. It is supported by tools like llama.cpp and is already being widely adopted in the developer community
Model Size
Model VRAM
Quantized models in GGUF Format
Loading a list of LLMs...
Here comes the list of the Small and Large Language Models
Model Name Maintainer Size Score VRAM (GB) Quantized License Context Len Likes Downloads Modified Languages Architectures
— Large Language Model
— Adapter
— Code-Generating Model
— Listed on LMSys Arena Bot ELO Rating
— Original Model
— Merged Model
— Instruction-Based Model
— Quantized Model
— Finetuned Model
— Mixture-Of-Experts
Table Headers Explained  
  • Name — The title and maintainer account associated with the model.
  • Params — The number of parameters used in the model.
  • Score — The model's score depending on the selected rating (default is the LLM Explorer Score).
  • Likes — The number of "likes" given to the model by users.
  • VRAM — The rough estimate of the GB required for inference.
  • Downloads — The total number of downloads for the model.
  • Quantized — Specifies whether the model is quantized.
  • CodeGen — Specifies whether the model can recognize or infer source code.
  • License — The type of license associated with the model.
  • Languages — The list of languages supported by the model (where specified).
  • Maintainer — The author or maintainer of the model.
  • Architectures — The transformer architecture used in the model.
  • Context Len — The content length supported by the model.
  • Tags — The list of tags specified by the model's maintainer.

Choose another global filter

  All Large Language Models   LMSYS ChatBot Arena ELO   OpenLLM LeaderBoard v1   OpenLLM LeaderBoard v2   Original & Foundation LLMs   OpenCompass LeaderBoard   Recently Added Models   Code Generating Models   Instruction-Based LLMs   Uncensored LLMs   LLMs Fit in 4GB RAM   LLMs Fit in 8GB RAM   LLMs Fit in 12GB RAM   LLMs Fit in 24GB RAM   LLMs Fit in 32GB RAM   GGUF Quantized Models   GPTQ Quantized Models   EXL2 Quantized Models   Fine-Tuned Models   LLMs for Commercial Use   TheBloke's Models   Context Size >16K Tokens   Mixture-Of-Experts Models   Apple's MLX LLMs   Small Language Models
Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v2024072803