What open-source LLMs or SLMs are you in search of? 42324 in total.

LLMs for 12GB VRAM: Large Language Models Fit in 12GB VRAM Was this list helpful?

If your GPU hardware is limited to 12GB VRAM, you can find the top-performing LLMs in our curated directory.
Which model designed for 12GB VRAM ranks highest for specific tasks and efficiency? Find these insights in our detailed directory, presenting each language model in clear, precise terms.
For cutting-edge proficiency in language modeling within 12GB memory constraints, our model list is your premier guide. Discover your ideal language model tailored for 12GB VRAM with our in-depth comparisons and analyses in the list below.
Model Size
LLMs for 12GB VRAM: Large Language Models Fit in 12GB VRAM
Loading a list of LLMs...
Here comes the list of the Small and Large Language Models
Model Name Maintainer Size Score VRAM (GB) Quantized License Context Len Likes Downloads Modified Languages Architectures
— Large Language Model
— Adapter
— Code-Generating Model
— Listed on LMSys Arena Bot ELO Rating
— Original Model
— Merged Model
— Instruction-Based Model
— Quantized Model
— Finetuned Model
— Mixture-Of-Experts
Table Headers Explained  
  • Name — The title and maintainer account associated with the model.
  • Params — The number of parameters used in the model.
  • Score — The model's score depending on the selected rating (default is the Open LLM Leaderboard on HuggingFace).
  • Likes — The number of "likes" given to the model by users.
  • VRAM — The number of GB required to load the model into the memory. It is not the actual required amount of RAM for inference, but could be used as a reference.
  • Downloads — The total number of downloads for the model.
  • Quantized — Specifies whether the model is quantized.
  • CodeGen — Specifies whether the model can recognize or infer source code.
  • License — The type of license associated with the model.
  • Languages — The list of languages supported by the model (where specified).
  • Maintainer — The author or maintainer of the model.
  • Architectures — The transformer architecture used in the model.
  • Context Len — The content length supported by the model.
  • Tags — The list of tags specified by the model's maintainer.

Choose another global filter

  All Large Language Models   LMSYS ChatBot Arena ELO   Open LLM LeaderBoard   Original & Foundation LLMs   OpenCompass LeaderBoard   Recently Added Models   Trending LLMs & Hot Picks   Code Generating Models   Instruction-Based LLMs   Uncensored LLMs   LLMs Fit in 4GB RAM   LLMs Fit in 8GB RAM   LLMs Fit in 12GB RAM   LLMs Fit in 24GB RAM   LLMs Fit in 32GB RAM   GGUF Quantized Models   GPTQ Quantized Models   EXL2 Quantized Models   Fine-Tuned Models   LLMs for Commercial Use   TheBloke's Models   Context Size >16K Tokens   Mixture-Of-Experts Models   Apple's MLX LLMs   Small Language Models
Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20240042001