LLM Explorer Blog

On May 13th, the Technology Innovation Institute (TII) launched Falcon 2, the latest version of its large language model (LLM). There are two versions: Falcon 2 11B: Efficient and accessible with 11 billion parameters, trained on 5.5 trillion tokens. It outperforms Meta’s Llama 3 and...
We at LLM Explorer love following developments in the LLM scene, both in model advancements and LLM benchmarks. And today we're happy to share some great news from TIGER-Lab—they've introduced an upgraded version of the MMLU dataset, called MMLU-Pro. The dataset is here. MMLU-Pro is a more...
The development of new LLMs is making headlines daily, and these new models can handle longer contexts and complex memory tasks. For example, DeepSeek-V2 supports context lengths up to 128K tokens with its large parameter count and Multi-head Latent Attention, reducing reliance on...
Recently, AI professional and Health AI Entrepreneur Farhang Dehzad introduced a new 3B-sized model, known as Sum Small, designed specifically for summarizing medical dialogues. This model notably surpasses GPT-4 in efficiently creating summaries for clinicians, potentially saving countless hours...
LLM leaderboards test language models by putting them through standardized benchmarks backed by detailed methods and large databases. They tackle a range of tasks such as text generation, translation, summarization, and understanding, using challenges like question answering and text completion to...
Microsoft recently introduced their new Phi-3 LLMs, which quickly outperformed the Llama 3 models despite their release less than a week prior. The Phi-3 models, particularly the Phi-3-mini, have demonstrated remarkable efficiency. Despite having only 3.8 billion parameters — less than half...
There are many language models available today. But how do you know which one is right for you? Trying out a model can be time-consuming and frustrating. The good news is that there are free online playgrounds and tools that let you test language models without installing them. Let's explore the...
Last week, the AI community was buzzing with the launch of Llama 3, available in 8B and 70B sizes. These models have outperformed many open-source chat models on standard benchmarks. We also joined in on the LLM excitement 😊 with our post "Llama3 License Explained," which covers the license's...
In this post, we've compiled a summary of both proprietary and open-source models that are provided as a service. However, this list does not encompass the full range of available hosting providers. You'll find numerous GPU and serverless LLM hostings listed in our directory. For more details,...
The Meta Llama 3 Community License Agreement seems quite liberal at first glance, offering a breath of fresh air compared to traditional open-source and Creative Commons licenses. But to truly understand its permissiveness, we need to dive into the specifics of what you can and cannot do under this...
Direct Preference Optimization (DPO) is fundamentally a streamlined approach for fine-tuning substantial language models such as Mixtral 8x7b, Llama2, and even GPT4. It’s useful because it cuts down on the complexity and resources needed compared to traditional methods. It makes the process...
Welcome back to our ongoing series where we spotlight the most trending Large and Small Language Models (LLMs) shaping the current AI landscape. As we enter week #16 of 2024, let’s dive into the roundup of new LLMs that have captured the AI community’s attention. This week, we see a...
The growing artificial intelligence (AI) industry has significantly changed how we interact with data. A key component of this progress is the development of Large Language Models (LLMs), which are capable of generating text that resembles human writing. However, using these models effectively and...
LLMs are valuable for coding, helping to generate and discuss code, making it easier for beginners to advance their projects, and simplifying the start of new tasks. For experienced specialists, they serve as an advanced tool, enhancing code optimization and providing innovative solutions to...
Welcome back to our ongoing series where we spotlight the Large and Small  Language Models that are defining the current landscape of artificial intelligence. As of April 9, 2024, we're excited to bring you this week's roundup of the LLMs that have stood out in the AI community. Our list is...
Noticing an increase in queries for 'uncensored models,' we've responded by adding a new Leaderboard focused on evaluating uncensored general intelligence (UGI) to our LLM Leaderboards Catalog. The UGI Leaderboard is hosted on Hugging Face Spaces. It assesses models on their ability to process and...
The development of NSFW (Not Safe for Work) Large Language Models (LLMs) is shaping new possibilities in adult content creation and engagement. Recognizing adult content as a legitimate and natural aspect of human expression, the AI industry is moving towards creating tools that can cater to the...
Last week, Jamba took the lead as the most trending model, beating all other Large Language Models (LLMs) in terms of downloads and likes on platforms like Hugging Face and LLM Explorer. Its popularity remains strong, holding the top spot. Curious about its success, we're taking a closer...
Continuing with our weekly roundup, here's the latest on the AI models making waves in the community as of April 2, 2024. These models have captured widespread interest, as evidenced by their downloads and likes on platforms like Hugging Face and LLM Explorer. Let's dive into this week's standout...
Following our previous week's roundup of Top-trending large language models (LLMs), here's the latest update on the AI models that have captured the community's attention from March 26, 2024. This week, we see new entries and significant updates, showcasing the dynamic and innovative landscape of...
In this article, we'll take a closer look at LLM (Large Language Model) Leaderboards, a key tool for assessing the performance of LLMs for professional use, and discuss the challenges and potential solutions for maintaining their reliability. LLM Leaderboards are simple yet powerful tools that...
In this post, we'd like to share the list of top trending models that caught people's attention in the AI world over the last week. We ranked them by how many times they were downloaded and liked, based on information from Hugging Face and LLM Explorer. 1. C4ai Command-R V01, developed by Cohere...
The emergence of open source large language models (LLMs) has changed the field of artificial intelligence (AI), particularly in natural language processing (NLP). These models are gaining popularity for their cost-effectiveness, high customizability, reduced vendor lock-in, transparent code,...
In recent years, small language models have sparked considerable interest among AI professionals and enthusiasts alike. Marking a significant shift towards more accessible and adaptable generative AI technologies, SLMs have proven to be highly beneficial for both individuals and organizations....
Uncensored models represent a unique class of artificial intelligence that operates without the traditional constraints imposed on most AI systems. Designed to generate/provide information with minimal restrictions, these models offer expansive capabilities for professionals aiming to utilize AI's...
In a surprising turn of events that has captivated the AI community, a leak from Mistral AI, a Paris-based AI powerhouse, has brought to light an advanced Large Language Model known as "Miqu-1 70b". This development was confirmed by Arthur Mensch, the CEO of Mistral, through a humor-laced tweet,...
Mamba represents a new approach in sequence modeling, crucial for understanding patterns in data sequences like language, audio, and more. It's designed as a linear-time sequence modeling method using selective state spaces, setting it apart from models like the Transformer...
This week, the AI community witnessed the arrival of RWKV Eagle LLM v5, a groundbreaking development in machine learning architecture. Unlike its predecessors that rely on the attention mechanism, RWKV Eagle v5 employs a "Linear Transformer" design, integrating aspects of both RNN and...
Was this helpful?
Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v2024042801