LLM Explorer Blog
2024-05-28
Last week's news about California’s new AI bill sparked many discussions in the AI community about the future of AI development. The bill aims to ensure responsible AI development but imposes significant restrictions on large language models. Critics argue that it introduces numerous...
2024-05-24
Mistral AI has released the Mistral-7B-Instruct-v0.3 model (license: apache-2.0), and it comes with a significant feature: function calling. This is remarkable because it's available in a medium-sized model with 7 billion parameters, making advanced capabilities more accessible.
Why Function...
2024-05-16
On May 13th, the Technology Innovation Institute (TII) launched Falcon 2, the latest version of its large language model (LLM). There are two versions:
Falcon 2 11B: Efficient and accessible with 11 billion parameters, trained on 5.5 trillion tokens. It outperforms Meta’s Llama 3 and...
2024-05-15
We at LLM Explorer love following developments in the LLM scene, both in model advancements and LLM benchmarks. And today we're happy to share some great news from TIGER-Lab—they've introduced an upgraded version of the MMLU dataset, called MMLU-Pro.
The dataset is here.
MMLU-Pro is a more...
2024-05-14
The development of new LLMs is making headlines daily, and these new models can handle longer contexts and complex memory tasks. For example, DeepSeek-V2 supports context lengths up to 128K tokens with its large parameter count and Multi-head Latent Attention, reducing reliance on...
2024-05-13
Recently, AI professional and Health AI Entrepreneur Farhang Dehzad introduced a new 3B-sized model, known as Sum Small, designed specifically for summarizing medical dialogues. This model notably surpasses GPT-4 in efficiently creating summaries for clinicians, potentially saving countless hours...
2024-05-12
LLM leaderboards test language models by putting them through standardized benchmarks backed by detailed methods and large databases. They tackle a range of tasks such as text generation, translation, summarization, and understanding, using challenges like question answering and text completion to...
2024-04-29
Microsoft recently introduced their new Phi-3 LLMs, which quickly outperformed the Llama 3 models despite their release less than a week prior. The Phi-3 models, particularly the Phi-3-mini, have demonstrated remarkable efficiency. Despite having only 3.8 billion parameters — less than half...
2024-04-27
There are many language models available today. But how do you know which one is right for you? Trying out a model can be time-consuming and frustrating. The good news is that there are free online playgrounds and tools that let you test language models without installing them. Let's explore the...
2024-04-23
Last week, the AI community was buzzing with the launch of Llama 3, available in 8B and 70B sizes. These models have outperformed many open-source chat models on standard benchmarks.
We also joined in on the LLM excitement 😊 with our post "Llama3 License Explained," which covers the license's...
2024-04-20
In this post, we've compiled a summary of both proprietary and open-source models that are provided as a service. However, this list does not encompass the full range of available hosting providers. You'll find numerous GPU and serverless LLM hostings listed in our directory. For more details,...
2024-04-18
The Meta Llama 3 Community License Agreement seems quite liberal at first glance, offering a breath of fresh air compared to traditional open-source and Creative Commons licenses. But to truly understand its permissiveness, we need to dive into the specifics of what you can and cannot do under this...
2024-04-18
Direct Preference Optimization (DPO) is fundamentally a streamlined approach for fine-tuning substantial language models such as Mixtral 8x7b, Llama2, and even GPT4. It’s useful because it cuts down on the complexity and resources needed compared to traditional methods. It makes the process...
2024-04-16
Welcome back to our ongoing series where we spotlight the most trending Large and Small Language Models (LLMs) shaping the current AI landscape. As we enter week #16 of 2024, let’s dive into the roundup of new LLMs that have captured the AI community’s attention. This week, we see a...
2024-04-15
The growing artificial intelligence (AI) industry has significantly changed how we interact with data. A key component of this progress is the development of Large Language Models (LLMs), which are capable of generating text that resembles human writing. However, using these models effectively and...
Original data from HuggingFace, OpenCompass and various public git repos.
Release v20241227