Exploring Permissive Licenses for Large Language Models (LLMs)

LLM Permissive Licenses

The growing artificial intelligence (AI) industry has significantly changed how we interact with data. A key component of this progress is the development of Large Language Models (LLMs), which are capable of generating text that resembles human writing. However, using these models effectively and responsibly involves important licensing considerations.

Licensing LLMs is a challenging issue due to their distinct characteristics. Unlike traditional software, LLMs consist not just of code but are trained on extensive data sets. Therefore, applying traditional open-source licenses to them can be complex.

Permissive licenses are commonly adopted in the AI community to balance open use and protection of the original work. These licenses include Apache-2.0, BSD, BSD-2-Clause, BSD-3-Clause, MIT, and various Creative Commons licenses, each with specific requirements.

Understanding the complexities of these licenses can be challenging. To clarify these licenses and their stipulations, refer to the comprehensive comparison table provided. This table details the key aspects of each license, offering a summary of their requirements for both users and authors, and it highlights potential commercial pitfalls. This framework makes it easier to compare the different licenses and choose the one that meets your needs. Let's examine this table:

LLM Permissive Licenses Table

Apache-2.0 (Apache License 2.0)

Requirements for the user:

  • Any significant changes made to the Large Language Model (LLM) must be documented.
  • Include a copy of the original Apache License in any use of the LLM.
  • Existing copyright, patent, trademark, and attribution notices must be maintained.

Requirements for the author:

  • Provide a copy of the Apache License.
  • Provide a notice file with the above-mentioned information.

Potential pitfalls for commercial use:

  • One unique aspect of the Apache License is that it includes an express grant of patent rights from contributors to users. Businesses need to be cautious about this clause if they are also developing their proprietary software.

BSD (Berkeley Software Distribution license)

Requirements for the user:

  • Use of the LLM must keep the original copyright notice, list of conditions, and disclaimer.

Requirements for the author:

  • The license text must be included in the distribution of the LLM.

Potential pitfalls for commercial use:

  • The original BSD License has an “advertising clause” that requires all advertising materials mentioning the LLM to display an acknowledgement. This can be burdensome for businesses.

BSD-2-Clause (BSD 2-Clause “Simplified” License)

Requirements for the user:

  • Use of the LLM must keep the original copyright notice, list of conditions, and disclaimer.

Requirements for the author:

  • The license text must be included in the distribution of the LLM.

Potential pitfalls for commercial use:

  • This is a very permissive license, with few conditions, making it straightforward for commercial use.

BSD-3-Clause (BSD 3-Clause “New” or “Revised” License)

Requirements for the user:

  • Use of the LLM must keep the original copyright notice, list of conditions, and disclaimer.
  • Cannot use the name of the original author or contributors to endorse or promote products derived from the LLM without permission.

Requirements for the author:

  • The license text must be included in the distribution of the LLM.

Potential pitfalls for commercial use:

  • The restriction on the use of the names of the original author or contributors for promotional purposes can pose a limitation in some commercial contexts.

CC-BY-2.0 (Creative Commons Attribution 2.0)

Requirements for the user:

  • Must give appropriate credit, provide a link to the license, and indicate if changes were made to the LLM.

Requirements for the author:

  • The license text must be included in the distribution of the LLM.

Potential pitfalls for commercial use:

  • The requirement to provide credit and indicate changes may be burdensome in some commercial contexts, especially for large-scale uses of the LLM.

While these licenses are flexible, some have requirements that may impact the use of LLMs in commercial settings. For example, Apache-2.0 and BSD-3-Clause licenses require that any modifications to the model be documented, which may not be suitable for some businesses.

Furthermore, Creative Commons licenses (CC-BY-2.0, 3.0, 4.0) require users to give appropriate credit, provide a link to the license, and note any changes made. These conditions can be challenging for large-scale commercial uses of LLMs.

On the other hand, MIT and BSD-2-Clause licenses are often favored for commercial use due to their straightforward and minimal requirements. They simply require the retention of copyright and permission notices in the distribution of the model.

Choosing the right license depends on the project’s nature, scale of use, and the user's need for flexibility. Developers and businesses must understand the implications of each license to fulfill their obligations effectively. A thoughtful licensing choice can enable innovation while respecting the rights of the original creators.

If you’re new to the world of Large Language Models and find the variety of options and licenses overwhelming, the LLM Explorer is here to help. This service simplifies the exploration of thousands of LLMs, providing all the essential information you need:

LLM Permissive Licenses Filter

The LLM Explorer helps select the best model for your local inference needs based on license type and other factors. It also enables you to compare similar LLMs, facilitating an informed choice for your project.

Ultimately, selecting the appropriate license hinges on your project’s specific requirements. Simplify this process with the LLM Explorer, designed to remove the complexities of navigating Large Language Model licenses.

Was this helpful?
Our Social Media →  
Original data from HuggingFace, OpenCompass and various public git repos.
Release v2024072803