NuminaMath-7B-TIR: A New Math-Focused Language Model

15/07/2024 11:49:01

Last week, the Numina team released NuminaMath 7B TIR, a new math-focused language model. It quickly reached the top position in our "Top-Trending LLMs over the Last Week" ranking.

For context, our ranking is based on the number of downloads and likes, using data from Hugging Face and LLM Explorer.

$LLMs Most Downloaded Weekly$

To understand what makes this model stand out, let's examine its key characteristics:

Built on: deepseek-ai/deepseek-math-7b-base
Parameters: 6.91 billion
Key feature: Tool-Integrated Reasoning (TIR) with Python REPL
License: Apache 2.0 (open-source)

$License: Apache 2.0 (open-source)$

The model's capabilities are evident in its recent achievement: NuminaMath 7B TIR won the first progress prize at the AI Math Olympiad (AIMO) 🔥, scoring 29/50 on the public and private test sets.

$NuminaMath 7B TIR won the first progress prize at the AI Math Olympiad$

Training Process:

Stage 1: Fine-tuned on a large dataset of math problems with Chain of Thought solutions
Stage 2: Further fine-tuned on synthetic datasets using tool-integrated reasoning

Problem-Solving Approach:

Chain of Thought Reasoning
Python Code Translation
Python REPL Execution
Self-Correction (iterates if initial attempt fails)

Training Details:

Learning rate: 2e-05
Train batch size: 4 (total 32 across 8 GPUs)
Optimizer: Adam
Number of epochs: 4.0

Framework Versions:

Transformers 4.40.1
Pytorch 2.3.1
Datasets 2.18.0
Tokenizers 0.19.1

The AI community has shown interest in this open-source model's performance in mathematics. Users have noted its potential to combine math and coding skills. It's important to remember that this is a specialized math model, not a general-purpose AI.

For developers: The model performs well with AMC 12 level problems but may face challenges with more advanced problems or geometry. It is not intended for general chat applications.

NuminaMath 7B TIR represents a new development in mathematical language models and may be of interest to AI researchers, mathematicians, and those following advancements in machine learning.

And we can't pass by some lighthearted comments from the Reddit community:

- Can someone explain how significant this is?

- People can finally complete their math homework and maybe do some advanced modeling.

***

- Can this model do RP?

- If you like dirty math problems, maybe..

Was this helpful?