Open-R1 is a truly open version of DeepSeek AI

Post Views: 475

On Monday, DeepSeek R1 crashed the stock market once it became clear to some of the investors trading AI-related stocks that the Chinese startup had found a way to train AI as capable as ChatGPT o1 without access to the state-of-the-art NVIDIA chips that OpenAI and US AI firms have access to. That’s why firms creating hardware for AI infrastructure suffered the most. NVIDIA shed nearly $600 billion in market cap, while the entire market lost almost $1 trillion.

I said at the time that the reactions might be blown out of proportion. Yes, DeepSeek employed software optimizations to develop AI as capable as o1 instead of relying on hardware. But that doesn’t mean NVIDIA’s GPUs are suddenly obsolete. It just realigns the playing field while providing a new way to innovate.

I still think that AI firms with access to the latest hardware and top-tier software talent will have an edge over Chinese rivals. All a company like OpenAI or Google has to do is replicate some of the tricks DeepSeek used to match the Chinese startup’s AI training and usage efficiency and then leapfrog it. The latest AI chips will still be very important here.

It turns out it’s not just the big AI firms that might try to copy what DeepSeek has done. A team of developers calling themselves Open-R1 wants to replicate the DeepSeek R1 success to create a reasoning AI model that’s just as powerful as R1. There’s a big twist in all of this that AI fans in Western markets will appreciate. Open-R1 should be even more transparent than DeepSeek R1.

Tech. Entertainment. Science. Your inbox.

By signing up, I agree to the Terms of Use and have reviewed the Privacy Notice.

DeepSeek’s decision to make its AI models open-source was brilliant. This ensured that anyone could access and install the model on their computer. From there, they’d have a local model as capable as ChatGPT o1. The open-source route would also drive up adoption and testing. News about R1’s capabilities would spread rapidly.

But, as the Open-R1 researchers explain on Hugging Face, DeepSeek R1 isn’t fully open-source:

The release of DeepSeek-R1 is an amazing boon for the community, but they didn’t release everything—although the model weights are open, the datasets and code used to train the model are not .

That’s where Open-R1 is coming in:

The goal of Open-R1 is to build these last missing pieces so that the whole research and industry community can build similar or better models using these recipes and datasets. And by doing this in the open, everybody in the community can contribute!

Specifically, the Open-R1 team wants to answer the following questions about DeepSeek R1 while they develop an identical AI:

Data collection: How were the reasoning-specific datasets curated?

Model training: No training code was released by DeepSeek, so it is unknown which hyperparameters work best and how they differ across different model families and scales.

Scaling laws: What are the compute and data trade-offs in training reasoning models?

The researchers plan to clone DeepSeek’s development strategy for R1, further fine-tune it, and create a truly open-source Open-R1 model that anyone could use.

Interestingly, the Open-R1 researchers want to distill DeepSeek R1 and create a high-quality reasoning dataset. DeepSeek might have done its own distillation, with OpenAI claiming the Chinese startup used ChatGPT to train its earlier versions of AI. That work might have been critical to getting to DeepSeek R1. It’s unclear if OpenAI can prove these allegations with absolute certainty.

However, the Open-R1 researchers have their own strategy after distilling R1, with the blog explaining how they plan to go forward.

If successful, Open-R1 could be a stepping-stone for developing other sophisticated AI models, and anyone could do it. The advantage here is that you would not have to go through the same training process. Conversely, that’s what OpenAI says DeepSeek did with ChatGPT, using some of its outputs to save money on training the AI.

An open-source reasoning model like the Open-R1 model the researchers propose could be used for other purposes, not just math and coding. The researchers mention medicine, where reasoning AI “could have significant impact.”

That said, it’s unclear how long the project will take and when Open-R1 will be ready for testing. Other AI researchers interested in Open-R1 can check out the project on GitHub.

Source