Marcosdumay

Prezentare generala

  • Data fondare 3 aprilie 2004
  • Joburi postate 0
  • Categorii Traduceri / Interpretariat / Translatari

Descriere companie

Open-R1: a Fully Open Reproduction Of DeepSeek-R1

Hey there! This article is an intro to the job, not a claim that we’ve recreated R1 yet. We’re integrating in the open, so as quickly as we have evaluation numbers, we’ll share them. You can follow our development on Hugging Face and GitHub.

True, but it appears like there’s nothing to be assessed as of today. I presume the supreme objective is to train a brand-new reasoning design and then use the very same assessment metrics as o1 and the DeepSeek-R1.

Well, there should be at least some peace of mind check and validation to ensure the model was trained correctly.

Oh yes, if you are talking about the evaluation variety of deepseek’s model it’s coming very soon!

As pointed out in the article there is no model called Open-R1 to evaluate at all … not yet anyway. This is a blog site outlining that Hugging face will take the R1 Deepseek design, work out how it was built as described in the paper and from what they released, and after that reproduce that process.

in truth this is practically how science works … A creates a strategy, discovery or development and it is evaluated by B, C and D to see if it is reproduceable. Thats been the foundation of research study now for a few centuries.

This blog is not stating they have already done so … Its a blog site outlining an intent to begin training a design like R1 and calling it Open-R1.

Also DeepSeek-R1 was just released last week, and even in their paper they outlined the compute hours needed. While those are low calculate hours for a SOTA design this does not mean you can train stated model in a week. I ‘d personally enjoy to be able to train a transformer design in a week, but we may need to wait a while for that level of calculate innovation.

So there are no criteria for a model that has not been built yet right? As described in the blog site, and again in reply to your concern.

However fear not, there is a GitHub Repo already and contributors (hell I may join myself), some prelim work done, and a strategy of attack. An excellent beginning position.

n
@edbeeching
has actually examined the launched models currently

( src: https://x.com/edwardbeeching/status/1884273209136275742)

R1 simply trained on o1 outputs, so collectively …/ s. This is what the brand-new AI czars are saying

Hi! This post is an intro to the task, not a claim that we have actually recreated R1 yet. We will totally share the missing piece when we have them, you can expect the models and datasets to be upload in this Hugging Face org and the code to be in this GitHub repo

That’s great and crucial to understand this incredible hype that does not have technical understanding and explanation. Science is about recreation, and if they declare to be open, let them fullfill the open part.

Please do publish the training cost.

We will!

Excalidraw Hi n
@bojan2501
thanks, we will indeed be striving to ensure this training dish can work for little language models on customer hardware given that not everybody has a cluster of H100s in your home:-RRB- The tool we utilized for the images was Excalidraw! https://excalidraw.com

eagerly anticipating it! WTF are your talking about?

should be a joke

It’s truly cool to see how the entire open source community comes together!

Ops …

5.5 M is number reporter in the deepseekv3 tech report (just the training, not the experiment afaik), for R1 hard to estimate tbh but much less than 5.5 M imo

Historically, they have never released code or datasets of their LLM training, so I would not anticipate this time to be various. If they would launch it that would be remarkable naturally!

Yes obviously!

So basically you’re asking to change existing censorship with another flavour of censorship?

The code for the models are inside the design repositories, e.g. for V3: https://huggingface.co/deepseek-ai/DeepSeek-V3/blob/main/modeling_deepseek.py

Hello Team, I’m Ray Bernard, the author and developer of EQUATOR. My research study group will be dealing with a paper concentrated on duplicating particular components of DeepSeek R1. Our aim is to recreate the cold start and supply your group with a dataset that consists of COT and other strategies to support these efforts. We like to contribute our work to help. Please let me understand if you find this beneficial. Best, Ray Bernard https://www.facebook.com/groups/1186310571520299/

Where is the evaluation numbers? without it you can’t call it reproduction.

8 replies

True, however it appears like there’s nothing to be evaluated as of today. I assume the ultimate objective is to train a brand-new reasoning design and after that utilize the exact same evaluation metrics as o1 and the DeepSeek-R1.

That’s rather interesting, I was asking myself why the questions the author exposed here are not being asked by others? I think the work they have done is but at the exact same time I wonder why they wouldn’t put these missing out on pieces on if they are expected to be completely open.
Why even without reproduction and understanding of the development they could impact so much the market in this way?

4 replies

Hi! This post is an intro to the job, not a claim that we have actually recreated R1 yet. We will absolutely share the missing out on piece when we have them, you can expect the designs and datasets to be upload in this Hugging Face org and the code to be in this GitHub repo

Interesting read, and it is great that we see more effort into this direction: more optimization and less brute force.
Also question what tool did the author use for producing step diagram.

2 replies

Excalidraw I’m so happy that initiative like this currently exist, I’m gon na attempt to contribute:-RRB- 1 reply

looking forward to it! So racist articel

2 replies

WTF are your talking about?

Awesome to have this open reproduction started!

For Step # 1 check out https://github.com/open-thoughts/open-thoughts!

https://x.com/ryanmart3n/status/1884284101265612856

Let’s do this thing!

1 reply

It’s really cool to see how the entire open source community comes together!

Does anybody understand the real training cost of r1? I can’t find it in the paper or the announcement post. Is the 6M cost reported by media just the number taken from v3’s training cost?

2 replies

Ops …

Has anybody asked the DeepSeek group to release their training data and code, or a minimum of share them independently with an independent replication job like this? Have they rejected such a request?

A faithful duplication depends on using the exact same dataset and hyperparameters. Otherwise, any major discrepancies with the released standards would be tough to pin down-whether due to training information differences or the replication approach itself.

1 reply

Historically, they have never released code or datasets of their LLM training, so I would not expect this time to be various. If they would launch it that would be amazing of course!

In the meantime we have to make best guess estimates and see if we can get there ourselves.

You supply great replication process of Deepseek reasoning training. I will attempt something similar to it.

This is actually excellent information, can we fine tune with particular use case when code is released?

1 reply

Yes of course!

Please consider removing prejudiced, polluted or unaligned training information and make an effort to eliminate copyrighted works from the crawl from intake. This will make the model more functional. If you recycled anthropic curation checks, this may also help, eliminate obviouslybiased data will likely include a lot of worth. We do not desire another polluted, unaligned open source model, right? And no business would ever utilize deepseek or a model that reuses it, right?
We value your work for the advantage of humanity, we hope.
Miike C from NJ

1 reply

So generally you’re asking to change existing censorship with another flavour of censorship?

Can’t wait! Hopefully the design will be uncensored however whatever you can do is alright! Love seeing open source structure itself up. I’m not wise enough to actually help but I can contribute support lol

Hello guys, I am even just attempting to find code for DeepSeek-V2, in order to totally understand multi-head hidden attention. You do not seem to have code in Hugging Face even for that. Or am I missing out on something? Don’t see anything in src/transformers/models. MLA is not effectively explained in their paper, so it would be very important to have code for this.