FlareStart
HomeNewsHow ToSources
FlareStart

Where developers start their day. All the tech news & tutorials that matter, in one place.

Quick Links

  • Home
  • News
  • Tutorials
  • Sources

Connect

© 2026 FlareStart. All rights reserved.

Back to articles
ArticleMachine Learning

[GRPO Explained] DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

via Yannic KilcherYannic Kilcher1y ago

#deepseek #llm #grpo GRPO is one of the core advancements used in Deepseek-R1, but was introduced already last year in this paper that uses a combination of new RL techniques and iterative data collection to achieve remarkable performance on mathematics benchmarks with just a 7B model. Paper: https://arxiv.org/abs/2402.03300 Abstract: Mathematical reasoning poses a significant challenge for language models due to its complex and structured nature. In this paper, we introduce DeepSeekMath 7B, which continues pre-training DeepSeek-Coder-Base-v1.5 7B with 120B math-related tokens sourced from Common Crawl, together with natural language and code data. DeepSeekMath 7B has achieved an impressive score of 51.7% on the competition-level MATH benchmark without relying on external toolkits and voting techniques, approaching the performance level of Gemini-Ultra and GPT-4. Self-consistency over 64 samples from DeepSeekMath 7B achieves 60.9% on MATH. The mathematical reasoning capability of DeepS

Watch on Yannic Kilcher

Opens in a new tab

Watch on YouTube
1 views

Related Articles

Why Degrees Don’t Make Developers
Article

Why Degrees Don’t Make Developers

Continuously Delivered • 2w ago

When you write your tests TOO LATE... #softwareengineering
Article

When you write your tests TOO LATE... #softwareengineering

Continuously Delivered • 3w ago

"Hello police? I'd like to report a journalism."
Article

"Hello police? I'd like to report a journalism."

Benn Jordan • 1mo ago

Traditional X-Mas Stream
Article

Traditional X-Mas Stream

Yannic Kilcher • 1mo ago

I Tested Dozens of Python Libraries But These 9 Are Actually Worth Using
News

I Tested Dozens of Python Libraries But These 9 Are Actually Worth Using

Medium Programming • 31m ago

Discover More Articles