type
status
date
slug
summary
tags
category
password
icon
Domain & Institution
Author
Priority
Abstract
Creation Date
Yesterday, I saw an interesting tweet from @Mahesh. His team introduced Bespoke-Stratos-32B, a model distilled from DeepSeek-R1 using Berkeley NovaSky's Sky-T1 recipe. I quickly read their blog post and reviewed Berkeley's recipe to take some notes.
TThe team open-sourced everything with the community:
  • 32B Model and 7B Model
  • Reasoning Dataset
  • Data Curation Code

Data Curation

The team used Bespoke Curator with DeepSeek-R1 to create the synthetic reasoning dataset in just 1.5 hours. Here are the key differences:
  • They used DeepSeek-R1 as the teacher reasoning model instead of QwQ
  • They skipped using gpt-4o-mini for reformatting reasoning traces since DeepSeek-R1's traces were already well-formatted and coherent
  • They opted for gpt-4o-mini instead of Sky-T1's parsing logic (which uses regex and sympy) to filter out incorrect solutions
The dataset Bespoke-Stratos-17k contains the following subsets:
  • Numina: 10.5k samples from the math, olympiads, and amc_aime subset of the difficulty-labeled Numina dataset
  • APPS: ~2.5k samples from the APPs dataset
  • TACO: ~3k samples from the TACO dataset
  • STILL-2: ~1k samples from the STILL-2 dataset

Performance

Here is the performance of Bespoke-Stratos-32B:
notion image
The performance of Bespoke-Stratos-7B is shown below:
notion image

Sky-T1

Here is the Sky-T1 blog post: https://novasky-ai.github.io/posts/sky-t1/
The Berkeley team generated 17,000 training samples using Alibaba's QwQ-32B-Preview model, then used gpt-4o-mini to rewrite the reasoning traces and applied reject sampling to enhance data quality. The process is shown below:
notion image

Evaluation Results

notion image

Findings:

  • Model Size Matters
  • Data Mixture Matters
 
Feb 8, Notes on Policy Gradient Jan 21, Notes on DeepSeek-R1
Loading...
BubbleBrain
BubbleBrain
BubbleBrain
Latest posts
MiniMax 发布周回顾
Jun 23, 2025
Mar 24 Notes on LightRAG
Mar 24, 2025
Dec 6, Some Tests on o1
Mar 14, 2025
Mar 10, Note on BIG-MATH
Mar 10, 2025
Mar 6, Note on QwQ-32B
Mar 6, 2025
Jan 21, Notes on DeepSeek-R1
Mar 6, 2025
Announcement
🎉欢迎来到 BubbleBrain🎉 
To find me:
Twitter/X:My X
👏Have fun in my blog👏