type
status
date
slug
summary
tags
category
password
icon
Author
Abstract
Since May last year, I’ve been recapping my experiences monthly. When 2025 began, I skipped January’s reflection, but I realize now how important it is to document this period. This post reflects on the past two months and captures my thoughts about life and the rapidly evolving AI landscape.
The DeepSeek Revolution
The first two months of 2025 marked a pivotal moment in AI history, particularly with the release of
DeepSeek-R1
. This groundbreaking model has fundamentally transformed the open-source AI community in several key ways:- Widespread Impact: Its influence spans both industry and academia, revolutionizing how researchers and practitioners approach AI development.
- Mainstream Recognition: The model’s reach extends far beyond technical circles - during my recent visit to Melbourne, even my elderly uncle, who has limited technical background, was eager to discuss
DeepSeek
.
- Technical Excellence: For a deeper understanding of the model’s capabilities and technical specifications, you can refer to my detailed notes here: Notes on DeepSeek R1
What I find most valuable about
DeepSeek-R1
is the team’s decision to make the model’s thinking process public. This transparency enables extensive data-distillation work, which they’ve already begun exploring in their technical report. This approach stands in stark contrast to OpenAI
, which has deliberately chosen not to release their models’ thinking processes. Interestingly, Google
initially provided access to the thinking process in their gemini-2.0-flash-thinking-exp-01-21
API, but later disabled this parameter, following OpenAI
’s lead.The Open-Source Response
Following
DeepSeek-R1
’s release, numerous projects emerged in the open-source community:- Some attempted to replicate
R1
’s development journey
- Others focused on data distillation, applying
DeepSeek-R1
’s capabilities to smaller models to enhance their reasoning
- Many explored using GPRO (the same reinforcement learning algorithm powering
DeepSeek-R1
) on smaller models to achieve similar “aha” moments
Some of these efforts have successfully demonstrated that smaller models can achieve reasoning performance comparable to
O1
and R1
. However, most work has concentrated primarily on mathematics rather than expanding to other domains. This limitation likely stems from the relative ease of designing reward rules in mathematics compared to real-world scenarios, where questions are often open-ended without absolute answers. I’m eager to see this research extend into more diverse domains beyond mathematics.The Broader AI Landscape
The past two months brought numerous other significant developments beyond
DeepSeek
:OpenAI
releasedo3-mini
, a new reasoning model, in January and followed withgpt-4.5-preview
in February. I’ve testedgpt-4.5-preview
and documented my findings here: Notes on GPT 4.5.
Anthropic
launchedclaude-3-7-20250219
, offering users the option to enable thinking capabilities or use its general abilities—effectively providing a unified model. My tests revealed impressive performance, detailed here: Notes on Claude 3.7 & Qwen 2.5 Max .
Alibaba
continued its steady progress, releasingQwen2.5-Max
and its reasoning-focused variantQwQ-Preview
. DespiteDeepSeek
’s prominence overshadowing some ofAlibaba
’s contributions, it’s worth noting that many data distillation projects still chooseQwen
as their base model.
My Recent Work
I’ve dedicated significant time to learning reinforcement learning algorithms. Inspired by
DeepSeek
’s success, I experimented with GPRO
on the Unsloth framework to train models and observe the “aha” moment—when models demonstrate reflection, verification, and reasoning. While this approach yields fascinating results, it’s extremely computationally intensive. Even attempting to train a 3B
model on an H100 GPU resulted in out-of-memory errors.Given my limited computational resources, I’ve pivoted to Supervised Fine-Tuning (SFT). I’m currently training
Qwen2.5-32B-Instruct
with distillation techniques based on DeepSeek-R1
. The model’s performance still has room for improvement, which I hope to share in next month’s update.Why I Write
I’m often asked, “Why do you love writing things down?” even though my audience is limited and many topics seem trivial. My reasons are threefold:
- Personal Documentation: Writing is primarily for myself, not others. I document my life, thoughts, and learning to preserve experiences that might otherwise fade from memory.
- Clarity of Thought: Writing reveals my thinking process. When I struggle to express something clearly, it signals that my understanding is incomplete or my thoughts are disorganized.
- Building a Second Brain: Writing forms the foundation of my second brain—a system for organizing thoughts and knowledge. By writing and linking ideas, I discover connections between concepts and build a more structured knowledge base. For example, writing about
DeepSeek-R1
allows me to connect it with previous notes on AI models, revealing patterns in AI development I might otherwise miss.
Looking Forward
These reflections capture my experiences, feelings, and thoughts from the beginning of 2025. I commit to continuing these monthly reflections, documenting my journey through this remarkable era of AI advancement.
- Author:Chengsheng Deng
- URL:https://chengshengddeng.com/article/my-jan-feb-story
- Copyright:All articles in this blog, except for special statements, adopt BY-NC-SA agreement. Please indicate the source!
Relate Posts