AI

Lazy loaded imageAugust 17, Instruction Data Generation

More researchers are recognizing the significance of instruction data during the Supervised Fine-Tuning (SFT) stage. In June, I wrote a blog about data generation, but I believe it was somewhat superficial and insufficient. Since then, many new methods have emerged. Therefore, I aim to cover more papers I've read to discuss instruction data generation and selection.

Lazy loaded imageJuly 23, DSPy with GPT-4o-mini on MMLU-Pro

DSPy is an optimization framework that enhances prompts and responses from models like GPT-4o-mini. It showcases the magic of the framework and demonstrates how to use its powerful optimizers to improve the cost-effective model. The MMLU-Pro dataset is an advanced dataset with complex questions and increased answer choices. The evaluation metric is defined to check if the model's responses match the true answers.

Lazy loaded imageJuly 16, LLMs Evals Thoughts

Evaluating LLMs is important for understanding their abilities and solving real business problems. A good evaluation requires sufficient and high-quality data samples, clear judging criteria, meaningful evaluation tasks, and frequent private benchmarks. The process should adapt to the development of LLMs over time.

Lazy loaded imageJuly 5, LLMs Evaluation Benchmarks

As the capabilities of Large Language Models (LLMs) continue to evolve, many traditional evaluation benchmarks may require updates. With the rapid progress of these models, researchers are increasingly introducing new evaluation datasets. However, the specific dimensions these datasets assess in the models are often unclear. In this blog, I will explore a series of commonly referenced evaluation datasets and highlight the particular aspects of model capabilities they were designed to assess even though I may not cover all available datasets.

🧡July 7, Weekend with Midjourney

Midjourney provides a platform for exploring different artistic styles and techniques. Whether you're a seasoned artist or a beginner, the tool offers a wide array of options to experiment with and refine your artistic vision. Users can blend various elements, adjust parameters, and see real-time changes, giving them a unique and interactive experience.

🎟️June 30, DSPy

DSPy is a framework developed by Stanford. It is used for programming to automatically optimize prompts and weights in Large Language Models (LLMs). DSPy can enhance the reliability of any model, whether it's GPT-4, LLaMA3 or Mistral, for any task you require.

⚗️June 26, TextGrad

TextGrad is an innovative autograd engine, particularly tailored for textual gradients. As a robust framework, it facilitates automatic meticulously implements backpropagation using feedback provided by advanced Large Language Models (LLMs), firmly anchored in the gradient metaphor.
Chengsheng Deng
Chengsheng Deng
Chengsheng Deng
Latest posts
Mar 24 Notes on LightRAG
Mar 24, 2025
Dec 6, Some Tests on o1
Mar 14, 2025
Mar 10, Note on BIG-MATH
Mar 10, 2025
Mar 6, Note on QwQ-32B
Mar 6, 2025
Jan 21, Notes on DeepSeek-R1
Mar 6, 2025
The First Pages of 2025 - My January & February Story
Mar 5, 2025
Announcement
🎉Welcome to my blog🎉 
To find me:
Twitter/X:My X
👏Have fun in my blog👏