2024 | Tags | Singularity Gallery

#2024

😶‍🌫️Dec 16, Notes on DP, Monte Carlo, TD in Reinforcement Learning

Exploration of three key reinforcement learning algorithms: Dynamic Programming (DP) for optimal policies in MDPs, Monte Carlo methods for learning from complete episodes without a model, and Temporal Difference (TD) learning for efficient updates from incomplete episodes using bootstrapping. Each method has unique characteristics and trade-offs essential for understanding advanced concepts in reinforcement learning.

Reinforcement Learning

2024

Statistics

Dec 16, Notes on DP, Monte Carlo, TD in Reinforcement Learning

🍉Dec 12, Notes on Gemini-Flash 2.0

Google has announced Gemini 2.0, launching Gemini Flash 2.0 as its first model in this new series.

🩺Dec 6, Some Tests on o1

In this blog, I will show some test examples of the full o1 model and compare it with other deep thinking models. Let's start!

2024

LLM

Dec 4, Recap for November

This document summarizes my experiences in both work and personal life during November

Thoughts

2024

personal

🏑Nov 29, Notes on “Deep Thinking Model”

Since OpenAI released its "o1-series" model, several teams have developed their own approaches to "deep thinking" models. DeepSeek introduced their o1-like model, DeepSeek-R1-Lite, while Qwen released QwQ-32B-Preview, and Intern launched Intern Thinker.

LLM

2024

🎐Nov 26, Explore DSPy on BootstrapFinetune

While this isn't the first blog about DSPy, I've noticed recent updates to the DSPy documentation and GitHub repository, including a new optimization method called BootstrapFinetune.

LLM

DSPy

2024

🤩Nov 20, Notes on DSPy

This isn't my first blog post on DSPy—I've written several before. However, I've noticed some recent updates to DSPy, and I'd rather not consult the documentation every time I want to build programs. So, I plan to jot down some basic DSPy concepts in this post. Additionally, I intend to use this document as external knowledge for GPT or Claude.

LLM

2024

DSPy

🫥Nov 17, Recap for October

This document summarizes my experiences in both work and personal life during October

2024

Thoughts

personal

Nov 15，Notes on OPENCODER

In this blog, I share notes on an intriguing paper I recently read:

LLM

2024

Code

🌓Nov 6, Notes on Contextual Retrieval

Contextual Retrieval, a method proposed by Anthropic, significantly enhances the retrieval step in RAG systems.

LLM

RAG

2024

🤩Oct 30, LLMs cannot Play the Snake Game

The blog introduces a novel method for evaluating LLM performance by having them play the Snake game, assessing their decision-making, planning, and strategy skills. The experiment tested several models, revealing that o1-mini performed best with a score of 11, while Claude models outperformed GPT models. The findings suggest that reinforcement learning significantly enhances LLMs' capabilities in dynamic decision-making tasks. Although preliminary, this approach highlights the potential of game-based assessments for deeper insights into LLM competencies, with recommendations for further testing across more models and scenarios.

2024

LLM

Evaluation

Oct 18, Notes on LIGHTRAG

The blog discusses LIGHTRAG, an innovative framework for Retrieval-Augmented Generation (RAG) systems that enhances performance by incorporating graph structures and dual-level retrieval processes. It outlines the challenges faced by traditional RAG systems, such as speed, quality, and understanding limitations, and explains how LightRAG addresses these issues through efficient text indexing and retrieval methods. The framework allows for both specific and abstract queries, improving the ability to handle complex questions and providing tailored responses using a general-purpose LLM.

LLM

RAG

2024

1 2 3 4

Chengsheng Deng