AI
Dec 12, Notes on Gemini-Flash 2.0
Dec 6, Some Tests on o1
Nov 29, Notes on “Deep Thinking Model”
Nov 26, Explore DSPy on BootstrapFinetune

🤩Nov 20, Notes on DSPy

This isn't my first blog post on DSPy—I've written several before. However, I've noticed some recent updates to DSPy, and I'd rather not consult the documentation every time I want to build programs. So, I plan to jot down some basic DSPy concepts in this post. Additionally, I intend to use this document as external knowledge for GPT or Claude.
Nov 20, Notes on DSPy
Nov 15,Notes on OPENCODER
Nov 6, Notes on Contextual Retrieval

🤩Oct 30, LLMs cannot Play the Snake Game

The blog introduces a novel method for evaluating LLM performance by having them play the Snake game, assessing their decision-making, planning, and strategy skills. The experiment tested several models, revealing that o1-mini performed best with a score of 11, while Claude models outperformed GPT models. The findings suggest that reinforcement learning significantly enhances LLMs' capabilities in dynamic decision-making tasks. Although preliminary, this approach highlights the potential of game-based assessments for deeper insights into LLM competencies, with recommendations for further testing across more models and scenarios.
Oct 30, LLMs cannot Play the Snake Game

Oct 18, Notes on LIGHTRAG

The blog discusses LIGHTRAG, an innovative framework for Retrieval-Augmented Generation (RAG) systems that enhances performance by incorporating graph structures and dual-level retrieval processes. It outlines the challenges faced by traditional RAG systems, such as speed, quality, and understanding limitations, and explains how LightRAG addresses these issues through efficient text indexing and retrieval methods. The framework allows for both specific and abstract queries, improving the ability to handle complex questions and providing tailored responses using a general-purpose LLM.
Oct 18, Notes on LIGHTRAG

Oct 12, Notes on Re-Reading & GSM-Symbolic

The blog discusses two contrasting papers on large language models (LLMs): one proposes a "Re-Reading" method to enhance reasoning capabilities, showing consistent improvements in performance, while the other, GSM-Symbolic, critiques LLMs' reasoning abilities, revealing significant performance variance and limitations in mathematical reasoning. The author concludes that it's too early to declare LLMs incapable of reasoning, suggesting that current limitations may evolve.
Oct 12, Notes on Re-Reading & GSM-Symbolic
Sep 25,Notes on Gemini models
Sep 19,Notes on Qwen2.5