STEM

Mastering Reasoning Models: Algorithms, Optimization, and Applications

This course provides a comprehensive exploration of modern reasoning models, focusing on the algorithmic innovations that power models like DeepSeek R1, OpenAI o1, and their open-source alternatives. Master the four key approaches to building reasoning LLMs: inference-time scaling, pure reinforcement learning, SFT+RL, and knowledge distillation. Through concrete examples and technical deep dives, learn how to implement test-time compute scaling, understand the mechanics of Group Relative Policy Optimization (GRPO), and build efficient inference pipelines for reasoning tasks. By the end of the course, you should have both the theoretical knowledge and practical skills to leverage these cutting-edge techniques in your own applications, whether you’re working with enterprise-scale resources or more limited computational budgets.

Learn More