Home
Blog
Courses
LLMs
EN
All Courses
Intermediate Reinforcement Learning Techniques
Chapter 1: Revisiting Reinforcement Learning Fundamentals
The Reinforcement Learning Problem Setup
Markov Decision Processes (MDPs) Recap
Value Functions and Bellman Equations
Tabular Solution Methods: Q-Learning and SARSA
Limitations of Tabular Methods
Quiz for Chapter 1
Chapter 2: Deep Q-Networks (DQN)
Introduction to Function Approximation
Using Neural Networks for Q-Value Approximation
The DQN Algorithm Architecture
Experience Replay Mechanism
Fixed Q-Targets (Target Networks)
Loss Function for DQN Training
Hands-on Practical: Implementing DQN for CartPole
Quiz for Chapter 2
Chapter 3: Improvements and Variants of DQN
The Overestimation Problem in Q-Learning
Double DQN (DDQN)
Dueling Network Architectures
Combining DQN Improvements
Prioritized Experience Replay (Brief Overview)
Practice: Implementing Double DQN
Quiz for Chapter 3
Chapter 4: Policy Gradient Methods
Limitations of Value-Based Methods
Direct Policy Parameterization
The Policy Gradient Theorem
REINFORCE Algorithm
Understanding Variance in Policy Gradients
Baselines for Variance Reduction
Hands-on Practical: Implementing REINFORCE
Quiz for Chapter 4
Chapter 5: Actor-Critic Methods
Combining Policy and Value Estimation
Actor-Critic Architecture Overview
Advantage Actor-Critic (A2C)
Asynchronous Advantage Actor-Critic (A3C)
Implementation Considerations for Actor-Critic
Comparison: REINFORCE vs A2C/A3C
Practice: Developing an A2C Implementation
Quiz for Chapter 5
Practice: Implementing Double DQN
Was this section helpful?
Helpful
Report Issue
Mark as Complete
© 2025 ApX Machine Learning