ApX logoApX logo
Q-Learning Algorithm: Off-Policy TD Control