[返回課程頁面]
第16講 Reinforcement Learning/Q-learning
課程影音
L16A
L16A
Introduction
L16B
Markov Decision Process (MDP)
L16C
Value Iteration
L16D
Policy Iteration
L16E
Reinforcement Learning
L16F
Model-Free RL based on MC Estimation
L16G
Temporal Difference Learning SARSA
L16H
Exploration Strategies
L16I
Q-Learning
L16J
SARSA vs. Q-Learning
Introduction
L16B
Markov Decision Process (MDP)
L16C
Value Iteration
L16D
Policy Iteration
L16E
Reinforcement Learning
L16F
Model-Free RL based on MC Estimation
L16G
Temporal Difference Learning SARSA
L16H
Exploration Strategies
L16I
Q-Learning
L16J
SARSA vs. Q-Learning