AI Simulator


2022-08-14 11:40:45 +0000 - paradite

Deep Q-networks (DQN) is a type of deep reinforcement learning algorithm developed by DeepMind in 2013.

DQN uses a deep convolutional neural network to approximate the Q-value of action in a given state.

Hyperparameters

There are several hyperparameters that can be tuned to get better results with DQN.

Performance metrics

We can use the Q value and loss as two metrics to evaluate the performance of a DQN model.

Q value

Q value measures the expected reward for performing an action in a given state.

Tips for Q value

Expected Q value is affected various hyperparameters as well as the reward function.

Here are some common issues with Q values and tips on how to fix them:

1. Q value too low (<1)

2. Q value too high (>50)

3. Q value unstable and fluctuates widely

Loss

Loss measures the difference between the predicted and the actual result (how accurate the prediction is). It is the squared error of the target Q value and prediction Q value.

Tips for Loss

Here are some common issues with loss and tips on how to fix them:

1. Negative loss

2. Loss too high (>10)

3. Loss unstable and fluctuates widely

Examples

DQN can be trained to play many single-player games, for example Tetris, Snake, 2048.

This is a screenshot of tfboard for training DQN to play AI Simulator: 2048 over 100M frames:

Observations on key metrics:

This is a screenshot of tfboard for training DQN to play AI Simulator: Robot over 13M frames:

Observations on key metrics:

Further readings

DQN paper

Interactive demos