An implementation of the reinforcement learning for CartPole-v0 by policy optimization

The step plot of the result

CartPole-v0: https://gym.openai.com/envs/CartPole-v0/
Ilyas, Andrew, et al. "A closer look at deep policy gradients." arXiv preprint arXiv:1811.02553 (2018).

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
reinforcement		reinforcement
README.md		README.md