In order to understand PyTorch and DeepRL on a deeper level, I decided to implement some algorithms from scratch using PyTorch. These are my implementations of the common reinforcement learning algorithms DQN, A2C, and PPO using Pytorch and Gymnasium. Feel free to use the code as a reference. My skill in pytorch increased by a lot with every algorithm, so the older the file, the more likely you are to find bad practices.
I made a presentation about the material to help me digest it. If you find any problems please reach out! Intro to DeepRL
DQN Paper
DQN Paper 2
DQN Guide
A2C/A3C Paper
Hugging Face A2C
PPO Paper
Hugging Face PPO
PPO Implementation Details
GAE Value Function Estimation (PPO)
Sutton and Barto: RL Introduction. Chapters 1-7
OpenAI: Spinning Up as a Deep RL Researcher. "The Right Background"