-
Notifications
You must be signed in to change notification settings - Fork 160
Description
Hi,
I have prior experience working with GPT architectures and neural networks in PyTorch, but I’m new to optimization and completely unfamiliar with CUDA.
I’ve read Chapter 1 of the book and opened the corresponding code/chap1 folder in this repo. However, I’m struggling to connect the concepts from the book with the implementation in the repo. Even after going through the README.md files, I’m still unclear about how the pieces fit together.
For someone with zero CUDA knowledge, what would be a good way to approach this repo in order to practice and learn the topics? Right now, everything feels overwhelming without references or guidance, and my only option seems to be reading through the code line by line to make sense of it.
Any advice on how to bridge the gap between the book’s explanations and the repo’s code would be greatly appreciated.
Thank you!