A C++/CUDA neural network library inspired by PyTorch, built from scratch.
- Tensors: CPU and GPU support with automatic memory management.
- Layers: Dense (Linear), ReLU, Softmax.
- Loss: Categorical Cross Entropy.
- Optimizer: Stochastic Gradient Descent (SGD).
- Backends: Custom CPU and CUDA kernels.
- CMake >= 3.18
- CUDA Toolkit (for GPU support)
- C++17 Compiler
mkdir build
cd build
cmake ..
makeThe example trains a neural network on the "spiral data" problem.
CPU Mode:
./train_spiralGPU Mode:
./train_spiral gpuinclude/: Header files.src/: Source code (C++ and CUDA).examples/: Example usage scripts.
- Tensor: Handles data storage and device movement (
to(Device::GPU)). - Ops: Contains math operations (
matmul,add,relu, etc.) with dispatch logic for CPU/GPU. - Layers: High-level abstractions (
Layer_Dense,Layer_ReLU) that store parameters and implementforward/backwardpasses.