Training, Optimization, and Experiment Tracking with Weights & Biases (WandB)
This project implements a fully-connected feedforward neural network (FFNN) from scratch using NumPy only. No deep learning frameworks such as TensorFlow or PyTorch are used. All core components are implemented manually:
- Forward and backward propagation
- Gradient-based optimisation (mini-batch gradient descent / Adam)
- Loss computation with optional L2 regularisation
- Evaluation and experiment tracking with Weights & Biases (WandB)
The overall goal is to understand how modern deep learning libraries work internally by building and experimenting with a NumPy-only implementation.
- Design and train a configurable FFNN for classification using NumPy.
- Support image-based datasets such as Fashion-MNIST and CIFAR-10.
- Implement from scratch:
- Forward pass (matrix multiplications + activation functions)
- Loss computation (cross-entropy) with L2 regularisation
- Backward pass (manual derivatives and gradient computation)
- Training loop with mini-batch gradient descent / Adam
- Evaluate models using accuracy, loss curves, and confusion matrices.
- Track experiments with WandB, including hyperparameter sweeps.
The main experiments are run on two open-source datasets:
- Fashion-MNIST – grayscale clothing images (28×28, 10 classes).
- CIFAR-10 – colour images (32×32×3, 10 classes).
Both datasets are small enough for CPU-based NumPy training but rich enough to demonstrate overfitting, regularisation effects, and the impact of different optimisers and inits. Additional open-source datasets (e.g. from Kaggle) can be plugged in if desired.
The central component is a flexible FFNN class with the following configurable hyperparameters:
num_epochsnum_hidden_layersn_hidden_unitslearning_rateoptimizer(Adam)batch_sizel2_coeffweights_init( Xavier, He)activation( ReLU, tanh, sigmoid)
- Forward pass – linear layers + activations using NumPy matrix operations.
- Loss computation – cross-entropy with optional L2 penalty on weights.
- Backward pass – manual derivatives for all layers and activations.
- Parameter update – gradient descent / Adam using accumulated gradients.
- Training loop – mini-batch iteration over the dataset with periodic validation.
- Evaluation – compute accuracy and loss curves on train/val/, and a confusion matrix on the test set.
Each training run can be logged to Weights & Biases. Logged artefacts include:
- Learning curves:
train_loss,val_loss,train_acc,val_acc. - Parameter histograms and gradient norms over time.
- Hyperparameter sweeps (Bayesian) over:
- network depth and width
- activation functions
- weight initialisation
- optimisers and learning rates
- Summary tables and bar plots comparing activations and initialisations across runs.
Repository root (this folder):
Deep-learning-/
│
├── CIFAR-10.ipynb # CIFAR-10 loading, training, sweeps, summaries
├── Fashion-MNIST .ipynb # Fashion-MNIST loading, training, sweeps, summaries
├── Utilisfunction.py # FFNN implementation, training helpers, WandB utilities
├── data/ # Fashion-MNIST raw gzip files (downloaded)
├── Dataset/ # CIFAR-10 python batches (pre-downloaded)
├── wandb/ # Local WandB run logs (auto-created)
└── README.md # Project documentation
Create and activate a Python environment, then install the required packages:
pip install numpy matplotlib pandas wandb- Open
Fashion-MNIST.ipynbin Jupyter or VS Code. - Run the setup / data-loading cells.
- Run the training / sweep cells to launch FFNN on Fashion-MNIST.
- Run the final summary cell to compare activation functions and initialisations (local-only analysis).
- Open
CIFAR-10.ipynb. - Run the setup / data-loading cells (these use the files in
Dataset/). - Run the training / sweep cells to launch FFNN on CIFAR-10.
- Run the final summary cell to compare activation functions and initialisations (local-only analysis).
Fashion-MNIST and CIFAR-10 are run separately in their own notebooks. Both use the shared implementation in Utilisfunction.py.
Before running sweeps, log in to WandB in a terminal:
wandb loginThen, when you run the sweep cells in each notebook, runs will be tracked in your WandB account (metrics, curves, histograms, confusion matrices, and hyperparameter sweeps).