This is a series of Jupyter notebooks introducing machine learning and deep learning concepts, developed for the workshop Introduction to Machine Learning in Python, offered by the Carpentry @ UCSB Library in Winter 2026.
These notebooks take you from ML fundamentals through convolutional neural networks.
Authors: Tian Qiu, Jose Nino Muriel
Covers the foundations of machine learning, including:
- ML vs rule-based programming; classification, regression, and clustering
- 10-step ML workflow and problem formulation
- Project: Penguin species classification (Seaborn penguins dataset)
- Logistic regression and hyperplane-based decision boundaries
- Confusion matrices and accuracy evaluation
- Introduction to neural networks: single neurons, ReLU/GELU activation functions, one-hot encoding
Builds a full neural network classifier using Keras/TensorFlow:
- Project: Penguin species classification (continued)
- Keras Sequential API:
Input,Dense,softmaxoutput layer - Categorical crossentropy loss, Adam optimizer
- Training for 100 epochs, loss curve visualization
- Predictions, confusion matrix heatmap
Extends to regression tasks and introduces training best practices:
- Project: Predicting Basel sunshine hours from European weather data (Zenodo dataset, 11 cities, 3654 days)
- 70/15/15 train/validation/test split strategy
- Gradient descent, batch training, and epoch concepts
- RMSE metric, baseline model comparison
- Overfitting detection using validation loss
EarlyStoppingcallback
Introduces convolutional neural networks for image data:
- Project: MNIST handwritten digit classification (60,000 training images, 28×28 grayscale)
- Image normalization and channel dimensions
Conv2Dfilters, kernel size, stride, paddingMaxPooling2Dfor dimensionality reduction- Full CNN architecture: Conv → Pool → Conv → Pool → Flatten → Dense → Softmax
- Visualizing learned filters and intermediate layer outputs
| Notebook | Task | Key Concepts |
|---|---|---|
| 1 | Classification | ML fundamentals, neurons, activations |
| 2 | Classification | Keras model building, training loop |
| 3 | Regression | Train/val/test split, overfitting, early stopping |
| 4 | Classification | CNNs, filters, pooling |
- TensorFlow / Keras — model building and training
- scikit-learn — classical ML, metrics, train/test split
- pandas / NumPy — data manipulation
- Matplotlib / Seaborn — visualization