Code for my Machine Learning Final Project
Binary classification of images for person detection using HOG + SVM and Random Forests.
-
Dataset Download
- Go to the BDD100k website: http://bdd-data.berkeley.edu/download.html
- Download both the images(red) and the labels(green)

- Extract the zip files in their respective data directory. If lost look for the hidden helper file.
-
Python Environment Setup
- Use your preferred method for environment setup
-
Run the Scripts
- The first script that should be run AFTER extracting the images and labels in the correct directory is the "flatten_dataset.py" script. This will move all the files to the parent directory and delete the leftover empty directories.
- Then run the "parse_labels.py" script to get the unique count of objects in the images.
- Run the "random_sample.py" script. If you want to change which object you classify by or the number of samples pooled from the total 100k images you may do so within this file.
- Run the "train_model.py" script to train the models and see results.