YORO is a YOLOv8-nano model fine-tuned on a self-curated, diverse Rubik’s Cube image dataset, with annotations created using Label Studio.
The model is designed to robustly detect cube colors and faces across varying orientations and lighting conditions.
Each image in the dataset was annotated with the following classes:
- Blue
- Green
- Orange
- Red
- White
- Yellow
- Cube face
A maximum of 7 classes can be detected per image(although theres a redundant side_face class, it never appears in the training set and the model will never predict it).
Some images in the dataset contain non-linearly oriented bounding boxes, which required the use of an Oriented Bounding Box (OBB) model rather than standard axis-aligned detection.
Despite limited hardware resources, the model achieved strong performance:
- mAP: ~0.95
- Epochs trained: 72
- Early stopping patience triggered at: 87 epochs
⚠️ Important Note
The validation set is extremely limited (5 images) and shares very similar scenery with the training data.
As a result, the reported metrics should be interpreted with caution.
This repository contains two python scripts
- main.py
This script contains a training sequene using the ultralytics package
- inference.py
This script allows you to run the model under 3 scenarios
- --live
This opens up the devices camera and allows live inference
- --image
This allows inference to be performed on an image stored on the device
- --video
This allows frame by frame inference on a video file



