Object detection using convolutional neural networks with already trained YOLO (You Only Look Once) model.
- Download trained model (195 MB) from here.
- Save it in /model_data folder
For every new image:
- Update size:
- Update name:
# car_detection_yolo.py
96 image_shape = (1080., 1440.) # image_shape = (Height, Width)# car_detection_yolo.py
126 out_scores, out_boxes, out_classes = predict(sess, "sh_taxi.jpg") # name = "sh_taxi.jpg"The input image goes through a CNN, resulting in a (19,19,5,85) dimensional output. Run summary() to see whole framework architecture:
# car_detection_yolo.py
99 yolo_model.summary()After flattening the last two dimensions, the output is a volume of shape (19, 19, 425):
- Each cell in a 19x19 grid over the input image gives 425 numbers.
- 425 = 5 x 85 because each cell contains predictions for 5 boxes, corresponding to 5 anchor boxes, as seen in lecture.
- 85 = 5 + 80 where 5 is because (pc,bx,by,bh,bw)(pc,bx,by,bh,bw) has 5 numbers, and and 80 is the number of classes we'd like to detect
You then select only few boxes based on:
- Score-thresholding: throw away boxes that have detected a class with a score less than the threshold
- Non-max suppression: Compute the Intersection over Union and avoid selecting overlapping boxes
# car_detection_yolo.py
def yolo_filter_boxes(box_confidence, boxes, box_class_probs, threshold = .6)# car_detection_yolo.py
def iou(box1, box2)
def yolo_non_max_suppression(scores, boxes, classes, max_boxes = 10, iou_threshold = 0.5)This gives you YOLO's final output:

