D-Cashier-

📌 System Overview

The D-Cashier project is a smart, voice-controlled automated checkout system designed for retail environments such as convenience stores or unmanned kiosks. It integrates object detection, voice interface, and robotic manipulation to streamline the checkout process.

This system enables users to interact entirely through voice, while products are recognized and processed using a YOLOv11n-OBB-based vision module. Unrecognized items are automatically handled via a “Cancel Position,” and restricted goods are verified through OCR and face recognition.

🎥 Demo

👉 Click the thumbnail above to watch the demo video on YouTube!

🔄 System Architecture

🔧 Core Achievements

🌀 Multi-frame Object Detection + Rotation Estimation
→ Implemented a custom post-processing algorithm for YOLOv11n-OBB
→ Achieved ±3° yaw error margin
⏱ Voice Interface with Real-Time GUI + TTS Feedback
→ System response time maintained under 1 second
❌ “Cancel Position” Handling for Undetected Items
→ Reduced false detection issues by over 40%

🔍 Key Functional Features

① YOLOv11n-OBB Object Detection
Detects objects and estimates their 3D position + orientation [x, y, z, yaw] using oriented bounding boxes.
Polygon vertices are averaged across frames to improve yaw estimation.
② Background Subtraction + Cancel Position Handling
If YOLO fails to detect an object, the system compares the current frame with a pre-stored background image to locate unexpected items.
Detected unknown objects are moved to a Cancel Position to prevent false charges.
③ Adult Verification (19+ Restricted Items)
When a restricted item is detected (e.g., alcohol, cigarettes), the system:
- Uses OCR to extract birth date from a captured ID card
- Matches the face from the ID with the user’s face in front of the camera
- Grants or denies approval based on age + match score
④ Voice-Controlled Interface with GUI Feedback
- Wake-up word detection: "Hello Rokey"
- Natural language input via OpenAI Whisper
- Intent parsing via LangChain + GPT-4o
- Real-time GUI update + TTS output using OpenAI voice

🤖 Vision-to-Robot Conversion + Manipulation

Pose Conversion
- Converts YOLO’s [x, y, z, yaw] to robot base coordinates [x, y, z, rx, ry, rz]
- Uses T_gripper2camera.npy and external calibration parameters for accurate transform
- Adjusts gripper width based on object size (e.g., min_side × 10 - 50)
- Pick action is executed with Doosan’s movel() API
Cancel Preemption (Stop & Retry)
- User can say "정지" or "Remove [item]" → current goal is canceled
- Robot switches to cancel pose using a custom CancelObject service
- Uses MultiThreadedExecutor to handle cancel requests concurrently with execution
Force-Sensitive Grasping
- Grasp failure is detected when |Fz| force remains unchanged after closure
- For fragile items (bottles, cans), compliance control is used:
  - Applies downward force (e.g., 15N @ Z-axis)
  - Releases when force drops below threshold (e.g., <10N)
- Logs all force values and errors for safety validation

default.mp4

📄 Documentation

For a detailed explanation of this project, please refer to the following document:

👉 docs

👥 Contributors

Thanks to these wonderful people who have contributed to this project:

_weedmo

_jsbae-RL

_DONGHO1206

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
OBB		OBB
docs		docs
gui		gui
main_bringup		main_bringup
msgs		msgs
robot_api_control		robot_api_control
voice_processing		voice_processing
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

D-Cashier-

📌 System Overview

🎥 Demo

🔄 System Architecture

🔧 Core Achievements

🔍 Key Functional Features

🤖 Vision-to-Robot Conversion + Manipulation

📄 Documentation

👥 Contributors

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

weedmo/D-Cashier-

Folders and files

Latest commit

History

Repository files navigation

D-Cashier-

📌 System Overview

🎥 Demo

🔄 System Architecture

🔧 Core Achievements

🔍 Key Functional Features

🤖 Vision-to-Robot Conversion + Manipulation

📄 Documentation

👥 Contributors

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages