as mediapipe only represent the 2D calculation from the distance from your camera, and it only detect the landmark of the hands. It does not have the actual information to calculate the orientation of the roll, pitch, and yaw.
For that i think it need the arm landmark also, to combine the information to calculate the roll, pitch, yaw.
The current program is not that robust, and accurate to be compared.