-
Notifications
You must be signed in to change notification settings - Fork 82
Description
I am currently trying to factorize a matrix using the MF_Solver with the KL Loss function. I experience NaN values during my training either after the 1st or the 2nd iteration. After creating small test cases, I have found that there might be an issue with large gradients causing negative values which are then converted to 0's in the sg_update function. This however seems to create 0 rows/columns in either my P or Q matrix which are not dealt with, as the prepare_for_sg_update function will calculate z=0, which results in NaN values which are carried through the calculations and eventually result in the whole model being filled with NaN values.
Can the algorithm (when calculating 1/z, one might consider 1/(z+epsilon) with epsilon>0) or my parameters (especially the learning rate) be adjusted to handle such cases?
Do you have some additional idea what might be causing NaN values during training?