More robust fixes for numerical issues#229
Open
fumoboy007 wants to merge 2 commits intocjlin1:masterfrom
Open
Conversation
…llate. Fixes cjlin1#225. # Background The optimization algorithm has three main calculations: 1. Select the working set `{i, j}` that [minimizes](https://github.com/cjlin1/libsvm/blob/35e55962f7f03ce425bada0e6b9db79193e947f8/svm.cpp#L829-L879) the decrease in the objective function. 2. Change `alpha[i]` and `alpha[j]` to [minimize](https://github.com/cjlin1/libsvm/blob/35e55962f7f03ce425bada0e6b9db79193e947f8/svm.cpp#L606-L691) the decrease in the objective function while respecting constraints. 3. [Update](https://github.com/cjlin1/libsvm/blob/35e55962f7f03ce425bada0e6b9db79193e947f8/svm.cpp#L698-L701) the gradient of the objective function according to the changes to `alpha[i]` and `alpha[j]`. All three calculations make use of the matrix `Q`, which is represented by the `QMatrix` [class](https://github.com/cjlin1/libsvm/blob/35e55962f7f03ce425bada0e6b9db79193e947f8/svm.cpp#L198). The `QMatrix` class has two main methods: - `get_Q`, which returns an array of values for a single column of the matrix; and - `get_QD`, which returns an array of diagonal values. # Problem `Q` values are of type `Qfloat` while `QD` values are of type `double`. `Qfloat` is currently [defined](https://github.com/cjlin1/libsvm/blob/35e55962f7f03ce425bada0e6b9db79193e947f8/svm.cpp#L16) as `float`, so there can be inconsistency in the diagonal values returned by `get_Q` and `get_QD`. For example, in cjlin1#225, one of the diagonal values is `181.05748749793070829` as `double` and `180.99411909539512067` as `float`. The first two calculations of the optimization algorithm access the diagonal values via `get_QD`. However, the third calculation accesses the diagonal values via `get_Q`. This inconsistency between the minimization calculations and the gradient update can cause the optimization algorithm to oscillate, as demonstrated by cjlin1#225. # Solution We change the type of `QD` values from `double` to `Qfloat`. This guarantees that all calculations are using the same values for the diagonal elements, eliminating the inconsistency. Note that this reverts the past commit 1c80a42. That commit changed the type of `QD` values from `Qfloat` to `double` to address a numerical issue. In a follow-up commit, we will allow `Qfloat` to be defined as `double` at runtime as a more general fix for numerical issues. # Future Changes The Java code will be updated similarly in a separate commit.
…rnel values. This will make it easier for users to try double precision kernel values when they run into numerical issues.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Commit 1: Fix inconsistency that could cause the optimization algorithm to oscillate.
Fixes #225.
Background
The optimization algorithm has three main calculations:
{i, j}that minimizes the decrease in the objective function.alpha[i]andalpha[j]to minimize the decrease in the objective function while respecting constraints.alpha[i]andalpha[j].All three calculations make use of the matrix
Q, which is represented by theQMatrixclass. TheQMatrixclass has two main methods:get_Q, which returns an array of values for a single column of the matrix; andget_QD, which returns an array of diagonal values.Problem
Qvalues are of typeQfloatwhileQDvalues are of typedouble.Qfloatis currently defined asfloat, so there can be inconsistency in the diagonal values returned byget_Qandget_QD. For example, in #225, one of the diagonal values is181.05748749793070829asdoubleand180.99411909539512067asfloat.The first two calculations of the optimization algorithm access the diagonal values via
get_QD. However, the third calculation accesses the diagonal values viaget_Q. This inconsistency between the minimization calculations and the gradient update can cause the optimization algorithm to oscillate, as demonstrated by #225.Solution
We change the type of
QDvalues fromdoubletoQfloat. This guarantees that all calculations are using the same values for the diagonal elements, eliminating the inconsistency.Note that this reverts the past commit 1c80a42. That commit changed the type of
QDvalues fromQfloattodoubleto address a numerical issue. In a follow-up commit, we will allowQfloatto be defined asdoubleat runtime as a more general fix for numerical issues.Future Changes
The Java code will be updated similarly in a separate commit.
Commit 2: Add a runtime parameter to specify the floating-point precision of kernel values.
This will make it easier for users to try double precision kernel values when they run into numerical issues.