Fix inconsistency that could cause the optimization algorithm to oscillate#228
Open
fumoboy007 wants to merge 1 commit intocjlin1:masterfrom
Open
Fix inconsistency that could cause the optimization algorithm to oscillate#228fumoboy007 wants to merge 1 commit intocjlin1:masterfrom
fumoboy007 wants to merge 1 commit intocjlin1:masterfrom
Conversation
…llate. Fixes cjlin1#225. # Background The optimization algorithm has three main calculations: 1. Select the working set `{i, j}` that minimizes the decrease in the objective function. 2. Change `alpha[i]` and `alpha[j]` to minimize the decrease in the objective function while respecting constraints. 3. Update the gradient of the objective function according to the changes to `alpha[i]` and `alpha[j]`. All three calculations make use of the matrix `Q`, which is represented by the `QMatrix` class. The `QMatrix` class has two main methods: - `get_Q`, which returns an array of values for a single column of the matrix; and - `get_QD`, which returns an array of diagonal values. # Problem `Q` values are of type `Qfloat` while `QD` values are of type `double`. `Qfloat` is currently defined as `float`, so there can be inconsistency in the diagonal values returned by `get_Q` and `get_QD`. For example, in cjlin1#225, one of the diagonal values is `181.05748749793070829` as `double` and `180.99411909539512067` as `float`. The first two calculations of the optimization algorithm access the diagonal values via `get_QD`. However, the third calculation accesses the diagonal values via `get_Q`. This inconsistency between the minimization calculations and the gradient update can cause the optimization algorithm to oscillate, as demonstrated by cjlin1#225. # Solution We change `get_Q` to return a new class called `QColumn` instead of a plain array of values. The `QColumn` class overloads the subscript operator, so accessing individual elements is the same as before. Internally though, the `QColumn` class will return the `QD` value when the diagonal element is accessed. This guarantees that all calculations are using the same values for the diagonal elements, eliminating the inconsistency. # Alternatives Considered Alternatively, we could change `Qfloat` to be defined as `double`. This would also eliminate the inconsistency; however, it would reduce the cache capacity by half. # Future Changes The Java code will be updated similarly in a separate commit.
fumoboy007
added a commit
to fumoboy007/scikit-learn
that referenced
this pull request
Dec 21, 2024
…to oscillate. See more details in the upstream pull request: cjlin1/libsvm#228.
fumoboy007
added a commit
to fumoboy007/scikit-learn
that referenced
this pull request
Dec 21, 2024
…to oscillate. See more details in the upstream pull request: cjlin1/libsvm#228. Fixes scikit-learn#30353.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes #225.
Background
The optimization algorithm has three main calculations:
{i, j}that minimizes the decrease in the objective function.alpha[i]andalpha[j]to minimize the decrease in the objective function while respecting constraints.alpha[i]andalpha[j].All three calculations make use of the matrix
Q, which is represented by theQMatrixclass. TheQMatrixclass has two main methods:get_Q, which returns an array of values for a single column of the matrix; andget_QD, which returns an array of diagonal values.Problem
Qvalues are of typeQfloatwhileQDvalues are of typedouble.Qfloatis currently defined asfloat, so there can be inconsistency in the diagonal values returned byget_Qandget_QD. For example, in #225, one of the diagonal values is181.05748749793070829asdoubleand180.99411909539512067asfloat.The first two calculations of the optimization algorithm access the diagonal values via
get_QD. However, the third calculation accesses the diagonal values viaget_Q. This inconsistency between the minimization calculations and the gradient update can cause the optimization algorithm to oscillate, as demonstrated by #225.Solution
We change
get_Qto return a new class calledQColumninstead of a plain array of values. TheQColumnclass overloads the subscript operator, so accessing individual elements is the same as before. Internally though, theQColumnclass will return theQDvalue when the diagonal element is accessed. This guarantees that all calculations are using the same values for the diagonal elements, eliminating the inconsistency.Alternatives Considered
Alternatively, we could change
Qfloatto be defined asdouble. This would also eliminate the inconsistency; however, it would reduce the cache capacity by half.Future Changes
The Java code will be updated similarly in a separate commit.