Fix inconsistency that could cause the optimization algorithm to oscillate by fumoboy007 · Pull Request #228 · cjlin1/libsvm

fumoboy007 · 2024-12-21T02:56:51Z

Fixes #225.

Background

The optimization algorithm has three main calculations:

Select the working set {i, j} that minimizes the decrease in the objective function.
Change alpha[i] and alpha[j] to minimize the decrease in the objective function while respecting constraints.
Update the gradient of the objective function according to the changes to alpha[i] and alpha[j].

All three calculations make use of the matrix Q, which is represented by the QMatrix class. The QMatrix class has two main methods:

get_Q, which returns an array of values for a single column of the matrix; and
get_QD, which returns an array of diagonal values.

Problem

Q values are of type Qfloat while QD values are of type double. Qfloat is currently defined as float, so there can be inconsistency in the diagonal values returned by get_Q and get_QD. For example, in #225, one of the diagonal values is 181.05748749793070829 as double and 180.99411909539512067 as float.

The first two calculations of the optimization algorithm access the diagonal values via get_QD. However, the third calculation accesses the diagonal values via get_Q. This inconsistency between the minimization calculations and the gradient update can cause the optimization algorithm to oscillate, as demonstrated by #225.

Solution

We change get_Q to return a new class called QColumn instead of a plain array of values. The QColumn class overloads the subscript operator, so accessing individual elements is the same as before. Internally though, the QColumn class will return the QD value when the diagonal element is accessed. This guarantees that all calculations are using the same values for the diagonal elements, eliminating the inconsistency.

Alternatives Considered

Alternatively, we could change Qfloat to be defined as double. This would also eliminate the inconsistency; however, it would reduce the cache capacity by half.

Future Changes

The Java code will be updated similarly in a separate commit.

…llate. Fixes cjlin1#225. # Background The optimization algorithm has three main calculations: 1. Select the working set `{i, j}` that minimizes the decrease in the objective function. 2. Change `alpha[i]` and `alpha[j]` to minimize the decrease in the objective function while respecting constraints. 3. Update the gradient of the objective function according to the changes to `alpha[i]` and `alpha[j]`. All three calculations make use of the matrix `Q`, which is represented by the `QMatrix` class. The `QMatrix` class has two main methods: - `get_Q`, which returns an array of values for a single column of the matrix; and - `get_QD`, which returns an array of diagonal values. # Problem `Q` values are of type `Qfloat` while `QD` values are of type `double`. `Qfloat` is currently defined as `float`, so there can be inconsistency in the diagonal values returned by `get_Q` and `get_QD`. For example, in cjlin1#225, one of the diagonal values is `181.05748749793070829` as `double` and `180.99411909539512067` as `float`. The first two calculations of the optimization algorithm access the diagonal values via `get_QD`. However, the third calculation accesses the diagonal values via `get_Q`. This inconsistency between the minimization calculations and the gradient update can cause the optimization algorithm to oscillate, as demonstrated by cjlin1#225. # Solution We change `get_Q` to return a new class called `QColumn` instead of a plain array of values. The `QColumn` class overloads the subscript operator, so accessing individual elements is the same as before. Internally though, the `QColumn` class will return the `QD` value when the diagonal element is accessed. This guarantees that all calculations are using the same values for the diagonal elements, eliminating the inconsistency. # Alternatives Considered Alternatively, we could change `Qfloat` to be defined as `double`. This would also eliminate the inconsistency; however, it would reduce the cache capacity by half. # Future Changes The Java code will be updated similarly in a separate commit.

…to oscillate. See more details in the upstream pull request: cjlin1/libsvm#228.

…to oscillate. See more details in the upstream pull request: cjlin1/libsvm#228. Fixes scikit-learn#30353.

fumoboy007 mentioned this pull request Dec 21, 2024

Training gets stuck on a specific dataset #225

Closed

fumoboy007 added a commit to fumoboy007/scikit-learn that referenced this pull request Dec 21, 2024

Fix inconsistency that could cause the LIBSVM optimization algorithm …

9f4c16d

…to oscillate. See more details in the upstream pull request: cjlin1/libsvm#228.

fumoboy007 added a commit to fumoboy007/scikit-learn that referenced this pull request Dec 21, 2024

Fix inconsistency that could cause the LIBSVM optimization algorithm …

960c546

…to oscillate. See more details in the upstream pull request: cjlin1/libsvm#228. Fixes scikit-learn#30353.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix inconsistency that could cause the optimization algorithm to oscillate#228

Fix inconsistency that could cause the optimization algorithm to oscillate#228
fumoboy007 wants to merge 1 commit intocjlin1:masterfrom
fumoboy007:oscillation_fix

fumoboy007 commented Dec 21, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

fumoboy007 commented Dec 21, 2024

Background

Problem

Solution

Alternatives Considered

Future Changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant