Skip to content

Conversation

@lijun99
Copy link

@lijun99 lijun99 commented Dec 11, 2025

This PR improves the deramp procedure in pycuampcor/dense_offset to address the issue of (linear) phase ramp along azimuth due to the Doppler Centroids.

  1. fix the pixelIdx error (thanks to @bhawkins)
  2. use double precision for phase correction. Using float seems to introduce spectral noises which breaks the band-limited structure and therefore lowers the correlation.
  3. add derampAxis parameter for options to deramp along certain direction, 0=azimuth 1=range and 2=both axes(default). In cfg, use deramp_axis = "azimuth", "range", or None to set this parameter.
  4. move cuDeramp from oversampler to chunk processor; making the code more readable

@xhuang-jpl
Copy link
Contributor

Thanks @lijun99 and @bhawkins, I will take a test.

Copy link
Contributor

@xhuang-jpl xhuang-jpl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have tested this PR on both CPU and GPU, which show significant improvements of the correlation surface peak after the ramp phase estimation bug fix. Thanks again @lijun99 !

Copy link
Contributor

@bhawkins bhawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I see the deramp phase calculations are double precision, and the axes are consistent between the estimation and removal steps.

I found a few things that could be tweaked but no blocking issues.

@hfattahi
Copy link
Contributor

@lijun99 would you please sync back to develop branch. There are some GDAL related failures unrelated to this PR that has been hopefully already fixed in develop.

@lijun99
Copy link
Author

lijun99 commented Jan 15, 2026

LGTM. I see the deramp phase calculations are double precision, and the axes are consistent between the estimation and removal steps.

I found a few things that could be tweaked but no blocking issues.

Thank you, @bhawkins !

@lijun99
Copy link
Author

lijun99 commented Jan 15, 2026

@lijun99 would you please sync back to develop branch. There are some GDAL related failures unrelated to this PR that has been hopefully already fixed in develop.

Are you referring to the GDAL mmap driver crash when the file size is huge (exceeding the memory size)? Maybe we should use a different PR to resolve this. My fix was to use a home-cooked mmap cache manager. Maybe I can take a look at the H5 Raster driver to see whether I can patch there.

BTW, what is the decision on Pycuampor V2 #76 , with two workflows? I just hope not to diverge the code development.

@hfattahi
Copy link
Contributor

@lijun99 would you please sync back to develop branch. There are some GDAL related failures unrelated to this PR that has been hopefully already fixed in develop.

Are you referring to the GDAL mmap driver crash when the file size is huge (exceeding the memory size)? Maybe we should use a different PR to resolve this. My fix was to use a home-cooked mmap cache manager. Maybe I can take a look at the H5 Raster driver to see whether I can patch there.

BTW, what is the decision on Pycuampor V2 #76 , with two workflows? I just hope not to diverge the code development.

No I was referring to this PR https://github.com/isce-framework/isce3/pull/196/files which is already merged.
For PyCuAmpcor V2, let's discuss separately.

@lijun99
Copy link
Author

lijun99 commented Jan 15, 2026

@lijun99 would you please sync back to develop branch. There are some GDAL related failures unrelated to this PR that has been hopefully already fixed in develop.

Are you referring to the GDAL mmap driver crash when the file size is huge (exceeding the memory size)? Maybe we should use a different PR to resolve this. My fix was to use a home-cooked mmap cache manager. Maybe I can take a look at the H5 Raster driver to see whether I can patch there.
BTW, what is the decision on Pycuampor V2 #76 , with two workflows? I just hope not to diverge the code development.

No I was referring to this PR https://github.com/isce-framework/isce3/pull/196/files which is already merged. For PyCuAmpcor V2, let's discuss separately.

I am not sure what types of files are fed into pycuamcpor, slc or h5? @xhuang-jpl, could you give me some details?

@hfattahi
Copy link
Contributor

@lijun99 would you please sync back to develop branch. There are some GDAL related failures unrelated to this PR that has been hopefully already fixed in develop.

Are you referring to the GDAL mmap driver crash when the file size is huge (exceeding the memory size)? Maybe we should use a different PR to resolve this. My fix was to use a home-cooked mmap cache manager. Maybe I can take a look at the H5 Raster driver to see whether I can patch there.
BTW, what is the decision on Pycuampor V2 #76 , with two workflows? I just hope not to diverge the code development.

No I was referring to this PR https://github.com/isce-framework/isce3/pull/196/files which is already merged. For PyCuAmpcor V2, let's discuss separately.

I am not sure what types of files are fed into pycuamcpor, slc or h5? @xhuang-jpl, could you give me some details?

As I said it was not related to this PR. geocodeCov was failing. Syncing with develop should solve it hopefully.

@lijun99
Copy link
Author

lijun99 commented Jan 15, 2026

I see. develop merged.

I have tried Linux with cuda on my machine and it compiles without an issue. However, the ci tests seem to fail probably due to the different results. Could someone help to update it? Thanks! @bhawkins @xhuang-jpl

@hfattahi
Copy link
Contributor

@lijun99 would you please take a look at this unit test (https://github.com/isce-framework/isce3/blob/develop/tests/python/packages/isce3/matchtemplate/test_ampcor.py). With this PR the test is failing with the following message:

Processing chunks (10, x) - (10, x) out of 13
Processing chunks (11, x) - (11, x) out of 13
Processing chunks (12, x) - (12, x) out of 13
Processing chunks (13, x) - (13, x) out of 13
comparing covariance
meandiff 6.768423e-10
comparing dense_offsets
got 0.015625 but expected -0.109375 diff is 0.125
at index 18
=========================== short test summary info ============================
FAILED ../../../../../../tests/python/packages/isce3/matchtemplate/test_ampcor.py::test_ampcor - assert False

I wonder if the test also fails for you?

@lijun99
Copy link
Author

lijun99 commented Jan 15, 2026

I guess we need to update the golden data. I did try to update, and it seems that cpu and gpu results diff from the correlation_peak by a max diff 0.015. I will do more tests tomorrow to figure out whether it can be reconciled.

@Tyler-g-hudson
Copy link
Contributor

Tyler-g-hudson commented Jan 15, 2026

@lijun99 please also rebase to the latest develop commit in order to fix the build-and-run test failures you're getting

EDIT: Nevermind, I misunderstood the problem

@hfattahi hfattahi removed this from the R05.00.8 milestone Jan 15, 2026
bhawkins and others added 9 commits January 15, 2026 11:06
 1) fix the pixelIdx error (thanks to @bhawkins)
 2) use double precision for phase correction
 3) add derampAxis parameter for options to deramp along certain
    direction
 4) move cuDeramp from oversampler to chunk processor; making the code more readable
from @bhawkins :
  - commenting and value checks on deramp_method and deramp_axis
  - use explicit sincos for potential performance boost

Co-authored-by: Brian Hawkins <1729052+bhawkins@users.noreply.github.com>
@lijun99
Copy link
Author

lijun99 commented Jan 15, 2026

I did a rebase instead of merge from the current develop branch, which is the force-push about.

I checked the ampcor test, and updated dense_offset and correlation_peak golden data. The dense_offset results were obtained from the a "faulty" procedure, which deviate from the new results by a max 0.2 (pixel). From debug runs, I checked the detailed outputs from each procedure and I think the new results are more reasonable.

CPU and GPU results are consistent with only tiny differences ~ 0.002 pixel in offset, ~0.002 in peak values. The test also compares fft and sinc oversamplers (for the correlation surface). and there is a ~0.013 difference in peak values. I think it is within reasonable numeric errors between fft and sinc interpolation methods and therefore I adjusted peak_value tolerance to 2e-2. Now the ampcor test passes.

Please double check.

@hfattahi
Copy link
Contributor

I did a rebase instead of merge from the current develop branch, which is the force-push about.

I checked the ampcor test, and updated dense_offset and correlation_peak golden data. The dense_offset results were obtained from the a "faulty" procedure, which deviate from the new results by a max 0.2 (pixel). From debug runs, I checked the detailed outputs from each procedure and I think the new results are more reasonable.

CPU and GPU results are consistent with only tiny differences ~ 0.002 pixel in offset, ~0.002 in peak values. The test also compares fft and sinc oversamplers (for the correlation surface). and there is a ~0.013 difference in peak values. I think it is within reasonable numeric errors between fft and sinc interpolation methods and therefore I adjusted peak_value tolerance to 2e-2. Now the ampcor test passes.

Please double check.

Thanks @lijun99 . Let me run the tests again and after will look at your test modifications

@lijun99
Copy link
Author

lijun99 commented Jan 15, 2026

It looks like macOS doesn't recognize sincos. Let me workout a solution.

@bhawkins
Copy link
Contributor

It looks like macOS doesn't recognize sincos. Let me workout a solution.

Yeah, sincos isn't part of C or C++ standards, so I wouldn't recommend including it in the CPU code. (It is part of the CUDA API, so fair game there.) Fortunately most C++ compilers support it as an intrinsic function and are smart enough to use it. For example, both GCC and clang do the transformation when you compile with -ffast-math (godbolt).

@lijun99
Copy link
Author

lijun99 commented Jan 15, 2026

It looks like macOS doesn't recognize sincos. Let me workout a solution.

Yeah, sincos isn't part of C or C++ standards, so I wouldn't recommend including it in the CPU code. (It is part of the CUDA API, so fair game there.) Fortunately most C++ compilers support it as an intrinsic function and are smart enough to use it. For example, both GCC and clang do the transformation when you compile with -ffast-math (godbolt).

I tried __sincos in macOS and it indeed works. So, which do you prefer, a APPLE check for __sincos, as I just pushed, or resort to sin/cos separate lines?

@bhawkins
Copy link
Contributor

I tried __sincos in macOS and it indeed works. So, which do you prefer, a APPLE check for __sincos, as I just pushed, or resort to sin/cos separate lines?

That's good to know. However, isce3 is currently aiming for C++17 language standard compliance, so I'd prefer to avoid explicit sincos and rely on the compiler to optimize it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants