Dynamic Indicators from Downscaled CMIP6 #652

Joshdpaul · 2025-10-22T21:57:44Z

This PR adds a dynamic indicators endpoint to the API. The endpoint pulls data from the recently developed downscaled CMIP6 coverages, and is based on preliminary work in this repo.

This PR is currently in draft mode, incomplete for the following reasons:

the coverages are still in flux, subject to change
there is a known conflict between the ansi timestamps in the coverage and the built-in WCPS query iteration over time (JP can explain more about this - BLUF, we might have to reingest the coverages with the timestamp at 0 hour instead of noon 🙃 )
we still need the lat/lon "in extent?" validation (via geotiff or bounding box)
the endpoint still needs CSV output
we need HTML documentation for this endpoint, which in turn requires an academic reference that does not yet exist (this one might take a while to iron out since there is USGS involvement...)

Background:

The goal here is to develop dynamically generated indicators that threshold, rank, or otherwise summarize daily CMIP6 data based on user inputs. The processing is largely done server-side, via WCPS queries. We are only implementing indicators that do not rely on sequence or consecutive day aggregations, because our downscaling methodology does not preserve the order of values.

Our list of possible indicators is therefore:

hd: “Hot day” threshold (degrees C) — the highest observed daily maximum 2 m air temperature such that there are 5 other observations equal to or greater than this value.

cd: “Cold day” threshold (degrees C) — the lowest observed daily minimum 2 m air temperature such that there are 5 other observations equal to or less than this value.

rx1day: Maximum 1–day precipitation (mm)

su: "Summer Days" — Annual number of days with maximum 2 m air temperature above 25 degrees C

dw: "Deep Winter days" — Annual number of days with minimum 2 m air temperature below -30 degrees C

r10mm: Number of days with precipitation > 10 mm

wet days per year: Number of days with precipitation > 1 mm

However, when coding this endpoint it became clear that we have 3 general types of queries that could actually be applied to any variable:

Threshold queries: counting days per year above or below a certain value
Statistical queries: getting min, mean, or max value per year
Ranking queries: finding the nth highest or lowest value per year

So rather than code up specific functions for the existing indicators, I created three granular functions that can be applied in order to recreate those indicators, or develop new ones. My reasoning for this is that not only does it require less code, but it takes into account that allowing user input will necessarily change the definitions of the indicators. For example, if you use 0C as the temperature threshold for the "deep winter days" indicator, are you really talking about "deep winter" anymore? Say you wanted to know how many days were above freezing (to approximate a growing season, perhaps) - the term "summer days" is not a very accurate description.

Note that each function can be used for any coverage variable, allows the use of both metric and imperial units, and provides time slicing options (yearly). It's an open question whether or not we need all this functionality. Note also that the endpoint URL construction is unique when compared to our other endpoints. I think there is something nice about being able to "read" the request URL sort of like a sentence, that describes exactly what you are asking for - but again, this approach is open to discussion, and we may want to make the endpoint less granular.

To test:

Start the API as usual, and test out some of the following endpoints. Try swapping "in" for "mm", or "F" for "C" and check out the results. Or try making up a completely new indicator!

Threshold queries

the "summer days" indicator: http://127.0.0.1:5000/dynamic_indicators/count_days/above/25/C/tasmax/64.5/-147.5/2000/2030/
the "deep winter days" indicator: http://127.0.0.1:5000/dynamic_indicators/count_days/below/-30/C/tasmin/64.5/-147.5/2000/2030/
the "days above 10mm precip" indicator: http://127.0.0.1:5000/dynamic_indicators/count_days/above/10/mm/pr/64.5/-147.5/2000/2030/
the "wet days" indicator: http://127.0.0.1:5000/dynamic_indicators/count_days/above/1/mm/pr/64.5/-147.5/2000/2030/

Statistical queries

"maximum one day precip" indicator: http://127.0.0.1:5000/dynamic_indicators/stat/max/pr/mm/64.5/-147.5/2000/2030/
coldest day per year: http://127.0.0.1:5000/dynamic_indicators/stat/min/tasmin/C/64.5/-147.5/2000/2030/
hottest day per year: http://127.0.0.1:5000/dynamic_indicators/stat/max/tasmax/C/64.5/-147.5/2000/2030/
a way to calculate total annual precipitation (we use the "sum" stat, but the "summary" section of return will show a mean of total annual precip over the year range): http://127.0.0.1:5000/dynamic_indicators/stat/sum/pr/mm/64.5/-147.5/2000/2030/
calculate mean daily precipitation (note that this is not a common statistic for precip - here we are talking the average amount of precip per day over the year.... in other words "how much does it rain per day, on average?" .... I include this here to show that this type of WCPS query can produce unexpected results since we are iterating to get all values per each year before running the statistical function): http://127.0.0.1:5000/dynamic_indicators/stat/mean/pr/mm/64.5/-147.5/2000/2030/

Ranking queries

the "hot day threshold" indicator: http://127.0.0.1:5000/dynamic_indicators/rank/6/highest/tasmax/64.5/-147.5/2000/2030/
the "cold day threshold" indicators: http://127.0.0.1:5000/dynamic_indicators/rank/6/lowest/tasmin/64.5/-147.5/2000/2030/
10th highest precip day: http://127.0.0.1:5000/dynamic_indicators/rank/10/highest/pr/64.5/-147.5/2000/2030/

…allowed params

…ple returns to documentation

Joshdpaul · 2025-11-07T19:28:31Z

Update:

Added documentation.

Added area queries to this endpoint. Note that these endpoints do not use the fancy WCPS queries that perform the server-side processing of threshold or statistics functions: that seems to be impossible to iterate over years AND grid cells in the same query.

Instead, we simply query for a NetCDF of the daily data in a bbox defined by the polygon, perform zonal statistics to get an area mean for each day, then compute the indicators from those means. Even so, the processing is fairly quick for something like a HUC10.

http://127.0.0.1:5000/dynamic_indicators/count_days/above/25/C/tasmax/area/1908030609/2000/2030/

http://127.0.0.1:5000/dynamic_indicators/stat/max/pr/mm/area/1908030609/2000/2030

http://127.0.0.1:5000/dynamic_indicators/rank/6/lowest/tasmin/area/1908030609/2000/2030/

charparr

Wow, this is an awesome use of Rasdaman via the API. Substantial and significant work here Josh, nice job. I've left quite a few comments, but before your proceed in addressing them I'm going to lay out a few broader points for discussion, knowing that some of this will likely get sorted while I'm gone on leave!

Structural

I'm not convinced that dynamic indicators is an endpoint (i.e. a resource) rather than a view / filter of a resource. I'm alerted here because the backend data is the same (well, a subset) of the CMIP6 downscaled endpoint. There are probably some exceptions we can find, but the general model of the API thus far is that the endpoints (paths) correspond to discrete data resources (usually Rasdaman coverages) and then we sort and filter them in various ways (sometimes also with paths, like /point/lat/lon/variable/ or with query parameters). The path vs. query parameter is sort of a toss-up in my mind, but I do think that in this case dynamic indicators (filtering on some value, n) would be better considered a view into a resource or resources because...
We can apply this incredible WCPS-fu over multiple endpoints! The most obvious one would be the ERA5-WRF data. This is why I left a comment suggesting that the dynamic indicator logic might perhaps be better off in a new module where multiple endpoints can talk to it. I could imagine sea ice, fire weather, dynamical CMIP6 in the future, all implementing this. Dunking this somewhere else would also...
Reduce the code footprint. There is a ton of similarity between this endpoint and the CMIP6 downscaled endpoint, which makes sense because they have the same data resources, so the configuration, packaging, validation, postprocessing, etc. will have some degree of similarity.

Request Validation

Request validation needs major revision. Some of this is likely broader than this PR ( XREF #616 ), but there are too many layers here handling the same logic: the Marshmallow QueryParamsScheme in application.py, the imports from validate_request, and the route-specific validation bits here in this PR. Additionally, there are several places where the implementation of the request validation results in rules not being enforced. Here are some examples:

lat-lon guardrails get ignored

latlon_is_numeric_and_in_geodetic_range returns either True or the int 400, but validate_latlon_and_reproject_to_epsg_3338 checks it with a if not syntax but since 400 is "truthy", non-geodetic coordinates are treated as valid and the request proceeds until the downstream WCPS call fails. See this with http://127.0.0.1:5000/dynamic_indicators/count_days/above/25/C/tasmax/point/1000/-147.5/2000/2030/

operator validation

http://127.0.0.1:5000/dynamic_indicators/count_days/foobar/25/C/tasmax/point/65/-147.5/2000/2030/

should return a 400, instead we get a 500. I think this happens because the result of the validation function gets assigned to the operator variable! Operator is then the (template, 400) tuple. I think this same pattern happens with...

place IDs bubble up to 500s

http://127.0.0.1:5000/dynamic_indicators/count_days/above/25/C/tasmax/area/FOOFOOBARBAR/2000/2030/

This 500s, but should cause a 422 “invalid area”. And I think the root cause is similar to the previous bug.

year range stuff not enforced

validate_year returns True or 400, but nothing gets done with that return value. Empty dicts get returned
http://127.0.0.1:5000/dynamic_indicators/stat/max/pr/mm/point/64/-147/20520/2000/
http://127.0.0.1:5000/dynamic_indicators/stat/max/pr/mm/point/64/-147/2050/2000/

So, there is quite a bit of validation code in this route, but a significant portion of it does not work as intended. Adding tests for know invalid requests may help develop this.

Alright all that being said, before diving in and making changes I'd suggest getting more input from @brucecrevensten and others on the direction here.

charparr · 2025-11-10T23:33:34Z

routes/dynamic_indicators.py

Probably an annoying comment, so will ask forgiveness ahead of time, but:

Should the dynamic CMIP6 indicators be wrapped into the existing CMIP6 downscaled route? The reason I ask is because these are two endpoints are fundamentally querying the same underlying "resource" : the downscaled CMIP6 coverages. That'd fit the conventional model in which paths map to specific resources and then query params do the heavy lifting. This might feel like shuffling deck chairs on the Titanic, I dunno, but there is also an opportunity to re-use some of the validation and packaging code, etc, and then pus the indicator logic to a different module which could get then touch other resources (ERA5-WRF, future dynamically downscaled stuff) and those will all have their own validation and packaging that is different from what is in this endpoint.

templates/documentation/dynamic_indicators.html

generate_requests.py

routes/dynamic_indicators.py

charparr · 2025-11-14T16:11:31Z

@Joshdpaul I like all of these changes! I'm probably not going to be able to dive much further into this before I leave, but am stoked to see where this goes, especially in the context of a potential Garden Helper reboot. I think you and Bruce will figure out the right path forward, but I do feel pretty strongly that we're going to want to be able to access different resources with this logic: namely, CMIP6-Downscaled, and the ERA5-WRF. Two data peas in a pod 🫛 for something like this.

Joshdpaul added 8 commits October 21, 2025 11:35

initial commit, set up new route in app and add new WCPS functions + …

acbe61c

…allowed params

sketch all routes

c17a5b9

begin validation routines

195e54e

add async fetch count_days function

ccb3ff9

working count_days route

cd866e0

finish fetching and post-processing functions

46b6248

replace errors with templates

f0c8762

add more notes

062e484

Joshdpaul requested review from brucecrevensten and charparr October 22, 2025 21:57

Joshdpaul added 15 commits October 22, 2025 14:00

fix typo

e083a77

change coverage IDs to fetch from 6model avg

665472d

add bbox validation and function documentation

764f261

add point to route url

eeaf452

set up area fetching functions

8e95345

bring in zonal stats functions

24f4d7f

finish area query for count days

47a87cd

finish area queries for stat and rank endpoints

df9c7eb

add stat note

2435b8f

start documentation

4552f90

more doc

54575a9

fix rank rounding

621dfde

remove "projected" key in results and replace with scenario; add exam…

20becbf

…ple returns to documentation

add area query documentation

be260b4

finish doc

fd06a70

Joshdpaul marked this pull request as ready for review November 7, 2025 19:35

Joshdpaul added 2 commits November 7, 2025 10:36

Merge branch 'main' into dynamic_indicators_2

a8d94b4

Add validation for 'units' parameter in application.py

b2be9cb

charparr requested changes Nov 11, 2025

View reviewed changes

Joshdpaul added 6 commits November 11, 2025 09:39

remove "units" and "n" request params from marshmallow validation

cba06b1

rm print statement

5120e66

improve documentation

303a9a9

fix wrf link

a3c52cd

move validation functions

28968e3

refactor validation routines

557d1db

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dynamic Indicators from Downscaled CMIP6 #652

Dynamic Indicators from Downscaled CMIP6 #652

Uh oh!

Joshdpaul commented Oct 22, 2025 •

edited

Loading

Uh oh!

Joshdpaul commented Nov 7, 2025

Uh oh!

charparr left a comment

Uh oh!

charparr Nov 10, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

charparr commented Nov 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Dynamic Indicators from Downscaled CMIP6 #652

Are you sure you want to change the base?

Dynamic Indicators from Downscaled CMIP6 #652

Uh oh!

Conversation

Joshdpaul commented Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Background:

To test:

Threshold queries

Statistical queries

Ranking queries

Uh oh!

Joshdpaul commented Nov 7, 2025

Uh oh!

charparr left a comment

Choose a reason for hiding this comment

Structural

Request Validation

lat-lon guardrails get ignored

operator validation

place IDs bubble up to 500s

year range stuff not enforced

Uh oh!

charparr Nov 10, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

charparr commented Nov 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Joshdpaul commented Oct 22, 2025 •

edited

Loading