Skip to content

Conversation

@jmpesp
Copy link
Contributor

@jmpesp jmpesp commented Jan 16, 2026

It was a mistake to not consider the rendezvous_local_storage_dataset size_used when computing old_zpool_usage during region allocation! Fix that here. The LEFT JOIN and coalesce are required for two reasons:

  • blueprint execution is currently disabled for our tests, so the rendezvous tables won't appear there

  • deployed systems that update to a Nexus that contains this query may not have had blueprint execution run yet, and it's not a good idea to block crucible disk creation until this occurs.

It was a mistake to not consider the rendezvous_local_storage_dataset
size_used when computing `old_zpool_usage` during region allocation! Fix
that here. The `LEFT JOIN` and `coalesce` are required for two reasons:

- blueprint execution is currently disabled for our tests, so the
  rendezvous tables won't appear there

- deployed systems that update to a Nexus that contains this query may
  not have had blueprint execution run yet, and it's not a good idea to
  block crucible disk creation until this occurs.
@jmpesp jmpesp requested a review from smklein January 16, 2026 18:15
builder.sql(
// Find all old regions associated with a particular volume
"WITH
builder
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We really should have at least one regression test for this!

sum(coalesce(rendezvous_local_storage_dataset.size_used, 0))
) AS size_used
FROM
crucible_dataset LEFT JOIN rendezvous_local_storage_dataset
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By using a "LEFT JOIN" here, we're fully ignoring the local storage dataset in the case where "no crucible datasets have been allocated". That seems wrong.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Also, seems unlikely I suppose? I guess all our current configs have "crucible, and may or may not have the rendezvous_local_storage_dataset -- but this still seems like defensive programming worth doing)

crucible_dataset.size_used IS NOT NULL AND
crucible_dataset.time_deleted IS NULL
GROUP BY crucible_dataset.pool_id),",
);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this query is incorrectly calculating size, overcounting usage, which will inhibit our ability to actually allocate storage on the rack.

For example, if pool A has:

  • 3 crucible_datasets (c1, c2, c3) with sizes 100, 200, 300
  • 2 rendezvous_local_storage_datasets (r1, r2) with sizes 50, 60

The JOIN condition is "match on pool ID" -- every crucible dataset will match every local storage dataset, returning 6 rows:

  • c1, r1 (100 + 50)
  • c1, r2 (100 + 60)
  • c2, r1 (200 + 50)
  • c2, r2 (200 + 60)
  • c3, r1 (300 + 50)
  • c3, r2 (300 + 60)

I think the correct answer for space used should actually be: 100 + 200 + 300 + 50 + 60 = 710.
However, your query as-is more than double counts: I think the sum of all the rows in this example is 1530 instead.

Presumably we want to account for "each crucible dataset, once" and "each local storage dataset, once" within each pool, right?

Maybe try separating the two?

  old_zpool_usage AS (                                                                                                                                                                                                                        
    SELECT                                                                                                                                                                                                                                    
      COALESCE(crucible_usage.pool_id, local_usage.pool_id) AS pool_id,                                                                                                                                                                       
      COALESCE(crucible_usage.size_used, 0) + COALESCE(local_usage.size_used, 0) AS size_used                                                                                                                                                 
    FROM (                                                                                                                                                                                                                                    
      SELECT pool_id, sum(size_used) AS size_used                                                                                                                                                                                             
      FROM crucible_dataset                                                                                                                                                                                                                   
      WHERE size_used IS NOT NULL AND time_deleted IS NULL                                                                                                                                                                                    
      GROUP BY pool_id                                                                                                                                                                                                                        
    ) crucible_usage                                                                                                                                                                                                                          
    FULL OUTER JOIN (                                                                                                                                                                                                                         
      SELECT pool_id, sum(size_used) AS size_used                                                                                                                                                                                             
      FROM rendezvous_local_storage_dataset                                                                                                                                                                                                   
      WHERE time_tombstoned IS NULL                                                                                                                                                                                                           
      GROUP BY pool_id                                     
    ) local_usage                                          
    ON crucible_usage.pool_id = local_usage.pool_id        
  )                                                        

This also should be testable. We should have coverage for cases where we're making an allocation, we need a precise amount of space to fit, and we have...

  • A pool with only crucible datasets
  • A pool with only local storage
  • A pool with both
  • A pool with neither

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

welp, I'm not sure how I managed to overlook this - maybe got too in the weeds on this query - but as you pointed out in chat, we really should have "zero or one" of these datasets on a zpool, not multiple.

So although my complaint is accurate in an academic point-of-view, based on the schema, practically, we shouldn't have "a bunch of crucible datasets" all residing on one zpool

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants