-
Notifications
You must be signed in to change notification settings - Fork 67
Account for local storage during region allocation #9670
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -262,25 +262,47 @@ pub fn allocation_query( | |
|
|
||
| let mut builder = QueryBuilder::new(); | ||
|
|
||
| builder.sql( | ||
| // Find all old regions associated with a particular volume | ||
| "WITH | ||
| builder | ||
| .sql( | ||
| // Find all old regions associated with a particular volume | ||
| "WITH | ||
| old_regions AS ( | ||
| SELECT ").sql(AllColumnsOfRegion::with_prefix("region")).sql(" | ||
| FROM region WHERE (region.volume_id = ").param().sql(")),") | ||
| .bind::<sql_types::Uuid, _>(*volume_id.as_untyped_uuid()) | ||
|
|
||
| // Calculates the old size being used by zpools under consideration as targets for region | ||
| // allocation. | ||
| .sql(" | ||
| SELECT ", | ||
| ) | ||
| .sql(AllColumnsOfRegion::with_prefix("region")) | ||
| .sql( | ||
| " | ||
| FROM region WHERE (region.volume_id = ", | ||
| ) | ||
| .param() | ||
| .sql(")),") | ||
| .bind::<sql_types::Uuid, _>(*volume_id.as_untyped_uuid()) | ||
| // Calculates the old size being used by zpools under consideration as | ||
| // targets for region allocation. | ||
| // | ||
| // Account for the local storage dataset rendezvous tables not having been | ||
| // created yet (or for the integration tests, where blueprint execution is | ||
| // currently disabled) by performing a LEFT JOIN on pool_id and a coalesce | ||
| // for the size_used column. | ||
| .sql( | ||
| " | ||
| old_zpool_usage AS ( | ||
| SELECT | ||
| crucible_dataset.pool_id, | ||
| sum(crucible_dataset.size_used) AS size_used | ||
| FROM crucible_dataset | ||
| ( | ||
| sum(crucible_dataset.size_used) + | ||
| sum(coalesce(rendezvous_local_storage_dataset.size_used, 0)) | ||
| ) AS size_used | ||
| FROM | ||
| crucible_dataset LEFT JOIN rendezvous_local_storage_dataset | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. By using a "LEFT JOIN" here, we're fully ignoring the local storage dataset in the case where "no crucible datasets have been allocated". That seems wrong.
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. (Also, seems unlikely I suppose? I guess all our current configs have "crucible, and may or may not have the |
||
| ON | ||
| crucible_dataset.pool_id = rendezvous_local_storage_dataset.pool_id AND | ||
| rendezvous_local_storage_dataset.time_tombstoned is NULL | ||
| WHERE | ||
| ((crucible_dataset.size_used IS NOT NULL) AND (crucible_dataset.time_deleted IS NULL)) | ||
| GROUP BY crucible_dataset.pool_id),"); | ||
| crucible_dataset.size_used IS NOT NULL AND | ||
| crucible_dataset.time_deleted IS NULL | ||
| GROUP BY crucible_dataset.pool_id),", | ||
| ); | ||
|
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think this query is incorrectly calculating size, overcounting usage, which will inhibit our ability to actually allocate storage on the rack. For example, if pool A has:
The JOIN condition is "match on pool ID" -- every crucible dataset will match every local storage dataset, returning 6 rows:
I think the correct answer for space used should actually be: 100 + 200 + 300 + 50 + 60 = 710. Presumably we want to account for "each crucible dataset, once" and "each local storage dataset, once" within each pool, right? Maybe try separating the two? old_zpool_usage AS (
SELECT
COALESCE(crucible_usage.pool_id, local_usage.pool_id) AS pool_id,
COALESCE(crucible_usage.size_used, 0) + COALESCE(local_usage.size_used, 0) AS size_used
FROM (
SELECT pool_id, sum(size_used) AS size_used
FROM crucible_dataset
WHERE size_used IS NOT NULL AND time_deleted IS NULL
GROUP BY pool_id
) crucible_usage
FULL OUTER JOIN (
SELECT pool_id, sum(size_used) AS size_used
FROM rendezvous_local_storage_dataset
WHERE time_tombstoned IS NULL
GROUP BY pool_id
) local_usage
ON crucible_usage.pool_id = local_usage.pool_id
) This also should be testable. We should have coverage for cases where we're making an allocation, we need a precise amount of space to fit, and we have...
Collaborator
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. welp, I'm not sure how I managed to overlook this - maybe got too in the weeds on this query - but as you pointed out in chat, we really should have "zero or one" of these datasets on a zpool, not multiple. So although my complaint is accurate in an academic point-of-view, based on the schema, practically, we shouldn't have "a bunch of crucible datasets" all residing on one zpool |
||
|
|
||
| if let Some(snapshot_id) = snapshot_id { | ||
| // Any zpool already have this volume's existing regions, or host the | ||
|
|
||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We really should have at least one regression test for this!