improvement: rework Store.toSQL function to minimize UPDATE statements#21
improvement: rework Store.toSQL function to minimize UPDATE statements#21
Conversation
| ); | ||
| } | ||
|
|
||
| protected _toSQL(props: { |
There was a problem hiding this comment.
This is the relevant part with the algorithm.
| return this._toSQL({ | ||
| serializeInsertStatement, | ||
| serializeUpdateStatement, | ||
| getEndStatements, | ||
| }); |
There was a problem hiding this comment.
The dialect implementation is now slimmed down to serializing the insert and update statements! (+ providing optional "end statements", in Postgres' case, to fix the sequences).
| ]), | ||
| ); | ||
| }); | ||
| test("should handle circular references", async () => { |
There was a problem hiding this comment.
I added a simple case, but I'm sure @avallete will be able to provide more elaborated ones and break all the things! 😂
| import { type CodegenContext } from "#core/codegen/codegen.js"; | ||
| import { type Shape, type TableShapePredictions } from "#trpc/shapes.js"; | ||
| import { type CodegenContext } from "../../codegen/codegen.js"; | ||
| import { shouldGenerateFieldValue } from "../../dataModel/shouldGenerateFieldValue.js"; |
There was a problem hiding this comment.
@jgoux bikeshed, but just checking if these .. paths are intentional?
There was a problem hiding this comment.
I think so, they live in the same "module" which is #core.
There was a problem hiding this comment.
Ah, so convention-wise, we should use ../../ if its in the same "module" (e.g. code in core/ importing other code in core/, and # if its in a different module (e.g. code in dialect/ importing code in core/)?
There was a problem hiding this comment.
Yes, but it's very informal, so I wouldn't obsess on it. It's just a way for me to keep layers of responsibility separated. If I see a "#" I know it's from another layer of responsibility.
| // do we need to keep track of the exact chain of parents? | ||
| if (ctx.pending[parent.type].includes(parentRowId)) { | ||
| if (parent.isRequired) { | ||
| throw new Error( |
There was a problem hiding this comment.
question
Should I try to add tests case on this PR so we address that before merging ?
I wonder if the fact that we don't handle circular deps will decrease the UX for most of the users or not. Because before we were just appending some update and that was it.
There was a problem hiding this comment.
We don't handle non-nullable circular deps, which wouldn't be insertable anyway.
I don't think this comment is relevant anymore, I think the way we generate UPDATE statements and replace these nullable FK with NULL when necessary solves it. But you can try breaking it, of course, maybe I'm missing obvious cases!
| serializeUpdateStatement: ctx.serializeUpdateStatement, | ||
| }); | ||
|
|
||
| ctx.insertStatements.push(insertStatement); |
There was a problem hiding this comment.
Is the idea that this line would be reached for parents first, then their children later? Such that we're always inserting parents before children, to avoid needing to later on update already inserted rows?
| const insertStatements: Array<string> = []; | ||
| const updateStatements: Array<string> = []; | ||
|
|
||
| const sortedModels = sortModels(this.dataModel); |
There was a problem hiding this comment.
For what reason do we topologically sort the models? To make sure the parents are reached before other ancestors, or something?
| ].join(EOL), | ||
| ); | ||
| } else { | ||
| updatableParents.push(parent); |
There was a problem hiding this comment.
Would this line only be reached if there is a cyclic dependency + there is at least one nullable relation in the chain causing the cycle?
| foreign key (last_order_id) references "order"(id); | ||
|
|
||
| alter table product add constraint fk_first_order | ||
| foreign key (first_order_id) references "order"(id); |
There was a problem hiding this comment.
Is it worth also testing cycles where there are relations in-between? e.g. instead of order->customer->order and order->product->order, something like order->transaction->customer->order, or something
justinvdm
left a comment
There was a problem hiding this comment.
Just left a bunch of questions for my understanding, but looks great! Code makes sense (provided I understood it correctly :D but the questions should indicate that at least). Nice that we also were able to abstract out the bulk of the logic so it is not per-dialect. Great work.
|
Reconverting the PR to draft as @avallete was able to destroy my implementation with more complicated tests! |
Cool, but was really just me wanting to understand the approach. If the implementation changing though, then I'll just re-grok and re-ask when the PR is open again, so works for me. |
|
I'd like us to consider using a more strategic approach:
|
We have a user who encountered a "bug" because we split our insertions between
INSERTandUPDATEstatements.This PR's goal is to use as many
INSERTstatements as we can.UPDATEstatements should only be used if we encounter a circular dependency.The strategy here is to follow the rows structure in order to generate the statements.
For a given row, we look for parents, and if there are, we recursively (sorry @avallete 😂) create those parents statements.
It ensures that we will always create the rows in the right order, so we can get rid of the
UPDATE.Context: https://discord.com/channels/788353076129038346/1214886051886268426
I also updated the dependencies and rewrote some imports, I commented on the interesting files so it's easier to follow. 👍