Conversation
|
Thank you for the contribution! The best way to handle this would be to rewrite the
|
Expands multi-argument UNNEST(arr1, arr2, ...) into ROWS FROM(unnest(arr1), unnest(arr2), ...) which properly zips results together with NULL padding for shorter arrays. Changes: - table_expr.go: Detect and expand multi-arg UNNEST to RowsFromExpr - aliased_table_expr.go: Support WITH ORDINALITY via RowsFromExpr - select_clause.go: Update multi-arg UNNEST rewrite to use RowsFromExpr Depends on: - dolthub/vitess#454 - dolthub/go-mysql-server#3412
|
@Hydrocharged Like this? |
|
We need tests for |
- Add comprehensive tests for explicit ROWS FROM syntax: - ROWS FROM with generate_series - ROWS FROM with unnest - ROWS FROM with mixed functions - Fix table_expr.go and aliased_table_expr.go to use RowsFromExpr for all multi-function ROWS FROM cases, not just multi-arg UNNEST Depends on: - dolthub/vitess#454 - dolthub/go-mysql-server#3412
@Hydrocharged done! |
|
I apologize for the delay on the review, but I'm starting on it now. I'll have something for you much later in the day. |
Hydrocharged
left a comment
There was a problem hiding this comment.
What you have here does work for the specific case of ROWS FROM, however we can generalize the changes to enable more broad support.
From looking through the documentation, it appears that ROWS FROM behaves no differently than an expression. Therefore, SELECT * FROM ROWS FROM (...); has the same relative behavior as SELECT * FROM expression();. With that in mind, we have vitess.TableFuncExpr to handle that case, which would apply to all expressions that are positioned after FROM.
With the above information, you can remove the new vitess.RowsFromExpr and replace it with vitess.TableFuncExpr. You'll still need to modify vitess though, as TableFuncExpr allows for a table alias, but not column aliases (for example, SELECT * FROM generate_series(1, 5) AS s(n); fails while removing the (n) works). The new Columns field will require GMS modifications (similar to your existing changes made for RowsFromExpr). We should also leave a comment in vitess that the Columns field is a Doltgres extension, and not used by MySQL.
Lastly, plan.RowsFrom would be moved out of GMS and into Doltgres. From here, we have two routes. The first would be to allow TableFuncExpr to reference expressions rather than strictly functions and table functions. The second would be to implement the expression as a function. The long-term goal is to implement the first one, but that may be a larger lift than just implementing this as a function. If you do go the function route, then you can use a unique name (such as dolt_rowsfrom as we disallow user-functions with the dolt prefix), with a TODO mentioning the above.
Move RowsFrom and RowsFromIter from GMS into server/node as an ExecBuilderNode. Register BuildMultiExprTableFunc factory to handle nameless TableFuncExpr (ROWS FROM pattern). Update AST conversion to emit TableFuncExpr instead of the removed RowsFromExpr.
|
@Hydrocharged changes made. |
Hydrocharged
left a comment
There was a problem hiding this comment.
This is closer, but there's still a bit of work left to do.
| // For non-RowsFromExpr expressions, ordinality is not yet supported | ||
| if node.Ordinality { | ||
| return nil, errors.Errorf("ordinality is only supported for ROWS FROM expressions") | ||
| } |
There was a problem hiding this comment.
This seems false due to the more generalized TableFuncExpr
| return nil, err | ||
| } | ||
|
|
||
| // Wrap in a subquery as the original code did |
There was a problem hiding this comment.
No need to reference the original code. Was this written with AI assistance?
| } | ||
|
|
||
| // Handle RowsFromExpr specially - it can have WITH ORDINALITY and column aliases | ||
| if rowsFrom, ok := node.Expr.(*tree.RowsFromExpr); ok { |
There was a problem hiding this comment.
This needs to be in the switch statement below
| selectExprs := make(vitess.SelectExprs, 0, len(tableFuncExpr.Exprs)) | ||
| for _, argExpr := range tableFuncExpr.Exprs { |
There was a problem hiding this comment.
We don't need to use append here since we're creating selectExprs based on the length of tableFuncExpr.Exprs, and then directly iterating over tableFuncExpr.Exprs. This should use the index variable rather than ignore it.
| // Handle multi-argument UNNEST specially: UNNEST(arr1, arr2, ...) | ||
| // is syntactic sugar for ROWS FROM(unnest(arr1), unnest(arr2), ...) | ||
| if len(node.Items) == 1 { | ||
| if funcExpr, ok := node.Items[0].(*tree.FuncExpr); ok { | ||
| funcName := funcExpr.Func.String() | ||
| if strings.EqualFold(funcName, "unnest") && len(funcExpr.Exprs) > 1 { | ||
| // Expand multi-arg UNNEST into separate unnest calls | ||
| selectExprs := make(vitess.SelectExprs, len(funcExpr.Exprs)) | ||
| for i, arg := range funcExpr.Exprs { | ||
| argExpr, err := nodeExpr(ctx, arg) | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| selectExprs[i] = &vitess.AliasedExpr{ | ||
| Expr: &vitess.FuncExpr{ | ||
| Name: vitess.NewColIdent("unnest"), | ||
| Exprs: vitess.SelectExprs{&vitess.AliasedExpr{Expr: argExpr}}, | ||
| }, | ||
| } | ||
| } | ||
| return &vitess.TableFuncExpr{ | ||
| Exprs: selectExprs, | ||
| }, nil | ||
| } | ||
| } | ||
| } |
There was a problem hiding this comment.
I'm not sure I understand why this is here. UNNEST won't resolve to *tree.RowsFromExpr, unless you mean for the nested case? Because for that, it should be handled in the natural path (probably under *tree.AliasedTableExpr, but definitely not here).
| if r.withOrdinality { | ||
| schema = append(schema, &sql.Column{ | ||
| Name: "ordinality", | ||
| Type: types.Int64, |
There was a problem hiding this comment.
This is a GMS type, not a Doltgres type, which we usually use pgtypes as an alias to make that distinction within a file. All returned types (unless GMS specifically has to use the type, and even then there are usually workarounds) should be Doltgres types. Also, is ordinality an int4 or int8? This should be confirmed via Postgres (generally through documentation so that it can be verified later, or from direct testing if it's not in the docs).
| // DebugString implements the sql.DebugStringer interface. | ||
| func (r *RowsFrom) DebugString() string { | ||
| var sb strings.Builder | ||
| sb.WriteString("RowsFrom(") |
There was a problem hiding this comment.
String writes "ROWS FROM" while this one writes RowsFrom.
| type singleValueIter struct { | ||
| value interface{} | ||
| consumed bool | ||
| } | ||
|
|
||
| func (s *singleValueIter) Next(ctx *sql.Context) (sql.Row, error) { | ||
| if s.consumed { | ||
| return nil, io.EOF | ||
| } | ||
| s.consumed = true | ||
| return sql.Row{s.value}, nil | ||
| } | ||
|
|
||
| func (s *singleValueIter) Close(ctx *sql.Context) error { | ||
| return nil | ||
| } |
There was a problem hiding this comment.
This is unnecessary, we already have sql.RowsToRowIter
| // It executes multiple set-returning functions in parallel and zips their results together. | ||
| // When one function is exhausted before another, NULL is used for its values. | ||
| type RowsFromIter struct { | ||
| functions []sql.Expression |
There was a problem hiding this comment.
Rather than building the iterators here, why not just build them in BuildRowIter and return an error there? Then we don't have to worry about this initialized stuff.
| }, | ||
| }, | ||
| }, | ||
| { |
There was a problem hiding this comment.
We also need tests for more general ROWS FROM behavior. For example, using ordinality and column aliases, not just table aliases. Same goes for other statements that use TableFuncExpr (which you can use generate_series for now).
Summary
Adds native ROWS FROM table function support for multi-argument UNNEST.
Django's bulk_create generates queries like:
Multi-argument
UNNEST(arr1, arr2, ...)is PostgreSQL syntactic sugar forROWS FROM(unnest(arr1), unnest(arr2), ...), which zips results together with NULL padding for shorter arrays.Changes
server/ast/table_expr.go: Detect and expand multi-arg UNNEST to RowsFromExprserver/ast/aliased_table_expr.go: Support WITH ORDINALITY via RowsFromExprserver/ast/select_clause.go: Update multi-arg UNNEST rewrite to use RowsFromExprDependencies
This PR depends on:
Test plan
TestArrayFunctions/multi-argument_unnest- Tests equal/unequal length arrays, NULL padding, aliasesTestInsert/insert_from_unnest- Django bulk_create patternTestCreateFunctionsLanguageSQL/function_returning_multiple_rows- User-defined SRFs still work