-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Hello, this issue was originally reported in the migra repository djrobstep/migra#196 by @alexkreidler but since it is yet to be fixed, I am reporting it here again.
The original message:
I'm using
migrawith tusker, a tool that creates a temporary PG databaseCREATE DATABASE;for both my schema and migrations separately, runs them from my SQL files on disk, and then diffs the output.When I run
tusker diff -r schema migrations, it generates something like this (truncated for brevity):create schema if not exists "app"; create extension if not exists "pgcrypto" with schema "public" version '1.3'; create table "app"."datasets" ( "id" uuid not null default generate_ulid_uuid(), [...] ); [...] CREATE OR REPLACE FUNCTION public.generate_ulid_uuid() RETURNS uuid LANGUAGE plpgsql AS $function$ DECLARE [...] END $function$ ;Generally, this generated diff/migration is good, eg it creates schemas before tables before constraints. But the user defined functions are at the end, and my table as you can see relies on a function, so directly applying this migration without manually reordering it results in an error.
I've looked at some issues that seem related to dependency tracking of DDL statements/db objects: #8 for functions, #142 inherited tables, #7 views.
Based on #8, my impression is that migra currently relies on native Postgres functionality to track the dependencies between DB objects, something like this.
The first bullet from #189:
- Proper dependency resolution (make a graph of all dependencies and topologically sort it)
makes me think it would be great to implement in Python, eg parsing the DDL with a library, applying some rules to the parsed objects to extract possible references to other objects, then creating a DAG and sorting. I have a few thoughts on that:
- it could be great to make the dependency sorter a separate library/CLI tools so people with a bunch of SQL scripts could keep them in whatever order they want and then run it.
- because there are often many valid topological sorts of a given DAG, it could be useful to have some additional rules to sort objects when dependencies are not an issue, like ordering by object type: schemas, tables, constraints, functions, like the existing logic. This "stylistic sort" could make things more consistent and human readable. I don't know how complex it would be to combine those two goals.
I believe that ensuring outputted migrations have a valid DB object creation order, so they can be immediately run would be a serious UX (#25) benefit and hopefully encourage adoption of the tool.
I'd love any feedback anyone has on the next steps towards implementing this. Thanks for this great tool!