-
Notifications
You must be signed in to change notification settings - Fork 41
Add a new treeDefault import feature #6429
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
|
NOTES:
|
|
TODO:
def create_default_trees_view(request):because: logged_in_collection_name = request.user.logincollectionname returns None for me |
Triggered by 441d301 on branch refs/heads/issue-6294
Whoops, should work now 👍 I gave it a quick test to confirm |
Remove print statements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- See that a list of default Taxon trees is displayed
- Click on a default tree and see you get a notification that it is being created.
- Wait and make sure the default tree is successfully created
- Create a new empty tree and make sure it is created successfully.
- Make sure you can start creating a tree and cancel it with the Cancel button in the popup.
- Make sure you can cancel the tree from the "in progress" notification as well.
- The new secondary Geography tree should be created after a few minutes.
- The new Chronostrat tree should be created after a few minutes.
- I tried uploading all taxon trees, everything uploads successfully except the ornithology tree.
- The root node for geography is just called 'Root' but should be called 'Earth'
- The root node for Chronostrat is called 'Root' but should be called 'Time'
- The chronostratigraphy tree doesn't have an Eon rank. I'm not sure if this is necessary but I wanted to mention it in case it is.
- The
full name separatorfor the root ranks for both chronostrat and geology are set tobut should be set to,.
Hm not sure why its failing, I increased the re-connection delay, so it might work now? |
It's still failing at the end of the tree creation |
Increase retries
|
TODO:
|
Triggered by 11a4d60 on branch refs/heads/issue-6294
|
@emenslin Ready to be tested again 👍 The bird tree has been moved onto the same server as the other trees, so it should now download reliably |
emenslin
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- See that a list of default Taxon trees is displayed
- Click on a default tree and see you get a notification that it is being created.
- Wait and make sure the default tree is successfully created
- Create a new empty tree and make sure it is created successfully.
- Make sure you can start creating a tree and cancel it with the Cancel button in the popup.
- Make sure you can cancel the tree from the "in progress" notification as well.
- The new secondary Geography tree should be created after a few minutes.
- The new Chronostrat tree should be created after a few minutes.
The ornithology tree uploaded successfully, looks good! I did notice that the geography root rank is called "Earth" when I believe it should be "Planet", however, I know it's called Earth in some trees and Planet in others so it's probably not a big deal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- See that a list of default Taxon trees is displayed
- Click on a default tree and see you get a notification that it is being created.
- Wait and make sure the default tree is successfully created
- Create a new empty tree and make sure it is created successfully.
- Make sure you can start creating a tree and cancel it with the Cancel button in the popup.
- Make sure you can cancel the tree from the "in progress" notification as well.
- The a new secondary Geography tree should be created after a few minutes.
- The new Chronostrat tree should be created after a few minutes.
just tested this on Calvert and it looks great! nice job y'all !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- See that a list of default Taxon trees is displayed
There is almost no space between the "Populated Trees" and "Empty Trees" sections, making it difficult to differentiate between the list of trees that are pre-loaded with data and the list of empty trees that can be chosen.
More importantly, the source data we are providing says it is from 2008, which is incredibly out of date (for most disciplines). The defaults we provide in Specify 6 today are from Catalog of Life 2021 in most cases, see https://files.specifysoftware.org/taxonfiles/
https://files.specifysoftware.org/taxonfiles/taxonfiles_2.xml
Collections curators and managers want to use the most up-to-date taxonomic data possible when using default data, so we need to make sure these updated spreadsheets are being used.
- Click on a default tree and see you get a notification that it is being created.
Some suggestions:
- The dialog header "The Default Tree creation process has started." should not have a period in it (to be stylistically consistent with other dialogs in the app) and should either be entirely proper case or 'default tree' should be lower case (as it is not a proper noun).
- Better yet, it should tell the user what tree it is making, as while the 20,000+ records are added, there is no indication in this dialog confirming they made the correct selection. The notifications do include the name of the tree, but this is not visible to the user while this process happens
- There is no "Close" button so the user can resume their work, only a red Cancel button. Since users instinctively click the primary action button, they may accidentally cancel this and not understand it will happen in the background. We could add a Close button after the cancel button to prevent this from happening instinctively.
[Cancel] [Primary Action].- In fairness, we might not even need to show the user progress at all, as long as the worker is functional, everything should just work.
- Wait and make sure the default tree is successfully created
This is quite slow... I chose the 'Ichthyology' tree and began populating it at 1:18 (according to the network requests). It crawled along, finishing populating entirely at 1:47 (29 minutes total) on the test panel.
Thinking of the (expectedly common) case of someone creating a database with several disciplines with a large number of trees, they would be forced to queue and wait for these jobs to finish over a long period of time.
It would make a lot of sense to batch these and make this action more efficient to prevent frustration.
The notification showing "Cancel" persists regardless of the state (success|failure).
- Create a new empty tree and make sure it is created successfully.
That was fast 😄
- Make sure you can start creating a tree and cancel it with the Cancel button in the popup.
- Make sure you can cancel the tree from the "in progress" notification as well.
This technically works, but–
Cancelling the tree still sends 3 notifications:
- The Default Tree creation process has started.
- Default Tree creation in progress.
- Default Tree creation failure.
Aborted
My suggestion:
-
Remove the 'Task ID' collapsible section from all notifications– the users should not see this.
-
Remove the giant red "Cancel" button from the notification. If the user wants to remove this tree, they should simply be able to do so after it is created (#7594)
-
Send only two notifications, one saying the process has started and one saying the tree creation was cancelled. Saying "Default Tree creation failure. Aborted" is overwhelmingly negative and says:
- The software failed unintentionally ("creation failure")
- We use programmer lingo ("aborted" instead of "cancelled")
- Create a Geography tree:
- Click execute and make note of the response you got below.
- The new secondary Geography tree should be created after a few minutes.
- You can track the progress by going to
yourdb.test.specifysystems.org/api/create_default_tree/status/YOUR_TASK_ID/. You should see your task id in the response you got.
- Create a Chronostratigraphy tree:
- Click execute and make note of the response you got below.
- The new Chronostrat tree should be created after a few minutes.
Other Thoughts
The core issue, as I see it, is that the user needs to create a taxon tree populated with our default data.
From my conversation with @alesan99, I understand that a big challenge we face is the time it takes to build these trees. This PR so far is attempting to develop a robust system that will inform users about the current status of the background process, including notifications, status updates, a progress bar, and options to cancel the action, to address this slowdown.
My (perhaps naïve) perspective is that we should re-evaluate how we provide the default trees to make this process nearly instantaneous and eliminate previous complications. If we can download the full tree CSV file and import it all at once, the extra work may be unnecessary. The faster this process completes, the better the user experience. Fewer messages for the user also would improve the experience.
In Specify 6, this took only a few seconds (the file downloaded locally, the tree was populated, the user moved on). Maybe there are lessons there we can use to make this more elegant.
This PR is loading the data row-by-row while Specify 6 used batch database operations, resulting in many more database round-trips.
- We could use bulk_create() instead of individual saves
- Seems pertinent to pre-fetch all existing nodes and ranks
- Batch inserts in groups?
- Maybe we need to consider raw SQL for the import portion if that's feasible...
Add back in close button to tree creation dialog. Add spacing between populated trees and empty trees
|
@grantfitzsimmons Addressed UI issues 👍 I can also update the tree files independently of this PR, so I will be working on updating those soon as well. |
grantfitzsimmons
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One usability concern I have is that if you dismiss the creation dialog, you are still shown the 'Add Tree' dialog, which may lead users to click the tree again instinctively, creating two trees:
Screen.Recording.2026-01-13.at.11.06.52.mov
Solution: Hide the 'Add Tree' dialog once creation begins
Since the tree creation is not fast yet (see #7641), the user is left wondering if the tree they started creating is being made again. I was not sent any notification for either of the trees I made. In the logs for the worker or specify7 container, I don't see an indication either of whether or not the tree is being created.
Until it becomes fast enough that the time between the tree creation being initiated and the time the tree exists, we need some mechanism to show the user this is happening.
I imagine this is just a bug? It looks like the intention was for this notification to be sent @alesan99
"Everything should be as simple as it can be, but not simpler"
Done 👍
It is indeed a bug, a notification should be sent out as soon as the worker receives the task. I could move the notification trigger to the main process, rather than the worker, but that risks lying in cases where the task didn't actually start (which looks like what's happening in this case maybe?). Perhaps I can at least add a failure notification if there is some way to detect if the worker didn't start the task. |
Fixes #6294, #7488, #7558
Adds
tree/create_default_tree/endpoint to create or populate a tree with records from a CSV retrived from a URL.Also fetches and displays a list of available default taxon trees in the Tree Viewer tree creation dialog.
TODO:
Details
/create_default_tree/accept a CSV url and discipline name. (Right now the frontend fetches a list of CSV files and sends a filename in the request, then the backend extracts the discipline from the filename, and then the backend chooses a url for the discipline from its own list.)Checklist
self-explanatory (or properly documented)
Testing instructions
yourdb.test.specifysystems.org/documentation/api/operations/all/yourdb.test.specifysystems.org/api/create_default_tree/status/YOUR_TASK_ID/. You should see your task id in the response you got.