Conversation
|
Hi Joren, Thanks for the PR! Wrapping the functions for Python interaction is a good idea! And catching the pathlib issue is much appreciated, we were aware of that problem but hadn’t had the opportunity to address it. As you suggested, it would be better to move the functions Then update the scripts to import these functions directly from Let me know what you think. If it makes sense and you have the capacity to make the change, we’ll review, approve and merge to main. Thanks for your contribution! Minhuan |
|
Thank you! The PR looks good to me. There are a few leftover code quality issues from our previous implementation—such as using a file open context manager and replacing the Biopython with Biotite if possible—that we’ll address in a future update, as they’re outside the scope of this PR. I’ve approved it and will merge by the end of today if no other issues are found. |
Hi,
thanks for this great package! I have a PR here that modifies scripts/run_msacluster.py and scripts/run_msascore.py. The changes that I propose here don't actually change the way you call these command line scripts. I left that exactly the same, so it should not break anyones code using these scripts.
What I did:
Adding python function for use besides the command line.
Create a "normal" python function with a signature allowing this procedure to be called easily from another python process without having to you over the command line. The main function now just parses the arguments and internally calls that function. As said, the CLI did not change.
Change to pathlib in clustering scripts.
Replace some string editing and os.path manipulations to pathlib. The main reason for this is that I got tripped up a bit when I was trying to follow this example: https://rocket-9.gitbook.io/rocket-docs/latent-state-of-a-human-serpin-with-msa-subsampling. The scoring script kept on not finding the clusters we just created and then I realised that the slash behind "-i serpin_clusters/" was really important here:
The reason was that the paths are string formatted here: https://github.com/rs-station/ROCKET/blob/de3041a08265a7e4933960aa969974e3453f146c/rocket/scripts/run_msascore.py#L298C1-L299C1 . Moving away from string editing prevents these finicky behavior with slashes.
If you want I can also move the new underlying functions inside of new modules directly underneath rocket and import them in the scripts.
have a nice day,
Joren