Update MarkerScan for compatibility with v2 of the NCBI API#14
Merged
prototaxites merged 13 commits intoCobiontID:ncbi_fixfrom Nov 6, 2025
Merged
Update MarkerScan for compatibility with v2 of the NCBI API#14prototaxites merged 13 commits intoCobiontID:ncbi_fixfrom
prototaxites merged 13 commits intoCobiontID:ncbi_fixfrom
Conversation
- use lightweight custom class reimplementing required functionality with the NCBI API - tidy up some logic with counting loops by just returning counts from the API - use a bit of pathlib to do some path stuff - re-format modified scripts with Ruff
…ersion of snakemake
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
The primary function of this PR is to update MarkerScan to be compatible with v2 of the NCBI API, replacing the dependency on
ncbi-datasets-pylibwhich is deprecated.To do this, I have added a small class in
scripts/NcbiApi.pywhich replicates the required functionality using therequestslibrary. In implementing this into the various scripts, I've also made a few changes to improve efficiency, including determining the number of assemblies directly from the API rather than counting them in a loop.In order to run, currently you must set the environment variable NCBI_API_KEY in order to access the API. I don't know enough Snakemake to modify the Snakefile, but I've also added command-line arguments (
-k) to the affected Python scripts to allow this to be specified manually.In trying to get the modified version of the pipeline to run, I have also made a number of small changes:
map-hifioption used in the pipeline (are there an alternate set of env files somewhere?)