Skip to content

feat: use .walk file in complexity command#10

Draft
aryarm wants to merge 12 commits intomainfrom
feat/walks-complexity
Draft

feat: use .walk file in complexity command#10
aryarm wants to merge 12 commits intomainfrom
feat/walks-complexity

Conversation

@aryarm
Copy link
Member

@aryarm aryarm commented Feb 5, 2025

very draft, rough as sandpaper, thoroughly untested, needs much refactoring

todo:

  • write more tests to check whether loading from .walk file works the same as loading walks from gfa
  • try to get a gfa for a region that is less than 500KB, so that we can use it for testing:
    RCHD/RCHE locus: hg38 query: chr1:25240000-25460000
    • this is unfortunately too big (a few MBs)
    • I also tried chr1:25368857-25368858 which seems to capture the entire loop but also results in a GFA that's 2.8 MB
    • removing the walks from that GFA brings it down to 476 KB! which is great, but also it would be nice if I could keep the walks in the file...
  • load node indices as integers instead of strings. will require changing some of the test data
  • write tests to check whether it matters that some walks might pass through a node more than once
  • benchmark and consider whether a different data structure for walks might be better
  • move classes in graph_utils and gbz_utils to data module?

@aryarm aryarm changed the title wip: read walks from .walk in complexity - initial implementation feat: use .walk file in complexity command Feb 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant