-
Notifications
You must be signed in to change notification settings - Fork 2
Open
Labels
app: backendTask implementation touches the backendTask implementation touches the backendtype: enhancementEnhancement to an existing featureEnhancement to an existing featuretype: maintenanceMaintaining this projectMaintaining this project
Description
The public data dump script (src/mavedb/scripts/export_public_data.py) currently exports metadata (main.json), score/count CSVs, and a license file. It does not include mapped variant data (VRS alleles, mapped HGVS, etc.), even though this data is available via GET /api/v1/score-sets/{urn}/mapped-variants.
We should include mapped variant JSON in the data dump so that downstream consumers have access to post-mapped VRS representations without needing to call the live API.
Proposed Changes
- Add mapped variant data to the dump
For each published score set that has completed mapping, export its mapped variant data (the same payload returned by GET /score-sets/{urn}/mapped-variants) as a JSON file in the archive, e.g.:
mapped/tmp:00000001-a-1.mapped-variants.json
Each file should contain the current mapped variants for that score set, including pre_mapped and post_mapped VRS allele JSON, HGVS columns, and VRS version metadata.
- Add a README to the archive
Add a README.md (or README.txt) to the root of the dump archive that documents:
- What is included in the dump (metadata JSON, score CSVs, count CSVs, mapped variant JSON, license)
- The structure/layout of the archive directory
- A brief description of each file type and its format
- Any caveats (e.g. only CC0-licensed published data is included, only current mapped variants are exported)
- A link back to MaveDB and the API documentation for further reference
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
app: backendTask implementation touches the backendTask implementation touches the backendtype: enhancementEnhancement to an existing featureEnhancement to an existing featuretype: maintenanceMaintaining this projectMaintaining this project