Skip to content

Comments

add prefixes to ensure unique col names#146

Open
matthewpeterkort wants to merge 7 commits intodevelopmentfrom
unique-col-names
Open

add prefixes to ensure unique col names#146
matthewpeterkort wants to merge 7 commits intodevelopmentfrom
unique-col-names

Conversation

@matthewpeterkort
Copy link
Collaborator

No description provided.

@quinnwai quinnwai self-requested a review September 23, 2025 18:44
Copy link
Contributor

@quinnwai quinnwai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just some conceptual questions. I'll approve once I review and test the rest (submission, aced_etl)

front_column_names += ["resourceType"]
if "patient" in df.columns:
front_column_names = front_column_names + ["patient"]
front_column_names = []
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we even do this reordering thing? Just for legibility of each ndjson record?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also, it makes sense to do it for general things like identifier and resourceType, why are we doing this for patient and stuff? Is there a functional difference to organizing the columns this way?

if "identifier" in df.columns:
front_column_names += ["identifier"]
if "resourceType" in df.columns:
prefix = inflection.underscore(data_type)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why did we need to underscore in the first place? Is this a hard constraint of shared filters?

Copy link
Contributor

@quinnwai quinnwai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change the relevant integration tests. They do validation of the document in elastic. Eg test_end_to_end_workflow.py::test_simple_workflow queries for file when validating elastic.

Complete list targeting calypr-dev...

FAILED tests/integration/test_bucket_import.py::test_bucket_import - AssertionError: g3t --debug --format json ls exit_code: 1, expected: 0
FAILED tests/integration/test_end_to_end_workflow.py::test_simple_workflow - AssertionError: g3t --debug push exit_code: 1, expected: 0
FAILED tests/integration/test_end_to_end_workflow.py::test_push_fails_with_invalid_doc_ref_creation_date - AssertionError: logs/publish.log does not exist.
FAILED tests/integration/test_end_to_end_workflow.py::test_push_fails_with_no_write_permissions - AssertionError: logs/publish.log does not exist.
FAILED tests/integration/test_rm_file.py::test_rm_uncommitted - AssertionError: g3t --debug push exit_code: 1, expected: 0
FAILED tests/integration/test_rm_file.py::test_rm_committed - AssertionError: g3t --debug push exit_code: 1, expected: 0
FAILED tests/integration/test_rm_file.py::test_rm_pushed - AssertionError: g3t --debug push exit_code: 1, expected: 0
FAILED tests/integration/test_rm_file.py::test_rm_commit_all - AssertionError: g3t --debug push exit_code: 1, expected: 0
FAILED tests/integration/test_rm_file.py::test_rm_pushed_links - AssertionError: g3t --debug push exit_code: 1, expected: 0

@matthewpeterkort
Copy link
Collaborator Author

end to end test working

* validate secondary identifiers

* just linting

* bump version
Copy link
Contributor

@quinnwai quinnwai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, tested and works! See comments to change back the end_to_end test. Also, how do we want to share the data-client binary for existing use? We need to provide that somewhere so people can still use g3t as we transition over to git-drs (docs still point to gen3-client setup).

matthewpeterkort and others added 2 commits October 14, 2025 09:47
Co-authored-by: Quinn Wai Wong <54592956+quinnwai@users.noreply.github.com>
Co-authored-by: Quinn Wai Wong <54592956+quinnwai@users.noreply.github.com>
Co-authored-by: Quinn Wai Wong <54592956+quinnwai@users.noreply.github.com>
Copy link
Contributor

@quinnwai quinnwai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ty

@quinnwai
Copy link
Contributor

quinnwai commented Oct 14, 2025

associated binary here: https://github.com/calypr/data-client/releases/tag/0.0.1.

To use, cp binary to a directory so that it's available via which data-client

@quinnwai quinnwai mentioned this pull request Oct 17, 2025
15 tasks
* get subj of subj (patient) in docref

* linting

* bump

* add patient observations to research subject

* bump

* patch I think

* bumperino
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants