add prefixes to ensure unique col names#146
add prefixes to ensure unique col names#146matthewpeterkort wants to merge 7 commits intodevelopmentfrom
Conversation
quinnwai
left a comment
There was a problem hiding this comment.
Just some conceptual questions. I'll approve once I review and test the rest (submission, aced_etl)
| front_column_names += ["resourceType"] | ||
| if "patient" in df.columns: | ||
| front_column_names = front_column_names + ["patient"] | ||
| front_column_names = [] |
There was a problem hiding this comment.
why do we even do this reordering thing? Just for legibility of each ndjson record?
There was a problem hiding this comment.
also, it makes sense to do it for general things like identifier and resourceType, why are we doing this for patient and stuff? Is there a functional difference to organizing the columns this way?
| if "identifier" in df.columns: | ||
| front_column_names += ["identifier"] | ||
| if "resourceType" in df.columns: | ||
| prefix = inflection.underscore(data_type) |
There was a problem hiding this comment.
why did we need to underscore in the first place? Is this a hard constraint of shared filters?
There was a problem hiding this comment.
Can you change the relevant integration tests. They do validation of the document in elastic. Eg test_end_to_end_workflow.py::test_simple_workflow queries for file when validating elastic.
Complete list targeting calypr-dev...
FAILED tests/integration/test_bucket_import.py::test_bucket_import - AssertionError: g3t --debug --format json ls exit_code: 1, expected: 0
FAILED tests/integration/test_end_to_end_workflow.py::test_simple_workflow - AssertionError: g3t --debug push exit_code: 1, expected: 0
FAILED tests/integration/test_end_to_end_workflow.py::test_push_fails_with_invalid_doc_ref_creation_date - AssertionError: logs/publish.log does not exist.
FAILED tests/integration/test_end_to_end_workflow.py::test_push_fails_with_no_write_permissions - AssertionError: logs/publish.log does not exist.
FAILED tests/integration/test_rm_file.py::test_rm_uncommitted - AssertionError: g3t --debug push exit_code: 1, expected: 0
FAILED tests/integration/test_rm_file.py::test_rm_committed - AssertionError: g3t --debug push exit_code: 1, expected: 0
FAILED tests/integration/test_rm_file.py::test_rm_pushed - AssertionError: g3t --debug push exit_code: 1, expected: 0
FAILED tests/integration/test_rm_file.py::test_rm_commit_all - AssertionError: g3t --debug push exit_code: 1, expected: 0
FAILED tests/integration/test_rm_file.py::test_rm_pushed_links - AssertionError: g3t --debug push exit_code: 1, expected: 0
|
end to end test working |
* validate secondary identifiers * just linting * bump version
quinnwai
left a comment
There was a problem hiding this comment.
Thanks, tested and works! See comments to change back the end_to_end test. Also, how do we want to share the data-client binary for existing use? We need to provide that somewhere so people can still use g3t as we transition over to git-drs (docs still point to gen3-client setup).
Co-authored-by: Quinn Wai Wong <54592956+quinnwai@users.noreply.github.com>
Co-authored-by: Quinn Wai Wong <54592956+quinnwai@users.noreply.github.com>
Co-authored-by: Quinn Wai Wong <54592956+quinnwai@users.noreply.github.com>
|
associated binary here: https://github.com/calypr/data-client/releases/tag/0.0.1. To use, cp binary to a directory so that it's available via |
* get subj of subj (patient) in docref * linting * bump * add patient observations to research subject * bump * patch I think * bumperino
No description provided.