69 datamodule and data import are combined by mikewoodward94 · Pull Request #70 · GSTT-CSC/project-template

mikewoodward94 · 2024-08-13T15:21:38Z

Summary of changes

Added a new file called XNATDataImport.py and adapted train.py and DataModule.py accordingly.

Reason for changes

The current implementation of pulling data from XNAT during the initialising of the PyTorch Lightning DataModule is messy and confusing. Adding the data import as a separate step is clearer, and additionally allows for simpler unit testing and data validation.

In this PR I've also updated the DataModule to align with best practices (https://lightning.ai/docs/pytorch/stable/data/datamodule.html). This includes removing the get_data() step as it is now not needed and renaming prepare_data() as setup().

sophie22 · 2024-08-14T08:28:39Z

project/DataModule.py

+        self.batch_size = batch_size
+        self.num_workers = num_workers
        self.visualise_training_data = visualise_training_data
+        self.image_series_option = image_series_option


is this only relevant for image data from XNAT? if so, could that be reflected in a comment?

Leave for now, can always delete or refactor at a later stage.

add comments in the code to prompt future users to delete as appropriate

sophie22 · 2024-08-14T08:32:58Z

project/DataModule.py

        ]

-    def prepare_data(self, *args, **kwargs):
+    def setup(self, *args, **kwargs):


I would like to have an in-depth discussion around setup() versus prepare_data(), potentially at the next TT.

Keep as is for now, but explore in the future.

add comment in the code with a brief summary of the decision to change this and point to resources where relevant considerations are listed for future use

sophie22 · 2024-08-14T08:35:35Z

project/XNATDataImport.py

+        return(data_builder.dataset)
+
+    @staticmethod
+    def fetch_xr(subject_data: SubjectData = None) -> List[ImageScanData]:


"CR" and "DX" are I assume specific for X-rays. I recommend we add various fetch functions with good documentation and prompt the developer of the specific project to choose and adapt the available functions as applicable for their project.

Create as separate issue!

scripts/train.py

sophie22 · 2024-08-14T09:15:30Z

project/XNATDataImport.py

+import pandas as pd
+import numpy as np
+
+from utils.tools import DataBuilderXNAT


Could XNATDataImport be merged/refactored with DataBuilderXNAT? What's the benefit of having them separate?

A merged version is easier to get rid of/ignore for not image-based projects.

Different functionalities so keep separate.

make plans to move both to CSC-XNAT package

sophie22 · 2024-08-14T09:24:40Z

README.md

 Note: The values present in the template config files are examples, you can remove any except those in `[server]` and `[project]` which are necessary for MLOps. Outside of these you are encouraged to add and modify the config files as relevant to your project.

-### 2. `project/Network.py`
+### 2. `project/XNATDataImport.py`


I think this should stay to be part of the DataModule component.

As it is its own component, I think it should have its own explanation.

Will create an issue to make a code structure/architecture image.

mikewoodward94 added 6 commits August 13, 2024 15:50

Create XnatDataImport.py

acc54ff

Update DataModule.py

2e31754

Update train.py

047d4ef

Update README.md

c18fd28

Update and rename XnatDataImport.py to XNATDataImport.py

4919e25

Update README.md

98414c0

mikewoodward94 linked an issue Aug 13, 2024 that may be closed by this pull request

DataModule and Data Import are combined #69

Open

mikewoodward94 self-assigned this Aug 13, 2024

sophie22 reviewed Aug 14, 2024

View reviewed changes

Conversation

mikewoodward94 commented Aug 13, 2024

Summary of changes

Reason for changes

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants