This is the official implementation of Deep-BCR-Auto.
- Use any foreground segmentation method you like. Here, we use the CLAM's implementation. Clone and go to the
/CLAMfolder. - Basic run (changed sthresh to 20):
python create_patches_fp.py --source <> --save_dir results/TCGA --patch_size 448 --step_size 448 --seg- Tune segmentations. Inspect each mask, tune
process_list_edited.csvand setprocessedto 1 for the mask you want to tune:
python create_patches_fp.py --source <> --save_dir results/TCGA --patch_size 448 --step_size 448 --seg --process_list process_list_edited.csv- Get tissue patches. Set all
processedto 1:
python create_patches_fp.py --source <> --save_dir results/TCGA --patch_size 448 --step_size 448 --seg --process_list process_list_edited.csv --patch --stitch- GO TO
/tumorbulk - Download pretrained weights from here .
Copy 'ctranspath.pth' to/tumorbulk/TransPath/and copy 'model_best.pth.tar' to/tumorbulk/ - generate tumor bulk masks
python tumorbulk.py --datadir <> --ptsdir ../CLAM/results/TCGA/patches --savedir ./results --code TCGA- create a dataset excel based on those having masks. ('../tcga_brca_bulkdetected.csv')
- crop patches:
python pts_extraction.py -p 896 -s 896 -l 0 --code TCGA --datadf ../tcga_brca_bulkdetected.csvMain args:
-
-p: patch size -
-s: stride -
-l: patch extraction at which level (i.e., openslide downsample level) -
--code: experiment code -
--datadf: data spread sheet (i.e., dataset excel) -
**NOTE: pts are now saved in (x, y) format for consistency
- go to
./casii/TransPath. Copy 'ctranspath.pth' to/casii/TransPath/(see the google link from previous step). - encode patches:
python get_features_CTransPath_stainnorm.py --psize 896 --level 0 --datadf ../../tcga_brca_bulkdetected.csv --save ../data/ctpnormembedding --stainnormMain args:
--psize: patch size--stride: stride--level: openslide level--datadf: dataset spread sheet. listed all the slide that you are using.--stainnorm: perform stain norm. store true, default: False--save: save dir
- *(optional) encoding using resnet50:
python get_features_resnet_stainnorm.py --psize 896 --level 0 --datadf ../../tcga_brca_bulkdetected.csv --save ../data/resnormembedding --stainnorm - build keyset:
python keyset_lrn.py --datadf '../tcga_brca_bulkdetected.csv' --featdir './data/ctpnormembedding/l0p896s896' --task fivefold -t 100 --psize 896Main args:
--featdir: feature dir--task: task name-t: maximum # of keys per slide--cur: run cur function. store true.--extract: run keyset extraction function (must be after cur). store true.--psize: patch size--encoder: encoder name. default: ctpnorm
- run casii:
python train.py --arch CASii_MB --data threefold --code ctpnormTCGAodx_patience5_stopep10_ws_lr1e4 --psize 896 --nfold 3 --weighted-sample --patience 5 --stop_epoch 10 --lr 1e-4Main args:
--arch: model arch--data: dataset name in mydatasets--ctp_recurnot: exp code--psize: patch size--weighted_sample: store-true, default: False--nfold: num of folds, default: 5
Pretrained weights can be find at here in folders 42, 43, and 44. 42 is the best performed fold.
Our splits file can be find at the same link in folder splits.
- test
python eval.py --arch CASii_MB --data threefold --code ctpnormTCGAodx_patience5_stopep10_ws_lr1e4 --psize 896 --nfold 3Main args: Main args:
--arch: model arch--data: dataset name in mydatasets--ctp_recurnot: exp code--psize: patch size