This repository contains the source code for:
On the Generalization of Optical Flow: Quantifying Robustness to Dataset Shifts
ICCV 2025 Workshop DataCV
Katrin Bauer, Andrés Bruhn, and Jenny Schmalfuss
Optical flow models are commonly evaluated by their ability to accurately predict the apparent motion from image sequence data. Though not seen during training, this evaluation data generally shares the training data's characteristics because it stems from the same distribution, i.e., it is in-distribution (ID) with the training data. However, when models are applied in the real world, the test data characteristics may be shifted, i.e. out-of-distribution (OOD), compared to the training data. For optical flow models, the generalization to dataset shifts is much less reported than the typical accuracy on ID data. In this work, we close this gap and systematically investigate the generalization of optical flow models by disentangling accuracy and robustness to dataset shifts with a new effective robustness metric. We evaluate a testbed of 20 models on six established optical flow datasets. Across models and datasets, we find that ID accuracy can be used as a predictor for OOD performance, but certain models generalize better than this trend suggests. While our analysis reveals that model generalization capabilities declined in recent years, we also find that more training data and smart architectural choices can improve generalization. Across tested models, effective robustness to dataset shifts is high for models that avoid attention mechanisms and favor multi-scale designs.
Clone this repository and all its submodules:
git clone thisrepository
git submodule update --init --recursive --remotepython3 -m venv effrob
source effrob/bin/activate
python -m pip install --upgrade pipChange into scripts_setup folder and execute the script which installs all required packages via pip. As each package is installed succesively, you can debug errors for specific packages later.
cd scripts_setup
bash install_packages.sh
cd ..Depending on your CUDA version, you may need to install an older PyTorch version.
All models except ptlflow were tested with
# Install PyTorch. The torch version depends on your CUDA version.
# For available versions, see https://pytorch.org/get-started/previous-versions/.
pip install torch==1.12.1+cu113 torchvision==0.13.1+cu113 torchaudio==0.12.1 --extra-index-url https://download.pytorch.org/whl/cu113
pip install numpy==1.26.4Using ptlflow requires PyTorch version 2.
Checkout models/ptlflow/requirements.txt for the respective requirements.
The module alt_cuda_corr can reduce memory load and is required for MS-RAFT+ and CCMR+.
cd models/ptlflow/ptlflow/utils/external/alt_cuda_corr/
python setup.py installcd models/MatchFlow/QuadTreeAttention/
python setup.py installPlease refer to the pytorch documentation how to compile the channelnorm, correlation and resample2d extensions.
If all else fails, go to the extension folders /models/FlowNet/{channelnorm,correlation,resample2d}_package, manually execute
python3 setup.py installand potentially replace cxx_args = ['-std=c++11'] by cxx_args = ['-std=c++14'], and the list of nvcc_args by nvcc_args = [] in every setup.py file.
If manually compiling worked, you may need to add the paths to the respective .egg files in the {channelnorm,correlation,resample2d}.py files, e.g. for channelnorm via
sys.path.append("/lib/pythonX.X/site-packages/channelnorm_cuda-0.0.0-py3.6-linux-x86_64.egg")
import channelnorm_cudaThe site-packages folder location varies depending on your operation system and python version.
If the installation of the spatial-correlation-sampler works and you have a cuda capable machine, open helper_functions/config_specs.py and make sure to set the variable "correlationSamplerOnlyCPU": to False. This will speed up computations when using PWCNet.
If the spatial-correlation-sampler does not install run the following script to install a cpu-only version:
cd scripts
bash install_scs_cpu.shWhen loading gcc and CUDA versions from modules, you need to make sure the versions are compatible and may adjust GCC_HOME and other variables. See more informations in this issue. One solution presented changes the variables as follows:
Details
## used to compile .cu and for cudnn
export GCC_HOME=/path/to/gcc-5.4.0/
export PATH=$GCC_HOME/bin/:$PATH
export LD_LIBRARY_PATH=$GCC_HOME/lib:$GCC_HOME/lib64:$GCC_HOME/libexec:$LD_LIBRARY_PATH
export CPLUS_INCLUDE="$GCC_HOME/include:$CPLUS_INCLUDE"
export C_INCLUDE="$GCC_HOME/include:$C_INCLUDE"
export CXX=$GCC_HOME/bin/g++
export CC=$GCC_HOME/bin/gcc ## for make
CC=$GCC_HOME/bin/gcc ## for cmake
## complime using nvcc with gcc
export EXTRA_NVCCFLAGS="-Xcompiler -std=c++98"
## pip install
pip install spatial-correlation-samplerFor evaluation, we use the datasets FlyingThings3D, Sintel, KITTI 2015, HD1K, Driving, VIPER, Spring. The datasets are assumed to be in a similar layout as for training RAFT:
├── datasets
├── FlyingChairs_release
├── data
├── FlyingThings3D
├── frames_cleanpass
├── frames_finalpass
├── optical_flow
├── Sintel
├── test
├── training
├── KITTI
├── testing
├── training
├── HD1k
├── hd1k_challenge
├── hd1k_input
├── hd1k_flow_uncertainty
├── hd1k_flow_gt
├── driving
├── frames_cleanpass
├── frames_finalpass
├── optical_flow
├── Viper
├── val
├── Spring
├── train
If you have them already saved somewhere else, you may link to the files with
mkdir datasets
cd datasets
ln -s /path/to/Sintel Sintel
ln -s /path/to/KITTI KITTIor specify the paths and names directly in helper_functions/config_specs.py.
Evaluating the effective robustness of an optical flow model is done in two steps:
- Evaluate the model on the in-distribution and out-of-distribution datasets. (See below how to add an external model to this repository.)
python evaluate_accuracy.py \
--net YourModel --custom_weight_path path/to/your/ckpt.pth \
--dataset FlyingThings3D --dataset_stage validation --dataset_pass final
python evaluate_accuracy.py \
--net YourModel --custom_weight_path path/to/your/ckpt.pth \
--dataset Kitti15 --dataset_stage training --dataset_pass ""- Evaluate the effective robustness as difference to the baseline.
The baselines from the paper are in
paper_results/baseline_{$ID_DATASET}_{$OOD_DATASET}.json.
python evaluate_effectiverobustness.py \
--net YourModel --custom_weight_path path/to/your/ckpt.pth \
--id_dataset FlyingThings3D --id_dataset_stage validation --id_dataset_pass final \
--ood_dataset Kitti15 --ood_dataset_stage training --ood_dataset_pass "" \
--baseline_file paper_results/baseline_FlyingThings3D_Kitti15.jsonTo evaluate a model on all datasets used for the paper, you can use the scripts evaluate_model_accuracy.sh and evaluate_model_effective_robustness.sh in scripts_evalaute/.
- Adapt the environment variable
NETandWPATHin each file. - Run them:
sh scripts_evaluate/evaluate_model_accuracy.sh
sh scripts_evaluate/evaluate_model_effective_robustness.shThe above process assumes you're using our scripts to evaluate the accuracy of a model using our folder structure of storing intermediate results. If you use your own evaluation scripts to compute the WAUC, you can evaluate the effective robustness via:
python evaluate_effectiverobustness_wauc.py --id-wauc 0.63 --ood-wauc 0.52 --baseline_file paper_results/baseline_FlyingThings3D_Kitti15.jsonWe provide the baselines used in the paper in paper_results/baseline_{$ID_DATASET}_{$OOD_DATASET}.json.
If you want to fit a baseline to your own set of models, checkout the script scripts_evaluate/fit_baselines.sh.
evaluate_accuracy.py stores the accuracy for each model at the following path
./experiment_data/accuracy/{$DATASET_NAME}/{$DATASET_PASS}/{$DATASET_STAGE}/{$NET}/{$WEIGHT_NAME}/metrics.jsonfit_baselines.py stores the baseline parameters as experiment_data/baseline_{$ID_DATASET}_{$OOD_DATASET}.json.
The model accuracies and baselines from the paper are stored in paper_results.
The framework is built such that custom (PyTorch) models can be included.
You can add a model either by adding it to the framework ptlflow or directly into this framework.
This framework contains ptlflow as submodule in models/ptlflow. Any model included in ptlflow should also work here after adding it to the valid program arguments.
-
Follow their excellent documentation on how to add a model.
-
In
helper_functions/parsing_file.py, add the model to the possible choices for--netusing the naming schemeptlflow-yourmodelname.
To add an own model, perform the following steps:
-
Create a directory
models/your_modelcontaining all the required files for the model. -
Make sure that all import calls are updated to the correct folder. I.e change:
from your_utils import your_functions ## old ## should be changed to: from models.your_model.your_utils import your_functions ## new
-
In
helper_functions/ownutilities.pymodify the following functions:-
import_and_load(): Add the following lines:elif net == 'your_model': ## mandatory: import your model i.e: from models.your_model import your_model ## optional: you can outsource the configuration of your model e.g. as a .json file in models/_config/ with open("models/_config/your_model_config.json") as file: config = json.load(file) ## mandatory: initialize model with your_model and load pretrained weights model = your_model(config) weights = torch.load(path_weights, map_location=device) model = load_state_dict(weights)
-
preprocess_img(): Make sure that the input is adapted to the forward pass of your model. The dataloader provides rgb images with range[0, 255]. The image dimensions differ with the dataset. You can use the padder class make the spatial dimensions divisible by a certain divisor.elif network == 'your_model': ## example: normalize rgb range to [0,1] images = [(img / 255.) for img in images] ## example: initialize padder to make spatial dimension divisible by 64 padder = InputPadder(images[0].shape, divisor=64) ## example: apply padding output = padder.pad(*images)
-
model_takes_unit_input(): Add your model to the respective list, if it expects input images in[0,1]rather than[0,255]. -
compute_flow(): Has to return a tensorfloworiginating from the forward pass of your model with the input imagesx1andx2.
If your model needs further preprocessing like concatenation perform it here:
python elif network == 'your_model': ## optional: model_input = torch.cat((x1, x2), dim=0) ## mandatory: perform forward pass flow = model(model_input)postprocess_flow(): Rescale the spatial dimension of the outputflow, such that they coincide with the original image dimensions. If you used the padder class during preprocessing it will be automatically reused here.
-
-
Add your model to the possible choices for
--netinhelper_functions/parsing_file.py(i.e.[... | your_model])
The model implementations in models/ are copied from the respective repositories:
- RAFT
- GMA
- FlowFormer
- PWC-Net
- SpyNet
- FlowNetCRobust
- FlowNet2
- IRR-PWC
- SKFlow
- MemFlow
- RPKNet (ptlflow)
- FlowFormer
- MatchFlow
- GMFlow+ or unimatch
- SEA-RAFT
- MS-RAFT+
- CCMR+
We thank the original authors for their amazing contributions.
This code base is derived from the repository PCFA.