Skip to content

CauSight: Learning to Supersense for Visual Causal Discovery

License

Notifications You must be signed in to change notification settings

OpenCausaLab/CauSight

Repository files navigation

CauSight: Learning to Supersense for Visual Causal Discovery

[Dataset] [Model] [Paper]

🔥 Highlights

We introduce the task of visual causal discovery. It requires models to infer cause-and-effect relations among visual entities across diverse scenarios instead of merely perceiving their presence. we first construct the Visual Causal Graph dataset (VCG-32K), a large-scale collection of over 32,000 images annotated with entity-level causal graphs, and further develop CauSight, a novel vision-language model to perform visual causal discovery through causally aware reasoning. Our training recipe integrates three components: (1) training data curation from VCG-32K, (2) Tree-of-Causal-Thought (ToCT) for synthesizing reasoning trajectories, and (3) reinforcement learning with a designed causal reward to refine the reasoning policy. Experiments show that CauSight outperforms GPT-4.1 on visual causal discovery, achieving over a threefold performance boost.

🔧 Getting Started

1. Clone the Repository

git clone https://github.com/OpenCausaLab/CauSight.git
cd CauSight

2. Set Up the Environment

We recommend using conda:

conda create -n causight python=3.11
conda activate causight

pip install -r requirements.txt
pip install -e .

3. Download the Dataset (VCG-32K)

mkdir -p VCG-32K
pip install huggingface_hub

hf login
hf download OpenCausaLab/VCG-32K \
    --repo-type dataset \
    --local-dir ./VCG-32K
tar -xzf ./VCG-32K/COCO/images.tar.gz -C ./VCG-32K/COCO
tar -xzf ./VCG-32K/365/images.tar.gz -C ./VCG-32K/365

4. Download the CauSight Model

mkdir -p model
huggingface-cli download OpenCausaLab/CauSight \
    --repo-type model \
    --local-dir ./model

5. Evaluation

Start the model server, then run inference:

bash model_server.sh
python run_inference.py

6. Tree-of-Causal-Thought

If you want to make your own SFT data with Tree-of-Causal-Thought, run:

bash model_server.sh
python run.py

Citation

@article{zhang2025causight,
  title={CauSight: Learning to Supersense for Visual Causal Discovery},
  author={Zhang, Yize and Chen, Meiqi and Chen, Sirui and Peng, Bo and Zhang, Yanxi and Li, Tianyu and Lu, Chaochao},
  journal={arXiv preprint arXiv:2512.01827},
  year={2025}
}

About

CauSight: Learning to Supersense for Visual Causal Discovery

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published