Skip to content

worv-ai/D2E

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 

Repository files navigation

D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI

Suhwan Choi*, Jaeyoon Jung*, Haebin Seong*, Minchan Kim, Minyeong Kim, Yongjun Cho, Yoonshik Kim, Yubeen Park, Youngjae Yu‡, Yunsung Lee‡

project-page arXiv

image

News

  • [2025/12/18] We release the FHD/QHD versions of the dataset on Hugging Face open-world-agents/D2E-Original for training world models and video generation models. We also fix issues in the 480p dataset open-world-agents/D2E-480p.

  • [2025/12/01] We release 480p version of dataset at huggingface. open-world-agents/D2E-480p: 267 hours of synchronized video, audio, and input events from 29 PC games across diverse genres (FPS, open-world, sandbox, and more), for training vision-action models and game agents.

  • [2025/10/21] We release part of our source codes. Code is comming soon! ocap and owa toolkit is being open-sourced already, have a look at these first.

Citation

If you find this work useful, please cite our paper:

@article{choi2025d2e,
  title={D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI},
  author={Choi, Suhwan and Jung, Jaeyoon and Seong, Haebin and Kim, Minchan and Kim, Minyeong and Cho, Yongjun and Kim, Yoonshik and Park, Yubeen and Yu, Youngjae and Lee, Yunsung},
  journal={arXiv preprint arXiv:2510.05684},
  year={2025}
}

About

D2E: Scaling Vision-Action Pretraining on Desktop Data for Transfer to Embodied AI

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •