Qualitative Comparison on MC-Bench
Qualitative examples from the MC-Bench dataset,
processed using our methodology with SVD and compared against
DragAnything,
SGI2V,
and MotionPro,
all of which are based on the same video model.
Among these, our method and SGI2V are the only training-free approaches.
For more details, please refer to the paper.
Qualitative Comparison on DL3DV
Qualitative examples from the DL3DV dataset,
processed using our methodology with CogVideoX
and compared against GWTF
trained version of the same model.
For more details, please refer to the paper.
References
-
Andreas Blattmann, Tim Dockhorn, Sumith Kulal, Daniel Mendelevitch, Maciej Kilian, Dominik Lorenz, Yam Levi, Zion English, Vikram Voleti, Adam Letts, et al. ,
"Stable video diffusion: Scaling latent video diffusion models to large datasets.",
Arxiv 2023.
-
Ryan Burgert, Yuancheng Xu, Wenqi Xian, Oliver Pilarski, Pascal Clausen, Mingming He, Li Ma, Yitong Deng, Lingxiao Li, Mohsen Mousavi, et al. ,
"Go-with-the-flow: Motion-controllable video diffusion models using real-time warped noise.",
CVPR 2025.
-
Koichi Namekata, Sherwin Bahmani, Ziyi Wu, Yash Kant, Igor Gilitschenski, and David B Lindell. ,
"Sg-i2v: Self-guided trajectory control in image-to-video generation.",
ICLR 2025.
-
Lu Ling, Yichen Sheng, Zhi Tu, Wentian Zhao, Cheng Xin, Kun Wan, Lantao Yu, Qianyu Guo, Zixun Yu, Yawen Lu, et al. ,
"Dl3dv-10k: A large-scale scene dataset for deep learning-based 3d vision.",
CVPR 2024.
-
Zhuoyi Yang, Jiayan Teng, Wendi Zheng, Ming Ding, Shiyu Huang, Jiazheng Xu, Yuanming Yang, Wenyi Hong, Xiaohan Zhang, Guanyu Feng, et al. ,
"Cogvideox: Text-to-video diffusion models with an expert transformer.",
ICLR 2025.
-
Weijia Wu, Zhuang Li, Yuchao Gu, Rui Zhao, Yefei He, David Junhao Zhang, Mike Zheng Shou, Yan Li, Tingting Gao, and Di Zhang. ,
"Draganything: Motion control for anything using entity representation.",
ECCV 2024.
-
Zhongwei Zhang, Fuchen Long, Zhaofan Qiu, Yingwei Pan, Wu Liu, Ting Yao, and Tao Mei. ,
"Motionpro: A precise motion controller for image-to-video generation.",
CVPR 2025.