|
View: |
Part 1: Document Description
|
|
Citation |
|
|---|---|
|
Title: |
Video Diffusion Models are Training-free Motion Interpreter and Controller |
|
Identification Number: |
doi:10.21979/N9/HQM313 |
|
Distributor: |
DR-NTU (Data) |
|
Date of Distribution: |
2024-11-07 |
|
Version: |
1 |
|
Bibliographic Citation: |
Xiao, Zeqi; Zhou, Yifan; Yang, Shuai; Pan, Xingang, 2024, "Video Diffusion Models are Training-free Motion Interpreter and Controller", https://doi.org/10.21979/N9/HQM313, DR-NTU (Data), V1 |
|
Citation |
|
|
Title: |
Video Diffusion Models are Training-free Motion Interpreter and Controller |
|
Identification Number: |
doi:10.21979/N9/HQM313 |
|
Authoring Entity: |
Xiao, Zeqi (Nanyang Technological University) |
|
Zhou, Yifan (Nanyang Technological University) |
|
|
Yang, Shuai (Peking University) |
|
|
Pan, Xingang (Nanyang Technological University) |
|
|
Software used in Production: |
Nil |
|
Distributor: |
DR-NTU (Data) |
|
Access Authority: |
Xiao, Zeqi |
|
Depositor: |
Xiao Zeqi |
|
Date of Deposit: |
2024-10-11 |
|
Holdings Information: |
https://doi.org/10.21979/N9/HQM313 |
|
Study Scope |
|
|
Keywords: |
Computer and Information Science, Video diffusion model |
|
Abstract: |
Video generation primarily aims to model authentic and customized motion across frames, making understanding and controlling the motion a crucial topic. Most diffusion-based studies on video motion focus on motion customization with training-based paradigms, which, however, demands substantial training resources and necessitates retraining for diverse models. Crucially, these approaches do not explore how video diffusion models encode cross-frame motion information in their features, lacking interpretability and transparency in their effectiveness. To answer this question, this paper introduces a novel perspective to understand, localize, and manipulate motion-aware features in video diffusion models. Through analysis using Principal Component Analysis (PCA), our work discloses that robust motion-aware feature already exists in video diffusion models. We present a new MOtion FeaTure (MOFT) by eliminating content correlation information and filtering motion channels. MOFT provides a distinct set of benefits, including the ability to encode comprehensive motion information with clear interpretability, extraction without the need for training, and generalizability across diverse architectures. Leveraging MOFT, we propose a novel training-free video motion control framework. Our method demonstrates competitive performance in generating natural and faithful motion, providing architecture-agnostic insights and applicability in a variety of downstream tasks. |
|
Kind of Data: |
Codes |
|
Methodology and Processing |
|
|
Sources Statement |
|
|
Data Access |
|
|
Other Study Description Materials |
|
|
Related Studies |
|
|
Paper: <a href="https://xizaoqu.github.io/moft/">Link</a> |
|
|
Code: <a href="https://github.com/xizaoqu/TrajectoryAttention">Link</a> |
|
|
Related Publications |
|
|
Citation |
|
|
Title: |
Xiao, Z., Zhou, Y., Yang, S., & Pan, X. (2024, December). Video diffusion models are training-free motion interpreter and controller. In Proceedings of the 38th International Conference on Neural Information Processing Systems (pp. 76115-76138). |
|
Identification Number: |
10.5555/3737916.3740339 |
|
Bibliographic Citation: |
Xiao, Z., Zhou, Y., Yang, S., & Pan, X. (2024, December). Video diffusion models are training-free motion interpreter and controller. In Proceedings of the 38th International Conference on Neural Information Processing Systems (pp. 76115-76138). |
|
Citation |
|
|
Title: |
Xiao, Z., Zhou, Y., Yang, S., & Pan, X. (2024). Video diffusion models are training-free motion interpreter and controller. Advances in Neural Information Processing Systems, 37, 76115-76138. |
|
Identification Number: |
10356/201828 |
|
Bibliographic Citation: |
Xiao, Z., Zhou, Y., Yang, S., & Pan, X. (2024). Video diffusion models are training-free motion interpreter and controller. Advances in Neural Information Processing Systems, 37, 76115-76138. |