Featured Dataverses

In order to use this feature you must have at least one published dataverse.

Publish Dataverse

Are you sure you want to publish your dataverse? Once you do so it must remain published.

Publish Dataverse

This dataverse cannot be published because the dataverse it is in has not been published.

Delete Dataverse

Are you sure you want to delete your dataverse? You cannot undelete this dataverse.

Advanced Search

41 to 50 of 84 Results
Nov 7, 2024
Xiao, Zeqi; Zhou, Yifan; Yang, Shuai; Pan, Xingang, 2024, "Video Diffusion Models are Training-free Motion Interpreter and Controller", https://doi.org/10.21979/N9/HQM313, DR-NTU (Data), V1
Video generation primarily aims to model authentic and customized motion across frames, making understanding and controlling the motion a crucial topic. Most diffusion-based studies on video motion focus on motion customization with training-based paradigms, which, however, deman...
Oct 23, 2024
Jiang, Xueying; Jin, Sheng; Zhang, Xiaoqin; Shao, Ling; Lu, Shijian, 2024, "MonoMAE: Enhancing Monocular 3D Detection through Depth-Aware Masked Autoencoders", https://doi.org/10.21979/N9/5ILJOM, DR-NTU (Data), V1
Monocular 3D object detection aims for precise 3D localization and identification of objects from a single-view image. Despite its recent progress, it often struggles while handling pervasive object occlusions that tend to complicate and degrade the prediction of object dimension...
Oct 8, 2024
Huang, Ziqi; Wu, Tianxing; Jiang, Yuming; Chan, Kelvin C. K.; Liu, Ziwei, 2024, "Replication Data for: ReVersion: Diffusion-Based Relation Inversion from Images", https://doi.org/10.21979/N9/UWSAXU, DR-NTU (Data), V1
A replication of the ReVersion Benchmark, for the paper "ReVersion: Diffusion-Based Relation Inversion from Images".
Oct 8, 2024
Xie, Binzhu; Zhang, Sicheng; Zhou, Zitang; Li, Bo; Zhang, Yuanhan; Hessel, Jack; Yang, Jingkang; Liu, Ziwei, 2024, "FunQA: Towards Surprising Video Comprehension", https://doi.org/10.21979/N9/SMR703, DR-NTU (Data), V1
Surprising videos, e.g., funny clips, creative performances, or visual illusions, attract significant attention. Enjoyment of these videos is not simply a response to visual stimuli; rather, it hinges on the human capacity to understand (and appreciate) commonsense violations dep...
Oct 8, 2024
Yang, Jingkang; Dong, Yuhao; Liu, Shuai; Li, Bo; Wang, Ziyue; Jiang, Chencheng; Tan, Haoran; Kang, Jiamu; Zhang, Yuanhan; Zhou, Kaiyang; Liu, Ziwei, 2024, "Octopus: Embodied Vision-Language Programmer from Environmental Feedback", https://doi.org/10.21979/N9/9EIB8X, DR-NTU (Data), V1
Large vision-language models (VLMs) have achieved substantial progress in multimodal perception and reasoning. Furthermore, when seamlessly integrated into an embodied agent, it signifies a crucial stride towards the creation of autonomous and context-aware systems capable of for...
ZIP Archive - 6.3 GB - MD5: df5972717c2859d34b4fadd702476a70
Oct 7, 2024
Ma, Yubo; Zang, Yuhang; Chan, Liangyu; Chen, Meiqi; Jiao, Yizhu; Li, Xinze; Lu Xinyuan; Liu, Ziyu; Ma, Yan; Dong, Xiaoyi; Zhang, Pan; Pan, Liangming; Jiang, Yu-Gang; Wang, Jiaqi; Cao, Yixin; Sun, Aixin, 2024, "Replication Data for: MMLongBench-Doc: Benchmarking Long-context Document Understanding with Visualizations", https://doi.org/10.21979/N9/IMVWT4, DR-NTU (Data), V1
Understanding documents with rich layouts and multi-modal components is a long-standing and practical task. Recent Large Vision-Language Models (LVLMs) have made remarkable strides in various tasks, particularly in single-page document understanding (DU). However, their abilities...
Oct 4, 2024
Yue, Zongsheng; Wang, Jianyi; Loy, Chen Change, 2024, "Efficient Diffusion Model for Image Restoration by Residual Shifting", https://doi.org/10.21979/N9/VYPJ0O, DR-NTU (Data), V1
While diffusion-based image restoration (IR) methods have achieved remarkable success, they are still limited by the low inference speed attributed to the necessity of executing hundreds or even thousands of sampling steps. Existing acceleration sampling techniques, though seekin...
Oct 3, 2024
Guo, Zujin; Li, Wei; Loy, Chen Change, 2024, "Generalizable Implicit Motion Modeling for Video Frame Interpolation", https://doi.org/10.21979/N9/EDKWDC, DR-NTU (Data), V1
Motion modeling is critical in flow-based Video Frame Interpolation (VFI). Existing paradigms either consider linear combinations of bidirectional flows or directly predict bilateral flows for given timestamps without exploring favorable motion priors, thus lacking the capability...
Oct 2, 2024
Hu, Runyi; Zhang, Jie; Xu, Ting; Li, Jiwei; Zhang, Tianwei, 2024, "Robust-Wide: Robust Watermarking against Instruction-driven Image Editing", https://doi.org/10.21979/N9/XVTPW9, DR-NTU (Data), V1
Instruction-driven image editing allows users to quickly edit an image according to text instructions in a forward pass. Nevertheless, malicious users can easily exploit this technique to create fake images, which could cause a crisis of trust and harm the rights of the original...
Add Data

Log in to create a dataverse or add a dataset.

Share Dataverse

Share this dataverse on your favorite social media networks.

Link Dataverse
Reset Modifications

Are you sure you want to reset the selected metadata fields? If you do this, any customizations (hidden, required, optional) you have done will no longer appear.