51 to 60 of 84 Results
Oct 2, 2024
Xu, Qianxiong; Liu, Xuanyi; Zhu, Lanyun; Lin, Guosheng; Long, Cheng; Li, Ziyue; Zhao, Rui, 2024, "Hybrid Mamba for Few-Shot Segmentation", https://doi.org/10.21979/N9/PHG7NV, DR-NTU (Data), V1
Many few-shot segmentation (FSS) methods use cross attention to fuse support foreground (FG) into query features, regardless of the quadratic complexity. A recent advance Mamba can also well capture intra-sequence dependencies, yet the complexity is only linear. Hence, we aim to... |
Oct 1, 2024
Liu, Tianqi; Wang, Guangcong; Hu, Shoukang; Shen, Liao; Ye, Xinyi; Zang, Yuhang; Cao, Zhiguo; Li, Wei; Liu, Ziwei, 2024, "MVSGaussian: Fast Generalizable Gaussian Splatting Reconstruction from Multi-View Stereo", https://doi.org/10.21979/N9/9LDWXG, DR-NTU (Data), V1
We present MVSGaussian, a new generalizable 3D Gaussian representation approach derived from Multi-View Stereo (MVS) that can efficiently reconstruct unseen scenes. Specifically, 1) we leverage MVS to encode geometry-aware Gaussian representations and decode them into Gaussian pa... |
Oct 1, 2024 - Chen Change LOY
Loy, Chen Change; Yang, Shuai, 2024, "VToonify", https://doi.org/10.21979/N9/7PGAOA, DR-NTU (Data), V4
Generating high-quality artistic portrait videos is an important and desirable task in computer graphics and vision. Although a series of successful portrait image toonification models built upon the powerful StyleGAN have been proposed, these image-oriented methods have obvious... |
Oct 1, 2024 -
VToonify
Adobe PDF - 4.6 MB -
MD5: 8ac626f4963bc362583fd550a84c0a45
|
Sep 27, 2024
Wu, Tianxing; Si, Chenyang; Jiang, Yuming; Huang, Ziqi; Liu, Ziwei, 2024, "FreeInit: Bridging Initialization Gap in Video Diffusion Models", https://doi.org/10.21979/N9/JMCW1W, DR-NTU (Data), V1
Though diffusion-based video generation has witnessed rapid progress, the inference results of existing models still exhibit unsatisfactory temporal consistency and unnatural dynamics. In this paper, we delve deep into the noise initialization of video diffusion models, and disco... |
Sep 27, 2024
Lan, Yushi; Fangzhou Hong; Shuai Yang; Shangchen Zhou; Bo Dai; Xingang Pan; Chen Change Loy, 2024, "LN3Diff: Scalable Latent Neural Fields Diffusion for Speedy 3D Generation", https://doi.org/10.21979/N9/UZ06ZG, DR-NTU (Data), V1
The field of neural rendering has witnessed significant progress with advancements in generative models and differentiable rendering techniques. Though 2D diffusion has achieved success, a unified 3D diffusion pipeline remains unsettled. This paper introduces a novel framework ca... |
Sep 27, 2024
Chen, Yongwei; Wang, Tengfei; Wu, Tong; Pan, Xingang; Jia, Kui; Liu, Ziwei, 2024, "ComboVerse: Compositional 3D Assets Creation Using Spatially-Aware Diffusion Guidance", https://doi.org/10.21979/N9/BAZCX6, DR-NTU (Data), V1
Generating high-quality 3D assets from a given image is highly desirable in various applications such as AR/VR. Recent advances in single-image 3D generation explore feed-forward models that learn to infer the 3D model of an object without optimization. Though promising results h... |
Sep 26, 2024
Tang, Jiaxiang; Chen, Zhaoxi; Chen, Xiaokang; Wang, Tengfei; Zeng, Gang; Liu, Ziwei, 2024, "LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation", https://doi.org/10.21979/N9/27JLJB, DR-NTU (Data), V1
3D content creation has achieved significant progress in terms of both quality and speed. Although current feed-forward models can produce 3D objects in seconds, their resolution is constrained by the intensive computation required during training. In this paper, we introduce Lar... |
Sep 25, 2024
Lan, Mengcheng; Chen, Chaofeng; Ke, Yiping; Wang, Xinjiang; Feng, Litong; Zhang, Wayne, 2024, "ProxyCLIP: Proxy Attention Improves CLIP for Open-Vocabulary Segmentation", https://doi.org/10.21979/N9/YY8L5O, DR-NTU (Data), V1
Open-vocabulary semantic segmentation requires models to effectively integrate visual representations with open-vocabulary semantic labels. While Contrastive Language-Image Pre-training (CLIP) models shine in recognizing visual concepts from text, they often struggle with segment... |
Adobe PDF - 3.5 MB -
MD5: 396eed51abcc20b34e2a977c570d33ee
|
