71 to 80 of 223 Results
Sep 27, 2024 - S-Lab for Advanced Intelligence
Chen, Yongwei; Wang, Tengfei; Wu, Tong; Pan, Xingang; Jia, Kui; Liu, Ziwei, 2024, "ComboVerse: Compositional 3D Assets Creation Using Spatially-Aware Diffusion Guidance", https://doi.org/10.21979/N9/BAZCX6, DR-NTU (Data), V1
Generating high-quality 3D assets from a given image is highly desirable in various applications such as AR/VR. Recent advances in single-image 3D generation explore feed-forward models that learn to infer the 3D model of an object without optimization. Though promising results h... |
Sep 26, 2024 - S-Lab for Advanced Intelligence
Tang, Jiaxiang; Chen, Zhaoxi; Chen, Xiaokang; Wang, Tengfei; Zeng, Gang; Liu, Ziwei, 2024, "LGM: Large Multi-View Gaussian Model for High-Resolution 3D Content Creation", https://doi.org/10.21979/N9/27JLJB, DR-NTU (Data), V1
3D content creation has achieved significant progress in terms of both quality and speed. Although current feed-forward models can produce 3D objects in seconds, their resolution is constrained by the intensive computation required during training. In this paper, we introduce Lar... |
Sep 25, 2024 - S-Lab for Advanced Intelligence
Lan, Mengcheng; Chen, Chaofeng; Ke, Yiping; Wang, Xinjiang; Feng, Litong; Zhang, Wayne, 2024, "ProxyCLIP: Proxy Attention Improves CLIP for Open-Vocabulary Segmentation", https://doi.org/10.21979/N9/YY8L5O, DR-NTU (Data), V1
Open-vocabulary semantic segmentation requires models to effectively integrate visual representations with open-vocabulary semantic labels. While Contrastive Language-Image Pre-training (CLIP) models shine in recognizing visual concepts from text, they often struggle with segment... |
Sep 25, 2024 - S-Lab for Advanced Intelligence
Lan, Mengcheng; Chen, Chaofeng; Ke, Yiping; Wang, Xinjiang; Feng, Litong; Zhang, Wayne, 2024, "ClearCLIP: Decomposing CLIP Representations for Dense Vision-Language Inference", https://doi.org/10.21979/N9/S6NTDJ, DR-NTU (Data), V1
Despite the success of large-scale pretrained Vision-Language Models (VLMs) especially CLIP in various open-vocabulary tasks, their application to semantic segmentation remains challenging, producing noisy segmentation maps with mis-segmented regions. In this paper, we carefully... |
Sep 25, 2024 - S-Lab for Advanced Intelligence
Yuan, Haobo; Li, Xiangtai; Zhou, Chong; Li, Yining; Chen, Kai; Loy, Chen Change, 2024, "Open-Vocabulary SAM: Segment and Recognize Twenty-thousand Classes Interactively", https://doi.org/10.21979/N9/L05ULT, DR-NTU (Data), V1
The CLIP and Segment Anything Model (SAM) are remarkable vision foundation models (VFMs). SAM excels in segmentation tasks across diverse domains, whereas CLIP is renowned for its zero-shot recognition capabilities. This paper presents an in-depth exploration of integrating these... |
Sep 25, 2024 - S-Lab for Advanced Intelligence
Wu, Tianhao; Zheng, Chuanxia; Wu, Qianyi; Cham, Tat-Jen, 2024, "ClusteringSDF: Self-Organized Neural Implicit Surfaces for 3D Decomposition", https://doi.org/10.21979/N9/RJUHMC, DR-NTU (Data), V1
3D decomposition/segmentation remains a challenge as large-scale 3D annotated data is not readily available. Existing approaches typically leverage 2D machine-generated segments, integrating them to achieve 3D consistency. In this paper, we propose ClusteringSDF, a novel approach... |
Sep 25, 2024 - S-Lab for Advanced Intelligence
Xu, Baixin; Hu, Jiangbei; Hou, Fei; Lin, Kwan-Yee; Wu, Wayne; Qian, Chen; He, Ying, 2024, "Parameterization-driven Neural Surface Reconstruction for Object-oriented Editing in Neural Rendering", https://doi.org/10.21979/N9/0C9BU9, DR-NTU (Data), V1
The advancements in neural rendering have increased the need for techniques that enable intuitive editing of 3D objects represented as neural implicit surfaces. This paper introduces a novel neural algorithm for parameterizing neural implicit surfaces to simple parametric domains... |
Sep 25, 2024 - S-Lab for Advanced Intelligence
Xu, Qianxiong; Lin, Guosheng; Loy, Chen Change; Long, Cheng; Li, Ziyue; Zhao, Rui, 2024, "Eliminating Feature Ambiguity for Few-Shot Segmentation", https://doi.org/10.21979/N9/CIOE8Y, DR-NTU (Data), V1
Recent advancements in few-shot segmentation (FSS) have exploited pixel-by-pixel matching between query and support features, typically based on cross attention, which selectively activate query foreground (FG) features that correspond to the same-class support FG features. Howev... |
Sep 25, 2024 - S-Lab for Advanced Intelligence
Feng, Ruicheng; Li, Chongyi; Loy, Chen Change, 2024, "Kalman-Inspired Feature Propagation for Video Face Super-Resolution", https://doi.org/10.21979/N9/FMVNYY, DR-NTU (Data), V1
Despite the promising progress of face image super-resolution, video face super-resolution remains relatively under-explored. Existing approaches either adapt general video super-resolution networks to face datasets or apply established face image super-resolution models independ... |
Sep 20, 2024 - S-Lab for Advanced Intelligence
Hu, Tao; Hong, Fangzhou; Liu, Ziwei, 2024, "StructLDM: Structured Latent Diffusion for 3D Human Generation", https://doi.org/10.21979/N9/BXUEXV, DR-NTU (Data), V1
Recent 3D human generative models have achieved remarkable progress by learning 3D-aware GANs from 2D images. However, existing 3D human generative methods model humans in a compact 1D latent space, ignoring the articulated structure and semantics of human body topology. In this... |
