GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generation (doi:10.21979/N9/ZQ85KI)

View:

Part 1: Document Description
Part 2: Study Description
Entire Codebook

(external link)

Document Description
Citation
Title:	GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generation
Identification Number:	doi:10.21979/N9/ZQ85KI
Distributor:	DR-NTU (Data)
Date of Distribution:	2025-02-05
Version:	1
Bibliographic Citation:	Lan, Yushi; Zhou, Shangchen; Lyu, Zhaoyang; Hong, Fangzhou; Yang, Shuai; Dai, Bo; Pan, Xingang; Loy, Chen Change, 2025, "GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generation", https://doi.org/10.21979/N9/ZQ85KI, DR-NTU (Data), V1
Study Description
Citation
Title:	GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generation
Identification Number:	doi:10.21979/N9/ZQ85KI
Authoring Entity:	Lan, Yushi (Nanyang Technological University)
	Zhou, Shangchen (Nanyang Technological University)
	Lyu, Zhaoyang (Shanghai Artificial Intelligence Laboratory)
	Hong, Fangzhou (Nanyang Technological University)
	Yang, Shuai (Peking University)
	Dai, Bo (Shanghai Artificial Intelligence Laboratory)
	Pan, Xingang (Nanyang Technological University)
	Loy, Chen Change (Nanyang Technological University)
Software used in Production:	PyTorch
Software used in Production:	Latex
Distributor:	DR-NTU (Data)
Access Authority:	Lan, Yushi
Depositor:	Lan, Yushi
Date of Deposit:	2025-01-24
Holdings Information:	https://doi.org/10.21979/N9/ZQ85KI
Study Scope
Keywords:	Computer and Information Science, Computer and Information Science, 3D Generative Models
Abstract:	While 3D content generation has advanced significantly, existing methods still face challenges with input formats, latent space design, and output representations. This paper introduces a novel 3D generation framework that addresses these challenges, offering scalable, high-quality 3D generation with an interactive Point Cloud-structured Latent space. Our framework employs a Variational Autoencoder (VAE) with multi-view posed RGB-D(epth)-N(ormal) renderings as input, using a unique latent space design that preserves 3D shape information, and incorporates a cascaded latent diffusion model for improved shape-texture disentanglement. The proposed method, GaussianAnything, supports multi-modal conditional 3D generation, allowing for point cloud, caption, and single/multi-view image inputs. Notably, the newly proposed latent space naturally enables geometry-texture disentanglement, thus allowing 3D-aware editing. Experimental results demonstrate the effectiveness of our approach on multiple datasets, outperforming existing methods in both text- and image-conditioned 3D generation.
Kind of Data:	Code and Data
Methodology and Processing
Sources Statement
Data Access
Notes:	S-Lab License 1.0 <br/> Copyright 2025 S-Lab <br/><br/> Redistribution and use for non-commercial purpose in source and binary forms, with or without modification, are permitted provided that the following conditions are met: <br/> 1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. <br/> 2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. <br/> 3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission. <br/><br/> THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. <br/><br/> In the event that redistribution and/or use for commercial purpose in source or binary forms, with or without modification is required, please contact the contributor(s) of the work.
Other Study Description Materials
Related Studies
	<a href="https://nirvanalan.github.io/projects/GA/">https://nirvanalan.github.io/projects/GA/</a>
Related Publications
Citation
Identification Number:	arXiv.2411.08033
Bibliographic Citation:	Lan, Y., Zhou, S., Lyu, Z., Hong, F., Yang, S., Dai, B., ... & Loy, C. C. (2024). GaussianAnything: Interactive Point Cloud Latent Diffusion for 3D Generation. arXiv preprint arXiv:2411.08033.