11 to 20 of 84 Results
May 22, 2025
Xu, Yuanmu; Hou, Guanli; Hu, Jiangbei; Ren, Tenglong; Wang, Xiaokun; Zhang, Yalan; Ban, Xiaojuan; Qian, Chen; Hou, Fei; He, Ying, 2025, "NeuS: Physics and Geometry-Augmented Neural Implicit Surfaces for Rigid Bodies", https://doi.org/10.21979/N9/LTXKFL, DR-NTU (Data), V1
This paper tackles the challenges of physics-based simulation of rigid bodies in neural rendering, focusing on 3D model representation and collision handling. A synthetic and real-world dataset is also included in the paper. |
May 22, 2025
Xu, Qianxiong; Zhu, Lanyun; Liu, Xuanyi; Lin, Guosheng; Long, Cheng; Li, Ziyue; Zhao, Rui, 2025, "Unlocking the Power of SAM 2 for Few-Shot Segmentation", https://doi.org/10.21979/N9/XIDXVT, DR-NTU (Data), V1
Few-Shot Segmentation (FSS) aims to learn class-agnostic segmentation on few classes to segment arbitrary classes, but at the risk of overfitting. To address this, some methods use the well-learned knowledge of foundation models (e.g., SAM) to simplify the learning process. Recen... |
May 16, 2025
Liu, Chenxi; Miao, Hao; Xu, Qianxiong; Zhou, Shaowen; Long, Cheng; Zhao, Yan; Li, Ziyue, 2025, "Efficient Multivariate Time Series Forecasting via Calibrated Language Models with Privileged Knowledge Distillation", https://doi.org/10.21979/N9/6WWC6K, DR-NTU (Data), V1
Multivariate time series forecasting (MTSF) endeavors to predict future observations given historical data, playing a crucial role in time series data management systems. With advancements in large language models (LLMs), recent studies employ textual prompt tuning to infuse the... |
May 13, 2025
Liu, Chenxi; Zhou, Shaowen; Xu, Qianxiong; Miao, Hao; Long, Cheng; Li, Ziyue; Zhao, Rui, 2025, "Towards Cross-Modality Modeling for Time Series Analytics: A Survey in the LLM Era", https://doi.org/10.21979/N9/I0HOYZ, DR-NTU (Data), V1
The proliferation of edge devices has generated an unprecedented volume of time series data across different domains, motivating various well-customized methods. Recently, Large Language Models (LLMs) have emerged as a new paradigm for time series analytics by leveraging the shar... |
May 9, 2025
Dong, Yuhao; Liu, Zuyan; Sun, Hai-Long; Yang, Jingkang; Hu, Winston; Rao, Yongming; Liu, Ziwei, 2025, "Insight-V: Exploring Long-Chain Visual Reasoning with Multimodal Large Language Models", https://doi.org/10.21979/N9/Y0TZUB, DR-NTU (Data), V1
Large Language Models (LLMs) demonstrate enhanced capabilities and reliability by reasoning more, evolving from Chain-of-Thought prompting to product-level solutions like OpenAI o1. Despite various efforts to improve LLM reasoning, high-quality long-chain reasoning data and optim... |
May 9, 2025
Huang, Zihao; Hu, Shoukang; Wang, Guangcong; Liu, Tianqi; Zang, Yuhang; Cao, Zhiguo; Li, Wei; Liu, Ziwei, 2025, "WildAvatar: Learning In-the-wild 3D Avatars from the Web", https://doi.org/10.21979/N9/5G18B1, DR-NTU (Data), V1
Existing research on avatar creation is typically limited to laboratory datasets, which require high costs against scalability and exhibit insufficient representation of the real world. On the other hand, the web abounds with off-the-shelf real-world human videos, but these video... |
Apr 29, 2025
Wang, Yiran; Li, Jiaqi; Hong, Chaoyi; Li, Ruibo; Sun, Liusheng; Song, Xiao; Wang, Zhe; Cao, Zhiguo; Lin, Guosheng, 2025, "TacoDepth: Towards Efficient Radar-Camera Depth Estimation with One-stage Fusion", https://doi.org/10.21979/N9/Q57ZYR, DR-NTU (Data), V1
Radar-Camera depth estimation aims to predict dense and accurate metric depth by fusing input images and Radar data. Model efficiency is crucial for this task in pursuit of real-time processing on autonomous vehicles and robotic platforms. However, due to the sparsity of Radar re... |
Apr 9, 2025
Chen, Yiwen; He, Tong; Di, Huang; Ye, Weicai; Chen, Sijin; Tang, Jiaxiang; Chen, Xin; Cai, Zhongang; Yang, Lei; Yu, Gang; Lin, Guosheng; Zhang, Chi, 2025, "MeshAnything: Artist-Created Mesh Generation with Autoregressive Transformers", https://doi.org/10.21979/N9/BJPHCP, DR-NTU (Data), V1
Recently, 3D assets created via reconstruction and generation have matched the quality of manually crafted assets, highlighting their potential for replacement. However, this potential is largely unrealized because these assets always need to be converted to meshes for 3D industr... |
Apr 9, 2025
Chen, Yongwei; Lan, Yushi; Zhou, Shangchen; Wang, Tengfei; Pan, Xingang, 2025, "SAR3D: Autoregressive 3D Object Generation and Understanding via Multi-scale 3D VQVAE", https://doi.org/10.21979/N9/OK5SSQ, DR-NTU (Data), V1
Autoregressive models have demonstrated remarkable success across various fields, from large language models (LLMs) to large multimodal models (LMMs) and 2D content generation, moving closer to artificial general intelligence (AGI). Despite these advances, applying autoregressive... |
Apr 9, 2025
Lan, Mengcheng; Chen, Chaofeng; Zhou, Yue; Ke, Yiping; Wang, Xinjiang; Feng, Litong; Zhang, Wayne, 2025, "Text4Seg: Reimagining Image Segmentation as Text Generation", https://doi.org/10.21979/N9/FF4YJY, DR-NTU (Data), V1
Multimodal Large Language Models (MLLMs) have shown exceptional capabilities in vision-language tasks; however, effectively integrating image segmentation into these models remains a significant challenge. In this paper, we introduce Text4Seg, a novel text-as-mask paradigm that c... |
