Biography
I am a 3rd-year PhD Candidate in Artificial Intelligence at The Hong Kong University of Science and Technology (Guangzhou) and a member of ENVISION Lab, advised by Ying-Cong Chen.
Previously, I received the M.S. degree in Optical Engineering from Beijing Institute of Technology in 2023, advised by Prof. Jianan Li and Prof. Tingfa Xu, and the B.E. degree in Measurement and Control Technology and Instruments from Beijing Institute of Technology in 2020.
News
Education
PhD Candidate, Artificial Intelligence The Hong Kong University of Science and Technology (Guangzhou) | Aug. 2023 -- Present |
| M.S., Optical Engineering, Beijing Institute of Technology | Aug. 2020 -- June 2023 |
| B.E., Measurement and Control Technology and Instruments, Beijing Institute of Technology | Aug. 2016 -- June 2020 |
Work Experience
| HUAWEI Noah's Ark Lab | Beijing, China |
Research Intern |
| Algorithm Engineer |
Oct. 2022 – Jun. 2023 |
| • Masked Autoencoders for Pre-training Large-scale Point Clouds |
| • FusionMAE: A Multi-Task Multi-Modality Pretraining Framework |
| NIO-AD (Autonomous Driving) | Beijing, China |
Research Intern |
| Algorithm Engineer |
Oct. 2021 – Aug. 2022 |
| • 3D & TSR & Shape Autolabeling — models deployed online |
| Meitu Imaging & Vision Lab (MTLAB) | Beijing, China |
Research Intern |
| Computer Vision Algorithm Engineer |
Jul. 2021 – Sept. 2021 |
| • Depth Estimation — model acceleration on iPhone and Android, ~10× speedup, deployed in APP |
| DXD (Robotics Startup) | Beijing, China |
Founder & CEO |
| Founder |
May 2018 – Jun. 2020 |
| • Indoor navigation robot — hardware design and SLAM |
| • Awards: Beijing Outstanding Student Start-up Team; Zhongguancun High-tech Enterprise |
Research Interests
My research interests include VLM, agentic multi-modal systems, and AI for creation.
Selected Honors
| 2020-2023, First-class Academic Scholarship, Beijing Institute of Technology |
| 2020-2023, Outstanding Student, Beijing Institute of Technology |
| 2020, MIIT Scholarship (40,000 RMB, Ministry of Industry and Information Technology, 工信部) |
| 2020, Outstanding Graduate, Beijing Institute of Technology |
* indicates equal contribution; † indicates project lead/corresponding author.
Preprints
-
VideoMemory: Toward Consistent Video Generation via Memory Integration
Jinsong Zhou*, Yihua Du*, Xinli Xu*, et al.
arXiv preprint, 2026.
-
PresentCoach: Dual-Agent Presentation Coaching through Exemplars and Interactive Feedback
Sirui Chen*, Jinsong Zhou*, Xinli Xu*, Xiaoyu Yang, Litao Guo, Ying-Cong Chen
arXiv preprint, 2025.
Journal Papers
Conference Papers
-
GaussianProperty: Integrating Physical Properties to 3D Gaussians with LMMs
Xinli Xu*, Wenhang Ge*, Dicong Qiu*, et al.
IEEE/CVF International Conference on Computer Vision (ICCV), 2025.
[Project Page]
-
FlexGen: Flexible Multi-View Generation from Text and Image Inputs
Xinli Xu*, Wenhang Ge*, Jiantao Lin*, et al.
IEEE/CVF International Conference on Computer Vision (ICCV), 2025.
[Project Page]
-
ComfyMind: Toward General-Purpose Generation via Tree-Based Planning and Reactive Feedback
Litao Guo*, Xinli Xu*, Luozhou Wang, et al.
The Thirty-Ninth Annual Conference on Neural Information Processing Systems (NeurIPS), 2025.
[Project Page]
-
Long-Video Audio Synthesis with Multi-Agent Collaboration
Yehang Zhang*, Xinli Xu*, Xiaojie Xu*, Li Liu, Yingcong Chen
The Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025.
[Project Page]
-
PreGenie: An Agentic Framework for High-quality Visual Presentation Generation
Xiaojie Xu, Xinli Xu, Sirui Chen, et al.
The Conference on Empirical Methods in Natural Language Processing (EMNLP), 2025.
-
FH-Net: A Fast Hierarchical Network for Scene Flow Estimation on Real-world Point Clouds
Lihe Ding*, Shaocong Dong*, Tingfa Xu, Xinli Xu, et al.
European Conference on Computer Vision (ECCV), 2022. (Oral, Acceptance Rate: 2.7%)
[Code]
-
MsSVT: Mixed-scale Sparse Voxel Transformer for 3D Object Detection on Point Clouds
Shaocong Dong*, Lihe Ding*, Haiyang Wang, Tingfa Xu, Xinli Xu, et al.
Advances in Neural Information Processing Systems (NeurIPS), 2022.
[Code]
Invited Talks
- ComfyMind invited talk by Tencent Hunyuan, July 2025.
Teaching Assistant
| 2025 Fall | UCMP6050 | Project Design Thinking |
| 2024 Fall | AIAA 2025 | Introduction to Artificial Intelligence |
© Xinli Xu | Last updated: April 2026