Xinhang Liu

I am a third-year PhD student at HKUST, advised by Chi-Keung Tang and Yu-Wing Tai. I graduated from ShanghaiTech University with Bachelor's degree in 2022, where I was lucky to join the group of Jingyi Yu. Outside of academics, I am a basketball enthusiast.

Email / Resume / GitHub / Google Scholar

Research

	Trace Anything: Representing Any Video in 4D via Trajectory Fields Xinhang Liu, Yuxi Xiao, Donny Y. Chen, Jiashi Feng, Yu-Wing Tai, Chi-Keung Tang, Bingyi Kang Preprint project page / arXiv / video / code We propose a 4D video representation, trajectory field, which maps each pixel across frames to a continuous, parametric 3D trajectory. With a single forward pass, the Trace Anything model efficiently estimates such trajectory fields for any video, image pair, or unstructured image set.
	Martian World Models: Controllable Video Synthesis with Physically Accurate 3D Reconstructions Longfei Li, Zhiwen Fan, Wenyan Cong, Xinhang Liu, Yuyang Yin, Matt Foutter, Panwang Pan, Chenyu You, Yue Wang, Zhangyang Wang, Yao Zhao, Marco Pavone, Yunchao Wei NeurIPS, 2025 project page / arXiv / dataset / code A framework for generating realistic Martian landscape videos for mission rehearsal and robotic simulation, including a data pipeline and a video generator.
	CryoFastAR: Fast Cryo-EM Ab Initio Reconstruction Made Easy Jiakai Zhang, Shouchen Zhou, Haizhao Dai, Xinhang Liu, Peihao Wang, Zhiwen Fan, Yuan Pei, Jingyi Yu ICCV, 2025 project page / arXiv / code We introduce CryoFastAR, the first geometric foundation model that can directly predict poses from Cryo-EM noisy images for Fast ab initio Reconstruction.
	Motion-Agent: A Conversational Framework for Human Motion Generation with LLMs Qi Wu, Yubo Zhao, Yifan Wang, Xinhang Liu, Yu-Wing Tai, Chi-Keung Tang ICLR, 2025 project page / arXiv / code We introduce Motion-Agent, an efficient conversational framework designed for general human motion generation, editing, and understanding.
	ChatCam: Empowering Camera Control through Conversational AI Xinhang Liu, Yu-Wing Tai, Chi-Keung Tang NeurIPS, 2024 project page / arXiv / code We introduce ChatCam, a system that navigates camera movements through conversations with users, mimicking a professional cinematographer's workflow.
	Deceptive-NeRF/3DGS: Diffusion-Generated Pseudo-Observations for High-Quality Sparse-View Reconstruction Xinhang Liu, Jiaben Chen, Shiu-hong Kao, Yu-Wing Tai, Chi-Keung Tang ECCV, 2024 (Also presented at ECCV'24 Wild3D Workshop) project page / arXiv / code We enhance sparse-view reconstruction by leveraging a diffusion model pre-trained from multiview datasets to synthesize pseudo-observations.
	CryoFormer: Continuous Heterogeneous Cryo-EM Reconstructionusing Transformer-based Neural Representations Xinhang Liu, Yan Zeng, Yifan Qin, Hao Li, Jiakai Zhang, Lan Xu, Jingyi Yu ECCV'24 NFBCC Workshop (Highlight) project page / arXiv With hundreds of thousands of noisy images as inputs, our method can reconstruct the heterogeneous structures of proteins and highlight the local, flexible motions.
	Gear-NeRF: Free-Viewpoint Rendering and Tracking with Motion-aware Spatio-Temporal Sampling Xinhang Liu, Yu-Wing Tai, Chi-Keung Tang, Pedro Miraldo, Suhas Lohit, Moitreya Chatterjee CVPR, 2024 (Highlight, 2.8% of 11532) project page / arXiv / video / code We learn a 4D semantic embedding and introduce the concept of gears for stratified modeling of dynamic regions achieving photo-realistic dynamic novel view synthesis and free-viewpoint tracking.
	Unsupervised Multi-View Object Segmentation Using Radiance Field Propagation Xinhang Liu, Jiaben Chen, Huai Yu, Yu-Wing Tai, Chi-Keung Tang NeurIPS, 2022 project page / arXiv / code / data We study segmenting objects in 3D during reconstruction given only unlabeled multi-view images of a scene, from an information-theoretic perspective.
	Fourier PlenOctrees for Dynamic Radiance Field Rendering in Real-time Liao Wang, Jiakai Zhang, Xinhang Liu, Fuqiang Zhao, Yanshun Zhang, Yingliang Zhang, Minye Wu Lan Xu, Jingyi Yu CVPR, 2022 project page / arXiv A novel technique to tackle efficient neural modeling and real-time rendering of dynamic scenes captured under the free-view video (FVV) setting.
	Editable Free-Viewpoint Video using a Layered Neural Representation Jiakai Zhang, Xinhang Liu, Xinyi Ye, Fuqiang Zhao, Yanshun Zhang, Minye Wu, Yingliang Zhang, Lan Xu, Jingyi Yu SIGRRAPH, 2021 project page / arXiv / code / Two Minute Papers The first approach for editable photo-realistic free-viewpoint video generation for large-scale dynamic scenes using only sparse 16 cameras.

Experience

Mitsubishi Electric Research Laboratories (MERL), Cambridge, MA: Research Intern (2023 summer).

Carnegie Mellon University, Pittsburgh, PA: Robotics Institute Summer Scholar (2021 summer).

Service & Teaching

Conference Reviewer of CVPR, ICCV, ECCV, ICML, NeurIPS and ICLR.

COMP2611 - Computer Organization (24S), Teaching Assistant.

COMP5421 - Computer Vision (23S), Teaching Assistant.

CS182 - Introduction to Machine Learning (21F), Teaching Assistant.

SI100B - Introduction to Information Science and Technology (21F), Teaching Assistant.

CS276 - Computational Photography (21F), Guest Lecture in 'NeRF'.

Design and source code modified based on Jon Barron's website

Last update: October, 2025 :)

Xinhang Liu

Research

Trace Anything: Representing Any Video in 4D via Trajectory Fields

Martian World Models: Controllable Video Synthesis with Physically Accurate 3D Reconstructions

CryoFastAR: Fast Cryo-EM Ab Initio Reconstruction Made Easy

Motion-Agent: A Conversational Framework for Human Motion Generation with LLMs

ChatCam: Empowering Camera Control through Conversational AI

Deceptive-NeRF/3DGS: Diffusion-Generated Pseudo-Observations for High-Quality Sparse-View Reconstruction

CryoFormer: Continuous Heterogeneous Cryo-EM Reconstruction using Transformer-based Neural Representations

Gear-NeRF: Free-Viewpoint Rendering and Tracking with Motion-aware Spatio-Temporal Sampling

Unsupervised Multi-View Object Segmentation Using Radiance Field Propagation

Fourier PlenOctrees for Dynamic Radiance Field Rendering in Real-time

Editable Free-Viewpoint Video using a Layered Neural Representation

Experience

Service & Teaching

CryoFormer: Continuous Heterogeneous Cryo-EM Reconstructionusing Transformer-based Neural Representations