Xiaoyang Lyu (吕晓阳)

Now, I am pursuing PhD position in the CVMI lab of the University of Hong Kong ,supervised by Xiaojuan Qi. Before that, I obtained my Master degree in the College of Control Science and Engineering at Zhejiang University, supervised by Prof. Yong Liu. Furthermore, I obtained my Bachelor of Engineering degree from the Harbin Institute Of Technology .

I have a keen interest in the fields of computer vision and robotics. At present, my focus lies on the intricate realm of 3D scene reconstruction, including neural rendering and depth estimation. My aspirations extend towards the development of a simulator capable of seamlessly transposing real-world environments into the virtual realm. By accomplishing this, we can expedite the integration of robotics, augmented reality (AR), and virtual reality (VR) applications.


Email / Google Scholar / Github

Recent News
  • [2024.04] I am delighted to announce that our paper Total-Decom has been selected as the highlight poster (acceptance rate 2.8%), and EscherNet has been chosen for an oral presentation (acceptance rate 0.78%) for CVPR 2024.
  • [2024.02] We have three papers accepted to the CVPR 2024, one as the first author. All code and demos will be open source.
  • [2024.02] We have released the code and demos about the EscherNet, which is a multi-view conditioned diffusion model for generative view synthesis. Welcome to check!
  • [2023.12] We have released the demos about the SC-GS, which is a controllable dynamic gaussian. Welcome to check!
  • [2023.07] Three papers accepted to ICCV2023, one as the first author.
  • [2023.03] One paper accepted to CVPR2023.
  • [2023.01] One paper accepted to ICRA2023.
Publications
PontTuset

Total-Decom: Decomposed 3D Scene Reconstruction with Minimal Interaction
Xiaoyang Lyu*, Chirui Chang*, Peng Dai, Yang-Tian Sun, Xiaojuan Qi
Computer Vision and Pattern Recognition Conference (CVPR), 2024. Seattle WA, USA.
Paper / Project Page / Code (Coming Soon)

Highlights (acceptance rate 2.8%)

Indoor scenes consist of complex compositions of objects and backgrounds. Our proposed method, Total-Decom, (a) performs 3D reconstruction from posed multiview images, (b) decomposes the reconstructed mesh to generate high-quality meshes for individual objects and backgrounds with minimal human annotations. This approach facilitates such applications as (c) object re-texturing and (d) scene reconfiguration.

PontTuset

EscherNet: A Generative Model for Scalable View Synthesis
Xin Kong*, Shikun Liu*, Xiaoyang Lyu, Marwan Taher, Xiaojuan Qi, Andrew J. Davison
Computer Vision and Pattern Recognition Conference (CVPR), 2024. Seattle WA, USA.
Paper / Project Page / Code

Oral (acceptance rate 0.78%)

EscherNet is a multi-view conditioned diffusion model for view synthesis. EscherNet learns implicit and generative 3D representations coupled with the camera positional encoding (CaPE), allowing continuous relative camera control between an arbitrary number of reference and target views.

PontTuset

SC-GS: Sparse-Controlled Gaussian Splatting for Editable Dynamic Scenes
Yi-Hua Huang*, Yang-Tian Sun*, Ziyi Yang*, Xiaoyang Lyu, Yan-Pei Cao, Xiaojuan Qi
Computer Vision and Pattern Recognition Conference (CVPR), 2024. Seattle WA, USA.
Paper / Project Page / Code

We propose a new representation that explicitly decomposes the motion and appearance of dynamic scenes into sparse control points and dense Gaussians, respectively. Our key idea is to use sparse control points, significantly fewer in number than the Gaussians, to learn compact 6 DoF transformation bases, which can be locally interpolated through learned interpolation weights to yield the motion field of 3D Gaussians. Please visit project page for more demos.

PontTuset

Learning a Room with the Occ-SDF Hybrid: Signed Distance Function Mingled with Occupancy Aids Scene Representation
Xiaoyang Lyu, Peng Dai, Zizhang Li, etc.
International Conference on Computer Vision (ICCV), 2023. Paris, France.
arXiv / Project Page / Code

We have analyzed the constraints present in current neural scene representation techniques with geometry priors, and have identified issues in their ability to reconstruct detailed structures due to a biased optimization towards high color intensities and the complex SDF distribution. As a result, we have developed a feature rendering scheme that balances color regions and have implemented a hybrid representation to address the limitations of the SDF distribution.

PontTuset

RICO: Regularizing the Unobservable for Indoor Compositional Reconstruction
Zizhang Li, Xiaoyang Lyu, etc.
International Conference on Computer Vision (ICCV), 2023. Paris, France.
arXiv / Code

We have presented RICO, a novel approach for compositional reconstruction in indoor scenes. Our key motivation is to regularize the unobservable regions for the objects with partial observations in indoor scenes. We exploit the geometry smoothness for the occluded background, and then adopt the improved background as the prior to regularize the objects’ geometry.

PontTuset

Speech2Lip: High-fidelity Speech to Lip Generation by Learning from a Short Video
Xiuzhe Wu, Pengfei Hu, Yang Wu, Xiaoyang Lyu, etc.
International Conference on Computer Vision (ICCV), 2023. Paris, France.
ArXiv / Code

We propose a novel decomposition synthesis-composition framework called Speech2Lip for high-fidelity talking head video synthesis, which disentangles speech-sensitive and speech-insensitive motions/appearances.

PontTuset

Hybrid Neural Rendering for Large-Scale Scenes with Motion Blur
Peng Dai, Yinda Zhang, Xin Yu, Xiaoyang Lyu, Xiaojuan Qi
Conference on Computer Vision and Pattern Recognition(CVPR), 2023. Vancouver, Canada.
Project Page / arXiv / code

We develop a hybrid neural rendering model that makes image-based representation and neural 3D representation join forces to render high-quality and view-consistent images.

PontTuset

Efficient Implicit Neural Reconstruction Using LiDAR
Dongyu Yan, Xiaoyang Lyu, Jieqi Shi, Yi Lin
IEEE International Conference on Robotics and Automation (ICRA), 2023. London, UK.
Project Page / arXiv / code

We propose a new method that uses sparse LiDAR point clouds and rough odometry to reconstruct fine-grained implicit occupancy field efficiently within a few minutes. We introduce a new loss function that supervises directly in 3D space without 2D rendering, avoiding information loss.

PontTuset

HR-Depth : High Resolution Self-Supervised Monocular Depth Estimation
Xiaoyang Lyu, Liang Liu, Mengmeng Wang, Xin Kong, etc.
The 35th AAAI Conference on Artificial Intelligence (AAAI), 2021. Virtual.
arXiv / code(Training and more will come)

Based on theoretical and empirical evidence, we present HR-Depth, for high-resolution self-supervised monocular depth estimation.

PontTuset

FCFR-Net: Frature Fusion based Coarse-to-Fine Residual Learning for Depth Completion
Lina Liu, Xibin Song, Xiaoyang Lyu, Junwei Diao, etc.
The 35th AAAI Conference on Artificial Intelligence (AAAI), 2021. Virtual.
arXiv / code

We propose a novel end-to-end residual learning framework, which formulates the depth completion as a tow-stage learning task.

Competitions
PontTuset

ICRA 2018 DJI RoboMaster AI Challenge
Team: I Hiter. Xingguang Zhong, Xin Kong, Xiaoyang Lyu, Le Qi, Hao Huang, Linrui Tian, Songwei Li
IEEE International Conference on Robotics and Automation (ICRA), 2018. Brisbane, Australia.
Global Champion / Ranking: 1st/21 / Certificate / Video / Rules

Our team built two fully automatic robots, including machinery, circuit, control and algorithm. I was responsible for visual servo, target detection, target localization and decision-making of robots.

PontTuset

2017, 2018, 2019 RoboMaster Robotics Competition
Team: I Hiter. Wei Chen, Xin Kong, Xiaoyang Lyu, etc.
China University Robot Competition (全国大学生机器人大赛), 2017, 2018, 2019. Shenzhen, China.
First Prize / Ranking: 4th/200+ in 2017, 2018, 6th/200+ in 2019./ Certificate / Highlights

Our team built more than 10 complex automatic or semi-automatic robots every year. In 2017, I was mainly responsible for building and manipulating Engineering Robot. In 2018, I was responsible for visual servo, which involves computer vision and machine learning. In 2019, I became the leader of computer vision group and the coach of our team.

Honors

Apr. 2022, Hong Kong PhD Fellowship Scheme - Research Grants Council (RGC) of Hong Kong

Nov. 2019, Academic scholarship - Zhejiang University

Jun. 2019, Outstanding Graduate - Harbin Institute of Technology

Jun. 2019, Top 100 excellent graduation thesis - Harbin Institute of Technology

Jan. 2019, Top 10 College Student in Harbin Institute of Technology - Harbin Institute of Technology

Mar. 2018, Outstanding student in Hei Longjiang Province - Harbin Institute of Technology

Oct. 2017, SMC Scholarship - Harbin Institute of Technology

Oct. 2016, National Scholarship - Harbin Institute of Technology

About Me

Skills: Python / C / C ++ / Matlab, PyTorch, Linux, ROS, OpenCV


Last update: 2024.04.06. Thanks.