profile photo

Qi Zhang (张琦)

I am currently a Lead Researcher in the Imaging Algorithm Center of Vivo. Our group, as the core alogrithm team, utilizes 3D, AIGC-based technologies to enhance the photography quality and user experience of smartphones. I do research in 3D computer vision and AI generated content (AIGC), where my interests focus on neural rendering, 3D Gaussian Splatting (3DGS), 3D reconstruction and modeling, visual diffusion models.

During 2021-2024, I was a Researcher in the Tencent AI Lab from Jun. 2021. Before that, I received my Ph.D. degree from the School of Computer Science of Northwestern Polytechnical University in 2021, I was supervised by Prof. Qing Wang. I was a visiting student at the Australian National University (ANU) between Jul. 2019 to Aug. 2020, which was supervised by Prof. Hongdong Li. I received the Outstanding Doctoral Dissertation Award Nominee from China Computer Federation (CCF) in 2021. I also won the 2021 ACM Xi'an Doctoral Dissertation Award and the 2023 NWPU Doctoral Dissertation Award.

!!! Vivo (Xi'an & Hangzhou) is hiring mulitple reserachers and interns for projects on 3D reconstruction (e.g. NeRF, 3DGS, and NVS) and AIGC-based generation (including image or video diffusion models), please feel free to contact me via e-mail and WeChat.

Email  /  CV  /  Bio  /  Google Scholar  /  WeChat  /  Github

News

🎉🎉2025.04: Serving as a Area Chair for NeurIPS 2025!
🎉🎉2025.03: 3 papers accetped to CVPR 2025!
🎉🎉2024.08: LTM-NeRF accetped to TPAMI!
🎉🎉2024.07: 4 papers accetped to ECCV 2024, including one oral paper!
🎉🎉2024.03: 5 papers accetped to CVPR 2024!
🎉🎉2024.02: DINER extended to TPAMI!
🎉🎉2023.12: NeIF accetped to AAAI 2023
🎉🎉2023.08: LoD-NeuS accetped to SIGGRAPH Asia 2023
🎉🎉2023.07: Pyramid NeRF accetped to IJCV 2023
🎉🎉2023.03: 7 papers (with 1 highlight paper) accetped to CVPR 2023!
🎉🎉2022.08: 1 paper (journal track) accetped to SIGGRAPH Asia 2022
🎉🎉2022.03: 4 papers accetped to CVPR 2022
🎉🎉2021.12: CCF Outstanding Doctoral Dissertation Award Nominee

Tech transfer

4D Tencent Avatar: 4D Content Generation (全息表演捕捉及人物建模技术)
Tencent Meeting: 3D Chat on Tencetn Meeting (沉浸式交流的腾讯裸眼3D实时会议系统)

Research

Please find below a complete list of my publications with representative papers highlighted. The IEEE TPAMI, IJCV are top journals in the field of computer vision and computational photography. The CVPR is the premier conference in Computer Vision research community.

arXiv 2025
GUS-IR: Gaussian Splatting with Unified Shading for Inverse Rendering
Zhihao Liang, Hongdong Li, Kui Jia, Kailing Guo, Qi Zhang
arXiv, 2025
Project Page / arXiv / Code

In this paper, we present GUS-IR, a novel framework designed to address the inverse rendering problem for complicated scenes featuring rough and glossy surfaces.

CVPR 2025
Mitigating Ambiguities in 3D Classification with Gaussian Splatting
Ruiqi Zhang*, Hao Zhu*, Jingyi Zhao, Qi Zhang, Xun Cao, Zhan Ma
CVPR, 2025
arXiv

This paper proposes Gaussian Splatting (GS) point cloud-based 3D classification. This paper finds that the scale and rotation coefficients in the GS point cloud help characterize surface types.

CVPR 2025
Mani-GS: Gaussian Splatting Manipulation with Triangular Mesh
Xiangjun Gao, Xiaoyu Li, Yiyu Zhuang, Qi Zhang, Wenbo Hu, Chaopeng Zhang, Yao Yao, Ying Shan, Long Quan
CVPR, 2025
Project Page / arXiv

This approach reduces the need to design various algorithms for different types of Gaussian manipulation. By utilizing a triangle shape-aware Gaussian binding and adapting method, we can achieve 3DGS manipulation and preserve high-fidelity rendering after manipulation.

CVPR 2025
TSD-SR: One-Step Diffusion with Target Score Distillation for Real-World Image Super-Resolution
Linwei Dong*, Qingnan Fan*, Yihong Guo, Zhonghao Wang Qi Zhang, Jinwei Chen, Yawei Luo, Changqing Zou
CVPR, 2025
Project Page / arXiv

This paper proposes TSD-SR, a novel distillation framework specifically designed for real-world image super-resolution, aiming to construct an efficient and effective one-step model.

TVCG 2025
H2O-NeRF: Radiance Fields Reconstruction for Two-Hand-Held Objects
Xinxin Liu, Qi Zhang, Xin Huang, Ying Feng, Guoqing Zhou, Qing Wang
IEEE Transactions on Visualization and Computer Graphics (TVCG), 2025
Project Page

In this paper, we propose a novel neural representation-based framework to recover radiance fields of the two-hand-held object, named H2O-NeRF.

arXiv 2025
Hero-SR: One-Step Diffusion for Super-Resolution with Human Perception Priors
Jiangang Wang, Qingnan Fan, Qi Zhang, Haigen Liu, Yuhang Yu, Jinwei Chen, Wenqi Ren,
arXiv, 2025
Project Page / arXiv / Code

Hero-SR consists of two novel modules: the Dynamic Time-Step Module (DTSM), which adaptively selects optimal diffusion steps for flexibly meeting human perceptual standards, and the Open-World Multi-modality Supervision (OWMS), which integrates guidance from both image and text domains through CLIP to improve semantic consistency and perceptual naturalness.

arXiv 2025
CoMPaSS: Enhancing Spatial Understanding in Text-to-Image Diffusion Models
Gaoyang Zhang, Bingtao Fu, Qingnan Fan, Qi Zhang, Runxing Liu, Hong Gu, Huaqi Zhang, Xinguo Liu
arXiv, 2025
Project Page / arXiv / Code

CoMPaSS significantly enhances spatial understanding in text-to-image diffusion models while preserving their text-only input nature, requiring zero additional parameters and negligible computational overhead during both training and inference, and seamlessly integrating with any model architecture or dataset.

SPL 2025
IR-Pro: Baking Probes to Model Indirect Illumination for Inverse Rendering of Scenes
Zhihao Liang, Qi Zhang, Yirui Guan, Ying Feng, Kui Jia
IEEE Signal Processing Letters, 2025
arXiv

Our LTM-NeRF, which incorporates the Camera Response Function (CRF) module and the Neural Exposure Field, collaborates seamlessly with NeRF.

TPAMI 2024
LTM-NeRF: Embedding 3D Local Tone Mapping in HDR Neural Radiance Field
Xin Huang, Qi Zhang, Ying Feng, Hongdong Li, Qing Wang
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
Project Page / arXiv

Our LTM-NeRF, which incorporates the Camera Response Function (CRF) module and the Neural Exposure Field, collaborates seamlessly with NeRF.

ECCV 2024 (Oral)
Analytic-Splatting Anti-Aliased 3D Gaussian Splatting via Analytic Integration
Zhihao Liang, Qi Zhang, Wenbo Hu, Lei Zhu, Ying Feng, Kui Jia
ECCV, 2024
Project Page / arXiv / Code / Viewer

In this paper, we derive an analytical solution to address the aliasing caused by discrete sampling in 3DGS.

ECCV 2024
Neural Poisson Solver: A Universal and Continuous Framework for Natural Signal Blending
Delong Wu, Hao Zhu Qi Zhang, You Li, Xun Cao, Zhan Ma
ECCV, 2024
Project Page / arXiv / Code

In this paper, we derive an analytical solution to address the aliasing caused by discrete sampling in 3DGS.

ECCV 2024
Physically Plausible Color Correction for Neural Radiance Fields
Qi Zhang, Ying Feng, Hongdong Li,
ECCV, 2024
Project Page / arXiv / Code

In this paper, we address this problem by proposing a novel color correction module that simulates the physical color processing in cameras to be embedded in NeRF, enabling the unified color NeRF reconstruction.

ECCV 2024
Head360: Learning a Parametric 3D Full-Head for Free-View Synthesis in 360°
Yuxiao He, Yiyu Zhuang, Yanwen Wang, Yao Yao, Siyu Zhu, Xiaoyu Li, Qi Zhang, Xun Cao, Hao Zhu
ECCV, 2024
Project Page / arXiv / Code / Data

Our model is represented by a neural radiance field with hexlanes, conditioned on a generative neural texture and a parametric 3D mesh model.

arXiv 2024
Advances in 3D Generation: A Survey
Xiaoyu Li*, Qi Zhang*, Di Kang, Weihao Cheng, Yiming Gao, Jingbo Zhang, Zhihao Liang, Jing Liao, Yanpei Cao, Ying Shan
arXiv, 2024
Project Page / arXiv / Code

In this survey, we aim to introduce the fundamental methodologies of 3D generation methods and establish a structured roadmap, encompassing 3D representation, generation methods, datasets, and corresponding applications.

Knowledge-Based Systems 2025
UV Gaussians: Joint Learning of Mesh Deformation and Gaussian Textures for Human Avatar Modeling
Yujiao Jiang, Qingmin Liao, Xiaoyu Li, Li Ma, Qi Zhang, Chaopeng Zhang, Zongqing Lu, Ying Shan
Knowledge-Based Systems, 2025
Project Page / arXiv / Code

In this paper, we introduce a texture-consistent back view synthesis module that could transfer the reference image content to the back view through depth and text-guided attention injection with the help of stable diffusion model.

CVPR 2024
GS-IR: 3D Gaussian Splatting for Inverse Rendering
Zhihao Liang*, Qi Zhang*, Ying Feng, Ying Shan, Kui Jia
CVPR, 2024
Project Page / arXiv / Code

We propose GS-IR, a novel inverse rendering approach based on 3D Gaussian Splatting (GS) that leverages forward mapping volume rendering to achieve photorealistic novel view synthesis and relighting results.

CVPR 2024
HumanNorm: Learning Normal Diffusion Model for High-quality and Realistic 3D Human Generation
Xin Huang*, Ruizhi Shao*, Qi Zhang, Hongwen Zhang, Ying Feng, Yebin Liu, Qing Wang
CVPR, 2024
Project Page / arXiv / Code

We propose HumanNorm, a novel approach for high-quality and realistic 3D human generation by learning the normal diffusion model including a normal-adapted diffusion model and a normal-aligned diffusion model.

CVPR 2024
FINER: Flexible spectral-bias tuning in Implicit NEural Representation by Variable-periodic Activation Functions
Zhen Liu*, Hao Zhu*, Qi Zhang, Jingde Fu, Weibing Deng Zhan Ma, Yanwen Guo Xun Cao
CVPR, 2024
Project Page / PDF / Code

We have identified that this frequency-related problem can be greatly alleviated by introducing variable-periodic activation functions, for which we propose FINER.

CVPR 2024
HumanRef: Single Image to 3D Human Generation via Reference-Guided Diffusion
Jingbo Zhang, Xiaoyu Li, Qi Zhang, Yanpei Cao Ying Shan, Jing Liao
CVPR, 2024
extended HumanRef-GS to TCSVT, 2025
Project Page / arXiv / Code

HumanRef, a reference-guided 3D human generation framework, is capable of generating 3D clothed human with realistic, view-consistent texture and geometry from a single image input with the help of stable diffusion model.

CVPR 2024
ConTex-Human: Free-View Rendering of Human from a Single Image with Texture-Consistent Synthesis
Xiangjun Gao, Xiaoyu Li, Chaopeng Zhang, Qi Zhang, Yanpei Cao, Ying Shan, Long Quan
CVPR, 2024
Project Page / arXiv / Code

In this paper, we introduce a texture-consistent back view synthesis module that could transfer the reference image content to the back view through depth and text-guided attention injection with the help of stable diffusion model.

AAAI 2024
NeIF: A Pre-convolved Representation for Plug-and-Play Neural Illumination Fields
Yiyu Zhuang*, Qi Zhang*, Xuan Wang, Hao Zhu, Ying Feng, Xiaoyu Li, Ying Shan, Xun Cao
AAAI, 2024
Project Page / arXiv / Code

We propose a fully differentiable framework named neural ambient illumination (NeAI) that uses Neural Radiance Fields (NeRF) as a lighting model to handle complex lighting in a physically based way.

TPAMI 2024
Disorder-invariant Implicit Neural Representation
Hao Zhu*, Shaowen Xie*, Zhen Liu*, Fengyi Liu Qi Zhang, You Zhou, Yi Lin, Zhan Ma, Xun Cao,
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2024
Project Page / arXiv / Code

In this paper, we find that such a frequency-related problem could be largely solved by re-arranging the coordinates of the input signal, for which we propose the disorder-invariant implicit neural representation (DINER) by augmenting a hash-table to a traditional INR backbone.

SIGGRAPH Asia 2023
Anti-Aliased Neural Implicit Surfaces with Encoding Level of Detail
Yiyu Zhuang*, Qi Zhang*, Ying Feng, Hao Zhu, Yao Yao, Xiaoyu Li, Yanpei Cao, Ying Shan, Xun Cao
SIGGRAPH Asia, 2023
Project Page / arXiv / Code

Our method, called LoD-NeuS, adaptively encodes Level of Detail (LoD) features derived from the multi-scale and multi-convoluted tri-plane representation. By optimizing a neural Signal Distance Field (SDF), our method is capable of reconstructing high-fidelity geometry

IJCV 2023
Pyramid NeRF: Frequency Guided Fast Radiance Field Optimization
Junyu Zhu*, Hao Zhu*, Qi Zhang, Fang Zhu Zhan Ma, Xun Cao
International Journal of Computer Vision (IJCV), 2023
Project Page / PDF / Code

In this paper, we propose the Pyramid NeRF, which guides the NeRF training in a 'low-frequency first, high-frequency second' style using the image pyramids and could improve the training and inference speed at 15x and 805x, respectively.

CVPR 2023
CVPR 2023
Wide-angle Rectification via Content-aware Conformal Mapping
Qi Zhang, Hongdong Li, Qing Wang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023
Project Page / arXiv

We propose a new content-aware optimization framework to preserve both local conformal shape (e.g. face or salient regions) and global linear structures (straight lines).

CVPR 2023
Inverting the Imaging Process by Learning an Implicit Camera Model
Xin Huang, Qi Zhang, Ying Feng, Hongdong Li, Qing Wang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023
Project Page / arXiv / Code

this paper proposes a novel implicit camera model which represents the physical imaging process of a camera as a deep neural network. We demonstrate the power of this new implicit camera model on two inverse imaging tasks: i) generating all-in-focus photos, and ii) HDR imaging.

CVPR 2023
Local Implicit Ray Function for Generalizable Radiance Field Representation
Xin Huang, Qi Zhang, Ying Feng, Xiaoyu Li, Xuan Wang, Qing Wang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023
Project Page / arXiv / Code

We propose LIRF (Local Implicit Ray Function), a generalizable neural rendering approach for novel view rendering. Given 3D positions within conical frustums, LIRF takes 3D coordinates and the features of conical frustums as inputs and predicts a local volumetric radiance field.

CVPR 2023 (Highlight)
DINER: Disorder-Invariant Implicit Neural Representation
Shaowen Xie*, Hao Zhu*, Zhen Liu*, Qi Zhang, You Zhou, Xun Cao, Zhan Ma
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023
Project Page / arXiv / Code

In this paper, we find that such a frequency-related problem could be largely solved by re-arranging the coordinates of the input signal, for which we propose the disorder-invariant implicit neural representation (DINER) by augmenting a hash-table to a traditional INR backbone.

CVPR 2023
Local-to-Global Registration for Bundle-Adjusting Neural Radiance Fields
Yue Chen Xingyu Chen, Xuan Wang, Qi Zhang, Yu Guo, Ying Shan, Fei Wang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023
Project Page / arXiv / Code

We propose L2G-NeRF, a Local-to-Global registration method for bundle-adjusting Neural Radiance Fields, including the pixel-wise local alignment and the frame-wise global alignment.

CVPR 2023
UV Volumes for Real-time Rendering of Editable Free-view Human Performance
Yue Chen, Xuan Wang*, Xingyu Chen, Qi Zhang, Xiaoyu Li, Yu Guo, Jue Wang, Fei Wang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023
Project Page / arXiv / Code / Video

We propose the UV Volumes, a new approach that can achieve real-time rendering, and editable NeRF, decomposing a dynamic human into 3D UV Volumes and a 2D appearance texture.

CVPR 2023
Fine-Grained Face Swapping via Regional GAN Inversion
Zhian Liu*, Maomao Li*, Yong Zhang*, Cairong Wang, Qi Zhang, Jue Wang, Yongwei Nie
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2023
Project Page / arXiv / Code

We present a novel paradigm for high-fidelity face swapping that faithfully preserves the desired subtle geometry and texture details.

SIGGRAPH 2022
Neural Parameterization for Dynamic Human Head Editing
Li Ma, Xiaoyu Li, Jing Liao, Xuan Wang, Qi Zhang, Jue Wang, Pedro V. Sander
ACM Transactions on Graphics, 2022
Project Page / arXiv / Code

We try to introduce explicit parameters into implicit dynamic NeRF representations to achieve editing of 3D human heads.

arXiv 2022
Stereo Unstructured Magnification: Multiple Homography Image for View Synthesis
Qi Zhang*, Xin Huang*, Ying Feng,, Xue Wang, Hongdong Li, Qing Wang
arXiv, 2022
arXiv

We propose a novel multiple homography image (MHI) representation, comprising of a set of scene planes with fixed normals and distances, for view synthesis from stereo images.

CVPR 2022
HDR-NeRF: High Dynamic Range Neural Radiance Fields
Xin Huang, Qi Zhang, Ying Feng, Hongdong Li, Xuan Wang, Qing Wang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
Project Page / arXiv / Code / Dataset / video

We present High Dynamic Range Neural Radiance Fields (HDR-NeRF) to recover an HDR radiance field from a set of low dynamic range (LDR) views with different exposures.

CVPR 2022
Hallucinated Neural Radiance Fields in the Wild
Xingyu Chen, Qi Zhang, Xiaoyu Li, Yue Chen, Ying Feng, Xuan Wang, Jue Wang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
Project Page / arXiv

This paper studies the problem of hallucinated NeRF: i.e. recovering a realistic NeRF at a different time of day from a group of tourism images.

CVPR 2022
Deblur-NeRF: Neural Radiance Fields from Blurry Images
Li Ma, Xiaoyu Li, Jing Liao, Qi Zhang, Xuan Wang, Jue Wang, Pedro V. Sander
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
Project Page / arXiv

In this paper, we propose Deblur-NeRF, the first method that can recover a sharp NeRF from blurry input. A novel Deformable Sparse Kernel (DSK) module is presented for both camera motion blur and defocus blur.

CVPR 2022
FENeRF: Face Editing in Neural Radiance Fields
Jingxiang Sun, Xuan Wang, Yong Zhang, Xiaoyu Li, Qi Zhang, Yebin Liu, Jue Wang
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2022
Project Page / arXiv

A 3D-aware generator (FENeRF) is prorposed to produce view-consistent and locally-editable portrait images.

TPAMI 2022
Ray-Space Epipolar Geometry for Light Field Cameras
Qi Zhang, Qing Wang, Hongdong Li, Jingyi Yu
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2022
PDF / bibtex

This paper fills in this gap by developing a novel ray-space epipolar geometry which intrinsically encapsulates the complete projective relationship between two light fields. Ray-space fundamental matrix and its properties are then derived to constrain ray-ray correspondences for general and special motions.

IJCV 2021
IJCV 2021
3D Scene Reconstruction with an Un-calibrated Light Field Camera
Qi Zhang, Hongdong Li, Xue Wang, Qing Wang
International Journal of Computer Vision (IJCV), 2021
PDF / bibtex

This paper is concerned with the problem of multi-view 3D reconstruction with an un-calibrated micro-lens array based light field camera..

TCI 2029
Full View Optical Flow Estimation Leveraged From Light Field Superpixel
Hao Zhu, Xiaoming Sun, Qi Zhang, Qing Wang, Antonio Robles-Kelly, Hongdong Li, Shaodi You
IEEE Transactions on Computational Imaging, 2019
PDF / bibtex

Our method employs the structure delivered by the four-dimensional light field over multiple views making use of superpixels for a full view optical flow estiamtion.

CVPR 2019
CVPR 2019
Ray-Space Projection Model for Light Field Camera
Qi Zhang, Jinbo Ling, Qing Wang, Jingyi Yu
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2019
PDF / Supp / bibtex

In the paper, we propose a novel ray-space projection model to transform sets of rays captured by multiple light field cameras in term of the Plucker coordinates.

TIP 2019
4D Light Field Superpixel and Segmentation
Hao Zhu, Qi Zhang, Qing Wang, Hongdong Li
IEEE Transactions on Image Processing (TIP), 2019
IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), 2017
PDF / bibtex

The light field superpixel (LFSP) is first defined mathematically and then a refocus-invariant metric named LFSP self-similarity is proposed to evaluate the segmentation performance.

TPAMI 2019
TPAMI 2019
A Generic Multi-Projection-Center Model and Calibration Method for Light Field Camera
Qi Zhang, Chunping Zhang, Jinbo Ling, Qing Wang, Jingyi Yu
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 2019
arXiv / PDF / Code / bibtex

The MPC model can generally parameterize light field in different imaging formations, including conventional and focused light field cameras.

Other Publications

Qi Zhang, Qing Wang. Common self-polar triangle of separate circles for light field camera calibration[J]. Xibei Gongye Daxue Xuebao/Journal of Northwestern Polytechnical University, 2021.
Yaning Li, Qi Zhang, Xue Wang, Qing Wang. Light Field SLAM based on Ray-Space Projection Model[C]. Optoelectronic Imaging and Multimedia Technology VI, 2019.
Qi Zhang, Xue Wang, Qing Wang. Light Field Planar Homography and Its Application[C]. Optoelectronic Imaging and Multimedia Technology VI, 2019.
Zhao Ren, Qi Zhang, Hao Zhu, Qing Wang. Extending the FOV from disparity and color consistencies in multiview light fields[C]. IEEE International Conference on Image Processing (ICIP), 2017


This template is a modification to Jon Barron's website. Feel free to clone it for your own use while attributing the original author Jon Barron.