Phong Nguyen-Ha

Research Scientist · CV / 3D / Generative AI

Building perceptual systems for the physical world.

I am a Senior Researcher at Qualcomm and I got my PhD in Computer Science from the University of Oulu, Finland. At Qualcomm AI Research Viet Nam, I am currently leading a team working on World Model and Robot Learning.

Phong Nguyen

Research focus

My work spans 3D reconstruction, neural scene representations, depth estimation, and controllable image generation. I am especially interested in systems that are both physically grounded and computationally efficient.

Current themes

  • Generalizable novel view synthesis
  • 3D content generation and editing
  • Geometry-aware image and video modeling
  • Efficient training and inference for vision systems

Selected work

Research portfolio

Cross-Space Distillation preview
Cross-Space Distillation hover preview

ECCV 2026

Cross-Space Distillation

Teaching one-step students with modern diffusion teachers using a latent-space bridge for cross-space distillation.

PixGS preview
PixGS hover preview

ECCV 2026

PixGS

Direct 3D Gaussian splat generation with pixel-space diffusion for high-quality, single-stage 3D asset synthesis.

SwiftTailor preview
SwiftTailor hover preview

CVPR 2026 Highlight

SwiftTailor

Efficient 3D garment generation with geometry-image representation and pattern-aware synthesis.

PixelRush preview
PixelRush hover preview

CVPR 2026 Highlight

PixelRush

Training-free, ultra-fast high-resolution image generation with a one-step diffusion pipeline.

Ar2Can preview

CVPR 2026

Ar2Can

Multi-human generation with a spatial planner and identity-preserving diffusion renderer.

ModeDreamer preview
ModeDreamer hover preview

In submission

ModeDreamer

Guiding score distillation with reference-image prompts for more stable and diverse text-to-3D generation.

SharpDepth preview
SharpDepth hover preview

CVPR 2025

SharpDepth

Sharpening metric depth predictions using diffusion distillation for accurate and crisp geometry.

SSC preview
SSC hover preview

AAAI 2025

Semi-supervised 3D Semantic Scene Completion

Using 2D vision foundation models to reduce the need for costly voxel-wise labels in 3D occupancy prediction.

DiverseDream preview
DiverseDream hover preview

ECCV 2024

DiverseDream

Diversifying text-to-3D synthesis by augmenting prompts and modeling joint generation.

CG-NeRF preview
CG-NeRF hover preview

TPAMI 2024

CG-NeRF

Generalizable neural radiance fields for fast novel-view synthesis with a coarse-to-fine renderer.

PhD thesis preview
PhD thesis hover preview

PhD Thesis 2023

Neural Scene Representations for Learning-Based View Synthesis

Introducing neural scene representations for efficient and compact learning-based novel view synthesis.

HVS-Net preview

ECCV 2022

Free-Viewpoint RGB-D Human Performance Capture

Learning dense features in novel views and inpainting occluded regions for photorealistic human rendering.

RGBD-Net preview

3DV 2021

RGBD-Net

Predicting color and depth images for novel-view synthesis through a depth-aware generator.

LiDNAS preview
LiDNAS hover preview

WACV 2021

LiDNAS

Efficient neural architecture search for lightweight monocular depth estimation.

FuSaNet preview

3DV 2021

Monocular Depth with Saliency and Hessian Loss

Improving monocular depth estimation with salient point priors and a Hessian-based loss.

Point Fusion preview

ICCV 2021

Boosting Monocular Depth with 3D Point Fusion

Guiding depth networks with extremely sparse point clouds for more robust geometry estimation.

T-GQN preview

ACCV 2020

Sequential View Synthesis with Transformer

Using multi-view attention and a sequential decoder for consistent scene synthesis without retraining.

DAV preview

ECCV 2020

Depth-Attention Volume

Using a non-local coplanarity prior to guide monocular depth estimation with planar structure awareness.

GAQN preview

SCIA 2019

Predicting Novel Views Using Generative Adversarial Query Network

Combining GQN with adversarial training for generative novel view synthesis.

LightDenseYOLO preview
LightDenseYOLO hover preview

Sensors 2018

LightDenseYOLO

A fast and accurate marker tracker for autonomous UAV landing using visible-light camera input.

Marker tracking preview
Marker tracking hover preview

Sensors 2017

Remote Marker-Based Tracking for UAV Landing

Reliable visible-light marker-based tracking for drone landing without GPS guidance.