Duc Vu

Research

My research interests include physics-aware and object-interaction video generation, with the goal of developing generative models that can better understand, capture, and simulate real-world physical dynamics. Previously, my work focused on diffusion-based generative models, with an emphasis on diffusion distillation and efficient one-step generation techniques for practical applications such as video enhancement and image inpainting. I am particularly interested in methods that maintain high generation quality while reducing computational cost, making generative AI more scalable, efficient, and accessible. I have also worked on adversarial defenses to support the safe and responsible deployment of generative video models.

Physics-Aware Video Gen One/Few-Step Diffusion Diffusion Distillation Video Enhancement Image Inpainting Adversarial Defense

Selected Publications

* denotes equal contribution

Cross-Space Distillation: Teaching One-Step Students with Modern Diffusion Teachers.

Anh Nguyen*, Ngan Nguyen*, Duc Vu*, Trung Dao, Viet Nguyen, Quan Dao, Kien Nguyen, Chi Tran, Phong Nguyen, Khoi Nguyen, Cuong Pham, Dimitris N. Metaxas, Vishal M. Patel, Anh Tran

ECCV 2026

We formalize Cross-Space Distillation and introduce Bridge, a lightweight latent-space interface that makes standard one-step distillation possible across mismatched resolutions, VAEs, architectures, and diffusion/flow paradigms.

Anti-I2V: Safeguarding your photos from malicious image-to-video generation.

Duc Vu, Anh Nguyen, Chi Tran, Anh Tran

CVPR 2026

Website ArXiv

A memory-efficient adversarial attack against diverse image-to-video diffusion models, enabled by robust dual-space perturbation optimization.

InverFill: One-Step Inversion for Enhanced Few-Step Diffusion Inpainting.

Duc Vu*, Kien Nguyen*, Trong-Tung Nguyen*, Ngan Nguyen, Phong Nguyen, Khoi Nguyen, Cuong Pham, Anh Tran

CVPR 2026

Website ArXiv

A highly efficient one-step inversion diffusion network for high-quality few-step image inpainting.

Improved Training Technique for Shortcut Models.

Anh Nguyen*, Viet Nguyen*, Duc Vu, Trung Dao, Chi Tran, Toan Tran, Anh Tran

NeurIPS 2025

ArXiv

iSM resolves five major shortcut-model flaws with dynamic guidance, wavelet loss, sOT, and Twin EMA, yielding markedly better image generation.

SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher.

Trung Dao, Thuan Hoang Nguyen*, Thanh Le*, Duc Vu*, Khoi Nguyen, Cuong Pham, Anh Tran

ECCV 2024

Website ArXiv

An improved SwiftBrush version that makes the one-step diffusion student beats its multi-step teacher.

EFHQ: Multi-purpose ExtremePose-Face-HQ dataset.

Trung Dao*, Duc Vu*, Cuong Pham, Anh Tran

CVPR 2024

Website ArXiv

A high-quality dataset centered on extreme pose faces, supporting face synthesis, reenactment, recognition benchmarking, and more.

Selected Preprints

* denotes equal contribution

VideoDrift: Plug-and-Play Video Refinement for Diffusion Models via KV-Anchored Attention.

Ngan Nguyen, Duc Vu, Trong-Tung Nguyen, Phuc Hong Lai, Cuong Pham, Anh Tran

Under Review 2026

A plug-and-play, backbone-agnostic video enhancer with low compute: each frame requires only inversion plus one generator pass, making it easy to deploy on outputs from diverse T2V systems.