Duc Vu

🫶 Hi! I'm Duc. Nice to meet yall! 🫶 I am an AI research resident at Qualcomm Research in Vietnam, where I am fortunate to be supervised by Dr. Anh Tran. Prior to this, I received my Bachelor of Science in Data Science & Statistics and Bachelor of Arts in Economics from Miami University - Oxford in the U.S (2024).

I am seeking a PhD in Computer Science for Fall 2026 and excited to pursue impactful research. 🚀📚

Email  /  CV  /  Scholar  /  Github

profile photo

Research

My research focuses on advancing generative AI by leveraging efficient one-step generation for real-world applications such as video enhancement and image inpainting. I am particularly interested in diffusion-based methods that achieve high-quality results with minimal computational cost, making generative models more practical and accessible. In addition to efficiency, I am also working on adversarial defenses to ensure the responsible and safe deployment of generative video models.

I aim to expand my research toward multimodal and video generation, with a focus on improving fine-grained temporal consistency and efficiency, while incorporating physically aware conditioning into these models.

Selected Preprints

* denotes equal contribution

Anti-I2V: Safeguarding your photos from malicious image-to-video generation.
Duc Vu, Anh Nguyen, Chi Tran, Anh Tran,
Under Review, 2025  

A memory‑efficient adversarial attack against diverse image‑to‑video diffusion models, enabled by robust dual‑space perturbation optimization.

InverFill: One-Step Inversion for Enhanced Few-Step Diffusion Inpainting.
Duc Vu*, Kien Nguyen*, Trong-Tung Nguyen, Ngan Nguyen, Phong Nguyen, Khoi Nguyen, Cuong Pham, Anh Tran,
Under Review, 2025  

A highly efficient one-step inversion diffusion network for high-quality few-step image inpainting.

VideoDrift: Plug-and-Play Video Refinement for Diffusion Models via KV-Anchored Attention.
Ngan Nguyen, Duc Vu, Trong-Tung Nguyen, Phuc Hong Lai, Cuong Pham, Anh Tran,
Under Review, 2025  

A plug-and-play, backbone-agnostic video enhancer with low compute: each frame requires only inversion plus one generator pass, making it easy to deploy on outputs from diverse T2V systems.

Selected Publications

* denotes equal contribution

Improved Training Technique for Shortcut Models.
Anh Nguyen*, Viet Nguyen*, Duc Vu, Trung Dao, Chi Tran, Toan Tran, Anh Tran,
NeurIPS, 2025  
ArXiv

iSM resolves five major shortcut-model flaws with dynamic guidance, wavelet loss, sOT, and Twin EMA, yielding markedly better image generation.

SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher.
Trung Dao, Thuan Hoang Nguyen*, Thanh Le*, Duc Vu*, Khoi Nguyen, Cuong Pham, Anh Tran,
ECCV, 2024  
Project page / ArXiv

An improved SwiftBrush version that makes the one-step diffusion student beats its multi-step teacher.

EFHQ: Multi-purpose ExtremePose-Face-HQ dataset.
Trung Dao*, Duc Vu*, Cuong Pham, Anh Tran,
CVPR, 2024  
Project page / ArXiv

A high-quality dataset centered on extreme pose faces, supporting face synthesis, reenactment, recognition benchmarking, and more.