ResearchMy research focuses on advancing generative AI by leveraging efficient one-step generation for real-world applications such as video enhancement and image inpainting. I am particularly interested in diffusion-based methods that achieve high-quality results with minimal computational cost, making generative models more practical and accessible. In addition to efficiency, I am also working on adversarial defenses to ensure the responsible and safe deployment of generative video models. I aim to expand my research toward multimodal and video generation, with a focus on improving fine-grained temporal consistency and efficiency, while incorporating physically aware conditioning into these models. |
Selected Preprints* denotes equal contribution |
|
Anti-I2V: Safeguarding your photos from malicious image-to-video generation.
Duc Vu, Anh Nguyen, Chi Tran, Anh Tran, Under Review, 2025   A memory‑efficient adversarial attack against diverse image‑to‑video diffusion models, enabled by robust dual‑space perturbation optimization. |
|
InverFill: One-Step Inversion for Enhanced Few-Step Diffusion Inpainting.
Duc Vu*, Kien Nguyen*, Trong-Tung Nguyen, Ngan Nguyen, Phong Nguyen, Khoi Nguyen, Cuong Pham, Anh Tran, Under Review, 2025   A highly efficient one-step inversion diffusion network for high-quality few-step image inpainting. |
|
VideoDrift: Plug-and-Play Video Refinement for Diffusion Models via KV-Anchored Attention.
Ngan Nguyen, Duc Vu, Trong-Tung Nguyen, Phuc Hong Lai, Cuong Pham, Anh Tran, Under Review, 2025   A plug-and-play, backbone-agnostic video enhancer with low compute: each frame requires only inversion plus one generator pass, making it easy to deploy on outputs from diverse T2V systems. |
Selected Publications* denotes equal contribution |
|
Improved Training Technique for Shortcut Models.
Anh Nguyen*, Viet Nguyen*, Duc Vu, Trung Dao, Chi Tran, Toan Tran, Anh Tran, NeurIPS, 2025   ArXiv iSM resolves five major shortcut-model flaws with dynamic guidance, wavelet loss, sOT, and Twin EMA, yielding markedly better image generation. |
|
SwiftBrush v2: Make Your One-step Diffusion Model Better Than Its Teacher.
Trung Dao, Thuan Hoang Nguyen*, Thanh Le*, Duc Vu*, Khoi Nguyen, Cuong Pham, Anh Tran, ECCV, 2024   Project page / ArXiv An improved SwiftBrush version that makes the one-step diffusion student beats its multi-step teacher. |
|
EFHQ: Multi-purpose ExtremePose-Face-HQ dataset.
Trung Dao*, Duc Vu*, Cuong Pham, Anh Tran, CVPR, 2024   Project page / ArXiv A high-quality dataset centered on extreme pose faces, supporting face synthesis, reenactment, recognition benchmarking, and more. |