GenTron: Diffusion Transformers for Image and Video Generation

$\color{green}{\mathcal{N}ew!}$ We’re actively recruiting Postdocs, PhDs, and RAs. Please drop me an email via pluo.lhi@gmail.com.
2024-03: Two papers will be presented in SIGGRAPH'24, eight papers in ICLR'24 (two spotlights), eight papers in CVPR'24.
2024-01: Received the prestigious HKU Outstanding Young Researcher Award 2023. Enjoy the video!
2023-10: DiffusionDet was nominated for the Best Paper Final List (17/8260, 0.2%) at ICCV 2023. PVT v2 received the Best Paper Runner-up of the Year 2023 at the Computaional Visual Media Journal (CVMJ). SegFormer and PVT v1 received the Outstanding Young Paper Awards at the World AI Conference (WAIC) 2023.
2023-06: Ten papers will be presented in CVPR'23, eleven in ICCV'23, three in ICLR'23, three in ICML'23, six in NeurIPS'23.
2022-05: Our paper “Compression of Generative Pre-trained Language Models via Quantization” received ACL 2022 Outstanding Paper Award. 5 papers were presented in ICLR 2022 (CycleMLP is an oral presentation, accepted rate 1.6%), 7 papers in CVPR 2022 (2 oral presentation), 3 papers will be presented in ICML 2022.

My researches aim at (1) developing Differentiable/ Meta/ Reinforcement Learning algorithms that endow machines and devices to solve complex tasks with larger autonomy, (2) understanding foundations of deep learning algorithms, and (3) enabling applications in Machine Vision and Artificial Intelligence such as text to image/video generation, 3D vision, scene and video understanding, and medical image analysis.

Biography

Ping Luo is an Associate Professor in the Department of Computer Science at the University of Hong Kong, an Associate Director of the HKU Musketeers Foundation Institute of Data Science (HKU IDS), and a Deputy Director of the Joint Research Lab of HKU and Shanghai AI Lab. He obtained his Ph.D. in Information Engineering from the Chinese University of Hong Kong in 2014, under the supervision of Prof. Xiaoou Tang (founder of SenseTime) and Prof. Xiaogang Wang. Before joining HKU in 2019, he was a Research Director in SenseTime. He has published 100+ papers in international conferences and journals such as TPAMI, ICML, ICLR, NeurIPS, and CVPR, with over 50,000 citations on Google Scholar. He was awarded the 2015 AAAI Easily Accessible Paper, nominated for the 2022 Computational Visual Media Journal's Best Paper of the Year, won the 2022 ACL Outstanding Paper, the 2023 World Artificial Intelligence Conference (WAIC) Outstanding Papers, and was a candidate for the Best Paper at ICCV’23. He was recognized as one of the innovators under 35 in the Asia-Pacific region by the MIT Technology Review (MIT TR35) in 2020. He has mentored 30 Ph.D. students, many of whom have received significant awards such as the Nvidia Fellowship, Baidu Fellowship, WAIC Yunfan Award, etc.

Recent Publications

Quickly discover relevant content by filtering publications.

Qiushan Guo, Shalini De Mello, Hongxu Yin, Wonmin Byeon, Ka Chun Cheung, Yizhou Yu, Ping Luo, Sifei Liu(2024).RegionGPT: Towards Region Understanding Vision Language Model.Computer Vision and Pattern Recognition (CVPR) 2024.

PDF

Zhixuan Liang, Yao Mu, Hengbo Ma, Masayoshi Tomizuka, Mingyu Ding, Ping Luo(2024).SkillDiffuser: Interpretable Hierarchical Planning via Skill Abstractions in Diffusion-Based Task Execution.Computer Vision and Pattern Recognition (CVPR) 2024.

PDF

Shoufa Chen, Mengmeng Xu, Jiawei Ren, Yuren Cong, Sen He, Yanping Xie, Animesh Sinha, Ping Luo, Tao Xiang, Juan-Manuel Perez-Rua(2024).GenTron: Diffusion Transformers for Image and Video Generation.Computer Vision and Pattern Recognition (CVPR) 2024.

PDF

Zhouxia Wang, Ziyang Yuan, Xintao Wang, Tianshui Chen, Menghan Xia, Ping Luo, Ying Shan(2024).MotionCtrl: A Unified and Flexible Motion Controller for Video Generation.SIGGRAPH 2024.

PDF Code

Anran Liu, Cheng Lin, Yuan Liu, Xiaoxiao Long, Zhiyang Dou, Hao-Xiang Guo, Ping Luo, Wenping Wang(2024).Part123: Part-aware 3D Reconstruction from a Single-view Image.SIGGRAPH 2024.

Junsong Chen, Jincheng YU, Chongjian GE, Lewei Yao, Enze Xie, Zhongdao Wang, James Kwok, Ping Luo, Huchuan Lu, Zhenguo Li(2024).PixArt-alpha: Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis.International Conference on Learning Representation (ICLR) 2024.

PDF

Haoyu Lu, Guoxing Yang, Nanyi Fei, Yuqi Huo, Zhiwu Lu, Ping Luo, Mingyu Ding(2024).Vdt: General-purpose video diffusion transformers via mask modeling.International Conference on Learning Representation (ICLR) 2024.

PDF

Wenqi Shao, Mengzhao Chen, Zhaoyang Zhang, Peng Xu, Lirui Zhao, Zhiqian Li, Kaipeng Zhang, Peng Gao, Yu Qiao, Ping Luo(2024).OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models.International Conference on Learning Representation (ICLR) 2024.

PDF

Wenhai Wang, Zhe Chen, Xiaokang Chen, Jiannan Wu, Xizhou Zhu, Gang Zeng, Ping Luo, Tong Lu, Jie Zhou, Yu Qiao, Jifeng Dai(2023).Visionllm: Large language model is also an open-ended decoder for vision-centric tasks.Thirty-seventh Annual Conference on Neural Information Processing Systems (NeurIPS) 2023.

PDF

Yao Mu, Qinglong Zhang, Mengkang Hu, Wenhai Wang, Mingyu Ding, Jun Jin, Bin Wang, Jifeng Dai, Yu Qiao, Ping Luo(2023).Embodiedgpt: Vision-language pre-training via embodied chain of thought.Thirty-seventh Annual Conference on Neural Information Processing Systems (NeurIPS) 2023.

PDF

See all publications

News&Talks

Recruitment and Opportunities

Recruit Postdocs, RAs, full-time/part-time PhDs, Internships, and Research Scientists. Drop me an email pluo.lhi@gmail.com.

Feb 18, 2020 3:38 PM

Ping Luo

Understanding Normalization in Deep Learning

一个卷积层，一个归一化层，一个非线性激活函数一起构成了深度卷积神经网络 (ConvNet)的“原子”结构。通过该基础结构的堆叠，产生了许多应用广泛的神经网络。归一化方法是这些神经网络的重要组成部分之一。本次报告的内容围绕深度学习的归一化方法展开，及其为神经网络带来的正则 …

Feb 18, 2019 6:28 PM

Ping Luo

PDF Code Project Video

Learning-to-Learn-to-Normalize: Algorithms, Applications and Theory

Introduce a new family of normalization methods in Deep Learning.

Dec 18, 2018 4:37 PM

Ping Luo

Project

Learning-to-Learn-to-Normalize: Algorithms, Applications and Theory

浅谈深度学习：归一化中的正则与泛化

Understanding batch normalization in Deep Learning. This blog was written in Chinese.

Oct 18, 2018 4:37 PM

Ping Luo

Project

WIDER Face and Pedestrian Challenge 2018

We organize a new challenge in conjunction with ECCV 2018. The challenge centers around the problem of precise localization of human …

Jun 18, 2018 6:28 PM

Chen Change Loy, Dahua Lin, Wanli Ouyang, Yuanjun Xiong, Shuo Yang, Qingqiu Huang, Dongzhan Zhou, Wei Xia, Quanquan Li, Ping Luo, Junjie Yan

Project

Principal Investigator

Ping Luo

Associate Professor, Computer Science, The University of Hong Kong

Advisory Committee

Wenping Wang

Professor, IEEE Fellow

Xiaoou Tang

In Forever Memory of Professor Sean Tang

PhD Candidates

Anran Liu

PhD, since 2019 (HKPFS), co-supervised with Prof. Wenping Wang

Low-Level Vision, Deep Learning

Chaofan Tao

PhD, since 2020. webpage Co-supervised with Prof. Ngai Wong

Model Compression and Acceleration, Hardware-efficient AI

Chengyue Wu

PhD (HKPFS), 2023-, webpage

Multimodality

Chonghao Si Ma

PhD, 2023-

Autonomous Driving

Chongjian GE

PhD, since 2020 (HKPFS). webpage

Object Detection, Visual Question Answering, Deep Learning

Fanqing Meng

PhD, 2023-, Shanghai AI Lab Joint PhD Program

Text-to-Image, LLM

Haibao Yu

PhD, since 2022. webpage

V2X, Autonomous Driving, Computer Vision, Efficient AI

Jiahao Wang

PhD, 2023-, webpage

Fast Neural Architecture Design

Jiannan Wu

PhD, since 2020 (HKPFS). webpage

Math Exercise Representation, Visual Question Answering, Deep Learning

Jin Wang

PhD, 2023-, webpage

Deepfake Detection, Explainable AI

Li Chen

PhD, 2023-, webpage

Autonomous Driving

Mengkang Hu

PhD, 2023-, webpage

NLP, Multimodality, Robotics Learning

Peize Sun

PhD, since 2020 (HKPFS). webpage

Computer Vision

Peng Xu

PhD, since 2021 (HKU-SUSTech Joint PhD Programme). Co-supervised with Prof. Fengwei An

Computer Vision, Edge Computing

Qiushan Guo

PhD, since 2020. Co-supervised with Prof. Yizhou Yu

Knowledge Distillation, Object Detection, Deep Learning

Runjian Chen

PhD, since 2021 (HKPFS). webpage

Representation Learning, Deep Learning, Autonomous Driving, 3D Computer Vision

Sheng Jin

PhD, since 2020 (HKPFS). webpage

Human Pose Estimation, Deep Learning

Shilong Zhang

PhD, 2023-, webpage

Computer Vision

Shoufa Chen

PhD, since 2021 (HKPFS). webpage

Video Understanding, Deep Learning

Teng Wang

PhD, since 2020 (HKU-SUSTech Joint PhD Programme). Co-supervised with Prof. Feng Zheng

Neural Architecture Search, Deep Learning

Tianqi Wang

PhD, since 2020 (HKU-PS). webpage

Autonomous Driving, 3D Object Detection

Yao Lai

PhD, since 2021 (HKPFS). webpage

AI Security, Electronic Design Automation, High Performance Computing

Yao Mu

PhD, since 2021 (HKPFS). webpage

Unsupervised Representation Learning, Reinforcement Learning

Yizhuo Li

PhD, since 2022. webpage

Video Understanding, Self-supervised Learning

Yuanfeng Ji

PhD, since 2020. webpage

Medical Image Analysis, Deep Learning

Yue Yang

PhD, 2022-, Shanghai AI Lab Joint PhD Program

Text-to-Image, LLM

Yuheng Lei

PhD (HKPFS), 2023-, webpage

Embodied AI, Reinforcement Learning, Robotics, Autonomous Driving

Zeyue Xue

PhD, since 2022.

Large-scale Deep Learning, Computer Vision

Zhanglin Peng

PhD, since 2020 (University Fellowship UPF). webpage Co-supervised with Prof. Wenping Wang

Normalization Methods, Image Recognition, Object Detection and Semantic Segmention, Image Demosaicing and Denoising, Deep Learning

Zhixuan Liang

PhD, since 2022 (HKPFS). webpage

Active Learning and Incremental Learning, Open World Detection, Autonomous Driving

Alumni

Enze Xie

PhD, 2019-2022. webpage

Instance-level Detection and Segmentation, Text Understanding, Deep Learning

Jiaming Xie

PhD, 2017-2023, co-supervised with Prof. Wenping Wang

Medical Image, VR/AR

Mingyu Ding

PhD, 2019-2023. webpage

3D Vision, Autonoumus Driving, Deep Learning

Nenglun Chen

PhD, 2017-2023. webpage Co-supervised with Prof. Wenping Wang

Geometric Deep Learning, Multimodal Learning

Qiang Zhai

Visitor, 2021-2022. webpage

Autonoumus Driving, Robotics

Wenhai Wang

RA, 2019-2020. webpage

Text Understanding, Instance-level Detection and Segmentation, Deep Learning

Wenqi Shao

PhD, since 2018. webpage Co-supervised with Prof. Xiaogang Wang

Normalization Methods, Efficient Neural Nets, Deep Learning

Xingang Pan

PhD, 2017-2021. webpage Co-supervised with Prof. Xiaoou Tang

Generative Models, Deep Learning

Yangyang Xu

Postdoc Fellow, 2021-2023. webpage

Generative Models, Image Editing, Transfer Learning

Yutao Hu

Postdoc Fellow, 2022-2023. webpage

AI for Healthcare, Computer Vision

Yuying Ge

PhD, 2019-2023. webpage

Fashion AI, Deep Learning

Zhaoyang Zhang

PhD, 2019-2023. webpage Co-supervised with Prof. Xiaogang Wang

Efficient Algorithm Design, Optimization, Computer Vision

Zhouxia Wang

PhD, 2020-2023. webpage Co-supervised with Prof. Wenping Wang

Exposure Bracketing Selection, Multi-exposure Fusion and Image Denoising, Image Recognition and Object Detection, Deep Learning

GenTron: Diffusion Transformers for Image and Video Generation

MotionCtrl: A Unified and Flexible Motion Controller for Video Generation

VDT: General-purpose Video Diffusion Transformers via Mask Modeling

PIXART-α:Fast Training of Diffusion Transformer for Photorealistic Text-to-Image Synthesis

PIXART-Σ: Weak-to-Strong Training of Diffusion Transformer for 4K Text-to-Image Generation

OmniQuant: Omnidirectionally Calibrated Quantization for Large Language Models

Multi-Modality Arena: an evaluation platform for large multi-modality models

Embodiedgpt: Vision-language pre-training via embodied chain of thought

RegionGPT: Towards Region Understanding Vision Language Model

RoboCodeX: Multimodal Code Generation for Robotic Behavior Synthesis

FlashFace: Human Image Personalization with High-fidelity Identity Preservation

Ping Luo

Associate Professor, Computer Science, The University of Hong Kong

Biography

Recent Publications

News&Talks

Principal Investigator

Associate Professor, Computer Science, The University of Hong Kong

Advisory Committee

Professor, IEEE Fellow

In Forever Memory of Professor Sean Tang

PhD Candidates

PhD, since 2019 (HKPFS), co-supervised with Prof. Wenping Wang

PhD, since 2020. webpage Co-supervised with Prof. Ngai Wong

PhD (HKPFS), 2023-, webpage

PhD, 2023-

PhD, since 2020 (HKPFS). webpage

PhD, 2023-, Shanghai AI Lab Joint PhD Program

PhD, since 2022. webpage

PhD, 2023-, webpage

PhD, since 2020 (HKPFS). webpage

PhD, 2023-, webpage

PhD, 2023-, webpage

PhD, 2023-, webpage

PhD, since 2020 (HKPFS). webpage

PhD, since 2021 (HKU-SUSTech Joint PhD Programme). Co-supervised with Prof. Fengwei An

PhD, since 2020. Co-supervised with Prof. Yizhou Yu

PhD, since 2021 (HKPFS). webpage

PhD, since 2020 (HKPFS). webpage

PhD, 2023-, webpage

PhD, since 2021 (HKPFS). webpage

PhD, since 2020 (HKU-SUSTech Joint PhD Programme). Co-supervised with Prof. Feng Zheng

PhD, since 2020 (HKU-PS). webpage

PhD, since 2021 (HKPFS). webpage

PhD, since 2021 (HKPFS). webpage

PhD, since 2022. webpage

PhD, since 2020. webpage

PhD, 2022-, Shanghai AI Lab Joint PhD Program

PhD (HKPFS), 2023-, webpage

PhD, since 2022.

PhD, since 2020 (University Fellowship UPF). webpage Co-supervised with Prof. Wenping Wang

PhD, since 2022 (HKPFS). webpage

Alumni

PhD, 2019-2022. webpage

PhD, 2017-2023, co-supervised with Prof. Wenping Wang

PhD, 2019-2023. webpage

PhD, 2017-2023. webpage Co-supervised with Prof. Wenping Wang

Visitor, 2021-2022. webpage

RA, 2019-2020. webpage

PhD, since 2018. webpage Co-supervised with Prof. Xiaogang Wang

PhD, 2017-2021. webpage Co-supervised with Prof. Xiaoou Tang

Postdoc Fellow, 2021-2023. webpage

Postdoc Fellow, 2022-2023. webpage

PhD, 2019-2023. webpage

PhD, 2019-2023. webpage Co-supervised with Prof. Xiaogang Wang

PhD, 2020-2023. webpage Co-supervised with Prof. Wenping Wang

Projects

Contact