要人心之自由,胸襟开放。要拿全世界人类曾经走过的路,都要算是我走过的路之一; 要有一个远见,能超越你未见。要想办法设想,我没见到的地方,那个世界还有可能什么样。 —— 许倬云
To have the freedom of the heart and an open mind. To take all the paths that humanity has walked around the world, they must be considered as one of the paths I have walked; Have a vision that can surpass what you haven’t seen before. Think of a way to imagine what the world could be like in a place I haven’t seen. – Prof. Xu Zhuoyun
About me
I’m currently a year-4 Ph.D. candidate advised by Assistant Prof. Baoxiang Wang and Prof. Hongyuan Zha in the School of Data Science at the Chinese University of Hong Kong, Shenzhen (CUHKSZ). My research interests mainly focus on multi-agent reinforcement learning, LLM post-training, multi-agent systems, and social welfare. My PhD topic is focusing on “How to maintain cooperation of multiple AI agents in an autonomous AI-decision making society”. For more details, please refer to my google scholar.
Before that, I got my Master’s Degree and B.Eng. Degree in Automotive Engineering, advised by Associate Prof. Zhaoxia Peng, at the School of Transportation Science and Engineering, Beihang University in 2021 and 2018 respectively. I worked as a research assistant with Prof. Junge Zhang at the Institute of Automation, Chinese Academy of Sciences (CASIA) in 2022. I am also fortunate to collaborate with Assistant Prof. Wenhao Li in Tongji University, Dr. Binbin Chen in ByteDance Inc., Dr. Lei Song and Dr. Jiang Bian in MicroSoft Research Asia. Grateful to all collaborators and mentors for their passion for research and selfless support throughout my academic journey.
Education
- 2022 - Present, Ph.D. Candidate - Computer Science, the Chinese University of Hong Kong, Shenzhen, China
- 2018 - 2021, M. Eng - Automotive Engineering, Beihang University, China
- 2014 - 2018, B. Eng - Automotive Engineering, Beihang University, China
Research Interests
My research interests include:
- Multi-agent Reinforcement Learning
- Sequetial Social Dilemma
- Diffusion Models
- Large Language Models & Agents
- Learning Mechanism Design for Social Welfare
Pre-prints
D. Qiao, W. Li, S. Yang, H. Zha, B. Wang*, Offline Multi-Agent Reinforcement Learning via Sequential Score Decomposition, Preprint, arXiv:2505.05968. 2025
D. Qiao, J. Zhang*, Y. Zhang, S. Xiao, H. Chen, "Privacy-preserved Fully Decentralized Multi-agent Reinforcement Learning for Networked Social Systems ". Chinese Patent BZ-PT235864-P2023xxxx, 2022.
- D. Qiao, Z. Peng*, G. Wen, T. Huang, "Novel Saturated Nussbaum-type Function based Adaptive Distributed Consensus Control of Multi-agent Systems with Unknown Arbitrary Control Directions". Preprint arXiv:2201.09453, 2022. [pdf]
Publications
Z. Li†, D. Qiao†, A. Rahman, Y. Du, S. Leonardos*, S. V. Albrecht, STAR-MARL: LLM-based Sub-task Curricula Design, LAMAS Workshop at AAAI-2026. 2025
W. Li, D. Qiao, B. Wang, X. Wang, B. Jin, H. Zha*, " Multi-Agent Credit Assignment with Pretrained Language Models ". AISTATS, 2025. [pdf]
S. Yang, Y. Hua, D. Qiao, Y. Lian, Y. Pan*, Y. He, " A coupled electrochemical-thermal-mechanical degradation modelling approach for lifetime assessment of lithium-ion batteries ". Electrochimica Acta, Vol. 326, Dec. 2019, 134928. [pdf]
Misc
Welcome to follow my Zhihu account and BiliBili.
Contact
Office: Floor 4, Zhixin Building, CUHKSZ, Shenzhen, 518172
原色%20PNG%20透明底.png?itok=2qCH6N8B)
