Tech Expert at Beijing Xiaomi Mobile Software Co., Ltd, leading the Agent Research team. Previously Tech Expert and AI Agent Team Lead at Huawei London Research Centre, working with Prof. Jianye Hao and Prof. Jun Wang. Ph.D. from the Institute of Automation, Chinese Academy of Sciences (advisor: Prof. Dongbin Zhao).
Research focus: Agentic AI — Harness, memory, reinforcement learning, and multi-agent systems — with production deployments on smartphones, automotive cockpits, IoT, and smart glasses&watches. 50+ publications (Nature MI, ICLR, CVPR, NeurIPS, ICML), 3,000+ citations.
Research focus: Agentic AI — Harness, memory, reinforcement learning, and multi-agent systems — with production deployments on smartphones, automotive cockpits, IoT, and smart glasses&watches. 50+ publications (Nature MI, ICLR, CVPR, NeurIPS, ICML), 3,000+ citations.
Research Highlights
2x Benchmark #1: AndroidWorld (GUI Agent, 2025) & GAIA (Deep Research Agent, June 2025)
Nature MI 2026: LLM-based robot OS framework for embodied AI
HarnessX: Open-source agent harness & self-evolution system at Xiaomi
2,000+ GitHub Stars on Deep Research agent toolkit
10+ Corresponding-Author Papers at ICLR / CVPR / NeurIPS / ICML (2024-2026)
10+ Patents across GUI Agent, Agentic RL, game AI, autonomous driving; 1 monograph
Work Experience
Beijing Xiaomi Mobile Software Co., Ltd. Feb. 2026 – Present
Tech Expert & Team Lead · Beijing
Leading the Agent Research team, driving agent architecture design and product deployment for Xiaomi's Xiaoai assistant across mobile, automotive, IoT, and wearable platforms.
(1) HarnessX — Compose. Adapt. Evolve. Open-sourced a composable agent orchestration framework. Nine orthogonal dimensions are independently configurable, replaceable, and optimizable, supporting meta-optimization and RL-based agent evolution.
(2) Agent Memory System — Led research and production deployment of agentic memory for Xiaoai. Built full-stack architecture (Multimodal Memory Encoding, Cross-Device Memory Persistence, Self-Evolving Lifecycle Management). Deployed on Xiaomi's intelligent automotive cockpit, smart glasses and other products.
(3) Agentic AI Research — Led multi-agent and agentic world-model research; explore sample-efficient test-time Agentic RL.
(1) HarnessX — Compose. Adapt. Evolve. Open-sourced a composable agent orchestration framework. Nine orthogonal dimensions are independently configurable, replaceable, and optimizable, supporting meta-optimization and RL-based agent evolution.
(2) Agent Memory System — Led research and production deployment of agentic memory for Xiaoai. Built full-stack architecture (Multimodal Memory Encoding, Cross-Device Memory Persistence, Self-Evolving Lifecycle Management). Deployed on Xiaomi's intelligent automotive cockpit, smart glasses and other products.
(3) Agentic AI Research — Led multi-agent and agentic world-model research; explore sample-efficient test-time Agentic RL.
Huawei London Research Centre July 2022 – Jan. 2026
Tech Expert & Team Lead · London
Led research on LLM/VLM-based AI Agents, covering agentic frameworks, reinforcement fine-tuning, planning & reasoning, tool-use, and memory.
(1) GUI Agent for Mobile Phone Control — Built Huawei's GUI Agent from scratch. Key innovations: scalable data-generation pipelines; lightweight action models (AppVLM, Lightweight Neural App Control — ICLR Spotlight 2025); agentic RL (DistRL — ICLR 2025); visual world models (ViMo — ICLR 2026); action semantics learning (CVPR 2026). Achieved 1st place on AndroidWorld. 10+ papers as corresponding author.
(2) Deep Research Agent — Pioneered general-purpose agentic systems for complex, multi-step tasks. Achieved 1st place on GAIA benchmark (June 2024). Open-source toolkit: 2,000+ GitHub stars.
(3) London–HQ Cooperation Owner — Strategic alignment between London Research Center and HQ on key landing projects.
(1) GUI Agent for Mobile Phone Control — Built Huawei's GUI Agent from scratch. Key innovations: scalable data-generation pipelines; lightweight action models (AppVLM, Lightweight Neural App Control — ICLR Spotlight 2025); agentic RL (DistRL — ICLR 2025); visual world models (ViMo — ICLR 2026); action semantics learning (CVPR 2026). Achieved 1st place on AndroidWorld. 10+ papers as corresponding author.
(2) Deep Research Agent — Pioneered general-purpose agentic systems for complex, multi-step tasks. Achieved 1st place on GAIA benchmark (June 2024). Open-source toolkit: 2,000+ GitHub stars.
(3) London–HQ Cooperation Owner — Strategic alignment between London Research Center and HQ on key landing projects.
Huawei Noah's Ark Lab July 2019 – July 2022
Principal Research Scientist · Beijing
(1) Population-based RL & Game AI — Diverse policy generation; production game AI and autonomous-driving simulation (SMARTS — Best System Paper, CoRL 2020).
(2) Multi-Agent RL — Credit assignment, knowledge transfer, opponent modeling (NeurIPS 2022, ICML 2020, ICRA 2022).
(3) Large-Scale Optimization — RL and MARL solutions for EDA and recommendation domains.
(2) Multi-Agent RL — Credit assignment, knowledge transfer, opponent modeling (NeurIPS 2022, ICML 2020, ICRA 2022).
(3) Large-Scale Optimization — RL and MARL solutions for EDA and recommendation domains.
Education
Institute of Automation, Chinese Academy of Sciences (CASIA) Sept. 2014 – Jun. 2019
Ph.D. in Artificial Intelligence
Beijing Jiaotong University (BJTU) Sept. 2010 – Jul. 2014
B.Eng in Automation
Selected Publications
Full list: Google Scholar · 50+ papers · 3,000+ citations
2026
- C.E. Mower, ..., K. Shao, et al. A robot operating system framework for using LLMs in embodied AI. Nature Machine Intelligence, 8: 313–325, 2026.
- D. Luo, B. Tang, ..., J. Hao, J. Wang, K. Shao. ViMo: A Generative Visual GUI World Model for App Agent. ICLR 2026. Corresponding Author
- B. Tang, D. Luo, ..., J. Hao, J. Wang, K. Shao. Beyond Syntax: Action Semantics Learning for App Agents. CVPR 2026. Corresponding Author
- Z. Wu, D. Mo, ..., K. Li, K. Shao, J. Hao. K²-Agent: Co-Evolving Know-What and Know-How for Hierarchical Mobile Device Control. ICLR 2026.
- Z. Zhou, Z. Liu, ..., K. Shao, D. Jin, F. Xu. ResMAS: Resilience Optimization in LLM-based Multi-agent Systems. AAAI 2026.
- Y. Li, L. Li, ..., J. Hao, K. Shao, F. Xu. AgentSwift: Efficient LLM Agent Design via Value-guided Hierarchical Search. AAAI 2026.
- H. Liang, J. Hao, ..., K. Shao, et al. AFE-Master: Enhancing LLM-Driven Autonomous Feature Engineering. WWW 2026.
2025
- F. Christianos, G. Papoudakis, T. Coste, J. Hao, J. Wang, K. Shao. Lightweight Neural App Control. ICLR 2025. Spotlight Corresponding Author
- J. Chen, D. Yuen, ..., K. Shao. SPA-Bench: A Comprehensive Benchmark for Smartphone Agent Evaluation. ICLR 2025. Spotlight Corresponding Author
- T. Wang, Z. Wu, J. Liu, J. Hao, J. Wang, K. Shao. DistRL: Asynchronous Distributed RL for On-Device Control Agents. ICLR 2025. Corresponding Author
- G. Papoudakis, T. Coste, J. Hao, J. Wang, K. Shao. Succeed or Learn Slowly: Sample Efficient Off-Policy RL for Mobile App Control. NeurIPS 2025. Corresponding Author
- J. Hu, Z. Cheng, S. Gong, ..., J. Hao, J. Wang, K. Shao. Uncertainty-quantified Rollout Policy Adaptation for Cross-domain Temporal Grounding. NeurIPS 2025. Corresponding Author
- S. Huang, L. Yang, ..., K. Shao, et al. ThinkBench: Dynamic Out-of-Distribution Evaluation for Robust LLM Reasoning. NeurIPS 2025.
- G. Li, N. Tsagkas, ..., K. Shao, L. Sevilla-Lara. Learning Precise Affordances from Egocentric Videos for Robotic Manipulation. ICCV 2025. Corresponding Author
- T. Jafferjee, J. Ziomek, ..., K. Shao, J. Wang. Taming Multi-Agent RL with Estimator Variance Reduction. AAMAS 2025.
2024
- Z. Xiong, R. Vuorio, J. Beck, M. Zimmer, K. Shao, S. Whiteson. Distilling Morphology-Conditioned Hypernetworks for Efficient Universal Morphology Control. ICML 2024.
- H. Li, W. Huang, ..., K. Shao, J. Wang, X. Deng. A Survey on Algorithms for Nash Equilibria in Finite Normal-Form Games. Computer Science Review, vol. 51, 2024.
2023
- X. Feng, Y. Luo, ..., K. Shao, D. Mguni, Y. Du, J. Wang. ChessGPT: Bridging Policy Learning and Language Modeling. NeurIPS 2023.
- H. Chen, J. Wang, K. Shao, et al. Traj-MAE: Masked Autoencoders for Trajectory Prediction. ICCV 2023.
- D. Mguni, A. Sootla, ..., K. Shao, J. Wang. Timing is Everything: Learning to Act Selectively with Costly Actions. ICLR 2023.
- D. Mguni, T. Jafferjee, ..., M. Taylor, K. Shao. Learning to Shape Rewards Using a Game of Two Partners. AAAI 2023.
- T. Zhou, F. Zhang, K. Shao, et al. Cooperative Multiagent Transfer Learning with Coalition Pattern Decomposition. IEEE Trans. on Games, 2023.
2022
- W. Huang, K. Li, K. Shao, et al. Multiagent Q-Learning with Sub-Team Coordination. NeurIPS 2022.
- J. Miao, T. Zhou, K. Shao, et al. Promoting Quality and Diversity in Population-Based RL. ICRA 2022.
- Z. Dai, T. Zhou, K. Shao, D.H. Mguni. Socially-Attentive Policy Optimization in Multi-Agent Self-Driving. CoRL 2022.
Book & Patents
Book: Artificial Intelligence Methods in Games — Dongbin Zhao, Yuanheng Zhu, Zhentao Tang, Kun Shao. Science Press, 2024.
Patents (10+ granted/applied): GUI World Model for App Agent (2025); Hierarchical RL for AppAgent (2025); Nash Equilibrium via Conditional Gradient (2024); Game AI Framework on ModelArts (2021); Diverse Scenarios for Intelligent Driving (2020).
Invited Talks
- The 20th Chinese Conference on Machine Learning (CCML), Taiyuan, 2025
- Beijing Institute for General Artificial Intelligence, Beijing, 2025
- National University of Singapore (NUS), Singapore, 2025
- The 1st International Workshop on AI Agent Reasoning and Decision-Making, online, 2025
- Robotics and AI, Department of Computer Science, UCL, London, 2025
- Shanghai AI Lab, Shanghai, 2025
- Department of Electronic Engineering, Tsinghua University, Beijing, 2024
- RLChina, Guangzhou, 2024
Honors & Awards
- Team Gold Medal Award of Huawei 2012 Lab Dec. 2025
- Excellent Contributor of Huawei European Research Institute Nov. 2025
- President's Award — Significant Business Contribution of Huawei Europe Sep. 2024
- 2nd Prize, Beijing Science and Technology Progress Award Oct. 2023
- Outstanding Individual Contribution Award of Huawei LRC Dec. 2023
- President's Team Award of Huawei CRI Dec. 2021
- Commended Student Award of CASIA Jul. 2018
- Outstanding Paper Award — IEEE TETCI 2018
- Outstanding Paper Award — Control Theory & Applications 2016
