Zhaolin Gao
I’m an incoming Member of Technical Staff at Microsoft Superintelligence.
My research includes reinforcement learning (RL) and its applications in LLM post-training. I received my Ph.D. from Cornell, where I was fortunate to work with Thorsten Joachims and Wen Sun. My Ph.D. was supported by LinkedIn fellowship.
I completed my bachelor’s degree at University of Toronto, working with Baochun Li, Scott Sanner, and Maksims Volkovs.
I am also a part-time content creator with more than 50,000 followers and 12 million views on Bilibili and Douyin.
News
| Apr 13, 2026 | Our paper “p1: Better Prompt Optimization with Fewer Prompts” is now on arXiv. |
|---|---|
| Jan 26, 2026 | PCL is accepted ICLR 2026!. |
| Oct 1, 2025 | Our paper “Prompt Curriculum Learning for Efficient LLM Post-Training” is now on arXiv. |
| Sep 18, 2025 | A*-PO, Q#, VGS, and LLM-HMM are accepted NeurIPS 2025! |
| Sep 3, 2025 | Q# and A*-PO are accepted (poster and oral) to New York Reinforcement Learning Workshop 2025. |
| Aug 6, 2025 | LangPTune is accepted at CIKM 2025! |
| Jun 11, 2025 | Our paper “Pre-trained Large Language Models Learn Hidden Markov Models In-context” is now on arXiv. |
| May 27, 2025 | Our paper “Accelerating RL for LLM Reasoning with Optimal Advantage Regression” is now on arXiv. |