Zhaolin Gao
I’m a final-year Computer Science Ph.D. student at Cornell University, working with Thorsten Joachims and Wen Sun, where my research is supported by LinkedIn fellowship. My research includes reinforcement learning (RL) and its applications in LLM post-training. My work has been published at NeurIPS, ICLR, CVPR, WWW, SIGIR, CIKM, RecSys, and INFOCOM.
I completed my bachelor’s degree in Computer Engineering at University of Toronto, where I had the privilege of working with Baochun Li, Scott Sanner, and Maksims Volkovs.
I am also a part-time content creator with more than 50,000 followers and 10 million views on Bilibili, Douyin, and YouTube.
News
| Oct 1, 2025 | Our paper “Prompt Curriculum Learning for Efficient LLM Post-Training” is now on arXiv. |
|---|---|
| Sep 18, 2025 | A*-PO, Q#, VGS, and LLM-HMM are accepted NeurIPS 2025! |
| Sep 3, 2025 | Q# and A*-PO are accepted (poster and oral) to New York Reinforcement Learning Workshop 2025. |
| Aug 6, 2025 | LangPTune is accepted at CIKM 2025! |
| Jun 11, 2025 | Our paper “Pre-trained Large Language Models Learn Hidden Markov Models In-context” is now on arXiv. |
| May 27, 2025 | Our paper “Accelerating RL for LLM Reasoning with Optimal Advantage Regression” is now on arXiv. |
| May 23, 2025 | Our paper “Value-Guided Search for Efficient Chain-of-Thought Reasoning” is now on arXiv. |
| Mar 1, 2025 | Our paper “Q#: Provably Optimal Distributional RL for LLM Post-Training” is now on arXiv. |