Zhaolin Gao
I’m a third-year Computer Science Ph.D. student at Cornell University, where I am advised by Thorsten Joachims and Wen Sun, and a part-time Researcher at Meta Superintelligence. My research includes reinforcement learning (RL) and its applications in LLM post-training. My work has been published at NeurIPS, ICLR, CVPR, WWW, SIGIR, CIKM, RecSys, and INFOCOM.
I completed my bachelor’s degree in Computer Engineering at University of Toronto, where I had the privilege of working with Baochun Li, Scott Sanner, and Maksims Volkovs.
I am also a part-time content creator with more than 50,000 followers and 10 million views on Bilibili, Douyin, and YouTube.
News
| Oct 1, 2025 | Our paper “Prompt Curriculum Learning for Efficient LLM Post-Training” is now on arXiv. |
|---|---|
| Sep 18, 2025 | A*-PO, Q#, VGS, and LLM-HMM are accepted NeurIPS 2025! |
| Sep 3, 2025 | Q# and A*-PO are accepted (poster and oral) to New York Reinforcement Learning Workshop 2025. |
| Aug 6, 2025 | LangPTune is accepted at CIKM 2025! |
| Jun 11, 2025 | Our paper “Pre-trained Large Language Models Learn Hidden Markov Models In-context” is now on arXiv. |
| May 27, 2025 | Our paper “Accelerating RL for LLM Reasoning with Optimal Advantage Regression” is now on arXiv. |
| May 23, 2025 | Our paper “Value-Guided Search for Efficient Chain-of-Thought Reasoning” is now on arXiv. |
| Mar 1, 2025 | Our paper “Q#: Provably Optimal Distributional RL for LLM Post-Training” is now on arXiv. |