Lau Luk Peter, Chen
Logo Department of Mathematics, Columbia University

I do theoretical-grounded LLM post training, via optimization theory and reinforcement learning to advance LLM reasoning and alignment. Apart from that, I enjoy doing optimal transport and geometry processing.

I am an undergradute student from Columbia, advised by Prof. Andrew Blumberg and Prof. Tianyi Lin. I am now working closely with Prof. Mengdi Wang and her group in Princeton.


Education
  • Columbia College, Columbia University
    Columbia College, Columbia University
    B.A. in Mathematics, Computer Science
    May. 2026
Experience
  • Princeton University
    Princeton University
    Research Intern; Hosted by Mengdi Wang
    Feb. 2025
  • HKU Musketeers Foundation Institute of Data Science
    HKU Musketeers Foundation Institute of Data Science
    Research Intern; Hosted by Yue Xie, Qingpeng Zhang
    May. 2024
Teaching & Service
  • TA for Analysis & Optimization (Sp 24/Fa 24/Sp 25/Fa 25/Sp 26)
  • Reviewer for NeurIPS, ICLR, ICML, AAAI
Selected Publications (view all )
ComPO: Preference Alignment via Comparison Oracles
ComPO: Preference Alignment via Comparison Oracles

Peter Chen, Xi Chen, Wotao Yin, Tianyi Lin

Advances in Neural Information Processing Systems 38 (NeurIPS 2025)

ComPO: Preference Alignment via Comparison Oracles

Peter Chen, Xi Chen, Wotao Yin, Tianyi Lin

Advances in Neural Information Processing Systems 38 (NeurIPS 2025)

Beyong Contamination: Rethinking RLVR Learning Dynamics through Clipping, Entropy, and Random Rewards
Beyong Contamination: Rethinking RLVR Learning Dynamics through Clipping, Entropy, and Random Rewards

Peter Chen, Xiaopeng Li, Ziniu Li, Wotao Yin, Xi Chen, Tianyi Lin

ICLR 2026, under review

Beyong Contamination: Rethinking RLVR Learning Dynamics through Clipping, Entropy, and Random Rewards

Peter Chen, Xiaopeng Li, Ziniu Li, Wotao Yin, Xi Chen, Tianyi Lin

ICLR 2026, under review

Stepwise Guided Policy Optimization: Coloring your Incorrect Reasoning in GRPO
Stepwise Guided Policy Optimization: Coloring your Incorrect Reasoning in GRPO

Peter Chen, Xiaopeng Li, Ziniu Li, Xi Chen, Tianyi Lin

ICLR 2026, under review

Stepwise Guided Policy Optimization: Coloring your Incorrect Reasoning in GRPO

Peter Chen, Xiaopeng Li, Ziniu Li, Xi Chen, Tianyi Lin

ICLR 2026, under review

All publications