2026

Reward-free Alignment for Conflicting Objectives
Reward-free Alignment for Conflicting Objectives

Peter Chen, Xiaopeng Li, Xi Chen, Tianyi Lin

ICML 2026, Under Review

Reward-free Alignment for Conflicting Objectives

Peter Chen, Xiaopeng Li, Xi Chen, Tianyi Lin

ICML 2026, Under Review

2025

ComPO: Preference Alignment via Comparison Oracles
ComPO: Preference Alignment via Comparison Oracles

Peter Chen, Xi Chen, Wotao Yin, Tianyi Lin

Advances in Neural Information Processing Systems 38 (NeurIPS 2025)

ComPO: Preference Alignment via Comparison Oracles

Peter Chen, Xi Chen, Wotao Yin, Tianyi Lin

Advances in Neural Information Processing Systems 38 (NeurIPS 2025)

Exploration v.s. Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward
Exploration v.s. Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward

Peter Chen, Xiaopeng Li, Ziniu Li, Wotao Yin, Xi Chen, Tianyi Lin

Proceedings of the International Conference on Learning Representations (ICLR 2026)

Exploration v.s. Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward

Peter Chen, Xiaopeng Li, Ziniu Li, Wotao Yin, Xi Chen, Tianyi Lin

Proceedings of the International Conference on Learning Representations (ICLR 2026)

GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators
GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators

Jiacheng Guo$^*$, Ling Yang$^*$, Peter Chen$^*$, Qixin Xiao$^*$, Yinjie Wang, Xinzhe Juan, Jiahao Qiu, Ke Shen, Mengdi Wang

Arxiv 2512.19682

GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators

Jiacheng Guo$^*$, Ling Yang$^*$, Peter Chen$^*$, Qixin Xiao$^*$, Yinjie Wang, Xinzhe Juan, Jiahao Qiu, Ke Shen, Mengdi Wang

Arxiv 2512.19682

3D Cell Oversegmentation Correction via Geo-Wasserstein Divergence
3D Cell Oversegmentation Correction via Geo-Wasserstein Divergence

Peter Chen, Bryan Chang, Olivia Annette Creasey, Julie Beth Sneddon, Zev Gartner, Yining Liu

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2026)

3D Cell Oversegmentation Correction via Geo-Wasserstein Divergence

Peter Chen, Bryan Chang, Olivia Annette Creasey, Julie Beth Sneddon, Zev Gartner, Yining Liu

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2026)

Stepwise Guided Policy Optimization: Coloring your Incorrect Reasoning in GRPO
Stepwise Guided Policy Optimization: Coloring your Incorrect Reasoning in GRPO

Peter Chen, Xiaopeng Li, Ziniu Li, Xi Chen, Tianyi Lin

ICLR 2026, under review

Stepwise Guided Policy Optimization: Coloring your Incorrect Reasoning in GRPO

Peter Chen, Xiaopeng Li, Ziniu Li, Xi Chen, Tianyi Lin

ICLR 2026, under review

Displacement-Sparse Neural Optimal Transport
Displacement-Sparse Neural Optimal Transport

Peter Chen, Yue Xie, Qingpeng Zhang

Arxiv 2502.01889

Displacement-Sparse Neural Optimal Transport

Peter Chen, Yue Xie, Qingpeng Zhang

Arxiv 2502.01889

2024

SICNN: Sparsity-induced Input Convex Neural Network

Peter Chen, Yue Xie, Qingpeng Zhang

NeurIPS 2024 Optimization for Machine Learning

SICNN: Sparsity-induced Input Convex Neural Network

Peter Chen, Yue Xie, Qingpeng Zhang

NeurIPS 2024 Optimization for Machine Learning