2025

ComPO: Preference Alignment via Comparison Oracles
ComPO: Preference Alignment via Comparison Oracles

Peter Chen, Xi Chen, Wotao Yin, Tianyi Lin

Advances in Neural Information Processing Systems 38 (NeurIPS 2025)

ComPO: Preference Alignment via Comparison Oracles

Peter Chen, Xi Chen, Wotao Yin, Tianyi Lin

Advances in Neural Information Processing Systems 38 (NeurIPS 2025)

Exploration v.s. Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward
Exploration v.s. Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward

Peter Chen, Xiaopeng Li, Ziniu Li, Wotao Yin, Xi Chen, Tianyi Lin

ICLR 2026, under review

Exploration v.s. Exploitation: Rethinking RLVR through Clipping, Entropy, and Spurious Reward

Peter Chen, Xiaopeng Li, Ziniu Li, Wotao Yin, Xi Chen, Tianyi Lin

ICLR 2026, under review

GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators
GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators

Jiacheng Guo$^*$, Ling Yang$^*$, Peter Chen$^*$, Qixin Xiao$^*$, Yinjie Wang, Xinzhe Juan, Jiahao Qiu, Ke Shen, Mengdi Wang

Arxiv 2512.19682

GenEnv: Difficulty-Aligned Co-Evolution Between LLM Agents and Environment Simulators

Jiacheng Guo$^*$, Ling Yang$^*$, Peter Chen$^*$, Qixin Xiao$^*$, Yinjie Wang, Xinzhe Juan, Jiahao Qiu, Ke Shen, Mengdi Wang

Arxiv 2512.19682

3D Cell Oversegmentation Correction via Geo-Wasserstein Divergence
3D Cell Oversegmentation Correction via Geo-Wasserstein Divergence

Peter Chen, Bryan Chang, Olivia Annette Creasey, Julie Beth Sneddon, Zev Gartner, Yining Liu

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2026)

3D Cell Oversegmentation Correction via Geo-Wasserstein Divergence

Peter Chen, Bryan Chang, Olivia Annette Creasey, Julie Beth Sneddon, Zev Gartner, Yining Liu

Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV 2026)

Stepwise Guided Policy Optimization: Coloring your Incorrect Reasoning in GRPO
Stepwise Guided Policy Optimization: Coloring your Incorrect Reasoning in GRPO

Peter Chen, Xiaopeng Li, Ziniu Li, Xi Chen, Tianyi Lin

ICLR 2026, under review

Stepwise Guided Policy Optimization: Coloring your Incorrect Reasoning in GRPO

Peter Chen, Xiaopeng Li, Ziniu Li, Xi Chen, Tianyi Lin

ICLR 2026, under review

Displacement-Sparse Neural Optimal Transport
Displacement-Sparse Neural Optimal Transport

Peter Chen, Yue Xie, Qingpeng Zhang

Arxiv 2502.01889

Displacement-Sparse Neural Optimal Transport

Peter Chen, Yue Xie, Qingpeng Zhang

Arxiv 2502.01889

2024

SICNN: Sparsity-induced Input Convex Neural Network

Peter Chen, Yue Xie, Qingpeng Zhang

NeurIPS 2024 Optimization for Machine Learning

SICNN: Sparsity-induced Input Convex Neural Network

Peter Chen, Yue Xie, Qingpeng Zhang

NeurIPS 2024 Optimization for Machine Learning