Shange Tang

 

I am a fifth year Ph.D. student in the Department of Operations Research and Financial Engineering at Princeton University under the supervision of Professor Jianqing Fan and Professor Chi Jin. Before coming to Princeton, I received my bachelor's degree from School of Mathematical Sciences at Peking University in 2021.
I am interested in theory and applications of statistics and machine learning.
E-mail: shangetang [@] princeton [DOT] edu
Google Scholar

Research

My recent research interests include

  • Automated theorem proving with LLMs

  • Out-of-Distribution (OOD) generalization

  • Factor Models

Selected Works

  1. Yong Lin, Shange Tang, Bohan Lyu, Ziran Yang, Jui-Hui Chung, Haoyu Zhao, Lai Jiang, Yihan Geng, Jiawei Ge, Jingruo Sun, Jiayun Wu, Jiri Gesi, Ximing Lu, David Acuna, Kaiyu Yang, Hongzhou Lin, Yejin Choi, Danqi Chen, Sanjeev Arora, Chi Jin. “Goedel-Prover-V2: Scaling Formal Theorem Proving with Scaffolded Data Synthesis and Self-Correction.”, arXiv preprint arXiv:2508.03613 (2025). [arXiv]

  2. Yong Lin*, Shange Tang*, Bohan Lyu, Jiayun Wu, Hongzhou Lin, Kaiyu Yang, Jia Li, Mengzhou Xia, Danqi Chen, Sanjeev Arora, Chi Jin. “Goedel-Prover: A Frontier Model for Open-Source Automated Theorem Proving.”, Conference on Language Modeling (COLM) 2025. [arXiv]

  3. Shange Tang*, Jiayun Wu*, Jianqing Fan, Chi Jin. “Benign Overfitting in Out-of-Distribution Generalization of Linear Models.”, International Conference on Learning Representations (ICLR) 2025. [arXiv]

  4. Shange Tang, Soham Jana, Jianqing Fan. “Factor Adjusted Spectral Clustering for Mixture Models.”, arXiv preprint arXiv:2408.12564 (2024). [arXiv]

  5. Jiawei Ge*, Shange Tang*, Jianqing Fan, Cong Ma, Chi Jin. “Maximum Likelihood Estimation is All You Need for Well-Specified Covariate Shift.”, International Conference on Learning Representations (ICLR) 2024. [arXiv]

  6. Jiawei Ge*, Shange Tang*, Jianqing Fan, Chi Jin. “On the Provable Advantage of Unsupervised Pretraining.”, International Conference on Learning Representations (ICLR) 2024, spotlight. [arXiv]

* denotes equal contribution.

Invited Talks

  1. “Goedel-Prover-V2: State-of-the-art performance in Automated Mathematical Theorem Proving”, Centaur AI Institute, Neural-Symbolic AI Summer School, Aug 2025. [Youtube]

  2. “Goedel-Prover: A Frontier Model for Open-Source Automated Theorem Proving”, TTIC Machine Learning Seminar, Mar 2025.

  3. “Benign Overfitting in Out-of-Distribution Generalization of Linear Models”, Simons Institute for the Theory of Computing, Domain Adaptation and Related Areas workshop poster session, Nov 2024.

  4. “Maximum Likelihood Estimation is All You Need for Well-Specified Covariate Shift”, UIUC Machine Learning Seminar, Mar 2024.


A brief cv.