Papers and Preprints

Layerwise Change of Knowledge in Neural Networks [PDF]
Xu Cheng*, Lei Cheng*, Zhaoran Peng, Yang Xu, Tian Han, Quanshi Zhang.
Proceedings of the 41st International Conference on Machine Learning (ICML), PMLR 235:8038-8059, 2024.
Towards the Dynamics of a DNN Learning Symbolic Interactions [PDF]
Qihan Ren*, Junpeng Zhang*, Yang Xu, Yue Xin, Dongrui Liu, Quanshi Zhang.
Neural Information Processing Systems (NeurIPS), 2024.
Tracking the Feature Dynamics in LLM Training: A Mechanistic Study [PDF]
Yang Xu, Yi Wang, Hengguan Huang, Hao Wang.
arXiv preprint arXiv:2412.17626, 2024.
Simulating and Understanding Deceptive Behaviors in Long-Horizon Interactions [PDF]
Yang Xu*, Xuanming Zhang*, Samuel Yeh, Jwala Dhamala, Ousmane Dia, Rahul Gupta, Sharon Li.
arXiv preprint arXiv:2510.03999, 2025. (ICLR 2026 Accepted).
You can find my articles on my Google Scholar profile.