I build architectures that learn structure — from distributed systems to graph neural networks to molecular dynamics to large language models. The thread connecting my work is a fascination with how computation discovers patterns, whether in proteins or in language.
Highlights along the way: the Pre-LN Transformer architecture adopted by the GPT series, Graphormer for molecular property prediction (NeurIPS 2021, KDD Cup Champion), and BioEmu for biomolecular simulations (Science cover, 2025). 7,000+ citations.
Today I split my time between foundational LLM research and translating AI-for-Science breakthroughs into real-world impact at the Zhongguancun Institute of AI in Beijing.