Tuesday, March 17, 2026 - 12:00pm to 1:30pm ET

Speaker

Weijie Su, Associate Professor, Wharton Statistics and Data Science Department and, by courtesy, Department of Biostatistics, Epidemiology, and Informatics, University of Pennsylvania

Moderator

Whitney Huang, Associate Professor of Statistics at Clemson University

Abstract

Title: Recent Advances in the Statistical Foundations of Large Language Models

Abstract: In this talk, we advocate for the development of rigorous statistical foundations for large language models (LLMs). We begin by elaborating two key features that motivate statistical perspectives for LLMs: (1) the probabilistic, autoregressive nature of next-token prediction, and (2) the complexity and black box nature of Transformer architectures. To illustrate how statistical insights can directly benefit LLM development and applications, we present two concrete examples. First, we introduce a novel statistical framework to analyze the efficiency of watermarking schemes, with a focus on a watermarking scheme developed by OpenAI for which we derive optimal detection rules that outperform existing ones. Second, we demonstrate statistical inconsistencies and biases arising from the current approach to aligning LLMs with human preference. We propose a regularization term for aligning LLMs that is both necessary and sufficient to ensure consistent alignment. Collectively, these findings showcase how statistical insights can address pressing challenges in LLMs while simultaneously illuminating new research avenues for the broader statistical community to advance responsible generative AI research. This talk is based on arXiv:2404.01245, 2405.16455, 2503.10990, and 2510.22007.

About the Speaker

Weijie Su is an Associate Professor in the Wharton Statistics and Data Science Department and, by courtesy, in the Department of Biostatistics, Epidemiology, and Informatics at the University of Pennsylvania. He serves as a Co-Director of the Penn Research in Machine Learning Center. He received his Ph.D. in Statistics from Stanford University in 2016 and his Bachelor's degree in Mathematics from Peking University in 2011. His research interests span the statistical foundations of generative AI, high-dimensional statistics, privacy-preserving machine learning, and optimization. He is a founding Co-Editor of the journal Statistical Learning and Data Science and serves as an Associate Editor for JASA, AOAS, OPRE, JMLR, FnT in Statistics, and Harvard Data Science Review. He currently serves on the Organizing Committee of ICML 2026 as Scientific Integrity Chair, where his isotonic mechanism will be deployed to enhance peer review. His work has been recognized with many honors, including the Stanford Theodore Anderson Dissertation Award, NSF CAREER Award, Sloan Research Fellowship, IMS Peter Hall Prize, SIAM Early Career Prize in Data Science, ASA Noether Early Career Award, ICBS Frontiers of Science Award in Mathematics, and IMS Medallion Lectureship. He has authored two discussion papers in JRSSB and JASA and is a Fellow of the IMS. See Profile

About the Moderator

Whitney Huang is an Associate Professor of Statistics at Clemson University, where he has served since August 2019. Prior to joining Clemson, he was a Canadian Statistical Sciences Institute (CANSSI) and Statistical and Applied Mathematical Sciences Institute (SAMSI) postdoctoral fellow at the University of Victoria (UVic), affiliated with the Pacific Climate Impacts Consortium and the School of Earth and Ocean Sciences, working with Dr. Francis Zwiers and Prof. Adam Monahan. Before his time at UVic, he held a SAMSI/University of North Carolina postdoctoral position under the supervision of Prof. Richard Smith. He received his Ph.D. in Statistics from Purdue University in August 2017, advised by Prof. Hao Zhang. During his doctoral studies, he was actively involved in the Research Network for Statistical Methods for Atmospheric and Oceanic Sciences (STATMOS) and the Center for Robust Decision Making on Climate and Energy Policy (RDCEP), collaborating with Michael Stein and Elisabeth Moyer at the University of Chicago and Doug Nychka at the National Center for Atmospheric Research. Before pursuing his doctorate at Purdue, he earned a Master’s degree in Statistics from the University of Akron and a Bachelor’s degree in Mechanical Engineering from National Cheng Kung University in Taiwan. His research interests include statistics of extremes, spatio-temporal statistics, surrogate modeling for computer experiments, time-frequency analysis, multiscale statistical modeling, spatial point processes, environmental applications, and high-frequency physiological data analysis. See Profile

About AI, StAtIstics and Data Science in Practice

The NISS AI, Statistics and Data Science in Practice is a monthly event series will bring together leading experts from industry and academia to discuss the latest advances and practical applications in AI, data science, and statistics. Each session will feature a keynote presentation on cutting-edge topics, where attendees can engage with speakers on the challenges and opportunities in applying these technologies in real-world scenarios. This series is intended for professionals, researchers, and students interested in the intersection of AI, data science, and statistics, offering insights into how these fields are shaping various industries. The series is designed to provide participants with exposure to and understanding of how modern data analytic methods are being applied in real-world scenarios across various industries, offering both theoretical insights, practical examples, and discussion of issues.

During Spring 2026, from January through May 2026, the series will focus on large language models (LLMs) and the statistical and methodological foundations required to develop, evaluate, and deploy them responsibly and effectively. As LLMs become central to a wide range of scientific, industrial, and societal applications, careful attention to data generation, model training, evaluation, and inference is essential to ensure reliability, robustness, and transparency. As LLMs become increasingly central to scientific research, industry workflows, and societal decision-making, rigorous attention to how training data are constructed, curated, and sampled is critical for understanding model behavior and limitations. The series will highlight methodological considerations in model training and fine-tuning, including sources of bias, variability, and uncertainty, as well as principled approaches to benchmarking and evaluation that move beyond surface-level performance metrics. Emphasis will be placed on transparent and reproducible evaluation frameworks that support meaningful comparisons across models and use cases, and on statistical perspectives that help clarify what LLM outputs do and do not represent. By grounding discussions of LLM development and deployment in sound statistical reasoning, the series aims to promote more reliable, interpretable, and trustworthy language models in practice.

See full list of featured topics (also below)

NISS AI, Statistics & Data Science in Practice Webinar: Recent Advances in the Statistical Foundations of Large Language Models

Speaker

Moderator

Abstract

About the Speaker

About the Moderator

About AI, StAtIstics and Data Science in Practice

Featured Topics:

Event Type

Cost

Website

Location

Policy

You are here

NISS AI, Statistics & Data Science in Practice Webinar: Recent Advances in the Statistical Foundations of Large Language Models

Speaker

Moderator

Abstract

About the Speaker

About the Moderator

About AI, StAtIstics and Data Science in Practice

Featured Topics:

Event Type

Cost

Website

Location

Policy