Speaker
Dr. Hongtu Zhu, Kenan Distinguished Professor of Biostatistics, Statistics, Radiology, Computer Science and Genetics at the University of North Carolina at Chapel Hill
Moderator
Abstract
About the Speaker
Dr. Hongtu Zhu is the Kenan Distinguished Professor of Biostatistics, Statistics, Radiology, Computer Science and Genetics at the University of North Carolina at Chapel Hill. He was a DiDi Fellow and Chief Scientist of Statistics at DiDi Chuxing between 2018 and 2020 and held the Endowed Bao-Shan Jing Professorship in Diagnostic Imaging at MD Anderson Cancer Center between 2016 and 2018. He is an internationally recognized expert in statistical learning, medical image analysis, precision medicine, biostatistics, artificial intelligence, and big data analytics. He received an established investigator award from the Cancer Prevention Research Institute of Texas in 2016, the INFORMS Daniel H. Wagner Prize for Excellence in Operations Research Practice in 2019, the IMS 2027 Medallion award and Lecture, and the COPSS 2025 Snedecor Award. He has published more than 345 papers in top journals, including Nature, Science, Cell, Nature Genetics, Nature Communication, PNAS, AOS, JASA, Biometrika, and JRSSB, as well as presenting 58+ conference papers at top conferences, including meetings for Neurips, ICLR, ICML, AAAI, and KDD. He is the coordinating editor of JASA and the editor of JASA ACS. See Profile
About the Moderator
Dr. Hongyuan Cao is an Associate Professor in the Department of Statistics at Florida State University. Dr. Cao earned her Ph.D. from the University of North Carolina–Chapel Hill in 2010. Her research focuses on statistical methods development with applications in social, medical and biological sciences. She has extensive experience in analyzing longitudinal data, survival data and omics data arising from various study designs. See Profile
About AI, StAtIstics and Data Science in Practice
The NISS AI, Statistics and Data Science in Practice is a monthly event series will bring together leading experts from industry and academia to discuss the latest advances and practical applications in AI, data science, and statistics. Each session will feature a keynote presentation on cutting-edge topics, where attendees can engage with speakers on the challenges and opportunities in applying these technologies in real-world scenarios. This series is intended for professionals, researchers, and students interested in the intersection of AI, data science, and statistics, offering insights into how these fields are shaping various industries. The series is designed to provide participants with exposure to and understanding of how modern data analytic methods are being applied in real-world scenarios across various industries, offering both theoretical insights, practical examples, and discussion of issues.
During Spring 2026, from January through May 2026, the series will focus on large language models (LLMs) and the statistical and methodological foundations required to develop, evaluate, and deploy them responsibly and effectively. As LLMs become central to a wide range of scientific, industrial, and societal applications, careful attention to data generation, model training, evaluation, and inference is essential to ensure reliability, robustness, and transparency. As LLMs become increasingly central to scientific research, industry workflows, and societal decision-making, rigorous attention to how training data are constructed, curated, and sampled is critical for understanding model behavior and limitations. The series will highlight methodological considerations in model training and fine-tuning, including sources of bias, variability, and uncertainty, as well as principled approaches to benchmarking and evaluation that move beyond surface-level performance metrics. Emphasis will be placed on transparent and reproducible evaluation frameworks that support meaningful comparisons across models and use cases, and on statistical perspectives that help clarify what LLM outputs do and do not represent. By grounding discussions of LLM development and deployment in sound statistical reasoning, the series aims to promote more reliable, interpretable, and trustworthy language models in practice.
See full list of featured topics (also below)
Featured Topics:
- Veridical Data Science - Speaker: Bin Yu, October 15,2024
- Random Forests: Why they Work and Why that’s a Problem - Speaker: Lucas Mentch, November 19, 2024
- Causal AI in Business Practices - Speakers: Victor Lo, and Victor Chen, January 24, 2025
- Large Language Models: Transforming AI Architectures and Operational Paradigms - Speaker: Frank Wei, February 18, 2025
- Machine Learning for Airborne Biological Hazard Detection - Speaker: Jared Schuetter, March 11, 2025
- Trustworthy AI in Weather, Climate, and Coastal Oceanography - Speaker: Dr. Amy McGovern, May 13, 2025
- Sequential Causal Inference in Experimental or Observational Settings - Speaker: Aaditya Ramdas, August 26, 2025
- Covariate Adjustment, Intro to Resampling, and Surprises - Speaker: Tim Hesterberg, October 3, 2025
- Bayesian Geospatial Approaches for Prediction of Opioid Overdose Deaths Utilizing the Real-Time Urine Drug Test - Speaker: Joanne Kim, November 18, 2025
- COVID-19 Focused Cost-benefit Analysis of Public Health Emergency Preparedness and Crisis Response Programs - Speaker: Nancy McMillan, December 11, 2025
- LabOS: The AI-XR Co-Scientist That Reasons, Sees and Works With Humans - Speaker: Mengdi Wang, January 20, 2026
- From LLMs to World Foundation Models & Robotics: The Next Frontier of Artificial Intelligence - Speaker: Robert Clark, February 24, 2026
- Recent Advances in the Statistical Foundations of Large Language Models - Speaker: Weijie Su, March 17, 2026
- Ai, Statistics & Data Science in Practice Webinar - April 17, 2026 - Speaker: Anastasios N Angelopoulos, April 17, 2026
- Causal Generalist Medical AI - Speaker: Hongtu Zhu, May 19, 2026
Event Type
- NISS Hosted
Cost
Website
Location
Policy
