
(Recording Coming Soon!) Date: Tuesday, February 17, 2026 - 12:00pm to 1:00pm ET
This webinar focused on analytic design theory, a framework developed by this session’s panelists: Dr. Roger Peng, (Professor, Department of Statistics and Data Sciences, University of Texas at Austin) and Stephanie C. Hicks, PhD, MA (Associate Professor, Biomedical Engineering and Biostatistics, Johns Hopkins University) to evaluate and improve data analyses. They discussed the origins of the theory, which emerged from challenges in teaching data analysis to large online courses and the need for a formal framework to assess analyses. Our speakers explored six design principles that characterize variation in data analyses, including data matching, reproducibility, clarity, and their interaction with resource allocation. They also touched on the role of artificial intelligence in potentially transforming data analysis education and the importance of capturing the iterative process of analysis. The discussion concluded with thoughts on future directions for analytic design theory, including questions about the completeness of analyses and building trust in data findings. This session was led by our Moderator: Lucy D'Agostino McGowan (Associate Professor, Department of Statistical Sciences, Wake Forest University).
Analytic Design Theory Development
Our speakers highlighted analytic design theory, focusing on its development and key principles. Roger explained how the theory emerged from challenges in grading massive open online courses and teaching advanced data science, highlighting a lack of formal frameworks for evaluating data analyses and the overemphasis on programming skills as a surrogate for data analysis skills. Stephanie emphasized the importance of design thinking in data analysis, comparing it to apprenticeship models and introducing axes of variation for analyses, such as the balance between main questions and exploratory rabbit holes. They also discussed the need for a tangible example to illustrate how analysts and consumers might interpret results differently, underscoring the human element in data analysis. Lucy concluded by questioning the initial design principles and seeking clarification on what drove their development.
Data Analysis Axes and Reproducibility
Stephanie and Roger discussed six fundamental axes of variation in data analysis, which they had developed over several years. They explained that these axes are not necessarily indicators of good or bad analysis, but rather describe different types of analyses. Stephanie described data matching, where analysts may use surrogate data when unable to measure what they truly want to analyze. Roger discussed reproducibility, noting that the effort spent on making an analysis reproducible can be traded off against other factors depending on the analysis's consequences. They also touched on clarity, emphasizing the importance of being able to illustrate the core features of an analysis in a simple and understandable way.
Practical Reproducibility in Data Analysis
Lucy and Stephanie discussed the challenges of teaching reproducibility in data analysis, highlighting the need to consider practical constraints such as time, monetary costs, and data hosting. They emphasized the importance of helping students and practitioners understand when full reproducibility is necessary and when simpler methods are adequate. Roger added that design principles in statistics often involve making trade-offs due to finite resources, encouraging a focus on practical solutions rather than idealized approaches.
AI in Data Analysis Education
The panel discussed data analysis education and the role of AI in teaching statistics. Lucy shared her experience teaching a capstone class where students learn data analysis principles through client projects. Stephanie described her work on the Open Case Studies project, which provides high-quality data analyses with contextual information for students. The group explored how AI and large language models could potentially help teach data analysis more efficiently, though challenges remain in ensuring accuracy and understanding. They also discussed the need to better formalize and communicate the iterative process of data analysis. The panelists agreed that future work in analytic design theory could focus on defining what makes a complete analysis, improving teaching methods, and developing ways to increase trust in data analyses.
Thanks and Acknowledgement
We extend our sincere gratitude to our distinguished panelists—Dr. Roger Peng and Dr. Stephanie C. Hicks—for sharing their expertise and insights on analytic design theory and its growing role in statistical education and practice. Their thoughtful discussion provided valuable perspectives on the development of the framework, its guiding design principles, and its future applications in teaching and evaluating data analysis.
We also thank our moderator, Dr. Lucy D’Agostino McGowan, for skillfully guiding the conversation and bringing forward engaging questions that deepened our understanding of the theory and its practical implications.
We would like to thank COPSS for organizing this webinar and bringing together this session’s panelists. Special appreciation goes to all attendees who joined us for this webinar and contributed to a dynamic and thought‑provoking session. We are grateful for your continued interest and participation in the COPSS‑NISS Leadership Webinar Series, and we look forward to future opportunities to explore emerging ideas in statistics and data science together.
Resource links shared during webinar:
Lucy D'Agostino McGowan:
Full article: Design Principles for Data Analysis
Quantifying the Alignment of a Data Analysis Between Analyst and Audience
Stephanie C. Hicks, PhD, MA:
Bloomberg American Health Open Case Studies Project
[2309.08494] Modeling Data Analytic Iteration With Probabilistic Outcome Sets
Roger Peng, PhD:
OSF | The Analytic Fluency Scale: Measuring the Skill of a Data Analyst
