NISS/FCSM AI in Federal Government - Text Analysis: Methods and Use Cases

Monday, February 12, 2024 at 3:00 pm - 4:30 pm ET


Brandon Kopp, US Bureau of Labor Statistics (BLS)

"Clustering Federal Register Comments for Efficient, Manual Review"

Benjamin Rogers, National Center for Health Statistics (NCHS/CDC)

"Text Analytics: Background on Large Language Models and their usage"

Kenneth Haase, US Census Bureau

“Why Meaning Matters: Grounding for Text, data, and statistical products"

Tala Fakhouri, US Food and Drug Administration (FDA)

"Using AI-based technologies to increase operational efficiencies at the FDA"


Ed Strocko, US Bureau of Transportation Services (BTS)


Brandon Kopp

Clustering Federal Register Comments for Efficient, Manual Review

Posting Federal Register notices (FRNs) is a regular part of government work. Most of these notices only receive a handful of comments, but some receive a lot. Brandon will talk about his recent experience on an interagency team tasked with reviewing public comments to a FRN on possible revisions to race and ethnicity measurement standards which received over 20,000 comments. He will also discuss how he downloaded the comments daily during the comment period, identified letter writing campaigns (exact and near duplicate comments) and unique comments using clustering techniques, and shared this information with reviewers and stakeholders through a dashboard. Time permitting, he will also discuss subsequent extensions of this work to summarize longer, multi-page comments submitted as PDFs to FRNs.

Benjamin Rogers

Text Analytics: Background on Large Language Models and their usage

Benjamin Rogers will delve into the foundational aspects of large language models and their practical applications in text analytics. Providing a comprehensive background, this talk will explore the significance of large language models at the National Center for Health Statistics. He will discuss how these models are leveraged to extract valuable insights from vast textual datasets, contributing to the advancement of text analytics methodologies.

Kenneth Haase

Why Meaning Matters: Grounding for Text, data, and statistical products

Kenneth Haase will emphasize the paramount importance of meaning in the realms of text, data, and statistical products. This talk will delve into the critical role of grounding information in meaning to ensure the accuracy and reliability of results, particularly within the context of the US Census Bureau's work. He will share insights into the implications of prioritizing meaning and its impact on producing trustworthy and meaningful statistical products.


Tala Fakhouri

Using AI-based technologies to increase operational efficiencies at the FDA

Tala Fakhouri will explore the transformative power of AI-based technologies in enhancing operational efficiencies at the US Food and Drug Administration. This talk highlights strategic implementations of artificial intelligence, showcasing how these technologies contribute to the streamlining of processes within the FDA. Tala will provide a comprehensive understanding of how AI advancements are employed to increase productivity and effectiveness in regulatory operations.

About the Speakers

Brandon Kopp is a seasoned research psychologist with a rich career that commenced in 2010 at the Bureau of Labor Statistics (BLS) in Washington, DC. Within the BLS, an independent agency housed in the Department of Labor (DoL), Brandon serves as a survey research advisor within the Behavioral Science Research Center in the Office of Survey Methods Research. His primary focus revolves around labor economics, where he plays a vital role in designing questions for longstanding surveys administered by the BLS. Committed to enhancing the reliability of data across time, Brandon engages in basic research to improve various aspects of the survey process. This includes refining user experience through website usability projects, where he collaborates with website users to identify potential enhancements. Several years into his tenure at BLS, Brandon enjoys a rewarding mix of independent and team-driven projects, showcasing his proficiency and passion for programming and data science in his diverse and successful professional journey.

Benjamin Rogers is a Data Scientist with the Division of Research and Methodology at the National Center for Health Statistics at CDC. He is experienced in integrating AI technologies into public health initiatives, specializing in AI and machine learning. Rodgers has played a crucial role in assessing the utility and risks of conversational AI technologies like ChatGPT, demonstrating his commitment to responsible AI adoption at the CDC. He contributed to the training of the publicly released NCHS AI model, the Semi-Automated Non-response Detection for Surveys. In 2022, Rodgers presented at the Federal Committee of Statistical Methodologies on using Natural Language Processing for survey response analysis. In 2023, he participated in AAPOR, showcasing collaborative NLP efforts in survey text analysis. Rodgers holds a Master of Science in Data Science from the University of Virginia.

Kenneth Haase, Census Bureau

Kenneth Haase has a diverse academic background, spending a significant portion of his academic career at MIT. Beginning in mathematics and transitioning through philosophy, he earned his Master's and PhD degrees in computer science at the Artificial Intelligence Laboratory. As a professor at the MIT Media Laboratory, he focused on semantically rich metadata and case-based representations, contributing to areas such as knowledge bases, linguistic parsing, knowledge-based information retrieval, and analogical matching algorithms. Notably, he co-developed FramerD, an open-source object database designed for large pointer-intensive datasets. 

Professionally, Kenneth has held various roles, including software developer, designer of physical and virtual interfaces at Atari, researcher, research manager, professor, entrepreneur, and laboratory director. In 2001, he founded Beingmeta, Inc., a company dedicated to creating technologies that enhance human access to information, leveraging his work at MIT. The company has concentrated on strengthening semantic and linguistic capabilities and has engaged in pilot projects with several companies in search and knowledge management applications. Beingmeta's technology portfolio encompasses high-performance knowledge systems, natural language analysis tools, and a platform for consistent, interlingual, inter-individual metadata creation.

Tala Fakhouri, PhD, MPH is the Associate Director for Policy Analysis in the Office of Medical Policy Initiatives (OMPI), Center for Drug Evaluation and Research (CDER), Food and Drug Administration (FDA). Dr. Fakhouri manages a team tasked with developing, coordinating, and implementing medical policy with a focus on the use of Artificial Intelligence (AI) and Machine Learning (ML) in drug development. These efforts include overseeing an AI policy group, as well as engaging external stakeholders and advancing the development of regulatory science around the use of AI in drug development. Recently, she led the development and publication of a Discussion Paper; titled: “Using Artificial Intelligence and Machine Learning in the Development of Drug and Biological Products”. She also contributes to the development of medical policy related to real-world evidence (RWE) and the use of digital health technologies for medical product development. In 2023, Dr. Fakhouri was selected by the Office of Management and Budget (OMB) to serve on the Federal Committee for Statistical Methodology (FCSM) for her expertise in statistical methods.

About the Moderator

Ed Strocko is the Director of the Office of Spatial Analysis and Visualization at the USDOT Bureau of Transportation Statistics, or BTS, a principal federal statistical agency. Ed leads the development of geospatial information and visualization tools including the National Transportation Atlas Database, and directs spatial and network analyses that facilitate the understanding of major issues and inform investment and policy decisions.  He employs high quality cartography and innovative web applications to produce relevant, timely, comparable, complete, and accessible geospatial products and statistical visualizations.

Prior to coming to BTS, Ed worked in the Federal Highway Administration’s Office of Freight Management and Operations where he was the Team Leader for Research and Analysis.  Prior to joining FHWA, Ed was the manager of Multi-modal Studies for the Maryland Department of Transportation. Ed also worked as a land use and community planner for a number of jurisdictions.

About the NISS/FCSM AI in Federal Government Series

The National Institute of Statistical Sciences (NISS) and the Federal Committee on Statistical Methodology (FCSM) are collaborating on a series of webinars on Artificial Intelligence (AI). The initial webinar took place on October 31, 2023 on AI in Federal Government: Uses, Potential Applications, and Issues. This series aims to benefit federal practitioners and managers by providing behind-the-scenes information on uses of AI in federal agencies and from insights on how agencies meet organizational, managerial, and ethical challenges in harnessing the power of AI. Participation by researchers and managers in the webinars can help streamline current efforts to adopt AI and inspire new endeavors. The NISS/FCSM webinar series creates unique opportunities not easily available through other forums or venues. Thank you to the American Statistical Association (ASA) for becoming a sponsor of the series.

Event Type


National Institute of Statistical Sciences
FCSM | Federal Committee on Statistical Methodology


ASA | American Statistical Association
NORC at the University of Chicago




United States