We are pleased to announce the UCLA synthetic data workshop. This 2-day workshop is hosted by the UCLA Department of Statistics and co-sponsored by IDRE and UCLA-Amazon Science Hub. The workshop is held at the UCLA Faculty Club.
Synthetic data generation is a rapidly growing and highly disciplinary area that draws much attention from both academia and industry. For the development of algorithmic procedures for fraud detection and spam identification, as well as for the construction of AI-driven models in industries like manufacturing and supply chain management, synthetic data has become a valuable resource. The advantages of synthetic data include cost savings, increased speed, agility, increased intelligence, and cutting-edge privacy. According to the Gartner report, synthetic data will overshadow real data in training machine/deep learning models by 2030; see Figure below. Additionally, the MIT Technology Review named synthetic data as one of its top 10 game-changing innovations for 2022 earlier this year. Therefore, it is believed that synthetic data generation will be an indispensable part of the next-generation machine learning workflow.
Despite numerous successful applications of synthetic data, its scientific foundation, e.g., the tradeoff among fidelity, utility, and privacy, is still missing. Additionally, industrial standards for generating and utilizing synthetic data are not fully developed. Furthermore, the privacy law concerning about synthetic data has not been fully developed. Therefore, this workshop is to form a community of synthetic data researchers (from statistics, machine learning and mathematics), policymakers, and industrial partners, and bring them together to collaborate on the development of the theory, methodology, and algorithms needed to produce synthetic benchmark datasets and algorithms.
Speakers, Program and Poster schedules available soon at the workshop website!