Explore the full two-day session lineup, and stay tuned for additional exciting sessions and speakers.

Day 1 - June 07

11:15 AM PST
2:15 PM EST

Generating Synthetic Tabular Data That’s Differentially Private

Lipika Ramaswamy

Senior Applied Scientist
Gretel AI

While generative models are able to produce synthetic datasets that preserve the statistical qualities of the training dataset without identifying any particular record in the training dataset, most generative models to date do not offer mathematical guarantees of privacy that can be used to facilitate information sharing or publishing. Without such mathematical guarantees, each adversarial attack on these models and the synthetic data they generate needs to be thwarted reactively. We can never be aware of adversarial attacks that might become feasible in the future. This is exactly the problem that differential privacy (DP) solves by bounding the probability that a compromising event occurs. By introducing calibrated noise into an algorithm, DP defends against all future privacy attacks with a high probability. In this session, we’ll explore approaches to applying differential privacy, including one that relies on measuring low dimensional distributions in a dataset combined with learning a graphical model representation. We'll end with a preview of Gretel's new generative model that applies this method to create high-quality synthetic tabular data that is differentially private.


Watch on demand

Watch all of the live sessions on-demand and discover the latest developments in data-centric AI.
Watch on demand