DPMLBench: Holistic Evaluation of Differentially Private Machine Learning

C. Wei, M. Zhao, Z. Zhang, M. Chen, W. Meng, B. Liu, Y. Fan, W. Chen

Research output: Chapter in Book / Report / Conference proceedingConference contributionAcademicpeer-review

Abstract

Differential privacy (DP), as a rigorous mathematical definition quantifying privacy leakage, has become a well-accepted standard for privacy protection. Combined with powerful machine learning (ML) techniques, differentially private machine learning (DPML) is increasingly important. As the most classic DPML algorithm, DP-SGD incurs a significant loss of utility, which hinders DPML's deployment in practice. Many studies have recently proposed improved algorithms based on DP-SGD to mitigate utility loss. However, these studies are isolated and cannot comprehensively measure the performance of improvements proposed in algorithms. More importantly, there is a lack of comprehensive research to compare improvements in these DPML algorithms across utility, defensive capabilities, and generalizability.
We fill this gap by performing a holistic measurement of improved DPML algorithms on utility and defense capability against membership inference attacks (MIAs) on image classification tasks. We first present a taxonomy of where improvements are located in the ML life cycle. Based on our taxonomy, we jointly perform an extensive measurement study of the improved DPML algorithms, over twelve algorithms, four model architectures, four datasets, two attacks, and various privacy budget configurations. We also cover state-of-the-art label differential privacy (Label DP) algorithms in the evaluation. According to our empirical results, DP can effectively defend against MIAs, and sensitivity-bounding techniques such as per-sample gradient clipping play an important role in defense. We also explore some improvements that can maintain model utility and defend against MIAs more effectively. Experiments show that Label DP algorithms achieve less utility loss but are fragile to MIAs. ML practitioners may benefit from these evaluations to select appropriate algorithms. To support our evaluation, we implement a modular re-usable software, DPMLBench,1. We open-source the tool in https://github.com/DmsKinson/DPMLBench which enables sensitive data owners to deploy DPML algorithms and serves as a benchmark tool for researchers and practitioners.
Original languageEnglish
Title of host publicationCCS 2023
Subtitle of host publicationProceedings of the 2023 ACM SIGSAC Conference on Computer and Communications Security
PublisherAssociation for Computing Machinery
Pages2621-2635
Number of pages15
ISBN (Electronic)9798400700507
ISBN (Print)9798400700507
DOIs
Publication statusPublished - 2023

Funding

We would like to thank the anonymous reviewers for their insightful comments. This work is supported in part by the National Natural Science Foundation of China (NSFC) under No. 62302441, the Funding for Postdoctoral Scientific Research Projects in Zhejiang Province (ZJ2022072), and ZJU – DAS-Security Joint Research Institute of Frontier Technologies, the Helmholtz Association within the project “Trustworthy Federated Data Analytics” (TFDA) (No. ZT-I-OO1 4), and CISPA-Stanford Center for Cybersecurity (FKZ:13N1S07 62).

FundersFunder number
Funding for Postdoctoral Scientific Research Projects in Zhejiang ProvinceZJ2022072
TFDA
National Natural Science Foundation of China62302441
Zhejiang University
Helmholtz Association
CISPA-Stanford Center for CybersecurityN1S07 62

    Fingerprint

    Dive into the research topics of 'DPMLBench: Holistic Evaluation of Differentially Private Machine Learning'. Together they form a unique fingerprint.

    Cite this