To refer to this page use:
|Abstract:||How can multiple distributed entities train a shared deep net on their private data while protecting data privacy? This paper introduces InstaHide, a simple encryption of training images. Encrypted images can be used in standard deep learning pipelines (PyTorch, Federated Learning etc.) with no additional setup or infrastructure. The encryption has a minor effect on test accuracy (unlike differential privacy). Encryption consists of mixing the image with a set of other images (in the sense of Mixup data augmentation technique (Zhang et al., 2018)) followed by applying a random pixel-wise mask on the mixed image. Other contributions of this paper are: (a) Use of large public dataset of images (e.g. ImageNet) for mixing during encryption; this improves security. (b) Experiments demonstrating effectiveness in protecting privacy against known attacks while preserving model accuracy. (c) Theoretical analysis showing that successfully attacking privacy requires attackers to solve a difficult computational problem. (d) Demonstration that Mixup alone is insecure as (contrary to recent proposals), by showing some efficient attacks. (e) Release of a challenge dataset to allow design of new attacks.|
|Citation:||Huang, Yangsibo, Zhao Song, Kai Li, and Sanjeev Arora. "InstaHide: Instance-hiding Schemes for Private Distributed Learning." In Proceedings of the 37th International Conference on Machine Learning (2020): pp. 4507-4518.|
|Pages:||4507 - 4518|
|Type of Material:||Conference Article|
|Journal/Proceeding Title:||Proceedings of the 37th International Conference on Machine Learning|
|Version:||Final published version. Article is made available in OAR by the publisher's permission or policy.|
Items in OAR@Princeton are protected by copyright, with all rights reserved, unless otherwise indicated.