Deep Learning with Differential Privacy
https://arxiv.org/pdf/1607.00133.pdf
Last updated
https://arxiv.org/pdf/1607.00133.pdf
Last updated
This paper proposes a new algorithm which allows us to train a deep neural network under a modest privacy budget. It offers protection against a strong adversary with full knowledge of the training mechanism and access to the model's parameters.
Note that:
We say that two of sets are adjacent if they differ in a single entry, that is, if one image-label pair is present in one set and absent in the other.
The algorithm is very similar to the traditional SGD algorithm with few exceptions:
To guarantee our model is differentially private, we need to bound the influence of each individual example on our model. Thus, , we clip each gradient in l2 norm.
The algorithm adds noise at lot-level. Lots are similar to a batches, but, to limit the memory consumption, we may set the batch size much smaller than the lot size. We perform the computation in batches, then group several batches into a lot for adding noise.
The algorithm computes the overall privacy cost of the train.
is the exponential function applied to the parameter . If is very close to 0, then is very close to 1, so the probabilities are very similar. The biggeris, the more the probabilities can differ.
This paper uses the variant, which allows for the possibility that plain -differential privacy is broken with probability .