Federated learning is a framewokr for distributed data anlysis that is free from a necessity of centralized datacenter. Namely a central server owning a machine learning model coordinates distributed learning with clients. The training server distributes the model, and client sends updates to the server for aggregation.
Federated learning faces two problems: Privacy and Compression.
Usually privacy is achieved by Local Differential Privacy (LDP). Clients control and perturb their data to conceal the private information.
Compression is achieved by quantization. It is mapping a value in a given range to another value which quantizes the same range. For example, if given range is and to quantize the value within 5 different values, it can be achieved by mapping the value to . If a value is , it can be mapped to . To send a continuous value through a communication channel, it will requre several bytes (float, double). On the other hand, the above example only requires five bits. There are various quantization techniques that reduces the size and preserves the value as much as possible.
Differential privacy and compression were achieved independently. Common approach addressing both issues was to apply differential privacy to a data or a weight first, and compres it before communication. This sequential approach offers an advese effect on utility as each independently degrades utility without considering existence of the other. In this work, the author proposes a privacy mechanism in conjunction with the compression.
The goal of the paper is to achieve differentially private and compressed algorithm. Formally speaking, there are three requirements.
Where are sensitive data in respective clients, is an aggregation mechanism, and is a local-differential privacy mechanism. The is an aggregate statistics.
There is no further explanation regarding . This is because the is not pertinent to particular task. It can be a statistic from , or a likelyhood of the data. The point is, whatever the is, the is a good approximation for . For example, let is a federated learning scenario where -th client is training a classifier using data and privacy mechanism and a server is aggregating all of them using . The , which will approximate, is an ideal classifier trained from all data combined.
This is somewhat magical. How can a quantized & differentially private federeated learning be a good approximator of a counterpart imaginary monolithic classifier which is trained from all data combined? Of course, it is not a free lunch. authors defined a property called asymptotic consistency.
Definition 4
We say that a private federated analytic protocol is asymptotically consistent if the output of the server's aggregation algorithm approaches the target statistics as . In other words, for any , there exists an such that for all , we have :
There is an additional requirement specifically for federated learning. Usually in federated learning, aggregation function is averaging updates from clients. This poses a requirement on .
Lemma 5
If is unbiased for all and has bounded variance, and if computes the average of the client responses, then the federated analytics solution is asymptotically consistent
As long as our privacy mechanims is unbiased, then we are good. Typical algorithms such as gaussian mechanisms are biased. The goal now is to find a differentially private mechanism which can preserve unbiasedness.
A stochastic algorithm is unbiased if
An unbiased quantization algorithm is dithed quantization.
If where , then
With the probabilisitc mapping, it is possible to keep the algorithm unbiased. .