Differentially private analysis of graphs

{{Short description|Method of graph analysis}}

Differentially private analysis of graphs{{cite journal

|author1=Sofya Raskhodnikova|author1-link= Sofya Raskhodnikova |author2=Adam Smith

|title=Differentially Private Analysis of Graphs

|date=2015

|journal=Kao MY. (Eds) Encyclopedia of Algorithms. Springer, Berlin, Heidelberg

|doi=10.1007/978-3-642-27848-8_549-1

}} studies algorithms for computing accurate graph statistics while preserving differential privacy. Such algorithms are used for data represented in the form of a graph where nodes correspond to individuals and edges correspond to relationships between them. For examples, edges could correspond to friendships, sexual relationships, or communication patterns.

A party that collected sensitive graph data can process it using a differentially private algorithm and publish the output of the algorithm. The goal of differentially private analysis of graphs is to design algorithms that compute accurate global information about graphs while preserving privacy of individuals whose data is stored in the graph.

Variants

Differential privacy imposes a restriction on the algorithm. Intuitively, it requires that the algorithm has roughly the same output distribution on neighboring inputs. If the input is a graph, there are two natural notions of neighboring inputs, edge neighbors and node neighbors, which yield two natural variants of differential privacy for graph data.

Let ε be a positive real number and \mathcal{A} be a randomized algorithm that takes a graph as input and returns an output from a set \mathcal{O}.

The algorithm \mathcal{A} is \epsilon-differentially private if, for all neighboring graphs G_1 and G_2 and all subsets S of \mathcal{O},

\Pr[\mathcal{A}(G_1) \in S] \leq e^{\epsilon} \times \Pr[\mathcal{A}(G_2) \in S],

where the probability is taken over the randomness used by the algorithm.

=Edge differential privacy=

Two graphs are edge neighbors if they differ in one edge. An algorithm is \epsilon-edge-differentially private if, in the definition above, the notion of edge neighbors is used. Intuitively, an edge differentially private algorithm has similar output distributions on any pair of graphs that differ in one edge, thus protecting changes to graph edges.

=Node differential privacy=

Two graphs are node neighbors if one can be obtained from the other by deleting a node and its adjacent edges. An algorithm is \epsilon-node-differentially private if, in the definition above, the notion of node neighbors is used. Intuitively, a node differentially private algorithm has similar output distributions on any pair of graphs that differ in one one nodes and edges adjacent to it, thus protecting information pertaining to each individual. Node differential privacy give a stronger privacy protection than edge differential privacy.

Research history

The first edge differentially private algorithm was designed by Nissim, Raskhodnikova, and Smith.{{Cite book|last1=Nissim|first1=Kobbi|last2=Raskhodnikova|first2=Sofya|author2-link= Sofya Raskhodnikova |last3=Smith|first3=Adam|title=Proceedings of the thirty-ninth annual ACM symposium on Theory of computing |chapter=Smooth sensitivity and sampling in private data analysis |date=2007|pages=75–84|location=New York, New York, USA|publisher=ACM Press|doi=10.1145/1250790.1250803|isbn=9781595936318|s2cid=5642529}} The distinction between edge and node differential privacy was first discussed by Hay, Miklau, and Jensen.{{Cite book|last1=Hay|first1=Michael|last2=Li|first2=Chao|last3=Miklau|first3=Gerome|last4=Jensen|first4=David|title=2009 Ninth IEEE International Conference on Data Mining |chapter=Accurate Estimation of the Degree Distribution of Private Networks |date=2009|publisher=IEEE|pages=169–178|doi=10.1109/icdm.2009.11|isbn=9781424452422|s2cid=2572996}} However, it took several years before first node differentially private algorithms were published in Blocki et al.,{{Cite book|last1=Blocki|first1=Jeremiah|last2=Blum|first2=Avrim|last3=Datta|first3=Anupam|last4=Sheffet|first4=Or|title=2012 IEEE 53rd Annual Symposium on Foundations of Computer Science |chapter=The Johnson-Lindenstrauss Transform Itself Preserves Differential Privacy |date=2012|pages=410–419|doi=10.1109/focs.2012.67|arxiv=1204.2136|isbn=978-0-7695-4874-6|bibcode=2012arXiv1204.2136B|s2cid=349368}} Kasiviswanathan et al.,{{Citation|last1=Kasiviswanathan|first1=Shiva Prasad|title=Analyzing Graphs with Node Differential Privacy|date=2013|work=Theory of Cryptography|pages=457–476|publisher=Springer Berlin Heidelberg|isbn=9783642365935|last2=Nissim|first2=Kobbi|last3=Raskhodnikova|first3=Sofya|author3-link= Sofya Raskhodnikova |last4=Smith|first4=Adam|doi=10.1007/978-3-642-36594-2_26|doi-access=free}} and Chen and Zhou.{{Cite book|last1=Chen|first1=Shixi|last2=Zhou|first2=Shuigeng|title=Proceedings of the 2013 ACM SIGMOD International Conference on Management of Data |chapter=Recursive mechanism |date=2013|pages=653–664|location=New York, New York, USA|publisher=ACM Press|doi=10.1145/2463676.2465304|isbn=9781450320375|s2cid=16257197}} In all three papers, the algorithms are for releasing a single statistic, like a triangle count or counts of other subgraphs. Raskhodnikova and Smith gave the first node differentially private algorithm for releasing a vector, specifically, the degree count and the degree distribution.{{Cite book|last1=Raskhodnikova|first1=Sofya|author1-link= Sofya Raskhodnikova |last2=Smith|first2=Adam|title=2016 IEEE 57th Annual Symposium on Foundations of Computer Science (FOCS) |chapter=Lipschitz Extensions for Node-Private Graph Statistics and the Generalized Exponential Mechanism |date=2016|publisher=IEEE|pages=495–504|doi=10.1109/focs.2016.60|isbn=9781509039333|s2cid=7310416}}

References