In addition, with the rapid development of Internet technology, the scale of online
social network data has shown a trend of quantitative changes. Faced with such a large
scale of social network data, traditional anonymity technology cannot meet the actual
needs, the use of parallel algorithms is an effective way to improve the efficiency of
execution. At present, the social network anonymous parallel processing technology
is mainly divided into two categories, one is based on the Secure Multi Party
(SMC) model [9], and the other takes use of the MapReduce which is a data processing
framework [10, 11]. However, these two types of parallel processing technologies are
aimed at relational data and do not consider the graph properties of individuals such as
the degree of nodes, neighbor subgraphs in social networks, which cannot protect
private information in social networks. The graph modeling and parallel processing of
social networks are effective solutions to the problem.
The research work of this paper is aimed at large-scale social network graph G. Its
content is to generate the anonymous graph G
*
quickly and efficiently while main-
taining the reachability of nodes. The main work and contributions are as follows.
(1) We propose a distributed Random Neighborhood Search (DRNS) algorithm. This
algorithm generates a Random Neighbor Table (RNT) for nodes and implem ents a
fast lookup of random neighbor sets based on the message passing mechanism of
GraphX.
(2) We propose two different distributed graph perturbation algorithms, Distribution
Neighborhood Randomization (DNR) and Reachability Preserving Distribution
Perturbation (PRDP). Based on the DRNS and the graph construction operations
in GraphX, DNR implements fast edge perturbation of large-scale social net-
works. RPDP proposes a “probe” mechanism. It is possible to maintain reacha-
bility node in the rapid edge perturbance.
2 Related Work
The reachability query is to query whether a node can reach another node in the
directed graph [5]. In order to protect the link relationship in the social network, the
researchers propose to protect the sensitive link through the random perturbation
technology [3]. The technology randomly modifies the social network graph by edge
probability, so the attacker cannot accurately guess the real data in the original social
network.
An edge perturbation technique [12] based on subgraph structure is proposed which
divides the original graph into several subgraphs, and then adds/deletes m edges ran-
domly in the subgraph. However, this increases the degree of some nodes in the
anonymous graph and the probability that such nodes will be identified in the subgra ph.
In order to solve this problem, a random neighbor edge perturbation technique [13]is
proposed. The edge <u, v> in the graph is reserved with a certain probability p
(0 p 1). If <u, v> needs to be deleted, the destination node v is replaced with the
r-hop (r 2) neighbor node w of the node u. Paper [14] proposed to use secure
grouping to protect the link relationship of interactive social networks. The idea is to
abstract the network into bipartite graphs, then group the network nodes, and the nodes
196 X. Zhang et al.