J. Pang et al. / Neurocomputing 277 (2018) 89–100 91
unlabeled classification algorithms, puMGL contains two iterative
processes: (1) selecting subgraph features and (2) deriving distin-
guish model. The problem setting of puMGL is different from our
problem, since it supposes that training datasets contain a small
quantity of positive and a large number of unlabeled multi-graphs.
While, in our works, we suppose that training datasets contain a
small quantity of labeled multi-graphs for every category and a
large number of unlabeled multi-graphs. Thus, puMGL framework
can not be directly applied to solve our problem.
2.2. Subgraph feature selection
Existing subgraph feature selection algorithms can be divided
into three types: unsupervised algorithms, semi-supervised algo-
rithms and supervised algorithms.
The first type of algorithms treats frequent subgraphs as sub-
graph features. For example, the top- m frequent subgraphs are
used as subgraph features. The second type of algorithms first
mines frequent subgraphs, then mines subgraph features from
those frequent subgraphs using labeled and unlabeled graph infor-
mation. For example, Kong and Yu [3] propose the semi-supervised
subgraph feature selection algorithm gSSC. Because gSSC not only
considers labeled graphs, but also considers unlabeled graphs
when selecting subgraph feature, it can derive more valuable
subgraph features than other algorithms. Meanwhile, an upper
bound pruning strategy is proposed to improve the efficiency of
gSSC. The last type of algorithms indirectly or directly mines sub-
graph features based on many labeled graphs. For example, Yan
et al. [7] propose an novel mining framework LEAP(Descending
leap mine) to identify the feature subgraphs. Moreover, two new
techniques, structural proximity pruning and frequency-descending
mining, are proposed to support leap search in graph pattern
space. In addition, in order to speed up the process of mining sub-
graph features and improve the quality of subgraph features, many
other techniques are adopted, such as partitioning technique based
on similarity [8] , fast probing technique based on search history
[9] and diversity technique [10] .
Although the first type of algorithms do not require the labels
of graphs in the graph datasets, experimental results show that
its accuracy is far from being satisfied. Because informative sub-
graph features may not only be frequent subgraphs but also be
infrequent ones, it is not appropriate to use only the frequency
to select the subgraph features. Although the second type of algo-
rithms only require the graph datasets containing a small number
of labeled graphs, they cannot be directly used to solve the semi-
supervised multi-graph classification problem because of the dif-
ferences between the multi-graph and the graph. Compared to the
graph, the multi-graph not only maintains the containment rela-
tionship between the multi-graph and the graphs, but also has the
mutual constraints between the graph label and the multi-graph
label. Also, the third type of algorithms are unsuitable for directly
solving the semi-supervised multi-graph classification problem be-
cause they demand that the training datasets contain a large num-
ber of labeled multi-graphs. In this paper, we propose an novel
subgraph feature selection algorithm, which is specially designed
for the multi-graph setting. Meanwhile, our algorithm considers
both of the constrains of the graph level and the constrains of the
multi-graph level.
2.3. Semi-supervised extreme learning machine
Huang et al. propose ELM for single hidden-layer feedforward
neural networks(SLF-Ns) and then extend it to the “generalized
SLFNs [11–25] . ELM has better generalization performance, faster
learning speed and higher training precision than traditional feed-
forward neural networks. ELM algorithm has a wide range of appli-
cations, such as protein secondary structure prediction [26] , clas-
sification in P2P networks [27] , XML document classification [28] ,
graph classification [29] , and multi-graph classification [30] .
As a promising algorithm, semi-supervised ELM has begun to
attract more researchers’ attentions. Liu et al. [31] introduce the
manifold regularization framework into the ELMs model to solve
semi-supervised binary classification problem. However, the algo-
rithm is inefficient when the number of hidden neurons is larger
than the number of training patterns. Li et al. [32] design a co-
training method to repeatedly train ELMs in a semi-supervised set-
ting. The algorithm is inefficient because of its strategy of repeat-
edly training. Huang et al. [33] extend ELMs for semi-supervised
task based on the manifold regularization. The proposed algorithm
i.e., SS-ELM (semi-supervised ELM), can handle multi-class clas-
sification. Liu et al. [6] propose ESELM(extended semi-supervised
ELM) considering the empirical risk and structural risk at the same
time. ESELM fits for both low dimension dataset and high dimen-
sion dataset. Extensive experimental results show the effectiveness
of ESELM. ESELM is inclined to the graph structure while SS-ELM
describes the form of semi-supervised method [6] . To our best
knowledge, ESELM is the latest semi-supervised ELM algorithm. In
this paper, we choose ESELM to build our prediction model.
3. Problem definition
In this section, we introduce related concepts and formulate our
problem.
Definition 1. (Connected graph)
A graph G is described by a quaternion < V, E , , f > , where
V and E denote the set of vertices and edges, respectively. indi-
cates the label range of vertices and edges. f is a label function
which allocates a label for every vertex and edge. A connected
graph is a graph in which there exists at least a path between any
two vertices. Under the condition without ambiguity, a connected
graph are called a graph for short in this paper. v
i
denotes the i th
vertex of G and e ( v
i
, v
j
) indicates the edge between v
i
and v
j
. The
label of vertex v
i
is represented as f ( v
i
), the label of edge e ( v
i
, v
j
) is
represented as f ( e ( v
i
, v
j
)), and l ( G ) denotes the class label of graph
G .
Definition 2. (SubGraph)
Graph G
is a subgraph of graph G if G and G
satisfy two con-
ditions: (1) for any vertex v
i
∈ V of G
, f (v
i
) = f (v
j
) where v
j
∈ V
of G . (2) f (e (v
i
, v
m
)) = f (e (v
j
, v
n
)) , where e ( v
i
, v
m
) ∈ E of G
, e ( v
j
,
v
n
) ∈ E of G and f (v
i
) = f (v
j
) , f (v
m
) = f (v
n
) . G
⊆G denotes that
G
is a subgraph of G .
Definition 3. (Labeled multi-graph)
A multi-graph is a bag of graphs MG = { G
1
, . . . , G
i
, . . . ,
G
| MG |
} (1 < i < | MG | ) . A labeled multi-graph is a multi-graph with
binary class label l(MG ) ∈ { posit i v e (+) , negat i v e (−) } . If the class
label for one graph of a multi-graph is tagged as positive,
the multi-graph is tagged as positive i.e., ∃ G ∈ MG and l(G ) =
positi v e ⇒ l(MG ) = positi v e . Otherwise, the multi-graph is tagged
as negative, i.e., ∀ G ∈ MG and l(G ) = negati v e ⇒ l(MG ) = negati v e .
Definition 4. (Optimal subgraph feature selection)
Given a multi-graph set M G
S
= { M G
1
, . . . , M G
i
, . . . , M G
| MG
S
|
} ,
MG
S
’s graph set G
S
= { G | G ∈ M G
i
, M G
i
∈ M
S
} , G
S
’s subgraph set
SG = { Sg| Sg ⊆ G, G ∈ G
S
} . Optimal subgraph feature selection aims
to find the most valuable subgraph feature set F ⊆SG , which are
most useful to distinguish multi-graphs, shown as follows.
F S = arg max S(F ) , s.t. | F | = m. (1)
where S ( F ) denotes an evaluation criterion to estimate the useful-
ness of F , | F | denotes the cardinality of F and m is the maximum
number of selected features.