985Molecular Genetics and Genomics (2018) 293:983–995
1 3
model which is Restricted Boltzmann Machine for Multiple
types of MiRNA–Disease Association prediction (RBM-
MMDA) based on known miRNA–disease associations.
Although RBMMMDA performed well on both predicting
miRNA–disease associations and miRNA–disease asso-
ciation types, the choice of parameters in the model is still
unsolved. Li etal. (2017) developed a model called Matrix
Completion for MiRNA–Disease Association prediction
(MCMDA) using singular value thresholding (SVT) algo-
rithm to complete the miRNA–disease association matrix.
However, this model cannot work for diseases with no
known related miRNAs. Chen etal. further developed a
model, which is Ranking-based KNN for MiRNA–Disease
Association prediction (RKNNMDA). In this model, an ini-
tial KNN-based ranking method was first applied. Due to
biases caused by the drawback of KNN, SVM is introduced
to re-rank the previous ranked neighbors. Although SVM
is introduced to the model, bias might still exist in the final
scores. Besides, an ideal method to combine KNN, SVM and
weighted voting is still needed (Chen etal. 2017).
To further exploit the potential associations between miR-
NAs and diseases, researchers have proposed deep learn-
ing methods (Chen etal. 2017; Fu and Peng 2017). Chen
etal. has introduced DRMDA (Deep Representations-based
MiRNA–Disease Association Prediction), using stacked
auto-encoder to obtain the abstract representations of the
raw data. A SVM classifier is stacked on the top of the auto-
encoder. However, since the need of negative data, SVM
classifier can not perform as good as expected. The param-
eters in DRMDA are also not easy to optimize. Fu etal.
also proposed an auto-encoder-based method to predict
miRNA–disease associations. They fed miRNA–miRNA
similarity network and disease–disease similarity network
into stacked auto-encoders, respectively, to extract features
from both similarity networks. The extracted features were
then concatenated as combined features and fed into a three-
layer fully connected network to calculate the probability of
the miRNA and disease being associated.
In this study, we proposed a network integration approach
called Heterogeneous Network-based MiRNA–Disease
Association prediction (HNMDA) to predict potential
miRNA–disease associations. To obtain a high accuracy,
we combined the miRNA similarity and disease semantic
similarity with Gaussian interaction profile kernel simi-
larities. We first built up miRNA and disease similarity
networks, based on the similarity data, respectively. Then
we introduced a network diffusion algorithm called Ran-
dom Walk with Restart (RWR), with which we can take the
global structure of the networks into consideration. Finally,
we managed to find an optimal projection from the miRNA
space onto the disease space, which enabled the predic-
tion of potential miRNA–disease associations according to
the geometric proximity of the mapped vectors. To get the
optimal projection, we turned it into an alternating minimi-
zation problem, and applied an inductive matrix comple-
tion method to solve it (Natarajan and Dhillon 2014). This
method worked well for diseases without any known related
miRNAs. Furthermore, leave-one-out cross-validation
(LOOCV) was introduced to evaluate our model. The AUC
of LOOCV is 0.8394. Moreover, we evaluated HNMDA
with three kinds of case studies. In the first case, we tested
our model on breast neoplasms, esophageal neoplasms and
kidney neoplasms, and there were 41, 38 and 42 out of top
50 miRNAs confirmed by experiments, respectively. In the
second case, we applied HNMDA on the test diseases whose
known associations with miRNAs were set to be unknown
ones. As a result, 49 out of top 50 miRNAs predicted to be
associated with hepatocellular carcinoma were experimen-
tally verified. And to test the robustness of HNMDA, we
tested our model using HMDD V1.0 database, and 40 out
of top 50 potential lymphoma-related miRNAs were experi-
mentally confirmed. Therefore, HNMDA is proved to be an
accurate and effective method in predicting miRNA–disease
associations.
Results
Performance evaluation
LOOCV was introduced to evaluate the accuracy of
HNMDA. Through the process of LOOCV, we left out each
known miRNA–disease association in turn as test sample,
and other known miRNA–disease associations were used for
training. Those miRNA–disease pairs which have no con-
firmed associations were taken to be candidate pairs. We
compared the score of each test sample with scores of all the
candidate pairs, and if its rank was above the threshold given
in advance, it will be considered as a successful prediction.
To further evaluate HNMDA, we drew Receiver operating
characteristics (ROC) curve by plotting true-positive rate
(TRP, sensitivity) versus the false-positive rate (FPR, 1−sen-
sitivity) at different thresholds. Sensitivity means the ratio of
the positive samples correctly predicted among all positives
ones. And specificity refers to the ratio of negative samples
correctly predicted among all negative ones. Area under the
ROC curve (AUC) is calculated to indicate the prediction
ability of HNMDA. AUC = 0.5 indicates the model only has
a random performance, while AUC = 1 indicates the model
performs perfectly in the prediction of miRNA–disease
associations. Compared with RKNNMDA and WBSMDA
of which the AUCs are 0.7159 and 0.8030, respectively,
HNMDA got the AUC of 0.8394, which has improved the
accuracy of predicting potential miRNA–disease association
(see Fig.1).