利用BP神经网络的Markov链方法：Android恶意软件的系统调用序列检测

27 浏览量更新于2024-08-27 收藏 2.15MB PDF 举报

本文探讨了一种新颖的Android恶意软件检测方法，即基于系统调用序列的马尔可夫链反向传播神经网络（Back-propagation Neural Network on Markov Chains from System Call Sequences, BMSCS）。随着Android平台的普及，恶意软件的威胁日益严重，传统的基于系统调用子序列的检测技术存在一定的局限性。这些方法依赖于共同的系统调用子序列来进行恶意软件识别，但确定合适的子序列长度往往具有挑战性，因为过短可能漏检，过长可能导致误报。 BMSCS方法针对这一问题提出了一种创新策略。首先，它将单个系统调用序列视为一个具有相同状态的马尔可夫链。马尔可夫链假设系统调用序列具有一定的统计规律，即使在相邻调用之间存在局部依赖性，这种模型可以捕捉到序列中的长期和短期行为模式。马尔可夫链简化了对整个序列的复杂分析，只需要关注状态转移概率。反向传播神经网络则被用于学习和理解这些系统调用序列的模式。神经网络通过训练数据调整其权重，以便能够区分正常应用和恶意软件的行为特征。它通过前向传播计算输入序列的概率分布，而后通过反向传播算法优化网络结构，以最小化预测与实际标签之间的误差。这种方法不仅考虑了局部依赖，还考虑了整体序列的信息，提高了检测的准确性。作者们通过对大量系统调用序列进行训练，构建了BMSCS模型，并将其应用于实际的Android恶意软件检测场景中。实验结果表明，相较于传统方法，该方法在保持高检测率的同时，降低了误报率，显示出在Android恶意软件检测方面的显著优势。此外，文章还讨论了模型的鲁棒性和适应性，以及可能的进一步改进方向，如集成其他上下文信息或采用深度学习技术来提升模型性能。这篇研究论文介绍了一种结合马尔可夫链和反向传播神经网络的新型Android恶意软件检测框架，有效地解决了传统方法中关于序列长度选择的问题，有望在移动安全领域发挥重要作用。

IET Information Security

Research Article

Back-propagation neural network on Markov

chains from system call sequences: a new

approach for detecting Android malware with

system call sequences

ISSN 1751-8709

Received on 10th December 2014

Revised 29th November 2015

Accepted on 28th December 2015

E-First on 15th March 2016

doi: 10.1049/iet-ifs.2015.0211

www.ietdl.org

Xi Xiao

, Zhenlong Wang

, Qing Li

, Shutao Xia

, Yong Jiang

Graduate School at Shenzhen, Tsinghua University, 518055 Shenzhen, People's Republic of China

E-mail: li.qing@sz.tsinghua.edu.cn

Abstract: Android has become the most prevalent mobile system, but in the meanwhile malware on this platform is widespread.

System call sequences are studied to detect malware. However, malware detection with these approaches relies on common

system-call-subsequences. It is not so efficient because it is difficult to decide the appropriate length of the common

subsequences. To address this issue, the authors propose a new approach, back-propagation neural network on Markov chains

from system call sequences (BMSCS). It treats one system call sequence as a homogeneous stationary Markov chain and

applies back-propagation neural network (BPNN) to detect malware by comparing transition probabilities in the chain. Since

transition probabilities from one system call to another in malware are significantly different from those in benign applications,

BMSCS can efficiently detect malware by capturing the anomaly in state transitions with the help of BPNN. The authors

evaluate the performance of BMSCS by experiments with real application samples. The experiment results show that the F-

score of BMSCS achieves up to 0.982773, which is higher than the other methods in the literature.

1 Introduction

Computer security has always been a serious problem. With mobile

terminals becoming more and more prevalent, mobile security is

increasingly prominent [1–4]. Due to the growing popularity and

openness, Android has attracted the most consideration of

malicious elements and a hacker can easily write malicious code

and spread it. Malware aiming specifically at Android devices has

increased at an alarming rate [5]. Furthermore, Android has unique

properties and specific limitations due to its mobile nature. This

makes it more difficult to detect malware with conventional

techniques. Therefore, it is rather important to develop a new and

efficient approach to detecting Android malware.

Researchers have explored two types of methods to detect

Android malware. The first type is static analysis, which aims to

recognise signatures of the malicious applications without actually

executing them [6–9]. Many binary forensic techniques can be

used in static analysis, including de-compilation, decryption,

pattern matching and so on. Yet these methods cannot detect

unknown malware as any application can have distinct signatures

by means of encryption and obfuscation [10]. Therefore, the

second type of methods, dynamic analysis, is proposed [11–17].

These approaches can monitor application's behaviours such as

network access, phone calling and message sending at run time.

The dynamic behaviours of an application are conducted by

system call sequences at the end. Therefore, researchers can

leverage system call sequences in the dynamic analysis [11–14].

The previous mechanism that uses system call sequences to detect

malicious applications usually consists of the following steps: first

generating common subsequences of system call sequences of

malware, second filtrating the common subsequences appearing in

system call sequences of benign applications. If the left common

subsequences exist in an application's system call sequence, the

application is identified as malware. Nevertheless, these methods

are inefficient and cannot achieve a desirable detection rate. The

critical limiting factor is the length of the common subsequence.

When the common subsequence is too short, the information used

to describe the action of an application is insufficient. However, the

action is the key character in identifying malicious applications.

When the common subsequence is too long, for instance, longer

than 45 system calls, it tends to be over fitting [14, 18].

Furthermore, it usually takes too much time to obtain the common

system-call-subsequences. The longer the common subsequence is,

the more time it takes (even weeks) [18].

To overcome the above shortcoming, in this paper we put

forward a new approach for Android malware detection, back-

propagation neural network on Markov chains from system call

sequences (BMSCS). The Markov chain has been employed in the

field of network security [19, 20] and the Markov logic network

has been adopted in Android malware detection [21]. Inspired by

Xiao et al. [22], where they applied homogeneous stationary

Markov chains to masquerade detection, we introduce this model

in mobile malware detection. Based on the fact that there are some

specific correlations between the adjacent system calls (e.g. first

memory access, second screen display, then user input

requirement), we treat the system call sequence activated by one

application as one Markov chain. To get low time complexity, we

only take two state dependency into consideration. Each distinct

system call corresponds to one unique state in the chain. There are

196 system calls in Android 4.0.4, thus the state number is 196. In

[19, 20], the numbers of states in these Markov chains are

relatively small. However, there are 196 states in our method.

Hence, it cannot solve the problem only by directly analysing each

element in the matrices. Some classifiers are required to help

further process these matrices.

In our scheme, we first calculate the transition probability

matrices by statistical methods and then convert them into vectors

of 196 × 196 dimensions. Our key assumption is that the

probabilities of transition from one system call to another are

significantly different between malicious applications and benign

ones. According to this assumption, the above vectors are fed to the

classifier, artificial neural network (ANN), to discriminate malware

from benign applications on Android. The classification process

consists of the training phase and the detection phase. During the

training phase, the ANNs (neural networks, in abbreviation) are

trained by back-propagation algorithm, which are called as back-

propagation neural networks (BPNNs). Finally, we do the

experiments on the malware from [23] and the benign applications

downloaded from Google. The results indicate that the F-score of

our method achieves up to 0.982773, higher than those of [8, 9,

14].

IET Inf. Secur., 2017, Vol. 11 Iss. 1, pp. 8-15

下载后可阅读完整内容，剩余7页未读，立即下载

weixin_38640443

粉丝: 10

利用BP神经网络的Markov链方法：Android恶意软件的系统调用序列检测

基于Markov链模型的Android平台恶意APP检测研究.pdf

基于系统调用的Linux系统入侵检测技术研究.pdf

一种基于支持向量机的安卓恶意软件新型检测方法.pdf

matlab的markov链模型

markov链是的遍历

用mcmc方法生成markov链

分析双层Markov链表示的状态转移模型和由隐Markov模型表示的状态转移模型的优劣和区别

耦合markov链和gamma分布的降水模拟

用mcmc的MH方法生成markov链

网络系统通过Markov链最终收敛到稳态，且收敛速度有快有慢，这与一步状态转移矩阵的什么性质有关

最新资源