深度学习在加密流量分类中的应用：综述

版权申诉

57 浏览量更新于2024-09-11 收藏 1.26MB PDF 举报

"这篇PDF文件名为《Deep Learning for Encrypted Traffic Classification An Overview》，作者是Shahbaz Rezaei，文章提交给了IEEE通信杂志。本文主要探讨了深度学习在加密流量分类中的应用，并提供了概述。文章指出，传统的流量分类方法如基于端口、数据包检查和经典机器学习的方法在面对互联网流量的显著变化，尤其是加密流量增加的情况下，其准确性已经下降。随着深度学习技术的发展，研究人员开始探索这些方法在流量分类任务中的应用，并取得了高精度的结果。本文提出了一种通用的深度学习为基础的流量分类框架，并讨论了相关的开放问题、挑战以及未来的机会。关键词包括流量分类、深度学习、加密流量等。" 以下是详细的知识点： 1. **流量分类**：流量分类是网络管理中的关键任务，用于识别网络流量的类型，例如，区分VoIP、视频流、网页浏览等。它对服务质量（QoS）提供、计费、网络安全等方面至关重要。 2. **传统方法的局限性**：传统的流量分类方法，如基于端口的分类、数据包检查（payload inspection）和经典机器学习算法，由于互联网流量的复杂性和加密趋势，其性能已经逐渐降低。特别是随着HTTPS和其他加密协议的广泛使用，这些方法无法解析加密内容，导致分类准确性下降。 3. **深度学习的应用**：深度学习因其强大的模式识别能力，被引入到流量分类中。通过神经网络模型，深度学习可以学习和理解复杂的流量特征，即使在数据加密的情况下也能实现高精度的分类。 4. **深度学习方法**：常见的深度学习方法在流量分类中的应用包括卷积神经网络（CNN）、循环神经网络（RNN）、长短时记忆网络（LSTM）、自注意力机制（Self-Attention）等，这些模型能够处理时间序列数据并提取高级特征。 5. **深度学习框架**：文章提出了一个通用的深度学习框架，该框架可能包括数据预处理、模型选择、训练和评估等步骤，为研究人员提供了指导，以便他们在流量分类任务中有效利用深度学习。 6. **开放问题与挑战**：尽管深度学习带来了高精度，但依然存在一些挑战，如模型解释性、数据隐私保护、模型泛化能力和实时性能优化等。此外，加密流量的动态性和多样性也增加了分类难度。 7. **未来机遇**：尽管面临挑战，但深度学习在流量分类中的应用仍有巨大的潜力。例如，通过改进模型架构、引入新型学习策略或结合其他网络分析技术，可以进一步提高分类效率和准确性。 8. **索引术语**：文章的关键术语包括“流量分类”，强调了本文关注的核心主题；“深度学习”表示所使用的主要技术手段；“加密流量”则指出了当前网络环境中的主要挑战。总结来说，这篇论文深入研究了深度学习如何在加密流量环境下进行有效的流量分类，同时也指出了一系列的未来研究方向和待解决的问题。这对于网络管理和安全领域的研究人员具有重要的参考价值。

SUBMITTED TO IEEE COMMUNICATIONS MAGAZINE 1

Deep Learning for Encrypted Trafﬁc Classiﬁcation:

An Overview

Shahbaz Rezaei, Member, IEEE, and Xin Liu, Senior, Member,

Abstract—Trafﬁc classiﬁcation has been studied for two

decades and applied to a wide range of applications from QoS

provisioning and billing in ISPs to security-related applications in

ﬁrewalls and intrusion detection systems. Port-based, data packet

inspection, and classical machine learning methods have been

used extensively in the past, but their accuracy have been declined

due to the dramatic changes in the Internet trafﬁc, particularly

the increase in encrypted trafﬁc. With the proliferation of

deep learning methods, researchers have recently investigated

these methods for trafﬁc classiﬁcation task and reported high

accuracy. In this article, we introduce a general framework for

deep-learning-based trafﬁc classiﬁcation. We present commonly

used deep learning methods and their application in trafﬁc

classiﬁcation tasks. Then, we discuss open problems and their

challenges, as well as opportunities for trafﬁc classiﬁcation.

Index Terms—Trafﬁc classiﬁcation, deep learning, machine

learning.

I. INTRODUCTION

RAFFIC classiﬁcation, the categorization of network

trafﬁc into appropriate classes, is important to many

applications, such as quality of service (QoS) control, pricing,

resource usage planning, malware detection, and intrusion de-

tection. Because of its importance, many different approaches

have been developed over years to accommodate the diverse

and changing needs of different application scenarios. In par-

ticular, new advances in communications, including encryption

and port obfuscation, raise additional challenges to network

classiﬁcation.

Trafﬁc classiﬁcation techniques have evolved signiﬁcantly

over time. The ﬁrst and easiest approach is to use port

numbers. However, its accuracy has been decreasing because

newer applications either use well-known port numbers to

disguise their trafﬁc or do not use standard registered port

numbers. Despite its inaccuracy, the port number is still widely

used either alone or in tandem with other features in practice.

The next generation of trafﬁc classiﬁers, relying on payload

or data packet inspection (DPI), focuses on ﬁnding patterns or

keywords in data packets. These methods are only applicable

to unencrypted trafﬁc and has high computational overhead. As

a result, a new generation of methods, based on ﬂow-statistics,

emerged. These methods rely on statistical or time series

features, which enable them to handle both encrypted and

unencrypted trafﬁc. These methods usually employ classical

machine learning (ML) algorithms, such as random forest (RF)

and k-nearest neighbor (KNN). However, their performance

S. Rezaei and X. Liu are with Computer Science Department, Uni-

versity of California, Davis, USA (e-mails: srezaei@ucdavis.edu and

liu@cs.ucdavis.edu).

Manuscript received April 19, 2005; revised August 26, 2015.

heavily depends on the human-engineered features, which

limit their generalizability.

Deep learning obviates the need to select features by a do-

main expert because it automatically selects features through

training. This characteristic makes deep learning a highly de-

sirable approach for trafﬁc classiﬁcation, especially when new

classes constantly emerge and patterns of old classes evolve.

Another important characteristic of deep learning is that it has

a considerably higher capacity of learning in comparison to

traditional ML methods, and thus can learn highly complicated

patterns. Combining these two characteristics, as an end-to-end

approach, deep learning is capable of learning the non-linear

relationship between the raw input and corresponding output

without the need to break the problem into the small sub-

problems of feature selection and classiﬁcation.

Recent work has demonstrated the efﬁcacy of deep learning

methods in trafﬁc classiﬁcation, in particular, in encrypted

trafﬁc. To achieve this goal, DL requires sufﬁcient labeled

data and adequate computation power. In this article, we

will overview the general framework for (encrypted) trafﬁc

classiﬁcation task. We provide general guidelines for classiﬁ-

cation tasks, including data collection and cleaning, feature

selection, and model selection. Moreover, we discuss deep

learning techniques and how they have been applied for trafﬁc

classiﬁcation task. Finally, open problems and future directions

are discussed.

II. OVERVIEW OF CLASSIFICATION PROBLEMS ON

COMPUTER NETWORK

Fig. 1 illustrates a general framework for trafﬁc classiﬁca-

tion, comprising seven steps. Most existing work adopts all or

part of the framework. We discuss the ﬁrst four steps in this

section, and the last three in the next section, with a focus on

deep-learning-based approaches.

A. Problem Deﬁnition

The ﬁrst step to build a network trafﬁc classiﬁer is to

clearly deﬁne the goal of classiﬁcation. Typical goals include

QoS provisioning, resource usage planning, billing system

customization, intrusion detection, and malware detection. To

serve its goal, one can categorize trafﬁc classes based on 1)

protocols (e.g. UDP, TCP, FTP or HTTP), 2) applications (e.g.

Skype, WeChat or Torrent), 3) trafﬁc-types (e.g. browsing,

downloading or video chat), 4) websites, 5) user actions (e.g.

posting a comment or sending voice message), 6) operating

systems, 7) browsers, and so on. Hence, the goal is to label

each ﬂow with corresponding trafﬁc classes. A ﬂow is usually

arXiv:1810.07906v2 [cs.NI] 26 Jan 2019

下载后可阅读完整内容，剩余8页未读，立即下载

Fun_He

粉丝: 19
资源: 104

深度学习在加密流量分类中的应用：综述

Deep Learning for Encrypted Traffic Classification:An Overview.pdf

Firefly-RK3399的Android10中的pdf_20211123_1657.7z

papa_e_16320_319054_1_648_encrypted.apk

NSFXSD.encrypted_signed_Aligned.apk女生发型试戴

lrs_v3.7.0_develop_encrypted_updated_1.apk

hanquan-6.3.1-release_encrypted_bbc_1_align.apk.zip

ZOL_Android-v10.09.02-full_encrypted_zol-ask_15_align.apk

kibana启动报错Generating a random key for xpack.encrypted_saved_objects.encryptionKey. To be able to decrypt encrypted saved objects attributes after restart, please set xpack.encrypted_saved_objects.encryptionKey in kibana.yml、

最新资源