4of18 ZHANG ET AL.
flow feature selection architecture was composed of flow feature manager model and flow feature selector model. The
former was used to obtain additional flow features, whereas the latter was responsible to analyze and select flow features.
As described in the introduction, the distinguishing features of SDN are helpful to achieve the application classification.
Some researchers in other works
9-11
have considered utilizing the SDN architecture for network application classification.
Prasad and Kataoka
9
designed an application-aware multipath packet forwarding mechanism by combining machine
learning and SDN. To achieve the application awareness, the C4.5 decision tree algorithm was used to build the application
classifier, and the machine learning–based trainer and classifier were integrated in the controller because of a global view
and the logically centralized control capability. Amaral et al
10
presented a novel machine learning–based data collection
and traffic classification architecture, which could be applied to legacy network and SDN network. In this architecture,
the controller collected the flow statistics from the switches and then used the s upervised machine learning algorithm to
classify traffic. Different from most research works on application classification, to achieve the classification and s atisfy
the QoS requirement of the network application at the same time, Wang et al
11
devised a QoS-aware traffic classification
framework in SDN. In this framework, deep packet inspection technology was used to detect the elephant flow, and the
semisupervised machine learning algorithm was used to achieve the QoS-aware traffic classification through the mapping
function. Specifically, the application flow was mapped to a certain predefined QoS class according to flow features in the
mapping function.
Although the aforementioned research works take the SDN architecture into account for application classification,
these classification methods merely belong to traditional shallow learning network. It cannot efficiently deal with mas -
sive data because of the limited feature learning ability. Compared with shallow learning network, deep learning network
has more powerful feature learning ability and can achieve deeper features from data through the training.
25
Deep
neural network has been widely used for intrusion detection, traffic flow prediction, and application classification.
26-30
Salama et al
26
proposed a hybrid intelligent anomaly intrusion detection scheme by jointly using the RBM-based deep
belief network and SVM. In this intrusion detection scheme, the deep belief network was used to reduce the feature
dimensionality, and SVM was used to classify whether the network traffic is normal. Similar with this work, Fiore et al
27
directly applied the RBM classifier to network anomaly detection by using the s emisupervised learning method. To deal
with massive network traffic, Lv et al
28
proposed a novel traffic flow prediction method on the basis of deep learning. In
this method, the stacked autoencoder model was used to obtain flow features since it could extract inherent features on
the basis of the data set from the lowest level to the highest level. To improve the forecasting accuracy of network traffic
and solve the congestion problem, Yang et al presented a stacked autoencoder Levenberg-Marquardt model.
29
The pro-
posed deep network was used to design an optimized structure and obtain flow features by means of the greedy layerwise
training algorithm. Huang et al
30
designed a novel deep network architecture for traffic flow prediction in transportation
research. The deep neural network was composed of a DBN and a multitask regression layer. The former was designed to
learn effective features with the greedy layerwise training algorithm, whereas the latter was used to predict the traffic flow.
Inspired by the aforementioned research works, in this paper, we propose an application classification framework by
combing SDN and deep learning. With the powerful computing capability, we use the controller to deal with the massive
network traffic and flow statistics. We construct a hybrid deep learning network model, which is composed of the stacked
autoencoder and the softmax regression layer, to extract flow features and build the application classifier.
3 FRAMEWORK DESIGN
In this section, we propose a novel framework to collect network traffic, process flow statistics, and classify network
application by using SDN and deep learning technology. Its main goal is to collect and process flow statistics information
and build an application classifier that divides different types of application flows into different categories.
There are several definitions on the flow. In this paper, we use the socket to define a flow, by a quintuple, that is, ⟨source
IP address, destination IP address, transport protocol (eg, Transmission Control Protocol (TCP) and User Datagram
Protocol ), source port number, destination port number ⟩.
Our proposed application classification framework is shown in Figure 1. As depicted in Figure 1, the control plane con-
sists of four important function modules (ie, network monitoring module, flow statistic module, data processing module,
and deep learning classifier module) and a database, which stores flow features information. The 4 function modules are
described as follows in details.
Network monitoring module: It is responsible to gather and analyze network traffic information from underlying
switches to obtain some useful flow information (eg, protocol type, source port, and destination port number of the flow).