1.3. THESIS ORGANIZATION
collect data related to the performance of the network links on every interface.
Our model’s predictions of bandwidth usage in 15 seconds rarely exceed an error
rate of 3%.
1.3 Thesis Organization
The remainder of the thesis is organized as follows:
Chapter 2 introduces concepts essential to intrusion detection. It denes what a
computer attack is, what an intrusion detection system is, and provides a historical
perspective of the eld. Dierent types of IDSs are detailed, with their strengths and
weaknesses. The contributions of machine learning in this eld are explained and the
dierent specic datasets discussed in this dissertation are presented. Finally, the most
commonly used metrics are dened.
Chapter 3 presents existing work related to intrusion detection using machine learn-
ing algorithms. The dierent models are detailed with a short presentation of how they
work. Approaches and results are successively presented, then compared in a common
section to determine the best techniques. Finally, the problems identied are listed
along with ideas on how to improve these dierent points. The insights gathered in this
chapter are used in the design of the IDSs in the following chapters.
Chapter 4 presents two solutions using learning machine models to classify attacks
on the two most popular datasets in intrusion detection. Data augmentation is used to
rebalance these datasets and to improve detection of the rarest attacks. Dierent models
are then trained and optimized to obtain the best quality of detection. Finally, they are
combined using a specic rule to improve their accuracy.
Chapter 5 describes two methods to improve two aspects of intrusion detection.
Firstly, it is possible to improve the update of signature databases of misuse-based IDS
by generating these signatures from anomalies. A hybrid IDS could then self-populate
its own signature database. Secondly, networks where IDSs are deployed rarely provide
labeled datasets containing attacks. Transfer learning is studied to train models on
labeled datasets and then transfer these models to real-life networks that do not contain
attacks.
Chapter 6 presents a method of intrusion detection without the need for a labelled
dataset (unsupervised learning). This technique performs anomaly detection by learning
the behavior of the protocol headers of the monitored network. The scores obtained by
the dierent protocols in a single packet are aggregated to produce the packet anomaly
score. A succession of abnormal packets is considered as an indicator of an attack.
Chapter 7 focuses on denial of service attacks, and more generally on network con-
gestion problems. Models are trained to predict the bandwidth consumption between
dierent links in a simulated network. This method works in real time in combination
with Software-Dened Networking (SDN), allowing congestion problems to be corrected
before they occur.
Chapter 8 concludes the thesis by summarizing the main points of the dissertation.
The relevance of machine learning for intrusion detection and future work are discussed.
0
MaximeLABONNE3