Journal of Applied and Physical Sciences
Details
Journal ISSN: 2414-3103
Article DOI: https://doi.org/10.20474/japs-5.1.4
Received: 12 January 2019
Accepted: 12 February 2019
Published: 28 February 2019
Download Article (PDF)
  • A distributed intrusion detection system based on apache spark and
    scikit-learn library


Mohamed Seghire Othman Djediden, Hicham Reguieg, Zoulikha Mekkakia Maaza

Article first published online: 2019

Abstract

With the great explosion of data generated in computer networks. The main task of Intrusion Detection Systems (IDS) has become more complicated. Most of the existing IDS are deployed on a single server and do not support the distributed processing. These systems encountered several problems as soon as the volume of the data to be analysed is larger and more varied. The main goal of this paper is to create an intrusion detection system that can analyse massive data quickly with great precision while supporting distributed data processing. This type of data processing assures that our system will be more available and fault-tolerant. In our work, we have combined the Apache Spark framework with known feature selection methods and machine learning algorithms from the improved Sickit-learn library called Sk-dist. The UNSW-NB15 dataset was used to assess the performance of our system. The results of comparisons made with other existing work have shown that our approach is much better in terms of accuracy, reduction of features and above all fault tolerance.