Author Identifier

Priya Naran Rabadia

Date of Award


Document Type



Edith Cowan University

Degree Name

Doctor of Philosophy


School of Science

First Supervisor

Professor Craig Valli

Second Supervisor

Dr Zubair Baig

Third Supervisor

Professor Andrew Woodward

Fourth Supervisor

Dr Peter Hannay


This thesis investigates a precise and efficient pattern-based intrusion detection approach by extracting patterns from sequential adversarial commands. As organisations are further placing assets within the cyber domain, mitigating the potential exposure of these assets is becoming increasingly imperative. Machine learning is the application of learning algorithms to extract knowledge from data to determine patterns between data points and make predictions. Machine learning algorithms have been used to extract patterns from sequences of commands to precisely and efficiently detect adversaries using the Secure Shell (SSH) protocol. Seeing as SSH is one of the most predominant methods of accessing systems it is also a prime target for cyber criminal activities.

For this study, deep packet inspection was applied to data acquired from three medium interaction honeypots emulating the SSH service. Feature selection was used to enhance the performance of the selected machine learning algorithms. A pre-processing procedure was developed to organise the acquired datasets to present the sequences of adversary commands per unique SSH session. The preprocessing phase also included generating a reduced version of each dataset that evenly and coherently represents their respective full dataset. This study focused on whether the machine learning algorithms can extract more precise patterns efficiently extracted from the reduced sequence of commands datasets compared to their respective full datasets. Since a reduced sequence of commands dataset requires less storage space compared to the relative full dataset. Machine learning algorithms selected for this study were the Naïve Bayes, Markov chain, Apriori and Eclat algorithms

The results show the machine learning algorithms applied to the reduced datasets could extract additional patterns that are more precise, compared to their respective full datasets. It was also determined the Naïve Bayes and Markov chain algorithms are more efficient at processing the reduced datasets compared to their respective full datasets. The best performing algorithm was the Markov chain algorithm at extracting more precise patterns efficiently from the reduced datasets. The greatest improvement in processing a reduced dataset was 97.711%. This study has contributed to the domain of pattern-based intrusion detection by providing an approach that can precisely and efficiently detect adversaries utilising SSH communications to gain unauthorised access to a system.