Machine Learning-Driven Network Anomaly Detection

An Empirical Study Using Isolation Forest and Random Forest on the UNSW-NB15 Benchmark

Authors

Keywords:

network intrusion detection, machine learning, Isolation Forest, Random Forest, SMOTE, UNSW-NB15, anomaly detection, class imbalance, MrBipinShrestha

Abstract

ABSTRACT

Detecting malicious activity within high-volume network traffic is a persistent challenge in operational cybersecurity. Rule-based intrusion detection systems are ineffective against attacks that fall outside catalogued signatures, while purely statistical methods tend to produce false alarm rates that undermine analyst efficiency. This study presents a reproducible, seven-step machine learning pipeline designed to address both shortcomings through complementary detection layers. The pipeline was evaluated on the UNSW-NB15 benchmark dataset, comprising 175,341 labelled training records and 82,332 test records across 45 network flow features. Following categorical encoding, z-score normalisation, and Synthetic Minority Over-sampling Technique (SMOTE) resampling to correct a 2.13:1 Attack/Normal class imbalance in the training set, an Isolation Forest model was applied for label-free anomaly screening, with a Random Forest binary classifier trained on the resampled data serving as the supervised detection layer. On the held-out test partition, the classifier achieved 88.3% overall accuracy, an Attack recall of 0.98, and an area under the Receiver Operating Characteristic curve (AUC-ROC) of 0.9794. Feature importance analysis identified connection-state TTL features (ct_state_ttl, sttl) and flow-rate statistics (rate, sload, dload) as the principal discriminators between benign and malicious traffic. The findings confirm that a two-stage hybrid detection architecture, when paired with appropriate class balancing, delivers operationally relevant performance — particularly on the Attack recall metric that matters most for intrusion detection deployments.

Downloads

Download data is not yet available.

Downloads

Published

2026-06-30

How to Cite

Shrestha, B. (2026). Machine Learning-Driven Network Anomaly Detection: An Empirical Study Using Isolation Forest and Random Forest on the UNSW-NB15 Benchmark. Australian Journal of Wireless Technologies, Mobility and Security, 15(1). Retrieved from https://ausjournal.com/index.php/j/article/view/92

Issue

Section

Articles