Skip to content

Latest commit

 

History

History
21 lines (15 loc) · 1.59 KB

README.md

File metadata and controls

21 lines (15 loc) · 1.59 KB

AE, VAE and PCA anomaly detection in operational technology infrastructure datasets

This repository contains the Jupyter Notebooks and Python code that were used while writing my Master Thesis paper on "Anomaly Detection in Operational Technology Infrastructures Using Artificial Neural Networks". The models are implemented using Tensorflow and Keras. My thesis paper looked at the following models:

  • Auto-Encoders (AE)
  • Variational Auto-Encoders (VAE)
  • Principal Component Analysis (PCA)

And concluded that the original Auto-Encoder model performed best with regards to anomaly detection in operational technology infrastructure datasets.

The datasets that were used to analyze the models during this research are:

  • Water Distribution (WADI)
  • Secure Water Treatment (SWaT)
  • Battle of Attack Detection Algorithms (BATADAL)

These datasets can be requested at iTrust, Centre for Research in Cyber Security, which is available through the following URL: https://itrust.sutd.edu.sg/itrust-labs_datasets/dataset_info/

The creditcard fraud dataset was used to initially test the different models in order to ensure that they worked correctly. This dataset is available at: https://www.kaggle.com/mlg-ulb/creditcardfraud

Although this repository contains almost all code used for my research, I didn't go through testing all of it before pushing it to GitHub. The reason for this is that I wrote my thesis two years ago and I currently do not have an up-to-date machine learning development environment. Most of the code should still be working and can be used as a starting point to develop your own models.