This project implements a system to detect cyberattacks in network traffic using Machine Learning (Random Forest/XGBoost) and automates response actions using Reinforcement Learning (DQN).
data/raw/: Directory for storing the CICIDS2017 CSV datasets.src/: Source code for the pipeline.preprocessing.py: Data cleaning, encoding, normalization, and balancing.detection.py: Training and evaluation of the ML detection model.response.py: RL environment and DQN agent for automated response.
models/: Directory where trained models (rf_model.pkl,dqn_model.pth) are saved.app.py: Streamlit web application for demonstration.
-
Install Dependencies:
pip install -r requirements.txt
-
Data Setup:
- Download the CICIDS2017 dataset (CSVs).
- Place the CSV files (e.g.,
Monday-WorkingHours.pcap_ISCX.csv, etc.) inside thedata/raw/folder. - Note: The system currently includes a dummy dataset for testing purposes.
Run the detection script to process data and train the Random Forest model:
python src/detection.pyThis will save the model to models/rf_model.pkl.
Run the response script to train the Deep Q-Network (DQN) agent:
python src/response.pyThis will save the agent to models/dqn_model.pth.
Launch the Streamlit dashboard to interact with the system:
streamlit run app.py- Select "Use Sample Data" to test with the generated dummy data.
- Select "Upload CSV" to test with real network traffic files.
- Detection: Classifies traffic as Normal or Attack.
- Response: RL Agent chooses to Allow, Block, Alert, or Isolate based on the threat.
- Visualization: View detection statistics and agent decisions in the UI.