Credit Card Fraud Detection Project

This project aims to build a machine learning model to detect fraudulent credit card transactions. It uses a Random Forest classifier trained on a dataset of credit card transactions.

Project Structure

Directory structure:
    ├── README.md
    ├── requirements.txt
    ├── models/
    │   └── random_forest_model.pkl
    ├── notebooks/
    │   ├── 01_data_preprocessing.ipynb
    │   └── 02_model_training.ipynb
    └── outputs/
        └── evaluation_report.txt

Workflow

Data Preprocessing (notebooks/01_data_preprocessing.ipynb):
- Loads the raw dataset (data/creditcard.csv).
- Performs basic data exploration (info, description, null checks).
- Visualizes class distribution and correlations.
- Applies StandardScaler to 'Amount' and 'Time' features.
- Drops original 'Time' and 'Amount' columns.
- Saves the processed data to data/processed_data.csv.
Model Training (notebooks/02_model_training.ipynb):
- Loads the processed data (data/processed_data.csv).
- Splits the data into training and testing sets.
- Trains a RandomForestClassifier model.
- Evaluates the model using classification report, confusion matrix, and ROC AUC score.
- Saves the trained model to models/random_forest_model.pkl.
- Saves the evaluation results to outputs/evaluation_report.txt.
- Generates and saves the ROC curve plot to outputs/roc_curve.png.

Setup and Installation

Clone the repository:

git clone <your-repository-url>
cd <repository-directory>

Create a virtual environment (recommended):

Linux/MacOS

 python -m venv venv
 source venv/bin/activate

Windows

 python -m venv venv
 venv\Scripts\activate

Install dependencies:
- Ensure the requirements.txt file lists all necessary packages. Based on the notebooks, you'll likely need: pandas, numpy, matplotlib, seaborn, scikit-learn.
- Install using pip:
```
pip install -r requirements.txt
```
Add Data:
- Place the raw dataset file (creditcard.csv) into the data/ directory. (Note: This dataset is often found on Kaggle).

How to Run

Ensure the creditcard.csv file is in the data/ directory.
Run the data preprocessing notebook:
- Execute the cells in notebooks/01_data_preprocessing.ipynb. This will generate data/processed_data.csv.
Run the model training notebook:
- Execute the cells in notebooks/02_model_training.ipynb. This will train the model, save it to models/, and generate evaluation files in outputs/.

Results

The model performance metrics (precision, recall, F1-score, confusion matrix, AUC score) are available in outputs/evaluation_report.txt. The ROC curve visualization is saved as outputs/roc_curve.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Credit Card Fraud Detection Project

Project Structure

Workflow

Setup and Installation

How to Run

Results

About

Uh oh!

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
models		models
notebooks		notebooks
outputs		outputs
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt

krtn2902/Credit-Card-Fraud-Detection

Folders and files

Latest commit

History

Repository files navigation

Credit Card Fraud Detection Project

Project Structure

Workflow

Setup and Installation

How to Run

Results

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages