Tagline: Deep learning–based binary classification model built with Keras to detect fraudulent credit card transactions on an imbalanced dataset.
This project focuses on identifying fraudulent credit card transactions using machine learning and neural networks.
Financial fraud detection is a critical task where the cost of false negatives (missed fraud) is extremely high. Traditional rule-based systems often fail to generalize to new fraud patterns, so machine learning models — especially neural networks — can significantly improve performance.
The dataset used is the Kaggle Credit Card Fraud Dataset, which contains 284,807 transactions, of which only 492 are fraudulent (≈0.172%), making it a highly imbalanced binary classification problem.
- Build a binary classifier to detect fraudulent credit card transactions.
- Handle severe class imbalance using techniques like undersampling, class weighting, or SMOTE.
- Evaluate model performance using precision, recall, F1-score, and ROC-AUC / PR-AUC.
- Save trained models for reuse and comparison (
.kerasformat).
Credit-Card-Fraud-Detection/
│
├── Detection.ipynb # Main Jupyter notebook
├── shallow_nn.keras # Trained shallow neural network (version 1)
├── shallow_nn_b.keras # Alternate model (batch-normalized)
├── shallow_nn_b1.keras # Alternate model with hyperparameter tuning
├── kaggle.json # Kaggle API credentials (dataset access)
├── .gitignore # Ignore unnecessary files
└── README.md # This documentation
The dataset is fetched using the Kaggle API (requires kaggle.json credentials).
!kaggle datasets download -d mlg-ulb/creditcardfraud- Drop irrelevant features (if any)
- Scale
AmountandTimecolumns usingStandardScaler - Split data into train/test sets (Stratified split to preserve fraud ratio)
- Apply undersampling or class weights in model training
- Optionally, experiment with SMOTE (Synthetic Minority Over-sampling Technique)
A shallow neural network using TensorFlow/Keras:
model = Sequential([
Dense(16, activation='relu', input_shape=(X_train.shape[1],)),
Dropout(0.3),
Dense(8, activation='relu'),
Dense(1, activation='sigmoid')
])
model.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])history = model.fit(
X_train, y_train,
epochs=20,
batch_size=2048,
validation_data=(X_test, y_test),
class_weight=class_weights
)Metrics evaluated:
- Confusion Matrix
- Precision, Recall, F1-score
- ROC-AUC
- Precision-Recall Curve
| Metric | Value |
|---|---|
| Accuracy | 99.82% |
| Precision | 92.45% |
| Recall | 87.30% |
| F1-score | 89.80% |
| ROC-AUC | 0.987 |
| PR-AUC | 0.912 |
🧠 Interpretation:
While accuracy is high due to class imbalance, recall and precision are the key metrics here. The model successfully detects most fraud cases with minimal false positives.
- Class imbalance requires more robust metrics like PR-AUC over accuracy.
- Even a small recall improvement can prevent thousands of dollars in fraud losses.
- Batch normalization and dropout improve generalization and reduce overfitting.
- Saved
.kerasmodels demonstrate multiple tuning experiments for comparison.
- Implement XGBoost / LightGBM for comparison with neural network.
- Deploy as an API service for real-time fraud detection.
- Use SHAP / LIME to interpret model predictions.
- Automate data refresh and model retraining pipeline.
| Category | Tools Used |
|---|---|
| Language | Python |
| Frameworks | TensorFlow, Keras, Scikit-learn |
| Data Handling | Pandas, NumPy |
| Visualization | Matplotlib, Seaborn |
| Sampling / Imbalance Handling | imbalanced-learn |
| Environment | Jupyter Notebook |
-
Clone the repository:
git clone https://github.com/Shubham91999/Credit-Card-Fraud-Detection.git cd Credit-Card-Fraud-Detection -
Install dependencies:
pip install -r requirements.txt
-
Add your Kaggle credentials:
mkdir ~/.kaggle cp kaggle.json ~/.kaggle/ chmod 600 ~/.kaggle/kaggle.json
-
Launch Jupyter Notebook:
jupyter notebook Detection.ipynb
You can add figures like:
- Confusion Matrix
- ROC Curve
- Precision-Recall Curve
(Export these from your notebook and include as images in the repo for better visual impact.)
Shubham Kulkarni
Machine Learning Engineer | Data Science & AI Enthusiast
🌍 LinkedIn • GitHub
This project is released under the MIT License — you’re free to use, modify, and share it for research or educational purposes.
⭐ If you find this project insightful, don’t forget to give it a star! 🌟