Skip to content

kapil2020/global-PM-2.5-next-day-forecasting

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

8 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

🌍 Global PM2.5 Next-Day Forecasting

Predicting next-day fine particulate matter (PMβ‚‚.β‚…) concentrations using Machine Learning techniques.

This repository contains code, data workflow, and visualizations from the Kaggle-based notebook for global air quality prediction.
It leverages meteorological and environmental datasets to build high-performance models that forecast PMβ‚‚.β‚… concentration levels across multiple regions worldwide.


πŸš€ Project Overview

Air pollution remains one of the most critical global challenges.
This project develops a data-driven machine learning framework for next-day PMβ‚‚.β‚… forecasting using publicly available datasets.

Key Features:

  • Data preprocessing and feature engineering from multi-source air quality datasets
  • Model training using tree-based ensemble methods (e.g., XGBoost, LightGBM)
  • Performance evaluation with RMSE and RΒ² metrics
  • SHAP interpretability for feature impact analysis
  • Visual dashboards and plots for spatial-temporal understanding

🧠 Methodology

  1. Data Aggregation:
    Merging global PMβ‚‚.β‚… datasets with weather parameters such as temperature, wind speed, humidity, and pressure.

  2. Preprocessing & Cleaning:
    Handling missing values, scaling features, and temporal alignment.

  3. Model Development:
    Training machine learning regressors like:

    • XGBoost
    • Random Forest
    • LightGBM
    • Linear Regression
  4. Evaluation:

    • RMSE, MAE, and RΒ²
    • SHAP-based feature importance visualization
  5. Forecast Generation:
    Produces next-day PMβ‚‚.β‚… predictions for multiple global locations.


πŸ“Š Results & Visualization

You can explore the interactive visualizations and full notebook here:
πŸ”— View the Project Dashboard


🧩 Repository Structure

β”‚
β”œβ”€β”€ data/                     # Raw and processed datasets (not pushed due to size)
β”œβ”€β”€ docs/                     # HTML outputs for GitHub Pages
β”‚   β”œβ”€β”€ index.html
β”‚   └── notebook.html
β”‚
β”œβ”€β”€ global-analysis-next-day-pm2-5-ml.ipynb   # Kaggle notebook
β”œβ”€β”€ requirements.txt           # Environment dependencies
└── README.md                  # Project documentation


Releases

No releases published

Packages

No packages published