Skip to content

A smart network monitoring application that automatically tracks your web browsing habits and classifies websites based on productivity. It provides real-time insights into your digital behavior through visual analytics and productivity scoring.

Notifications You must be signed in to change notification settings

Rudra-P11/NetSense

Folders and files

NameName
Last commit message
Last commit date

Latest commit

ย 

History

5 Commits
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 
ย 

Repository files navigation

๐Ÿ“ก NetSense โ€“ Real-Time Network Analyzer for Productivity Tracking

A smart, privacy-focused application that passively monitors your network activity and translates it into real-time insights about your digital productivity.


๐Ÿ—๏ธ Technical Architecture

๐Ÿ” Packet Capture

  • Uses Scapy with Npcap to monitor traffic on ports:
    • 53 (DNS)
    • 80 (HTTP)
    • 443, 8080, 8443 (HTTPS)

๐Ÿ”Ž Multi-Layer Analysis

  • DNS query monitoring
  • SSL/TLS SNI extraction (for HTTPS sites)
  • HTTP header inspection
  • IP-to-domain mapping via DNS responses

๐Ÿง  Smart Filtering

  • Filters out:
    • CDNs (Cloudflare, Akamai, etc.)
    • Cloud service domains
    • Reverse DNS noise
  • Only captures actual website domains

โšก Real-Time Classification

  • Instantly categorizes domains as:
    • โœ… Productive
    • โŒ Unproductive
    • โšช Neutral
      based on customizable rules

๐Ÿงฉ Core Components

  • DNS Sniffer (dns_sniffer.py): Captures and processes raw network packets
  • Data Collector (data_collector.py): Aggregates network data and extracts ML-ready features
  • Website Classifier (website_classifier.py): scikit-learn based ML model for domain categorization (Productive/Unproductive/Neutral)
  • Content Analyzer (content_analyzer.py): NLTK-powered text analysis for enhanced classification
  • Anomaly Detector (anomaly_detector.py): Real-time anomaly detection for suspicious network activity
  • Auto Trainer (auto_trainer.py): Continuous model retraining with user feedback
  • Model Storage (model_storage.py): Persistent model serialization and versioning
  • Network Monitor (network_usage.py): Tracks bandwidth usage and connections
  • Productivity Predictor (productivity_predictor.py): Forecasts productivity trends
  • Dashboard (dashboard.py): Tkinter-based GUI with real-time visualizations and anomaly alerts

๐ŸŒŸ Key Features

๐Ÿ” Detection & Analysis

  • โœ… Real-time tracking of websites (e.g. YouTube, Netflix, Instagram, GitHub)
  • โœ… Productivity scoring algorithm (0โ€“100%)
  • โœ… Time spent analysis per category
  • โœ… Live bandwidth monitoring
  • โœ… Multi-layer packet analysis (DNS, HTTP, HTTPS/SNI)
  • โœ… Automatic IP-to-domain mapping with reverse DNS validation

๐Ÿ“Š Visual Analytics

  • ๐Ÿ“ˆ Real-time usage graphs
  • ๐Ÿ“Š Interactive pie charts showing productivity distribution
  • ๐ŸŽฏ Color-coded activity logs
  • ๐Ÿ“ฑ Multi-tab GUI dashboard with 6 specialized views:
    • Overview (live stats)
    • Websites (detailed access log)
    • Timeline (productivity trends)
    • Productivity (scoring breakdown)
    • Anomalies (security alerts)
    • Insights (recommendations)

๐Ÿง  Smart Insights

  • ๐Ÿ’ก Personalized productivity recommendations
  • โšก Instant statistics and real-time alerts
  • ๐Ÿ“ Historical activity tracking
  • ๐ŸŽฎ Customizable website classifications
  • ๐Ÿค– ML-powered website categorization (scikit-learn)
  • ๐Ÿ“š NLP analysis for enhanced accuracy (NLTK)

๐Ÿ” Anomaly Detection & Security

  • ๐Ÿšจ Real-time anomaly detection for suspicious network patterns:
    • Unusual packet rates (high/low thresholds)
    • Unexpected bandwidth spikes
    • Abnormal connection patterns
  • ๐ŸŽฏ Smart alerting with confidence scoring (0.0-1.0)
  • ๐Ÿ“Š Historical anomaly tracking and trend analysis
  • โš ๏ธ Live anomaly feed in dashboard with timestamps

๐Ÿค– Machine Learning & Auto-Training

  • ๐Ÿง  Continuous model improvement with user feedback
  • ๐Ÿ“ˆ scikit-learn RandomForest classifier
  • ๐Ÿ”„ Automatic retraining triggered by domain misclassifications
  • ๐Ÿ’พ Persistent model storage with versioning
  • ๐Ÿ“Š Feature engineering: TLD analysis, domain length, keyword matching
  • ๐ŸŽฏ Adaptive learning from user corrections

๐Ÿš€ Why This Project?

๐Ÿ’ข Problem Solved

Most users donโ€™t realize how much time they waste online.
This tool brings digital self-awareness by providing clear, actionable data about browsing habits.

๐ŸŒ Real-World Applications

  • ๐Ÿง‘โ€๐Ÿ’ป Personal Productivity: Focus tracking & time management
  • ๐Ÿข Workplace Monitoring: Employee analytics (with consent)
  • ๐ŸŽ“ Education: Helps students monitor study vs distraction
  • ๐Ÿ‘จโ€๐Ÿ‘ฉโ€๐Ÿ‘ง Parental Control: Track kidsโ€™ online activity trends

๐Ÿงช Technical Innovation

  • Overcomes modern web complexity:
    • CDNs
    • HTTPS encryption
    • Dynamic IP/domain mapping
  • Combines multi-layer packet analysis with smart local filtering
  • Fully real-time, no delay or lag
  • Local-only processing, ensuring privacy

๐Ÿง  Key Technical Achievements

  • โœ… Bypassed Modern Web Obstacles:
    • Handles HTTPS and CDN-heavy traffic
    • Accurately detects actual domains (e.g. youtube.com, not ytcdn.net)
    • Robust port handling for IP:port combinations in ML pipeline
  • โœ… Intelligent Filtering:
    • Eliminates infrastructure noise
    • Domain normalization and deduplication
  • โœ… High Performance:
    • Real-time packet processing + GUI updates
    • Non-blocking background threads for ML inference
    • Efficient feature extraction and caching
  • โœ… Machine Learning Integration:
    • Custom feature engineering pipeline
    • Thread-safe model training and inference
    • Persistent model storage with metadata
  • โœ… Anomaly Detection:
    • Statistical threshold-based detection
    • Deque-based alert history with automatic pruning
    • Real-time scoring with confidence metrics
  • โœ… User-Friendly Interface:
    • Converts complex network data into readable insights
    • 6 specialized dashboard tabs with rich visualizations
    • Live anomaly alerts with severity indicators

๐Ÿ”ง Recent Improvements (Latest Release)

  • โœ… TensorFlow Optimization: Suppressed oneDNN warnings for cleaner startup
  • โœ… ML Pipeline Fix: Added port stripping in domain feature extraction to handle IP:port strings
  • โœ… Anomaly Display Implementation:
    • Real-time anomaly alerts now display in Anomalies tab
    • Shows timestamp, alert message, and confidence score
    • Reverse chronological ordering (newest first)
    • Thread-safe deque access pattern
  • โœ… Enhanced Error Handling: Improved robustness in network data processing

๐Ÿš€ Installation & Setup

Prerequisites

  • Python 3.8+ (tested on 3.10+)
  • Windows OS (with Npcap driver for packet capture)
  • Administrator privileges (required for packet sniffing)

Quick Start

  1. Clone the repository

    git clone https://github.com/yourusername/NetSense.git
    cd NetSense
  2. Create virtual environment

    python -m venv .venv
    .venv\Scripts\activate
  3. Install dependencies

    pip install -r requirements.txt
  4. Run the application

    python main.py

Required Dependencies

  • scapy: Packet capture and analysis
  • scikit-learn: ML model training and inference
  • pandas: Data processing
  • matplotlib/seaborn: Visualization
  • nltk: NLP analysis
  • psutil: System monitoring
  • tkinter: GUI framework

See requirements.txt for complete list with versions.

Platform Requirements

  • Npcap: Required for packet capture on Windows
  • Admin privileges: Application must run as administrator

๐Ÿ“ Project Structure

NetSense/
โ”œโ”€โ”€ main.py                      # Entry point
โ”œโ”€โ”€ dns_sniffer.py              # Packet capture engine
โ”œโ”€โ”€ data_collector.py           # Network data aggregation
โ”œโ”€โ”€ website_classifier.py       # ML-based domain classification
โ”œโ”€โ”€ content_analyzer.py         # NLTK-powered text analysis
โ”œโ”€โ”€ anomaly_detector.py         # Real-time anomaly detection
โ”œโ”€โ”€ auto_trainer.py             # Continuous model retraining
โ”œโ”€โ”€ model_storage.py            # Model persistence
โ”œโ”€โ”€ network_usage.py            # Bandwidth monitoring
โ”œโ”€โ”€ productivity_predictor.py   # Trend forecasting
โ”œโ”€โ”€ dashboard.py                # GUI dashboard (Tkinter)
โ”œโ”€โ”€ requirements.txt            # Production dependencies
โ”œโ”€โ”€ ml_data/                    # ML training data cache
โ”‚   โ”œโ”€โ”€ training_data.json     # Historical training samples
โ”‚   โ””โ”€โ”€ feature_cache.json     # Cached feature vectors
โ”œโ”€โ”€ ml_models/                  # Trained model storage
โ”‚   โ”œโ”€โ”€ website_classifier_*.pkl  # Classification model
โ”‚   โ”œโ”€โ”€ feature_scaler_*.pkl      # Feature normalization
โ”‚   โ””โ”€โ”€ models_metadata.json      # Model metadata
โ””โ”€โ”€ README.md                   # This file

๐ŸŽฎ Usage Guide

Dashboard Overview

The application displays real-time analytics across 6 main tabs:

๐Ÿ“Š Overview Tab

  • Live statistics: total productivity score, uptime, active connections
  • Real-time bandwidth usage graphs
  • Current network activity status

๐ŸŒ Websites Tab

  • Detailed log of all visited websites
  • Classification status (Productive/Unproductive/Neutral)
  • Visit timestamps and duration
  • Right-click to mark as misclassified for model retraining

๐Ÿ“ˆ Timeline Tab

  • Productivity trends over time
  • Hourly/daily productivity breakdown
  • Visual patterns of your work habits
  • Export trends for analysis

๐ŸŽฏ Productivity Tab

  • Category-wise productivity breakdown
  • Pie charts showing time distribution
  • Top productive and unproductive websites
  • Customizable category definitions

โš ๏ธ Anomalies Tab (NEW)

  • Real-time security alerts
  • Suspicious network patterns detected:
    • High packet rate bursts
    • Unusual bandwidth spikes
    • Abnormal connection patterns
  • Alert severity with confidence scoring
  • Timestamp and detailed anomaly messages
  • Historical anomaly tracking

๐Ÿ’ก Insights Tab

  • AI-generated recommendations
  • Productivity improvement suggestions
  • Time management tips
  • Personalized insights based on your patterns

Feature Controls

  • Start/Stop Capture: Begin/end network monitoring
  • Clear Data: Reset current session statistics
  • Custom Classifications: Edit website categorization rules
  • Model Training: View model performance metrics
  • Settings: Configure sensitivity, thresholds, alerts

๐Ÿ”ฌ Machine Learning Details

Classification Model

  • Algorithm: scikit-learn RandomForestClassifier
  • Features:
    • Domain name characteristics (TLD, length, structure)
    • Keyword matching (productivity keywords vs distractions)
    • Historical classification data
    • User feedback corrections
  • Training: Automatic on domain misclassification correction
  • Inference: Real-time classification during packet analysis

Anomaly Detection

  • Method: Statistical threshold-based detection
  • Metrics Monitored:
    • Packet rate (packets per minute)
    • Bandwidth usage (MB per timeframe)
    • Connection count spikes
    • Unusual protocol distributions
  • Scoring: Confidence scores (0.0-1.0) for each alert
  • History: Last 20 anomalies retained with full metadata

Auto-Training System

  • Triggered when user corrects domain classification
  • Retrains model with expanded feature set
  • Versions models with metadata timestamps
  • Graceful fallback to previous model if training fails

๐Ÿ›ก๏ธ Privacy & Ethics

  • ๐Ÿ”’ Local Processing Only โ€“ No data ever leaves your machine
  • ๐Ÿšซ No Storage โ€“ No logs or user data saved
  • ๐Ÿ” Transparent โ€“ Open-source for full code visibility
  • โœ… Consent-based Design โ€“ Intended for personal, ethical use
  • ๐Ÿ” No Packet Payload Inspection โ€“ Only metadata analyzed
  • ๐ŸŒ No External API Calls โ€“ Completely offline operation

๐Ÿ“ˆ Impact & Value

This project demonstrates how network programming, behavioral analytics, machine learning, and data visualization can power real-world, high-impact applications.

It transforms raw packet data into real insights that help users:

  • Stay productive โ€“ Understand where time is spent
  • Detect threats โ€“ Real-time anomaly alerts for suspicious activity
  • Optimize workflows โ€“ Data-driven productivity recommendations
  • Take control โ€“ Machine learning adapts to your definitions

It bridges the gap between low-level networking and practical productivity tools with intelligent automation.


๐Ÿ› Troubleshooting

Common Issues

Issue: "Permission Denied" or "Admin required"

  • Solution: Run the application as Administrator (right-click โ†’ Run as administrator)

Issue: "Npcap not found" or packet capture not working

Issue: "ModuleNotFoundError" for dependencies

  • Solution: Run pip install -r requirements.txt in activated virtual environment

Issue: No websites appearing in dashboard

  • Solution:
    • Ensure capture is running (Start button)
    • Generate network traffic (visit websites, use browser)
    • Check that your network interface is selected
    • May take 5-10 seconds for first entries to appear

Issue: ML model crashes with "could not convert string to float"

  • Solution: Ensure you're running the latest version with port-handling fixes

Issue: Anomalies not showing in Anomalies tab

  • Solution:
    • Check that capture is active
    • Anomalies may need 30+ seconds of capture for baseline
    • Ensure thresholds are appropriately configured
    • Check application logs for errors

Issue: Dashboard freezes during packet capture

  • Solution:
    • ML model inference runs in background threads
    • Initial model training may take 30 seconds on first run
    • Let application settle for 1-2 minutes after starting

๐Ÿ“š Documentation

  • QUICKSTART.md: Step-by-step setup guide
  • START_HERE.md: First-time user guide
  • TODO.md: Planned features and improvements
  • IMPROVEMENTS.md: Recent enhancements summary
  • CHANGES_SUMMARY.md: Version history

๐Ÿค Contributing

Contributions are welcome! Areas for improvement:

  • Feature Requests: New analysis metrics or visualizations
  • Bug Fixes: Edge cases in packet capture or ML pipeline
  • Performance: Optimization for large datasets
  • Documentation: Clearer guides and examples
  • UI/UX: Enhanced dashboard design and usability

Development Setup

# Clone and setup
git clone <repo>
cd NetSense
python -m venv .venv
.venv\Scripts\activate
pip install -r requirements.txt

# Run tests (if available)
pytest

# Make changes and submit PR

๐Ÿ“ž Support & Contact

  • Issues: GitHub Issues page
  • Discussions: GitHub Discussions for questions
  • Email: (contact information)

๐Ÿ“„ License

This project is licensed under the MIT License - see LICENSE file for details.


๐Ÿ™ Acknowledgments

  • Scapy: For powerful packet manipulation
  • scikit-learn: For machine learning capabilities
  • Tkinter: For the cross-platform GUI
  • NLTK: For natural language processing
  • Matplotlib: For beautiful visualizations

๐ŸŽ‰ Digital awareness starts with understanding your traffic. NetSense makes it possible.

Built with โค๏ธ for productivity, powered by data. ๐Ÿš€

About

A smart network monitoring application that automatically tracks your web browsing habits and classifies websites based on productivity. It provides real-time insights into your digital behavior through visual analytics and productivity scoring.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published