Skip to content

carlospolop/MalwareWorld

Repository files navigation

MalwareWorld

MalwareWorld aggregates public threat-intel blacklists into a static dataset (IPs, domains and IP ranges) and publishes it as a website on GitHub Pages.

  • Website: https://malwareworld.com/
  • HackTricks tools: https://tools.hacktricks.wiki/
  • Blacklists used (URLs): https://malwareworld.com/data/blacklists.txt

What this repo contains

  • docs/: the GitHub Pages web UI (no backend).
  • scripts/: a generator that downloads blacklists and produces:
    • textlists to download (suspiciousIPs.txt, suspiciousDomains.txt, suspiciousRanges.txt, …)
    • per-category downloads (type_<Category>_domains.txt, type_<Category>_ips.txt)
    • sharded JSON for exact lookups from the UI (IP/domain/range)
    • maps + stats per category
    • monthly archive artifacts (optional; see below)

Generation uses SQLite on disk to keep memory usage low.

GitHub Pages setup

  1. Repo → Settings → Pages
  2. Source: GitHub Actions
  3. Run the workflow .github/workflows/publish-release-assets.yml once (or wait for the schedule).

The workflows publish a GitHub Release with the generated artifacts and also deploy the same artifacts under the Pages site at /data/ so the UI can fetch them without CORS issues.

Run locally (web UI + data generation)

  1. Install:

npm install

  1. Generate the site data:

npm run generate:site

Useful env vars:

MW_LIMIT_BLACKLISTS=10 MW_CONCURRENCY=10 MW_OUTPUT_DIR=release-assets npm run generate:site

For flaky sources (common on CI):

MW_CONCURRENCY=12 MW_RETRY_COUNT=3 MW_RETRY_DELAY_MS=90000 npm run generate:site

Monthly archive (union of all non-whitelist items seen during the month):

MW_MONTHLY=1 MW_MONTHLY_DB_PATH=release-assets/monthly-YYYY-MM-archive.sqlite npm run generate:site

Optional seed to merge the current run into an existing monthly archive:

MW_MONTHLY_SEED_PATH=path/to/monthly-YYYY-MM-archive.sqlite

  1. Serve the UI:

python3 -m http.server 4173 --directory docs

Option A (recommended local dev): generate into docs/data/ so the UI auto-detects it:

MW_OUTPUT_DIR=docs/data npm run generate:site

Open http://127.0.0.1:4173/

Option B: serve release-assets/ separately and point the UI to it:

python3 -m http.server 4174 --directory release-assets

Open http://127.0.0.1:4173/?releaseBase=http://127.0.0.1:4174/

  1. Quick end-to-end smoke test (generates a small subset and verifies outputs + HTTP fetches):

npm test

CLI lookup tools (Node + Python)

The same sharded JSON layout used by the UI can be queried from the command line. Scripts live in tools/lookup/.

Local data (uses docs/data/ if present):

node tools/lookup/lookup.js example.com

python3 tools/lookup/lookup.py 1.1.1.1

Remote release assets:

node tools/lookup/lookup.js example.com --base https://github.com/<owner>/<repo>/releases/latest/download/

python3 tools/lookup/lookup.py 1.1.1.1 --base https://malwareworld.com/data/

Note about removals

This repo previously exposed a Node.js “library” API (including external intelligence lookups). That code path is removed: MalwareWorld is now focused on static data generation + GitHub Pages.

Monthly archives (how it works)

MalwareWorld can keep a monthly union of all malicious IPs/domains/ranges/URLs with their categories and IP geolocation. When enabled:

  • scripts/generate-site-data.js attaches/creates a monthly SQLite archive: monthly-YYYY-MM-archive.sqlite.
  • Each run merges the current dataset into the archive (non‑whitelist only).
  • It also generates monthly map and stats assets: monthly-YYYY-MM-map_{Type}.geojson and monthly-YYYY-MM-stats_{Type}.json.
  • The workflow .github/workflows/publish-release-assets.yml publishes those files as a release and updates docs/data/monthly/index.json with the month → release URL mapping.

The UI reads docs/data/monthly/index.json and shows a Month selector in the Maps section. When a month is selected, the UI loads the monthly map/stats files from the release URL. Because GitHub Releases do not provide CORS headers, the UI fetches monthly files through a configurable CORS proxy (default: https://api.allorigins.win/raw?url=). Override via:

?corsProxy=https://your-proxy/?url= or disable with ?corsProxy=none.

Downloadable files (from the web UI)

These files are published under /data/ on GitHub Pages and also as release assets:

Monthly archive artifacts (not in Pages; only in Releases):

Monthly archive workflow notes

  • The Pages site always serves the latest dataset under /data/.
  • Monthly archives and monthly map/stats are kept in Releases (not in Pages) to avoid unbounded Pages growth.
  • The monthly index file (docs/data/monthly/index.json) is committed by the workflow so the UI can discover months without calling the GitHub API.

License

MIT