A Python script that aggregates, cleans, and verifies DC++ hub lists from multiple sources.
- Multi-source aggregation: Downloads hub lists from various online sources
- Smart deduplication: Identifies and merges duplicate hubs based on addresses, failovers, and metadata
- Hub verification: Optional ping verification using DCPing tool
- Smart filtering: Removes full hubs and explicitly offline hubs, keeps others with offline status
- Concurrent processing: Multi-threaded ping operations for faster verification
- Output formats: Generates both XML and compressed BZ2 versions
- Configurable: Command-line options for timeout, workers, output files, and more
- Python 3.6 or higher
- DCPing tool (optional, for hub verification)
DCPing is required for hub verification. You can build it from source:
# Clone the repository
git clone https://github.com/direct-connect/go-dcpp.git
go build ./cmd/dcpingThe script is compatible with DCPing version v0.26.0 or later.
python3 hublist.pyThis will download hub lists, deduplicate entries, and generate hublist.xml and hublist.xml.bz2 without ping verification.
python3 hublist.py --ping-tool /path/to/dcpingpython3 hublist.py --helpAvailable options:
--ping-tool PATH: Path to DCPing executable for hub verification--no-ping: Skip hub verification even if ping tool is available--timeout SECONDS: Network timeout in seconds (default: 10)--output FILE: Output XML filename (default: hublist.xml)--max-ping-workers NUM: Maximum concurrent ping workers (default: 5)--ping-timeout SECONDS: Ping timeout per hub in seconds (default: 15)--verbose, -v: Enable verbose output--help: Show help message
The script downloads hub lists from configured sources.
- Parsing: Extracts hub information from XML/BZ2 files
- Normalization: Completes missing protocols and ports
- Filtering: Removes hubs with unsupported encodings or protocols
- Deduplication: Identifies and merges duplicate hubs using:
- Direct address comparison
- Failover address matching
- Name/Description/Encoding matching (for NMDC hubs)
If DCPing is provided:
- Pings each hub to verify online status
- Removes full hubs (ErrCode 226) and explicitly offline hubs
- Keeps hubs with other errors (timeout, network issues) with 'Offline' status
- Updates hub statistics (users, shared data, etc.)
Creates two files:
hublist.xml: Clean, deduplicated hub list in XML formathublist.xml.bz2: Compressed version for efficient distribution
Edit the following variables in hublist.py:
OWN_HUBLIST = "https://dcnf.github.io/Hublist/ownDataHublist.xml"
INTERNET_HUBLISTS = [
"https://www.te-home.net/?do=hublist&get=hublist.xml",
"https://dchublist.org/hublist.xml.bz2",
# ... more sources
]
LOCAL_HUBLISTS = [] # Add local file paths hereThe script supports:
- Protocols: ADC, ADCS, DCHUB, DCHUBS, NMDC, NMDCS
- NMDC Encodings: UTF-8, CP1250-1257, GB18030
The script processes these hub attributes:
- Address, Name, Description, Users, Country
- Shared, Minshare, Minslots, Maxhubs, Maxusers
- Reliability, Rating, Encoding, Software, Website
- Email, ASN, Operators, Bots, Infected, Status, Failover
- DC++: dcpp/FavoriteManager.cpp#l322
- AirDC++: modules/HublistManager.cpp#L58
- EiskaltDC++: dcpp/FavoriteManager.cpp#L255
- FlyLinkDC: windows/PublicHubsFrm.cpp#L53
Use verbose output to see detailed processing:
python3 hublist.py --ping-tool ./dcping --verboseThis project is licensed under the GPLv2 or later License - see the LICENSE file for details.