GitHub - booringreader/web-crawler: non recursive in nature (will only visit the children urls listed in urls.csv without adding any new ones), ensures this little crawler doesn't go out to map the entire internet :)

    git clone https://github.com/booringreader/web-crawler.git
    cd web-crawler

if the existing one doesn't work, remove venv dir & create a new virtual environment with

python -m venv venv # macOS/Linux

pip install beautifulsoup4

execute the urls.py file first, then enter the root url(the first page, the entry point); this will populate the urls.csv file
execute the mails.py file

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
venv		venv
Readme.md		Readme.md
TODO.txt		TODO.txt
mails.py		mails.py
terminal		terminal
urls.py		urls.py

Provide feedback