Soodud is a webapp that scrapes data from online stores with Python and then uses C++ hierarchical cluster analysis to form comparable products between stores that are stored in PostgreSQL. The resulting data is served by a Django REST API and then processed by a TailwindCSS & React frontend.
CI/CD is implemented through Github Actions and Docker Compose. Nginx & fail2ban are used to compress/cache/serve static files, provide rate limiting, and detect malicious bots. All commits are ran through flake8 and other pre-commit filters.
- Clone the project.
- Create a valid
.envfile based on.env.example. - In order to contribute, first install the required git commit hooks with
cd django && pipenv run pre-commit install.
- Install dependencies using
cd client && npm install --devandcd django && pipenv install. - Build the C++ project and move
clustering/out/clustering.(so|pyd)into thedjango/data/stores/directory. - Start the webpack dev server using
cd client && npm run server - Start the Python virtual environment with
cd django && pipenv shell. - Start the Django dev server with
tools/start_server.sh. - To scrape new product data and form updated product clusters, run
tools/run_service.sh launchandtools/run_service.sh matchrespectively.
- If this is your initial configuration, temporarily disable HTTPS in
nginx/nginx.confby commenting out theinclude. - Run Docker Compose with
tools/compose.sh. - Create a new cronjob with
tools/cron.txtas a reference. This will ensure that the product database is updated once a day.
