Scrape short-term listings providers (Airbnb).
Given a search query, e.g. "San Diego, CA" or "Rome, Italy", search Airbnb inventory and collect data on listings. Save results to a CSV file or Elasticsearch.
Airbnb's API is subject to change at any moment, which would break this scraper. They've already changed it several times in the past. I have not personally used this scraper in several months, and therefore it may need to be updated to work with the latest version of the Airbnb website.
# activate the virtual env
. .venv/bin/activate
# run the script
./stl.py search "Madrid, Spain"# spin up the containers
docker compose up -d
# ensure dependencies installed
docker compose exec jupyter-scipy-notebook conda install --yes --file work/requirements.txt
# run the script
docker compose exec jupyter-scipy-notebook /opt/conda/bin/python work/stl.py search -v "Madrid, Spain"Short-Term Listings (STL) Scraper
Usage:
stl.py search <query> [--checkin=<checkin> --checkout=<checkout>
[--priceMin=<priceMin>] [--priceMax=<priceMax>]]
[--roomTypes=<roomTypes>] [--storage=<storage>] [-v|--verbose]
stl.py calendar (<listingId> | --all)
stl.py pricing <listingId> --checkin=<checkin> --checkout=<checkout>
stl.py data <listingId>
Arguments:
<query> The query string to search (e.g. "San Diego, CA")
<listingId> The listing id
Options:
--checkin=<checkin> Check-in date, e.g. "2023-06-01"
--checkout=<checkout> Check-out date, e.g. "2023-06-30"
--priceMin=<priceMin> Minimum nightly or monthly price
--priceMax=<priceMax> Maximum nightly or monthly price
--all Update calendar for all listings (requires Elasticsearch backend)
Global Options:
--currency=<currency> "USD", "EUR", etc. [default: USD]
--source=<source> Only allows "airbnb" for now. [default: airbnb]
- Python >= 3.10, or Docker Compose
This option assumes you have Python >= 3.10 installed, and that you will manage dependencies using the python venv
module with pip. You can connect to your own instance of Elasticsearch. However, Elasticsearch is not required.
# create the config file
cp .env.dist .env
# create the virtual env
python3 -m venv .venv
# activate the virtual env
. .venv/bin/activate
# install dependencies in virtual env
pip install -r requirements.txtThis option uses docker compose to build:
jupyter-scipy-notebook: jupyter scipy notebook, python, condasetuptemporary container that configures elasticsearch & kibana securityes01: elasticsearch containerkibana: kibana container
# create the config file
cp .env.dist .env
# Create the containers
docker compose up -d
# Install project requirements
docker compose exec jupyter-scipy-notebook conda install --yes --file work/requirements.txtYou can directly view records in Elasticsearch by using Kibana.
- Scrape some listings using above commands
- Browse to http://localhost:5601/app/management/kibana/dataViews (u: elastic / p: abc123)
- Click "Create new data view" on top right
- Use
short-term-listingsas name and index pattern - Click "Save data view to Kibana"
- Click "Analytics > Discover" on the main menu, selecting the
short-term-listingsdata view, and see JSON records
- If using Elasticsearch, you may need to run
sudo sysctl -w vm.max_map_count=262144on your host machine (see https://www.elastic.co/guide/en/elasticsearch/reference/current/vm-max-map-count.html)