-
Notifications
You must be signed in to change notification settings - Fork 1
Open
Description
When testing the coding style compliance of the python code in the get_noaa_smoke demo, I found several compliance issues using pep8online.com.
Modified code which passes the pep8 tests follows:
import re
import os.path
import urlparse
import scrapy
from scrapy.http import Request
from scrapy.crawler import CrawlerProcess
class get_hms_shapefiles(scrapy.Spider):
"""Get daily SMOKE files from May through Sept. for the years 2008-2017."""
name = "get_hms_shapefiles"
domain = "satepsanone.nesdis.noaa.gov"
allowed_domains = [domain]
start_urls = ["http://%s/pub/volcano/FIRE/HMS_ARCHIVE/%s/GIS/SMOKE/" %
(domain, year) for year in range(2008, 2017)]
def parse(self, response):
for href in response.xpath('//a/@href').extract():
regexp = r'hms_smoke[0-9]{4}0[5-9]{1}[0-9]{2}\.(dbf|shp|shx)\.gz$'
if re.match(regexp, href):
yield Request(url=response.urljoin(href),
callback=self.save_file)
def save_file(self, response):
path = response.url.split('/')[-1]
if not os.path.exists(path):
with open(path, 'wb') as f:
f.write(response.body)
process = CrawlerProcess()
process.crawl(get_hms_shapefiles) & process.start()
Metadata
Metadata
Assignees
Labels
No labels