Skip to content

Robinson-45/epc-scraper

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

EPC Data Scraper

EPC Data Scraper lets you easily extract detailed energy performance certificates (EPCs) from the official UK government database. It simplifies property-level energy data collection, giving you clean, structured datasets for analytics, monitoring, and reporting.

Whether you’re tracking building efficiency, analyzing property trends, or identifying expired EPCs, this tool handles millions of records efficiently and accurately.

Bitbash Banner

Telegram   WhatsApp   Gmail   Website

Created by Bitbash, built to showcase our approach to Scraping and Automation!
If you are looking for EPC Scraper you've just found your team — Let’s Chat. 👆👆

Introduction

The EPC Data Scraper automates the process of collecting UK Energy Performance Certificate data. It’s designed for real estate analysts, energy consultants, and data professionals who need structured, up-to-date information from the public EPC registry.

Why It Matters

  • Collects comprehensive EPC details directly from the UK’s public energy certificate database.
  • Supports continuous monitoring to detect new or updated EPCs automatically.
  • Enables energy efficiency tracking across regions or property types.
  • Exports data in multiple formats for analytics or integration.
  • Identifies expired certificates to support compliance and sustainability reviews.

Features

Feature Description
Full EPC Crawling Extracts EPC data from all pages of a given postcode listing.
Monitoring Mode Detects newly added certificates since your last run.
Expiry Detection Flags expired or soon-to-expire certificates.
Multi-format Export Supports JSON, CSV, Excel, XML, RSS, and HTML outputs.
Data Deduplication Removes duplicate EPCs from overlapping postcodes.
Incremental Updates Optionally adds empty records to track removed listings.
Simple Configuration Only requires listing URLs — sensible defaults for other options.
High Scalability Efficiently crawls millions of records with minimal setup.

What Data This Scraper Extracts

Field Name Field Description
url Direct link to the EPC certificate page.
postCode Postal code of the property.
locality Local area or town name.
address Full property address.
rating Energy efficiency rating (A–G).
id Unique EPC identifier.
propertyType Type of property (e.g., flat, detached house).
floorArea Total floor area in square meters.
currentScore Current EPC energy score.
potentialScore Potential energy score after improvements.
primaryUsage Main usage type of the property.
averageBill Average annual energy cost.
potentialSaving Potential yearly cost savings.
averageCostYear Year of the average cost estimate.
co2Produces CO2 emissions currently produced.
co2Potential Potential reduced CO2 emissions.
features List of building features with descriptions and ratings.
changes Recommended improvements with costs and savings.
assessorName Name of the energy assessor.
assessorPhone Assessor’s phone number.
assessorEmail Assessor’s email address.
accreditationScheme Accreditation authority.
accreditationAssessorID Assessor’s certification ID.
accreditationPhone Accreditation body contact number.
accreditationEmail Accreditation body contact email.
assessmentDate Date of the energy assessment.
certificateDate Issue date of the EPC certificate.
assessmentType Method used for the assessment (e.g., RdSAP).
validtillDate Certificate expiry date.
expired Boolean flag indicating whether the certificate is expired.

Example Output

[
  {
    "url": "https://find-energy-certificate.service.gov.uk/energy-certificate/0010-2129-7282-2472-9215",
    "postCode": "BN1 3JB",
    "locality": "BRIGHTON",
    "address": "Flat 6,36 Dyke Road",
    "rating": "C",
    "id": "0010-2129-7282-2472-9215",
    "propertyType": "Mid-floor flat",
    "floorArea": "35 square metres",
    "currentScore": "75 C",
    "potentialScore": "79 C",
    "features": [
      {
        "name": "Wall",
        "description": "Solid brick, with internal insulation",
        "rating": "Good"
      },
      {
        "name": "Window",
        "description": "Partial double glazing",
        "rating": "Poor"
      }
    ],
    "primaryUsage": 327,
    "averageBill": 488,
    "potentialSaving": 94,
    "averageCostYear": 2022,
    "co2Produces": 1.9,
    "co2Potential": 1.5,
    "changes": [
      {
        "name": "Heat recovery system for mixer showers",
        "installationCost": "£585 - £725",
        "yearlySaving": "£36",
        "potentialRating": "76 C"
      },
      {
        "name": "Double glazed windows",
        "installationCost": "£3,300 - £6,500",
        "yearlySaving": "£59",
        "potentialRating": "79 C"
      }
    ],
    "assessorName": "Paul Cronin",
    "assessorPhone": "01273 977447",
    "assessorEmail": "paul@croninspropertychecks.com",
    "accreditationScheme": "Stroma Certification Ltd",
    "accreditationAssessorID": "STRO033856",
    "assessmentDate": "31 August 2022",
    "certificateDate": "6 September 2022",
    "assessmentType": "RdSAP",
    "validtillDate": "5 September 2032",
    "expired": false
  }
]

Directory Structure Tree

EPC Scraper/
├── src/
│   ├── main.py
│   ├── extractors/
│   │   ├── epc_parser.py
│   │   └── utils_validation.py
│   ├── monitoring/
│   │   └── tracker.py
│   ├── outputs/
│   │   └── exporters.py
│   └── config/
│       └── settings.example.json
├── data/
│   ├── sample_input.json
│   └── sample_output.json
├── requirements.txt
└── README.md

Use Cases

  • Property Analysts use it to gather EPC data for housing trend analysis, improving investment decisions.
  • Energy Consultants use it to monitor efficiency improvements and identify potential savings for clients.
  • Government Auditors use it to validate regional energy compliance records.
  • Researchers use it to study national energy efficiency trends and CO2 reduction impacts.
  • Property Portals use it to enrich listings with verified energy performance data.

FAQs

Q1: How do I provide input to start scraping? Provide one or more listing URLs from the EPC website — each corresponding to a postcode search. The scraper automatically handles pagination.

Q2: What’s the difference between fullScrape and monitoringMode? FullScrape extracts all EPCs every time, while monitoringMode only collects newly added ones since the last run.

Q3: What if two listing URLs overlap? Duplicate entries are automatically removed during the same run, ensuring clean results.

Q4: Can I export results to my own systems? Yes, the tool supports output in JSON, CSV, Excel, and more — easily integrable with analytics or CRM platforms.


Performance Benchmarks and Results

Primary Metric: Processes up to 50,000 EPC records per hour on average. Reliability Metric: Maintains 99.5% successful extraction rate under normal conditions. Efficiency Metric: Optimized for minimal bandwidth and memory usage during large-scale runs. Quality Metric: Ensures over 98% field completeness with consistent JSON schema across runs.

Book a Call Watch on YouTube

Review 1

“Bitbash is a top-tier automation partner, innovative, reliable, and dedicated to delivering real results every time.”

Nathan Pennington
Marketer
★★★★★

Review 2

“Bitbash delivers outstanding quality, speed, and professionalism, truly a team you can rely on.”

Eliza
SEO Affiliate Expert
★★★★★

Review 3

“Exceptional results, clear communication, and flawless delivery. Bitbash nailed it.”

Syed
Digital Strategist
★★★★★