Dollargeneral Proxy Parser Spider Scraper collects detailed customer reviews from Dollar General product pages and converts them into clean, structured data. It helps teams analyze ratings, feedback, and buyer sentiment at scale using a reliable Dollar General review scraper workflow.
Created by Bitbash, built to showcase our approach to Scraping and Automation!
If you are looking for dollargeneral-proxy-parser-spider you've just found your team — Let’s Chat. 👆👆
This project extracts customer review data from Dollar General product pages and structures it into analysis-ready records. It solves the challenge of manually collecting and normalizing large volumes of product feedback. It is designed for analysts, researchers, and product teams who rely on accurate review insights.
- Processes multiple product URLs in a single execution
- Captures rich review metadata including ratings and verification flags
- Produces structured records ready for analytics pipelines
- Scales efficiently for large product catalogs
| Feature | Description |
|---|---|
| Review Content Extraction | Collects headlines, comments, and ratings from product reviews. |
| User Metadata Capture | Extracts nicknames, locations, and verification status. |
| Feedback Metrics | Includes helpful and not helpful vote counts per review. |
| Scalable URL Processing | Handles multiple product pages in one run without manual effort. |
| Structured Output | Delivers consistent JSON-ready records for easy integration. |
| Field Name | Field Description |
|---|---|
| review_id | Unique identifier of the review. |
| headline | Short title summarizing the review. |
| comments | Full text of the customer review. |
| rating | Star rating given by the reviewer. |
| helpful_votes | Number of users who found the review helpful. |
| not_helpful_votes | Number of users who found the review not helpful. |
| nickname | Public nickname of the reviewer. |
| location | Location provided by the reviewer. |
| created_date | Timestamp when the review was submitted. |
| is_verified_buyer | Indicates whether the reviewer purchased the product. |
| is_verified_reviewer | Indicates reviewer authenticity status. |
| bottom_line | Overall recommendation indicator. |
| product_page_id | Internal product identifier. |
[
{
"review_id": 522263006,
"headline": "Yummy",
"comments": "I rate this a 5 because they are so good and they don't taste like any hot sauce.",
"rating": 5,
"helpful_votes": 0,
"not_helpful_votes": 0,
"nickname": "Aubrey_Queen",
"location": "Atlanta GA",
"created_date": 1716852857000,
"is_verified_buyer": false,
"is_verified_reviewer": true,
"bottom_line": "Yes",
"product_page_id": "00802701"
}
]
Dollargeneral Proxy Parser Spider/
├── src/
│ ├── runner.py
│ ├── parsers/
│ │ ├── review_parser.py
│ │ └── validators.py
│ ├── network/
│ │ └── proxy_manager.py
│ └── utils/
│ └── date_utils.py
├── data/
│ ├── inputs.sample.json
│ └── outputs.sample.json
├── requirements.txt
└── README.md
- Market researchers use it to analyze product sentiment, so they can identify trends and preferences.
- Retail analysts use it to monitor ratings over time, so they can measure product performance.
- Brand teams use it to track customer feedback, so they can improve product quality.
- Competitive analysts use it to compare review metrics, so they can benchmark against similar products.
Can this handle multiple product pages at once? Yes, the scraper is designed to process multiple Dollar General product URLs in a single run while maintaining consistent output structure.
Does it include both positive and negative reviews? All available reviews are collected regardless of rating, ensuring balanced and unbiased datasets.
Is reviewer authenticity information included? Yes, verification indicators are extracted to help distinguish verified buyers and reviewers.
What output format is produced? The results are delivered as structured records suitable for direct use in analytics or storage systems.
Primary Metric: Processes hundreds of reviews per product page with consistent extraction speed.
Reliability Metric: Maintains high success rates across repeated runs on active product listings.
Efficiency Metric: Optimized parsing minimizes unnecessary requests and resource usage.
Quality Metric: High data completeness with accurate capture of review text, ratings, and metadata.
