Skip to content

Security: BasicFist/scrapox

SECURITY.md

Security Policy

Supported Versions

Version Supported Status
2.0.x Active
1.0.x End of Life

Security Features

Built-in Protections

Input Validation

  • URL validation before HTTP requests
  • Code length limits (3-20 characters)
  • Alphanumeric-only code validation
  • HTML sanitization via BeautifulSoup

SQL Injection Prevention

  • Parameterized queries throughout
  • No string concatenation in SQL
  • Context managers for transactions

Secrets Management

  • Environment variables for sensitive data
  • No secrets in configuration files
  • No hardcoded credentials
  • Config files safe to commit to Git

Network Security

  • HTTPS enforcement for all requests
  • Request timeout protection (30s default)
  • Rate limiting prevents abuse
  • User-Agent mimics legitimate browsers

Dependency Security

  • Pinned versions in requirements.txt
  • Regular dependency updates
  • No known vulnerabilities in current versions

Reporting a Vulnerability

Please DO NOT report security vulnerabilities via public GitHub issues.

Instead, please report security vulnerabilities by emailing: security@scrapox.example.com (replace with actual email)

What to Include

Please include the following information:

  • Type of vulnerability
  • Location of the affected source code
  • Step-by-step instructions to reproduce
  • Proof-of-concept or exploit code (if applicable)
  • Impact assessment
  • Suggested fix (if available)

Response Timeline

  • Initial Response: Within 48 hours
  • Triage: Within 1 week
  • Fix Development: Depends on severity
  • Disclosure: After fix is released

Severity Levels

Critical (CVSS 9.0-10.0)

  • Remote code execution
  • Authentication bypass
  • Data breach potential
  • Response: Immediate (same day)

High (CVSS 7.0-8.9)

  • Privilege escalation
  • SQL injection
  • XSS vulnerabilities
  • Response: 1-3 days

Medium (CVSS 4.0-6.9)

  • Information disclosure
  • DoS vulnerabilities
  • Configuration issues
  • Response: 1-2 weeks

Low (CVSS 0.1-3.9)

  • Minor information leaks
  • Best practice violations
  • Response: Next release cycle

Security Best Practices

For Users

Environment Variables

# ✅ GOOD: Use environment variables
export SCRAPOX_DISCORD_WEBHOOK="https://discord.com/api/webhooks/..."

# ❌ BAD: Don't hardcode in config.yaml
webhook_url: https://discord.com/api/webhooks/...  # DON'T DO THIS

Configuration Files

# ✅ GOOD: Reference environment variables
notifications:
  discord:
    webhook_url: ${SCRAPOX_DISCORD_WEBHOOK}

# ❌ BAD: Hardcode secrets
notifications:
  discord:
    webhook_url: https://discord.com/api/webhooks/123/abc  # DON'T DO THIS

Webhook Security

  • Rotate webhooks regularly
  • Never commit webhooks to Git
  • Use separate webhooks for dev/prod
  • Monitor webhook usage in Discord/Telegram

Database Security

  • Set appropriate file permissions: chmod 600 data/scrapox.db
  • Regular backups to secure location
  • Don't expose database to network
  • Keep database in protected directory

For Developers

Code Security

# ✅ GOOD: Parameterized queries
cursor.execute("SELECT * FROM codes WHERE code=?", (code,))

# ❌ BAD: String concatenation
cursor.execute(f"SELECT * FROM codes WHERE code='{code}'")  # SQL injection risk

Input Validation

# ✅ GOOD: Validate before use
if re.match(r'^[A-Z0-9]{3,20}$', code):
    save_code(code)

# ❌ BAD: Trust user input
save_code(code)  # No validation

Secret Management

# ✅ GOOD: Environment variables
webhook = os.getenv("SCRAPOX_DISCORD_WEBHOOK")

# ❌ BAD: Hardcoded secrets
webhook = "https://discord.com/api/webhooks/..."  # DON'T DO THIS

Known Security Considerations

Rate Limiting

Issue: Aggressive scraping can trigger IP bans Mitigation: Built-in rate limiting (10s default per source) Configuration: Adjust in config.yaml if needed

HTML Parsing

Issue: Malicious HTML could cause parsing errors Mitigation: BeautifulSoup sanitizes input Best Practice: Use trusted sources only

Database Locking

Issue: SQLite doesn't handle high concurrency Mitigation: Sequential writes per source Recommendation: Use PostgreSQL for high-traffic deployments

Notification Webhooks

Issue: Webhooks can be compromised if leaked Mitigation: Use environment variables, never commit Best Practice: Rotate webhooks periodically

Security Audit History

Date Type Findings Status
2025-11-08 Internal Audit None ✅ Clear

Third-Party Security

Dependency Auditing

Run regular security audits:

# Using pip-audit
pip install pip-audit
pip-audit

# Using safety
pip install safety
safety check

Current Dependencies Security Status

All dependencies verified secure as of 2025-11-08:

  • aiohttp 3.13.1 - Latest stable, no known CVEs
  • beautifulsoup4 4.14.2 - Latest stable, no known CVEs
  • lxml 6.0.2 - Latest stable, no known CVEs
  • All other dependencies current and secure

Compliance

OWASP Top 10 (2021)

Risk Status Mitigation
A01: Broken Access Control ✅ N/A No authentication system
A02: Cryptographic Failures ✅ Mitigated Env vars for secrets
A03: Injection ✅ Mitigated Parameterized queries
A04: Insecure Design ✅ Mitigated Security by design
A05: Security Misconfiguration ✅ Mitigated Secure defaults
A06: Vulnerable Components ✅ Mitigated Pinned versions
A07: Auth Failures ✅ N/A No authentication
A08: Software/Data Integrity ✅ Mitigated Checksums available
A09: Security Logging ✅ Implemented Structured logging
A10: Server-Side Request Forgery ✅ Mitigated URL validation

Data Privacy

Personal Data: Scrapox does NOT collect or store personal information Analytics: No telemetry or usage tracking Third-Party: Only connects to user-configured sources GDPR Compliance: Not applicable (no personal data)

Security Checklist for Deployment

Pre-Deployment

  • All secrets in environment variables
  • No secrets in config files
  • Dependencies up to date
  • Security audit passed
  • File permissions set correctly
  • Database backups configured
  • Logging configured and monitored
  • Rate limiting configured appropriately

Post-Deployment

  • Monitor logs for errors
  • Check webhook security
  • Verify rate limiting working
  • Test notification delivery
  • Review database permissions
  • Confirm no secrets exposed
  • Monitor resource usage
  • Test disaster recovery

Contact

Acknowledgments

We appreciate responsible disclosure. Security researchers who report valid vulnerabilities will be:

  • Credited in release notes (if desired)
  • Mentioned in SECURITY.md
  • Given early notification of fix

Last Updated: 2025-11-08 Next Review: 2025-12-08 Policy Version: 1.0

There aren’t any published security advisories