A comprehensive, full-stack web crawling platform that provides powerful website analysis capabilities through a modern web interface. Built with Go backend and Next.js frontend, this platform offers real-time crawling, detailed analytics, and an exceptional user experience.
β‘ Rapid Development Achievement: This entire full-stack application was built after learning Go in just one day! It showcases the power of modern development tools, clean architecture patterns, and the effectiveness of well-structured frameworks for building robust applications quickly.
Sykell is a multi-tenant web crawling platform that combines:
- Powerful Backend: High-performance Go API with clean architecture
- Modern Frontend: React-based dashboard with real-time updates
- Scalable Infrastructure: Docker-containerized deployment ready for production
- Real-time Features: WebSocket integration for live status updates
- Comprehensive Analytics: Detailed website analysis and reporting
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Frontend (Next.js) β
β βββββββββββββββ ββββββββββββββββ βββββββββββββββββββββββ β
β β Dashboard β β Real-time β β Authentication β β
β β UI β β Updates β β UI β β
β βββββββββββββββ ββββββββββββββββ βββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
HTTP/WebSocket
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Backend API (Go) β
β βββββββββββββββ ββββββββββββββββ βββββββββββββββββββββββ β
β β RESTful β β WebSocket β β Authentication β β
β β API β β Hub β β & Authorization β β
β βββββββββββββββ ββββββββββββββββ βββββββββββββββββββββββ β
β βββββββββββββββ ββββββββββββββββ βββββββββββββββββββββββ β
β β Crawler β β Business β β Data Access β β
β β Engine β β Logic β β Layer β β
β βββββββββββββββ ββββββββββββββββ βββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
β Database (MySQL) β
β βββββββββββββββ ββββββββββββββββ βββββββββββββββββββββββ β
β β Users β β Crawls β β Audit Logs β β
β β Tables β β Results β β & Sessions β β
β βββββββββββββββ ββββββββββββββββ βββββββββββββββββββββββ β
βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
- Comprehensive Analysis: HTML version detection, title extraction, heading structure analysis
- Link Analysis: Internal vs. external link classification with broken link detection
- Form Detection: Login form presence identification
- Performance Metrics: Processing time tracking and optimization insights
- Real-time Processing: Background job processing with live status updates
- Multi-tenant System: Complete user registration and authentication
- Modern Dashboard: Responsive design with dark mode support
- Real-time Updates: WebSocket integration for live crawl notifications
- Data Visualization: Interactive tables with advanced filtering and sorting
- Bulk Operations: Manage multiple crawls efficiently
- Clean Architecture: Domain-driven design with clear separation of concerns
- Type Safety: Full TypeScript coverage across the frontend
- Security: JWT authentication with secure session management
- Performance: Optimized concurrent processing and caching systems
- Scalability: Docker containerization ready for production deployment
sykell-crawler/
βββ π client/ # Next.js Frontend Application
β βββ π app/ # Next.js App Router
β βββ π components/ # React Components
β βββ π store/ # State Management (Zustand)
β βββ π hooks/ # Custom React Hooks
β βββ π lib/ # Utility Libraries
β βββ π types/ # TypeScript Definitions
β βββ π tests/ # E2E Tests (Playwright)
β βββ π Dockerfile # Frontend Container Config
β βββ π README.md # Frontend Documentation
βββ π server/ # Go Backend API
β βββ π cmd/api/ # Application Entry Point
β βββ π internal/ # Private Application Code
β β βββ π application/ # Business Logic Services
β β βββ π domain/ # Domain Entities & Interfaces
β β βββ π infrastructure/ # External Integrations
β β βββ π presentation/ # HTTP/WebSocket Handlers
β βββ π tests/ # API Tests & Test Utilities
β βββ π Dockerfile # Backend Container Config
β βββ π README.md # Backend Documentation
βββ π docker-compose.yml # Multi-service Orchestration
βββ π .env.example # Environment Configuration Template
βββ π LICENSE # MIT License
βββ π README.md # This File
- Docker & Docker Compose (Recommended)
- Go 1.24.2+ (for local development)
- Node.js 20+ (for local development)
- MySQL 8.0 (if running locally)
-
Clone the repository
git clone https://github.com/diabahmed/sykell-crawler.git cd sykell-crawler -
Configure environment
cp .env.example .env
Edit
.envwith your configuration:# Database Configuration DB_PASSWORD=your_secure_password DB_NAME=web_crawler_db DB_SOURCE="root:your_secure_password@tcp(db:3306)/web_crawler_db?charset=utf8mb4&parseTime=True&loc=Local" # Frontend Configuration NEXT_PUBLIC_API_BASE_URL=http://localhost:8088/api/v1 NEXT_PUBLIC_WS_BASE_URL=ws://localhost:8088/api/v1/ws # JWT Configuration TOKEN_SYMMETRIC_KEY="your_32_character_secret_key_here" ACCESS_TOKEN_DURATION="24h"
-
Launch the platform
docker-compose up --build -d
-
Access the application
- Frontend: http://localhost:3000
- Backend API: http://localhost:8088
- Database: localhost:3306
For detailed local development instructions, refer to the component-specific READMEs:
- Backend Development Guide - Go API setup, testing, and development
- Frontend Development Guide - Next.js setup, components, and testing
-
π Backend API Documentation
- Architecture overview
- API endpoints
- Database schema
- Configuration options
- Development guide
-
- Component architecture
- State management
- UI components
- Testing strategy
- Performance optimizations
- API Endpoints: Detailed in Backend README
- JWT-based authentication with HTTP-only cookies
- Multi-tenant user isolation
- Secure password hashing with bcrypt
- Session management and automatic logout
- Input validation and sanitization
- CORS configuration
- Rate limiting capabilities
- SQL injection prevention via ORM
- Container security best practices
- Secure environment variable handling
- Network isolation with Docker
- Development: Local development with hot reload
- Staging: Production-like environment for testing
- Production: Optimized for performance and security
This project is licensed under the MIT License - see the LICENSE file for details.
For detailed component documentation, please refer to: