Specialized in building robust, containerized data pipelines and data warehousing solutions. Focusing on scalability, clean architecture, and automated orchestration.
| Domain | Technologies |
|---|---|
| Orchestration & Containerization | |
| Data Processing & Analytics | |
| Storage, Warehouse & Cloud | |
| Data Quality & Web Development |
- Architecture: Architected a production-grade Modern Data Stack (MDS) simulation enforcing SCD Type 2 (History Tracking) and strict data contracts using DuckDB for high-performance columnar processing.
- Key Tech: Implemented a fully automated CI/CD pipeline via GitHub Actions (with robust environment-agnostic path injection), integrated Soda Core for quality guardrails, and orchestrated modular transformations using Dagster Assets.
Apache Spark | Apache Airflow | MinIO (S3) | PostgreSQL | Great Expectations
- Architecture: Engineered a scalable Lakehouse-style pipeline to process 2.9M+ raw records from NYC Open Data.
- Key Tech: Implemented distributed data processing using a Spark Master-Worker cluster and integrated automated data quality guardrails with Great Expectations to ensure 99% data integrity before warehouse loading.
π§π· Olist E-Commerce Data Warehouse
dbt Core | PostgreSQL | Docker | Dimensional Modeling
- Architecture: Built an end-to-end ELT pipeline transforming raw CSV data into a business-ready Star Schema Data Warehouse.
- Key Tech: Implemented dbt (data build tool) for modular SQL transformations, automated testing (schema & referral integrity), and data lineage documentation.
Docker | Apache Airflow | PostgreSQL | Python
- Architecture: Designed a fault-tolerant ETL pipeline to ingest real-time financial data.
- Key Tech: Implemented custom Airflow DAGs for hourly scheduling with automated retries and containerized the entire environment using Docker Compose for portability.
Azure VM | SSIS | SQL Server | Kimball Dimensional Modeling
- Role: Principal Data Engineer & Team Lead. Led a team of 4 to build an end-to-end Data Warehouse for ITERA's Quality Assurance Institute.
- Key Tech: Designed a Star Schema for Intellectual Property tracking and engineered complex SSIS packages for ETL orchestration on Azure cloud infrastructure.