Building multi-agent systems, RL swarms, and practical AI products.
- Multi-Agent RL + Swarm Intelligence (coverage, search, coordination, energy-aware policies)
- Agentic systems (debate + memory + distillation → “reasoning that improves over time”)
- Applied AI products (automation, NLP classification, developer tooling)
Graph-attention + Soft Actor-Critic + multi-critic training for cooperative drone missions
- Coverage/search phases, custom metrics (coverage, redundancy, energy-per-area, efficiency)
- Designed for research-grade ablations + paper-ready plots
- Classroom/assignment automation pipelines
- YouTube script generator (web + scraping + structured outputs)
Classifies support messages into categories (fraud/chargeback/legal threat/refund…) using classical ML / lightweight DL (no transformers)
Core: Python • Pytorch • TensorFlow • NumPy • Pandas
ML/RL: Optax • Gym/Gymnasium • RL tooling
Backend: FastAPI • Node.js • REST APIs
Frontend: Next.js • React • Tailwind
Automation: n8n • Webhooks • Supabase
Dev: Git • Docker • Linux • CI (GitHub Actions)
- Research-grade clarity: clean experiments, ablations, reproducibility
- Engineering discipline: tests, readable diffs, minimal magic
- Systems thinking: metrics, failure modes, memory/latency constraints
- Ship mindset: demos that real users can try
If you’re working on multi-agent RL, agent memory/distillation, JAX tooling, or AI automation products — I’m open to collaborations.