Skip to content
View AnnthomyGILLES's full-sized avatar
💭
Open to Work
💭
Open to Work

Organizations

@AK-Intelligence-partners

Block or report AnnthomyGILLES

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
AnnthomyGILLES/README.md

👋 Hi, I'm Annthomy GILLES

🌎 Location: Montréal, Québec, Canada
🔗 LinkedIn: https://www.linkedin.com/in/annthomygilles
📖 AI Engineering (∼20%)

📖 Designing Data-Intensive Applications (∼20%)

📚 About Me

Senior Data Scientist and AI Engineer with a multidisciplinary background spanning computational biology and advanced analytics. I specialize in developing data-driven solutions that transform complex challenges into actionable insights across industries including finance, healthcare, automotive, and government.

With expertise in both technical implementation and strategic product development, I excel at bridging the gap between cutting-edge technology and business value. My experience includes:

  • AI/ML Engineering: Designing and implementing machine learning models and workflows with a focus on NLP, generative AI, and network analysis
  • Product Development: Leading full-lifecycle product creation from ideation to market deployment, including an Anti-Money Laundering application
  • Data Architecture: Architecting robust data pipelines and governance frameworks for complex, multi-source environments
  • Cross-functional Leadership: Collaborating effectively with stakeholders from technical and non-technical backgrounds to deliver impactful solutions

My Articles

ChatGPT: Reflet du bullshit en entreprise
Comprendre les métiers de la Data le temps d'une pause café.
Is your organization TRULY data-driven? 12 questions to find out!
Le Temps Guérit Tout. Excepté Le Mauvais Code.

🚀 Featured Project: MarketPulseAI

📊 Overview

MarketPulseAI is my most ambitious side project to date - an advanced real-time analytics platform that combines traditional stock market data analysis with social media sentiment to provide holistic market insights. The system processes millions of data points per minute to detect market patterns and sentiment shifts that often precede price movements, giving users a potential edge in understanding market dynamics.

🔑 Core Value Proposition

MarketPulseAI stands apart through its dual-analysis approach:

  1. Traditional Price Data Analysis: Deep learning models process market metrics to predict potential price movements
  2. Social Media Sentiment Analysis: NLP algorithms capture market mood and emotional drivers of price action

This combination delivers a more complete picture of what's driving stock prices, integrating both quantitative factors and human sentiment that influences market behavior.

💻 Technology Stack

Data Infrastructure

  • Ingestion: Apache Kafka + Kafka Connect
  • Processing: Apache Spark (Stream Processing)
  • Storage:
    • Redis (real-time features/online store)
    • Cassandra (historical market data/offline store)
    • Elasticsearch (social media content/text data)

Machine Learning & Analytics

  • Market Analysis: Custom deep learning models
  • Text Processing: Advanced NLP for sentiment analysis
  • Signal Integration: Weighted ensemble models

Deployment & Operations

  • Containerization: Docker
  • Orchestration: Kubernetes
  • Monitoring: Prometheus + Grafana
  • API Layer: FastAPI
  • Visualization: Streamlit dashboards
  • Real-Time Updates: WebSockets

🧠 Key Insights From Development

  • Social sentiment shifts sometimes predict price movements 1-3 hours before they appear in market data
  • The relative importance of technical vs. sentiment features varies dramatically based on market conditions
  • Quality and consistency of input data proved far more important than model sophistication for prediction accuracy

📈 Current Status & Next Steps

MarketPulseAI is still in active development. I'm currently working on:

  • Expanding data sources (options flow, institutional trading patterns)
  • Optimizing the pipeline for improved throughput and reduced latency
  • Developing enhanced visualization dashboards
  • Preparing for cloud deployment with managed services

Disclaimer: MarketPulseAI is an EDUCATIONAL exploration, not an investment tool.

💼 Other Side Projects

Dashboard Project - Private 📊💻

  • Building a dashboard connected to a database using Flask, mySQL, and web scraping
  • Implemented automatic notifications sent to Discord and Telegram

WhatsApp Integration - Private 📱💬

  • Building a Python pipeline connected to a database using Flask, MongoDB, and Docker
  • Implemented API integration with WhatsApp for automated messaging

Weather Data Aggregation with Kafka - Public ☁️🌡️

  • Building a project to scrape weather data from different APIs
  • Experimenting with Kafka to aggregate the data
  • Integrating Spark for data analysis and processing
  • Project is focused on learning Kafka and expanding knowledge of big data technologies

💼 Professional Experience

🇨🇦 KPMG Canada (Oct 2022 - Present): Senior Consultant, Information Management & Data Analytics

🇧🇪 Belgian Government (Mar 2021 - Oct 2022): Data Scientist

  • Worked on a graph-based modelling project for COVID-19 infection spread and management
  • Gained experience with Neo4j, ElasticSearch, PostgreSQL, MongoDB, Prefect, Dask, Python, Apache Airflow, Unit testing, CI/CD, JIRA, Agile, API, Pandas, and scikit-learn

🇧🇪 Toyota Motor Corporation (Feb 2020 - Nov 2020): Data Scientist Consultant

  • Worked on a DataOps project to clean and prepare data from car sensors for R&D use cases
  • Gained experience with AWS services, Dask, Python, multi Unit testing, CI/CD, JIRA, Agile, API, Pandas, and scikit-learn

🇧🇪 Positive Thinking Company (Oct 2019 - Oct 2022): Data Scientist

  • Developed an automated tool for resume classification and summarization using NLP techniques
  • Gained experience with Python, R, Shiny, MongoDB, TFIDF, word2vec, doc2vec, Random Forest, XGboost, and Docker

🇫🇷 Devoteam (Feb 2019 - May 2019): Data Consultant

  • Built a comprehensive web app dashboard for employee management and tracking
  • Gained experience with Google Cloud Platform (GCP), Docker, web development, Firebase, R, JavaScript, MongoDB, Git, HTML, and CSS

🇫🇷 bioMerieux (Sept. 2016 - Sept. 2018): Data Scientist

  • Worked on a decision support system for improving doctors prescribing behavior during infectious disease
  • Gained experience with Python, R, inferential statistics, machine learning, dimensionality reduction, business intelligence, metagenomics, differential abundance analysis, nanopore technology, and SQL

🇩🇪 Max Planck Institute (Mar. 2016 - Aug. 2016): Computational Biologist

  • Developed a differential gene expression analysis workflow using Python, shell, and R languages
  • Gained experience with Tuxedo suite, DeSEQ2, MEME suite, GATK, Picards-tools, Stringtie, Go enrichment, variant calling, and differential expression

🇫🇷 Merial, a Sanofi Company (Mar. 2015 - Sept. 2015): Biological Engineer

  • Characterized virulence factors and vaccine targets of a bacterial canine pathogen
  • Gained experience with cell culture techniques, flow cytometry, genetic engineering, northern and western blotting, fluorescent and confocal microscopy, and PCR

Skills

Category Skills
Programming 🐍 Python, R, 💻 Shell/Bash/Command line
Databases 🍃 MongoDB, 🗃️ SQL, 🔗 Neo4J, 📊 Cassandra, 🔍 Elasticsearch
Big Data 🔄 Apache Kafka, ⚡ Apache Spark, 🗄️ Redis, 📈 Stream Processing
Statistics & Machine Learning 🔬 Inferential Statistics, 📈 Hypothesis testing, 📊 Regression methods, 🔄 Correlation, 📉 Descriptive Statistics, 🚦 Markov model, 🌐 Dimensionality reduction, 🧩 Clustering, 🌳 Decision tree, 🧠 KNN, 🎄 SVM, 🌱 Random forest
Tools 🧰 Git, 📊 Matplotlib, 🔢 Numpy, 🐼 Pandas, 🍃 Pymongo, 🔬 Scipy, 🤖 Scikit-learn, 🌊 Seaborn, 🔗 SQLalchemy
Web Development 🌐 HTML5/CSS3, 💻 Javascript, Typescript, NestJs, Prisma, 🌶️ Flask
DevOps & Cloud 🐳 Docker, ☸️ Kubernetes, 🔍 Prometheus, 📊 Grafana, ☁️ AWS
Environment 💻 High Performance Computing, 🐧 Linux
Data Science 🛠️ Data Engineering, 🧑‍💼 Data Governance, 📈📉📊 Big Data, 🤖 Machine Learning, 📊 Data Analytics, 🍃MongoDB, 🐳 Docker, 🗃️ PostgreSQL, ☁️ Amazon Web Services (AWS), 📈 JIRA, 🌐 Web Development, 🧑‍🔬 NLP

🏫 Education

University of Rouen Normandie

Master in Bioinformatics and Statistics (2015 - 2018)

  • Three-year Research & Professional Master's Degree in Bioinformatics, Statistics and Mathematics
  • Curriculum covers management, processing, and analysis of sequences and massive data
  • Data science: supervised learning (Regression, Decision Tree, Random Forests, Markov Chains, SVM, KNN, Neural Network) and unsupervised learning (KNN, K-means, CAH)

University of Poitiers

Master's Degree in Bioengineering and Biomedical Engineering (2013 - 2015)

  • Interdisciplinary education in biomedical research and engineering program from various backgrounds including bioengineering, cell and molecular biology, oncology, pharmacology, genetics, and microbiology

University of the French West Indies and Guiana

Bachelor's Degree (Licence) in Biochemistry and Biology (2010 - 2013)

  • Curriculum covers biochemistry, cellular & molecular biology, immunology, physiology, biological statistics, organic chemistry

Pinned Loading

  1. MarketPulseAI MarketPulseAI Public

    A real-time stock market analysis system combining price prediction and sentiment analysis

    Python 1 1

  2. LLMs-in-Production LLMs-in-Production Public

    Building an end-to-end production-ready LLM & RAG system using LLMOps best practices

    Python 1

  3. AK_whatsapp_chatbot AK_whatsapp_chatbot Public

    Python

  4. FomoTools-Web-App FomoTools-Web-App Public

    Make FOMO great again

    CSS

  5. RAG-for-Production RAG-for-Production Public

    Python 1

  6. mlops-project mlops-project Public

    Python