GILLES Annthomy AnnthomyGILLES

👋 Hi, I'm Annthomy GILLES

🌎 Location: Montréal, Québec, Canada
🔗 LinkedIn: https://www.linkedin.com/in/annthomygilles
📖 AI Engineering (∼20%)

📖 Designing Data-Intensive Applications (∼20%)

📚 About Me

Senior Data Scientist and AI Engineer with a multidisciplinary background spanning computational biology and advanced analytics. I specialize in developing data-driven solutions that transform complex challenges into actionable insights across industries including finance, healthcare, automotive, and government.

With expertise in both technical implementation and strategic product development, I excel at bridging the gap between cutting-edge technology and business value. My experience includes:

AI/ML Engineering: Designing and implementing machine learning models and workflows with a focus on NLP, generative AI, and network analysis
Product Development: Leading full-lifecycle product creation from ideation to market deployment, including an Anti-Money Laundering application
Data Architecture: Architecting robust data pipelines and governance frameworks for complex, multi-source environments
Cross-functional Leadership: Collaborating effectively with stakeholders from technical and non-technical backgrounds to deliver impactful solutions

My Articles

ChatGPT: Reflet du bullshit en entreprise
Comprendre les métiers de la Data le temps d'une pause café.
Is your organization TRULY data-driven? 12 questions to find out!
Le Temps Guérit Tout. Excepté Le Mauvais Code.

🚀 Featured Project: MarketPulseAI

📊 Overview

MarketPulseAI is my most ambitious side project to date - an advanced real-time analytics platform that combines traditional stock market data analysis with social media sentiment to provide holistic market insights. The system processes millions of data points per minute to detect market patterns and sentiment shifts that often precede price movements, giving users a potential edge in understanding market dynamics.

🔑 Core Value Proposition

MarketPulseAI stands apart through its dual-analysis approach:

Traditional Price Data Analysis: Deep learning models process market metrics to predict potential price movements
Social Media Sentiment Analysis: NLP algorithms capture market mood and emotional drivers of price action

This combination delivers a more complete picture of what's driving stock prices, integrating both quantitative factors and human sentiment that influences market behavior.

💻 Technology Stack

Data Infrastructure

Ingestion: Apache Kafka + Kafka Connect
Processing: Apache Spark (Stream Processing)
Storage:
- Redis (real-time features/online store)
- Cassandra (historical market data/offline store)
- Elasticsearch (social media content/text data)

Machine Learning & Analytics

Market Analysis: Custom deep learning models
Text Processing: Advanced NLP for sentiment analysis
Signal Integration: Weighted ensemble models

Deployment & Operations

Containerization: Docker
Orchestration: Kubernetes
Monitoring: Prometheus + Grafana
API Layer: FastAPI
Visualization: Streamlit dashboards
Real-Time Updates: WebSockets

🧠 Key Insights From Development

Social sentiment shifts sometimes predict price movements 1-3 hours before they appear in market data
The relative importance of technical vs. sentiment features varies dramatically based on market conditions
Quality and consistency of input data proved far more important than model sophistication for prediction accuracy

📈 Current Status & Next Steps

MarketPulseAI is still in active development. I'm currently working on:

Expanding data sources (options flow, institutional trading patterns)
Optimizing the pipeline for improved throughput and reduced latency
Developing enhanced visualization dashboards
Preparing for cloud deployment with managed services

Disclaimer: MarketPulseAI is an EDUCATIONAL exploration, not an investment tool.

💼 Other Side Projects

Dashboard Project - Private 📊💻

Building a dashboard connected to a database using Flask, mySQL, and web scraping
Implemented automatic notifications sent to Discord and Telegram

WhatsApp Integration - Private 📱💬

Building a Python pipeline connected to a database using Flask, MongoDB, and Docker
Implemented API integration with WhatsApp for automated messaging

Weather Data Aggregation with Kafka - Public ☁️🌡️

Building a project to scrape weather data from different APIs
Experimenting with Kafka to aggregate the data
Integrating Spark for data analysis and processing
Project is focused on learning Kafka and expanding knowledge of big data technologies

💼 Professional Experience

🇨🇦 KPMG Canada (Oct 2022 - Present): Senior Consultant, Information Management & Data Analytics

🇧🇪 Belgian Government (Mar 2021 - Oct 2022): Data Scientist

Worked on a graph-based modelling project for COVID-19 infection spread and management
Gained experience with Neo4j, ElasticSearch, PostgreSQL, MongoDB, Prefect, Dask, Python, Apache Airflow, Unit testing, CI/CD, JIRA, Agile, API, Pandas, and scikit-learn

🇧🇪 Toyota Motor Corporation (Feb 2020 - Nov 2020): Data Scientist Consultant

Worked on a DataOps project to clean and prepare data from car sensors for R&D use cases
Gained experience with AWS services, Dask, Python, multi Unit testing, CI/CD, JIRA, Agile, API, Pandas, and scikit-learn

🇧🇪 Positive Thinking Company (Oct 2019 - Oct 2022): Data Scientist

Developed an automated tool for resume classification and summarization using NLP techniques
Gained experience with Python, R, Shiny, MongoDB, TFIDF, word2vec, doc2vec, Random Forest, XGboost, and Docker

🇫🇷 Devoteam (Feb 2019 - May 2019): Data Consultant

Built a comprehensive web app dashboard for employee management and tracking
Gained experience with Google Cloud Platform (GCP), Docker, web development, Firebase, R, JavaScript, MongoDB, Git, HTML, and CSS

🇫🇷 bioMerieux (Sept. 2016 - Sept. 2018): Data Scientist

Worked on a decision support system for improving doctors prescribing behavior during infectious disease
Gained experience with Python, R, inferential statistics, machine learning, dimensionality reduction, business intelligence, metagenomics, differential abundance analysis, nanopore technology, and SQL

🇩🇪 Max Planck Institute (Mar. 2016 - Aug. 2016): Computational Biologist

Developed a differential gene expression analysis workflow using Python, shell, and R languages
Gained experience with Tuxedo suite, DeSEQ2, MEME suite, GATK, Picards-tools, Stringtie, Go enrichment, variant calling, and differential expression

🇫🇷 Merial, a Sanofi Company (Mar. 2015 - Sept. 2015): Biological Engineer

Characterized virulence factors and vaccine targets of a bacterial canine pathogen
Gained experience with cell culture techniques, flow cytometry, genetic engineering, northern and western blotting, fluorescent and confocal microscopy, and PCR

Skills

Category	Skills
Programming	🐍 Python, R, 💻 Shell/Bash/Command line
Databases	🍃 MongoDB, 🗃️ SQL, 🔗 Neo4J, 📊 Cassandra, 🔍 Elasticsearch
Big Data	🔄 Apache Kafka, ⚡ Apache Spark, 🗄️ Redis, 📈 Stream Processing
Statistics & Machine Learning	🔬 Inferential Statistics, 📈 Hypothesis testing, 📊 Regression methods, 🔄 Correlation, 📉 Descriptive Statistics, 🚦 Markov model, 🌐 Dimensionality reduction, 🧩 Clustering, 🌳 Decision tree, 🧠 KNN, 🎄 SVM, 🌱 Random forest
Tools	🧰 Git, 📊 Matplotlib, 🔢 Numpy, 🐼 Pandas, 🍃 Pymongo, 🔬 Scipy, 🤖 Scikit-learn, 🌊 Seaborn, 🔗 SQLalchemy
Web Development	🌐 HTML5/CSS3, 💻 Javascript, Typescript, NestJs, Prisma, 🌶️ Flask
DevOps & Cloud	🐳 Docker, ☸️ Kubernetes, 🔍 Prometheus, 📊 Grafana, ☁️ AWS
Environment	💻 High Performance Computing, 🐧 Linux
Data Science	🛠️ Data Engineering, 🧑‍💼 Data Governance, 📈📉📊 Big Data, 🤖 Machine Learning, 📊 Data Analytics, 🍃MongoDB, 🐳 Docker, 🗃️ PostgreSQL, ☁️ Amazon Web Services (AWS), 📈 JIRA, 🌐 Web Development, 🧑‍🔬 NLP

🏫 Education

University of Rouen Normandie

Master in Bioinformatics and Statistics (2015 - 2018)

Three-year Research & Professional Master's Degree in Bioinformatics, Statistics and Mathematics
Curriculum covers management, processing, and analysis of sequences and massive data
Data science: supervised learning (Regression, Decision Tree, Random Forests, Markov Chains, SVM, KNN, Neural Network) and unsupervised learning (KNN, K-means, CAH)

University of Poitiers

Master's Degree in Bioengineering and Biomedical Engineering (2013 - 2015)

Interdisciplinary education in biomedical research and engineering program from various backgrounds including bioengineering, cell and molecular biology, oncology, pharmacology, genetics, and microbiology

University of the French West Indies and Guiana

Bachelor's Degree (Licence) in Biochemistry and Biology (2010 - 2013)

Curriculum covers biochemistry, cellular & molecular biology, immunology, physiology, biological statistics, organic chemistry

Provide feedback

Saved searches

Use saved searches to filter your results more quickly