Repo to hold my code for general data science and stats techniques. I hope to one day turn this into a training course but for now it'll serve as a good reference for me and my learning, development and consolidation.
Note that this is a work in progress and the chapters in the individual sections are not to be considered an exhaustive view of the subject. As I go through I'll be adding more chapters and sections.
The repo is made up of Jupyter Notebooks and is split into sections as follows:
A00. Introduction to Statistics
A01. Basic Statistics
A02. Probability
A03. Hypothesis & Inference
A00. Introduction to Data Engineering
Axx. Databases Types
Axx. Airflow Pipelines
Axx. Architecture
Axx. Big Data Architecture
Axx. Hadoop / Spark etc.
D00. Introduction to Machine Learning
D01. Linear Algebra
D02. Gradient Descent
D03. A First Machine Learning Project
D00. Introduction to Deep Learning
E00. Introduction to Natural Language Processing (NLP)
F00. Introduction to Miscellaneous
F01. Github
F02. Cloud Computing
F03. Testing