Skip to content

gomesfernanda/personal-development

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

123 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

This is the beginning of my trajectory to become an awesome data scientist.

yeah

I have no degree on computer science and I'm not a "formal" software engineer. I have however a bachelor degree in Physics, so my math background is pretty solid.

I already know Python (thanks to MIT course on edX) but I have A LOT to improve, and I plan to learn Go too.

This is what I figured so far for my development:

deeplearning.ai's Deep Learning Specialization

  • Course 1 - Neural networks and deep learning
  • Course 2 - Improving deep neural networks
  • Course 3 - Structuring machine learning projects
  • Course 4 - Convolutional neural networks
  • Course 5 - Sequence models

Book "Feature Engineering for Machine Learning"

Summary here

  • Chapter 1 - The Machine Learning Pipeline
  • Chapter 2 - Fancy Tricks with Simple Numbers
  • 3. Text Data: Flattening, Filtering, and Chunking
  • 4. The Effects of Feature Scaling: From Bag-of-Words to Tf-Idf
  • 5. Categorical Variables: Counting Eggs in the Age of Robotic Chickens
  • 6. Dimensionality Reduction: Squashing the Data Pancake with PCA
  • 7. Nonlinear Featurization via K-Means Model Stacking
  • 8. Automating the Featurizer: Image Feature Extraction and Deep Learning
  • 9. Back to the Feature: Building an Academic Paper Recommender

PGA Study

There's a repo for that

  • Some descriptive analytics for PGA index file
  • Descriptive analytics for siva files on PGA according to some criteria
    • Download siva files
    • Examine siva files
    • Use gitbase to query siva files

Data Science from Scratch

Following O'Reilly book

  • Chapter 01 - Introduction
  • Chapter 02 - A Crash Course in Python
  • Chapter 03 - Visualizing Data
  • Chapter 04 - Linear Algebra
  • Chapter 05 - Statistics
  • Chapter 06 - Probability
  • Chapter 07 - Hypothesis and Inference
  • Chapter 08 - Gradient Descent
  • Chapter 09 - Getting Data
  • Chapter 10 - Working with Data
  • Chapter 11 - Machine Learning
  • Chapter 12 - k-Nearest Neighbors
  • Chapter 13 - Naive Bayes
  • Chapter 14 - Simple Linear Regression
  • Chapter 15 - Multiple Regression
  • Chapter 16 - Logistic Regression
  • Chapter 17 - Decision Trees
  • Chapter 18 - Neural Networks
  • Chapter 19 - Clustering
  • Chapter 20 - Natural Language Processing
  • Chapter 21 - Network Analysis
  • Chapter 22 - Recommender Systems
  • Chapter 23 - Databases and SQL
  • Chapter 24 - MapReduce
  • Chapter 25 - Go Forth and Do Data Science

Using source{d} stack

Following "Introduction to Code As Data & Machine Learning On Code"

  • Getting started with Babelfish
  • Analyzing Git Repositories
  • Getting started with gitbase & gitbase web
  • MLonCode Pre-trained Models
  • Training MLonCode Models

Playing with Kaggle's Titanic dataset

There's a repo for that

  • Acquire data
  • Analyze by describing data
  • Analyze by pivoting features
  • Analyze by visualizing data
  • Wrangle data
  • Model, predict and solve
    • Logistic Regression
    • KNN or k-Nearest Neighbors
    • Support Vector Machines
    • Naive Bayes classifier
    • Decision Tree
    • Random Forrest
    • Perceptron
    • Artificial neural network
    • RVM or Relevance Vector Machine

Lean Analytics Book

Following the book

  • To be developed

Natural Language Processing with Python

Following O'Reilly book

  • To be developed

Introducing Go

Following the book by Caleb Doxsey

  • Chapter 01 - Getting Started
  • Chapter 02 - Types
  • Chapter 03 - Variables
  • Chapter 04 - Control Sctructures
  • Chapter 05 - Arrays, Slices and Maps
  • Chapter 06 - Functions
  • Chapter 07 - Structs and Interfaces
  • Chapter 08 - Packages
  • Chapter 09 - Testing
  • Chapter 10 - Concurrency
  • Chapter 11 - Next Steps

Misc

Concepts that I have no clue about and have to study/practice

  • reflog
  • git bisect
  • binaries
  • packfile
  • namespace
  • xpath
  • testing
  • SDK
  • debug
  • protobuf
  • rpc

About

Summary for the book Lean Analytics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published