Skip to content

This site publishes the material I teach in my Urban Data Analytics course and GIS courses.

License

Notifications You must be signed in to change notification settings

mehdiheris/UrbanDataAnalytics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

72 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Urban Data Analysis Course

In this course, we will learn how to find data sources and datasets and learn from them. This course situates data analysis tasks and interpretations in the context of urban planning and research methods. This course uses advanced quantitative and statistical methods for analyzing urban issues. To start this course, students are expected to understand basic computer literacy, including file management and cloud-based data backups (e.g. Dropbox, Google Drive, MS OneDrive). Students are also expected to have basic mathematical knowledge and data manipulation in MS Excel.

This course uses the Python programming language as the main platform for data manipulation and statistical analysis. Students will learn basic data manipulation using Python and measurement of relationships using statistical methods. We will start by downloading data from NYC’s open data catalog and reading them in Jupyter Notebooks. After basic data visualizations, we will start exploring the data using descriptive statistics. Then, the fundamentals of inferential statistics will be taught. The course work will culminate with a final project that students will carry out by choosing among a pool of datasets to answer independent research questions.

This summer course is fast-paced, and students are expected to learn how to troubleshoot programming problems independently. We will use generative in this course for developing basic codes. Students who have concerns about using AI need to talk to the instructor in advance.

We will use MS Teams as our communication platform. Students need to be on Teams and be active participants. Being an active participant means you need to ask questions and respond to questions brought by others.

Learning Objectives: The learning objectives of this course include:

  • To learn about online datasets, open data catalogs, and other relevant resources;
  • To learn basic programming skills (with Python) for data preparation and analysis
  • To learn descriptive statistics;
  • To learn relationships between planning-related variables in urban areas and inferential statistics for hypothesis testing;
  • To learn how to use data analysis for critically evaluating existing planning policies and building future alternatives.
Session 1
  • Introduction to the course, our policies, and our resources.
  • If you are not on Teams, you are not in our team.
  • How to leverage AI.
  • Let’s set up Google Colab, as your programming environment
  • We will mount your Google Drive in Colab.
  • We are using this notebook: https://github.com/mehdiheris/UrbanDataAnalytics/blob/main/notebooks/Start_Python_Session_1.ipynb
  • Let’s do Python: What are variables?
  • Let’s do Python: What are data types?
  • Let’s do Python: Create Lists.
  • Let’s do Python: What are For loops?
Session 2

(1) Let’s learn stats:

What are the mean, median, and standard deviation of a column/variable?

  • Explanation of mean, median, and standard deviation

(2) Let’s do Python:

Importing libraries
  • import pandas as pd
  • import matplotlib.pyplot as plt
  • import seaborn as sns
Read a CSV table.
  • Start with Dataframes in Pandas
What are columns?
  • Explanation of columns
  • Learn how to get mean, median, min, and max for each column
What is a histogram plot?
  • Explanation of histogram plots
What is a scatter plot?
  • Explanation of scatter plots
Session 3

(1) Let’s learn stats:

- Research hypothesis and null hypothesis.
- What is a categorical variable and what is a numerical variable?

(2) Let’s do Python:

We are using these notebooks:

- Get the stats of a column
- Get the unique values of a column
- Understand datatime objects.
- Create line plots
- What is a method() and what is an argument?
- What are quantiles?
Session 4

(1) Let’s do Python:

- import mumpy as np
- A few things in Numpy
- Create an array
- Calculate mean, median, sum.
- What is quantile?

(2) Let’s learn stats:

- We will learn what sampling methods are and whether samples are representative of the population.
- What is the margin of error?
- Are polls reliable?
Session 5

(1) Let’s do Python:

- import statsmodels as st
- Get a subset of column of a df
- Get a subset of rows of a df
- Calculate new columns

(2) Let’s learn stats:

- We will start learning about inferential statistics.
- What in the world is a P value?
- Let’s unwrap the T statistic; confidence intervals; and hypothesis testing!
Session 6

(1) Let’s do Python:

- Let’s learn Groupby
- Let’s master .loc[] and .iloc[*]

(2) Let’s learn stats:

- Are these two correlated?
- We will learn about correlation and how to measure and visualize it!

About

This site publishes the material I teach in my Urban Data Analytics course and GIS courses.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •