Urban Data Analysis Course

In this course, we will learn how to find data sources and datasets and learn from them. This course situates data analysis tasks and interpretations in the context of urban planning and research methods. This course uses advanced quantitative and statistical methods for analyzing urban issues. To start this course, students are expected to understand basic computer literacy, including file management and cloud-based data backups (e.g. Dropbox, Google Drive, MS OneDrive). Students are also expected to have basic mathematical knowledge and data manipulation in MS Excel.

This course uses the Python programming language as the main platform for data manipulation and statistical analysis. Students will learn basic data manipulation using Python and measurement of relationships using statistical methods. We will start by downloading data from NYC’s open data catalog and reading them in Jupyter Notebooks. After basic data visualizations, we will start exploring the data using descriptive statistics. Then, the fundamentals of inferential statistics will be taught. The course work will culminate with a final project that students will carry out by choosing among a pool of datasets to answer independent research questions.

This summer course is fast-paced, and students are expected to learn how to troubleshoot programming problems independently. We will use generative in this course for developing basic codes. Students who have concerns about using AI need to talk to the instructor in advance.

We will use MS Teams as our communication platform. Students need to be on Teams and be active participants. Being an active participant means you need to ask questions and respond to questions brought by others.

Learning Objectives: The learning objectives of this course include:

To learn about online datasets, open data catalogs, and other relevant resources;
To learn basic programming skills (with Python) for data preparation and analysis
To learn descriptive statistics;
To learn relationships between planning-related variables in urban areas and inferential statistics for hypothesis testing;
To learn how to use data analysis for critically evaluating existing planning policies and building future alternatives.

Session 1

Introduction to the course, our policies, and our resources.
If you are not on Teams, you are not in our team.
How to leverage AI.
Let’s set up Google Colab, as your programming environment
We will mount your Google Drive in Colab.
We are using this notebook: https://github.com/mehdiheris/UrbanDataAnalytics/blob/main/notebooks/Start_Python_Session_1.ipynb
Let’s do Python: What are variables?
Let’s do Python: What are data types?
Let’s do Python: Create Lists.
Let’s do Python: What are For loops?

Session 2

(1) Let’s learn stats:

What are the mean, median, and standard deviation of a column/variable?

Explanation of mean, median, and standard deviation

(2) Let’s do Python:

We are using the following notebooks:
- 1 https://github.com/mehdiheris/UrbanDataAnalytics/blob/main/notebooks/import_packages.ipynb
- 2 https://github.com/mehdiheris/UrbanDataAnalytics/blob/main/notebooks/Pandas_Study_Guide_Session_2.ipynb

Importing libraries

import pandas as pd

import matplotlib.pyplot as plt

import seaborn as sns

Read a CSV table.

Start with Dataframes in Pandas

What are columns?

Explanation of columns

Learn how to get mean, median, min, and max for each column

What is a histogram plot?

Explanation of histogram plots

What is a scatter plot?

Explanation of scatter plots

Session 3

(1) Let’s learn stats:

- Research hypothesis and null hypothesis.

- What is a categorical variable and what is a numerical variable?

(2) Let’s do Python:

We are using these notebooks:

- Get the stats of a column

- Get the unique values of a column

- Understand datatime objects.

- Create line plots

- What is a method() and what is an argument?

- What are quantiles?

Session 4

(1) Let’s do Python:

- import mumpy as np

- A few things in Numpy

- Create an array

- Calculate mean, median, sum.

- What is quantile?

(2) Let’s learn stats:

- We will learn what sampling methods are and whether samples are representative of the population.

- What is the margin of error?

- Are polls reliable?

Session 5

(1) Let’s do Python:

- import statsmodels as st

- Get a subset of column of a df

- Get a subset of rows of a df

- Calculate new columns

(2) Let’s learn stats:

- We will start learning about inferential statistics.

- What in the world is a P value?

- Let’s unwrap the T statistic; confidence intervals; and hypothesis testing!

Session 6

(1) Let’s do Python:

- Let’s learn Groupby

- Let’s master .loc[] and .iloc[*]

(2) Let’s learn stats:

- Are these two correlated?

- We will learn about correlation and how to measure and visualize it!

Name		Name	Last commit message	Last commit date
Latest commit History 72 Commits
.ipynb_checkpoints		.ipynb_checkpoints
dataset		dataset
docs		docs
notebooks		notebooks
scripts		scripts
LICENSE		LICENSE
README.md		README.md
SessionContents		SessionContents

License

mehdiheris/UrbanDataAnalytics

Folders and files

Latest commit

History

Repository files navigation

Urban Data Analysis Course

(1) Let’s learn stats:

What are the mean, median, and standard deviation of a column/variable?

(2) Let’s do Python:

Importing libraries

Read a CSV table.

What are columns?

What is a histogram plot?

What is a scatter plot?

(1) Let’s learn stats:

- Research hypothesis and null hypothesis.

- What is a categorical variable and what is a numerical variable?

(2) Let’s do Python:

- Get the stats of a column

- Get the unique values of a column

- Understand datatime objects.

- Create line plots

- What is a method() and what is an argument?

- What are quantiles?

(1) Let’s do Python:

- import mumpy as np

- A few things in Numpy

- Create an array

- Calculate mean, median, sum.

- What is quantile?

(2) Let’s learn stats:

- We will learn what sampling methods are and whether samples are representative of the population.

- What is the margin of error?

- Are polls reliable?

(1) Let’s do Python:

- import statsmodels as st

- Get a subset of column of a df

- Get a subset of rows of a df

- Calculate new columns

(2) Let’s learn stats:

- We will start learning about inferential statistics.

- What in the world is a P value?

- Let’s unwrap the T statistic; confidence intervals; and hypothesis testing!

(1) Let’s do Python:

- Let’s learn Groupby

- Let’s master .loc[] and .iloc[*]

(2) Let’s learn stats:

- Are these two correlated?

- We will learn about correlation and how to measure and visualize it!

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

- `import mumpy as np`

- `import statsmodels as st`

Packages