The INTerventions, Equity, Research, and Action in Cities Team (INTERACT ) is a pan-Canadian collaboration of scientists, urban planners, public health practitioners, community partners, and members of the public, uncovering how the design of our cities is shaping the health and well-being of all Canadians. Since 2017, INTERACT has collected data from cohorts in 4 Canadian cities: Victoria, Vancouver, Saskatoon and Montreal.
Participants provide data via these sources:
- Linkage
- Linkage: Participation meta-data for each data source, participants are assigned an ID. These are tracked through the city's linkage file (one linkage file per city, per wave)
- Surveys (Polygon):
- Health: The health questionnaire features questions on INTERACT’s key health outcomes (physical activity, social connectedness, and well-being), mobility, socio-demographic data, and neighbourhood
- VERITAS (Visualisation, Evaluation and Recording of Itineraries and Activity Spaces): is a map-based survey that aims to collect data to help understand the complex interactions between daily mobility, social networks, and urban environments. These questions are asked across all sites.
- SenseDoc
- Sensedoc: The SenseDoc is a research grade multisensor device used for mobility (GPS) and physical activity (accelerometer) tracking. These data are collected continuously and allow us to measure location-based physical activity and infer transportation mode.
- Ethica
- Ethica: Ethica is a research-grade smartphone app used for mobility and physical activity tracking. These data are collected in a 1-in-5 duty cycle (1 minute active, 4 minutes idle). It collects
- GPS
- WiFi signals
- Activity recognition
- Pedometer
- Battery
- Accelerometer data
- Ethica: Ethica is a research-grade smartphone app used for mobility and physical activity tracking. These data are collected in a 1-in-5 duty cycle (1 minute active, 4 minutes idle). It collects
- EMA (Ethica)
- Ecological momentary assessment (EMA- Questions are different from city to city) collected from Ethica data. They have their own folder because the data are a different format from the rest of the Ethica data and require bespoke data wrangling.
On top of the data sets listed above, two others data sets are produced:
- Essence table: computed from on Health and VERITAS sources, presents a selection of core and/or often used variables. It also provides a series of derived metrics based on Camille Perchoux's spatial toolbox and Alexandre Naud's social toolbox.
- Table of Power (ToP): computed from GPS and accelerometer data, both from SenseDoc and Ethica. It provides an aggregated data set of combined location (GPS) and physical activity level (AXL) at various epochs (1 second / 1 minute / 5 minutes). NB due to the specificity of Ethica data collection pattern, Ethica ToP does not include PA levels
See Folders for each data source.
- Collect: Data is collected from participants
- Extract: Data is imported from original data source and moved to Compute Canada
- BACKUP
- Validate: Data is verified to ensure correct format and meta data. Any data issue is flagged to the research team and fixed at this step before continuing through the pipeline. If corrections occur, data is backed up to nearline, replacing the previous back-up.
- Load: Data is loaded by developer.
- Transform: Data is cleaned or transformed as needed.
- OUTPUT: Elite files (flat tables of intermediary files for expert members of team)
- Produce: Data is outputted as flat tables used by larger research team. For SD and Ethica, these are the Tables of Power. For survey data, it corresponds to Essence table.
- OUTPUT: Create csvs of Polygon surveys or ToP, for SenseDoc or Ethica
- Describe: Data is described with summary statistics to help check data is as expected.
Each data source follows theses pipeline steps. Details can be found in their respective folders.
