Code and data for Temperature shapes language sonority: Revalidation from a large dataset.
The following 5 steps can be run separately as the output of each step is already provided in this repository (see Data below). Steps 1 and 2 require a local storage of the ASJP dataset and the FLDAS dataset, but you can skip these two steps so you do not need to download full datasets.
Run python get_sonority.py [raw_path], where [raw_path] is the path to raw folder in the local ASJP dataset (e.g. python get_sonority.py C:/ASJP/raw/). Results will be saved as sonorities.csv, phones.csv, word_structures.csv, and word_lengths.csv in the data folder.
Run python get_temperature.py [FLDAS_path] to extract monthly temperature data of all doculects in sonorities.csv, where [FLDAS_path] is the path to FLDAS_NOAH01_C_GL_M.001 folder of the local FLDAS dataset (e.g. python get_temperature.py C:/FLDAS/FLDAS_NOAH01_C_GL_M.001/). Results will be saved as data/temperatures.csv.
Run python get_temperature_global.py [FLDAS_path] to extract global monthly mean temperature data. Result will be saved as temperature_global.csv.
Run python plot_global.py. Plot will be saved as figure/global.png.
Run python process.py. Results will be saved as data.csv, data_genus.csv, data_family.csv, and data_macroarea.csv in the data folder.
Run corresponding code blocks in process.r in R.
Run python test_vowel_length_solutions.py [raw_path]. Results will be saved as data/vowel_length_solutions.csv. Then, run code block of “Plot correlations between vowel length solutions” in process.r to plot correlations.
All extracted data files are in the data folder.
temperatures.csv: Monthly temperature (1982–2022) for each filtered doculecttemperature_global.csv: Global mean annual temperature over 41 years (180° W–180° E, 60° S–90° N)
sonorities.csv: Mean sonority index (MSI) of each filtered doculect. We adapted 5 methods to calculate MSI from ASJP codes:index0: Parker’s scale, from Sonority in The Blackwell Companion to Phonologyindex1: Fought’s scale, from Sonority and climate in a world sample of languages: Findings and prospectsindex2: List’s scale, from Sequence Comparison in Historical Linguisticsindex3: Clements’s scale, from The role of the sonority cycle in core syllabification in Papers in Laboratory Phonologyindex4: Sonorant index (here obstruent = 1; sonorant = 2)index5: Vowel index (here consonant = 1; semivowel = 2; vowel = 3)index6: List’s scale, calculated using LingPytokens2class()
phones.csv: Extracted phones from all doculectsword_structures.csv: Word structures statistics of all doculects, characterized byC(= consonant) andV(= vowel) symbolsword_structures_grouped.csv: Word lengths statistics of all doculectsvowel_length_solutions.csv: MSI results under three vowel length solutions
data.csv: Data for each filtered doculect, with temperature data and linguistic data combinedWL: Mean word lengthIndex0toIndex6: MSIs in 7 methodsT: Mean annual temperatureT_max: Max of 41-year mean monthly temperaturesT_min: Min of 41-year mean monthly temperaturesT_sd: Standard deviation of monthly temperatures over 41 yearsT_diff: Mean annual range of temperatureIndex0_trans, etc.: Transformated above data
data_genus.csv: Data for each language “genus” classified by WALSdata_family.csv: Data for each language family classified by WALSdata_macroarea.csv: Data for each macroarea (North America, South America, Eurasia, Africa, Greater New Guinea, and Australia)
All saved figure files are in the figures folder.
global.png(also converted intoglobal.pdf): Global distribution of MATs and MSIsdistribution.pdf: Distribution of MATs and MSIs grouped by macroareacorrelation.pdf: Relationship between MSI and MATcorrelation_by_family.pdf: Relationship between MSI and MAT of the top 25 largest familiesword_length.pdf: Relationship between mean word length and MSI or MATword_length_by_family.pdf: Relationship between MSI and mean word length of the top 25 largest familiesrange.pdf: Relationship between mean annual range and MATvowel_length_solutions.pdf: Relationship between vowel length solutions