-
Add the developers of your organization/project to be get affiliated in
./developers_affiliations.txtin the proper format.cd src/. Now generate new email-map using./import_affs.sh, then:mv email-map cncf-config/email-map. For e.g.developer1: email1@xyz, email2@abc, ... company1 company2 until YYYY-MM-DD developer2: email3@xyz, email4@pqr, ... company3 company4 until YYYY-MM-DD -
Clone all repositories of the project at
~/dev/project_name/. For cloning either you can usecncf/velocityproject and writing sql query in BigQuery folder or you can create a new shellscript file in~/dev/cncf/gitdm/location with nameclone_project_name.sh. And just copy paste this code in that file#!/bin/bash mkdir ~/dev/project_name/ 2>/dev/null cd ~/dev/project_name || exit 1 git clone github_repo_clone_url_for_your_project1 || exit 1 git clone github_repo_clone_url_for_your_project2 || exit 1 ... echo "All project_name repos cloned"Paste all repository's clone_url manually. Save file and run this script
chmod +x ./clone_project_name.sh. and then run this script -./clone_project_name.sh. This will clone all repos at the place~/dev/project_name/.Notes : replace project_name with your github organization name.
-
To generate
git.logfile, use this command./all_repos_log.sh ~/dev/project_name/*. Make ituniq. -
To run
cncf/gitdmon a generatedgit.logfile do:~/dev/cncf/gitdm/cncfdm.py -i git.log -r "^vendor/|/vendor/|^Godeps/" -R -n -b ./src/ -t -z -d -D -U -u -o all.txt -x all.csv -a all_affs.csv > all.out -
To generate human readable text affiliation files:
SKIP_COMPANIES="(Unknown)" ./gen_aff_files.sh -
If updating via
ghusers.shorghusers_cached.sh(step 6), please updatereposarray in./ghusers.rbwith your org/project repos lists, then rungenerate_actors.shtoo. But before it, make sure that you had set devstats and update./generate_actors.shafter first line withsudo -u postgres psql -tA your_pg_database_name < ~/dev/go/src/devstats/util_sql/all_actors.sql > actors.txt. now run./generate_actors.sh. -
Consider
./ghusers_cached.shor./ghusers.sh(if you run this, then copy result json somewhere and get 0-committers from previous version to save GH API points). Sometimes you should just run./ghusers.shwithout cache. -
ghusers_partially_cached.shwill refetch repos metadata and commits and get users data fromgithub_users.jsonso you can save a lot of API points. -
To update (enchance)
github_users.jsonwith new affiliations./enchance_json.sh. -
To merge multiple GitHub logins data (for example propagate known affiliation to unknown or not found on the same GitHub login) run:
./merge_github_logins.sh. -
Because this can find new affiliations you can now use
./import_from_github_users.shto import back fromgithub_users.jsonand then restart from step 3. -
Run
./correlation.shand examine its outputcorrelations.txtto try to normalize company names and remove common suffixes like Ltd., Corp. and downcase/upcase differences. -
Run
./lookup_json.shand examine its output JSONs - those GitHub profiles have some useful data directly available - this will save you some manual research work. -
ALWAYS before any commit to GitHub run:
./handle_forbidden_data.shto remove any forbiden affiliations, please also seeFORBIDDEN_DATA.md. -
You can use
./clear_affiliations_in_json.shto clear all affiliations on a generatedgithub_users.json. -
You can create smaller final json for
cncf/devstatsusing./strip_json.sh github_users.json stripped.json; cp stripped.json ~/dev/go/src/devstats/github_users.json.