This is the code for Project 1 of Computer System Performance
This code has implemented a concurrent and a independent data partion algorithm
Compile, test and run via control.sh
You can pass in:
- run: runs
./run.sh, here you have to specify a second arfg - test: run
out.owith default args - build: compiles the code
./control.sh <function>./control.sh build
Compiled: out.oTo run one of the experiments you can write
- hyper or hyper_threading
- one or numa_0
- multi or multi_numa
./control.sh hyper
./control.sh one
./control.sh multiIf you want to specify a specif affinity file you can use run as first argument and fileName as second argument
./control.sh run <affinity_file_name>E-g
./control.sh run linearRun the bash script and regenerate the results in results folder
run.sh takes the file name of an affinity file see affinity section
./run.sh <affinity_file_name>the affinity_file_name is processed in the run.sh
./run.sh linearThis is going to look in affinity/ folder and find linear.txt, and will pass affinity/linear.txt to out.o
The affinity files are .txt files that specify how to set the affinity.
The files are located in the affinity/ folder.
The different affinity files
- hyper_threading:
- Note: This assigns to the same cores to achieve hyper threading
- Pattern:
0 0 1 1 ...
- multi_numa
- Note: This split the work up evenly between the numa nodes
- Pattern:
0 8 1 9 ...
- numa_0
- Note: To run only run on numa node 0
- Pattern:
0 1 2 3 ...
- numa_1
- Note: To run only on numa node 1
- Pattern:
8 9 10 11 ...
Not used at the moment:
- even
- odd
- linear
Compile the main script (remember to use -pthread)
g++ -o out.o main.cpp abstract_method.cpp concurrent_method.cpp independent_method.cpp -pthreadCompile the generate script
g++ -o generate.o generate.cppArguments
- hashbits=4
- number of threads=4
- verbose=0 {0,1,2}
0 = no printing
1 = show progress
2 = additional information - method {0,1}
- path_to_affinity_file
./out.o <hashbits> <number_of_threads> <verbose> <method> <affinity_file_name>This will run out.o with 4 hashbits and 8 threads without printing
./out.o 4 8 0 1 affinity/numa_0.txt- Number of key value pairs to generate
./generate.o <number>Note that you can only generate data with size of the power of 2
./generate.o 8Create a new tmux session with
tmux new-session -s latencylegends12Once inside the session, run the shell script
./run.shTo detach from an active session, press Ctrl + B, then release both keys, then press D.
To list active sessions, write
tmux lsTo reattach to a running/active session, write
tmux attach -t latencylegends12You can kill the session from outside with
tmux kill-session -t latencylegends12or alternatively, write 'exit' inside the session.
To copy the results to your Desktop
scp -r group12@dionysos.itu.dk:/home/group12/csp/results ~/Desktop/Create env with conda or python
pip install -r requirements.txtThe easiest way is to use pipreqs. Install by pip install pipreqs
pipreqs . --force
Note:
--forceoverrides the currentrequirements.txtfile..(the dot in the code) is the path to the project. So "." is the current folder
Run graph.py to generate the two graphs concurrent_fig.png and independent_fig.png
Note: it is assumed that ./run.sh has been run before generating the graphs
python graph.pyThis will run the block inside if __name__ == '__main__'
To run with perf:
perf stat -e cycles,instructions,L1-icache-load-misses,L1-dcache-load-misses,LLC-load-misses,cache-misses,uops_retired.stall_cycles,branch-misses,iTLB-load-misses,dTLB-load-misses -o perf_results.txt ./run.sh
The output will be saved in the file specified (perf_results.txt).
In the output, cpu_core refers to P-cores (performance), cpu_atom refers to E-cores (efficiency).
The results are saved in the folder results/
Files are formatted: <methods>_<#threads>.csv
Each file contains
- hash_bits
- mil_tup_per_sec