-
Notifications
You must be signed in to change notification settings - Fork 127
parallel computing
[[TOC]]
PROPTI is able to make use of multiple CPUs, or cores, to speed up the overall process. Depending on the task there are different options. When utilising the SPOTPY package via PROPTI, one can use the parallelisation capabilities implemented into SPOTPY. For instance, the shuffled complex evolutionary algorithm (SCEUA) splits the parameter sets of a generation into multiple complexes in an attempt of not getting stuck at a local but to find the global optimum. Those complexes are treaded as different tasks, which can be processed by different computing cores in parallel.
Another possibility would be to use each parameter set in different simulation setups. Starting point for this could be, that one has performed different experiments on the same material. Then it would be helpful, aiming to get a more robust parameter set, to run a simulation of each experiment. In that case, a parameter set, from a complex of the SCEUA, would be implemented into an individual simulation setup, replicating specific experimental conditions. Therefore, each task from a complex would start multiple sub-processes, one for each experimental condition to be incorporated.
This tutorial uses the files provided in the example propti\examples\tga_analysis_02, and is based on the tga_analysis.fds example from the Fire Dynamics Simulator (FDS). The experimental data used here as target, is fabricated and the result of a simulation of the original FDS example file with adjusted heating rates.
The idea here is, that one has performed TGA experiments. The same material was subjected to three different heating rates (5 , 10, 15)K/s. Experimental results have been stored in three different files (tga_5K_exp.csv, tga_10K_exp.csv, tga_15K_exp.csv). Simulations to replicate the experiments are conducted with FDS. A FDS template file is provided tga_analysis_02.fds, as well as an input file for PROPTI (input.py).
The general concept is very similar to the tga_analysis_01 example in "Getting Started". For each simulation setup different parameter sets need to be defined to describe the optimisation parameters and those which describe the experimental conditions. To make the input definition slightly easier, the capabilities of Python are used to create the definitions procedurally.
Again, some general information is defined at first, like the character ID to name the files or the template file name.
# Set the character ID.
CHID = 'TGA_analysis_02'
# Define the end of the simulation data.
TEND = 9360
# Define template file name.
template_file = "tga_analysis_02.fds"Now, a couple of lists is defined, that hold information on the different heating rates. One list contains values of the heating rates, the other holds the file names of the respective experimental data file names. With this setup, simple loops can be utilised to create the three different simulation setups. They only differ in the actual values of the parameters (5 K/s or 10 K/s) and not the parameters themselves (heating rate).
First the heating rates:
# Define heating rates.
HeatingRatesTGA = [5, 10, 15]Then the list with the file names of the experimental data:
# Define the file names,
# for the target information for the optimisation.
experimental_data_file_list = ['tga_5K_exp.csv',
'tga_10K_exp.csv',
'tga_15K_exp.csv']Now the optimisation parameters need to be defined, as described in "Getting Started".
# Define the optimisation parameters.
op1 = pr.Parameter(name='ref_temp_comp_01',
place_holder='rtc01',
min_value=200, max_value=400)
op2 = pr.Parameter(name='ref_rate_comp_01',
place_holder='rrc01',
min_value=0.001, max_value=0.01)
op3 = pr.Parameter(name='ref_temp_comp_02',
place_holder='rtc02',
min_value=300, max_value=600)
op4 = pr.Parameter(name='ref_rate_comp_02',
place_holder='rrc02',
min_value=0.001, max_value=0.01)
# Collect all the defined parameters from above, just for convenience.
set_of_parameters = [op1, op2, op3, op4]
# Definition of parameters, which is used by propti_prepare.py later on.
# It has no further meaning here.
ops = pr.ParameterSet(params=set_of_parameters)A for-loop iterates over the list of desired heating rates and creates an individual parameter set for each element. In each parameter set optimisation parameters are merged with different model parameters to account for all experimental setups. Furthermore, an individual name for each parameter set is created, 'chid', based on the experimental conditions.
This is also an example how the list of HeatingRatesTGA is used to control construction of the information for the IMP.
# Initialise the list for the parameter sets (objects) that describe the
# experiments.
model_parameter_setups = []
# A loop to dynamically create the different parameter sets based on the
# experimental conditions. It adds the model parameters, that describe the
# experimental conditions, to the optimisation parameters.
for i in HeatingRatesTGA:
# Provide optimisation parameters for the simulation setups.
ps = pr.ParameterSet(params=set_of_parameters)
# Add different heating rates (5, 10, 15) - the model parameters.
ps.append(pr.Parameter(name='heating_rate_{}K'.format(i),
place_holder='hr',
value=i))
# Add individual character ID to distinguish the simulation data.
ps.append(pr.Parameter(name='character id',
place_holder='CHID',
value='{}_{}K'.format(CHID, i)))
model_parameter_setups.append(ps)As before, in "Getting Started", we need to define the relationships between the experimental data and the simulation response. Another for-loop is set up to generate the appropriate links. Again, this loop is controlled by the list of HeatingRatesTGA.
As a side note, basic computation on the data to be read in is possible, for both experiment and simulation. With factor a factor can be provided by which all the data points (y values) are multiplied. This could be handy when the simulation provides only the mass loss rate, but the user is interested in mass loss rate per unit area. Then one could write relation.model.factor = 1/(area) (while area is assumed to be a useful value). Similarly, offset allows to add, or subtract, a value to all the data points (y values). This allows to shift the data. Here, the default values are provided, which have no effect. They are mentioned here as a hint for users, that need to make use of this functionality.
# Create a list of relations between experimental and model (simulation) data,
# for each experimental data series. (Could also be nested, if there would be
# multiple repetitions for each experiment.)
r = []
for i in range(len(HeatingRatesTGA)):
# Initialise a relation.
relation = pr.Relation()
# Information on simulation data.
relation.model.file_name = "{}_{}K_tga.csv".format(CHID,
str(HeatingRatesTGA[i]))
relation.model.label_x = 'Time'
relation.model.label_y = 'MLR'
relation.model.header_line = 1
# The following parameters are the default values, which have no effect.
relation.model.factor = 1 # Allows basic computation, could be used to
# calculate heat release rate per unit area.
relation.model.offset = 0 # Add or substract number to shift the data.
# Information on experimental data.
relation.experiment.file_name = experimental_data_file_list[i]
relation.experiment.label_x = 'Time'
relation.experiment.label_y = 'MassLossRate'
relation.experiment.header_line = 0
# The following parameters are the default values, which have no effect.
relation.experiment.factor = 1 # Allows basic computation, could be used to
# calculate heat release rate per unit area.
relation.experiment.offset = 0 # Add or substract number to shift the data.
# Define definition set for data comparison. Basically providing the
# amount and position of data points in x-axis, by determining the range
# (from 0. to TEND) and providing a delta between the points (12).
relation.x_def = np.arange(0., TEND, 12)
# Collect the different relations.
r.append(relation)A set of simulation setups is invoked, to be used as a container for the different simulation setups.
# Initialise empty simulation setup sets.
setups = pr.SimulationSetupSet()All the information from above is now collected in a SimulationSetupSet. Further information is added, as well. For each experimental setup, a simulation setup is created to replicate it. Each of which gets an individual name for a working directory. The FDS template is specified, together with the different model parameters. Information on which simulation software to be used, needs to be provided (Note, this command will forwarded to the terminal, or command line. Thus, the operating system needs to understand this command!). And the relations are needed here, as well.
Remember: The model parameters contain also information about the different experimental conditions. Therefore, it is not necessary to use different simulation input file templates. However, it is possible, if desired. For instance, when experiments with different apparatuses have been conducted.
Again, this process is controlled by the list of HeatingRatesTGA.
# Create simulation setups by joining all the necessary information:
# parameters, working directory, template file , relations and simulation
# software executable.
for i in range(len(HeatingRatesTGA)):
s = pr.SimulationSetup(name='tga_analysis_02',
work_dir=
"{}_{}K".format(CHID, str(HeatingRatesTGA[
i])),
model_template=template_file,
model_parameter=model_parameter_setups[i],
model_executable='fds653',
relations=r[i])
setups.append(s)
print('** setups generated')Finally, some information needs to be provided to the optimiser. An optimisation algorithm needs to be specified, here a shuffled complex evolutionary algorithm (SCEUA). The repetitions define how many parameter sets are allowed to be generated, until the process will be stopped, if the algorithm could not converge. Furthermore, PROPTI can spawn sub-processes (num_subprocesses), which are used to run the three different simulations
per parameter set in parallel. The mpi parameter can be used for further parallelisation, if the optimisation algorithm allows for it.
# Provide values for optimiser.
optimiser = pr.OptimiserProperties(algorithm='sceua',
repetitions=150,
ngs=len(set_of_parameters),
# Sub-processes are used here for multiple
# experimental conditions.
num_subprocesses=len(HeatingRatesTGA),
mpi=True)
print('** input file processed')When using the restart functionality, where the batch script write markers into the database file (csv), this function cleans the database file. It is focussed on the SCEUA. It will create two files, one with only the markers removed. A second file gets the markers removed, as well as the partly completed generations, that might be left over after restart after a system crash.
It is rather simple, in that it just counts the lines between the restart markers and takes only the first n generations times m individuals and stops when no full generation is left.
Now, the input file is prepared and propti_prepare.py can be executed.