This is the code to make turn on and efficiencies
Config file here. Copy paste the whole sheet into a text file, and feed the text file into the fill histogram executable to run things
The code is done in two parts. First part is an executable to generate all the histograms. The second part is to take the histograms and make efficiencies, turn-ons, and scalings.
The first step can take a while (few hours), and that's because of the file I/O to write out all the histograms. If you are pressed for time, consider making a smaller config file with only the few relevant lines. Then after the rush, or on the side, we can launch the full thing (it's always a good idea to have all objects ready for all the files).
Note:
- Instructions on the config file sheet is coming
- Instructions for new tree version is coming
- Setup a recent CMSSW environment. For example if you want to use CMSSW_10_4_0:
cmsrel CMSSW_10_4_0
cd CMSSW_10_4_0/src
cmsenv
The code is not tied with any specific CMSSW version
- Clone this repository (no restrictions on where)
git clone https://github.com/FHead/L1UpgradeStudies.git AwesomeCode
- Make sure you have
rootandfastjetin PATH - do
maketo compile everything
cd AwesomeCode
make
- test run by typing
make TestRun
Take a look at the makefile for the TestRun for an example on how to run everything.
This can be done for example to have a quick fastjet installation
mkdir -p fastjet/tarfiles
cd fastjet/tarfiles
curl -O http://fastjet.fr/repo/fastjet-3.3.2.tar.gz
tar xvfz fastjet-3.3.2.tar.gz
mv fastjet-3.3.2 ../3.3.2
cd ../3.3.2/
./configure --prefix=$PWD/../
make -j 10
make install
export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:<your path>/fastjet/lib
export PATH=${PATH}:<your path>/fastjet/bin/
You can test if the setup is correct by running fastjet-config and see if you get anything.
makes all the histograms needed for later steps. Input parameters as follows
input: comma-separated list of all root filesoutput: output file name (.root)StoredGen: true/false. Whether to use the gen jet info stored in the tree, or recluster on the flyconfig: the config file to use
this makes a plot with efficiencies or turn ons, or just simple distributions, or even cumulative distributions. This executable is very versatile. Input parameters as follows
label: comma-separated list of histogram labels (to be used in legends)file: comma-separated list of files that contains the histogramsnumerators: comma-separated list of histograms to be use as numeratorsdenominators: comma-separated list of histograms to be used as denominators. Several possibilities here- histogram name. Takes the histogram from the file
- "auto". Guesses the denominator name by adding "NoMatch"
- "simple". Don't divide the numerator histogram by anything. Just plot the distribution
- "cumulative". Don't divide the numerator histogram, but instead draw the cumulative version of it (useful for isolation derivation)
output: output filename.title: string to be passed into the histogram constructor. For example: "title;x;y"xmin,xmax,ymin,ymax: range of axes. y range defaults to (0.0, 1.1) if omittedcolor: comma-separated list of integers to be used as colors (see root TColor for the list) for each curveline: comma-separated list of doubles. Each one will draw a horizontal dashed line on the plotgrid: true/false. Whether to enable grid. Defaults to false.logy: true/false. Whether to do log y. Defaults to false.legendx,legendy: Location of the upper-left corner of legend. Defaults to (0.35, 0.20)rebin: integer, defaults to 1. If not 1, the histograms will be rebinned using this number.
This executable fits stuff and gets the scalings, and writes results into a data helper (DH) file, in addition to producing a pdf for inspection. Input parameters are -
input: the root file from the first step containing all the histogramsoutput: the output pdf file name. Has to be pdf!curves: the output DH file name.reference: where to take as the reference point. We typically use 95%prefix: additional prefix to distinguish stuff in the DH fileDo*: a lot of booleans, all defaults to false. The * can be {STAMuon, STADisplacedMuon, TkMuon, TkMuonStub, TkMuonStubS12, EG, EGExtended, EGTrack, Electron, ZElectron, IsoElectorn, Photon, PhotonPV, ElectronPV, PuppiJet, PuppiJetForMET, PuppiJetMin25, PuppiHT, PuppiMET, PFTau, PFIsoTau, CaloJet, CaloHT, TrackerJet, TrackerHT, TrackerMHT, TrackerMET, TkTau, CaloTkTau, TkEGTau, NNTauLoose, NNTauTight, CaloTau}. Though it's best to look in the source code to see what is there
The main work horse of this is the ProcessFile(...) function, which fits and produces one scaling line. In case we need to fit new things, we have to add these functions in the code, with one of the Do* switch if possible, to make sure things don't litter around too much. The function is defined as
void ProcessFile(PdfFileHelper &PdfFile, string FileName, string OutputFileName,
string Prefix, vector<double> Thresholds,
double Target, string Tag, string Name = "PT", int Type = TYPE_SMOOTH_SUPERTIGHT,
int Scaling = LINEAR)Here are the meaning of each of the thing
PdfFileHelper &PdfFile: this is one of the Yi helper class that makes multiple-page pdfs a breeze. It makes the final pdf output filestring FileName: the file that contains all the histogramsstring OutputFileName: the DH file filename.string Prefix: the directory to use in the histogram filevector<double> Thresholds: what thresholds to use in the scandouble Target: the famous 98%, or some other number you like. We pass it from command linestring Tag: The tag to use to store the result in the DH filestring Name: The middle part of histogram to use (for example thePTinTkElectron_PT_000000)int Type: what kind of fit to perform. Several possibilies are codedTYPE_FITFIX: fits with the classic functionf(x)we've been using for ages with three parameters: lambda, mu, sigmaTYPE_FITFIX2: let the baseline float by modifying the function asf(x) * ([3]-[4]) + [4], but fix[3]to 1.0TYPE_FIT: same modification as before, but fix[4]to 0 and let[3]floatTYPE_FITFLOAT: let the baseline and the plateau float by modifying the function asf(x) * ([3]-[4]) + [4]TYPE_FITTANH: fits the turn on with a tanh() functionTYPE_SMOOTH_LOOSE: a string model that attempts to go through all the points with a loose tension.TYPE_SMOOTH_TIGHT: same as above, a bit higher tensionTYPE_SMOOTH_SUPERTIGHT: similarly, with even higher tensionTYPE_SMOOTH_ULTRATIGHT: very tight strings!
int Scaling: what kind of scaling to fit in the end. 99.9% we putLINEAR. There is alsoQUADRATIC, which fits a quadratic curve ofx = a2 y^2 + a1 y + a0(note the swap between x and y)
Note 1: The classic function is this one
f(x) = (ROOT::Math::normal_cdf([0]*(x-[1]), [0]*[2], 0) - exp(-[0]*(x-[1])+[0]*[0]*[2]*[2]/2)*ROOT::Math::normal
_cdf([0]*(x-[1]), [0]*[2], [0]*[0]*[2]*[2]))
Note 2: Since we only care about the point where the turn on passes x% (usually 95%), the string model is a fine thing to use. Sometimes the fit just won't converge for some strange reason. And the string model is much more stable and flexible - for example if the detail of the turn on curve is not described well by the classic curve nor the tanh() around 10-30% turn on range, rather than finding the best curve to fit, we can use the string to go through the points and extract the 95% with good confidence.
This executable makes the text file to interface with the rate part of menu code.
There are two input arguments:
input: the DH file file name that contains all the turn on fit resultsoutput: text file name to store the output
For this usually we need to go into the code and change what is exported. Putting everything from command line seems silly, and exporting all the unnecessary things from the DH file is not helpful either. If you open the source code, you can see three blocks
vector<pair<string, string>> GName =
{
pair<string, string>("StandaloneMuonIsoTanh", "StandaloneMuon"),
...
}
vector<pair<string, string>> TwoPartName =
{
pair<string, string>("EG", "StandalonePhoton"),
...
};
vector<pair<string, string>> QuadraticName =
{
pair<string, string>("TrackerMHT5METFit", "TrackerMHTQuadratic")
};Each of the block contains a list of pairs. First item in the pair is the identifier in the DH file. Second item is the name you want it to appear in the final text file.
The first block is simple linear scaling. The second one will look for barrel and endcap ("EGBarrel", "EGEndcap") and add a if statement in the final text file. The last one is for quadratic cases - not really used so far.
Each object is in its own directory. Within each directory, there are many many histograms. Let's take TkElectron as example. The naming convention is
- TkElectronNoMatch_*_000000: The distribution without gen-match
- TkElectron_*_000000: the distribution with gen-match, but no L1 PT requirement
- TkElectron_*_00XX00: the distribution with gen-match, and with L1 PT > XX. 30 GeV = 003000, 10.5 GeV = 001050, etc. The list is set by the "preset" column in the config file.
There are a number of distributions in the middle field
- PT: PT distribution without any eta restriction
- PTEta15: PT in barrel
- PTEtaLarge: PT outside barrel
- Response: L1PT/GenPT
- ResponseEta15: response in barrel
- ResponseEtaLarge: response outside barrel
- ResponsePT
x: response with PT >x,x= 10, 50, 100, 150, 200 - ResponsePT10Eta15: response with PT > 10, barrel
- ResponsePT10EtaLarge: response with PT > 10, outside barrel
- Eta: eta distribution
- EtaPT3to
x: eta with PT = 3-x,x= 5, 6, 10, 15 - EtaPT
x: eta with PT >x,x= 15, 20, 25, 30, 100, 200 - EtaDXY
x: eta with DXY >x,x= 20, 50, 80 - TkIso: isolation
- TkIsoPT
x: isolation with PT >x,x= 10, 20, 30, 40 - TkIsoEta15: isolation within barrel
- TkIsoEtaLarge: isolation outside barrel
- TkIsoPT10Eta15: isolation within barrel, and PT > 10
- TkIsoPT10EtaLarge: isolation outside barrel, and PT > 10
- DR: matching DR distribution
- DRPT
x: matching DR, PT >x,x= 10, 20, 50 - DREta15: matching DR inside barrel
- DRPT10Eta15: matching DR inside barrel, PT > 10
- DRPT20Eta15: matching DR inside barrel, PT > 20
- DREtaLarge: matching DR outside barrel
- DRPT10EtaLarge: matching DR outside barrel, PT > 10
- DRPT20EtaLarge: matching DR outside barrel, PT > 20
- DXY: DXY distribution
- DXYPT
x: DXY, PT >x,x= 15, 20, 30
So... If you want...
- Matching efficiency vs PT: TkElectron_PT_000000 / TkElectronNoMatch_PT_000000
- Matching efficiency vs eta: TkElectron_Eta_000000 / TkElectronNoMatch_Eta_000000
- Turn on with threhsold 15: TkElectron_PT_001500 / TkElectron_PT_000000
- Isolation distritbuion: TkElectron_TkIso_000000
For things involving isolation, sometimes you need to get it from different folders. For example
- Matching efficiency vs PT for TkIsoElectron: TkIsoElectron_PT_000000 / TkElectronIsoNoMatch_PT_000000