In CMS, the SM processes are generated and simulated centrally, and we don't have to worry about them. But for a specific BSM search, a physicist has to take care of the BSM samples themselves, if they are not being handled already. For this section, I am using vector-like leptons as a reference BSM signal arXiv:1510.03556. First, we need to integrate this new physics model to the existing standard model in an event generator like MadGraph. Typically, theorists share their BSM model in the Universal FeynRules Output (UFO) format, which contain automatized matrix-element generators. These files contain python modules which can be easily included as an extension to MadGraph. This allows us to play with the new particles and the Feynman rules associated with them. I have taken the latest VLL UFO file from the repository: github.com/prudhvibhattiprolu.
This is the order of installation for this setup.
Here are some important packages required before installation. I am mentioning the versions that I have in my setup, but any recent versions should also work fine. I recommend using conda to install everything. It's important that ROOT is build with the same Python version that is used here. For detailed instructions on how to handle Python environments and install ROOT properly, visit here.
Packages | Versions |
---|---|
python | 3.10.9 |
cmake | 3.22.1 |
git | 2.34.1 |
g++, gcc | 11.4.0 |
ROOT | 6.26/10 |
I would also recommend to keep all of these MC-generation tools inside the same directory. For me, the working directory is
/home/phazarik/mcgeneration
.
MadGraph can be downloaded from the official website. Building is not needed. The
binary file is available at mg5amcnlo/bin/mg5_aMC
. I have given the full path to MadGraph in my .bashrc
file as
follows.
alias mg5="/home/phazarik/mcgeneration/mg5amcnlo/bin/mg5_aMC"
If you are not importing any BSM model, MadGraph setup is done. In case of any BSM model (like VLL in my case), the BSM models are unpacked into the MadGraph directory as follows.
/home/phazarik/mcgeneration/mg5amcnlo/models/VLLS_NLO /home/phazarik/mcgeneration/mg5amcnlo/models/VLLD_NLO
These model files in my example are compatible with MadGraph version v2 that was built on python2, but should also work with the latest MadGraph v3 that uses python3. In order to use MadGraph v3, these model files are made compatible with python3, and then imported into MadGraph as follows.
shell>> mg5 #Takes me into MadGraph interface. MG5_aMC> set auto_convert_model T #For compatibility with python3. MG5_aMC> import model VLLD_NLO
If the following outputs pop-up successfully, then the setup is ready.
INFO: Change particles name to pass to MG5 convention
the definition of 'j' and 'p' to 5 flavour scheme.
definitions of multiparticles l+ / l- / vl / vl~ unchanged
multiparticle all = g ghg ghg~ u c d s b u~ c~ d~ s~ b~ a gha gha~ ve vm vt e- mu- ta- ve~ vm~ vt~ e+ mu+ ta+ t t~ z w+ ghz ghwp ghwm h w- ghz~ ghwp~ ghwm~ taup nup taup~ nup~
INFO: Change particles name to pass to MG5 convention
definitions of multiparticles p / j / l+ / l- / vl / vl~ unchanged
multiparticle all = g ghg ghg~ u c d s b u~ c~ d~ s~ b~ a gha gha~ ve vm vt e- mu- ta- ve~ vm~ vt~ e+ mu+ ta+ t t~ z w+ ghz ghwp ghwm h w- ghz~ ghwp~ ghwm~ taup nup taup~ nup~
HepMC is widely used for exchanging event information between event generators and detector simulation tools. For this exercise, HepMC3 can be downloaded from GitLab. Some usage instructions are available here.
HepMC3 can be brought from GitLab and built as follows.
git clone https://gitlab.cern.ch/hepmc/HepMC3.git # This will create a source directory called 'HepMC3'. mkdir hepmc3_build hepmc3_install # This will create two directories where hepmc3 is built and installed. cd hepmc3_build # This is where building hepmc3 happens. cmake -DCMAKE_INSTALL_PREFIX=../hepmc3_install -Dmomentum:STRING=GEV -Dlength:STRING=MM ../HepMC3 make # This requires the C++ compilers (as checked by the previous command), and takes some time. make install # This will transfer files to the install directory.
PYTHIA is a program for simulating particle interactions as well as hadronization. It can be downloaded from the official website. Installation instructions are provided there. The following steps are used to install pythia with HepMC3 configuration. Make sure to give the full path to HepMC3 during configuration. I am using version 8.312, but these instructions are valid for all versions.
wget https://www.pythia.org/download/pythia83/pythia8312.tgz # Downloading pythia. tar xvfz pythia8312.tgz # Unzipping the tarball. cd pythia8312
In the configuration, I am including the HepMC3 library. For this, I need to put the full path to HepMC3 along with the
./configure
command as follows.
./configure --with-hepmc3=/home/phazarik/mcgeneration/hepmc3_install # Alternative build commands: #./configure --with-hepmc3=/home/phazarik/mcgeneration/hepmc3_install \ # --with-python-include=/home/phazarik/miniconda3_backup_2024_10_09/envs/analysis/include/python3.10 \ # --with-python-bin=/home/phazarik/miniconda3_backup_2024_10_09/envs/analysis/bin # ./configure # ./configure --with-python #If everything goes right, the following report should pop up. #--------------------------------------------------------------------- #| PYTHIA Configuration Summary | #--------------------------------------------------------------------- # Architecture = LINUX # C++ compiler CXX = g++ # C++ dynamic tags CXX_DTAGS = -Wl,--disable-new-dtags # C++ flags CXX_COMMON = -O2 -std=c++11 -pedantic -W -Wall -Wshadow -fPIC -pthread # C++ shared flag CXX_SHARED = -shared # Further options = #The following optional external packages will be used: #+ HEPMC3 (-I/home/phazarik/mcgeneration/hepmc3_install/include) make clean # Removes temporary files from previous attempts, if any. make # This takes a couple of minutes.
Hadronization happens in the pythiaXXXX/examples
directory. That's why I have built pythia in my work-area for my convenience. But
it can also be kept along with the other tools and export the output files to work-area for the next steps. Anyway, once pythia is build, I
added the following paths to my .bashrc
file, which is needed for C++ compilation of the hadronizer codes.
export PYTHIA8=/mnt/d/work/temp/mcgeneration/pythia8312 export PYTHIA8DATA=$PYTHIA8/share/Pythia8/xmldoc export PATH=$PYTHIA8/bin:$PATH export LD_LIBRARY_PATH=$PYTHIA8/lib:$LD_LIBRARY_PATH
In some of the examples, I have also seen PYTHIA used in a python based interface. For this, one can easily install PYTHIA and HepMC3 using conda-forge. But this is not important for the toy example that I have shared in the next section. For a new user, I would not recommend this, because managing multiple versions of tools is messy.
conda install -c conda-forge pythia8 # not important conda install -c conda-forge hepmc3 # not important
I also found a nice YouTube video on this Python based PYTHIA interface, which is a part of an HSF workshop. This video broadly covers the basics of a lot of things that can be done with PYTHIA, starting from event generation.
Delphes is a fast simulation framework for high-energy physics detectors. The Delphes outputs are equivalent to NanoAOD, but the information is structured differently in the ROOT files. It can be downloaded from the GitHub repository. Installation instructions are available there as well. Building delphes is pretty straight forward.
git clone https://github.com/delphes/delphes.git cd delphes make # This takes a couple of minutes
In some examples, PYTHIA, HEPMC3 etc. can be used from the MadGraph interface itself (I'm exploring it, not an expert yet). For this, paths to PYTHIA and Delples should be included in MadGraph configuration. This has to be done once. I also added my fastjet setup, in case I need it later.
set pythia8_path /home/phazarik/mcgeneration/pythia8312 set delphes_path /home/phazarik/mcgeneration/delphes set fastjet /home/phazarik/fastjet-3.4.2/bin/fastjet-config #optional
These paths are needed only when the hadronization etc is done inside MadGraph itself, by importing these modules. But in the
toy example, I am doing each step of the sample generation individually. These paths can also be added manually by editing
the input/mg5_configuration.txt
file in the MadGraph directory. Also, MadGraph may ask for something called lhapdf
,
which is a standard tool for evaluating parton distribution functions (PDFs). This can be installed using conda as follows.
conda install -c conda-forge lhapdf
After these installations, following are the list of MC-generation tools I have in my setup.
Packages | Versions | Sources |
---|---|---|
mg5amnlo | 3.5.4 | GitHub |
hepmc3 | 3.3.0 | GitLab |
pythia | 8.312 | pythia.org |
delphes | 3.5.0 | GitHub |
lhapdf | 6.5.4 | conda-forge |
install pythia8
(for pythia and HepMC together) and install Delphes
(for Delphes alone). These are kept in
mg5amcnlo/HEPTools
. But I don't trust it yet, and there is no control on the versions. I prefer to do it manually.
The flowchart below illustrates the event simulation workflow I am going to use. The process begins with generation using MadGraph, where parton-level events are generated in the LHE format. The output LHE file is then fed into PYTHIA for hadronization, where partons are converted into physical hadrons, resulting in a DAT file. Finally, detector simulation is done using Delphes with a CMS card, which simulate how the CMS would record these events, producing a ROOT file that can be used for further analysis.
In the generation phase, tools like MadGraph are used to simulate hard scattering events based on quantum field theory. MadGraph takes in a process definition and a set of parameters from the run_card and param_card. The input file typically consists of these cards and configuration files, which define the physics process and the collider setup. The output of this phase is in the LHE format (Les Houches Event), a standardized text file that contains detailed information about the parton-level event, such as particle IDs, momenta, and event weights. The LHE file serves as the bridge to further event processing.
Let's try to generate a simple Drell-Yan process: pp → Z → ll. For this, be in your work area, go to MadGraph prompt, and define the process.
mg5 # my shortcut for opening MadGraph prompt. #Now inside MadGraph prompt. display particle # displays all the available individual particles. display multiparticles # displays all the labels for groups of particles. generate p p > z > l+ l- output ppToZToLL exit
The output
line creates a new directory containing the pp → Z → ll process, including the run-cards and param-cards etc.
In principle, all of these can be done in a single step in the MadGraph prompt, but it's convenient to customize the parameters later. The cards
are kept at ppToZToLL/Cards
. Let's edit the run_card.dat
file and change/add some important parameters as follows, and
keep everything else as it is.
#Edited line: 100 = nevents ! Number of unweighted events requested, changed to 10 #Newly added lines: # -- customize according to your need --
The ppToZToLL/bin
directory contains the binaries used to run the event generator, and ppToZToLL/Events
contains the
outputs for each run. Let's run a test and see what it does.
# Be inside ppToZToLL directory. ./bin/generate_events testrun -f > /dev/null 2>&1 # testrun = name of the directory to be generated inside ppToZToLL/Events # -f = suppresses the MadGraph CLI prompts and takes the parameters from run_card. # > /dev/null 2>&1 = Suppresses any GUI related issues (in WSL)
This creates a ppToZToLL/Events/testrun
directory containing a .gz
file which can be unzipped and used for later
purposes.
cd Events/testrun gunzip unweighted_events.lhe.gz
This unzipped LHE file, unweighted_events.lhe
, contains all the generated parton-level information, including particle IDs,
momenta, event weights, and other metadata.
The hadronization step happens in the pythiaXXXX/examples
directory. For this, we need a code that reads the lhe file and
hadronizes them, and edit pythiaXXXX/examples/Makefile
to include this code so that it is compiled along with the rest of the
staff.
hadlhe.cc
, and can be found here.
Makefile
can be found here, and should
replace the old one.
These files can also be directly downloaded into the correct directory as follows.
cd [path to pythia]/examples wget -O hadlhe.cc https://raw.githubusercontent.com/phazarik/phazarik.github.io/main/mypages/files/codes/hadlhe.cc wget -O Makefile https://raw.githubusercontent.com/phazarik/phazarik.github.io/main/mypages/files/codes/Makefile_for_pythia.txt
The hadlhe.cc
file should be edited to give the correct path to the input LHE file that was generated using MadGraph, the number of
events in the input file, and a desired output path for the dat
file that contains the hadronized outputs. This
dat
file is later imported in Delphes for detector simulation. The hadronizer is executed as follows.
#inside the examples directory make clean # to get rid of any previous compiled files make hadlhe # this looks fof hadlhe.cc, compiles it, and creates an executable ./hadlhe # execution
Delphes simulation can be run using the executable DelphesHepMC3
followed by the arguments: card-name, output-file and input-file.
The first two arguments are the same in every case, so I added them in my bashrc as follows.
alias delphes="/home/phazarik/mcgeneration/delphes/DelphesHepMC3 /home/phazarik/mcgeneration/delphes/cards/delphes_card_CMS.tcl"
For this toy example, I ran the following from my work-directory to produce a delphes tree.
delphes testroot.root pythia8312/examples/ppToZToLL_10.dat
So far, the whole MC production process is done step-by-step. First, event generation at the LHE level, then producing the hadronized gen-level information using PYTHIA, and then detector simulation using Delphes. This whole exercise is done for understanding what happens at the back-end. The hadronization step produces large dat files, which needs to be handled by Delphes before being useful. It turns out that this extra step can be avoided by merging the work of Delphes and PYTHIA, which is illustrated in the flowchart below.
For this the following things need to be done.
.bashrc
.
export PYTHIA8=[path to pythia8312] #Give the full path here export PYTHIA8DATA=$PYTHIA8/share/Pythia8/xmldoc export PATH=$PYTHIA8/bin:$PATH export LD_LIBRARY_PATH=$PYTHIA8/lib:$LD_LIBRARY_PATH
cd /home/phazarik/mcgeneration/delphes make clean make HAS_PYTHIA8=true # This will take a while
pythia8_ppToZToLL.cfg
, and keep it in the process directory. This contains a
specific set of instructions that is needed during hadronization, including the path to the input files, how many events to process, etc. This
file is fed as an input to Delphes. If the number of events specified in the configuration does not match the number of events available in
the LHE file, PYTHIA will still proceed with whatever number of events it can read.
! Pythia8 configuration for pp -> Z -> LL process Main:numberOfEvents = 10 ! Number of events to simulate Main:timesAllowErrors = 10 ! How many errors Pythia will allow before stopping ! Set up beam parameters Beams:idA = 2212 ! Proton beam Beams:idB = 2212 ! Proton beam Beams:eCM = 13000.0 ! Center-of-mass energy in GeV ! Load the LHEF events generated by MadGraph Beams:LHEF = /mnt/d/work/temp/mcgeneration/ppToZToLL/Events/testrun/unweighted_events.lhe ! Pythia8-specific physics settings WeakSingleBoson:ffbar2gmZ = on ! Enable Z boson production 23:onMode = off ! Turn off all Z decays 23:onIfAny = 11 13 ! Allow only decays to leptons (e+ e- and mu+ mu-) ! Hadronization and jet clustering HadronLevel:all = on
DelphesHepMC3
, run DelphesPythia8
along with the CMS card. For this, I created an alias in the
.bashrc
file for my convenience.
alias delphespythia="/home/phazarik/mcgeneration/delphes/DelphesPythia8 /home/phazarik/mcgeneration/delphes/cards/delphes_card_CMS.tcl"
Once these changes are made, the following can be run from the work-directory.
delphespythia ppToZToLL/pythia8_ppToZToLL.cfg test_output.root
I find it convenient because I don't have to go to the pythia8312/examples
directory and create a mess there with all the temporary
folders, and deal with the DAT files. However, this method complains about some missing libraries related to ExRootAnalysis
at the
beginning, and ROOT prompts some warnings about the same while loading the output files. But as long as we are not using those features, it's
fine.
output processname
.run_card.dat
and specify the number of events../bin/generate_events testrun
.Events/testrun
.delphes/DelphesPythia8
by providing the CMS card, the PYTHIA config file, and the output file as arguments.
For analyzing Delphes trees and writing histograms, I created a MakeSelelector()
based setup with instructions available in this
GitHub repository.
First, setup a suitable environment in lxplus. For this example, I am going with slc7_amb64_gcc700
which is compatible with
CMSSW_12_4_8
. This can be checked by running the following, and finding the right CMSSW version in the list.
scram --arch slc7_amd64_gcc700 list CMSSW
Unfortunately, in this particular case, lxplus7
is no longer maintained. So we need to use a singularity image OS by running the
following right after logging into lxplus. More of this can be found here
here.
cmssw-el7 echo $SCRAM_ARCH # This should display some slc7 architecture. export SCRAM_ARCH=slc7_amd64_gcc700 # This will pick the right architecture. echo $SCRAM_ARCH # Make sure that ou have the right architecture.
Now, go to a desired work-area (preferably in eos), create a CMSSW work area and select the environment.
cmsrel CMSSW_12_4_8 cd CMSSW_12_4_8/src cmsenv
Riya is working on this. I will update this part later.
I am going to demonstrate this using CMSSW_12_4_8
. I logged into lxplus8 (with el8_amd64_gcc10 architecture). Inside
src
, I copied a gen-fragment template for Run2 Ultra Legacy, edited the gridpack location to put the locally available gridpack
(produced with the same CMSSW release), and compiled the setup.
wget --no-check-certificate --content-disposition --retry-connrefused --tries=3 -P Configuration/GenProduction/python/ https://cms-pdmv.cern.ch/mcm/public/restapi/requests/get_fragment/EXO-RunIISummer20UL18wmLHEGEN-01288 mv Configuration/GenProduction/python/EXO-RunIISummer20UL18wmLHEGEN-01288 Configuration/GenProduction/python/EXO-RunIISummer20UL18wmLHEGEN-01288.py #Make sure the file extension is py and edit the gridpack location. scram b -j8
cmsDriver.py Configuration/GenProduction/python/EXO-RunIISummer20UL18wmLHEGEN-01288.py \ --python_filename EXO-RunIISummer20UL18wmGEN-01288_cfg.py \ --eventcontent RAWSIM,LHE --customise Configuration/DataProcessing/Utils.addMonitoring \ --datatier GEN,LHE --fileout file:EXO-RunIISummer20UL18wmGEN-01288.root \ --conditions 106X_upgrade2018_realistic_v4 --beamspot Realistic25ns13TeVEarly2018Collision \ --step LHE,GEN --geometry DB:Extended --era Run2_2018 --no_exec --mc -n 10