Warning, /tutorial-analysis/_episodes/04-full_chain_analysis.md is written in an unsupported language. File is not indexed.
0001 ---
0002 title: "Full Chain Analysis"
0003 teaching: 15
0004 exercises: 10
0005 questions:
0006 - "How do I bring all of this together if I'm starting from scratch?"
0007 objectives:
0008 - "Become familiar with the full analysis chain"
0009 keypoints:
0010 - "There are a few steps to go through before we get to the file we analysed previously."
0011 - "Good for testing, but use simulation campaign output where possible!."
0012 ---
0013
0014 In this short session, we'll go through a brief run through of how we actually ended up with a file like the one we ran our script on before. There are 5 basic steps which we'll look at individually, and then combine together:
0015
0016 1. Generate an input file (typically hepmc, other formats are useable). This is usually from some external event generator.
0017 2. Afterburn the file and apply beam effects (might be skipped in some cases).
0018 3. Process the input through the simulation, DD4HEP.
0019 4. Reconstruct the DD4HEP output with EICrecon.
0020 5. Analyse the EICrecon output with analysis script.
0021
0022 Note that for low level analyses, you could also directly analyse the DD4HEP output from step 3. You may also wish to consult [Holly's slides from the April 2024 software meeting for an overview](https://indico.cern.ch/event/1343984/contributions/5927492/attachments/2843633/4971409/tutorial_overview.pdf) of each of these steps and how they fit into this production chain.
0023
0024 ## Event Generator Input Files
0025
0026 I won't say too much on this since this strongly depends upon the channel you want to simulate and analyse. The files here will likely come from an external event generator, for example -
0027
0028 - PYTHIA
0029 - BeAGLE
0030 - DJANGOH
0031 - MILOU
0032 - eSTARlight
0033 - LaGER
0034 - DEMPgen
0035 - elSPectro
0036
0037 ... and may others. However, regardless of what you use, the output is likely some form of .hepmc file with event by event particle/vertex info. For example -
0038
0039 > Example HEPMC Event:
0040 > An example event from a HEPMC file is shown below. In this example event, we have an input 5 GeV electron on a 41 GeV proton. We have one vertex and three outgoing particles, a scattered electron, a pion, and a neutron. In our header, we also have an event weight included.
0041 >
0042 > E 1 1 5
0043 > U GEV MM
0044 > A 0 weight 4.813926604168258e-07
0045 > P 1 0 11 6.123233963758798e-16 0.000000000000000e+00 -4.999999973888007e+00 5.000000000000000e+00 5.109989488070365e-04 4
0046 > P 2 0 2212 -0.000000000000000e+00 -0.000000000000000e+00 4.100000000000000e+01 4.101073462535657e+01 9.382720881600054e-01 4
0047 > V -1 0 [1,2]
0048 > P 3 -1 11 -6.872312444834133e-01 1.924351128807063e+00 -4.281657822517654e+00 4.744260534644128e+00 5.109989488070365e-04 1
0049 > P 4 -1 211 1.042011265882083e+00 -1.600831989262599e+00 1.404460452649878e+00 2.374960954263115e+00 1.395701800000037e-01 1
0050 > P 5 -1 2112 -3.547800213986697e-01 -3.235191395444645e-01 3.887719739597977e+01 3.889151313644933e+01 9.395654204998098e-01 1
0051 {: .callout}
0052
0053 Typically, we also need to incorporate beam effects. This is done via the use of the afterburner.
0054
0055 ## Applying Beam Effects - Afterburner
0056
0057 Afterburner applies beam effects to an existing hepmc file. These include effects due to the crabbing of the beam bunches and the crossing angle. Afterburner is pre-installed in eic-shell. We can run it via -
0058
0059 ```console
0060 abconv
0061 ```
0062
0063 However, we'll need an input file to do anything, we can also check other options quickly with -
0064
0065 ```console
0066 abconv -h
0067 ```
0068
0069 Note that when we run Afterburner, it will try to pick up the input beam energies and apply the relevant configuration. We can force a different configuration if we want (see the options from the help printout). We could for example though run -
0070
0071 ```console
0072 abconv $File -o $OutputFilename
0073 ```
0074
0075 where $File is our input hepmc file from our generator, and $OutputFilename is whatever we want our output to be called.
0076
0077 Regardless of whether we want or need to afterburn the file, we can feed in our hepmc file to DD4HEP and process our events through the detector simulation.
0078
0079 ## Simulation
0080
0081 To process our events through the simulation, we need to get the detector geometry. The simplest way is simply to source the nightly build within eic-shell -
0082
0083 ```console
0084 ./eic-shell
0085 source /opt/detector/epic-main/bin/thisepic.sh
0086 ```
0087
0088 We can check this worked as intended by checking that the DETECTOR_PATH variable is now set. Do so via -
0089
0090 ```console
0091 ls $DETECTOR_PATH
0092 ```
0093
0094 If we do this without sourcing thisepic.sh, we should get an error. Now, we should see a range of .xml files (which outline various detector configurations). We could also compile our own version of the detector within eic-shell. You might want to do so if you are actively iterating on the design of specific detector for example. See the [GitHub page](https://github.com/eic/epic) for instructions.
0095
0096 We can now process a simulation. Be aware that this may take some time, so to test it, try processing a small number of events first. Check the options we can provide via -
0097
0098 ```console
0099 npsim -h
0100 ```
0101
0102 A typical simulation command might look something like -
0103
0104 ```console
0105 npsim --compactFile $DETECTOR_PATH/epic_craterlake.xml --numberOfEvents 1000 --inputFiles input.hepmc --outputFile output.edm4hep.root
0106 ```
0107
0108 Most of the arguments are pretty self explanatory. As a quick demo, I'll run -
0109
0110 ```console
0111 npsim --compactFile $DETECTOR_PATH/epic_craterlake_5x41.xml --numberOfEvents 10 --inputFiles eic_DEMPgen_5on41_ip6_pi+_1B_1.hepmc --outputFile DEMPgen_5on41_pi+_10_TestOutput.edm4hep.root
0112 ```
0113
0114 When we run this, we'll get lots of printouts to screen, we can supress this by adding the -v5 argument too (only errors will be printed). Once we have our simulation output, we can now reconstruct our events.
0115
0116 ## Reconstruction
0117
0118 We can run eicrecon pretty straightforwardly, within eicshell, try -
0119
0120 ```console
0121 eicrecon -h
0122 ```
0123
0124 which should again, print out the various options we have available. An example command to run the reconstruction on a file might look like this -
0125
0126 ```console
0127 eicrecon -Ppodio:output_file=eicrecon_out.root -Pjana:nevents=1000 -Pdd4hep:xml_files=epic_craterlake.xml sim_output.edm4hep.root
0128 ```
0129
0130 Again, this might take a long time. So test a small sample of events first. Following up on my simulation demo, I'll run -
0131
0132 ```console
0133 eicrecon -Ppodio:output_file=DEMPgen_5on41_pi+_10_TestReconOutput.edm4hep.root -Pjana:nevents=10 -Pdd4hep:xml_files=epic_craterlake_5x41.xml DEMPgen_5on41_pi+_10_TestOutput.edm4hep.root
0134 ```
0135
0136 eicrecon will look for the detector .xml file in $DETECTOR_PATH, so make sure the detector geometry is sourced before running eicrecon.
0137
0138 ## Combining Everything
0139
0140 Ok, great. We now have a file we could run our earlier analysis script on. But what if we wanted to do all of this from scratch? Well, the easiest way might be to put all of this in a shell script. So, pulling all of our commands together -
0141
0142 ```console
0143 #! /bin/bash
0144
0145 source /opt/detector/epic-main/bin/thisepic.sh
0146 eval npsim --compactFile $DETECTOR_PATH/epic_craterlake.xml --numberOfEvents 1000 --inputFiles input.hepmc --outputFile output.edm4hep.root
0147 sleep 3
0148 eval eicrecon -Ppodio:output_file=eicrecon_out.root -Pjana:nevents=1000 -Pdd4hep:xml_files=epic_craterlake.xml sim_output.edm4hep.root
0149
0150 exit 0
0151 ```
0152
0153 I've premade a version of this with the commands I ran earlier, so we can run it and see what happens.
0154
0155 ## Farming
0156
0157 Ok, great. We can do (almost) all of the processes we need in one command. But as we've seen, the processing can take a while. Realistically, we're probably going to want to parallelise this in some way. With access to the JLab iFarm or the BNL systems (Condor). We can create and submit compute jobs for this purpose. This is getting a bit beyond the scope of this tutorial, but some things to consider -
0158
0159 - Our job needs to either access the container, or process eic-shell within the job (more on this below)
0160 - The job itself should be as simple as possible, just exectuing a command with some arugments. Our script above is a good candidate (with some work)
0161 - As is, our script is fairly inflexible. We should probably make things like the input and output file names variables that are set based upon arguments we provide.
0162 - We need to consider the resource usage of our job carefully.
0163 - Pathing can be tricky, we need to make sure the farm/compute node picks up the correct paths such as $DETECTOR_PATH (this is a common job error).
0164 - As always, TEST first. Run a small job that runs quickly interactively, THEN submit it is a small compute job. Compare the outputs.
0165
0166 For our first point, one easy (and not recommended for Condor jobs!) way to do this is via an EOF line -
0167
0168 ```console
0169 #! /bin/bash
0170
0171 cat <<EOF | eic-shell
0172 ./Basic_Bash.sh
0173 EOF
0174
0175 exit 0
0176 ```
0177 This just starts eic-shell and runs our earlier script. We can run this WITHOUT running eic-shell first. Note that this is a bit of a cheat to address a pathing issue. $DETECTOR_PATH will be interpreted by the script BEFORE the EOF script so our variable will be mis-set. We can get around this by running a script. Ideally, for our compute job, we should probably also explicitly set our paths to eic-shell and the bash script in some way.
0178
0179 With changes like this made, we could then make a quick job and submit it. This is a bit beyond the tutorials, but for some farm examples, see [this job script](https://github.com/sjdkay/ePIC_PairSpec_Sim/blob/main/Farm_Bash_Scripts/PairSpec_Sim_Job.sh) and this [job submission script](https://github.com/sjdkay/ePIC_PairSpec_Sim/blob/main/Farm_Bash_Scripts/PairSpec_Sim_Job.sh) I use as a template. Feel free to use these as a template for your own jobs, but please thoroughly read through and understand them before submitting a huge number of jobs. Keep the comments above in mind too.
0180
0181 Also a quick disclaimer, my experience in running jobs is limited to systems I know (which does not include the BNL systems). As such, I can't advise much beyond general slurm job style questions on BNL/Condor. I'm also aware that using EOF in scripts was not encouraged in BNL jobs, see [this discussion on mattermost](https://chat.epic-eic.org/main/pl/fo9954siwigyjckasrnd4xufxw) for more.
0182
0183 ## Warnings
0184
0185 Finally, a major disclaimer. A lot of the time, you should NOT be starting from scratch and processing through the simulation and reconstruction yourself. There are numerous reasons -
0186
0187 - Computing time intensive
0188 - Versioning errors/mismatch
0189 - Not as reproducible (if you find an error, people will need to try and reproduce it from your environment)
0190
0191 Where possible, use files from official simulation campaigns (bringing us full circle, see the first lesson for using a simulation campaign file in a script!). That being said, for testing and iterating rapidly on a design change, running small jobs yourself may be the way to go. It may also help you to understand the full process by seeing the steps involved.