Back to home page

EIC code displayed by LXR

 
 

    


Warning, /tutorial-analysis/_episodes/03-analysis.md is written in an unsupported language. File is not indexed.

0001 ---
0002 title: "Analyzing the Reconstruction Output"
0003 teaching: 20
0004 exercises: 40
0005 questions:
0006 - "How does one utilize the reconstruction output trees to do an analysis?"
0007 objectives:
0008 - "Become familiar with methods for reading the trees"
0009 - "Understand how to access truth/particle information"
0010 - "Find track efficiency and resolution"
0011 keypoints:
0012 - "Flat tree structure provides flexibility in analysis."
0013 - "The ReconstructedChargedParticles branch holds information on reconstructed tracks."
0014 ---
0015 
0016 So far, we have only looked at (and plotted) some information from our file interactively. This is very useful and can help us identify the variables we want to deal with. However, we can't really use these techniques to conduct a full analysis of the data. To do so, we typically use a script or macro. In this part of the tutorial, we will create a script that we can use to do a relatively straightforward analysis of our file.
0017 
0018 > Note:
0019 > - The [branch dictionary]({{ page.root }}{% link _extras/branch_dictionary.md %}) outlines all of the branches we will need to utilise in this section.
0020 > - If you want, you can prune the branches you don't need from the input file using the [TreePrune.C script]({{ page.root }}{% link _extras/tree_pruning_script.md %}).
0021 {: .callout}
0022 
0023 ## Reading the Output Trees
0024 
0025 The simulation output trees are "flat" in the sense that there is no event class structure embedded within the tree and no additional libraries are needed to handle the output. Therefore, the end user can simply read the values stored in each branch using whatever method/workflow they are most comfortable with. Examples of several common methods for reading the trees are provided below. We will see a ROOT TTreeReader based example using a ROOT macro and a python/uproot based version. There is also an example using the (relatively) new RDataFrame class of ROOT. During the tutorial, you should try the exercise using whichever language you feel most comfortable with.
0026 
0027 ## Sample Analysis with ROOT TTreeReader: Track Efficiency and Resolution
0028 
0029 As a sample exercise to become familiar with the simulation output and how to use it in a realistic setting, we will find the tracking eficiency and resolution. We will need to access the reconstructed track information and the truth particle information and we will have to associate the individual tracks and particles to one another. 
0030 
0031 Before we begin, we should create a skeleton macro to handle file I/O. For the `TTreeReader` example, we will use a simple ROOT macro. Using your favorite text editor, create a file with a name like `trackAnalysis.C` or something similar and copy in the following code: 
0032 
0033 ```c++
0034 void trackAnalysis(TString infile="path_to_your_simu_file")
0035   {
0036     // Set output file for the histograms
0037     TFile *ofile = TFile::Open("out.hist.root","RECREATE");
0038 
0039     // Analysis code will go here
0040 
0041     ofile->Write(); // Write histograms to file
0042     ofile->Close(); // Close output file
0043   }
0044 ```
0045 
0046 We will need momentum, generator status, and particle species information for the truth particles and momentum information for the reconstructed tracks. The reconstructed track information can be accessed from two different branches: CentralCKFTrackParameters and ReconstructedChargedParticles. We can access these branches using a TTreeReaderArray.
0047 
0048 > ROOT TTreeReaderArrays:
0049 >
0050 >TTreeReader and the associated TTreeReaderArray is a simple interface for reading data from a TTree. The class description and examples can be seen [here](https://root.cern/doc/v630/classTTreeReader.html). To instantiate the reader and access values from a given branch (e.g. the MCParticles branch), one would use the following calls:
0051 >
0052 > ```c++
0053 > // Set up input file chain
0054 > TChain *mychain = new TChain("events");
0055 > mychain->Add(infile);
0056 >
0057 > // Initialize reader
0058 > TTreeReader tree_reader(mychain);
0059 >
0060 > // Access whatever data-members you need
0061 > TTreeReaderArray<int> partGenStat(tree_reader, "MCParticles.generatorStatus");
0062 > TTreeReaderArray<float> partMomX(tree_reader, "MCParticles.momentum.x");
0063 > ...
0064 > ```
0065 >
0066 > The branches and their members can be viewed by opening a file with TBrowser (`new TBrowser()`) from within ROOT. Once you have defined the `TTreeReaderArray` objects for the data-members you want to look at, you can loop over the events and the members within that event:
0067 >
0068 > ```c++
0069 > while(tree_reader.Next()) { // Loop over events
0070 >  for(unsigned int i=0; i<partGenStat.GetSize(); i++) // Loop through particles in the event
0071 >    {
0072 >      int particleStatus = partGenStat[i]; // Access data-members as you would an array
0073 >      float particleXMomentum = partMomX[i]; // partMomX should have same number of entries as partGenStat because they are in the same branch
0074 >      ...
0075 >    }
0076 > }
0077 > ```
0078 > All members of the same branch should have the same number of entries, so it is sufficient to use any member of the branch to set the limit of your loop.
0079 {: .callout}
0080 
0081 We will proceed using the ReconstructedChargedParticles branch as this will give us a chance to practice using associations, copy the following lines into your analysis macro.
0082 
0083 ```c++
0084 // Set up input file chain
0085 TChain *mychain = new TChain("events");
0086 mychain->Add(infile);
0087 
0088 // Initialize reader
0089 TTreeReader tree_reader(mychain);
0090 
0091 // Get Particle Information
0092 TTreeReaderArray<int> partGenStat(tree_reader, "MCParticles.generatorStatus");
0093 TTreeReaderArray<float> partMomX(tree_reader, "MCParticles.momentum.x");
0094 TTreeReaderArray<float> partMomY(tree_reader, "MCParticles.momentum.y");
0095 TTreeReaderArray<float> partMomZ(tree_reader, "MCParticles.momentum.z");
0096 TTreeReaderArray<int> partPdg(tree_reader, "MCParticles.PDG");
0097 
0098 // Get Reconstructed Track Information
0099 TTreeReaderArray<float> trackMomX(tree_reader, "ReconstructedChargedParticles.momentum.x");
0100 TTreeReaderArray<float> trackMomY(tree_reader, "ReconstructedChargedParticles.momentum.y");
0101 TTreeReaderArray<float> trackMomZ(tree_reader, "ReconstructedChargedParticles.momentum.z");
0102 
0103 // Get Associations Between MCParticles and ReconstructedChargedParticles
0104 TTreeReaderArray<unsigned int> recoAssoc(tree_reader, "ReconstructedChargedParticleAssociations.recID");
0105 TTreeReaderArray<unsigned int> simuAssoc(tree_reader, "ReconstructedChargedParticleAssociations.simID");
0106 ```
0107 
0108 The last two lines encode the association between a ReconstructedChargedParticle and a MCParticle where the matching is determined in the [ParticlesWithPID](https://github.com/eic/EICrecon/blob/main/src/algorithms/pid/ParticlesWithPID.cc) algorithm which generates the ReconstructedChargedParticle objects.
0109 
0110 > Compiling ROOT Macros:
0111 > - If you are analysing a large number of events, you may wish to compile your macro to increase throughput. An example of how you can create and compile a root macro is included in the [Exercise Scripts section](https://eic.github.io/tutorial-analysis/exercise_scripts/index.html#compiled-root-scripts)
0112 {: .callout}
0113 
0114 ### Efficiency Analysis
0115 
0116 > Hint:
0117 > Refer to [the script template](https://eic.github.io/tutorial-analysis/exercise_scripts/index.html#efficiencyanalysisc) if you're having trouble putting things in the right place.
0118 {: .callout}
0119 
0120 Now that we have access to the data we need we will begin constructing our efficiency plots, starting with efficiency as a function of the true particle pseudorapidity. The basic strategy is outlined below:
0121 
0122 1. Loop over all events in the file
0123 2. Within each event, loop over all stable charged particles
0124 3. Identify the ReconstructedChargedParticle (if any) associated with the truth particle we are looking at
0125 4. Create and fill the necessary histograms
0126 
0127 Here is the code to implement these steps:
0128 
0129 ```c++
0130 // Define Histograms
0131 TH1D *partEta = new TH1D("partEta","Eta of Thrown Charged Particles;Eta",100,-5.,5.);
0132 TH1D *matchedPartEta = new TH1D("matchedPartEta","Eta of Thrown Charged Particles That Have Matching Track",100,-5.,5.);
0133 
0134 TH1D *matchedPartTrackDeltaR = new TH1D("matchedPartTrackDeltaR","Delta R Between Matching Thrown and Reconstructed Charged Particle",5000,0.,5.);
0135 
0136 while(tree_reader.Next()) { // Loop over events
0137 
0138   for(unsigned int i=0; i<partGenStat.GetSize(); i++) // Loop over thrown particles
0139     {
0140             if(partGenStat[i] == 1) // Select stable thrown particles
0141               {
0142                 int pdg = TMath::Abs(partPdg[i]);
0143 
0144                 if(pdg == 11 || pdg == 13 || pdg == 211 || pdg == 321 || pdg == 2212) // Look at charged particles (electrons, muons, pions, kaons, protons)
0145                   {
0146                           TVector3 trueMom(partMomX[i],partMomY[i],partMomZ[i]);
0147 
0148                           float trueEta = trueMom.PseudoRapidity();
0149                           float truePhi = trueMom.Phi();
0150             
0151                           partEta->Fill(trueEta);
0152 
0153               // Loop over associations to find matching ReconstructedChargedParticle
0154                           for(unsigned int j=0; j<simuAssoc.GetSize(); j++)
0155                             {
0156                               if(simuAssoc[j] == i) // Find association index matching the index of the thrown particle we are looking at
0157                                 {
0158                                         TVector3 recMom(trackMomX[recoAssoc[j]],trackMomY[recoAssoc[j]],trackMomZ[recoAssoc[j]]); // recoAssoc[j] is the index of the matched ReconstructedChargedParticle
0159 
0160                       // Check the distance between the thrown and reconstructed particle
0161                                         float deltaEta = trueEta - recMom.PseudoRapidity();
0162                                         float deltaPhi = TVector2::Phi_mpi_pi(truePhi - recMom.Phi());
0163                                         float deltaR = TMath::Sqrt(deltaEta*deltaEta + deltaPhi*deltaPhi);
0164 
0165                                         matchedPartTrackDeltaR->Fill(deltaR);
0166 
0167                                         matchedPartEta->Fill(trueEta); // Plot the thrown eta if a matched ReconstructedChargedParticle was found
0168                     }
0169                 }
0170             }            
0171         }
0172     }
0173 }
0174 ```
0175 
0176 We should now have everything we need to find the track efficiency as a function of pseudorapidity. To run the macro and produce an output file containing the histograms we defined, simply type `root -l -q trackAnalysis.C`. After the macro runs, you can open the root file to inspect the histograms. The efficiency can be found by taking the ratio of matchedPartEta over partEta.
0177 
0178 > Question:
0179 > - Do the histogram ranges make sense?
0180 > - We plot the distance between thrown and reconstructed charged partices, does this distribution look reasonable?
0181 > - When filling the matchedPartEta histogram (the numerator in our efficiency), why do we use again the true thrown eta instead of the associated reconstructed eta?
0182 {: .callout}
0183 
0184 > Exercise:
0185 > - Find the efficiency as a function of particle momentum. Are there cuts on any other quantities you should place to get a sensible result?
0186 > - Find the efficiency for some 2-D correlations: momentum vs eta; phi vs eta
0187 > - Plot some kinematic distributions (momentum, eta, etc) for all ReconstructedChargedParticles, not just those that are associated with a thrown particle
0188 {: .challenge}
0189 
0190 ### Resolution Analysis
0191 
0192 > Hint:
0193 > Refer to [the script template](https://eic.github.io/tutorial-analysis/exercise_scripts/index.html#resolutionanalysisc) if you're having trouble putting things in the right place.
0194 {: .callout}
0195 
0196 Next, we will look at track momentum resolution, that is, how well the momentum of the reconstructed track matches that of the thrown particle. We should have all of the "infrastructure" we need in place to do the analysis, we just need to define the appropriate quantities and make the histograms. It only makes sense to define the resolution for tracks and particles which are associated with one another, so we will work within the loop over associations. Define the resolution expression and fill a simple histogram:
0197 
0198 ```c++
0199 TH1D *trackMomentumRes = new TH1D("trackMomentumRes","Track Momentum Resolution",2000,-10.,10.);
0200 ...
0201 // Loop over associations to find matching ReconstructedChargedParticle
0202 for(unsigned int j=0; j<simuAssoc.GetSize(); j++)
0203   {
0204     if(simuAssoc[j] == i) // Find association index matching the index of the thrown particle we are looking at
0205       {
0206         ...
0207         double momRes = (recMom.Mag() - trueMom.Mag())/trueMom.Mag();
0208 
0209         trackMomentumRes->Fill(momRes);
0210       }
0211   }  
0212 ```
0213 
0214 While this plot will give us a sense of what the tracking resolution is, we don't expect the resolution to be constant for all momenta or eta. We can get a more complete picture by plotting the resolution as a function of different kinematic quantities. 
0215 
0216 > Exercise:
0217 > - Make 2-D plots of resolution vs true momentum and vs true pseudorapidity.
0218 > - Break resolution plots down by particle species.
0219 {: .challenge}
0220 
0221 > Question:
0222 > - Will the histogram ranges for each particle species be the same?
0223 > - Could we present the resolution values in a more understandable way?
0224 {: .callout}
0225 
0226 ## Sample Analysis with Python/uproot: Track Efficiency and Resolution
0227 
0228 > Comment:
0229 > Despite using python/uproot, I have written these in a very "ROOT"/C way. Uproot converts our branches to arrays, so you can manipulate them in various fun ways using more pythonic methods if you want.
0230 > Shujie Li utilises a more pythonic approach in the [Understanding the Simulation Output](https://eic.github.io/tutorial-understanding-sim-output/) tutorial. You can see some examples of this in the [Jupyter Notebook](https://colab.research.google.com/drive/1Wn9guq1aIJ8RUW36HHTkeR7-iPPcoBOw?usp=sharing) Shujie set up for that tutorial. 
0231 {: .callout}
0232 
0233 If you are more familiar with python than you are with C/C++, you might find that using a python based root macro is easier for you. Outlined below are sample blocks of code for creating and running a python based analysis script.
0234 
0235 With python, some tasks become easier, e.g. string manipulation and writing to (non ROOT) files.
0236 
0237 Before we begin, we should create a skeleton macro to handle file I/O. For this example, we will make a simple python script. Using your favorite editor, create a file with a name like `trackAnalysis.py` or something similar and copy in the following code: 
0238 
0239 ```python
0240 #! /usr/bin/python
0241          
0242 #Import relevant packages
0243 import ROOT, math, array                                
0244 from ROOT import TH1F, TH2F, TMath, TTree, TVector3, TVector2
0245 import uproot as up
0246 
0247 # Define and open files
0248 infile="PATH_TO_FILE" 
0249 ofile=ROOT.TFile.Open("TrackAnalysis_OutPy.root", "RECREATE")
0250 
0251 # Open input file and define branches we want to look at with uproot
0252 events_tree = up.open(infile)["events"]
0253 
0254 # Define histograms below
0255 
0256 # Add main analysis loop(s) below
0257 
0258 # Write output histograms to file below                
0259 
0260 # Close files
0261 ofile.Close()                    
0262 ```
0263 Note that we are using the module uproot to access the data here. See [further documentation here](https://masonproffitt.github.io/uproot-tutorial/03-trees/index.html). You may also need some of the other included packages too.
0264 
0265 > We will use uproot a little bit like we use the TTreeReader in the other example. We can define the branches we want and assign them to arrays with uproot.
0266 > We can do this via:
0267 >  ```python
0268 > # Open input file and define branches we want to look at with uproot
0269 > events_tree = up.open(infile)["events"]
0270 > # Get particle information# Get particle information
0271 > partGenStat = events_tree["MCParticles.generatorStatus"].array()
0272 > partMomX = events_tree["MCP articles.momentum.x"].array() 
0273 > partMomY = events_tree["MCParticles.momentum.y"].array()
0274 > partMomZ = events_tree["MCParticles.momentum.z"].array()
0275 > partPdg = events_tree["MCParticles.PDG"].array()
0276 >
0277 > # Get reconstructed track information
0278 > trackMomX = events_tree["ReconstructedChargedParticles.momentum.x"].array()
0279 > trackMomY = events_tree["ReconstructedChargedParticles.momentum.y"].array()
0280 > trackMomZ = events_tree["ReconstructedChargedParticles.momentum.z"].array()
0281 >  ...
0282 > ```
0283 >  We can then access them as an array in a loop -
0284 > ```python
0285 > # Add main analysis loop(s) below
0286 > for i in range(0, len(partGenStat)): # Loop over all events
0287 >     for j in range(0, len(partGenStat[i])): # Loop over all thrown particles
0288 >         if partGenStat[i][j] == 1: # Select stable particles
0289 >             pdg = abs(partPdg[i][j]) # Get PDG for each stable particle
0290 >             ...
0291 > ```
0292 > Uproot effectively takes the information in the tree, and turns it into an array. We can then acces and manipulate this array in the same way that we can with any array in python.
0293 >
0294 > Note that if you are using an older version of uproot (v2.x.x), you will need to access the branches slightly differently via -
0295 > ```python
0296 > partGenStat = events_tree.array("MCParticles.generatorStatus")
0297 > ```
0298 {: .callout}
0299 
0300 You can run this file with ``python3 trackAnalysis.py``. It should open your file and create an empty output root file as specified. We will add histograms to this script and fill them in the next step.
0301 
0302 Note that depending upon your setup, ``python trackAnalysis.py`` may work too.
0303 
0304 ### Efficiency Analysis
0305 
0306 > Hint:
0307 > Refer to [the script template](https://eic.github.io/tutorial-analysis/exercise_scripts/index.html#efficiencyanalysispy) if you're having trouble putting things in the right place.
0308 {: .callout}
0309 
0310 As with the ROOT TTreeReader example, we will find the tracking eficiency and resolution. We will need to access the reconstructed track information and the truth particle information and we will have to associate the individual tracks and particles to one another.
0311 
0312 The basic strategy is the same:
0313 
0314 1. Loop over all events in the file
0315 2. Within each event, loop over all stable charged particles
0316 3. Identify the ReconstructedChargedParticle (if any) associated with the truth particle we are looking at
0317 4. Create and fill the necessary histograms
0318 
0319 Here is the sample code to implement these steps:
0320 
0321 ```python
0322 # Get assocations between MCParticles and ReconstructedChargedParticles
0323 recoAssoc = events_tree["ReconstructedChargedParticleAssociations.recID"].array()
0324 simuAssoc = events_tree["ReconstructedChargedParticleAssociations.simID"].array()
0325 
0326 # Define histograms below
0327 partEta = ROOT.TH1D("partEta","Eta of Thrown Charged Particles;Eta",100, -5 ,5 )
0328 matchedPartEta = ROOT.TH1D("matchedPartEta","Eta of Thrown Charged Particles That Have Matching Track", 100, -5 ,5);
0329 matchedPartTrackDeltaR = ROOT.TH1D("matchedPartTrackDeltaR","Delta R Between Matching Thrown and Reconstructed Charge Particle", 5000, 0, 5);
0330 
0331 # Add main analysis loop(s) below
0332 for i in range(0, len(partGenStat)): # Loop over all events
0333     for j in range(0, len(partGenStat[i])): # Loop over all thrown particles
0334         if partGenStat[i][j] == 1: # Select stable particles
0335             pdg = abs(partPdg[i][j]) # Get PDG for each stable particle
0336             if(pdg == 11 or pdg == 13 or pdg == 211 or pdg == 321 or pdg == 2212):
0337                 trueMom = ROOT.TVector3(partMomX[i][j], partMomY[i][j], partMomZ[i][j])
0338                 trueEta = trueMom.PseudoRapidity()
0339                 truePhi = trueMom.Phi()
0340                 partEta.Fill(trueEta)
0341                 for k in range(0,len(simuAssoc[i])): # Loop over associations to find matching ReconstructedChargedParticle
0342                     if (simuAssoc[i][k] == j):
0343                         recMom = ROOT.TVector3(trackMomX[i][recoAssoc[i][k]], trackMomY[i][recoAssoc[i][k]], trackMomZ[i][recoAssoc[i][k]])
0344                         deltaEta = trueEta - recMom.PseudoRapidity()
0345                         deltaPhi = TVector2. Phi_mpi_pi(truePhi - recMom.Phi())
0346                         deltaR = math.sqrt((deltaEta*deltaEta) + (deltaPhi*deltaPhi))
0347                         matchedPartEta.Fill(trueEta)
0348                         matchedPartTrackDeltaR.Fill(deltaR)
0349                         
0350 # Write output histograms to file below
0351 partEta.Write()
0352 matchedPartEta.Write()
0353 matchedPartTrackDeltaR.Write()
0354 
0355 # Close files
0356 ofile.Close()
0357 ```
0358 Insert this block of code appropriately. We should now have everything we need to find the track efficiency as a function of pseudorapidity. Run the script with `python3 trackAnalysis.py``. This should produce a root file with a few histograms in place. The efficiency can be found by taking the ratio of matchedPartEta over partEta.
0359 
0360 > Question:
0361 > - Do the hisotgram ranges make sense?
0362 > - We plot the distance between thrown and reconstructed charged partices, does this distribution look reasonable?
0363 > - When filling the matchedPartEta histogram (the numerator in our efficiency), why do we use again the true thrown eta instead of the associated reconstructed eta?
0364 {: .callout}
0365 
0366 > Exercise:
0367 > - Find the efficiency as a function of particle momentum. Are there cuts on any other quantities you should place to get a sensible result?
0368 > - Find the efficiency for some 2-D correlations: momentum vs eta; phi vs eta
0369 > - Plot some kinematic distributions (momentum, eta, etc) for all ReconstructedChargedParticles, not just those that are associated with a thrown particle
0370 {: .challenge}
0371 
0372 ### Resolution Analysis
0373 
0374 > Hint:
0375 > Refer to [the script template](https://eic.github.io/tutorial-analysis/exercise_scripts/index.html#resolutionanalysispy) if you're having trouble putting things in the right place.
0376 {: .callout}
0377 
0378 Next, we will look at track momentum resolution, that is, how well the momentum of the reconstructed track matches that of the thrown particle. We should have all of the "infrastructure" we need in place to do the analysis, we just need to define the appropriate quantities and make the histograms. It only makes sense to define the resolution for tracks and particles which are associated with one another, so we will work within the loop over associations. Define the resolution expression and fill a simple histogram by inserting this block of code appropriately:
0379 
0380 ```python
0381 trackMomentumRes = ROOT.TH1D("trackMomentumRes","Track Momentum Resolution",2000,-10.,10.);
0382 ...
0383                 for k in range(0,len(simuAssoc[i])): # Loop over associations to find matching ReconstructedChargedParticle
0384                     if (simuAssoc[i][k] == j):
0385                         recMom = ROOT.TVector3(trackMomX[i][recoAssoc[i][k]], trackMomY[i][recoAssoc[i][k]], trackMomZ[i][recoAssoc[i][k]])
0386                         momRes = (recMom.Mag() - trueMom.Mag())/trueMom.Mag()
0387 
0388                         trackMomentumRes.Fill(momRes)
0389 ```
0390 
0391 Remember to write this histogram to the output file too! While this plot will give us a sense of what the tracking resolution is, we don't expect the resolution to be constant for all momenta or eta. We can get a more complete picture by plotting the resolution as a function of different kinematic quantities. 
0392 
0393 > Exercise:
0394 > - Make 2-D plots of resolution vs true momentum and vs true pseudorapidity.
0395 > - Break resolution plots down by particle species.
0396 {: .challenge}
0397 
0398 > Question:
0399 > - Will the histogram ranges for each particle species be the same?
0400 > - Could we present the resolution values in a more understandable way?
0401 {: .callout}
0402 
0403 ## ROOT RDataFrames
0404 
0405 > Note:
0406 > - This method does actually need you to be within eic-shell (or somewhere else with the correct EDM4hep/EDM4eic libraries installed).
0407 {: .callout}
0408 
0409 Newer versions of root, such as the version in eic-shell, have access to a relatively new class, [RDataFrames](https://root.cern/doc/master/classROOT_1_1RDataFrame.html). These are similar to pythonic data frame style strucutres that you may be familiar with. Some people are moving towards utilising RDataFrames in their analysis. If you are more familiar with working with data frames, you may wish to investigate these further.
0410 
0411 Included below is a quick script from [Simon Gardner](https://github.com/simonge/EIC_Analysis/blob/main/Analysis-Tutorial/EfficiencyAnalysisRDF.C) that utilises RDataFrames to analyse a data file. Copy the following into a new file called `EfficiencyAnalysisRDF.C` -
0412 
0413 ```c++
0414 #include <edm4hep/utils/vector_utils.h>
0415 #include <edm4hep/MCParticle.h>
0416 #include <edm4eic/ReconstructedParticle.h>
0417 #include <ROOT/RDataFrame.hxx>
0418 #include <ROOT/RVec.hxx>
0419 #include <TFile.h>
0420 
0421 // Define aliases for the data types 
0422 using MCP = edm4hep::MCParticleData;
0423 using RecoP = edm4eic::ReconstructedParticleData;
0424 
0425 // Define function to vectorize the edm4hep::utils methods
0426 template <typename T>
0427 auto getEta = [](ROOT::VecOps::RVec<T> momenta) {
0428   return ROOT::VecOps::Map(momenta, [](const T& p) { return edm4hep::utils::eta(p.momentum); });
0429 };
0430 
0431 template <typename T>
0432 auto getPhi = [](ROOT::VecOps::RVec<T> momenta) {
0433   return ROOT::VecOps::Map(momenta, [](const T& p) { return edm4hep::utils::angleAzimuthal(p.momentum); });
0434 };
0435 
0436 // Define the function to perform the efficiency analysis
0437 void EfficiencyAnalysisRDF(TString infile="PATH_TO_FILE"){
0438    
0439   // Set up input file 
0440   ROOT::RDataFrame df("events", infile);
0441 
0442   // Define new dataframe node with additional columns
0443   auto df1 =  df.Define("statusFilter",  "MCParticles.generatorStatus == 1"    )
0444                 .Define("absPDG",        "abs(MCParticles.PDG)"                )
0445                 .Define("pdgFilter",     "absPDG == 11 || absPDG == 13 || absPDG == 211 || absPDG == 321 || absPDG == 2212")
0446                 .Define("particleFilter","statusFilter && pdgFilter"           )
0447                 .Define("filtMCParts",   "MCParticles[particleFilter]"         )
0448                 .Define("assoFilter",    "Take(particleFilter,ReconstructedChargedParticleAssociations.simID)") // Incase any of the associated particles happen to not be charged
0449                 .Define("assoMCParts",   "Take(MCParticles,ReconstructedChargedParticleAssociations.simID)[assoFilter]")
0450                 .Define("assoRecParts",  "Take(ReconstructedChargedParticles,ReconstructedChargedParticleAssociations.recID)[assoFilter]")
0451                 .Define("filtMCEta",     getEta<MCP>   , {"filtMCParts"} )
0452                 .Define("filtMCPhi",     getPhi<MCP>   , {"filtMCParts"} )
0453                 .Define("accoMCEta",     getEta<MCP>   , {"assoMCParts"} )
0454                 .Define("accoMCPhi",     getPhi<MCP>   , {"assoMCParts"} )
0455                 .Define("assoRecEta",    getEta<RecoP> , {"assoRecParts"})
0456                 .Define("assoRecPhi",    getPhi<RecoP> , {"assoRecParts"})
0457                 .Define("deltaR",        "ROOT::VecOps::DeltaR(assoRecEta, accoMCEta, assoRecPhi, accoMCPhi)");
0458 
0459   // Define histograms
0460   auto partEta                = df1.Histo1D({"partEta","Eta of Thrown Charged Particles;Eta",100,-5.,5.},"filtMCEta");
0461   auto matchedPartEta         = df1.Histo1D({"matchedPartEta","Eta of Thrown Charged Particles That Have Matching Track",100,-5.,5.},"accoMCEta");
0462   auto matchedPartTrackDeltaR = df1.Histo1D({"matchedPartTrackDeltaR","Delta R Between Matching Thrown and Reconstructed Charged Particle",5000,0.,5.},"deltaR");
0463 
0464   // Write histograms to file
0465   TFile *ofile = TFile::Open("EfficiencyAnalysis_Out_RDF.root","RECREATE");
0466 
0467   // Booked Define and Histo1D lazy actions are only performed here
0468   partEta->Write();
0469   matchedPartEta->Write();
0470   matchedPartTrackDeltaR->Write();
0471       
0472   ofile->Close(); // Close output file
0473 }
0474 ```
0475 > Note:
0476 > - You will need to run this script with the command `root -l -q EfficiencyAnalysisRDF.C++`, within eic-shell (or somewhere else with the correct EDM4hep/EDM4eic libraries installed).
0477 > - Remember to put in the correct file path.
0478 {: .callout}
0479 
0480 If you like, you can try completing the exercises using this example to start from.
0481 
0482 ## PODIO - Direct Analysis
0483 
0484 If you want to avoid ROOT entirely, you can analyse the PODIO files directly in a variety of ways.
0485 
0486 See [Wouter's example use cases](https://indico.cern.ch/event/1343984/contributions/5908856/attachments/2842958/4970156/2024-04-23%20-%20Examples%20for%20Data%20Model%20Usage.pdf) from 23/04/24. Wouter shows a few ways in which the PODIO file can be accssed and analysed directly.