Back to home page

EIC code displayed by LXR

 
 

    


Warning, /tutorial-analysis/_episodes/03-analysis.md is written in an unsupported language. File is not indexed.

0001 ---
0002 title: "Analyzing the Reconstruction Output"
0003 teaching: 20
0004 exercises: 40
0005 questions:
0006 - "How does one utilize the reconstruction output trees to do an analysis?"
0007 objectives:
0008 - "Become familiar with methods for reading the trees"
0009 - "Understand how to access truth/particle information"
0010 - "Find track efficiency and resolution"
0011 keypoints:
0012 - "Flat tree structure provides flexibility in analysis."
0013 - "The ReconstructedChargedParticles branch holds information on reconstructed tracks."
0014 ---
0015 
0016 So far, we have only looked at (and plotted) some information from our file interactively. This is very useful and can help us identify the variables we want to deal with. However, we can't really use these techniques to conduct a full analysis of the data. To do so, we typically use a script or macro. In this part of the tutorial, we will create a script that we can use to do a relatively straightforward analysis of our file.
0017 
0018 ## Reading the Output Trees
0019 
0020 The simulation output trees are "flat" in the sense that there is no event class structure embedded within the tree and no additional libraries are needed to handle the output. Therefore, the end user can simply read the values stored in each branch using whatever method/workflow they are most comfortable with. Examples of several common methods for reading the trees are provided below. We will see a ROOT TTreeReader based example using a ROOT macro and a python/uproot based version. During the tutorial, you should try the exercise using whichever language you feel most comfortable with.
0021 
0022 ## Sample Analysis with ROOT TTreeReader: Track Efficiency and Resolution
0023 
0024 As a sample exercise to become familiar with the simulation output and how to use it in a realistic setting, we will find the tracking eficiency and resolution. We will need to access the reconstructed track information and the truth particle information and we will have to associate the individual tracks and particles to one another. 
0025 
0026 Before we begin, we should create a skeleton macro to handle file I/O. For the `TTreeReader` example, we will use a simple ROOT macro. Using your favorite editor, create a file with a name like `trackAnalysis.C` or something similar and copy in the following code: 
0027 
0028 ```c++
0029 void trackAnalysis(TString infile="path_to_your_simu_file")
0030   {
0031     // Set output file for the histograms
0032     TFile *ofile = TFile::Open("out.hist.root","RECREATE");
0033 
0034     // Analysis code will go here
0035 
0036     ofile->Write(); // Write histograms to file
0037     ofile->Close(); // Close output file
0038   }
0039 ```
0040 
0041 We will need momentum, generator status, and particle species information for the truth particles and momentum information for the reconstructed tracks. The reconstructed track information can be accessed from two different branches: CentralCKFTrackParameters and ReconstructedChargedParticles. We can access these branches using a TTreeReaderArray.
0042 
0043 > ROOT TTreeReaderArrays:
0044 >
0045 >TTreeReader and the associated TTreeReaderArray is a simple interface for reading data from a TTree. The class description and examples can be seen [here](https://root.cern/doc/v630/classTTreeReader.html). To instantiate the reader and access values from a given branch (e.g. the MCParticles branch), one would use the following calls:
0046 >
0047 > ```c++
0048 > // Set up input file chain
0049 > TChain *mychain = new TChain("events");
0050 > mychain->Add(infile);
0051 >
0052 > // Initialize reader
0053 > TTreeReader tree_reader(mychain);
0054 >
0055 > // Access whatever data-members you need
0056 > TTreeReaderArray<int> partGenStat(tree_reader, "MCParticles.generatorStatus");
0057 > TTreeReaderArray<float> partMomX(tree_reader, "MCParticles.momentum.x");
0058 > ...
0059 > ```
0060 >
0061 > The branches and their members can be viewed by opening a file with TBrowser (`new TBrowser()`) from within ROOT. Once you have defined the `TTreeReaderArray` objects for the data-members you want to look at, you can loop over the events and the members within that event:
0062 >
0063 > ```c++
0064 > while(tree_reader.Next()) { // Loop over events
0065 >  for(unsigned int i=0; i<partGenStat.GetSize(); i++) // Loop through particles in the event
0066 >    {
0067 >      int particleStatus = partGenStat[i]; // Access data-members as you would an array
0068 >      float particleXMomentum = partMomX[i]; // partMomX should have same number of entries as partGenStat because they are in the same branch
0069 >      ...
0070 >    }
0071 > }
0072 > ```
0073 > All members of the same branch should have the same number of entries, so it is sufficient to use any member of the branch to set the limit of your loop.
0074 {: .callout}
0075 
0076 We will proceed using the ReconstructedChargedParticles branch as this will give us a chance to practice using associations, copy the following lines into your analysis macro.
0077 
0078 ```c++
0079 // Set up input file chain
0080 TChain *mychain = new TChain("events");
0081 mychain->Add(infile);
0082 
0083 // Initialize reader
0084 TTreeReader tree_reader(mychain);
0085 
0086 // Get Particle Information
0087 TTreeReaderArray<int> partGenStat(tree_reader, "MCParticles.generatorStatus");
0088 TTreeReaderArray<float> partMomX(tree_reader, "MCParticles.momentum.x");
0089 TTreeReaderArray<float> partMomY(tree_reader, "MCParticles.momentum.y");
0090 TTreeReaderArray<float> partMomZ(tree_reader, "MCParticles.momentum.z");
0091 TTreeReaderArray<int> partPdg(tree_reader, "MCParticles.PDG");
0092 
0093 // Get Reconstructed Track Information
0094 TTreeReaderArray<float> trackMomX(tree_reader, "ReconstructedChargedParticles.momentum.x");
0095 TTreeReaderArray<float> trackMomY(tree_reader, "ReconstructedChargedParticles.momentum.y");
0096 TTreeReaderArray<float> trackMomZ(tree_reader, "ReconstructedChargedParticles.momentum.z");
0097 
0098 // Get Associations Between MCParticles and ReconstructedChargedParticles
0099 TTreeReaderArray<unsigned int> recoAssoc(tree_reader, "ReconstructedChargedParticleAssociations.recID");
0100 TTreeReaderArray<unsigned int> simuAssoc(tree_reader, "ReconstructedChargedParticleAssociations.simID");
0101 ```
0102 
0103 The last two lines encode the association between a ReconstructedChargedParticle and a MCParticle where the matching is determined in the [ParticlesWithPID](https://github.com/eic/EICrecon/blob/main/src/algorithms/pid/ParticlesWithPID.cc) algorithm which generates the ReconstructedChargedParticle objects.
0104 
0105 > Compiling ROOT Macros:
0106 > - If you are analysing a large number of events, you may wish to compile your macro to increase throughput. An example of how you can create and compile a root macro is included in the [Exercise Scripts section](https://eic.github.io/tutorial-analysis/exercise_scripts/index.html#compiled-root-scripts)
0107 {: .callout}
0108 
0109 
0110 ### Efficiency Analysis
0111 
0112 > Hint:
0113 > Refer to [the script template](https://eic.github.io/tutorial-analysis/exercise_scripts/index.html#efficiencyanalysisc) if you're having trouble putting things in the right place.
0114 {: .callout}
0115 
0116 Now that we have access to the data we need we will begin constructing our efficiency plots, starting with efficiency as a function of the true particle pseudorapidity. The basic strategy is outlined below:
0117 
0118 1. Loop over all events in the file
0119 2. Within each event, loop over all stable charged particles
0120 3. Identify the ReconstructedChargedParticle (if any) associated with the truth particle we are looking at
0121 4. Create and fill the necessary histograms
0122 
0123 Here is the code to implement these steps:
0124 
0125 ```c++
0126 // Define Histograms
0127 TH1D *partEta = new TH1D("partEta","Eta of Thrown Charged Particles;Eta",100,-5.,5.);
0128 TH1D *matchedPartEta = new TH1D("matchedPartEta","Eta of Thrown Charged Particles That Have Matching Track",100,-5.,5.);
0129 
0130 TH1D *matchedPartTrackDeltaR = new TH1D("matchedPartTrackDeltaR","Delta R Between Matching Thrown and Reconstructed Charged Particle",5000,0.,5.);
0131 
0132 while(tree_reader.Next()) { // Loop over events
0133 
0134   for(unsigned int i=0; i<partGenStat.GetSize(); i++) // Loop over thrown particles
0135     {
0136             if(partGenStat[i] == 1) // Select stable thrown particles
0137               {
0138                 int pdg = TMath::Abs(partPdg[i]);
0139 
0140                 if(pdg == 11 || pdg == 13 || pdg == 211 || pdg == 321 || pdg == 2212) // Look at charged particles (electrons, muons, pions, kaons, protons)
0141                   {
0142                           TVector3 trueMom(partMomX[i],partMomY[i],partMomZ[i]);
0143 
0144                           float trueEta = trueMom.PseudoRapidity();
0145                           float truePhi = trueMom.Phi();
0146             
0147                           partEta->Fill(trueEta);
0148 
0149               // Loop over associations to find matching ReconstructedChargedParticle
0150                           for(unsigned int j=0; j<simuAssoc.GetSize(); j++)
0151                             {
0152                               if(simuAssoc[j] == i) // Find association index matching the index of the thrown particle we are looking at
0153                                 {
0154                                         TVector3 recMom(trackMomX[recoAssoc[j]],trackMomY[recoAssoc[j]],trackMomZ[recoAssoc[j]]); // recoAssoc[j] is the index of the matched ReconstructedChargedParticle
0155 
0156                       // Check the distance between the thrown and reconstructed particle
0157                                         float deltaEta = trueEta - recMom.PseudoRapidity();
0158                                         float deltaPhi = TVector2::Phi_mpi_pi(truePhi - recMom.Phi());
0159                                         float deltaR = TMath::Sqrt(deltaEta*deltaEta + deltaPhi*deltaPhi);
0160 
0161                                         matchedPartTrackDeltaR->Fill(deltaR);
0162 
0163                                         matchedPartEta->Fill(trueEta); // Plot the thrown eta if a matched ReconstructedChargedParticle was found
0164                     }
0165                 }
0166             }            
0167         }
0168     }
0169 }
0170 ```
0171 
0172 We should now have everything we need to find the track efficiency as a function of pseudorapidity. To run the macro and produce an output file containing the histograms we defined, simply type `root -l -q trackAnalysis.C`. After the macro runs, you can open the root file to inspect the histograms. The efficiency can be found by taking the ratio of matchedPartEta over partEta.
0173 
0174 > Question:
0175 > - Do the histogram ranges make sense?
0176 > - We plot the distance between thrown and reconstructed charged partices, does this distribution look reasonable?
0177 > - When filling the matchedPartEta histogram (the numerator in our efficiency), why do we use again the true thrown eta instead of the associated reconstructed eta?
0178 {: .callout}
0179 
0180 > Exercise:
0181 > - Find the efficiency as a function of particle momentum. Are there cuts on any other quantities you should place to get a sensible result?
0182 > - Find the efficiency for some 2-D correlations: momentum vs eta; phi vs eta
0183 > - Plot some kinematic distributions (momentum, eta, etc) for all ReconstructedChargedParticles, not just those that are associated with a thrown particle
0184 {: .challenge}
0185 
0186 ### Resolution Analysis
0187 
0188 > Hint:
0189 > Refer to [the script template](https://eic.github.io/tutorial-analysis/exercise_scripts/index.html#resolutionanalysisc) if you're having trouble putting things in the right place.
0190 {: .callout}
0191 
0192 Next, we will look at track momentum resolution, that is, how well the momentum of the reconstructed track matches that of the thrown particle. We should have all of the "infrastructure" we need in place to do the analysis, we just need to define the appropriate quantities and make the histograms. It only makes sense to define the resolution for tracks and particles which are associated with one another, so we will work within the loop over associations. Define the resolution expression and fill a simple histogram:
0193 
0194 ```c++
0195 TH1D *trackMomentumRes = new TH1D("trackMomentumRes","Track Momentum Resolution",2000,-10.,10.);
0196 ...
0197 // Loop over associations to find matching ReconstructedChargedParticle
0198 for(unsigned int j=0; j<simuAssoc.GetSize(); j++)
0199   {
0200     if(simuAssoc[j] == i) // Find association index matching the index of the thrown particle we are looking at
0201       {
0202         ...
0203         double momRes = (recMom.Mag() - trueMom.Mag())/trueMom.Mag();
0204 
0205         trackMomentumRes->Fill(momRes);
0206       }
0207   }  
0208 ```
0209 
0210 While this plot will give us a sense of what the tracking resolution is, we don't expect the resolution to be constant for all momenta or eta. We can get a more complete picture by plotting the resolution as a function of different kinematic quantities. 
0211 
0212 > Exercise:
0213 > - Make 2-D plots of resolution vs true momentum and vs true pseudorapidity.
0214 > - Break resolution plots down by particle species.
0215 {: .challenge}
0216 
0217 > Question:
0218 > - Will the histogram ranges for each particle species be the same?
0219 > - Could we present the resolution values in a more understandable way?
0220 {: .callout}
0221 
0222 ## Sample Analysis with Python/uproot: Track Efficiency and Resolution
0223 
0224 If you are more familiar with python than you are with C/C++, you might find that using a python based root macro is easier for you. Outlined below are sample blocks of code for creating and running a python based analysis script.
0225 
0226 With python, some tasks become easier, e.g. string manipulation and writing to (non ROOT) files.
0227 
0228 Before we begin, we should create a skeleton macro to handle file I/O. For this example, we will make a simple python script. Using your favorite editor, create a file with a name like `trackAnalysis.py` or something similar and copy in the following code: 
0229 
0230 ```python
0231 #! /usr/bin/python
0232          
0233 #Import relevant packages
0234 import ROOT, math, array                                
0235 from ROOT import TH1F, TH2F, TMath, TTree, TVector3, TVector2
0236 import uproot as up
0237 
0238 # Define and open files
0239 infile="PATH_TO_FILE" 
0240 ofile=ROOT.TFile.Open("TrackAnalysis_OutPy.root", "RECREATE")
0241 
0242 # Open input file and define branches we want to look at with uproot
0243 events_tree = up.open(infile)["events"]
0244 
0245 # Define histograms below
0246 
0247 # Add main analysis loop(s) below
0248 
0249 # Write output histograms to file below                
0250 
0251 # Close files
0252 ofile.Close()                    
0253 ```
0254 Note that we are using the module uproot to access the data here. See [further documentation here](https://masonproffitt.github.io/uproot-tutorial/03-trees/index.html). You may also need some of the other included packages too.
0255 
0256 > We will use uproot a little bit like we use the TTreeReader in the other example. We can define the branches we want and assign them to arrays with uproot.
0257 > We can do this via:
0258 >  ```python
0259 > # Open input file and define branches we want to look at with uproot
0260 > events_tree = up.open(infile)["events"]
0261 > # Get particle information# Get particle information
0262 > partGenStat = events_tree["MCParticles.generatorStatus"].array()
0263 > partMomX = events_tree["MCP articles.momentum.x"].array() 
0264 > partMomY = events_tree["MCParticles.momentum.y"].array()
0265 > partMomZ = events_tree["MCParticles.momentum.z"].array()
0266 > partPdg = events_tree["MCParticles.PDG"].array()
0267 >
0268 > # Get reconstructed track information
0269 > trackMomX = events_tree["ReconstructedChargedParticles.momentum.x"].array()
0270 > trackMomY = events_tree["ReconstructedChargedParticles.momentum.y"].array()
0271 > trackMomZ = events_tree["ReconstructedChargedParticles.momentum.z"].array()
0272 >  ...
0273 > ```
0274 >  We can then access them as an array in a loop -
0275 > ```python
0276 > # Add main analysis loop(s) below
0277 > for i in range(0, len(events_tree)): # Loop over all events
0278 >     for j in range(0, len(partGenStat[i])): # Loop over all thrown particles
0279 >         if partGenStat[i][j] == 1: # Select stable particles
0280 >             pdg = abs(partPdg[i][j]) # Get PDG for each stable particle
0281 >             ...
0282 > ```
0283 > Uproot effectively takes the information in the tree, and turns it into an array. We can then acces and manipulate this array in the same way that we can with any array in python.
0284 >
0285 > Note that if you are using an older version of uproot (v2.x.x), you will need to access the branches slightly differently via -
0286 > ```python
0287 > partGenStat = events_tree.array("MCParticles.generatorStatus")
0288 > ```
0289 {: .callout}
0290 
0291 You can run this file with ``python3 trackAnalysis.py``. It should open your file and create an empty output root file as specified. We will add histograms to this script and fill them in the next step.
0292 
0293 Note that depending upon your setup, ``python trackAnalysis.py`` may work too.
0294 
0295 ### Efficiency Analysis
0296 
0297 > Hint:
0298 > Refer to [the script template](https://eic.github.io/tutorial-analysis/exercise_scripts/index.html#efficiencyanalysispy) if you're having trouble putting things in the right place.
0299 {: .callout}
0300 
0301 As with the ROOT TTreeReader example, we will find the tracking eficiency and resolution. We will need to access the reconstructed track information and the truth particle information and we will have to associate the individual tracks and particles to one another.
0302 
0303 The basic strategy is the same:
0304 
0305 1. Loop over all events in the file
0306 2. Within each event, loop over all stable charged particles
0307 3. Identify the ReconstructedChargedParticle (if any) associated with the truth particle we are looking at
0308 4. Create and fill the necessary histograms
0309 
0310 Here is the sample code to implement these steps:
0311 
0312 ```python
0313 # Get assocations between MCParticles and ReconstructedChargedParticles
0314 recoAssoc = events_tree["ReconstructedChargedParticleAssociations.recID"].array()
0315 simuAssoc = events_tree["ReconstructedChargedParticleAssociations.simID"].array()
0316 
0317 # Define histograms below
0318 partEta = ROOT.TH1D("partEta","Eta of Thrown Charged Particles;Eta",100, -5 ,5 )
0319 matchedPartEta = ROOT.TH1D("matchedPartEta","Eta of Thrown Charged Particles That Have Matching Track", 100, -5 ,5);
0320 matchedPartTrackDeltaR = ROOT.TH1D("matchedPartTrackDeltaR","Delta R Between Matching Thrown and Reconstructed Charge Particle", 5000, 0, 5);
0321 
0322 # Add main analysis loop(s) below
0323 for i in range(0, len(events_tree)): # Loop over all events
0324     for j in range(0, len(partGenStat[i])): # Loop over all thrown particles
0325         if partGenStat[i][j] == 1: # Select stable particles
0326             pdg = abs(partPdg[i][j]) # Get PDG for each stable particle
0327             if(pdg == 11 or pdg == 13 or pdg == 211 or pdg == 321 or pdg == 2212):
0328                 trueMom = ROOT.TVector3(partMomX[i][j], partMomY[i][j], partMomZ[i][j])
0329                 trueEta = trueMom.PseudoRapidity()
0330                 truePhi = trueMom.Phi()
0331                 
0332                 partEta.Fill(trueEta)
0333                 for k in range(0,len(simuAssoc[i])): # Loop over associations to find matching ReconstructedChargedParticle
0334                     if (simuAssoc[i][k] == j):
0335                         recMom = ROOT.TVector3(trackMomX[i][recoAssoc[i][k]], trackMomY[i][recoAssoc[i][k]], trackMomZ[i][recoAssoc[i][k]])
0336                         deltaEta = trueEta - recMom.PseudoRapidity()
0337                         deltaPhi = TVector2. Phi_mpi_pi(truePhi - recMom.Phi())
0338                         deltaR = math.sqrt((deltaEta*deltaEta) + (deltaPhi*deltaPhi))
0339 
0340                         matchedPartEta.Fill(trueEta)
0341                         matchedPartTrackDeltaR.Fill(deltaR)
0342                         
0343 # Write output histograms to file below
0344 partEta.Write()
0345 matchedPartEta.Write()
0346 matchedPartTrackDeltaR.Write()
0347 
0348 # Close files
0349 ofile.Close()
0350 ```
0351 Insert this block of code appropriately. We should now have everything we need to find the track efficiency as a function of pseudorapidity. Run the script with `python3 trackAnalysis.py``. This should produce a root file with a few histograms in place. The efficiency can be found by taking the ratio of matchedPartEta over partEta.
0352 
0353 > Note:
0354 > - More recent simulation files (May 2024 or later) seem to have some issue or conflict when processed via Uproot (issue slicing into arrays) - Investigating further (10/09/24)
0355 {: .callout}
0356 
0357 > Question:
0358 > - Do the hisotgram ranges make sense?
0359 > - We plot the distance between thrown and reconstructed charged partices, does this distribution look reasonable?
0360 > - When filling the matchedPartEta histogram (the numerator in our efficiency), why do we use again the true thrown eta instead of the associated reconstructed eta?
0361 {: .callout}
0362 
0363 > Exercise:
0364 > - Find the efficiency as a function of particle momentum. Are there cuts on any other quantities you should place to get a sensible result?
0365 > - Find the efficiency for some 2-D correlations: momentum vs eta; phi vs eta
0366 > - Plot some kinematic distributions (momentum, eta, etc) for all ReconstructedChargedParticles, not just those that are associated with a thrown particle
0367 {: .challenge}
0368 
0369 ### Resolution Analysis
0370 
0371 > Hint:
0372 > Refer to [the script template](https://eic.github.io/tutorial-analysis/exercise_scripts/index.html#resolutionanalysispy) if you're having trouble putting things in the right place.
0373 {: .callout}
0374 
0375 Next, we will look at track momentum resolution, that is, how well the momentum of the reconstructed track matches that of the thrown particle. We should have all of the "infrastructure" we need in place to do the analysis, we just need to define the appropriate quantities and make the histograms. It only makes sense to define the resolution for tracks and particles which are associated with one another, so we will work within the loop over associations. Define the resolution expression and fill a simple histogram by inserting this block of code appropriately:
0376 
0377 ```python
0378 trackMomentumRes = ROOT.TH1D("trackMomentumRes","Track Momentum Resolution",2000,-10.,10.);
0379 ...
0380                 for k in range(0,len(simuAssoc[i])): # Loop over associations to find matching ReconstructedChargedParticle
0381                     if (simuAssoc[i][k] == j):
0382                         recMom = ROOT.TVector3(trackMomX[i][recoAssoc[i][k]], trackMomY[i][recoAssoc[i][k]], trackMomZ[i][recoAssoc[i][k]])
0383                         momRes = (recMom.Mag() - trueMom.Mag())/trueMom.Mag()
0384 
0385                         trackMomentumRes.Fill(momRes)
0386 ```
0387 
0388 Remember to write this histogram to the output file too! While this plot will give us a sense of what the tracking resolution is, we don't expect the resolution to be constant for all momenta or eta. We can get a more complete picture by plotting the resolution as a function of different kinematic quantities. 
0389 
0390 > Exercise:
0391 > - Make 2-D plots of resolution vs true momentum and vs true pseudorapidity.
0392 > - Break resolution plots down by particle species.
0393 {: .challenge}
0394 
0395 > Question:
0396 > - Will the histogram ranges for each particle species be the same?
0397 > - Could we present the resolution values in a more understandable way?
0398 {: .callout}
0399 
0400 ## ROOT RDataFrames
0401 
0402 Newer versions of root, such as the version in eic-shell, have access to a relatively new class, [RDataFrames](https://root.cern/doc/master/classROOT_1_1RDataFrame.html). These are similar to pythonic data frame style strucutres that you may be familiar with. Some people are moving towards utilising RDataFrames in their analysis. If you are more familiar with working with data frames, you may wish to investigate these further.
0403 
0404 Included below is a quick script from [Simon Gardner](https://github.com/simonge/EIC_Analysis/blob/main/Analysis-Tutorial/EfficiencyAnalysisRDF.C) that utilises RDataFrames to analyse a data file. Copy the following into a new file called `EfficiencyAnalysisRDF.C` -
0405 
0406 ```c++
0407 #include <edm4hep/utils/vector_utils.h>
0408 #include <edm4hep/MCParticle.h>
0409 #include <edm4eic/ReconstructedParticle.h>
0410 #include <ROOT/RDataFrame.hxx>
0411 #include <ROOT/RVec.hxx>
0412 #include <TFile.h>
0413 
0414 // Define aliases for the data types 
0415 using MCP = edm4hep::MCParticleData;
0416 using RecoP = edm4eic::ReconstructedParticleData;
0417 
0418 // Define function to vectorize the edm4hep::utils methods
0419 template <typename T>
0420 auto getEta = [](ROOT::VecOps::RVec<T> momenta) {
0421   return ROOT::VecOps::Map(momenta, [](const T& p) { return edm4hep::utils::eta(p.momentum); });
0422 };
0423 
0424 template <typename T>
0425 auto getPhi = [](ROOT::VecOps::RVec<T> momenta) {
0426   return ROOT::VecOps::Map(momenta, [](const T& p) { return edm4hep::utils::angleAzimuthal(p.momentum); });
0427 };
0428 
0429 // Define the function to perform the efficiency analysis
0430 void EfficiencyAnalysisRDF(TString infile="PATH_TO_FILE"){
0431    
0432   // Set up input file 
0433   ROOT::RDataFrame df("events", infile);
0434 
0435   // Define new dataframe node with additional columns
0436   auto df1 =  df.Define("statusFilter",  "MCParticles.generatorStatus == 1"    )
0437                 .Define("absPDG",        "abs(MCParticles.PDG)"                )
0438                 .Define("pdgFilter",     "absPDG == 11 || absPDG == 13 || absPDG == 211 || absPDG == 321 || absPDG == 2212")
0439                 .Define("particleFilter","statusFilter && pdgFilter"           )
0440                 .Define("filtMCParts",   "MCParticles[particleFilter]"         )
0441                 .Define("assoFilter",    "Take(particleFilter,ReconstructedChargedParticleAssociations.simID)") // Incase any of the associated particles happen to not be charged
0442                 .Define("assoMCParts",   "Take(MCParticles,ReconstructedChargedParticleAssociations.simID)[assoFilter]")
0443                 .Define("assoRecParts",  "Take(ReconstructedChargedParticles,ReconstructedChargedParticleAssociations.recID)[assoFilter]")
0444                 .Define("filtMCEta",     getEta<MCP>   , {"filtMCParts"} )
0445                 .Define("filtMCPhi",     getPhi<MCP>   , {"filtMCParts"} )
0446                 .Define("accoMCEta",     getEta<MCP>   , {"assoMCParts"} )
0447                 .Define("accoMCPhi",     getPhi<MCP>   , {"assoMCParts"} )
0448                 .Define("assoRecEta",    getEta<RecoP> , {"assoRecParts"})
0449                 .Define("assoRecPhi",    getPhi<RecoP> , {"assoRecParts"})
0450                 .Define("deltaR",        "ROOT::VecOps::DeltaR(assoRecEta, accoMCEta, assoRecPhi, accoMCPhi)");
0451 
0452   // Define histograms
0453   auto partEta                = df1.Histo1D({"partEta","Eta of Thrown Charged Particles;Eta",100,-5.,5.},"filtMCEta");
0454   auto matchedPartEta         = df1.Histo1D({"matchedPartEta","Eta of Thrown Charged Particles That Have Matching Track",100,-5.,5.},"accoMCEta");
0455   auto matchedPartTrackDeltaR = df1.Histo1D({"matchedPartTrackDeltaR","Delta R Between Matching Thrown and Reconstructed Charged Particle",5000,0.,5.},"deltaR");
0456 
0457   // Write histograms to file
0458   TFile *ofile = TFile::Open("EfficiencyAnalysis_Out_RDF.root","RECREATE");
0459 
0460   // Booked Define and Histo1D lazy actions are only performed here
0461   partEta->Write();
0462   matchedPartEta->Write();
0463   matchedPartTrackDeltaR->Write();
0464       
0465   ofile->Close(); // Close output file
0466 }
0467 ```
0468 > Note:
0469 > - You will need to run this script with the command `root -l -q EfficiencyAnalysisRDF.C++`, within eic-shell (or somewhere else with the correct EDM4hep/EDM4eic libraries installed).
0470 > - Remember to put in the correct file path.
0471 {: .callout}
0472 
0473 If you like, you can try completing the exercises using this example to start from.
0474 
0475 ## PODIO - Direct Analysis
0476 
0477 If you want to avoid ROOT entirely, you can analyse the PODIO files directly in a variety of ways.
0478 
0479 See [Wouter's example use cases](https://indico.cern.ch/event/1343984/contributions/5908856/attachments/2842958/4970156/2024-04-23%20-%20Examples%20for%20Data%20Model%20Usage.pdf) from 23/04/24. Wouter shows a few ways in which the PODIO file can be accssed and analysed directly.