Warning, /drich-dev/doc/tutorials/4-reconstruction-code-part-1.md is written in an unsupported language. File is not indexed.
0001 Tutorial 4: Reconstruction Code Part I
0002 ======================================
0003
0004 - [TUTORIAL RECORDING (mirror 1)](https://drive.google.com/file/d/1AtZWrKtmR5mimze8WsnN1wVJ72U_Ewdu/view?usp=sharing)
0005 -- [CHAT](https://drive.google.com/file/d/1YHoNTOO6zhZ0K9nBs2oYEI3M0Z-5rJVe/view?usp=sharing)
0006 - [TUTORIAL RECORDING (mirror 2)](https://duke.zoom.us/rec/share/eJqawysxW9cb376hZE5Yth6hSyfWh_gvTGezb1ggnngZf84fm8oBW63So1dIr0lU.FDHkKc0ETS8TCYFO)
0007 - [Return to Tutorial Landing Page](README.md)
0008
0009 ## Introduction
0010
0011 Now we turn to discussing how the reconstruction code works. The reconstruction framework is [EICrecon](https://github.com/eic/EICrecon), based on [JANA2](https://github.com/JeffersonLab/JANA2). This version of the tutorial assumes the reconstruction algorithms are a part of EICrecon (there is a plan to separate the algorithms from the framework).
0012
0013 For an introduction to EICrecon and JANA2, see [the general tutorial](https://indico.bnl.gov/event/16833/).
0014
0015 ### Data Model
0016
0017 To permit communication of data between the various ePIC software repositories, they must all agree on a data model, which expresses how the data are stored. We use the following data models, called Event Data Models (EDMs):
0018
0019 - [EDM4hep](https://github.com/key4hep/EDM4hep): a general data model used by many HEP experiments, including ePIC
0020 - [EDM4eic](https://github.com/eic/EDM4eic): an extension of EDM4hep, with datatypes and components specific to ePIC
0021
0022 Within these repositories, there is a specification of the datatypes and components, in a YAML file. Before checking these files, here are some definitions you should know:
0023
0024 - **components**: an object that is used or part of a **datatype**, such as a 3D vector or a covariance matrix
0025 - **datatype**: definition of some object, such as a sensor hit or a PID hypothesis; a **datatype** may have:
0026 - **Members**: scalar objects, typically integers, floating-point numbers or **components**
0027 - **VectorMembers**: vector objects, a list of scalars with a common type
0028 - **OneToOneRelations**: a bijective connection between objects; for example, a relation between a PID hypothesis and the charged particle it applies to
0029 - **OneToManyRelations**: a one-to-many connection between objects; for example, a relation between a reconstructed particle and the corresponding set of PID hypotheses
0030 - **Associations**: these are special **datatypes**, typically used to relate MC-truth objects to reconstructed objects; their only members are typically **OneToOneRelations** and **OneToManyRelations**. For example, a relation between an MC particle and reconstructed particle contains two **OneToOneRelations**: one to the MC particle and the other to the reconstructed particle
0031
0032 Now take a look at the YAML files. Here are links to them, along with the various **datatypes** that are relevant for the dRICH:
0033
0034 - [edm4hep.yaml](https://github.com/key4hep/EDM4hep/blob/master/edm4hep.yaml)
0035 - `edm4hep::SimTrackerHit`: MC-truth dRICH sensor hits (recall that the dRICH is a (optical) "tracker" in DD4hep)
0036 - `edm4hep::MCParticle`: MC-truth particle
0037 - `edm4hep::ParticleID`: a PID hypothesis result; these are the user-level PID objects that the dRICH PID algorithms produces
0038 - `edm4hep::ReconstructedParticle`: a reconstructed particle (we do not yet use this in ePIC; instead we use a modified version in EDM4eic)
0039 - [edm4eic.yaml](https://github.com/eic/EDM4eic/blob/main/edm4eic.yaml)
0040 - `edm4eic::RawTrackerHit`: a digitized dRICH sensor hit
0041 - `edm4eic::TrackSegment`: a segment of a full track, consisting of a list of `edm4eic::TrackPoint` component objects
0042 - `edm4eic::CherenkovParticleID`: expert-level PID object for Cherenkov detectors; it contains a list of hypotheses, `edm4eic::CherenkovParticleIDHypothesis` component objects, one for each mass hypothesis
0043 - `edm4eic::ReconstructedParticle`: a reconstructed particle
0044 - `edm4eic::MCRecoParticleAssociation`: an association between a reconstructed particle and the corresponding MC-truth particle
0045 - `edm4eic::MCRecoTrackerHitAssociation`: an association between a digitized hit and the corresponding set of MC-truth hits
0046
0047 Being a YAML file, this expression of the data model is independent of any programming language. Code generation may be used to generate an API to use the data model in any preferred language. In ePIC, we use [PODIO](https://github.com/AIDASoft/podio) to generate C++ classes from these YAML files. You can find full documentation linked in the EDM Github repositories; alternatively, browse the C++ header files in `eic-shell` found in:
0048 ```bash
0049 /opt/local/include/edm4hep
0050 /opt/local/include/edm4eic
0051 ```
0052
0053 See also [Thomas Madlener's CHEP 2023 talk](https://indico.jlab.org/event/459/contributions/11578/) for another resource on PODIO.
0054
0055 One last definition before discussing algorithms: for each **datatype**, you will find a **Collection**, which is a set of objects of a particular **datatype**. These **Collections** are the primary object passed between reconstruction algorithms and other ePIC software packages.
0056
0057 ### Algorithms
0058
0059 An algorithm is a transformation from one set of EDM Collections to another; for example, a digitizer transforms a collection of MC-truth sensor hits to a collection of raw, digitized hits (and in our dRICH usage, an additional collection of associations between the raw hits and MC hits).
0060
0061 IMPORTANT: these algorithms are supposed to be _as independent as possible_ from the reconstruction framework; the primary dependence should be on the data model. Eventually we may fully decouple the algorithms from EICrecon, but for now they currently live in the EICrecon repository. See [Sylvester's CHEP 2023 talk](https://indico.jlab.org/event/459/contributions/11419/) for more details.
0062
0063 Algorithms should be:
0064 - Configurable, allowing external configuration to tune for specific subsystems or use cases
0065 - Focused, not trying to do too many things
0066 - Shareable, since some algorithms may be useful for multiple subsystems
0067 - Modular, independent from other algorithms and from EICrecon
0068
0069 Algorithms are typically a class, with the following methods:
0070 - `AlgorithmInit`: run once, before all events
0071 - `AlgorithmProcess`: run on each event, returning a set of output collections given a set of input collections
0072
0073 An additional class (or `struct`) is used to hold the set of configuration parameters for an algorithm. The algorithm (or its base class) typically owns an instance of its configuration class.
0074
0075 We are now ready to discuss [the dRICH reconstruction flowchart of algorithms](https://github.com/eic/EICrecon/blob/main/src/detectors/DRICH/README.md).
0076
0077 See also [slides on dRICH algorithms](https://indico.bnl.gov/event/19683/attachments/48044/82398/slides.pdf) for a general idea of what each algorithm does.
0078
0079 ## EICrecon
0080
0081 ### Definitions
0082
0083 [EICrecon](https://github.com/eic/EICrecon) is used to run the reconstruction algorithms. It is there that the connections between collections and algorithms are defined. To proceed, we need a few more definitions:
0084
0085 - **Factory**: a factory uses an algorithm to produce a set of output collections, given input collections
0086 - this is effectively the EICrecon-dependent part of an algorithm, though it is a _separate_ class, since we prefer the algorithm _itself_ to be EICrecon independent
0087 - interfaces algorithm configuration with the user (`eicrecon` command)
0088 - initializes an algorithm, given the configuration
0089 - handles the input and output data of an algorithm, running the algorithm's `AlgorithmProcess` method
0090 - **Service**: globally common features, including:
0091 - access to the detector geometry
0092 - NOTE: the RICH detectors have an extended geometry service, called `richgeo`, which serves the dRICH geometry in the form of ACTS surfaces for track propagation and IRT optical surfaces for the IRT PID, along with additional geometry-related features
0093 - logging and log levels
0094 - file I/O
0095 - **Plugin**:
0096 - runs the factories
0097 - effectively expresses the "wiring" between algorithms and collections
0098
0099 Here is a visual representation:
0100
0101 ![rec-eicrecon](img/rec-eicrecon.png)
0102
0103 ### Code Organization
0104
0105 At the time of writing this tutorial, the part of the source code tree relevant for the dRICH is given below. Notice that the algorithm and factory names are given in the dRICH algorithm flowchart, but here we show the full file tree so that you can more easily find the relevant files.
0106 ```
0107 src
0108 ├── algorithms // EICrecon-independent algorithms (cf. factories in global/ below)
0109 │ │
0110 │ ├── digi
0111 │ │ ├── PhotoMultiplierHitDigi.cc // digitizer
0112 │ │ ├── PhotoMultiplierHitDigi.h
0113 │ │ └── PhotoMultiplierHitDigiConfig.h
0114 │ │
0115 │ ├── tracking
0116 │ │ ├── TrackPropagation.cc // propagate tracks to surfaces
0117 │ │ └── TrackPropagation.h
0118 │ │
0119 │ └── pid
0120 │ │
0121 │ ├── MergeTracks.cc // combine propagated track segments
0122 │ ├── MergeTracks.h
0123 │ │
0124 │ ├── IrtCherenkovParticleID.cc // run the underlying Indirect Ray Tracing (IRT) algorithm
0125 │ ├── IrtCherenkovParticleID.h
0126 │ ├── IrtCherenkovParticleIDConfig.h
0127 │ │
0128 │ ├── MergeParticleID.cc // combine ParticleID objects
0129 │ ├── MergeParticleID.h
0130 │ ├── MergeParticleIDConfig.h
0131 │ │
0132 │ ├── ParticlesWithPID.cc // link reconstructed particles to ParticleID objects
0133 │ ├── ParticlesWithPID.h
0134 │ ├── ParticlesWithPIDConfig.h
0135 │ │
0136 │ ├── ConvertParticleID.h // conversions between different ParticleID datatypes
0137 │ └── Tools.h // common methods and constants
0138 │
0139 │
0140 ├── detectors // plugins for each detector subsystem
0141 │ └── DRICH
0142 │ ├── DRICH.cc // DRICH plugin: "wiring" of dRICH reconstruction algorithms, factories, and collections
0143 │ └── README.md // primary documentation for dRICH reconstruction
0144 │
0145 │
0146 ├── global // EICrecon factories (each corresponds to an algorithm above)
0147 │ │
0148 │ ├── digi
0149 │ │ ├── PhotoMultiplierHitDigi_factory.cc
0150 │ │ └── PhotoMultiplierHitDigi_factory.h
0151 │ │
0152 │ └── pid
0153 │ ├── pid.cc // the PID plugin (uses DRICH plugin results, and eventually will use other PID subsystem results too)
0154 │ │
0155 │ ├── IrtCherenkovParticleID_factory.cc
0156 │ ├── IrtCherenkovParticleID_factory.h
0157 │ │
0158 │ ├── MergeCherenkovParticleID_factory.cc
0159 │ ├── MergeCherenkovParticleID_factory.h
0160 │ │
0161 │ ├── MergeTrack_factory.cc
0162 │ ├── MergeTrack_factory.h
0163 │ │
0164 │ ├── ParticlesWithPID_factory.cc
0165 │ ├── ParticlesWithPID_factory.h
0166 │ │
0167 │ ├── RichTrackConfig.h // TODO: this should be converted to an algorithm configuration...
0168 │ ├── RichTrack_factory.cc
0169 │ └── RichTrack_factory.h
0170 │
0171 │
0172 ├── services // EICrecon services
0173 │ └── geometry
0174 │ └── richgeo // RICH geometry bindings
0175 │ │
0176 │ ├── richgeo.cc // plugin and service implementation
0177 │ ├── RichGeo_service.cc
0178 │ ├── RichGeo_service.h
0179 │ │
0180 │ ├── ActsGeo.cc // bindings to ACTS, for track propagation
0181 │ ├── ActsGeo.h
0182 │ │
0183 │ ├── IrtGeo.cc // bindings to IRT, for PID optics
0184 │ ├── IrtGeo.h
0185 │ ├── IrtGeoDRICH.cc
0186 │ ├── IrtGeoDRICH.h
0187 │ ├── IrtGeoPFRICH.cc
0188 │ ├── IrtGeoPFRICH.h
0189 │ │
0190 │ ├── ReadoutGeo.cc // information about the readout geometry and pixels
0191 │ ├── ReadoutGeo.h
0192 │ │
0193 │ └── RichGeo.h // common objects and functions
0194 │
0195 │
0196 └── tests
0197 └── algorithms_test // unit tests
0198 ├── pid_MergeTracks.cc
0199 └── pid_MergeParticleID.cc
0200 ```
0201
0202 We will now give a tour of the code; the remainder of this tutorial is contained in the tutorial recording, linked at the top of this page.