tutorial-jana2/_episodes/04-factory.md

0001 ---
0002 title: "Creating or modifying a JANA factory in order to implement a reconstruction algorithm"
0003 teaching: 15
0004 exercises: 20
0005 questions:
0006 - "How to write a reconstruction algorithm in EICrecon?"
0007 objectives:
0008 - "Learn how to create a new factory in EICrecon that supplies a reconstruction algorithm for all to use."
0009 - "Understand the directory structure for where the factory should be placed in the source tree."
0010 - "Understand how to use a generic algorithm in a JANA factory."
0011 keypoints:
0012 - "Create a factory for reconstructing single subdetector data or for global reconstruction."
0013 ---
0014 > Note: The following episode presents a somewhat outdated view, and some commands may not function.
0015 > If you are only interested in analyzing already-reconstructed data, then there is no requirement
0016 > to use a plugin as described below; the output ROOT file works too.
0017 {: .callout}
0018
0019 ## Introduction
0020
0021 Now that you've learned about JANA plugins and JEventProcessors, let's talk about JFactories. JFactories are another essential JANA component just like JEventProcessors and JEventSources. While JEventProcessors are used for _aggregating_ results from each event into a structured output such as a histogram or a file, JFactories are used for computing those results in an organized way.
0022
0023 ### When do I use a JFactory?
0024
0025 - If you have an input file and need to read data model objects from it, use a JEventSource.
0026 - If you have an output file (or histogram) and wish to write data model objects to it, use a JEventProcessor.
0027 - If you have some data model objects and wish to produce a new data model object, use a JFactory.
0028
0029
0030 ### Why should I prefer writing a JFactory?
0031
0032 1. They make your code reusable. Different people can use your results later without having to understand the specifics of what you did.
0033
0034 2. If you are consuming some data which doesn't look right to you, JFactories make it extremely easy to pinpoint exactly which code produced this data.
0035
0036 3. EICrecon needs to run multithreaded, and using JFactories can help steer you away from introducing thorny parallelism bugs.
0037
0038 4. You can simply ask for the results you need and the JFactory will provide it. If nobody needs the results from the JFactory, it won't be run. If the results were already in the input file, it won't be run. If there are multiple consumers, the results are only computed once and then cached. If the JFactory relies on results from other JFactories, it will call them transparently and recursively.
0039
0040
0041 ### When do I create my own plugin?
0042
0043 - If you are doing a one-off prototype, it's fine to just use a ROOT macro.
0044 - If you are writing code you'll probably return to, we recommend putting the code in a standalone (i.e. outside of the EICrecon source tree) plugin.
0045 - If you are writing code other people will probably want to run, we recommend adding your plugin to the EICrecon source tree.
0046 - If you are writing a JFactory, we recommend adding it to the EICrecon source tree, either to an existing plugin or to a new one.
0047
0048
0049 ## Algorithms vs Factories
0050
0051 In general, a Factory is a programming pattern for constructing objects in an abstract way. Oftentimes, the Factory is calling an algorithm under the hood.
0052 This algorithm may be very generic. For instance, we may have a Factory that produces Cluster objects for a barrel calorimeter, and it calls a clustering algorithm
0053 that doesn't care at all about barrel calorimeters, just the position and energy of energy of each CalorimeterHit object. Perhaps multiple factories for creating clusters
0054 for completely different detectors are all using the same algorithm.
0055
0056 Note that Gaudi provides an abstraction called "Algorithm" which is essentially its own version of a JFactory. In EICrecon, we have been separating out _generic algorithms_ from the old Gaudi and new JANA code so that these can be developed and tested independently. To see an example of how a generic algorithm is being implemented, look at these examples:
0057
0058 src/detectors/EEMC/RawCalorimeterHit_factory_EcalEndcapNRawHits.h
0059 src/algorithms/calorimetry/CalorimeterHitDigi.h
0060 src/algorithms/calorimetry/CalorimeterHitDigi.cc
0061
0062 Using generic algorithms makes things slightly more complex. However, the generic algorithms can be recycled for use in multiple detector systems which adds some simplification.
0063
0064
0065 ## Parallelism considerations
0066
0067
0068 JEventProcessors observe the entire _event stream_, and require a _critical section_ where only one thread is allowed to modify a shared resource (such as a histogram) at any time.
0069 JFactories, on the other hand, only observe a single event at a time, and work on each event independently. Each worker thread is given an independent event with its own set of factories. This means that for a given JFactory instance, there will be only one thread working on one event at any time. You get the benefits of multithreading _without_ having to make each JFactory thread-safe.
0070
0071
0072 You can write JFactories in an almost-functional style, but you can also cache some data on the JFactory that will stick around from event-to-event. This is useful for things like conditions and geometry data, where for performance reasons you don't want to be doing a deep lookup on every event. Instead, you can write callbacks such as `BeginRun()`, where you can update your cached values when the run number changes.
0073
0074
0075 Note that just because the JFactory _can_ be called in parallel doesn't mean it always will. If you call event->Get() from inside `JEventProcessor::ProcessSequential`, in particular, the factory will run single-threaded and slow everything down. However, if you call it using `Prefetch` instead, it will run in parallel and you may get a speed boost.
0076
0077
0078 ## How do I use an existing JFactory? ##
0079
0080 Using an existing JFactory is extremely easy! Any time you are someplace where you have access to a `JEvent` object, do this:
0081
0082
0083 ~~~ c++
0084
0085 auto clusters = event->Get<edm4eic::Cluster>("EcalEndcapNIslandClusters");
0086
0087 for (auto c : clusters) {
0088   // ... do something with a cluster
0089 }
0090
0091 ~~~
0092
0093 As you can see, it doesn't matter whether the `Cluster` objects were calculated from some simpler objects, or were simply loaded from a file. This is a very powerful concept.
0094
0095 One thing we might want to do is to swap one factory for another, possibly even at runtime. This is easy to do if you just make the factory tag be a parameter:
0096
0097
0098 ~~~ c++
0099
0100 std::string my_cluster_source = "EcalEndcapNIslandClusters";  // Make this be a parameter
0101 app->SetDefaultParameter("MyPlugin:MyAnalysis:my_cluster_source", my_cluster_source, "Cluster source for MyAnalysis");
0102 auto clusters = event->Get<edm4eic::Cluster>(my_cluster_source);
0103 ~~~
0104
0105 ## How do I create a new JFactory?
0106
0107 We are going to add a new JFactory inside EICrecon.
0108
0109 `src/detectors/EEMC/Cluster_factory_EcalEndcapNIslandClusters.h`:
0110 ~~~ c++
0111 #pragma once
0112
0113 #include <edm4eic/Cluster.h>
0114 #include <JANA/JFactoryT.h>
0115 #include <services/log/Log_service.h>
0116
0117 class Cluster_factory_EcalEndcapNIslandClusters : public JFactoryT<edm4eic::Cluster> {
0118 public:
0119
0120     Cluster_factory_EcalEndcapNIslandClusters(); // Constructor
0121
0122     void Init() override;
0123     // Gets called exactly once at the beginning of the JFactory's life
0124
0125     void ChangeRun(const std::shared_ptr<const JEvent> &event) override {};
0126     // Gets called on events where the run number has changed (before Process())
0127
0128     void Process(const std::shared_ptr<const JEvent> &event) override;
0129     // Gets called on every event
0130
0131     void Finish() override {};
0132     // Gets called exactly once at the end of the JFactory's life
0133
0134 private:
0135     float m_scaleFactor;
0136     std::shared_ptr<spdlog::logger> m_log;
0137
0138 };
0139
0140 ~~~
0141
0142 `src/detectors/EEMC/Cluster_factory_EcalEndcapNIslandClusters.cc`:
0143 ~~~ c++
0144 #include "Cluster_factory_EcalEndcapNIslandClusters.h"
0145
0146 #include <edm4eic/ProtoCluster.h>
0147 #include <JANA/JEvent.h>
0148
0149
0150 Cluster_factory_EcalEndcapNIslandClusters::Cluster_factory_EcalEndcapNIslandClusters() {
0151
0152     SetTag("EcalEndcapNIslandClusters");
0153 }
0154
0155
0156 void Cluster_factory_EcalEndcapNIslandClusters::Init() {
0157     auto app = GetApplication();
0158
0159     // This is an example of how to declare a configuration parameter that
0160     // can be set at run time. e.g. with -PEEMC:EcalEndcapNIslandClusters:scaleFactor=0.97
0161     m_scaleFactor =0.98;
0162     app->SetDefaultParameter("EEMC:EcalEndcapNIslandClusters:scaleFactor", m_scaleFactor, "Energy scale factor");
0163
0164     // This is how you access shared resources using the JService interface
0165     m_log = app->GetService<Log_service>()->logger("EcalEndcapNIslandClusters");
0166 }
0167
0168
0169 void Cluster_factory_EcalEndcapNIslandClusters::Process(const std::shared_ptr<const JEvent> &event) {
0170
0171     m_log->info("Processing event {}", event->GetEventNumber());
0172
0173     // Grab inputs
0174     auto protoclusters = event->Get<edm4eic::ProtoCluster>("EcalEndcapNIslandProtoClusters");
0175
0176     // Loop over protoclusters and turn each into a cluster
0177     std::vector<edm4eic::Cluster*> outputClusters;
0178     for( auto proto : protoclusters ) {
0179
0180         // ======================
0181         // Algorithm goes here!
0182         // ======================
0183
0184         auto cluster = new edm4eic::Cluster(
0185             0, // type
0186             energy * m_scaleFactor,
0187             sqrt(energyError_squared),
0188             time,
0189             timeError,
0190             proto->hits_size(),
0191             position,
0192             edm4eic::Cov3f(), // positionError,
0193             0.0,              // intrinsicTheta,
0194             0.0,              // intrinsicPhi,
0195             edm4eic::Cov2f()  // intrinsicDirectionError
0196             );
0197
0198         outputClusters.push_back( cluster );
0199     }
0200
0201     // Hand ownership of algorithm objects over to JANA
0202     Set(outputClusters);
0203 }
0204
0205
0206 ~~~
0207
0208 We can now fill in the algorithm with anything we like!
0209
0210
0211 ~~~ c++
0212         // Grab inputs
0213         auto protoclusters = event->Get<edm4eic::ProtoCluster>("EcalEndcapNIslandProtoClusters");
0214
0215         // Loop over protoclusters and turn each into a cluster
0216         std::vector<edm4eic::Cluster*> outputClusters;
0217         for( auto proto : protoclusters ) {
0218
0219             // Fill cumulative values by looping over all hits in proto cluster
0220             float energy = 0;
0221             double energyError_squared = 0.0;
0222             float time = 1.0E8;
0223             float timeError;
0224             edm4hep::Vector3f position;
0225             double sum_weights = 0.0;
0226             for( uint32_t ihit=0; ihit<proto->hits_size() ; ihit++){
0227                 auto const &hit = proto->getHits(ihit);
0228                 auto weight = proto->getWeights(ihit);
0229                 energy += hit.getEnergy();
0230                 energyError_squared += std::pow(hit.getEnergyError(), 2.0);
0231                 if( hit.getTime() < time ){
0232                     time = hit.getTime();            // use earliest time
0233                     timeError = hit.getTimeError();  // use error of earliest time
0234                 }
0235                 auto &p = hit.getPosition();
0236                 position.x += p.x*weight;
0237                 position.y += p.y*weight;
0238                 position.z += p.z*weight;
0239                 sum_weights += weight;
0240             }
0241
0242             // Normalize position
0243             position.x /= sum_weights;
0244             position.y /= sum_weights;
0245             position.z /= sum_weights;
0246
0247             // Create a cluster object from values accumulated from hits above
0248             auto cluster = new edm4eic::Cluster(
0249                 0, // type (?))
0250                 energy * m_scaleFactor,
0251                 sqrt(energyError_squared),
0252                 time,
0253                 timeError,
0254                 proto->hits_size(),
0255                 position,
0256
0257                 // Not sure how to calculate these last few
0258                 edm4eic::Cov3f(), // positionError,
0259                 0.0, // intrinsicTheta,
0260                 0.0, // intrinsicPhi,
0261                 edm4eic::Cov2f() // intrinsicDirectionError
0262                 );
0263
0264             outputClusters.push_back( cluster );
0265         }
0266
0267         // Hand ownership of algorithm objects over to JANA
0268         Set(outputClusters);
0269
0270 ~~~
0271
0272
0273 You can't pass JANA a JFactory directly (because it needs to create an arbitrary number of them on the fly). Instead you register a `JFactoryGenerator` object:
0274
0275 `src/detectors/EEMC/EEMC.cc`
0276 ~~~ c++
0277
0278 // In your plugin's init
0279
0280 #include <JANA/JFactoryGenerator.h>
0281 // ...
0282 #include "Cluster_factory_EcalEndcapNIslandClusters.h"
0283
0284 extern "C" {
0285     void InitPlugin(JApplication *app) {
0286         InitJANAPlugin(app);
0287         // ...
0288
0289         app->Add(new JFactoryGeneratorT<Cluster_factory_EcalEndcapNIslandClusters>());
0290      }
0291
0292 ~~~
0293
0294 Finally, we go ahead and trigger the factory (remember, factories won't do anything unless activated by a JEventProcessor). You can open the
0295
0296 ~~~ bash
0297 eicrecon in.root -Ppodio:output_file=out.root -Ppodio:output_collections=EcalEndcapNIslandClusters -Pjana:nevents=10
0298 ~~~
0299
0300
0301 Your exercise is to get this JFactory working! You can tweak the algorithm, add log messages, add additional config parameters, etc.
0302
0303
0304
0305 {% include links.md %}