Back to home page

EIC code displayed by LXR

 
 

    


Warning, /tutorial-jana2/_episodes/04-factory.md is written in an unsupported language. File is not indexed.

0001 ---
0002 title: "Creating or modifying a JANA factory in order to implement a reconstruction algorithm"
0003 teaching: 15
0004 exercises: 20
0005 questions:
0006 - "How to write a reconstruction algorithm in EICrecon?"
0007 objectives:
0008 - "Learn how to create a new factory in EICrecon that supplies a reconstruction algorithm for all to use."
0009 - "Understand the directory structure for where the factory should be placed in the source tree."
0010 - "Understand how to use a generic algorithm in a JANA factory."
0011 keypoints:
0012 - "Create a factory for reconstructing single subdetector data or for global reconstruction."
0013 ---
0014 > Note: The following episode presents a somewhat outdated view, and some commands may not function.
0015 > If you are only interested in analyzing already-reconstructed data, then there is no requirement
0016 > to use a plugin as described below; the output ROOT file works too.
0017 {: .callout}
0018 
0019 ## Introduction
0020 
0021 Now that you've learned about JANA plugins and JEventProcessors, let's talk about JFactories. JFactories are another essential JANA component just like JEventProcessors and JEventSources. While JEventProcessors are used for _aggregating_ results from each event into a structured output such as a histogram or a file, JFactories are used for computing those results in an organized way. 
0022 
0023 ### When do I use a JFactory?
0024 
0025 - If you have an input file and need to read data model objects from it, use a JEventSource.
0026 - If you have an output file (or histogram) and wish to write data model objects to it, use a JEventProcessor.
0027 - If you have some data model objects and wish to produce a new data model object, use a JFactory.
0028 
0029 
0030 ### Why should I prefer writing a JFactory?
0031 
0032 1. They make your code reusable. Different people can use your results later without having to understand the specifics of what you did.
0033 
0034 2. If you are consuming some data which doesn't look right to you, JFactories make it extremely easy to pinpoint exactly which code produced this data.
0035 
0036 3. EICrecon needs to run multithreaded, and using JFactories can help steer you away from introducing thorny parallelism bugs.
0037 
0038 4. You can simply ask for the results you need and the JFactory will provide it. If nobody needs the results from the JFactory, it won't be run. If the results were already in the input file, it won't be run. If there are multiple consumers, the results are only computed once and then cached. If the JFactory relies on results from other JFactories, it will call them transparently and recursively. 
0039 
0040 
0041 ### When do I create my own plugin?
0042 
0043 - If you are doing a one-off prototype, it's fine to just use a ROOT macro.
0044 - If you are writing code you'll probably return to, we recommend putting the code in a standalone (i.e. outside of the EICrecon source tree) plugin.
0045 - If you are writing code other people will probably want to run, we recommend adding your plugin to the EICrecon source tree.
0046 - If you are writing a JFactory, we recommend adding it to the EICrecon source tree, either to an existing plugin or to a new one.
0047 
0048 
0049 ## Algorithms vs Factories 
0050 
0051 In general, a Factory is a programming pattern for constructing objects in an abstract way. Oftentimes, the Factory is calling an algorithm under the hood. 
0052 This algorithm may be very generic. For instance, we may have a Factory that produces Cluster objects for a barrel calorimeter, and it calls a clustering algorithm 
0053 that doesn't care at all about barrel calorimeters, just the position and energy of energy of each CalorimeterHit object. Perhaps multiple factories for creating clusters 
0054 for completely different detectors are all using the same algorithm. 
0055 
0056 Note that Gaudi provides an abstraction called "Algorithm" which is essentially its own version of a JFactory. In EICrecon, we have been separating out _generic algorithms_ from the old Gaudi and new JANA code so that these can be developed and tested independently. To see an example of how a generic algorithm is being implemented, look at these examples:
0057 
0058 src/detectors/EEMC/RawCalorimeterHit_factory_EcalEndcapNRawHits.h
0059 src/algorithms/calorimetry/CalorimeterHitDigi.h
0060 src/algorithms/calorimetry/CalorimeterHitDigi.cc
0061 
0062 Using generic algorithms makes things slightly more complex. However, the generic algorithms can be recycled for use in multiple detector systems which adds some simplification.
0063 
0064 
0065 ## Parallelism considerations
0066 
0067 
0068 JEventProcessors observe the entire _event stream_, and require a _critical section_ where only one thread is allowed to modify a shared resource (such as a histogram) at any time. 
0069 JFactories, on the other hand, only observe a single event at a time, and work on each event independently. Each worker thread is given an independent event with its own set of factories. This means that for a given JFactory instance, there will be only one thread working on one event at any time. You get the benefits of multithreading _without_ having to make each JFactory thread-safe.
0070 
0071 
0072 You can write JFactories in an almost-functional style, but you can also cache some data on the JFactory that will stick around from event-to-event. This is useful for things like conditions and geometry data, where for performance reasons you don't want to be doing a deep lookup on every event. Instead, you can write callbacks such as `BeginRun()`, where you can update your cached values when the run number changes. 
0073 
0074 
0075 Note that just because the JFactory _can_ be called in parallel doesn't mean it always will. If you call event->Get() from inside `JEventProcessor::ProcessSequential`, in particular, the factory will run single-threaded and slow everything down. However, if you call it using `Prefetch` instead, it will run in parallel and you may get a speed boost.
0076 
0077 
0078 ## How do I use an existing JFactory? ##
0079 
0080 Using an existing JFactory is extremely easy! Any time you are someplace where you have access to a `JEvent` object, do this:
0081 
0082 
0083 ~~~ c++
0084 
0085 auto clusters = event->Get<edm4eic::Cluster>("EcalEndcapNIslandClusters");
0086 
0087 for (auto c : clusters) {
0088   // ... do something with a cluster
0089 }
0090 
0091 ~~~
0092 
0093 As you can see, it doesn't matter whether the `Cluster` objects were calculated from some simpler objects, or were simply loaded from a file. This is a very powerful concept. 
0094 
0095 One thing we might want to do is to swap one factory for another, possibly even at runtime. This is easy to do if you just make the factory tag be a parameter:
0096 
0097 
0098 ~~~ c++
0099 
0100 std::string my_cluster_source = "EcalEndcapNIslandClusters";  // Make this be a parameter
0101 app->SetDefaultParameter("MyPlugin:MyAnalysis:my_cluster_source", my_cluster_source, "Cluster source for MyAnalysis");
0102 auto clusters = event->Get<edm4eic::Cluster>(my_cluster_source);
0103 ~~~
0104 
0105 ## How do I create a new JFactory?
0106 
0107 We are going to add a new JFactory inside EICrecon. 
0108 
0109 `src/detectors/EEMC/Cluster_factory_EcalEndcapNIslandClusters.h`:
0110 ~~~ c++
0111 #pragma once
0112 
0113 #include <edm4eic/Cluster.h>
0114 #include <JANA/JFactoryT.h>
0115 #include <services/log/Log_service.h>
0116 
0117 class Cluster_factory_EcalEndcapNIslandClusters : public JFactoryT<edm4eic::Cluster> {
0118 public:
0119 
0120     Cluster_factory_EcalEndcapNIslandClusters(); // Constructor
0121 
0122     void Init() override;  
0123     // Gets called exactly once at the beginning of the JFactory's life
0124 
0125     void ChangeRun(const std::shared_ptr<const JEvent> &event) override {};
0126     // Gets called on events where the run number has changed (before Process())
0127 
0128     void Process(const std::shared_ptr<const JEvent> &event) override;
0129     // Gets called on every event
0130 
0131     void Finish() override {};
0132     // Gets called exactly once at the end of the JFactory's life
0133 
0134 private:
0135     float m_scaleFactor;
0136     std::shared_ptr<spdlog::logger> m_log;
0137 
0138 };
0139 
0140 ~~~
0141 
0142 `src/detectors/EEMC/Cluster_factory_EcalEndcapNIslandClusters.cc`:
0143 ~~~ c++
0144 #include "Cluster_factory_EcalEndcapNIslandClusters.h"
0145 
0146 #include <edm4eic/ProtoCluster.h>
0147 #include <JANA/JEvent.h>
0148 
0149 
0150 Cluster_factory_EcalEndcapNIslandClusters::Cluster_factory_EcalEndcapNIslandClusters() {
0151 
0152     SetTag("EcalEndcapNIslandClusters");
0153 }
0154 
0155 
0156 void Cluster_factory_EcalEndcapNIslandClusters::Init() {
0157     auto app = GetApplication();
0158 
0159     // This is an example of how to declare a configuration parameter that
0160     // can be set at run time. e.g. with -PEEMC:EcalEndcapNIslandClusters:scaleFactor=0.97
0161     m_scaleFactor =0.98;
0162     app->SetDefaultParameter("EEMC:EcalEndcapNIslandClusters:scaleFactor", m_scaleFactor, "Energy scale factor");
0163 
0164     // This is how you access shared resources using the JService interface
0165     m_log = app->GetService<Log_service>()->logger("EcalEndcapNIslandClusters");
0166 }
0167 
0168 
0169 void Cluster_factory_EcalEndcapNIslandClusters::Process(const std::shared_ptr<const JEvent> &event) {
0170 
0171     m_log->info("Processing event {}", event->GetEventNumber());
0172 
0173     // Grab inputs
0174     auto protoclusters = event->Get<edm4eic::ProtoCluster>("EcalEndcapNIslandProtoClusters");
0175 
0176     // Loop over protoclusters and turn each into a cluster
0177     std::vector<edm4eic::Cluster*> outputClusters;
0178     for( auto proto : protoclusters ) {
0179 
0180         // ======================
0181         // Algorithm goes here!
0182         // ======================
0183 
0184         auto cluster = new edm4eic::Cluster(
0185             0, // type 
0186             energy * m_scaleFactor,
0187             sqrt(energyError_squared),
0188             time,
0189             timeError,
0190             proto->hits_size(),
0191             position,
0192             edm4eic::Cov3f(), // positionError,
0193             0.0,              // intrinsicTheta,
0194             0.0,              // intrinsicPhi,
0195             edm4eic::Cov2f()  // intrinsicDirectionError
0196             );
0197 
0198         outputClusters.push_back( cluster );
0199     }
0200 
0201     // Hand ownership of algorithm objects over to JANA
0202     Set(outputClusters);
0203 }
0204 
0205 
0206 ~~~
0207 
0208 We can now fill in the algorithm with anything we like!
0209 
0210 
0211 ~~~ c++
0212         // Grab inputs
0213         auto protoclusters = event->Get<edm4eic::ProtoCluster>("EcalEndcapNIslandProtoClusters");
0214 
0215         // Loop over protoclusters and turn each into a cluster
0216         std::vector<edm4eic::Cluster*> outputClusters;
0217         for( auto proto : protoclusters ) {
0218 
0219             // Fill cumulative values by looping over all hits in proto cluster
0220             float energy = 0;
0221             double energyError_squared = 0.0;
0222             float time = 1.0E8;
0223             float timeError;
0224             edm4hep::Vector3f position;
0225             double sum_weights = 0.0;
0226             for( uint32_t ihit=0; ihit<proto->hits_size() ; ihit++){
0227                 auto const &hit = proto->getHits(ihit);
0228                 auto weight = proto->getWeights(ihit);
0229                 energy += hit.getEnergy();
0230                 energyError_squared += std::pow(hit.getEnergyError(), 2.0);
0231                 if( hit.getTime() < time ){
0232                     time = hit.getTime();            // use earliest time
0233                     timeError = hit.getTimeError();  // use error of earliest time
0234                 }
0235                 auto &p = hit.getPosition();
0236                 position.x += p.x*weight;
0237                 position.y += p.y*weight;
0238                 position.z += p.z*weight;
0239                 sum_weights += weight;
0240             }
0241             
0242             // Normalize position
0243             position.x /= sum_weights;
0244             position.y /= sum_weights;
0245             position.z /= sum_weights;
0246 
0247             // Create a cluster object from values accumulated from hits above
0248             auto cluster = new edm4eic::Cluster(
0249                 0, // type (?))
0250                 energy * m_scaleFactor,
0251                 sqrt(energyError_squared),
0252                 time,
0253                 timeError,
0254                 proto->hits_size(),
0255                 position,
0256 
0257                 // Not sure how to calculate these last few
0258                 edm4eic::Cov3f(), // positionError,
0259                 0.0, // intrinsicTheta,
0260                 0.0, // intrinsicPhi,
0261                 edm4eic::Cov2f() // intrinsicDirectionError
0262                 );
0263 
0264             outputClusters.push_back( cluster );
0265         }
0266 
0267         // Hand ownership of algorithm objects over to JANA
0268         Set(outputClusters);
0269 
0270 ~~~
0271 
0272 
0273 You can't pass JANA a JFactory directly (because it needs to create an arbitrary number of them on the fly). Instead you register a `JFactoryGenerator` object: 
0274 
0275 `src/detectors/EEMC/EEMC.cc`
0276 ~~~ c++
0277 
0278 // In your plugin's init
0279 
0280 #include <JANA/JFactoryGenerator.h>
0281 // ...
0282 #include "Cluster_factory_EcalEndcapNIslandClusters.h"
0283 
0284 extern "C" {
0285     void InitPlugin(JApplication *app) {
0286         InitJANAPlugin(app);
0287         // ...
0288 
0289         app->Add(new JFactoryGeneratorT<Cluster_factory_EcalEndcapNIslandClusters>());
0290      }
0291 
0292 ~~~
0293 
0294 Finally, we go ahead and trigger the factory (remember, factories won't do anything unless activated by a JEventProcessor). You can open the 
0295 
0296 ~~~ bash
0297 eicrecon in.root -Ppodio:output_file=out.root -Ppodio:output_collections=EcalEndcapNIslandClusters -Pjana:nevents=10
0298 ~~~
0299 
0300 
0301 Your exercise is to get this JFactory working! You can tweak the algorithm, add log messages, add additional config parameters, etc.
0302 
0303 
0304 
0305 {% include links.md %}