eic-opticks/docs/performance-and-debugging.md

0001 # Performance and debugging
0002
0003 This guide covers the existing performance-study workflow and the
0004 `optiphy/ana/photon_history_summary.py` analysis script.
0005
0006 ## Performance studies
0007
0008 To quantify the speed-up achieved by Simphony compared to Geant4, the
0009 repository provides Python tooling that runs the same Geant4 simulation with
0010 and without tracking optical photons in G4. The difference between those runs
0011 approximates the time required to simulate photons in Geant4, while the same
0012 photons are simulated on the GPU with Simphony and the GPU simulation time is
0013 saved.
0014
0015 ```shell
0016 mkdir -p /tmp/out/dev
0017 mkdir -p /tmp/out/rel
0018
0019 docker build -t simphony:perf-dev --target=develop
0020 docker run --rm -t -v /tmp/out:/tmp/out simphony:perf-dev run-performance -o /tmp/out/dev
0021
0022 docker build -t simphony:perf-rel --target=release
0023 docker run --rm -t -v /tmp/out:/tmp/out simphony:perf-rel run-performance -o /tmp/out/rel
0024 ```
0025
0026 ## Debug analysis with `optiphy/ana/photon_history_summary.py`
0027
0028 The script analyzes GPU optical photon simulation output to debug where
0029 photons went and why: which were detected, absorbed, scattered, or trapped
0030 bouncing forever, and whether the physics, such as wavelength shifting and
0031 energy conservation, worked correctly. Without it, the simulation is a black
0032 box that only reports a hit count.
0033
0034 ### Prerequisites
0035
0036 The simulation must be run with `OPTICKS_EVENT_MODE` set so that output arrays
0037 are saved to disk. The default mode, `Minimal`, only gathers hits into memory
0038 and does not write `.npy` files.
0039
0040 ```bash
0041 export OPTICKS_EVENT_MODE=HitPhoton   # saves photon, hit, inphoton, record, seq
0042 ```
0043
0044 Valid modes for this script: `HitPhoton`, `DebugLite`, `DebugHeavy`.
0045
0046 ### Running a simulation with output saving
0047
0048 ```bash
0049 OPTICKS_EVENT_MODE=HitPhoton GPUPhotonSourceMinimal -g tests/geom/wls_test.gdml -c wls_test -m tests/run.mac -s 42
0050 ```
0051
0052 ### Output file location
0053
0054 Opticks writes `.npy` arrays to:
0055
0056 ```text
0057 $TMP/GEOM/$GEOM/<ExecutableName>/ALL0_no_opticks_event_name/A000/
0058 ```
0059
0060 | Variable | Default | Override |
0061 |----------|---------|----------|
0062 | `$TMP` | `/tmp/$USER/opticks` | `export TMP=/path/to/tmp` |
0063 | `$GEOM` | `GEOM` | `export GEOM=mygeom` |
0064
0065 For example, with defaults:
0066
0067 ```text
0068 /tmp/$USER/opticks/GEOM/GEOM/GPUPhotonSourceMinimal/ALL0_no_opticks_event_name/A000/
0069 |-- photon.npy      # (N, 4, 4) float32 - final state of all photons
0070 |-- hit.npy         # (H, 4, 4) float32 - detected photons only
0071 |-- inphoton.npy    # (N, 4, 4) float32 - input photons before simulation
0072 |-- record.npy      # (N, 32, 4, 4) float32 - step-by-step history (up to 32 steps)
0073 |-- seq.npy         # (N, 2, 2) uint64 - compressed step sequence per photon
0074 |-- genstep.npy     # generation step parameters
0075 |-- domain.npy      # domain compression parameters
0076 ```
0077
0078 ### Running the analysis
0079
0080 ```bash
0081 # Basic summary tables:
0082 python optiphy/ana/photon_history_summary.py <event_folder>
0083
0084 # Auto-resolves A000 subfolder:
0085 python optiphy/ana/photon_history_summary.py /tmp/$USER/opticks/GEOM/GEOM/GPUPhotonSourceMinimal/ALL0_no_opticks_event_name
0086
0087 # Show step-by-step trace for specific photons:
0088 python optiphy/ana/photon_history_summary.py <path> --trace 0,227,235
0089
0090 # Show all non-detected (lost) photons with full traces:
0091 python optiphy/ana/photon_history_summary.py <path> --lost
0092
0093 # Filter by terminal flag:
0094 python optiphy/ana/photon_history_summary.py <path> --flag BULK_ABSORB
0095
0096 # Show per-photon detail for first 20 photons:
0097 python optiphy/ana/photon_history_summary.py <path> --detail 20
0098 ```
0099
0100 ### Output tables
0101
0102 The script prints six summary tables by default: photon outcomes by terminal
0103 flag with hit rate and MaxBounce truncation count, cumulative flagmask history
0104 distribution, wavelength statistics with energy conservation check,
0105 position/time stats, ranked step sequence histories decoded from `seq.npy`,
0106 and step count distribution flagging truncated photons. The `--lost`,
0107 `--trace`, and `--flag` options drill into specific photons with full
0108 step-by-step position, wavelength, and flag traces for debugging exactly where
0109 and why a photon died.
0110
0111 | Table | Description |
0112 |-------|-------------|
0113 | **Photon Outcomes** | Counts by terminal flag, such as `SURFACE_DETECT` and `BULK_ABSORB`, with boundary indices, hitmask, and MaxBounce truncation count |
0114 | **Photon Histories** | Unique cumulative flagmask combinations, such as `TO|RE|BT|SD` |
0115 | **Wavelength Analysis** | Input vs output wavelengths, shift count, energy conservation check |
0116 | **Position / Time** | Radius and time statistics |
0117 | **Sequence Histories** | Top N step-by-step sequences from `seq.npy`, such as `TO RE BT SD` |
0118 | **Step Counts** | Distribution of steps per photon, flagging truncated ones at the maximum |
0119
0120 ### Photon data layout (`sphoton.h`)
0121
0122 Each photon is a `(4, 4)` float32 matrix: four quads of four floats.
0123 `photon.npy` holds the final state, `hit.npy` the detected subset,
0124 `inphoton.npy` the initial state, and `record.npy` stores the full
0125 step-by-step history, up to 32 bounces per photon using the same quad layout
0126 per step.
0127
0128 | Quad | .x | .y | .z | .w |
0129 |------|----|----|----|----|
0130 | q0 | pos.x | pos.y | pos.z | time |
0131 | q1 | mom.x | mom.y | mom.z | iindex |
0132 | q2 | pol.x | pol.y | pol.z | wavelength (nm) |
0133 | q3 | orient_boundary_flag | identity | index | flagmask |
0134
0135 ### q3 bit packing
0136
0137 `q3` is stored as float32 but interpreted as uint32 via `.view(np.uint32)`.
0138 This is where the physics outcome lives.
0139
0140 `q3.x` (`orient_boundary_flag`) packs three fields into 32 bits:
0141
0142 ```text
0143 bit 31          : orient sign (1 = inward-facing surface normal)
0144 bits 30-16      : boundary index (which pair of materials the photon is at)
0145 bits 15-0       : terminal flag (what happened last)
0146 ```
0147
0148 Python extraction:
0149
0150 ```python
0151 q3    = photon[:, 3, :].view(np.uint32)
0152 flag  = q3[:, 0] & 0xFFFF             # terminal flag
0153 bnd   = (q3[:, 0] >> 16) & 0x7FFF     # boundary index
0154 ```
0155
0156 `q3.w` (`flagmask`) is the cumulative OR of every flag set during the
0157 photon's lifetime. Each call to `set_flag(f)` does both `flag = f` and
0158 `flagmask |= f`. A single integer therefore captures the full history. For
0159 example, `0x0854` = `TO|RE|SD|BT` means torch generation, WLS re-emission,
0160 boundary transmit, and surface detect.
0161
0162 ### Sequence history encoding (`sseq.h`)
0163
0164 `seq.npy` shape is `(N, 2, 2)` uint64. Each photon gets two pairs of uint64:
0165 `seqhis[2]` for flag history and `seqbnd[2]` for boundary history.
0166
0167 Each uint64 holds 16 slots of 4-bit nibbles, for a total of 32 slots across
0168 the pair. Flag nibbles store the find-first-set value of the flag bit.
0169 `TORCH` (bit 2) stores as nibble 3, and `SURFACE_DETECT` (bit 6) stores as 7.
0170
0171 ```python
0172 seqhis = seq[:, 0, :]   # two uint64 per photon
0173 nibble = (seqhis[0] >> (4 * slot)) & 0xF   # for slot 0-15
0174 flag   = 1 << (nibble - 1)                  # reconstruct flag
0175 ```
0176
0177 This gives a compact chronological per-step history in 16 bytes per photon.
0178
0179 ### Hit determination and MaxBounce
0180
0181 A photon is a hit when `(flagmask & hitmask) == hitmask`. The default hitmask
0182 is `SD` (`SURFACE_DETECT = 0x40`), but for PMT efficiency tagging it may be
0183 `EC` (`EFFICIENCY_COLLECT = 0x2000`). The script reads the actual hitmask from
0184 `NPFold_meta.txt` in the event folder.
0185
0186 Photons that exhaust the bounce limit, `MaxBounce` by default 31 and therefore
0187 32 steps, receive no special flag. The propagation loop simply exits and the
0188 photon keeps whatever terminal flag it had at its last step. These are
0189 typically photons trapped by total internal reflection. The script detects
0190 them by checking `step_count == max_steps` and reports the count in the
0191 outcome table.
0192
0193 ### Flag definitions (`OpticksPhoton.h`)
0194
0195 Flags are a power-of-two enum where each GPU physics process gets one bit:
0196
0197 | Flag | Hex | Abbrev | Description |
0198 |------|-----|--------|-------------|
0199 | CERENKOV | 0x0001 | CK | Cerenkov generation |
0200 | SCINTILLATION | 0x0002 | SI | Scintillation generation |
0201 | TORCH | 0x0004 | TO | Torch source |
0202 | BULK_ABSORB | 0x0008 | AB | Absorbed in bulk material |
0203 | BULK_REEMIT | 0x0010 | RE | Re-emitted, scintillation or WLS |
0204 | BULK_SCATTER | 0x0020 | SC | Rayleigh scattered |
0205 | SURFACE_DETECT | 0x0040 | SD | Detected at surface |
0206 | SURFACE_ABSORB | 0x0080 | SA | Absorbed at surface |
0207 | SURFACE_DREFLECT | 0x0100 | DR | Diffuse reflection at surface |
0208 | SURFACE_SREFLECT | 0x0200 | SR | Specular reflection at surface |
0209 | BOUNDARY_REFLECT | 0x0400 | BR | Fresnel reflection at boundary |
0210 | BOUNDARY_TRANSMIT | 0x0800 | BT | Transmitted through boundary |
0211 | NAN_ABORT | 0x1000 | NA | Aborted due to NaN, usually geometry error |
0212 | EFFICIENCY_COLLECT | 0x2000 | EC | Collected by PMT efficiency |
0213 | EFFICIENCY_CULL | 0x4000 | EL | Culled by PMT efficiency |
0214 | MISS | 0x8000 | MI | Missed all geometry |