source/sphenixprod/README.md

0001 # sphenixprod
0002 Production Toolchain for the sPHENIX experiment
0003
0004 Originally based on https://github.com/klendathu2k/slurp with the goal of streamlining and scaling to keep O(100k) farm nodes occupied.
0005
0006 ## Installation
0007 All (at least hopefully) dependencies are in `requirements.txt`.
0008 ```sh
0009 pip install -r requirements.txt
0010 ```
0011
0012 ## Chunking Support
0013
0014 For large run lists, the submission process can be time-consuming when processing all runs at once. The `--chunk-size` parameter allows you to process runs in smaller chunks, enabling faster feedback and more incremental progress.
0015
0016 ### Usage
0017
0018 ```bash
0019 # Process all runs at once (default behavior)
0020 create_submission.py --config config.yaml --rulename RULE --runs 1000 2000
0021
0022 # Process runs in chunks of 50
0023 create_submission.py --config config.yaml --rulename RULE --runs 1000 2000 --chunk-size 50
0024
0025 # Process runs from a runlist file in chunks of 100
0026 create_submission.py --config config.yaml --rulename RULE --runlist runs.txt --chunk-size 100
0027 ```
0028
0029 ### How It Works
0030
0031 - Each chunk goes through the complete pipeline: matching → file creation → DB updates → optional submission
0032 - Runs are processed newest-first within each chunk
0033 - With `--andgo`, jobs are submitted after each chunk completes
0034 - Default: `--chunk-size 0` processes all runs at once (backward compatible)
0035
0036 ### Benefits
0037
0038 1. **Faster Time to First Submission**: Start submitting jobs sooner rather than waiting for all runs to be processed
0039 2. **Better Resource Management**: Spread processing over time to avoid overwhelming resources
0040 3. **Incremental Progress**: See results from earlier chunks while later chunks are still processing
0041 4. **Easier Debugging**: Smaller chunks make it easier to identify and fix issues
0042
0043