Warning, /swf-testbed/docs/proposal_diagrams.md is written in an unsupported language. File is not indexed.
0001 # AI-enabled WFMS proposal — draft diagrams
0002
0003 Mermaid prototypes to support the "Why us?" section. Finished versions will
0004 graduate to hand-authored SVG in `swf-testbed/docs/images/` style.
0005
0006 Open this file's preview (`Ctrl+Shift+V`) to render.
0007
0008 ---
0009
0010 ## Diagram 1 — Three Contexts
0011
0012 The thesis picture: three LLM-integrated systems running today, ordered
0013 left-to-right by increasing LLM autonomy. Shared MCP ecosystem feeds all
0014 three. Top banner carries the 6-month claim.
0015
0016 ```mermaid
0017 flowchart TB
0018 classDef llm fill:#e3f2fd,stroke:#1565c0,stroke-width:2px,color:#000
0019 classDef human fill:#fff3e0,stroke:#e65100,color:#000
0020 classDef tool fill:#f1f8e9,stroke:#33691e,color:#000
0021 classDef delta fill:#fce4ec,stroke:#ad1457,stroke-dasharray:4 3,color:#000
0022
0023 Thesis["<b>Today</b>: LLMs inform humans ━━ 6-month step ━━▶ <b>Tomorrow</b>: LLMs act within workflows"]:::delta
0024
0025 subgraph P1["① Real-time bot — Mattermost"]
0026 direction TB
0027 U1["ePIC users"]:::human
0028 B1["AI bot<br/>Haiku · cross-session memory<br/>context harness"]:::llm
0029 O1["Q&A, diagnostics, on-the-fly analysis"]
0030 U1 --> B1 --> O1 --> U1
0031 end
0032
0033 subgraph P2["② Research orchestrator — corun-ai"]
0034 direction TB
0035 U2["Expert evaluators<br/>(production, user learning)"]:::human
0036 S2["Scheduler<br/>model × sysprompt × MCP set<br/>config compare & annotate"]
0037 B2["Long-latency worker<br/>Opus / Sonnet / Gemini / Gemma<br/>minutes–tens of minutes"]:::llm
0038 O2["Deep research entry<br/>(e.g. Perlmutter performance)"]
0039 U2 --> S2 --> B2 --> O2 --> U2
0040 end
0041
0042 subgraph P3["③ Active workflow orchestrator — swf-testbed"]
0043 direction TB
0044 U3["Testbed users"]:::human
0045 B3["LLM orchestrator<br/>launch · run · monitor<br/>assess · summarize"]:::llm
0046 W3["Hybrid workflow<br/>LLM steps ⇄ deterministic agents<br/>DAQ sim → PanDA workers"]
0047 O3["Completed run + summary"]
0048 U3 --> B3 --> W3 --> B3
0049 W3 --> O3 --> U3
0050 end
0051
0052 subgraph MCP["Shared MCP tool ecosystem"]
0053 direction LR
0054 IH["<b>In-house</b><br/>AskPanDA · PanDA Monitor · Streaming Workflow"]:::tool
0055 AD["<b>Adopted</b><br/>Rucio · XRootD · uproot · LXR · GitHub · Zenodo"]:::tool
0056 end
0057
0058 Thesis -.-> P1
0059 Thesis -.-> P2
0060 Thesis -.-> P3
0061 P1 --> MCP
0062 P2 --> MCP
0063 P3 --> MCP
0064 ```
0065
0066 Legend: blue = LLM, orange = human, green = MCP/tool surface, pink-dashed = thesis / 6-month delta.
0067
0068 ---
0069
0070 ## Diagram 3 — Hybrid Workflow Anatomy
0071
0072 One real swf-testbed streaming run as a pipeline of alternating LLM and
0073 deterministic steps, with the MCP tools each LLM step actually calls.
0074 Human-in-loop gate between ⑨ and ⑩ is where the 6-month scope lands.
0075
0076 ```mermaid
0077 flowchart TB
0078 classDef llm fill:#e3f2fd,stroke:#1565c0,stroke-width:2px,color:#000
0079 classDef det fill:#eeeeee,stroke:#555,color:#000
0080 classDef hitl fill:#fff3e0,stroke:#e65100,stroke-dasharray:5 3,color:#000
0081 classDef mcp fill:#f1f8e9,stroke:#33691e,font-size:11px,color:#000
0082
0083 U["User prompt<br/>'run a fast-processing test<br/>and summarize results'"]:::hitl
0084
0085 L1["① Prepare<br/>select config, check prior runs"]:::llm
0086 L1t["swf_list_workflow_executions<br/>pcs_list_tags · swf_get_system_state"]:::mcp
0087
0088 L2["② Start testbed"]:::llm
0089 L2t["swf_start_user_testbed"]:::mcp
0090
0091 L3["③ Start workflow"]:::llm
0092 L3t["swf_start_workflow<br/>(stf_count, config, …)"]:::mcp
0093
0094 D1["④ DAQ simulator<br/>emits STF files"]:::det
0095 D2["⑤ Data agent<br/>STF registration"]:::det
0096 D3["⑥ FastMon agent<br/>samples Time Frames"]:::det
0097 D4["⑦ Fast processing agent<br/>TF slices → PanDA"]:::det
0098 D5["⑧ PanDA workers<br/>EICrecon reconstruction"]:::det
0099
0100 L4["⑨ Monitor in-flight<br/>errors, throughput, stragglers"]:::llm
0101 L4t["swf_list_logs(level='ERROR')<br/>swf_list_workflow_executions<br/>panda_get_activity"]:::mcp
0102
0103 G1{"human-in-loop<br/>gate<br/>(scope of 6-mo work)"}:::hitl
0104
0105 L5["⑩ Assess & summarize<br/>narrative run report,<br/>anomaly notes,<br/>comparison to prior runs"]:::llm
0106 L5t["swf_get_workflow_execution<br/>panda_study_job · lxr_ident"]:::mcp
0107
0108 O["Run entry + summary<br/>annotated, searchable"]:::hitl
0109
0110 U --> L1 --> L2 --> L3 --> D1 --> D2 --> D3 --> D4 --> D5 --> L4
0111 L4 --> G1 --> L5 --> O
0112
0113 L1 -.- L1t
0114 L2 -.- L2t
0115 L3 -.- L3t
0116 L4 -.- L4t
0117 L5 -.- L5t
0118 ```
0119
0120 Legend: blue = LLM step, grey = deterministic agent, dashed orange = human-in-loop / user edge, green captions = MCP tool calls.
0121
0122 ---
0123
0124 ## Diagram 2 — MCP Tool Ecosystem
0125
0126 One LLM reaches into the experiment's operational stack through a two-tier
0127 tool set. Counters "everyone has MCP now" by showing depth into production
0128 systems.
0129
0130 ```mermaid
0131 flowchart TB
0132 classDef llm fill:#e3f2fd,stroke:#1565c0,stroke-width:2px,color:#000
0133 classDef ih fill:#f1f8e9,stroke:#33691e,color:#000
0134 classDef ad fill:#fff8e1,stroke:#f57f17,color:#000
0135 classDef sys fill:#fafafa,stroke:#888,color:#444
0136
0137 LLM["<b>LLM</b><br/>Opus · Sonnet · Haiku · Gemini · Gemma<br/>sysprompt · effort level · context harness"]:::llm
0138
0139 subgraph IH["In-house — purpose-built on our production WFMS"]
0140 direction LR
0141 T1["AskPanDA<br/><i>job diagnostics</i>"]:::ih
0142 T2["PanDA Monitor MCP<br/><i>operational state</i>"]:::ih
0143 T3["Streaming Workflow MCP<br/><i>active testbed control</i>"]:::ih
0144 end
0145
0146 subgraph AD["3rd-party MCP — 6+ community/standard tools"]
0147 direction LR
0148 T4["Rucio MCP"]:::ad
0149 T5["XRootD MCP"]:::ad
0150 T6["uproot MCP"]:::ad
0151 T7["LXR XREF MCP"]:::ad
0152 T8["GitHub MCP"]:::ad
0153 T9["Zenodo MCP"]:::ad
0154 end
0155
0156 subgraph SYS["Reaches into"]
0157 direction LR
0158 S1["PanDA DB<br/>monitor · testbed"]:::sys
0159 S2["Rucio<br/>data catalogs"]:::sys
0160 S3["XRootD<br/>remote I/O"]:::sys
0161 S4["ePIC codebase<br/>(55+ repos)"]:::sys
0162 S5["Zenodo<br/>official repo"]:::sys
0163 end
0164
0165 LLM --> IH
0166 LLM --> AD
0167 IH --> SYS
0168 AD --> SYS
0169 ```
0170
0171 ---
0172
0173 ## Diagram 4 — 6-month Delta (before / after)
0174
0175 Same boxes, one arrow moves, one audit loop added. Makes the project
0176 feel like a bounded increment on an operational system, not a research
0177 leap.
0178
0179 ```mermaid
0180 flowchart LR
0181 classDef llm fill:#e3f2fd,stroke:#1565c0,stroke-width:2px,color:#000
0182 classDef human fill:#fff3e0,stroke:#e65100,stroke-width:2px,color:#000
0183 classDef wfms fill:#f3e5f5,stroke:#6a1b9a,stroke-width:2px,color:#000
0184 classDef audit fill:#fffde7,stroke:#f9a825,stroke-dasharray:4 3,color:#000
0185 classDef delta fill:#fce4ec,stroke:#ad1457,color:#000
0186
0187 subgraph TODAY["<b>Today</b> — LLM informs, human decides"]
0188 direction TB
0189 T_LLM["LLM"]:::llm
0190 T_MCP["MCP tools<br/><i>reads · analyzes</i>"]:::llm
0191 T_HUM["<b>Human decides</b>"]:::human
0192 T_WFMS["WFMS acts"]:::wfms
0193 T_LLM --> T_MCP --> T_HUM --> T_WFMS
0194 end
0195
0196 subgraph NEXT["<b>Proposed (6 months)</b> — LLM decides on pre-defined classes"]
0197 direction TB
0198 N_LLM["LLM"]:::llm
0199 N_MCP["MCP tools<br/><i>reads · analyzes · <b>decides</b></i>"]:::llm
0200 N_WFMS["WFMS acts"]:::wfms
0201 N_AUD["HITL audit trail<br/><i>async human review</i>"]:::audit
0202 N_LLM --> N_MCP --> N_WFMS
0203 N_WFMS -.-> N_AUD
0204 N_AUD -.-> N_LLM
0205 end
0206
0207 DELTA["<b>Delta:</b><br/>• 'human decides' → 'LLM decides'<br/>• HITL audit loop added<br/>• Scope gated by decision-class allowlist"]:::delta
0208
0209 TODAY -.-> DELTA -.-> NEXT
0210 ```
0211
0212 ---
0213
0214 ## Diagram 5 — PanDA Scale Provenance
0215
0216 The "why 6 months is plausible" anchor: we're layering on a production
0217 WFMS with a decade of operational history, not starting from zero.
0218
0219 ```mermaid
0220 flowchart BT
0221 classDef app fill:#e3f2fd,stroke:#1565c0,stroke-width:2px,color:#000
0222 classDef ai fill:#f1f8e9,stroke:#33691e,stroke-width:2px,color:#000
0223 classDef panda fill:#fff3e0,stroke:#e65100,stroke-width:2px,color:#000
0224
0225 subgraph L1["<b>Foundation</b> — PanDA production WFMS (operational since 2005)"]
0226 direction LR
0227 P1["ATLAS @ LHC<br/>O(million) jobs/day<br/>200+ institutions"]:::panda
0228 P2["PanDA monitor<br/>deep drill-down<br/>refined 15+ years"]:::panda
0229 P3["ePIC production<br/>(monthly campaigns, OSG, HPC)"]:::panda
0230 P4["ePIC streaming<br/>workflow testbed<br/>(this team, 2025+)"]:::panda
0231 end
0232
0233 subgraph L2["AI instrumentation today — this team (2024–)"]
0234 direction LR
0235 I1["AskPanDA MCP"]:::ai
0236 I2["PanDA Monitor MCP"]:::ai
0237 I3["VectorDB RAG"]:::ai
0238 I4["Streaming Workflow MCP"]:::ai
0239 I5["3rd-party MCP (6+)"]:::ai
0240 end
0241
0242 subgraph L3["<b>New application layer</b> — LLM-driven orchestration (proposed, 6 months)"]
0243 direction LR
0244 A1["LLM workflow<br/>orchestrator"]:::app
0245 A2["Hybrid workflows<br/>LLM + deterministic"]:::app
0246 A3["Harnessed autonomous<br/>LLM action"]:::app
0247 A4["LLM research assistant<br/><i>evolution of Mattermost<br/>bot + codoc-ai</i>"]:::app
0248 end
0249
0250 L1 --> L2
0251 L2 --> L3
0252 ```
0253
0254 ---
0255
0256 ## Diagram 6 — corun-ai Research Loop
0257
0258 Shows corun-ai as an orchestrated research system, not a chatbot.
0259 Config-compare in annotation threads is the R&D-testbed feature.
0260
0261 ```mermaid
0262 sequenceDiagram
0263 autonumber
0264 participant U as Expert evaluator
0265 participant S as Scheduler
0266 participant W as Worker LLM
0267 participant M as MCP tools
0268 participant E as Research entry
0269
0270 U->>S: submit research prompt<br/>+ config (model · sysprompt · MCP set)
0271 S->>W: spawn worker with config
0272 loop deep analysis — minutes to tens of minutes
0273 W->>M: tool call (PanDA / LXR / Rucio / ...)
0274 M-->>W: results
0275 W->>W: reason · refine · iterate
0276 end
0277 W-->>S: completed analysis
0278 S->>E: write research entry
0279 E-->>U: notify + surface result
0280 U->>E: annotate · thread comments
0281 Note over U,E: config variants compared<br/>side-by-side in threads —<br/>an R&D testbed, not a product
0282 ```
0283
0284 ---
0285
0286 Fill in concrete numbers (PanDA jobs/day, testbed run count, corun-ai prompt count, etc.) before these go into proposal figures.