Warning, /swf-testbed/docs/architecture_and_design_choices.md is written in an unsupported language. File is not indexed.
0001 # Architecture and Design Choices
0002
0003 This document records the reasoning behind key architectural and design decisions
0004 for the `swf-testbed` project. Its goal is to provide context for new
0005 contributors and for future architectural reviews.
0006
0007 ## Shared Code Strategy: Dedicated Package (`swf-common-lib`)
0008
0009 - **The Choice:** All code intended for use by more than one component will
0010 reside in a dedicated, versioned, and installable Python package named
0011 `swf-common-lib`.
0012
0013 - **Rationale:** This approach was chosen to prevent code duplication and
0014 divergence across the various `swf-` repositories. It establishes a single
0015 source of truth for common utilities, ensuring that bug fixes and improvements
0016 are propagated consistently to all dependent components. It also enables clear
0017 versioning and dependency management, allowing components to depend on specific
0018 versions of the shared library.
0019
0020 ## Process Management: `supervisord`
0021
0022 - **The Choice:** The testbed's agent processes will be managed by `supervisord`.
0023
0024 - **Rationale:** `supervisord` was chosen for its simplicity, reliability, and
0025 cross-platform compatibility (it works on both Linux and macOS). As a
0026 Python-based tool, it fits well within the project's ecosystem and can be
0027 bundled as a dependency of the main `swf-testbed` package. It provides all
0028 necessary features—such as auto-restarting, log management, and a control
0029 interface (`supervisorctl`)—with a straightforward configuration file.
0030
0031 - **Alternatives Considered:**
0032 - **`systemd`:** A powerful alternative on Linux, but it is not cross-platform
0033 and would prevent the testbed from running easily on macOS.
0034 - **Docker Compose:** Excellent for managing multi-container services. While
0035 this is a powerful pattern, the primary distribution goal is to package the
0036 Python code itself, not necessarily to mandate a container-based runtime
0037 (see `docs/packaging_and_distribution.md`).
0038 - **Manual Scripts:** Running agents in separate terminals is feasible for
0039 development but is not a robust or scalable solution for a deployed
0040 testbed.
0041
0042 ## ActiveMQ Connection and Messaging Patterns
0043
0044 *Notes from Wen Guan (ActiveMQ/Artemis expert), January 2026.*
0045
0046 ### Separate vs Shared Connections
0047
0048 - **The Choice:** Agents use separate connections for publishing and subscribing
0049 rather than sharing a single connection for both.
0050
0051 - **Rationale:** Shared connections add complexity and introduce failure coupling.
0052 When a publisher sends messages with errors, the broker may send REMOTE_DISCONNECT
0053 to terminate the connection, which would also kill the subscriber if they share
0054 the same connection. Since we have no connection count limitations, separate
0055 connections provide better isolation with minimal overhead.
0056
0057 - **Current Implementation:** `BaseAgent` currently uses a single connection that
0058 handles both send and receive. This works because stomp.py handles concurrent
0059 operations, but if issues arise, splitting into separate connections is the
0060 recommended fix.
0061
0062 ### Messaging Semantics: Topic vs Queue vs Durable Subscription
0063
0064 | Pattern | Persistence | Consumers | Use Case |
0065 |---------|-------------|-----------|----------|
0066 | **Topic** | None - messages lost if subscriber offline | Broadcast to all subscribers | Real-time events, heartbeats |
0067 | **Queue** | Messages kept until consumed | One consumer per message | Work distribution, guaranteed delivery |
0068 | **Durable Subscription on Topic** | Creates per-subscriber queue from topic | One subscriber per durable subscription | Persistent broadcast (with caveats) |
0069
0070 ### Durable Subscription Warnings
0071
0072 Durable subscriptions create an output queue from an input topic. Important caveats:
0073
0074 1. **Exclusive access:** The output queue can only be used by one subscriber at a time.
0075 Multiple subscribers will raise an "in use" error.
0076
0077 2. **Must unsubscribe when done:** Unconsumed messages accumulate on disk. A topic
0078 with high message volume can fill disk space quickly if durable subscriptions
0079 are not properly cleaned up.
0080
0081 3. **Why some systems disable this:** Managing durable subscriptions requires
0082 responsible ownership and monitoring. Production systems often disable this
0083 feature or require explicit approval with designated owners who can be
0084 contacted when queues grow too large.
0085
0086 ### Destination Naming
0087
0088 ActiveMQ Artemis requires explicit destination type prefixes:
0089 - Topics: `/topic/name` (e.g., `/topic/epictopic`)
0090 - Queues: `/queue/name` (e.g., `/queue/stf_processing`)
0091
0092 Never use bare destination names like `epictopic` - always include the prefix.