Coordination Patterns
How agents in a swarm communicate and coordinate.
Pattern 1: Orchestrator-Worker
A central orchestrator (Lima) directs worker agents.
flowchart TB
Orch[Lima Orchestrator]
Orch --> A[Agent A]
Orch --> B[Agent B]
Orch --> C[Agent C]
A --> Orch
B --> Orch
C --> Orch
Use Case: Most common pattern in Crella Engine.
Pros:
- Clear command structure
- Easy to debug
- Centralized state
Cons:
- Orchestrator is single point of failure
- Can bottleneck at scale
Pattern 2: Pipeline (Sequential)
Agents process in sequence, each passing to the next.
flowchart LR
A[Alpha] --> B[Bravo] --> C[Charlie] --> D[Delta]
Use Case: Linear workflows like document processing.
Pros:
- Simple to understand
- Clear dependencies
- Easy error tracking
Cons:
- Slowest pattern
- One failure stops pipeline
Pattern 3: Fan-Out / Fan-In
One agent distributes work, another collects results.
flowchart TB
Distribute[Distribute Task]
Distribute --> A[Worker A]
Distribute --> B[Worker B]
Distribute --> C[Worker C]
A --> Collect[Collect Results]
B --> Collect
C --> Collect
Use Case: Parallel research, batch processing.
Pros:
- Maximum parallelism
- Fastest for independent tasks
Cons:
- Requires aggregation logic
- Complex error handling
Pattern 4: Peer-to-Peer
Agents communicate directly with each other.
flowchart LR
A[Agent A] <--> B[Agent B]
B <--> C[Agent C]
A <--> C
Use Case: Collaborative tasks, real-time adjustments.
Pros:
- Flexible
- No single point of failure
- Low latency
Cons:
- Complex coordination
- Harder to debug
- State management challenging
Pattern 5: Event-Driven
Agents respond to events, not direct commands.
flowchart TB
Event[Event Bus]
Event --> |document.uploaded| Charlie[Charlie]
Event --> |lead.qualified| Echo[Echo]
Event --> |response.received| India[India]
Charlie --> |document.processed| Event
Echo --> |lead.enriched| Event
India --> |escalation.created| Event
Use Case: Decoupled systems, async processing.
Pros:
- Highly scalable
- Loose coupling
- Easy to add agents
Cons:
- Eventually consistent
- Debugging complexity
Communication Protocols
Synchronous (Request-Response)
// Agent A calls Agent B directly
const result = await agentB.process(data);
When to use: Need immediate result, simple operations.
Asynchronous (Message Queue)
// Agent A publishes to queue
await queue.publish('charlie.tasks', { document: data });
// Charlie subscribes and processes
queue.subscribe('charlie.tasks', async (msg) => {
const result = await process(msg.document);
await queue.publish('charlie.results', result);
});
When to use: High volume, fire-and-forget, resilience needed.
State Management
Centralized State (Redis)
flowchart TB
subgraph agents [Agents]
A[Alpha]
B[Bravo]
C[Charlie]
end
Redis[(Redis State Store)]
A --> Redis
B --> Redis
C --> Redis
Use: Shared counters, session data, rate limits.
Distributed State (Per-Agent)
flowchart LR
A[Alpha + State A]
B[Bravo + State B]
C[Charlie + State C]
Use: Agent-specific state, no sharing needed.
Workflow State (Lima)
{
"workflow_id": "wf-123",
"current_step": 3,
"state": {
"lead_validated": true,
"enrichment_complete": true,
"email_generated": false
},
"history": [
{"agent": "ALPHA001", "result": "success"},
{"agent": "ECHO001", "result": "success"}
]
}
Error Handling
Retry Pattern
flowchart TB
Task[Execute Task] --> Check{Success?}
Check -->|Yes| Done[Complete]
Check -->|No| Retry{Attempts < Max?}
Retry -->|Yes| Wait[Backoff Wait]
Wait --> Task
Retry -->|No| Escalate[Escalate/Fail]
Circuit Breaker
stateDiagram-v2
[*] --> Closed
Closed --> Open: Failures > Threshold
Open --> HalfOpen: Timeout Elapsed
HalfOpen --> Closed: Success
HalfOpen --> Open: Failure
Dead Letter Queue
Failed messages go to a DLQ for manual review:
Main Queue → Agent → Success ✓
↓
Failure (after retries)
↓
Dead Letter Queue → Human Review
Best Practices
- Start with Orchestrator-Worker — Simplest pattern, optimize later
- Use async for high volume — Don't block on slow operations
- Implement idempotency — Safe to retry any operation
- Log everything — Correlation IDs across agents
- Set timeouts — Don't let agents hang forever
- Design for failure — Every call can fail