Agent Chains & Cascading Delegation

Cascading agent delegation enables multi-step workflows where completing one agent can automatically trigger the next agent in a defined chain. This pattern supports pipelines like triage → developer → reviewer without manual intervention.

Overview

When a job completes successfully, the system can automatically queue a child job with the next agent in the chain. The child job receives context from the parent's execution, enabling each agent to build on previous work.

Initial event
    ↓
[Triage Agent] → analyzes issue
    ↓
[Developer Agent] ← receives triage output + issue
    ↓
[Reviewer Agent] ← receives dev output + issue
    ↓
Completion

Defining Chains

Chains are defined in an agent's eventTriggers array as part of the trigger configuration. Each trigger can define the next agent to run and conditions for progression.

ChainStep Structure

interface ChainStep {
  agentTriggerName: string  // Trigger name of next agent (e.g., "ai-developer")
  condition: "always"       // Progression condition
           | "on-success"   // Only if parent exits with code 0
           | "on-failure"   // Only if parent exits with non-zero code
  failurePolicy: "stop"     // Stop the chain on failure
               | "skip"     // Skip this agent and continue with next
               | "notify"   // Continue but send notification
}

Example Agent Configuration

apiVersion: labrats.work/v1alpha1
kind: AiAgent
metadata:
  name: triage
  namespace: ai-agents
spec:
  name: Issue Triage
  triggerName: ai-triage
  defaultModel: haiku
  defaultEffort: low
  enabled: true
  eventTriggers:
    - source: github
      events: ["issues.opened"]
      next:
        agentTriggerName: ai-developer
        condition: on-success
        failurePolicy: stop

Chain Tracking

Each job carries metadata about its position in a chain:

Field	Type	Purpose
`chainId`	string	Unique identifier for the entire chain (shared by all jobs in it)
`chainDepth`	number	Position in chain (0 = initiator, 1 = first delegated, etc.)
`parentJobId`	string	ID of job that triggered this one (or null if initiator)
`chainPath`	string[]	Ordered list of agent trigger names traversed (loop detection)

These fields are exposed in the job API for filtering and tracking multi-step workflows:

# Get all jobs in a chain
curl https://ai-agents.hcl.labrats.work/api/jobs?chainId=abc123 \
  -H "Authorization: Bearer <token>"

# Response includes chain metadata
{
  "id": "job-2",
  "chainId": "abc123",
  "chainDepth": 1,
  "parentJobId": "job-1",
  "chainPath": ["ai-triage", "ai-developer"],
  ...
}

Context Propagation

When the next agent is queued, the parent job's output is captured and passed as context. This enables the child agent to build on previous work:

Parent Output Capture

Capture size: Up to 4,000 characters of parent's stdout
Format: Wrapped in <chain-context> block
Placement: Prepended to child's original prompt

Example

Parent (Triage) stdout:

Issue Analysis:
- Type: Bug in authentication flow
- Priority: High
- Affected components: LoginForm, AuthService

Child (Developer) receives:

<chain-context>
Previous agent output:
Issue Analysis:
- Type: Bug in authentication flow
- Priority: High
- Affected components: LoginForm, AuthService
</chain-context>

[Original developer prompt...]

Safety Limits

Chains have built-in safeguards to prevent infinite loops and resource exhaustion:

Limit	Value	Purpose
Max chain depth	10	Prevents runaway delegation chains
Output capture	4000 chars	Bounds context size
Loop detection	chainPath dedup	Prevents agent A → B → A cycles

If a chain exceeds max depth or would create a loop, the job fails with a clear error message.

Failure Handling

Each chain step can define how to handle failures from the previous agent:

Stop (default)

Chain terminates immediately on failure. Useful for strict dependency chains.

failurePolicy: stop

Skip

If agent fails, move to next agent in chain (if defined). Useful for optional enrichment steps.

failurePolicy: skip

Notify

Continue chain but emit an event for monitoring/alerting.

failurePolicy: notify

Progression Conditions

Control when the next agent runs based on parent's exit status:

Always

Run next agent regardless of parent outcome. Useful for notifications or cleanup agents.

condition: always

On Success

Run next agent only if parent exits with code 0. Most common for dependent workflows.

condition: on-success

On Failure

Run next agent only if parent exits with non-zero code. Useful for error handling/escalation.

condition: on-failure

Monitoring Chains

View Chain in Dashboard

The job detail page shows:

Full chain path (all agents traversed)
Chain ID for grouping related jobs
Parent job ID for tracing context
Current depth in chain

Query Chains via API

# All jobs in a chain
GET /api/jobs?chainId=abc123

# Jobs by agent
GET /api/jobs?agentRole=ai-developer

# Single job with chain metadata
GET /api/jobs/job-id

Response includes:

{
  "id": "job-2",
  "type": "implement-issue",
  "agentRole": "ai-developer",
  "chainId": "abc123",
  "chainDepth": 1,
  "parentJobId": "job-1",
  "chainPath": ["ai-triage", "ai-developer"],
  "status": "completed",
  "result": "...",
  ...
}

Troubleshooting

Chain stuck at depth 0

Problem: Next agent never queued despite on-success condition.

Debug steps:

Check parent job's exit code (should be 0 for on-success)
Verify next agent's trigger name exists
Check next agent is enabled (enabled: true)

Loop detected / Chain depth exceeded

Problem: Job failed with "Loop detected" or "Max chain depth exceeded"

Causes:

Agent configured to call itself (directly or indirectly)
Chain too long (>10 steps)

Resolution:

Review next.agentTriggerName in all agents' eventTriggers
Simplify chain design (break into multiple separate workflows)

Child job missing parent context

Problem: Child doesn't have parent output in <chain-context> block.

Causes:

Parent output was empty or shorter than 4000 chars (should still appear)
Pipeline jobs (e.g., pdf-sections) don't use chains — they use different execution path

Check:

View parent job logs in API detail endpoint
Verify parent executed successfully

Best Practices

Keep chains shallow — Prefer 2-3 agents per chain. Longer chains are harder to debug and troubleshoot.
Use descriptive trigger names — Names like ai-triage, ai-developer, ai-reviewer are clearer than agent-1, agent-2.
Plan failure modes — Decide upfront whether a failure in step 1 should block step 2 (use stop) or proceed anyway (use skip).
Monitor chain metrics — Track how often chains complete successfully, where they fail, and adjust conditions/policies accordingly.
Test chains in dev — Use the API directly to test a new chain before deploying to production.
Document the workflow — Add comments to agent manifests explaining the chain and its purpose.

Example:

# Triage agent for initial assessment
# ↓ on-success → Developer agent (implements fix)
# ↓ on-success → Reviewer agent (code review)

Examples

Issue → Fix → Review

---
# 1. Triage agent analyzes issue
apiVersion: labrats.work/v1alpha1
kind: AiAgent
metadata:
  name: issue-triage
spec:
  name: Issue Triage
  triggerName: ai-triage
  defaultModel: haiku
  eventTriggers:
    - source: github
      events: ["issues.opened"]
      next:
        agentTriggerName: ai-developer
        condition: on-success

---
# 2. Developer implements fix
apiVersion: labrats.work/v1alpha1
kind: AiAgent
metadata:
  name: developer
spec:
  name: Developer
  triggerName: ai-developer
  defaultModel: sonnet
  eventTriggers:
    - source: manual
      next:
        agentTriggerName: ai-reviewer
        condition: on-success

---
# 3. Reviewer reviews code
apiVersion: labrats.work/v1alpha1
kind: AiAgent
metadata:
  name: reviewer
spec:
  name: Code Reviewer
  triggerName: ai-reviewer
  defaultModel: sonnet

Error Escalation

Run escalation agent only on failures:

apiVersion: labrats.work/v1alpha1
kind: AiAgent
metadata:
  name: primary-agent
spec:
  name: Primary Agent
  triggerName: ai-primary
  eventTriggers:
    - source: github
      events: ["issues.labeled"]
      filters:
        label: "help-wanted"
      next:
        agentTriggerName: ai-escalate
        condition: on-failure
        failurePolicy: stop