ORCHESTRATION-SPEC — Multi-Agent Orchestration Specification¶

Status: live · Version: 1.0.0 · Camada: 9 (AI Infrastructure) Data source: docs/reference/registries/_orchestrators-registry.yaml (12 orchestrators) Sibling: docs/reference/templates/AGENT-MANIFEST.md (per-agent deep spec)

Purpose¶

Authoritative specification for how requests flow through the 5-layer orchestration hierarchy. Where the registry is the lean index, this spec is the prose explaining WHY the hierarchy exists and HOW transitions happen.

Layer responsibilities¶

Layer 1 — Cortex (global orchestrator)¶

One instance, always-on
Reads: ROUTING-COMPASS, CONSTITUTION, COST-POLICY
Writes: capsules to Layer 2
Never invokes a skill directly
Failure mode: cortex_fallback (a simpler classifier)

Layer 2 — Domain orchestrators (10)¶

frontend_orch, backend_orch, ai_ml_orch, devops_orch, security_orch, qa_orch, finance_trading_orch, integrations_orch, iot_orch, meta_orch
Always-on; one per coherent macro-domain
Reads: domain-specific skills + ADRs
Writes: capsules to Layer 3 (task orchestrators)
Veto power: security_orch and qa_orch can block releases unilaterally

Layer 3 — Task orchestrators (ephemeral)¶

Spawned per-story, reaped on completion
TTL: 1800s max (enforced)
Reads: parent capsule + memory pointers
Writes: capsules to Layer 4 (specialists)
Emits: checkpoints per the parent capsule's checkpoint_policy

Layer 4 — Specialists (20)¶

Stateless; one invocation = one task
Reads: the inbound capsule + its own memory namespace
Writes: artifact + outbound capsule (back to L3)
Skills: bound by preferred_skills ∪ skills_allowed from capsule
Quality threshold from capsule (default 0.85)

Layer 5 — Workers (5)¶

Atomic; one capability per worker
code-writer-worker, file-operator-worker, api-caller-worker, test-runner-worker, git-worker
Reads: explicit single-action instruction
Writes: result blob
No autonomy; obey the parent specialist

Routing transitions¶

User intent
   │
   ▼
Cortex (L1) ──► intent_classification ──► routing_rule lookup ──► capsule_to(L2)
                                                                       │
                                              ┌────────────────────────┘
                                              ▼
                                         L2 domain orch ──► decompose into stories
                                              │
                                              ▼
                                         spawn L3 task_orch_<story_id>
                                              │
                                              ▼
                                         sequence L4 specialists
                                              │     ▲
                                              ▼     │
                                         delegate to L5 workers as needed
                                              │
                                              ▼
                                         L3 aggregates result, returns to L2
                                              │
                                              ▼
                                         L2 returns to L1 (or to next L2 if cross-domain)
                                              │
                                              ▼
                                         L1 returns to user

Cross-domain orchestration¶

When an intent spans multiple Layer-2 domains (e.g., "build a fullstack feature"): - Cortex selects a cross_domain_pattern from ROUTING-COMPASS - Pattern declares a sequence: e.g., backend_orch.design_api → frontend_orch.implement_ui → qa_orch.tests → devops_orch.deploy - Each step waits for the previous (or runs in parallel if pattern allows)

Fallback chains¶

Every orchestrator declares a fallback_chain field. On failure: 1. First fallback (often a simpler version of itself) 2. Then cortex (global fallback) 3. Then human_escalation (last resort)

Fallbacks are intentional values, not orchestrator IDs (audit allowlist documented in _audit_semantic.py).

Failure semantics¶

Failure type	Handled by	Action
Capsule schema invalid	Receiver	Reject; return error capsule
Budget exceeded	CIRCUIT-BREAKER (Camada 9, future)	HALT; emit cost alert
Constitutional violation	Auditor-haiku	HALT; emit INCIDENT
Skill unavailable	Specialist	Try fallback skill; if none → escalate to L3
Deadline missed	Task orchestrator	Renegotiate or terminate story
Veto from security/qa	Veto-holder	HALT; require ADR to proceed

SLOs at the orchestration layer¶

(Will land in AGENT-METRICS — Camada 9, planned.)

Indicative targets: - Cortex classification latency: p95 < 200 ms - Domain orchestrator decomposition: p95 < 1 s - Task orchestrator overhead per specialist call: p95 < 50 ms

What this spec does NOT cover¶

Per-agent prompt content → docs/reference/templates/AGENT-MANIFEST.md (template) + per-agent manifests (instances)
Skill catalog → _skills-registry.yaml
Routing rules in detail → ROUTING-COMPASS.yaml
Capsule schema → docs/reference/contracts/handoff-capsule.schema.json
Checkpoint schema → docs/reference/contracts/checkpoint.schema.json

Evolution¶

Version	Date	Change
1.0.0	2026-05-23	Initial spec; matches `_orchestrators-registry.yaml` v1.0.1
(future)	—	Add Camada 13 autonomy integration (WORLD-STATE + EVENT-BUS)
(future)	—	Add Camada 18 cognitive layer hooks (mirror learning between L4 specialists)