How-to: connect the Anthropic API¶

Plug the real Anthropic API into the runtime so pipelines invoke claude-haiku-4-5, claude-sonnet-4-6, or claude-opus-4-7 instead of the mock invoker.

Install¶

pip install -e ".[llm]"     # adds anthropic>=0.50

The anthropic SDK is an optional dependency. The pipeline works without it (using the mock invoker); install it only when you want real LLM calls.

Set the API key¶

export ANTHROPIC_API_KEY="sk-ant-..."          # Linux/macOS
$env:ANTHROPIC_API_KEY="sk-ant-..."            # PowerShell

Or pass api_key=... explicitly to AnthropicInvoker(api_key=...).

Quickstart — one call¶

from src.ai import ModelManager

mm, invoker = ModelManager.with_anthropic()
result = mm.call("Explain RAG in one sentence.", tier="simple", invoker=invoker)

print(result["response"])           # actual text from claude-haiku-4-5
print(result["tokens_in"], "in /", result["tokens_out"], "out")
print(result["model"])              # claude-haiku-4-5

tier="simple" routes to Haiku first, falling back to Sonnet if Haiku's circuit opens. Other tiers: standard (Haiku→Sonnet→Opus), complex (Sonnet→Opus), critical (Opus→Sonnet).

Plug into a full pipeline run¶

from src.ai import AgentMetrics, ModelManager
from src.pipeline import (
    AgentRef, BudgetTracker, Capsule, CheckpointStore, Constraints,
    ContinuationStore, ExpectedOutput, LockManager, PipelineOrchestrator,
    TaskSpec, VerificationAgent,
)
from src.privacy import AuditChain

# 1. Real Anthropic-backed model manager
mm, anthropic_invoker = ModelManager.with_anthropic(max_tokens=4096)

# 2. Executor that turns the capsule into a prompt and runs it
def llm_executor(state):
    prompt = f"Task: {state.capsule.task.description}\n\n"
    prompt += "Acceptance criteria:\n"
    for ac in state.capsule.task.acceptance_criteria:
        prompt += f"- {ac}\n"
    result = mm.call(prompt, tier="complex", invoker=anthropic_invoker)
    state.budget.record_call(
        model=result["model"],
        tokens_in=result["tokens_in"],
        tokens_out=result["tokens_out"],
    )
    return {"artifact_main": result["response"]}

# 3. Wire orchestrator with full governance
orch = PipelineOrchestrator(
    budget=BudgetTracker(capsule_id="...", limit_tokens=50_000, limit_dollars=2.0),
    locks=LockManager(),
    verifier=VerificationAgent(),
    audit_chain=AuditChain(),
    metrics=AgentMetrics(),
    checkpoint_store=CheckpointStore(),
    continuation_store=ContinuationStore(),
)

# 4. Run
result = orch.run(capsule, executor=llm_executor)

How retry + fallback compose¶

Two layers protect every call:

Layer	What it handles	Configured by
SDK auto-retry (per-model)	Transient 429 / 408 / 409 / 5xx with exponential backoff (default 3 retries)	`AnthropicInvoker(max_retries=N)`
ModelManager fallback ladder (across models)	Persistent failure on a model → next route in the ladder; trips `CircuitBreaker` on repeated failures	`ModelManager.default()` / `with_anthropic()`

If Haiku's API returns 429 once → SDK retries with backoff → call succeeds. If Haiku is consistently failing → CircuitBreaker opens → ModelManager.select() returns Sonnet instead.

Cost tracking¶

Every call's tokens are captured. To estimate cost:

from src.pipeline.token_budget import estimate_cost

cost = estimate_cost(
    model=result["model"],
    tokens_input=result["tokens_in"],
    tokens_output=result["tokens_out"],
)
print(f"${cost:.6f}")

Pricing per million tokens (current rate card):

Model	Input	Output
`claude-haiku-4-5`	$1.00	$5.00
`claude-sonnet-4-6`	$3.00	$15.00
`claude-opus-4-7`	$5.00	$25.00

BudgetTracker.record_call() raises BudgetExceededError when the per-capsule cap is hit — wire this into state.budget.record_call(...) inside your executor for hard caps.

Network smoke tests¶

Real API call test lives in tests/smoke/test_llm_roundtrip.py (marked @pytest.mark.network). It's skipped automatically when ANTHROPIC_API_KEY is not set.

To run it:

export ANTHROPIC_API_KEY="sk-ant-..."
python -m pytest tests/smoke/test_llm_roundtrip.py -m network

Expected cost: < $0.001 per run (single Haiku call, 50–100 tokens).

Configure `AnthropicInvoker` in detail¶

Parameter	Default	Notes
`api_key`	`None` → reads `ANTHROPIC_API_KEY` env	Pass explicit for multi-tenant scenarios
`max_tokens`	`4096`	Per-call output cap
`max_retries`	`3`	SDK auto-retry on 429/408/409/5xx with exp backoff
`timeout_seconds`	`60.0`	Per-request timeout (SDK default is 600)
`system_prompt`	`None`	Optional system prompt prepended every call

Troubleshooting¶

Symptom	Likely cause	Fix
`AnthropicNotInstalledError`	`anthropic` package missing	`pip install -e ".[llm]"`
`ValueError: ANTHROPIC_API_KEY not set`	No env var, no explicit key	`export ANTHROPIC_API_KEY=...` or pass `api_key=`
`anthropic.AuthenticationError`	Invalid key	Verify the key in Console
`anthropic.RateLimitError` (after retries)	Tier limits exhausted	Add a slower tier in the ladder, or upgrade plan
`FallbackError: all models on 'X' unavailable`	All routes have OPEN circuits	Wait for cooldown (default 60s) or check upstream