How-to: connect the Anthropic API¶
Plug the real Anthropic API into the runtime so pipelines invoke claude-haiku-4-5, claude-sonnet-4-6, or claude-opus-4-7 instead of the mock invoker.
Install¶
The anthropic SDK is an optional dependency. The pipeline works without it (using the mock invoker); install it only when you want real LLM calls.
Set the API key¶
export ANTHROPIC_API_KEY="sk-ant-..." # Linux/macOS
$env:ANTHROPIC_API_KEY="sk-ant-..." # PowerShell
Or pass api_key=... explicitly to AnthropicInvoker(api_key=...).
Quickstart — one call¶
from src.ai import ModelManager
mm, invoker = ModelManager.with_anthropic()
result = mm.call("Explain RAG in one sentence.", tier="simple", invoker=invoker)
print(result["response"]) # actual text from claude-haiku-4-5
print(result["tokens_in"], "in /", result["tokens_out"], "out")
print(result["model"]) # claude-haiku-4-5
tier="simple" routes to Haiku first, falling back to Sonnet if Haiku's circuit opens. Other tiers: standard (Haiku→Sonnet→Opus), complex (Sonnet→Opus), critical (Opus→Sonnet).
Plug into a full pipeline run¶
from src.ai import AgentMetrics, ModelManager
from src.pipeline import (
AgentRef, BudgetTracker, Capsule, CheckpointStore, Constraints,
ContinuationStore, ExpectedOutput, LockManager, PipelineOrchestrator,
TaskSpec, VerificationAgent,
)
from src.privacy import AuditChain
# 1. Real Anthropic-backed model manager
mm, anthropic_invoker = ModelManager.with_anthropic(max_tokens=4096)
# 2. Executor that turns the capsule into a prompt and runs it
def llm_executor(state):
prompt = f"Task: {state.capsule.task.description}\n\n"
prompt += "Acceptance criteria:\n"
for ac in state.capsule.task.acceptance_criteria:
prompt += f"- {ac}\n"
result = mm.call(prompt, tier="complex", invoker=anthropic_invoker)
state.budget.record_call(
model=result["model"],
tokens_in=result["tokens_in"],
tokens_out=result["tokens_out"],
)
return {"artifact_main": result["response"]}
# 3. Wire orchestrator with full governance
orch = PipelineOrchestrator(
budget=BudgetTracker(capsule_id="...", limit_tokens=50_000, limit_dollars=2.0),
locks=LockManager(),
verifier=VerificationAgent(),
audit_chain=AuditChain(),
metrics=AgentMetrics(),
checkpoint_store=CheckpointStore(),
continuation_store=ContinuationStore(),
)
# 4. Run
result = orch.run(capsule, executor=llm_executor)
How retry + fallback compose¶
Two layers protect every call:
| Layer | What it handles | Configured by |
|---|---|---|
| SDK auto-retry (per-model) | Transient 429 / 408 / 409 / 5xx with exponential backoff (default 3 retries) | AnthropicInvoker(max_retries=N) |
| ModelManager fallback ladder (across models) | Persistent failure on a model → next route in the ladder; trips CircuitBreaker on repeated failures |
ModelManager.default() / with_anthropic() |
If Haiku's API returns 429 once → SDK retries with backoff → call succeeds. If Haiku is consistently failing → CircuitBreaker opens → ModelManager.select() returns Sonnet instead.
Cost tracking¶
Every call's tokens are captured. To estimate cost:
from src.pipeline.token_budget import estimate_cost
cost = estimate_cost(
model=result["model"],
tokens_input=result["tokens_in"],
tokens_output=result["tokens_out"],
)
print(f"${cost:.6f}")
Pricing per million tokens (current rate card):
| Model | Input | Output |
|---|---|---|
claude-haiku-4-5 |
$1.00 | $5.00 |
claude-sonnet-4-6 |
$3.00 | $15.00 |
claude-opus-4-7 |
$5.00 | $25.00 |
BudgetTracker.record_call() raises BudgetExceededError when the per-capsule cap is hit — wire this into state.budget.record_call(...) inside your executor for hard caps.
Network smoke tests¶
Real API call test lives in tests/smoke/test_llm_roundtrip.py (marked @pytest.mark.network). It's skipped automatically when ANTHROPIC_API_KEY is not set.
To run it:
Expected cost: < $0.001 per run (single Haiku call, 50–100 tokens).
Configure AnthropicInvoker in detail¶
| Parameter | Default | Notes |
|---|---|---|
api_key |
None → reads ANTHROPIC_API_KEY env |
Pass explicit for multi-tenant scenarios |
max_tokens |
4096 |
Per-call output cap |
max_retries |
3 |
SDK auto-retry on 429/408/409/5xx with exp backoff |
timeout_seconds |
60.0 |
Per-request timeout (SDK default is 600) |
system_prompt |
None |
Optional system prompt prepended every call |
Troubleshooting¶
| Symptom | Likely cause | Fix |
|---|---|---|
AnthropicNotInstalledError |
anthropic package missing |
pip install -e ".[llm]" |
ValueError: ANTHROPIC_API_KEY not set |
No env var, no explicit key | export ANTHROPIC_API_KEY=... or pass api_key= |
anthropic.AuthenticationError |
Invalid key | Verify the key in Console |
anthropic.RateLimitError (after retries) |
Tier limits exhausted | Add a slower tier in the ladder, or upgrade plan |
FallbackError: all models on 'X' unavailable |
All routes have OPEN circuits | Wait for cooldown (default 60s) or check upstream |
See also¶
- Tutorial: first pipeline run — uses the mock invoker
- MODEL-MANAGEMENT spec
- CIRCUIT-BREAKER spec
- TOKEN-BUDGET-PROTOCOL spec