Add Managed Agents outcomes, multiagent, and webhooks to claude-api skill (#1096)

This commit is contained in:
Lance Martin
2026-05-06 09:05:49 -07:00
committed by GitHub
parent d230a6dd6e
commit d211d43744
11 changed files with 374 additions and 12 deletions

View File

@@ -234,7 +234,7 @@ For placement patterns, architectural guidance, and the silent-invalidator audit
|---|---| |---|---|
| `managed-agents-onboard` | Walk the user through setting up a Managed Agent from scratch. **Read `shared/managed-agents-onboarding.md` immediately** and follow its interview script: mental model → know-or-explore branch → template config → session setup → emit code. Do not summarize — run the interview. | | `managed-agents-onboard` | Walk the user through setting up a Managed Agent from scratch. **Read `shared/managed-agents-onboarding.md` immediately** and follow its interview script: mental model → know-or-explore branch → template config → session setup → emit code. Do not summarize — run the interview. |
**Reading guide:** Start with `shared/managed-agents-overview.md`, then the topical `shared/managed-agents-*.md` files (core, environments, tools, events, memory, client-patterns, onboarding, api-reference). For Python, TypeScript, Go, Ruby, PHP, and Java, read `{lang}/managed-agents/README.md` for code examples. For cURL, read `curl/managed-agents.md`. **Agents are persistent — create once, reference by ID.** Store the agent ID returned by `agents.create` and pass it to every subsequent `sessions.create`; do not call `agents.create` in the request path. The Anthropic CLI is one convenient way to create agents and environments from version-controlled YAML (URL in `shared/live-sources.md`). If a binding you need isn't shown in the language README, WebFetch the relevant entry from `shared/live-sources.md` rather than guess. C# does not currently have Managed Agents support; use raw HTTP from `curl/managed-agents.md` as a reference. **Reading guide:** Start with `shared/managed-agents-overview.md`, then the topical `shared/managed-agents-*.md` files (core, environments, tools, events, outcomes, multiagent, webhooks, memory, client-patterns, onboarding, api-reference). For Python, TypeScript, Go, Ruby, PHP, and Java, read `{lang}/managed-agents/README.md` for code examples. For cURL, read `curl/managed-agents.md`. **Agents are persistent — create once, reference by ID.** Store the agent ID returned by `agents.create` and pass it to every subsequent `sessions.create`; do not call `agents.create` in the request path. The Anthropic CLI is one convenient way to create agents and environments from version-controlled YAML (URL in `shared/live-sources.md`). If a binding you need isn't shown in the language README, WebFetch the relevant entry from `shared/live-sources.md` rather than guess. C# does not currently have Managed Agents support; use raw HTTP from `curl/managed-agents.md` as a reference.
**When the user wants to set up a Managed Agent from scratch** (e.g. "how do I get started", "walk me through creating one", "set up a new agent"): read `shared/managed-agents-onboarding.md` and run its interview — same flow as the `managed-agents-onboard` subcommand. **When the user wants to set up a Managed Agent from scratch** (e.g. "how do I get started", "walk me through creating one", "set up a new agent"): read `shared/managed-agents-onboarding.md` and run its interview — same flow as the `managed-agents-onboard` subcommand.

View File

@@ -88,6 +88,7 @@ Use these when a managed-agents binding, behavior, or wire-level detail isn't co
| Permission Policies | `https://platform.claude.com/docs/en/managed-agents/permission-policies.md` | "Extract permission policy types (allow/deny/confirm) and per-tool config" | | Permission Policies | `https://platform.claude.com/docs/en/managed-agents/permission-policies.md` | "Extract permission policy types (allow/deny/confirm) and per-tool config" |
| Multi-Agent | `https://platform.claude.com/docs/en/managed-agents/multi-agent.md` | "Extract multi-agent composition patterns, sub-agent invocation, and result handoff" | | Multi-Agent | `https://platform.claude.com/docs/en/managed-agents/multi-agent.md` | "Extract multi-agent composition patterns, sub-agent invocation, and result handoff" |
| Observability | `https://platform.claude.com/docs/en/managed-agents/observability.md` | "Extract logging, tracing, and usage telemetry exposed by managed agents" | | Observability | `https://platform.claude.com/docs/en/managed-agents/observability.md` | "Extract logging, tracing, and usage telemetry exposed by managed agents" |
| Webhooks | `https://platform.claude.com/docs/en/managed-agents/webhooks.md` | "Extract webhook endpoint registration, HMAC signature verification, supported event types, and delivery semantics" |
| GitHub | `https://platform.claude.com/docs/en/managed-agents/github.md` | "Extract github_repository resource shape, multi-repo mounting, and token rotation" | | GitHub | `https://platform.claude.com/docs/en/managed-agents/github.md` | "Extract github_repository resource shape, multi-repo mounting, and token rotation" |
| MCP Connector | `https://platform.claude.com/docs/en/managed-agents/mcp-connector.md` | "Extract MCP server declaration on agents and vault-based credential injection at session" | | MCP Connector | `https://platform.claude.com/docs/en/managed-agents/mcp-connector.md` | "Extract MCP server declaration on agents and vault-based credential injection at session" |
| Vaults | `https://platform.claude.com/docs/en/managed-agents/vaults.md` | "Extract vault create, credential add/rotate, OAuth refresh shape, and archive" | | Vaults | `https://platform.claude.com/docs/en/managed-agents/vaults.md` | "Extract vault create, credential add/rotate, OAuth refresh shape, and archive" |

View File

@@ -23,15 +23,16 @@ All resources are under the `beta` namespace. Python and TypeScript share identi
| Environments | `environments.create` / `retrieve` / `update` / `list` / `delete` / `archive` | `Environments.New` / `Get` / `Update` / `List` / `Delete` / `Archive` | | Environments | `environments.create` / `retrieve` / `update` / `list` / `delete` / `archive` | `Environments.New` / `Get` / `Update` / `List` / `Delete` / `Archive` |
| Sessions | `sessions.create` / `retrieve` / `update` / `list` / `delete` / `archive` | `Sessions.New` / `Get` / `Update` / `List` / `Delete` / `Archive` | | Sessions | `sessions.create` / `retrieve` / `update` / `list` / `delete` / `archive` | `Sessions.New` / `Get` / `Update` / `List` / `Delete` / `Archive` |
| Session Events | `sessions.events.list` / `send` / `stream` | `Sessions.Events.List` / `Send` / `StreamEvents` | | Session Events | `sessions.events.list` / `send` / `stream` | `Sessions.Events.List` / `Send` / `StreamEvents` |
| Session Threads | `sessions.threads.list` / `retrieve` / `archive`; `sessions.threads.events.list` / `stream` | `Sessions.Threads.List` / `Get` / `Archive`; `Sessions.Threads.Events.List` / `StreamEvents` |
| Session Resources | `sessions.resources.add` / `retrieve` / `update` / `list` / `delete` | `Sessions.Resources.Add` / `Get` / `Update` / `List` / `Delete` | | Session Resources | `sessions.resources.add` / `retrieve` / `update` / `list` / `delete` | `Sessions.Resources.Add` / `Get` / `Update` / `List` / `Delete` |
| Vaults | `vaults.create` / `retrieve` / `update` / `list` / `delete` / `archive` | `Vaults.New` / `Get` / `Update` / `List` / `Delete` / `Archive` | | Vaults | `vaults.create` / `retrieve` / `update` / `list` / `delete` / `archive` | `Vaults.New` / `Get` / `Update` / `List` / `Delete` / `Archive` |
| Credentials | `vaults.credentials.create` / `retrieve` / `update` / `list` / `delete` / `archive` | `Vaults.Credentials.New` / `Get` / `Update` / `List` / `Delete` / `Archive` | | Credentials | `vaults.credentials.create` / `retrieve` / `update` / `list` / `delete` / `archive` / `mcp_oauth_validate` | `Vaults.Credentials.New` / `Get` / `Update` / `List` / `Delete` / `Archive` / `McpOauthValidate` |
| Memory Stores | `memory_stores.create` / `retrieve` / `update` / `list` / `delete` / `archive` | `MemoryStores.New` / `Get` / `Update` / `List` / `Delete` / `Archive` | | Memory Stores | `memory_stores.create` / `retrieve` / `update` / `list` / `delete` / `archive` | `MemoryStores.New` / `Get` / `Update` / `List` / `Delete` / `Archive` |
| Memories | `memory_stores.memories.create` / `retrieve` / `update` / `list` / `delete` | `MemoryStores.Memories.New` / `Get` / `Update` / `List` / `Delete` | | Memories | `memory_stores.memories.create` / `retrieve` / `update` / `list` / `delete` | `MemoryStores.Memories.New` / `Get` / `Update` / `List` / `Delete` |
| Memory Versions | `memory_stores.memory_versions.list` / `retrieve` / `redact` | `MemoryStores.MemoryVersions.List` / `Get` / `Redact` | | Memory Versions | `memory_stores.memory_versions.list` / `retrieve` / `redact` | `MemoryStores.MemoryVersions.List` / `Get` / `Redact` |
**Naming quirks to watch for:** **Naming quirks to watch for:**
- Agents have **no delete** — only `archive`. Archive is **permanent**: the agent becomes read-only, new sessions cannot reference it, and there is no unarchive. Confirm with the user before archiving a production agent. Environments, Sessions, Vaults, Credentials, and Memory Stores have both `delete` and `archive`; Session Resources, Files, Skills, and Memories are `delete`-only; Memory Versions have neither — only `redact`. - Agents and Session Threads have **no delete** — only `archive`. Archive is **permanent**: the agent becomes read-only, new sessions cannot reference it, and there is no unarchive. Confirm with the user before archiving a production agent. Environments, Sessions, Vaults, Credentials, and Memory Stores have both `delete` and `archive`; Session Resources, Files, Skills, and Memories are `delete`-only; Memory Versions have neither — only `redact`.
- Session resources use `add` (not `create`). - Session resources use `add` (not `create`).
- Go's event stream is `StreamEvents` (not `Stream`). - Go's event stream is `StreamEvents` (not `Stream`).
@@ -73,6 +74,18 @@ All resources are under the `beta` namespace. Python and TypeScript share identi
| `POST` | `/v1/sessions/{session_id}/events` | SendEvents | Send events (user message, tool result) | | `POST` | `/v1/sessions/{session_id}/events` | SendEvents | Send events (user message, tool result) |
| `GET` | `/v1/sessions/{session_id}/events/stream` | StreamEvents | Stream events via SSE | | `GET` | `/v1/sessions/{session_id}/events/stream` | StreamEvents | Stream events via SSE |
## Session Threads
Per-subagent event streams in multiagent sessions. See `shared/managed-agents-multiagent.md`.
| Method | Path | Operation | Description |
| -------- | ------------------------------------------------ | ---------------- | ---------------------------------------- |
| `GET` | `/v1/sessions/{session_id}/threads` | ListThreads | List threads (paginated) |
| `GET` | `/v1/sessions/{session_id}/threads/{thread_id}` | GetThread | Retrieve one thread (carries `agent` snapshot, `status`, `parent_thread_id`, `stats`, `usage`) |
| `POST` | `/v1/sessions/{session_id}/threads/{thread_id}/archive` | ArchiveThread | Archive a thread |
| `GET` | `/v1/sessions/{session_id}/threads/{thread_id}/events` | ListThreadEvents | List past events for one thread (paginated) |
| `GET` | `/v1/sessions/{session_id}/threads/{thread_id}/stream` | StreamThreadEvents | Stream one thread via SSE (SDK: `threads.events.stream`) |
## Session Resources ## Session Resources
| Method | Path | Operation | Description | | Method | Path | Operation | Description |
@@ -119,6 +132,7 @@ Credentials are individual secrets stored inside a vault.
| `POST` | `/v1/vaults/{vault_id}/credentials/{credential_id}` | UpdateCredential | Update credential | | `POST` | `/v1/vaults/{vault_id}/credentials/{credential_id}` | UpdateCredential | Update credential |
| `DELETE` | `/v1/vaults/{vault_id}/credentials/{credential_id}` | DeleteCredential | Delete credential | | `DELETE` | `/v1/vaults/{vault_id}/credentials/{credential_id}` | DeleteCredential | Delete credential |
| `POST` | `/v1/vaults/{vault_id}/credentials/{credential_id}/archive` | ArchiveCredential | Archive credential | | `POST` | `/v1/vaults/{vault_id}/credentials/{credential_id}/archive` | ArchiveCredential | Archive credential |
| `POST` | `/v1/vaults/{vault_id}/credentials/{credential_id}/mcp_oauth_validate` | McpOauthValidate | Validate an MCP OAuth credential |
## Memory Stores ## Memory Stores
@@ -206,13 +220,21 @@ Immutable per-mutation snapshots (`memver_...`) — the audit and rollback surfa
"url": "https://api.githubcopilot.com/mcp/" "url": "https://api.githubcopilot.com/mcp/"
} }
], ],
"multiagent": {
"type": "coordinator",
"agents": [
"agent_abc123",
{ "type": "agent", "id": "agent_def456", "version": 4 },
{ "type": "self" }
]
},
"metadata": { "metadata": {
"key": "value (max 16 pairs, keys ≤64 chars, values ≤512 chars)" "key": "value (max 16 pairs, keys ≤64 chars, values ≤512 chars)"
} }
} }
``` ```
> Limits: `tools` max 50, `skills` max 64, `mcp_servers` max 20 (unique names). > Limits: `tools` max 128, `skills` max 20, `mcp_servers` max 20 (unique names). `multiagent.agents` 120 entries (string ID | `{type:"agent",id,version?}` | `{type:"self"}`) — see `shared/managed-agents-multiagent.md`.
### CreateSession Request Body ### CreateSession Request Body
@@ -276,6 +298,19 @@ Immutable per-mutation snapshots (`memver_...`) — the audit and rollback surfa
} }
``` ```
### Define Outcome Event
```json
{
"type": "user.define_outcome",
"description": "Build a DCF model for Costco in .xlsx",
"rubric": { "type": "file", "file_id": "file_01..." },
"max_iterations": 5
}
```
> `rubric` is required: `{type: "text", content}` or `{type: "file", file_id}`. `max_iterations` default 3, max 20. Echoed back with `outcome_id` + `processed_at`. See `shared/managed-agents-outcomes.md`.
### Tool Result Event ### Tool Result Event
```json ```json

View File

@@ -132,8 +132,9 @@ const session = await client.beta.sessions.create(
| `system` | string | No | System prompt — defines the agent's behavior (up to 100K chars) | | `system` | string | No | System prompt — defines the agent's behavior (up to 100K chars) |
| `tools` | array | No | Encompasses three kinds: (1) pre-built Claude Agent tools (`agent_toolset_20260401`), (2) MCP tools (`mcp_toolset`), and (3) custom client-side tools. Max 128. | | `tools` | array | No | Encompasses three kinds: (1) pre-built Claude Agent tools (`agent_toolset_20260401`), (2) MCP tools (`mcp_toolset`), and (3) custom client-side tools. Max 128. |
| `mcp_servers` | array | No | MCP server connections — standardized third-party capabilities (e.g. GitHub, Asana). Max 20, unique names. See `shared/managed-agents-tools.md` → MCP Servers. | | `mcp_servers` | array | No | MCP server connections — standardized third-party capabilities (e.g. GitHub, Asana). Max 20, unique names. See `shared/managed-agents-tools.md` → MCP Servers. |
| `skills` | array | No | Customized "best-practices" context with progressive disclosure. Max 64. See `shared/managed-agents-tools.md` → Skills. | | `skills` | array | No | Customized "best-practices" context with progressive disclosure. Max 20. See `shared/managed-agents-tools.md` → Skills. |
| `description` | string | No | Description of the agent (up to 2048 chars) | | `description` | string | No | Description of the agent (up to 2048 chars) |
| `multiagent` | object | No | `{type: "coordinator", agents: [...]}` — roster this agent may delegate to. See `shared/managed-agents-multiagent.md`. |
| `metadata` | object | No | Arbitrary key-value pairs (max 16, keys ≤64 chars, values ≤512 chars) | | `metadata` | object | No | Arbitrary key-value pairs (max 16, keys ≤64 chars, values ≤512 chars) |
--- ---
@@ -153,8 +154,9 @@ The API is **flat** — `model`, `system`, `tools` etc. are top-level fields, no
| `system` | string | No | System prompt | | `system` | string | No | System prompt |
| `tools` | array | No | Agent toolset / MCP toolset / custom tools | | `tools` | array | No | Agent toolset / MCP toolset / custom tools |
| `mcp_servers` | array | No | MCP server connections | | `mcp_servers` | array | No | MCP server connections |
| `skills` | array | No | Skill references (max 64) | | `skills` | array | No | Skill references (max 20) |
| `description` | string | No | Description of the agent | | `description` | string | No | Description of the agent |
| `multiagent` | object | No | Coordinator roster — see `shared/managed-agents-multiagent.md` |
| `metadata` | object | No | Arbitrary key-value pairs | | `metadata` | object | No | Arbitrary key-value pairs |
### Lifecycle: create once, run many, update in place ### Lifecycle: create once, run many, update in place

View File

@@ -12,13 +12,15 @@ Send events to a session via `POST /v1/sessions/{id}/events`.
| `user.interrupt` | Interrupt the agent while it's running | | `user.interrupt` | Interrupt the agent while it's running |
| `user.tool_confirmation` | Approve/deny a tool call (when `always_ask` policy) | | `user.tool_confirmation` | Approve/deny a tool call (when `always_ask` policy) |
| `user.custom_tool_result` | Provide result for a custom tool call | | `user.custom_tool_result` | Provide result for a custom tool call |
| `user.define_outcome` | Start a rubric-graded iterate loop — see `shared/managed-agents-outcomes.md` |
### Receiving Events ### Receiving Events
Two methods: Three methods:
1. **Streaming (SSE)**: `GET /v1/sessions/{id}/events/stream` — real-time Server-Sent Events. **Long-lived** — the server sends periodic heartbeats to keep the connection alive. 1. **Streaming (SSE)**: `GET /v1/sessions/{id}/events/stream` — real-time Server-Sent Events. **Long-lived** — the server sends periodic heartbeats to keep the connection alive.
2. **Polling**: `GET /v1/sessions/{id}/events` — paginated event list (query params: `limit` default 1000, `page`). **Returns immediately** — this is a plain paginated GET, not a long-poll. 2. **Polling**: `GET /v1/sessions/{id}/events` — paginated event list (query params: `limit` default 1000, `page`). **Returns immediately** — this is a plain paginated GET, not a long-poll.
3. **Webhooks**: Anthropic POSTs session state transitions to your HTTPS endpoint — thin payloads (IDs only), HMAC-signed, Console-registered. See `shared/managed-agents-webhooks.md`.
All received events carry `id`, `type`, and `processed_at` (ISO 8601; `null` if not yet processed by the agent). All received events carry `id`, `type`, and `processed_at` (ISO 8601; `null` if not yet processed by the agent).
@@ -47,8 +49,12 @@ Event types use dot notation, grouped by namespace:
| `session.error` | Error occurred during processing | | `session.error` | Error occurred during processing |
| `span.model_request_start` | Model inference started | | `span.model_request_start` | Model inference started |
| `span.model_request_end` | Model inference completed | | `span.model_request_end` | Model inference completed |
| `span.outcome_evaluation_start` / `_ongoing` / `_end` | Grader progress for outcome-oriented sessions — see `shared/managed-agents-outcomes.md` |
| `session.thread_created` | Subagent thread spawned (multiagent) — see `shared/managed-agents-multiagent.md` |
| `session.thread_status_running` / `_idle` / `_rescheduled` / `_terminated` | Subagent thread status transitions (multiagent). `_idle` carries `stop_reason`. |
| `agent.thread_message_sent` / `_received` | Cross-thread message, carries `to_session_thread_id` / `from_session_thread_id` (multiagent) |
The stream also echoes back user-sent events (`user.message`, `user.interrupt`, `user.tool_confirmation`, `user.custom_tool_result`). The stream also echoes back user-sent events (`user.message`, `user.interrupt`, `user.tool_confirmation`, `user.custom_tool_result`, `user.define_outcome`).
--- ---
@@ -125,7 +131,7 @@ await client.beta.sessions.events.send(sessionId, {
}); });
``` ```
The agent stops mid-task. It does not see the interrupt as a message — it just halts. Send a follow-up `user` event to explain what to do instead. The agent stops mid-task. It does not see the interrupt as a message — it just halts. Send a follow-up `user` event to explain what to do instead. If an outcome is active, the interrupt also marks `span.outcome_evaluation_end.result: "interrupted"` (see `shared/managed-agents-outcomes.md`).
> **Note**: Interrupt events may have empty IDs in the current implementation. When troubleshooting, use the `processed_at` timestamp along with surrounding event IDs. > **Note**: Interrupt events may have empty IDs in the current implementation. When troubleshooting, use the `processed_at` timestamp along with surrounding event IDs.

View File

@@ -0,0 +1,99 @@
# Managed Agents — Multiagent Sessions
A coordinator agent can delegate to other agents within one session. All agents **share the container and filesystem**; each runs in its own **thread** — a context-isolated event stream with its own conversation history, model, system prompt, tools, MCP servers, and skills (from that agent's own config). Threads are persistent: the coordinator can send a follow-up to a subagent it called earlier and that subagent retains its prior turns.
The SDK sets the `managed-agents-2026-04-01` beta header automatically on all `client.beta.{agents,sessions}.*` calls; no additional header is required for multiagent.
---
## Declare the roster on the coordinator
`multiagent` is a **top-level field** on `agents.create()` / `agents.update()`**not** a `tools[]` entry. `agents` lists 120 roster entries. Nothing changes on `sessions.create()` — the roster is resolved from the coordinator's config.
```python
orchestrator = client.beta.agents.create(
name="Engineering Lead",
model="{{OPUS_ID}}",
system="You coordinate engineering work. Delegate code review to the reviewer and test writing to the test agent.",
tools=[{"type": "agent_toolset_20260401"}],
multiagent={
"type": "coordinator",
"agents": [
reviewer.id, # bare string — latest version
{"type": "agent", "id": test_writer.id, "version": 4}, # pinned version
{"type": "self"}, # the coordinator itself
],
},
)
session = client.beta.sessions.create(agent=orchestrator.id, environment_id=env.id)
```
| Roster entry | Shape | Notes |
|---|---|---|
| String shorthand | `"agent_abc123"` | References the latest version of a stored agent. |
| Agent reference | `{type: "agent", id, version?}` | Omit `version` to pin the latest at coordinator save time. |
| Self | `{type: "self"}` | The coordinator can spawn copies of itself. |
Up to **20 unique agents** in the roster; the coordinator may spawn **multiple copies** of each. **One level of delegation only** — depth > 1 is ignored.
---
## Threads
The session-level event stream is the **primary thread** — it shows the coordinator's trace plus a condensed view of subagent activity (thread status transitions and cross-thread messages, not every subagent tool call). Drill into a specific subagent via the per-thread endpoints:
| Operation | HTTP | SDK (`client.beta.sessions.threads.*`) |
|---|---|---|
| List threads | `GET /v1/sessions/{sid}/threads` | `.list(session_id)` |
| Retrieve one | `GET /v1/sessions/{sid}/threads/{tid}` | `.retrieve(thread_id, session_id=...)` |
| Archive | `POST /v1/sessions/{sid}/threads/{tid}/archive` | `.archive(thread_id, session_id=...)` |
| List thread events | `GET /v1/sessions/{sid}/threads/{tid}/events` | `.events.list(thread_id, session_id=...)` |
| Stream thread events | `GET /v1/sessions/{sid}/threads/{tid}/stream` | `.events.stream(thread_id, session_id=...)` |
Each `SessionThread` carries `id`, `status` (`running` | `idle` | `rescheduling` | `terminated`), `agent` (a resolved snapshot of the agent config — `id`, `name`, `model`, `system`, `tools`, `skills`, `mcp_servers`, `version`), `parent_thread_id` (null for the primary thread, which is included in the list), `archived_at`, and optional `stats`/`usage`. **Session status aggregates thread statuses** — if any thread is `running`, `session.status` is `running`. Max **25 concurrent threads**. When draining a per-thread stream, break on `session.thread_status_idle` (and check its `stop_reason` as you would for the session-level idle).
---
## Multiagent events (on the session stream)
| Event | Payload highlights | Meaning |
|---|---|---|
| `session.thread_created` | `session_thread_id`, `agent_name` | A new thread was created. |
| `session.thread_status_running` | `session_thread_id`, `agent_name` | Thread started activity. |
| `session.thread_status_idle` | `session_thread_id`, `agent_name`, **`stop_reason`** | Thread is awaiting input. Inspect `stop_reason` (same shape as `session.status_idle.stop_reason`). |
| `session.thread_status_rescheduled` | `session_thread_id`, `agent_name` | Thread is rescheduling after a retryable error. |
| `session.thread_status_terminated` | `session_thread_id`, `agent_name` | Thread was archived or hit a terminal error. |
| `agent.thread_message_sent` | `to_session_thread_id`, `to_agent_name`, `content` | Coordinator sent a follow-up to another thread. |
| `agent.thread_message_received` | `from_session_thread_id`, `from_agent_name`, `content` | An agent delivered its result to the coordinator. |
---
## Tool permissions and custom tools from subagent threads
When a subagent needs your client (an `always_ask` confirmation, or a custom tool result), the request is **cross-posted to the primary thread** with `session_thread_id` identifying the originating thread — so you only need to watch the session stream. Reply with `user.tool_confirmation` (carrying `tool_use_id`) or `user.custom_tool_result` (carrying `custom_tool_use_id`), and **echo the `session_thread_id` from the originating event** (the SDK param type and docstring expect it). The server also routes by the tool-use ID, so the echo is belt-and-suspenders rather than load-bearing — but include it.
```python
for event_id in stop.event_ids:
pending = events_by_id[event_id]
confirmation = {
"type": "user.tool_confirmation",
"tool_use_id": event_id,
"result": "allow",
}
if pending.session_thread_id is not None:
confirmation["session_thread_id"] = pending.session_thread_id
client.beta.sessions.events.send(session.id, events=[confirmation])
```
The same pattern applies to `user.custom_tool_result`.
---
## Pitfalls
- **Don't put the roster on `sessions.create()` or in `tools[]`.** `multiagent` is a top-level agent field; update the coordinator, then start a session that references it.
- **Don't assume shared context.** Threads share the filesystem but not conversation history or tools. If the coordinator needs a subagent to act on something, it must say so in the delegated message (or write it to disk).
- **Depth > 1 is ignored.** A subagent's own `multiagent` roster (if any) doesn't cascade — only the session's coordinator delegates.
For per-language bindings beyond Python, WebFetch `https://platform.claude.com/docs/en/managed-agents/multi-agent.md` (see `shared/live-sources.md`).

View File

@@ -51,7 +51,7 @@ Three rounds. Batch the questions in each round; don't ask them one at a time.
**Round B — Skills, files, and repos.** What the agent has on hand when it starts. **Round B — Skills, files, and repos.** What the agent has on hand when it starts.
*Skills* — two types; both work the same way — Claude auto-uses them when relevant. Max 64 per agent. *Skills* — two types; both work the same way — Claude auto-uses them when relevant. Max 20 per agent.
- [ ] **Pre-built Agent Skills**: `xlsx`, `docx`, `pptx`, `pdf`. Reference by name. - [ ] **Pre-built Agent Skills**: `xlsx`, `docx`, `pptx`, `pdf`. Reference by name.
- [ ] **Custom Skills**: skills uploaded to the user's org via the Skills API. Reference by `skill_id` + optional `version`. If the skill doesn't exist yet, walk the user through `POST /v1/skills` + `POST /v1/skills/{id}/versions` (beta header `skills-2025-10-02`). Full detail: `shared/managed-agents-tools.md` → Skills + Skills API. - [ ] **Custom Skills**: skills uploaded to the user's org via the Skills API. Reference by `skill_id` + optional `version`. If the skill doesn't exist yet, walk the user through `POST /v1/skills` + `POST /v1/skills/{id}/versions` (beta header `skills-2025-10-02`). Full detail: `shared/managed-agents-tools.md` → Skills + Skills API.

View File

@@ -0,0 +1,106 @@
# Managed Agents — Outcomes
An **outcome** elevates a session from *conversation* to *work*: you state what "done" looks like, and the harness runs an iterate → grade → revise loop until the artifact meets the rubric, hits `max_iterations`, or is interrupted. A separate **grader** (independent context window) scores each iteration against your rubric and feeds per-criterion gaps back to the agent.
The SDK sets the `managed-agents-2026-04-01` beta header automatically on all `client.beta.sessions.*` calls; no additional header is required for outcomes.
---
## The `user.define_outcome` event
Outcomes are not a field on `sessions.create()`. You create a normal session, then send a `user.define_outcome` event. The agent starts working on receipt — **do not also send a `user.message`** to kick it off.
```python
session = client.beta.sessions.create(
agent=AGENT_ID,
environment_id=ENVIRONMENT_ID,
title="Financial analysis on Costco",
)
client.beta.sessions.events.send(
session_id=session.id,
events=[
{
"type": "user.define_outcome",
"description": "Build a DCF model for Costco in .xlsx",
"rubric": {"type": "text", "content": RUBRIC_MD},
# or: "rubric": {"type": "file", "file_id": rubric.id}
"max_iterations": 5, # optional; default 3, max 20
}
],
)
```
| Field | Type | Notes |
|---|---|---|
| `type` | `"user.define_outcome"` | |
| `description` | string | The task. This is what the agent works toward — no separate `user.message` needed. |
| `rubric` | `{type: "text", content}` \| `{type: "file", file_id}` | **Required.** Markdown with explicit, independently gradeable criteria. Upload once via `client.beta.files.upload(...)` (beta `files-api-2025-04-14`) to reuse across sessions. |
| `max_iterations` | int | Optional. Default **3**, max **20**. |
The event is echoed back on the stream with a server-assigned `outcome_id` and `processed_at`.
> **Writing rubrics.** Use explicit, gradeable criteria ("CSV has a numeric `price` column"), not vibes ("data looks good") — the grader scores each criterion independently, so vague criteria produce noisy loops. If you don't have a rubric, have Claude analyze a known-good artifact and turn that analysis into one.
---
## Outcome-specific events
These appear on the standard event stream (`sessions.events.stream` / `.list`) alongside the usual `agent.*` / `session.*` events.
| Event | Payload highlights | Meaning |
|---|---|---|
| `span.outcome_evaluation_start` | `outcome_id`, `iteration` (0-indexed) | Grader began scoring iteration *N*. |
| `span.outcome_evaluation_ongoing` | `outcome_id` | Heartbeat while the grader runs. Grader reasoning is opaque — you see *that* it's working, not *what* it's thinking. |
| `span.outcome_evaluation_end` | `outcome_evaluation_start_id`, `outcome_id`, `iteration`, `result`, `explanation`, `usage` | Grader finished one iteration. `result` drives what happens next (table below). |
### `span.outcome_evaluation_end.result`
| `result` | Next |
|---|---|
| `satisfied` | Session → `idle`. Terminal for this outcome. |
| `needs_revision` | Agent starts another iteration. |
| `max_iterations_reached` | No further grader cycles. Agent may run one final revision, then session → `idle`. |
| `failed` | Session → `idle`. Rubric fundamentally doesn't match the task (e.g. description and rubric contradict). |
| `interrupted` | Only emitted if `_start` had already fired before a `user.interrupt` arrived. |
```json
{
"type": "span.outcome_evaluation_end",
"id": "sevt_01jkl...",
"outcome_evaluation_start_id": "sevt_01def...",
"outcome_id": "outc_01a...",
"result": "satisfied",
"explanation": "All 12 criteria met: revenue projections use 5 years of historical data, ...",
"iteration": 0,
"usage": { "input_tokens": 2400, "output_tokens": 350, "cache_creation_input_tokens": 0, "cache_read_input_tokens": 1800 },
"processed_at": "2026-03-25T14:03:00Z"
}
```
---
## Checking status & retrieving deliverables
**Status** — either watch the stream for `span.outcome_evaluation_end`, or poll the session and read `outcome_evaluations`:
```python
session = client.beta.sessions.retrieve(session.id)
for ev in session.outcome_evaluations:
print(f"{ev.outcome_id}: {ev.result}") # outc_01a...: satisfied
```
**Deliverables** — the agent writes to `/mnt/session/outputs/`. Once idle, fetch via the Files API with `scope_id=session.id`. This is the same session-outputs mechanism documented in `shared/managed-agents-environments.md` → Session outputs (including the dual-beta-header requirement on `files.list`).
---
## Interaction rules & pitfalls
- **One outcome at a time.** Chain by sending the next `user.define_outcome` only after the previous one's terminal `span.outcome_evaluation_end` (`satisfied` / `max_iterations_reached` / `failed` / `interrupted`). The session retains history across chained outcomes.
- **Steering is allowed but optional.** You *may* send `user.message` events mid-outcome to nudge direction, but the agent already knows to keep working until terminal — don't send "keep going" prompts.
- **`user.interrupt` pauses the current outcome** — it marks `result: "interrupted"` and leaves the session `idle`, ready for a new outcome or conversational turn.
- **After terminal, the session is reusable** — continue conversationally or define a new outcome.
- **Outcome ≠ session-create field.** Don't put `outcome`, `rubric`, or `description` on `sessions.create()` — outcomes are always sent as a `user.define_outcome` event.
- **Idle-break gate is unchanged.** In your drain loop, keep using `event.type === 'session.status_idle' && event.stop_reason?.type !== 'requires_action'` — do **not** gate on `span.outcome_evaluation_end` alone (on `needs_revision` the session keeps running). See `shared/managed-agents-client-patterns.md` Pattern 5.
For the raw HTTP shapes and per-language SDK bindings beyond Python, WebFetch `https://platform.claude.com/docs/en/managed-agents/define-outcomes.md` (see `shared/live-sources.md`).

View File

@@ -25,7 +25,7 @@ Managed Agents is in beta. The SDK sets required beta headers automatically:
| Beta Header | What it enables | | Beta Header | What it enables |
| ------------------------------ | ---------------------------------------------------- | | ------------------------------ | ---------------------------------------------------- |
| `managed-agents-2026-04-01` | Agents, Environments, Sessions, Events, Session Resources, Vaults, Credentials, Memory Stores | | `managed-agents-2026-04-01` | Agents, Environments, Sessions, Events, Session Resources, Session Threads, Outcomes, Multiagent, Vaults, Credentials, Memory Stores |
| `skills-2025-10-02` | Skills API (for managing custom skill definitions) | | `skills-2025-10-02` | Skills API (for managing custom skill definitions) |
| `files-api-2025-04-14` | Files API for file uploads | | `files-api-2025-04-14` | Files API for file uploads |
@@ -45,6 +45,9 @@ Managed Agents is in beta. The SDK sets required beta headers automatically:
| Configure tools and permissions | `shared/managed-agents-tools.md` | | Configure tools and permissions | `shared/managed-agents-tools.md` |
| Set up MCP servers | `shared/managed-agents-tools.md` (MCP Servers section) | | Set up MCP servers | `shared/managed-agents-tools.md` (MCP Servers section) |
| Stream events / handle tool_use | `shared/managed-agents-events.md` + language file | | Stream events / handle tool_use | `shared/managed-agents-events.md` + language file |
| Get notified of session state changes via webhook (no polling) | `shared/managed-agents-webhooks.md` — Console-registered endpoint, HMAC verify, thin payload + fetch |
| Define an outcome / rubric-graded iterate loop | `shared/managed-agents-outcomes.md``user.define_outcome` event, grader, `span.outcome_evaluation_*` events |
| Coordinate multiple agents / subagents / threads | `shared/managed-agents-multiagent.md``multiagent: {type: "coordinator", agents: [...]}` on the agent, session threads, cross-posted tool confirmations |
| Set up environments | `shared/managed-agents-environments.md` + language file | | Set up environments | `shared/managed-agents-environments.md` + language file |
| Upload files / attach repos | `shared/managed-agents-environments.md` (Resources) | | Upload files / attach repos | `shared/managed-agents-environments.md` (Resources) |
| Give agents persistent memory across sessions | `shared/managed-agents-memory.md` — memory stores, `memory_store` session resource, preconditions, versions/redact | | Give agents persistent memory across sessions | `shared/managed-agents-memory.md` — memory stores, `memory_store` session resource, preconditions, versions/redact |

View File

@@ -258,7 +258,7 @@ Two types — both work the same way; the agent automatically uses them when rel
| **Pre-built Anthropic skills** | Common document tasks (PowerPoint, Excel, Word, PDF). Reference by name (e.g. `xlsx`). | | **Pre-built Anthropic skills** | Common document tasks (PowerPoint, Excel, Word, PDF). Reference by name (e.g. `xlsx`). |
| **Custom skills** | Skills you've created in your organization via the Skills API. Reference by `skill_id` + optional `version`. | | **Custom skills** | Skills you've created in your organization via the Skills API. Reference by `skill_id` + optional `version`. |
**Max 64 skills per agent.** Agent creation uses `managed-agents-2026-04-01`; the separate Skills API (for managing custom skill definitions) uses `skills-2025-10-02`. **Max 20 skills per agent.** Agent creation uses `managed-agents-2026-04-01`; the separate Skills API (for managing custom skill definitions) uses `skills-2025-10-02`.
### Enabling skills on a session ### Enabling skills on a session

View File

@@ -0,0 +1,110 @@
# Managed Agents — Webhooks
Anthropic can POST to your HTTPS endpoint when a Managed Agents resource changes state — an alternative to holding an SSE stream or polling. Payloads are **thin** (event type + resource IDs only); on receipt, fetch the resource for current state. Every delivery is HMAC-signed.
> **Direction matters.** This page covers *Anthropic → you* notifications about session/vault state. It does **not** cover *third-party → you* webhooks that *trigger* a session (e.g. a GitHub push handler that calls `sessions.create()`) — that's ordinary application code on your side with no Anthropic-specific wire format.
---
## Register an endpoint (Console only)
Console → **Manage → Webhooks**. There is no programmatic endpoint-management API yet. Secret rotation is supported from the same page.
| Field | Constraint |
|---|---|
| URL | HTTPS on port 443, publicly resolvable hostname |
| Event types | Subscribe per `data.type` — you only receive subscribed types (plus test events) |
| Signing secret | `whsec_`-prefixed, 32 bytes, **shown once at creation** — store it |
---
## Verify the signature
Every delivery is HMAC-signed. **Use the SDK's `client.beta.webhooks.unwrap()`** — it verifies the signature, rejects payloads more than ~5 minutes old, and returns the parsed event. It reads the `whsec_` secret from `ANTHROPIC_WEBHOOK_SIGNING_KEY`.
```python
import anthropic
from flask import Flask, request
client = anthropic.Anthropic() # reads ANTHROPIC_WEBHOOK_SIGNING_KEY from env
app = Flask(__name__)
@app.route("/webhook", methods=["POST"])
def webhook():
try:
event = client.beta.webhooks.unwrap(
request.get_data(as_text=True),
headers=dict(request.headers),
)
except Exception:
return "invalid signature", 400
if event.id in seen_event_ids: # dedupe retries — id is per-event, not per-delivery
return "", 204
seen_event_ids.add(event.id)
match event.data.type:
case "session.status_idled":
session = client.beta.sessions.retrieve(event.data.id)
notify_user(session)
case "vault_credential.refresh_failed":
alert_oncall(event.data.id)
return "", 204
```
Pass the **raw request body** to `unwrap()` — frameworks that re-serialize JSON (Express `.json()`, Flask `.get_json()`) change the bytes and break the MAC. For other languages, look up the `beta.webhooks.unwrap` binding in the SDK repo (`shared/live-sources.md`); don't hand-roll verification.
---
## Payload envelope
```json
{
"type": "event",
"id": "event_01ABC...",
"created_at": "2026-03-18T14:05:22Z",
"data": {
"type": "session.status_idled",
"id": "session_01XYZ...",
"organization_id": "8a3d2f1e-...",
"workspace_id": "c7b0e4d9-..."
}
}
```
Switch on `data.type`, fetch the resource by `data.id`, return any **2xx** to acknowledge. `created_at` is when the *state transition* happened, not when the webhook fired.
---
## Supported `data.type` values
| `data.type` | Fires when |
|---|---|
| `session.status_scheduled` | Session created and ready to accept events |
| `session.status_run_started` | Agent execution kicked off (every transition to `running`) |
| `session.status_idled` | Agent awaiting input (tool approval, custom tool result, or next message) |
| `session.status_terminated` | Session hit a terminal error |
| `session.thread_created` | Multiagent: coordinator opened a new subagent thread |
| `session.thread_idled` | Multiagent: a subagent thread is waiting for input |
| `session.outcome_evaluation_ended` | Outcome grader finished one iteration |
| `vault.archived` | Vault was archived |
| `vault.created` | Vault was created |
| `vault.deleted` | Vault was deleted |
| `vault_credential.archived` | Vault credential was archived |
| `vault_credential.created` | Vault credential was created |
| `vault_credential.deleted` | Vault credential was deleted |
| `vault_credential.refresh_failed` | MCP OAuth vault credential failed to refresh |
> These are **webhook** `data.type` values — a separate namespace from SSE event types (`session.status_idle`, `span.outcome_evaluation_end`, etc. in `shared/managed-agents-events.md`). Don't reuse SSE constants in webhook handlers.
---
## Delivery behavior & pitfalls
- **No ordering guarantee.** `session.status_idled` may arrive before `session.outcome_evaluation_ended` even if the evaluation finished first. Sort by envelope `created_at` if order matters.
- **Retries carry the same `event.id`.** At least one retry on non-2xx. Dedupe on `event.id`.
- **3xx is failure.** Redirects are not followed — update the URL in Console if your endpoint moves.
- **Auto-disable** after ~20 consecutive failed deliveries, or immediately if the hostname resolves to a private IP or returns a redirect. Re-enable manually in Console.
- **Thin payload is intentional.** Don't expect `stop_reason`, `outcome_evaluations`, credential secrets, etc. on the webhook body — fetch the resource.