From d211d437443a7b2496a3dad9575e7dddd724c585 Mon Sep 17 00:00:00 2001
From: Lance Martin <122662504+rlancemartin@users.noreply.github.com>
Date: Wed, 6 May 2026 09:05:49 -0700
Subject: [PATCH] Add Managed Agents outcomes, multiagent, and webhooks to
 claude-api skill (#1096)

---
 skills/claude-api/SKILL.md                    |   2 +-
 skills/claude-api/shared/live-sources.md      |   1 +
 .../shared/managed-agents-api-reference.md    |  41 ++++++-
 .../claude-api/shared/managed-agents-core.md  |   6 +-
 .../shared/managed-agents-events.md           |  12 +-
 .../shared/managed-agents-multiagent.md       |  99 ++++++++++++++++
 .../shared/managed-agents-onboarding.md       |   2 +-
 .../shared/managed-agents-outcomes.md         | 106 +++++++++++++++++
 .../shared/managed-agents-overview.md         |   5 +-
 .../claude-api/shared/managed-agents-tools.md |   2 +-
 .../shared/managed-agents-webhooks.md         | 110 ++++++++++++++++++
 11 files changed, 374 insertions(+), 12 deletions(-)
 create mode 100644 skills/claude-api/shared/managed-agents-multiagent.md
 create mode 100644 skills/claude-api/shared/managed-agents-outcomes.md
 create mode 100644 skills/claude-api/shared/managed-agents-webhooks.md

diff --git a/skills/claude-api/SKILL.md b/skills/claude-api/SKILL.md
index ebff1c796..9412f082c 100644
--- a/skills/claude-api/SKILL.md
+++ b/skills/claude-api/SKILL.md
@@ -234,7 +234,7 @@ For placement patterns, architectural guidance, and the silent-invalidator audit
 |---|---|
 | `managed-agents-onboard` | Walk the user through setting up a Managed Agent from scratch. **Read `shared/managed-agents-onboarding.md` immediately** and follow its interview script: mental model → know-or-explore branch → template config → session setup → emit code. Do not summarize — run the interview. |
 
-**Reading guide:** Start with `shared/managed-agents-overview.md`, then the topical `shared/managed-agents-*.md` files (core, environments, tools, events, memory, client-patterns, onboarding, api-reference). For Python, TypeScript, Go, Ruby, PHP, and Java, read `{lang}/managed-agents/README.md` for code examples. For cURL, read `curl/managed-agents.md`. **Agents are persistent — create once, reference by ID.** Store the agent ID returned by `agents.create` and pass it to every subsequent `sessions.create`; do not call `agents.create` in the request path. The Anthropic CLI is one convenient way to create agents and environments from version-controlled YAML (URL in `shared/live-sources.md`). If a binding you need isn't shown in the language README, WebFetch the relevant entry from `shared/live-sources.md` rather than guess. C# does not currently have Managed Agents support; use raw HTTP from `curl/managed-agents.md` as a reference.
+**Reading guide:** Start with `shared/managed-agents-overview.md`, then the topical `shared/managed-agents-*.md` files (core, environments, tools, events, outcomes, multiagent, webhooks, memory, client-patterns, onboarding, api-reference). For Python, TypeScript, Go, Ruby, PHP, and Java, read `{lang}/managed-agents/README.md` for code examples. For cURL, read `curl/managed-agents.md`. **Agents are persistent — create once, reference by ID.** Store the agent ID returned by `agents.create` and pass it to every subsequent `sessions.create`; do not call `agents.create` in the request path. The Anthropic CLI is one convenient way to create agents and environments from version-controlled YAML (URL in `shared/live-sources.md`). If a binding you need isn't shown in the language README, WebFetch the relevant entry from `shared/live-sources.md` rather than guess. C# does not currently have Managed Agents support; use raw HTTP from `curl/managed-agents.md` as a reference.
 
 **When the user wants to set up a Managed Agent from scratch** (e.g. "how do I get started", "walk me through creating one", "set up a new agent"): read `shared/managed-agents-onboarding.md` and run its interview — same flow as the `managed-agents-onboard` subcommand.
 
diff --git a/skills/claude-api/shared/live-sources.md b/skills/claude-api/shared/live-sources.md
index 53a8bbec2..d2f835519 100644
--- a/skills/claude-api/shared/live-sources.md
+++ b/skills/claude-api/shared/live-sources.md
@@ -88,6 +88,7 @@ Use these when a managed-agents binding, behavior, or wire-level detail isn't co
 | Permission Policies   | `https://platform.claude.com/docs/en/managed-agents/permission-policies.md`      | "Extract permission policy types (allow/deny/confirm) and per-tool config"                     |
 | Multi-Agent           | `https://platform.claude.com/docs/en/managed-agents/multi-agent.md`              | "Extract multi-agent composition patterns, sub-agent invocation, and result handoff"            |
 | Observability         | `https://platform.claude.com/docs/en/managed-agents/observability.md`            | "Extract logging, tracing, and usage telemetry exposed by managed agents"                       |
+| Webhooks              | `https://platform.claude.com/docs/en/managed-agents/webhooks.md`                 | "Extract webhook endpoint registration, HMAC signature verification, supported event types, and delivery semantics" |
 | GitHub                | `https://platform.claude.com/docs/en/managed-agents/github.md`                   | "Extract github_repository resource shape, multi-repo mounting, and token rotation"             |
 | MCP Connector         | `https://platform.claude.com/docs/en/managed-agents/mcp-connector.md`            | "Extract MCP server declaration on agents and vault-based credential injection at session"     |
 | Vaults                | `https://platform.claude.com/docs/en/managed-agents/vaults.md`                   | "Extract vault create, credential add/rotate, OAuth refresh shape, and archive"                 |
diff --git a/skills/claude-api/shared/managed-agents-api-reference.md b/skills/claude-api/shared/managed-agents-api-reference.md
index 8e7b3a03b..16b1c5b8d 100644
--- a/skills/claude-api/shared/managed-agents-api-reference.md
+++ b/skills/claude-api/shared/managed-agents-api-reference.md
@@ -23,15 +23,16 @@ All resources are under the `beta` namespace. Python and TypeScript share identi
 | Environments | `environments.create` / `retrieve` / `update` / `list` / `delete` / `archive` | `Environments.New` / `Get` / `Update` / `List` / `Delete` / `Archive` |
 | Sessions | `sessions.create` / `retrieve` / `update` / `list` / `delete` / `archive` | `Sessions.New` / `Get` / `Update` / `List` / `Delete` / `Archive` |
 | Session Events | `sessions.events.list` / `send` / `stream` | `Sessions.Events.List` / `Send` / `StreamEvents` |
+| Session Threads | `sessions.threads.list` / `retrieve` / `archive`; `sessions.threads.events.list` / `stream` | `Sessions.Threads.List` / `Get` / `Archive`; `Sessions.Threads.Events.List` / `StreamEvents` |
 | Session Resources | `sessions.resources.add` / `retrieve` / `update` / `list` / `delete` | `Sessions.Resources.Add` / `Get` / `Update` / `List` / `Delete` |
 | Vaults | `vaults.create` / `retrieve` / `update` / `list` / `delete` / `archive` | `Vaults.New` / `Get` / `Update` / `List` / `Delete` / `Archive` |
-| Credentials | `vaults.credentials.create` / `retrieve` / `update` / `list` / `delete` / `archive` | `Vaults.Credentials.New` / `Get` / `Update` / `List` / `Delete` / `Archive` |
+| Credentials | `vaults.credentials.create` / `retrieve` / `update` / `list` / `delete` / `archive` / `mcp_oauth_validate` | `Vaults.Credentials.New` / `Get` / `Update` / `List` / `Delete` / `Archive` / `McpOauthValidate` |
 | Memory Stores | `memory_stores.create` / `retrieve` / `update` / `list` / `delete` / `archive` | `MemoryStores.New` / `Get` / `Update` / `List` / `Delete` / `Archive` |
 | Memories | `memory_stores.memories.create` / `retrieve` / `update` / `list` / `delete` | `MemoryStores.Memories.New` / `Get` / `Update` / `List` / `Delete` |
 | Memory Versions | `memory_stores.memory_versions.list` / `retrieve` / `redact` | `MemoryStores.MemoryVersions.List` / `Get` / `Redact` |
 
 **Naming quirks to watch for:**
-- Agents have **no delete** — only `archive`. Archive is **permanent**: the agent becomes read-only, new sessions cannot reference it, and there is no unarchive. Confirm with the user before archiving a production agent. Environments, Sessions, Vaults, Credentials, and Memory Stores have both `delete` and `archive`; Session Resources, Files, Skills, and Memories are `delete`-only; Memory Versions have neither — only `redact`.
+- Agents and Session Threads have **no delete** — only `archive`. Archive is **permanent**: the agent becomes read-only, new sessions cannot reference it, and there is no unarchive. Confirm with the user before archiving a production agent. Environments, Sessions, Vaults, Credentials, and Memory Stores have both `delete` and `archive`; Session Resources, Files, Skills, and Memories are `delete`-only; Memory Versions have neither — only `redact`.
 - Session resources use `add` (not `create`).
 - Go's event stream is `StreamEvents` (not `Stream`).
 
@@ -73,6 +74,18 @@ All resources are under the `beta` namespace. Python and TypeScript share identi
 | `POST` | `/v1/sessions/{session_id}/events` | SendEvents | Send events (user message, tool result) |
 | `GET` | `/v1/sessions/{session_id}/events/stream` | StreamEvents | Stream events via SSE |
 
+## Session Threads
+
+Per-subagent event streams in multiagent sessions. See `shared/managed-agents-multiagent.md`.
+
+| Method   | Path                                             | Operation        | Description                              |
+| -------- | ------------------------------------------------ | ---------------- | ---------------------------------------- |
+| `GET` | `/v1/sessions/{session_id}/threads` | ListThreads | List threads (paginated) |
+| `GET` | `/v1/sessions/{session_id}/threads/{thread_id}` | GetThread | Retrieve one thread (carries `agent` snapshot, `status`, `parent_thread_id`, `stats`, `usage`) |
+| `POST` | `/v1/sessions/{session_id}/threads/{thread_id}/archive` | ArchiveThread | Archive a thread |
+| `GET` | `/v1/sessions/{session_id}/threads/{thread_id}/events` | ListThreadEvents | List past events for one thread (paginated) |
+| `GET` | `/v1/sessions/{session_id}/threads/{thread_id}/stream` | StreamThreadEvents | Stream one thread via SSE (SDK: `threads.events.stream`) |
+
 ## Session Resources
 
 | Method   | Path                                                    | Operation        | Description                              |
@@ -119,6 +132,7 @@ Credentials are individual secrets stored inside a vault.
 | `POST`   | `/v1/vaults/{vault_id}/credentials/{credential_id}`               | UpdateCredential   | Update credential            |
 | `DELETE` | `/v1/vaults/{vault_id}/credentials/{credential_id}`               | DeleteCredential   | Delete credential            |
 | `POST`   | `/v1/vaults/{vault_id}/credentials/{credential_id}/archive`       | ArchiveCredential  | Archive credential           |
+| `POST`   | `/v1/vaults/{vault_id}/credentials/{credential_id}/mcp_oauth_validate` | McpOauthValidate | Validate an MCP OAuth credential |
 
 ## Memory Stores
 
@@ -206,13 +220,21 @@ Immutable per-mutation snapshots (`memver_...`) — the audit and rollback surfa
       "url": "https://api.githubcopilot.com/mcp/"
     }
   ],
+  "multiagent": {
+    "type": "coordinator",
+    "agents": [
+      "agent_abc123",
+      { "type": "agent", "id": "agent_def456", "version": 4 },
+      { "type": "self" }
+    ]
+  },
   "metadata": {
     "key": "value (max 16 pairs, keys ≤64 chars, values ≤512 chars)"
   }
 }
 ```
 
-> Limits: `tools` max 50, `skills` max 64, `mcp_servers` max 20 (unique names).
+> Limits: `tools` max 128, `skills` max 20, `mcp_servers` max 20 (unique names). `multiagent.agents` 1–20 entries (string ID | `{type:"agent",id,version?}` | `{type:"self"}`) — see `shared/managed-agents-multiagent.md`.
 
 ### CreateSession Request Body
 
@@ -276,6 +298,19 @@ Immutable per-mutation snapshots (`memver_...`) — the audit and rollback surfa
 }
 ```
 
+### Define Outcome Event
+
+```json
+{
+  "type": "user.define_outcome",
+  "description": "Build a DCF model for Costco in .xlsx",
+  "rubric": { "type": "file", "file_id": "file_01..." },
+  "max_iterations": 5
+}
+```
+
+> `rubric` is required: `{type: "text", content}` or `{type: "file", file_id}`. `max_iterations` default 3, max 20. Echoed back with `outcome_id` + `processed_at`. See `shared/managed-agents-outcomes.md`.
+
 ### Tool Result Event
 
 ```json
diff --git a/skills/claude-api/shared/managed-agents-core.md b/skills/claude-api/shared/managed-agents-core.md
index ef45ab8f3..f5e0127e1 100644
--- a/skills/claude-api/shared/managed-agents-core.md
+++ b/skills/claude-api/shared/managed-agents-core.md
@@ -132,8 +132,9 @@ const session = await client.beta.sessions.create(
 | `system`      | string   | No       | System prompt — defines the agent's behavior (up to 100K chars) |
 | `tools`       | array    | No       | Encompasses three kinds: (1) pre-built Claude Agent tools (`agent_toolset_20260401`), (2) MCP tools (`mcp_toolset`), and (3) custom client-side tools. Max 128. |
 | `mcp_servers` | array    | No       | MCP server connections — standardized third-party capabilities (e.g. GitHub, Asana). Max 20, unique names. See `shared/managed-agents-tools.md` → MCP Servers. |
-| `skills`      | array    | No       | Customized "best-practices" context with progressive disclosure. Max 64. See `shared/managed-agents-tools.md` → Skills. |
+| `skills`      | array    | No       | Customized "best-practices" context with progressive disclosure. Max 20. See `shared/managed-agents-tools.md` → Skills. |
 | `description` | string   | No       | Description of the agent (up to 2048 chars)    |
+| `multiagent`  | object   | No       | `{type: "coordinator", agents: [...]}` — roster this agent may delegate to. See `shared/managed-agents-multiagent.md`. |
 | `metadata`    | object   | No       | Arbitrary key-value pairs (max 16, keys ≤64 chars, values ≤512 chars) |
 
 ---
@@ -153,8 +154,9 @@ The API is **flat** — `model`, `system`, `tools` etc. are top-level fields, no
 | `system`           | string   | No       | System prompt                                      |
 | `tools`            | array    | No       | Agent toolset / MCP toolset / custom tools         |
 | `mcp_servers`      | array    | No       | MCP server connections                             |
-| `skills`           | array    | No       | Skill references (max 64)                          |
+| `skills`           | array    | No       | Skill references (max 20)                          |
 | `description`      | string   | No       | Description of the agent                           |
+| `multiagent`       | object   | No       | Coordinator roster — see `shared/managed-agents-multiagent.md` |
 | `metadata`         | object   | No       | Arbitrary key-value pairs                          |
 
 ### Lifecycle: create once, run many, update in place
diff --git a/skills/claude-api/shared/managed-agents-events.md b/skills/claude-api/shared/managed-agents-events.md
index 4ee3084d4..28e3fbcb1 100644
--- a/skills/claude-api/shared/managed-agents-events.md
+++ b/skills/claude-api/shared/managed-agents-events.md
@@ -12,13 +12,15 @@ Send events to a session via `POST /v1/sessions/{id}/events`.
 | `user.interrupt`          | Interrupt the agent while it's running |
 | `user.tool_confirmation`  | Approve/deny a tool call (when `always_ask` policy) |
 | `user.custom_tool_result` | Provide result for a custom tool call |
+| `user.define_outcome`     | Start a rubric-graded iterate loop — see `shared/managed-agents-outcomes.md` |
 
 ### Receiving Events
 
-Two methods:
+Three methods:
 
 1. **Streaming (SSE)**: `GET /v1/sessions/{id}/events/stream` — real-time Server-Sent Events. **Long-lived** — the server sends periodic heartbeats to keep the connection alive.
 2. **Polling**: `GET /v1/sessions/{id}/events` — paginated event list (query params: `limit` default 1000, `page`). **Returns immediately** — this is a plain paginated GET, not a long-poll.
+3. **Webhooks**: Anthropic POSTs session state transitions to your HTTPS endpoint — thin payloads (IDs only), HMAC-signed, Console-registered. See `shared/managed-agents-webhooks.md`.
 
 All received events carry `id`, `type`, and `processed_at` (ISO 8601; `null` if not yet processed by the agent).
 
@@ -47,8 +49,12 @@ Event types use dot notation, grouped by namespace:
 | `session.error` | Error occurred during processing |
 | `span.model_request_start` | Model inference started |
 | `span.model_request_end` | Model inference completed |
+| `span.outcome_evaluation_start` / `_ongoing` / `_end` | Grader progress for outcome-oriented sessions — see `shared/managed-agents-outcomes.md` |
+| `session.thread_created` | Subagent thread spawned (multiagent) — see `shared/managed-agents-multiagent.md` |
+| `session.thread_status_running` / `_idle` / `_rescheduled` / `_terminated` | Subagent thread status transitions (multiagent). `_idle` carries `stop_reason`. |
+| `agent.thread_message_sent` / `_received` | Cross-thread message, carries `to_session_thread_id` / `from_session_thread_id` (multiagent) |
 
-The stream also echoes back user-sent events (`user.message`, `user.interrupt`, `user.tool_confirmation`, `user.custom_tool_result`).
+The stream also echoes back user-sent events (`user.message`, `user.interrupt`, `user.tool_confirmation`, `user.custom_tool_result`, `user.define_outcome`).
 
 ---
 
@@ -125,7 +131,7 @@ await client.beta.sessions.events.send(sessionId, {
 });
 ```
 
-The agent stops mid-task. It does not see the interrupt as a message — it just halts. Send a follow-up `user` event to explain what to do instead.
+The agent stops mid-task. It does not see the interrupt as a message — it just halts. Send a follow-up `user` event to explain what to do instead. If an outcome is active, the interrupt also marks `span.outcome_evaluation_end.result: "interrupted"` (see `shared/managed-agents-outcomes.md`).
 
 > **Note**: Interrupt events may have empty IDs in the current implementation. When troubleshooting, use the `processed_at` timestamp along with surrounding event IDs.
 
diff --git a/skills/claude-api/shared/managed-agents-multiagent.md b/skills/claude-api/shared/managed-agents-multiagent.md
new file mode 100644
index 000000000..1d5872c9b
--- /dev/null
+++ b/skills/claude-api/shared/managed-agents-multiagent.md
@@ -0,0 +1,99 @@
+# Managed Agents — Multiagent Sessions
+
+A coordinator agent can delegate to other agents within one session. All agents **share the container and filesystem**; each runs in its own **thread** — a context-isolated event stream with its own conversation history, model, system prompt, tools, MCP servers, and skills (from that agent's own config). Threads are persistent: the coordinator can send a follow-up to a subagent it called earlier and that subagent retains its prior turns.
+
+The SDK sets the `managed-agents-2026-04-01` beta header automatically on all `client.beta.{agents,sessions}.*` calls; no additional header is required for multiagent.
+
+---
+
+## Declare the roster on the coordinator
+
+`multiagent` is a **top-level field** on `agents.create()` / `agents.update()` — **not** a `tools[]` entry. `agents` lists 1–20 roster entries. Nothing changes on `sessions.create()` — the roster is resolved from the coordinator's config.
+
+```python
+orchestrator = client.beta.agents.create(
+    name="Engineering Lead",
+    model="{{OPUS_ID}}",
+    system="You coordinate engineering work. Delegate code review to the reviewer and test writing to the test agent.",
+    tools=[{"type": "agent_toolset_20260401"}],
+    multiagent={
+        "type": "coordinator",
+        "agents": [
+            reviewer.id,                                            # bare string — latest version
+            {"type": "agent", "id": test_writer.id, "version": 4},  # pinned version
+            {"type": "self"},                                       # the coordinator itself
+        ],
+    },
+)
+
+session = client.beta.sessions.create(agent=orchestrator.id, environment_id=env.id)
+```
+
+| Roster entry | Shape | Notes |
+|---|---|---|
+| String shorthand | `"agent_abc123"` | References the latest version of a stored agent. |
+| Agent reference | `{type: "agent", id, version?}` | Omit `version` to pin the latest at coordinator save time. |
+| Self | `{type: "self"}` | The coordinator can spawn copies of itself. |
+
+Up to **20 unique agents** in the roster; the coordinator may spawn **multiple copies** of each. **One level of delegation only** — depth > 1 is ignored.
+
+---
+
+## Threads
+
+The session-level event stream is the **primary thread** — it shows the coordinator's trace plus a condensed view of subagent activity (thread status transitions and cross-thread messages, not every subagent tool call). Drill into a specific subagent via the per-thread endpoints:
+
+| Operation | HTTP | SDK (`client.beta.sessions.threads.*`) |
+|---|---|---|
+| List threads | `GET /v1/sessions/{sid}/threads` | `.list(session_id)` |
+| Retrieve one | `GET /v1/sessions/{sid}/threads/{tid}` | `.retrieve(thread_id, session_id=...)` |
+| Archive | `POST /v1/sessions/{sid}/threads/{tid}/archive` | `.archive(thread_id, session_id=...)` |
+| List thread events | `GET /v1/sessions/{sid}/threads/{tid}/events` | `.events.list(thread_id, session_id=...)` |
+| Stream thread events | `GET /v1/sessions/{sid}/threads/{tid}/stream` | `.events.stream(thread_id, session_id=...)` |
+
+Each `SessionThread` carries `id`, `status` (`running` | `idle` | `rescheduling` | `terminated`), `agent` (a resolved snapshot of the agent config — `id`, `name`, `model`, `system`, `tools`, `skills`, `mcp_servers`, `version`), `parent_thread_id` (null for the primary thread, which is included in the list), `archived_at`, and optional `stats`/`usage`. **Session status aggregates thread statuses** — if any thread is `running`, `session.status` is `running`. Max **25 concurrent threads**. When draining a per-thread stream, break on `session.thread_status_idle` (and check its `stop_reason` as you would for the session-level idle).
+
+---
+
+## Multiagent events (on the session stream)
+
+| Event | Payload highlights | Meaning |
+|---|---|---|
+| `session.thread_created` | `session_thread_id`, `agent_name` | A new thread was created. |
+| `session.thread_status_running` | `session_thread_id`, `agent_name` | Thread started activity. |
+| `session.thread_status_idle` | `session_thread_id`, `agent_name`, **`stop_reason`** | Thread is awaiting input. Inspect `stop_reason` (same shape as `session.status_idle.stop_reason`). |
+| `session.thread_status_rescheduled` | `session_thread_id`, `agent_name` | Thread is rescheduling after a retryable error. |
+| `session.thread_status_terminated` | `session_thread_id`, `agent_name` | Thread was archived or hit a terminal error. |
+| `agent.thread_message_sent` | `to_session_thread_id`, `to_agent_name`, `content` | Coordinator sent a follow-up to another thread. |
+| `agent.thread_message_received` | `from_session_thread_id`, `from_agent_name`, `content` | An agent delivered its result to the coordinator. |
+
+---
+
+## Tool permissions and custom tools from subagent threads
+
+When a subagent needs your client (an `always_ask` confirmation, or a custom tool result), the request is **cross-posted to the primary thread** with `session_thread_id` identifying the originating thread — so you only need to watch the session stream. Reply with `user.tool_confirmation` (carrying `tool_use_id`) or `user.custom_tool_result` (carrying `custom_tool_use_id`), and **echo the `session_thread_id` from the originating event** (the SDK param type and docstring expect it). The server also routes by the tool-use ID, so the echo is belt-and-suspenders rather than load-bearing — but include it.
+
+```python
+for event_id in stop.event_ids:
+    pending = events_by_id[event_id]
+    confirmation = {
+        "type": "user.tool_confirmation",
+        "tool_use_id": event_id,
+        "result": "allow",
+    }
+    if pending.session_thread_id is not None:
+        confirmation["session_thread_id"] = pending.session_thread_id
+    client.beta.sessions.events.send(session.id, events=[confirmation])
+```
+
+The same pattern applies to `user.custom_tool_result`.
+
+---
+
+## Pitfalls
+
+- **Don't put the roster on `sessions.create()` or in `tools[]`.** `multiagent` is a top-level agent field; update the coordinator, then start a session that references it.
+- **Don't assume shared context.** Threads share the filesystem but not conversation history or tools. If the coordinator needs a subagent to act on something, it must say so in the delegated message (or write it to disk).
+- **Depth > 1 is ignored.** A subagent's own `multiagent` roster (if any) doesn't cascade — only the session's coordinator delegates.
+
+For per-language bindings beyond Python, WebFetch `https://platform.claude.com/docs/en/managed-agents/multi-agent.md` (see `shared/live-sources.md`).
diff --git a/skills/claude-api/shared/managed-agents-onboarding.md b/skills/claude-api/shared/managed-agents-onboarding.md
index 912a8cec6..e6bc3416d 100644
--- a/skills/claude-api/shared/managed-agents-onboarding.md
+++ b/skills/claude-api/shared/managed-agents-onboarding.md
@@ -51,7 +51,7 @@ Three rounds. Batch the questions in each round; don't ask them one at a time.
 
 **Round B — Skills, files, and repos.** What the agent has on hand when it starts.
 
-*Skills* — two types; both work the same way — Claude auto-uses them when relevant. Max 64 per agent.
+*Skills* — two types; both work the same way — Claude auto-uses them when relevant. Max 20 per agent.
 - [ ] **Pre-built Agent Skills**: `xlsx`, `docx`, `pptx`, `pdf`. Reference by name.
 - [ ] **Custom Skills**: skills uploaded to the user's org via the Skills API. Reference by `skill_id` + optional `version`. If the skill doesn't exist yet, walk the user through `POST /v1/skills` + `POST /v1/skills/{id}/versions` (beta header `skills-2025-10-02`). Full detail: `shared/managed-agents-tools.md` → Skills + Skills API.
 
diff --git a/skills/claude-api/shared/managed-agents-outcomes.md b/skills/claude-api/shared/managed-agents-outcomes.md
new file mode 100644
index 000000000..aee3f4e3f
--- /dev/null
+++ b/skills/claude-api/shared/managed-agents-outcomes.md
@@ -0,0 +1,106 @@
+# Managed Agents — Outcomes
+
+An **outcome** elevates a session from *conversation* to *work*: you state what "done" looks like, and the harness runs an iterate → grade → revise loop until the artifact meets the rubric, hits `max_iterations`, or is interrupted. A separate **grader** (independent context window) scores each iteration against your rubric and feeds per-criterion gaps back to the agent.
+
+The SDK sets the `managed-agents-2026-04-01` beta header automatically on all `client.beta.sessions.*` calls; no additional header is required for outcomes.
+
+---
+
+## The `user.define_outcome` event
+
+Outcomes are not a field on `sessions.create()`. You create a normal session, then send a `user.define_outcome` event. The agent starts working on receipt — **do not also send a `user.message`** to kick it off.
+
+```python
+session = client.beta.sessions.create(
+    agent=AGENT_ID,
+    environment_id=ENVIRONMENT_ID,
+    title="Financial analysis on Costco",
+)
+
+client.beta.sessions.events.send(
+    session_id=session.id,
+    events=[
+        {
+            "type": "user.define_outcome",
+            "description": "Build a DCF model for Costco in .xlsx",
+            "rubric": {"type": "text", "content": RUBRIC_MD},
+            # or: "rubric": {"type": "file", "file_id": rubric.id}
+            "max_iterations": 5,  # optional; default 3, max 20
+        }
+    ],
+)
+```
+
+| Field | Type | Notes |
+|---|---|---|
+| `type` | `"user.define_outcome"` | |
+| `description` | string | The task. This is what the agent works toward — no separate `user.message` needed. |
+| `rubric` | `{type: "text", content}` \| `{type: "file", file_id}` | **Required.** Markdown with explicit, independently gradeable criteria. Upload once via `client.beta.files.upload(...)` (beta `files-api-2025-04-14`) to reuse across sessions. |
+| `max_iterations` | int | Optional. Default **3**, max **20**. |
+
+The event is echoed back on the stream with a server-assigned `outcome_id` and `processed_at`.
+
+> **Writing rubrics.** Use explicit, gradeable criteria ("CSV has a numeric `price` column"), not vibes ("data looks good") — the grader scores each criterion independently, so vague criteria produce noisy loops. If you don't have a rubric, have Claude analyze a known-good artifact and turn that analysis into one.
+
+---
+
+## Outcome-specific events
+
+These appear on the standard event stream (`sessions.events.stream` / `.list`) alongside the usual `agent.*` / `session.*` events.
+
+| Event | Payload highlights | Meaning |
+|---|---|---|
+| `span.outcome_evaluation_start` | `outcome_id`, `iteration` (0-indexed) | Grader began scoring iteration *N*. |
+| `span.outcome_evaluation_ongoing` | `outcome_id` | Heartbeat while the grader runs. Grader reasoning is opaque — you see *that* it's working, not *what* it's thinking. |
+| `span.outcome_evaluation_end` | `outcome_evaluation_start_id`, `outcome_id`, `iteration`, `result`, `explanation`, `usage` | Grader finished one iteration. `result` drives what happens next (table below). |
+
+### `span.outcome_evaluation_end.result`
+
+| `result` | Next |
+|---|---|
+| `satisfied` | Session → `idle`. Terminal for this outcome. |
+| `needs_revision` | Agent starts another iteration. |
+| `max_iterations_reached` | No further grader cycles. Agent may run one final revision, then session → `idle`. |
+| `failed` | Session → `idle`. Rubric fundamentally doesn't match the task (e.g. description and rubric contradict). |
+| `interrupted` | Only emitted if `_start` had already fired before a `user.interrupt` arrived. |
+
+```json
+{
+  "type": "span.outcome_evaluation_end",
+  "id": "sevt_01jkl...",
+  "outcome_evaluation_start_id": "sevt_01def...",
+  "outcome_id": "outc_01a...",
+  "result": "satisfied",
+  "explanation": "All 12 criteria met: revenue projections use 5 years of historical data, ...",
+  "iteration": 0,
+  "usage": { "input_tokens": 2400, "output_tokens": 350, "cache_creation_input_tokens": 0, "cache_read_input_tokens": 1800 },
+  "processed_at": "2026-03-25T14:03:00Z"
+}
+```
+
+---
+
+## Checking status & retrieving deliverables
+
+**Status** — either watch the stream for `span.outcome_evaluation_end`, or poll the session and read `outcome_evaluations`:
+
+```python
+session = client.beta.sessions.retrieve(session.id)
+for ev in session.outcome_evaluations:
+    print(f"{ev.outcome_id}: {ev.result}")  # outc_01a...: satisfied
+```
+
+**Deliverables** — the agent writes to `/mnt/session/outputs/`. Once idle, fetch via the Files API with `scope_id=session.id`. This is the same session-outputs mechanism documented in `shared/managed-agents-environments.md` → Session outputs (including the dual-beta-header requirement on `files.list`).
+
+---
+
+## Interaction rules & pitfalls
+
+- **One outcome at a time.** Chain by sending the next `user.define_outcome` only after the previous one's terminal `span.outcome_evaluation_end` (`satisfied` / `max_iterations_reached` / `failed` / `interrupted`). The session retains history across chained outcomes.
+- **Steering is allowed but optional.** You *may* send `user.message` events mid-outcome to nudge direction, but the agent already knows to keep working until terminal — don't send "keep going" prompts.
+- **`user.interrupt` pauses the current outcome** — it marks `result: "interrupted"` and leaves the session `idle`, ready for a new outcome or conversational turn.
+- **After terminal, the session is reusable** — continue conversationally or define a new outcome.
+- **Outcome ≠ session-create field.** Don't put `outcome`, `rubric`, or `description` on `sessions.create()` — outcomes are always sent as a `user.define_outcome` event.
+- **Idle-break gate is unchanged.** In your drain loop, keep using `event.type === 'session.status_idle' && event.stop_reason?.type !== 'requires_action'` — do **not** gate on `span.outcome_evaluation_end` alone (on `needs_revision` the session keeps running). See `shared/managed-agents-client-patterns.md` Pattern 5.
+
+For the raw HTTP shapes and per-language SDK bindings beyond Python, WebFetch `https://platform.claude.com/docs/en/managed-agents/define-outcomes.md` (see `shared/live-sources.md`).
diff --git a/skills/claude-api/shared/managed-agents-overview.md b/skills/claude-api/shared/managed-agents-overview.md
index 2c55d2f15..689f510df 100644
--- a/skills/claude-api/shared/managed-agents-overview.md
+++ b/skills/claude-api/shared/managed-agents-overview.md
@@ -25,7 +25,7 @@ Managed Agents is in beta. The SDK sets required beta headers automatically:
 
 | Beta Header                    | What it enables                                      |
 | ------------------------------ | ---------------------------------------------------- |
-| `managed-agents-2026-04-01`    | Agents, Environments, Sessions, Events, Session Resources, Vaults, Credentials, Memory Stores |
+| `managed-agents-2026-04-01`    | Agents, Environments, Sessions, Events, Session Resources, Session Threads, Outcomes, Multiagent, Vaults, Credentials, Memory Stores |
 | `skills-2025-10-02`            | Skills API (for managing custom skill definitions)   |
 | `files-api-2025-04-14`         | Files API for file uploads                           |
 
@@ -45,6 +45,9 @@ Managed Agents is in beta. The SDK sets required beta headers automatically:
 | Configure tools and permissions        | `shared/managed-agents-tools.md`                        |
 | Set up MCP servers                     | `shared/managed-agents-tools.md` (MCP Servers section)  |
 | Stream events / handle tool_use        | `shared/managed-agents-events.md` + language file       |
+| Get notified of session state changes via webhook (no polling) | `shared/managed-agents-webhooks.md` — Console-registered endpoint, HMAC verify, thin payload + fetch |
+| Define an outcome / rubric-graded iterate loop | `shared/managed-agents-outcomes.md` — `user.define_outcome` event, grader, `span.outcome_evaluation_*` events |
+| Coordinate multiple agents / subagents / threads | `shared/managed-agents-multiagent.md` — `multiagent: {type: "coordinator", agents: [...]}` on the agent, session threads, cross-posted tool confirmations |
 | Set up environments                    | `shared/managed-agents-environments.md` + language file |
 | Upload files / attach repos            | `shared/managed-agents-environments.md` (Resources)     |
 | Give agents persistent memory across sessions | `shared/managed-agents-memory.md` — memory stores, `memory_store` session resource, preconditions, versions/redact |
diff --git a/skills/claude-api/shared/managed-agents-tools.md b/skills/claude-api/shared/managed-agents-tools.md
index de1dabb73..3a7247f6a 100644
--- a/skills/claude-api/shared/managed-agents-tools.md
+++ b/skills/claude-api/shared/managed-agents-tools.md
@@ -258,7 +258,7 @@ Two types — both work the same way; the agent automatically uses them when rel
 | **Pre-built Anthropic skills** | Common document tasks (PowerPoint, Excel, Word, PDF). Reference by name (e.g. `xlsx`). |
 | **Custom skills** | Skills you've created in your organization via the Skills API. Reference by `skill_id` + optional `version`. |
 
-**Max 64 skills per agent.** Agent creation uses `managed-agents-2026-04-01`; the separate Skills API (for managing custom skill definitions) uses `skills-2025-10-02`.
+**Max 20 skills per agent.** Agent creation uses `managed-agents-2026-04-01`; the separate Skills API (for managing custom skill definitions) uses `skills-2025-10-02`.
 
 ### Enabling skills on a session
 
diff --git a/skills/claude-api/shared/managed-agents-webhooks.md b/skills/claude-api/shared/managed-agents-webhooks.md
new file mode 100644
index 000000000..4d2e5e15b
--- /dev/null
+++ b/skills/claude-api/shared/managed-agents-webhooks.md
@@ -0,0 +1,110 @@
+# Managed Agents — Webhooks
+
+Anthropic can POST to your HTTPS endpoint when a Managed Agents resource changes state — an alternative to holding an SSE stream or polling. Payloads are **thin** (event type + resource IDs only); on receipt, fetch the resource for current state. Every delivery is HMAC-signed.
+
+> **Direction matters.** This page covers *Anthropic → you* notifications about session/vault state. It does **not** cover *third-party → you* webhooks that *trigger* a session (e.g. a GitHub push handler that calls `sessions.create()`) — that's ordinary application code on your side with no Anthropic-specific wire format.
+
+---
+
+## Register an endpoint (Console only)
+
+Console → **Manage → Webhooks**. There is no programmatic endpoint-management API yet. Secret rotation is supported from the same page.
+
+| Field | Constraint |
+|---|---|
+| URL | HTTPS on port 443, publicly resolvable hostname |
+| Event types | Subscribe per `data.type` — you only receive subscribed types (plus test events) |
+| Signing secret | `whsec_`-prefixed, 32 bytes, **shown once at creation** — store it |
+
+---
+
+## Verify the signature
+
+Every delivery is HMAC-signed. **Use the SDK's `client.beta.webhooks.unwrap()`** — it verifies the signature, rejects payloads more than ~5 minutes old, and returns the parsed event. It reads the `whsec_` secret from `ANTHROPIC_WEBHOOK_SIGNING_KEY`.
+
+```python
+import anthropic
+from flask import Flask, request
+
+client = anthropic.Anthropic()  # reads ANTHROPIC_WEBHOOK_SIGNING_KEY from env
+app = Flask(__name__)
+
+
+@app.route("/webhook", methods=["POST"])
+def webhook():
+    try:
+        event = client.beta.webhooks.unwrap(
+            request.get_data(as_text=True),
+            headers=dict(request.headers),
+        )
+    except Exception:
+        return "invalid signature", 400
+
+    if event.id in seen_event_ids:  # dedupe retries — id is per-event, not per-delivery
+        return "", 204
+    seen_event_ids.add(event.id)
+
+    match event.data.type:
+        case "session.status_idled":
+            session = client.beta.sessions.retrieve(event.data.id)
+            notify_user(session)
+        case "vault_credential.refresh_failed":
+            alert_oncall(event.data.id)
+
+    return "", 204
+```
+
+Pass the **raw request body** to `unwrap()` — frameworks that re-serialize JSON (Express `.json()`, Flask `.get_json()`) change the bytes and break the MAC. For other languages, look up the `beta.webhooks.unwrap` binding in the SDK repo (`shared/live-sources.md`); don't hand-roll verification.
+
+---
+
+## Payload envelope
+
+```json
+{
+  "type": "event",
+  "id": "event_01ABC...",
+  "created_at": "2026-03-18T14:05:22Z",
+  "data": {
+    "type": "session.status_idled",
+    "id": "session_01XYZ...",
+    "organization_id": "8a3d2f1e-...",
+    "workspace_id": "c7b0e4d9-..."
+  }
+}
+```
+
+Switch on `data.type`, fetch the resource by `data.id`, return any **2xx** to acknowledge. `created_at` is when the *state transition* happened, not when the webhook fired.
+
+---
+
+## Supported `data.type` values
+
+| `data.type` | Fires when |
+|---|---|
+| `session.status_scheduled` | Session created and ready to accept events |
+| `session.status_run_started` | Agent execution kicked off (every transition to `running`) |
+| `session.status_idled` | Agent awaiting input (tool approval, custom tool result, or next message) |
+| `session.status_terminated` | Session hit a terminal error |
+| `session.thread_created` | Multiagent: coordinator opened a new subagent thread |
+| `session.thread_idled` | Multiagent: a subagent thread is waiting for input |
+| `session.outcome_evaluation_ended` | Outcome grader finished one iteration |
+| `vault.archived` | Vault was archived |
+| `vault.created` | Vault was created |
+| `vault.deleted` | Vault was deleted |
+| `vault_credential.archived` | Vault credential was archived |
+| `vault_credential.created` | Vault credential was created |
+| `vault_credential.deleted` | Vault credential was deleted |
+| `vault_credential.refresh_failed` | MCP OAuth vault credential failed to refresh |
+
+> These are **webhook** `data.type` values — a separate namespace from SSE event types (`session.status_idle`, `span.outcome_evaluation_end`, etc. in `shared/managed-agents-events.md`). Don't reuse SSE constants in webhook handlers.
+
+---
+
+## Delivery behavior & pitfalls
+
+- **No ordering guarantee.** `session.status_idled` may arrive before `session.outcome_evaluation_ended` even if the evaluation finished first. Sort by envelope `created_at` if order matters.
+- **Retries carry the same `event.id`.** At least one retry on non-2xx. Dedupe on `event.id`.
+- **3xx is failure.** Redirects are not followed — update the URL in Console if your endpoint moves.
+- **Auto-disable** after ~20 consecutive failed deliveries, or immediately if the hostname resolves to a private IP or returns a redirect. Re-enable manually in Console.
+- **Thin payload is intentional.** Don't expect `stop_reason`, `outcome_evaluations`, credential secrets, etc. on the webhook body — fetch the resource.