docs: record AgentShield fleet routing evidence

2026-07-01 20:41:26 +08:00 · 2026-05-16 01:24:20 -04:00
parent 1c5c5d2389
commit cc83a85eb8
7 changed files with 38 additions and 19 deletions
@@ -37,8 +37,9 @@ As of 2026-05-16:
  CI install hardening, GitHub Actions cache purge, AgentShield #85
  registry-signature verification, AgentShield #86 evidence-pack CI provenance,
  AgentShield #87 plugin-cache runtime-confidence classification, AgentShield
-  #88 evidence-pack inspect/readback, ECC-Tools #75 billing-gate tightening,
-  PR #1947 supply-chain protection, and May 16 release-evidence refresh.
+  #88 evidence-pack inspect/readback, AgentShield #89 evidence-pack fleet
+  routing, ECC-Tools #75 billing-gate tightening, PR #1947 supply-chain
+  protection, and May 16 release-evidence refresh.
 - `npm run harness:audit -- --format json` reports 70/70 on current `main`.
 - `npm run observability:ready` reports 21/21 readiness on current `main`,
  including the GitHub/Linear/handoff/roadmap progress-sync contract.
@@ -100,6 +101,12 @@ As of 2026-05-16:
  finding counts, runtime confidence, policy, baseline, supply-chain, CI
  context, remediation phases, and malformed artifact errors without manually
  opening every bundle file.
+- AgentShield PR #89 merged as `521ada9091bb6d818511ab8589ae675b920c106a`
+  and added `agentshield evidence-pack fleet <dirs...> [--json]` for
+  downstream fleet routing. Multiple verified evidence packs now aggregate into
+  ready, security-blocker, policy-review, baseline-regression,
+  supply-chain-review, and invalid routes with finding, policy, baseline,
+  supply-chain, and remediation totals.
 - JARVIS PR #13 merged as `127efabbfb5033ae53d7a53e1546aa3c33d6f962`
  and hardened CI/deploy workflows with npm registry signature verification,
  disabled persisted checkout credentials in write-permission jobs, and pinned
@@ -520,11 +527,11 @@ is not complete unless the evidence column exists and has been freshly verified.
 | Naming and rename readiness | Naming matrix across package/plugin/docs/social surfaces | `docs/releases/2.0.0-rc.1/naming-and-publication-matrix.md` records current package, repo, Claude plugin, Codex plugin, OpenCode, and npm availability evidence | Complete for rc.1; post-rc rename remains future work |
 | Claude and Codex plugin publication | Contact/submission path with required artifacts and status | Publication readiness, naming matrix, and May 12 dry-run evidence document plugin validation, clean-checkout Claude tag/install smoke, and Codex marketplace CLI shape | Needs explicit approval for real tag/push and marketplace submission |
 | Articles, tweets, and announcements | X thread, LinkedIn copy, GitHub release copy, push checklist | Draft launch collateral exists under rc.1 release docs | Needs URL-backed refresh |
-| AgentShield enterprise iteration | Policy gates, SARIF, packs, provenance, corpus, HTML reports, exception lifecycle audit, baseline drift Action/CLI surfaces, evidence-pack redaction, harness adapter registry, enterprise research roadmap, supply-chain hardened release path, CI-safe baseline fingerprints, corpus accuracy recommendations, remediation workflow phases, env proxy hijack corpus coverage, Mini Shai-Hulud full-campaign package IOCs, CI-provenance evidence packs, plugin-cache runtime-confidence triage, and evidence-pack consumer readback | PRs #53, #55-#64, #67-#69, and #78-#88 landed with test evidence; native PDF export deferred in favor of self-contained HTML plus print-to-PDF until explicit enterprise demand appears; `docs/architecture/agentshield-enterprise-research-roadmap.md` now has baseline drift, evidence-pack bundle, redaction, adapter-registry, supply-chain hardening, hashed baseline fingerprints, corpus accuracy recommendation, remediation workflow, env proxy hijack corpus, Mini Shai-Hulud full-campaign package-table, `ci-context.json` provenance, `plugin-cache` confidence, and `evidence-pack inspect` readback slices landed | Next cross-harness depth and fleet routing |
+| AgentShield enterprise iteration | Policy gates, SARIF, packs, provenance, corpus, HTML reports, exception lifecycle audit, baseline drift Action/CLI surfaces, evidence-pack redaction, harness adapter registry, enterprise research roadmap, supply-chain hardened release path, CI-safe baseline fingerprints, corpus accuracy recommendations, remediation workflow phases, env proxy hijack corpus coverage, Mini Shai-Hulud full-campaign package IOCs, CI-provenance evidence packs, plugin-cache runtime-confidence triage, evidence-pack consumer readback, and fleet-level evidence-pack routing | PRs #53, #55-#64, #67-#69, and #78-#89 landed with test evidence; native PDF export deferred in favor of self-contained HTML plus print-to-PDF until explicit enterprise demand appears; `docs/architecture/agentshield-enterprise-research-roadmap.md` now has baseline drift, evidence-pack bundle, redaction, adapter-registry, supply-chain hardening, hashed baseline fingerprints, corpus accuracy recommendation, remediation workflow, env proxy hijack corpus, Mini Shai-Hulud full-campaign package-table, `ci-context.json` provenance, `plugin-cache` confidence, `evidence-pack inspect` readback, and `evidence-pack fleet` routing slices landed | Next ECC-Tools fleet-summary consumption and cross-harness policy integration |
 | ECC Tools next-level app | Billing audit, PR checks, deep analyzer, sync backlog, evaluator/RAG corpus, analysis-depth readiness, hosted execution planning, hosted CI diagnostics, hosted security evidence review, hosted harness compatibility audit, hosted reference-set evaluation, hosted AI routing/cost review, hosted team backlog routing, hosted depth-plan check-run, PR-comment hosted job dispatch, hosted job result history/check-runs, hosted result status command, status-aware depth-plan recommendations, hosted promotion readiness, hosted promotion output scoring, hosted promotion retrieval planning, hosted promotion judge contract, gated hosted promotion judge execution, payment-announcement readiness | PRs #26-#43 plus #53-#74 landed with test evidence, including AgentShield evidence-pack gap routing, canonical bundle recognition, supply-chain signature gates, PR draft follow-up Linear tracking, evidence-backed/deep-ready repository classification, the `/api/analysis/depth-plan` hosted job plan, `/api/analysis/jobs/ci-diagnostics`, `/api/analysis/jobs/security-evidence-review`, `/api/analysis/jobs/harness-compatibility-audit`, `/api/analysis/jobs/reference-set-evaluation`, `/api/analysis/jobs/ai-routing-cost-review`, `/api/analysis/jobs/team-backlog-routing`, the `ECC Tools / Hosted Depth Plan` check-run, `/ecc-tools analyze --job ...` PR-comment dispatch, non-blocking per-hosted-job result check-runs backed by 30-day result cache records, `/ecc-tools analyze --job status` cache lookup, cache-aware next-job recommendations in the depth-plan check-run, the `ECC Tools / Hosted Promotion Readiness` corpus-backed PR check-run, deterministic hosted-output scoring against cached completed job artifacts/findings, ranked retrieval/model-prompt planning, the fail-closed `hosted-promotion-judge.v1` request contract, opt-in live model-judge execution behind hosted evidence, entitlement, budget, provider, executor, strict JSON, and citation gates, a fail-closed `/api/billing/readiness` `announcementGate` for native GitHub payments claims, and `npm run billing:announcement-gate` as the non-secret operator verifier | Next work is hosted promotion telemetry, operator review UX, and live Marketplace test-account readback |
 | GitGuardian/Dependabot/CodeRabbit-style checks | Non-blocking taxonomy, deterministic follow-up checks, and local supply-chain gates | ECC-Tools risk taxonomy check plus follow-up signals landed, including Skill Quality, Deep Analyzer Evidence, Analyzer Corpus Evidence, RAG/Evaluator Evidence, PR Review/Salvage Evidence, and AgentShield evidence-pack evidence; #1846 added npm registry signature gates; #1848 added the supply-chain incident-response playbook and `pull_request_target` cache-poisoning validator guard; #1851 added the privileged checkout credential-persistence guard; AgentShield #78, JARVIS #13, and ECC-Tools #53 applied the same hardening outside trunk | Current supply-chain gate complete; deeper hosted review features remain future |
 | Harness-agnostic learning system | Audit, adapter matrix, observability, traces, promotion loop | Audit/adapters/observability gates plus `docs/architecture/evaluator-rag-prototype.md`, `examples/evaluator-rag-prototype/`, and ECC-Tools PR #40 define read-only stale-salvage, billing-readiness, CI-failure-diagnosis, harness-config-quality, AgentShield policy-exception, skill-quality evidence, deep-analyzer evidence, and RAG/evaluator comparison scenarios with trace, report, playbook, verifier, and predictive-check artifacts; ECC-Tools PRs #68-#72 now turn that corpus into a deterministic PR check-run gate with cached hosted-output scoring, ranked retrieval candidates, a model prompt seed, a fail-closed hosted model-judge request contract, and opt-in live model execution behind strict hosted-evidence gates | Deterministic hosted PR check, cached output scoring, retrieval planning, judge contract, and gated model execution integrated |
-| Linear roadmap is detailed | Linear project status plus repo mirror | Repo mirror exists; issue creation was retried on 2026-05-12 and remains blocked by the workspace free issue limit; this May 16 sync adds ECC #1860, AgentShield #78-#88, JARVIS #13, ECC-Tools #53-#74, resolved queue/discussion counts, and a generated `operator:dashboard` prompt-to-artifact audit for recurring status updates | Needs recurring status updates after each significant merge batch |
+| Linear roadmap is detailed | Linear project status plus repo mirror | Repo mirror exists; issue creation was retried on 2026-05-12 and remains blocked by the workspace free issue limit; this May 16 sync adds ECC #1860, AgentShield #78-#89, JARVIS #13, ECC-Tools #53-#74, resolved queue/discussion counts, and a generated `operator:dashboard` prompt-to-artifact audit for recurring status updates | Needs recurring status updates after each significant merge batch |
 | Flow separation and progress tracking | Flow lanes with owner artifacts and update cadence | This roadmap defines lanes below and `docs/architecture/progress-sync-contract.md` makes GitHub/Linear/handoff/roadmap sync part of the readiness gate | Active |
 | Realtime Linear sync | Project updates while issue limit is blocked; issues later | ECC-Tools #39 implements opt-in Linear API sync for deferred follow-up backlog items, and ECC-Tools #54 adds copy-ready PR drafts to that backlog when draft PR shells are not opened; `docs/architecture/progress-sync-contract.md` defines the local file-backed realtime boundary while issue capacity is blocked | Needs workspace capacity/config rollout |
 | Observability for self-use | Local readiness gate, traces, status snapshots, HUD/status contract, risk ledger, progress-sync contract | `npm run observability:ready` reports 21/21 | Complete for local gate |
@@ -544,7 +551,7 @@ repo evidence and merge commits.
 | Release and publication | rc.1 release docs, publication readiness doc | Naming matrix and plugin submission/contact checklist | Before any tag |
 | Harness OS core | Audit, adapter matrix, observability docs, `ecc2/` | HUD/session-control acceptance spec | Weekly until GA |
 | Evaluation and RAG | Reference-set validation, harness audit, traces, ECC-Tools corpus | Read-only evaluator/RAG prototype plus stale-salvage, billing-readiness, CI-failure-diagnosis, harness-config-quality, AgentShield policy-exception, skill-quality evidence, deep-analyzer evidence, and RAG/evaluator comparison fixtures; ECC-Tools #68 publishes the corpus as a hosted promotion readiness check-run, #69 scores cached hosted job outputs against the same corpus, #70 emits ranked retrieval candidates plus a model prompt seed, #71 adds a fail-closed hosted model-judge request contract, and #72 executes that judge only when explicitly enabled and backed by hosted retrieval citations | Hosted promotion telemetry and operator review UX |
-| AgentShield enterprise | AgentShield PR evidence and roadmap notes | Cross-harness depth and fleet routing after evidence-pack inspect/readback shipped in #88 | Next implementation batch |
+| AgentShield enterprise | AgentShield PR evidence and roadmap notes | Fleet routing landed in #89 after evidence-pack inspect/readback shipped in #88 | ECC-Tools fleet-summary consumption and cross-harness policy integration |
 | ECC Tools app | ECC-Tools PR evidence, billing audit, risk taxonomy, evaluator/RAG corpus | ECC-Tools #53 published the supply-chain workflow hardening branch, #54 tracks copy-ready PR drafts in the Linear/project backlog, #55 classifies analysis-depth readiness, #56 exposes the hosted execution plan, #57 executes the first hosted CI diagnostics job, #58 executes the hosted security evidence review job, #59 executes the hosted harness compatibility audit, #60 executes the hosted reference-set evaluation, #61 executes the hosted AI routing/cost review, #62 executes hosted team backlog routing, #63 publishes the hosted depth-plan check-run, #64 dispatches hosted jobs from PR comments, #65 persists hosted result history/check-runs, #66 exposes hosted job status from PR comments, #67 makes depth-plan recommendations cache-aware, #68 publishes hosted promotion readiness from the evaluator/RAG corpus, #69 scores cached hosted job outputs against that corpus, #70 emits ranked retrieval candidates plus a model prompt seed, #71 emits the gated `hosted-promotion-judge.v1` contract without live model calls, #72 adds opt-in live model-judge execution behind hosted-evidence and strict JSON/citation gates, #73 adds a fail-closed native-payments `announcementGate` to billing readiness, and #74 adds `npm run billing:announcement-gate` for operator verification | Live Marketplace test-account readback and hosted promotion telemetry |
 | Linear progress | Linear project status updates, `docs/architecture/progress-sync-contract.md`, generated `operator:dashboard` output, and this mirror | Status update with queue/evidence/missing gates | Every significant merge batch |

@@ -766,8 +773,10 @@ Acceptance:
   packs; PR #87 classified installed Claude plugin caches separately from
   active top-level runtime config, including cached hook implementations; PR
   #88 added `agentshield evidence-pack inspect` JSON/text readback for
-   downstream consumers; and ECC-Tools PRs #42/#43 now route and recognize
-   evidence packs. The next slice is cross-harness depth and fleet routing.
+   downstream consumers; PR #89 added `agentshield evidence-pack fleet`
+   summary/routing across multiple inspected bundles; and ECC-Tools PRs #42/#43
+   now route and recognize evidence packs. The next slice is ECC-Tools
+   fleet-summary consumption and cross-harness policy integration.
 2. Run ECC-Tools `/api/billing/readiness` against a Marketplace-managed test
   account and require `announcementGate.ready === true` before any native
   GitHub payments announcement.
@@ -1,6 +1,6 @@
 # AgentShield Enterprise Research Roadmap

-Generated: 2026-05-12; refreshed with May 16 AgentShield PR #87 and #88 evidence.
+Generated: 2026-05-12; refreshed with May 16 AgentShield PR #87, #88, and #89 evidence.

 This is a planning artifact for the next AgentShield enterprise iteration. It
 does not modify AgentShield code. The goal is to turn the current scanner,
@@ -91,6 +91,10 @@ AgentShield is already more than a static lint tool:
  JSON/text summaries for report score, finding counts, runtime confidence,
  policy, baseline, supply-chain, CI context, remediation, and malformed
  artifact errors.
+- Fleet-level evidence-pack consumption now has a local routing primitive:
+  `agentshield evidence-pack fleet <dirs...> [--json]` aggregates multiple
+  inspected bundles into ready, security-blocker, policy-review,
+  baseline-regression, supply-chain-review, and invalid routes.

 May 16 update: AgentShield PR #87 merged as
 `26bb44650663816d07180e0d20c1895e431a326c`. It classifies installed Claude
@@ -103,10 +107,15 @@ AgentShield PR #88 merged as
 `agentshield evidence-pack inspect <dir> [--json]`, validates the bundle before
 readback, summarizes every consumer-facing evidence artifact, and keeps
 malformed-but-valid JSON artifacts from crashing inspection.
+AgentShield PR #89 merged as
+`521ada9091bb6d818511ab8589ae675b920c106a`. It adds
+`agentshield evidence-pack fleet <dirs...> [--json]`, verifies each pack through
+the inspect path, aggregates finding, policy, baseline, supply-chain, and
+remediation totals, and assigns each pack to a deterministic fleet route.

-The next iteration should not be "add more regex rules" by default. The higher
-leverage move is to make AgentShield remember, compare, route, and enforce
-security posture across time, repos, teams, and harnesses.
+The next iteration after fleet routing should not be "add more regex rules" by
+default. The higher leverage move is to wire the new fleet summaries into
+ECC-Tools follow-up routing and cross-harness policy integration.

 ## Enterprise Gaps

@@ -21,7 +21,7 @@ surfaces, or posting announcements.
 | `docs/releases/2.0.0-rc.1/launch-checklist.md` | Operator launch checklist | Must remain approval-gated for release, package, plugin, and announcement actions |
 | `docs/releases/2.0.0-rc.1/publication-readiness.md` | Release gate | Requires fresh evidence from the exact release commit |
 | `docs/releases/2.0.0-rc.1/publication-evidence-2026-05-15.md` | Current May 15 queue, roadmap, security, supply-chain watch, no-lifecycle CI install hardening, AgentShield #86 evidence-pack provenance, ECC Tools billing-gate, Actions cache purge, and `ecc2` test evidence through PR #1941 | Must be superseded by a final clean-checkout evidence file before real publication |
-| `docs/releases/2.0.0-rc.1/publication-evidence-2026-05-16.md` | Current May 16 queue cleanup, recsys skill merge, GateGuard triage, PR #1947 supply-chain protection, AgentShield #87 plugin-cache confidence evidence, AgentShield #88 evidence-pack inspect/readback, dashboard refresh, and combined Node/Rust/release-surface gate evidence through `6bced468` | Must still be repeated from a strict clean checkout before real publication |
+| `docs/releases/2.0.0-rc.1/publication-evidence-2026-05-16.md` | Current May 16 queue cleanup, recsys skill merge, GateGuard triage, PR #1947 supply-chain protection, AgentShield #87 plugin-cache confidence evidence, AgentShield #88 evidence-pack inspect/readback, AgentShield #89 evidence-pack fleet routing, dashboard refresh, and combined Node/Rust/release-surface gate evidence through `6bced468` | Must still be repeated from a strict clean checkout before real publication |
 | `docs/releases/2.0.0-rc.1/naming-and-publication-matrix.md` | Naming, slug, and publication-path decision record | Keeps `Everything Claude Code / ECC`, npm `ecc-universal`, and plugin slug `ecc` for rc.1 |
 | `docs/releases/2.0.0-rc.1/x-thread.md` | X launch draft | Must replace placeholders with live URLs after release/package/plugin publication |
 | `docs/releases/2.0.0-rc.1/linkedin-post.md` | LinkedIn launch draft | Must replace placeholders with live URLs after release/package/plugin publication |
@@ -9,7 +9,7 @@ npm publication, plugin tag, marketplace submission, or announcement post.
 | --- | --- |
 | Upstream main | `6bced468d76b269243a6f0bd28472853aa78e0e4` |
 | Git remote | `https://github.com/affaan-m/everything-claude-code.git` |
-| Evidence scope | Current `main` after PR #1944, PR #1945, issue #1946 triage, PR #1947 supply-chain protection, AgentShield PR #87, AgentShield PR #88, ITO-57 sync, and operator dashboard refresh |
+| Evidence scope | Current `main` after PR #1944, PR #1945, issue #1946 triage, PR #1947 supply-chain protection, AgentShield PR #87, AgentShield PR #88, AgentShield PR #89, ITO-57 sync, and operator dashboard refresh |
 | Local status caveat | `git status --short --branch` showed `## main...origin/main` plus unrelated untracked `docs/drafts/` |

 The actual release operator should repeat all publish-facing checks from the
@@ -34,8 +34,9 @@ final release commit with a strictly clean checkout before publishing.
 | PR #1947 | Merged scheduled supply-chain watch/advisory-source evidence as `4093d1bb7a14db1b4d4ea5bd00f2073baf94bfb0`; trunk now has the TanStack/Mini Shai-Hulud/node-ipc IOC scan plus advisory-source report surfaces wired into scheduled watch evidence |
 | AgentShield PR #87 | Merged plugin-cache runtime-confidence classification as `26bb44650663816d07180e0d20c1895e431a326c`; installed Claude plugin cache findings now emit `runtimeConfidence: plugin-cache`, `plugins/cache` only maps to Claude cache under `.claude`, and cached hook implementations are no longer mislabeled as active `hook-code` |
 | AgentShield PR #88 | Merged evidence-pack inspect/readback as `65ed6e2a87545dc99d962b58413f49096a4d70ec`; `agentshield evidence-pack inspect` now emits verified JSON/text summaries for report, policy, baseline, supply-chain, CI context, remediation, and malformed artifact errors |
+| AgentShield PR #89 | Merged evidence-pack fleet routing as `521ada9091bb6d818511ab8589ae675b920c106a`; `agentshield evidence-pack fleet <dirs...> [--json]` now aggregates multiple verified bundles into ready, security-blocker, policy-review, baseline-regression, supply-chain-review, and invalid routes with finding, policy, baseline, supply-chain, and remediation totals |
 | ITO-57 | Updated with PR #1947 advisory-source evidence, post-merge source refresh, IOC scan, npm audit/signature checks, and OpenAI app update caveat |
-| ITO-49 | Updated with AgentShield PR #87 and #88 merge evidence, local test evidence, CI status, live `~/.claude` scan classification counts, and local Mini Shai-Hulud protection scan results |
+| ITO-49 | Updated with AgentShield PR #87, #88, and #89 merge evidence, local test evidence, CI status, live `~/.claude` scan classification counts, and local Mini Shai-Hulud protection scan results |
 | ITO-44 | Updated with queue cleanup, dashboard refresh, and remaining macro gaps |

 ## Release Gate Commands
@@ -22,8 +22,8 @@ refresh through PR #1941, see
 [`publication-evidence-2026-05-15.md`](publication-evidence-2026-05-15.md).
 For the May 16 queue cleanup, recsys skill merge, GateGuard issue triage,
 AgentShield #87 plugin-cache runtime-confidence evidence, AgentShield #88
-evidence-pack inspect/readback, operator dashboard refresh, and combined
-final-gate rerun on current `main`, see
+evidence-pack inspect/readback, AgentShield #89 evidence-pack fleet routing,
+operator dashboard refresh, and combined final-gate rerun on current `main`, see
 [`publication-evidence-2026-05-16.md`](publication-evidence-2026-05-16.md).
 For the operator-facing prompt-to-artifact readiness dashboard from the same
 May 16 pass, see