Compare commits

..

21 Commits

Author SHA1 Message Date
richm-spp c802c33abc fix(hooks): emit suggest-compact via hookSpecificOutput stdout
The threshold and interval suggestions in suggest-compact.js are written
to stderr via log(). Per the Claude Code hooks guide, non-blocking
PreToolUse stderr (exit code 0) is only captured in the debug log — it
does not reach the model. As shipped, the script's nudge to /compact is
silent on Claude Code 2.1.x.

Fix: alongside the existing log() call (kept for debug-log capture),
emit the same suggestion as structured JSON on stdout:

  { hookSpecificOutput: {
      hookEventName: "PreToolUse",
      additionalContext: msg
  } }

This is the documented mechanism for a PreToolUse hook to inject
context into the next model turn without blocking the tool call.

Verified end-to-end on Claude Code 2.1.142 (VSCode native extension,
Windows 11) — the additionalContext now surfaces in the next turn as
a <system-reminder> block. Counter increment and exit code behavior
unchanged.

Tests: 4 new cases in tests/hooks/suggest-compact.test.js covering
stdout JSON at threshold, stdout JSON at +25 interval, silence below
threshold, and stderr-retention for the debug log. Suite goes from
19/19 -> 23/23 (suggest-compact) and full run-all stays clean for
the unaffected suites (the 4 pre-existing Windows broken-symlink
failures in ci/validators, lib/session-manager, and lib/utils are
unrelated to this change).
2026-05-15 09:48:28 +10:00
Affaan Mustafa 0e66c838c7 docs: sync ECC Tools judge execution (#1901) 2026-05-14 17:38:03 -04:00
Affaan Mustafa cb9702ca99 docs: sync ECC Tools judge contract (#1900) 2026-05-14 17:15:54 -04:00
Affaan Mustafa f9384427b8 docs: sync ECC Tools retrieval planning (#1892) 2026-05-14 16:54:30 -04:00
Affaan Mustafa 4423f10cfb docs: sync ECC Tools hosted output scoring (#1891) 2026-05-13 23:02:23 -04:00
Affaan Mustafa 3b12fb273f docs: sync ECC Tools hosted promotion readiness (#1890) 2026-05-13 22:39:01 -04:00
Affaan Mustafa 4fb80d8861 Sync ECC Tools status-aware depth plan roadmap (#1887) 2026-05-13 22:12:11 -04:00
Affaan Mustafa a27831c13e Sync ECC Tools hosted status roadmap (#1886) 2026-05-13 21:49:42 -04:00
Affaan Mustafa b24d762caa Sync ECC Tools hosted result history roadmap (#1885) 2026-05-13 21:31:08 -04:00
Affaan Mustafa f94478e524 docs: sync roadmap after ECC-Tools hosted dispatch 2026-05-13 20:30:48 -04:00
Affaan Mustafa 6cdac19764 docs: sync roadmap after ECC-Tools depth-plan check 2026-05-13 20:10:38 -04:00
Affaan Mustafa af3a206412 docs: sync roadmap after ECC-Tools team backlog job (#1880) 2026-05-13 19:44:49 -04:00
Affaan Mustafa 20f00c1410 docs: sync roadmap after ECC-Tools AI cost job (#1878) 2026-05-13 19:26:48 -04:00
Affaan Mustafa e7a6f137e5 docs: sync roadmap after ECC-Tools reference-set job (#1877) 2026-05-13 19:09:35 -04:00
Affaan Mustafa 7596502092 docs: sync roadmap after ECC-Tools harness job (#1876) 2026-05-13 18:50:45 -04:00
Affaan Mustafa c04baa8c25 docs: sync roadmap after ECC-Tools security evidence job (#1875) 2026-05-13 18:32:06 -04:00
Affaan Mustafa 9082bdedac docs: sync roadmap after ECC-Tools CI diagnostics (#1874) 2026-05-13 18:12:31 -04:00
Affaan Mustafa 3243a1c5d3 docs: sync roadmap after ECC-Tools hosted planning (#1872) 2026-05-13 12:48:50 -04:00
Affaan Mustafa 69401b28b3 docs: sync roadmap after ECC-Tools depth readiness (#1871) 2026-05-13 12:26:32 -04:00
Affaan Mustafa 9a5ed3223a docs: sync roadmap after AgentShield corpus expansion
Records AgentShield PR #82 and moves the next AgentShield roadmap slice to hosted evidence-pack workflow depth.
2026-05-13 09:04:34 -04:00
Affaan Mustafa d844bd6bfc docs: sync roadmap after AgentShield remediation workflows
Records AgentShield PR #81 and advances the next AgentShield roadmap slice after remediation workflow phases landed.
2026-05-13 08:46:07 -04:00
4 changed files with 269 additions and 20 deletions
+191 -14
View File
@@ -61,6 +61,14 @@ As of 2026-05-13:
and added prioritized corpus accuracy recommendations to failed corpus gates, and added prioritized corpus accuracy recommendations to failed corpus gates,
mapping misses by category, missing rule, and config ID so enterprise mapping misses by category, missing rule, and config ID so enterprise
scanner-regression work has an actionable improvement plan. scanner-regression work has an actionable improvement plan.
- AgentShield PR #81 merged as `6583884e74ba2e896942113e1ce3146230e6fb76`
and added ordered remediation workflow phases to remediation plans, routing
safe auto-fixes, manual review, and verification through stable finding
fingerprints without copying raw evidence.
- AgentShield PR #82 merged as `51336ba074ad5e9fed2c0aa3237422be22147e76`
and expanded the built-in attack corpus with an env proxy hijack scenario
covering proxy/runtime mutation, env-token exfiltration, DNS exfiltration,
credential-store access, and clipboard access.
- JARVIS PR #13 merged as `127efabbfb5033ae53d7a53e1546aa3c33d6f962` - JARVIS PR #13 merged as `127efabbfb5033ae53d7a53e1546aa3c33d6f962`
and hardened CI/deploy workflows with npm registry signature verification, and hardened CI/deploy workflows with npm registry signature verification,
disabled persisted checkout credentials in write-permission jobs, and pinned disabled persisted checkout credentials in write-permission jobs, and pinned
@@ -72,6 +80,115 @@ As of 2026-05-13:
and made `/ecc-tools followups sync-linear` track copy-ready PR drafts in and made `/ecc-tools followups sync-linear` track copy-ready PR drafts in
the Linear/project backlog when `open-pr-drafts` is not used, preserving the Linear/project backlog when `open-pr-drafts` is not used, preserving
useful stale-PR salvage work without opening extra PR shells. useful stale-PR salvage work without opening extra PR shells.
- ECC-Tools PR #55 merged as `5d8c112cce4794cfa089d5b0ea661ba87a178be1`
and added analysis-depth readiness to `/ecc-tools analyze` comments,
separating commit-history-only repos from evidence-backed and deep-ready repos
using CI/CD, security, harness, reference/eval, AI routing/cost-control, and
team handoff evidence.
- ECC-Tools PR #56 merged as `5b729c88641eafe80f65364bab3fc74d0270f57b`
and added the authenticated `/api/analysis/depth-plan` contract that maps
analysis-depth readiness into concrete hosted jobs for CI diagnostics,
security evidence review, harness compatibility, reference-set evaluation,
AI routing/cost review, and team backlog routing.
- ECC-Tools PR #57 merged as `4cc61112a4cc9feec7b07af09321f360e34af6a4`
and added the first executable hosted analysis job:
`/api/analysis/jobs/ci-diagnostics` now gates on CI/CD readiness, inspects
workflow/test-runner/failure-evidence artifacts, returns CI hardening
findings and next actions, and charges usage only after successful execution.
- ECC-Tools PR #58 merged as `ce09dd8d9b46f65c6b88dc4f48cfb6b6227ae0bf`
and added the second executable hosted analysis job:
`/api/analysis/jobs/security-evidence-review` now gates on security-evidence
readiness, inspects capped AgentShield evidence-pack, policy, baseline,
SBOM, SARIF, and security-scan artifacts, returns supply-chain evidence
findings and next actions, and charges usage only after successful execution.
- ECC-Tools PR #59 merged as `505b372dbd8f75f996d9e2ed079effd30cec5ba5`
and added the third executable hosted analysis job:
`/api/analysis/jobs/harness-compatibility-audit` now gates on harness-config
readiness, inspects capped Claude, Codex, OpenCode, MCP, plugin, and
cross-harness documentation artifacts, excludes local secret-bearing config
paths from fetches, returns portability findings and next actions, and
charges usage only after successful execution.
- ECC-Tools PR #60 merged as `b75e0a49ba5672b1ec9a2a4880ddcfa2d07dc557`
and added the fourth executable hosted analysis job:
`/api/analysis/jobs/reference-set-evaluation` now gates on reference-evidence
readiness, evaluates analyzer corpus, RAG/evaluator, PR salvage/review,
harness, security, and CI failure-mode evidence, excludes obvious
secret-bearing fixture paths from fetches, returns reference coverage
findings and next actions, and charges usage only after successful execution.
- ECC-Tools PR #61 merged as `7b01b67cae0b80774b311cb515b7eca0aa038c65`
and added the fifth executable hosted analysis job:
`/api/analysis/jobs/ai-routing-cost-review` now gates on AI routing/cost
readiness, evaluates model routing, token budget, usage-limit, rate-limit,
billing/entitlement, cost-regression, and cost-policy evidence, excludes
obvious secret-bearing paths from fetches, returns cost-control findings and
next actions, and charges usage only after successful execution.
- ECC-Tools PR #62 merged as `781d6733e56f7556edb43fb96bdfb00b1f0a3aa6`
and added the sixth executable hosted analysis job:
`/api/analysis/jobs/team-backlog-routing` now gates on team handoff/project
tracking readiness, evaluates roadmap, runbook, handoff, release-plan,
issue-template, ownership, project-tracker, backlog, and follow-up evidence,
excludes obvious secret-bearing paths from fetches, returns team-routing
findings and next actions, and charges usage only after successful execution.
- ECC-Tools PR #63 merged as `fb9e4c5ceb9ccde50da74c7a69c3fa4bd321fc07`
and made the hosted execution plan operator-visible on queued PR analysis:
the queue now publishes a non-blocking `ECC Tools / Hosted Depth Plan`
check-run on the PR head SHA with ready/blocked hosted executor commands
and next action text, while keeping check-run publication best-effort so
bundle generation and analysis comments are not blocked.
- ECC-Tools PR #64 merged as `72020ef94db94840812977ea7ac37e9344036668`
and added PR-facing hosted job dispatch controls:
`/ecc-tools analyze --job ...` comments now queue hosted jobs against the
PR head SHA, execute them through the existing hosted readiness/evidence
gates, post artifacts/findings/next actions back to the PR, and scope
idempotency keys by job id so hosted jobs do not collide with bundle
analysis.
- ECC-Tools PR #65 merged as `bacd4adf6a3a629e8d403865456d15f127baaf4e`
and added hosted job result history/check-run summaries:
queued hosted jobs now cache both the latest result and immutable run records
for completed or blocked runs, then publish a non-blocking per-job check-run
on the PR head SHA with artifacts, findings, readiness blockers, and next
actions.
- ECC-Tools PR #66 merged as `4e1db48252d068ea5dcf4308b0bc11b0dfe0c9ce`
and added a read-only hosted status command:
`/ecc-tools analyze --job status` now reads the #65 latest-result cache for
the current PR head and posts a compact completed/blocked/not-run table with
the next hosted job command, without queueing work or billing usage.
- ECC-Tools PR #67 merged as `f20e6bec2b0bf49e4cc36e08b7285c795973b73d`
and made the hosted depth-plan check-run status-aware:
queued PR analysis now reads the #65/#66 latest-result cache when publishing
`ECC Tools / Hosted Depth Plan`, includes the latest hosted run status in
the plan table, and recommends the next unrun ready job before reruns.
- ECC-Tools PR #68 merged as `2cde524b5ef8f34ab7bb1af973248fe4be4359f8`
and added deterministic hosted promotion readiness:
opened/synchronized PRs now publish a non-blocking
`ECC Tools / Hosted Promotion Readiness` check-run that compares changed
files against the checked-in evaluator/RAG corpus, warns on missing
hosted-job promotion evidence, and can be disabled with
`PR_HOSTED_PROMOTION_READINESS_CHECK_MODE=off`.
- ECC-Tools PR #69 merged as `d0112dac7cef807ae27def41f057682ef0772cce`
and extended hosted promotion readiness with deterministic output scoring:
the check now reads cached completed hosted job results for the current PR
head, scores their artifacts and findings against evaluator/RAG corpus
expectations, and treats matching hosted artifacts as promotion evidence
before reporting a gap.
- ECC-Tools PR #70 merged as `7001d805ac981fe220b4575159f469fbea9dbb76`
and added retrieval planning for hosted promotion:
the check now emits ranked retrieval candidates from cached hosted artifacts,
hosted findings, expected evidence paths, and changed source paths, plus a
model prompt seed that tells the later hosted judge not to promote from
changed paths alone.
- ECC-Tools PR #71 merged as `d41e59ff00fe1bd0b0c96386e56bc5269d7b9c15`
and added the first model-backed hosted promotion judge contract:
the check now emits a provider-neutral `hosted-promotion-judge.v1` request
contract and fails closed unless hosted retrieval evidence, entitlement,
remaining budget, and provider configuration are present. It still does not
make live model calls.
- ECC-Tools PR #72 merged as `973bc51e5436dd279ae5a890cce9811485eef0b5`
and executes the hosted promotion model judge behind explicit gates:
`PR_HOSTED_PROMOTION_MODEL_JUDGE_MODE=execute` now calls the configured
provider only after hosted retrieval evidence, entitlement, budget, provider,
and executor gates pass; the check remains non-blocking, strict-JSON-only,
and rejects uncited or non-hosted model output without echoing raw responses.
- Handoff `ecc-supply-chain-audit-20260513-0645.md` under - Handoff `ecc-supply-chain-audit-20260513-0645.md` under
`~/.cluster-swarm/handoffs/` `~/.cluster-swarm/handoffs/`
records the May 13 supply-chain sweep: no active lockfile/manifest hit for records the May 13 supply-chain sweep: no active lockfile/manifest hit for
@@ -255,6 +372,66 @@ As of 2026-05-13:
artifact contract so canonical bundle files now satisfy the taxonomy and artifact contract so canonical bundle files now satisfy the taxonomy and
generated follow-up PRs point maintainers at generated follow-up PRs point maintainers at
`agentshield scan --evidence-pack <dir>`. `agentshield scan --evidence-pack <dir>`.
- ECC-Tools PR #55 added the first hosted/deeper-analysis readiness signal:
analysis comments now classify a repo as commit-history-only,
evidence-backed, or deep-ready before routing work into CI, AgentShield,
harness, reference-set, RAG/evaluator, AI-routing, cost-control, and
Linear/project-tracking lanes.
- ECC-Tools PR #56 turned that signal into a hosted execution-plan contract:
`/api/analysis/depth-plan` returns ready/blocked jobs and next action text
without charging analysis usage or creating bundle PRs.
- ECC-Tools PR #57 implemented the first job-specific hosted executor:
`/api/analysis/jobs/ci-diagnostics` reuses the depth-readiness gate, internal
API auth, installation ownership, repo-access billing checks, capped workflow
file reads, and usage accounting to return concrete CI hardening findings.
- ECC-Tools PR #58 implemented the second job-specific hosted executor:
`/api/analysis/jobs/security-evidence-review` applies the same hosted gates
to AgentShield evidence-pack, policy, baseline, SBOM, SARIF, and security
scanner artifacts.
- ECC-Tools PR #59 implemented the third job-specific hosted executor:
`/api/analysis/jobs/harness-compatibility-audit` applies the same hosted
gates to Claude, Codex, OpenCode, MCP, plugin, and cross-harness evidence
while avoiding local secret-bearing harness config fetches.
- ECC-Tools PR #60 implemented the fourth job-specific hosted executor:
`/api/analysis/jobs/reference-set-evaluation` applies the same hosted gates
to analyzer corpus, RAG/evaluator, PR salvage, harness, security, and CI
failure-mode reference evidence while avoiding obvious secret-bearing fixture
fetches.
- ECC-Tools PR #61 implemented the fifth job-specific hosted executor:
`/api/analysis/jobs/ai-routing-cost-review` applies the same hosted gates to
model-routing, token-budget, usage-limit, rate-limit, billing/entitlement,
cost-regression, and cost-policy evidence while avoiding obvious
secret-bearing path fetches.
- ECC-Tools PR #62 implemented the sixth job-specific hosted executor:
`/api/analysis/jobs/team-backlog-routing` applies the same hosted gates to
roadmap, runbook, handoff, release-plan, issue-template, ownership,
project-tracker, backlog, and follow-up evidence while avoiding obvious
secret-bearing path fetches.
- ECC-Tools PR #63 publishes the hosted depth-plan check-run after queued PR
analysis completes, making the six hosted executor commands visible on the
PR head SHA without turning the check into a merge blocker.
- ECC-Tools PR #64 wires those commands into the queue: maintainers can comment
`/ecc-tools analyze --job ci-diagnostics`, `security-evidence`,
`harness-compatibility`, `reference-set-evaluation`, `ai-routing-cost`, or
`team-backlog` on a PR and receive hosted job results in a PR comment.
- ECC-Tools PR #65 persists completed and blocked hosted job results to the
analysis cache for 30 days and publishes non-blocking `ECC Tools / Hosted
Job: ...` check-runs so maintainers can scan hosted outcomes from the PR
checks surface instead of rereading older comments.
- ECC-Tools PR #66 exposes the cached results from PR comments with
`/ecc-tools analyze --job status`, summarizing completed, blocked, and
not-yet-run hosted jobs for the PR head and recommending the next hosted job
command.
- ECC-Tools PR #67 feeds those cached results back into the hosted depth-plan
check-run so queued analysis recommends the next unrun ready hosted job from
cache state instead of repeating the static readiness order.
- ECC-Tools PR #68 adds the first evaluator-backed hosted promotion gate:
opened/synchronized PRs get a non-blocking Hosted Promotion Readiness
check-run that turns the evaluator/RAG corpus into warnings when changed
files match fixture scenarios without their expected evidence artifacts.
- ECC-Tools PR #69 extends that gate to score cached completed hosted job
outputs for the current PR head, so hosted artifacts can satisfy corpus
evidence expectations before the check reports a promotion gap.
- ECC PR #1803 landed the contributor Quarkus handling branch after maintainer - ECC PR #1803 landed the contributor Quarkus handling branch after maintainer
cleanup, current-`main` alignment, full local validation, and preservation of cleanup, current-`main` alignment, full local validation, and preservation of
the author's removal of incomplete ja-JP and zh-CN Quarkus translations. the author's removal of incomplete ja-JP and zh-CN Quarkus translations.
@@ -307,11 +484,11 @@ is not complete unless the evidence column exists and has been freshly verified.
| Naming and rename readiness | Naming matrix across package/plugin/docs/social surfaces | `docs/releases/2.0.0-rc.1/naming-and-publication-matrix.md` records current package, repo, Claude plugin, Codex plugin, OpenCode, and npm availability evidence | Complete for rc.1; post-rc rename remains future work | | Naming and rename readiness | Naming matrix across package/plugin/docs/social surfaces | `docs/releases/2.0.0-rc.1/naming-and-publication-matrix.md` records current package, repo, Claude plugin, Codex plugin, OpenCode, and npm availability evidence | Complete for rc.1; post-rc rename remains future work |
| Claude and Codex plugin publication | Contact/submission path with required artifacts and status | Publication readiness, naming matrix, and May 12 dry-run evidence document plugin validation, clean-checkout Claude tag/install smoke, and Codex marketplace CLI shape | Needs explicit approval for real tag/push and marketplace submission | | Claude and Codex plugin publication | Contact/submission path with required artifacts and status | Publication readiness, naming matrix, and May 12 dry-run evidence document plugin validation, clean-checkout Claude tag/install smoke, and Codex marketplace CLI shape | Needs explicit approval for real tag/push and marketplace submission |
| Articles, tweets, and announcements | X thread, LinkedIn copy, GitHub release copy, push checklist | Draft launch collateral exists under rc.1 release docs | Needs URL-backed refresh | | Articles, tweets, and announcements | X thread, LinkedIn copy, GitHub release copy, push checklist | Draft launch collateral exists under rc.1 release docs | Needs URL-backed refresh |
| AgentShield enterprise iteration | Policy gates, SARIF, packs, provenance, corpus, HTML reports, exception lifecycle audit, baseline drift Action/CLI surfaces, evidence-pack redaction, harness adapter registry, enterprise research roadmap, supply-chain hardened release path, CI-safe baseline fingerprints, corpus accuracy recommendations | PRs #53, #55-#64, #67-#69, and #78-#80 landed with test evidence; native PDF export deferred in favor of self-contained HTML plus print-to-PDF until explicit enterprise demand appears; `docs/architecture/agentshield-enterprise-research-roadmap.md` now has baseline drift, evidence-pack bundle, redaction, adapter-registry, supply-chain hardening, hashed baseline fingerprints, and corpus accuracy recommendation slices landed | Next remediation workflow depth or corpus expansion | | AgentShield enterprise iteration | Policy gates, SARIF, packs, provenance, corpus, HTML reports, exception lifecycle audit, baseline drift Action/CLI surfaces, evidence-pack redaction, harness adapter registry, enterprise research roadmap, supply-chain hardened release path, CI-safe baseline fingerprints, corpus accuracy recommendations, remediation workflow phases, env proxy hijack corpus coverage | PRs #53, #55-#64, #67-#69, and #78-#82 landed with test evidence; native PDF export deferred in favor of self-contained HTML plus print-to-PDF until explicit enterprise demand appears; `docs/architecture/agentshield-enterprise-research-roadmap.md` now has baseline drift, evidence-pack bundle, redaction, adapter-registry, supply-chain hardening, hashed baseline fingerprints, corpus accuracy recommendation, remediation workflow, and env proxy hijack corpus slices landed | Next hosted evidence-pack workflow depth |
| ECC Tools next-level app | Billing audit, PR checks, deep analyzer, sync backlog, evaluator/RAG corpus | PRs #26-#43 plus #53/#54 landed with test evidence, including AgentShield evidence-pack gap routing, canonical bundle recognition, supply-chain signature gates, and PR draft follow-up Linear tracking | Needs hosted/deeper analysis follow-up | | ECC Tools next-level app | Billing audit, PR checks, deep analyzer, sync backlog, evaluator/RAG corpus, analysis-depth readiness, hosted execution planning, hosted CI diagnostics, hosted security evidence review, hosted harness compatibility audit, hosted reference-set evaluation, hosted AI routing/cost review, hosted team backlog routing, hosted depth-plan check-run, PR-comment hosted job dispatch, hosted job result history/check-runs, hosted result status command, status-aware depth-plan recommendations, hosted promotion readiness, hosted promotion output scoring, hosted promotion retrieval planning, hosted promotion judge contract, gated hosted promotion judge execution | PRs #26-#43 plus #53-#72 landed with test evidence, including AgentShield evidence-pack gap routing, canonical bundle recognition, supply-chain signature gates, PR draft follow-up Linear tracking, evidence-backed/deep-ready repository classification, the `/api/analysis/depth-plan` hosted job plan, `/api/analysis/jobs/ci-diagnostics`, `/api/analysis/jobs/security-evidence-review`, `/api/analysis/jobs/harness-compatibility-audit`, `/api/analysis/jobs/reference-set-evaluation`, `/api/analysis/jobs/ai-routing-cost-review`, `/api/analysis/jobs/team-backlog-routing`, the `ECC Tools / Hosted Depth Plan` check-run, `/ecc-tools analyze --job ...` PR-comment dispatch, non-blocking per-hosted-job result check-runs backed by 30-day result cache records, `/ecc-tools analyze --job status` cache lookup, cache-aware next-job recommendations in the depth-plan check-run, the `ECC Tools / Hosted Promotion Readiness` corpus-backed PR check-run, deterministic hosted-output scoring against cached completed job artifacts/findings, ranked retrieval/model-prompt planning, the fail-closed `hosted-promotion-judge.v1` request contract, and opt-in live model-judge execution behind hosted evidence, entitlement, budget, provider, executor, strict JSON, and citation gates | Next work is hosted promotion telemetry and operator review UX |
| GitGuardian/Dependabot/CodeRabbit-style checks | Non-blocking taxonomy, deterministic follow-up checks, and local supply-chain gates | ECC-Tools risk taxonomy check plus follow-up signals landed, including Skill Quality, Deep Analyzer Evidence, Analyzer Corpus Evidence, RAG/Evaluator Evidence, PR Review/Salvage Evidence, and AgentShield evidence-pack evidence; #1846 added npm registry signature gates; #1848 added the supply-chain incident-response playbook and `pull_request_target` cache-poisoning validator guard; #1851 added the privileged checkout credential-persistence guard; AgentShield #78, JARVIS #13, and ECC-Tools #53 applied the same hardening outside trunk | Current supply-chain gate complete; deeper hosted review features remain future | | GitGuardian/Dependabot/CodeRabbit-style checks | Non-blocking taxonomy, deterministic follow-up checks, and local supply-chain gates | ECC-Tools risk taxonomy check plus follow-up signals landed, including Skill Quality, Deep Analyzer Evidence, Analyzer Corpus Evidence, RAG/Evaluator Evidence, PR Review/Salvage Evidence, and AgentShield evidence-pack evidence; #1846 added npm registry signature gates; #1848 added the supply-chain incident-response playbook and `pull_request_target` cache-poisoning validator guard; #1851 added the privileged checkout credential-persistence guard; AgentShield #78, JARVIS #13, and ECC-Tools #53 applied the same hardening outside trunk | Current supply-chain gate complete; deeper hosted review features remain future |
| Harness-agnostic learning system | Audit, adapter matrix, observability, traces, promotion loop | Audit/adapters/observability gates plus `docs/architecture/evaluator-rag-prototype.md`, `examples/evaluator-rag-prototype/`, and ECC-Tools PR #40 define read-only stale-salvage, billing-readiness, CI-failure-diagnosis, harness-config-quality, AgentShield policy-exception, skill-quality evidence, deep-analyzer evidence, and RAG/evaluator comparison scenarios with trace, report, playbook, verifier, and predictive-check artifacts | Local corpus complete; hosted integration remains future | | Harness-agnostic learning system | Audit, adapter matrix, observability, traces, promotion loop | Audit/adapters/observability gates plus `docs/architecture/evaluator-rag-prototype.md`, `examples/evaluator-rag-prototype/`, and ECC-Tools PR #40 define read-only stale-salvage, billing-readiness, CI-failure-diagnosis, harness-config-quality, AgentShield policy-exception, skill-quality evidence, deep-analyzer evidence, and RAG/evaluator comparison scenarios with trace, report, playbook, verifier, and predictive-check artifacts; ECC-Tools PRs #68-#72 now turn that corpus into a deterministic PR check-run gate with cached hosted-output scoring, ranked retrieval candidates, a model prompt seed, a fail-closed hosted model-judge request contract, and opt-in live model execution behind strict hosted-evidence gates | Deterministic hosted PR check, cached output scoring, retrieval planning, judge contract, and gated model execution integrated |
| Linear roadmap is detailed | Linear project status plus repo mirror | Repo mirror exists; issue creation was retried on 2026-05-12 and remains blocked by the workspace free issue limit; this May 13 sync adds ECC #1860, AgentShield #78/#79, JARVIS #13, ECC-Tools #53/#54, resolved queue/discussion counts, and Linear project status updates `59f630eb`/`c7ea6daf` | Needs recurring status updates after each merge batch | | Linear roadmap is detailed | Linear project status plus repo mirror | Repo mirror exists; issue creation was retried on 2026-05-12 and remains blocked by the workspace free issue limit; this May 13 sync adds ECC #1860, AgentShield #78-#82, JARVIS #13, ECC-Tools #53-#72, resolved queue/discussion counts, and notes that Linear connector status updates after ECC-Tools #68 remain blocked by a connector secret-owner error | Needs recurring status updates after connector recovery |
| Flow separation and progress tracking | Flow lanes with owner artifacts and update cadence | This roadmap defines lanes below and `docs/architecture/progress-sync-contract.md` makes GitHub/Linear/handoff/roadmap sync part of the readiness gate | Active | | Flow separation and progress tracking | Flow lanes with owner artifacts and update cadence | This roadmap defines lanes below and `docs/architecture/progress-sync-contract.md` makes GitHub/Linear/handoff/roadmap sync part of the readiness gate | Active |
| Realtime Linear sync | Project updates while issue limit is blocked; issues later | ECC-Tools #39 implements opt-in Linear API sync for deferred follow-up backlog items, and ECC-Tools #54 adds copy-ready PR drafts to that backlog when draft PR shells are not opened; `docs/architecture/progress-sync-contract.md` defines the local file-backed realtime boundary while issue capacity is blocked | Needs workspace capacity/config rollout | | Realtime Linear sync | Project updates while issue limit is blocked; issues later | ECC-Tools #39 implements opt-in Linear API sync for deferred follow-up backlog items, and ECC-Tools #54 adds copy-ready PR drafts to that backlog when draft PR shells are not opened; `docs/architecture/progress-sync-contract.md` defines the local file-backed realtime boundary while issue capacity is blocked | Needs workspace capacity/config rollout |
| Observability for self-use | Local readiness gate, traces, status snapshots, HUD/status contract, risk ledger, progress-sync contract | `npm run observability:ready` reports 21/21 | Complete for local gate | | Observability for self-use | Local readiness gate, traces, status snapshots, HUD/status contract, risk ledger, progress-sync contract | `npm run observability:ready` reports 21/21 | Complete for local gate |
@@ -330,9 +507,9 @@ repo evidence and merge commits.
| Queue hygiene and salvage | GitHub PR/issue state, salvage ledger | Append ledger entries for any future stale closures | Every cleanup batch | | Queue hygiene and salvage | GitHub PR/issue state, salvage ledger | Append ledger entries for any future stale closures | Every cleanup batch |
| Release and publication | rc.1 release docs, publication readiness doc | Naming matrix and plugin submission/contact checklist | Before any tag | | Release and publication | rc.1 release docs, publication readiness doc | Naming matrix and plugin submission/contact checklist | Before any tag |
| Harness OS core | Audit, adapter matrix, observability docs, `ecc2/` | HUD/session-control acceptance spec | Weekly until GA | | Harness OS core | Audit, adapter matrix, observability docs, `ecc2/` | HUD/session-control acceptance spec | Weekly until GA |
| Evaluation and RAG | Reference-set validation, harness audit, traces, ECC-Tools corpus | Read-only evaluator/RAG prototype plus stale-salvage, billing-readiness, CI-failure-diagnosis, harness-config-quality, AgentShield policy-exception, skill-quality evidence, deep-analyzer evidence, and RAG/evaluator comparison fixtures | Hosted retrieval/check-run automation plan | | Evaluation and RAG | Reference-set validation, harness audit, traces, ECC-Tools corpus | Read-only evaluator/RAG prototype plus stale-salvage, billing-readiness, CI-failure-diagnosis, harness-config-quality, AgentShield policy-exception, skill-quality evidence, deep-analyzer evidence, and RAG/evaluator comparison fixtures; ECC-Tools #68 publishes the corpus as a hosted promotion readiness check-run, #69 scores cached hosted job outputs against the same corpus, #70 emits ranked retrieval candidates plus a model prompt seed, #71 adds a fail-closed hosted model-judge request contract, and #72 executes that judge only when explicitly enabled and backed by hosted retrieval citations | Hosted promotion telemetry and operator review UX |
| AgentShield enterprise | AgentShield PR evidence and roadmap notes | Remediation workflow depth or corpus expansion follow-up | Next implementation batch | | AgentShield enterprise | AgentShield PR evidence and roadmap notes | Remediation workflow depth or corpus expansion follow-up | Next implementation batch |
| ECC Tools app | ECC-Tools PR evidence, billing audit, risk taxonomy, evaluator/RAG corpus | ECC-Tools #53 published the supply-chain workflow hardening branch and #54 tracks copy-ready PR drafts in the Linear/project backlog; next work is hosted/deeper analysis follow-up | Next implementation batch | | ECC Tools app | ECC-Tools PR evidence, billing audit, risk taxonomy, evaluator/RAG corpus | ECC-Tools #53 published the supply-chain workflow hardening branch, #54 tracks copy-ready PR drafts in the Linear/project backlog, #55 classifies analysis-depth readiness, #56 exposes the hosted execution plan, #57 executes the first hosted CI diagnostics job, #58 executes the hosted security evidence review job, #59 executes the hosted harness compatibility audit, #60 executes the hosted reference-set evaluation, #61 executes the hosted AI routing/cost review, #62 executes hosted team backlog routing, #63 publishes the hosted depth-plan check-run, #64 dispatches hosted jobs from PR comments, #65 persists hosted result history/check-runs, #66 exposes hosted job status from PR comments, #67 makes depth-plan recommendations cache-aware, #68 publishes hosted promotion readiness from the evaluator/RAG corpus, #69 scores cached hosted job outputs against that corpus, #70 emits ranked retrieval candidates plus a model prompt seed, #71 emits the gated `hosted-promotion-judge.v1` contract without live model calls, and #72 adds opt-in live model-judge execution behind hosted-evidence and strict JSON/citation gates | Next implementation batch |
| Linear progress | Linear project status updates, `docs/architecture/progress-sync-contract.md`, and this mirror | Status update with queue/evidence/missing gates | Every significant merge batch | | Linear progress | Linear project status updates, `docs/architecture/progress-sync-contract.md`, and this mirror | Status update with queue/evidence/missing gates | Every significant merge batch |
The project status update should always include: The project status update should always include:
@@ -545,16 +722,16 @@ Acceptance:
supply-chain incident class; PR #79 moved baseline/watch/remediation supply-chain incident class; PR #79 moved baseline/watch/remediation
fingerprints to hashed evidence and stopped writing raw evidence into new fingerprints to hashed evidence and stopped writing raw evidence into new
baselines; PR #80 added prioritized corpus accuracy recommendations for baselines; PR #80 added prioritized corpus accuracy recommendations for
failed regression gates; and ECC-Tools PRs #42/#43 now route and recognize failed regression gates; PR #81 added ordered remediation workflow phases;
evidence packs. The next slice is remediation workflow depth or corpus PR #82 expanded corpus coverage for env proxy hijacks and out-of-band
expansion. exfiltration; and ECC-Tools PRs #42/#43 now route and recognize evidence
2. Keep ECC-Tools #53's supply-chain workflow gate and #54's PR-draft backlog packs. The next slice is hosted evidence-pack workflow depth.
tracking in the recurring queue evidence, and use the org-scoped GitHub auth 2. Add hosted promotion telemetry and operator review UX on top of the #72
path for future ECC-Tools maintenance while the narrow environment token gated model execution path so live judgments can be audited before any
remains active. promotion policy becomes enforceable.
3. Enable/configure the merged Linear backlog sync path after workspace issue 3. Enable/configure the merged Linear backlog sync path after workspace issue
capacity clears or the Linear workspace is upgraded, then verify PR-draft capacity clears or the Linear workspace is upgraded, then verify PR-draft
salvage items land in the expected project. salvage items land in the expected project.
4. Use the ECC-Tools evaluator/RAG corpus as the promotion gate before adding 4. Use the ECC-Tools evaluator/RAG corpus as the promotion gate before adding
hosted retrieval, vector storage, model-backed judging, or automated hosted retrieval, vector storage, live model-backed judging, or automated
check-run promotion. check-run promotion.
+16 -4
View File
@@ -19,7 +19,8 @@ const {
getTempDir, getTempDir,
writeFile, writeFile,
readStdinJson, readStdinJson,
log log,
output
} = require('../lib/utils'); } = require('../lib/utils');
async function resolveSessionId() { async function resolveSessionId() {
@@ -77,14 +78,25 @@ async function main() {
writeFile(counterFile, String(count)); writeFile(counterFile, String(count));
} }
// Suggest compact after threshold tool calls // Suggest compact after threshold tool calls.
//
// log() writes to stderr (debug log). Per the Claude Code hooks guide,
// non-blocking PreToolUse stderr (exit 0) is only written to the debug log;
// it does not reach the model. To inject a user-facing suggestion without
// blocking the tool call, emit structured JSON to stdout with
// hookSpecificOutput.additionalContext — the documented mechanism for
// PreToolUse hooks to add context to the next model turn.
if (count === threshold) { if (count === threshold) {
log(`[StrategicCompact] ${threshold} tool calls reached - consider /compact if transitioning phases`); const msg = `[StrategicCompact] ${threshold} tool calls reached - consider /compact if transitioning phases`;
log(msg);
output({ hookSpecificOutput: { hookEventName: 'PreToolUse', additionalContext: msg } });
} }
// Suggest at regular intervals after threshold (every 25 calls from threshold) // Suggest at regular intervals after threshold (every 25 calls from threshold)
if (count > threshold && (count - threshold) % 25 === 0) { if (count > threshold && (count - threshold) % 25 === 0) {
log(`[StrategicCompact] ${count} tool calls - good checkpoint for /compact if context is stale`); const msg = `[StrategicCompact] ${count} tool calls - good checkpoint for /compact if context is stale`;
log(msg);
output({ hookSpecificOutput: { hookEventName: 'PreToolUse', additionalContext: msg } });
} }
process.exit(0); process.exit(0);
+2 -2
View File
@@ -130,12 +130,12 @@ test('candidate playbook preserves stale-salvage operating rules', () => {
} }
}); });
test('roadmap points to the evaluator RAG prototype and keeps hosted integration open', () => { test('roadmap points to the evaluator RAG prototype and hosted PR check', () => {
const roadmap = read('docs/ECC-2.0-GA-ROADMAP.md'); const roadmap = read('docs/ECC-2.0-GA-ROADMAP.md');
assert.ok(roadmap.includes('docs/architecture/evaluator-rag-prototype.md')); assert.ok(roadmap.includes('docs/architecture/evaluator-rag-prototype.md'));
assert.ok(roadmap.includes('examples/evaluator-rag-prototype/')); assert.ok(roadmap.includes('examples/evaluator-rag-prototype/'));
assert.ok(roadmap.includes('Local corpus complete; hosted integration remains future')); assert.ok(roadmap.includes('Deterministic hosted PR check, cached output scoring, retrieval planning, judge contract, and gated model execution integrated'));
}); });
test('billing readiness scenario rejects launch copy overclaims', () => { test('billing readiness scenario rejects launch copy overclaims', () => {
+60
View File
@@ -366,6 +366,66 @@ function runTests() {
})) passed++; })) passed++;
else failed++; else failed++;
// ── hookSpecificOutput JSON on stdout ──
// Claude Code 2.1+ drops non-blocking PreToolUse stderr; the suggestion has
// to ride on stdout as { hookSpecificOutput: { additionalContext } } to reach
// the model. These tests pin that contract.
console.log('\nhookSpecificOutput stdout JSON:');
if (test('emits hookSpecificOutput.additionalContext on stdout at threshold', () => {
const { sessionId, counterFile, cleanup } = createCounterContext();
cleanup();
fs.writeFileSync(counterFile, '49');
const result = runCompact({ CLAUDE_SESSION_ID: sessionId });
assert.strictEqual(result.code, 0, 'Should exit 0');
assert.ok(result.stdout.trim().length > 0, `Expected stdout payload at threshold. Got: "${result.stdout}"`);
const parsed = JSON.parse(result.stdout);
assert.strictEqual(parsed.hookSpecificOutput.hookEventName, 'PreToolUse',
`hookEventName should be PreToolUse. Got: ${JSON.stringify(parsed)}`);
assert.ok(parsed.hookSpecificOutput.additionalContext.includes('50 tool calls reached'),
`additionalContext should include threshold text. Got: ${parsed.hookSpecificOutput.additionalContext}`);
cleanup();
})) passed++;
else failed++;
if (test('emits hookSpecificOutput.additionalContext on stdout at +25 interval', () => {
const { sessionId, counterFile, cleanup } = createCounterContext();
cleanup();
// threshold=3, set counter to 27 → next run = 28 → 28-3=25 → interval hit
fs.writeFileSync(counterFile, '27');
const result = runCompact({ CLAUDE_SESSION_ID: sessionId, COMPACT_THRESHOLD: '3' });
assert.strictEqual(result.code, 0, 'Should exit 0');
assert.ok(result.stdout.trim().length > 0, `Expected stdout payload at interval. Got: "${result.stdout}"`);
const parsed = JSON.parse(result.stdout);
assert.strictEqual(parsed.hookSpecificOutput.hookEventName, 'PreToolUse');
assert.ok(parsed.hookSpecificOutput.additionalContext.includes('28 tool calls'),
`additionalContext should include count. Got: ${parsed.hookSpecificOutput.additionalContext}`);
cleanup();
})) passed++;
else failed++;
if (test('emits no stdout below threshold (silent)', () => {
const { sessionId, cleanup } = createCounterContext();
cleanup();
const result = runCompact({ CLAUDE_SESSION_ID: sessionId, COMPACT_THRESHOLD: '5' });
assert.strictEqual(result.code, 0);
assert.strictEqual(result.stdout.trim(), '',
`Expected empty stdout below threshold. Got: "${result.stdout}"`);
cleanup();
})) passed++;
else failed++;
if (test('still writes [StrategicCompact] to stderr (debug log retained)', () => {
const { sessionId, counterFile, cleanup } = createCounterContext();
cleanup();
fs.writeFileSync(counterFile, '49');
const result = runCompact({ CLAUDE_SESSION_ID: sessionId });
assert.ok(result.stderr.includes('[StrategicCompact]'),
`stderr should retain [StrategicCompact] for debug log capture. Got: "${result.stderr}"`);
cleanup();
})) passed++;
else failed++;
// ── Round 64: default session ID fallback ── // ── Round 64: default session ID fallback ──
console.log('\nDefault session ID fallback (Round 64):'); console.log('\nDefault session ID fallback (Round 64):');