fix(hooks): emit suggest-compact via hookSpecificOutput stdout

The threshold and interval suggestions in suggest-compact.js are written to stderr via log(). Per the Claude Code hooks guide, non-blocking PreToolUse stderr (exit code 0) is only captured in the debug log — it does not reach the model. As shipped, the script's nudge to /compact is silent on Claude Code 2.1.x. Fix: alongside the existing log() call (kept for debug-log capture), emit the same suggestion as structured JSON on stdout: { hookSpecificOutput: { hookEventName: "PreToolUse", additionalContext: msg } } This is the documented mechanism for a PreToolUse hook to inject context into the next model turn without blocking the tool call. Verified end-to-end on Claude Code 2.1.142 (VSCode native extension, Windows 11) — the additionalContext now surfaces in the next turn as a <system-reminder> block. Counter increment and exit code behavior unchanged. Tests: 4 new cases in tests/hooks/suggest-compact.test.js covering stdout JSON at threshold, stdout JSON at +25 interval, silence below threshold, and stderr-retention for the debug log. Suite goes from 19/19 -> 23/23 (suggest-compact) and full run-all stays clean for the unaffected suites (the 4 pre-existing Windows broken-symlink failures in ci/validators, lib/session-manager, and lib/utils are unrelated to this change).
docs: sync ECC Tools judge execution (#1901 )
2026-06-11 02:33:10 +08:00 · 2026-05-15 09:48:28 +10:00 · 2026-05-14 17:38:03 -04:00 · 2026-05-14 17:15:54 -04:00 · 2026-05-14 16:54:30 -04:00 · 2026-05-13 23:02:23 -04:00
4 changed files with 269 additions and 20 deletions
--- a/docs/ECC-2.0-GA-ROADMAP.md
+++ b/docs/ECC-2.0-GA-ROADMAP.md
@@ -61,6 +61,14 @@ As of 2026-05-13:
  and added prioritized corpus accuracy recommendations to failed corpus gates,
  mapping misses by category, missing rule, and config ID so enterprise
  scanner-regression work has an actionable improvement plan.
+- AgentShield PR #81 merged as `6583884e74ba2e896942113e1ce3146230e6fb76`
+  and added ordered remediation workflow phases to remediation plans, routing
+  safe auto-fixes, manual review, and verification through stable finding
+  fingerprints without copying raw evidence.
+- AgentShield PR #82 merged as `51336ba074ad5e9fed2c0aa3237422be22147e76`
+  and expanded the built-in attack corpus with an env proxy hijack scenario
+  covering proxy/runtime mutation, env-token exfiltration, DNS exfiltration,
+  credential-store access, and clipboard access.
 - JARVIS PR #13 merged as `127efabbfb5033ae53d7a53e1546aa3c33d6f962`
  and hardened CI/deploy workflows with npm registry signature verification,
  disabled persisted checkout credentials in write-permission jobs, and pinned
@@ -72,6 +80,115 @@ As of 2026-05-13:
  and made `/ecc-tools followups sync-linear` track copy-ready PR drafts in
  the Linear/project backlog when `open-pr-drafts` is not used, preserving
  useful stale-PR salvage work without opening extra PR shells.
+- ECC-Tools PR #55 merged as `5d8c112cce4794cfa089d5b0ea661ba87a178be1`
+  and added analysis-depth readiness to `/ecc-tools analyze` comments,
+  separating commit-history-only repos from evidence-backed and deep-ready repos
+  using CI/CD, security, harness, reference/eval, AI routing/cost-control, and
+  team handoff evidence.
+- ECC-Tools PR #56 merged as `5b729c88641eafe80f65364bab3fc74d0270f57b`
+  and added the authenticated `/api/analysis/depth-plan` contract that maps
+  analysis-depth readiness into concrete hosted jobs for CI diagnostics,
+  security evidence review, harness compatibility, reference-set evaluation,
+  AI routing/cost review, and team backlog routing.
+- ECC-Tools PR #57 merged as `4cc61112a4cc9feec7b07af09321f360e34af6a4`
+  and added the first executable hosted analysis job:
+  `/api/analysis/jobs/ci-diagnostics` now gates on CI/CD readiness, inspects
+  workflow/test-runner/failure-evidence artifacts, returns CI hardening
+  findings and next actions, and charges usage only after successful execution.
+- ECC-Tools PR #58 merged as `ce09dd8d9b46f65c6b88dc4f48cfb6b6227ae0bf`
+  and added the second executable hosted analysis job:
+  `/api/analysis/jobs/security-evidence-review` now gates on security-evidence
+  readiness, inspects capped AgentShield evidence-pack, policy, baseline,
+  SBOM, SARIF, and security-scan artifacts, returns supply-chain evidence
+  findings and next actions, and charges usage only after successful execution.
+- ECC-Tools PR #59 merged as `505b372dbd8f75f996d9e2ed079effd30cec5ba5`
+  and added the third executable hosted analysis job:
+  `/api/analysis/jobs/harness-compatibility-audit` now gates on harness-config
+  readiness, inspects capped Claude, Codex, OpenCode, MCP, plugin, and
+  cross-harness documentation artifacts, excludes local secret-bearing config
+  paths from fetches, returns portability findings and next actions, and
+  charges usage only after successful execution.
+- ECC-Tools PR #60 merged as `b75e0a49ba5672b1ec9a2a4880ddcfa2d07dc557`
+  and added the fourth executable hosted analysis job:
+  `/api/analysis/jobs/reference-set-evaluation` now gates on reference-evidence
+  readiness, evaluates analyzer corpus, RAG/evaluator, PR salvage/review,
+  harness, security, and CI failure-mode evidence, excludes obvious
+  secret-bearing fixture paths from fetches, returns reference coverage
+  findings and next actions, and charges usage only after successful execution.
+- ECC-Tools PR #61 merged as `7b01b67cae0b80774b311cb515b7eca0aa038c65`
+  and added the fifth executable hosted analysis job:
+  `/api/analysis/jobs/ai-routing-cost-review` now gates on AI routing/cost
+  readiness, evaluates model routing, token budget, usage-limit, rate-limit,
+  billing/entitlement, cost-regression, and cost-policy evidence, excludes
+  obvious secret-bearing paths from fetches, returns cost-control findings and
+  next actions, and charges usage only after successful execution.
+- ECC-Tools PR #62 merged as `781d6733e56f7556edb43fb96bdfb00b1f0a3aa6`
+  and added the sixth executable hosted analysis job:
+  `/api/analysis/jobs/team-backlog-routing` now gates on team handoff/project
+  tracking readiness, evaluates roadmap, runbook, handoff, release-plan,
+  issue-template, ownership, project-tracker, backlog, and follow-up evidence,
+  excludes obvious secret-bearing paths from fetches, returns team-routing
+  findings and next actions, and charges usage only after successful execution.
+- ECC-Tools PR #63 merged as `fb9e4c5ceb9ccde50da74c7a69c3fa4bd321fc07`
+  and made the hosted execution plan operator-visible on queued PR analysis:
+  the queue now publishes a non-blocking `ECC Tools / Hosted Depth Plan`
+  check-run on the PR head SHA with ready/blocked hosted executor commands
+  and next action text, while keeping check-run publication best-effort so
+  bundle generation and analysis comments are not blocked.
+- ECC-Tools PR #64 merged as `72020ef94db94840812977ea7ac37e9344036668`
+  and added PR-facing hosted job dispatch controls:
+  `/ecc-tools analyze --job ...` comments now queue hosted jobs against the
+  PR head SHA, execute them through the existing hosted readiness/evidence
+  gates, post artifacts/findings/next actions back to the PR, and scope
+  idempotency keys by job id so hosted jobs do not collide with bundle
+  analysis.
+- ECC-Tools PR #65 merged as `bacd4adf6a3a629e8d403865456d15f127baaf4e`
+  and added hosted job result history/check-run summaries:
+  queued hosted jobs now cache both the latest result and immutable run records
+  for completed or blocked runs, then publish a non-blocking per-job check-run
+  on the PR head SHA with artifacts, findings, readiness blockers, and next
+  actions.
+- ECC-Tools PR #66 merged as `4e1db48252d068ea5dcf4308b0bc11b0dfe0c9ce`
+  and added a read-only hosted status command:
+  `/ecc-tools analyze --job status` now reads the #65 latest-result cache for
+  the current PR head and posts a compact completed/blocked/not-run table with
+  the next hosted job command, without queueing work or billing usage.
+- ECC-Tools PR #67 merged as `f20e6bec2b0bf49e4cc36e08b7285c795973b73d`
+  and made the hosted depth-plan check-run status-aware:
+  queued PR analysis now reads the #65/#66 latest-result cache when publishing
+  `ECC Tools / Hosted Depth Plan`, includes the latest hosted run status in
+  the plan table, and recommends the next unrun ready job before reruns.
+- ECC-Tools PR #68 merged as `2cde524b5ef8f34ab7bb1af973248fe4be4359f8`
+  and added deterministic hosted promotion readiness:
+  opened/synchronized PRs now publish a non-blocking
+  `ECC Tools / Hosted Promotion Readiness` check-run that compares changed
+  files against the checked-in evaluator/RAG corpus, warns on missing
+  hosted-job promotion evidence, and can be disabled with
+  `PR_HOSTED_PROMOTION_READINESS_CHECK_MODE=off`.
+- ECC-Tools PR #69 merged as `d0112dac7cef807ae27def41f057682ef0772cce`
+  and extended hosted promotion readiness with deterministic output scoring:
+  the check now reads cached completed hosted job results for the current PR
+  head, scores their artifacts and findings against evaluator/RAG corpus
+  expectations, and treats matching hosted artifacts as promotion evidence
+  before reporting a gap.
+- ECC-Tools PR #70 merged as `7001d805ac981fe220b4575159f469fbea9dbb76`
+  and added retrieval planning for hosted promotion:
+  the check now emits ranked retrieval candidates from cached hosted artifacts,
+  hosted findings, expected evidence paths, and changed source paths, plus a
+  model prompt seed that tells the later hosted judge not to promote from
+  changed paths alone.
+- ECC-Tools PR #71 merged as `d41e59ff00fe1bd0b0c96386e56bc5269d7b9c15`
+  and added the first model-backed hosted promotion judge contract:
+  the check now emits a provider-neutral `hosted-promotion-judge.v1` request
+  contract and fails closed unless hosted retrieval evidence, entitlement,
+  remaining budget, and provider configuration are present. It still does not
+  make live model calls.
+- ECC-Tools PR #72 merged as `973bc51e5436dd279ae5a890cce9811485eef0b5`
+  and executes the hosted promotion model judge behind explicit gates:
+  `PR_HOSTED_PROMOTION_MODEL_JUDGE_MODE=execute` now calls the configured
+  provider only after hosted retrieval evidence, entitlement, budget, provider,
+  and executor gates pass; the check remains non-blocking, strict-JSON-only,
+  and rejects uncited or non-hosted model output without echoing raw responses.
 - Handoff `ecc-supply-chain-audit-20260513-0645.md` under
  `~/.cluster-swarm/handoffs/`
  records the May 13 supply-chain sweep: no active lockfile/manifest hit for
@@ -255,6 +372,66 @@ As of 2026-05-13:
  artifact contract so canonical bundle files now satisfy the taxonomy and
  generated follow-up PRs point maintainers at
  `agentshield scan --evidence-pack <dir>`.
+- ECC-Tools PR #55 added the first hosted/deeper-analysis readiness signal:
+  analysis comments now classify a repo as commit-history-only,
+  evidence-backed, or deep-ready before routing work into CI, AgentShield,
+  harness, reference-set, RAG/evaluator, AI-routing, cost-control, and
+  Linear/project-tracking lanes.
+- ECC-Tools PR #56 turned that signal into a hosted execution-plan contract:
+  `/api/analysis/depth-plan` returns ready/blocked jobs and next action text
+  without charging analysis usage or creating bundle PRs.
+- ECC-Tools PR #57 implemented the first job-specific hosted executor:
+  `/api/analysis/jobs/ci-diagnostics` reuses the depth-readiness gate, internal
+  API auth, installation ownership, repo-access billing checks, capped workflow
+  file reads, and usage accounting to return concrete CI hardening findings.
+- ECC-Tools PR #58 implemented the second job-specific hosted executor:
+  `/api/analysis/jobs/security-evidence-review` applies the same hosted gates
+  to AgentShield evidence-pack, policy, baseline, SBOM, SARIF, and security
+  scanner artifacts.
+- ECC-Tools PR #59 implemented the third job-specific hosted executor:
+  `/api/analysis/jobs/harness-compatibility-audit` applies the same hosted
+  gates to Claude, Codex, OpenCode, MCP, plugin, and cross-harness evidence
+  while avoiding local secret-bearing harness config fetches.
+- ECC-Tools PR #60 implemented the fourth job-specific hosted executor:
+  `/api/analysis/jobs/reference-set-evaluation` applies the same hosted gates
+  to analyzer corpus, RAG/evaluator, PR salvage, harness, security, and CI
+  failure-mode reference evidence while avoiding obvious secret-bearing fixture
+  fetches.
+- ECC-Tools PR #61 implemented the fifth job-specific hosted executor:
+  `/api/analysis/jobs/ai-routing-cost-review` applies the same hosted gates to
+  model-routing, token-budget, usage-limit, rate-limit, billing/entitlement,
+  cost-regression, and cost-policy evidence while avoiding obvious
+  secret-bearing path fetches.
+- ECC-Tools PR #62 implemented the sixth job-specific hosted executor:
+  `/api/analysis/jobs/team-backlog-routing` applies the same hosted gates to
+  roadmap, runbook, handoff, release-plan, issue-template, ownership,
+  project-tracker, backlog, and follow-up evidence while avoiding obvious
+  secret-bearing path fetches.
+- ECC-Tools PR #63 publishes the hosted depth-plan check-run after queued PR
+  analysis completes, making the six hosted executor commands visible on the
+  PR head SHA without turning the check into a merge blocker.
+- ECC-Tools PR #64 wires those commands into the queue: maintainers can comment
+  `/ecc-tools analyze --job ci-diagnostics`, `security-evidence`,
+  `harness-compatibility`, `reference-set-evaluation`, `ai-routing-cost`, or
+  `team-backlog` on a PR and receive hosted job results in a PR comment.
+- ECC-Tools PR #65 persists completed and blocked hosted job results to the
+  analysis cache for 30 days and publishes non-blocking `ECC Tools / Hosted
+  Job: ...` check-runs so maintainers can scan hosted outcomes from the PR
+  checks surface instead of rereading older comments.
+- ECC-Tools PR #66 exposes the cached results from PR comments with
+  `/ecc-tools analyze --job status`, summarizing completed, blocked, and
+  not-yet-run hosted jobs for the PR head and recommending the next hosted job
+  command.
+- ECC-Tools PR #67 feeds those cached results back into the hosted depth-plan
+  check-run so queued analysis recommends the next unrun ready hosted job from
+  cache state instead of repeating the static readiness order.
+- ECC-Tools PR #68 adds the first evaluator-backed hosted promotion gate:
+  opened/synchronized PRs get a non-blocking Hosted Promotion Readiness
+  check-run that turns the evaluator/RAG corpus into warnings when changed
+  files match fixture scenarios without their expected evidence artifacts.
+- ECC-Tools PR #69 extends that gate to score cached completed hosted job
+  outputs for the current PR head, so hosted artifacts can satisfy corpus
+  evidence expectations before the check reports a promotion gap.
 - ECC PR #1803 landed the contributor Quarkus handling branch after maintainer
  cleanup, current-`main` alignment, full local validation, and preservation of
  the author's removal of incomplete ja-JP and zh-CN Quarkus translations.
@@ -307,11 +484,11 @@ is not complete unless the evidence column exists and has been freshly verified.
 | Naming and rename readiness | Naming matrix across package/plugin/docs/social surfaces | `docs/releases/2.0.0-rc.1/naming-and-publication-matrix.md` records current package, repo, Claude plugin, Codex plugin, OpenCode, and npm availability evidence | Complete for rc.1; post-rc rename remains future work |
 | Claude and Codex plugin publication | Contact/submission path with required artifacts and status | Publication readiness, naming matrix, and May 12 dry-run evidence document plugin validation, clean-checkout Claude tag/install smoke, and Codex marketplace CLI shape | Needs explicit approval for real tag/push and marketplace submission |
 | Articles, tweets, and announcements | X thread, LinkedIn copy, GitHub release copy, push checklist | Draft launch collateral exists under rc.1 release docs | Needs URL-backed refresh |
-| AgentShield enterprise iteration | Policy gates, SARIF, packs, provenance, corpus, HTML reports, exception lifecycle audit, baseline drift Action/CLI surfaces, evidence-pack redaction, harness adapter registry, enterprise research roadmap, supply-chain hardened release path, CI-safe baseline fingerprints, corpus accuracy recommendations | PRs #53, #55-#64, #67-#69, and #78-#80 landed with test evidence; native PDF export deferred in favor of self-contained HTML plus print-to-PDF until explicit enterprise demand appears; `docs/architecture/agentshield-enterprise-research-roadmap.md` now has baseline drift, evidence-pack bundle, redaction, adapter-registry, supply-chain hardening, hashed baseline fingerprints, and corpus accuracy recommendation slices landed | Next remediation workflow depth or corpus expansion |
-| ECC Tools next-level app | Billing audit, PR checks, deep analyzer, sync backlog, evaluator/RAG corpus | PRs #26-#43 plus #53/#54 landed with test evidence, including AgentShield evidence-pack gap routing, canonical bundle recognition, supply-chain signature gates, and PR draft follow-up Linear tracking | Needs hosted/deeper analysis follow-up |
+| AgentShield enterprise iteration | Policy gates, SARIF, packs, provenance, corpus, HTML reports, exception lifecycle audit, baseline drift Action/CLI surfaces, evidence-pack redaction, harness adapter registry, enterprise research roadmap, supply-chain hardened release path, CI-safe baseline fingerprints, corpus accuracy recommendations, remediation workflow phases, env proxy hijack corpus coverage | PRs #53, #55-#64, #67-#69, and #78-#82 landed with test evidence; native PDF export deferred in favor of self-contained HTML plus print-to-PDF until explicit enterprise demand appears; `docs/architecture/agentshield-enterprise-research-roadmap.md` now has baseline drift, evidence-pack bundle, redaction, adapter-registry, supply-chain hardening, hashed baseline fingerprints, corpus accuracy recommendation, remediation workflow, and env proxy hijack corpus slices landed | Next hosted evidence-pack workflow depth |
+| ECC Tools next-level app | Billing audit, PR checks, deep analyzer, sync backlog, evaluator/RAG corpus, analysis-depth readiness, hosted execution planning, hosted CI diagnostics, hosted security evidence review, hosted harness compatibility audit, hosted reference-set evaluation, hosted AI routing/cost review, hosted team backlog routing, hosted depth-plan check-run, PR-comment hosted job dispatch, hosted job result history/check-runs, hosted result status command, status-aware depth-plan recommendations, hosted promotion readiness, hosted promotion output scoring, hosted promotion retrieval planning, hosted promotion judge contract, gated hosted promotion judge execution | PRs #26-#43 plus #53-#72 landed with test evidence, including AgentShield evidence-pack gap routing, canonical bundle recognition, supply-chain signature gates, PR draft follow-up Linear tracking, evidence-backed/deep-ready repository classification, the `/api/analysis/depth-plan` hosted job plan, `/api/analysis/jobs/ci-diagnostics`, `/api/analysis/jobs/security-evidence-review`, `/api/analysis/jobs/harness-compatibility-audit`, `/api/analysis/jobs/reference-set-evaluation`, `/api/analysis/jobs/ai-routing-cost-review`, `/api/analysis/jobs/team-backlog-routing`, the `ECC Tools / Hosted Depth Plan` check-run, `/ecc-tools analyze --job ...` PR-comment dispatch, non-blocking per-hosted-job result check-runs backed by 30-day result cache records, `/ecc-tools analyze --job status` cache lookup, cache-aware next-job recommendations in the depth-plan check-run, the `ECC Tools / Hosted Promotion Readiness` corpus-backed PR check-run, deterministic hosted-output scoring against cached completed job artifacts/findings, ranked retrieval/model-prompt planning, the fail-closed `hosted-promotion-judge.v1` request contract, and opt-in live model-judge execution behind hosted evidence, entitlement, budget, provider, executor, strict JSON, and citation gates | Next work is hosted promotion telemetry and operator review UX |
 | GitGuardian/Dependabot/CodeRabbit-style checks | Non-blocking taxonomy, deterministic follow-up checks, and local supply-chain gates | ECC-Tools risk taxonomy check plus follow-up signals landed, including Skill Quality, Deep Analyzer Evidence, Analyzer Corpus Evidence, RAG/Evaluator Evidence, PR Review/Salvage Evidence, and AgentShield evidence-pack evidence; #1846 added npm registry signature gates; #1848 added the supply-chain incident-response playbook and `pull_request_target` cache-poisoning validator guard; #1851 added the privileged checkout credential-persistence guard; AgentShield #78, JARVIS #13, and ECC-Tools #53 applied the same hardening outside trunk | Current supply-chain gate complete; deeper hosted review features remain future |
-| Harness-agnostic learning system | Audit, adapter matrix, observability, traces, promotion loop | Audit/adapters/observability gates plus `docs/architecture/evaluator-rag-prototype.md`, `examples/evaluator-rag-prototype/`, and ECC-Tools PR #40 define read-only stale-salvage, billing-readiness, CI-failure-diagnosis, harness-config-quality, AgentShield policy-exception, skill-quality evidence, deep-analyzer evidence, and RAG/evaluator comparison scenarios with trace, report, playbook, verifier, and predictive-check artifacts | Local corpus complete; hosted integration remains future |
-| Linear roadmap is detailed | Linear project status plus repo mirror | Repo mirror exists; issue creation was retried on 2026-05-12 and remains blocked by the workspace free issue limit; this May 13 sync adds ECC #1860, AgentShield #78/#79, JARVIS #13, ECC-Tools #53/#54, resolved queue/discussion counts, and Linear project status updates `59f630eb`/`c7ea6daf` | Needs recurring status updates after each merge batch |
+| Harness-agnostic learning system | Audit, adapter matrix, observability, traces, promotion loop | Audit/adapters/observability gates plus `docs/architecture/evaluator-rag-prototype.md`, `examples/evaluator-rag-prototype/`, and ECC-Tools PR #40 define read-only stale-salvage, billing-readiness, CI-failure-diagnosis, harness-config-quality, AgentShield policy-exception, skill-quality evidence, deep-analyzer evidence, and RAG/evaluator comparison scenarios with trace, report, playbook, verifier, and predictive-check artifacts; ECC-Tools PRs #68-#72 now turn that corpus into a deterministic PR check-run gate with cached hosted-output scoring, ranked retrieval candidates, a model prompt seed, a fail-closed hosted model-judge request contract, and opt-in live model execution behind strict hosted-evidence gates | Deterministic hosted PR check, cached output scoring, retrieval planning, judge contract, and gated model execution integrated |
+| Linear roadmap is detailed | Linear project status plus repo mirror | Repo mirror exists; issue creation was retried on 2026-05-12 and remains blocked by the workspace free issue limit; this May 13 sync adds ECC #1860, AgentShield #78-#82, JARVIS #13, ECC-Tools #53-#72, resolved queue/discussion counts, and notes that Linear connector status updates after ECC-Tools #68 remain blocked by a connector secret-owner error | Needs recurring status updates after connector recovery |
 | Flow separation and progress tracking | Flow lanes with owner artifacts and update cadence | This roadmap defines lanes below and `docs/architecture/progress-sync-contract.md` makes GitHub/Linear/handoff/roadmap sync part of the readiness gate | Active |
 | Realtime Linear sync | Project updates while issue limit is blocked; issues later | ECC-Tools #39 implements opt-in Linear API sync for deferred follow-up backlog items, and ECC-Tools #54 adds copy-ready PR drafts to that backlog when draft PR shells are not opened; `docs/architecture/progress-sync-contract.md` defines the local file-backed realtime boundary while issue capacity is blocked | Needs workspace capacity/config rollout |
 | Observability for self-use | Local readiness gate, traces, status snapshots, HUD/status contract, risk ledger, progress-sync contract | `npm run observability:ready` reports 21/21 | Complete for local gate |
@@ -330,9 +507,9 @@ repo evidence and merge commits.
 | Queue hygiene and salvage | GitHub PR/issue state, salvage ledger | Append ledger entries for any future stale closures | Every cleanup batch |
 | Release and publication | rc.1 release docs, publication readiness doc | Naming matrix and plugin submission/contact checklist | Before any tag |
 | Harness OS core | Audit, adapter matrix, observability docs, `ecc2/` | HUD/session-control acceptance spec | Weekly until GA |
-| Evaluation and RAG | Reference-set validation, harness audit, traces, ECC-Tools corpus | Read-only evaluator/RAG prototype plus stale-salvage, billing-readiness, CI-failure-diagnosis, harness-config-quality, AgentShield policy-exception, skill-quality evidence, deep-analyzer evidence, and RAG/evaluator comparison fixtures | Hosted retrieval/check-run automation plan |
+| Evaluation and RAG | Reference-set validation, harness audit, traces, ECC-Tools corpus | Read-only evaluator/RAG prototype plus stale-salvage, billing-readiness, CI-failure-diagnosis, harness-config-quality, AgentShield policy-exception, skill-quality evidence, deep-analyzer evidence, and RAG/evaluator comparison fixtures; ECC-Tools #68 publishes the corpus as a hosted promotion readiness check-run, #69 scores cached hosted job outputs against the same corpus, #70 emits ranked retrieval candidates plus a model prompt seed, #71 adds a fail-closed hosted model-judge request contract, and #72 executes that judge only when explicitly enabled and backed by hosted retrieval citations | Hosted promotion telemetry and operator review UX |
 | AgentShield enterprise | AgentShield PR evidence and roadmap notes | Remediation workflow depth or corpus expansion follow-up | Next implementation batch |
-| ECC Tools app | ECC-Tools PR evidence, billing audit, risk taxonomy, evaluator/RAG corpus | ECC-Tools #53 published the supply-chain workflow hardening branch and #54 tracks copy-ready PR drafts in the Linear/project backlog; next work is hosted/deeper analysis follow-up | Next implementation batch |
+| ECC Tools app | ECC-Tools PR evidence, billing audit, risk taxonomy, evaluator/RAG corpus | ECC-Tools #53 published the supply-chain workflow hardening branch, #54 tracks copy-ready PR drafts in the Linear/project backlog, #55 classifies analysis-depth readiness, #56 exposes the hosted execution plan, #57 executes the first hosted CI diagnostics job, #58 executes the hosted security evidence review job, #59 executes the hosted harness compatibility audit, #60 executes the hosted reference-set evaluation, #61 executes the hosted AI routing/cost review, #62 executes hosted team backlog routing, #63 publishes the hosted depth-plan check-run, #64 dispatches hosted jobs from PR comments, #65 persists hosted result history/check-runs, #66 exposes hosted job status from PR comments, #67 makes depth-plan recommendations cache-aware, #68 publishes hosted promotion readiness from the evaluator/RAG corpus, #69 scores cached hosted job outputs against that corpus, #70 emits ranked retrieval candidates plus a model prompt seed, #71 emits the gated `hosted-promotion-judge.v1` contract without live model calls, and #72 adds opt-in live model-judge execution behind hosted-evidence and strict JSON/citation gates | Next implementation batch |
 | Linear progress | Linear project status updates, `docs/architecture/progress-sync-contract.md`, and this mirror | Status update with queue/evidence/missing gates | Every significant merge batch |

 The project status update should always include:
@@ -545,16 +722,16 @@ Acceptance:
   supply-chain incident class; PR #79 moved baseline/watch/remediation
   fingerprints to hashed evidence and stopped writing raw evidence into new
   baselines; PR #80 added prioritized corpus accuracy recommendations for
-   failed regression gates; and ECC-Tools PRs #42/#43 now route and recognize
-   evidence packs. The next slice is remediation workflow depth or corpus
-   expansion.
-2. Keep ECC-Tools #53's supply-chain workflow gate and #54's PR-draft backlog
-   tracking in the recurring queue evidence, and use the org-scoped GitHub auth
-   path for future ECC-Tools maintenance while the narrow environment token
-   remains active.
+   failed regression gates; PR #81 added ordered remediation workflow phases;
+   PR #82 expanded corpus coverage for env proxy hijacks and out-of-band
+   exfiltration; and ECC-Tools PRs #42/#43 now route and recognize evidence
+   packs. The next slice is hosted evidence-pack workflow depth.
+2. Add hosted promotion telemetry and operator review UX on top of the #72
+   gated model execution path so live judgments can be audited before any
+   promotion policy becomes enforceable.
 3. Enable/configure the merged Linear backlog sync path after workspace issue
   capacity clears or the Linear workspace is upgraded, then verify PR-draft
   salvage items land in the expected project.
 4. Use the ECC-Tools evaluator/RAG corpus as the promotion gate before adding
-   hosted retrieval, vector storage, model-backed judging, or automated
+   hosted retrieval, vector storage, live model-backed judging, or automated
   check-run promotion.
--- a/scripts/hooks/suggest-compact.js
+++ b/scripts/hooks/suggest-compact.js
@@ -19,7 +19,8 @@ const {
  getTempDir,
  writeFile,
  readStdinJson,
-  log
+  log,
+  output
 } = require('../lib/utils');

 async function resolveSessionId() {
@@ -77,14 +78,25 @@ async function main() {
    writeFile(counterFile, String(count));
  }

-  // Suggest compact after threshold tool calls
+  // Suggest compact after threshold tool calls.
+  //
+  // log() writes to stderr (debug log). Per the Claude Code hooks guide,
+  // non-blocking PreToolUse stderr (exit 0) is only written to the debug log;
+  // it does not reach the model. To inject a user-facing suggestion without
+  // blocking the tool call, emit structured JSON to stdout with
+  // hookSpecificOutput.additionalContext — the documented mechanism for
+  // PreToolUse hooks to add context to the next model turn.
  if (count === threshold) {
-    log(`[StrategicCompact] ${threshold} tool calls reached - consider /compact if transitioning phases`);
+    const msg = `[StrategicCompact] ${threshold} tool calls reached - consider /compact if transitioning phases`;
+    log(msg);
+    output({ hookSpecificOutput: { hookEventName: 'PreToolUse', additionalContext: msg } });
  }

  // Suggest at regular intervals after threshold (every 25 calls from threshold)
  if (count > threshold && (count - threshold) % 25 === 0) {
-    log(`[StrategicCompact] ${count} tool calls - good checkpoint for /compact if context is stale`);
+    const msg = `[StrategicCompact] ${count} tool calls - good checkpoint for /compact if context is stale`;
+    log(msg);
+    output({ hookSpecificOutput: { hookEventName: 'PreToolUse', additionalContext: msg } });
  }

  process.exit(0);
--- a/tests/docs/evaluator-rag-prototype.test.js
+++ b/tests/docs/evaluator-rag-prototype.test.js
@@ -130,12 +130,12 @@ test('candidate playbook preserves stale-salvage operating rules', () => {
  }
 });

-test('roadmap points to the evaluator RAG prototype and keeps hosted integration open', () => {
+test('roadmap points to the evaluator RAG prototype and hosted PR check', () => {
  const roadmap = read('docs/ECC-2.0-GA-ROADMAP.md');

  assert.ok(roadmap.includes('docs/architecture/evaluator-rag-prototype.md'));
  assert.ok(roadmap.includes('examples/evaluator-rag-prototype/'));
-  assert.ok(roadmap.includes('Local corpus complete; hosted integration remains future'));
+  assert.ok(roadmap.includes('Deterministic hosted PR check, cached output scoring, retrieval planning, judge contract, and gated model execution integrated'));
 });

 test('billing readiness scenario rejects launch copy overclaims', () => {
--- a/tests/hooks/suggest-compact.test.js
+++ b/tests/hooks/suggest-compact.test.js
@@ -366,6 +366,66 @@ function runTests() {
  })) passed++;
  else failed++;

+  // ── hookSpecificOutput JSON on stdout ──
+  // Claude Code 2.1+ drops non-blocking PreToolUse stderr; the suggestion has
+  // to ride on stdout as { hookSpecificOutput: { additionalContext } } to reach
+  // the model. These tests pin that contract.
+  console.log('\nhookSpecificOutput stdout JSON:');
+
+  if (test('emits hookSpecificOutput.additionalContext on stdout at threshold', () => {
+    const { sessionId, counterFile, cleanup } = createCounterContext();
+    cleanup();
+    fs.writeFileSync(counterFile, '49');
+    const result = runCompact({ CLAUDE_SESSION_ID: sessionId });
+    assert.strictEqual(result.code, 0, 'Should exit 0');
+    assert.ok(result.stdout.trim().length > 0, `Expected stdout payload at threshold. Got: "${result.stdout}"`);
+    const parsed = JSON.parse(result.stdout);
+    assert.strictEqual(parsed.hookSpecificOutput.hookEventName, 'PreToolUse',
+      `hookEventName should be PreToolUse. Got: ${JSON.stringify(parsed)}`);
+    assert.ok(parsed.hookSpecificOutput.additionalContext.includes('50 tool calls reached'),
+      `additionalContext should include threshold text. Got: ${parsed.hookSpecificOutput.additionalContext}`);
+    cleanup();
+  })) passed++;
+  else failed++;
+
+  if (test('emits hookSpecificOutput.additionalContext on stdout at +25 interval', () => {
+    const { sessionId, counterFile, cleanup } = createCounterContext();
+    cleanup();
+    // threshold=3, set counter to 27 → next run = 28 → 28-3=25 → interval hit
+    fs.writeFileSync(counterFile, '27');
+    const result = runCompact({ CLAUDE_SESSION_ID: sessionId, COMPACT_THRESHOLD: '3' });
+    assert.strictEqual(result.code, 0, 'Should exit 0');
+    assert.ok(result.stdout.trim().length > 0, `Expected stdout payload at interval. Got: "${result.stdout}"`);
+    const parsed = JSON.parse(result.stdout);
+    assert.strictEqual(parsed.hookSpecificOutput.hookEventName, 'PreToolUse');
+    assert.ok(parsed.hookSpecificOutput.additionalContext.includes('28 tool calls'),
+      `additionalContext should include count. Got: ${parsed.hookSpecificOutput.additionalContext}`);
+    cleanup();
+  })) passed++;
+  else failed++;
+
+  if (test('emits no stdout below threshold (silent)', () => {
+    const { sessionId, cleanup } = createCounterContext();
+    cleanup();
+    const result = runCompact({ CLAUDE_SESSION_ID: sessionId, COMPACT_THRESHOLD: '5' });
+    assert.strictEqual(result.code, 0);
+    assert.strictEqual(result.stdout.trim(), '',
+      `Expected empty stdout below threshold. Got: "${result.stdout}"`);
+    cleanup();
+  })) passed++;
+  else failed++;
+
+  if (test('still writes [StrategicCompact] to stderr (debug log retained)', () => {
+    const { sessionId, counterFile, cleanup } = createCounterContext();
+    cleanup();
+    fs.writeFileSync(counterFile, '49');
+    const result = runCompact({ CLAUDE_SESSION_ID: sessionId });
+    assert.ok(result.stderr.includes('[StrategicCompact]'),
+      `stderr should retain [StrategicCompact] for debug log capture. Got: "${result.stderr}"`);
+    cleanup();
+  })) passed++;
+  else failed++;
+
  // ── Round 64: default session ID fallback ──
  console.log('\nDefault session ID fallback (Round 64):');
Author	SHA1	Message	Date
richm-spp	c802c33abc	fix(hooks): emit suggest-compact via hookSpecificOutput stdout The threshold and interval suggestions in suggest-compact.js are written to stderr via log(). Per the Claude Code hooks guide, non-blocking PreToolUse stderr (exit code 0) is only captured in the debug log — it does not reach the model. As shipped, the script's nudge to /compact is silent on Claude Code 2.1.x. Fix: alongside the existing log() call (kept for debug-log capture), emit the same suggestion as structured JSON on stdout: { hookSpecificOutput: { hookEventName: "PreToolUse", additionalContext: msg } } This is the documented mechanism for a PreToolUse hook to inject context into the next model turn without blocking the tool call. Verified end-to-end on Claude Code 2.1.142 (VSCode native extension, Windows 11) — the additionalContext now surfaces in the next turn as a <system-reminder> block. Counter increment and exit code behavior unchanged. Tests: 4 new cases in tests/hooks/suggest-compact.test.js covering stdout JSON at threshold, stdout JSON at +25 interval, silence below threshold, and stderr-retention for the debug log. Suite goes from 19/19 -> 23/23 (suggest-compact) and full run-all stays clean for the unaffected suites (the 4 pre-existing Windows broken-symlink failures in ci/validators, lib/session-manager, and lib/utils are unrelated to this change).	2026-05-15 09:48:28 +10:00
Affaan Mustafa	0e66c838c7	docs: sync ECC Tools judge execution (#1901 )	2026-05-14 17:38:03 -04:00
Affaan Mustafa	cb9702ca99	docs: sync ECC Tools judge contract (#1900 )	2026-05-14 17:15:54 -04:00
Affaan Mustafa	f9384427b8	docs: sync ECC Tools retrieval planning (#1892 )	2026-05-14 16:54:30 -04:00
Affaan Mustafa	4423f10cfb	docs: sync ECC Tools hosted output scoring (#1891 )	2026-05-13 23:02:23 -04:00
Affaan Mustafa	3b12fb273f	docs: sync ECC Tools hosted promotion readiness (#1890 )	2026-05-13 22:39:01 -04:00
Affaan Mustafa	4fb80d8861	Sync ECC Tools status-aware depth plan roadmap (#1887 )	2026-05-13 22:12:11 -04:00
Affaan Mustafa	a27831c13e	Sync ECC Tools hosted status roadmap (#1886 )	2026-05-13 21:49:42 -04:00
Affaan Mustafa	b24d762caa	Sync ECC Tools hosted result history roadmap (#1885 )	2026-05-13 21:31:08 -04:00
Affaan Mustafa	f94478e524	docs: sync roadmap after ECC-Tools hosted dispatch	2026-05-13 20:30:48 -04:00
Affaan Mustafa	6cdac19764	docs: sync roadmap after ECC-Tools depth-plan check	2026-05-13 20:10:38 -04:00
Affaan Mustafa	af3a206412	docs: sync roadmap after ECC-Tools team backlog job (#1880 )	2026-05-13 19:44:49 -04:00
Affaan Mustafa	20f00c1410	docs: sync roadmap after ECC-Tools AI cost job (#1878 )	2026-05-13 19:26:48 -04:00
Affaan Mustafa	e7a6f137e5	docs: sync roadmap after ECC-Tools reference-set job (#1877 )	2026-05-13 19:09:35 -04:00
Affaan Mustafa	7596502092	docs: sync roadmap after ECC-Tools harness job (#1876 )	2026-05-13 18:50:45 -04:00
Affaan Mustafa	c04baa8c25	docs: sync roadmap after ECC-Tools security evidence job (#1875 )	2026-05-13 18:32:06 -04:00
Affaan Mustafa	9082bdedac	docs: sync roadmap after ECC-Tools CI diagnostics (#1874 )	2026-05-13 18:12:31 -04:00
Affaan Mustafa	3243a1c5d3	docs: sync roadmap after ECC-Tools hosted planning (#1872 )	2026-05-13 12:48:50 -04:00
Affaan Mustafa	69401b28b3	docs: sync roadmap after ECC-Tools depth readiness (#1871 )	2026-05-13 12:26:32 -04:00
Affaan Mustafa	9a5ed3223a	docs: sync roadmap after AgentShield corpus expansion Records AgentShield PR #82 and moves the next AgentShield roadmap slice to hosted evidence-pack workflow depth.	2026-05-13 09:04:34 -04:00
Affaan Mustafa	d844bd6bfc	docs: sync roadmap after AgentShield remediation workflows Records AgentShield PR #81 and advances the next AgentShield roadmap slice after remediation workflow phases landed.	2026-05-13 08:46:07 -04:00