Compare commits

...

21 Commits

Author SHA1 Message Date
Jamkris
85d33748e0 test(hooks): regression coverage for round 1 review fixes
9 new cases locking in the behavior added by the previous two
commits. Each was verified to fail before the fix and pass after.

Greptile — quote-aware depth counting:
  - blocks $(echo ")"; (npm run dev))
  - blocks (echo ")"; npm run dev)
  - allows $(echo "(npm run dev)") — () inside double-quoted body is literal

Greptile — brace groups:
  - blocks { npm run dev; }
  - blocks echo hi && { npm run dev; }
  - allows {npm run dev} — bash brace-group syntax requires a space after {

CodeRabbit — missing package-manager variants:
  - blocks yarn run dev (yarn 1.x convention)
  - blocks bun dev (bun bare form)

CodeRabbit nitpick — symmetric quote test:
  - blocks echo "$(npm run dev)" — double-quoted substitution still substitutes

The `{npm run dev}` allow case is intentional: bash treats `{` as
a reserved word only when followed by whitespace. The pre-fix code
already passed this through, but until now we never asserted it,
so a future change to brace handling could silently start blocking
literal `{npm` tokens.
2026-05-14 12:24:45 +09:00
Jamkris
e2eaf4ac2f fix(hooks): cover brace groups + yarn-run/bun-bare dev variants
Two false-negatives surfaced in PR #1889 review:

1. Brace-group bypass (Greptile).
   `{ npm run dev; }` evaluates the dev command in the *current*
   shell — semantically distinct from `( ... )` but with the same
   effect for this hook. `splitShellSegments` correctly cleaves the
   group at `;` into `["{ npm run dev", "}"]`, but the first segment's
   leading token under `readToken` is the bare `{`, which was not in
   `DEV_COMMAND_WORDS`, so the dev-pattern check was skipped.

   Fix: treat `{` and `}` as no-op tokens in `getLeadingCommandWord`
   so we keep walking to the real command word. Matches how shell
   itself parses brace groups (the braces are reserved words, not
   commands). Bash requires a space after `{` and a terminator before
   `}` for an actual group, so `{npm run dev}` correctly remains
   allowed (single token `{npm`, not in `DEV_COMMAND_WORDS`).

2. Missing yarn-run / bun-bare variants (CodeRabbit).
   Both `yarn dev` *and* `yarn run dev` are valid (the latter is what
   `package.json` actually wires `dev` to under yarn 1.x). The same
   `(run )?` symmetry applies to bun. The previous `DEV_PATTERN` only
   matched `yarn\s+dev` and `bun\s+run\s+dev`, allowing the cross
   forms to pass through silently.

   Fix: `yarn(?:\s+run)?\s+dev` and `bun(?:\s+run)?\s+dev` — same
   shape `pnpm(?:\s+run)?\s+dev` was already using.

Verified after this commit (every form now exits 2):

  { npm run dev; }
  { npm run dev ; }
  echo hi && { npm run dev; }
  ({ npm run dev; })
  $( { npm run dev; } )
  yarn run dev
  bun dev

Verified still allowed (no regression):

  echo "{ npm run dev; }"   # literal inside double quotes
  {npm run dev}             # not a brace group per bash syntax
2026-05-14 12:23:55 +09:00
Jamkris
70b86d81c4 fix(lib): track quote state inside command-substitution depth counters
Greptile flagged a bypass in PR #1889: `$(echo ")"; (npm run dev))`
threaded the depth-counting loops in `extractCommandSubstitutions`
and `extractSubshellGroups` to terminate early, because a literal `)`
inside double quotes was treated as a real closing paren. The
truncated body then ended in a dangling `"` that toggled `inDouble`
in the outer scan, masking the subsequent `(npm run dev)` group from
extraction.

Reproduced (before this commit) by piping the synthetic PreToolUse
payload `{"tool_input":{"command":"$(echo \")\"; (npm run dev))"}}`
into `scripts/hooks/pre-bash-dev-server-block.js` and observing
exit 0 (allow) where the dev pattern is clearly present.

Fix: each `$(...)` and `(...)` body loop now tracks its own
single/double quote state and only treats `(` / `)` as depth
delimiters when outside quotes. The quoted `)` no longer closes
the group early, the body now extends to the real closing paren,
and the outer scan's quote state remains untouched.

After this commit:
  $ echo '{"tool_input":{"command":"$(echo \")\"; (npm run dev))"}}' \
      | node scripts/hooks/pre-bash-dev-server-block.js; echo $?
  2

The symmetric form `$(echo "(npm run dev)")` correctly remains
allowed (bash does not honor `(...)` inside double quotes).
2026-05-14 12:22:22 +09:00
Jamkris
4f4654bf21 test(hooks): regression coverage for dev-server-block subshell bypass
Lock in the behavior added by the previous commit. Each new case was
verified to fail before the fix and pass after.

Bypasses now blocked (exit 2):
- \$(npm run dev)              command substitution
- \`npm run dev\`              backtick substitution
- echo \$(npm run dev)         substitution inside an argument
- (npm run dev)               plain subshell group
- \$(echo a; npm run dev)      substitution containing a sequenced segment
- (pnpm dev)                  plain subshell group, alt package manager

Allow cases — explicitly proven NOT to regress so the fix doesn't
over-block legitimate uses:
- (tmux new-session -d -s dev "npm run dev")   tmux launcher inside ()
- git commit -m '(npm run dev)'                literal in single quotes
- echo "(npm run dev)"                         literal in double quotes
  (bash does NOT subshell () inside double quotes)
- git commit -m '\$(npm run dev) fix'          literal in single quotes

Single- and double-quote allow cases are important: they distinguish a
real subshell construct from one that's just text inside a string,
which is what `extractSubshellGroups` / `extractCommandSubstitutions`
quote-awareness is for.
2026-05-14 11:22:44 +09:00
Jamkris
a7e51e8046 fix(hooks): close subshell bypass in pre-bash-dev-server-block
Before this commit the dev-server-block hook ran the leading-command
and dev-pattern check only against the top-level segments returned by
`splitShellSegments`, which doesn't split on `$(...)`, backticks, or
plain `(...)`. That left the policy bypassable by wrapping a dev
command in any of those constructs:

  $(npm run dev)
  `npm run dev`
  echo $(npm run dev)
  (npm run dev)

Each verified by piping a synthetic PreToolUse payload into the hook
on this branch: every form above returned exit 0 (allow) where a plain
`npm run dev` correctly returned exit 2 (block).

Fix: expand the check space before running the leading-command rule.
A small BFS walks the raw command, harvesting bodies from
`extractCommandSubstitutions` (`$(...)` and backticks) and from
`extractSubshellGroups` (plain `(...)`), then splits each harvested
body through `splitShellSegments` and feeds the result into the
existing `isBlockedDevSegment` check.

This preserves every existing allow case (`tmux new-session -d -s dev
"npm run dev"`, quoted-string mentions like `git commit -m "npm run
dev fix"`, `echo hi`) because the leading-command rule is unchanged —
only the set of segments it runs against grew.

Known limitation, not fixed here: `eval "$(echo npm run dev)"` still
slips through because the substitution body's leading command is
`echo`, and statically modeling echo's output to recover the executed
command is out of scope. The same class affects `gateguard-fact-force`
(via `eval "$(echo rm -rf /)"` etc.) and is best addressed in both
hooks together as a follow-up rather than as a one-off here.
2026-05-14 11:21:41 +09:00
Jamkris
04dc03c3af feat(lib): add extractSubshellGroups for plain (...) subshells
`extractCommandSubstitutions` only walks `$(...)` and backticks — the two
shell constructs whose bodies are captured as strings. Bash also has
plain `(...)` subshells (e.g. `(npm run dev)`), where the body executes
in a child shell but is not value-captured. Our PreToolUse hooks need
to peer inside those too, because a `(...)` group bypasses the
top-level segment splitter just like `$(...)` does.

This commit adds a sibling extractor with the same conventions as
`extractCommandSubstitutions`:

- single quotes literal — `'(npm run dev)'` is a string, ignored
- double quotes literal for parens — `"(npm run dev)"` is a string
  (bash only honors `$(...)`, not bare `(...)`, inside double quotes)
- skips `$(...)` and backtick spans so we don't double-extract
  bodies the other helper already handles
- recurses into its own bodies for nested groups

No consumer yet; the next commit wires both extractors into
`scripts/hooks/pre-bash-dev-server-block.js` to close the subshell
bypass surface.
2026-05-14 11:17:46 +09:00
Jamkris
0a380c3e85 feat(lib): extract shell command-substitution parser to shared lib
Extract the `extractCommandSubstitutions` function originally
introduced in scripts/hooks/gateguard-fact-force.js (PR #1853
round 2) into scripts/lib/shell-substitution.js so other PreToolUse
hooks can reuse the same single-quote-aware, double-quote-aware,
nested-subshell-aware parser without duplicating it.

No behavior change in this commit — the function body is copied
verbatim and exposed via `module.exports`. The next commit wires it
into scripts/hooks/pre-bash-dev-server-block.js to close that hook's
own subshell-bypass holes.

gateguard-fact-force.js still defines its own private copy of the
function; consolidating both call sites onto this shared lib is a
follow-up worth doing once this PR lands, but is intentionally out
of scope here to keep the diff focused on the dev-server-block fix.
2026-05-14 11:10:40 +09:00
Affaan Mustafa
a27831c13e Sync ECC Tools hosted status roadmap (#1886) 2026-05-13 21:49:42 -04:00
Affaan Mustafa
b24d762caa Sync ECC Tools hosted result history roadmap (#1885) 2026-05-13 21:31:08 -04:00
Affaan Mustafa
f94478e524 docs: sync roadmap after ECC-Tools hosted dispatch 2026-05-13 20:30:48 -04:00
Affaan Mustafa
6cdac19764 docs: sync roadmap after ECC-Tools depth-plan check 2026-05-13 20:10:38 -04:00
Affaan Mustafa
af3a206412 docs: sync roadmap after ECC-Tools team backlog job (#1880) 2026-05-13 19:44:49 -04:00
Affaan Mustafa
20f00c1410 docs: sync roadmap after ECC-Tools AI cost job (#1878) 2026-05-13 19:26:48 -04:00
Affaan Mustafa
e7a6f137e5 docs: sync roadmap after ECC-Tools reference-set job (#1877) 2026-05-13 19:09:35 -04:00
Affaan Mustafa
7596502092 docs: sync roadmap after ECC-Tools harness job (#1876) 2026-05-13 18:50:45 -04:00
Affaan Mustafa
c04baa8c25 docs: sync roadmap after ECC-Tools security evidence job (#1875) 2026-05-13 18:32:06 -04:00
Affaan Mustafa
9082bdedac docs: sync roadmap after ECC-Tools CI diagnostics (#1874) 2026-05-13 18:12:31 -04:00
Affaan Mustafa
3243a1c5d3 docs: sync roadmap after ECC-Tools hosted planning (#1872) 2026-05-13 12:48:50 -04:00
Affaan Mustafa
69401b28b3 docs: sync roadmap after ECC-Tools depth readiness (#1871) 2026-05-13 12:26:32 -04:00
Affaan Mustafa
9a5ed3223a docs: sync roadmap after AgentShield corpus expansion
Records AgentShield PR #82 and moves the next AgentShield roadmap slice to hosted evidence-pack workflow depth.
2026-05-13 09:04:34 -04:00
Affaan Mustafa
d844bd6bfc docs: sync roadmap after AgentShield remediation workflows
Records AgentShield PR #81 and advances the next AgentShield roadmap slice after remediation workflow phases landed.
2026-05-13 08:46:07 -04:00
4 changed files with 541 additions and 22 deletions

View File

@@ -61,6 +61,14 @@ As of 2026-05-13:
and added prioritized corpus accuracy recommendations to failed corpus gates,
mapping misses by category, missing rule, and config ID so enterprise
scanner-regression work has an actionable improvement plan.
- AgentShield PR #81 merged as `6583884e74ba2e896942113e1ce3146230e6fb76`
and added ordered remediation workflow phases to remediation plans, routing
safe auto-fixes, manual review, and verification through stable finding
fingerprints without copying raw evidence.
- AgentShield PR #82 merged as `51336ba074ad5e9fed2c0aa3237422be22147e76`
and expanded the built-in attack corpus with an env proxy hijack scenario
covering proxy/runtime mutation, env-token exfiltration, DNS exfiltration,
credential-store access, and clipboard access.
- JARVIS PR #13 merged as `127efabbfb5033ae53d7a53e1546aa3c33d6f962`
and hardened CI/deploy workflows with npm registry signature verification,
disabled persisted checkout credentials in write-permission jobs, and pinned
@@ -72,6 +80,79 @@ As of 2026-05-13:
and made `/ecc-tools followups sync-linear` track copy-ready PR drafts in
the Linear/project backlog when `open-pr-drafts` is not used, preserving
useful stale-PR salvage work without opening extra PR shells.
- ECC-Tools PR #55 merged as `5d8c112cce4794cfa089d5b0ea661ba87a178be1`
and added analysis-depth readiness to `/ecc-tools analyze` comments,
separating commit-history-only repos from evidence-backed and deep-ready repos
using CI/CD, security, harness, reference/eval, AI routing/cost-control, and
team handoff evidence.
- ECC-Tools PR #56 merged as `5b729c88641eafe80f65364bab3fc74d0270f57b`
and added the authenticated `/api/analysis/depth-plan` contract that maps
analysis-depth readiness into concrete hosted jobs for CI diagnostics,
security evidence review, harness compatibility, reference-set evaluation,
AI routing/cost review, and team backlog routing.
- ECC-Tools PR #57 merged as `4cc61112a4cc9feec7b07af09321f360e34af6a4`
and added the first executable hosted analysis job:
`/api/analysis/jobs/ci-diagnostics` now gates on CI/CD readiness, inspects
workflow/test-runner/failure-evidence artifacts, returns CI hardening
findings and next actions, and charges usage only after successful execution.
- ECC-Tools PR #58 merged as `ce09dd8d9b46f65c6b88dc4f48cfb6b6227ae0bf`
and added the second executable hosted analysis job:
`/api/analysis/jobs/security-evidence-review` now gates on security-evidence
readiness, inspects capped AgentShield evidence-pack, policy, baseline,
SBOM, SARIF, and security-scan artifacts, returns supply-chain evidence
findings and next actions, and charges usage only after successful execution.
- ECC-Tools PR #59 merged as `505b372dbd8f75f996d9e2ed079effd30cec5ba5`
and added the third executable hosted analysis job:
`/api/analysis/jobs/harness-compatibility-audit` now gates on harness-config
readiness, inspects capped Claude, Codex, OpenCode, MCP, plugin, and
cross-harness documentation artifacts, excludes local secret-bearing config
paths from fetches, returns portability findings and next actions, and
charges usage only after successful execution.
- ECC-Tools PR #60 merged as `b75e0a49ba5672b1ec9a2a4880ddcfa2d07dc557`
and added the fourth executable hosted analysis job:
`/api/analysis/jobs/reference-set-evaluation` now gates on reference-evidence
readiness, evaluates analyzer corpus, RAG/evaluator, PR salvage/review,
harness, security, and CI failure-mode evidence, excludes obvious
secret-bearing fixture paths from fetches, returns reference coverage
findings and next actions, and charges usage only after successful execution.
- ECC-Tools PR #61 merged as `7b01b67cae0b80774b311cb515b7eca0aa038c65`
and added the fifth executable hosted analysis job:
`/api/analysis/jobs/ai-routing-cost-review` now gates on AI routing/cost
readiness, evaluates model routing, token budget, usage-limit, rate-limit,
billing/entitlement, cost-regression, and cost-policy evidence, excludes
obvious secret-bearing paths from fetches, returns cost-control findings and
next actions, and charges usage only after successful execution.
- ECC-Tools PR #62 merged as `781d6733e56f7556edb43fb96bdfb00b1f0a3aa6`
and added the sixth executable hosted analysis job:
`/api/analysis/jobs/team-backlog-routing` now gates on team handoff/project
tracking readiness, evaluates roadmap, runbook, handoff, release-plan,
issue-template, ownership, project-tracker, backlog, and follow-up evidence,
excludes obvious secret-bearing paths from fetches, returns team-routing
findings and next actions, and charges usage only after successful execution.
- ECC-Tools PR #63 merged as `fb9e4c5ceb9ccde50da74c7a69c3fa4bd321fc07`
and made the hosted execution plan operator-visible on queued PR analysis:
the queue now publishes a non-blocking `ECC Tools / Hosted Depth Plan`
check-run on the PR head SHA with ready/blocked hosted executor commands
and next action text, while keeping check-run publication best-effort so
bundle generation and analysis comments are not blocked.
- ECC-Tools PR #64 merged as `72020ef94db94840812977ea7ac37e9344036668`
and added PR-facing hosted job dispatch controls:
`/ecc-tools analyze --job ...` comments now queue hosted jobs against the
PR head SHA, execute them through the existing hosted readiness/evidence
gates, post artifacts/findings/next actions back to the PR, and scope
idempotency keys by job id so hosted jobs do not collide with bundle
analysis.
- ECC-Tools PR #65 merged as `bacd4adf6a3a629e8d403865456d15f127baaf4e`
and added hosted job result history/check-run summaries:
queued hosted jobs now cache both the latest result and immutable run records
for completed or blocked runs, then publish a non-blocking per-job check-run
on the PR head SHA with artifacts, findings, readiness blockers, and next
actions.
- ECC-Tools PR #66 merged as `4e1db48252d068ea5dcf4308b0bc11b0dfe0c9ce`
and added a read-only hosted status command:
`/ecc-tools analyze --job status` now reads the #65 latest-result cache for
the current PR head and posts a compact completed/blocked/not-run table with
the next hosted job command, without queueing work or billing usage.
- Handoff `ecc-supply-chain-audit-20260513-0645.md` under
`~/.cluster-swarm/handoffs/`
records the May 13 supply-chain sweep: no active lockfile/manifest hit for
@@ -255,6 +336,56 @@ As of 2026-05-13:
artifact contract so canonical bundle files now satisfy the taxonomy and
generated follow-up PRs point maintainers at
`agentshield scan --evidence-pack <dir>`.
- ECC-Tools PR #55 added the first hosted/deeper-analysis readiness signal:
analysis comments now classify a repo as commit-history-only,
evidence-backed, or deep-ready before routing work into CI, AgentShield,
harness, reference-set, RAG/evaluator, AI-routing, cost-control, and
Linear/project-tracking lanes.
- ECC-Tools PR #56 turned that signal into a hosted execution-plan contract:
`/api/analysis/depth-plan` returns ready/blocked jobs and next action text
without charging analysis usage or creating bundle PRs.
- ECC-Tools PR #57 implemented the first job-specific hosted executor:
`/api/analysis/jobs/ci-diagnostics` reuses the depth-readiness gate, internal
API auth, installation ownership, repo-access billing checks, capped workflow
file reads, and usage accounting to return concrete CI hardening findings.
- ECC-Tools PR #58 implemented the second job-specific hosted executor:
`/api/analysis/jobs/security-evidence-review` applies the same hosted gates
to AgentShield evidence-pack, policy, baseline, SBOM, SARIF, and security
scanner artifacts.
- ECC-Tools PR #59 implemented the third job-specific hosted executor:
`/api/analysis/jobs/harness-compatibility-audit` applies the same hosted
gates to Claude, Codex, OpenCode, MCP, plugin, and cross-harness evidence
while avoiding local secret-bearing harness config fetches.
- ECC-Tools PR #60 implemented the fourth job-specific hosted executor:
`/api/analysis/jobs/reference-set-evaluation` applies the same hosted gates
to analyzer corpus, RAG/evaluator, PR salvage, harness, security, and CI
failure-mode reference evidence while avoiding obvious secret-bearing fixture
fetches.
- ECC-Tools PR #61 implemented the fifth job-specific hosted executor:
`/api/analysis/jobs/ai-routing-cost-review` applies the same hosted gates to
model-routing, token-budget, usage-limit, rate-limit, billing/entitlement,
cost-regression, and cost-policy evidence while avoiding obvious
secret-bearing path fetches.
- ECC-Tools PR #62 implemented the sixth job-specific hosted executor:
`/api/analysis/jobs/team-backlog-routing` applies the same hosted gates to
roadmap, runbook, handoff, release-plan, issue-template, ownership,
project-tracker, backlog, and follow-up evidence while avoiding obvious
secret-bearing path fetches.
- ECC-Tools PR #63 publishes the hosted depth-plan check-run after queued PR
analysis completes, making the six hosted executor commands visible on the
PR head SHA without turning the check into a merge blocker.
- ECC-Tools PR #64 wires those commands into the queue: maintainers can comment
`/ecc-tools analyze --job ci-diagnostics`, `security-evidence`,
`harness-compatibility`, `reference-set-evaluation`, `ai-routing-cost`, or
`team-backlog` on a PR and receive hosted job results in a PR comment.
- ECC-Tools PR #65 persists completed and blocked hosted job results to the
analysis cache for 30 days and publishes non-blocking `ECC Tools / Hosted
Job: ...` check-runs so maintainers can scan hosted outcomes from the PR
checks surface instead of rereading older comments.
- ECC-Tools PR #66 exposes the cached results from PR comments with
`/ecc-tools analyze --job status`, summarizing completed, blocked, and
not-yet-run hosted jobs for the PR head and recommending the next hosted job
command.
- ECC PR #1803 landed the contributor Quarkus handling branch after maintainer
cleanup, current-`main` alignment, full local validation, and preservation of
the author's removal of incomplete ja-JP and zh-CN Quarkus translations.
@@ -307,11 +438,11 @@ is not complete unless the evidence column exists and has been freshly verified.
| Naming and rename readiness | Naming matrix across package/plugin/docs/social surfaces | `docs/releases/2.0.0-rc.1/naming-and-publication-matrix.md` records current package, repo, Claude plugin, Codex plugin, OpenCode, and npm availability evidence | Complete for rc.1; post-rc rename remains future work |
| Claude and Codex plugin publication | Contact/submission path with required artifacts and status | Publication readiness, naming matrix, and May 12 dry-run evidence document plugin validation, clean-checkout Claude tag/install smoke, and Codex marketplace CLI shape | Needs explicit approval for real tag/push and marketplace submission |
| Articles, tweets, and announcements | X thread, LinkedIn copy, GitHub release copy, push checklist | Draft launch collateral exists under rc.1 release docs | Needs URL-backed refresh |
| AgentShield enterprise iteration | Policy gates, SARIF, packs, provenance, corpus, HTML reports, exception lifecycle audit, baseline drift Action/CLI surfaces, evidence-pack redaction, harness adapter registry, enterprise research roadmap, supply-chain hardened release path, CI-safe baseline fingerprints, corpus accuracy recommendations | PRs #53, #55-#64, #67-#69, and #78-#80 landed with test evidence; native PDF export deferred in favor of self-contained HTML plus print-to-PDF until explicit enterprise demand appears; `docs/architecture/agentshield-enterprise-research-roadmap.md` now has baseline drift, evidence-pack bundle, redaction, adapter-registry, supply-chain hardening, hashed baseline fingerprints, and corpus accuracy recommendation slices landed | Next remediation workflow depth or corpus expansion |
| ECC Tools next-level app | Billing audit, PR checks, deep analyzer, sync backlog, evaluator/RAG corpus | PRs #26-#43 plus #53/#54 landed with test evidence, including AgentShield evidence-pack gap routing, canonical bundle recognition, supply-chain signature gates, and PR draft follow-up Linear tracking | Needs hosted/deeper analysis follow-up |
| AgentShield enterprise iteration | Policy gates, SARIF, packs, provenance, corpus, HTML reports, exception lifecycle audit, baseline drift Action/CLI surfaces, evidence-pack redaction, harness adapter registry, enterprise research roadmap, supply-chain hardened release path, CI-safe baseline fingerprints, corpus accuracy recommendations, remediation workflow phases, env proxy hijack corpus coverage | PRs #53, #55-#64, #67-#69, and #78-#82 landed with test evidence; native PDF export deferred in favor of self-contained HTML plus print-to-PDF until explicit enterprise demand appears; `docs/architecture/agentshield-enterprise-research-roadmap.md` now has baseline drift, evidence-pack bundle, redaction, adapter-registry, supply-chain hardening, hashed baseline fingerprints, corpus accuracy recommendation, remediation workflow, and env proxy hijack corpus slices landed | Next hosted evidence-pack workflow depth |
| ECC Tools next-level app | Billing audit, PR checks, deep analyzer, sync backlog, evaluator/RAG corpus, analysis-depth readiness, hosted execution planning, hosted CI diagnostics, hosted security evidence review, hosted harness compatibility audit, hosted reference-set evaluation, hosted AI routing/cost review, hosted team backlog routing, hosted depth-plan check-run, PR-comment hosted job dispatch, hosted job result history/check-runs, hosted result status command | PRs #26-#43 plus #53-#66 landed with test evidence, including AgentShield evidence-pack gap routing, canonical bundle recognition, supply-chain signature gates, PR draft follow-up Linear tracking, evidence-backed/deep-ready repository classification, the `/api/analysis/depth-plan` hosted job plan, `/api/analysis/jobs/ci-diagnostics`, `/api/analysis/jobs/security-evidence-review`, `/api/analysis/jobs/harness-compatibility-audit`, `/api/analysis/jobs/reference-set-evaluation`, `/api/analysis/jobs/ai-routing-cost-review`, `/api/analysis/jobs/team-backlog-routing`, the `ECC Tools / Hosted Depth Plan` check-run, `/ecc-tools analyze --job ...` PR-comment dispatch, non-blocking per-hosted-job result check-runs backed by 30-day result cache records, and `/ecc-tools analyze --job status` cache lookup | Next work is evaluator-backed hosted promotion and status-aware depth-plan recommendations |
| GitGuardian/Dependabot/CodeRabbit-style checks | Non-blocking taxonomy, deterministic follow-up checks, and local supply-chain gates | ECC-Tools risk taxonomy check plus follow-up signals landed, including Skill Quality, Deep Analyzer Evidence, Analyzer Corpus Evidence, RAG/Evaluator Evidence, PR Review/Salvage Evidence, and AgentShield evidence-pack evidence; #1846 added npm registry signature gates; #1848 added the supply-chain incident-response playbook and `pull_request_target` cache-poisoning validator guard; #1851 added the privileged checkout credential-persistence guard; AgentShield #78, JARVIS #13, and ECC-Tools #53 applied the same hardening outside trunk | Current supply-chain gate complete; deeper hosted review features remain future |
| Harness-agnostic learning system | Audit, adapter matrix, observability, traces, promotion loop | Audit/adapters/observability gates plus `docs/architecture/evaluator-rag-prototype.md`, `examples/evaluator-rag-prototype/`, and ECC-Tools PR #40 define read-only stale-salvage, billing-readiness, CI-failure-diagnosis, harness-config-quality, AgentShield policy-exception, skill-quality evidence, deep-analyzer evidence, and RAG/evaluator comparison scenarios with trace, report, playbook, verifier, and predictive-check artifacts | Local corpus complete; hosted integration remains future |
| Linear roadmap is detailed | Linear project status plus repo mirror | Repo mirror exists; issue creation was retried on 2026-05-12 and remains blocked by the workspace free issue limit; this May 13 sync adds ECC #1860, AgentShield #78/#79, JARVIS #13, ECC-Tools #53/#54, resolved queue/discussion counts, and Linear project status updates `59f630eb`/`c7ea6daf` | Needs recurring status updates after each merge batch |
| Linear roadmap is detailed | Linear project status plus repo mirror | Repo mirror exists; issue creation was retried on 2026-05-12 and remains blocked by the workspace free issue limit; this May 13 sync adds ECC #1860, AgentShield #78-#82, JARVIS #13, ECC-Tools #53-#66, resolved queue/discussion counts, and Linear project status updates through ECC-Tools #66 | Needs recurring status updates after each merge batch |
| Flow separation and progress tracking | Flow lanes with owner artifacts and update cadence | This roadmap defines lanes below and `docs/architecture/progress-sync-contract.md` makes GitHub/Linear/handoff/roadmap sync part of the readiness gate | Active |
| Realtime Linear sync | Project updates while issue limit is blocked; issues later | ECC-Tools #39 implements opt-in Linear API sync for deferred follow-up backlog items, and ECC-Tools #54 adds copy-ready PR drafts to that backlog when draft PR shells are not opened; `docs/architecture/progress-sync-contract.md` defines the local file-backed realtime boundary while issue capacity is blocked | Needs workspace capacity/config rollout |
| Observability for self-use | Local readiness gate, traces, status snapshots, HUD/status contract, risk ledger, progress-sync contract | `npm run observability:ready` reports 21/21 | Complete for local gate |
@@ -332,7 +463,7 @@ repo evidence and merge commits.
| Harness OS core | Audit, adapter matrix, observability docs, `ecc2/` | HUD/session-control acceptance spec | Weekly until GA |
| Evaluation and RAG | Reference-set validation, harness audit, traces, ECC-Tools corpus | Read-only evaluator/RAG prototype plus stale-salvage, billing-readiness, CI-failure-diagnosis, harness-config-quality, AgentShield policy-exception, skill-quality evidence, deep-analyzer evidence, and RAG/evaluator comparison fixtures | Hosted retrieval/check-run automation plan |
| AgentShield enterprise | AgentShield PR evidence and roadmap notes | Remediation workflow depth or corpus expansion follow-up | Next implementation batch |
| ECC Tools app | ECC-Tools PR evidence, billing audit, risk taxonomy, evaluator/RAG corpus | ECC-Tools #53 published the supply-chain workflow hardening branch and #54 tracks copy-ready PR drafts in the Linear/project backlog; next work is hosted/deeper analysis follow-up | Next implementation batch |
| ECC Tools app | ECC-Tools PR evidence, billing audit, risk taxonomy, evaluator/RAG corpus | ECC-Tools #53 published the supply-chain workflow hardening branch, #54 tracks copy-ready PR drafts in the Linear/project backlog, #55 classifies analysis-depth readiness, #56 exposes the hosted execution plan, #57 executes the first hosted CI diagnostics job, #58 executes the hosted security evidence review job, #59 executes the hosted harness compatibility audit, #60 executes the hosted reference-set evaluation, #61 executes the hosted AI routing/cost review, #62 executes hosted team backlog routing, #63 publishes the hosted depth-plan check-run, and #64 dispatches hosted jobs from PR comments; next work is hosted result history/check-run summaries | Next implementation batch |
| Linear progress | Linear project status updates, `docs/architecture/progress-sync-contract.md`, and this mirror | Status update with queue/evidence/missing gates | Every significant merge batch |
The project status update should always include:
@@ -545,13 +676,13 @@ Acceptance:
supply-chain incident class; PR #79 moved baseline/watch/remediation
fingerprints to hashed evidence and stopped writing raw evidence into new
baselines; PR #80 added prioritized corpus accuracy recommendations for
failed regression gates; and ECC-Tools PRs #42/#43 now route and recognize
evidence packs. The next slice is remediation workflow depth or corpus
expansion.
2. Keep ECC-Tools #53's supply-chain workflow gate and #54's PR-draft backlog
tracking in the recurring queue evidence, and use the org-scoped GitHub auth
path for future ECC-Tools maintenance while the narrow environment token
remains active.
failed regression gates; PR #81 added ordered remediation workflow phases;
PR #82 expanded corpus coverage for env proxy hijacks and out-of-band
exfiltration; and ECC-Tools PRs #42/#43 now route and recognize evidence
packs. The next slice is hosted evidence-pack workflow depth.
2. Feed the #66 status surface back into hosted depth-plan recommendations so
queued analysis can suggest the next unrun or newly blocked hosted job from
cached outcomes, not only static readiness.
3. Enable/configure the merged Linear backlog sync path after workspace issue
capacity clears or the Linear workspace is upgraded, then verify PR-draft
salvage items land in the expected project.

View File

@@ -4,6 +4,10 @@
const MAX_STDIN = 1024 * 1024;
const path = require('path');
const { splitShellSegments } = require('../lib/shell-split');
const {
extractCommandSubstitutions,
extractSubshellGroups
} = require('../lib/shell-substitution');
const DEV_COMMAND_WORDS = new Set([
'npm',
@@ -123,6 +127,8 @@ function getLeadingCommandWord(segment) {
continue;
}
if (token === '{' || token === '}') continue;
if (/^[A-Za-z_][A-Za-z0-9_]*=.*/.test(token)) continue;
const normalizedToken = normalizeCommandWord(token);
@@ -154,23 +160,55 @@ process.stdin.on('data', chunk => {
}
});
const TMUX_LAUNCHER = /^\s*tmux\s+(new|new-session|new-window|split-window)\b/;
const DEV_PATTERN = /\b(npm\s+run\s+dev|pnpm(?:\s+run)?\s+dev|yarn(?:\s+run)?\s+dev|bun(?:\s+run)?\s+dev)\b/;
/**
* Collect every command-line segment we should evaluate. Returns the top-level
* segments first, then segments harvested from `$(...)` / backtick command
* substitutions and plain `(...)` subshell groups, recursively.
*
* Without this expansion the leading-command and dev-pattern check below only
* sees the outermost command, so wrappers like `$(npm run dev)` and
* `(npm run dev)` (which still spawn a dev server) sneak past.
*/
function collectCheckSegments(cmd) {
const segments = [...splitShellSegments(cmd)];
const queue = [cmd];
const seen = new Set();
while (queue.length) {
const current = queue.shift();
if (seen.has(current)) continue;
seen.add(current);
for (const body of extractCommandSubstitutions(current)) {
for (const seg of splitShellSegments(body)) segments.push(seg);
queue.push(body);
}
for (const body of extractSubshellGroups(current)) {
for (const seg of splitShellSegments(body)) segments.push(seg);
queue.push(body);
}
}
return segments;
}
function isBlockedDevSegment(segment) {
const commandWord = getLeadingCommandWord(segment);
if (!commandWord || !DEV_COMMAND_WORDS.has(commandWord)) return false;
return DEV_PATTERN.test(segment) && !TMUX_LAUNCHER.test(segment);
}
process.stdin.on('end', () => {
try {
const input = JSON.parse(raw);
const cmd = String(input.tool_input?.command || '');
if (process.platform !== 'win32') {
const segments = splitShellSegments(cmd);
const tmuxLauncher = /^\s*tmux\s+(new|new-session|new-window|split-window)\b/;
const devPattern = /\b(npm\s+run\s+dev|pnpm(?:\s+run)?\s+dev|yarn\s+dev|bun\s+run\s+dev)\b/;
const hasBlockedDev = segments.some(segment => {
const commandWord = getLeadingCommandWord(segment);
if (!commandWord || !DEV_COMMAND_WORDS.has(commandWord)) {
return false;
}
return devPattern.test(segment) && !tmuxLauncher.test(segment);
});
const segments = collectCheckSegments(cmd);
const hasBlockedDev = segments.some(isBlockedDevSegment);
if (hasBlockedDev) {
console.error('[Hook] BLOCKED: Dev server must run in tmux for log access');

View File

@@ -0,0 +1,246 @@
'use strict';
/**
* Extract executable command-substitution bodies from a shell line.
*
* Single quotes are literal, so substitutions inside them are ignored;
* double quotes still permit substitutions, so those bodies are scanned
* before quoted text is stripped. Returns each substitution body plus
* any nested substitutions discovered recursively.
*
* Originally introduced in scripts/hooks/gateguard-fact-force.js
* (PR #1853 round 2). Extracted to a shared lib so other PreToolUse
* hooks that need the same "scan inside `$(...)` and backticks"
* behavior can reuse it without duplicating the parser.
*
* @param {string} input
* @returns {string[]}
*/
function extractCommandSubstitutions(input) {
const source = String(input || '');
const substitutions = [];
let inSingle = false;
let inDouble = false;
for (let i = 0; i < source.length; i++) {
const ch = source[i];
const prev = source[i - 1];
if (ch === '\\' && !inSingle) {
i += 1;
continue;
}
if (ch === "'" && !inDouble && prev !== '\\') {
inSingle = !inSingle;
continue;
}
if (ch === '"' && !inSingle && prev !== '\\') {
inDouble = !inDouble;
continue;
}
if (inSingle) {
continue;
}
if (ch === '`') {
let body = '';
i += 1;
while (i < source.length) {
const inner = source[i];
if (inner === '\\') {
body += inner;
if (i + 1 < source.length) {
body += source[i + 1];
i += 2;
continue;
}
}
if (inner === '`') {
break;
}
body += inner;
i += 1;
}
if (body.trim()) {
substitutions.push(body);
substitutions.push(...extractCommandSubstitutions(body));
}
continue;
}
if (ch === '$' && source[i + 1] === '(') {
let depth = 1;
let body = '';
let bodyInSingle = false;
let bodyInDouble = false;
i += 2;
while (i < source.length && depth > 0) {
const inner = source[i];
const innerPrev = source[i - 1];
if (inner === '\\' && !bodyInSingle) {
body += inner;
if (i + 1 < source.length) {
body += source[i + 1];
i += 2;
continue;
}
}
if (inner === "'" && !bodyInDouble && innerPrev !== '\\') {
bodyInSingle = !bodyInSingle;
} else if (inner === '"' && !bodyInSingle && innerPrev !== '\\') {
bodyInDouble = !bodyInDouble;
} else if (!bodyInSingle && !bodyInDouble) {
if (inner === '(') {
depth += 1;
} else if (inner === ')') {
depth -= 1;
if (depth === 0) {
break;
}
}
}
body += inner;
i += 1;
}
if (body.trim()) {
substitutions.push(body);
substitutions.push(...extractCommandSubstitutions(body));
}
}
}
return substitutions;
}
/**
* Extract bodies of plain `(...)` subshell groups.
*
* Bash treats `(npm run dev)` as a subshell that executes its contents, but
* the regex-light segment splitters used by our PreToolUse hooks don't peer
* inside those parens. This helper finds top-level `(...)` groups (skipping
* `$(...)` command substitutions and backticks, which `extractCommandSubstitutions`
* already covers) and returns each body, recursing for nested groups.
*
* Quote semantics:
* - Single quotes are literal: `'( ... )'` is a string, not a subshell.
* - Double quotes are literal *for parens*: `"( ... )"` is a string too —
* bash only honors `$( )` inside double quotes, not bare `( )`.
*
* @param {string} input
* @returns {string[]}
*/
function extractSubshellGroups(input) {
const source = String(input || '');
const groups = [];
let inSingle = false;
let inDouble = false;
for (let i = 0; i < source.length; i++) {
const ch = source[i];
const prev = source[i - 1];
if (ch === '\\' && !inSingle) {
i += 1;
continue;
}
if (ch === "'" && !inDouble && prev !== '\\') {
inSingle = !inSingle;
continue;
}
if (ch === '"' && !inSingle && prev !== '\\') {
inDouble = !inDouble;
continue;
}
if (inSingle || inDouble) {
continue;
}
if (ch === '$' && source[i + 1] === '(') {
let depth = 1;
let skipInSingle = false;
let skipInDouble = false;
i += 2;
while (i < source.length && depth > 0) {
const inner = source[i];
const innerPrev = source[i - 1];
if (inner === '\\' && !skipInSingle) {
i += 2;
continue;
}
if (inner === "'" && !skipInDouble && innerPrev !== '\\') {
skipInSingle = !skipInSingle;
} else if (inner === '"' && !skipInSingle && innerPrev !== '\\') {
skipInDouble = !skipInDouble;
} else if (!skipInSingle && !skipInDouble) {
if (inner === '(') depth += 1;
else if (inner === ')') depth -= 1;
}
i += 1;
}
i -= 1;
continue;
}
if (ch === '`') {
i += 1;
while (i < source.length && source[i] !== '`') {
if (source[i] === '\\' && i + 1 < source.length) {
i += 2;
continue;
}
i += 1;
}
continue;
}
if (ch === '(') {
let depth = 1;
let body = '';
let bodyInSingle = false;
let bodyInDouble = false;
i += 1;
while (i < source.length && depth > 0) {
const inner = source[i];
const innerPrev = source[i - 1];
if (inner === '\\' && !bodyInSingle) {
body += inner;
if (i + 1 < source.length) {
body += source[i + 1];
i += 2;
continue;
}
}
if (inner === "'" && !bodyInDouble && innerPrev !== '\\') {
bodyInSingle = !bodyInSingle;
} else if (inner === '"' && !bodyInSingle && innerPrev !== '\\') {
bodyInDouble = !bodyInDouble;
} else if (!bodyInSingle && !bodyInDouble) {
if (inner === '(') {
depth += 1;
} else if (inner === ')') {
depth -= 1;
if (depth === 0) {
break;
}
}
}
body += inner;
i += 1;
}
if (body.trim()) {
groups.push(body);
groups.push(...extractSubshellGroups(body));
}
}
}
return groups;
}
module.exports = { extractCommandSubstitutions, extractSubshellGroups };

View File

@@ -89,6 +89,110 @@ function runTests() {
assert.strictEqual(result.code, 0, `Expected exit code 0, got ${result.code}`);
}) ? passed++ : failed++);
// --- Subshell bypass regression (issue: dev server slipped past via $(), ``, ()) ---
if (!isWindows) {
(test('blocks $(npm run dev) — command substitution', () => {
const result = runScript('$(npm run dev)');
assert.strictEqual(result.code, 2, `Expected exit code 2, got ${result.code}`);
assert.ok(result.stderr.includes('BLOCKED'), 'expected BLOCKED in stderr');
}) ? passed++ : failed++);
(test('blocks `npm run dev` — backtick substitution', () => {
const result = runScript('`npm run dev`');
assert.strictEqual(result.code, 2, `Expected exit code 2, got ${result.code}`);
}) ? passed++ : failed++);
(test('blocks echo $(npm run dev) — substitution nested in argument', () => {
const result = runScript('echo $(npm run dev)');
assert.strictEqual(result.code, 2, `Expected exit code 2, got ${result.code}`);
}) ? passed++ : failed++);
(test('blocks (npm run dev) — plain subshell group', () => {
const result = runScript('(npm run dev)');
assert.strictEqual(result.code, 2, `Expected exit code 2, got ${result.code}`);
}) ? passed++ : failed++);
(test('blocks $(echo a; npm run dev) — substitution with sequenced segments', () => {
const result = runScript('$(echo a; npm run dev)');
assert.strictEqual(result.code, 2, `Expected exit code 2, got ${result.code}`);
}) ? passed++ : failed++);
(test('blocks (pnpm dev) — plain subshell group with pnpm', () => {
const result = runScript('(pnpm dev)');
assert.strictEqual(result.code, 2, `Expected exit code 2, got ${result.code}`);
}) ? passed++ : failed++);
(test('allows tmux launcher inside subshell wrapping (exit code 0)', () => {
const result = runScript('(tmux new-session -d -s dev "npm run dev")');
assert.strictEqual(result.code, 0, `Expected exit code 0, got ${result.code}`);
}) ? passed++ : failed++);
(test('allows single-quoted "(npm run dev)" — literal string, not a subshell', () => {
const result = runScript("git commit -m '(npm run dev)'");
assert.strictEqual(result.code, 0, `Expected exit code 0, got ${result.code}`);
}) ? passed++ : failed++);
(test('allows double-quoted "(npm run dev)" — literal in double quotes (bash does not subshell)', () => {
const result = runScript('echo "(npm run dev)"');
assert.strictEqual(result.code, 0, `Expected exit code 0, got ${result.code}`);
}) ? passed++ : failed++);
(test("allows single-quoted '$(npm run dev)' — literal string, no substitution", () => {
const result = runScript("git commit -m '$(npm run dev) fix'");
assert.strictEqual(result.code, 0, `Expected exit code 0, got ${result.code}`);
}) ? passed++ : failed++);
}
// --- Round 1 review fixes (Greptile + CodeRabbit on PR #1889) ---
if (!isWindows) {
(test('blocks $(echo ")"; (npm run dev)) — quoted ) does not terminate $() early', () => {
const result = runScript('$(echo ")"; (npm run dev))');
assert.strictEqual(result.code, 2, `Expected exit code 2, got ${result.code}`);
}) ? passed++ : failed++);
(test('blocks (echo ")"; npm run dev) — quoted ) does not terminate (...) early', () => {
const result = runScript('(echo ")"; npm run dev)');
assert.strictEqual(result.code, 2, `Expected exit code 2, got ${result.code}`);
}) ? passed++ : failed++);
(test('allows $(echo "(npm run dev)") — () inside double-quoted substitution body is literal', () => {
const result = runScript('$(echo "(npm run dev)")');
assert.strictEqual(result.code, 0, `Expected exit code 0, got ${result.code}`);
}) ? passed++ : failed++);
(test('blocks { npm run dev; } — brace group runs in current shell', () => {
const result = runScript('{ npm run dev; }');
assert.strictEqual(result.code, 2, `Expected exit code 2, got ${result.code}`);
}) ? passed++ : failed++);
(test('blocks echo hi && { npm run dev; } — brace group after &&', () => {
const result = runScript('echo hi && { npm run dev; }');
assert.strictEqual(result.code, 2, `Expected exit code 2, got ${result.code}`);
}) ? passed++ : failed++);
(test('allows {npm run dev} — bash requires space after { to form a group', () => {
const result = runScript('{npm run dev}');
assert.strictEqual(result.code, 0, `Expected exit code 0, got ${result.code}`);
}) ? passed++ : failed++);
(test('blocks yarn run dev — yarn 1.x convention', () => {
const result = runScript('yarn run dev');
assert.strictEqual(result.code, 2, `Expected exit code 2, got ${result.code}`);
}) ? passed++ : failed++);
(test('blocks bun dev — bun bare form', () => {
const result = runScript('bun dev');
assert.strictEqual(result.code, 2, `Expected exit code 2, got ${result.code}`);
}) ? passed++ : failed++);
(test('blocks "$(npm run dev)" — double-quoted substitution still substitutes', () => {
const result = runScript('echo "$(npm run dev)"');
assert.strictEqual(result.code, 2, `Expected exit code 2, got ${result.code}`);
}) ? passed++ : failed++);
}
// --- Edge cases ---
(test('empty/invalid input passes through (exit code 0)', () => {