Two false-negatives surfaced in PR #1889 review:
1. Brace-group bypass (Greptile).
`{ npm run dev; }` evaluates the dev command in the *current*
shell — semantically distinct from `( ... )` but with the same
effect for this hook. `splitShellSegments` correctly cleaves the
group at `;` into `["{ npm run dev", "}"]`, but the first segment's
leading token under `readToken` is the bare `{`, which was not in
`DEV_COMMAND_WORDS`, so the dev-pattern check was skipped.
Fix: treat `{` and `}` as no-op tokens in `getLeadingCommandWord`
so we keep walking to the real command word. Matches how shell
itself parses brace groups (the braces are reserved words, not
commands). Bash requires a space after `{` and a terminator before
`}` for an actual group, so `{npm run dev}` correctly remains
allowed (single token `{npm`, not in `DEV_COMMAND_WORDS`).
2. Missing yarn-run / bun-bare variants (CodeRabbit).
Both `yarn dev` *and* `yarn run dev` are valid (the latter is what
`package.json` actually wires `dev` to under yarn 1.x). The same
`(run )?` symmetry applies to bun. The previous `DEV_PATTERN` only
matched `yarn\s+dev` and `bun\s+run\s+dev`, allowing the cross
forms to pass through silently.
Fix: `yarn(?:\s+run)?\s+dev` and `bun(?:\s+run)?\s+dev` — same
shape `pnpm(?:\s+run)?\s+dev` was already using.
Verified after this commit (every form now exits 2):
{ npm run dev; }
{ npm run dev ; }
echo hi && { npm run dev; }
({ npm run dev; })
$( { npm run dev; } )
yarn run dev
bun dev
Verified still allowed (no regression):
echo "{ npm run dev; }" # literal inside double quotes
{npm run dev} # not a brace group per bash syntax
Before this commit the dev-server-block hook ran the leading-command
and dev-pattern check only against the top-level segments returned by
`splitShellSegments`, which doesn't split on `$(...)`, backticks, or
plain `(...)`. That left the policy bypassable by wrapping a dev
command in any of those constructs:
$(npm run dev)
`npm run dev`
echo $(npm run dev)
(npm run dev)
Each verified by piping a synthetic PreToolUse payload into the hook
on this branch: every form above returned exit 0 (allow) where a plain
`npm run dev` correctly returned exit 2 (block).
Fix: expand the check space before running the leading-command rule.
A small BFS walks the raw command, harvesting bodies from
`extractCommandSubstitutions` (`$(...)` and backticks) and from
`extractSubshellGroups` (plain `(...)`), then splits each harvested
body through `splitShellSegments` and feeds the result into the
existing `isBlockedDevSegment` check.
This preserves every existing allow case (`tmux new-session -d -s dev
"npm run dev"`, quoted-string mentions like `git commit -m "npm run
dev fix"`, `echo hi`) because the leading-command rule is unchanged —
only the set of segments it runs against grew.
Known limitation, not fixed here: `eval "$(echo npm run dev)"` still
slips through because the substitution body's leading command is
`echo`, and statically modeling echo's output to recover the executed
command is out of scope. The same class affects `gateguard-fact-force`
(via `eval "$(echo rm -rf /)"` etc.) and is best addressed in both
hooks together as a follow-up rather than as a one-off here.
Backport Jamkris's fix for case-insensitive core.hooksPath overrides and the git commit -tn template-path false positive. Verified locally on current main with 25/25 block-no-verify tests and node tests/run-all.js passing 2369/2369.
Salvages the useful statusline/context monitor work from stale PR #1504 while preserving the current continuous-learning hook runner wiring.
Adds the metrics bridge, context monitor, statusline script, shared cost/session bridge utilities, and tests. Fixes the reviewed false loop-detection hash collision for non-file tools, avoids default-session cost inflation, sanitizes statusline task lookup, and records hook payload session IDs in cost-tracker.
* fix(hooks): resolve MCP health-check spawn ENOENT on Windows
On Windows, commands like 'npx' are batch files (npx.cmd) that require
shell expansion to resolve via PATH. Without shell: true, Node.js
spawn() fails with ENOENT.
However, absolute paths (e.g. C:\Program Files\nodejs\node.exe) must
NOT use shell mode because cmd.exe misparses paths containing spaces.
Fix: enable shell mode only for non-absolute commands on Windows, using
path.isAbsolute() to distinguish. This matches how attemptReconnect()
already handles the shell option.
Fixes#1455
* fix(hooks): harden Windows shell spawn — validate command for metacharacters
Addresses bot review feedback on PR #1456:
- Add UNSAFE_SHELL_CHARS regex to guard against shell injection when
needsShell=true: cmd.exe operators (&, |, <, >, ^, %, !, (), ;,
whitespace) are rejected before shell mode is enabled
- Add typeof command === 'string' check so path.isAbsolute() cannot
throw on malformed non-string command values
- Rename test to 'via PATH resolution' (not Windows-only; runs all platforms)
- Fix misleading test comment: 'node' resolves via PATH like npx.cmd but
does not itself use .cmd; comment now accurately reflects the intent
* fix(hooks): kill full process tree on Windows when shell mode is used
When needsShell=true, the spawned child is cmd.exe. Calling child.kill()
only terminates the shell, leaving the real server process orphaned.
Use taskkill /PID <pid> /T /F on Windows+shell to kill the entire
process tree rooted at cmd.exe. Fall back to SIGTERM+SIGKILL on all
other platforms or when shell mode is not active.
* fix(hooks): fall back to child.kill() when taskkill fails
Windows taskkill can fail if it's not on PATH, the process already
exited, or permissions are denied. Previously the failure was silently
ignored and no kill signal reached the child.
Now: capture the spawnSync result and fall back to child.kill('SIGKILL')
on any taskkill error or non-zero status. This still may leak a
detached server process but at least guarantees the cmd.exe shell is
signaled.
The SessionStart hook injects the most recent *-session.tmp as
additionalContext labelled only with 'Previous session summary:'.
After a /compact boundary, the model frequently re-executes stale
slash-skill invocations it finds inside that summary, re-running
ARGUMENTS-bearing skills (e.g. /fw-task-new, /fw-raise-pr) with the
last ARGUMENTS they saw.
Observed on claude-opus-4-7 with ECC v1.9.0 on a firmware project:
after compaction resume, the model spontaneously re-enters the prior
skill with stale ARGUMENTS, duplicating GitHub issues, Notion tasks,
and branches for work that is already merged.
ECC cannot fix Claude Code's skill-state replay across compactions,
but it can stop amplifying it. Wrap the injected summary in an
explicit HISTORICAL REFERENCE ONLY preamble with a STALE-BY-DEFAULT
contract and delimit the block with BEGIN/END markers so the model
treats everything inside as frozen reference material.
Tests: update the two hooks.test.js cases that asserted on the old
'Previous session summary' literal to assert on the new guard
preamble, the STALE-BY-DEFAULT contract, and both delimiters. 219/219
tests pass locally.
Tracked at: #1534
* fix(gateguard): rewrite routineBashMsg to use fact-presentation pattern
The imperative 'Quote user's instruction verbatim. Then retry.' phrasing
triggers Claude Code's runtime anti-prompt-injection filter, deadlocking
the first Bash call of every session. The sibling gates (edit, write,
destructive) use multi-point fact-list framing that the runtime accepts.
Align routineBashMsg with that pattern to restore the gate's intended
behavior without changing run(), state schema, or any public API.
Closes#1530
* docs(gateguard): sync SKILL.md routine gate spec with new message format
CodeRabbit flagged that skills/gateguard/SKILL.md still described the
pre-fix imperative message. Update the Routine Bash Gate section to
match the numbered fact-list format used by the new routineBashMsg().
Previously the env fallback ran only when JSON.parse threw. If stdin was valid
JSON but omitted transcript_path or provided a non-string/empty value, the
script dropped to the getSessionIdShort() fallback path, re-introducing the
collision this PR targets.
Validate the parsed transcript_path and apply the env-var fallback for any
unusable value, not just malformed JSON. Matches coderabbit's outside-diff
suggestion and keeps both input-source paths equivalent.
Refs #1494
- Route the transcript-derived shortId through sanitizeSessionId so the
fallback and transcript branches remain byte-for-byte equivalent for any
non-UUID session IDs that still land in CLAUDE_SESSION_ID (greptile P1).
- Clarify the inline comment in the first regression test: clearing
CLAUDE_SESSION_ID exercises the transcript_path branch, not the
getSessionIdShort() fallback (coderabbit P2).
Refs #1494
- Use last-8 chars of transcript UUID instead of first-8, matching
getSessionIdShort()'s .slice(-8) convention. Same session now produces the
same filename whether shortId comes from CLAUDE_SESSION_ID or transcript_path,
so existing .tmp files are not orphaned on upgrade.
- Normalize extracted hex prefix to lowercase to avoid case-driven filename
divergence from sanitizeSessionId()'s lowercase output.
- Explicitly clear CLAUDE_SESSION_ID in the first regression test so the env
leak from parent test runs cannot hide the fallback path.
- Add regression tests for the lowercase-normalization path and for the case
where CLAUDE_SESSION_ID and transcript_path refer to the same UUID (backward
compat guarantee).
Refs #1494
When session-end.js runs and CLAUDE_SESSION_ID is unset, getSessionIdShort()
falls back to the project/worktree name. If any other Stop-hook in the chain
spawns a claude subprocess (e.g. an AI-summary generator using 'claude -p'),
the subprocess also fires the full Stop chain and writes to the same project-
name-based filename, clobbering the parent's valid session summary with a
summary of the summarization prompt itself.
Fix: when stdin JSON (or CLAUDE_TRANSCRIPT_PATH) provides a transcript_path,
extract the first 8 hex chars of the session UUID from the filename and use
that as shortId. Falls back to the original getSessionIdShort() when no
transcript_path is available, so existing behavior is preserved for all
callers that do not set it.
Adds a regression test in tests/hooks/hooks.test.js.
Refs #1494
P1: Gate message asked for raw production data records — changed to
"redacted or synthetic values" to prevent sensitive data exfiltration
P2: SKILL.md description now includes MultiEdit (was missing after
MultiEdit gate was added in previous commit)
P2: Session key pruning now caps __prefixed keys at 50 to prevent
unbounded growth even in theoretical edge cases
9/9 tests pass.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- isChecked() no longer calls saveState() — read-only operation
should not write to disk (was causing 3x writes per tool call)
- Test cleanup uses fs.rmSync(recursive) instead of fs.rmdirSync
which failed with ENOTEMPTY when .tmp files remained
9/9 tests pass.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
P1 (cubic-dev-ai): Test process PID differs from spawned hook PID,
so test was seeding/clearing wrong state file. Fix: pass fixed
CLAUDE_SESSION_ID='gateguard-test-session' to spawned hooks.
P2 (cubic-dev-ai): Pruning checked array could evict __bash_session__
and other session keys, causing gates to re-fire mid-session. Fix:
preserve __prefixed keys during pruning, only evict file-path entries.
9/9 tests pass.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
P1 bug reported by greptile-apps: MultiEdit uses toolInput.edits[].file_path,
not toolInput.file_path. The gate was silently allowing all MultiEdit calls.
Fix: separate MultiEdit into its own branch that iterates edits array
and gates on the first unchecked file_path.
9/9 tests pass.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Addresses reviewer feedback from @affaan-m:
1. State keyed by CLAUDE_SESSION_ID / ECC_SESSION_ID
- Falls back to pid-based isolation when env vars absent
- State file: state-{sessionId}.json (was .session_state.json)
2. Atomic write+rename semantics
- Write to temp file, then fs.renameSync to final path
- Prevents partial reads from concurrent hooks
3. Bounded checked list (MAX_CHECKED_ENTRIES = 500)
- Prunes to last 500 entries when cap exceeded
- Stale session files auto-deleted after 1 hour
9/9 tests pass.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>