Commit Graph

201 Commits

Author SHA1 Message Date
Jamkris
086e44c964 fix(hooks): log fail-open breadcrumb on parse/read errors in metrics bridge
coderabbitai flagged: the two `catch` blocks in `readSessionCost`
silently swallowed every failure mode. A malformed `costs.jsonl`
row, a permission error opening the file, or any other unexpected
I/O failure would silently return zero cost — masking real
problems and feeding stale or zero numbers into
`ecc-context-monitor.js` (which then injects them as
`additionalContext` into the live model turn).

Fix two things, both fail-open-preserving:

1. **Inner JSON.parse catch** — count malformed lines and write
   one aggregated breadcrumb per call:

     [ecc-metrics-bridge] skipped N malformed line(s) in <path>

   Aggregating (rather than per-line) keeps a log-flooded
   `costs.jsonl` diagnosable without overwhelming stderr.

2. **Outer fs.readFileSync catch** — write a breadcrumb on real
   errors, but stay silent on `ENOENT`. The "no costs.jsonl yet"
   case is genuinely normal (no Stop event has fired this session)
   and producing noise on every PreToolUse before the first Stop
   would be reviewer-visible spam. All other error codes
   (`EACCES`, `EISDIR`, `EMFILE`, …) get:

     [ecc-metrics-bridge] failing open after <name> reading <path>: <msg>

In both cases the function still returns the zero-cost fallback
so the bridge never breaks tool execution — only the
diagnosability changes.

Two new regression tests in
`tests/hooks/ecc-metrics-bridge.test.js`:

  ✓ readSessionCost writes a stderr breadcrumb when malformed
    lines are skipped — feeds 4 rows (2 valid, 2 malformed),
    asserts the last valid row still wins AND captured stderr
    contains "skipped 2 malformed line(s)".

  ✓ readSessionCost stays silent when costs.jsonl does not exist
    (ENOENT) — uses a fresh tmp HOME with no metrics dir, asserts
    zero return AND empty stderr.

Test count: 16 → 18; `npm test` green; `yarn lint` clean.
2026-05-17 21:41:24 -04:00
Jamkris
63c9788f50 fix(hooks): scan full costs.jsonl when locating session row
`readSessionCost` read only the trailing 8 KiB of
`~/.claude/metrics/costs.jsonl` to "avoid scanning entire file".
That ceiling is the opposite-sign sibling of the double-count bug
fixed in the previous commit: once a session's most recent
cumulative row gets pushed past the 8 KiB window by newer rows
from other sessions, the bridge silently reports `totalCost: 0`,
`totalIn: 0`, `totalOut: 0` for that session — same false signal
to `ecc-context-monitor.js`, same wrong number injected into the
live model turn as `additionalContext`.

`cost-tracker.js` has no rotation policy, so on any non-trivial
workstation costs.jsonl grows past 8 KiB within minutes of normal
use. For users who keep multiple concurrent sessions, this means
the second-and-later sessions silently report zero almost
immediately.

Reproduced before this commit:

  $ HOME=/tmp/eccc node -e '
      const fs = require("fs");
      const m = require("./scripts/hooks/ecc-metrics-bridge.js");
      // S1 row at file start, then 200 rows of OTHER-session noise (~16 KiB).
      // S1 is the row we want, but it sits past the 8 KiB tail.
      const s1 = `{"session_id":"S1","estimated_cost_usd":0.5,"input_tokens":500,"output_tokens":250}`;
      const other = `{"session_id":"OTHER","estimated_cost_usd":1,"input_tokens":100,"output_tokens":50}`;
      fs.mkdirSync("/tmp/eccc/.claude/metrics", { recursive: true });
      fs.writeFileSync("/tmp/eccc/.claude/metrics/costs.jsonl",
        [s1, ...Array(200).fill(other)].join("\\n") + "\\n");
      console.log(JSON.stringify(m.readSessionCost("S1")));'
  {"totalCost":0,"totalIn":0,"totalOut":0}

Expected: `{"totalCost":0.5, "totalIn":500, "totalOut":250}` (the
S1 row that exists in the file).
Actual: zero — the row is past the 8 KiB tail.

Fix: drop the `fs.openSync` + bounded `fs.readSync` + position
arithmetic in favour of `fs.readFileSync(costsPath, 'utf8')` and
iterate every line. Each row is ~150 bytes; even 100k rows is
~15 MB and a single sync read on PreToolUse is in the low ms.
If file rotation lands in `cost-tracker.js` later, this scan
becomes proportionally cheaper.

After this commit the reproduction above returns
`{"totalCost":0.5, "totalIn":500, "totalOut":250}`.

Regression test in `tests/hooks/ecc-metrics-bridge.test.js`:
`readSessionCost finds session row beyond the old 8 KiB tail
boundary`. The test asserts the costs.jsonl fixture is > 8 KiB
before reading so any reintroduction of a bounded tail would
re-fail the test (i.e. the assertion is the contract, not the
specific number 8192).

Together with the previous commit, both directions of the
metrics-bridge cost-reporting bug are closed.
2026-05-17 21:41:24 -04:00
Jamkris
4f21ed2acf fix(hooks): use last cumulative row for session cost in metrics bridge
`ecc-metrics-bridge.js#readSessionCost` summed the
`estimated_cost_usd`, `input_tokens`, and `output_tokens` of
every matching row in `~/.claude/metrics/costs.jsonl`. That breaks
the documented contract of `scripts/hooks/cost-tracker.js`, which
explicitly states (in its module docblock):

  Cumulative behavior: Stop fires per assistant response, not
  per session. Each row therefore represents the cumulative
  session total up to that point. To get per-session cost, take
  the last row per session_id.

Summing N cumulative rows over-counts by roughly (N+1)/2 ×. For a
session with 3 rows at 0.01, 0.02, 0.03 USD (true running total
0.03), the bridge today reports 0.06 USD. The over-counted value
feeds `ecc-context-monitor.js`, which then trips its
COST_NOTICE_USD / COST_WARNING_USD / COST_CRITICAL_USD thresholds
on phantom spend AND injects the inflated number as
`additionalContext` into the live model turn — so the agent
itself is told a wrong cost.

Reproduced on `main` before this commit:

  $ cat > /tmp/eccc/.claude/metrics/costs.jsonl <<EOF
  {"session_id":"S1","estimated_cost_usd":0.01,"input_tokens":333,"output_tokens":166}
  {"session_id":"S1","estimated_cost_usd":0.02,"input_tokens":666,"output_tokens":333}
  {"session_id":"S1","estimated_cost_usd":0.03,"input_tokens":1000,"output_tokens":500}
  EOF

  $ HOME=/tmp/eccc node -e 'const m = require("./scripts/hooks/ecc-metrics-bridge.js"); \
      console.log(JSON.stringify(m.readSessionCost("S1")))'
  {"totalCost":0.06,"totalIn":1999,"totalOut":999}

Expected: `{"totalCost":0.03,"totalIn":1000,"totalOut":500}` (the
last cumulative row).
Actual: 2× over-count.

Fix: replace `+=` with `=` in the matching branch so the assigned
values reflect the most recent row encountered. The iteration
order is file order, which is also event time order, so the last
assignment wins — exactly the contract cost-tracker writes
against.

After this commit the reproduction above returns
`{"totalCost":0.03,"totalIn":1000,"totalOut":500}`.

Regression test in `tests/hooks/ecc-metrics-bridge.test.js`:
`readSessionCost returns the LAST cumulative row, not the sum
(cost-tracker contract)`. The existing
`readSessionCost does not include unrelated default-session rows`
test happened to pass even with the bug because it only had one
target-session row — single-row sessions are coincidentally
correct under both formulas. The new test uses three rows so the
two formulas diverge.

A second issue in the same function — the 8 KiB tail-only read
silently drops older rows once a session's recent cumulative
totals scroll past that window — is fixed in the next commit.
2026-05-17 21:41:24 -04:00
Affaan Mustafa
b47dfa95a3 fix: add context monitor cost warning opt-out 2026-05-17 01:53:57 -04:00
Affaan Mustafa
6d130cfcd5 fix: reduce observer hook scanner signatures 2026-05-16 15:26:25 -04:00
Kris Pahel
50ac061f9e chore: update statusline ANSI color palette
- Replace blinking red (5;31m) with bold red (1;31m) for critical context bar
- Replace cyan metrics (36m) with sky blue (38;5;117m)
- Replace plain bold task (1m) with bold bright white (1;97m)
- Update test assertion to match new bold red code
2026-05-15 23:18:01 -04:00
SeungHyun
8cfadfea28 fix(hooks): close grouped command bypasses in gateguard (#1912)
Inspect executable bodies inside plain subshells and brace groups before applying destructive command classifiers.\n\nCo-authored-by: Jamkris <82251632+Jamkris@users.noreply.github.com>
2026-05-15 01:39:15 -04:00
Affaan Mustafa
375d750b4c fix: integrate recent hook and docs PRs (#1905)
Integrates useful changes from #1882, #1884, #1889, #1893, #1898, #1899, and #1903:
- fix rule install docs to preserve language directories
- correct Ruby security command examples
- harden dev-server hook command-substitution parsing
- add Prisma patterns skill and catalog/package surfaces
- allow first-time protected config creation while blocking existing configs
- read cost metrics from Stop hook transcripts
- emit suggest-compact additionalContext on stdout

Co-authored-by: Jamkris <dltmdgus1412@gmail.com>
Co-authored-by: Levi-Evan <levishantz@gmail.com>
Co-authored-by: gaurav0107 <gauravdubey0107@gmail.com>
Co-authored-by: richm-spp <richard.millar@salarypackagingplus.com.au>
Co-authored-by: zomia <zomians@outlook.jp>
Co-authored-by: donghyeun02 <donghyeun02@gmail.com>
2026-05-14 21:37:28 -04:00
SeungHyun
0e169fecbc fix: harden GateGuard destructive bash tokenizer
Co-authored-by: Jamkris <dltmdgus1412@gmail.com>
2026-05-13 02:43:04 -04:00
SeungHyun
6be241a463 fix: close block-no-verify bypass holes
Backport Jamkris's fix for case-insensitive core.hooksPath overrides and the git commit -tn template-path false positive. Verified locally on current main with 25/25 block-no-verify tests and node tests/run-all.js passing 2369/2369.
2026-05-12 22:28:12 -04:00
Affaan Mustafa
22aabf7d4f test: harden InsAIts wrapper fake Python shim 2026-05-12 01:13:01 -04:00
Affaan Mustafa
901e41997b test: stabilize MCP stderr probe timeout 2026-05-12 01:13:01 -04:00
Affaan Mustafa
940135ea47 feat: add ECC statusline observability hooks
Salvages the useful statusline/context monitor work from stale PR #1504 while preserving the current continuous-learning hook runner wiring.

Adds the metrics bridge, context monitor, statusline script, shared cost/session bridge utilities, and tests. Fixes the reviewed false loop-detection hash collision for non-file tools, avoids default-session cost inflation, sanitizes statusline task lookup, and records hook payload session IDs in cost-tracker.
2026-05-11 23:44:06 -04:00
Affaan Mustafa
e9c8845833 feat: add Astraflow provider support 2026-05-11 23:21:46 -04:00
Affaan Mustafa
03108bea62 fix: scope SessionStart context injection 2026-05-11 22:56:29 -04:00
Affaan Mustafa
67a8b914ee test: harden mcp health port readiness 2026-05-11 22:40:19 -04:00
Affaan Mustafa
4220f1b064 test: relax InsAIts monitor timeout 2026-05-11 19:38:21 -04:00
Affaan Mustafa
c45aeee57f fix: salvage remaining stale queue fixes (#1754) 2026-05-11 16:41:08 -04:00
Affaan Mustafa
f442bac8c9 fix: port Windows hook safety fixes (#1719) 2026-05-11 03:56:51 -04:00
Affaan Mustafa
12e1bc424d fix: port continuous-learning observer fixes
Ports continuous-learning observer signal, storage, remote normalization, and v1 deprecation fixes onto current main.
2026-05-11 03:35:42 -04:00
Affaan Mustafa
1abc3fb381 fix: port hook session and dashboard safety fixes
Ports suggest-compact session_id isolation and dashboard terminal/document launch safety onto current main.
2026-05-11 02:53:28 -04:00
Affaan Mustafa
7b964402ee fix: bypass GateGuard file gates in subagents (#1710) 2026-05-11 01:51:24 -04:00
Michael
600072ebd8 fix(hooks): resolve MCP health-check spawn ENOENT on Windows (#1456)
* fix(hooks): resolve MCP health-check spawn ENOENT on Windows

On Windows, commands like 'npx' are batch files (npx.cmd) that require
shell expansion to resolve via PATH. Without shell: true, Node.js
spawn() fails with ENOENT.

However, absolute paths (e.g. C:\Program Files\nodejs\node.exe) must
NOT use shell mode because cmd.exe misparses paths containing spaces.

Fix: enable shell mode only for non-absolute commands on Windows, using
path.isAbsolute() to distinguish. This matches how attemptReconnect()
already handles the shell option.

Fixes #1455

* fix(hooks): harden Windows shell spawn — validate command for metacharacters

Addresses bot review feedback on PR #1456:

- Add UNSAFE_SHELL_CHARS regex to guard against shell injection when
  needsShell=true: cmd.exe operators (&, |, <, >, ^, %, !, (), ;,
  whitespace) are rejected before shell mode is enabled
- Add typeof command === 'string' check so path.isAbsolute() cannot
  throw on malformed non-string command values
- Rename test to 'via PATH resolution' (not Windows-only; runs all platforms)
- Fix misleading test comment: 'node' resolves via PATH like npx.cmd but
  does not itself use .cmd; comment now accurately reflects the intent

* fix(hooks): kill full process tree on Windows when shell mode is used

When needsShell=true, the spawned child is cmd.exe. Calling child.kill()
only terminates the shell, leaving the real server process orphaned.

Use taskkill /PID <pid> /T /F on Windows+shell to kill the entire
process tree rooted at cmd.exe. Fall back to SIGTERM+SIGKILL on all
other platforms or when shell mode is not active.

* fix(hooks): fall back to child.kill() when taskkill fails

Windows taskkill can fail if it's not on PATH, the process already
exited, or permissions are denied. Previously the failure was silently
ignored and no kill signal reached the child.

Now: capture the spawnSync result and fall back to child.kill('SIGKILL')
on any taskkill error or non-zero status. This still may leak a
detached server process but at least guarantees the cmd.exe shell is
signaled.
2026-05-11 01:13:37 -04:00
Affaan Mustafa
bb40978e31 fix: show correct gateguard hook recovery id 2026-04-30 11:26:15 -04:00
Affaan Mustafa
7c5452f4fa fix: keep gateguard destructive gate strict 2026-04-30 11:26:15 -04:00
Affaan Mustafa
cfe770a735 fix: add gateguard recovery escape hatch 2026-04-30 11:26:15 -04:00
Affaan Mustafa
b1456bd954 fix: cap session-start context injection 2026-04-30 08:41:52 -04:00
Affaan Mustafa
95bef977c1 fix: fail open on gateguard state write errors 2026-04-30 08:15:27 -04:00
Affaan Mustafa
d26d66fd3b fix: inject learned skills at session start 2026-04-30 01:31:41 -04:00
Affaan Mustafa
9627c201c7 test: harden mcp health http probe fixture 2026-04-29 23:05:17 -04:00
Affaan Mustafa
1188aeafc4 fix: refine gateguard destructive git detection 2026-04-29 22:41:22 -04:00
Affaan Mustafa
0dcde13384 fix: parse block-no-verify flags by shell words 2026-04-29 21:59:12 -04:00
Affaan Mustafa
3fadc37802 fix: route continuous learning observe hooks through node 2026-04-29 21:28:59 -04:00
Affaan Mustafa
c3ea7a1e5e fix: preserve gateguard concurrent state writes (#1623) 2026-04-29 19:31:11 -04:00
Affaan Mustafa
468c755abd test: extend insaits monitor subprocess timeout 2026-04-29 19:25:18 -04:00
Affaan Mustafa
8c7e6611e0 test: cover gateguard edge paths 2026-04-29 19:08:47 -04:00
Affaan Mustafa
b5bdd9352f fix: run pre-bash linters through windows wrappers 2026-04-29 18:59:10 -04:00
Affaan Mustafa
ebf0d4322b test: support windows pre-bash linter shims 2026-04-29 18:36:33 -04:00
Affaan Mustafa
015b00b8fc test: stabilize mcp health crash probes 2026-04-29 18:29:02 -04:00
Affaan Mustafa
51511461f6 test: cover pre-bash commit quality edges 2026-04-29 18:28:56 -04:00
Affaan Mustafa
33edfd3bb3 test: cover session activity tracker edge paths 2026-04-29 18:15:51 -04:00
Affaan Mustafa
f92dc544c4 test: cover mcp health edge paths 2026-04-29 18:08:45 -04:00
Affaan Mustafa
1c2d5dd389 fix: fail open on insaits monitor errors 2026-04-29 18:03:33 -04:00
Affaan Mustafa
63485a26bf fix: support windows insaits python shims 2026-04-29 17:53:07 -04:00
Affaan Mustafa
fe40a3d27b test: cover hook bootstrap and InsAIts monitor 2026-04-29 17:45:22 -04:00
Affaan Mustafa
fd95cf6b29 fix: retry observer wait after signal 2026-04-28 22:11:47 -04:00
Affaan Mustafa
601c626b03 Merge pull request #1495 from ratorin/fix/session-end-transcript-path-isolation
fix(hooks): isolate session-end.js filename using transcript_path UUID (#1494)
2026-04-21 18:14:23 -04:00
Vishnu Pradeep
b27551897d fix(hooks): wrap SessionStart summary with stale-replay guard (#1536)
The SessionStart hook injects the most recent *-session.tmp as
additionalContext labelled only with 'Previous session summary:'.
After a /compact boundary, the model frequently re-executes stale
slash-skill invocations it finds inside that summary, re-running
ARGUMENTS-bearing skills (e.g. /fw-task-new, /fw-raise-pr) with the
last ARGUMENTS they saw.

Observed on claude-opus-4-7 with ECC v1.9.0 on a firmware project:
after compaction resume, the model spontaneously re-enters the prior
skill with stale ARGUMENTS, duplicating GitHub issues, Notion tasks,
and branches for work that is already merged.

ECC cannot fix Claude Code's skill-state replay across compactions,
but it can stop amplifying it. Wrap the injected summary in an
explicit HISTORICAL REFERENCE ONLY preamble with a STALE-BY-DEFAULT
contract and delimit the block with BEGIN/END markers so the model
treats everything inside as frozen reference material.

Tests: update the two hooks.test.js cases that asserted on the old
'Previous session summary' literal to assert on the new guard
preamble, the STALE-BY-DEFAULT contract, and both delimiters. 219/219
tests pass locally.

Tracked at: #1534
2026-04-21 18:02:19 -04:00
Taro Kawakami
01d816781e review: apply sanitizeSessionId to UUID shortId, fix test comment
- Route the transcript-derived shortId through sanitizeSessionId so the
  fallback and transcript branches remain byte-for-byte equivalent for any
  non-UUID session IDs that still land in CLAUDE_SESSION_ID (greptile P1).
- Clarify the inline comment in the first regression test: clearing
  CLAUDE_SESSION_ID exercises the transcript_path branch, not the
  getSessionIdShort() fallback (coderabbit P2).

Refs #1494
2026-04-19 14:30:00 +09:00
Taro Kawakami
93cd5f4cff review: address P1/P2 bot feedback on shortId derivation
- Use last-8 chars of transcript UUID instead of first-8, matching
  getSessionIdShort()'s .slice(-8) convention. Same session now produces the
  same filename whether shortId comes from CLAUDE_SESSION_ID or transcript_path,
  so existing .tmp files are not orphaned on upgrade.
- Normalize extracted hex prefix to lowercase to avoid case-driven filename
  divergence from sanitizeSessionId()'s lowercase output.
- Explicitly clear CLAUDE_SESSION_ID in the first regression test so the env
  leak from parent test runs cannot hide the fallback path.
- Add regression tests for the lowercase-normalization path and for the case
  where CLAUDE_SESSION_ID and transcript_path refer to the same UUID (backward
  compat guarantee).

Refs #1494
2026-04-19 14:19:29 +09:00