Compare commits

...

6 Commits

Author SHA1 Message Date
Affaan Mustafa
f6d7067ff5 fix(hooks): persist metrics warning dedup 2026-05-17 21:32:08 -04:00
Affaan Mustafa
ef13690fc3 fix(hooks): suppress repeated metrics warning breadcrumbs 2026-05-17 21:09:47 -04:00
Jamkris
8cb9dac8ec docs(hooks): correct PreToolUse → PostToolUse in readSessionCost docblock
greptile P2 nitpick: the previous commit's docblock said "on every
PreToolUse hook" but the module header (and the actual hook wiring
in `hooks/hooks.json`) identifies this script as a PostToolUse
hook — it runs *after* each tool invocation to update the running
session aggregate. One-word typo, no behavior change.
2026-05-18 10:03:42 +09:00
Jamkris
cd176504d3 fix(hooks): log fail-open breadcrumb on parse/read errors in metrics bridge
coderabbitai flagged: the two `catch` blocks in `readSessionCost`
silently swallowed every failure mode. A malformed `costs.jsonl`
row, a permission error opening the file, or any other unexpected
I/O failure would silently return zero cost — masking real
problems and feeding stale or zero numbers into
`ecc-context-monitor.js` (which then injects them as
`additionalContext` into the live model turn).

Fix two things, both fail-open-preserving:

1. **Inner JSON.parse catch** — count malformed lines and write
   one aggregated breadcrumb per call:

     [ecc-metrics-bridge] skipped N malformed line(s) in <path>

   Aggregating (rather than per-line) keeps a log-flooded
   `costs.jsonl` diagnosable without overwhelming stderr.

2. **Outer fs.readFileSync catch** — write a breadcrumb on real
   errors, but stay silent on `ENOENT`. The "no costs.jsonl yet"
   case is genuinely normal (no Stop event has fired this session)
   and producing noise on every PreToolUse before the first Stop
   would be reviewer-visible spam. All other error codes
   (`EACCES`, `EISDIR`, `EMFILE`, …) get:

     [ecc-metrics-bridge] failing open after <name> reading <path>: <msg>

In both cases the function still returns the zero-cost fallback
so the bridge never breaks tool execution — only the
diagnosability changes.

Two new regression tests in
`tests/hooks/ecc-metrics-bridge.test.js`:

  ✓ readSessionCost writes a stderr breadcrumb when malformed
    lines are skipped — feeds 4 rows (2 valid, 2 malformed),
    asserts the last valid row still wins AND captured stderr
    contains "skipped 2 malformed line(s)".

  ✓ readSessionCost stays silent when costs.jsonl does not exist
    (ENOENT) — uses a fresh tmp HOME with no metrics dir, asserts
    zero return AND empty stderr.

Test count: 16 → 18; `npm test` green; `yarn lint` clean.
2026-05-18 10:03:21 +09:00
Jamkris
44e13541fa fix(hooks): scan full costs.jsonl when locating session row
`readSessionCost` read only the trailing 8 KiB of
`~/.claude/metrics/costs.jsonl` to "avoid scanning entire file".
That ceiling is the opposite-sign sibling of the double-count bug
fixed in the previous commit: once a session's most recent
cumulative row gets pushed past the 8 KiB window by newer rows
from other sessions, the bridge silently reports `totalCost: 0`,
`totalIn: 0`, `totalOut: 0` for that session — same false signal
to `ecc-context-monitor.js`, same wrong number injected into the
live model turn as `additionalContext`.

`cost-tracker.js` has no rotation policy, so on any non-trivial
workstation costs.jsonl grows past 8 KiB within minutes of normal
use. For users who keep multiple concurrent sessions, this means
the second-and-later sessions silently report zero almost
immediately.

Reproduced before this commit:

  $ HOME=/tmp/eccc node -e '
      const fs = require("fs");
      const m = require("./scripts/hooks/ecc-metrics-bridge.js");
      // S1 row at file start, then 200 rows of OTHER-session noise (~16 KiB).
      // S1 is the row we want, but it sits past the 8 KiB tail.
      const s1 = `{"session_id":"S1","estimated_cost_usd":0.5,"input_tokens":500,"output_tokens":250}`;
      const other = `{"session_id":"OTHER","estimated_cost_usd":1,"input_tokens":100,"output_tokens":50}`;
      fs.mkdirSync("/tmp/eccc/.claude/metrics", { recursive: true });
      fs.writeFileSync("/tmp/eccc/.claude/metrics/costs.jsonl",
        [s1, ...Array(200).fill(other)].join("\\n") + "\\n");
      console.log(JSON.stringify(m.readSessionCost("S1")));'
  {"totalCost":0,"totalIn":0,"totalOut":0}

Expected: `{"totalCost":0.5, "totalIn":500, "totalOut":250}` (the
S1 row that exists in the file).
Actual: zero — the row is past the 8 KiB tail.

Fix: drop the `fs.openSync` + bounded `fs.readSync` + position
arithmetic in favour of `fs.readFileSync(costsPath, 'utf8')` and
iterate every line. Each row is ~150 bytes; even 100k rows is
~15 MB and a single sync read on PreToolUse is in the low ms.
If file rotation lands in `cost-tracker.js` later, this scan
becomes proportionally cheaper.

After this commit the reproduction above returns
`{"totalCost":0.5, "totalIn":500, "totalOut":250}`.

Regression test in `tests/hooks/ecc-metrics-bridge.test.js`:
`readSessionCost finds session row beyond the old 8 KiB tail
boundary`. The test asserts the costs.jsonl fixture is > 8 KiB
before reading so any reintroduction of a bounded tail would
re-fail the test (i.e. the assertion is the contract, not the
specific number 8192).

Together with the previous commit, both directions of the
metrics-bridge cost-reporting bug are closed.
2026-05-18 08:45:48 +09:00
Jamkris
e61bb043ed fix(hooks): use last cumulative row for session cost in metrics bridge
`ecc-metrics-bridge.js#readSessionCost` summed the
`estimated_cost_usd`, `input_tokens`, and `output_tokens` of
every matching row in `~/.claude/metrics/costs.jsonl`. That breaks
the documented contract of `scripts/hooks/cost-tracker.js`, which
explicitly states (in its module docblock):

  Cumulative behavior: Stop fires per assistant response, not
  per session. Each row therefore represents the cumulative
  session total up to that point. To get per-session cost, take
  the last row per session_id.

Summing N cumulative rows over-counts by roughly (N+1)/2 ×. For a
session with 3 rows at 0.01, 0.02, 0.03 USD (true running total
0.03), the bridge today reports 0.06 USD. The over-counted value
feeds `ecc-context-monitor.js`, which then trips its
COST_NOTICE_USD / COST_WARNING_USD / COST_CRITICAL_USD thresholds
on phantom spend AND injects the inflated number as
`additionalContext` into the live model turn — so the agent
itself is told a wrong cost.

Reproduced on `main` before this commit:

  $ cat > /tmp/eccc/.claude/metrics/costs.jsonl <<EOF
  {"session_id":"S1","estimated_cost_usd":0.01,"input_tokens":333,"output_tokens":166}
  {"session_id":"S1","estimated_cost_usd":0.02,"input_tokens":666,"output_tokens":333}
  {"session_id":"S1","estimated_cost_usd":0.03,"input_tokens":1000,"output_tokens":500}
  EOF

  $ HOME=/tmp/eccc node -e 'const m = require("./scripts/hooks/ecc-metrics-bridge.js"); \
      console.log(JSON.stringify(m.readSessionCost("S1")))'
  {"totalCost":0.06,"totalIn":1999,"totalOut":999}

Expected: `{"totalCost":0.03,"totalIn":1000,"totalOut":500}` (the
last cumulative row).
Actual: 2× over-count.

Fix: replace `+=` with `=` in the matching branch so the assigned
values reflect the most recent row encountered. The iteration
order is file order, which is also event time order, so the last
assignment wins — exactly the contract cost-tracker writes
against.

After this commit the reproduction above returns
`{"totalCost":0.03,"totalIn":1000,"totalOut":500}`.

Regression test in `tests/hooks/ecc-metrics-bridge.test.js`:
`readSessionCost returns the LAST cumulative row, not the sum
(cost-tracker contract)`. The existing
`readSessionCost does not include unrelated default-session rows`
test happened to pass even with the bug because it only had one
target-session row — single-row sessions are coincidentally
correct under both formulas. The new test uses three rows so the
two formulas diverge.

A second issue in the same function — the 8 KiB tail-only read
silently drops older rows once a session's recent cumulative
totals scroll past that window — is fixed in the next commit.
2026-05-18 08:45:48 +09:00
2 changed files with 303 additions and 27 deletions

View File

@@ -11,6 +11,7 @@
const crypto = require('crypto');
const fs = require('fs');
const os = require('os');
const path = require('path');
const { sanitizeSessionId, readBridge, writeBridgeAtomic } = require('../lib/session-bridge');
const { getClaudeDir } = require('../lib/utils');
@@ -19,6 +20,7 @@ const MAX_STDIN = 1024 * 1024;
const MAX_FILES_TRACKED = 200;
const RECENT_TOOLS_SIZE = 5;
const HASH_INPUT_LIMIT = 2048;
const WARNING_CACHE_PREFIX = 'ecc-metrics-cost-warnings-';
function toNumber(value) {
const n = Number(value);
@@ -76,41 +78,104 @@ function extractFilePaths(toolName, toolInput) {
return paths;
}
function getCostWarningCachePath(costsPath) {
const hash = crypto.createHash('sha256').update(costsPath).digest('hex').slice(0, 16);
return path.join(os.tmpdir(), `${WARNING_CACHE_PREFIX}${hash}.json`);
}
function readCostWarningCache(cachePath) {
try {
const parsed = JSON.parse(fs.readFileSync(cachePath, 'utf8'));
return parsed && typeof parsed === 'object' && !Array.isArray(parsed) ? parsed : {};
} catch {
return {};
}
}
function writeCostWarningIfChanged(kind, costsPath, signature, message) {
const cachePath = getCostWarningCachePath(costsPath);
const cache = readCostWarningCache(cachePath);
if (cache[kind] === signature) return;
process.stderr.write(message);
try {
const next = { ...cache, [kind]: signature };
const tmp = `${cachePath}.${process.pid}.tmp`;
fs.writeFileSync(tmp, JSON.stringify(next), 'utf8');
fs.renameSync(tmp, cachePath);
} catch {
// Warning-cache persistence is best effort; never block hook execution.
}
}
/**
* Read cumulative cost for a session from the tail of costs.jsonl.
* Reads last 8KB to avoid scanning entire file.
* Read cumulative cost for a session from costs.jsonl.
*
* Scans the full file because each row is a cumulative session total
* (see cost-tracker.js docblock) and the row we need is the last one
* matching `sessionId`. The previous implementation read only the
* trailing 8 KiB; any session whose latest cumulative row was pushed
* past that window by newer rows from other sessions silently dropped
* to zero — the opposite sign of the double-count bug fixed in the
* previous commit.
*
* costs.jsonl is append-only and unbounded today (no rotation in
* cost-tracker.js). At a typical ~150 bytes per row, even 100k rows
* is ~15 MB and a single sync read on every PostToolUse hook is in
* the low milliseconds. If rotation lands later, this scan becomes
* even cheaper.
*/
function readSessionCost(sessionId) {
let costsPath = path.join('metrics', 'costs.jsonl');
try {
const costsPath = path.join(getClaudeDir(), 'metrics', 'costs.jsonl');
const stat = fs.statSync(costsPath);
const readSize = Math.min(stat.size, 8192);
const fd = fs.openSync(costsPath, 'r');
try {
const buf = Buffer.alloc(readSize);
fs.readSync(fd, buf, 0, readSize, Math.max(0, stat.size - readSize));
const lines = buf.toString('utf8').split('\n').filter(Boolean);
costsPath = path.join(getClaudeDir(), 'metrics', 'costs.jsonl');
const content = fs.readFileSync(costsPath, 'utf8');
const lines = content.split('\n').filter(Boolean);
let totalCost = 0;
let totalIn = 0;
let totalOut = 0;
for (const line of lines) {
try {
const row = JSON.parse(line);
if (row.session_id === sessionId) {
totalCost += toNumber(row.estimated_cost_usd);
totalIn += toNumber(row.input_tokens);
totalOut += toNumber(row.output_tokens);
}
} catch {
/* skip malformed lines */
let totalCost = 0;
let totalIn = 0;
let totalOut = 0;
let malformed = 0;
const malformedHasher = crypto.createHash('sha256');
for (const line of lines) {
try {
const row = JSON.parse(line);
if (row.session_id === sessionId) {
totalCost = toNumber(row.estimated_cost_usd);
totalIn = toNumber(row.input_tokens);
totalOut = toNumber(row.output_tokens);
}
} catch {
malformed += 1;
malformedHasher.update(line).update('\0');
}
return { totalCost, totalIn, totalOut };
} finally {
fs.closeSync(fd);
}
} catch {
// One aggregated breadcrumb per call rather than one per bad row, so a
// log-flooded costs.jsonl stays diagnosable without overwhelming stderr.
// Suppress repeats for the same malformed-line signature across hook
// subprocesses, so a persistent bad row should not spam stderr.
if (malformed > 0) {
writeCostWarningIfChanged(
'malformed',
costsPath,
`${malformed}:${malformedHasher.digest('hex').slice(0, 16)}`,
`[ecc-metrics-bridge] skipped ${malformed} malformed line(s) in ${costsPath}\n`
);
}
return { totalCost, totalIn, totalOut };
} catch (err) {
// ENOENT is the common case (no Stop event has fired yet this session)
// and is not actually a failure — stay silent on it. Anything else
// (permission, EISDIR, malformed read) deserves a breadcrumb because
// the bridge will silently report zero cost otherwise.
if (err && err.code !== 'ENOENT') {
writeCostWarningIfChanged(
'read-error',
costsPath,
`${err.code || err.name || 'error'}:${err.message || String(err)}`,
`[ecc-metrics-bridge] failing open after ${err.name || 'error'} reading ${costsPath}: ${err.message || String(err)}\n`
);
}
return { totalCost: 0, totalIn: 0, totalOut: 0 };
}
}

View File

@@ -5,6 +5,7 @@
*/
const assert = require('assert');
const { spawnSync } = require('child_process');
const fs = require('fs');
const os = require('os');
const path = require('path');
@@ -145,6 +146,216 @@ function runTests() {
passed++;
else failed++;
if (
test('readSessionCost returns the LAST cumulative row, not the sum (cost-tracker contract)', () => {
// cost-tracker.js writes one row per Stop event; each row is already
// a cumulative session total ("To get per-session cost, take the
// last row per session_id."). Summing across rows over-counts:
// 0.01 + 0.02 + 0.03 = 0.06, but the correct answer is 0.03.
const tmpHome = makeTempHome();
const originalHome = process.env.HOME;
const originalUserProfile = process.env.USERPROFILE;
try {
process.env.HOME = tmpHome;
process.env.USERPROFILE = tmpHome;
const metricsDir = path.join(tmpHome, '.claude', 'metrics');
fs.mkdirSync(metricsDir, { recursive: true });
fs.writeFileSync(
path.join(metricsDir, 'costs.jsonl'),
[
JSON.stringify({ session_id: 'S1', estimated_cost_usd: 0.01, input_tokens: 333, output_tokens: 166 }),
JSON.stringify({ session_id: 'S1', estimated_cost_usd: 0.02, input_tokens: 666, output_tokens: 333 }),
JSON.stringify({ session_id: 'S1', estimated_cost_usd: 0.03, input_tokens: 1000, output_tokens: 500 })
].join('\n') + '\n',
'utf8'
);
const result = readSessionCost('S1');
assert.strictEqual(result.totalCost, 0.03, `expected last-row 0.03, got ${result.totalCost} (was the bug: 0.06)`);
assert.strictEqual(result.totalIn, 1000);
assert.strictEqual(result.totalOut, 500);
} finally {
if (originalHome === undefined) delete process.env.HOME;
else process.env.HOME = originalHome;
if (originalUserProfile === undefined) delete process.env.USERPROFILE;
else process.env.USERPROFILE = originalUserProfile;
fs.rmSync(tmpHome, { recursive: true, force: true });
}
})
)
passed++;
else failed++;
if (
test('readSessionCost finds session row beyond the old 8 KiB tail boundary', () => {
// The previous implementation read only the trailing 8 KiB of
// costs.jsonl. A long-running deployment where the target session's
// most recent cumulative row sat further back than that — e.g.
// pushed past by many rows from OTHER sessions — silently saw
// cost=0. This test wedges the S1 row at the file start, fills
// ~16 KiB of OTHER-session noise after it, and asserts the S1 row
// is still found.
const tmpHome = makeTempHome();
const originalHome = process.env.HOME;
const originalUserProfile = process.env.USERPROFILE;
try {
process.env.HOME = tmpHome;
process.env.USERPROFILE = tmpHome;
const metricsDir = path.join(tmpHome, '.claude', 'metrics');
fs.mkdirSync(metricsDir, { recursive: true });
const otherRow = JSON.stringify({ session_id: 'OTHER', estimated_cost_usd: 1, input_tokens: 100, output_tokens: 50 });
const s1Row = JSON.stringify({ session_id: 'S1', estimated_cost_usd: 0.5, input_tokens: 500, output_tokens: 250 });
const rows = [s1Row, ...Array(200).fill(otherRow)];
fs.writeFileSync(path.join(metricsDir, 'costs.jsonl'), rows.join('\n') + '\n', 'utf8');
// Confirm we're actually past the old 8 KiB ceiling so the test
// would have failed under the previous implementation.
const size = fs.statSync(path.join(metricsDir, 'costs.jsonl')).size;
assert.ok(size > 8192, `setup: expected costs.jsonl > 8 KiB, got ${size} bytes`);
const result = readSessionCost('S1');
assert.strictEqual(result.totalCost, 0.5);
assert.strictEqual(result.totalIn, 500);
assert.strictEqual(result.totalOut, 250);
} finally {
if (originalHome === undefined) delete process.env.HOME;
else process.env.HOME = originalHome;
if (originalUserProfile === undefined) delete process.env.USERPROFILE;
else process.env.USERPROFILE = originalUserProfile;
fs.rmSync(tmpHome, { recursive: true, force: true });
}
})
)
passed++;
else failed++;
if (
test('readSessionCost writes one stderr breadcrumb when malformed lines persist across calls', () => {
// Reviewer (coderabbitai) asked for diagnosability when the inner
// catch silently skips malformed JSON rows. Verify the aggregated
// "skipped N malformed line(s)" breadcrumb appears on stderr while
// the function still recovers the last valid matching row. Because
// this hook runs after every tool invocation, the same bad rows should
// not emit the same warning on every call.
const tmpHome = makeTempHome();
const originalHome = process.env.HOME;
const originalUserProfile = process.env.USERPROFILE;
const originalStderrWrite = process.stderr.write.bind(process.stderr);
let captured = '';
process.stderr.write = chunk => {
captured += String(chunk);
return true;
};
try {
process.env.HOME = tmpHome;
process.env.USERPROFILE = tmpHome;
const metricsDir = path.join(tmpHome, '.claude', 'metrics');
fs.mkdirSync(metricsDir, { recursive: true });
fs.writeFileSync(
path.join(metricsDir, 'costs.jsonl'),
[
JSON.stringify({ session_id: 'S1', estimated_cost_usd: 0.5, input_tokens: 500, output_tokens: 250 }),
'NOT_JSON',
'{"truncated":',
JSON.stringify({ session_id: 'S1', estimated_cost_usd: 0.7, input_tokens: 700, output_tokens: 350 }),
].join('\n') + '\n',
'utf8'
);
const result = readSessionCost('S1');
assert.strictEqual(result.totalCost, 0.7, 'last valid row should still win');
const secondResult = readSessionCost('S1');
assert.deepStrictEqual(secondResult, result);
const matches = captured.match(/skipped 2 malformed line\(s\)/g) || [];
assert.strictEqual(matches.length, 1,
`expected one aggregated malformed-line breadcrumb on stderr, got: ${captured}`);
} finally {
process.stderr.write = originalStderrWrite;
if (originalHome === undefined) delete process.env.HOME;
else process.env.HOME = originalHome;
if (originalUserProfile === undefined) delete process.env.USERPROFILE;
else process.env.USERPROFILE = originalUserProfile;
fs.rmSync(tmpHome, { recursive: true, force: true });
}
})
)
passed++;
else failed++;
if (
test('readSessionCost suppresses repeated malformed breadcrumbs across hook subprocesses', () => {
const tmpHome = makeTempHome();
const originalHome = process.env.HOME;
const originalUserProfile = process.env.USERPROFILE;
try {
process.env.HOME = tmpHome;
process.env.USERPROFILE = tmpHome;
const metricsDir = path.join(tmpHome, '.claude', 'metrics');
fs.mkdirSync(metricsDir, { recursive: true });
fs.writeFileSync(
path.join(metricsDir, 'costs.jsonl'),
[
JSON.stringify({ session_id: 'S1', estimated_cost_usd: 0.7, input_tokens: 700, output_tokens: 350 }),
'NOT_JSON',
'{"truncated":'
].join('\n') + '\n',
'utf8'
);
const bridgePath = path.resolve(__dirname, '../../scripts/hooks/ecc-metrics-bridge');
const code = "const { readSessionCost } = require(process.argv[1]); readSessionCost('S1');";
const env = { ...process.env, HOME: tmpHome, USERPROFILE: tmpHome };
const first = spawnSync(process.execPath, ['-e', code, bridgePath], { env, encoding: 'utf8' });
const second = spawnSync(process.execPath, ['-e', code, bridgePath], { env, encoding: 'utf8' });
assert.strictEqual(first.status, 0, first.stderr || first.stdout);
assert.strictEqual(second.status, 0, second.stderr || second.stdout);
assert.match(first.stderr, /skipped 2 malformed line\(s\)/);
assert.strictEqual(second.stderr, '', `expected repeat subprocess warning suppression, got: ${second.stderr}`);
} finally {
if (originalHome === undefined) delete process.env.HOME;
else process.env.HOME = originalHome;
if (originalUserProfile === undefined) delete process.env.USERPROFILE;
else process.env.USERPROFILE = originalUserProfile;
fs.rmSync(tmpHome, { recursive: true, force: true });
}
})
)
passed++;
else failed++;
if (
test('readSessionCost stays silent when costs.jsonl does not exist (ENOENT)', () => {
// ENOENT is the common case before any Stop event has fired — it is
// not a failure and should not produce stderr noise. Other errors
// (permission, EISDIR, etc.) DO produce a breadcrumb, covered by the
// malformed-line test above's surrounding harness.
const tmpHome = makeTempHome();
const originalHome = process.env.HOME;
const originalUserProfile = process.env.USERPROFILE;
const originalStderrWrite = process.stderr.write.bind(process.stderr);
let captured = '';
process.stderr.write = chunk => {
captured += String(chunk);
return true;
};
try {
process.env.HOME = tmpHome;
process.env.USERPROFILE = tmpHome;
// Do NOT create the metrics dir or file — readSessionCost should
// hit ENOENT and return zeros silently.
const result = readSessionCost('S1');
assert.deepStrictEqual(result, { totalCost: 0, totalIn: 0, totalOut: 0 });
assert.strictEqual(captured, '', `expected no stderr on ENOENT, got: ${captured}`);
} finally {
process.stderr.write = originalStderrWrite;
if (originalHome === undefined) delete process.env.HOME;
else process.env.HOME = originalHome;
if (originalUserProfile === undefined) delete process.env.USERPROFILE;
else process.env.USERPROFILE = originalUserProfile;
fs.rmSync(tmpHome, { recursive: true, force: true });
}
})
)
passed++;
else failed++;
if (
test('readSessionCost does not include unrelated default-session rows', () => {
const tmpHome = makeTempHome();