Files
everything-claude-code/tests/hooks/run-with-flags-truncation.test.js
Affaan Mustafa 6319c7d309 fix: stability batch — hook stdin truncation, Codex exa TOML, Stop hook JSON, GateGuard repetition (#2227)
* fix(hooks): fail open on oversized stdin instead of echoing truncated JSON (#2222)

run-with-flags.js capped stdin at 1MB but every fallthrough path still
echoed the truncated string to stdout. The harness parses hook stdout as
JSON, got a document cut mid-stream, and blocked the tool call — so any
Edit/Write with a >1MB hook payload was permanently blocked by every
registered pre-write hook, before ECC_HOOK_PROFILE / ECC_DISABLED_HOOKS
gating could run.

- Exit 0 with empty stdout (no opinion) when the stdin cap trips, before
  any echo or gating logic.
- Flush stdout via write callback before process.exit: exiting right
  after stdout.write() dropped everything past the ~64KB pipe buffer,
  cutting even sub-cap pass-through payloads mid-JSON.

Regression tests cover the enabled, disabled, and missing-arg paths for
oversized payloads plus full echo of sub-cap >64KB payloads.

* fix(codex): stop emitting invalid exa url entry, align merge with connector policy (#2224)

The Codex MCP merge declared exa with a url key, but Codex's
[mcp_servers.*] TOML schema is stdio-only — the url key makes the
entire config.toml fail to load, bricking both the codex CLI and the
desktop app. Every install/update re-injected the line because the
urlEntry branch treated the broken entry as present.

- ECC_SERVERS now emits only the current default set per
  docs/MCP-CONNECTOR-POLICY.md: chrome-devtools (stdio, command/args).
  Retired servers (supabase, playwright, context7, exa, github, memory,
  sequential-thinking) are never re-emitted; existing user-managed
  entries are untouched.
- The merge now repairs the exact ECC-emitted broken form (url-only
  exa entry) on every run so re-running the installer fixes broken
  configs instead of preserving them. User stdio exa entries
  (command + mcp-remote) are left alone.
- check-codex-global-state.sh requires chrome-devtools instead of the
  retired set, and flags url-only exa entries with a repair hint.

Tests cover repair, re-run idempotence, stdio-entry preservation, and
no-retired-server emission in add, update, dry-run, and disabled modes.

* fix(hooks): never echo truncated stdin from Stop hooks (#2090)

Stop hooks follow the ECC pass-through convention (echo stdin on
stdout), but every echoing Stop hook capped stdin and echoed the capped
string. The Stop payload carries last_assistant_message, so a long
final assistant message produced a JSON document cut mid-stream on
stdout, which the harness reports as 'Stop hook error: JSON validation
failed' across the whole Stop chain.

Reproduced: a Stop payload with a >64KB last_assistant_message run
through run-with-flags + cost-tracker emitted exactly 65536 bytes of
invalid JSON (cost-tracker capped stdin at 64KB — far below realistic
Stop payloads).

- cost-tracker: raise the cap to 1MB (matching all other hooks) and
  suppress the pass-through echo when stdin was truncated.
- check-console-log, stop-format-typecheck, desktop-notify: suppress
  the echo when stdin was truncated; flush stdout before process.exit
  so sub-cap payloads are not cut at the ~64KB pipe buffer.
- All hooks keep exiting 0 (fail-open); diagnostics go to stderr.

New stop-hooks-stdout test asserts the contract for every registered
Stop hook: stdout is empty or valid JSON, exit code 0 — for realistic
100KB payloads and oversized >1MB payloads, via the production runner
and via direct invocation. Updated the old hooks.test.js case that
codified the truncated-echo behavior.

* fix(hooks): dampen GateGuard fact-force repetition in long sessions (#2142)

In long autonomous sessions the fact-force gate produced 10+
near-identical 'state facts -> blocked -> restate -> retry' blocks in
one context window, which measurably raises the odds of the model
collapsing into a degenerate single-token repetition loop.

- Track a per-session fact_force_denials counter in GateGuard state
  (merged max across concurrent writers, reset with the session, robust
  to malformed on-disk values).
- The first GATEGUARD_FACT_FORCE_FULL_DENIALS denials (default 3) keep
  the full four-fact block; later denials emit a condensed single-line
  message that carries the denial ordinal, so consecutive denials are
  structurally different and never textually identical.
- True retries of the same target remain allowed without re-prompting
  (unchanged). Destructive-Bash and routine-Bash gates are unchanged,
  as are the ECC_GATEGUARD=off / ECC_DISABLED_HOOKS escape hatches.

Eight new tests cover budget counting, condensed format, ordinal
advancement, retry pass-through, env tuning, malformed state, MultiEdit
dampening, and destructive-gate exemption.

* fix(hooks): keep security hooks able to block on oversized stdin (#2222)

Refine the truncation fail-open: instead of skipping the hook entirely,
the runner now suppresses only its own raw-echo when stdin was
truncated. The hook still executes and receives the truncated flag
(run() context / ECC_HOOK_INPUT_TRUNCATED), so config-protection keeps
blocking truncated protected-config payloads (its test requires exit 2)
while pass-through hooks fail open with empty stdout as before.

* style: apply repo formatter to touched hook files
2026-06-11 00:31:33 -04:00

155 lines
5.7 KiB
JavaScript

/**
* Regression tests for #2222: run-with-flags.js must fail open on >1MB stdin.
*
* Before the fix, every fallthrough path echoed the truncated payload to
* stdout. The harness parses hook stdout as JSON, got a document cut
* mid-stream, and treated the hook as failed — blocking every Edit/Write
* whose hook payload exceeded the 1MB cap.
*/
'use strict';
const assert = require('assert');
const path = require('path');
const { spawnSync } = require('child_process');
const repoRoot = path.join(__dirname, '..', '..');
const runner = path.join(repoRoot, 'scripts', 'hooks', 'run-with-flags.js');
const MAX_STDIN = 1024 * 1024;
function test(name, fn) {
try {
fn();
console.log(`${name}`);
return true;
} catch (error) {
console.log(`${name}`);
console.log(` Error: ${error.message}`);
return false;
}
}
function runRunner(args, input, env = {}) {
return spawnSync('node', [runner, ...args], {
input,
encoding: 'utf8',
cwd: repoRoot,
env: { ...process.env, ...env },
timeout: 30000,
maxBuffer: 16 * 1024 * 1024,
stdio: ['pipe', 'pipe', 'pipe']
});
}
function oversizedPayload() {
// JSON document that exceeds MAX_STDIN so the runner's stdin cap trips.
return JSON.stringify({
tool_name: 'Write',
tool_input: { file_path: '/tmp/big.md', content: 'x'.repeat(MAX_STDIN + 64 * 1024) }
});
}
console.log('\nrun-with-flags truncation (fail-open) tests:');
let passed = 0;
let failed = 0;
if (
test('oversized payload exits 0 with empty stdout for an enabled hook', () => {
const result = runRunner(['pre:write:doc-file-warning', 'scripts/hooks/doc-file-warning.js', 'standard,strict'], oversizedPayload());
assert.strictEqual(result.status, 0, `expected exit 0, got ${result.status}: ${result.stderr}`);
assert.strictEqual(result.stdout, '', `stdout must be empty, got: ${result.stdout.slice(0, 120)}...`);
assert.match(result.stderr, /stdin exceeded \d+ bytes for pre:write:doc-file-warning/);
assert.match(result.stderr, /fail-open/);
})
)
passed++;
else failed++;
if (
test('oversized payload never echoes truncated stdin when hook args are missing', () => {
const result = runRunner([], oversizedPayload());
assert.strictEqual(result.status, 0);
assert.strictEqual(result.stdout, '', 'missing-args path must not echo truncated stdin');
})
)
passed++;
else failed++;
if (
test('oversized payload never echoes truncated stdin for a disabled hook', () => {
const result = runRunner(['pre:write:doc-file-warning', 'scripts/hooks/doc-file-warning.js', 'standard,strict'], oversizedPayload(), { ECC_DISABLED_HOOKS: 'pre:write:doc-file-warning' });
assert.strictEqual(result.status, 0);
assert.strictEqual(result.stdout, '', 'disabled-hook path must not echo truncated stdin');
})
)
passed++;
else failed++;
if (
test('normal-sized payload still passes through unchanged', () => {
const payload = JSON.stringify({
tool_name: 'Write',
tool_input: { file_path: '/tmp/small.js', content: 'const x = 1;\n' }
});
const result = runRunner(['pre:write:doc-file-warning', 'scripts/hooks/doc-file-warning.js', 'standard,strict'], payload);
assert.strictEqual(result.status, 0, `expected exit 0, got ${result.status}: ${result.stderr}`);
assert.ok(result.stdout.length > 0, 'normal payloads keep the pass-through behavior');
JSON.parse(result.stdout); // stdout must remain valid JSON
})
)
passed++;
else failed++;
if (
test('a security hook can still block on an oversized payload (no blanket skip)', () => {
// config-protection refuses to fail open on truncated payloads. The
// runner must still execute the hook and forward its verdict — only the
// runner's own raw-echo is suppressed.
const payload = JSON.stringify({
tool_name: 'Write',
tool_input: { file_path: '.eslintrc.js', content: 'x'.repeat(MAX_STDIN + 2048) }
});
const result = runRunner(['pre:config-protection', 'scripts/hooks/config-protection.js', 'standard,strict'], payload);
assert.strictEqual(result.status, 2, `expected block exit 2, got ${result.status}: ${result.stderr}`);
assert.strictEqual(result.stdout, '', 'blocked truncated payload must not echo raw input');
})
)
passed++;
else failed++;
if (
test('payload just under the cap echoes through completely (no 64KB pipe cut)', () => {
// process.exit() right after stdout.write() used to drop everything past
// the ~64KB pipe buffer, cutting the echoed JSON mid-stream.
const content = 'y'.repeat(MAX_STDIN - 1024);
const payload = JSON.stringify({ tool_name: 'Write', tool_input: { file_path: '/tmp/edge.md', content } });
assert.ok(payload.length < MAX_STDIN, 'fixture must stay under the stdin cap');
const result = runRunner([], payload);
assert.strictEqual(result.status, 0);
assert.strictEqual(result.stdout.length, payload.length, 'echo must not be cut at the pipe buffer');
assert.strictEqual(result.stdout, payload, 'sub-cap payloads still echo through fallthrough paths');
})
)
passed++;
else failed++;
if (
test('disabled-hook passthrough of a >64KB payload stays valid JSON', () => {
const payload = JSON.stringify({
tool_name: 'Write',
tool_input: { file_path: '/tmp/medium.md', content: 'z'.repeat(256 * 1024) }
});
const result = runRunner(['pre:write:doc-file-warning', 'scripts/hooks/doc-file-warning.js', 'standard,strict'], payload, { ECC_DISABLED_HOOKS: 'pre:write:doc-file-warning' });
assert.strictEqual(result.status, 0);
assert.strictEqual(result.stdout, payload);
JSON.parse(result.stdout);
})
)
passed++;
else failed++;
console.log(`\n ${passed} passed, ${failed} failed\n`);
process.exit(failed > 0 ? 1 : 0);