mirror of
https://github.com/affaan-m/everything-claude-code.git
synced 2026-06-12 19:23:07 +08:00
fix: stability batch — hook stdin truncation, Codex exa TOML, Stop hook JSON, GateGuard repetition (#2227)
* fix(hooks): fail open on oversized stdin instead of echoing truncated JSON (#2222) run-with-flags.js capped stdin at 1MB but every fallthrough path still echoed the truncated string to stdout. The harness parses hook stdout as JSON, got a document cut mid-stream, and blocked the tool call — so any Edit/Write with a >1MB hook payload was permanently blocked by every registered pre-write hook, before ECC_HOOK_PROFILE / ECC_DISABLED_HOOKS gating could run. - Exit 0 with empty stdout (no opinion) when the stdin cap trips, before any echo or gating logic. - Flush stdout via write callback before process.exit: exiting right after stdout.write() dropped everything past the ~64KB pipe buffer, cutting even sub-cap pass-through payloads mid-JSON. Regression tests cover the enabled, disabled, and missing-arg paths for oversized payloads plus full echo of sub-cap >64KB payloads. * fix(codex): stop emitting invalid exa url entry, align merge with connector policy (#2224) The Codex MCP merge declared exa with a url key, but Codex's [mcp_servers.*] TOML schema is stdio-only — the url key makes the entire config.toml fail to load, bricking both the codex CLI and the desktop app. Every install/update re-injected the line because the urlEntry branch treated the broken entry as present. - ECC_SERVERS now emits only the current default set per docs/MCP-CONNECTOR-POLICY.md: chrome-devtools (stdio, command/args). Retired servers (supabase, playwright, context7, exa, github, memory, sequential-thinking) are never re-emitted; existing user-managed entries are untouched. - The merge now repairs the exact ECC-emitted broken form (url-only exa entry) on every run so re-running the installer fixes broken configs instead of preserving them. User stdio exa entries (command + mcp-remote) are left alone. - check-codex-global-state.sh requires chrome-devtools instead of the retired set, and flags url-only exa entries with a repair hint. Tests cover repair, re-run idempotence, stdio-entry preservation, and no-retired-server emission in add, update, dry-run, and disabled modes. * fix(hooks): never echo truncated stdin from Stop hooks (#2090) Stop hooks follow the ECC pass-through convention (echo stdin on stdout), but every echoing Stop hook capped stdin and echoed the capped string. The Stop payload carries last_assistant_message, so a long final assistant message produced a JSON document cut mid-stream on stdout, which the harness reports as 'Stop hook error: JSON validation failed' across the whole Stop chain. Reproduced: a Stop payload with a >64KB last_assistant_message run through run-with-flags + cost-tracker emitted exactly 65536 bytes of invalid JSON (cost-tracker capped stdin at 64KB — far below realistic Stop payloads). - cost-tracker: raise the cap to 1MB (matching all other hooks) and suppress the pass-through echo when stdin was truncated. - check-console-log, stop-format-typecheck, desktop-notify: suppress the echo when stdin was truncated; flush stdout before process.exit so sub-cap payloads are not cut at the ~64KB pipe buffer. - All hooks keep exiting 0 (fail-open); diagnostics go to stderr. New stop-hooks-stdout test asserts the contract for every registered Stop hook: stdout is empty or valid JSON, exit code 0 — for realistic 100KB payloads and oversized >1MB payloads, via the production runner and via direct invocation. Updated the old hooks.test.js case that codified the truncated-echo behavior. * fix(hooks): dampen GateGuard fact-force repetition in long sessions (#2142) In long autonomous sessions the fact-force gate produced 10+ near-identical 'state facts -> blocked -> restate -> retry' blocks in one context window, which measurably raises the odds of the model collapsing into a degenerate single-token repetition loop. - Track a per-session fact_force_denials counter in GateGuard state (merged max across concurrent writers, reset with the session, robust to malformed on-disk values). - The first GATEGUARD_FACT_FORCE_FULL_DENIALS denials (default 3) keep the full four-fact block; later denials emit a condensed single-line message that carries the denial ordinal, so consecutive denials are structurally different and never textually identical. - True retries of the same target remain allowed without re-prompting (unchanged). Destructive-Bash and routine-Bash gates are unchanged, as are the ECC_GATEGUARD=off / ECC_DISABLED_HOOKS escape hatches. Eight new tests cover budget counting, condensed format, ordinal advancement, retry pass-through, env tuning, malformed state, MultiEdit dampening, and destructive-gate exemption. * fix(hooks): keep security hooks able to block on oversized stdin (#2222) Refine the truncation fail-open: instead of skipping the hook entirely, the runner now suppresses only its own raw-echo when stdin was truncated. The hook still executes and receives the truncated flag (run() context / ECC_HOOK_INPUT_TRUNCATED), so config-protection keeps blocking truncated protected-config payloads (its test requires exit 2) while pass-through hooks fail open with empty stdout as before. * style: apply repo formatter to touched hook files
This commit is contained in:
@@ -1637,6 +1637,114 @@ function runTests() {
|
||||
}
|
||||
})) passed++; else failed++;
|
||||
|
||||
// --- Fact-force denial dampening (#2142) ---
|
||||
|
||||
console.log('\n Fact-force denial dampening (#2142):');
|
||||
|
||||
clearState();
|
||||
if (test('first denials use the full four-fact block and count toward the budget', () => {
|
||||
const result = runHook({ tool_name: 'Edit', tool_input: { file_path: '/src/damp-one.js' } });
|
||||
const output = parseOutput(result.stdout);
|
||||
assert.strictEqual(output.hookSpecificOutput.permissionDecision, 'deny');
|
||||
assert.ok(output.hookSpecificOutput.permissionDecisionReason.includes('present these facts'),
|
||||
'first denial should use the full block');
|
||||
const state = JSON.parse(fs.readFileSync(stateFile, 'utf8'));
|
||||
assert.strictEqual(state.fact_force_denials, 1, 'denial counter should persist in session state');
|
||||
})) passed++; else failed++;
|
||||
|
||||
clearState();
|
||||
if (test('emits a condensed single-line denial once the full-block budget is spent', () => {
|
||||
writeState({ checked: [], last_active: Date.now(), fact_force_denials: 3 });
|
||||
const result = runHook({ tool_name: 'Edit', tool_input: { file_path: '/src/damp-two.js' } });
|
||||
const output = parseOutput(result.stdout);
|
||||
assert.strictEqual(result.code, 0);
|
||||
assert.strictEqual(output.hookSpecificOutput.permissionDecision, 'deny', 'still denies first touch');
|
||||
const reason = output.hookSpecificOutput.permissionDecisionReason;
|
||||
assert.ok(reason.includes('[Fact-Forcing Gate]'), 'condensed message keeps the gate marker');
|
||||
assert.ok(reason.includes('denial #4'), 'condensed message carries the denial ordinal');
|
||||
assert.ok(reason.includes('/src/damp-two.js'), 'condensed message names the target');
|
||||
assert.ok(!reason.includes('present these facts'), 'no repeated four-fact block');
|
||||
assert.ok(!reason.includes('\n'), 'condensed message is a single line');
|
||||
assert.ok(reason.includes('ECC_GATEGUARD=off'), 'condensed message keeps a recovery hint');
|
||||
})) passed++; else failed++;
|
||||
|
||||
clearState();
|
||||
if (test('consecutive condensed denials are textually different (ordinal advances)', () => {
|
||||
writeState({ checked: [], last_active: Date.now(), fact_force_denials: 5 });
|
||||
const first = parseOutput(runHook({ tool_name: 'Write', tool_input: { file_path: '/src/damp-a.js', content: 'x' } }).stdout);
|
||||
const second = parseOutput(runHook({ tool_name: 'Write', tool_input: { file_path: '/src/damp-b.js', content: 'x' } }).stdout);
|
||||
const firstReason = first.hookSpecificOutput.permissionDecisionReason;
|
||||
const secondReason = second.hookSpecificOutput.permissionDecisionReason;
|
||||
assert.ok(firstReason.includes('denial #6'), `expected ordinal 6, got: ${firstReason}`);
|
||||
assert.ok(secondReason.includes('denial #7'), `expected ordinal 7, got: ${secondReason}`);
|
||||
assert.notStrictEqual(firstReason, secondReason, 'successive denials must differ so they cannot compound verbatim');
|
||||
})) passed++; else failed++;
|
||||
|
||||
clearState();
|
||||
if (test('retry of the same target is still allowed after a condensed denial', () => {
|
||||
writeState({ checked: [], last_active: Date.now(), fact_force_denials: 9 });
|
||||
const input = { tool_name: 'Edit', tool_input: { file_path: '/src/damp-retry.js' } };
|
||||
const denied = parseOutput(runHook(input).stdout);
|
||||
assert.strictEqual(denied.hookSpecificOutput.permissionDecision, 'deny');
|
||||
const retryOutput = parseOutput(runHook(input).stdout);
|
||||
assert.ok(!retryOutput || !retryOutput.hookSpecificOutput, 'retry passes through (no second deny, no re-prompt)');
|
||||
})) passed++; else failed++;
|
||||
|
||||
clearState();
|
||||
if (test('GATEGUARD_FACT_FORCE_FULL_DENIALS tunes the full-block budget', () => {
|
||||
// Budget 0: condensed from the very first denial.
|
||||
const zero = parseOutput(runHook(
|
||||
{ tool_name: 'Edit', tool_input: { file_path: '/src/damp-zero.js' } },
|
||||
{ GATEGUARD_FACT_FORCE_FULL_DENIALS: '0' }
|
||||
).stdout);
|
||||
assert.ok(zero.hookSpecificOutput.permissionDecisionReason.includes('denial #1'));
|
||||
assert.ok(!zero.hookSpecificOutput.permissionDecisionReason.includes('present these facts'));
|
||||
|
||||
// Large budget: full block well past the default threshold.
|
||||
clearState();
|
||||
writeState({ checked: [], last_active: Date.now(), fact_force_denials: 7 });
|
||||
const big = parseOutput(runHook(
|
||||
{ tool_name: 'Edit', tool_input: { file_path: '/src/damp-big.js' } },
|
||||
{ GATEGUARD_FACT_FORCE_FULL_DENIALS: '20' }
|
||||
).stdout);
|
||||
assert.ok(big.hookSpecificOutput.permissionDecisionReason.includes('present these facts'),
|
||||
'budget of 20 keeps the full block at denial 8');
|
||||
})) passed++; else failed++;
|
||||
|
||||
clearState();
|
||||
if (test('malformed denial counter in state is treated as zero (full block, no crash)', () => {
|
||||
writeState({ checked: [], last_active: Date.now(), fact_force_denials: 'garbage' });
|
||||
const result = runHook({ tool_name: 'Edit', tool_input: { file_path: '/src/damp-malformed.js' } });
|
||||
assert.strictEqual(result.code, 0);
|
||||
const output = parseOutput(result.stdout);
|
||||
assert.strictEqual(output.hookSpecificOutput.permissionDecision, 'deny');
|
||||
assert.ok(output.hookSpecificOutput.permissionDecisionReason.includes('present these facts'),
|
||||
'malformed counter resets to the full block');
|
||||
})) passed++; else failed++;
|
||||
|
||||
clearState();
|
||||
if (test('MultiEdit denials are dampened past the budget', () => {
|
||||
writeState({ checked: [], last_active: Date.now(), fact_force_denials: 4 });
|
||||
const result = runHook({
|
||||
tool_name: 'MultiEdit',
|
||||
tool_input: { edits: [{ file_path: '/src/damp-multi.js', old_string: 'a', new_string: 'b' }] }
|
||||
});
|
||||
const output = parseOutput(result.stdout);
|
||||
assert.strictEqual(output.hookSpecificOutput.permissionDecision, 'deny');
|
||||
assert.ok(output.hookSpecificOutput.permissionDecisionReason.includes('denial #5'));
|
||||
assert.ok(!output.hookSpecificOutput.permissionDecisionReason.includes('present these facts'));
|
||||
})) passed++; else failed++;
|
||||
|
||||
clearState();
|
||||
if (test('destructive Bash gate keeps the full message regardless of denial count', () => {
|
||||
writeState({ checked: ['__bash_session__'], last_active: Date.now(), fact_force_denials: 50 });
|
||||
const result = runBashHook({ tool_name: 'Bash', tool_input: { command: 'rm -rf /tmp/damp-target' } });
|
||||
const output = parseOutput(result.stdout);
|
||||
assert.strictEqual(output.hookSpecificOutput.permissionDecision, 'deny');
|
||||
assert.ok(output.hookSpecificOutput.permissionDecisionReason.includes('rollback'),
|
||||
'destructive gate is exempt from dampening');
|
||||
})) passed++; else failed++;
|
||||
|
||||
// Cleanup only the temp directory created by this test file.
|
||||
try {
|
||||
if (fs.existsSync(stateDir)) {
|
||||
|
||||
@@ -5285,17 +5285,15 @@ async function runTests() {
|
||||
console.log('\nRound 59: check-console-log.js (stdin exceeding 1MB — truncation):');
|
||||
|
||||
if (
|
||||
await asyncTest('truncates stdin at 1MB limit and still passes through data', async () => {
|
||||
// Send 1.2MB of data — exceeds the 1MB MAX_STDIN limit
|
||||
await asyncTest('suppresses pass-through for oversized stdin (fail-open, #2090)', async () => {
|
||||
// Send 1.2MB of data — exceeds the 1MB MAX_STDIN limit. Echoing the
|
||||
// truncated string would emit a JSON document cut mid-stream, which the
|
||||
// harness reports as a Stop hook JSON validation failure.
|
||||
const payload = 'x'.repeat(1024 * 1024 + 200000);
|
||||
const result = await runScript(path.join(scriptsDir, 'check-console-log.js'), payload);
|
||||
|
||||
assert.strictEqual(result.code, 0, 'Should exit 0 even with oversized stdin');
|
||||
// Output should be truncated — significantly less than input
|
||||
assert.ok(result.stdout.length < payload.length, `stdout (${result.stdout.length}) should be shorter than input (${payload.length})`);
|
||||
// Output should be approximately 1MB (last accepted chunk may push slightly over)
|
||||
assert.ok(result.stdout.length <= 1024 * 1024 + 65536, `stdout (${result.stdout.length}) should be near 1MB, not unbounded`);
|
||||
assert.ok(result.stdout.length > 0, 'Should still pass through truncated data');
|
||||
assert.strictEqual(result.stdout, '', 'Truncated stdin must not be echoed (empty stdout = no opinion)');
|
||||
})
|
||||
)
|
||||
passed++;
|
||||
|
||||
154
tests/hooks/run-with-flags-truncation.test.js
Normal file
154
tests/hooks/run-with-flags-truncation.test.js
Normal file
@@ -0,0 +1,154 @@
|
||||
/**
|
||||
* Regression tests for #2222: run-with-flags.js must fail open on >1MB stdin.
|
||||
*
|
||||
* Before the fix, every fallthrough path echoed the truncated payload to
|
||||
* stdout. The harness parses hook stdout as JSON, got a document cut
|
||||
* mid-stream, and treated the hook as failed — blocking every Edit/Write
|
||||
* whose hook payload exceeded the 1MB cap.
|
||||
*/
|
||||
|
||||
'use strict';
|
||||
|
||||
const assert = require('assert');
|
||||
const path = require('path');
|
||||
const { spawnSync } = require('child_process');
|
||||
|
||||
const repoRoot = path.join(__dirname, '..', '..');
|
||||
const runner = path.join(repoRoot, 'scripts', 'hooks', 'run-with-flags.js');
|
||||
|
||||
const MAX_STDIN = 1024 * 1024;
|
||||
|
||||
function test(name, fn) {
|
||||
try {
|
||||
fn();
|
||||
console.log(` ✓ ${name}`);
|
||||
return true;
|
||||
} catch (error) {
|
||||
console.log(` ✗ ${name}`);
|
||||
console.log(` Error: ${error.message}`);
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
function runRunner(args, input, env = {}) {
|
||||
return spawnSync('node', [runner, ...args], {
|
||||
input,
|
||||
encoding: 'utf8',
|
||||
cwd: repoRoot,
|
||||
env: { ...process.env, ...env },
|
||||
timeout: 30000,
|
||||
maxBuffer: 16 * 1024 * 1024,
|
||||
stdio: ['pipe', 'pipe', 'pipe']
|
||||
});
|
||||
}
|
||||
|
||||
function oversizedPayload() {
|
||||
// JSON document that exceeds MAX_STDIN so the runner's stdin cap trips.
|
||||
return JSON.stringify({
|
||||
tool_name: 'Write',
|
||||
tool_input: { file_path: '/tmp/big.md', content: 'x'.repeat(MAX_STDIN + 64 * 1024) }
|
||||
});
|
||||
}
|
||||
|
||||
console.log('\nrun-with-flags truncation (fail-open) tests:');
|
||||
|
||||
let passed = 0;
|
||||
let failed = 0;
|
||||
|
||||
if (
|
||||
test('oversized payload exits 0 with empty stdout for an enabled hook', () => {
|
||||
const result = runRunner(['pre:write:doc-file-warning', 'scripts/hooks/doc-file-warning.js', 'standard,strict'], oversizedPayload());
|
||||
assert.strictEqual(result.status, 0, `expected exit 0, got ${result.status}: ${result.stderr}`);
|
||||
assert.strictEqual(result.stdout, '', `stdout must be empty, got: ${result.stdout.slice(0, 120)}...`);
|
||||
assert.match(result.stderr, /stdin exceeded \d+ bytes for pre:write:doc-file-warning/);
|
||||
assert.match(result.stderr, /fail-open/);
|
||||
})
|
||||
)
|
||||
passed++;
|
||||
else failed++;
|
||||
|
||||
if (
|
||||
test('oversized payload never echoes truncated stdin when hook args are missing', () => {
|
||||
const result = runRunner([], oversizedPayload());
|
||||
assert.strictEqual(result.status, 0);
|
||||
assert.strictEqual(result.stdout, '', 'missing-args path must not echo truncated stdin');
|
||||
})
|
||||
)
|
||||
passed++;
|
||||
else failed++;
|
||||
|
||||
if (
|
||||
test('oversized payload never echoes truncated stdin for a disabled hook', () => {
|
||||
const result = runRunner(['pre:write:doc-file-warning', 'scripts/hooks/doc-file-warning.js', 'standard,strict'], oversizedPayload(), { ECC_DISABLED_HOOKS: 'pre:write:doc-file-warning' });
|
||||
assert.strictEqual(result.status, 0);
|
||||
assert.strictEqual(result.stdout, '', 'disabled-hook path must not echo truncated stdin');
|
||||
})
|
||||
)
|
||||
passed++;
|
||||
else failed++;
|
||||
|
||||
if (
|
||||
test('normal-sized payload still passes through unchanged', () => {
|
||||
const payload = JSON.stringify({
|
||||
tool_name: 'Write',
|
||||
tool_input: { file_path: '/tmp/small.js', content: 'const x = 1;\n' }
|
||||
});
|
||||
const result = runRunner(['pre:write:doc-file-warning', 'scripts/hooks/doc-file-warning.js', 'standard,strict'], payload);
|
||||
assert.strictEqual(result.status, 0, `expected exit 0, got ${result.status}: ${result.stderr}`);
|
||||
assert.ok(result.stdout.length > 0, 'normal payloads keep the pass-through behavior');
|
||||
JSON.parse(result.stdout); // stdout must remain valid JSON
|
||||
})
|
||||
)
|
||||
passed++;
|
||||
else failed++;
|
||||
|
||||
if (
|
||||
test('a security hook can still block on an oversized payload (no blanket skip)', () => {
|
||||
// config-protection refuses to fail open on truncated payloads. The
|
||||
// runner must still execute the hook and forward its verdict — only the
|
||||
// runner's own raw-echo is suppressed.
|
||||
const payload = JSON.stringify({
|
||||
tool_name: 'Write',
|
||||
tool_input: { file_path: '.eslintrc.js', content: 'x'.repeat(MAX_STDIN + 2048) }
|
||||
});
|
||||
const result = runRunner(['pre:config-protection', 'scripts/hooks/config-protection.js', 'standard,strict'], payload);
|
||||
assert.strictEqual(result.status, 2, `expected block exit 2, got ${result.status}: ${result.stderr}`);
|
||||
assert.strictEqual(result.stdout, '', 'blocked truncated payload must not echo raw input');
|
||||
})
|
||||
)
|
||||
passed++;
|
||||
else failed++;
|
||||
|
||||
if (
|
||||
test('payload just under the cap echoes through completely (no 64KB pipe cut)', () => {
|
||||
// process.exit() right after stdout.write() used to drop everything past
|
||||
// the ~64KB pipe buffer, cutting the echoed JSON mid-stream.
|
||||
const content = 'y'.repeat(MAX_STDIN - 1024);
|
||||
const payload = JSON.stringify({ tool_name: 'Write', tool_input: { file_path: '/tmp/edge.md', content } });
|
||||
assert.ok(payload.length < MAX_STDIN, 'fixture must stay under the stdin cap');
|
||||
const result = runRunner([], payload);
|
||||
assert.strictEqual(result.status, 0);
|
||||
assert.strictEqual(result.stdout.length, payload.length, 'echo must not be cut at the pipe buffer');
|
||||
assert.strictEqual(result.stdout, payload, 'sub-cap payloads still echo through fallthrough paths');
|
||||
})
|
||||
)
|
||||
passed++;
|
||||
else failed++;
|
||||
|
||||
if (
|
||||
test('disabled-hook passthrough of a >64KB payload stays valid JSON', () => {
|
||||
const payload = JSON.stringify({
|
||||
tool_name: 'Write',
|
||||
tool_input: { file_path: '/tmp/medium.md', content: 'z'.repeat(256 * 1024) }
|
||||
});
|
||||
const result = runRunner(['pre:write:doc-file-warning', 'scripts/hooks/doc-file-warning.js', 'standard,strict'], payload, { ECC_DISABLED_HOOKS: 'pre:write:doc-file-warning' });
|
||||
assert.strictEqual(result.status, 0);
|
||||
assert.strictEqual(result.stdout, payload);
|
||||
JSON.parse(result.stdout);
|
||||
})
|
||||
)
|
||||
passed++;
|
||||
else failed++;
|
||||
|
||||
console.log(`\n ${passed} passed, ${failed} failed\n`);
|
||||
process.exit(failed > 0 ? 1 : 0);
|
||||
203
tests/hooks/stop-hooks-stdout.test.js
Normal file
203
tests/hooks/stop-hooks-stdout.test.js
Normal file
@@ -0,0 +1,203 @@
|
||||
/**
|
||||
* Regression tests for #2090: "Stop hook error: JSON validation failed".
|
||||
*
|
||||
* Stop hooks follow the ECC pass-through convention (echo stdin on stdout).
|
||||
* The Stop payload carries `last_assistant_message`, which can be large; any
|
||||
* hook that caps stdin and echoes the capped string emits a JSON document cut
|
||||
* mid-stream, which the harness reports as a Stop hook JSON validation
|
||||
* failure. Worst offender: cost-tracker capped stdin at 64KB, so any Stop
|
||||
* payload with a >64KB final assistant message broke the whole Stop chain.
|
||||
*
|
||||
* Contract under test: for every Stop hook, stdout is either empty or valid
|
||||
* JSON, and the exit code is 0 — for realistic large payloads and for
|
||||
* oversized (>1MB) payloads, via the production runner and via direct
|
||||
* invocation.
|
||||
*/
|
||||
|
||||
'use strict';
|
||||
|
||||
const assert = require('assert');
|
||||
const fs = require('fs');
|
||||
const os = require('os');
|
||||
const path = require('path');
|
||||
const { spawnSync } = require('child_process');
|
||||
|
||||
const repoRoot = path.join(__dirname, '..', '..');
|
||||
const runner = path.join(repoRoot, 'scripts', 'hooks', 'run-with-flags.js');
|
||||
|
||||
const MAX_STDIN = 1024 * 1024;
|
||||
|
||||
const workDir = fs.mkdtempSync(path.join(os.tmpdir(), 'ecc-stop-stdout-')); // non-git cwd
|
||||
const dataHome = fs.mkdtempSync(path.join(os.tmpdir(), 'ecc-stop-data-'));
|
||||
|
||||
function test(name, fn) {
|
||||
try {
|
||||
fn();
|
||||
console.log(` ✓ ${name}`);
|
||||
return true;
|
||||
} catch (error) {
|
||||
console.log(` ✗ ${name}`);
|
||||
console.log(` Error: ${error.message}`);
|
||||
return false;
|
||||
}
|
||||
}
|
||||
|
||||
function stopPayload(messageBytes) {
|
||||
return JSON.stringify({
|
||||
session_id: `stop-stdout-test-${process.pid}`,
|
||||
transcript_path: path.join(workDir, 'missing-transcript.jsonl'),
|
||||
cwd: workDir,
|
||||
hook_event_name: 'Stop',
|
||||
stop_hook_active: false,
|
||||
last_assistant_message: 'm'.repeat(messageBytes)
|
||||
});
|
||||
}
|
||||
|
||||
function hookEnv() {
|
||||
const env = {
|
||||
...process.env,
|
||||
ECC_HOOK_PROFILE: 'standard',
|
||||
ECC_AGENT_DATA_HOME: dataHome,
|
||||
CLAUDE_SESSION_ID: `stop-stdout-test-${process.pid}`
|
||||
};
|
||||
delete env.ECC_GATEGUARD;
|
||||
delete env.ECC_DISABLED_HOOKS;
|
||||
return env;
|
||||
}
|
||||
|
||||
function runViaRunner(hookId, script, input) {
|
||||
return spawnSync('node', [runner, hookId, script, 'minimal,standard,strict'], {
|
||||
input,
|
||||
encoding: 'utf8',
|
||||
cwd: workDir,
|
||||
env: hookEnv(),
|
||||
timeout: 60000,
|
||||
maxBuffer: 16 * 1024 * 1024,
|
||||
stdio: ['pipe', 'pipe', 'pipe']
|
||||
});
|
||||
}
|
||||
|
||||
function runDirect(script, input) {
|
||||
return spawnSync('node', [path.join(repoRoot, script)], {
|
||||
input,
|
||||
encoding: 'utf8',
|
||||
cwd: workDir,
|
||||
env: hookEnv(),
|
||||
timeout: 60000,
|
||||
maxBuffer: 16 * 1024 * 1024,
|
||||
stdio: ['pipe', 'pipe', 'pipe']
|
||||
});
|
||||
}
|
||||
|
||||
function assertStdoutContract(result, label) {
|
||||
assert.strictEqual(result.status, 0, `${label}: expected exit 0, got ${result.status}: ${result.stderr}`);
|
||||
if (result.stdout.length > 0) {
|
||||
try {
|
||||
JSON.parse(result.stdout);
|
||||
} catch (err) {
|
||||
assert.fail(`${label}: stdout is non-empty but not valid JSON (${err.message}); first 120 chars: ${result.stdout.slice(0, 120)}`);
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
// All registered Stop hooks (hooks/hooks.json).
|
||||
const STOP_HOOKS = [
|
||||
['stop:format-typecheck', 'scripts/hooks/stop-format-typecheck.js'],
|
||||
['stop:check-console-log', 'scripts/hooks/check-console-log.js'],
|
||||
['stop:session-end', 'scripts/hooks/session-end.js'],
|
||||
['stop:evaluate-session', 'scripts/hooks/evaluate-session.js'],
|
||||
['stop:cost-tracker', 'scripts/hooks/cost-tracker.js']
|
||||
// stop:desktop-notify is excluded from the valid-payload run because a
|
||||
// successful run() fires a real OS notification; its truncation path is
|
||||
// covered separately below (run() bails on JSON.parse before notifying).
|
||||
];
|
||||
|
||||
// Direct-invocation legacy paths that echo stdin.
|
||||
const ECHOING_STOP_HOOKS = [
|
||||
'scripts/hooks/stop-format-typecheck.js',
|
||||
'scripts/hooks/check-console-log.js',
|
||||
'scripts/hooks/cost-tracker.js',
|
||||
'scripts/hooks/desktop-notify.js'
|
||||
];
|
||||
|
||||
console.log('\nStop hook stdout contract tests (#2090):');
|
||||
|
||||
let passed = 0;
|
||||
let failed = 0;
|
||||
|
||||
// A 100KB last_assistant_message is a realistic long-session Stop payload.
|
||||
// Before the fix, cost-tracker echoed it cut at 64KB through the production
|
||||
// runner path, making the harness report "JSON validation failed".
|
||||
const realisticPayload = stopPayload(100 * 1024);
|
||||
|
||||
for (const [hookId, script] of STOP_HOOKS) {
|
||||
if (
|
||||
test(`${hookId} via runner keeps stdout valid for a 100KB Stop payload`, () => {
|
||||
const result = runViaRunner(hookId, script, realisticPayload);
|
||||
assertStdoutContract(result, hookId);
|
||||
if (result.stdout.length > 0) {
|
||||
assert.strictEqual(result.stdout, realisticPayload, `${hookId}: pass-through must echo the payload uncut`);
|
||||
}
|
||||
})
|
||||
)
|
||||
passed++;
|
||||
else failed++;
|
||||
}
|
||||
|
||||
const oversizedPayload = stopPayload(MAX_STDIN + 64 * 1024);
|
||||
|
||||
for (const [hookId, script] of [...STOP_HOOKS, ['stop:desktop-notify', 'scripts/hooks/desktop-notify.js']]) {
|
||||
if (
|
||||
test(`${hookId} via runner fails open on a >1MB Stop payload`, () => {
|
||||
const result = runViaRunner(hookId, script, oversizedPayload);
|
||||
assert.strictEqual(result.status, 0, `${hookId}: expected exit 0, got ${result.status}: ${result.stderr}`);
|
||||
assert.strictEqual(result.stdout, '', `${hookId}: oversized payloads must not be echoed`);
|
||||
})
|
||||
)
|
||||
passed++;
|
||||
else failed++;
|
||||
}
|
||||
|
||||
for (const script of ECHOING_STOP_HOOKS) {
|
||||
if (
|
||||
test(`${path.basename(script)} invoked directly never echoes truncated stdin`, () => {
|
||||
const result = runDirect(script, oversizedPayload);
|
||||
assert.strictEqual(result.status, 0, `${script}: expected exit 0, got ${result.status}: ${result.stderr}`);
|
||||
assert.strictEqual(result.stdout, '', `${script}: truncated stdin must not be echoed`);
|
||||
})
|
||||
)
|
||||
passed++;
|
||||
else failed++;
|
||||
}
|
||||
|
||||
if (
|
||||
test('check-console-log invoked directly echoes a sub-cap >64KB payload uncut', () => {
|
||||
const result = runDirect('scripts/hooks/check-console-log.js', realisticPayload);
|
||||
assert.strictEqual(result.status, 0);
|
||||
assert.strictEqual(result.stdout, realisticPayload, 'pass-through must not be cut at the pipe buffer');
|
||||
JSON.parse(result.stdout);
|
||||
})
|
||||
)
|
||||
passed++;
|
||||
else failed++;
|
||||
|
||||
if (
|
||||
test('cost-tracker invoked directly echoes a sub-cap >64KB payload uncut', () => {
|
||||
const result = runDirect('scripts/hooks/cost-tracker.js', realisticPayload);
|
||||
assert.strictEqual(result.status, 0);
|
||||
assert.strictEqual(result.stdout, realisticPayload, 'the old 64KB cap must not cut realistic Stop payloads');
|
||||
JSON.parse(result.stdout);
|
||||
})
|
||||
)
|
||||
passed++;
|
||||
else failed++;
|
||||
|
||||
try {
|
||||
fs.rmSync(workDir, { recursive: true, force: true });
|
||||
fs.rmSync(dataHome, { recursive: true, force: true });
|
||||
} catch {
|
||||
/* best-effort cleanup */
|
||||
}
|
||||
|
||||
console.log(`\n ${passed} passed, ${failed} failed\n`);
|
||||
process.exit(failed > 0 ? 1 : 0);
|
||||
@@ -261,7 +261,7 @@ if (
|
||||
else failed++;
|
||||
|
||||
if (
|
||||
test('merge-mcp-config dry-run appends all recommended servers without mutating target', () => {
|
||||
test('merge-mcp-config dry-run appends the current default set without mutating target', () => {
|
||||
const tempDir = createTempDir('mcp-merge-dry-run-');
|
||||
const configPath = path.join(tempDir, 'config.toml');
|
||||
const original = '';
|
||||
@@ -272,9 +272,12 @@ if (
|
||||
|
||||
assert.strictEqual(result.status, 0, `${result.stdout}\n${result.stderr}`);
|
||||
assert.match(result.stdout, /Package manager: npm \(exec: npx\)/);
|
||||
assert.match(result.stdout, /\[add\] mcp_servers\.supabase/);
|
||||
assert.match(result.stdout, /\[mcp_servers\.github\]/);
|
||||
assert.match(result.stdout, /\[add\] mcp_servers\.chrome-devtools/);
|
||||
assert.match(result.stdout, /\[mcp_servers\.chrome-devtools\]/);
|
||||
assert.match(result.stdout, /Dry run/);
|
||||
// Retired defaults (June 2026 connector policy) must not be emitted.
|
||||
assert.doesNotMatch(result.stdout, /mcp_servers\.(supabase|playwright|context7|exa|github|memory|sequential-thinking)\b/);
|
||||
assert.doesNotMatch(result.stdout, /url = /);
|
||||
assert.strictEqual(fs.readFileSync(configPath, 'utf8'), original);
|
||||
} finally {
|
||||
cleanup(tempDir);
|
||||
@@ -296,14 +299,17 @@ if (
|
||||
|
||||
const merged = fs.readFileSync(configPath, 'utf8');
|
||||
const parsed = TOML.parse(merged);
|
||||
assert.strictEqual(parsed.mcp_servers.exa.url, 'https://mcp.exa.ai/mcp');
|
||||
assert.strictEqual(parsed.mcp_servers.github.command, 'bash');
|
||||
assert.deepStrictEqual(parsed.mcp_servers.memory.args, ['@modelcontextprotocol/server-memory']);
|
||||
assert.strictEqual(parsed.mcp_servers.supabase.tool_timeout_sec, 120);
|
||||
assert.strictEqual(parsed.mcp_servers['chrome-devtools'].command, 'npx');
|
||||
assert.deepStrictEqual(parsed.mcp_servers['chrome-devtools'].args, ['chrome-devtools-mcp@latest']);
|
||||
assert.strictEqual(parsed.mcp_servers['chrome-devtools'].startup_timeout_sec, 30);
|
||||
// No retired server may be (re-)emitted — exa's url form broke Codex (#2224).
|
||||
assert.strictEqual(parsed.mcp_servers.exa, undefined);
|
||||
assert.strictEqual(parsed.mcp_servers.github, undefined);
|
||||
assert.strictEqual(parsed.mcp_servers.supabase, undefined);
|
||||
|
||||
const second = runNode(mergeMcpConfigScript, [configPath], deterministicPackageEnv);
|
||||
assert.strictEqual(second.status, 0, `${second.stdout}\n${second.stderr}`);
|
||||
assert.match(second.stdout, /\[ok\] mcp_servers\.github/);
|
||||
assert.match(second.stdout, /\[ok\] mcp_servers\.chrome-devtools/);
|
||||
assert.match(second.stdout, /All ECC MCP servers already present/);
|
||||
assert.strictEqual(fs.readFileSync(configPath, 'utf8'), merged);
|
||||
} finally {
|
||||
@@ -315,24 +321,88 @@ if (
|
||||
else failed++;
|
||||
|
||||
if (
|
||||
test('merge-mcp-config update dry-run reports canonical and legacy section refreshes', () => {
|
||||
test('merge-mcp-config repairs the invalid exa url entry from earlier ECC versions (#2224)', () => {
|
||||
const tempDir = createTempDir('mcp-merge-exa-repair-');
|
||||
const configPath = path.join(tempDir, 'config.toml');
|
||||
const original = [
|
||||
'[mcp_servers.github]',
|
||||
'command = "npx"',
|
||||
'args = ["-y", "@modelcontextprotocol/server-github"]',
|
||||
'',
|
||||
'[mcp_servers.exa]',
|
||||
'url = "https://mcp.exa.ai/mcp"',
|
||||
'',
|
||||
].join('\n');
|
||||
|
||||
try {
|
||||
fs.writeFileSync(configPath, original);
|
||||
const result = runNode(mergeMcpConfigScript, [configPath], deterministicPackageEnv);
|
||||
|
||||
assert.strictEqual(result.status, 0, `${result.stdout}\n${result.stderr}`);
|
||||
assert.match(result.stdout, /\[repair\] mcp_servers\.exa/);
|
||||
|
||||
const updated = fs.readFileSync(configPath, 'utf8');
|
||||
const parsed = TOML.parse(updated);
|
||||
assert.strictEqual(parsed.mcp_servers.exa, undefined, 'invalid exa url entry must be removed');
|
||||
assert.doesNotMatch(updated, /url = "https:\/\/mcp\.exa\.ai\/mcp"/);
|
||||
// User-managed servers are untouched; current default is added.
|
||||
assert.strictEqual(parsed.mcp_servers.github.command, 'npx');
|
||||
assert.strictEqual(parsed.mcp_servers['chrome-devtools'].command, 'npx');
|
||||
|
||||
// Re-running must not re-introduce the invalid entry.
|
||||
const second = runNode(mergeMcpConfigScript, [configPath], deterministicPackageEnv);
|
||||
assert.strictEqual(second.status, 0, `${second.stdout}\n${second.stderr}`);
|
||||
assert.doesNotMatch(fs.readFileSync(configPath, 'utf8'), /mcp_servers\.exa/);
|
||||
} finally {
|
||||
cleanup(tempDir);
|
||||
}
|
||||
})
|
||||
)
|
||||
passed++;
|
||||
else failed++;
|
||||
|
||||
if (
|
||||
test('merge-mcp-config leaves a user-managed stdio exa entry untouched', () => {
|
||||
const tempDir = createTempDir('mcp-merge-exa-stdio-');
|
||||
const configPath = path.join(tempDir, 'config.toml');
|
||||
const original = [
|
||||
'[mcp_servers.exa]',
|
||||
'command = "npx"',
|
||||
'args = ["-y", "mcp-remote", "https://mcp.exa.ai/mcp"]',
|
||||
'startup_timeout_sec = 30',
|
||||
'',
|
||||
].join('\n');
|
||||
|
||||
try {
|
||||
fs.writeFileSync(configPath, original);
|
||||
const result = runNode(mergeMcpConfigScript, [configPath], deterministicPackageEnv);
|
||||
|
||||
assert.strictEqual(result.status, 0, `${result.stdout}\n${result.stderr}`);
|
||||
assert.doesNotMatch(result.stdout, /\[repair\]/);
|
||||
|
||||
const parsed = TOML.parse(fs.readFileSync(configPath, 'utf8'));
|
||||
assert.strictEqual(parsed.mcp_servers.exa.command, 'npx');
|
||||
assert.deepStrictEqual(parsed.mcp_servers.exa.args, ['-y', 'mcp-remote', 'https://mcp.exa.ai/mcp']);
|
||||
} finally {
|
||||
cleanup(tempDir);
|
||||
}
|
||||
})
|
||||
)
|
||||
passed++;
|
||||
else failed++;
|
||||
|
||||
if (
|
||||
test('merge-mcp-config update dry-run refreshes managed sections and leaves user servers alone', () => {
|
||||
const tempDir = createTempDir('mcp-merge-update-dry-run-');
|
||||
const configPath = path.join(tempDir, 'config.toml');
|
||||
const original = [
|
||||
'[mcp_servers.chrome-devtools]',
|
||||
'command = "custom"',
|
||||
'args = ["old"]',
|
||||
'',
|
||||
'[mcp_servers.context7]',
|
||||
'command = "custom"',
|
||||
'args = ["old"]',
|
||||
'',
|
||||
'[mcp_servers.context7-mcp]',
|
||||
'command = "npx"',
|
||||
'args = ["legacy"]',
|
||||
'',
|
||||
'[mcp_servers.supabase]',
|
||||
'command = "custom"',
|
||||
'args = ["old"]',
|
||||
'',
|
||||
'[mcp_servers.supabase.env]',
|
||||
'SUPABASE_ACCESS_TOKEN = "token"',
|
||||
'args = ["-y", "@upstash/context7-mcp@latest"]',
|
||||
'',
|
||||
].join('\n');
|
||||
|
||||
@@ -341,11 +411,10 @@ if (
|
||||
const result = runNode(mergeMcpConfigScript, [configPath, '--update-mcp', '--dry-run'], deterministicPackageEnv);
|
||||
|
||||
assert.strictEqual(result.status, 0, `${result.stdout}\n${result.stderr}`);
|
||||
assert.match(result.stdout, /\[remove\] mcp_servers\.context7/);
|
||||
assert.match(result.stdout, /\[remove\] mcp_servers\.context7-mcp/);
|
||||
assert.match(result.stdout, /\[remove\] mcp_servers\.supabase/);
|
||||
assert.match(result.stdout, /\[mcp_servers\.supabase\]/);
|
||||
assert.match(result.stdout, /\[mcp_servers\.context7\]/);
|
||||
assert.match(result.stdout, /\[remove\] mcp_servers\.chrome-devtools/);
|
||||
assert.match(result.stdout, /\[mcp_servers\.chrome-devtools\]/);
|
||||
// Retired servers are no longer ECC-managed: never removed or re-added.
|
||||
assert.doesNotMatch(result.stdout, /\[remove\] mcp_servers\.context7/);
|
||||
assert.strictEqual(fs.readFileSync(configPath, 'utf8'), original);
|
||||
} finally {
|
||||
cleanup(tempDir);
|
||||
@@ -356,38 +425,31 @@ if (
|
||||
else failed++;
|
||||
|
||||
if (
|
||||
test('merge-mcp-config removes disabled legacy servers without appending replacements', () => {
|
||||
test('merge-mcp-config removes disabled servers without appending replacements', () => {
|
||||
const tempDir = createTempDir('mcp-merge-disabled-');
|
||||
const configPath = path.join(tempDir, 'config.toml');
|
||||
const original = [
|
||||
'[mcp_servers.context7-mcp]',
|
||||
'[mcp_servers.chrome-devtools]',
|
||||
'command = "npx"',
|
||||
'args = ["legacy"]',
|
||||
'',
|
||||
'[mcp_servers.exa]',
|
||||
'url = "https://mcp.exa.ai/mcp"',
|
||||
'args = ["chrome-devtools-mcp@latest"]',
|
||||
'',
|
||||
].join('\n');
|
||||
const allServersDisabled = 'supabase,playwright,context7,exa,github,memory,sequential-thinking';
|
||||
|
||||
try {
|
||||
fs.writeFileSync(configPath, original);
|
||||
const result = runNode(mergeMcpConfigScript, [configPath], {
|
||||
...deterministicPackageEnv,
|
||||
ECC_DISABLED_MCPS: allServersDisabled,
|
||||
ECC_DISABLED_MCPS: 'chrome-devtools',
|
||||
});
|
||||
|
||||
assert.strictEqual(result.status, 0, `${result.stdout}\n${result.stderr}`);
|
||||
assert.match(result.stdout, /Disabled via ECC_DISABLED_MCPS/);
|
||||
assert.match(result.stdout, /\[skip\] mcp_servers\.context7 \(disabled\)/);
|
||||
assert.match(result.stdout, /\[skip\] mcp_servers\.exa \(disabled\)/);
|
||||
assert.match(result.stdout, /\[update\] mcp_servers\.context7-mcp \(disabled\)/);
|
||||
assert.match(result.stdout, /\[update\] mcp_servers\.exa \(disabled\)/);
|
||||
assert.match(result.stdout, /Done\. Removed 2 disabled server\(s\)\./);
|
||||
assert.match(result.stdout, /\[skip\] mcp_servers\.chrome-devtools \(disabled\)/);
|
||||
assert.match(result.stdout, /\[update\] mcp_servers\.chrome-devtools \(disabled\)/);
|
||||
assert.match(result.stdout, /Done\. Removed 1 server section\(s\)\./);
|
||||
|
||||
const updated = fs.readFileSync(configPath, 'utf8');
|
||||
assert.doesNotMatch(updated, /context7-mcp/);
|
||||
assert.doesNotMatch(updated, /mcp_servers\.exa/);
|
||||
assert.doesNotMatch(updated, /chrome-devtools/);
|
||||
} finally {
|
||||
cleanup(tempDir);
|
||||
}
|
||||
@@ -454,7 +516,10 @@ if (
|
||||
assert.strictEqual(parsedConfig.agents.explorer.config_file, 'agents/explorer.toml');
|
||||
assert.strictEqual(parsedConfig.agents.reviewer.config_file, 'agents/reviewer.toml');
|
||||
assert.strictEqual(parsedConfig.agents.docs_researcher.config_file, 'agents/docs-researcher.toml');
|
||||
assert.ok(parsedConfig.mcp_servers.exa);
|
||||
// Current default connector is added; retired servers are not emitted,
|
||||
// and pre-existing user-managed entries are preserved untouched.
|
||||
assert.ok(parsedConfig.mcp_servers['chrome-devtools']);
|
||||
assert.strictEqual(parsedConfig.mcp_servers.exa, undefined);
|
||||
assert.ok(parsedConfig.mcp_servers.github);
|
||||
assert.ok(parsedConfig.mcp_servers.memory);
|
||||
assert.ok(parsedConfig.mcp_servers['sequential-thinking']);
|
||||
|
||||
Reference in New Issue
Block a user