mirror of
https://github.com/affaan-m/everything-claude-code.git
synced 2026-06-12 11:13:11 +08:00
* fix(hooks): fail open on oversized stdin instead of echoing truncated JSON (#2222) run-with-flags.js capped stdin at 1MB but every fallthrough path still echoed the truncated string to stdout. The harness parses hook stdout as JSON, got a document cut mid-stream, and blocked the tool call — so any Edit/Write with a >1MB hook payload was permanently blocked by every registered pre-write hook, before ECC_HOOK_PROFILE / ECC_DISABLED_HOOKS gating could run. - Exit 0 with empty stdout (no opinion) when the stdin cap trips, before any echo or gating logic. - Flush stdout via write callback before process.exit: exiting right after stdout.write() dropped everything past the ~64KB pipe buffer, cutting even sub-cap pass-through payloads mid-JSON. Regression tests cover the enabled, disabled, and missing-arg paths for oversized payloads plus full echo of sub-cap >64KB payloads. * fix(codex): stop emitting invalid exa url entry, align merge with connector policy (#2224) The Codex MCP merge declared exa with a url key, but Codex's [mcp_servers.*] TOML schema is stdio-only — the url key makes the entire config.toml fail to load, bricking both the codex CLI and the desktop app. Every install/update re-injected the line because the urlEntry branch treated the broken entry as present. - ECC_SERVERS now emits only the current default set per docs/MCP-CONNECTOR-POLICY.md: chrome-devtools (stdio, command/args). Retired servers (supabase, playwright, context7, exa, github, memory, sequential-thinking) are never re-emitted; existing user-managed entries are untouched. - The merge now repairs the exact ECC-emitted broken form (url-only exa entry) on every run so re-running the installer fixes broken configs instead of preserving them. User stdio exa entries (command + mcp-remote) are left alone. - check-codex-global-state.sh requires chrome-devtools instead of the retired set, and flags url-only exa entries with a repair hint. Tests cover repair, re-run idempotence, stdio-entry preservation, and no-retired-server emission in add, update, dry-run, and disabled modes. * fix(hooks): never echo truncated stdin from Stop hooks (#2090) Stop hooks follow the ECC pass-through convention (echo stdin on stdout), but every echoing Stop hook capped stdin and echoed the capped string. The Stop payload carries last_assistant_message, so a long final assistant message produced a JSON document cut mid-stream on stdout, which the harness reports as 'Stop hook error: JSON validation failed' across the whole Stop chain. Reproduced: a Stop payload with a >64KB last_assistant_message run through run-with-flags + cost-tracker emitted exactly 65536 bytes of invalid JSON (cost-tracker capped stdin at 64KB — far below realistic Stop payloads). - cost-tracker: raise the cap to 1MB (matching all other hooks) and suppress the pass-through echo when stdin was truncated. - check-console-log, stop-format-typecheck, desktop-notify: suppress the echo when stdin was truncated; flush stdout before process.exit so sub-cap payloads are not cut at the ~64KB pipe buffer. - All hooks keep exiting 0 (fail-open); diagnostics go to stderr. New stop-hooks-stdout test asserts the contract for every registered Stop hook: stdout is empty or valid JSON, exit code 0 — for realistic 100KB payloads and oversized >1MB payloads, via the production runner and via direct invocation. Updated the old hooks.test.js case that codified the truncated-echo behavior. * fix(hooks): dampen GateGuard fact-force repetition in long sessions (#2142) In long autonomous sessions the fact-force gate produced 10+ near-identical 'state facts -> blocked -> restate -> retry' blocks in one context window, which measurably raises the odds of the model collapsing into a degenerate single-token repetition loop. - Track a per-session fact_force_denials counter in GateGuard state (merged max across concurrent writers, reset with the session, robust to malformed on-disk values). - The first GATEGUARD_FACT_FORCE_FULL_DENIALS denials (default 3) keep the full four-fact block; later denials emit a condensed single-line message that carries the denial ordinal, so consecutive denials are structurally different and never textually identical. - True retries of the same target remain allowed without re-prompting (unchanged). Destructive-Bash and routine-Bash gates are unchanged, as are the ECC_GATEGUARD=off / ECC_DISABLED_HOOKS escape hatches. Eight new tests cover budget counting, condensed format, ordinal advancement, retry pass-through, env tuning, malformed state, MultiEdit dampening, and destructive-gate exemption. * fix(hooks): keep security hooks able to block on oversized stdin (#2222) Refine the truncation fail-open: instead of skipping the hook entirely, the runner now suppresses only its own raw-echo when stdin was truncated. The hook still executes and receives the truncated flag (run() context / ECC_HOOK_INPUT_TRUNCATED), so config-protection keeps blocking truncated protected-config payloads (its test requires exit 2) while pass-through hooks fail open with empty stdout as before. * style: apply repo formatter to touched hook files
204 lines
6.7 KiB
JavaScript
204 lines
6.7 KiB
JavaScript
/**
|
|
* Regression tests for #2090: "Stop hook error: JSON validation failed".
|
|
*
|
|
* Stop hooks follow the ECC pass-through convention (echo stdin on stdout).
|
|
* The Stop payload carries `last_assistant_message`, which can be large; any
|
|
* hook that caps stdin and echoes the capped string emits a JSON document cut
|
|
* mid-stream, which the harness reports as a Stop hook JSON validation
|
|
* failure. Worst offender: cost-tracker capped stdin at 64KB, so any Stop
|
|
* payload with a >64KB final assistant message broke the whole Stop chain.
|
|
*
|
|
* Contract under test: for every Stop hook, stdout is either empty or valid
|
|
* JSON, and the exit code is 0 — for realistic large payloads and for
|
|
* oversized (>1MB) payloads, via the production runner and via direct
|
|
* invocation.
|
|
*/
|
|
|
|
'use strict';
|
|
|
|
const assert = require('assert');
|
|
const fs = require('fs');
|
|
const os = require('os');
|
|
const path = require('path');
|
|
const { spawnSync } = require('child_process');
|
|
|
|
const repoRoot = path.join(__dirname, '..', '..');
|
|
const runner = path.join(repoRoot, 'scripts', 'hooks', 'run-with-flags.js');
|
|
|
|
const MAX_STDIN = 1024 * 1024;
|
|
|
|
const workDir = fs.mkdtempSync(path.join(os.tmpdir(), 'ecc-stop-stdout-')); // non-git cwd
|
|
const dataHome = fs.mkdtempSync(path.join(os.tmpdir(), 'ecc-stop-data-'));
|
|
|
|
function test(name, fn) {
|
|
try {
|
|
fn();
|
|
console.log(` ✓ ${name}`);
|
|
return true;
|
|
} catch (error) {
|
|
console.log(` ✗ ${name}`);
|
|
console.log(` Error: ${error.message}`);
|
|
return false;
|
|
}
|
|
}
|
|
|
|
function stopPayload(messageBytes) {
|
|
return JSON.stringify({
|
|
session_id: `stop-stdout-test-${process.pid}`,
|
|
transcript_path: path.join(workDir, 'missing-transcript.jsonl'),
|
|
cwd: workDir,
|
|
hook_event_name: 'Stop',
|
|
stop_hook_active: false,
|
|
last_assistant_message: 'm'.repeat(messageBytes)
|
|
});
|
|
}
|
|
|
|
function hookEnv() {
|
|
const env = {
|
|
...process.env,
|
|
ECC_HOOK_PROFILE: 'standard',
|
|
ECC_AGENT_DATA_HOME: dataHome,
|
|
CLAUDE_SESSION_ID: `stop-stdout-test-${process.pid}`
|
|
};
|
|
delete env.ECC_GATEGUARD;
|
|
delete env.ECC_DISABLED_HOOKS;
|
|
return env;
|
|
}
|
|
|
|
function runViaRunner(hookId, script, input) {
|
|
return spawnSync('node', [runner, hookId, script, 'minimal,standard,strict'], {
|
|
input,
|
|
encoding: 'utf8',
|
|
cwd: workDir,
|
|
env: hookEnv(),
|
|
timeout: 60000,
|
|
maxBuffer: 16 * 1024 * 1024,
|
|
stdio: ['pipe', 'pipe', 'pipe']
|
|
});
|
|
}
|
|
|
|
function runDirect(script, input) {
|
|
return spawnSync('node', [path.join(repoRoot, script)], {
|
|
input,
|
|
encoding: 'utf8',
|
|
cwd: workDir,
|
|
env: hookEnv(),
|
|
timeout: 60000,
|
|
maxBuffer: 16 * 1024 * 1024,
|
|
stdio: ['pipe', 'pipe', 'pipe']
|
|
});
|
|
}
|
|
|
|
function assertStdoutContract(result, label) {
|
|
assert.strictEqual(result.status, 0, `${label}: expected exit 0, got ${result.status}: ${result.stderr}`);
|
|
if (result.stdout.length > 0) {
|
|
try {
|
|
JSON.parse(result.stdout);
|
|
} catch (err) {
|
|
assert.fail(`${label}: stdout is non-empty but not valid JSON (${err.message}); first 120 chars: ${result.stdout.slice(0, 120)}`);
|
|
}
|
|
}
|
|
}
|
|
|
|
// All registered Stop hooks (hooks/hooks.json).
|
|
const STOP_HOOKS = [
|
|
['stop:format-typecheck', 'scripts/hooks/stop-format-typecheck.js'],
|
|
['stop:check-console-log', 'scripts/hooks/check-console-log.js'],
|
|
['stop:session-end', 'scripts/hooks/session-end.js'],
|
|
['stop:evaluate-session', 'scripts/hooks/evaluate-session.js'],
|
|
['stop:cost-tracker', 'scripts/hooks/cost-tracker.js']
|
|
// stop:desktop-notify is excluded from the valid-payload run because a
|
|
// successful run() fires a real OS notification; its truncation path is
|
|
// covered separately below (run() bails on JSON.parse before notifying).
|
|
];
|
|
|
|
// Direct-invocation legacy paths that echo stdin.
|
|
const ECHOING_STOP_HOOKS = [
|
|
'scripts/hooks/stop-format-typecheck.js',
|
|
'scripts/hooks/check-console-log.js',
|
|
'scripts/hooks/cost-tracker.js',
|
|
'scripts/hooks/desktop-notify.js'
|
|
];
|
|
|
|
console.log('\nStop hook stdout contract tests (#2090):');
|
|
|
|
let passed = 0;
|
|
let failed = 0;
|
|
|
|
// A 100KB last_assistant_message is a realistic long-session Stop payload.
|
|
// Before the fix, cost-tracker echoed it cut at 64KB through the production
|
|
// runner path, making the harness report "JSON validation failed".
|
|
const realisticPayload = stopPayload(100 * 1024);
|
|
|
|
for (const [hookId, script] of STOP_HOOKS) {
|
|
if (
|
|
test(`${hookId} via runner keeps stdout valid for a 100KB Stop payload`, () => {
|
|
const result = runViaRunner(hookId, script, realisticPayload);
|
|
assertStdoutContract(result, hookId);
|
|
if (result.stdout.length > 0) {
|
|
assert.strictEqual(result.stdout, realisticPayload, `${hookId}: pass-through must echo the payload uncut`);
|
|
}
|
|
})
|
|
)
|
|
passed++;
|
|
else failed++;
|
|
}
|
|
|
|
const oversizedPayload = stopPayload(MAX_STDIN + 64 * 1024);
|
|
|
|
for (const [hookId, script] of [...STOP_HOOKS, ['stop:desktop-notify', 'scripts/hooks/desktop-notify.js']]) {
|
|
if (
|
|
test(`${hookId} via runner fails open on a >1MB Stop payload`, () => {
|
|
const result = runViaRunner(hookId, script, oversizedPayload);
|
|
assert.strictEqual(result.status, 0, `${hookId}: expected exit 0, got ${result.status}: ${result.stderr}`);
|
|
assert.strictEqual(result.stdout, '', `${hookId}: oversized payloads must not be echoed`);
|
|
})
|
|
)
|
|
passed++;
|
|
else failed++;
|
|
}
|
|
|
|
for (const script of ECHOING_STOP_HOOKS) {
|
|
if (
|
|
test(`${path.basename(script)} invoked directly never echoes truncated stdin`, () => {
|
|
const result = runDirect(script, oversizedPayload);
|
|
assert.strictEqual(result.status, 0, `${script}: expected exit 0, got ${result.status}: ${result.stderr}`);
|
|
assert.strictEqual(result.stdout, '', `${script}: truncated stdin must not be echoed`);
|
|
})
|
|
)
|
|
passed++;
|
|
else failed++;
|
|
}
|
|
|
|
if (
|
|
test('check-console-log invoked directly echoes a sub-cap >64KB payload uncut', () => {
|
|
const result = runDirect('scripts/hooks/check-console-log.js', realisticPayload);
|
|
assert.strictEqual(result.status, 0);
|
|
assert.strictEqual(result.stdout, realisticPayload, 'pass-through must not be cut at the pipe buffer');
|
|
JSON.parse(result.stdout);
|
|
})
|
|
)
|
|
passed++;
|
|
else failed++;
|
|
|
|
if (
|
|
test('cost-tracker invoked directly echoes a sub-cap >64KB payload uncut', () => {
|
|
const result = runDirect('scripts/hooks/cost-tracker.js', realisticPayload);
|
|
assert.strictEqual(result.status, 0);
|
|
assert.strictEqual(result.stdout, realisticPayload, 'the old 64KB cap must not cut realistic Stop payloads');
|
|
JSON.parse(result.stdout);
|
|
})
|
|
)
|
|
passed++;
|
|
else failed++;
|
|
|
|
try {
|
|
fs.rmSync(workDir, { recursive: true, force: true });
|
|
fs.rmSync(dataHome, { recursive: true, force: true });
|
|
} catch {
|
|
/* best-effort cleanup */
|
|
}
|
|
|
|
console.log(`\n ${passed} passed, ${failed} failed\n`);
|
|
process.exit(failed > 0 ? 1 : 0);
|