fix: stability batch — hook stdin truncation, Codex exa TOML, Stop hook JSON, GateGuard repetition (#2227)

* fix(hooks): fail open on oversized stdin instead of echoing truncated JSON (#2222)

run-with-flags.js capped stdin at 1MB but every fallthrough path still
echoed the truncated string to stdout. The harness parses hook stdout as
JSON, got a document cut mid-stream, and blocked the tool call — so any
Edit/Write with a >1MB hook payload was permanently blocked by every
registered pre-write hook, before ECC_HOOK_PROFILE / ECC_DISABLED_HOOKS
gating could run.

- Exit 0 with empty stdout (no opinion) when the stdin cap trips, before
  any echo or gating logic.
- Flush stdout via write callback before process.exit: exiting right
  after stdout.write() dropped everything past the ~64KB pipe buffer,
  cutting even sub-cap pass-through payloads mid-JSON.

Regression tests cover the enabled, disabled, and missing-arg paths for
oversized payloads plus full echo of sub-cap >64KB payloads.

* fix(codex): stop emitting invalid exa url entry, align merge with connector policy (#2224)

The Codex MCP merge declared exa with a url key, but Codex's
[mcp_servers.*] TOML schema is stdio-only — the url key makes the
entire config.toml fail to load, bricking both the codex CLI and the
desktop app. Every install/update re-injected the line because the
urlEntry branch treated the broken entry as present.

- ECC_SERVERS now emits only the current default set per
  docs/MCP-CONNECTOR-POLICY.md: chrome-devtools (stdio, command/args).
  Retired servers (supabase, playwright, context7, exa, github, memory,
  sequential-thinking) are never re-emitted; existing user-managed
  entries are untouched.
- The merge now repairs the exact ECC-emitted broken form (url-only
  exa entry) on every run so re-running the installer fixes broken
  configs instead of preserving them. User stdio exa entries
  (command + mcp-remote) are left alone.
- check-codex-global-state.sh requires chrome-devtools instead of the
  retired set, and flags url-only exa entries with a repair hint.

Tests cover repair, re-run idempotence, stdio-entry preservation, and
no-retired-server emission in add, update, dry-run, and disabled modes.

* fix(hooks): never echo truncated stdin from Stop hooks (#2090)

Stop hooks follow the ECC pass-through convention (echo stdin on
stdout), but every echoing Stop hook capped stdin and echoed the capped
string. The Stop payload carries last_assistant_message, so a long
final assistant message produced a JSON document cut mid-stream on
stdout, which the harness reports as 'Stop hook error: JSON validation
failed' across the whole Stop chain.

Reproduced: a Stop payload with a >64KB last_assistant_message run
through run-with-flags + cost-tracker emitted exactly 65536 bytes of
invalid JSON (cost-tracker capped stdin at 64KB — far below realistic
Stop payloads).

- cost-tracker: raise the cap to 1MB (matching all other hooks) and
  suppress the pass-through echo when stdin was truncated.
- check-console-log, stop-format-typecheck, desktop-notify: suppress
  the echo when stdin was truncated; flush stdout before process.exit
  so sub-cap payloads are not cut at the ~64KB pipe buffer.
- All hooks keep exiting 0 (fail-open); diagnostics go to stderr.

New stop-hooks-stdout test asserts the contract for every registered
Stop hook: stdout is empty or valid JSON, exit code 0 — for realistic
100KB payloads and oversized >1MB payloads, via the production runner
and via direct invocation. Updated the old hooks.test.js case that
codified the truncated-echo behavior.

* fix(hooks): dampen GateGuard fact-force repetition in long sessions (#2142)

In long autonomous sessions the fact-force gate produced 10+
near-identical 'state facts -> blocked -> restate -> retry' blocks in
one context window, which measurably raises the odds of the model
collapsing into a degenerate single-token repetition loop.

- Track a per-session fact_force_denials counter in GateGuard state
  (merged max across concurrent writers, reset with the session, robust
  to malformed on-disk values).
- The first GATEGUARD_FACT_FORCE_FULL_DENIALS denials (default 3) keep
  the full four-fact block; later denials emit a condensed single-line
  message that carries the denial ordinal, so consecutive denials are
  structurally different and never textually identical.
- True retries of the same target remain allowed without re-prompting
  (unchanged). Destructive-Bash and routine-Bash gates are unchanged,
  as are the ECC_GATEGUARD=off / ECC_DISABLED_HOOKS escape hatches.

Eight new tests cover budget counting, condensed format, ordinal
advancement, retry pass-through, env tuning, malformed state, MultiEdit
dampening, and destructive-gate exemption.

* fix(hooks): keep security hooks able to block on oversized stdin (#2222)

Refine the truncation fail-open: instead of skipping the hook entirely,
the runner now suppresses only its own raw-echo when stdin was
truncated. The hook still executes and receives the truncated flag
(run() context / ECC_HOOK_INPUT_TRUNCATED), so config-protection keeps
blocking truncated protected-config payloads (its test requires exit 2)
while pass-through hooks fail open with empty stdout as before.

* style: apply repo formatter to touched hook files
This commit is contained in:
Affaan Mustafa
2026-06-11 00:31:33 -04:00
committed by GitHub
parent 3bdb4a5e12
commit 6319c7d309
14 changed files with 846 additions and 151 deletions

View File

@@ -107,11 +107,11 @@ if [[ -f "$CONFIG_FILE" ]]; then
check_config_pattern '^\[profiles\.strict\]' "profiles.strict exists"
check_config_pattern '^\[profiles\.yolo\]' "profiles.yolo exists"
# Current default connector set (docs/MCP-CONNECTOR-POLICY.md): exactly
# one connector. Former defaults (github, memory, sequential-thinking,
# context7, exa, ...) are opt-in user choices, so they are not required.
for section in \
'mcp_servers.github' \
'mcp_servers.memory' \
'mcp_servers.sequential-thinking' \
'mcp_servers.context7'
'mcp_servers.chrome-devtools'
do
if search_file "^\[$section\]" "$CONFIG_FILE"; then
ok "MCP section [$section] exists"
@@ -120,25 +120,17 @@ if [[ -f "$CONFIG_FILE" ]]; then
fi
done
has_context7_legacy=0
has_context7_current=0
if search_file '^\[mcp_servers\.context7\]' "$CONFIG_FILE"; then
has_context7_legacy=1
fi
if search_file '^\[mcp_servers\.context7-mcp\]' "$CONFIG_FILE"; then
has_context7_current=1
fi
if [[ "$has_context7_legacy" -eq 1 || "$has_context7_current" -eq 1 ]]; then
ok "MCP section [mcp_servers.context7] or [mcp_servers.context7-mcp] exists"
else
fail "MCP section [mcp_servers.context7] or [mcp_servers.context7-mcp] missing"
fi
if [[ "$has_context7_legacy" -eq 1 && "$has_context7_current" -eq 1 ]]; then
warn "Both [mcp_servers.context7] and [mcp_servers.context7-mcp] exist; prefer one name"
# ECC <= 2.0.0 emitted a url-only exa entry that Codex's stdio-only
# schema rejects, breaking the whole config (#2224). Flag it so users
# re-run the sync (which repairs it) or remove it manually.
if search_file '^\[mcp_servers\.exa\]' "$CONFIG_FILE"; then
exa_block="$(awk '/^\[mcp_servers\.exa\]/{flag=1;next}/^\[/{flag=0}flag' "$CONFIG_FILE")"
if printf '%s\n' "$exa_block" | grep -Eq '^[[:space:]]*url[[:space:]]*=' \
&& ! printf '%s\n' "$exa_block" | grep -Eq '^[[:space:]]*command[[:space:]]*='; then
fail "MCP section [mcp_servers.exa] uses a url key, which Codex rejects for stdio servers — re-run ecc-sync-codex to repair (#2224)"
else
ok "MCP section [mcp_servers.exa] uses the stdio form"
fi
fi
fi

View File

@@ -65,14 +65,14 @@ const PM_EXEC_PARTS = PM_EXEC.split(/\s+/); // ["pnpm", "dlx"] or ["npx"] or ["b
// ECC-recommended MCP servers
// ---------------------------------------------------------------------------
// GitHub bootstrap uses bash for token forwarding — this is intentionally
// shell-based regardless of package manager, since Codex runs on macOS/Linux.
const GH_BOOTSTRAP = `token=$(gh auth token 2>/dev/null || true); if [ -n "$token" ]; then export GITHUB_PERSONAL_ACCESS_TOKEN="$token"; fi; exec ${PM_EXEC} @modelcontextprotocol/server-github`;
/**
* Build a server spec with the detected package manager.
* Returns { fields, toml } where fields is for drift detection and
* toml is the raw text appended to the file.
*
* Codex's [mcp_servers.*] TOML schema is stdio-only (command/args) —
* never emit a `url` key here. The http/url form is valid only for
* Claude Code's .mcp.json (#2224).
*/
function dlxServer(name, pkg, extraFields, extraToml) {
const args = [...PM_EXEC_PARTS.slice(1), pkg];
@@ -87,31 +87,29 @@ function dlxServer(name, pkg, extraFields, extraToml) {
const DEFAULT_MCP_STARTUP_TIMEOUT_SEC = 30;
const DEFAULT_MCP_STARTUP_TIMEOUT_TOML = `startup_timeout_sec = ${DEFAULT_MCP_STARTUP_TIMEOUT_SEC}`;
// Current default connector set (docs/MCP-CONNECTOR-POLICY.md): exactly one
// connector. The former defaults (supabase, playwright, context7, exa,
// github, memory, sequential-thinking) were retired in the June 2026 audit
// and must not be re-emitted; they remain opt-in via
// mcp-configs/mcp-servers.json. Existing user-managed entries are never
// touched by the merge (add-only), except the known-invalid repair below.
const ECC_SERVERS = {
supabase: dlxServer('supabase', '@supabase/mcp-server-supabase@latest', { startup_timeout_sec: 20.0, tool_timeout_sec: 120.0 }, 'startup_timeout_sec = 20.0\ntool_timeout_sec = 120.0'),
playwright: dlxServer('playwright', '@playwright/mcp@latest', { startup_timeout_sec: DEFAULT_MCP_STARTUP_TIMEOUT_SEC }, DEFAULT_MCP_STARTUP_TIMEOUT_TOML),
context7: dlxServer('context7', '@upstash/context7-mcp@latest', { startup_timeout_sec: DEFAULT_MCP_STARTUP_TIMEOUT_SEC }, DEFAULT_MCP_STARTUP_TIMEOUT_TOML),
exa: {
fields: { url: 'https://mcp.exa.ai/mcp' },
toml: `[mcp_servers.exa]\nurl = "https://mcp.exa.ai/mcp"`
},
github: {
fields: { command: 'bash', args: ['-lc', GH_BOOTSTRAP], startup_timeout_sec: DEFAULT_MCP_STARTUP_TIMEOUT_SEC },
toml: `[mcp_servers.github]\ncommand = "bash"\nargs = ["-lc", ${JSON.stringify(GH_BOOTSTRAP)}]\n${DEFAULT_MCP_STARTUP_TIMEOUT_TOML}`
},
memory: dlxServer('memory', '@modelcontextprotocol/server-memory', { startup_timeout_sec: DEFAULT_MCP_STARTUP_TIMEOUT_SEC }, DEFAULT_MCP_STARTUP_TIMEOUT_TOML),
'sequential-thinking': dlxServer('sequential-thinking', '@modelcontextprotocol/server-sequential-thinking', { startup_timeout_sec: DEFAULT_MCP_STARTUP_TIMEOUT_SEC }, DEFAULT_MCP_STARTUP_TIMEOUT_TOML)
'chrome-devtools': dlxServer('chrome-devtools', 'chrome-devtools-mcp@latest', { startup_timeout_sec: DEFAULT_MCP_STARTUP_TIMEOUT_SEC }, DEFAULT_MCP_STARTUP_TIMEOUT_TOML)
};
// Append --features arg for supabase after dlxServer builds the base
ECC_SERVERS.supabase.fields.args.push('--features=account,docs,database,debugging,development,functions,storage,branching');
ECC_SERVERS.supabase.toml = ECC_SERVERS.supabase.toml.replace(/^(args = \[.*)\]$/m, '$1, "--features=account,docs,database,debugging,development,functions,storage,branching"]');
// ECC <= 2.0.0 emitted [mcp_servers.exa] with a `url` key. Codex rejects
// `url` for stdio servers, which makes the *entire* config.toml fail to
// load (#2224). Repair exactly that ECC-emitted form on every merge so
// re-running the installer fixes broken configs instead of preserving
// them. A user-managed stdio exa entry (command/args) is left untouched.
const RETIRED_INVALID_URL_SERVERS = {
exa: 'https://mcp.exa.ai/mcp'
};
// Legacy section names that should be treated as an existing ECC server.
// e.g. older configs shipped [mcp_servers.context7-mcp] instead of [mcp_servers.context7].
const LEGACY_ALIASES = {
context7: ['context7-mcp']
};
// e.g. older configs shipped [mcp_servers.context7-mcp] instead of
// [mcp_servers.context7]. Empty since the June 2026 default-set reduction.
const LEGACY_ALIASES = {};
// ---------------------------------------------------------------------------
// Helpers
@@ -241,6 +239,21 @@ function main() {
const toAppend = [];
const toRemoveLog = [];
// Repair schema-invalid entries emitted by earlier ECC versions (#2224).
for (const [name, invalidUrl] of Object.entries(RETIRED_INVALID_URL_SERVERS)) {
const entry = existing[name];
const isBrokenEccForm =
entry &&
typeof entry.url === 'string' &&
entry.url === invalidUrl &&
typeof entry.command !== 'string';
if (isBrokenEccForm) {
toRemoveLog.push(`mcp_servers.${name} (invalid url entry from earlier ECC versions)`);
raw = removeServerFromText(raw, name, existing);
log(` [repair] mcp_servers.${name} — url is not valid for Codex stdio servers, removing`);
}
}
for (const [name, spec] of Object.entries(ECC_SERVERS)) {
const entry = existing[name];
const aliases = LEGACY_ALIASES[name] || [];
@@ -249,7 +262,9 @@ function main() {
// Prefer canonical entry over legacy alias
const hasCanonical = entry && typeof entry.command === 'string';
const resolvedEntry = hasCanonical ? entry : legacyName ? existing[legacyName] : null;
// For URL-based servers (exa), check for url field instead of command
// Recognize url-form entries as existing so they are never duplicated.
// (Codex itself rejects url-form stdio servers; ECC only ever emits
// command/args, but a user-managed entry must still count as present.)
const urlEntry = !resolvedEntry && entry && typeof entry.url === 'string' ? entry : null;
const finalEntry = resolvedEntry || urlEntry;
const resolvedLabel = hasCanonical ? name : legacyName || name;
@@ -306,11 +321,13 @@ function main() {
if (dryRun) {
if (toRemoveLog.length > 0) {
log('Dry run — would remove and re-add:');
log('Dry run — would remove:');
for (const label of toRemoveLog) log(` [remove] ${label}`);
}
log('Dry run — would append:');
console.log(appendText);
if (toAppend.length > 0) {
log('Dry run — would append:');
console.log(appendText);
}
return;
}
@@ -325,7 +342,7 @@ function main() {
}
if (hasRemovals && toAppend.length === 0) {
log(`Done. Removed ${toRemoveLog.length} disabled server(s).`);
log(`Done. Removed ${toRemoveLog.length} server section(s).`);
return;
}