fix: stability batch — hook stdin truncation, Codex exa TOML, Stop hook JSON, GateGuard repetition (#2227)

* fix(hooks): fail open on oversized stdin instead of echoing truncated JSON (#2222) run-with-flags.js capped stdin at 1MB but every fallthrough path still echoed the truncated string to stdout. The harness parses hook stdout as JSON, got a document cut mid-stream, and blocked the tool call — so any Edit/Write with a >1MB hook payload was permanently blocked by every registered pre-write hook, before ECC_HOOK_PROFILE / ECC_DISABLED_HOOKS gating could run. - Exit 0 with empty stdout (no opinion) when the stdin cap trips, before any echo or gating logic. - Flush stdout via write callback before process.exit: exiting right after stdout.write() dropped everything past the ~64KB pipe buffer, cutting even sub-cap pass-through payloads mid-JSON. Regression tests cover the enabled, disabled, and missing-arg paths for oversized payloads plus full echo of sub-cap >64KB payloads. * fix(codex): stop emitting invalid exa url entry, align merge with connector policy (#2224) The Codex MCP merge declared exa with a url key, but Codex's [mcp_servers.*] TOML schema is stdio-only — the url key makes the entire config.toml fail to load, bricking both the codex CLI and the desktop app. Every install/update re-injected the line because the urlEntry branch treated the broken entry as present. - ECC_SERVERS now emits only the current default set per docs/MCP-CONNECTOR-POLICY.md: chrome-devtools (stdio, command/args). Retired servers (supabase, playwright, context7, exa, github, memory, sequential-thinking) are never re-emitted; existing user-managed entries are untouched. - The merge now repairs the exact ECC-emitted broken form (url-only exa entry) on every run so re-running the installer fixes broken configs instead of preserving them. User stdio exa entries (command + mcp-remote) are left alone. - check-codex-global-state.sh requires chrome-devtools instead of the retired set, and flags url-only exa entries with a repair hint. Tests cover repair, re-run idempotence, stdio-entry preservation, and no-retired-server emission in add, update, dry-run, and disabled modes. * fix(hooks): never echo truncated stdin from Stop hooks (#2090) Stop hooks follow the ECC pass-through convention (echo stdin on stdout), but every echoing Stop hook capped stdin and echoed the capped string. The Stop payload carries last_assistant_message, so a long final assistant message produced a JSON document cut mid-stream on stdout, which the harness reports as 'Stop hook error: JSON validation failed' across the whole Stop chain. Reproduced: a Stop payload with a >64KB last_assistant_message run through run-with-flags + cost-tracker emitted exactly 65536 bytes of invalid JSON (cost-tracker capped stdin at 64KB — far below realistic Stop payloads). - cost-tracker: raise the cap to 1MB (matching all other hooks) and suppress the pass-through echo when stdin was truncated. - check-console-log, stop-format-typecheck, desktop-notify: suppress the echo when stdin was truncated; flush stdout before process.exit so sub-cap payloads are not cut at the ~64KB pipe buffer. - All hooks keep exiting 0 (fail-open); diagnostics go to stderr. New stop-hooks-stdout test asserts the contract for every registered Stop hook: stdout is empty or valid JSON, exit code 0 — for realistic 100KB payloads and oversized >1MB payloads, via the production runner and via direct invocation. Updated the old hooks.test.js case that codified the truncated-echo behavior. * fix(hooks): dampen GateGuard fact-force repetition in long sessions (#2142) In long autonomous sessions the fact-force gate produced 10+ near-identical 'state facts -> blocked -> restate -> retry' blocks in one context window, which measurably raises the odds of the model collapsing into a degenerate single-token repetition loop. - Track a per-session fact_force_denials counter in GateGuard state (merged max across concurrent writers, reset with the session, robust to malformed on-disk values). - The first GATEGUARD_FACT_FORCE_FULL_DENIALS denials (default 3) keep the full four-fact block; later denials emit a condensed single-line message that carries the denial ordinal, so consecutive denials are structurally different and never textually identical. - True retries of the same target remain allowed without re-prompting (unchanged). Destructive-Bash and routine-Bash gates are unchanged, as are the ECC_GATEGUARD=off / ECC_DISABLED_HOOKS escape hatches. Eight new tests cover budget counting, condensed format, ordinal advancement, retry pass-through, env tuning, malformed state, MultiEdit dampening, and destructive-gate exemption. * fix(hooks): keep security hooks able to block on oversized stdin (#2222) Refine the truncation fail-open: instead of skipping the hook entirely, the runner now suppresses only its own raw-echo when stdin was truncated. The hook still executes and receives the truncated flag (run() context / ECC_HOOK_INPUT_TRUNCATED), so config-protection keeps blocking truncated protected-config payloads (its test requires exit 2) while pass-through hooks fail open with empty stdout as before. * style: apply repo formatter to touched hook files
2026-06-12 19:23:07 +08:00 · 2026-06-11 00:31:33 -04:00
parent 3bdb4a5e12
commit 6319c7d309
14 changed files with 846 additions and 151 deletions
--- a/scripts/codex/check-codex-global-state.sh
+++ b/scripts/codex/check-codex-global-state.sh
@@ -107,11 +107,11 @@ if [[ -f "$CONFIG_FILE" ]]; then
  check_config_pattern '^\[profiles\.strict\]' "profiles.strict exists"
  check_config_pattern '^\[profiles\.yolo\]' "profiles.yolo exists"

+  # Current default connector set (docs/MCP-CONNECTOR-POLICY.md): exactly
+  # one connector. Former defaults (github, memory, sequential-thinking,
+  # context7, exa, ...) are opt-in user choices, so they are not required.
  for section in \
-    'mcp_servers.github' \
-    'mcp_servers.memory' \
-    'mcp_servers.sequential-thinking' \
-    'mcp_servers.context7'
+    'mcp_servers.chrome-devtools'
  do
    if search_file "^\[$section\]" "$CONFIG_FILE"; then
      ok "MCP section [$section] exists"
@@ -120,25 +120,17 @@ if [[ -f "$CONFIG_FILE" ]]; then
    fi
  done

-  has_context7_legacy=0
-  has_context7_current=0
-
-  if search_file '^\[mcp_servers\.context7\]' "$CONFIG_FILE"; then
-    has_context7_legacy=1
-  fi
-
-  if search_file '^\[mcp_servers\.context7-mcp\]' "$CONFIG_FILE"; then
-    has_context7_current=1
-  fi
-
-  if [[ "$has_context7_legacy" -eq 1 || "$has_context7_current" -eq 1 ]]; then
-    ok "MCP section [mcp_servers.context7] or [mcp_servers.context7-mcp] exists"
-  else
-    fail "MCP section [mcp_servers.context7] or [mcp_servers.context7-mcp] missing"
-  fi
-
-  if [[ "$has_context7_legacy" -eq 1 && "$has_context7_current" -eq 1 ]]; then
-    warn "Both [mcp_servers.context7] and [mcp_servers.context7-mcp] exist; prefer one name"
+  # ECC <= 2.0.0 emitted a url-only exa entry that Codex's stdio-only
+  # schema rejects, breaking the whole config (#2224). Flag it so users
+  # re-run the sync (which repairs it) or remove it manually.
+  if search_file '^\[mcp_servers\.exa\]' "$CONFIG_FILE"; then
+    exa_block="$(awk '/^\[mcp_servers\.exa\]/{flag=1;next}/^\[/{flag=0}flag' "$CONFIG_FILE")"
+    if printf '%s\n' "$exa_block" | grep -Eq '^[[:space:]]*url[[:space:]]*=' \
+      && ! printf '%s\n' "$exa_block" | grep -Eq '^[[:space:]]*command[[:space:]]*='; then
+      fail "MCP section [mcp_servers.exa] uses a url key, which Codex rejects for stdio servers — re-run ecc-sync-codex to repair (#2224)"
+    else
+      ok "MCP section [mcp_servers.exa] uses the stdio form"
+    fi
  fi
 fi