Commit Graph

12 Commits

Author SHA1 Message Date
Affaan Mustafa
6319c7d309 fix: stability batch — hook stdin truncation, Codex exa TOML, Stop hook JSON, GateGuard repetition (#2227)
* fix(hooks): fail open on oversized stdin instead of echoing truncated JSON (#2222)

run-with-flags.js capped stdin at 1MB but every fallthrough path still
echoed the truncated string to stdout. The harness parses hook stdout as
JSON, got a document cut mid-stream, and blocked the tool call — so any
Edit/Write with a >1MB hook payload was permanently blocked by every
registered pre-write hook, before ECC_HOOK_PROFILE / ECC_DISABLED_HOOKS
gating could run.

- Exit 0 with empty stdout (no opinion) when the stdin cap trips, before
  any echo or gating logic.
- Flush stdout via write callback before process.exit: exiting right
  after stdout.write() dropped everything past the ~64KB pipe buffer,
  cutting even sub-cap pass-through payloads mid-JSON.

Regression tests cover the enabled, disabled, and missing-arg paths for
oversized payloads plus full echo of sub-cap >64KB payloads.

* fix(codex): stop emitting invalid exa url entry, align merge with connector policy (#2224)

The Codex MCP merge declared exa with a url key, but Codex's
[mcp_servers.*] TOML schema is stdio-only — the url key makes the
entire config.toml fail to load, bricking both the codex CLI and the
desktop app. Every install/update re-injected the line because the
urlEntry branch treated the broken entry as present.

- ECC_SERVERS now emits only the current default set per
  docs/MCP-CONNECTOR-POLICY.md: chrome-devtools (stdio, command/args).
  Retired servers (supabase, playwright, context7, exa, github, memory,
  sequential-thinking) are never re-emitted; existing user-managed
  entries are untouched.
- The merge now repairs the exact ECC-emitted broken form (url-only
  exa entry) on every run so re-running the installer fixes broken
  configs instead of preserving them. User stdio exa entries
  (command + mcp-remote) are left alone.
- check-codex-global-state.sh requires chrome-devtools instead of the
  retired set, and flags url-only exa entries with a repair hint.

Tests cover repair, re-run idempotence, stdio-entry preservation, and
no-retired-server emission in add, update, dry-run, and disabled modes.

* fix(hooks): never echo truncated stdin from Stop hooks (#2090)

Stop hooks follow the ECC pass-through convention (echo stdin on
stdout), but every echoing Stop hook capped stdin and echoed the capped
string. The Stop payload carries last_assistant_message, so a long
final assistant message produced a JSON document cut mid-stream on
stdout, which the harness reports as 'Stop hook error: JSON validation
failed' across the whole Stop chain.

Reproduced: a Stop payload with a >64KB last_assistant_message run
through run-with-flags + cost-tracker emitted exactly 65536 bytes of
invalid JSON (cost-tracker capped stdin at 64KB — far below realistic
Stop payloads).

- cost-tracker: raise the cap to 1MB (matching all other hooks) and
  suppress the pass-through echo when stdin was truncated.
- check-console-log, stop-format-typecheck, desktop-notify: suppress
  the echo when stdin was truncated; flush stdout before process.exit
  so sub-cap payloads are not cut at the ~64KB pipe buffer.
- All hooks keep exiting 0 (fail-open); diagnostics go to stderr.

New stop-hooks-stdout test asserts the contract for every registered
Stop hook: stdout is empty or valid JSON, exit code 0 — for realistic
100KB payloads and oversized >1MB payloads, via the production runner
and via direct invocation. Updated the old hooks.test.js case that
codified the truncated-echo behavior.

* fix(hooks): dampen GateGuard fact-force repetition in long sessions (#2142)

In long autonomous sessions the fact-force gate produced 10+
near-identical 'state facts -> blocked -> restate -> retry' blocks in
one context window, which measurably raises the odds of the model
collapsing into a degenerate single-token repetition loop.

- Track a per-session fact_force_denials counter in GateGuard state
  (merged max across concurrent writers, reset with the session, robust
  to malformed on-disk values).
- The first GATEGUARD_FACT_FORCE_FULL_DENIALS denials (default 3) keep
  the full four-fact block; later denials emit a condensed single-line
  message that carries the denial ordinal, so consecutive denials are
  structurally different and never textually identical.
- True retries of the same target remain allowed without re-prompting
  (unchanged). Destructive-Bash and routine-Bash gates are unchanged,
  as are the ECC_GATEGUARD=off / ECC_DISABLED_HOOKS escape hatches.

Eight new tests cover budget counting, condensed format, ordinal
advancement, retry pass-through, env tuning, malformed state, MultiEdit
dampening, and destructive-gate exemption.

* fix(hooks): keep security hooks able to block on oversized stdin (#2222)

Refine the truncation fail-open: instead of skipping the hook entirely,
the runner now suppresses only its own raw-echo when stdin was
truncated. The hook still executes and receives the truncated flag
(run() context / ECC_HOOK_INPUT_TRUNCATED), so config-protection keeps
blocking truncated protected-config payloads (its test requires exit 2)
while pass-through hooks fail open with empty stdout as before.

* style: apply repo formatter to touched hook files
2026-06-11 00:31:33 -04:00
Affaan Mustafa
ae02b26cf9 test: cover mcp config merge edges 2026-04-29 18:57:55 -04:00
Affaan Mustafa
cc89c40751 test: cover codex config merge edges 2026-04-29 18:51:56 -04:00
Affaan Mustafa
9a6080f2e1 feat: sync the codex baseline and agent roles 2026-04-01 16:08:03 -07:00
Affaan Mustafa
6cc85ef2ed fix: CI fixes, security audit, remotion skill, lead-intelligence, npm audit (#1039)
* fix(ci): resolve cross-platform test failures

- Sanity check script (check-codex-global-state.sh) now falls back to
  grep -E when ripgrep is not available, fixing the codex-hooks sync
  test on all CI platforms. Patterns converted to POSIX ERE for
  portability.
- Unicode safety test accepts both / and \ path separators so the
  executable-file assertion passes on Windows.
- Gacha test sets PYTHONUTF8=1 so Python uses UTF-8 stdout encoding on
  Windows instead of cp1252, preventing UnicodeEncodeError on box-drawing
  characters.
- Quoted-hook-path test skipped on Windows where NTFS disallows
  double-quote characters in filenames.

* feat: port remotion-video-creation skill (29 rules), restore missing files

New skill:
- remotion-video-creation: 29 domain-specific Remotion rules covering 3D/Three.js,
  animations, audio, captions, charts, compositions, fonts, GIFs, Lottie,
  measuring, sequencing, tailwind, text animations, timing, transitions,
  trimming, and video embedding. Ported from personal skills.

Restored:
- autonomous-agent-harness/SKILL.md (was in commit but missing from worktree)
- lead-intelligence/ (full directory restored from branch commit)

Updated:
- manifests/install-modules.json: added remotion-video-creation to media-generation
- README.md + AGENTS.md: synced counts to 139 skills

Catalog validates: 30 agents, 60 commands, 139 skills.

* fix(security): pin MCP server versions, add dependabot, pin github-script SHA

Critical:
- Pin all npx -y MCP server packages to specific versions in .mcp.json
  to prevent supply chain attacks via version hijacking:
  - @modelcontextprotocol/server-github@2025.4.8
  - @modelcontextprotocol/server-memory@2026.1.26
  - @modelcontextprotocol/server-sequential-thinking@2025.12.18
  - @playwright/mcp@0.0.69 (was 0.0.68)

Medium:
- Add .github/dependabot.yml for weekly npm + github-actions updates
  with grouped minor/patch PRs
- Pin actions/github-script to SHA (was @v7 tag, now pinned to commit)

* feat: add social-graph-ranker skill — weighted network proximity scoring

New skill: social-graph-ranker
- Weighted social graph traversal with exponential decay across hops
- Bridge Score: B(m) = Σ w(t) · λ^(d(m,t)-1) ranks mutuals by target proximity
- Extended Score incorporates 2nd-order network (mutual-of-mutual connections)
- Final ranking includes engagement bonus for responsive connections
- Runs in parallel with lead-intelligence skill for combined warm+cold outreach
- Supports X API + LinkedIn CSV for graph harvesting
- Outputs tiered action list: warm intros, direct outreach, network gap analysis

Added to business-content install module. Catalog validates: 30/60/140.

* fix(security): npm audit fix — resolve all dependency vulnerabilities

Applied npm audit fix --force to resolve:
- minimatch ReDoS (3 vulnerabilities, HIGH)
- smol-toml DoS (MODERATE)
- brace-expansion memory exhaustion (MODERATE)
- markdownlint-cli upgraded from 0.47.0 to 0.48.0

npm audit now reports 0 vulnerabilities.

* fix: resolve markdown lint and yarn lockfile sync

- MD047: ensure single trailing newline on all remotion rule files
- MD012: remove consecutive blank lines in lottie, measuring-dom-nodes, trimming
- MD034: wrap bare URLs in angle brackets (tailwind, transcribe-captions)
- yarn.lock: regenerated to sync with npm audit changes in package.json

* fix: replace unicode arrows in lead-intelligence (CI unicode safety check)
2026-03-31 15:08:55 -04:00
Affaan Mustafa
e68233cd5d fix(ci): harden codex hook regression test (#1028) 2026-03-30 14:21:40 -04:00
Affaan Mustafa
7253d0ca98 test: isolate codex hook sync env (#1023) 2026-03-30 04:31:09 -04:00
Affaan Mustafa
8846210ca2 fix: unblock unicode safety CI lint (#1017)
* fix: unblock unicode safety CI lint

* fix: unblock shared CI regressions
2026-03-30 01:50:17 -04:00
Affaan Mustafa
0ebcfc368e fix(codex): broaden context7 config checks 2026-03-29 00:26:16 -04:00
Affaan Mustafa
b19b4c6b5e fix: finish blocker lane hook and install regressions 2026-03-25 04:00:50 -04:00
Affaan Mustafa
9c5ca92e6e fix: finish hook fallback and canonical session follow-ups 2026-03-25 03:44:03 -04:00
Affaan Mustafa
1d0aa5ac2a fix: fold session manager blockers into one candidate 2026-03-24 23:08:27 -04:00