Codex CLI requires YAML frontmatter (---) in SKILL.md files.
6 skills were missing frontmatter entirely; laravel-verification had
a bare colon in its description causing an invalid YAML parse error.
- Remove prompt_file immediately after shell expansion into -p arg,
avoiding stale temp files during long analysis windows (greptile feedback)
- Update test assertion to check analysis_relpath instead of analysis_file,
matching the cross-platform relative path change from earlier commits
Signed-off-by: Lidang-Jiang <lidangjiang@gmail.com>
The cd "$PROJECT_DIR" failure path returned without removing prompt_file
and analysis_file, leaving stale temp files in .observer-tmp/.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Lidang-Jiang <lidangjiang@gmail.com>
Address reviewer feedback: under set +e, a failing cd would silently
leave CWD unchanged, causing the relative analysis path to break.
Add || return with a diagnostic log entry.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Lidang-Jiang <lidangjiang@gmail.com>
Reviewers correctly identified that the relative analysis_relpath
(.observer-tmp/<file>) only resolves when CWD equals PROJECT_DIR.
Without an explicit cd, non-Windows users launching the observer from
a different directory would fail to read the analysis file.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Lidang-Jiang <lidangjiang@gmail.com>
Address remaining issues from #842 after PR #903 moved temp files to
PROJECT_DIR/.observer-tmp:
Bug A (path resolution): Use relative paths (.observer-tmp/filename)
in the prompt instead of absolute paths from mktemp. On Windows
Git Bash/MSYS2, absolute paths use MSYS-style prefixes (/c/Users/...)
that the spawned Claude subprocess may fail to resolve.
Bug B (asks for permission): Add explicit IMPORTANT instruction block
at the prompt start telling the Haiku agent it is in non-interactive
--print mode and must use the Write tool directly without asking for
confirmation.
Additional improvements:
- Pass prompt via -p flag instead of stdin redirect for Windows compat
- Add .observer-tmp/ to .gitignore to prevent accidental commits
Fixes#842
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Lidang-Jiang <lidangjiang@gmail.com>
- Fix read-after-write in session-start.mjs: read prevSession BEFORE
overwriting current-session.json so unsaved-session detection fires
- Fix shell injection in resume.mjs: replace execSync shell string with
fs.existsSync for directory existence check
- Fix shell injection in shared.mjs gitSummary: replace nested \$(git ...)
subshell with a separate runGit() call to get rev count
- Fix displayName never shown: render functions now use ctx.displayName
?? ctx.name so user-supplied names show instead of the slug
- Fix renderListTable: uses context.displayName ?? entry.name
- Fix init.mjs: use path.basename() instead of cwd.split('/').pop()
- Fix save.mjs confirmation: show original name, not contextDir slug
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Adds the ck (Context Keeper) skill — deterministic Node.js scripts
that give Claude Code persistent, per-project memory across sessions.
Architecture:
- commands/ — 8 Node.js scripts handle all command logic (init, save,
resume, info, list, forget, migrate, shared). Claude calls scripts
and displays output — no LLM interpretation of command logic.
- hooks/session-start.mjs — injects ~100 token compact summary on
session start (not kilobytes). Detects unsaved sessions, git
activity since last save, goal mismatch vs CLAUDE.md.
- context.json as source of truth — CONTEXT.md is generated from it.
Full session history, session IDs, git activity per save.
Commands: /ck:init /ck:save /ck:resume /ck:info /ck:list /ck:forget /ck:migrate
Source: https://github.com/sreedhargs89/context-keeper
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Greptile fixes:
- Removed non-standard YAML frontmatter fields (observe, feedback, rollback) from all 4 skills — only name, description, origin, version per CONTRIBUTING.md
- Added null guard to checkInteractions implementation (was missing despite test)
- CI: replaced 2>/dev/null with 2>&1 (was silencing safety-critical errors)
- CI: quoted $RESULT variable (was breaking jq on JSON with spaces)
- CI: added division-by-zero guard when test suite is empty
- CI: added note that Jest is reference implementation, thresholds are framework-agnostic
CodeRabbit fixes (6 comments):
- All 4 skills: renamed 'When to Activate' → 'When to Use', added 'How It Works' and 'Examples' sections
- CDSS: DoseValidationResult.suggestedRange now typed as '| null'
- PHI: hyphenated 'Non-patient-sensitive'
Cubic fixes (7 issues):
- P1: CDSS weight-based check now BLOCKS when weight missing (was false-negative pass)
- P1: EMR medication safety clarified — critical = hard block, override requires documented reason
- P1: PHI logging guidance clarified — use opaque UUIDs only, not medical record numbers
- P2: CDSS validateDose now uses age and renal function params (ageAdjusted, renalAdjusted rules)
- P2: Eval CI example now enforces 95% threshold with jq + bc calculation
- P2: Eval CI example now includes --coverage --coverageThreshold on CDSS suite
- P2: CDSS suggestedRange null type fixed (same as CodeRabbit)
- Add laraplugins MCP server to mcp-configs/mcp-servers.json
- Create laravel-plugin-discovery skill for Laravel package discovery
- Supports searching by keyword, health score, Laravel/PHP version
- No API key required - free for Laravel community
* feat(skills): add santa-method
Multi-agent adversarial verification with convergence loop. Two independent review agents evaluate output against a shared rubric. Both must pass before shipping. Includes architecture diagram, implementation patterns (subagent, inline, batch sampling), domain-specific rubric extensions, failure mode mitigations, and integration guidance with existing ECC skills.
* Enhance SKILL.md with detailed Santa Method documentation
Expanded the SKILL.md documentation for the Santa Method, detailing architecture, phases, implementation patterns, failure modes, integration with other skills, metrics, and cost analysis.
* feat: add pending instinct TTL pruning and /prune command
Pending instincts generated by the observer accumulate indefinitely
with no cleanup mechanism. This adds lifecycle management:
- `instinct-cli.py prune` — delete pending instincts older than 30 days
(configurable via --max-age). Supports --dry-run and --quiet flags.
- Enhanced `status` command — shows pending count, warns at 5+,
highlights instincts expiring within 7 days.
- `observer-loop.sh` — runs prune before each analysis cycle.
- `/prune` slash command — user-facing command for manual pruning.
Design rationale: council consensus (4/4) rejected auto-promote in
favor of TTL-based garbage collection. Frequency of observation does
not establish correctness. Unreviewed pending instincts auto-delete
after 30 days; if the pattern is real, the observer will regenerate it.
Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
* fix: remove duplicate functions, broaden extension filter, fix prune output
- Remove duplicate _collect_pending_dirs and _parse_created_date defs
- Use ALLOWED_INSTINCT_EXTENSIONS (.md/.yaml/.yml) instead of .md-only
- Track actually-deleted items separately from expired for accurate output
- Update README.md and AGENTS.md command counts: 59 → 60
Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
* fix: address Copilot and CodeRabbit review findings
- Use is_dir() instead of exists() for pending path checks
- Change > to >= for --max-age boundary (--max-age 0 now prunes all)
- Use CLV2_PYTHON_CMD env var in observer-loop.sh prune call
- Remove unused source_dupes variable
- Remove extraneous f-string prefix on static string
Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
* fix: update AGENTS.md project structure command count 59 → 60
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: address cubic and coderabbit review findings
- Fix status early return skipping pending instinct warnings (cubic #1)
- Exclude already-expired items from expiring-soon filter (cubic #2)
- Warn on unparseable pending instinct age instead of silent skip (cubic #4)
- Log prune failures to observer.log instead of silencing (cubic #5)
Co-Authored-By: Claude <noreply@anthropic.com>
* fix: YAML single-quote unescaping, f-string cleanup, add /prune to README
- Fix single-quoted YAML unescaping: use '' doubling (YAML spec) not
backslash escaping which only applies to double-quoted strings (greptile P1)
- Remove extraneous f-string prefix on static string (coderabbit)
- Add /prune to README command catalog and file tree (cubic)
Co-Authored-By: Claude <noreply@anthropic.com>
---------
Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Happy <yesreply@happy.engineering>
New debugging skill that traces every button/touchpoint through its full
state change sequence. Catches bugs where functions individually work but
cancel each other out via shared state side effects.
Covers 6 bug patterns:
1. Sequential Undo — call B resets what call A just set
2. Async Race — double-click bypasses state-based loading guards
3. Stale Closure — useCallback captures old value
4. Missing State Transition — handler doesn't do what label says
5. Conditional Dead Path — condition always false, action unreachable
6. useEffect Interference — effect undoes button action
Battle-tested: found 48 bugs in a production React+Zustand app that
systematic debugging (54 bugs found separately) completely missed.
* feat(skills): add Kysely migration patterns to database-migrations
Add Kysely section covering kysely-ctl CLI workflow, migration file
structure (up/down with Kysely<any>), and programmatic Migrator setup
with FileMigrationProvider and allowUnorderedMigrations option.
* fix(skills): address PR review feedback for Kysely migration patterns
- Replace redundant email index with avatar_url index (unique already creates index)
- Add ESM-compatible __dirname using import.meta.url
- Comment out allowUnorderedMigrations with production safety warning
- Add clarifying comment for db variable
* fix(skills): fix migration filename mismatch and clarify ESM-only pattern
- Rename migration file to create_user_profile to match actual content
- Restructure ESM import pattern to be clearly ESM-only with CJS note
Library-agnostic Flutter/Dart code reviewer that adapts to the project's
chosen state management solution (BLoC, Riverpod, Provider, GetX, MobX,
Signals) and architecture pattern (Clean Architecture, MVVM, feature-first).
Co-authored-by: Maciej Starosielec <maciej@code-snap.com>
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Replace node -e with temp file execution in validator tests to avoid
Windows shebang parsing failures (node -e cannot handle scripts that
originally contained #!/usr/bin/env node shebangs)
- Remove duplicate blank line in skills/rust-patterns/SKILL.md (MD012)
* feat(skills): add architecture-decision-records skill
Adds a skill that captures architectural decisions made during coding
sessions as structured ADR documents (Michael Nygard format).
Features:
- Auto-detects decision moments from conversation signals
- Records context, alternatives considered with pros/cons, and consequences
- Maintains numbered ADR files in docs/adr/ with an index
- Supports ADR lifecycle (proposed → accepted → deprecated/superseded)
- Categorizes decisions worth recording vs trivial ones to skip
- Integrates with planner, code-reviewer, and codebase-onboarding skills
Includes Antigravity support via .agents/skills/ and openai.yaml.
* fix: address review feedback on ADR skill
- Add missing "why did we choose X?" read-ADR trigger to .agents/ copy
- Add canonical-reference link to .agents/ SKILL.md pointing to full version
- Remove integration reference to non-existent codebase-onboarding skill
* fix: add initialization step and sync .agents/ trigger
- Add Step 1 to workflow: initialize docs/adr/ directory, README.md
index, and template.md on first use when directory doesn't exist
- Add "API design" to .agents/ alternatives trigger to match canonical
version
* fix: address ADR workflow gaps and implicit signal safety
- Init step: seed README.md with index table header so Step 8 can
append rows correctly on first ADR
- Add read-path workflow: graceful handling when docs/adr/ is empty
or absent ("No ADRs found, would you like to start?")
- Implicit signals: add "do not auto-create without user confirmation"
guard, tighten triggers to require conclusion/rationale not just
discussion, remove overly broad "testing strategy" trigger
* fix: require user confirmation before creating files
- Canonical SKILL.md: init step now asks user before creating docs/adr/
- .agents/ condensed version: add confirmation gate for implicit signals
and explicit consent step before any file writes
* fix: require user approval before writing ADR file, add refusal path
* fix: remove .agents/ duplicate, keep canonical in skills/
---------
Co-authored-by: vazidmansuri005 <vazidmansuri005@users.noreply.github.com>
* feat(commands): add /context-budget optimizer command
Adds a command that audits context window token consumption across
agents, skills, rules, MCP servers, and CLAUDE.md files.
Detects bloated agent descriptions, redundant components, MCP
over-subscription, and CLAUDE.md bloat. Produces a prioritized
report with specific token savings per optimization.
Directly relevant to #434 (agent descriptions too verbose, ~26k
tokens causing performance warnings).
* fix: address review feedback on context-budget command
- Add $ARGUMENTS to enable --verbose flag passthrough
- Fix MCP token estimate: 45 tools × ~500 tokens = ~22,500 (was ~2,200)
- Fix heavy agents example: all 3 now exceed 200-line threshold
- Fix description threshold: warning at >30 words, fail at >50 words
- Add Step 4 instructions (was empty)
- Fix audit cadence: "quarterly" → "regularly" + "monthly" consistently
- Fix Output Format heading level under Step 4
- Replace "Antigravity" with generic "harness versions"
- Recalculate total overhead to match corrected MCP numbers
* fix: correct MCP tool count and savings percentage in sample output
- Fix MCP tool count: table now shows 87 tools matching the issues
section (was 45 in table vs 87 in issues)
- Fix savings percentage: 5,100 / 66,400 = 7.7% (was 20.6%)
- Recalculate total overhead and effective context to match
* fix: correct sample output arithmetic
- Fix total overhead: 66,400 → 66,100 to match component table sum
(12,400 + 6,200 + 2,800 + 43,500 + 1,200 = 66,100)
- Fix MCP savings: ~1,500 → ~27,500 tokens (55 tools × 500 tokens/tool)
to match the per-tool formula defined in Step 1
- Reorder optimizations by savings (MCP removal is now #1)
- Fix total savings and percentage (31,100 / 66,100 = 47.0%)
* fix: distinguish always-on vs on-demand agent overhead
Agent descriptions are always loaded into Task tool routing context,
but the full agent body is only loaded when invoked. The audit now
measures both: description-only tokens as always-on overhead and
full-file tokens as worst-case overhead. This resolves the
contradiction between Step 1 (counting full files) and Tip 1 (saying
only descriptions are loaded per session).
* fix: simplify agent accounting and resolve inconsistencies
- Revert to single agent overhead metric (full file tokens) — simpler
and matches what the report actually displays
- Add back 200-line threshold for heavy agents in Step 1
- Fix heavy agents action to match issue type (split/trim, not
description-only)
- Remove .agents/skills/ scan path (doesn't exist in ECC repo)
- Consolidate description threshold to single 30-word check
* fix: add model assumption and verbose mode activation
- Step 4: assume 200K context window by default (Claude has no way to
introspect its model at runtime)
- Step 4: add explicit instruction to check $ARGUMENTS for --verbose
flag and include additional output when present
* fix: handle .agents/skills/ duplicates in skill scan
Skills scan now checks .agents/skills/ for Codex harness copies and
skips identical duplicates to avoid double-counting overhead.
* fix: add savings estimate to heavy agents action for consistency
* feat(skills): add context-budget backing skill, slim command to delegator
* fix: use structurally detectable classification criteria instead of session frequency
---------
Co-authored-by: vazidmansuri005 <vazidmansuri005@users.noreply.github.com>
* feat(skills): add codebase-onboarding skill
Adds a skill that systematically analyzes an unfamiliar codebase and
produces two artifacts: a structured onboarding guide and a starter
CLAUDE.md tailored to the project's conventions.
Four-phase workflow:
1. Reconnaissance — parallel detection of manifests, frameworks, entry
points, directory structure, tooling, and test setup
2. Architecture mapping — tech stack, patterns, key directories, request
lifecycle tracing
3. Convention detection — naming, error handling, async patterns, git
workflow from recent history
4. Artifact generation — scannable onboarding guide + project-specific
CLAUDE.md
Includes Antigravity support via .agents/skills/ and openai.yaml.
* fix: address review feedback on codebase-onboarding skill
- Rename headings to match skill format: When to Activate → When to Use,
Onboarding Workflow → How It Works
- Add Examples section with 3 usage scenarios
- Mark Phase 4 Next.js paths as example with HTML comments
- Fix CLAUDE.md generation to read/enhance existing file first
- Replace abbreviated .agents/ SKILL.md with full copy per repo convention
* fix: add example marker to Common Tasks template section
Adds <!-- Example for a Node.js project --> comment to Common Tasks,
matching the markers already on Key Entry Points and Where to Look.
Syncs .agents/ copy.
* fix: add missing example markers and shorten default_prompt
- Add example comment to Tech Stack table in Phase 4 template
- Add example comment to Key Directories block in Phase 2
- Shorten openai.yaml default_prompt to match repo convention (~60 chars)
- Sync .agents/ SKILL.md copy
* fix: add empty-repo fallback and remove hardcoded output path
- Phase 3: add fallback for repos with no git history
- Example 1: remove hardcoded docs/ path assumption, output to
conversation or project root instead
- Sync .agents/ copy
* fix: remove .agents/ duplicate, keep canonical in skills/
* fix: clarify Example 1 output destination
* fix: add shallow-clone fallback to git conventions detection
---------
Co-authored-by: vazidmansuri005 <vazidmansuri005@users.noreply.github.com>
* feat(skills): add agent-eval for head-to-head coding agent comparison
* fix(skills): address PR #540 review feedback for agent-eval skill
- Remove duplicate "When to Use" section (kept "When to Activate")
- Add Installation section with pip install instructions
- Change origin from "community" to "ECC" per repo convention
- Add commit field to YAML task example for reproducibility
- Fix pass@k mislabeling to "pass rate across repeated runs"
- Soften worktree isolation language to "reproducibility isolation"
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Pin agent-eval install to specific commit hash
Address PR review feedback: pin the VCS install to commit
6d062a2 to avoid supply-chain risk from unpinned external deps.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
---------
Co-authored-by: Joaquin Hui Gomez <joaquinhui1995@gmail.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
In git worktrees, .git is a file (not a directory) containing a gitdir
pointer. The -d test fails for worktree checkouts, causing project
detection to fall through to the "global" fallback. Changing to -e
(exists) handles both regular repos and worktrees correctly.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
The observer's Haiku subprocess cannot access files outside the project
sandbox (/tmp/ for observations, ~/.claude/homunculus/ for instincts).
Adding --allowedTools "Read,Write" grants the necessary file access
while keeping the subprocess constrained by --max-turns and timeout.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>