Commit Graph

19 Commits

Author SHA1 Message Date
Affaan Mustafa
b2407ab3f5 fix(ecc2): sync catalog counts for scaffold CI 2026-03-24 03:43:48 -07:00
Shimo
a2e465c74d feat(skills): add skill-comply — automated behavioral compliance measurement (#724)
* feat(skills): add skill-comply — automated behavioral compliance measurement

Automated compliance measurement for skills, rules, and agent definitions.
Generates behavioral specs, runs scenarios at 3 strictness levels,
classifies tool calls via LLM, and produces self-contained reports.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

* fix(skill-comply): address bot review feedback

- AGENTS.md: fix stale skill count (115 → 117) in project structure
- run.py: replace remaining print() with logger, add zero-division guard,
  create parent dirs for --output path
- runner.py: add returncode check for claude subprocess, clarify
  relative_to path traversal validation
- parser.py: use is_file() instead of exists(), catch KeyError for
  missing trace fields, add file check in parse_spec
- classifier.py: log warnings on malformed classification output,
  guard against non-dict JSON responses
- grader.py: filter negative indices from LLM classification

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

---------

Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-03-22 21:51:49 -07:00
Chris Yau
0e733753e0 feat: pending instinct TTL pruning and /prune command (#725)
* feat: add pending instinct TTL pruning and /prune command

Pending instincts generated by the observer accumulate indefinitely
with no cleanup mechanism. This adds lifecycle management:

- `instinct-cli.py prune` — delete pending instincts older than 30 days
  (configurable via --max-age). Supports --dry-run and --quiet flags.
- Enhanced `status` command — shows pending count, warns at 5+,
  highlights instincts expiring within 7 days.
- `observer-loop.sh` — runs prune before each analysis cycle.
- `/prune` slash command — user-facing command for manual pruning.

Design rationale: council consensus (4/4) rejected auto-promote in
favor of TTL-based garbage collection. Frequency of observation does
not establish correctness. Unreviewed pending instincts auto-delete
after 30 days; if the pattern is real, the observer will regenerate it.

Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>

* fix: remove duplicate functions, broaden extension filter, fix prune output

- Remove duplicate _collect_pending_dirs and _parse_created_date defs
- Use ALLOWED_INSTINCT_EXTENSIONS (.md/.yaml/.yml) instead of .md-only
- Track actually-deleted items separately from expired for accurate output
- Update README.md and AGENTS.md command counts: 59 → 60

Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>

* fix: address Copilot and CodeRabbit review findings

- Use is_dir() instead of exists() for pending path checks
- Change > to >= for --max-age boundary (--max-age 0 now prunes all)
- Use CLV2_PYTHON_CMD env var in observer-loop.sh prune call
- Remove unused source_dupes variable
- Remove extraneous f-string prefix on static string

Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>

* fix: update AGENTS.md project structure command count 59 → 60

Co-Authored-By: Claude <noreply@anthropic.com>

* fix: address cubic and coderabbit review findings

- Fix status early return skipping pending instinct warnings (cubic #1)
- Exclude already-expired items from expiring-soon filter (cubic #2)
- Warn on unparseable pending instinct age instead of silent skip (cubic #4)
- Log prune failures to observer.log instead of silencing (cubic #5)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix: YAML single-quote unescaping, f-string cleanup, add /prune to README

- Fix single-quoted YAML unescaping: use '' doubling (YAML spec) not
  backslash escaping which only applies to double-quoted strings (greptile P1)
- Remove extraneous f-string prefix on static string (coderabbit)
- Add /prune to README command catalog and file tree (cubic)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Happy <yesreply@happy.engineering>
2026-03-22 15:40:58 -07:00
Affaan Mustafa
2643e0c72f fix: update catalog counts for flutter-reviewer (28 agents, 116 skills) 2026-03-20 07:11:16 -07:00
Affaan Mustafa
b77f49569b feat: add nuxt 4 patterns skill (#702) 2026-03-20 04:44:31 -07:00
Affaan Mustafa
8511d84042 feat(skills): add rules-distill skill (rebased #561) (#678)
* feat(skills): add rules-distill — extract cross-cutting principles from skills into rules

Applies the skill-stocktake pattern to rules maintenance:
scan skills → extract shared principles → propose rule changes.

Key design decisions:
- Deterministic collection (scan scripts) + LLM judgment (cross-read & verdict)
- 6 verdict types: Append, Revise, New Section, New File, Already Covered, Too Specific
- Anti-abstraction safeguard: 2+ skills evidence, actionable behavior test, violation risk
- Rules full text passed to LLM (no grep pre-filter) for accurate matching
- Never modifies rules automatically — always requires user approval

* fix(skills): address review feedback for rules-distill

Fixes raised by CodeRabbit, Greptile, and cubic:

- Add Prerequisites section documenting skill-stocktake dependency
- Add fallback command when skill-stocktake is not installed
- Fix shell quoting: add IFS= and -r to while-read loops
- Replace hardcoded paths with env var placeholders ($CLAUDE_RULES_DIR, $SKILL_STOCKTAKE_DIR)
- Add json language identifier to code blocks
- Add "How It Works" parent heading for Phase 1/2/3
- Add "Example" section with end-to-end run output
- Add revision.reason/before/after fields to output schema for Revise verdict
- Document timestamp format (date -u +%Y-%m-%dT%H:%M:%SZ)
- Document candidate-id format (kebab-case from principle)
- Use concrete examples in results.json schema

* fix(skills): remove skill-stocktake dependency, add self-contained scripts

Address P1 review feedback:
- Add scan-skills.sh and scan-rules.sh directly in rules-distill/scripts/
  (no external dependency on skill-stocktake)
- Remove Prerequisites section (no longer needed)
- Add cross-batch merge step to prevent 2+ skills requirement
  from being silently broken across batch boundaries
- Fix nested triple-backtick fences (use quadruple backticks)
- Remove head -100 cap (silent truncation)
- Rename "When to Activate" → "When to Use" (ECC standard)
- Remove unnecessary env var placeholders (SKILL.md is a prompt, not a script)

* fix: update skill/command counts in README.md and AGENTS.md

rules-distill added 1 skill + 1 command:
- skills: 108 → 109
- commands: 57 → 58

Updates all count references to pass CI catalog validation.

* fix(skills): address Servitor review feedback for rules-distill

1. Rename SKILL_STOCKTAKE_* env vars to RULES_DISTILL_* for consistency
2. Remove unnecessary observation counting (use_7d/use_30d) from scan-skills.sh
3. Fix header comment: scan.sh → scan-skills.sh
4. Use jq for JSON construction in scan-rules.sh to properly escape
   headings containing special characters (", \)

* fix(skills): address CodeRabbit review — portability and scan scope

1. scan-rules.sh: use jq for error JSON output (proper escaping)
2. scan-rules.sh: replace GNU-only sort -z with portable sort (BSD compat)
3. scan-rules.sh: fix pipefail crash on files without H2 headings
4. scan-skills.sh: scan only SKILL.md files (skip learned/*.md and
   auxiliary docs that lack frontmatter)
5. scan-skills.sh: add portable get_mtime helper (GNU stat/date
   fallback to BSD stat/date)

* fix: sync catalog counts with filesystem (27 agents, 114 skills, 59 commands)

---------

Co-authored-by: Tatsuya Shimomoto <shimo4228@gmail.com>
2026-03-20 01:44:55 -07:00
Affaan Mustafa
52e949a85b fix: sync catalog counts with filesystem (27 agents, 113 skills, 58 commands) (#693) 2026-03-20 01:19:36 -07:00
Affaan Mustafa
29277ac273 chore: prepare v1.9.0 release (#666)
- Bump version to 1.9.0 in package.json, package-lock.json, .opencode/package.json
- Add v1.9.0 changelog with 212 commits covering selective install architecture,
  6 new agents, 15+ new skills, session/state infrastructure, observer fixes,
  12 language ecosystems, and community contributions
- Update README with v1.9.0 release notes and complete agents tree (27 agents)
- Add pytorch-build-resolver to AGENTS.md agent table
- Update documentation counts to 27 agents, 109 skills, 57 commands
- Update version references in zh-CN README
- All 1421 tests passing, catalog counts verified
2026-03-20 00:29:20 -07:00
Affaan Mustafa
fec871e1cb fix: update catalog counts and resolve lint error
- Update agent count 26→27 in README.md (quick-start + comparison table) and AGENTS.md (summary + project structure)
- Update skill count 108→109 in README.md (quick-start + comparison table) and AGENTS.md (summary)
- Rename unused variable provenance → _provenance in tests/lib/skill-dashboard.test.js
2026-03-19 22:47:46 -07:00
teee32
90c3486e03 feat(agents): add typescript-reviewer agent (#647)
Adds typescript-reviewer agent following the established agent format, covering type safety, async correctness, security, and React/Next.js patterns.
2026-03-19 20:49:23 -07:00
Affaan Mustafa
fce4513d58 fix: sync documentation counts with catalog (25 agents, 108 skills, 57 commands) 2026-03-17 00:42:09 -07:00
Yashwardhan
7cf07cac17 feat(agents): add java-build-resolver for Maven/Gradle (#538) 2026-03-16 14:32:25 -07:00
Yashwardhan
10879da823 feat(agents): add java-reviewer agent (#528)
* Add java-reviewer agent for Java and Spring Boot code review

* Fix java-reviewer: update tools format, git diff scope, diagnostic commands, AGENTS.md registration

* Fix: correct skill reference, add command injection check, update agent count to 17

* Fix: report-only disclaimer, path traversal, split ScriptEngine, escalation note, agent count 19
2026-03-16 14:01:38 -07:00
Justin Philpott
01ed1b3b03 fix(ci): enforce catalog count integrity (#525)
* fix(ci): enforce catalog count integrity

* test: harden catalog structure parsing
2026-03-16 13:37:51 -07:00
Ronaldo Martins
ebd8c8c6fa feat(agents): add Rust language support (#523)
* feat(agents): add Rust language support — reviewer, build resolver, patterns, and testing

Add Rust-specific agents and skills following the established Go/Kotlin pattern:
- agents/rust-reviewer.md: ownership, lifetimes, unsafe audit, clippy, error handling
- agents/rust-build-resolver.md: cargo build errors, borrow checker, dependency resolution
- skills/rust-patterns/SKILL.md: idiomatic Rust patterns and best practices
- skills/rust-testing/SKILL.md: TDD, unit/integration/async/property-based testing

* fix(agents): correct Rust examples for accuracy and consistency

- unsafe fn: add inner unsafe {} block for Rust 2024 edition compliance
- edition: update from 2021 to 2024 as current default
- rstest: add missing fixture import
- mockall: add missing predicate::eq import
- concurrency: use sync_channel (bounded) and expect() over unwrap()
  to align with rust-reviewer's HIGH-priority review checks

* fix(skills): correct compilation issues in Rust code examples

- collect: add .copied() for &str iterator into String
- tokio import: remove unused sleep, keep Duration
- async test: add missing Duration import

* fix(skills): move --no-fail-fast before test-binary args

--no-fail-fast is a Cargo option, not a test binary flag.
Placing it after -- forwards it to the test harness where it is
unrecognized.

* fix(agents): distinguish missing cargo-audit from real audit failures

Check if cargo-audit is installed before running it, so actual
vulnerability findings are not suppressed by the fallback message.

* fix: address automated review findings across all Rust files

- build-resolver: prefer scoped cargo update over full refresh
- testing: add Cargo.toml bench config with harness = false for criterion
- testing: condense TDD example to stay under 500-line limit
- patterns: use expect() over unwrap() on JoinHandle for consistency
- patterns: add explicit lifetime to unsafe FFI return reference
- reviewer: replace misleading "string interpolation" with concrete alternatives

* fix: align with CONTRIBUTING.md conventions

- skills: rename "When to Activate" to "When to Use" per template
- reviewer: add cargo check gate before starting review

* fix(agents): guard cargo-audit and cargo-deny with availability checks

Match the pattern used in rust-build-resolver to avoid command-not-found
errors when optional tools are not installed.

* fix: address second round of automated review findings

- testing: split TDD example into separate code blocks to avoid
  duplicate fn definition in single block
- build-resolver/reviewer: use if/then/else instead of && ... ||
  chaining for cargo-audit/deny to avoid masking real failures
- build-resolver: add MSRV caveat to edition upgrade guidance

* feat: add Rust slash commands for build, review, and test

Add commands/rust-build.md, commands/rust-review.md, and
commands/rust-test.md to provide consistent user entrypoints
matching the existing Go and Kotlin command patterns.

* fix(commands): improve rust-build accuracy and tone

- Restructure-first borrow fix example instead of clone-first
- Realistic cargo test output format (per-test lines, not per-file)
- Align "Parse Errors" step with actual resolver behavior
- Prefer restructuring over cloning in common errors table

* fix: address cubic-dev-ai review findings on commands

- Gate review on all automated checks, not just cargo check
- Use git diff HEAD~1 / git diff main...HEAD for PR file selection
- Fix #[must_use] guidance: Result is already must_use by type
- Remove error-masking fallback on cargo tree --duplicates

* fix: address remaining review findings across all bots

- Add rust-reviewer and rust-build-resolver to AGENTS.md registry
- Update agent count from 16 to 18
- Mark parse_config doctest as no_run (body is todo!())
- Add "How It Works" section to both Rust skills
- Replace cargo install with taiki-e/install-action in CI snippet
- Trim tarpaulin section to stay under 500-line limit

* fix(agents): align rust-reviewer invocation with command spec

- Use git diff HEAD~1 / main...HEAD instead of bare git diff
- Add cargo test as explicit step before review begins

* fix(skills): address cubic review on patterns and testing

- Remove Tokio-specific language from How It Works summary
- Add cargo-llvm-cov install note in coverage section
- Revert no_run on doctest examples (illustrative code, not compiled)

* fix(skills): use expect on thread join for consistency

Replace handle.join().unwrap() with .expect("worker thread panicked")
to match the .expect("mutex poisoned") pattern used above.

* fix(agents): gate review on all automated checks, not just cargo check

Consolidate check/clippy/fmt/test into a single gate step that
stops and reports if any fail, matching the command spec.

* fix(skills): replace unwrap with expect in channel example

Use .expect("receiver disconnected") on tx.send() for consistency
with the .expect() convention used in all other concurrency examples.

* fix: address final review round — OpenCode mirrors, counts, examples

- Add .opencode/commands/rust-{build,review,test}.md mirrors
- Add .opencode/prompts/agents/rust-{build-resolver,reviewer}.txt mirrors
- Fix AGENTS.md count to 20 (add missing kotlin agents to table)
- Fix review example: all checks pass (consistent with gate policy)
- Replace should_panic doctest with is_err() (consistent with best practices)
- Trim testing commands to stay at 500-line limit

* fix: address cubic and greptile review on OpenCode files and agents

- Fix crate::module import guidance (internal path, not Cargo.toml)
- Add cargo fmt --check to verification steps
- Fix TDD GREEN example to handle error path (validate(input)?)
- Scope .context() guidance to anyhow/eyre application code
- Update command count from 40 to 51
- Add tokio channel variants to unbounded channel warning
- Preserve JoinError context in spawned task panic message

* fix: stale command count, channel guidance, cargo tree fallback

- Fix stale command count in Project Structure section (40→51)
- Clarify unbounded channel rule: context-appropriate bounded alternatives
- Remove dead cargo tree fallback (exits 0 even with no duplicates)
- Sync OpenCode reviewer mirror with tokio channel coverage
2026-03-16 13:34:25 -07:00
Affaan Mustafa
7705051910 fix: align architecture tooling with current hooks docs 2026-03-10 19:36:57 -07:00
kinshukdutta
a50349181a feat: architecture improvements — test discovery, hooks schema, catalog, command map, coverage, cross-harness docs
- AGENTS.md: sync skills count to 65+
- tests/run-all.js: glob-based test discovery for *.test.js
- scripts/ci/validate-hooks.js: validate hooks.json with ajv + schemas/hooks.schema.json
- schemas/hooks.schema.json: hookItem.type enum command|notification
- scripts/ci/catalog.js: catalog agents, commands, skills (--json | --md)
- docs/COMMAND-AGENT-MAP.md: command → agent/skill map
- docs/ARCHITECTURE-IMPROVEMENTS.md: improvement recommendations
- package.json: ajv, c8 devDeps; npm run coverage
- CONTRIBUTING.md: Cross-Harness and Translations section
- .gitignore: coverage/

Made-with: Cursor
2026-03-10 19:36:57 -07:00
Frank
b994a076c2 docs: add guidance for project documentation capture (#355) 2026-03-07 14:48:11 -08:00
Affaan Mustafa
d70bab85e3 feat: add Cursor, Codex, and OpenCode harnesses — maximize every AI coding tool
- AGENTS.md: universal cross-tool file read by Claude Code, Cursor, Codex, and OpenCode
- .cursor/: 15 hook events via hooks.json, 16 hook scripts with DRY adapter pattern,
  29 rules (9 common + 20 language-specific) with Cursor YAML frontmatter
- .codex/: reference config.toml, Codex-specific AGENTS.md supplement,
  10 skills ported to .agents/skills/ with openai.yaml metadata
- .opencode/: 3 new tools (format-code, lint-check, git-summary), 3 new hooks
  (shell.env, experimental.session.compacting, permission.ask), expanded instructions,
  version bumped to 1.6.0
- README: fixed Cursor section, added Codex section, added cross-tool parity table
- install.sh: now copies hooks.json + hooks/ for --target cursor
2026-02-25 10:45:29 -08:00