feat: architecture improvements — test discovery, hooks schema, catalog, command map, coverage, cross-harness docs

- AGENTS.md: sync skills count to 65+
- tests/run-all.js: glob-based test discovery for *.test.js
- scripts/ci/validate-hooks.js: validate hooks.json with ajv + schemas/hooks.schema.json
- schemas/hooks.schema.json: hookItem.type enum command|notification
- scripts/ci/catalog.js: catalog agents, commands, skills (--json | --md)
- docs/COMMAND-AGENT-MAP.md: command → agent/skill map
- docs/ARCHITECTURE-IMPROVEMENTS.md: improvement recommendations
- package.json: ajv, c8 devDeps; npm run coverage
- CONTRIBUTING.md: Cross-Harness and Translations section
- .gitignore: coverage/

Made-with: Cursor
This commit is contained in:
kinshukdutta
2026-03-09 15:05:17 -04:00
committed by Affaan Mustafa
parent c883289abb
commit a50349181a
11 changed files with 1247 additions and 43 deletions

View File

@@ -0,0 +1,146 @@
# Architecture Improvement Recommendations
This document captures architect-level improvements for the Everything Claude Code (ECC) project. It is written from the perspective of a Claude Code coding architect aiming to improve maintainability, consistency, and long-term quality.
---
## 1. Documentation and Single Source of Truth
### 1.1 Agent / Command / Skill Count Sync
**Issue:** AGENTS.md states "13 specialized agents, 50+ skills, 33 commands" while the repo has **16 agents**, **65+ skills**, and **40 commands**. README and other docs also vary. This causes confusion for contributors and users.
**Recommendation:**
- **Single source of truth:** Derive counts (and optionally tables) from the filesystem or a small manifest. Options:
- **Option A:** Add a script (e.g. `scripts/ci/catalog.js`) that scans `agents/*.md`, `commands/*.md`, and `skills/*/SKILL.md` and outputs JSON/Markdown. CI and docs can consume this.
- **Option B:** Maintain one `docs/catalog.json` (or YAML) that lists agents, commands, and skills with metadata; scripts and docs read from it. Requires discipline to update on add/remove.
- **Short-term:** Manually sync AGENTS.md, README.md, and CLAUDE.md with actual counts and list any new agents (e.g. chief-of-staff, loop-operator, harness-optimizer) in the agent table.
**Impact:** High — affects first impression and contributor trust.
---
### 1.2 Command → Agent / Skill Map
**Issue:** There is no single machine- or human-readable map of "which command uses which agent(s) or skill(s)." This lives in README tables and individual command `.md` files, which can drift.
**Recommendation:**
- Add a **command registry** (e.g. in `docs/` or as frontmatter in command files) that lists for each command: name, description, primary agent(s), skills referenced. Can be generated from command file content or maintained by hand.
- Expose a "map" in docs (e.g. `docs/COMMAND-AGENT-MAP.md`) or in the generated catalog for discoverability and for tooling (e.g. "which commands use tdd-guide?").
**Impact:** Medium — improves discoverability and refactoring safety.
---
## 2. Testing and Quality
### 2.1 Test Discovery vs Hardcoded List
**Issue:** `tests/run-all.js` uses a **hardcoded list** of test files. New test files are not run unless someone updates `run-all.js`, so coverage can be incomplete by omission.
**Recommendation:**
- **Glob-based discovery:** Discover test files by pattern (e.g. `**/*.test.js` under `tests/`) and run them, with an optional allowlist/denylist for special cases. This makes new tests automatically part of the suite.
- Keep a single entry point (`tests/run-all.js`) that runs discovered tests and aggregates results.
**Impact:** High — prevents regression where new tests exist but are never executed.
---
### 2.2 Test Coverage Metrics
**Issue:** There is no coverage tool (e.g. nyc/c8/istanbul). The project cannot assert "80%+ coverage" for its own scripts; coverage is implicit.
**Recommendation:**
- Introduce a coverage tool for Node scripts (e.g. `c8` or `nyc`) and run it in CI. Start with a baseline (e.g. 60%) and raise over time; or at least report coverage in CI without failing so the team can see trends.
- Focus on `scripts/` (lib + hooks + ci) as the primary target; exclude one-off scripts if needed.
**Impact:** Medium — aligns the project with its own AGENTS.md guidance (80%+ coverage) and surfaces untested paths.
---
## 3. Schema and Validation
### 3.1 Use Hooks JSON Schema in CI
**Issue:** `schemas/hooks.schema.json` exists and defines the hook configuration shape, but `scripts/ci/validate-hooks.js` does **not** use it. Validation is duplicated (VALID_EVENTS, structure) and can drift from the schema.
**Recommendation:**
- Use a JSON Schema validator (e.g. `ajv`) in `validate-hooks.js` to validate `hooks/hooks.json` against `schemas/hooks.schema.json`. Keep the validator as the single source of truth for structure; retain only hook-specific checks (e.g. inline JS syntax) in the script.
- Ensures schema and validator stay in sync and allows IDE/editor validation via `$schema` in hooks.json.
**Impact:** Medium — reduces drift and improves contributor experience when editing hooks.
---
## 4. Cross-Harness and i18n
### 4.1 Skill/Agent Subset Sync (.agents/skills, .cursor/skills)
**Issue:** `.agents/skills/` (Codex) and `.cursor/skills/` are subsets of `skills/`. Adding or removing a skill in the main repo requires manually updating these subsets, which can be forgotten.
**Recommendation:**
- Document in CONTRIBUTING.md that adding a skill may require updating `.agents/skills` and `.cursor/skills` (and how to do it).
- Optionally: a CI check or script that compares `skills/` to the subsets and fails or warns if a skill is in one set but not the other when it should be (e.g. by convention or by a small manifest).
**Impact:** LowMedium — reduces cross-harness drift.
---
### 4.2 Translation Drift (docs/ zh-CN, zh-TW, ja-JP)
**Issue:** Translations in `docs/` duplicate agents, commands, skills. As the English source evolves, translations can become outdated without clear process or tooling.
**Recommendation:**
- Document a **translation process:** when to update (e.g. on release), who owns each locale, and how to detect stale content (e.g. diff file lists or key sections).
- Consider: translation status file (e.g. `docs/i18n-status.md`) or CI that checks translation file existence/timestamps and warns if English was updated more recently than a translation.
- Long-term: consider extraction/placeholder format (e.g. i18n keys) so translations reference the same structure as the English source.
**Impact:** Medium — improves experience for non-English users and reduces confusion from outdated translations.
---
## 5. Hooks and Scripts
### 5.1 Hook Runtime Consistency
**Issue:** Most hooks invoke Node scripts via `run-with-flags.js`; one path uses `run-with-flags-shell.sh` + `observe.sh`. The mixed runtime is documented but could be simplified over time.
**Recommendation:**
- Prefer Node for new hooks when possible (cross-platform, single runtime). If shell is required, document why and keep the surface small.
- Ensure `ECC_HOOK_PROFILE` and `ECC_DISABLED_HOOKS` are respected in all code paths (including shell) so behavior is consistent.
**Impact:** Low — maintains current design; improves if more hooks migrate to Node.
---
## 6. Summary Table
| Area | Improvement | Priority | Effort |
|-------------------|--------------------------------------|----------|---------|
| Doc sync | Sync AGENTS.md/README counts & table | High | Low |
| Single source | Catalog script or manifest | High | Medium |
| Test discovery | Glob-based test runner | High | Low |
| Coverage | Add c8/nyc and CI coverage | Medium | Medium |
| Hook schema in CI | Validate hooks.json via schema | Medium | Low |
| Command map | Command → agent/skill registry | Medium | Medium |
| Subset sync | Document/CI for .agents/.cursor | LowMed | LowMed |
| Translations | Process + stale detection | Medium | Medium |
| Hook runtime | Prefer Node; document shell use | Low | Low |
---
## 7. Quick Wins (Immediate)
1. **Update AGENTS.md:** Set agent count to 16; add chief-of-staff, loop-operator, harness-optimizer to the agent table; align skill/command counts with repo.
2. **Test discovery:** Change `run-all.js` to discover `**/*.test.js` under `tests/` (with optional allowlist) so new tests are always run.
3. **Wire hooks schema:** In `validate-hooks.js`, validate `hooks/hooks.json` against `schemas/hooks.schema.json` using ajv (or similar) and keep only hook-specific checks in the script.
These three can be done in one or two sessions and materially improve consistency and reliability.

61
docs/COMMAND-AGENT-MAP.md Normal file
View File

@@ -0,0 +1,61 @@
# Command → Agent / Skill Map
This document lists each slash command and the primary agent(s) or skills it invokes. Use it to discover which commands use which agents and to keep refactoring consistent.
| Command | Primary agent(s) | Notes |
|---------|------------------|--------|
| `/plan` | planner | Implementation planning before code |
| `/tdd` | tdd-guide | Test-driven development |
| `/code-review` | code-reviewer | Quality and security review |
| `/build-fix` | build-error-resolver | Fix build/type errors |
| `/e2e` | e2e-runner | Playwright E2E tests |
| `/refactor-clean` | refactor-cleaner | Dead code removal |
| `/update-docs` | doc-updater | Documentation sync |
| `/update-codemaps` | doc-updater | Codemaps / architecture docs |
| `/go-review` | go-reviewer | Go code review |
| `/go-test` | tdd-guide | Go TDD workflow |
| `/go-build` | go-build-resolver | Fix Go build errors |
| `/python-review` | python-reviewer | Python code review |
| `/harness-audit` | — | Harness scorecard (no single agent) |
| `/loop-start` | loop-operator | Start autonomous loop |
| `/loop-status` | loop-operator | Inspect loop status |
| `/quality-gate` | — | Quality pipeline (hook-like) |
| `/model-route` | — | Model recommendation (no agent) |
| `/orchestrate` | planner, tdd-guide, code-reviewer, security-reviewer, architect | Multi-agent handoff |
| `/multi-plan` | architect (Codex/Gemini prompts) | Multi-model planning |
| `/multi-execute` | architect / frontend prompts | Multi-model execution |
| `/multi-backend` | architect | Backend multi-service |
| `/multi-frontend` | architect | Frontend multi-service |
| `/multi-workflow` | architect | General multi-service |
| `/learn` | — | continuous-learning skill, instincts |
| `/learn-eval` | — | continuous-learning-v2, evaluate then save |
| `/instinct-status` | — | continuous-learning-v2 |
| `/instinct-import` | — | continuous-learning-v2 |
| `/instinct-export` | — | continuous-learning-v2 |
| `/evolve` | — | continuous-learning-v2, cluster instincts |
| `/promote` | — | continuous-learning-v2 |
| `/projects` | — | continuous-learning-v2 |
| `/skill-create` | — | skill-create-output script, git history |
| `/checkpoint` | — | verification-loop skill |
| `/verify` | — | verification-loop skill |
| `/eval` | — | eval-harness skill |
| `/test-coverage` | — | Coverage analysis |
| `/sessions` | — | Session history |
| `/setup-pm` | — | Package manager setup script |
| `/claw` | — | NanoClaw CLI (scripts/claw.js) |
| `/pm2` | — | PM2 service lifecycle |
| `/security-scan` | security-reviewer (skill) | AgentShield via security-scan skill |
## Skills referenced by commands
- **continuous-learning**, **continuous-learning-v2**: `/learn`, `/learn-eval`, `/instinct-*`, `/evolve`, `/promote`, `/projects`
- **verification-loop**: `/checkpoint`, `/verify`
- **eval-harness**: `/eval`
- **security-scan**: `/security-scan` (runs AgentShield)
- **strategic-compact**: suggested at compaction points (hooks)
## How to use this map
- **Discoverability:** Find which command triggers which agent (e.g. “use `/code-review` for code-reviewer”).
- **Refactoring:** When renaming or removing an agent, search this doc and the command files for references.
- **CI/docs:** The catalog script (`node scripts/ci/catalog.js`) outputs agent/command/skill counts; this map complements it with commandagent relationships.