feat: deliver v1.8.0 harness reliability and parity updates

2026-05-17 14:23:13 +08:00 · 2026-03-04 14:48:06 -08:00
parent 32e9c293f0
commit 48b883d741
84 changed files with 2990 additions and 725 deletions
--- a/commands/claw.md
+++ b/commands/claw.md
@@ -1,10 +1,10 @@
 ---
-description: Start the NanoClaw agent REPL — a persistent, session-aware AI assistant powered by the claude CLI.
+description: Start NanoClaw v2 — ECC's persistent, zero-dependency REPL with model routing, skill hot-load, branching, compaction, export, and metrics.
 ---

 # Claw Command

-Start an interactive AI agent session that persists conversation history to disk and optionally loads ECC skill context.
+Start an interactive AI agent session with persistent markdown history and operational controls.

 ## Usage

@@ -23,57 +23,29 @@ npm run claw
 | Variable | Default | Description |
 |----------|---------|-------------|
 | `CLAW_SESSION` | `default` | Session name (alphanumeric + hyphens) |
-| `CLAW_SKILLS` | *(empty)* | Comma-separated skill names to load as system context |
+| `CLAW_SKILLS` | *(empty)* | Comma-separated skills loaded at startup |
+| `CLAW_MODEL` | `sonnet` | Default model for the session |

 ## REPL Commands

-Inside the REPL, type these commands directly at the prompt:
-
-```
-/clear      Clear current session history
-/history    Print full conversation history
-/sessions   List all saved sessions
-/help       Show available commands
-exit        Quit the REPL
+```text
+/help                          Show help
+/clear                         Clear current session history
+/history                       Print full conversation history
+/sessions                      List saved sessions
+/model [name]                  Show/set model
+/load <skill-name>             Hot-load a skill into context
+/branch <session-name>         Branch current session
+/search <query>                Search query across sessions
+/compact                       Compact old turns, keep recent context
+/export <md|json|txt> [path]   Export session
+/metrics                       Show session metrics
+exit                           Quit
 ```

-## How It Works
+## Notes

-1. Reads `CLAW_SESSION` env var to select a named session (default: `default`)
-2. Loads conversation history from `~/.claude/claw/{session}.md`
-3. Optionally loads ECC skill context from `CLAW_SKILLS` env var
-4. Enters a blocking prompt loop — each user message is sent to `claude -p` with full history
-5. Responses are appended to the session file for persistence across restarts
-
-## Session Storage
-
-Sessions are stored as Markdown files in `~/.claude/claw/`:
-
-```
-~/.claude/claw/default.md
-~/.claude/claw/my-project.md
-```
-
-Each turn is formatted as:
-
-```markdown
-### [2025-01-15T10:30:00.000Z] User
-What does this function do?
---
-### [2025-01-15T10:30:05.000Z] Assistant
-This function calculates...
---
-```
-
-## Examples
-
-```bash
-# Start default session
-node scripts/claw.js
-
-# Named session
-CLAW_SESSION=my-project node scripts/claw.js
-
-# With skill context
-CLAW_SKILLS=tdd-workflow,security-review node scripts/claw.js
-```
+- NanoClaw remains zero-dependency.
+- Sessions are stored at `~/.claude/claw/<session>.md`.
+- Compaction keeps the most recent turns and writes a compaction header.
+- Export supports markdown, JSON turns, and plain text.
--- a/commands/harness-audit.md
+++ b/commands/harness-audit.md
@@ -0,0 +1,58 @@
+# Harness Audit Command
+
+Audit the current repository's agent harness setup and return a prioritized scorecard.
+
+## Usage
+
+`/harness-audit [scope] [--format text|json]`
+
+- `scope` (optional): `repo` (default), `hooks`, `skills`, `commands`, `agents`
+- `--format`: output style (`text` default, `json` for automation)
+
+## What to Evaluate
+
+Score each category from `0` to `10`:
+
+1. Tool Coverage
+2. Context Efficiency
+3. Quality Gates
+4. Memory Persistence
+5. Eval Coverage
+6. Security Guardrails
+7. Cost Efficiency
+
+## Output Contract
+
+Return:
+
+1. `overall_score` out of 70
+2. Category scores and concrete findings
+3. Top 3 actions with exact file paths
+4. Suggested ECC skills to apply next
+
+## Checklist
+
+- Inspect `hooks/hooks.json`, `scripts/hooks/`, and hook tests.
+- Inspect `skills/`, command coverage, and agent coverage.
+- Verify cross-harness parity for `.cursor/`, `.opencode/`, `.codex/`.
+- Flag broken or stale references.
+
+## Example Result
+
+```text
+Harness Audit (repo): 52/70
+- Quality Gates: 9/10
+- Eval Coverage: 6/10
+- Cost Efficiency: 4/10
+
+Top 3 Actions:
+1) Add cost tracking hook in scripts/hooks/cost-tracker.js
+2) Add pass@k docs and templates in skills/eval-harness/SKILL.md
+3) Add command parity for /harness-audit in .opencode/commands/
+```
+
+## Arguments
+
+$ARGUMENTS:
+- `repo|hooks|skills|commands|agents` (optional scope)
+- `--format text|json` (optional output format)
--- a/commands/loop-start.md
+++ b/commands/loop-start.md
@@ -0,0 +1,32 @@
+# Loop Start Command
+
+Start a managed autonomous loop pattern with safety defaults.
+
+## Usage
+
+`/loop-start [pattern] [--mode safe|fast]`
+
+- `pattern`: `sequential`, `continuous-pr`, `rfc-dag`, `infinite`
+- `--mode`:
+  - `safe` (default): strict quality gates and checkpoints
+  - `fast`: reduced gates for speed
+
+## Flow
+
+1. Confirm repository state and branch strategy.
+2. Select loop pattern and model tier strategy.
+3. Enable required hooks/profile for the chosen mode.
+4. Create loop plan and write runbook under `.claude/plans/`.
+5. Print commands to start and monitor the loop.
+
+## Required Safety Checks
+
+- Verify tests pass before first loop iteration.
+- Ensure `ECC_HOOK_PROFILE` is not disabled globally.
+- Ensure loop has explicit stop condition.
+
+## Arguments
+
+$ARGUMENTS:
+- `<pattern>` optional (`sequential|continuous-pr|rfc-dag|infinite`)
+- `--mode safe|fast` optional
--- a/commands/loop-status.md
+++ b/commands/loop-status.md
@@ -0,0 +1,24 @@
+# Loop Status Command
+
+Inspect active loop state, progress, and failure signals.
+
+## Usage
+
+`/loop-status [--watch]`
+
+## What to Report
+
+- active loop pattern
+- current phase and last successful checkpoint
+- failing checks (if any)
+- estimated time/cost drift
+- recommended intervention (continue/pause/stop)
+
+## Watch Mode
+
+When `--watch` is present, refresh status periodically and surface state changes.
+
+## Arguments
+
+$ARGUMENTS:
+- `--watch` optional
--- a/commands/model-route.md
+++ b/commands/model-route.md
@@ -0,0 +1,26 @@
+# Model Route Command
+
+Recommend the best model tier for the current task by complexity and budget.
+
+## Usage
+
+`/model-route [task-description] [--budget low|med|high]`
+
+## Routing Heuristic
+
+- `haiku`: deterministic, low-risk mechanical changes
+- `sonnet`: default for implementation and refactors
+- `opus`: architecture, deep review, ambiguous requirements
+
+## Required Output
+
+- recommended model
+- confidence level
+- why this model fits
+- fallback model if first attempt fails
+
+## Arguments
+
+$ARGUMENTS:
+- `[task-description]` optional free-text
+- `--budget low|med|high` optional
--- a/commands/quality-gate.md
+++ b/commands/quality-gate.md
@@ -0,0 +1,29 @@
+# Quality Gate Command
+
+Run the ECC quality pipeline on demand for a file or project scope.
+
+## Usage
+
+`/quality-gate [path|.] [--fix] [--strict]`
+
+- default target: current directory (`.`)
+- `--fix`: allow auto-format/fix where configured
+- `--strict`: fail on warnings where supported
+
+## Pipeline
+
+1. Detect language/tooling for target.
+2. Run formatter checks.
+3. Run lint/type checks when available.
+4. Produce a concise remediation list.
+
+## Notes
+
+This command mirrors hook behavior but is operator-invoked.
+
+## Arguments
+
+$ARGUMENTS:
+- `[path|.]` optional target path
+- `--fix` optional
+- `--strict` optional