feat: deliver v1.8.0 harness reliability and parity updates

This commit is contained in:
Affaan Mustafa
2026-03-04 14:48:06 -08:00
parent 32e9c293f0
commit 48b883d741
84 changed files with 2990 additions and 725 deletions

View File

@@ -1,10 +1,10 @@
---
description: Start the NanoClaw agent REPL — a persistent, session-aware AI assistant powered by the claude CLI.
description: Start NanoClaw v2 — ECC's persistent, zero-dependency REPL with model routing, skill hot-load, branching, compaction, export, and metrics.
---
# Claw Command
Start an interactive AI agent session that persists conversation history to disk and optionally loads ECC skill context.
Start an interactive AI agent session with persistent markdown history and operational controls.
## Usage
@@ -23,57 +23,29 @@ npm run claw
| Variable | Default | Description |
|----------|---------|-------------|
| `CLAW_SESSION` | `default` | Session name (alphanumeric + hyphens) |
| `CLAW_SKILLS` | *(empty)* | Comma-separated skill names to load as system context |
| `CLAW_SKILLS` | *(empty)* | Comma-separated skills loaded at startup |
| `CLAW_MODEL` | `sonnet` | Default model for the session |
## REPL Commands
Inside the REPL, type these commands directly at the prompt:
```
/clear Clear current session history
/history Print full conversation history
/sessions List all saved sessions
/help Show available commands
exit Quit the REPL
```text
/help Show help
/clear Clear current session history
/history Print full conversation history
/sessions List saved sessions
/model [name] Show/set model
/load <skill-name> Hot-load a skill into context
/branch <session-name> Branch current session
/search <query> Search query across sessions
/compact Compact old turns, keep recent context
/export <md|json|txt> [path] Export session
/metrics Show session metrics
exit Quit
```
## How It Works
## Notes
1. Reads `CLAW_SESSION` env var to select a named session (default: `default`)
2. Loads conversation history from `~/.claude/claw/{session}.md`
3. Optionally loads ECC skill context from `CLAW_SKILLS` env var
4. Enters a blocking prompt loop — each user message is sent to `claude -p` with full history
5. Responses are appended to the session file for persistence across restarts
## Session Storage
Sessions are stored as Markdown files in `~/.claude/claw/`:
```
~/.claude/claw/default.md
~/.claude/claw/my-project.md
```
Each turn is formatted as:
```markdown
### [2025-01-15T10:30:00.000Z] User
What does this function do?
---
### [2025-01-15T10:30:05.000Z] Assistant
This function calculates...
---
```
## Examples
```bash
# Start default session
node scripts/claw.js
# Named session
CLAW_SESSION=my-project node scripts/claw.js
# With skill context
CLAW_SKILLS=tdd-workflow,security-review node scripts/claw.js
```
- NanoClaw remains zero-dependency.
- Sessions are stored at `~/.claude/claw/<session>.md`.
- Compaction keeps the most recent turns and writes a compaction header.
- Export supports markdown, JSON turns, and plain text.

58
commands/harness-audit.md Normal file
View File

@@ -0,0 +1,58 @@
# Harness Audit Command
Audit the current repository's agent harness setup and return a prioritized scorecard.
## Usage
`/harness-audit [scope] [--format text|json]`
- `scope` (optional): `repo` (default), `hooks`, `skills`, `commands`, `agents`
- `--format`: output style (`text` default, `json` for automation)
## What to Evaluate
Score each category from `0` to `10`:
1. Tool Coverage
2. Context Efficiency
3. Quality Gates
4. Memory Persistence
5. Eval Coverage
6. Security Guardrails
7. Cost Efficiency
## Output Contract
Return:
1. `overall_score` out of 70
2. Category scores and concrete findings
3. Top 3 actions with exact file paths
4. Suggested ECC skills to apply next
## Checklist
- Inspect `hooks/hooks.json`, `scripts/hooks/`, and hook tests.
- Inspect `skills/`, command coverage, and agent coverage.
- Verify cross-harness parity for `.cursor/`, `.opencode/`, `.codex/`.
- Flag broken or stale references.
## Example Result
```text
Harness Audit (repo): 52/70
- Quality Gates: 9/10
- Eval Coverage: 6/10
- Cost Efficiency: 4/10
Top 3 Actions:
1) Add cost tracking hook in scripts/hooks/cost-tracker.js
2) Add pass@k docs and templates in skills/eval-harness/SKILL.md
3) Add command parity for /harness-audit in .opencode/commands/
```
## Arguments
$ARGUMENTS:
- `repo|hooks|skills|commands|agents` (optional scope)
- `--format text|json` (optional output format)

32
commands/loop-start.md Normal file
View File

@@ -0,0 +1,32 @@
# Loop Start Command
Start a managed autonomous loop pattern with safety defaults.
## Usage
`/loop-start [pattern] [--mode safe|fast]`
- `pattern`: `sequential`, `continuous-pr`, `rfc-dag`, `infinite`
- `--mode`:
- `safe` (default): strict quality gates and checkpoints
- `fast`: reduced gates for speed
## Flow
1. Confirm repository state and branch strategy.
2. Select loop pattern and model tier strategy.
3. Enable required hooks/profile for the chosen mode.
4. Create loop plan and write runbook under `.claude/plans/`.
5. Print commands to start and monitor the loop.
## Required Safety Checks
- Verify tests pass before first loop iteration.
- Ensure `ECC_HOOK_PROFILE` is not disabled globally.
- Ensure loop has explicit stop condition.
## Arguments
$ARGUMENTS:
- `<pattern>` optional (`sequential|continuous-pr|rfc-dag|infinite`)
- `--mode safe|fast` optional

24
commands/loop-status.md Normal file
View File

@@ -0,0 +1,24 @@
# Loop Status Command
Inspect active loop state, progress, and failure signals.
## Usage
`/loop-status [--watch]`
## What to Report
- active loop pattern
- current phase and last successful checkpoint
- failing checks (if any)
- estimated time/cost drift
- recommended intervention (continue/pause/stop)
## Watch Mode
When `--watch` is present, refresh status periodically and surface state changes.
## Arguments
$ARGUMENTS:
- `--watch` optional

26
commands/model-route.md Normal file
View File

@@ -0,0 +1,26 @@
# Model Route Command
Recommend the best model tier for the current task by complexity and budget.
## Usage
`/model-route [task-description] [--budget low|med|high]`
## Routing Heuristic
- `haiku`: deterministic, low-risk mechanical changes
- `sonnet`: default for implementation and refactors
- `opus`: architecture, deep review, ambiguous requirements
## Required Output
- recommended model
- confidence level
- why this model fits
- fallback model if first attempt fails
## Arguments
$ARGUMENTS:
- `[task-description]` optional free-text
- `--budget low|med|high` optional

29
commands/quality-gate.md Normal file
View File

@@ -0,0 +1,29 @@
# Quality Gate Command
Run the ECC quality pipeline on demand for a file or project scope.
## Usage
`/quality-gate [path|.] [--fix] [--strict]`
- default target: current directory (`.`)
- `--fix`: allow auto-format/fix where configured
- `--strict`: fail on warnings where supported
## Pipeline
1. Detect language/tooling for target.
2. Run formatter checks.
3. Run lint/type checks when available.
4. Produce a concise remediation list.
## Notes
This command mirrors hook behavior but is operator-invoked.
## Arguments
$ARGUMENTS:
- `[path|.]` optional target path
- `--fix` optional
- `--strict` optional