mirror of
https://github.com/affaan-m/everything-claude-code.git
synced 2026-04-01 22:53:27 +08:00
feat: complete OpenCode plugin support with hooks, tools, and commands
Major OpenCode integration overhaul: - llms.txt: Comprehensive OpenCode documentation for LLMs (642 lines) - .opencode/plugins/ecc-hooks.ts: All Claude Code hooks translated to OpenCode's plugin system - .opencode/tools/*.ts: 3 custom tools (run-tests, check-coverage, security-audit) - .opencode/commands/*.md: All 24 commands in OpenCode format - .opencode/package.json: npm package structure for opencode-ecc - .opencode/index.ts: Main plugin entry point - Delete incorrect LIMITATIONS.md (hooks ARE supported via plugins) - Rewrite MIGRATION.md with correct hook event mapping - Update README.md OpenCode section to show full feature parity OpenCode has 20+ events vs Claude Code's 3 phases: - PreToolUse → tool.execute.before - PostToolUse → tool.execute.after - Stop → session.idle - SessionStart → session.created - SessionEnd → session.deleted - Plus: file.edited, file.watcher.updated, permission.asked, todo.updated - 12 agents: Full parity - 24 commands: Full parity (+1 from original 23) - 16 skills: Full parity - Hooks: OpenCode has MORE (20+ events vs 3 phases) - Custom Tools: 3 native OpenCode tools The OpenCode configuration can now be: 1. Used directly: cd everything-claude-code && opencode 2. Installed via npm: npm install opencode-ecc
This commit is contained in:
88
.opencode/commands/eval.md
Normal file
88
.opencode/commands/eval.md
Normal file
@@ -0,0 +1,88 @@
|
||||
---
|
||||
description: Run evaluation against acceptance criteria
|
||||
agent: build
|
||||
---
|
||||
|
||||
# Eval Command
|
||||
|
||||
Evaluate implementation against acceptance criteria: $ARGUMENTS
|
||||
|
||||
## Your Task
|
||||
|
||||
Run structured evaluation to verify the implementation meets requirements.
|
||||
|
||||
## Evaluation Framework
|
||||
|
||||
### Grader Types
|
||||
|
||||
1. **Binary Grader** - Pass/Fail
|
||||
- Does it work? Yes/No
|
||||
- Good for: feature completion, bug fixes
|
||||
|
||||
2. **Scalar Grader** - Score 0-100
|
||||
- How well does it work?
|
||||
- Good for: performance, quality metrics
|
||||
|
||||
3. **Rubric Grader** - Category scores
|
||||
- Multiple dimensions evaluated
|
||||
- Good for: comprehensive review
|
||||
|
||||
## Evaluation Process
|
||||
|
||||
### Step 1: Define Criteria
|
||||
|
||||
```
|
||||
Acceptance Criteria:
|
||||
1. [Criterion 1] - [weight]
|
||||
2. [Criterion 2] - [weight]
|
||||
3. [Criterion 3] - [weight]
|
||||
```
|
||||
|
||||
### Step 2: Run Tests
|
||||
|
||||
For each criterion:
|
||||
- Execute relevant test
|
||||
- Collect evidence
|
||||
- Score result
|
||||
|
||||
### Step 3: Calculate Score
|
||||
|
||||
```
|
||||
Final Score = Σ (criterion_score × weight) / total_weight
|
||||
```
|
||||
|
||||
### Step 4: Report
|
||||
|
||||
## Evaluation Report
|
||||
|
||||
### Overall: [PASS/FAIL] (Score: X/100)
|
||||
|
||||
### Criterion Breakdown
|
||||
|
||||
| Criterion | Score | Weight | Weighted |
|
||||
|-----------|-------|--------|----------|
|
||||
| [Criterion 1] | X/10 | 30% | X |
|
||||
| [Criterion 2] | X/10 | 40% | X |
|
||||
| [Criterion 3] | X/10 | 30% | X |
|
||||
|
||||
### Evidence
|
||||
|
||||
**Criterion 1: [Name]**
|
||||
- Test: [what was tested]
|
||||
- Result: [outcome]
|
||||
- Evidence: [screenshot, log, output]
|
||||
|
||||
### Recommendations
|
||||
|
||||
[If not passing, what needs to change]
|
||||
|
||||
## Pass@K Metrics
|
||||
|
||||
For non-deterministic evaluations:
|
||||
- Run K times
|
||||
- Calculate pass rate
|
||||
- Report: "Pass@K = X/K"
|
||||
|
||||
---
|
||||
|
||||
**TIP**: Use eval for acceptance testing before marking features complete.
|
||||
Reference in New Issue
Block a user