mirror of
https://github.com/affaan-m/everything-claude-code.git
synced 2026-04-13 05:03:28 +08:00
docs(zh-CN): sync Chinese docs with latest upstream changes (#304)
* docs(zh-CN): sync Chinese docs with latest upstream changes * update --------- Co-authored-by: neo <neo.dowithless@gmail.com>
This commit is contained in:
@@ -1,6 +1,7 @@
|
||||
---
|
||||
name: eval-harness
|
||||
description: 克劳德代码会话的正式评估框架,实施评估驱动开发(EDD)原则
|
||||
origin: ECC
|
||||
tools: Read, Write, Edit, Bash, Grep, Glob
|
||||
---
|
||||
|
||||
@@ -8,6 +9,14 @@ tools: Read, Write, Edit, Bash, Grep, Glob
|
||||
|
||||
一个用于 Claude Code 会话的正式评估框架,实现了评估驱动开发 (EDD) 原则。
|
||||
|
||||
## 何时激活
|
||||
|
||||
* 为 AI 辅助工作流程设置评估驱动开发 (EDD)
|
||||
* 定义 Claude Code 任务完成的标准(通过/失败)
|
||||
* 使用 pass@k 指标衡量代理可靠性
|
||||
* 为提示或代理变更创建回归测试套件
|
||||
* 跨模型版本对代理性能进行基准测试
|
||||
|
||||
## 理念
|
||||
|
||||
评估驱动开发将评估视为 "AI 开发的单元测试":
|
||||
|
||||
Reference in New Issue
Block a user