docs(zh-CN): sync Chinese docs with latest upstream changes (#304)

* docs(zh-CN): sync Chinese docs with latest upstream changes * update --------- Co-authored-by: neo <neo.dowithless@gmail.com>
2026-06-15 04:31:27 +08:00 · 2026-03-03 14:28:27 +08:00
parent adc0f67008
commit ada4cd75a3
114 changed files with 11161 additions and 4790 deletions
@@ -1,6 +1,7 @@
 ---
 name: eval-harness
 description: 克劳德代码会话的正式评估框架，实施评估驱动开发（EDD）原则
+origin: ECC
 tools: Read, Write, Edit, Bash, Grep, Glob
 ---

@@ -8,6 +9,14 @@ tools: Read, Write, Edit, Bash, Grep, Glob

 一个用于 Claude Code 会话的正式评估框架，实现了评估驱动开发 (EDD) 原则。

+## 何时激活
+
+* 为 AI 辅助工作流程设置评估驱动开发 (EDD)
+* 定义 Claude Code 任务完成的标准（通过/失败）
+* 使用 pass@k 指标衡量代理可靠性
+* 为提示或代理变更创建回归测试套件
+* 跨模型版本对代理性能进行基准测试
+
 ## 理念

 评估驱动开发将评估视为 "AI 开发的单元测试"：