mirror of
https://github.com/affaan-m/everything-claude-code.git
synced 2026-04-06 17:23:28 +08:00
feat: add agent introspection debugging skill
This commit is contained in:
153
.agents/skills/agent-introspection-debugging/SKILL.md
Normal file
153
.agents/skills/agent-introspection-debugging/SKILL.md
Normal file
@@ -0,0 +1,153 @@
|
||||
---
|
||||
name: agent-introspection-debugging
|
||||
description: Structured self-debugging workflow for AI agent failures using capture, diagnosis, contained recovery, and introspection reports.
|
||||
origin: ECC
|
||||
---
|
||||
|
||||
# Agent Introspection Debugging
|
||||
|
||||
Use this skill when an agent run is failing repeatedly, consuming tokens without progress, looping on the same tools, or drifting away from the intended task.
|
||||
|
||||
This is a workflow skill, not a hidden runtime. It teaches the agent to debug itself systematically before escalating to a human.
|
||||
|
||||
## When to Activate
|
||||
|
||||
- Maximum tool call / loop-limit failures
|
||||
- Repeated retries with no forward progress
|
||||
- Context growth or prompt drift that starts degrading output quality
|
||||
- File-system or environment state mismatch between expectation and reality
|
||||
- Tool failures that are likely recoverable with diagnosis and a smaller corrective action
|
||||
|
||||
## Scope Boundaries
|
||||
|
||||
Activate this skill for:
|
||||
- capturing failure state before retrying blindly
|
||||
- diagnosing common agent-specific failure patterns
|
||||
- applying contained recovery actions
|
||||
- producing a structured human-readable debug report
|
||||
|
||||
Do not use this skill as the primary source for:
|
||||
- feature verification after code changes; use `verification-loop`
|
||||
- framework-specific debugging when a narrower ECC skill already exists
|
||||
- runtime promises the current harness cannot enforce automatically
|
||||
|
||||
## Four-Phase Loop
|
||||
|
||||
### Phase 1: Failure Capture
|
||||
|
||||
Before trying to recover, record the failure precisely.
|
||||
|
||||
Capture:
|
||||
- error type, message, and stack trace when available
|
||||
- last meaningful tool call sequence
|
||||
- what the agent was trying to do
|
||||
- current context pressure: repeated prompts, oversized pasted logs, duplicated plans, or runaway notes
|
||||
- current environment assumptions: cwd, branch, relevant service state, expected files
|
||||
|
||||
Minimum capture template:
|
||||
|
||||
```markdown
|
||||
## Failure Capture
|
||||
- Session / task:
|
||||
- Goal in progress:
|
||||
- Error:
|
||||
- Last successful step:
|
||||
- Last failed tool / command:
|
||||
- Repeated pattern seen:
|
||||
- Environment assumptions to verify:
|
||||
```
|
||||
|
||||
### Phase 2: Root-Cause Diagnosis
|
||||
|
||||
Match the failure to a known pattern before changing anything.
|
||||
|
||||
| Pattern | Likely Cause | Check |
|
||||
| --- | --- | --- |
|
||||
| Maximum tool calls / repeated same command | loop or no-exit observer path | inspect the last N tool calls for repetition |
|
||||
| Context overflow / degraded reasoning | unbounded notes, repeated plans, oversized logs | inspect recent context for duplication and low-signal bulk |
|
||||
| `ECONNREFUSED` / timeout | service unavailable or wrong port | verify service health, URL, and port assumptions |
|
||||
| `429` / quota exhaustion | retry storm or missing backoff | count repeated calls and inspect retry spacing |
|
||||
| file missing after write / stale diff | race, wrong cwd, or branch drift | re-check path, cwd, git status, and actual file existence |
|
||||
| tests still failing after “fix” | wrong hypothesis | isolate the exact failing test and re-derive the bug |
|
||||
|
||||
Diagnosis questions:
|
||||
- is this a logic failure, state failure, environment failure, or policy failure?
|
||||
- did the agent lose the real objective and start optimizing the wrong subtask?
|
||||
- is the failure deterministic or transient?
|
||||
- what is the smallest reversible action that would validate the diagnosis?
|
||||
|
||||
### Phase 3: Contained Recovery
|
||||
|
||||
Recover with the smallest action that changes the diagnosis surface.
|
||||
|
||||
Safe recovery actions:
|
||||
- stop repeated retries and restate the hypothesis
|
||||
- trim low-signal context and keep only the active goal, blockers, and evidence
|
||||
- re-check the actual filesystem / branch / process state
|
||||
- narrow the task to one failing command, one file, or one test
|
||||
- switch from speculative reasoning to direct observation
|
||||
- escalate to a human when the failure is high-risk or externally blocked
|
||||
|
||||
Do not claim unsupported auto-healing actions like “reset agent state” or “update harness config” unless you are actually doing them through real tools in the current environment.
|
||||
|
||||
Contained recovery checklist:
|
||||
|
||||
```markdown
|
||||
## Recovery Action
|
||||
- Diagnosis chosen:
|
||||
- Smallest action taken:
|
||||
- Why this is safe:
|
||||
- What evidence would prove the fix worked:
|
||||
```
|
||||
|
||||
### Phase 4: Introspection Report
|
||||
|
||||
End with a report that makes the recovery legible to the next agent or human.
|
||||
|
||||
```markdown
|
||||
## Agent Self-Debug Report
|
||||
- Session / task:
|
||||
- Failure:
|
||||
- Root cause:
|
||||
- Recovery action:
|
||||
- Result: success | partial | blocked
|
||||
- Token / time burn risk:
|
||||
- Follow-up needed:
|
||||
- Preventive change to encode later:
|
||||
```
|
||||
|
||||
## Recovery Heuristics
|
||||
|
||||
Prefer these interventions in order:
|
||||
|
||||
1. Restate the real objective in one sentence.
|
||||
2. Verify the world state instead of trusting memory.
|
||||
3. Shrink the failing scope.
|
||||
4. Run one discriminating check.
|
||||
5. Only then retry.
|
||||
|
||||
Bad pattern:
|
||||
- retrying the same action three times with slightly different wording
|
||||
|
||||
Good pattern:
|
||||
- capture failure
|
||||
- classify the pattern
|
||||
- run one direct check
|
||||
- change the plan only if the check supports it
|
||||
|
||||
## Integration with ECC
|
||||
|
||||
- Use `verification-loop` after recovery if code was changed.
|
||||
- Use `continuous-learning-v2` when the failure pattern is worth turning into an instinct or later skill.
|
||||
- Use `council` when the issue is not technical failure but decision ambiguity.
|
||||
- Use `workspace-surface-audit` if the failure came from conflicting local state or repo drift.
|
||||
|
||||
## Output Standard
|
||||
|
||||
When this skill is active, do not end with “I fixed it” alone.
|
||||
|
||||
Always provide:
|
||||
- the failure pattern
|
||||
- the root-cause hypothesis
|
||||
- the recovery action
|
||||
- the evidence that the situation is now better or still blocked
|
||||
@@ -1,6 +1,6 @@
|
||||
# Everything Claude Code (ECC) — Agent Instructions
|
||||
|
||||
This is a **production-ready AI coding plugin** providing 47 specialized agents, 180 skills, 79 commands, and automated hook workflows for software development.
|
||||
This is a **production-ready AI coding plugin** providing 47 specialized agents, 181 skills, 79 commands, and automated hook workflows for software development.
|
||||
|
||||
**Version:** 1.10.0
|
||||
|
||||
@@ -146,7 +146,7 @@ Troubleshoot failures: check test isolation → verify mocks → fix implementat
|
||||
|
||||
```
|
||||
agents/ — 47 specialized subagents
|
||||
skills/ — 180 workflow skills and domain knowledge
|
||||
skills/ — 181 workflow skills and domain knowledge
|
||||
commands/ — 79 slash commands
|
||||
hooks/ — Trigger-based automations
|
||||
rules/ — Always-follow guidelines (common + per-language)
|
||||
|
||||
@@ -236,7 +236,7 @@ For manual install instructions see the README in the `rules/` folder. When copy
|
||||
/plugin list ecc@ecc
|
||||
```
|
||||
|
||||
**That's it!** You now have access to 47 agents, 180 skills, and 79 legacy command shims.
|
||||
**That's it!** You now have access to 47 agents, 181 skills, and 79 legacy command shims.
|
||||
|
||||
### Multi-model commands require additional setup
|
||||
|
||||
@@ -1154,7 +1154,7 @@ The configuration is automatically detected from `.opencode/opencode.json`.
|
||||
|---------|-------------|----------|--------|
|
||||
| Agents | PASS: 47 agents | PASS: 12 agents | **Claude Code leads** |
|
||||
| Commands | PASS: 79 commands | PASS: 31 commands | **Claude Code leads** |
|
||||
| Skills | PASS: 180 skills | PASS: 37 skills | **Claude Code leads** |
|
||||
| Skills | PASS: 181 skills | PASS: 37 skills | **Claude Code leads** |
|
||||
| Hooks | PASS: 8 event types | PASS: 11 events | **OpenCode has more!** |
|
||||
| Rules | PASS: 29 rules | PASS: 13 instructions | **Claude Code leads** |
|
||||
| MCP Servers | PASS: 14 servers | PASS: Full | **Full parity** |
|
||||
@@ -1263,7 +1263,7 @@ ECC is the **first plugin to maximize every major AI coding tool**. Here's how e
|
||||
|---------|------------|------------|-----------|----------|
|
||||
| **Agents** | 47 | Shared (AGENTS.md) | Shared (AGENTS.md) | 12 |
|
||||
| **Commands** | 79 | Shared | Instruction-based | 31 |
|
||||
| **Skills** | 180 | Shared | 10 (native format) | 37 |
|
||||
| **Skills** | 181 | Shared | 10 (native format) | 37 |
|
||||
| **Hook Events** | 8 types | 15 types | None yet | 11 types |
|
||||
| **Hook Scripts** | 20+ scripts | 16 scripts (DRY adapter) | N/A | Plugin hooks |
|
||||
| **Rules** | 34 (common + lang) | 34 (YAML frontmatter) | Instruction-based | 13 instructions |
|
||||
|
||||
@@ -106,7 +106,7 @@ cp -r everything-claude-code/rules/perl ~/.claude/rules/
|
||||
/plugin list ecc@ecc
|
||||
```
|
||||
|
||||
**完成!** 你现在可以使用 47 个代理、180 个技能和 79 个命令。
|
||||
**完成!** 你现在可以使用 47 个代理、181 个技能和 79 个命令。
|
||||
|
||||
### multi-* 命令需要额外配置
|
||||
|
||||
|
||||
@@ -92,6 +92,7 @@ Keep this file detailed for only the current sprint, blockers, and next actions.
|
||||
|
||||
- 2026-04-05: Continued `#1213` overlap cleanup by narrowing `coding-standards` into the baseline cross-project conventions layer instead of deleting it. The skill now explicitly points detailed React/UI guidance to `frontend-patterns`, backend/API structure to `backend-patterns` / `api-design`, and keeps only reusable naming, readability, immutability, and code-quality expectations.
|
||||
- 2026-04-05: Added a packaging regression guard for the OpenCode release path after `#1287` showed the published `v1.10.0` artifact was still stale. `tests/scripts/build-opencode.test.js` now asserts the `npm pack --dry-run` tarball includes `.opencode/dist/index.js` plus compiled plugin/tool entrypoints, so future releases cannot silently omit the built OpenCode payload.
|
||||
- 2026-04-05: Landed `skills/agent-introspection-debugging` for `#829` as an ECC-native self-debugging framework. It is intentionally guidance-first rather than fake runtime automation: capture failure state, classify the pattern, apply the smallest contained recovery action, then emit a structured introspection report and hand off to `verification-loop` / `continuous-learning-v2` when appropriate.
|
||||
- 2026-04-05: Fixed the `main` npm CI break after the latest direct ports. `package-lock.json` had drifted behind `package.json` on the `globals` devDependency (`^17.1.0` vs `^17.4.0`), which caused all npm-based GitHub Actions jobs to fail at `npm ci`. Refreshed the lockfile only, verified `npm ci --ignore-scripts`, and kept the mixed-lock workspace otherwise untouched.
|
||||
- 2026-04-05: Direct-ported the useful discoverability part of `#1221` without duplicating a second healthcare compliance system. Added `skills/hipaa-compliance/SKILL.md` as a thin HIPAA-specific entrypoint that points into the canonical `healthcare-phi-compliance` / `healthcare-reviewer` lane, and wired both healthcare privacy skills into the `security` install module for selective installs.
|
||||
- 2026-04-05: Direct-ported the audited blockchain/web3 security lane from `#1222` into `main` as four self-contained skills: `defi-amm-security`, `evm-token-decimals`, `llm-trading-agent-security`, and `nodejs-keccak256`. These are now part of the `security` install module instead of living as an unmerged fork PR.
|
||||
|
||||
@@ -1,6 +1,6 @@
|
||||
# Everything Claude Code (ECC) — 智能体指令
|
||||
|
||||
这是一个**生产就绪的 AI 编码插件**,提供 47 个专业代理、180 项技能、79 条命令以及自动化钩子工作流,用于软件开发。
|
||||
这是一个**生产就绪的 AI 编码插件**,提供 47 个专业代理、181 项技能、79 条命令以及自动化钩子工作流,用于软件开发。
|
||||
|
||||
**版本:** 1.10.0
|
||||
|
||||
@@ -147,7 +147,7 @@
|
||||
|
||||
```
|
||||
agents/ — 47 个专业子代理
|
||||
skills/ — 180 个工作流技能和领域知识
|
||||
skills/ — 181 个工作流技能和领域知识
|
||||
commands/ — 79 个斜杠命令
|
||||
hooks/ — 基于触发的自动化
|
||||
rules/ — 始终遵循的指导方针(通用 + 每种语言)
|
||||
|
||||
@@ -209,7 +209,7 @@ npx ecc-install typescript
|
||||
/plugin list ecc@ecc
|
||||
```
|
||||
|
||||
**搞定!** 你现在可以使用 47 个智能体、180 项技能和 79 个命令了。
|
||||
**搞定!** 你现在可以使用 47 个智能体、181 项技能和 79 个命令了。
|
||||
|
||||
***
|
||||
|
||||
@@ -1096,7 +1096,7 @@ opencode
|
||||
|---------|-------------|----------|--------|
|
||||
| 智能体 | PASS: 47 个 | PASS: 12 个 | **Claude Code 领先** |
|
||||
| 命令 | PASS: 79 个 | PASS: 31 个 | **Claude Code 领先** |
|
||||
| 技能 | PASS: 180 项 | PASS: 37 项 | **Claude Code 领先** |
|
||||
| 技能 | PASS: 181 项 | PASS: 37 项 | **Claude Code 领先** |
|
||||
| 钩子 | PASS: 8 种事件类型 | PASS: 11 种事件 | **OpenCode 更多!** |
|
||||
| 规则 | PASS: 29 条 | PASS: 13 条指令 | **Claude Code 领先** |
|
||||
| MCP 服务器 | PASS: 14 个 | PASS: 完整 | **完全对等** |
|
||||
@@ -1208,7 +1208,7 @@ ECC 是**第一个最大化利用每个主要 AI 编码工具的插件**。以
|
||||
|---------|------------|------------|-----------|----------|
|
||||
| **智能体** | 47 | 共享 (AGENTS.md) | 共享 (AGENTS.md) | 12 |
|
||||
| **命令** | 79 | 共享 | 基于指令 | 31 |
|
||||
| **技能** | 180 | 共享 | 10 (原生格式) | 37 |
|
||||
| **技能** | 181 | 共享 | 10 (原生格式) | 37 |
|
||||
| **钩子事件** | 8 种类型 | 15 种类型 | 暂无 | 11 种类型 |
|
||||
| **钩子脚本** | 20+ 个脚本 | 16 个脚本 (DRY 适配器) | N/A | 插件钩子 |
|
||||
| **规则** | 34 (通用 + 语言) | 34 (YAML 前页) | 基于指令 | 13 条指令 |
|
||||
|
||||
@@ -200,6 +200,7 @@
|
||||
"description": "Evaluation, TDD, verification, learning, and compaction skills.",
|
||||
"paths": [
|
||||
"skills/agent-sort",
|
||||
"skills/agent-introspection-debugging",
|
||||
"skills/ai-regression-testing",
|
||||
"skills/configure-ecc",
|
||||
"skills/code-tour",
|
||||
|
||||
153
skills/agent-introspection-debugging/SKILL.md
Normal file
153
skills/agent-introspection-debugging/SKILL.md
Normal file
@@ -0,0 +1,153 @@
|
||||
---
|
||||
name: agent-introspection-debugging
|
||||
description: Structured self-debugging workflow for AI agent failures using capture, diagnosis, contained recovery, and introspection reports.
|
||||
origin: ECC
|
||||
---
|
||||
|
||||
# Agent Introspection Debugging
|
||||
|
||||
Use this skill when an agent run is failing repeatedly, consuming tokens without progress, looping on the same tools, or drifting away from the intended task.
|
||||
|
||||
This is a workflow skill, not a hidden runtime. It teaches the agent to debug itself systematically before escalating to a human.
|
||||
|
||||
## When to Activate
|
||||
|
||||
- Maximum tool call / loop-limit failures
|
||||
- Repeated retries with no forward progress
|
||||
- Context growth or prompt drift that starts degrading output quality
|
||||
- File-system or environment state mismatch between expectation and reality
|
||||
- Tool failures that are likely recoverable with diagnosis and a smaller corrective action
|
||||
|
||||
## Scope Boundaries
|
||||
|
||||
Activate this skill for:
|
||||
- capturing failure state before retrying blindly
|
||||
- diagnosing common agent-specific failure patterns
|
||||
- applying contained recovery actions
|
||||
- producing a structured human-readable debug report
|
||||
|
||||
Do not use this skill as the primary source for:
|
||||
- feature verification after code changes; use `verification-loop`
|
||||
- framework-specific debugging when a narrower ECC skill already exists
|
||||
- runtime promises the current harness cannot enforce automatically
|
||||
|
||||
## Four-Phase Loop
|
||||
|
||||
### Phase 1: Failure Capture
|
||||
|
||||
Before trying to recover, record the failure precisely.
|
||||
|
||||
Capture:
|
||||
- error type, message, and stack trace when available
|
||||
- last meaningful tool call sequence
|
||||
- what the agent was trying to do
|
||||
- current context pressure: repeated prompts, oversized pasted logs, duplicated plans, or runaway notes
|
||||
- current environment assumptions: cwd, branch, relevant service state, expected files
|
||||
|
||||
Minimum capture template:
|
||||
|
||||
```markdown
|
||||
## Failure Capture
|
||||
- Session / task:
|
||||
- Goal in progress:
|
||||
- Error:
|
||||
- Last successful step:
|
||||
- Last failed tool / command:
|
||||
- Repeated pattern seen:
|
||||
- Environment assumptions to verify:
|
||||
```
|
||||
|
||||
### Phase 2: Root-Cause Diagnosis
|
||||
|
||||
Match the failure to a known pattern before changing anything.
|
||||
|
||||
| Pattern | Likely Cause | Check |
|
||||
| --- | --- | --- |
|
||||
| Maximum tool calls / repeated same command | loop or no-exit observer path | inspect the last N tool calls for repetition |
|
||||
| Context overflow / degraded reasoning | unbounded notes, repeated plans, oversized logs | inspect recent context for duplication and low-signal bulk |
|
||||
| `ECONNREFUSED` / timeout | service unavailable or wrong port | verify service health, URL, and port assumptions |
|
||||
| `429` / quota exhaustion | retry storm or missing backoff | count repeated calls and inspect retry spacing |
|
||||
| file missing after write / stale diff | race, wrong cwd, or branch drift | re-check path, cwd, git status, and actual file existence |
|
||||
| tests still failing after “fix” | wrong hypothesis | isolate the exact failing test and re-derive the bug |
|
||||
|
||||
Diagnosis questions:
|
||||
- is this a logic failure, state failure, environment failure, or policy failure?
|
||||
- did the agent lose the real objective and start optimizing the wrong subtask?
|
||||
- is the failure deterministic or transient?
|
||||
- what is the smallest reversible action that would validate the diagnosis?
|
||||
|
||||
### Phase 3: Contained Recovery
|
||||
|
||||
Recover with the smallest action that changes the diagnosis surface.
|
||||
|
||||
Safe recovery actions:
|
||||
- stop repeated retries and restate the hypothesis
|
||||
- trim low-signal context and keep only the active goal, blockers, and evidence
|
||||
- re-check the actual filesystem / branch / process state
|
||||
- narrow the task to one failing command, one file, or one test
|
||||
- switch from speculative reasoning to direct observation
|
||||
- escalate to a human when the failure is high-risk or externally blocked
|
||||
|
||||
Do not claim unsupported auto-healing actions like “reset agent state” or “update harness config” unless you are actually doing them through real tools in the current environment.
|
||||
|
||||
Contained recovery checklist:
|
||||
|
||||
```markdown
|
||||
## Recovery Action
|
||||
- Diagnosis chosen:
|
||||
- Smallest action taken:
|
||||
- Why this is safe:
|
||||
- What evidence would prove the fix worked:
|
||||
```
|
||||
|
||||
### Phase 4: Introspection Report
|
||||
|
||||
End with a report that makes the recovery legible to the next agent or human.
|
||||
|
||||
```markdown
|
||||
## Agent Self-Debug Report
|
||||
- Session / task:
|
||||
- Failure:
|
||||
- Root cause:
|
||||
- Recovery action:
|
||||
- Result: success | partial | blocked
|
||||
- Token / time burn risk:
|
||||
- Follow-up needed:
|
||||
- Preventive change to encode later:
|
||||
```
|
||||
|
||||
## Recovery Heuristics
|
||||
|
||||
Prefer these interventions in order:
|
||||
|
||||
1. Restate the real objective in one sentence.
|
||||
2. Verify the world state instead of trusting memory.
|
||||
3. Shrink the failing scope.
|
||||
4. Run one discriminating check.
|
||||
5. Only then retry.
|
||||
|
||||
Bad pattern:
|
||||
- retrying the same action three times with slightly different wording
|
||||
|
||||
Good pattern:
|
||||
- capture failure
|
||||
- classify the pattern
|
||||
- run one direct check
|
||||
- change the plan only if the check supports it
|
||||
|
||||
## Integration with ECC
|
||||
|
||||
- Use `verification-loop` after recovery if code was changed.
|
||||
- Use `continuous-learning-v2` when the failure pattern is worth turning into an instinct or later skill.
|
||||
- Use `council` when the issue is not technical failure but decision ambiguity.
|
||||
- Use `workspace-surface-audit` if the failure came from conflicting local state or repo drift.
|
||||
|
||||
## Output Standard
|
||||
|
||||
When this skill is active, do not end with “I fixed it” alone.
|
||||
|
||||
Always provide:
|
||||
- the failure pattern
|
||||
- the root-cause hypothesis
|
||||
- the recovery action
|
||||
- the evidence that the situation is now better or still blocked
|
||||
Reference in New Issue
Block a user