From e09c548edfc5f6943d2077b42250e9b81b6113b4 Mon Sep 17 00:00:00 2001
From: Affaan Mustafa <affaan@dcube.ai>
Date: Sun, 5 Apr 2026 20:10:54 -0700
Subject: [PATCH] feat: add agent introspection debugging skill

---
 .../agent-introspection-debugging/SKILL.md    | 153 ++++++++++++++++++
 AGENTS.md                                     |   4 +-
 README.md                                     |   6 +-
 README.zh-CN.md                               |   2 +-
 WORKING-CONTEXT.md                            |   1 +
 docs/zh-CN/AGENTS.md                          |   4 +-
 docs/zh-CN/README.md                          |   6 +-
 manifests/install-modules.json                |   1 +
 skills/agent-introspection-debugging/SKILL.md | 153 ++++++++++++++++++
 9 files changed, 319 insertions(+), 11 deletions(-)
 create mode 100644 .agents/skills/agent-introspection-debugging/SKILL.md
 create mode 100644 skills/agent-introspection-debugging/SKILL.md

diff --git a/.agents/skills/agent-introspection-debugging/SKILL.md b/.agents/skills/agent-introspection-debugging/SKILL.md
new file mode 100644
index 00000000..ea5a2c58
--- /dev/null
+++ b/.agents/skills/agent-introspection-debugging/SKILL.md
@@ -0,0 +1,153 @@
+---
+name: agent-introspection-debugging
+description: Structured self-debugging workflow for AI agent failures using capture, diagnosis, contained recovery, and introspection reports.
+origin: ECC
+---
+
+# Agent Introspection Debugging
+
+Use this skill when an agent run is failing repeatedly, consuming tokens without progress, looping on the same tools, or drifting away from the intended task.
+
+This is a workflow skill, not a hidden runtime. It teaches the agent to debug itself systematically before escalating to a human.
+
+## When to Activate
+
+- Maximum tool call / loop-limit failures
+- Repeated retries with no forward progress
+- Context growth or prompt drift that starts degrading output quality
+- File-system or environment state mismatch between expectation and reality
+- Tool failures that are likely recoverable with diagnosis and a smaller corrective action
+
+## Scope Boundaries
+
+Activate this skill for:
+- capturing failure state before retrying blindly
+- diagnosing common agent-specific failure patterns
+- applying contained recovery actions
+- producing a structured human-readable debug report
+
+Do not use this skill as the primary source for:
+- feature verification after code changes; use `verification-loop`
+- framework-specific debugging when a narrower ECC skill already exists
+- runtime promises the current harness cannot enforce automatically
+
+## Four-Phase Loop
+
+### Phase 1: Failure Capture
+
+Before trying to recover, record the failure precisely.
+
+Capture:
+- error type, message, and stack trace when available
+- last meaningful tool call sequence
+- what the agent was trying to do
+- current context pressure: repeated prompts, oversized pasted logs, duplicated plans, or runaway notes
+- current environment assumptions: cwd, branch, relevant service state, expected files
+
+Minimum capture template:
+
+```markdown
+## Failure Capture
+- Session / task:
+- Goal in progress:
+- Error:
+- Last successful step:
+- Last failed tool / command:
+- Repeated pattern seen:
+- Environment assumptions to verify:
+```
+
+### Phase 2: Root-Cause Diagnosis
+
+Match the failure to a known pattern before changing anything.
+
+| Pattern | Likely Cause | Check |
+| --- | --- | --- |
+| Maximum tool calls / repeated same command | loop or no-exit observer path | inspect the last N tool calls for repetition |
+| Context overflow / degraded reasoning | unbounded notes, repeated plans, oversized logs | inspect recent context for duplication and low-signal bulk |
+| `ECONNREFUSED` / timeout | service unavailable or wrong port | verify service health, URL, and port assumptions |
+| `429` / quota exhaustion | retry storm or missing backoff | count repeated calls and inspect retry spacing |
+| file missing after write / stale diff | race, wrong cwd, or branch drift | re-check path, cwd, git status, and actual file existence |
+| tests still failing after “fix” | wrong hypothesis | isolate the exact failing test and re-derive the bug |
+
+Diagnosis questions:
+- is this a logic failure, state failure, environment failure, or policy failure?
+- did the agent lose the real objective and start optimizing the wrong subtask?
+- is the failure deterministic or transient?
+- what is the smallest reversible action that would validate the diagnosis?
+
+### Phase 3: Contained Recovery
+
+Recover with the smallest action that changes the diagnosis surface.
+
+Safe recovery actions:
+- stop repeated retries and restate the hypothesis
+- trim low-signal context and keep only the active goal, blockers, and evidence
+- re-check the actual filesystem / branch / process state
+- narrow the task to one failing command, one file, or one test
+- switch from speculative reasoning to direct observation
+- escalate to a human when the failure is high-risk or externally blocked
+
+Do not claim unsupported auto-healing actions like “reset agent state” or “update harness config” unless you are actually doing them through real tools in the current environment.
+
+Contained recovery checklist:
+
+```markdown
+## Recovery Action
+- Diagnosis chosen:
+- Smallest action taken:
+- Why this is safe:
+- What evidence would prove the fix worked:
+```
+
+### Phase 4: Introspection Report
+
+End with a report that makes the recovery legible to the next agent or human.
+
+```markdown
+## Agent Self-Debug Report
+- Session / task:
+- Failure:
+- Root cause:
+- Recovery action:
+- Result: success | partial | blocked
+- Token / time burn risk:
+- Follow-up needed:
+- Preventive change to encode later:
+```
+
+## Recovery Heuristics
+
+Prefer these interventions in order:
+
+1. Restate the real objective in one sentence.
+2. Verify the world state instead of trusting memory.
+3. Shrink the failing scope.
+4. Run one discriminating check.
+5. Only then retry.
+
+Bad pattern:
+- retrying the same action three times with slightly different wording
+
+Good pattern:
+- capture failure
+- classify the pattern
+- run one direct check
+- change the plan only if the check supports it
+
+## Integration with ECC
+
+- Use `verification-loop` after recovery if code was changed.
+- Use `continuous-learning-v2` when the failure pattern is worth turning into an instinct or later skill.
+- Use `council` when the issue is not technical failure but decision ambiguity.
+- Use `workspace-surface-audit` if the failure came from conflicting local state or repo drift.
+
+## Output Standard
+
+When this skill is active, do not end with “I fixed it” alone.
+
+Always provide:
+- the failure pattern
+- the root-cause hypothesis
+- the recovery action
+- the evidence that the situation is now better or still blocked
diff --git a/AGENTS.md b/AGENTS.md
index 9aad5972..3412f269 100644
--- a/AGENTS.md
+++ b/AGENTS.md
@@ -1,6 +1,6 @@
 # Everything Claude Code (ECC) — Agent Instructions
 
-This is a **production-ready AI coding plugin** providing 47 specialized agents, 180 skills, 79 commands, and automated hook workflows for software development.
+This is a **production-ready AI coding plugin** providing 47 specialized agents, 181 skills, 79 commands, and automated hook workflows for software development.
 
 **Version:** 1.10.0
 
@@ -146,7 +146,7 @@ Troubleshoot failures: check test isolation → verify mocks → fix implementat
 
 ```
 agents/          — 47 specialized subagents
-skills/          — 180 workflow skills and domain knowledge
+skills/          — 181 workflow skills and domain knowledge
 commands/        — 79 slash commands
 hooks/           — Trigger-based automations
 rules/           — Always-follow guidelines (common + per-language)
diff --git a/README.md b/README.md
index eb856afa..8d63889d 100644
--- a/README.md
+++ b/README.md
@@ -236,7 +236,7 @@ For manual install instructions see the README in the `rules/` folder. When copy
 /plugin list ecc@ecc
 ```
 
-**That's it!** You now have access to 47 agents, 180 skills, and 79 legacy command shims.
+**That's it!** You now have access to 47 agents, 181 skills, and 79 legacy command shims.
 
 ### Multi-model commands require additional setup
 
@@ -1154,7 +1154,7 @@ The configuration is automatically detected from `.opencode/opencode.json`.
 |---------|-------------|----------|--------|
 | Agents | PASS: 47 agents | PASS: 12 agents | **Claude Code leads** |
 | Commands | PASS: 79 commands | PASS: 31 commands | **Claude Code leads** |
-| Skills | PASS: 180 skills | PASS: 37 skills | **Claude Code leads** |
+| Skills | PASS: 181 skills | PASS: 37 skills | **Claude Code leads** |
 | Hooks | PASS: 8 event types | PASS: 11 events | **OpenCode has more!** |
 | Rules | PASS: 29 rules | PASS: 13 instructions | **Claude Code leads** |
 | MCP Servers | PASS: 14 servers | PASS: Full | **Full parity** |
@@ -1263,7 +1263,7 @@ ECC is the **first plugin to maximize every major AI coding tool**. Here's how e
 |---------|------------|------------|-----------|----------|
 | **Agents** | 47 | Shared (AGENTS.md) | Shared (AGENTS.md) | 12 |
 | **Commands** | 79 | Shared | Instruction-based | 31 |
-| **Skills** | 180 | Shared | 10 (native format) | 37 |
+| **Skills** | 181 | Shared | 10 (native format) | 37 |
 | **Hook Events** | 8 types | 15 types | None yet | 11 types |
 | **Hook Scripts** | 20+ scripts | 16 scripts (DRY adapter) | N/A | Plugin hooks |
 | **Rules** | 34 (common + lang) | 34 (YAML frontmatter) | Instruction-based | 13 instructions |
diff --git a/README.zh-CN.md b/README.zh-CN.md
index 86849294..e56b323f 100644
--- a/README.zh-CN.md
+++ b/README.zh-CN.md
@@ -106,7 +106,7 @@ cp -r everything-claude-code/rules/perl ~/.claude/rules/
 /plugin list ecc@ecc
 ```
 
-**完成！** 你现在可以使用 47 个代理、180 个技能和 79 个命令。
+**完成！** 你现在可以使用 47 个代理、181 个技能和 79 个命令。
 
 ### multi-* 命令需要额外配置
 
diff --git a/WORKING-CONTEXT.md b/WORKING-CONTEXT.md
index 2f6aa614..82ae07ba 100644
--- a/WORKING-CONTEXT.md
+++ b/WORKING-CONTEXT.md
@@ -92,6 +92,7 @@ Keep this file detailed for only the current sprint, blockers, and next actions.
 
 - 2026-04-05: Continued `#1213` overlap cleanup by narrowing `coding-standards` into the baseline cross-project conventions layer instead of deleting it. The skill now explicitly points detailed React/UI guidance to `frontend-patterns`, backend/API structure to `backend-patterns` / `api-design`, and keeps only reusable naming, readability, immutability, and code-quality expectations.
 - 2026-04-05: Added a packaging regression guard for the OpenCode release path after `#1287` showed the published `v1.10.0` artifact was still stale. `tests/scripts/build-opencode.test.js` now asserts the `npm pack --dry-run` tarball includes `.opencode/dist/index.js` plus compiled plugin/tool entrypoints, so future releases cannot silently omit the built OpenCode payload.
+- 2026-04-05: Landed `skills/agent-introspection-debugging` for `#829` as an ECC-native self-debugging framework. It is intentionally guidance-first rather than fake runtime automation: capture failure state, classify the pattern, apply the smallest contained recovery action, then emit a structured introspection report and hand off to `verification-loop` / `continuous-learning-v2` when appropriate.
 - 2026-04-05: Fixed the `main` npm CI break after the latest direct ports. `package-lock.json` had drifted behind `package.json` on the `globals` devDependency (`^17.1.0` vs `^17.4.0`), which caused all npm-based GitHub Actions jobs to fail at `npm ci`. Refreshed the lockfile only, verified `npm ci --ignore-scripts`, and kept the mixed-lock workspace otherwise untouched.
 - 2026-04-05: Direct-ported the useful discoverability part of `#1221` without duplicating a second healthcare compliance system. Added `skills/hipaa-compliance/SKILL.md` as a thin HIPAA-specific entrypoint that points into the canonical `healthcare-phi-compliance` / `healthcare-reviewer` lane, and wired both healthcare privacy skills into the `security` install module for selective installs.
 - 2026-04-05: Direct-ported the audited blockchain/web3 security lane from `#1222` into `main` as four self-contained skills: `defi-amm-security`, `evm-token-decimals`, `llm-trading-agent-security`, and `nodejs-keccak256`. These are now part of the `security` install module instead of living as an unmerged fork PR.
diff --git a/docs/zh-CN/AGENTS.md b/docs/zh-CN/AGENTS.md
index c22bd5d4..0bad9c1c 100644
--- a/docs/zh-CN/AGENTS.md
+++ b/docs/zh-CN/AGENTS.md
@@ -1,6 +1,6 @@
 # Everything Claude Code (ECC) — 智能体指令
 
-这是一个**生产就绪的 AI 编码插件**，提供 47 个专业代理、180 项技能、79 条命令以及自动化钩子工作流，用于软件开发。
+这是一个**生产就绪的 AI 编码插件**，提供 47 个专业代理、181 项技能、79 条命令以及自动化钩子工作流，用于软件开发。
 
 **版本:** 1.10.0
 
@@ -147,7 +147,7 @@
 
 ```
 agents/          — 47 个专业子代理
-skills/          — 180 个工作流技能和领域知识
+skills/          — 181 个工作流技能和领域知识
 commands/        — 79 个斜杠命令
 hooks/           — 基于触发的自动化
 rules/           — 始终遵循的指导方针（通用 + 每种语言）
diff --git a/docs/zh-CN/README.md b/docs/zh-CN/README.md
index e81b8c3b..3ea2d83a 100644
--- a/docs/zh-CN/README.md
+++ b/docs/zh-CN/README.md
@@ -209,7 +209,7 @@ npx ecc-install typescript
 /plugin list ecc@ecc
 ```
 
-**搞定！** 你现在可以使用 47 个智能体、180 项技能和 79 个命令了。
+**搞定！** 你现在可以使用 47 个智能体、181 项技能和 79 个命令了。
 
 ***
 
@@ -1096,7 +1096,7 @@ opencode
 |---------|-------------|----------|--------|
 | 智能体 | PASS: 47 个 | PASS: 12 个 | **Claude Code 领先** |
 | 命令 | PASS: 79 个 | PASS: 31 个 | **Claude Code 领先** |
-| 技能 | PASS: 180 项 | PASS: 37 项 | **Claude Code 领先** |
+| 技能 | PASS: 181 项 | PASS: 37 项 | **Claude Code 领先** |
 | 钩子 | PASS: 8 种事件类型 | PASS: 11 种事件 | **OpenCode 更多！** |
 | 规则 | PASS: 29 条 | PASS: 13 条指令 | **Claude Code 领先** |
 | MCP 服务器 | PASS: 14 个 | PASS: 完整 | **完全对等** |
@@ -1208,7 +1208,7 @@ ECC 是**第一个最大化利用每个主要 AI 编码工具的插件**。以
 |---------|------------|------------|-----------|----------|
 | **智能体** | 47 | 共享 (AGENTS.md) | 共享 (AGENTS.md) | 12 |
 | **命令** | 79 | 共享 | 基于指令 | 31 |
-| **技能** | 180 | 共享 | 10 (原生格式) | 37 |
+| **技能** | 181 | 共享 | 10 (原生格式) | 37 |
 | **钩子事件** | 8 种类型 | 15 种类型 | 暂无 | 11 种类型 |
 | **钩子脚本** | 20+ 个脚本 | 16 个脚本 (DRY 适配器) | N/A | 插件钩子 |
 | **规则** | 34 (通用 + 语言) | 34 (YAML 前页) | 基于指令 | 13 条指令 |
diff --git a/manifests/install-modules.json b/manifests/install-modules.json
index dc00316c..29c4b841 100644
--- a/manifests/install-modules.json
+++ b/manifests/install-modules.json
@@ -200,6 +200,7 @@
       "description": "Evaluation, TDD, verification, learning, and compaction skills.",
       "paths": [
         "skills/agent-sort",
+        "skills/agent-introspection-debugging",
         "skills/ai-regression-testing",
         "skills/configure-ecc",
         "skills/code-tour",
diff --git a/skills/agent-introspection-debugging/SKILL.md b/skills/agent-introspection-debugging/SKILL.md
new file mode 100644
index 00000000..ea5a2c58
--- /dev/null
+++ b/skills/agent-introspection-debugging/SKILL.md
@@ -0,0 +1,153 @@
+---
+name: agent-introspection-debugging
+description: Structured self-debugging workflow for AI agent failures using capture, diagnosis, contained recovery, and introspection reports.
+origin: ECC
+---
+
+# Agent Introspection Debugging
+
+Use this skill when an agent run is failing repeatedly, consuming tokens without progress, looping on the same tools, or drifting away from the intended task.
+
+This is a workflow skill, not a hidden runtime. It teaches the agent to debug itself systematically before escalating to a human.
+
+## When to Activate
+
+- Maximum tool call / loop-limit failures
+- Repeated retries with no forward progress
+- Context growth or prompt drift that starts degrading output quality
+- File-system or environment state mismatch between expectation and reality
+- Tool failures that are likely recoverable with diagnosis and a smaller corrective action
+
+## Scope Boundaries
+
+Activate this skill for:
+- capturing failure state before retrying blindly
+- diagnosing common agent-specific failure patterns
+- applying contained recovery actions
+- producing a structured human-readable debug report
+
+Do not use this skill as the primary source for:
+- feature verification after code changes; use `verification-loop`
+- framework-specific debugging when a narrower ECC skill already exists
+- runtime promises the current harness cannot enforce automatically
+
+## Four-Phase Loop
+
+### Phase 1: Failure Capture
+
+Before trying to recover, record the failure precisely.
+
+Capture:
+- error type, message, and stack trace when available
+- last meaningful tool call sequence
+- what the agent was trying to do
+- current context pressure: repeated prompts, oversized pasted logs, duplicated plans, or runaway notes
+- current environment assumptions: cwd, branch, relevant service state, expected files
+
+Minimum capture template:
+
+```markdown
+## Failure Capture
+- Session / task:
+- Goal in progress:
+- Error:
+- Last successful step:
+- Last failed tool / command:
+- Repeated pattern seen:
+- Environment assumptions to verify:
+```
+
+### Phase 2: Root-Cause Diagnosis
+
+Match the failure to a known pattern before changing anything.
+
+| Pattern | Likely Cause | Check |
+| --- | --- | --- |
+| Maximum tool calls / repeated same command | loop or no-exit observer path | inspect the last N tool calls for repetition |
+| Context overflow / degraded reasoning | unbounded notes, repeated plans, oversized logs | inspect recent context for duplication and low-signal bulk |
+| `ECONNREFUSED` / timeout | service unavailable or wrong port | verify service health, URL, and port assumptions |
+| `429` / quota exhaustion | retry storm or missing backoff | count repeated calls and inspect retry spacing |
+| file missing after write / stale diff | race, wrong cwd, or branch drift | re-check path, cwd, git status, and actual file existence |
+| tests still failing after “fix” | wrong hypothesis | isolate the exact failing test and re-derive the bug |
+
+Diagnosis questions:
+- is this a logic failure, state failure, environment failure, or policy failure?
+- did the agent lose the real objective and start optimizing the wrong subtask?
+- is the failure deterministic or transient?
+- what is the smallest reversible action that would validate the diagnosis?
+
+### Phase 3: Contained Recovery
+
+Recover with the smallest action that changes the diagnosis surface.
+
+Safe recovery actions:
+- stop repeated retries and restate the hypothesis
+- trim low-signal context and keep only the active goal, blockers, and evidence
+- re-check the actual filesystem / branch / process state
+- narrow the task to one failing command, one file, or one test
+- switch from speculative reasoning to direct observation
+- escalate to a human when the failure is high-risk or externally blocked
+
+Do not claim unsupported auto-healing actions like “reset agent state” or “update harness config” unless you are actually doing them through real tools in the current environment.
+
+Contained recovery checklist:
+
+```markdown
+## Recovery Action
+- Diagnosis chosen:
+- Smallest action taken:
+- Why this is safe:
+- What evidence would prove the fix worked:
+```
+
+### Phase 4: Introspection Report
+
+End with a report that makes the recovery legible to the next agent or human.
+
+```markdown
+## Agent Self-Debug Report
+- Session / task:
+- Failure:
+- Root cause:
+- Recovery action:
+- Result: success | partial | blocked
+- Token / time burn risk:
+- Follow-up needed:
+- Preventive change to encode later:
+```
+
+## Recovery Heuristics
+
+Prefer these interventions in order:
+
+1. Restate the real objective in one sentence.
+2. Verify the world state instead of trusting memory.
+3. Shrink the failing scope.
+4. Run one discriminating check.
+5. Only then retry.
+
+Bad pattern:
+- retrying the same action three times with slightly different wording
+
+Good pattern:
+- capture failure
+- classify the pattern
+- run one direct check
+- change the plan only if the check supports it
+
+## Integration with ECC
+
+- Use `verification-loop` after recovery if code was changed.
+- Use `continuous-learning-v2` when the failure pattern is worth turning into an instinct or later skill.
+- Use `council` when the issue is not technical failure but decision ambiguity.
+- Use `workspace-surface-audit` if the failure came from conflicting local state or repo drift.
+
+## Output Standard
+
+When this skill is active, do not end with “I fixed it” alone.
+
+Always provide:
+- the failure pattern
+- the root-cause hypothesis
+- the recovery action
+- the evidence that the situation is now better or still blocked