mirror of
https://github.com/affaan-m/everything-claude-code.git
synced 2026-04-08 02:03:34 +08:00
feat: add agent introspection debugging skill
This commit is contained in:
153
.agents/skills/agent-introspection-debugging/SKILL.md
Normal file
153
.agents/skills/agent-introspection-debugging/SKILL.md
Normal file
@@ -0,0 +1,153 @@
|
||||
---
|
||||
name: agent-introspection-debugging
|
||||
description: Structured self-debugging workflow for AI agent failures using capture, diagnosis, contained recovery, and introspection reports.
|
||||
origin: ECC
|
||||
---
|
||||
|
||||
# Agent Introspection Debugging
|
||||
|
||||
Use this skill when an agent run is failing repeatedly, consuming tokens without progress, looping on the same tools, or drifting away from the intended task.
|
||||
|
||||
This is a workflow skill, not a hidden runtime. It teaches the agent to debug itself systematically before escalating to a human.
|
||||
|
||||
## When to Activate
|
||||
|
||||
- Maximum tool call / loop-limit failures
|
||||
- Repeated retries with no forward progress
|
||||
- Context growth or prompt drift that starts degrading output quality
|
||||
- File-system or environment state mismatch between expectation and reality
|
||||
- Tool failures that are likely recoverable with diagnosis and a smaller corrective action
|
||||
|
||||
## Scope Boundaries
|
||||
|
||||
Activate this skill for:
|
||||
- capturing failure state before retrying blindly
|
||||
- diagnosing common agent-specific failure patterns
|
||||
- applying contained recovery actions
|
||||
- producing a structured human-readable debug report
|
||||
|
||||
Do not use this skill as the primary source for:
|
||||
- feature verification after code changes; use `verification-loop`
|
||||
- framework-specific debugging when a narrower ECC skill already exists
|
||||
- runtime promises the current harness cannot enforce automatically
|
||||
|
||||
## Four-Phase Loop
|
||||
|
||||
### Phase 1: Failure Capture
|
||||
|
||||
Before trying to recover, record the failure precisely.
|
||||
|
||||
Capture:
|
||||
- error type, message, and stack trace when available
|
||||
- last meaningful tool call sequence
|
||||
- what the agent was trying to do
|
||||
- current context pressure: repeated prompts, oversized pasted logs, duplicated plans, or runaway notes
|
||||
- current environment assumptions: cwd, branch, relevant service state, expected files
|
||||
|
||||
Minimum capture template:
|
||||
|
||||
```markdown
|
||||
## Failure Capture
|
||||
- Session / task:
|
||||
- Goal in progress:
|
||||
- Error:
|
||||
- Last successful step:
|
||||
- Last failed tool / command:
|
||||
- Repeated pattern seen:
|
||||
- Environment assumptions to verify:
|
||||
```
|
||||
|
||||
### Phase 2: Root-Cause Diagnosis
|
||||
|
||||
Match the failure to a known pattern before changing anything.
|
||||
|
||||
| Pattern | Likely Cause | Check |
|
||||
| --- | --- | --- |
|
||||
| Maximum tool calls / repeated same command | loop or no-exit observer path | inspect the last N tool calls for repetition |
|
||||
| Context overflow / degraded reasoning | unbounded notes, repeated plans, oversized logs | inspect recent context for duplication and low-signal bulk |
|
||||
| `ECONNREFUSED` / timeout | service unavailable or wrong port | verify service health, URL, and port assumptions |
|
||||
| `429` / quota exhaustion | retry storm or missing backoff | count repeated calls and inspect retry spacing |
|
||||
| file missing after write / stale diff | race, wrong cwd, or branch drift | re-check path, cwd, git status, and actual file existence |
|
||||
| tests still failing after “fix” | wrong hypothesis | isolate the exact failing test and re-derive the bug |
|
||||
|
||||
Diagnosis questions:
|
||||
- is this a logic failure, state failure, environment failure, or policy failure?
|
||||
- did the agent lose the real objective and start optimizing the wrong subtask?
|
||||
- is the failure deterministic or transient?
|
||||
- what is the smallest reversible action that would validate the diagnosis?
|
||||
|
||||
### Phase 3: Contained Recovery
|
||||
|
||||
Recover with the smallest action that changes the diagnosis surface.
|
||||
|
||||
Safe recovery actions:
|
||||
- stop repeated retries and restate the hypothesis
|
||||
- trim low-signal context and keep only the active goal, blockers, and evidence
|
||||
- re-check the actual filesystem / branch / process state
|
||||
- narrow the task to one failing command, one file, or one test
|
||||
- switch from speculative reasoning to direct observation
|
||||
- escalate to a human when the failure is high-risk or externally blocked
|
||||
|
||||
Do not claim unsupported auto-healing actions like “reset agent state” or “update harness config” unless you are actually doing them through real tools in the current environment.
|
||||
|
||||
Contained recovery checklist:
|
||||
|
||||
```markdown
|
||||
## Recovery Action
|
||||
- Diagnosis chosen:
|
||||
- Smallest action taken:
|
||||
- Why this is safe:
|
||||
- What evidence would prove the fix worked:
|
||||
```
|
||||
|
||||
### Phase 4: Introspection Report
|
||||
|
||||
End with a report that makes the recovery legible to the next agent or human.
|
||||
|
||||
```markdown
|
||||
## Agent Self-Debug Report
|
||||
- Session / task:
|
||||
- Failure:
|
||||
- Root cause:
|
||||
- Recovery action:
|
||||
- Result: success | partial | blocked
|
||||
- Token / time burn risk:
|
||||
- Follow-up needed:
|
||||
- Preventive change to encode later:
|
||||
```
|
||||
|
||||
## Recovery Heuristics
|
||||
|
||||
Prefer these interventions in order:
|
||||
|
||||
1. Restate the real objective in one sentence.
|
||||
2. Verify the world state instead of trusting memory.
|
||||
3. Shrink the failing scope.
|
||||
4. Run one discriminating check.
|
||||
5. Only then retry.
|
||||
|
||||
Bad pattern:
|
||||
- retrying the same action three times with slightly different wording
|
||||
|
||||
Good pattern:
|
||||
- capture failure
|
||||
- classify the pattern
|
||||
- run one direct check
|
||||
- change the plan only if the check supports it
|
||||
|
||||
## Integration with ECC
|
||||
|
||||
- Use `verification-loop` after recovery if code was changed.
|
||||
- Use `continuous-learning-v2` when the failure pattern is worth turning into an instinct or later skill.
|
||||
- Use `council` when the issue is not technical failure but decision ambiguity.
|
||||
- Use `workspace-surface-audit` if the failure came from conflicting local state or repo drift.
|
||||
|
||||
## Output Standard
|
||||
|
||||
When this skill is active, do not end with “I fixed it” alone.
|
||||
|
||||
Always provide:
|
||||
- the failure pattern
|
||||
- the root-cause hypothesis
|
||||
- the recovery action
|
||||
- the evidence that the situation is now better or still blocked
|
||||
Reference in New Issue
Block a user