fix: use general-purpose agent instead of Explore for skill-stocktake evaluation

The Explore agent is a "Fast agent" optimized for codebase exploration,
not deep reasoning. The skill-stocktake V4 design requires holistic AI
judgment (actionability, scope fit, uniqueness, currency) which needs
the full reasoning capability of the conversation's main model.

Additionally, the Agent tool has no `model` parameter — specifying
`model: opus` was silently ignored, causing the evaluation to run on
the lightweight Explore model. This resulted in all skills receiving
"Keep" verdicts without genuine critical analysis.

Changing to `general-purpose` agent ensures evaluation runs on the
conversation's main model (e.g., Opus 4.6), enabling the holistic
judgment that V4 was designed for.
This commit is contained in:
Tatsuya Shimomoto
2026-03-08 21:02:14 +09:00
committed by Affaan Mustafa
parent 973be02aa6
commit 02d754ba67

View File

@@ -74,7 +74,7 @@ Scanning:
### Phase 2 — Quality Evaluation
Launch a Task tool subagent (**Explore agent, model: opus**) with the full inventory and checklist.
Launch an Agent tool subagent (**general-purpose agent**) with the full inventory and checklist.
The subagent reads each skill, applies the checklist, and returns per-skill JSON:
`{ "verdict": "Keep"|"Improve"|"Update"|"Retire"|"Merge into [X]", "reason": "..." }`