mirror of
https://github.com/affaan-m/everything-claude-code.git
synced 2026-04-01 22:53:27 +08:00
* feat(skills): add evalview-agent-testing skill and MCP server Add EvalView as a regression testing skill for AI agents. EvalView snapshots agent behavior (tool calls, parameters, output), then diffs against baselines after every change — catching regressions before they ship. Skill covers: - CLI workflow (init → snapshot → check → monitor) - Python API (gate() / gate_async() for autonomous loops) - Quick mode (no LLM judge, $0, sub-second) - CI/CD integration (GitHub Actions with PR comments) - MCP integration (8 tools for Claude Code) - Multi-turn test cases - OpenClaw integration for autonomous agents Also adds evalview MCP server to mcp-servers.json. * fix(skills): pin action SHA and remove unvetted external links - Pin hidai25/eval-view action to commit SHA instead of @main - Replace external GitHub links with PyPI package link (vetted registry) Addresses cubic-dev-ai review feedback. * fix(skills): replace third-party action with pip install + CLI Use plain pip install + evalview CLI instead of a third-party GitHub Action. No external actions, no secrets passed to unvetted code. Addresses cubic-dev-ai supply-chain review feedback. * fix(skills): add destructive revert warning for gate_or_revert Add prominent warning that gate_or_revert runs git checkout, discarding uncommitted changes. Documents the revert_cmd override for safer alternatives like git stash. Addresses cubic-dev-ai review feedback. * fix(skills): pin pip version range and document fail-on tradeoffs - Pin evalview to >=0.5,<1 to prevent breaking CI on major upgrades - Document --fail-on REGRESSION vs --strict tradeoff so users understand what gates and what passes through Addresses greptile-apps review feedback. * fix: use python3 -m evalview for venv compatibility in MCP config Follows the same pattern as insaits entry. Resolves correctly even when evalview is installed in a virtual environment that isn't on the system PATH. * fix: align MCP install command with mcp-servers.json pattern Use python3 -m evalview mcp serve consistently across both the skill docs and the MCP config catalog. * fix: use evalview CLI entry point for MCP command pip install evalview installs the evalview binary to PATH, so using it directly is consistent with the install docs and avoids python3 version mismatch issues. * fix: pin install version to match CI section * fix: pin all pip install references consistently * fix: add API key placeholder and pin install version in MCP config Add OPENAI_API_KEY env placeholder matching other entries. Note that the key is optional — deterministic checks work without it. Pin install version to match skill docs. * fix: guard score_delta format for non-scored statuses --------- Co-authored-by: Affaan Mustafa <me@affaanmustafa.com>