From a77c8c3f85ac06ccdf2813f50eb32230061bf687 Mon Sep 17 00:00:00 2001 From: Affaan Mustafa Date: Sun, 5 Apr 2026 15:19:56 -0700 Subject: [PATCH] feat: add ecc tools cost audit workflow --- AGENTS.md | 4 +- README.md | 8 +- README.zh-CN.md | 2 +- WORKING-CONTEXT.md | 2 + docs/zh-CN/AGENTS.md | 4 +- docs/zh-CN/README.md | 6 +- manifests/install-modules.json | 1 + skills/ecc-tools-cost-audit/SKILL.md | 160 +++++++++++++++++++++++++++ 8 files changed, 175 insertions(+), 12 deletions(-) create mode 100644 skills/ecc-tools-cost-audit/SKILL.md diff --git a/AGENTS.md b/AGENTS.md index f40f2f4e..cbf7762e 100644 --- a/AGENTS.md +++ b/AGENTS.md @@ -1,6 +1,6 @@ # Everything Claude Code (ECC) — Agent Instructions -This is a **production-ready AI coding plugin** providing 38 specialized agents, 158 skills, 72 commands, and automated hook workflows for software development. +This is a **production-ready AI coding plugin** providing 38 specialized agents, 159 skills, 72 commands, and automated hook workflows for software development. **Version:** 1.10.0 @@ -146,7 +146,7 @@ Troubleshoot failures: check test isolation → verify mocks → fix implementat ``` agents/ — 38 specialized subagents -skills/ — 158 workflow skills and domain knowledge +skills/ — 159 workflow skills and domain knowledge commands/ — 72 slash commands hooks/ — Trigger-based automations rules/ — Always-follow guidelines (common + per-language) diff --git a/README.md b/README.md index a2502541..f14bd895 100644 --- a/README.md +++ b/README.md @@ -85,7 +85,7 @@ This repo is the raw code only. The guides explain everything. ### v1.10.0 — Surface Refresh, Operator Workflows, and ECC 2.0 Alpha (Apr 2026) - **Public surface synced to the live repo** — metadata, catalog counts, plugin manifests, and install-facing docs now match the actual OSS surface: 38 agents, 156 skills, and 72 legacy command shims. -- **Operator and outbound workflow expansion** — `brand-voice`, `social-graph-ranker`, `connections-optimizer`, `customer-billing-ops`, `google-workspace-ops`, `project-flow-ops`, and `workspace-surface-audit` round out the operator lane. +- **Operator and outbound workflow expansion** — `brand-voice`, `social-graph-ranker`, `connections-optimizer`, `customer-billing-ops`, `ecc-tools-cost-audit`, `google-workspace-ops`, `project-flow-ops`, and `workspace-surface-audit` round out the operator lane. - **Media and launch tooling** — `manim-video`, `remotion-video-creation`, and upgraded social publishing surfaces make technical explainers and launch content part of the same system. - **Framework and product surface growth** — `nestjs-patterns`, richer Codex/OpenCode install surfaces, and expanded cross-harness packaging keep the repo usable beyond Claude Code alone. - **ECC 2.0 alpha is in-tree** — the Rust control-plane prototype in `ecc2/` now builds locally and exposes `dashboard`, `start`, `sessions`, `status`, `stop`, `resume`, and `daemon` commands. It is usable as an alpha, not yet a general release. @@ -236,7 +236,7 @@ For manual install instructions see the README in the `rules/` folder. When copy /plugin list ecc@ecc ``` -**That's it!** You now have access to 38 agents, 158 skills, and 72 legacy command shims. +**That's it!** You now have access to 38 agents, 159 skills, and 72 legacy command shims. ### Multi-model commands require additional setup @@ -1154,7 +1154,7 @@ The configuration is automatically detected from `.opencode/opencode.json`. |---------|-------------|----------|--------| | Agents | PASS: 38 agents | PASS: 12 agents | **Claude Code leads** | | Commands | PASS: 72 commands | PASS: 31 commands | **Claude Code leads** | -| Skills | PASS: 158 skills | PASS: 37 skills | **Claude Code leads** | +| Skills | PASS: 159 skills | PASS: 37 skills | **Claude Code leads** | | Hooks | PASS: 8 event types | PASS: 11 events | **OpenCode has more!** | | Rules | PASS: 29 rules | PASS: 13 instructions | **Claude Code leads** | | MCP Servers | PASS: 14 servers | PASS: Full | **Full parity** | @@ -1263,7 +1263,7 @@ ECC is the **first plugin to maximize every major AI coding tool**. Here's how e |---------|------------|------------|-----------|----------| | **Agents** | 38 | Shared (AGENTS.md) | Shared (AGENTS.md) | 12 | | **Commands** | 72 | Shared | Instruction-based | 31 | -| **Skills** | 158 | Shared | 10 (native format) | 37 | +| **Skills** | 159 | Shared | 10 (native format) | 37 | | **Hook Events** | 8 types | 15 types | None yet | 11 types | | **Hook Scripts** | 20+ scripts | 16 scripts (DRY adapter) | N/A | Plugin hooks | | **Rules** | 34 (common + lang) | 34 (YAML frontmatter) | Instruction-based | 13 instructions | diff --git a/README.zh-CN.md b/README.zh-CN.md index 549c8c25..df7ef956 100644 --- a/README.zh-CN.md +++ b/README.zh-CN.md @@ -106,7 +106,7 @@ cp -r everything-claude-code/rules/perl ~/.claude/rules/ /plugin list ecc@ecc ``` -**完成!** 你现在可以使用 38 个代理、158 个技能和 72 个命令。 +**完成!** 你现在可以使用 38 个代理、159 个技能和 72 个命令。 ### multi-* 命令需要额外配置 diff --git a/WORKING-CONTEXT.md b/WORKING-CONTEXT.md index b38d1bfc..b8d1d60a 100644 --- a/WORKING-CONTEXT.md +++ b/WORKING-CONTEXT.md @@ -138,3 +138,5 @@ Keep this file detailed for only the current sprint, blockers, and next actions. - 2026-04-05: Shipped `846ffb7` (`chore: ship v1.10.0 release surface refresh`). This updated README/plugin metadata/package versions, synced the explicit plugin agent inventory, bumped stale star/fork/contributor counts, created `docs/releases/1.10.0/*`, tagged and released `v1.10.0`, and posted the announcement discussion at `#1272`. - 2026-04-05: Salvaged the reusable Hermes-branch operator skills in `6eba30f` without replaying the full branch. Added `skills/github-ops`, `skills/knowledge-ops`, and `skills/hookify-rules`, wired them into install modules, and re-synced the repo to `159` skills. `knowledge-ops` was explicitly adapted to the current workspace model: live code in cloned repos, active truth in GitHub/Linear, broader non-code context in the KB/archive layers. - 2026-04-05: Fixed the remaining OpenCode npm-publish gap in `db6d52e`. The root package now builds `.opencode/dist` during `prepack`, includes the compiled OpenCode plugin assets in the published tarball, and carries a dedicated regression test (`tests/scripts/build-opencode.test.js`) so the package no longer ships only raw TypeScript source for that surface. +- 2026-04-05: Fixed the stale-row bug in `.github/workflows/monthly-metrics.yml` with `bf5961e`. The workflow now refreshes the current month row in issue `#1087` instead of early-returning when the month already exists, and the dispatched run updated the April snapshot to the current star/fork/release counts. +- 2026-04-05: Recovered the useful cost-control workflow from the divergent Hermes branch as a small ECC-native operator skill instead of replaying the branch. `skills/ecc-tools-cost-audit/SKILL.md` is now wired into `operator-workflows` and focused on webhook -> queue -> worker tracing, burn containment, quota bypass, premium-model leakage, and retry fanout in the sibling `ECC-Tools` repo. diff --git a/docs/zh-CN/AGENTS.md b/docs/zh-CN/AGENTS.md index cf82e07c..fe986c2a 100644 --- a/docs/zh-CN/AGENTS.md +++ b/docs/zh-CN/AGENTS.md @@ -1,6 +1,6 @@ # Everything Claude Code (ECC) — 智能体指令 -这是一个**生产就绪的 AI 编码插件**,提供 38 个专业代理、158 项技能、72 条命令以及自动化钩子工作流,用于软件开发。 +这是一个**生产就绪的 AI 编码插件**,提供 38 个专业代理、159 项技能、72 条命令以及自动化钩子工作流,用于软件开发。 **版本:** 1.10.0 @@ -147,7 +147,7 @@ ``` agents/ — 38 个专业子代理 -skills/ — 158 个工作流技能和领域知识 +skills/ — 159 个工作流技能和领域知识 commands/ — 72 个斜杠命令 hooks/ — 基于触发的自动化 rules/ — 始终遵循的指导方针(通用 + 每种语言) diff --git a/docs/zh-CN/README.md b/docs/zh-CN/README.md index da71ab6a..6e7841c0 100644 --- a/docs/zh-CN/README.md +++ b/docs/zh-CN/README.md @@ -209,7 +209,7 @@ npx ecc-install typescript /plugin list ecc@ecc ``` -**搞定!** 你现在可以使用 38 个智能体、158 项技能和 72 个命令了。 +**搞定!** 你现在可以使用 38 个智能体、159 项技能和 72 个命令了。 *** @@ -1096,7 +1096,7 @@ opencode |---------|-------------|----------|--------| | 智能体 | PASS: 38 个 | PASS: 12 个 | **Claude Code 领先** | | 命令 | PASS: 72 个 | PASS: 31 个 | **Claude Code 领先** | -| 技能 | PASS: 158 项 | PASS: 37 项 | **Claude Code 领先** | +| 技能 | PASS: 159 项 | PASS: 37 项 | **Claude Code 领先** | | 钩子 | PASS: 8 种事件类型 | PASS: 11 种事件 | **OpenCode 更多!** | | 规则 | PASS: 29 条 | PASS: 13 条指令 | **Claude Code 领先** | | MCP 服务器 | PASS: 14 个 | PASS: 完整 | **完全对等** | @@ -1208,7 +1208,7 @@ ECC 是**第一个最大化利用每个主要 AI 编码工具的插件**。以 |---------|------------|------------|-----------|----------| | **智能体** | 38 | 共享 (AGENTS.md) | 共享 (AGENTS.md) | 12 | | **命令** | 72 | 共享 | 基于指令 | 31 | -| **技能** | 158 | 共享 | 10 (原生格式) | 37 | +| **技能** | 159 | 共享 | 10 (原生格式) | 37 | | **钩子事件** | 8 种类型 | 15 种类型 | 暂无 | 11 种类型 | | **钩子脚本** | 20+ 个脚本 | 16 个脚本 (DRY 适配器) | N/A | 插件钩子 | | **规则** | 34 (通用 + 语言) | 34 (YAML 前页) | 基于指令 | 13 条指令 | diff --git a/manifests/install-modules.json b/manifests/install-modules.json index 6a68c30c..7b866ba5 100644 --- a/manifests/install-modules.json +++ b/manifests/install-modules.json @@ -315,6 +315,7 @@ "paths": [ "skills/connections-optimizer", "skills/customer-billing-ops", + "skills/ecc-tools-cost-audit", "skills/github-ops", "skills/google-workspace-ops", "skills/jira-integration", diff --git a/skills/ecc-tools-cost-audit/SKILL.md b/skills/ecc-tools-cost-audit/SKILL.md new file mode 100644 index 00000000..05169631 --- /dev/null +++ b/skills/ecc-tools-cost-audit/SKILL.md @@ -0,0 +1,160 @@ +--- +name: ecc-tools-cost-audit +description: Evidence-first ECC Tools burn and billing audit workflow. Use when investigating runaway PR creation, quota bypass, premium-model leakage, duplicate jobs, or GitHub App cost spikes in the ECC Tools repo. +origin: ECC +--- + +# ECC Tools Cost Audit + +Use this skill when the user suspects the ECC Tools GitHub App is burning cost, over-creating PRs, bypassing usage limits, or routing free users into premium analysis paths. + +This is a focused operator workflow for the sibling [ECC-Tools](../../ECC-Tools) repo. It is not a generic billing skill and it is not a repo-wide code review pass. + +## Skill Stack + +Pull these ECC-native skills into the workflow when relevant: + +- `autonomous-loops` for bounded multi-step audits that cross webhooks, queues, billing, and retries +- `agentic-engineering` for tracing the request path into discrete, provable units +- `customer-billing-ops` when repo behavior and customer-impact math must be separated cleanly +- `search-first` before inventing helpers or re-implementing repo-local utilities +- `security-review` when auth, usage gates, entitlements, or secrets are touched +- `verification-loop` for proving rerun safety and exact post-fix state +- `tdd-workflow` when the fix needs regression coverage in the worker, router, or billing paths + +## When To Use + +- user says ECC Tools burn rate, PR recursion, over-created PRs, usage-limit bypass, or premium-model leakage +- the task is in the sibling `ECC-Tools` repo and depends on webhook handlers, queue workers, usage reservation, PR creation logic, or paid-gate enforcement +- a customer report says the app created too many PRs, billed incorrectly, or analyzed code without producing a usable result + +## Scope Guardrails + +- work in the sibling `ECC-Tools` repo, not in `everything-claude-code` +- start read-only unless the user clearly asked for a fix +- do not mutate unrelated billing, checkout, or UI flows while tracing analysis burn +- treat app-generated branches and app-generated PRs as red-flag recursion paths until proved otherwise +- separate three things explicitly: + - repo-side burn root cause + - customer-facing billing impact + - product or entitlement gaps that need backlog follow-up + +## Workflow + +### 1. Freeze repo scope + +- switch into the sibling `ECC-Tools` repo +- check branch and local diff first +- identify the exact surface under audit: + - webhook router + - queue producer + - queue consumer + - PR creation path + - usage reservation / billing path + - model routing path + +### 2. Trace ingress before theorizing + +- inspect `src/index.*` or the main entrypoint first +- map every enqueue path before suggesting a fix +- confirm which GitHub events share a queue type +- confirm whether push, pull_request, synchronize, comment, or manual re-run events can converge on the same expensive path + +### 3. Trace the worker and side effects + +- inspect the queue consumer or scheduled worker that handles analysis +- confirm whether a queued analysis always ends in: + - PR creation + - branch creation + - file updates + - premium model calls + - usage increments +- if analysis can spend tokens and then fail before output is persisted, classify it as burn-with-broken-output + +### 4. Audit the high-signal burn paths + +#### PR multiplication + +- inspect PR helpers and branch naming +- check dedupe, synchronize-event handling, and existing-PR reuse +- if app-generated branches can re-enter analysis, treat that as a priority-0 recursion risk + +#### Quota bypass + +- inspect where quota is checked versus where usage is reserved or incremented +- if quota is checked before enqueue but usage is charged only inside the worker, treat concurrent front-door passes as a real race + +#### Premium-model leakage + +- inspect model selection, tier branching, and provider routing +- verify whether free or capped users can still hit premium analyzers when premium keys are present + +#### Retry burn + +- inspect retry loops, duplicate queue jobs, and deterministic failure reruns +- if the same non-transient error can spend analysis repeatedly, fix that before quality improvements + +### 5. Fix in burn order + +If the user asked for code changes, prioritize fixes in this order: + +1. stop automatic PR multiplication +2. stop quota bypass +3. stop premium leakage +4. stop duplicate-job fanout and pointless retries +5. close rerun/update safety gaps + +Keep the pass bounded to one to three direct fixes unless the same root cause clearly spans multiple files. + +### 6. Verify with the smallest proving steps + +- rerun only the targeted tests or integration slices that cover the changed path +- verify whether the burn path is now: + - blocked + - deduped + - downgraded to cheaper analysis + - or rejected early +- state the final status exactly: + - changed locally + - verified locally + - pushed + - deployed + - still blocked + +## High-Signal Failure Patterns + +### 1. One queue type for all triggers + +If pushes, PR syncs, and manual audits all enqueue the same job and the worker always creates a PR, analysis equals PR spam. + +### 2. Post-enqueue usage reservation + +If usage is checked at the front door but only incremented in the worker, concurrent requests can all pass the gate and exceed quota. + +### 3. Free tier on premium path + +If free queued jobs can still route into Anthropic or another premium provider when keys exist, that is real spend leakage even if the user never sees the premium result. + +### 4. App-generated branches re-enter the webhook + +If `pull_request.synchronize`, branch pushes, or comment-triggered runs fire on app-owned branches, the app can recursively analyze its own output. + +### 5. Expensive work before persistence safety + +If the system can spend tokens and then fail on PR creation, file update, or branch collision, it is burning cost without shipping value. + +## Pitfalls + +- do not begin with broad repo wandering; settle webhook -> queue -> worker first +- do not mix customer billing inference with code-backed product truth +- do not fix lower-value quality issues before the highest-burn path is contained +- do not claim burn is fixed until the narrow proving step was rerun +- do not push or deploy unless the user asked +- do not touch unrelated repo-local changes if they are already in progress + +## Verification + +- root causes cite exact file paths and code areas +- fixes are ordered by burn impact, not code neatness +- proving commands are named +- final status distinguishes local change, verification, push, and deployment