Files
everything-claude-code/skills/hermes-generated/ecc-tools-cost-audit/SKILL.md
2026-04-02 15:14:20 -07:00

105 lines
6.0 KiB
Markdown

---
name: ecc-tools-cost-audit
description: Evidence-first ECC Tools burn and billing audit workflow. Use when investigating runaway PR creation, quota bypass, premium-model leakage, or GitHub App cost spikes in the ECC Tools repo.
origin: ECC
---
# ECC Tools Cost Audit
Use this when the user suspects the ECC Tools GitHub App is burning cost, over-creating PRs, bypassing usage limits, or routing free users into premium analysis paths.
## Skill Stack
Pull these imported skills into the workflow when relevant:
- `continuous-agent-loop` for scope freezes, recovery gates, and cost-aware tracing when audits are long or failure signatures repeat
- `terminal-ops` for repo-local inspection, narrow edits, and proving commands
- `finance-billing-ops` when customer-impact math has to be separated from repo behavior
- `agentic-engineering` for tracing entrypoints, queue paths, and fix sequencing
- `plankton-code-quality` for safer code changes and rerun behavior
- `eval-harness` mindset for exact root-cause evidence and post-fix verification
- `search-first` before inventing a new helper or abstraction
- `security-review` when auth, secrets, usage gates, or entitlement paths are touched
## When To Use
- user says ECC Tools burn rate, PR recursion, over-created PRs, usage-limit bypass, or expensive model routing
- the task is an audit or fix in `$PRIMARY_REPOS_ROOT/ECC/skill-creator-app`
- the answer depends on webhook handlers, queue workers, usage reservation, PR creation logic, or paid-gate enforcement
## Workflow
1. Freeze repo scope first:
- use `$PRIMARY_REPOS_ROOT/ECC/skill-creator-app`
- check branch and local diff before changing anything
2. Freeze audit mode before tracing:
- if the user asked for `report only`, `audit only`, `review only`, or explicitly said `do not modify code`, keep the pass read-only until the user changes scope
- gather evidence with reads, searches, git status/diff, and other non-writing proving commands first
- do not patch `src/index.ts`, run generators, install dependencies, or stage changes during an audit-only pass
3. Trace ingress before suggesting fixes:
- inspect webhook entrypoints in `src/index.ts`
- search every `ANALYSIS_QUEUE.send(...)` or equivalent enqueue
- map which triggers share a job type
4. Trace the queue consumer and its side effects:
- inspect `handleAnalysisQueue(...)` or the equivalent worker
- confirm whether queued analysis always ends in PR creation, file writes, or premium model calls
5. Audit PR multiplication:
- inspect PR helpers and branch naming
- check dedupe, branch skip logic, synchronize-event handling, and reuse of existing PRs
- treat app-generated branches such as `ecc-tools/*` or timestamped branches as red-flag evidence paths
6. Audit usage and billing truth:
- inspect rate-limit check and increment paths
- if quota is checked before enqueue but incremented only in the worker, mark it as a real race
- separate overrun risk, customer billing impact, and entitlement truth
7. Audit model routing:
- inspect analyzer `fastMode` or equivalent flags, free-vs-paid tier branching, and actual provider/model calls
- confirm whether free queued work can still hit Anthropic or another premium path when keys exist
8. Audit rerun safety and file updates:
- inspect file update helpers for existing-file `sha` handling or equivalent optimistic concurrency
- if reruns can spend analysis cost and then fail on PR or file creation, mark it as burn-with-broken-output
9. Fix in priority order only if the user asked for code changes:
- stop automatic PR multiplication first
- stop quota bypass second
- stop premium leakage third
- then close rerun/update safety gaps and missing entitlement gates
10. Answer status interrupts before more tracing:
- if the user asks `did you do it?`, `are you working?`, or the session is near the tool budget, reply from the current verified repo state before more searching
- lead with whether root causes are `found`, fixes are `changed locally`, `verified locally`, `pushed`, or still `blocked`
- if the asked burn path is still unresolved, say that before side findings or lower-priority issues
11. Verify with the smallest proving commands:
- rerun only the focused tests or typecheck that cover changed paths
- report `changed locally`, `verified locally`, `pushed`, `deployed`, or `blocked` exactly
## High-Signal Failure Patterns
### 1. One queue type for all triggers
If pushes, PR syncs, and manual audits all enqueue the same analyze job and the worker always calls the PR-creation path, analysis equals PR spam.
### 2. Post-enqueue usage increment
If usage is reserved only inside the worker, concurrent requests can all pass the front-door check and exceed quotas.
### 3. Free tier on premium model path
If free queued jobs still set `fastMode: false` or equivalent while premium provider keys exist, free users can burn premium spend.
### 4. App-generated branches re-entering the webhook
If `pull_request.synchronize` or similar runs on `ecc-tools/*` branches, the app can analyze its own output and recurse.
### 5. Update-without-sha reruns
If generated files are updated without passing the existing file `sha`, reruns can fail after the expensive work already happened.
## Pitfalls
- do not start with broad repo wandering, settle webhook -> queue -> worker path first
- do not mix customer billing inference with code-backed product truth
- do not mutate the repo during an audit-only or `do not modify code` pass
- do not claim burn is fixed until the narrow proving command was rerun
- do not push or deploy unless the user asked
- do not ignore existing local changes in the repo, work around them or stop if they conflict
- do not keep tracing lower-priority repo paths after a budget warning or status interrupt when the main root-cause state is already known
## Verification
- root causes cite exact file paths and code areas
- fixes are ordered by burn impact, not code neatness
- proving commands are named
- final status distinguishes local change, verification, push, and deployment