mirror of
https://github.com/affaan-m/everything-claude-code.git
synced 2026-04-10 03:13:29 +08:00
161 lines
6.3 KiB
Markdown
161 lines
6.3 KiB
Markdown
---
|
|
name: ecc-tools-cost-audit
|
|
description: Evidence-first ECC Tools burn and billing audit workflow. Use when investigating runaway PR creation, quota bypass, premium-model leakage, duplicate jobs, or GitHub App cost spikes in the ECC Tools repo.
|
|
origin: ECC
|
|
---
|
|
|
|
# ECC Tools Cost Audit
|
|
|
|
Use this skill when the user suspects the ECC Tools GitHub App is burning cost, over-creating PRs, bypassing usage limits, or routing free users into premium analysis paths.
|
|
|
|
This is a focused operator workflow for the sibling [ECC-Tools](../../ECC-Tools) repo. It is not a generic billing skill and it is not a repo-wide code review pass.
|
|
|
|
## Skill Stack
|
|
|
|
Pull these ECC-native skills into the workflow when relevant:
|
|
|
|
- `autonomous-loops` for bounded multi-step audits that cross webhooks, queues, billing, and retries
|
|
- `agentic-engineering` for tracing the request path into discrete, provable units
|
|
- `customer-billing-ops` when repo behavior and customer-impact math must be separated cleanly
|
|
- `search-first` before inventing helpers or re-implementing repo-local utilities
|
|
- `security-review` when auth, usage gates, entitlements, or secrets are touched
|
|
- `verification-loop` for proving rerun safety and exact post-fix state
|
|
- `tdd-workflow` when the fix needs regression coverage in the worker, router, or billing paths
|
|
|
|
## When To Use
|
|
|
|
- user says ECC Tools burn rate, PR recursion, over-created PRs, usage-limit bypass, or premium-model leakage
|
|
- the task is in the sibling `ECC-Tools` repo and depends on webhook handlers, queue workers, usage reservation, PR creation logic, or paid-gate enforcement
|
|
- a customer report says the app created too many PRs, billed incorrectly, or analyzed code without producing a usable result
|
|
|
|
## Scope Guardrails
|
|
|
|
- work in the sibling `ECC-Tools` repo, not in `everything-claude-code`
|
|
- start read-only unless the user clearly asked for a fix
|
|
- do not mutate unrelated billing, checkout, or UI flows while tracing analysis burn
|
|
- treat app-generated branches and app-generated PRs as red-flag recursion paths until proved otherwise
|
|
- separate three things explicitly:
|
|
- repo-side burn root cause
|
|
- customer-facing billing impact
|
|
- product or entitlement gaps that need backlog follow-up
|
|
|
|
## Workflow
|
|
|
|
### 1. Freeze repo scope
|
|
|
|
- switch into the sibling `ECC-Tools` repo
|
|
- check branch and local diff first
|
|
- identify the exact surface under audit:
|
|
- webhook router
|
|
- queue producer
|
|
- queue consumer
|
|
- PR creation path
|
|
- usage reservation / billing path
|
|
- model routing path
|
|
|
|
### 2. Trace ingress before theorizing
|
|
|
|
- inspect `src/index.*` or the main entrypoint first
|
|
- map every enqueue path before suggesting a fix
|
|
- confirm which GitHub events share a queue type
|
|
- confirm whether push, pull_request, synchronize, comment, or manual re-run events can converge on the same expensive path
|
|
|
|
### 3. Trace the worker and side effects
|
|
|
|
- inspect the queue consumer or scheduled worker that handles analysis
|
|
- confirm whether a queued analysis always ends in:
|
|
- PR creation
|
|
- branch creation
|
|
- file updates
|
|
- premium model calls
|
|
- usage increments
|
|
- if analysis can spend tokens and then fail before output is persisted, classify it as burn-with-broken-output
|
|
|
|
### 4. Audit the high-signal burn paths
|
|
|
|
#### PR multiplication
|
|
|
|
- inspect PR helpers and branch naming
|
|
- check dedupe, synchronize-event handling, and existing-PR reuse
|
|
- if app-generated branches can re-enter analysis, treat that as a priority-0 recursion risk
|
|
|
|
#### Quota bypass
|
|
|
|
- inspect where quota is checked versus where usage is reserved or incremented
|
|
- if quota is checked before enqueue but usage is charged only inside the worker, treat concurrent front-door passes as a real race
|
|
|
|
#### Premium-model leakage
|
|
|
|
- inspect model selection, tier branching, and provider routing
|
|
- verify whether free or capped users can still hit premium analyzers when premium keys are present
|
|
|
|
#### Retry burn
|
|
|
|
- inspect retry loops, duplicate queue jobs, and deterministic failure reruns
|
|
- if the same non-transient error can spend analysis repeatedly, fix that before quality improvements
|
|
|
|
### 5. Fix in burn order
|
|
|
|
If the user asked for code changes, prioritize fixes in this order:
|
|
|
|
1. stop automatic PR multiplication
|
|
2. stop quota bypass
|
|
3. stop premium leakage
|
|
4. stop duplicate-job fanout and pointless retries
|
|
5. close rerun/update safety gaps
|
|
|
|
Keep the pass bounded to one to three direct fixes unless the same root cause clearly spans multiple files.
|
|
|
|
### 6. Verify with the smallest proving steps
|
|
|
|
- rerun only the targeted tests or integration slices that cover the changed path
|
|
- verify whether the burn path is now:
|
|
- blocked
|
|
- deduped
|
|
- downgraded to cheaper analysis
|
|
- or rejected early
|
|
- state the final status exactly:
|
|
- changed locally
|
|
- verified locally
|
|
- pushed
|
|
- deployed
|
|
- still blocked
|
|
|
|
## High-Signal Failure Patterns
|
|
|
|
### 1. One queue type for all triggers
|
|
|
|
If pushes, PR syncs, and manual audits all enqueue the same job and the worker always creates a PR, analysis equals PR spam.
|
|
|
|
### 2. Post-enqueue usage reservation
|
|
|
|
If usage is checked at the front door but only incremented in the worker, concurrent requests can all pass the gate and exceed quota.
|
|
|
|
### 3. Free tier on premium path
|
|
|
|
If free queued jobs can still route into Anthropic or another premium provider when keys exist, that is real spend leakage even if the user never sees the premium result.
|
|
|
|
### 4. App-generated branches re-enter the webhook
|
|
|
|
If `pull_request.synchronize`, branch pushes, or comment-triggered runs fire on app-owned branches, the app can recursively analyze its own output.
|
|
|
|
### 5. Expensive work before persistence safety
|
|
|
|
If the system can spend tokens and then fail on PR creation, file update, or branch collision, it is burning cost without shipping value.
|
|
|
|
## Pitfalls
|
|
|
|
- do not begin with broad repo wandering; settle webhook -> queue -> worker first
|
|
- do not mix customer billing inference with code-backed product truth
|
|
- do not fix lower-value quality issues before the highest-burn path is contained
|
|
- do not claim burn is fixed until the narrow proving step was rerun
|
|
- do not push or deploy unless the user asked
|
|
- do not touch unrelated repo-local changes if they are already in progress
|
|
|
|
## Verification
|
|
|
|
- root causes cite exact file paths and code areas
|
|
- fixes are ordered by burn impact, not code neatness
|
|
- proving commands are named
|
|
- final status distinguishes local change, verification, push, and deployment
|