mirror of
https://github.com/affaan-m/everything-claude-code.git
synced 2026-03-30 13:43:26 +08:00
feat: add 6 gap-closing skills — browser QA, design system, product lens, canary watch, benchmark, safety guard
Closes competitive gaps with gstack: - browser-qa: automated visual testing via browser MCP - design-system: generate, audit, and detect AI slop in UI - product-lens: product diagnostic, founder review, feature prioritization - canary-watch: post-deploy monitoring with alert thresholds - benchmark: performance baseline and regression detection - safety-guard: prevent destructive operations in autonomous sessions
This commit is contained in:
87
skills/benchmark/SKILL.md
Normal file
87
skills/benchmark/SKILL.md
Normal file
@@ -0,0 +1,87 @@
|
||||
# Benchmark — Performance Baseline & Regression Detection
|
||||
|
||||
## When to Use
|
||||
|
||||
- Before and after a PR to measure performance impact
|
||||
- Setting up performance baselines for a project
|
||||
- When users report "it feels slow"
|
||||
- Before a launch — ensure you meet performance targets
|
||||
- Comparing your stack against alternatives
|
||||
|
||||
## How It Works
|
||||
|
||||
### Mode 1: Page Performance
|
||||
|
||||
Measures real browser metrics via browser MCP:
|
||||
|
||||
```
|
||||
1. Navigate to each target URL
|
||||
2. Measure Core Web Vitals:
|
||||
- LCP (Largest Contentful Paint) — target < 2.5s
|
||||
- CLS (Cumulative Layout Shift) — target < 0.1
|
||||
- INP (Interaction to Next Paint) — target < 200ms
|
||||
- FCP (First Contentful Paint) — target < 1.8s
|
||||
- TTFB (Time to First Byte) — target < 800ms
|
||||
3. Measure resource sizes:
|
||||
- Total page weight (target < 1MB)
|
||||
- JS bundle size (target < 200KB gzipped)
|
||||
- CSS size
|
||||
- Image weight
|
||||
- Third-party script weight
|
||||
4. Count network requests
|
||||
5. Check for render-blocking resources
|
||||
```
|
||||
|
||||
### Mode 2: API Performance
|
||||
|
||||
Benchmarks API endpoints:
|
||||
|
||||
```
|
||||
1. Hit each endpoint 100 times
|
||||
2. Measure: p50, p95, p99 latency
|
||||
3. Track: response size, status codes
|
||||
4. Test under load: 10 concurrent requests
|
||||
5. Compare against SLA targets
|
||||
```
|
||||
|
||||
### Mode 3: Build Performance
|
||||
|
||||
Measures development feedback loop:
|
||||
|
||||
```
|
||||
1. Cold build time
|
||||
2. Hot reload time (HMR)
|
||||
3. Test suite duration
|
||||
4. TypeScript check time
|
||||
5. Lint time
|
||||
6. Docker build time
|
||||
```
|
||||
|
||||
### Mode 4: Before/After Comparison
|
||||
|
||||
Run before and after a change to measure impact:
|
||||
|
||||
```
|
||||
/benchmark baseline # saves current metrics
|
||||
# ... make changes ...
|
||||
/benchmark compare # compares against baseline
|
||||
```
|
||||
|
||||
Output:
|
||||
```
|
||||
| Metric | Before | After | Delta | Verdict |
|
||||
|--------|--------|-------|-------|---------|
|
||||
| LCP | 1.2s | 1.4s | +200ms | ⚠ WARN |
|
||||
| Bundle | 180KB | 175KB | -5KB | ✓ BETTER |
|
||||
| Build | 12s | 14s | +2s | ⚠ WARN |
|
||||
```
|
||||
|
||||
## Output
|
||||
|
||||
Stores baselines in `.ecc/benchmarks/` as JSON. Git-tracked so the team shares baselines.
|
||||
|
||||
## Integration
|
||||
|
||||
- CI: run `/benchmark compare` on every PR
|
||||
- Pair with `/canary-watch` for post-deploy monitoring
|
||||
- Pair with `/browser-qa` for full pre-ship checklist
|
||||
81
skills/browser-qa/SKILL.md
Normal file
81
skills/browser-qa/SKILL.md
Normal file
@@ -0,0 +1,81 @@
|
||||
# Browser QA — Automated Visual Testing & Interaction
|
||||
|
||||
## When to Use
|
||||
|
||||
- After deploying a feature to staging/preview
|
||||
- When you need to verify UI behavior across pages
|
||||
- Before shipping — confirm layouts, forms, interactions actually work
|
||||
- When reviewing PRs that touch frontend code
|
||||
- Accessibility audits and responsive testing
|
||||
|
||||
## How It Works
|
||||
|
||||
Uses the browser automation MCP (claude-in-chrome, Playwright, or Puppeteer) to interact with live pages like a real user.
|
||||
|
||||
### Phase 1: Smoke Test
|
||||
```
|
||||
1. Navigate to target URL
|
||||
2. Check for console errors (filter noise: analytics, third-party)
|
||||
3. Verify no 4xx/5xx in network requests
|
||||
4. Screenshot above-the-fold on desktop + mobile viewport
|
||||
5. Check Core Web Vitals: LCP < 2.5s, CLS < 0.1, INP < 200ms
|
||||
```
|
||||
|
||||
### Phase 2: Interaction Test
|
||||
```
|
||||
1. Click every nav link — verify no dead links
|
||||
2. Submit forms with valid data — verify success state
|
||||
3. Submit forms with invalid data — verify error state
|
||||
4. Test auth flow: login → protected page → logout
|
||||
5. Test critical user journeys (checkout, onboarding, search)
|
||||
```
|
||||
|
||||
### Phase 3: Visual Regression
|
||||
```
|
||||
1. Screenshot key pages at 3 breakpoints (375px, 768px, 1440px)
|
||||
2. Compare against baseline screenshots (if stored)
|
||||
3. Flag layout shifts > 5px, missing elements, overflow
|
||||
4. Check dark mode if applicable
|
||||
```
|
||||
|
||||
### Phase 4: Accessibility
|
||||
```
|
||||
1. Run axe-core or equivalent on each page
|
||||
2. Flag WCAG AA violations (contrast, labels, focus order)
|
||||
3. Verify keyboard navigation works end-to-end
|
||||
4. Check screen reader landmarks
|
||||
```
|
||||
|
||||
## Output Format
|
||||
|
||||
```markdown
|
||||
## QA Report — [URL] — [timestamp]
|
||||
|
||||
### Smoke Test
|
||||
- Console errors: 0 critical, 2 warnings (analytics noise)
|
||||
- Network: all 200/304, no failures
|
||||
- Core Web Vitals: LCP 1.2s ✓, CLS 0.02 ✓, INP 89ms ✓
|
||||
|
||||
### Interactions
|
||||
- [✓] Nav links: 12/12 working
|
||||
- [✗] Contact form: missing error state for invalid email
|
||||
- [✓] Auth flow: login/logout working
|
||||
|
||||
### Visual
|
||||
- [✗] Hero section overflows on 375px viewport
|
||||
- [✓] Dark mode: all pages consistent
|
||||
|
||||
### Accessibility
|
||||
- 2 AA violations: missing alt text on hero image, low contrast on footer links
|
||||
|
||||
### Verdict: SHIP WITH FIXES (2 issues, 0 blockers)
|
||||
```
|
||||
|
||||
## Integration
|
||||
|
||||
Works with any browser MCP:
|
||||
- `mChild__claude-in-chrome__*` tools (preferred — uses your actual Chrome)
|
||||
- Playwright via `mcp__browserbase__*`
|
||||
- Direct Puppeteer scripts
|
||||
|
||||
Pair with `/canary-watch` for post-deploy monitoring.
|
||||
93
skills/canary-watch/SKILL.md
Normal file
93
skills/canary-watch/SKILL.md
Normal file
@@ -0,0 +1,93 @@
|
||||
# Canary Watch — Post-Deploy Monitoring
|
||||
|
||||
## When to Use
|
||||
|
||||
- After deploying to production or staging
|
||||
- After merging a risky PR
|
||||
- When you want to verify a fix actually fixed it
|
||||
- Continuous monitoring during a launch window
|
||||
- After dependency upgrades
|
||||
|
||||
## How It Works
|
||||
|
||||
Monitors a deployed URL for regressions. Runs in a loop until stopped or until the watch window expires.
|
||||
|
||||
### What It Watches
|
||||
|
||||
```
|
||||
1. HTTP Status — is the page returning 200?
|
||||
2. Console Errors — new errors that weren't there before?
|
||||
3. Network Failures — failed API calls, 5xx responses?
|
||||
4. Performance — LCP/CLS/INP regression vs baseline?
|
||||
5. Content — did key elements disappear? (h1, nav, footer, CTA)
|
||||
6. API Health — are critical endpoints responding within SLA?
|
||||
```
|
||||
|
||||
### Watch Modes
|
||||
|
||||
**Quick check** (default): single pass, report results
|
||||
```
|
||||
/canary-watch https://myapp.com
|
||||
```
|
||||
|
||||
**Sustained watch**: check every N minutes for M hours
|
||||
```
|
||||
/canary-watch https://myapp.com --interval 5m --duration 2h
|
||||
```
|
||||
|
||||
**Diff mode**: compare staging vs production
|
||||
```
|
||||
/canary-watch --compare https://staging.myapp.com https://myapp.com
|
||||
```
|
||||
|
||||
### Alert Thresholds
|
||||
|
||||
```yaml
|
||||
critical: # immediate alert
|
||||
- HTTP status != 200
|
||||
- Console error count > 5 (new errors only)
|
||||
- LCP > 4s
|
||||
- API endpoint returns 5xx
|
||||
|
||||
warning: # flag in report
|
||||
- LCP increased > 500ms from baseline
|
||||
- CLS > 0.1
|
||||
- New console warnings
|
||||
- Response time > 2x baseline
|
||||
|
||||
info: # log only
|
||||
- Minor performance variance
|
||||
- New network requests (third-party scripts added?)
|
||||
```
|
||||
|
||||
### Notifications
|
||||
|
||||
When a critical threshold is crossed:
|
||||
- Desktop notification (macOS/Linux)
|
||||
- Optional: Slack/Discord webhook
|
||||
- Log to `~/.claude/canary-watch.log`
|
||||
|
||||
## Output
|
||||
|
||||
```markdown
|
||||
## Canary Report — myapp.com — 2026-03-23 03:15 PST
|
||||
|
||||
### Status: HEALTHY ✓
|
||||
|
||||
| Check | Result | Baseline | Delta |
|
||||
|-------|--------|----------|-------|
|
||||
| HTTP | 200 ✓ | 200 | — |
|
||||
| Console errors | 0 ✓ | 0 | — |
|
||||
| LCP | 1.8s ✓ | 1.6s | +200ms |
|
||||
| CLS | 0.01 ✓ | 0.01 | — |
|
||||
| API /health | 145ms ✓ | 120ms | +25ms |
|
||||
|
||||
### No regressions detected. Deploy is clean.
|
||||
```
|
||||
|
||||
## Integration
|
||||
|
||||
Pair with:
|
||||
- `/browser-qa` for pre-deploy verification
|
||||
- Hooks: add as a PostToolUse hook on `git push` to auto-check after deploys
|
||||
- CI: run in GitHub Actions after deploy step
|
||||
76
skills/design-system/SKILL.md
Normal file
76
skills/design-system/SKILL.md
Normal file
@@ -0,0 +1,76 @@
|
||||
# Design System — Generate & Audit Visual Systems
|
||||
|
||||
## When to Use
|
||||
|
||||
- Starting a new project that needs a design system
|
||||
- Auditing an existing codebase for visual consistency
|
||||
- Before a redesign — understand what you have
|
||||
- When the UI looks "off" but you can't pinpoint why
|
||||
- Reviewing PRs that touch styling
|
||||
|
||||
## How It Works
|
||||
|
||||
### Mode 1: Generate Design System
|
||||
|
||||
Analyzes your codebase and generates a cohesive design system:
|
||||
|
||||
```
|
||||
1. Scan CSS/Tailwind/styled-components for existing patterns
|
||||
2. Extract: colors, typography, spacing, border-radius, shadows, breakpoints
|
||||
3. Research 3 competitor sites for inspiration (via browser MCP)
|
||||
4. Propose a design token set (JSON + CSS custom properties)
|
||||
5. Generate DESIGN.md with rationale for each decision
|
||||
6. Create an interactive HTML preview page (self-contained, no deps)
|
||||
```
|
||||
|
||||
Output: `DESIGN.md` + `design-tokens.json` + `design-preview.html`
|
||||
|
||||
### Mode 2: Visual Audit
|
||||
|
||||
Scores your UI across 10 dimensions (0-10 each):
|
||||
|
||||
```
|
||||
1. Color consistency — are you using your palette or random hex values?
|
||||
2. Typography hierarchy — clear h1 > h2 > h3 > body > caption?
|
||||
3. Spacing rhythm — consistent scale (4px/8px/16px) or arbitrary?
|
||||
4. Component consistency — do similar elements look similar?
|
||||
5. Responsive behavior — fluid or broken at breakpoints?
|
||||
6. Dark mode — complete or half-done?
|
||||
7. Animation — purposeful or gratuitous?
|
||||
8. Accessibility — contrast ratios, focus states, touch targets
|
||||
9. Information density — cluttered or clean?
|
||||
10. Polish — hover states, transitions, loading states, empty states
|
||||
```
|
||||
|
||||
Each dimension gets a score, specific examples, and a fix with exact file:line.
|
||||
|
||||
### Mode 3: AI Slop Detection
|
||||
|
||||
Identifies generic AI-generated design patterns:
|
||||
|
||||
```
|
||||
- Gratuitous gradients on everything
|
||||
- Purple-to-blue defaults
|
||||
- "Glass morphism" cards with no purpose
|
||||
- Rounded corners on things that shouldn't be rounded
|
||||
- Excessive animations on scroll
|
||||
- Generic hero with centered text over stock gradient
|
||||
- Sans-serif font stack with no personality
|
||||
```
|
||||
|
||||
## Examples
|
||||
|
||||
**Generate for a SaaS app:**
|
||||
```
|
||||
/design-system generate --style minimal --palette earth-tones
|
||||
```
|
||||
|
||||
**Audit existing UI:**
|
||||
```
|
||||
/design-system audit --url http://localhost:3000 --pages / /pricing /docs
|
||||
```
|
||||
|
||||
**Check for AI slop:**
|
||||
```
|
||||
/design-system slop-check
|
||||
```
|
||||
79
skills/product-lens/SKILL.md
Normal file
79
skills/product-lens/SKILL.md
Normal file
@@ -0,0 +1,79 @@
|
||||
# Product Lens — Think Before You Build
|
||||
|
||||
## When to Use
|
||||
|
||||
- Before starting any feature — validate the "why"
|
||||
- Weekly product review — are we building the right thing?
|
||||
- When stuck choosing between features
|
||||
- Before a launch — sanity check the user journey
|
||||
- When converting a vague idea into a spec
|
||||
|
||||
## How It Works
|
||||
|
||||
### Mode 1: Product Diagnostic
|
||||
|
||||
Like YC office hours but automated. Asks the hard questions:
|
||||
|
||||
```
|
||||
1. Who is this for? (specific person, not "developers")
|
||||
2. What's the pain? (quantify: how often, how bad, what do they do today?)
|
||||
3. Why now? (what changed that makes this possible/necessary?)
|
||||
4. What's the 10-star version? (if money/time were unlimited)
|
||||
5. What's the MVP? (smallest thing that proves the thesis)
|
||||
6. What's the anti-goal? (what are you explicitly NOT building?)
|
||||
7. How do you know it's working? (metric, not vibes)
|
||||
```
|
||||
|
||||
Output: a `PRODUCT-BRIEF.md` with answers, risks, and a go/no-go recommendation.
|
||||
|
||||
### Mode 2: Founder Review
|
||||
|
||||
Reviews your current project through a founder lens:
|
||||
|
||||
```
|
||||
1. Read README, CLAUDE.md, package.json, recent commits
|
||||
2. Infer: what is this trying to be?
|
||||
3. Score: product-market fit signals (0-10)
|
||||
- Usage growth trajectory
|
||||
- Retention indicators (repeat contributors, return users)
|
||||
- Revenue signals (pricing page, billing code, Stripe integration)
|
||||
- Competitive moat (what's hard to copy?)
|
||||
4. Identify: the one thing that would 10x this
|
||||
5. Flag: things you're building that don't matter
|
||||
```
|
||||
|
||||
### Mode 3: User Journey Audit
|
||||
|
||||
Maps the actual user experience:
|
||||
|
||||
```
|
||||
1. Clone/install the product as a new user
|
||||
2. Document every friction point (confusing steps, errors, missing docs)
|
||||
3. Time each step
|
||||
4. Compare to competitor onboarding
|
||||
5. Score: time-to-value (how long until the user gets their first win?)
|
||||
6. Recommend: top 3 fixes for onboarding
|
||||
```
|
||||
|
||||
### Mode 4: Feature Prioritization
|
||||
|
||||
When you have 10 ideas and need to pick 2:
|
||||
|
||||
```
|
||||
1. List all candidate features
|
||||
2. Score each on: impact (1-5) × confidence (1-5) ÷ effort (1-5)
|
||||
3. Rank by ICE score
|
||||
4. Apply constraints: runway, team size, dependencies
|
||||
5. Output: prioritized roadmap with rationale
|
||||
```
|
||||
|
||||
## Output
|
||||
|
||||
All modes output actionable docs, not essays. Every recommendation has a specific next step.
|
||||
|
||||
## Integration
|
||||
|
||||
Pair with:
|
||||
- `/browser-qa` to verify the user journey audit findings
|
||||
- `/design-system audit` for visual polish assessment
|
||||
- `/canary-watch` for post-launch monitoring
|
||||
69
skills/safety-guard/SKILL.md
Normal file
69
skills/safety-guard/SKILL.md
Normal file
@@ -0,0 +1,69 @@
|
||||
# Safety Guard — Prevent Destructive Operations
|
||||
|
||||
## When to Use
|
||||
|
||||
- When working on production systems
|
||||
- When agents are running autonomously (full-auto mode)
|
||||
- When you want to restrict edits to a specific directory
|
||||
- During sensitive operations (migrations, deploys, data changes)
|
||||
|
||||
## How It Works
|
||||
|
||||
Three modes of protection:
|
||||
|
||||
### Mode 1: Careful Mode
|
||||
|
||||
Intercepts destructive commands before execution and warns:
|
||||
|
||||
```
|
||||
Watched patterns:
|
||||
- rm -rf (especially /, ~, or project root)
|
||||
- git push --force
|
||||
- git reset --hard
|
||||
- git checkout . (discard all changes)
|
||||
- DROP TABLE / DROP DATABASE
|
||||
- docker system prune
|
||||
- kubectl delete
|
||||
- chmod 777
|
||||
- sudo rm
|
||||
- npm publish (accidental publishes)
|
||||
- Any command with --no-verify
|
||||
```
|
||||
|
||||
When detected: shows what the command does, asks for confirmation, suggests safer alternative.
|
||||
|
||||
### Mode 2: Freeze Mode
|
||||
|
||||
Locks file edits to a specific directory tree:
|
||||
|
||||
```
|
||||
/safety-guard freeze src/components/
|
||||
```
|
||||
|
||||
Any Write/Edit outside `src/components/` is blocked with an explanation. Useful when you want an agent to focus on one area without touching unrelated code.
|
||||
|
||||
### Mode 3: Guard Mode (Careful + Freeze combined)
|
||||
|
||||
Both protections active. Maximum safety for autonomous agents.
|
||||
|
||||
```
|
||||
/safety-guard guard --dir src/api/ --allow-read-all
|
||||
```
|
||||
|
||||
Agents can read anything but only write to `src/api/`. Destructive commands are blocked everywhere.
|
||||
|
||||
### Unlock
|
||||
|
||||
```
|
||||
/safety-guard off
|
||||
```
|
||||
|
||||
## Implementation
|
||||
|
||||
Uses PreToolUse hooks to intercept Bash, Write, Edit, and MultiEdit tool calls. Checks the command/path against the active rules before allowing execution.
|
||||
|
||||
## Integration
|
||||
|
||||
- Enable by default for `codex -a never` sessions
|
||||
- Pair with observability risk scoring in ECC 2.0
|
||||
- Logs all blocked actions to `~/.claude/safety-guard.log`
|
||||
Reference in New Issue
Block a user