mirror of https://github.com/affaan-m/everything-claude-code.git synced 2026-03-30 13:43:26 +08:00

Files

Himanshu Sharma 535120d6b1 Add Kiro skills (18 SKILL.md files) (#811 )

Co-authored-by: Sungmin Hong <hsungmin@amazon.com>

2026-03-22 21:55:45 -07:00

3.9 KiB

Raw Blame History

name, description, metadata

name

description

metadata

agentic-engineering

Operate as an agentic engineer using eval-first execution, decomposition, and cost-aware model routing. Use when AI agents perform most implementation work and humans enforce quality and risk controls.

origin
ECC

Agentic Engineering

Use this skill for engineering workflows where AI agents perform most implementation work and humans enforce quality and risk controls.

Operating Principles

Define completion criteria before execution.
Decompose work into agent-sized units.
Route model tiers by task complexity.
Measure with evals and regression checks.

Eval-First Loop

Define capability eval and regression eval.
Run baseline and capture failure signatures.
Execute implementation.
Re-run evals and compare deltas.

Example workflow:

1. Write test that captures desired behavior (eval)
2. Run test → capture baseline failures
3. Implement feature
4. Re-run test → verify improvements
5. Check for regressions in other tests

Task Decomposition

Apply the 15-minute unit rule:

Each unit should be independently verifiable
Each unit should have a single dominant risk
Each unit should expose a clear done condition

Good decomposition:

Task: Add user authentication
├─ Unit 1: Add password hashing (15 min, security risk)
├─ Unit 2: Create login endpoint (15 min, API contract risk)
├─ Unit 3: Add session management (15 min, state risk)
└─ Unit 4: Protect routes with middleware (15 min, auth logic risk)

Bad decomposition:

Task: Add user authentication (2 hours, multiple risks)

Model Routing

Choose model tier based on task complexity:

Haiku: Classification, boilerplate transforms, narrow edits
- Example: Rename variable, add type annotation, format code
Sonnet: Implementation and refactors
- Example: Implement feature, refactor module, write tests
Opus: Architecture, root-cause analysis, multi-file invariants
- Example: Design system, debug complex issue, review architecture

Cost discipline: Escalate model tier only when lower tier fails with a clear reasoning gap.

Session Strategy

Continue session for closely-coupled units
- Example: Implementing related functions in same module
Start fresh session after major phase transitions
- Example: Moving from implementation to testing
Compact after milestone completion, not during active debugging
- Example: After feature complete, before starting next feature

Review Focus for AI-Generated Code

Prioritize:

Invariants and edge cases
Error boundaries
Security and auth assumptions
Hidden coupling and rollout risk

Do not waste review cycles on style-only disagreements when automated format/lint already enforce style.

Review checklist:

Edge cases handled (null, empty, boundary values)
Error handling comprehensive
Security assumptions validated
No hidden coupling between modules
Rollout risk assessed (breaking changes, migrations)

Cost Discipline

Track per task:

Model tier used
Token estimate
Retries needed
Wall-clock time
Success/failure outcome

Example tracking:

Task: Implement user login
Model: Sonnet
Tokens: ~5k input, ~2k output
Retries: 1 (initial implementation had auth bug)
Time: 8 minutes
Outcome: Success

When to Use This Skill

Managing AI-driven development workflows
Planning agent task decomposition
Optimizing model tier selection
Implementing eval-first development
Reviewing AI-generated code
Tracking development costs

Integration with Other Skills

tdd-workflow: Combine with eval-first loop for test-driven development
verification-loop: Use for continuous validation during implementation
search-first: Apply before implementation to find existing solutions
coding-standards: Reference during code review phase

3.9 KiB Raw Blame History