Files
everything-claude-code/.kiro/skills/agentic-engineering/SKILL.md
Himanshu Sharma 535120d6b1 Add Kiro skills (18 SKILL.md files) (#811)
Co-authored-by: Sungmin Hong <hsungmin@amazon.com>
2026-03-22 21:55:45 -07:00

3.9 KiB

name, description, metadata
name description metadata
agentic-engineering Operate as an agentic engineer using eval-first execution, decomposition, and cost-aware model routing. Use when AI agents perform most implementation work and humans enforce quality and risk controls.
origin
ECC

Agentic Engineering

Use this skill for engineering workflows where AI agents perform most implementation work and humans enforce quality and risk controls.

Operating Principles

  1. Define completion criteria before execution.
  2. Decompose work into agent-sized units.
  3. Route model tiers by task complexity.
  4. Measure with evals and regression checks.

Eval-First Loop

  1. Define capability eval and regression eval.
  2. Run baseline and capture failure signatures.
  3. Execute implementation.
  4. Re-run evals and compare deltas.

Example workflow:

1. Write test that captures desired behavior (eval)
2. Run test → capture baseline failures
3. Implement feature
4. Re-run test → verify improvements
5. Check for regressions in other tests

Task Decomposition

Apply the 15-minute unit rule:

  • Each unit should be independently verifiable
  • Each unit should have a single dominant risk
  • Each unit should expose a clear done condition

Good decomposition:

Task: Add user authentication
├─ Unit 1: Add password hashing (15 min, security risk)
├─ Unit 2: Create login endpoint (15 min, API contract risk)
├─ Unit 3: Add session management (15 min, state risk)
└─ Unit 4: Protect routes with middleware (15 min, auth logic risk)

Bad decomposition:

Task: Add user authentication (2 hours, multiple risks)

Model Routing

Choose model tier based on task complexity:

  • Haiku: Classification, boilerplate transforms, narrow edits

    • Example: Rename variable, add type annotation, format code
  • Sonnet: Implementation and refactors

    • Example: Implement feature, refactor module, write tests
  • Opus: Architecture, root-cause analysis, multi-file invariants

    • Example: Design system, debug complex issue, review architecture

Cost discipline: Escalate model tier only when lower tier fails with a clear reasoning gap.

Session Strategy

  • Continue session for closely-coupled units

    • Example: Implementing related functions in same module
  • Start fresh session after major phase transitions

    • Example: Moving from implementation to testing
  • Compact after milestone completion, not during active debugging

    • Example: After feature complete, before starting next feature

Review Focus for AI-Generated Code

Prioritize:

  • Invariants and edge cases
  • Error boundaries
  • Security and auth assumptions
  • Hidden coupling and rollout risk

Do not waste review cycles on style-only disagreements when automated format/lint already enforce style.

Review checklist:

  • Edge cases handled (null, empty, boundary values)
  • Error handling comprehensive
  • Security assumptions validated
  • No hidden coupling between modules
  • Rollout risk assessed (breaking changes, migrations)

Cost Discipline

Track per task:

  • Model tier used
  • Token estimate
  • Retries needed
  • Wall-clock time
  • Success/failure outcome

Example tracking:

Task: Implement user login
Model: Sonnet
Tokens: ~5k input, ~2k output
Retries: 1 (initial implementation had auth bug)
Time: 8 minutes
Outcome: Success

When to Use This Skill

  • Managing AI-driven development workflows
  • Planning agent task decomposition
  • Optimizing model tier selection
  • Implementing eval-first development
  • Reviewing AI-generated code
  • Tracking development costs

Integration with Other Skills

  • tdd-workflow: Combine with eval-first loop for test-driven development
  • verification-loop: Use for continuous validation during implementation
  • search-first: Apply before implementation to find existing solutions
  • coding-standards: Reference during code review phase