mirror of
https://github.com/affaan-m/everything-claude-code.git
synced 2026-04-13 05:03:28 +08:00
Add Kiro IDE support (.kiro/) (#548)
Co-authored-by: Sungmin Hong <hsungmin@amazon.com>
This commit is contained in:
135
.kiro/skills/agentic-engineering/SKILL.md
Normal file
135
.kiro/skills/agentic-engineering/SKILL.md
Normal file
@@ -0,0 +1,135 @@
|
||||
---
|
||||
name: agentic-engineering
|
||||
description: >
|
||||
Operate as an agentic engineer using eval-first execution, decomposition,
|
||||
and cost-aware model routing. Use when AI agents perform most implementation
|
||||
work and humans enforce quality and risk controls.
|
||||
metadata:
|
||||
origin: ECC
|
||||
---
|
||||
|
||||
# Agentic Engineering
|
||||
|
||||
Use this skill for engineering workflows where AI agents perform most implementation work and humans enforce quality and risk controls.
|
||||
|
||||
## Operating Principles
|
||||
|
||||
1. Define completion criteria before execution.
|
||||
2. Decompose work into agent-sized units.
|
||||
3. Route model tiers by task complexity.
|
||||
4. Measure with evals and regression checks.
|
||||
|
||||
## Eval-First Loop
|
||||
|
||||
1. Define capability eval and regression eval.
|
||||
2. Run baseline and capture failure signatures.
|
||||
3. Execute implementation.
|
||||
4. Re-run evals and compare deltas.
|
||||
|
||||
**Example workflow:**
|
||||
```
|
||||
1. Write test that captures desired behavior (eval)
|
||||
2. Run test → capture baseline failures
|
||||
3. Implement feature
|
||||
4. Re-run test → verify improvements
|
||||
5. Check for regressions in other tests
|
||||
```
|
||||
|
||||
## Task Decomposition
|
||||
|
||||
Apply the 15-minute unit rule:
|
||||
- Each unit should be independently verifiable
|
||||
- Each unit should have a single dominant risk
|
||||
- Each unit should expose a clear done condition
|
||||
|
||||
**Good decomposition:**
|
||||
```
|
||||
Task: Add user authentication
|
||||
├─ Unit 1: Add password hashing (15 min, security risk)
|
||||
├─ Unit 2: Create login endpoint (15 min, API contract risk)
|
||||
├─ Unit 3: Add session management (15 min, state risk)
|
||||
└─ Unit 4: Protect routes with middleware (15 min, auth logic risk)
|
||||
```
|
||||
|
||||
**Bad decomposition:**
|
||||
```
|
||||
Task: Add user authentication (2 hours, multiple risks)
|
||||
```
|
||||
|
||||
## Model Routing
|
||||
|
||||
Choose model tier based on task complexity:
|
||||
|
||||
- **Haiku**: Classification, boilerplate transforms, narrow edits
|
||||
- Example: Rename variable, add type annotation, format code
|
||||
|
||||
- **Sonnet**: Implementation and refactors
|
||||
- Example: Implement feature, refactor module, write tests
|
||||
|
||||
- **Opus**: Architecture, root-cause analysis, multi-file invariants
|
||||
- Example: Design system, debug complex issue, review architecture
|
||||
|
||||
**Cost discipline:** Escalate model tier only when lower tier fails with a clear reasoning gap.
|
||||
|
||||
## Session Strategy
|
||||
|
||||
- **Continue session** for closely-coupled units
|
||||
- Example: Implementing related functions in same module
|
||||
|
||||
- **Start fresh session** after major phase transitions
|
||||
- Example: Moving from implementation to testing
|
||||
|
||||
- **Compact after milestone completion**, not during active debugging
|
||||
- Example: After feature complete, before starting next feature
|
||||
|
||||
## Review Focus for AI-Generated Code
|
||||
|
||||
Prioritize:
|
||||
- Invariants and edge cases
|
||||
- Error boundaries
|
||||
- Security and auth assumptions
|
||||
- Hidden coupling and rollout risk
|
||||
|
||||
Do not waste review cycles on style-only disagreements when automated format/lint already enforce style.
|
||||
|
||||
**Review checklist:**
|
||||
- [ ] Edge cases handled (null, empty, boundary values)
|
||||
- [ ] Error handling comprehensive
|
||||
- [ ] Security assumptions validated
|
||||
- [ ] No hidden coupling between modules
|
||||
- [ ] Rollout risk assessed (breaking changes, migrations)
|
||||
|
||||
## Cost Discipline
|
||||
|
||||
Track per task:
|
||||
- Model tier used
|
||||
- Token estimate
|
||||
- Retries needed
|
||||
- Wall-clock time
|
||||
- Success/failure outcome
|
||||
|
||||
**Example tracking:**
|
||||
```
|
||||
Task: Implement user login
|
||||
Model: Sonnet
|
||||
Tokens: ~5k input, ~2k output
|
||||
Retries: 1 (initial implementation had auth bug)
|
||||
Time: 8 minutes
|
||||
Outcome: Success
|
||||
```
|
||||
|
||||
## When to Use This Skill
|
||||
|
||||
- Managing AI-driven development workflows
|
||||
- Planning agent task decomposition
|
||||
- Optimizing model tier selection
|
||||
- Implementing eval-first development
|
||||
- Reviewing AI-generated code
|
||||
- Tracking development costs
|
||||
|
||||
## Integration with Other Skills
|
||||
|
||||
- **tdd-workflow**: Combine with eval-first loop for test-driven development
|
||||
- **verification-loop**: Use for continuous validation during implementation
|
||||
- **search-first**: Apply before implementation to find existing solutions
|
||||
- **coding-standards**: Reference during code review phase
|
||||
Reference in New Issue
Block a user