Files
Affaan Mustafa 8511d84042 feat(skills): add rules-distill skill (rebased #561) (#678)
* feat(skills): add rules-distill — extract cross-cutting principles from skills into rules

Applies the skill-stocktake pattern to rules maintenance:
scan skills → extract shared principles → propose rule changes.

Key design decisions:
- Deterministic collection (scan scripts) + LLM judgment (cross-read & verdict)
- 6 verdict types: Append, Revise, New Section, New File, Already Covered, Too Specific
- Anti-abstraction safeguard: 2+ skills evidence, actionable behavior test, violation risk
- Rules full text passed to LLM (no grep pre-filter) for accurate matching
- Never modifies rules automatically — always requires user approval

* fix(skills): address review feedback for rules-distill

Fixes raised by CodeRabbit, Greptile, and cubic:

- Add Prerequisites section documenting skill-stocktake dependency
- Add fallback command when skill-stocktake is not installed
- Fix shell quoting: add IFS= and -r to while-read loops
- Replace hardcoded paths with env var placeholders ($CLAUDE_RULES_DIR, $SKILL_STOCKTAKE_DIR)
- Add json language identifier to code blocks
- Add "How It Works" parent heading for Phase 1/2/3
- Add "Example" section with end-to-end run output
- Add revision.reason/before/after fields to output schema for Revise verdict
- Document timestamp format (date -u +%Y-%m-%dT%H:%M:%SZ)
- Document candidate-id format (kebab-case from principle)
- Use concrete examples in results.json schema

* fix(skills): remove skill-stocktake dependency, add self-contained scripts

Address P1 review feedback:
- Add scan-skills.sh and scan-rules.sh directly in rules-distill/scripts/
  (no external dependency on skill-stocktake)
- Remove Prerequisites section (no longer needed)
- Add cross-batch merge step to prevent 2+ skills requirement
  from being silently broken across batch boundaries
- Fix nested triple-backtick fences (use quadruple backticks)
- Remove head -100 cap (silent truncation)
- Rename "When to Activate" → "When to Use" (ECC standard)
- Remove unnecessary env var placeholders (SKILL.md is a prompt, not a script)

* fix: update skill/command counts in README.md and AGENTS.md

rules-distill added 1 skill + 1 command:
- skills: 108 → 109
- commands: 57 → 58

Updates all count references to pass CI catalog validation.

* fix(skills): address Servitor review feedback for rules-distill

1. Rename SKILL_STOCKTAKE_* env vars to RULES_DISTILL_* for consistency
2. Remove unnecessary observation counting (use_7d/use_30d) from scan-skills.sh
3. Fix header comment: scan.sh → scan-skills.sh
4. Use jq for JSON construction in scan-rules.sh to properly escape
   headings containing special characters (", \)

* fix(skills): address CodeRabbit review — portability and scan scope

1. scan-rules.sh: use jq for error JSON output (proper escaping)
2. scan-rules.sh: replace GNU-only sort -z with portable sort (BSD compat)
3. scan-rules.sh: fix pipefail crash on files without H2 headings
4. scan-skills.sh: scan only SKILL.md files (skip learned/*.md and
   auxiliary docs that lack frontmatter)
5. scan-skills.sh: add portable get_mtime helper (GNU stat/date
   fallback to BSD stat/date)

* fix: sync catalog counts with filesystem (27 agents, 114 skills, 59 commands)

---------

Co-authored-by: Tatsuya Shimomoto <shimo4228@gmail.com>
2026-03-20 01:44:55 -07:00

130 lines
3.8 KiB
Bash
Executable File

#!/usr/bin/env bash
# scan-skills.sh — enumerate skill files, extract frontmatter and UTC mtime
# Usage: scan-skills.sh [CWD_SKILLS_DIR]
# Output: JSON to stdout
#
# When CWD_SKILLS_DIR is omitted, defaults to $PWD/.claude/skills so the
# script always picks up project-level skills without relying on the caller.
#
# Environment:
# RULES_DISTILL_GLOBAL_DIR Override ~/.claude/skills (for testing only;
# do not set in production — intended for bats tests)
# RULES_DISTILL_PROJECT_DIR Override project dir detection (for testing only)
set -euo pipefail
GLOBAL_DIR="${RULES_DISTILL_GLOBAL_DIR:-$HOME/.claude/skills}"
CWD_SKILLS_DIR="${RULES_DISTILL_PROJECT_DIR:-${1:-$PWD/.claude/skills}}"
# Validate CWD_SKILLS_DIR looks like a .claude/skills path (defense-in-depth).
# Only warn when the path exists — a nonexistent path poses no traversal risk.
if [[ -n "$CWD_SKILLS_DIR" && -d "$CWD_SKILLS_DIR" && "$CWD_SKILLS_DIR" != */.claude/skills* ]]; then
echo "Warning: CWD_SKILLS_DIR does not look like a .claude/skills path: $CWD_SKILLS_DIR" >&2
fi
# Extract a frontmatter field (handles both quoted and unquoted single-line values).
# Does NOT support multi-line YAML blocks (| or >) or nested YAML keys.
extract_field() {
local file="$1" field="$2"
awk -v f="$field" '
BEGIN { fm=0 }
/^---$/ { fm++; next }
fm==1 {
n = length(f) + 2
if (substr($0, 1, n) == f ": ") {
val = substr($0, n+1)
gsub(/^"/, "", val)
gsub(/"$/, "", val)
print val
exit
}
}
fm>=2 { exit }
' "$file"
}
# Get file mtime in UTC ISO8601 (portable: GNU and BSD)
get_mtime() {
local file="$1"
local secs
secs=$(stat -c %Y "$file" 2>/dev/null || stat -f %m "$file" 2>/dev/null) || return 1
date -u -d "@$secs" +%Y-%m-%dT%H:%M:%SZ 2>/dev/null ||
date -u -r "$secs" +%Y-%m-%dT%H:%M:%SZ
}
# Scan a directory and produce a JSON array of skill objects
scan_dir_to_json() {
local dir="$1"
local tmpdir
tmpdir=$(mktemp -d)
local _scan_tmpdir="$tmpdir"
_scan_cleanup() { rm -rf "$_scan_tmpdir"; }
trap _scan_cleanup RETURN
local i=0
while IFS= read -r file; do
local name desc mtime dp
name=$(extract_field "$file" "name")
desc=$(extract_field "$file" "description")
mtime=$(get_mtime "$file")
dp="${file/#$HOME/~}"
jq -n \
--arg path "$dp" \
--arg name "$name" \
--arg description "$desc" \
--arg mtime "$mtime" \
'{path:$path,name:$name,description:$description,mtime:$mtime}' \
> "$tmpdir/$i.json"
i=$((i+1))
done < <(find "$dir" -name "SKILL.md" -type f 2>/dev/null | sort)
if [[ $i -eq 0 ]]; then
echo "[]"
else
jq -s '.' "$tmpdir"/*.json
fi
}
# --- Main ---
global_found="false"
global_count=0
global_skills="[]"
if [[ -d "$GLOBAL_DIR" ]]; then
global_found="true"
global_skills=$(scan_dir_to_json "$GLOBAL_DIR")
global_count=$(echo "$global_skills" | jq 'length')
fi
project_found="false"
project_path=""
project_count=0
project_skills="[]"
if [[ -n "$CWD_SKILLS_DIR" && -d "$CWD_SKILLS_DIR" ]]; then
project_found="true"
project_path="$CWD_SKILLS_DIR"
project_skills=$(scan_dir_to_json "$CWD_SKILLS_DIR")
project_count=$(echo "$project_skills" | jq 'length')
fi
# Merge global + project skills into one array
all_skills=$(jq -s 'add' <(echo "$global_skills") <(echo "$project_skills"))
jq -n \
--arg global_found "$global_found" \
--argjson global_count "$global_count" \
--arg project_found "$project_found" \
--arg project_path "$project_path" \
--argjson project_count "$project_count" \
--argjson skills "$all_skills" \
'{
scan_summary: {
global: { found: ($global_found == "true"), count: $global_count },
project: { found: ($project_found == "true"), path: $project_path, count: $project_count }
},
skills: $skills
}'