Files
everything-claude-code/skills/continuous-learning-v2/agents/observer-loop.sh
Chris Yau 0e733753e0 feat: pending instinct TTL pruning and /prune command (#725)
* feat: add pending instinct TTL pruning and /prune command

Pending instincts generated by the observer accumulate indefinitely
with no cleanup mechanism. This adds lifecycle management:

- `instinct-cli.py prune` — delete pending instincts older than 30 days
  (configurable via --max-age). Supports --dry-run and --quiet flags.
- Enhanced `status` command — shows pending count, warns at 5+,
  highlights instincts expiring within 7 days.
- `observer-loop.sh` — runs prune before each analysis cycle.
- `/prune` slash command — user-facing command for manual pruning.

Design rationale: council consensus (4/4) rejected auto-promote in
favor of TTL-based garbage collection. Frequency of observation does
not establish correctness. Unreviewed pending instincts auto-delete
after 30 days; if the pattern is real, the observer will regenerate it.

Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>

* fix: remove duplicate functions, broaden extension filter, fix prune output

- Remove duplicate _collect_pending_dirs and _parse_created_date defs
- Use ALLOWED_INSTINCT_EXTENSIONS (.md/.yaml/.yml) instead of .md-only
- Track actually-deleted items separately from expired for accurate output
- Update README.md and AGENTS.md command counts: 59 → 60

Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>

* fix: address Copilot and CodeRabbit review findings

- Use is_dir() instead of exists() for pending path checks
- Change > to >= for --max-age boundary (--max-age 0 now prunes all)
- Use CLV2_PYTHON_CMD env var in observer-loop.sh prune call
- Remove unused source_dupes variable
- Remove extraneous f-string prefix on static string

Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)

Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>

* fix: update AGENTS.md project structure command count 59 → 60

Co-Authored-By: Claude <noreply@anthropic.com>

* fix: address cubic and coderabbit review findings

- Fix status early return skipping pending instinct warnings (cubic #1)
- Exclude already-expired items from expiring-soon filter (cubic #2)
- Warn on unparseable pending instinct age instead of silent skip (cubic #4)
- Log prune failures to observer.log instead of silencing (cubic #5)

Co-Authored-By: Claude <noreply@anthropic.com>

* fix: YAML single-quote unescaping, f-string cleanup, add /prune to README

- Fix single-quoted YAML unescaping: use '' doubling (YAML spec) not
  backslash escaping which only applies to double-quoted strings (greptile P1)
- Remove extraneous f-string prefix on static string (coderabbit)
- Add /prune to README command catalog and file tree (cubic)

Co-Authored-By: Claude <noreply@anthropic.com>

---------

Co-authored-by: Claude <noreply@anthropic.com>
Co-authored-by: Happy <yesreply@happy.engineering>
2026-03-22 15:40:58 -07:00

192 lines
6.1 KiB
Bash
Executable File

#!/usr/bin/env bash
# Continuous Learning v2 - Observer background loop
#
# Fix for #521: Added re-entrancy guard, cooldown throttle, and
# tail-based sampling to prevent memory explosion from runaway
# parallel Claude analysis processes.
set +e
unset CLAUDECODE
SLEEP_PID=""
USR1_FIRED=0
ANALYZING=0
LAST_ANALYSIS_EPOCH=0
# Minimum seconds between analyses (prevents rapid re-triggering)
ANALYSIS_COOLDOWN="${ECC_OBSERVER_ANALYSIS_COOLDOWN:-60}"
cleanup() {
[ -n "$SLEEP_PID" ] && kill "$SLEEP_PID" 2>/dev/null
if [ -f "$PID_FILE" ] && [ "$(cat "$PID_FILE" 2>/dev/null)" = "$$" ]; then
rm -f "$PID_FILE"
fi
exit 0
}
trap cleanup TERM INT
analyze_observations() {
if [ ! -f "$OBSERVATIONS_FILE" ]; then
return
fi
obs_count=$(wc -l < "$OBSERVATIONS_FILE" 2>/dev/null || echo 0)
if [ "$obs_count" -lt "$MIN_OBSERVATIONS" ]; then
return
fi
echo "[$(date)] Analyzing $obs_count observations for project ${PROJECT_NAME}..." >> "$LOG_FILE"
if [ "${CLV2_IS_WINDOWS:-false}" = "true" ] && [ "${ECC_OBSERVER_ALLOW_WINDOWS:-false}" != "true" ]; then
echo "[$(date)] Skipping claude analysis on Windows due to known non-interactive hang issue (#295). Set ECC_OBSERVER_ALLOW_WINDOWS=true to override." >> "$LOG_FILE"
return
fi
if ! command -v claude >/dev/null 2>&1; then
echo "[$(date)] claude CLI not found, skipping analysis" >> "$LOG_FILE"
return
fi
# session-guardian: gate observer cycle (active hours, cooldown, idle detection)
if ! bash "$(dirname "$0")/session-guardian.sh"; then
echo "[$(date)] Observer cycle skipped by session-guardian" >> "$LOG_FILE"
return
fi
# Sample recent observations instead of loading the entire file (#521).
# This prevents multi-MB payloads from being passed to the LLM.
MAX_ANALYSIS_LINES="${ECC_OBSERVER_MAX_ANALYSIS_LINES:-500}"
analysis_file="$(mktemp "${TMPDIR:-/tmp}/ecc-observer-analysis.XXXXXX.jsonl")"
tail -n "$MAX_ANALYSIS_LINES" "$OBSERVATIONS_FILE" > "$analysis_file"
analysis_count=$(wc -l < "$analysis_file" 2>/dev/null || echo 0)
echo "[$(date)] Using last $analysis_count of $obs_count observations for analysis" >> "$LOG_FILE"
prompt_file="$(mktemp "${TMPDIR:-/tmp}/ecc-observer-prompt.XXXXXX")"
cat > "$prompt_file" <<PROMPT
Read ${analysis_file} and identify patterns for the project ${PROJECT_NAME} (user corrections, error resolutions, repeated workflows, tool preferences).
If you find 3+ occurrences of the same pattern, create an instinct file in ${INSTINCTS_DIR}/<id>.md.
CRITICAL: Every instinct file MUST use this exact format:
---
id: kebab-case-name
trigger: when <specific condition>
confidence: <0.3-0.85 based on frequency: 3-5 times=0.5, 6-10=0.7, 11+=0.85>
domain: <one of: code-style, testing, git, debugging, workflow, file-patterns>
source: session-observation
scope: project
project_id: ${PROJECT_ID}
project_name: ${PROJECT_NAME}
---
# Title
## Action
<what to do, one clear sentence>
## Evidence
- Observed N times in session <id>
- Pattern: <description>
- Last observed: <date>
Rules:
- Be conservative, only clear patterns with 3+ observations
- Use narrow, specific triggers
- Never include actual code snippets, only describe patterns
- If a similar instinct already exists in ${INSTINCTS_DIR}/, update it instead of creating a duplicate
- The YAML frontmatter (between --- markers) with id field is MANDATORY
- If a pattern seems universal (not project-specific), set scope to global instead of project
- Examples of global patterns: always validate user input, prefer explicit error handling
- Examples of project patterns: use React functional components, follow Django REST framework conventions
PROMPT
timeout_seconds="${ECC_OBSERVER_TIMEOUT_SECONDS:-120}"
max_turns="${ECC_OBSERVER_MAX_TURNS:-10}"
exit_code=0
case "$max_turns" in
''|*[!0-9]*)
max_turns=10
;;
esac
if [ "$max_turns" -lt 4 ]; then
max_turns=10
fi
# Prevent observe.sh from recording this automated Haiku session as observations
ECC_SKIP_OBSERVE=1 ECC_HOOK_PROFILE=minimal claude --model haiku --max-turns "$max_turns" --print \
--allowedTools "Read,Write" \
< "$prompt_file" >> "$LOG_FILE" 2>&1 &
claude_pid=$!
(
sleep "$timeout_seconds"
if kill -0 "$claude_pid" 2>/dev/null; then
echo "[$(date)] Claude analysis timed out after ${timeout_seconds}s; terminating process" >> "$LOG_FILE"
kill "$claude_pid" 2>/dev/null || true
fi
) &
watchdog_pid=$!
wait "$claude_pid"
exit_code=$?
kill "$watchdog_pid" 2>/dev/null || true
rm -f "$prompt_file" "$analysis_file"
if [ "$exit_code" -ne 0 ]; then
echo "[$(date)] Claude analysis failed (exit $exit_code)" >> "$LOG_FILE"
fi
if [ -f "$OBSERVATIONS_FILE" ]; then
archive_dir="${PROJECT_DIR}/observations.archive"
mkdir -p "$archive_dir"
mv "$OBSERVATIONS_FILE" "$archive_dir/processed-$(date +%Y%m%d-%H%M%S)-$$.jsonl" 2>/dev/null || true
fi
}
on_usr1() {
[ -n "$SLEEP_PID" ] && kill "$SLEEP_PID" 2>/dev/null
SLEEP_PID=""
USR1_FIRED=1
# Re-entrancy guard: skip if analysis is already running (#521)
if [ "$ANALYZING" -eq 1 ]; then
echo "[$(date)] Analysis already in progress, skipping signal" >> "$LOG_FILE"
return
fi
# Cooldown: skip if last analysis was too recent (#521)
now_epoch=$(date +%s)
elapsed=$(( now_epoch - LAST_ANALYSIS_EPOCH ))
if [ "$elapsed" -lt "$ANALYSIS_COOLDOWN" ]; then
echo "[$(date)] Analysis cooldown active (${elapsed}s < ${ANALYSIS_COOLDOWN}s), skipping" >> "$LOG_FILE"
return
fi
ANALYZING=1
analyze_observations
LAST_ANALYSIS_EPOCH=$(date +%s)
ANALYZING=0
}
trap on_usr1 USR1
echo "$$" > "$PID_FILE"
echo "[$(date)] Observer started for ${PROJECT_NAME} (PID: $$)" >> "$LOG_FILE"
# Prune expired pending instincts before analysis
SCRIPT_DIR="$(cd "$(dirname "$0")" && pwd)"
"${CLV2_PYTHON_CMD:-python3}" "${SCRIPT_DIR}/../scripts/instinct-cli.py" prune --quiet >> "$LOG_FILE" 2>&1 || echo "[$(date)] Warning: instinct prune failed (non-fatal)" >> "$LOG_FILE"
while true; do
sleep "$OBSERVER_INTERVAL_SECONDS" &
SLEEP_PID=$!
wait "$SLEEP_PID" 2>/dev/null
SLEEP_PID=""
if [ "$USR1_FIRED" -eq 1 ]; then
USR1_FIRED=0
else
analyze_observations
fi
done