feat: orchestration harness, selective install, observer improvements

2026-07-01 20:41:26 +08:00 · 2026-03-14 12:55:25 -07:00
parent 424f3b3729
commit 4e028bd2d2
76 changed files with 11050 additions and 340 deletions
@@ -82,7 +82,7 @@ If the user chooses niche or core + niche, continue to category selection below

 ### 2b: Choose Skill Categories

-There are 35 skills organized into 7 categories. Use `AskUserQuestion` with `multiSelect: true`:
+There are 7 selectable category groups below. The detailed confirmation lists that follow cover 41 skills across 8 categories, plus 1 standalone template. Use `AskUserQuestion` with `multiSelect: true`:

 ```
 Question: "Which skill categories do you want to install?"
@@ -91,7 +91,8 @@ PROMPT
    max_turns=10
  fi

-  claude --model haiku --max-turns "$max_turns" --print < "$prompt_file" >> "$LOG_FILE" 2>&1 &
+  # Prevent observe.sh from recording this automated Haiku session as observations
+  ECC_SKIP_OBSERVE=1 ECC_HOOK_PROFILE=minimal claude --model haiku --max-turns "$max_turns" --print < "$prompt_file" >> "$LOG_FILE" 2>&1 &
  claude_pid=$!

  (
@@ -73,6 +73,50 @@ if [ -n "$STDIN_CWD" ] && [ -d "$STDIN_CWD" ]; then
  export CLAUDE_PROJECT_DIR="$STDIN_CWD"
 fi

+# ─────────────────────────────────────────────
+# Lightweight config and automated session guards
+# ─────────────────────────────────────────────
+
+CONFIG_DIR="${HOME}/.claude/homunculus"
+
+# Skip if disabled
+if [ -f "$CONFIG_DIR/disabled" ]; then
+  exit 0
+fi
+
+# Prevent observe.sh from firing on non-human sessions to avoid:
+#   - ECC observing its own Haiku observer sessions (self-loop)
+#   - ECC observing other tools' automated sessions
+#   - automated sessions creating project-scoped homunculus metadata
+
+# Layer 1: entrypoint. Only interactive terminal sessions should continue.
+case "${CLAUDE_CODE_ENTRYPOINT:-cli}" in
+  cli) ;;
+  *) exit 0 ;;
+esac
+
+# Layer 2: minimal hook profile suppresses non-essential hooks.
+[ "${ECC_HOOK_PROFILE:-standard}" = "minimal" ] && exit 0
+
+# Layer 3: cooperative skip env var for automated sessions.
+[ "${ECC_SKIP_OBSERVE:-0}" = "1" ] && exit 0
+
+# Layer 4: subagent sessions are automated by definition.
+_ECC_AGENT_ID=$(echo "$INPUT_JSON" | "$PYTHON_CMD" -c "import json,sys; print(json.load(sys.stdin).get('agent_id',''))" 2>/dev/null || true)
+[ -n "$_ECC_AGENT_ID" ] && exit 0
+
+# Layer 5: known observer-session path exclusions.
+_ECC_SKIP_PATHS="${ECC_OBSERVE_SKIP_PATHS:-observer-sessions,.claude-mem}"
+if [ -n "$STDIN_CWD" ]; then
+  IFS=',' read -ra _ECC_SKIP_ARRAY <<< "$_ECC_SKIP_PATHS"
+  for _pattern in "${_ECC_SKIP_ARRAY[@]}"; do
+    _pattern="${_pattern#"${_pattern%%[![:space:]]*}"}"
+    _pattern="${_pattern%"${_pattern##*[![:space:]]}"}"
+    [ -z "$_pattern" ] && continue
+    case "$STDIN_CWD" in *"$_pattern"*) exit 0 ;; esac
+  done
+fi
+
 # ─────────────────────────────────────────────
 # Project detection
 # ─────────────────────────────────────────────
@@ -89,15 +133,9 @@ PYTHON_CMD="${CLV2_PYTHON_CMD:-$PYTHON_CMD}"
 # Configuration
 # ─────────────────────────────────────────────

-CONFIG_DIR="${HOME}/.claude/homunculus"
 OBSERVATIONS_FILE="${PROJECT_DIR}/observations.jsonl"
 MAX_FILE_SIZE_MB=10

-# Skip if disabled
-if [ -f "$CONFIG_DIR/disabled" ]; then
-  exit 0
-fi
-
 # Auto-purge observation files older than 30 days (runs once per session)
 PURGE_MARKER="${PROJECT_DIR}/.last-purge"
 if [ ! -f "$PURGE_MARKER" ] || [ "$(find "$PURGE_MARKER" -mtime +1 2>/dev/null)" ]; then
@@ -161,8 +161,10 @@ resp = requests.post(
            "linkedin": {"text": linkedin_version},
            "threads": {"text": threads_version}
        }
-    }
+    },
+    timeout=30,
 )
+resp.raise_for_status()
 ```

 ### Manual Posting
@@ -24,17 +24,14 @@ Exa MCP server must be configured. Add to `~/.claude.json`:
 ```json
 "exa-web-search": {
  "command": "npx",
-  "args": [
-    "-y",
-    "exa-mcp-server",
-    "tools=web_search_exa,get_code_context_exa,crawling_exa,company_research_exa,linkedin_search_exa,deep_researcher_start,deep_researcher_check"
-  ],
+  "args": ["-y", "exa-mcp-server"],
  "env": { "EXA_API_KEY": "YOUR_EXA_API_KEY_HERE" }
 }
 ```

 Get an API key at [exa.ai](https://exa.ai).
-If you omit the `tools=...` argument, only a smaller default tool set may be enabled.
+This repo's current Exa setup documents the tool surface exposed here: `web_search_exa` and `get_code_context_exa`.
+If your Exa server exposes additional tools, verify their exact names before depending on them in docs or prompts.

 ## Core Tools

@@ -51,29 +48,9 @@ web_search_exa(query: "latest AI developments 2026", numResults: 5)
 |-------|------|---------|-------|
 | `query` | string | required | Search query |
 | `numResults` | number | 8 | Number of results |
-
-### web_search_advanced_exa
-Filtered search with domain and date constraints.
-
-```
-web_search_advanced_exa(
-  query: "React Server Components best practices",
-  numResults: 5,
-  includeDomains: ["github.com", "react.dev"],
-  startPublishedDate: "2025-01-01"
-)
-```
-
-**Parameters:**
-
-| Param | Type | Default | Notes |
-|-------|------|---------|-------|
-| `query` | string | required | Search query |
-| `numResults` | number | 8 | Number of results |
-| `includeDomains` | string[] | none | Limit to specific domains |
-| `excludeDomains` | string[] | none | Exclude specific domains |
-| `startPublishedDate` | string | none | ISO date filter (start) |
-| `endPublishedDate` | string | none | ISO date filter (end) |
+| `type` | string | `auto` | Search mode |
+| `livecrawl` | string | `fallback` | Prefer live crawling when needed |
+| `category` | string | none | Optional focus such as `company` or `research paper` |

 ### get_code_context_exa
 Find code examples and documentation from GitHub, Stack Overflow, and docs sites.
@@ -89,52 +66,6 @@ get_code_context_exa(query: "Python asyncio patterns", tokensNum: 3000)
 | `query` | string | required | Code or API search query |
 | `tokensNum` | number | 5000 | Content tokens (1000-50000) |

-### company_research_exa
-Research companies for business intelligence and news.
-
-```
-company_research_exa(companyName: "Anthropic", numResults: 5)
-```
-
-**Parameters:**
-
-| Param | Type | Default | Notes |
-|-------|------|---------|-------|
-| `companyName` | string | required | Company name |
-| `numResults` | number | 5 | Number of results |
-
-### linkedin_search_exa
-Find professional profiles and company-adjacent people research.
-
-```
-linkedin_search_exa(query: "AI safety researchers at Anthropic", numResults: 5)
-```
-
-### crawling_exa
-Extract full page content from a URL.
-
-```
-crawling_exa(url: "https://example.com/article", tokensNum: 5000)
-```
-
-**Parameters:**
-
-| Param | Type | Default | Notes |
-|-------|------|---------|-------|
-| `url` | string | required | URL to extract |
-| `tokensNum` | number | 5000 | Content tokens |
-
-### deep_researcher_start / deep_researcher_check
-Start an AI research agent that runs asynchronously.
-
-```
-# Start research
-deep_researcher_start(query: "comprehensive analysis of AI code editors in 2026")
-
-# Check status (returns results when complete)
-deep_researcher_check(researchId: "<id from start>")
-```
-
 ## Usage Patterns

 ### Quick Lookup
@@ -147,27 +78,24 @@ web_search_exa(query: "Node.js 22 new features", numResults: 3)
 get_code_context_exa(query: "Rust error handling patterns Result type", tokensNum: 3000)
 ```

-### Company Due Diligence
+### Company or People Research
 ```
-company_research_exa(companyName: "Vercel", numResults: 5)
-web_search_advanced_exa(query: "Vercel funding valuation 2026", numResults: 3)
+web_search_exa(query: "Vercel funding valuation 2026", numResults: 3, category: "company")
+web_search_exa(query: "site:linkedin.com/in AI safety researchers Anthropic", numResults: 5)
 ```

 ### Technical Deep Dive
 ```
-# Start async research
-deep_researcher_start(query: "WebAssembly component model status and adoption")
-# ... do other work ...
-deep_researcher_check(researchId: "<id>")
+web_search_exa(query: "WebAssembly component model status and adoption", numResults: 5)
+get_code_context_exa(query: "WebAssembly component model examples", tokensNum: 4000)
 ```

 ## Tips

- Use `web_search_exa` for broad queries, `web_search_advanced_exa` for filtered results
+- Use `web_search_exa` for current information, company lookups, and broad discovery
+- Use search operators like `site:`, quoted phrases, and `intitle:` to narrow results
 - Lower `tokensNum` (1000-2000) for focused code snippets, higher (5000+) for comprehensive context
- Combine `company_research_exa` with `web_search_advanced_exa` for thorough company analysis
- Use `crawling_exa` to get full content from specific URLs found in search results
- `deep_researcher_start` is best for comprehensive topics that benefit from AI synthesis
+- Use `get_code_context_exa` when you need API usage or code examples rather than general web pages

 ## Related Skills

@@ -253,7 +253,7 @@ estimate_cost(
  estimate_type: "unit_price",
  endpoints: {
    "fal-ai/nano-banana-pro": {
-      "num_images": 1
+      "unit_quantity": 1
    }
  }
 )