4.7 KiB
Capture Guide
Overview
VideoDB Capture enables real-time screen and audio recording with AI processing. Desktop capture currently supports macOS only.
For code-level details (SDK methods, event structures, AI pipelines), see capture-reference.md.
Quick Start
- Start WebSocket listener:
python scripts/ws_listener.py --clear & - Run capture code (see Complete Capture Workflow below)
- Events written to:
/tmp/videodb_events.jsonl
Complete Capture Workflow
No webhooks or polling required. WebSocket delivers all events including session lifecycle.
CRITICAL: The
CaptureClientmust remain running for the entire duration of the capture. It runs the local recorder binary that streams screen/audio data to VideoDB. If the Python process that created theCaptureClientexits, the recorder binary is killed and capture stops silently. Always run the capture code as a long-lived background process (e.g.nohup python capture_script.py &) and use signal handling (asyncio.Event+SIGINT/SIGTERM) to keep it alive until you explicitly stop it.
-
Start WebSocket listener in background with
--clearflag to clear old events. Wait for it to create the WebSocket ID file. -
Read the WebSocket ID. This ID is required for capture session and AI pipelines.
-
Create a capture session and generate a client token for the desktop client.
-
Initialize CaptureClient with the token. Request permissions for microphone and screen capture.
-
List and select channels (mic, display, system_audio). Set
store = Trueon channels you want to persist as a video. -
Start the session with selected channels.
-
Wait for session active by reading events until you see
capture_session.active. This event contains thertstreamsarray. Save session info (session ID, RTStream IDs) to a file (e.g./tmp/videodb_capture_info.json) so other scripts can read it. -
Keep the process alive. Use
asyncio.Eventwith signal handlers forSIGINT/SIGTERMto block until explicitly stopped. Write a PID file (e.g./tmp/videodb_capture_pid) so the process can be stopped later withkill $(cat /tmp/videodb_capture_pid). The PID file should be overwritten on every run so reruns always have the correct PID. -
Start AI pipelines (in a separate command/script) on each RTStream for audio indexing and visual indexing. Read the RTStream IDs from the saved session info file.
-
Write custom event processing logic (in a separate command/script) to read real-time events based on your use case. Examples:
- Log Slack activity when
visual_indexmentions "Slack" - Summarize discussions when
audio_indexevents arrive - Trigger alerts when specific keywords appear in
transcript - Track application usage from screen descriptions
- Log Slack activity when
-
Stop capture when done — send SIGTERM to the capture process. It should call
client.stop_capture()andclient.shutdown()in its signal handler. -
Wait for export by reading events until you see
capture_session.exported. This event containsexported_video_id,stream_url, andplayer_url. This may take several seconds after stopping capture. -
Stop WebSocket listener after receiving the export event. Use
kill $(cat /tmp/videodb_ws_pid)to cleanly terminate it.
Shutdown Sequence
Proper shutdown order is important to ensure all events are captured:
- Stop the capture session —
client.stop_capture()thenclient.shutdown() - Wait for export event — poll
/tmp/videodb_events.jsonlforcapture_session.exported - Stop the WebSocket listener —
kill $(cat /tmp/videodb_ws_pid)
Do NOT kill the WebSocket listener before receiving the export event, or you will miss the final video URLs.
Scripts
| Script | Description |
|---|---|
scripts/ws_listener.py |
WebSocket event listener (dumps to JSONL) |
ws_listener.py Usage
# Start listener in background (append to existing events)
python scripts/ws_listener.py &
# Start listener with clear (new session, clears old events)
python scripts/ws_listener.py --clear &
# Custom output directory
python scripts/ws_listener.py --clear /path/to/events &
# Stop the listener
kill $(cat /tmp/videodb_ws_pid)
Options:
--clear: Clear the events file before starting. Use when starting a new capture session.
Output files:
videodb_events.jsonl- All WebSocket eventsvideodb_ws_id- WebSocket connection ID (forws_connection_idparameter)videodb_ws_pid- Process ID (for stopping the listener)
Features:
- Auto-reconnect with exponential backoff on connection drops
- Graceful shutdown on SIGINT/SIGTERM
- PID file for easy process management
- Connection status logging