mirror of
https://github.com/affaan-m/everything-claude-code.git
synced 2026-04-01 22:53:27 +08:00
568 lines
15 KiB
Markdown
568 lines
15 KiB
Markdown
# RTStream 参考
|
||
|
||
RTStream 操作的代码级详情。工作流程指南请参阅 [rtstream.md](rtstream.md)。
|
||
有关使用指导和流程选择,请从 [../SKILL.md](../SKILL.md) 开始。
|
||
|
||
基于 [docs.videodb.io](https://docs.videodb.io/pages/ingest/live-streams/realtime-apis.md)。
|
||
|
||
***
|
||
|
||
## Collection RTStream 方法
|
||
|
||
`Collection` 上用于管理 RTStream 的方法:
|
||
|
||
| 方法 | 返回 | 描述 |
|
||
|--------|---------|-------------|
|
||
| `coll.connect_rtstream(url, name, ...)` | `RTStream` | 从 RTSP/RTMP URL 创建新的 RTStream |
|
||
| `coll.get_rtstream(id)` | `RTStream` | 通过 ID 获取现有的 RTStream |
|
||
| `coll.list_rtstreams(limit, offset, status, name, ordering)` | `List[RTStream]` | 列出集合中的所有 RTStream |
|
||
| `coll.search(query, namespace="rtstream")` | `RTStreamSearchResult` | 在所有 RTStream 中搜索 |
|
||
|
||
### 连接 RTStream
|
||
|
||
```python
|
||
import videodb
|
||
|
||
conn = videodb.connect()
|
||
coll = conn.get_collection()
|
||
|
||
rtstream = coll.connect_rtstream(
|
||
url="rtmp://your-stream-server/live/stream-key",
|
||
name="My Live Stream",
|
||
media_types=["video"], # or ["audio", "video"]
|
||
sample_rate=30, # optional
|
||
store=True, # enable recording storage for export
|
||
enable_transcript=True, # optional
|
||
ws_connection_id=ws_id, # optional, for real-time events
|
||
)
|
||
```
|
||
|
||
### 获取现有 RTStream
|
||
|
||
```python
|
||
rtstream = coll.get_rtstream("rts-xxx")
|
||
```
|
||
|
||
### 列出 RTStream
|
||
|
||
```python
|
||
rtstreams = coll.list_rtstreams(
|
||
limit=10,
|
||
offset=0,
|
||
status="connected", # optional filter
|
||
name="meeting", # optional filter
|
||
ordering="-created_at",
|
||
)
|
||
|
||
for rts in rtstreams:
|
||
print(f"{rts.id}: {rts.name} - {rts.status}")
|
||
```
|
||
|
||
### 从捕获会话获取
|
||
|
||
捕获会话激活后,检索 RTStream 对象:
|
||
|
||
```python
|
||
session = conn.get_capture_session(session_id)
|
||
|
||
mics = session.get_rtstream("mic")
|
||
displays = session.get_rtstream("screen")
|
||
system_audios = session.get_rtstream("system_audio")
|
||
```
|
||
|
||
或使用 `capture_session.active` WebSocket 事件中的 `rtstreams` 数据:
|
||
|
||
```python
|
||
for rts in rtstreams:
|
||
rtstream = coll.get_rtstream(rts["rtstream_id"])
|
||
```
|
||
|
||
***
|
||
|
||
## RTStream 方法
|
||
|
||
| 方法 | 返回 | 描述 |
|
||
|--------|---------|-------------|
|
||
| `rtstream.start()` | `None` | 开始摄取 |
|
||
| `rtstream.stop()` | `None` | 停止摄取 |
|
||
| `rtstream.generate_stream(start, end)` | `str` | 流式传输录制的片段(Unix 时间戳) |
|
||
| `rtstream.export(name=None)` | `RTStreamExportResult` | 导出为永久视频 |
|
||
| `rtstream.index_visuals(prompt, ...)` | `RTStreamSceneIndex` | 创建带 AI 分析的视觉索引 |
|
||
| `rtstream.index_audio(prompt, ...)` | `RTStreamSceneIndex` | 创建带 LLM 摘要的音频索引 |
|
||
| `rtstream.list_scene_indexes()` | `List[RTStreamSceneIndex]` | 列出流上的所有场景索引 |
|
||
| `rtstream.get_scene_index(index_id)` | `RTStreamSceneIndex` | 获取特定场景索引 |
|
||
| `rtstream.search(query, ...)` | `RTStreamSearchResult` | 搜索索引内容 |
|
||
| `rtstream.start_transcript(ws_connection_id, engine)` | `dict` | 开始实时转录 |
|
||
| `rtstream.get_transcript(page, page_size, start, end, since)` | `dict` | 获取转录页面 |
|
||
| `rtstream.stop_transcript(engine)` | `dict` | 停止转录 |
|
||
|
||
***
|
||
|
||
## 启动和停止
|
||
|
||
```python
|
||
# Begin ingestion
|
||
rtstream.start()
|
||
|
||
# ... stream is being recorded ...
|
||
|
||
# Stop ingestion
|
||
rtstream.stop()
|
||
```
|
||
|
||
***
|
||
|
||
## 生成流
|
||
|
||
使用 Unix 时间戳(而非秒数偏移)从录制内容生成播放流:
|
||
|
||
```python
|
||
import time
|
||
|
||
start_ts = time.time()
|
||
rtstream.start()
|
||
|
||
# Let it record for a while...
|
||
time.sleep(60)
|
||
|
||
end_ts = time.time()
|
||
rtstream.stop()
|
||
|
||
# Generate a stream URL for the recorded segment
|
||
stream_url = rtstream.generate_stream(start=start_ts, end=end_ts)
|
||
print(f"Recorded stream: {stream_url}")
|
||
```
|
||
|
||
***
|
||
|
||
## 导出为视频
|
||
|
||
将录制的流导出为集合中的永久视频:
|
||
|
||
```python
|
||
export_result = rtstream.export(name="Meeting Recording 2024-01-15")
|
||
|
||
print(f"Video ID: {export_result.video_id}")
|
||
print(f"Stream URL: {export_result.stream_url}")
|
||
print(f"Player URL: {export_result.player_url}")
|
||
print(f"Duration: {export_result.duration}s")
|
||
```
|
||
|
||
### RTStreamExportResult 属性
|
||
|
||
| 属性 | 类型 | 描述 |
|
||
|----------|------|-------------|
|
||
| `video_id` | `str` | 导出视频的 ID |
|
||
| `stream_url` | `str` | HLS 流 URL |
|
||
| `player_url` | `str` | Web 播放器 URL |
|
||
| `name` | `str` | 视频名称 |
|
||
| `duration` | `float` | 时长(秒) |
|
||
|
||
***
|
||
|
||
## AI 管道
|
||
|
||
AI 管道处理实时流并通过 WebSocket 发送结果。
|
||
|
||
### RTStream AI 管道方法
|
||
|
||
| 方法 | 返回 | 描述 |
|
||
|--------|---------|-------------|
|
||
| `rtstream.index_audio(prompt, batch_config, ...)` | `RTStreamSceneIndex` | 开始带 LLM 摘要的音频索引 |
|
||
| `rtstream.index_visuals(prompt, batch_config, ...)` | `RTStreamSceneIndex` | 开始屏幕内容的视觉索引 |
|
||
|
||
### 音频索引
|
||
|
||
以一定间隔生成音频内容的 LLM 摘要:
|
||
|
||
```python
|
||
audio_index = rtstream.index_audio(
|
||
prompt="Summarize what is being discussed",
|
||
batch_config={"type": "word", "value": 50},
|
||
model_name=None, # optional
|
||
name="meeting_audio", # optional
|
||
ws_connection_id=ws_id,
|
||
)
|
||
```
|
||
|
||
**音频 batch\_config 选项:**
|
||
|
||
| 类型 | 值 | 描述 |
|
||
|------|-------|-------------|
|
||
| `"word"` | count | 每 N 个词分段 |
|
||
| `"sentence"` | count | 每 N 个句子分段 |
|
||
| `"time"` | seconds | 每 N 秒分段 |
|
||
|
||
示例:
|
||
|
||
```python
|
||
{"type": "word", "value": 50} # every 50 words
|
||
{"type": "sentence", "value": 5} # every 5 sentences
|
||
{"type": "time", "value": 30} # every 30 seconds
|
||
```
|
||
|
||
结果通过 `audio_index` WebSocket 通道送达。
|
||
|
||
### 视觉索引
|
||
|
||
生成视觉内容的 AI 描述:
|
||
|
||
```python
|
||
scene_index = rtstream.index_visuals(
|
||
prompt="Describe what is happening on screen",
|
||
batch_config={"type": "time", "value": 2, "frame_count": 5},
|
||
model_name="basic",
|
||
name="screen_monitor", # optional
|
||
ws_connection_id=ws_id,
|
||
)
|
||
```
|
||
|
||
**参数:**
|
||
|
||
| 参数 | 类型 | 描述 |
|
||
|-----------|------|-------------|
|
||
| `prompt` | `str` | AI 模型的指令(支持结构化 JSON 输出) |
|
||
| `batch_config` | `dict` | 控制帧采样(见下文) |
|
||
| `model_name` | `str` | 模型层级:`"mini"`、`"basic"`、`"pro"`、`"ultra"` |
|
||
| `name` | `str` | 索引名称(可选) |
|
||
| `ws_connection_id` | `str` | 用于接收结果的 WebSocket 连接 ID |
|
||
|
||
**视觉 batch\_config:**
|
||
|
||
| 键 | 类型 | 描述 |
|
||
|-----|------|-------------|
|
||
| `type` | `str` | 仅 `"time"` 支持视觉索引 |
|
||
| `value` | `int` | 窗口大小(秒) |
|
||
| `frame_count` | `int` | 每个窗口提取的帧数 |
|
||
|
||
示例:`{"type": "time", "value": 2, "frame_count": 5}` 每 2 秒采样 5 帧并将其发送到模型。
|
||
|
||
**结构化 JSON 输出:**
|
||
|
||
使用请求 JSON 格式的提示语以获得结构化响应:
|
||
|
||
```python
|
||
scene_index = rtstream.index_visuals(
|
||
prompt="""Analyze the screen and return a JSON object with:
|
||
{
|
||
"app_name": "name of the active application",
|
||
"activity": "what the user is doing",
|
||
"ui_elements": ["list of visible UI elements"],
|
||
"contains_text": true/false,
|
||
"dominant_colors": ["list of main colors"]
|
||
}
|
||
Return only valid JSON.""",
|
||
batch_config={"type": "time", "value": 3, "frame_count": 3},
|
||
model_name="pro",
|
||
ws_connection_id=ws_id,
|
||
)
|
||
```
|
||
|
||
结果通过 `scene_index` WebSocket 通道送达。
|
||
|
||
***
|
||
|
||
## 批处理配置摘要
|
||
|
||
| 索引类型 | `type` 选项 | `value` | 额外键 |
|
||
|---------------|----------------|---------|------------|
|
||
| **音频** | `"word"`、`"sentence"`、`"time"` | words/sentences/seconds | - |
|
||
| **视觉** | 仅 `"time"` | seconds | `frame_count` |
|
||
|
||
示例:
|
||
|
||
```python
|
||
# Audio: every 50 words
|
||
{"type": "word", "value": 50}
|
||
|
||
# Audio: every 30 seconds
|
||
{"type": "time", "value": 30}
|
||
|
||
# Visual: 5 frames every 2 seconds
|
||
{"type": "time", "value": 2, "frame_count": 5}
|
||
```
|
||
|
||
***
|
||
|
||
## 转录
|
||
|
||
通过 WebSocket 进行实时转录:
|
||
|
||
```python
|
||
# Start live transcription
|
||
rtstream.start_transcript(
|
||
ws_connection_id=ws_id,
|
||
engine=None, # optional, defaults to "assemblyai"
|
||
)
|
||
|
||
# Get transcript pages (with optional filters)
|
||
transcript = rtstream.get_transcript(
|
||
page=1,
|
||
page_size=100,
|
||
start=None, # optional: start timestamp filter
|
||
end=None, # optional: end timestamp filter
|
||
since=None, # optional: for polling, get transcripts after this timestamp
|
||
engine=None,
|
||
)
|
||
|
||
# Stop transcription
|
||
rtstream.stop_transcript(engine=None)
|
||
```
|
||
|
||
转录结果通过 `transcript` WebSocket 通道送达。
|
||
|
||
***
|
||
|
||
## RTStreamSceneIndex
|
||
|
||
当您调用 `index_audio()` 或 `index_visuals()` 时,该方法返回一个 `RTStreamSceneIndex` 对象。此对象表示正在运行的索引,并提供用于管理场景和警报的方法。
|
||
|
||
```python
|
||
# index_visuals returns an RTStreamSceneIndex
|
||
scene_index = rtstream.index_visuals(
|
||
prompt="Describe what is on screen",
|
||
ws_connection_id=ws_id,
|
||
)
|
||
|
||
# index_audio also returns an RTStreamSceneIndex
|
||
audio_index = rtstream.index_audio(
|
||
prompt="Summarize the discussion",
|
||
ws_connection_id=ws_id,
|
||
)
|
||
```
|
||
|
||
### RTStreamSceneIndex 属性
|
||
|
||
| 属性 | 类型 | 描述 |
|
||
|----------|------|-------------|
|
||
| `rtstream_index_id` | `str` | 索引的唯一 ID |
|
||
| `rtstream_id` | `str` | 父 RTStream 的 ID |
|
||
| `extraction_type` | `str` | 提取类型(`time` 或 `transcript`) |
|
||
| `extraction_config` | `dict` | 提取配置 |
|
||
| `prompt` | `str` | 用于分析的提示语 |
|
||
| `name` | `str` | 索引名称 |
|
||
| `status` | `str` | 状态(`connected`、`stopped`) |
|
||
|
||
### RTStreamSceneIndex 方法
|
||
|
||
| 方法 | 返回 | 描述 |
|
||
|--------|---------|-------------|
|
||
| `index.get_scenes(start, end, page, page_size)` | `dict` | 获取已索引的场景 |
|
||
| `index.start()` | `None` | 启动/恢复索引 |
|
||
| `index.stop()` | `None` | 停止索引 |
|
||
| `index.create_alert(event_id, callback_url, ws_connection_id)` | `str` | 创建事件检测警报 |
|
||
| `index.list_alerts()` | `list` | 列出此索引上的所有警报 |
|
||
| `index.enable_alert(alert_id)` | `None` | 启用警报 |
|
||
| `index.disable_alert(alert_id)` | `None` | 禁用警报 |
|
||
|
||
### 获取场景
|
||
|
||
从索引轮询已索引的场景:
|
||
|
||
```python
|
||
result = scene_index.get_scenes(
|
||
start=None, # optional: start timestamp
|
||
end=None, # optional: end timestamp
|
||
page=1,
|
||
page_size=100,
|
||
)
|
||
|
||
for scene in result["scenes"]:
|
||
print(f"[{scene['start']}-{scene['end']}] {scene['text']}")
|
||
|
||
if result["next_page"]:
|
||
# fetch next page
|
||
pass
|
||
```
|
||
|
||
### 管理场景索引
|
||
|
||
```python
|
||
# List all indexes on the stream
|
||
indexes = rtstream.list_scene_indexes()
|
||
|
||
# Get a specific index by ID
|
||
scene_index = rtstream.get_scene_index(index_id)
|
||
|
||
# Stop an index
|
||
scene_index.stop()
|
||
|
||
# Restart an index
|
||
scene_index.start()
|
||
```
|
||
|
||
***
|
||
|
||
## 事件
|
||
|
||
事件是可重用的检测规则。创建一次,即可通过警报附加到任何索引。
|
||
|
||
### 连接事件方法
|
||
|
||
| 方法 | 返回 | 描述 |
|
||
|--------|---------|-------------|
|
||
| `conn.create_event(event_prompt, label)` | `str` (event\_id) | 创建检测事件 |
|
||
| `conn.list_events()` | `list` | 列出所有事件 |
|
||
|
||
### 创建事件
|
||
|
||
```python
|
||
event_id = conn.create_event(
|
||
event_prompt="User opened Slack application",
|
||
label="slack_opened",
|
||
)
|
||
```
|
||
|
||
### 列出事件
|
||
|
||
```python
|
||
events = conn.list_events()
|
||
for event in events:
|
||
print(f"{event['event_id']}: {event['label']}")
|
||
```
|
||
|
||
***
|
||
|
||
## 警报
|
||
|
||
警报将事件连接到索引以实现实时通知。当 AI 检测到与事件描述匹配的内容时,会发送警报。
|
||
|
||
### 创建警报
|
||
|
||
```python
|
||
# Get the RTStreamSceneIndex from index_visuals
|
||
scene_index = rtstream.index_visuals(
|
||
prompt="Describe what application is open on screen",
|
||
ws_connection_id=ws_id,
|
||
)
|
||
|
||
# Create an alert on the index
|
||
alert_id = scene_index.create_alert(
|
||
event_id=event_id,
|
||
callback_url="https://your-backend.com/alerts", # for webhook delivery
|
||
ws_connection_id=ws_id, # for WebSocket delivery (optional)
|
||
)
|
||
```
|
||
|
||
**注意:** `callback_url` 是必需的。如果仅使用 WebSocket 交付,请传递空字符串 `""`。
|
||
|
||
### 管理警报
|
||
|
||
```python
|
||
# List all alerts on an index
|
||
alerts = scene_index.list_alerts()
|
||
|
||
# Enable/disable alerts
|
||
scene_index.disable_alert(alert_id)
|
||
scene_index.enable_alert(alert_id)
|
||
```
|
||
|
||
### 警报交付
|
||
|
||
| 方法 | 延迟 | 使用场景 |
|
||
|--------|---------|----------|
|
||
| WebSocket | 实时 | 仪表板、实时 UI |
|
||
| Webhook | < 1 秒 | 服务器到服务器、自动化 |
|
||
|
||
### WebSocket 警报事件
|
||
|
||
```json
|
||
{
|
||
"channel": "alert",
|
||
"rtstream_id": "rts-xxx",
|
||
"data": {
|
||
"event_label": "slack_opened",
|
||
"timestamp": 1710000012340,
|
||
"text": "User opened Slack application"
|
||
}
|
||
}
|
||
```
|
||
|
||
### Webhook 负载
|
||
|
||
```json
|
||
{
|
||
"event_id": "event-xxx",
|
||
"label": "slack_opened",
|
||
"confidence": 0.95,
|
||
"explanation": "User opened the Slack application",
|
||
"timestamp": "2024-01-15T10:30:45Z",
|
||
"start_time": 1234.5,
|
||
"end_time": 1238.0,
|
||
"stream_url": "https://stream.videodb.io/v3/...",
|
||
"player_url": "https://console.videodb.io/player?url=..."
|
||
}
|
||
```
|
||
|
||
***
|
||
|
||
## WebSocket 集成
|
||
|
||
所有实时 AI 结果均通过 WebSocket 交付。将 `ws_connection_id` 传递给:
|
||
|
||
* `rtstream.start_transcript()`
|
||
* `rtstream.index_audio()`
|
||
* `rtstream.index_visuals()`
|
||
* `scene_index.create_alert()`
|
||
|
||
### WebSocket 通道
|
||
|
||
| 通道 | 来源 | 内容 |
|
||
|---------|--------|---------|
|
||
| `transcript` | `start_transcript()` | 实时语音转文本 |
|
||
| `scene_index` | `index_visuals()` | 视觉分析结果 |
|
||
| `audio_index` | `index_audio()` | 音频分析结果 |
|
||
| `alert` | `create_alert()` | 警报通知 |
|
||
|
||
有关 WebSocket 事件结构和 ws\_listener 用法,请参阅 [capture-reference.md](capture-reference.md)。
|
||
|
||
***
|
||
|
||
## 完整工作流程
|
||
|
||
```python
|
||
import time
|
||
import videodb
|
||
from videodb.exceptions import InvalidRequestError
|
||
|
||
conn = videodb.connect()
|
||
coll = conn.get_collection()
|
||
|
||
# 1. Connect and start recording
|
||
rtstream = coll.connect_rtstream(
|
||
url="rtmp://your-stream-server/live/stream-key",
|
||
name="Weekly Standup",
|
||
store=True,
|
||
)
|
||
rtstream.start()
|
||
|
||
# 2. Record for the duration of the meeting
|
||
start_ts = time.time()
|
||
time.sleep(1800) # 30 minutes
|
||
end_ts = time.time()
|
||
rtstream.stop()
|
||
|
||
# Generate an immediate playback URL for the captured window
|
||
stream_url = rtstream.generate_stream(start=start_ts, end=end_ts)
|
||
print(f"Recorded stream: {stream_url}")
|
||
|
||
# 3. Export to a permanent video
|
||
export_result = rtstream.export(name="Weekly Standup Recording")
|
||
print(f"Exported video: {export_result.video_id}")
|
||
|
||
# 4. Index the exported video for search
|
||
video = coll.get_video(export_result.video_id)
|
||
video.index_spoken_words(force=True)
|
||
|
||
# 5. Search for action items
|
||
try:
|
||
results = video.search("action items and next steps")
|
||
stream_url = results.compile()
|
||
print(f"Action items clip: {stream_url}")
|
||
except InvalidRequestError as exc:
|
||
if "No results found" in str(exc):
|
||
print("No action items were detected in the recording.")
|
||
else:
|
||
raise
|
||
```
|