7.7 KiB
Search & Indexing Guide
Search allows you to find specific moments inside videos using natural language queries, exact keywords, or visual scene descriptions.
Prerequisites
Videos must be indexed before they can be searched. Indexing is a one-time operation per video per index type.
Indexing
Spoken Word Index
Index the transcribed speech content of a video for semantic and keyword search:
video = coll.get_video(video_id)
# force=True makes indexing idempotent — skips if already indexed
video.index_spoken_words(force=True)
This transcribes the audio track and builds a searchable index over the spoken content. Required for semantic search and keyword search.
Parameters:
| Parameter | Type | Default | Description |
|---|---|---|---|
language_code |
str|None |
None |
Language code of the video |
segmentation_type |
SegmentationType |
SegmentationType.sentence |
Segmentation type (sentence or llm) |
force |
bool |
False |
Set to True to skip if already indexed (avoids "already exists" error) |
callback_url |
str|None |
None |
Webhook URL for async notification |
Scene Index
Index visual content by generating AI descriptions of scenes. Like spoken word indexing, this raises an error if a scene index already exists. Extract the existing scene_index_id from the error message.
import re
from videodb import SceneExtractionType
try:
scene_index_id = video.index_scenes(
extraction_type=SceneExtractionType.shot_based,
prompt="Describe the visual content, objects, actions, and setting in this scene.",
)
except Exception as e:
match = re.search(r"id\s+([a-f0-9]+)", str(e))
if match:
scene_index_id = match.group(1)
else:
raise
Extraction types:
| Type | Description | Best For |
|---|---|---|
SceneExtractionType.shot_based |
Splits on visual shot boundaries | General purpose, action content |
SceneExtractionType.time_based |
Splits at fixed intervals | Uniform sampling, long static content |
SceneExtractionType.transcript |
Splits based on transcript segments | Speech-driven scene boundaries |
Parameters for time_based:
video.index_scenes(
extraction_type=SceneExtractionType.time_based,
extraction_config={"time": 5, "select_frames": ["first", "last"]},
prompt="Describe what is happening in this scene.",
)
Search Types
Semantic Search
Natural language queries matched against spoken content:
from videodb import SearchType
results = video.search(
query="explaining the benefits of machine learning",
search_type=SearchType.semantic,
)
Returns ranked segments where the spoken content semantically matches the query.
Keyword Search
Exact term matching in transcribed speech:
results = video.search(
query="artificial intelligence",
search_type=SearchType.keyword,
)
Returns segments containing the exact keyword or phrase.
Scene Search
Visual content queries matched against indexed scene descriptions. Requires a prior index_scenes() call.
index_scenes() returns a scene_index_id. Pass it to video.search() to target a specific scene index (especially important when a video has multiple scene indexes):
from videodb import SearchType, IndexType
from videodb.exceptions import InvalidRequestError
# Search using semantic search against the scene index.
# Use score_threshold to filter low-relevance noise (recommended: 0.3+).
try:
results = video.search(
query="person writing on a whiteboard",
search_type=SearchType.semantic,
index_type=IndexType.scene,
scene_index_id=scene_index_id,
score_threshold=0.3,
)
shots = results.get_shots()
except InvalidRequestError as e:
if "No results found" in str(e):
shots = []
else:
raise
Important notes:
- Use
SearchType.semanticwithindex_type=IndexType.scene— this is the most reliable combination and works on all plans. SearchType.sceneexists but may not be available on all plans (e.g. Free tier). PreferSearchType.semanticwithIndexType.scene.- The
scene_index_idparameter is optional. If omitted, the search runs against all scene indexes on the video. Pass it to target a specific index. - You can create multiple scene indexes per video (with different prompts or extraction types) and search them independently using
scene_index_id.
Scene Search with Metadata Filtering
When indexing scenes with custom metadata, you can combine semantic search with metadata filters:
from videodb import SearchType, IndexType
results = video.search(
query="a skillful chasing scene",
search_type=SearchType.semantic,
index_type=IndexType.scene,
scene_index_id=scene_index_id,
filter=[{"camera_view": "road_ahead"}, {"action_type": "chasing"}],
)
See the scene_level_metadata_indexing cookbook for a full example of custom metadata indexing and filtered search.
Working with Results
Get Shots
Access individual result segments:
results = video.search("your query")
for shot in results.get_shots():
print(f"Video: {shot.video_id}")
print(f"Start: {shot.start:.2f}s")
print(f"End: {shot.end:.2f}s")
print(f"Text: {shot.text}")
print("---")
Play Compiled Results
Stream all matching segments as a single compiled video:
results = video.search("your query")
stream_url = results.compile()
results.play() # opens compiled stream in browser
Extract Clips
Download or stream specific result segments:
for shot in results.get_shots():
stream_url = shot.generate_stream()
print(f"Clip: {stream_url}")
Cross-Collection Search
Search across all videos in a collection:
coll = conn.get_collection()
# Search across all videos in the collection
results = coll.search(
query="product demo",
search_type=SearchType.semantic,
)
for shot in results.get_shots():
print(f"Video: {shot.video_id} [{shot.start:.1f}s - {shot.end:.1f}s]")
Note: Collection-level search only supports
SearchType.semantic. UsingSearchType.keywordorSearchType.scenewithcoll.search()will raiseNotImplementedError. For keyword or scene search, usevideo.search()on individual videos instead.
Search + Compile
Index, search, and compile matching segments into a single playable stream:
video.index_spoken_words(force=True)
results = video.search(query="your query", search_type=SearchType.semantic)
stream_url = results.compile()
print(stream_url)
Tips
- Index once, search many times: Indexing is the expensive operation. Once indexed, searches are fast.
- Combine index types: Index both spoken words and scenes to enable all search types on the same video.
- Refine queries: Semantic search works best with descriptive, natural language phrases rather than single keywords.
- Use keyword search for precision: When you need exact term matches, keyword search avoids semantic drift.
- Handle "No results found":
video.search()raisesInvalidRequestErrorwhen no results match. Always wrap search calls in try/except and treat"No results found"as an empty result set. - Filter scene search noise: Semantic scene search can return low-relevance results for vague queries. Use
score_threshold=0.3(or higher) to filter noise. - Idempotent indexing: Use
index_spoken_words(force=True)to safely re-index.index_scenes()has noforceparameter — wrap it in try/except and extract the existingscene_index_idfrom the error message withre.search(r"id\s+([a-f0-9]+)", str(e)).