Files
everything-claude-code/skills/videodb/reference/search.md

7.7 KiB

Search & Indexing Guide

Search allows you to find specific moments inside videos using natural language queries, exact keywords, or visual scene descriptions.

Prerequisites

Videos must be indexed before they can be searched. Indexing is a one-time operation per video per index type.

Indexing

Spoken Word Index

Index the transcribed speech content of a video for semantic and keyword search:

video = coll.get_video(video_id)

# force=True makes indexing idempotent — skips if already indexed
video.index_spoken_words(force=True)

This transcribes the audio track and builds a searchable index over the spoken content. Required for semantic search and keyword search.

Parameters:

Parameter Type Default Description
language_code str|None None Language code of the video
segmentation_type SegmentationType SegmentationType.sentence Segmentation type (sentence or llm)
force bool False Set to True to skip if already indexed (avoids "already exists" error)
callback_url str|None None Webhook URL for async notification

Scene Index

Index visual content by generating AI descriptions of scenes. Like spoken word indexing, this raises an error if a scene index already exists. Extract the existing scene_index_id from the error message.

import re
from videodb import SceneExtractionType

try:
    scene_index_id = video.index_scenes(
        extraction_type=SceneExtractionType.shot_based,
        prompt="Describe the visual content, objects, actions, and setting in this scene.",
    )
except Exception as e:
    match = re.search(r"id\s+([a-f0-9]+)", str(e))
    if match:
        scene_index_id = match.group(1)
    else:
        raise

Extraction types:

Type Description Best For
SceneExtractionType.shot_based Splits on visual shot boundaries General purpose, action content
SceneExtractionType.time_based Splits at fixed intervals Uniform sampling, long static content
SceneExtractionType.transcript Splits based on transcript segments Speech-driven scene boundaries

Parameters for time_based:

video.index_scenes(
    extraction_type=SceneExtractionType.time_based,
    extraction_config={"time": 5, "select_frames": ["first", "last"]},
    prompt="Describe what is happening in this scene.",
)

Search Types

Natural language queries matched against spoken content:

from videodb import SearchType

results = video.search(
    query="explaining the benefits of machine learning",
    search_type=SearchType.semantic,
)

Returns ranked segments where the spoken content semantically matches the query.

Exact term matching in transcribed speech:

results = video.search(
    query="artificial intelligence",
    search_type=SearchType.keyword,
)

Returns segments containing the exact keyword or phrase.

Visual content queries matched against indexed scene descriptions. Requires a prior index_scenes() call.

index_scenes() returns a scene_index_id. Pass it to video.search() to target a specific scene index (especially important when a video has multiple scene indexes):

from videodb import SearchType, IndexType
from videodb.exceptions import InvalidRequestError

# Search using semantic search against the scene index.
# Use score_threshold to filter low-relevance noise (recommended: 0.3+).
try:
    results = video.search(
        query="person writing on a whiteboard",
        search_type=SearchType.semantic,
        index_type=IndexType.scene,
        scene_index_id=scene_index_id,
        score_threshold=0.3,
    )
    shots = results.get_shots()
except InvalidRequestError as e:
    if "No results found" in str(e):
        shots = []
    else:
        raise

Important notes:

  • Use SearchType.semantic with index_type=IndexType.scene — this is the most reliable combination and works on all plans.
  • SearchType.scene exists but may not be available on all plans (e.g. Free tier). Prefer SearchType.semantic with IndexType.scene.
  • The scene_index_id parameter is optional. If omitted, the search runs against all scene indexes on the video. Pass it to target a specific index.
  • You can create multiple scene indexes per video (with different prompts or extraction types) and search them independently using scene_index_id.

Scene Search with Metadata Filtering

When indexing scenes with custom metadata, you can combine semantic search with metadata filters:

from videodb import SearchType, IndexType

results = video.search(
    query="a skillful chasing scene",
    search_type=SearchType.semantic,
    index_type=IndexType.scene,
    scene_index_id=scene_index_id,
    filter=[{"camera_view": "road_ahead"}, {"action_type": "chasing"}],
)

See the scene_level_metadata_indexing cookbook for a full example of custom metadata indexing and filtered search.

Working with Results

Get Shots

Access individual result segments:

results = video.search("your query")

for shot in results.get_shots():
    print(f"Video: {shot.video_id}")
    print(f"Start: {shot.start:.2f}s")
    print(f"End: {shot.end:.2f}s")
    print(f"Text: {shot.text}")
    print("---")

Play Compiled Results

Stream all matching segments as a single compiled video:

results = video.search("your query")
stream_url = results.compile()
results.play()  # opens compiled stream in browser

Extract Clips

Download or stream specific result segments:

for shot in results.get_shots():
    stream_url = shot.generate_stream()
    print(f"Clip: {stream_url}")

Search across all videos in a collection:

coll = conn.get_collection()

# Search across all videos in the collection
results = coll.search(
    query="product demo",
    search_type=SearchType.semantic,
)

for shot in results.get_shots():
    print(f"Video: {shot.video_id} [{shot.start:.1f}s - {shot.end:.1f}s]")

Note: Collection-level search only supports SearchType.semantic. Using SearchType.keyword or SearchType.scene with coll.search() will raise NotImplementedError. For keyword or scene search, use video.search() on individual videos instead.

Search + Compile

Index, search, and compile matching segments into a single playable stream:

video.index_spoken_words(force=True)
results = video.search(query="your query", search_type=SearchType.semantic)
stream_url = results.compile()
print(stream_url)

Tips

  • Index once, search many times: Indexing is the expensive operation. Once indexed, searches are fast.
  • Combine index types: Index both spoken words and scenes to enable all search types on the same video.
  • Refine queries: Semantic search works best with descriptive, natural language phrases rather than single keywords.
  • Use keyword search for precision: When you need exact term matches, keyword search avoids semantic drift.
  • Handle "No results found": video.search() raises InvalidRequestError when no results match. Always wrap search calls in try/except and treat "No results found" as an empty result set.
  • Filter scene search noise: Semantic scene search can return low-relevance results for vague queries. Use score_threshold=0.3 (or higher) to filter noise.
  • Idempotent indexing: Use index_spoken_words(force=True) to safely re-index. index_scenes() has no force parameter — wrap it in try/except and extract the existing scene_index_id from the error message with re.search(r"id\s+([a-f0-9]+)", str(e)).