Tutorial: Index & Search Videos

Overview

This tutorial provides a step-by-step guide on how to index and search videos using the Aana API. The indexing process involves creating a collection, adding videos to it, and then searching for specific content within that collection.

The indexing process is asynchronous, meaning that once you submit a video for indexing, the API will return a task ID and will process the video in the background. You can use the task ID to check the status of the indexing. If you want to be notified when the indexing is complete, consider setting up a webhooks.

Step 1: Create a Collection

The videos are organized into collections. You can create a new collection by sending a request to POST /collections.

import requests

url = "https://v1.api.aana.ai/collections"

payload = {
  "collection_id": "my_collection", 
  "name": "My Awesome Collection",
  "description": "A collection of educational videos"
}

headers = {
  "Content-Type": "application/json",
  "x-api-key": "YOUR_API_KEY_HERE"
}

response = requests.post(url, json=payload, headers=headers)

data = response.json()
print(data)
Collection Management

The collection_id is a human-readable identifier (e.g., "my_collection").
See Collection Management for more detail on handling these resources (listing, updating, deleting).

Step 2: Index a Video

To index a video, send a request to POST /index. This endpoint adds or updates video content in a user-defined collection, making the content searchable.

import json, requests

payload = {
    'collection': 'my_collection',
    'reindex_if_exists': False,
    'video': {'media_id': '12345', 'url': 'https://example.com/video_12345.mp4'}}

headers = {
  "Content-Type": "application/x-www-form-urlencoded",
  "x-api-key": "YOUR_API_KEY_HERE"
}

response = requests.post("https://v1.api.aana.ai/index",
    headers=headers,
    data={"body": json.dumps(payload)}
)

data = response.json()
print(data)

The response will contain a task_id that you can use to check the status of the indexing process.

{"task_id": "175e1c97-1e18-4afe-906b-9e4897450c6e"}
Reindexing

The reindex_if_exists parameter allows you to control whether to reindex a video if it already exists in the collection. Set it to True to overwrite existing data.

Step 3: Retrieve Task Status

When you call POST /index, a new task is created and runs asynchronously. To check the status of this task:

  1. Use the /tasks/{task_id} endpoint if you have the task ID.
  2. Use webhooks to get notified when the task is completed. See Webhooks for more details.
import requests

task_id = "YOUR_TASK_ID"
url = f"https://v1.api.aana.ai/tasks/{task_id}"
headers = {"x-api-key": "YOUR_API_KEY_HERE"}

response = requests.get(url, headers=headers)
print(response.json()) 

Example response:

{
  "id": "175e1c97-1e18-4afe-906b-9e4897450c6e",
  "endpoint": "/index",
  "data": {"video": {"media_id": "12345", "url": "https://example.com/video_12345.mp4"}, "collection": "my_collection"},
  "status": "running",
  "results": null
}

If the task is running, you can wait or poll until it is completed or failed.

Once the task is completed, the results will be available in the results field of the task object.

Step 4: Search Within the Collection

After the video indexing is complete, you can search by sending a request to POST /search. This endpoint allows you to find relevant videos or segments based on a text query.

Search Endpoints

Check out Search for more details on the available search endpoints.

Is it possible to search across multiple collections?

At the moment, you can only search within a single collection. If you need to search across multiple collections, consider creating a new collection that includes all the videos you want to search or perform multiple searches and combine the results in your application.

import json, requests

payload = {
    'collection': 'my_collection',
    'query': 'data science',
}

headers = {
  "Content-Type": "application/x-www-form-urlencoded",
  "x-api-key": "YOUR_API_KEY_HERE"
}

response = requests.post("https://v1.api.aana.ai/search",
    headers=headers,
    data={"body": json.dumps(payload)}
)

data = response.json()
print(data)

The response will contain relevant segments where your search term appears.

{
  "results": [
    {
      "audio": [],
      "computer_vision": [
        {
          "end": 7.574233333333333,
          "start": 0,
          "summary": "The video features a close-up shot..."
        }
      ],
      "content": "The video features a close-up shot...",
      "episode_id": 0,
      "image_url": "...",
      "media_id": "ud2QnFTRPds",
      "score": 0.31464624404907227,
      "src": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
      "start_time": 0
    }
  ]
}