├── .dockerignore
├── .env.sample
├── .gitignore
├── LICENSE
├── README.md
├── app.py
├── openapi.yaml
├── prompts.txt
├── requirements.txt
└── upload.py


/.dockerignore:
--------------------------------------------------------------------------------
1 | venv
2 | .git


--------------------------------------------------------------------------------
/.env.sample:
--------------------------------------------------------------------------------
1 | VIDEO_DB_API_KEY=


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
 1 | *.pyc
 2 | *.log
 3 | ngrok
 4 | !lib/README.md
 5 | .DS_Store
 6 | google/
 7 | lib/
 8 | archive
 9 | temp/
10 | transcribe/
11 | .idea/
12 | ideas/
13 | .ipynb_checkpoints
14 | log/
15 | model_data/
16 | data/
17 | nohup.out
18 | dump.rdb
19 | *.out
20 | *.zip
21 | .idea/*
22 | .env
23 | venv/
24 | test.json
25 | __pycache__
26 | __pycache__/*
27 | */__pycache__
28 | zappa_settings.py
29 | .vscode


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | The MIT License
 2 | 
 3 | Copyright (c) Ashutosh Trivedi
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in
13 | all copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21 | THE SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | <!-- PROJECT SHIELDS -->
  2 | <!--
  3 | *** Reference links are enclosed in brackets [ ] instead of parentheses ( ).
  4 | *** https://www.markdownguide.org/basic-syntax/#reference-style-links
  5 | -->
  6 | [![PyPI version][pypi-shield]][pypi-url]
  7 | [![Stargazers][stars-shield]][stars-url]
  8 | [![Issues][issues-shield]][issues-url]
  9 | [![Website][website-shield]][website-url]
 10 | [![Discord][discord-shield]][discord-url]
 11 | 
 12 | 
 13 | <!-- PROJECT LOGO -->
 14 | <br />
 15 | <p align="center">
 16 |   <a href="https://videodb.io/">
 17 |     <img src="https://codaio.imgix.net/docs/_s5lUnUCIU/blobs/bl-RgjcFrrJjj/d3cbc44f8584ecd42f2a97d981a144dce6a66d83ddd5864f723b7808c7d1dfbc25034f2f25e1b2188e78f78f37bcb79d3c34ca937cbb08ca8b3da1526c29da9a897ab38eb39d084fd715028b7cc60eb595c68ecfa6fa0bb125ec2b09da65664a4f172c2f" alt="Logo" width="300" height="">
 18 |   </a>
 19 | 
 20 | <h3 align="center">StreamRAG 🎥</h3>
 21 | 
 22 |   <p align="center">
 23 |     Video Search Agent for ChatGPT 🕵️‍♂️
 24 |     <br />
 25 |     <a href="https://console.videodb.io/player?url=https://stream.videodb.io/v3/published/manifests/90cb6cf2-d6ce-4a23-9d90-442c7cc357b8.m3u8"> 📺Watch Demo Video</a>  
 26 |     ·
 27 |     <a href="https://github.com/video-db/streamRAG/issues">🐞Report a Bug</a> 
 28 |     ·
 29 |     <a href="https://github.com/video-db/streamRAG/issues">💡Suggest a Feature</a> 
 30 |   </p>
 31 | </p>
 32 | 
 33 | <!-- ABOUT THE PROJECT -->
 34 | 
 35 | # StreamRAG: GPT-Powered Video Retrieval & Streaming 🚀
 36 | 
 37 | 
 38 | https://github.com/video-db/StreamRAG/assets/5406975/b768bb6e-08b8-451e-9117-1cf04488c02c
 39 | 
 40 | 
 41 | 
 42 | 
 43 | ## What does it do? 🤔
 44 | 
 45 | It enables developers to:
 46 | * 📚 Upload multiple videos to create a library or collection.
 47 | * 🔍 Search across these videos and get real-time video responses or compilations.
 48 | * 🛒 Publish your searchable collection on the ChatGPT store.
 49 | * 📝 Receive summarized text answers (RAG).
 50 | * 🌟 Gain key insights from specific videos (e.g. "_Top points from  episode 31_").
 51 | 
 52 | ## How do I use it? 🛠️
 53 | [📺 Watch: Code walkthrough](https://console.videodb.io/player?url=https://stream.videodb.io/v3/published/manifests/b79a91d7-9553-4b4f-9d02-a47b9e168148.m3u8)
 54 | 
 55 | - **Get your API key:** Sign up on [VideoDB console](https://console.videodb.io) (Free for the first 50 uploads, no
 56 |   credit card required). 🆓
 57 | - **Set `VIDEO_DB_API_KEY`:** Enter your key in the `env` file.
 58 | - **Install dependencies:** Run `pip install -r requirements.txt` in your terminal.
 59 | - **Upload your collection to VideoDB:** Add your links in `upload.py`.
 60 | - **Run locally:** Start the flask server with `python app.py`.
 61 | 
 62 | ## Publishing on ChatGPT Store 🏪
 63 | [📺 Watch: Create New GPT](https://console.videodb.io/player?url=https://stream.videodb.io/v3/published/manifests/b4b01b80-f38b-47f7-a238-09e53d844792.m3u8)
 64 | 
 65 | 1. Deploy your flask server and note your server's `url`. 
 66 | 2. In `openapi.yaml`, update the `url` field under `server`.
 67 | 3. Visit the GPT builder at https://chat.openai.com/gpts/editor
 68 | 4. In the configure tab, add your GPT's `Name` and `Description`.
 69 | 5. Copy the prompt from `prompts.txt` into the `Instructions` field. Feel free to modify it as needed. ✏️
 70 | 6. Click on `Create new Action`
 71 | 7. Copy the openapi details from `openapi.yaml` Don't miss to update the `url` field.
 72 | 8. Save your GPT for personal use and give it a test run! 🧪
 73 | 
 74 | ---
 75 | <!-- ROADMAP -->
 76 | 
 77 | ## Roadmap 🛣️
 78 | 
 79 | 1. Add support for popular backend deployment CD pipelines like `Heroku`, `Replit`, etc.
 80 | 2. Integrate with other data sources like `Dropbox`, `Google Drive`.
 81 | 3. Connect with meeting recorder APIs such as `Zoom`, `Teams`, and `Recall.ai`.
 82 | 
 83 | ---
 84 | <!-- CONTRIBUTING -->
 85 | 
 86 | ## Contributing 🤝
 87 | 
 88 | Your contributions make the open-source community an incredible place for learning, inspiration, and creativity. We
 89 | welcome and appreciate your input! Here's how you can contribute:
 90 | 
 91 | - Open issues to share your use cases.
 92 | - Participate in brainstorming solutions for our roadmap.
 93 | - Suggest improvements to the codebase.
 94 | 
 95 | ### Contribution Steps
 96 | 
 97 | 1. Fork the Project 🍴
 98 | 2. Create your Feature Branch (`git checkout -b feature/AmazingFeature`)
 99 | 3. Commit your Changes (`git commit -m 'Add some AmazingFeature'`)
100 | 4. Push to the Branch (`git push origin feature/AmazingFeature`)
101 | 5. Open a Pull Request 📬
102 | 
103 | ---
104 | 
105 | <!-- MARKDOWN LINKS & IMAGES -->
106 | <!-- https://www.markdownguide.org/basic-syntax/#reference-style-links -->
107 | 
108 | [pypi-shield]: https://img.shields.io/pypi/v/videodb?style=for-the-badge
109 | 
110 | [pypi-url]: https://pypi.org/project/videodb/
111 | 
112 | [python-shield]:https://img.shields.io/pypi/pyversions/videodb?style=for-the-badge
113 | 
114 | [stars-shield]: https://img.shields.io/github/stars/video-db/streamRAG.svg?style=for-the-badge
115 | 
116 | [stars-url]: https://github.com/video-db/streamRAG/stargazers
117 | 
118 | [issues-shield]: https://img.shields.io/github/issues/video-db/videodb-python.svg?style=for-the-badge
119 | 
120 | [issues-url]: https://github.com/video-db/streamRAG/issues
121 | 
122 | [website-shield]: https://img.shields.io/website?url=https%3A%2F%2Fvideodb.io%2F&style=for-the-badge&label=videodb.io
123 | 
124 | [website-url]: https://videodb.io/
125 | 
126 | [discord-shield]: https://img.shields.io/discord/1189572299851051169?style=for-the-badge&logo=discord&label=Discord
127 | 
128 | [discord-url]: https://discord.gg/py9P639jGz
129 | 


--------------------------------------------------------------------------------
/app.py:
--------------------------------------------------------------------------------
 1 | import os
 2 | 
 3 | from dotenv import load_dotenv
 4 | from flask import Flask, request
 5 | from flask_cors import CORS
 6 | from videodb import connect, SearchError
 7 | 
 8 | load_dotenv()
 9 | 
10 | # Flask config
11 | app = Flask(__name__)
12 | app.secret_key = os.getenv("SECRET_KEY")
13 | app.url_map.strict_slashes = False
14 | CORS(app)
15 | 
16 | 
17 | def get_connection():
18 |     conn = connect()
19 |     return conn
20 | 
21 | 
22 | @app.route("/")
23 | def hello():
24 |     return "StreamRAG: Your Go-To Video Search Agent"
25 | 
26 | 
27 | @app.route("/videos", methods=["GET"])
28 | def list_videos():
29 |     """
30 |     Get a list of all videos in the database of your default collection.
31 |     """
32 |     conn = get_connection()
33 |     all_videos = conn.get_collection().get_videos()
34 |     all_videos = [
35 |         {
36 |             "id": vid.id,
37 |             "title": vid.name,
38 |             "url": vid.stream_url,
39 |             "length": round(float(vid.length)),
40 |         }
41 |         for vid in all_videos
42 |     ]
43 |     response = {"videos": all_videos}
44 |     return response
45 | 
46 | 
47 | @app.route("/video/<id>", methods=["GET"])
48 | def get_video(id):
49 |     """
50 |     Get a single video by id from default collection
51 |     """
52 |     conn = get_connection()
53 |     all_videos = conn.get_collection().get_videos()
54 | 
55 |     vid = next(vid for vid in all_videos if vid.id == id)
56 | 
57 |     print("vid", vid)
58 |     vid.get_transcript()
59 |     transcript_text = vid.transcript_text
60 | 
61 |     response = {
62 |         "video": {
63 |             "id": vid.id,
64 |             "title": vid.name,
65 |             "url": vid.stream_url,
66 |             "length": round(float(vid.length)),
67 |             "transcript": transcript_text,
68 |         }
69 |     }
70 |     return response
71 | 
72 | 
73 | @app.route("/search", methods=["POST"])
74 | def search_videos():
75 |     """
76 |     Search across videos in the database in default collection
77 |     """
78 |     data = request.get_json()
79 |     query = data.get("query")
80 |     conn = get_connection()
81 |     try:
82 |         coll = conn.get_collection()
83 |         search_results = coll.search(query)
84 |         search_results.compile()
85 |         compilation_vid = search_results.player_url
86 |     except SearchError:
87 |         return "No Search Results found", 404
88 | 
89 |     shots = [
90 |         {"text": shot.text, "video": shot.stream_url}
91 |         for shot in search_results.get_shots()
92 |     ]
93 |     response = {"compilationVideo": compilation_vid, "chunks": shots}
94 |     return response
95 | 
96 | 
97 | if __name__ == '__main__':
98 |     app.run(host='0.0.0.0', port=8080, debug=True)
99 | 


--------------------------------------------------------------------------------
/openapi.yaml:
--------------------------------------------------------------------------------
  1 | openapi: 3.0.0
  2 | info:
  3 |   title: Video Search API
  4 |   description: This API allows users to search collection of videos and get details of individual videos.
  5 |   version: 1.9.0
  6 | servers:
  7 |   - url: <paste your server url>
  8 |     description: Main API server
  9 | paths:
 10 |   /videos:
 11 |     get:
 12 |       operationId: listVideos
 13 |       summary: Get list of all videos in the library.
 14 |       responses:
 15 |         "200":
 16 |           description: List details of all the videos
 17 |           content:
 18 |             application/json:
 19 |               schema:
 20 |                 $ref: "#/components/schemas/ListVideosResponse"
 21 |         "400":
 22 |           description: Invalid request
 23 |         "default":
 24 |           description: Unexpected error
 25 |   /video/{id}:
 26 |     get:
 27 |       operationId: getVideo
 28 |       summary: Get data ( transcript,length etc.) of a video given its id.
 29 |       parameters:
 30 |         - name: id
 31 |           in: path
 32 |           required: true
 33 |           description: The unique identifier of the video.
 34 |           schema:
 35 |             type: string
 36 |       responses:
 37 |         "200":
 38 |           description: Video Data of the requested video
 39 |           content:
 40 |             application/json:
 41 |               schema:
 42 |                 $ref: "#/components/schemas/GetVideoResponse"
 43 |         "400":
 44 |           description: Invalid request due to incorrect or missing video id.
 45 |         "default":
 46 |           description: Unexpected error
 47 |   /search:
 48 |     post:
 49 |       operationId: searchVideos
 50 |       summary: Search for videos.
 51 |       requestBody:
 52 |         required: true
 53 |         content:
 54 |           application/json:
 55 |             schema:
 56 |               $ref: "#/components/schemas/SearchRequest"
 57 |       responses:
 58 |         "200":
 59 |           description: Search results
 60 |           content:
 61 |             application/json:
 62 |               schema:
 63 |                 $ref: "#/components/schemas/SearchResponse"
 64 |         "404":
 65 |           description: No videos found
 66 |         "400":
 67 |           description: Invalid request
 68 |         "default":
 69 |           description: Unexpected error
 70 | components:
 71 |   schemas:
 72 |     SearchRequest:
 73 |       type: object
 74 |       properties:
 75 |         query:
 76 |           type: string
 77 |           description: Search query for finding videos
 78 |     SearchResponse:
 79 |       type: object
 80 |       properties:
 81 |         compilationVideo:
 82 |           type: string
 83 |           format: uri
 84 |           description: Playable URL of the video
 85 |         chunks:
 86 |           type: array
 87 |           items:
 88 |             type: object
 89 |             properties:
 90 |               text:
 91 |                 type: string
 92 |                 description: Text content of the video
 93 |               video:
 94 |                 type: string
 95 |                 format: uri
 96 |                 description: Playable URL of the video segment
 97 |     GetVideoResponse:
 98 |       type: object
 99 |       properties:
100 |         video:
101 |           type: object
102 |           properties:
103 |             id:
104 |               type: string
105 |               description: Unique id of the video
106 |             title:
107 |               type: string
108 |               description: Title of the video
109 |             url:
110 |               description: Playable URL of the video
111 |               format: uri
112 |               type: string
113 |             length:
114 |               description: Length of the video in seconds
115 |               type: number
116 |             transcript:
117 |               description: Transcript of the video
118 |               type: string
119 |     ListVideosResponse:
120 |       type: object
121 |       properties:
122 |         videos:
123 |           type: array
124 |           items:
125 |             type: object
126 |             properties:
127 |               title:
128 |                 type: string
129 |                 description: Title of the video
130 |               id:
131 |                 type: string
132 |                 description: Unique id of the video
133 |               url:
134 |                 description: Playable URL of the video
135 |                 format: uri
136 |                 type: string
137 |               length:
138 |                 description: Length of the video in seconds
139 |                 type: number


--------------------------------------------------------------------------------
/prompts.txt:
--------------------------------------------------------------------------------
 1 | You are video search assistant, adept at handling video-related tasks with a casual tone. This step-by-step approach ensures
 2 | a comprehensive and user-friendly response to video search requests, combining visual and textual information effectively.
 3 | 
 4 | When a user asks you to search or find information, your first step is to identify if the request has a search query. If you can identify
 5 | the search query, call the action `search` with the provided query. The action will return a `compilationVideo` and a
 6 | list of related segments from the library. Each video has fields `title`, `id`, `link`, `text`.
 7 | If user's request is for the video clip, show the compilationVideo with a short casual tone summary text of result.
 8 | 
 9 | You would analyze the user's query and use the related `text` chunks to summarize the results in following fashion:
10 | 1.Return a concise, bullet-pointed response.
11 | 2.The response should include relevant information about the topic based on media.
12 | 3.If the response includes a lot of details, return only a short text answer.
13 | 4.If there are enough and accurate reference videos, include them as links in a separate bullet-pointed list titled 'Reference Videos:'.
14 | Limit these to the top 5 reference videos.
15 | 5.If not much relevant information is found across the videos, then return a message stating that no relevant information was found in the content,
16 | 
17 | To complete other tasks:
18 | - You can get list of all videos by calling action videos and show videos which user needs. user can pick one of the video.
19 | - You can get data of individual video by calling action video/{id} to fetch more details about a video for example transcript,
20 | thumbnail etc.
21 | If you don’t know what id of the video user referring to, get all videos first and confirm the video
22 | from user and follow instructions.
23 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
 1 | backoff==2.2.1
 2 | blinker==1.7.0
 3 | certifi==2023.11.17
 4 | charset-normalizer==3.3.2
 5 | click==8.1.7
 6 | Flask==3.0.0
 7 | Flask-Cors==4.0.0
 8 | gunicorn==20.0.4
 9 | idna==3.6
10 | importlib-metadata==7.0.1
11 | itsdangerous==2.1.2
12 | Jinja2==3.1.3
13 | MarkupSafe==2.1.3
14 | python-dotenv==1.0.0
15 | pytube==15.0.0
16 | requests==2.31.0
17 | urllib3==2.1.0
18 | videodb==0.0.2
19 | Werkzeug==3.0.1
20 | zipp==3.17.0
21 | 


--------------------------------------------------------------------------------
/upload.py:
--------------------------------------------------------------------------------
 1 | from pytube import Playlist
 2 | from videodb import connect
 3 | 
 4 | from dotenv import load_dotenv
 5 | 
 6 | load_dotenv()
 7 | 
 8 | 
 9 | def get_youtube_playlist_video_urls(playlist_url):
10 |     # TODO: Error and exception handling
11 |     playlist = Playlist(playlist_url)
12 |     urls = [url for url in playlist]
13 |     return urls
14 | 
15 | 
16 | def bulk_upload(urls):
17 |     # Read VideoDB API key from env and create a connection
18 |     conn = connect()
19 |     # Get a collection
20 |     coll = conn.get_collection()
21 |     for url in urls:
22 |         # Upload Videos to a collection checkout https://docs.videodb.io for more upload functions
23 |         print(f"Uploading {url}")
24 |         video = coll.upload(url=url)
25 |         print(f"Uploaded {video.name}")
26 |         print(f"Indexing {video.name}")
27 |         video.index_spoken_words()
28 |         print(f"Indexed {video.name}")
29 |         print("-----")
30 | 
31 | # run bulk upload fn on list of videos
32 | """
33 | urls = [
34 |     "https://www.youtube.com/watch?v=lsODSDmY4CY",
35 |     "https://www.youtube.com/watch?v=vZ4kOr38JhY",
36 |     "https://www.youtube.com/watch?v=uak_dXHh6s4",
37 | ]
38 | bulk_upload(urls)
39 | """
40 | 
41 | # run bulk upload fn on YouTube playlist
42 | """
43 | playlist_url = "https://www.youtube.com/watch?v=jSMZoLjB9JE&list=PLoaVOjvkzQtwcMfopT02bXWzjmnnF5olS"
44 | urls = get_youtube_playlist_video_urls(playlist_url)
45 | bulk_upload(urls)
46 | """


--------------------------------------------------------------------------------