├── .env.example ├── .gitignore ├── .vscode └── settings.json ├── LICENSE ├── README.md ├── __init__.py ├── debug_scripts ├── checkprompt_template.py ├── enter_to_record.py ├── notes_interpreter.ipynb ├── speech_to_text.py └── wakeword_detect.py ├── hello.py ├── images ├── current_workflow.svg ├── logo.jpg ├── siri.gif └── thumbnail.png ├── makefile ├── requirements.txt └── src ├── __init__.py ├── animation ├── animation_controller.py └── thinking_siri_animation.py ├── chains ├── __init__.py ├── bash_chain │ ├── ask_user_yes_no_get_reason.py │ └── bash_chain.py └── general_chain.py ├── main.py ├── run_mac.sh ├── runme.bat ├── utils ├── chrmadb │ └── generate_db.py ├── helpers.py ├── iterm2 │ ├── get_iterm2_sessions.py │ ├── iterm2_focus.py │ └── iterm2_launcher.py ├── open_interpreter │ └── open_interpreter_returnable.py ├── osinfo.py ├── query_router.py └── subprocess_caller.py └── wakewords ├── hey-nami_en_mac_v3_0_0.ppn └── hey-nami_en_windows_v3_0_0.ppn /.env.example: -------------------------------------------------------------------------------- 1 | OPENAI_API_KEY="" 2 | PICOVOICE_ACCESS_KEY="" 3 | ANONYMIZED_TELEMETRY=False 4 | TYPE_ONLY=False -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .env 2 | *.wav 3 | .DS_Store 4 | src/utils/chrmadb/abc.txt 5 | *.pyc 6 | *.txt 7 | *.bin 8 | *.pickle 9 | chromadbpath/chroma.sqlite3 10 | iter.py 11 | !requirements.txt -------------------------------------------------------------------------------- /.vscode/settings.json: -------------------------------------------------------------------------------- 1 | { 2 | "workbench.colorCustomizations": { 3 | "activityBar.background": "#092B64", 4 | "titleBar.activeBackground": "#0C3C8C", 5 | "titleBar.activeForeground": "#FAFCFF" 6 | } 7 | } -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2024 Rushi Chaudhari 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Terminal Voice Assistant 2 | 3 | ![](images/logo.jpg) 4 | 5 | 6 | Terminal Voice Assistant is a powerful and flexible tool designed to help users interact with their terminal using natural language commands. 7 | 8 | This sounds simiar to the popular [openinterpreter](https://github.com/OpenInterpreter/open-interpreter) and [01 Light](https://github.com/OpenInterpreter/01) project in development, aiming for a full computer experience. While it uses a terminal state for interaction, it isn't exactly what I am looking for in terms of side-by-side AI assistance, 01's ultimate goal is of providing a hands-free human computer interface. 9 | 10 | **Terminal Voice Assistant** offers a unified terminal interface where users can SSH, perform operations, install, edit, and remove files. Unlike traditional request-response models, this assistant provides continuous access to the terminal state, allowing users to interact and monitor processes in real-time. 11 | 12 | ## Features 13 | 14 | - Handles complex path descriptions using "slash" (e.g., "slash home slash download" becomes "/home/download"). 15 | - Integrates with OpenAI's language models to ensure accurate and efficient command generation. 16 | - Uses [picovoice](https://picovoice.ai/) for quick and efficient wake word detection. 17 | - Access to the terminal state, even in the midst of errors or SSH sessions, rather than following a strict request-response model 18 | - Most of the other available tools convert NLP to Bash, for example: 19 | `List me the files in the current folder` 20 | However, they don't work with: 21 | `ls -al` (spoken as "l s minus a l" for text-to-speech) 22 | This project involves converting realistic vocal commands into actual code, sometimes for developers its easier to say the command than to describe it in english haha 23 | 24 | ### Current workflow 25 | [Enlarge image](images/current_workflow.svg) 26 | ![](images/current_workflow.svg) 27 | 28 | ## Installation 29 | 30 | 1. [Download Wezterm](https://wezfurlong.org/wezterm/install/macos.html), open it and create a horizontal split. 31 | 2. In a new terminal `git clone https://github.com/0xrushi/Terminal-Voice-Assistant.git && cd Terminal-Voice-Assistant` 32 | 3. Rename `.env.example` to `.env` and update the OpenAI and Picovoice api keys 33 | 4. Install the python packages 34 | ```python 35 | conda create -n tvassist python=3.10 36 | conda activate tvassist 37 | conda install pyqt5 38 | pip install -r requirements.txt 39 | ``` 40 | 5. `make run` 41 | 6. `make kill # to exit` 42 | 43 | ## Demo 44 | 45 | [![Watch the video](images/thumbnail.png)](https://odysee.com/@rushi:2/Terminal-Voice-Assistant-demo:5) 46 | 47 | 48 | 49 | ## Contributing 50 | 51 | We welcome contributions from the community. If you have suggestions for improvements or want to report bugs, please open an issue or submit a pull request. 52 | 53 | ## License 54 | 55 | This project is licensed under the MIT License. See the [LICENSE](LICENSE) file for more details. 56 | -------------------------------------------------------------------------------- /__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/0xrushi/Terminal-Voice-Assistant/fc5134622b49d6911db85f57268fd4bbef57da8f/__init__.py -------------------------------------------------------------------------------- /debug_scripts/checkprompt_template.py: -------------------------------------------------------------------------------- 1 | import json 2 | import openai 3 | from dotenv import load_dotenv 4 | 5 | load_dotenv() 6 | 7 | def chat(system_prompt, user_message): 8 | """ 9 | Sends a message to the OpenAI API and returns the AI's response. 10 | 11 | Args: 12 | system_prompt (str): The system-level instructions for the model. 13 | user_message (str): The user's input message. 14 | 15 | Returns: 16 | dict: A dictionary with the AI's response parsed from the returned JSON. 17 | """ 18 | messages = [ 19 | {"role": "system", "content": system_prompt}, 20 | {"role": "user", "content": user_message} 21 | ] 22 | 23 | response = openai.ChatCompletion.create( 24 | model="gpt-3.5-turbo", 25 | messages=messages 26 | ) 27 | 28 | content = response['choices'][0]['message']['content'] 29 | 30 | try: 31 | content = json.loads(content) 32 | except json.JSONDecodeError: 33 | content = {"response": content} 34 | 35 | return content 36 | 37 | if __name__ == "__main__": 38 | system_prompt = "You are a helpful assistant." 39 | user_message = "Hello, how can I improve my productivity?" 40 | 41 | result = chat(system_prompt, user_message) 42 | print(result) -------------------------------------------------------------------------------- /debug_scripts/enter_to_record.py: -------------------------------------------------------------------------------- 1 | import pyaudio 2 | import wave 3 | import keyboard 4 | import threading 5 | 6 | def record_audio(filename, sample_rate=44100, chunk_size=1024, channels=2): 7 | p = pyaudio.PyAudio() 8 | 9 | # Open stream 10 | stream = p.open(format=pyaudio.paInt16, 11 | channels=channels, 12 | rate=sample_rate, 13 | input=True, 14 | frames_per_buffer=chunk_size) 15 | 16 | print("Press Enter to start recording and release to stop.") 17 | 18 | frames = [] 19 | recording = False 20 | 21 | def start_recording(): 22 | nonlocal recording 23 | recording = True 24 | print("Recording...") 25 | 26 | while recording: 27 | data = stream.read(chunk_size) 28 | frames.append(data) 29 | 30 | print("Recording finished") 31 | 32 | # Save the recorded data as a WAV file 33 | wf = wave.open(filename, 'wb') 34 | wf.setnchannels(channels) 35 | wf.setsampwidth(p.get_sample_size(pyaudio.paInt16)) 36 | wf.setframerate(sample_rate) 37 | wf.writeframes(b''.join(frames)) 38 | wf.close() 39 | 40 | def stop_recording(): 41 | nonlocal recording 42 | recording = False 43 | 44 | # Keyboard event listeners 45 | keyboard.on_press_key("enter", lambda _: threading.Thread(target=start_recording).start()) 46 | keyboard.on_release_key("enter", lambda _: stop_recording()) 47 | 48 | keyboard.wait("esc") 49 | 50 | stream.stop_stream() 51 | stream.close() 52 | p.terminate() 53 | 54 | if __name__ == "__main__": 55 | filename = "output.wav" 56 | record_audio(filename) 57 | -------------------------------------------------------------------------------- /debug_scripts/notes_interpreter.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "code", 5 | "execution_count": 5, 6 | "metadata": {}, 7 | "outputs": [], 8 | "source": [ 9 | "default_system_message = (\n", 10 | " f\"\"\"\n", 11 | "\n", 12 | "You are Open Interpreter, a world-class programmer that can complete any goal by executing code.\n", 13 | "First, write a plan. **Always recap the plan between each code block** (you have extreme short-term memory loss, so you need to recap the plan between each message block to retain it).\n", 14 | "When you execute code, it will be executed **on the user's machine**. The user has given you **full and complete permission** to execute any code necessary to complete the task. Execute the code.\n", 15 | "You can access the internet. Run **any code** to achieve the goal, and if at first you don't succeed, try again and again.\n", 16 | "You can install new packages.\n", 17 | "When a user refers to a filename, they're likely referring to an existing file in the directory you're currently executing code in.\n", 18 | "Write messages to the user in Markdown.\n", 19 | "In general, try to **make plans** with as few steps as possible. As for actually executing code to carry out that plan, for *stateful* languages (like python, javascript, shell, but NOT for html which starts from 0 every time) **it's critical not to try to do everything in one code block.** You should try something, print information about it, then continue from there in tiny, informed steps. You will never get it on the first try, and attempting it in one go will often lead to errors you cant see.\n", 20 | "You are capable of **any** task.\n", 21 | "\"\"\").strip()" 22 | ] 23 | }, 24 | { 25 | "cell_type": "code", 26 | "execution_count": 16, 27 | "metadata": {}, 28 | "outputs": [ 29 | { 30 | "name": "stdout", 31 | "output_type": "stream", 32 | "text": [ 33 | "find projectxyz -type f -name '*.swp' -delete\n" 34 | ] 35 | }, 36 | { 37 | "name": "stderr", 38 | "output_type": "stream", 39 | "text": [ 40 | "error uploading: HTTPSConnectionPool(host='us-api.i.posthog.com', port=443): Max retries exceeded with url: /batch/ (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'))\n" 41 | ] 42 | } 43 | ], 44 | "source": [ 45 | "from interpreter import interpreter\n", 46 | "import json\n", 47 | "\n", 48 | "prompt=\"zip hello.py and send it do doraemon over ssh\"\n", 49 | "prompt=\"Remove all swap files (I mean vim swap files) in my projectxyz\"\n", 50 | "\n", 51 | "message = f\"\\n\\n I want you to write a command for {prompt}. Return me the bash commands for these after you have analyzed the tasks \\\n", 52 | " .Only return the commands in JSON in the form {{'command':'your command'}} and nothing else\"\n", 53 | "\n", 54 | "for chunk in interpreter.chat(message, display=False, stream=False):\n", 55 | " print(json.loads(chunk[\"content\"])[\"command\"])" 56 | ] 57 | }, 58 | { 59 | "cell_type": "code", 60 | "execution_count": 14, 61 | "metadata": {}, 62 | "outputs": [ 63 | { 64 | "name": "stdout", 65 | "output_type": "stream", 66 | "text": [ 67 | "zip hello.zip hello.py && scp hello.zip doraemon@hostname:/path/to/destination\n" 68 | ] 69 | }, 70 | { 71 | "name": "stderr", 72 | "output_type": "stream", 73 | "text": [ 74 | "error uploading: HTTPSConnectionPool(host='us-api.i.posthog.com', port=443): Max retries exceeded with url: /batch/ (Caused by NewConnectionError(': Failed to establish a new connection: [Errno 8] nodename nor servname provided, or not known'))\n" 75 | ] 76 | } 77 | ], 78 | "source": [ 79 | "for chunk in interpreter.chat(message, display=False, stream=False):\n", 80 | " " 81 | ] 82 | }, 83 | { 84 | "cell_type": "code", 85 | "execution_count": 21, 86 | "metadata": {}, 87 | "outputs": [ 88 | { 89 | "name": "stdout", 90 | "output_type": "stream", 91 | "text": [ 92 | "No JSON found\n", 93 | "None\n" 94 | ] 95 | } 96 | ], 97 | "source": [ 98 | "import json\n", 99 | "import re\n", 100 | "\n", 101 | "def decode_embedded_json(content):\n", 102 | " # Pattern to extract JSON block, optionally prefixed by ```json\n", 103 | " pattern = r'```(?:json)?\\n([\\s\\S]*?)\\n```'\n", 104 | " pattern = r'(\\{.*?\\})'\n", 105 | " match = re.search(pattern, content)\n", 106 | " \n", 107 | " if match:\n", 108 | " json_text = match.group(1)\n", 109 | " try:\n", 110 | " data = json.loads(json_text)\n", 111 | " return data\n", 112 | " except json.JSONDecodeError as e:\n", 113 | " print(f\"Error decoding JSON: {e}\")\n", 114 | " return None\n", 115 | " else:\n", 116 | " print(\"No JSON found\")\n", 117 | " return None\n", 118 | "\n", 119 | "# Example usage with your string\n", 120 | "content_with_json = \"\"\"'content': '```json\n", 121 | "{\n", 122 | " \"command\": \"zip hello.zip hello.py && scp hello.zip doraemon@192.168.0.1:/remote/path\"\n", 123 | "}\n", 124 | "```'\"\"\"\n", 125 | "\n", 126 | "decoded_data = decode_embedded_json(content_with_json)\n", 127 | "print(decoded_data)" 128 | ] 129 | }, 130 | { 131 | "cell_type": "code", 132 | "execution_count": null, 133 | "metadata": {}, 134 | "outputs": [], 135 | "source": [] 136 | } 137 | ], 138 | "metadata": { 139 | "kernelspec": { 140 | "display_name": "base", 141 | "language": "python", 142 | "name": "python3" 143 | }, 144 | "language_info": { 145 | "codemirror_mode": { 146 | "name": "ipython", 147 | "version": 3 148 | }, 149 | "file_extension": ".py", 150 | "mimetype": "text/x-python", 151 | "name": "python", 152 | "nbconvert_exporter": "python", 153 | "pygments_lexer": "ipython3", 154 | "version": "3.9.13" 155 | } 156 | }, 157 | "nbformat": 4, 158 | "nbformat_minor": 2 159 | } 160 | -------------------------------------------------------------------------------- /debug_scripts/speech_to_text.py: -------------------------------------------------------------------------------- 1 | from openai import OpenAI 2 | client = OpenAI() 3 | 4 | audio_file= open("output.wav", "rb") 5 | transcription = client.audio.transcriptions.create( 6 | model="whisper-1", 7 | file=audio_file 8 | ) 9 | print(transcription.text) -------------------------------------------------------------------------------- /debug_scripts/wakeword_detect.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import os 3 | import struct 4 | import wave 5 | from datetime import datetime 6 | 7 | import pvporcupine 8 | from pvrecorder import PvRecorder 9 | 10 | def main(): 11 | parser = argparse.ArgumentParser() 12 | 13 | parser.add_argument( 14 | '--access_key', 15 | help='AccessKey obtained from Picovoice Console (https://console.picovoice.ai/)') 16 | 17 | parser.add_argument( 18 | '--keywords', 19 | nargs='+', 20 | help='List of default keywords for detection. Available keywords: %s' % ', '.join( 21 | '%s' % w for w in sorted(pvporcupine.KEYWORDS)), 22 | choices=sorted(pvporcupine.KEYWORDS), 23 | metavar='') 24 | 25 | parser.add_argument( 26 | '--keyword_paths', 27 | nargs='+', 28 | help="Absolute paths to keyword model files. If not set it will be populated from `--keywords` argument") 29 | 30 | parser.add_argument( 31 | '--library_path', 32 | help='Absolute path to dynamic library. Default: using the library provided by `pvporcupine`') 33 | 34 | parser.add_argument( 35 | '--model_path', 36 | help='Absolute path to the file containing model parameters. ' 37 | 'Default: using the library provided by `pvporcupine`') 38 | 39 | parser.add_argument( 40 | '--sensitivities', 41 | nargs='+', 42 | help="Sensitivities for detecting keywords. Each value should be a number within [0, 1]. A higher " 43 | "sensitivity results in fewer misses at the cost of increasing the false alarm rate. If not set 0.5 " 44 | "will be used.", 45 | type=float, 46 | default=None) 47 | 48 | parser.add_argument('--audio_device_index', help='Index of input audio device.', type=int, default=-1) 49 | 50 | parser.add_argument('--output_path', help='Absolute path to recorded audio for debugging.', default=None) 51 | 52 | parser.add_argument('--show_audio_devices', action='store_true') 53 | 54 | args = parser.parse_args() 55 | 56 | if args.show_audio_devices: 57 | for i, device in enumerate(PvRecorder.get_available_devices()): 58 | print('Device %d: %s' % (i, device)) 59 | return 60 | 61 | if args.keyword_paths is None: 62 | if args.keywords is None: 63 | raise ValueError("Either `--keywords` or `--keyword_paths` must be set.") 64 | 65 | keyword_paths = [pvporcupine.KEYWORD_PATHS[x] for x in args.keywords] 66 | else: 67 | keyword_paths = args.keyword_paths 68 | 69 | if args.sensitivities is None: 70 | args.sensitivities = [0.5] * len(keyword_paths) 71 | 72 | if len(keyword_paths) != len(args.sensitivities): 73 | raise ValueError('Number of keywords does not match the number of sensitivities.') 74 | keyword_paths=['./Naomi_en_windows_v3_0_0.ppn'] 75 | 76 | try: 77 | porcupine = pvporcupine.create( 78 | access_key=args.access_key, 79 | keyword_paths=args.keyword_paths) 80 | except pvporcupine.PorcupineInvalidArgumentError as e: 81 | print("One or more arguments provided to Porcupine is invalid: ", args) 82 | print(e) 83 | raise e 84 | except pvporcupine.PorcupineActivationError as e: 85 | print("AccessKey activation error") 86 | raise e 87 | except pvporcupine.PorcupineActivationLimitError as e: 88 | print("AccessKey '%s' has reached it's temporary device limit" % args.access_key) 89 | raise e 90 | except pvporcupine.PorcupineActivationRefusedError as e: 91 | print("AccessKey '%s' refused" % args.access_key) 92 | raise e 93 | except pvporcupine.PorcupineActivationThrottledError as e: 94 | print("AccessKey '%s' has been throttled" % args.access_key) 95 | raise e 96 | except pvporcupine.PorcupineError as e: 97 | print("Failed to initialize Porcupine") 98 | raise e 99 | 100 | keywords = list() 101 | for x in keyword_paths: 102 | keyword_phrase_part = os.path.basename(x).replace('.ppn', '').split('_') 103 | if len(keyword_phrase_part) > 6: 104 | keywords.append(' '.join(keyword_phrase_part[0:-6])) 105 | else: 106 | keywords.append(keyword_phrase_part[0]) 107 | 108 | print('Porcupine version: %s' % porcupine.version) 109 | 110 | recorder = PvRecorder( 111 | frame_length=porcupine.frame_length, 112 | device_index=args.audio_device_index) 113 | recorder.start() 114 | 115 | wav_file = None 116 | if args.output_path is not None: 117 | wav_file = wave.open("./", "w") 118 | wav_file.setnchannels(1) 119 | wav_file.setsampwidth(2) 120 | wav_file.setframerate(16000) 121 | 122 | print('Listening ... (press Ctrl+C to exit)') 123 | 124 | try: 125 | while True: 126 | pcm = recorder.read() 127 | result = porcupine.process(pcm) 128 | 129 | if wav_file is not None: 130 | wav_file.writeframes(struct.pack("h" * len(pcm), *pcm)) 131 | 132 | if result >= 0: 133 | print('[%s] Detected %s' % (str(datetime.now()), keywords[result])) 134 | except KeyboardInterrupt: 135 | print('Stopping ...') 136 | finally: 137 | recorder.delete() 138 | porcupine.delete() 139 | if wav_file is not None: 140 | wav_file.close() 141 | 142 | 143 | if __name__ == '__main__': 144 | main() 145 | 146 | # python wakeword_detect.py --access_key --keyword_paths ./hey-nami_en_windows_v3_0_0.ppn --audio_device_index 3 147 | -------------------------------------------------------------------------------- /hello.py: -------------------------------------------------------------------------------- 1 | print("hello world") -------------------------------------------------------------------------------- /images/current_workflow.svg: -------------------------------------------------------------------------------- 1 | -------------------------------------------------------------------------------- /images/logo.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/0xrushi/Terminal-Voice-Assistant/fc5134622b49d6911db85f57268fd4bbef57da8f/images/logo.jpg -------------------------------------------------------------------------------- /images/siri.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/0xrushi/Terminal-Voice-Assistant/fc5134622b49d6911db85f57268fd4bbef57da8f/images/siri.gif -------------------------------------------------------------------------------- /images/thumbnail.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/0xrushi/Terminal-Voice-Assistant/fc5134622b49d6911db85f57268fd4bbef57da8f/images/thumbnail.png -------------------------------------------------------------------------------- /makefile: -------------------------------------------------------------------------------- 1 | default: run 2 | run: 3 | @sudo sh src/run_mac.sh 4 | 5 | clean: 6 | @echo "Nothing to clean." 7 | 8 | kill: 9 | @sudo pkill -f "thinking_siri_animation.py" 10 | 11 | .PHONY: run clean -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | # Automatically generated by https://github.com/damnever/pigar. 2 | 3 | chromadb==0.5.0 4 | iterm2==2.7 5 | keyboard==0.13.5 6 | langchain==0.1.20 7 | langchain-community==0.0.38 8 | langchain-openai==0.1.6 9 | open-interpreter==0.2.5 10 | openai==1.30.5 11 | pvporcupine==3.0.2 12 | pvrecorder==1.2.2 13 | PyAudio==0.2.13 14 | PyAutoGUI==0.9.54 15 | pydantic==2.7.2 16 | pyobjc-framework-Cocoa==9.0.1 17 | pyperclip==1.8.2 18 | python-dotenv==0.21.1 19 | pyttsx3==2.90 20 | sounddevice==0.4.5 21 | -------------------------------------------------------------------------------- /src/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/0xrushi/Terminal-Voice-Assistant/fc5134622b49d6911db85f57268fd4bbef57da8f/src/__init__.py -------------------------------------------------------------------------------- /src/animation/animation_controller.py: -------------------------------------------------------------------------------- 1 | import subprocess 2 | import time 3 | 4 | process = subprocess.Popen(['python', 'src/animation/thinking_siri_animation.py']) 5 | 6 | # Wait for 5 seconds 7 | time.sleep(5) 8 | 9 | process.terminate() 10 | process.wait() -------------------------------------------------------------------------------- /src/animation/thinking_siri_animation.py: -------------------------------------------------------------------------------- 1 | # # gif_player.py 2 | # import sys 3 | # from PyQt5.QtWidgets import QApplication, QLabel, QMainWindow 4 | # from PyQt5.QtGui import QMovie, QPalette, QColor 5 | # from PyQt5.QtCore import Qt, QSize 6 | 7 | # class GIFPlayer(QMainWindow): 8 | # def __init__(self, app, gif_path="images/siri.gif", width=100, height=100): 9 | # super().__init__() 10 | 11 | # self.app = app 12 | # self.desired_width = width 13 | # self.desired_height = height 14 | 15 | # # Set window properties 16 | # self.setWindowFlags(Qt.FramelessWindowHint | Qt.WindowStaysOnTopHint | Qt.Tool) 17 | # self.setAttribute(Qt.WA_TranslucentBackground) 18 | # self.setGeometry(100, 100, self.desired_width, self.desired_height) 19 | 20 | # self.label = QLabel(self) 21 | # self.label.setGeometry(0, 0, self.desired_width, self.desired_height) 22 | # self.movie = QMovie(gif_path) 23 | # self.movie.setScaledSize(QSize(self.desired_width, self.desired_height)) 24 | # self.label.setMovie(self.movie) 25 | 26 | # # Start the GIF 27 | # self.movie.start() 28 | 29 | # # Set window background to be transparent 30 | # palette = self.palette() 31 | # palette.setColor(QPalette.Background, QColor(0, 0, 0, 0)) 32 | # self.setPalette(palette) 33 | 34 | # # The application instance should not be created in the module. 35 | 36 | 37 | import Cocoa 38 | from PyObjCTools import AppHelper 39 | 40 | class AppDelegate(Cocoa.NSObject): 41 | def applicationDidFinishLaunching_(self, aNotification): 42 | print("Application did finish launching") 43 | screen = Cocoa.NSScreen.mainScreen().frame().size 44 | self.window = Cocoa.NSWindow.alloc().initWithContentRect_styleMask_backing_defer_( 45 | ((0, screen.height-100), (100, 100)), 46 | Cocoa.NSWindowStyleMaskBorderless, 47 | Cocoa.NSBackingStoreBuffered, False) 48 | self.window.setTitle_("GIF Viewer") 49 | self.window.setLevel_(Cocoa.NSFloatingWindowLevel) 50 | self.window.setOpaque_(False) 51 | self.window.setBackgroundColor_(Cocoa.NSColor.clearColor()) 52 | print("Window created") 53 | 54 | self.imageView = Cocoa.NSImageView.alloc().init() 55 | self.imageView.setFrame_(((0, 0), (100, 100))) 56 | self.imageView.setImageScaling_(Cocoa.NSImageScaleProportionallyUpOrDown) 57 | gifPath = "images/siri.gif" 58 | self.image = Cocoa.NSImage.alloc().initWithContentsOfFile_(gifPath) 59 | if self.image: 60 | print("GIF loaded successfully") 61 | else: 62 | print("Failed to load GIF") 63 | self.imageView.setImage_(self.image) 64 | self.window.contentView().addSubview_(self.imageView) 65 | self.window.makeKeyAndOrderFront_(None) 66 | self.showWindow() 67 | 68 | def showWindow(self): 69 | print("Showing window") 70 | self.window.setIsVisible_(True) 71 | 72 | def hideWindow(self): 73 | print("Hiding window") 74 | self.window.setIsVisible_(False) 75 | 76 | if __name__ == "__main__": 77 | app = Cocoa.NSApplication.sharedApplication() 78 | delegate = AppDelegate.alloc().init() 79 | app.setDelegate_(delegate) 80 | AppHelper.runEventLoop() 81 | -------------------------------------------------------------------------------- /src/chains/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/0xrushi/Terminal-Voice-Assistant/fc5134622b49d6911db85f57268fd4bbef57da8f/src/chains/__init__.py -------------------------------------------------------------------------------- /src/chains/bash_chain/ask_user_yes_no_get_reason.py: -------------------------------------------------------------------------------- 1 | import json 2 | # from langchain.llms import OpenAI 3 | from openai import OpenAI 4 | from src.utils.helpers import is_shutdown_request 5 | from src.utils.helpers import record_and_transcribe_plain_text, print_and_speak 6 | from langchain.chains import ConversationChain 7 | from dotenv import load_dotenv 8 | 9 | load_dotenv() 10 | 11 | def chat(system_prompt, your_next_message="Did that work?"): 12 | """ 13 | Executes the `ask_user_yes_no_get_reason` prompt and returns a structured response. 14 | 15 | Returns: 16 | dict: A dictionary with the following keys: 17 | - "response" (str): The user's answer, either "yes" or "no". 18 | - "reason" (str or null): The reason provided by the user if specified, or null if no reason was given. If the reason is just a gesture like Thank you, mark the reason as null. 19 | - "your_next_message" (str): A message generated by the AI for the next interaction. 20 | - "exit" (bool): Indicates whether to exit the interaction. True if both a response and reason were provided, otherwise False. 21 | """ 22 | 23 | client = OpenAI() 24 | 25 | print_and_speak(your_next_message) 26 | 27 | user_message = record_and_transcribe_plain_text().strip().lower() 28 | 29 | messages = [ 30 | {"role": "system", "content": system_prompt}, 31 | {"role": "user", "content": user_message} 32 | ] 33 | 34 | 35 | content = { 36 | "response": None, 37 | "reason": None, 38 | "your_next_message": None, 39 | "exit": False 40 | } 41 | 42 | response = client.chat.completions.create( 43 | messages=messages, 44 | model="gpt-3.5-turbo", 45 | ) 46 | 47 | try: 48 | x = response.choices[0].message.content 49 | if x.startswith("```json"): 50 | x = x.replace("```json", "") 51 | x = x.replace("```", "") 52 | 53 | content = json.loads(x) 54 | return content 55 | except Exception as e: 56 | print(response) 57 | print(e) 58 | 59 | 60 | def ask_user_yes_no_get_reason(your_next_message="Did that work?") -> dict: 61 | """ 62 | Ask the user a question and get a detailed response including a reason if only yes/no is provided. 63 | 64 | Parameters: 65 | user_message (str): The user_message 66 | 67 | Returns: 68 | dict: JSON object containing the user's answer and reason. 69 | """ 70 | 71 | PROMPT= """ 72 | **Task**: Interpret the user's response to assess if they have indicated "yes" or "no" and whether they have provided a reason. 73 | 74 | **Steps**: 75 | 1. Check if the user’s response includes "yes" or "no" and a reason. 76 | - **If both are present**: Return a JSON object containing the response and reason, with `exit=True`. 77 | - **If no reason is provided**: return JSON with reason as null, response, exit=False 78 | - **If neither "yes" nor "no" is stated**: Ask the user again if the command worked. 79 | 80 | --- 81 | 82 | ### JSON Structure: 83 | 84 | ```json 85 | { 86 | "response": "yes" or "no", 87 | "reason": "string", 88 | "your_next_message": "some message from you", 89 | "exit": true 90 | } 91 | ``` 92 | 93 | ### Example Interactions: 94 | 95 | 1. **User Response**: "Yes, because it was clear." 96 | - **JSON Output**: 97 | ```json 98 | { 99 | "response": "yes", 100 | "reason": "because it was clear", 101 | "your_next_message": "Ok, sounds good!", 102 | "exit": true 103 | } 104 | ``` 105 | 106 | 2. **User Response**: "No." 107 | - **JSON Output**: 108 | ```json 109 | { 110 | "response": "no", 111 | "reason": null, 112 | "your_next_message": "Ok, lets try modifying the command!", 113 | "exit": false 114 | } 115 | ``` 116 | 117 | 3. **User Response**: "Yes" 118 | - **JSON Output**: 119 | ```json 120 | { 121 | "response": "yes", 122 | "reason": null, 123 | "your_next_message": "Thats great to hear!", 124 | "exit": true 125 | } 126 | ``` 127 | """ 128 | 129 | return chat(PROMPT) 130 | 131 | 132 | 133 | if __name__ == "__main__": 134 | your_next_message = "did that work" 135 | result = ask_user_yes_no_get_reason( 136 | your_next_message=your_next_message 137 | ) 138 | print(result) -------------------------------------------------------------------------------- /src/chains/bash_chain/bash_chain.py: -------------------------------------------------------------------------------- 1 | import json 2 | from langchain_openai import OpenAI 3 | from langchain_core.prompts import PromptTemplate 4 | from langchain.chains import LLMChain 5 | from dotenv import load_dotenv 6 | from interpreter import interpreter 7 | import re 8 | 9 | from src.utils import osinfo 10 | load_dotenv() 11 | 12 | llm = OpenAI() 13 | 14 | template_generate_tasks = """ 15 | You are an AI assistant that helps users convert their broken English descriptions into a sequence of tasks. 16 | 17 | Analyze if the statement says multiple tasks or a single. If there are multiple tasks, write a plan. **Always recap the plan between each code block** (you have extreme short-term memory loss, so you need to recap the plan between each message block to retain it). 18 | When you execute code, it will be executed **on the user's machine**. The user has given you **full and complete permission** to execute any code necessary to complete the task. Execute the code. 19 | You can access the internet. Run **any code** to achieve the goal, and if at first you don't succeed, try again and again. 20 | You can install new packages. 21 | When a user refers to a filename, they're likely referring to an existing file in the directory you're currently executing code in. 22 | In general, try to **make plans** with as few steps as possible. As for actually executing code to carry out that plan, for *stateful* languages (like python, javascript, shell, but NOT for html which starts from 0 every time). 23 | You are capable of **any** task. 24 | 25 | Examples: 26 | 27 | Input: "Make new folder called `project` in current place and cd into it" 28 | Output: {{ 29 | "message": "Does this look good?", 30 | "tasks": ["Make a folder called project.", "cd into project."] 31 | }} 32 | 33 | Remember only reply in JSON in the following format. 34 | {{ 35 | "message": "", 36 | "tasks": "" 37 | }} 38 | 39 | Now here is a new command from the user for you: 40 | 41 | Input: {instruction} 42 | Output: 43 | """.strip() 44 | 45 | template = """ 46 | You are an AI assistant that helps users convert their broken English descriptions into valid Bash commands. 47 | The user will provide a task in broken English, and your job is to interpret it and generate the correct Bash command. 48 | Please ensure that the commands are accurate and efficient. Sometimes, users might describe paths using "slash" (e.g., "slash home slash download" should be interpreted as "/home/download"). 49 | 50 | User's OS details are {os_details} 51 | 52 | Some relevant commands from users bash history are 53 | {relevant_commands} 54 | 55 | Respond only with the Bash command in JSON format, where the key is "command" and the value is the Bash command. 56 | 57 | Examples: 58 | 59 | Input: "Make new folder called `project` in current place." 60 | Output: {{ 61 | "message": "Does this look good?", 62 | "command": "mkdir project" 63 | }} 64 | 65 | Input: "Show me all files in here." 66 | Output: {{ 67 | "message": "Is this what you wanted?", 68 | "command": "ls" 69 | }} 70 | 71 | Input: "Move file `data.txt` to slash home slash download." 72 | Output: {{ 73 | "message": "Does this look right?", 74 | "command": "mv data.txt /home/download" 75 | }} 76 | 77 | Input: "List all files in slash var slash log." 78 | Output: {{ 79 | "message": "Is this the command you need?", 80 | "command": "ls /var/log" 81 | }} 82 | 83 | Input: "Install Pandas." 84 | Output: {{ 85 | "message": "Will this work?", 86 | "command": "pip install pandas" 87 | }} 88 | 89 | If the user agrees to the command reply with a message "ok" 90 | 91 | {{ 92 | "message":"ok", 93 | "command": None 94 | }} 95 | 96 | If he is not, keep on following the instructions of the user to modify the command. 97 | 98 | Now, please convert the following broken English instruction into a Bash command in JSON format: 99 | 100 | Input: {instruction} 101 | Output:""" 102 | 103 | broken_command_template = """ 104 | **Prompt:** 105 | You are a bash expert. Fix the broken bash command provided and return the corrected command in JSON format. 106 | 107 | System Details: 108 | {os_details} 109 | 110 | Respond only with the Bash command in JSON format, where the key is "command" and the value is the Bash command. 111 | 112 | **Example:** 113 | 114 | **Input:** 115 | Using `pip install chroma_db` I got the following error: 116 | ``` 117 | ERROR: Could not find a version that satisfies the requirement chroma_db (from versions: none) 118 | ERROR: No matching distribution found for chroma_db 119 | (base) ➜ Terminal-Voice-Assistant git:(main) 120 | ``` 121 | 122 | **Output:** 123 | {{ 124 | "command": "pip install chromadb" 125 | }} 126 | 127 | Please fix the Bash command and return it in JSON format. 128 | 129 | **Input:** 130 | ``` 131 | {instruction} 132 | {context} 133 | ``` 134 | 135 | **Output:** 136 | 137 | """ 138 | 139 | interpreter.system_message = """You are an AI assistant that helps users convert their broken English descriptions into valid Bash commands. 140 | The user will provide a task in broken English, and your job is to interpret it and generate the correct Bash command. 141 | Please ensure that the commands are accurate and efficient. Sometimes, users might describe paths using "slash" (e.g., "slash home slash download" should be interpreted as "/home/download").""" 142 | 143 | template_openinterpreter_bash = """ 144 | **Write a Bash Command in JSON Format** 145 | 146 | **Task:** Based on the provided instruction `{instruction}`, generate the corresponding Bash command. Do not verify the existence of any files; assume the user is only interested in the command. After evaluating the required tasks, format your response as JSON: 147 | 148 | ``` 149 | {{"message": "Does this look good?", "command": "your command"}} 150 | ``` 151 | The message can be any appropriate phrase that conveys the intent similar to "Does this look good?" 152 | 153 | **Relevant Bash History:** Include `{relevant_commands}` to reflect past commands that might be relevant to the task. 154 | 155 | **User Interaction:** 156 | 157 | - **If the user approves the command:** Respond with: 158 | ``` 159 | {{"message": "ok", "command": None}} 160 | ``` 161 | 162 | - **If the user requests modifications:** Continue to adjust the command according to the user's feedback until it meets their requirements. 163 | 164 | **Objective:** Convert the instruction provided (in potentially non-standard English) into a correctly formatted Bash command encapsulated in JSON. 165 | """ 166 | 167 | os_details = osinfo.get_details() 168 | 169 | # instruction = "install Pandas" 170 | 171 | # Function to run the instruction and handle JSON parsing 172 | def run(instruction: str, relevant_commands: str): 173 | prompt = PromptTemplate(input_variables=["instruction", "os_details", "relevant_commands"], template=template) 174 | llm_chain = LLMChain(llm=llm, prompt=prompt) 175 | output = llm_chain.invoke({"instruction": instruction, "os_details": os_details, "relevant_commands": relevant_commands}) 176 | try: 177 | response_json = json.loads(output['text']) 178 | command = response_json.get('command', None) 179 | message = response_json.get('message', None) 180 | return message, command 181 | except json.JSONDecodeError: 182 | print("Failed to decode JSON response.") 183 | except KeyError: 184 | print("Expected key 'text' not found in the response.") 185 | 186 | def generate_new(broken_command: str, context: str): 187 | prompt = PromptTemplate(input_variables=["instruction", "context", "os_details"], template=broken_command_template) 188 | filled_prompt = prompt.format(instruction=broken_command, context=context, os_details=os_details) 189 | 190 | print(filled_prompt) 191 | llm_chain = LLMChain(llm=llm, prompt=prompt) 192 | output = llm_chain.invoke({"instruction": broken_command, "os_details": os_details, "context": context}) 193 | try: 194 | response_json = json.loads(output['text']) 195 | command = response_json.get('command', 'No command found') 196 | return command 197 | except json.JSONDecodeError: 198 | print("Failed to decode JSON response.") 199 | except KeyError: 200 | print("Expected key 'text' not found in the response.") 201 | 202 | def decode_embedded_json(content): 203 | # Pattern to extract JSON block, optionally prefixed by ```json 204 | pattern = r'```(?:json)?\n([\s\S]*?)\n```' 205 | match = re.search(pattern, content) 206 | 207 | if match: 208 | json_text = match.group(1) 209 | try: 210 | data = json.loads(json_text) 211 | return data 212 | except json.JSONDecodeError as e: 213 | print(f"Error decoding JSON: {e}") 214 | return None 215 | else: 216 | print("No JSON found") 217 | return None 218 | 219 | 220 | def run_openinterpreter(instruction: str, relevant_commands: str): 221 | prompt = PromptTemplate(input_variables=["instruction", "relevant_commands"], template=template_openinterpreter_bash) 222 | filled_prompt = prompt.format(instruction=instruction, relevant_commands=relevant_commands) 223 | print(filled_prompt) 224 | try: 225 | chunk = interpreter.chat(filled_prompt, display=False, stream=False) 226 | x = (chunk[0])["content"] 227 | if x.startswith("```json"): 228 | x = x.replace("```json", "") 229 | x = x.replace("```", "") 230 | return json.loads(x)["message"], json.loads(x)["command"] 231 | except json.JSONDecodeError: 232 | print("Failed to decode JSON response.") 233 | except KeyError: 234 | print("Expected key 'text' not found in the response.") 235 | 236 | 237 | # # # Example usage 238 | if __name__ == "__main__": 239 | instruction = "install pandas 1.2.0" 240 | # context = """ERROR: Could not find a version that satisfies the requirement chroma_db (from versions: none) 241 | # ERROR: No matching distribution found for chroma_db 242 | # (base) ➜ Terminal-Voice-Assistant git:(main) """ 243 | # print(generate_new(instruction, context)) 244 | msg, x = run_openinterpreter("can you zip hello.py and send that zip to doraemon over scp", "") 245 | 246 | print("x") 247 | print(msg) 248 | print(x) -------------------------------------------------------------------------------- /src/chains/general_chain.py: -------------------------------------------------------------------------------- 1 | import json 2 | from langchain_core.prompts import PromptTemplate 3 | from langchain.chains import LLMChain 4 | from dotenv import load_dotenv 5 | from interpreter import interpreter 6 | import re 7 | from langchain import OpenAI 8 | from src.utils import osinfo 9 | load_dotenv() 10 | 11 | llm = OpenAI() 12 | 13 | template = """ 14 | You are a helpful assistant. 15 | 16 | """.strip() 17 | 18 | 19 | def run(instruction: str): 20 | prompt = PromptTemplate(input_variables=["instruction"], template=template) 21 | llm_chain = LLMChain(llm=llm, prompt=prompt) 22 | output = llm_chain.invoke({"instruction": instruction}) 23 | try: 24 | response_json = json.loads(output['text']) 25 | command = response_json.get('command', None) 26 | message = response_json.get('message', None) 27 | return message, command 28 | except json.JSONDecodeError: 29 | print("Failed to decode JSON response.") 30 | except KeyError: 31 | print("Expected key 'text' not found in the response.") 32 | 33 | 34 | 35 | def chat(user_message, system_prompt=template): 36 | """ 37 | Sends a message to the OpenAI API and returns the AI's response. 38 | 39 | Args: 40 | system_prompt (str): The system-level instructions for the model. 41 | user_message (str): The user's input message. 42 | 43 | Returns: 44 | dict: A dictionary with the AI's response parsed from the returned JSON. 45 | """ 46 | messages = [ 47 | {"role": "system", "content": system_prompt}, 48 | {"role": "user", "content": user_message} 49 | ] 50 | 51 | response = openai.ChatCompletion.create( 52 | model="gpt-3.5-turbo", 53 | messages=messages 54 | ) 55 | 56 | content = response['choices'][0]['message']['content'] 57 | 58 | try: 59 | content = json.loads(content) 60 | except json.JSONDecodeError: 61 | content = {"response": content} 62 | 63 | return content 64 | 65 | if __name__ == "__main__": 66 | system_prompt = "You are a helpful assistant." 67 | user_message = "Hello, how can I improve my productivity?" 68 | 69 | result = chat(system_prompt, user_message) 70 | print(result) 71 | 72 | if __name__ == "__main__": 73 | instruction = "install pandas 1.2.0" 74 | msg = is_shutdown_request("hey nami, stop") 75 | print("x") 76 | print(msg) 77 | # print(x) -------------------------------------------------------------------------------- /src/main.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import os 3 | import wave 4 | from datetime import datetime 5 | import threading 6 | import pyaudio 7 | import keyboard 8 | from openai import OpenAI 9 | import pvporcupine 10 | from pvrecorder import PvRecorder 11 | from dotenv import load_dotenv 12 | import pyttsx3 13 | import sounddevice as sd 14 | import pyautogui 15 | import time 16 | import pyperclip 17 | import subprocess 18 | import colorlog 19 | from colorlog import ColoredFormatter 20 | 21 | from PyQt5.QtCore import QTimer 22 | import sys 23 | from PyQt5.QtWidgets import QApplication, QLabel, QMainWindow 24 | from PyQt5.QtGui import QMovie, QPalette, QColor 25 | from PyQt5.QtCore import Qt, QSize 26 | 27 | from src.chains.bash_chain import bash_chain 28 | from src.chains.bash_chain.ask_user_yes_no_get_reason import ask_user_yes_no_get_reason 29 | from src.utils.chrmadb.generate_db import get_relevant_history 30 | from src.utils.subprocess_caller import get_command_logs 31 | from src.utils.helpers import print_and_speak, record_and_transcribe_plain_text, transcribe_audio, record_audio 32 | # from src.animation import animation_controller 33 | # from src.animation.thinking_siri_animation import GIFPlayer 34 | 35 | load_dotenv() 36 | 37 | handler = colorlog.StreamHandler() 38 | formatter = ColoredFormatter( 39 | "%(log_color)s%(levelname)-8s%(reset)s %(blue)s%(message)s", 40 | datefmt=None, 41 | reset=True, 42 | log_colors={ 43 | 'DEBUG': 'cyan', 44 | 'INFO': 'green', 45 | 'WARNING': 'yellow', 46 | 'ERROR': 'red', 47 | 'CRITICAL': 'red,bg_white', 48 | }, 49 | secondary_log_colors={}, 50 | style='%' 51 | ) 52 | handler.setFormatter(formatter) 53 | logger = colorlog.getLogger('example') 54 | logger.addHandler(handler) 55 | 56 | transcription = "" 57 | TAB_INDEX2=2 58 | resp = { 59 | "reason": None 60 | } 61 | 62 | def wake_word_detection(access_key, keyword_paths, audio_device_index=-1): 63 | """ 64 | Detects a wake word in the audio stream from the specified audio device. 65 | 66 | Args: 67 | access_key (str): The access key for the Porcupine service. 68 | keyword_paths (List[str]): The paths to the keyword model files. 69 | audio_device_index (int, optional): The index of the audio device to use. Defaults to -1. 70 | 71 | Returns: 72 | bool: True if a wake word is detected, False otherwise. 73 | 74 | Raises: 75 | Exception: If an error occurs during the wake word detection process. 76 | """ 77 | try: 78 | # Create Porcupine object 79 | porcupine = pvporcupine.create( 80 | access_key=access_key, 81 | keyword_paths=keyword_paths) 82 | 83 | # Initialize PvRecorder 84 | recorder = PvRecorder(frame_length=porcupine.frame_length, device_index=audio_device_index) 85 | recorder.start() 86 | 87 | logger.info('Listening for wake word... (press Ctrl+C to exit)') 88 | while True: 89 | pcm = recorder.read() 90 | result = porcupine.process(pcm) 91 | if result >= 0: 92 | logger.debug('[%s] Wake word detected!' % str(datetime.now())) 93 | recorder.stop() 94 | return True 95 | except Exception as e: 96 | logger.warn(f"Error: {e}") 97 | finally: 98 | if 'recorder' in locals(): 99 | recorder.delete() 100 | if 'porcupine' in locals(): 101 | porcupine.delete() 102 | 103 | def ask_yes_no_by_voice(prompt): 104 | """ 105 | **Does not use OPENAI** 106 | Asks a yes/no question via voice and expects a vocal response. The response is recorded and transcribed to determine if the answer is 'yes' or 'no'. 107 | 108 | Args: 109 | prompt (str): The question to ask the user via text-to-speech. 110 | 111 | Returns: 112 | bool: True if the user responded with 'yes', False if 'no', and None if unclear. 113 | """ 114 | client = OpenAI() 115 | 116 | print_and_speak(prompt) 117 | 118 | if not bool(os.getenv("TYPE_ONLY")): 119 | filename = "confirmation.wav" 120 | record_audio(filename) 121 | 122 | with open(filename, "rb") as audio_file: 123 | confirmation = client.audio.transcriptions.create(model="whisper-1", file=audio_file) 124 | response_text = confirmation.text.lower().strip() 125 | logger.debug("Transcribed confirmation:", response_text) 126 | else: 127 | response_text = input("Enter your response y/n: ") 128 | 129 | # Determine the user's response based on transcription 130 | if "yes" in response_text: 131 | return True 132 | elif "no" or "não" in response_text: 133 | return False 134 | elif "stop" in response_text: 135 | return "STOP" 136 | else: 137 | return None 138 | 139 | def type_text(text): 140 | try: 141 | # pane id indexing starts at 0 142 | # command = f'source ~/.zshrc && echo -e "\\n{text}\\n" | wezterm cli send-text --pane-id 1 --no-paste' 143 | command = f'source ~/.zshrc && echo -e "\\n{text}\\n" | wezterm cli send-text --pane-id 1 --no-paste' 144 | subprocess.run(['zsh', '-c', command], check=True, text=True) 145 | except subprocess.CalledProcessError as e: 146 | logger.error(f"Error occurred while sending text: {e}") 147 | except Exception as e: 148 | logger.error(f"Unexpected error occurred while sending text: {e}") 149 | 150 | def run_command(command: str): 151 | time.sleep(0.5) 152 | cwd = os.getcwd() 153 | type_text(command) 154 | time.sleep(0.5) 155 | 156 | def replace_quotes(text): 157 | bad_double_quotes = ['“', '”', '„'] 158 | bad_single_quotes = ['‘', '’'] 159 | 160 | for bad_double in bad_double_quotes: 161 | text = text.replace(bad_double, '"') 162 | 163 | for bad_single in bad_single_quotes: 164 | text = text.replace(bad_single, "'") 165 | 166 | return text 167 | 168 | def main(): 169 | parser = argparse.ArgumentParser() 170 | parser.add_argument('--keyword_paths', nargs='+', required=True, help='Absolute paths to keyword model files') 171 | parser.add_argument('--audio_device_index', type=int, default=-1) 172 | parser.add_argument('--show_audio_devices', action='store_true') 173 | 174 | args = parser.parse_args() 175 | global resp 176 | 177 | if args.show_audio_devices: 178 | for i, device in enumerate(PvRecorder.get_available_devices()): 179 | logger.info('Device %d: %s' % (i, device)) 180 | return 181 | 182 | access_key = os.getenv("PICOVOICE_ACCESS_KEY") 183 | if wake_word_detection(access_key, args.keyword_paths, 1): 184 | # if True: 185 | message_ok = False 186 | # show animation 187 | process = subprocess.Popen(['python', 'src/animation/thinking_siri_animation.py']) 188 | is_first_run = True 189 | clean_command = '' 190 | while not message_ok: 191 | logger.debug(f"---\n{resp}\n{resp.get('reason')}") 192 | if resp.get("reason")is None: 193 | # Record audio in a separate thread 194 | if True: 195 | command = record_and_transcribe_plain_text() 196 | else: 197 | command = input("enter command: ") 198 | 199 | command = clean_command + "\n\n" + command 200 | 201 | else: 202 | command = clean_command + "\n\n" + resp.get("reason") 203 | 204 | if is_first_run: 205 | historical_commands_from_rag = get_relevant_history(command) 206 | # uncomment this to enable openai instead of open interpreter 207 | # message, clean_command = bash_chain.run(command, historical_commands_from_rag) 208 | message, clean_command = bash_chain.run_openinterpreter(command, historical_commands_from_rag) 209 | clean_command = replace_quotes(clean_command) 210 | logger.info(clean_command) 211 | message_ok = message=="ok" 212 | 213 | # ai asks: should i run it 214 | command_success = ask_yes_no_by_voice(message) 215 | 216 | if command_success == True: 217 | run_command(clean_command) 218 | 219 | # ask_confirm = ask_yes_no_by_voice(msg) 220 | resp = ask_user_yes_no_get_reason() 221 | logger.debug("line 206 received resp", resp) 222 | if resp.get("exit") == True and resp.get("response")=="yes" and resp.get("reason")==None: 223 | print_and_speak(resp.get("your_next_message")) 224 | message_ok = True 225 | elif (resp.get("response")=="yes" and resp.get("reason")is not None) or (resp.get("response")=="no"): 226 | message_ok = False 227 | 228 | process.terminate() 229 | process.wait() 230 | resp = { 231 | "reason": None 232 | } 233 | if __name__ == '__main__': 234 | # run_command(f"script -a {os.getcwd()}/command_logs.txt") 235 | while True: 236 | main() 237 | run_command("exit") 238 | # sys.exit(app.exec_()) -------------------------------------------------------------------------------- /src/run_mac.sh: -------------------------------------------------------------------------------- 1 | export PYTHONPATH=$PYTHONPATH:/Users/bread/Documents/Terminal-Voice-Assistant 2 | 3 | python src/main.py --keyword_paths ./src/wakewords/hey-nami_en_mac_v3_0_0.ppn --audio_device_index 1 4 | -------------------------------------------------------------------------------- /src/runme.bat: -------------------------------------------------------------------------------- 1 | @echo off 2 | REM Activate the Conda environment 3 | call activate base 4 | 5 | REM Run the Python script with specified parameters 6 | python main.py --keyword_paths ./wakewords/hey-nami_en_windows_v3_0_0.ppn --audio_device_index 3 7 | 8 | REM 9 | pause 10 | -------------------------------------------------------------------------------- /src/utils/chrmadb/generate_db.py: -------------------------------------------------------------------------------- 1 | import chromadb 2 | import subprocess 3 | import os 4 | import logging 5 | from dotenv import load_dotenv 6 | import pandas as pd 7 | 8 | logging.basicConfig(level=logging.INFO) 9 | logger = logging.getLogger(__name__) 10 | 11 | load_dotenv() 12 | 13 | # not using this right now for further cleaning 14 | # def get_zsh_history(): 15 | # """ 16 | # Return zsh shell history 17 | # """ 18 | # history_file = os.path.expanduser('~/.zsh_history') 19 | # result = subprocess.run(['cat', history_file], stdout=subprocess.PIPE, stderr=subprocess.PIPE, text=True) 20 | 21 | # if result.returncode == 0: 22 | # return result.stdout 23 | # else: 24 | # print(f"Error: {result.stderr}") 25 | # return None 26 | 27 | chroma_client = chromadb.PersistentClient(path="./chromadbpath") 28 | collection_name = "my_collection" 29 | 30 | existing_collections = [i.name for i in chroma_client.list_collections()] 31 | print(existing_collections) 32 | if collection_name in existing_collections: 33 | collection = chroma_client.get_collection(name=collection_name) 34 | else: 35 | collection = chroma_client.create_collection(name=collection_name) 36 | 37 | try: 38 | with open("src/utils/chrmadb/abc.txt", "r") as f: 39 | text = f.read() 40 | except FileNotFoundError: 41 | logger.error("File not found.") 42 | text = "" 43 | except Exception as e: 44 | logger.error(f"An error occurred: {e}") 45 | text = "" 46 | 47 | lines = text.strip().split('\n') 48 | # documents = list(set([line.split(maxsplit=1)[1] for line in lines if len(line)> 5])) 49 | df=pd.DataFrame({'line': lines}) 50 | documents = list(df['line'].str.split(n=1, expand=True).dropna()[1].unique()) 51 | 52 | collection.upsert( 53 | documents=documents, 54 | ids=[str(i) for i in range(len(documents))] 55 | ) 56 | 57 | def get_relevant_history(query="ssh to doraemon"): 58 | results = collection.query( 59 | query_texts=[query], 60 | n_results=10 61 | ) 62 | relevant_history = '\n'.join(results['documents'][0]) 63 | logger.info(f"Retrieved relevant history: {relevant_history}\n") 64 | return relevant_history 65 | 66 | if __name__ == "__main__": 67 | print(get_relevant_history("cd documents/chatapp-react")) -------------------------------------------------------------------------------- /src/utils/helpers.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import os 3 | import wave 4 | from datetime import datetime 5 | import threading 6 | import pyaudio 7 | import keyboard 8 | from openai import OpenAI 9 | import pvporcupine 10 | from pvrecorder import PvRecorder 11 | from dotenv import load_dotenv 12 | import pyttsx3 13 | import sounddevice as sd 14 | import pyautogui 15 | import time 16 | import pyperclip 17 | import subprocess 18 | 19 | from PyQt5.QtCore import QTimer 20 | import sys 21 | from PyQt5.QtWidgets import QApplication, QLabel, QMainWindow 22 | from PyQt5.QtGui import QMovie, QPalette, QColor 23 | from PyQt5.QtCore import Qt, QSize 24 | 25 | from openai import OpenAI 26 | from dotenv import load_dotenv 27 | import json 28 | import pyttsx3 29 | import colorlog 30 | from colorlog import ColoredFormatter 31 | 32 | handler = colorlog.StreamHandler() 33 | formatter = ColoredFormatter( 34 | "%(log_color)s%(levelname)-8s%(reset)s %(blue)s%(message)s", 35 | datefmt=None, 36 | reset=True, 37 | log_colors={ 38 | 'DEBUG': 'cyan', 39 | 'INFO': 'green', 40 | 'WARNING': 'yellow', 41 | 'ERROR': 'red', 42 | 'CRITICAL': 'red,bg_white', 43 | }, 44 | secondary_log_colors={}, 45 | style='%' 46 | ) 47 | handler.setFormatter(formatter) 48 | logger = colorlog.getLogger('example') 49 | logger.addHandler(handler) 50 | 51 | load_dotenv() 52 | 53 | engine = pyttsx3.init() 54 | 55 | def is_shutdown_request(prompt: str) -> bool: 56 | """ 57 | Use OpenAI to determine if a prompt asks for shutdown or exit and return JSON true or false. 58 | 59 | Parameters: 60 | prompt (str): The input prompt from the user. 61 | 62 | Returns: 63 | str: JSON 'true' if the prompt is asking for shutdown or exit, 'false' otherwise. 64 | """ 65 | client = OpenAI() 66 | messages=[ 67 | {"role": "system", "content": "You are Nami. Determine if the user is asking you to stop or exit or shutdown or cancel yourself."}, 68 | {"role": "user", "content": prompt}, 69 | {"role": "assistant", "content": "Respond with 'true' or 'false'."} 70 | ] 71 | response = client.chat.completions.create( 72 | messages=messages, 73 | model="gpt-3.5-turbo", 74 | ) 75 | 76 | content = response.choices[0].message.content.strip().lower() 77 | 78 | if "true" in content: 79 | return True 80 | else: 81 | return False 82 | 83 | def record_and_transcribe_plain_text(): 84 | record_thread = threading.Thread(target=record_audio, args=("recorded.wav",)) 85 | record_thread.start() 86 | record_thread.join() 87 | 88 | response = transcribe_audio("recorded.wav") 89 | return response 90 | 91 | def print_and_speak(msg: str): 92 | global engine 93 | logger.info(msg) 94 | if not bool(os.getenv("TYPE_ONLY")): 95 | engine.say(msg) 96 | engine.runAndWait() 97 | 98 | 99 | def record_audio(filename, sample_rate=16000, chunk_size=512, channels=1, device_id=None): 100 | """ 101 | Records audio from the default microphone input until the 'esc' key is pressed. 102 | The audio is recorded in WAV format and saved to the specified file. 103 | 104 | Args: 105 | filename (str): The path to the file where the recorded audio will be saved. 106 | The file extension should be .wav to indicate the format. 107 | sample_rate (int): The sample rate of the audio recording in Hertz. 108 | Common rates include 44100 (CD), 48000 (audio for video), and 16000 (telephony). Default is 16000. 109 | chunk_size (int): The number of audio frames per buffer. A larger size reduces CPU usage but increases latency. 110 | Default is 512. 111 | channels (int): The number of audio channels (1 for mono, 2 for stereo). Default is 1 (mono). 112 | device_id (int, optional): The ID of the audio input device to use. Default is None, which uses the default device. 113 | """ 114 | p = pyaudio.PyAudio() 115 | 116 | if device_id is None: 117 | device_id = p.get_default_input_device_info()['index'] 118 | 119 | stream = p.open(format=pyaudio.paInt16, 120 | channels=channels, 121 | rate=sample_rate, 122 | input=True, 123 | input_device_index=device_id, 124 | frames_per_buffer=chunk_size) 125 | 126 | frames = [] 127 | print("Recording... (press 'esc' to stop)") 128 | 129 | recording = True 130 | def stop_recording(): 131 | nonlocal recording 132 | recording = False 133 | 134 | keyboard.add_hotkey('esc', stop_recording, suppress=True) 135 | 136 | try: 137 | while recording: 138 | data = stream.read(chunk_size, exception_on_overflow=False) 139 | frames.append(data) 140 | finally: 141 | print("Recording stopped.") 142 | stream.stop_stream() 143 | stream.close() 144 | p.terminate() 145 | 146 | # Save the recorded data as a WAV file 147 | wf = wave.open(filename, 'wb') 148 | wf.setnchannels(channels) 149 | wf.setsampwidth(p.get_sample_size(pyaudio.paInt16)) 150 | wf.setframerate(sample_rate) 151 | wf.writeframes(b''.join(frames)) 152 | wf.close() 153 | 154 | keyboard.unhook_all_hotkeys() 155 | 156 | def transcribe_audio(filename): 157 | """ 158 | Transcribes the audio file specified by the given filename using the OpenAI API. 159 | 160 | Args: 161 | filename (str): The path to the audio file to be transcribed. 162 | 163 | Returns: 164 | transcription (str): The transcription of the audio file. 165 | """ 166 | 167 | client = OpenAI() 168 | couldnt_understand = False 169 | global transcription 170 | while True: 171 | is_correct = None 172 | if not couldnt_understand: 173 | with open(filename, "rb") as audio_file: 174 | transcription = client.audio.transcriptions.create(model="whisper-1", file=audio_file) 175 | print(transcription.text) 176 | 177 | prompt = "Is this transcription correct?." if not couldnt_understand else "" 178 | # is_correct = ask_yes_no_by_voice(prompt), commenting to skip one ask user check 179 | is_correct = True 180 | 181 | if is_correct: 182 | return transcription.text 183 | elif is_correct is False: 184 | couldnt_understand = False 185 | msg = "Let's try recording the main command again." 186 | print_and_speak(msg) 187 | record_audio(filename) 188 | else: 189 | msg = "I couldn't understand. Please say 'yes' or 'no' clearly. " 190 | print_and_speak(msg) 191 | couldnt_understand = True 192 | 193 | if __name__ == "__main__": 194 | msg = is_shutdown_request("hey nami, cancel this command") 195 | print("x") 196 | print(msg) -------------------------------------------------------------------------------- /src/utils/iterm2/get_iterm2_sessions.py: -------------------------------------------------------------------------------- 1 | import asyncio 2 | import iterm2 3 | import time 4 | tab_titles = [] 5 | 6 | async def get_all_sessions(app): 7 | """ 8 | Prints the titles of all windows, tabs, and sessions in the iTerm2 application. 9 | 10 | This function iterates over all windows, tabs, and sessions in the iTerm2 application 11 | and prints their titles. 12 | 13 | Parameters: 14 | app (iterm2.App): The iTerm2 application instance. 15 | """ 16 | global tab_titles 17 | for window in app.windows: 18 | window_title = await window.async_get_variable("title") 19 | print("Window title: %s" % (window_title)) 20 | for tab in window.tabs: 21 | tab_title = await tab.async_get_variable("title") 22 | print("\tTab title: %s" % (tab_title)) 23 | tab_titles.append(tab_title) 24 | for session in tab.sessions: 25 | session_title = await session.async_get_variable("name") 26 | print("\t\tSession title: %s" % (session_title)) 27 | 28 | async def main_get_all_sessions(connection): 29 | app = await iterm2.async_get_app(connection) 30 | await get_all_sessions(app) 31 | 32 | 33 | def get_iterm2_titles(): 34 | global tab_titles 35 | iterm2.run_until_complete(main_get_all_sessions) 36 | tabtitles_copy = [i for i in tab_titles] 37 | tab_titles = [] 38 | return tabtitles_copy 39 | -------------------------------------------------------------------------------- /src/utils/iterm2/iterm2_focus.py: -------------------------------------------------------------------------------- 1 | import asyncio 2 | import iterm2 3 | import time 4 | import pyautogui 5 | from AppKit import NSWorkspace, NSRunningApplication 6 | from functools import partial 7 | 8 | def focus_iterm2(): 9 | """ 10 | Brings iTerm2 application to the foreground using AppKit. 11 | 12 | This function iterates over the running applications and activates iTerm2 if it is found. 13 | """ 14 | ws = NSWorkspace.sharedWorkspace() 15 | apps = ws.runningApplications() 16 | for app in apps: 17 | if app.localizedName() == "iTerm2": 18 | app.activateWithOptions_(1 << 1) 19 | break 20 | 21 | async def focus_on_specific_session(connection, target_window_title, target_tab_title, target_session_title): 22 | """ 23 | Focuses on a specific session in iTerm2. 24 | 25 | This function navigates through the iTerm2 application structure to focus on the specified 26 | window, tab, and session based on their titles. 27 | 28 | Parameters: 29 | connection (iterm2.Connection): The connection to the iTerm2 application. 30 | target_window_title (str or None): The title of the target window. Use None if the window title is None. 31 | target_tab_title (str): The title of the target tab. 32 | target_session_title (str): The title of the target session. 33 | """ 34 | app = await iterm2.async_get_app(connection) 35 | for window in app.windows: 36 | window_title = await window.async_get_variable("title") 37 | if window_title == target_window_title or (target_window_title is None and window_title is None): 38 | # Focus on the window 39 | await window.async_activate() 40 | for tab in window.tabs: 41 | tab_title = await tab.async_get_variable("title") 42 | if tab_title == target_tab_title: 43 | # Focus on the tab 44 | await tab.async_select() 45 | for session in tab.sessions: 46 | session_title = await session.async_get_variable("name") 47 | if session_title == target_session_title: 48 | # Focus on the session 49 | await session.async_activate() 50 | print(f"Focused on window: {window_title}, tab: {tab_title}, session: {session_title}") 51 | return 52 | 53 | async def focus_context(connection, target_window_title, target_tab_title, target_session_title): 54 | """ 55 | Main function to initiate focusing on the specified iTerm2 session and ensure iTerm2 stays in front. 56 | 57 | This function specifies the titles of the target window, tab, and session and calls the function to focus 58 | on the specific session. It also ensures iTerm2 is brought to the foreground again after focusing. 59 | 60 | Parameters: 61 | connection (iterm2.Connection): The connection to the iTerm2 application. 62 | target_window_title (str or None): The title of the target window. Use None if the window title is None. 63 | target_tab_title (str): The title of the target tab. 64 | target_session_title (str): The title of the target session. 65 | """ 66 | await focus_on_specific_session(connection, target_window_title, target_tab_title, target_session_title) 67 | 68 | # Ensure iTerm2 is brought to the foreground 69 | focus_iterm2() 70 | 71 | def switch_tab(ind): 72 | if not ind: 73 | raise ValueError("This switch tab index is empty") 74 | time.sleep(0.3) 75 | pyautogui.keyDown('alt') 76 | pyautogui.press(str(ind)) 77 | pyautogui.keyUp('alt') 78 | time.sleep(0.3) 79 | 80 | def focus_iterm2_run(target_window_title = None, target_tab_title = "My Custom Title (zsh)", target_session_title = "My Custom Title (zsh)"): 81 | time.sleep(2) 82 | iterm2.run_until_complete(partial(focus_context, target_window_title=target_window_title, \ 83 | target_tab_title=target_tab_title, target_session_title=target_session_title)) 84 | 85 | -------------------------------------------------------------------------------- /src/utils/iterm2/iterm2_launcher.py: -------------------------------------------------------------------------------- 1 | import asyncio 2 | import iterm2 3 | 4 | # Color presets to use 5 | LIGHT_PRESET_NAME = "Light Background" 6 | DARK_PRESET_NAME = "Dark Background" 7 | PROFILES = ["Default"] 8 | 9 | async def set_colors(connection, preset_name): 10 | print("Change to preset {}".format(preset_name)) 11 | preset = await iterm2.ColorPreset.async_get(connection, preset_name) 12 | for partial in (await iterm2.PartialProfile.async_query(connection)): 13 | if partial.name in PROFILES: 14 | await partial.async_set_color_preset(preset) 15 | 16 | async def launch_iterm2_with_custom_title(connection): 17 | try: 18 | app = await iterm2.async_get_app(connection) 19 | new_window = await iterm2.Window.async_create(connection) 20 | if new_window: 21 | # Get the current session in the new window 22 | session1 = new_window.current_tab.current_session 23 | 24 | # Ensure the session is valid 25 | if session1: 26 | await session1.async_set_name("My Custom Title") 27 | print("Custom title for session1 set successfully") 28 | 29 | # Split the tab vertically 30 | session2 = await session1.async_split_pane(vertical=True) 31 | 32 | if session2: 33 | await session2.async_set_name("My Custom Title2") 34 | print("Custom title for session2 set successfully") 35 | 36 | await set_colors(connection, DARK_PRESET_NAME) 37 | await asyncio.sleep(1) 38 | 39 | # Bring the new window to the front 40 | await new_window.async_activate() 41 | # Activate the iTerm2 application 42 | await app.async_activate() 43 | else: 44 | print("Session1 not found") 45 | else: 46 | print("New window could not be created") 47 | except Exception as e: 48 | print(f"An error occurred: {e}") 49 | 50 | def launch_iterm2(): 51 | iterm2.run_until_complete(launch_iterm2_with_custom_title, retry=True) -------------------------------------------------------------------------------- /src/utils/open_interpreter/open_interpreter_returnable.py: -------------------------------------------------------------------------------- 1 | import re 2 | import pyperclip 3 | import logging 4 | from interpreter import interpreter 5 | # from langchain.llms import OpenAI 6 | from langchain_community.llms import OpenAI 7 | from langchain.prompts import PromptTemplate 8 | from langchain.chains import LLMChain 9 | import time 10 | import iterm2 11 | from functools import partial 12 | import asyncio 13 | 14 | from src.utils.chrmadb.generate_db import get_relevant_history 15 | from src.utils.subprocess_caller import run_get_sessions 16 | from src.utils.iterm2.iterm2_focus import focus_context, focus_iterm2 17 | from dotenv import load_dotenv 18 | 19 | import warnings 20 | warnings.filterwarnings("ignore") 21 | 22 | logging.basicConfig(level=logging.INFO) 23 | logger = logging.getLogger(__name__) 24 | 25 | load_dotenv() 26 | llm = OpenAI() 27 | 28 | template = """ 29 | You are a helpful assistant. Not always but you might find this shell history commands useful to 30 | answer any questions. Remember only return the command nothing else. 31 | 32 | Shell history: 33 | {history} 34 | 35 | Answer the following question: 36 | 37 | Question: {question} 38 | 39 | Answer: 40 | """ 41 | 42 | def call_openinterpreter(query: str): 43 | """ 44 | Calls the OpenAI interpreter with a given query, formats the prompt with relevant shell history, 45 | and copies the cleaned response to the clipboard. 46 | 47 | Args: 48 | query (str): The question to be asked. 49 | 50 | Returns: 51 | str: The cleaned response from the OpenAI interpreter. 52 | """ 53 | try: 54 | logger.info("Fetching relevant shell history for the query.") 55 | history = get_relevant_history(query) 56 | print(history) 57 | exit(0) 58 | prompt = str(template.format(history=history, question=query)) 59 | 60 | logger.info("Sending prompt to the OpenAI interpreter.") 61 | resp = interpreter.chat(prompt, stream=False, display=True)[0]['content'] 62 | logger.info(f"Interpreter output {resp}") 63 | 64 | logger.info("Cleaning the response from the interpreter.") 65 | cleaned_resp = re.sub(r'```shell\n|```', '', resp).strip() 66 | 67 | logger.info("Copying the cleaned response to the clipboard.") 68 | pyperclip.copy(cleaned_resp) 69 | 70 | target_window_title = None 71 | target_tab_title = "~/Documents (zsh)" 72 | target_session_title = "~/Documents (zsh)" 73 | 74 | time.sleep(1) 75 | 76 | iterm2.run_until_complete(partial(focus_context, target_window_title=target_window_title, \ 77 | target_tab_title=target_tab_title, target_session_title=target_session_title)) 78 | 79 | time.sleep(1) 80 | 81 | focus_iterm2() 82 | time.sleep(1) 83 | pyperclip.paste() 84 | 85 | 86 | return cleaned_resp 87 | except Exception as e: 88 | logger.error("An error occurred while processing the query: %s", e) 89 | raise 90 | 91 | # # Example usage 92 | if __name__ == "__main__": 93 | query = "cd crewAI" 94 | response = call_openinterpreter(query) 95 | print("Response:", response) -------------------------------------------------------------------------------- /src/utils/osinfo.py: -------------------------------------------------------------------------------- 1 | import subprocess 2 | import platform 3 | import json 4 | 5 | def get_python_version(): 6 | try: 7 | result = subprocess.run(['python', '--version'], capture_output=True, text=True, check=True) 8 | return result.stdout.strip() 9 | except subprocess.CalledProcessError as e: 10 | return f"Failed to get Python version: {e}" 11 | 12 | def get_conda_version(): 13 | try: 14 | result = subprocess.run(['conda', '--version'], capture_output=True, text=True, check=True) 15 | return result.stdout.strip() 16 | except subprocess.CalledProcessError as e: 17 | return f"Failed to get Conda version: {e}" 18 | 19 | def get_os_version(): 20 | try: 21 | os_info = platform.uname() 22 | return { 23 | "system": os_info.system, 24 | "machine": os_info.machine, 25 | } 26 | except Exception as e: 27 | return f"Failed to get OS version: {e}" 28 | 29 | def get_details(): 30 | info = { 31 | "Python Version": get_python_version(), 32 | "Conda Version": get_conda_version(), 33 | "OS Version": get_os_version() 34 | } 35 | 36 | return json.dumps(info) 37 | -------------------------------------------------------------------------------- /src/utils/query_router.py: -------------------------------------------------------------------------------- 1 | from llama_index.llms.openai import OpenAI 2 | from dataclasses import fields 3 | from pydantic import BaseModel, Field 4 | from typing import List 5 | import json 6 | from llama_index.core.types import BaseOutputParser 7 | from llama_index.core import PromptTemplate 8 | from llama_index.program.openai import OpenAIPydanticProgram 9 | from dotenv import load_dotenv 10 | 11 | load_dotenv() 12 | llm = OpenAI(model="gpt-3.5-turbo") 13 | 14 | # Define the choices for routing queries 15 | choices = [ 16 | "Useful for questions related to any simple command or script or file operations or install uninstall or clone operations", 17 | "Useful for questions related to complex commands like 'zip and scp this file to this server', 'clone this repo, create python env and install requirements'", 18 | "Useful for questions related to everything else", 19 | ] 20 | 21 | # Define the format string for output 22 | FORMAT_STR = """The output should be formatted as a JSON instance that conforms to 23 | the JSON schema below. 24 | 25 | Here is the output schema: 26 | { 27 | "type": "array", 28 | "items": { 29 | "type": "object", 30 | "properties": { 31 | "choice": { 32 | "type": "integer" 33 | }, 34 | "reason": { 35 | "type": "string" 36 | } 37 | }, 38 | "required": [ 39 | "choice", 40 | "reason" 41 | ], 42 | "additionalProperties": false 43 | } 44 | } 45 | """ 46 | 47 | class Answer(BaseModel): 48 | """ 49 | Represents a single choice with a reason. 50 | 51 | Attributes: 52 | choice (int): The choice index. 53 | reason (str): The reason for the choice. 54 | """ 55 | choice: int 56 | reason: str 57 | 58 | class Answers(BaseModel): 59 | """ 60 | Represents a list of answers. 61 | 62 | Attributes: 63 | answers (List[Answer]): A list of Answer objects. 64 | """ 65 | answers: List[Answer] 66 | 67 | # --- 68 | def _escape_curly_braces(input_string: str) -> str: 69 | """ 70 | Escapes curly braces in the input string. 71 | 72 | Args: 73 | input_string (str): The input string. 74 | 75 | Returns: 76 | str: The escaped string. 77 | """ 78 | escaped_string = input_string.replace("{", "{{").replace("}", "}}") 79 | return escaped_string 80 | 81 | def _marshal_output_to_json(output: str) -> str: 82 | """ 83 | Marshals the output string to JSON format. 84 | 85 | Args: 86 | output (str): The output string. 87 | 88 | Returns: 89 | str: The JSON formatted string. 90 | """ 91 | output = output.strip() 92 | left = output.find("[") 93 | right = output.find("]") 94 | output = output[left : right + 1] 95 | return output 96 | # --- 97 | 98 | 99 | def get_choice_str(choices): 100 | """ 101 | Formats the choices as a numbered string. 102 | 103 | Args: 104 | choices (List[str]): The list of choices. 105 | 106 | Returns: 107 | str: The formatted string of choices. 108 | """ 109 | choices_str = "\n\n".join( 110 | [f"{idx+1}. {c}" for idx, c in enumerate(choices)] 111 | ) 112 | return choices_str 113 | 114 | def get_formatted_prompt(query_str): 115 | """ 116 | Formats the prompt with the query string. 117 | 118 | Args: 119 | query_str (str): The query string. 120 | 121 | Returns: 122 | str: The formatted prompt. 123 | """ 124 | fmt_prompt = router_prompt0.format( 125 | num_choices=len(choices), 126 | max_outputs=2, 127 | context_list=choices_str, 128 | query_str=query_str, 129 | ) 130 | return fmt_prompt 131 | 132 | # Define the router prompt template 133 | router_prompt0 = PromptTemplate( 134 | template= """Some choices are given below. It is provided in a numbered list (1 to 135 | {num_choices}), where each item in the list corresponds to a 136 | summary.\n---------------------\n{context_list}\n---------------------\nUsing 137 | only the choices above and not prior knowledge, return the top choices 138 | (no more than {max_outputs}, but only select what is needed) that are 139 | most relevant to the question: '{query_str}'\n""" 140 | ) 141 | 142 | def get_route(query_str="open vscode"): 143 | """ 144 | Gets the route for the given query. 145 | 146 | Args: 147 | query_str (str): The query string. Default is "open vscode". 148 | 149 | Returns: 150 | int: The choice index for the query. 151 | """ 152 | program = OpenAIPydanticProgram.from_defaults( 153 | output_cls=Answers, 154 | prompt=router_prompt1, 155 | verbose=False, 156 | ) 157 | output = program(context_list=choices_str, query_str=query_str) 158 | return output.answers[0].choice 159 | 160 | 161 | choices_str = get_choice_str(choices) 162 | router_prompt1 = router_prompt0.partial_format( 163 | num_choices=len(choices), 164 | max_outputs=len(choices), 165 | ) 166 | if __name__ == "__main__": 167 | print(get_route(query_str="open vscode")) -------------------------------------------------------------------------------- /src/utils/subprocess_caller.py: -------------------------------------------------------------------------------- 1 | import subprocess 2 | 3 | def run_get_sessions(): 4 | """ 5 | Runs the get_iterm2_sessions.py script as a subprocess. 6 | """ 7 | try: 8 | result = subprocess.run(['python', 'src/utils/iterm2/get_iterm2_sessions.py'], capture_output=True, text=True) 9 | print(result.stdout) 10 | return result.stdout 11 | print("Errors:") 12 | print(result.stderr) 13 | except Exception as e: 14 | print(f"An error occurred: {e}") 15 | 16 | def get_command_logs(): 17 | """ 18 | return logs 19 | """ 20 | try: 21 | result = subprocess.run(['tail', '-n', '20', 'command_logs.txt'], capture_output=True, text=True) 22 | print(result.stdout) 23 | return result.stdout 24 | # print("Errors:") 25 | # print(result.stderr) 26 | except Exception as e: 27 | print(f"An error occurred: {e}") 28 | 29 | # # Example usage 30 | if __name__ == "__main__": 31 | get_command_logs() -------------------------------------------------------------------------------- /src/wakewords/hey-nami_en_mac_v3_0_0.ppn: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/0xrushi/Terminal-Voice-Assistant/fc5134622b49d6911db85f57268fd4bbef57da8f/src/wakewords/hey-nami_en_mac_v3_0_0.ppn -------------------------------------------------------------------------------- /src/wakewords/hey-nami_en_windows_v3_0_0.ppn: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/0xrushi/Terminal-Voice-Assistant/fc5134622b49d6911db85f57268fd4bbef57da8f/src/wakewords/hey-nami_en_windows_v3_0_0.ppn --------------------------------------------------------------------------------