├── requirements-core.txt ├── requirements-agent.txt ├── LICENSE ├── README.md ├── matlab ├── real_time_clap_detection.m ├── detect_clap_sound.m └── record │ └── record_wav_file.py ├── two_claps_open.py ├── .gitignore └── agent_on_clap.py /requirements-core.txt: -------------------------------------------------------------------------------- 1 | # Core dependencies for two-claps-open 2 | numpy 3 | pyaudio 4 | scipy 5 | pywin32 6 | -------------------------------------------------------------------------------- /requirements-agent.txt: -------------------------------------------------------------------------------- 1 | # Optional dependencies for voice agent extension 2 | PyAudio 3 | pygame 4 | SpeechRecognition 5 | dotenv 6 | gTTS 7 | langchain 8 | langchain-core 9 | langchain-google 10 | langchain-google-genai 11 | scipy 12 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2025 Yutarop 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # `two_claps_open` 👏 2 | Open Chrome (or any file/app) by simply clapping twice — just like [Tony Stark](https://www.youtube.com/watch?v=OT2b5KzMoC0&t=101s). 3 | 4 | #### 🔸Open a presentation slide and chrome by clapping twice (two_claps_open.py) 5 | 6 | 7 | https://github.com/user-attachments/assets/15849e57-b662-4096-8f67-f8937ae61711 8 | 9 | 10 | #### 🔸Activate a voice assistant agent by clapping twice (agent_on_clap.py) 11 | 12 | 13 | https://github.com/user-attachments/assets/3b0ef232-7229-4ca4-9d57-f2fc325295bc 14 | 15 | 16 | ## Clap Detection 17 | To figure out what a clap "looks like" in terms of sound, I first recorded a sample clap and ran a Fourier transform on it to check the frequency content. 18 | From the analysis, most of the energy from the clap is concentrated around 1.4kHz to 1.8kHz. 19 | Based on this, I set up a bandpass filter to isolate only that range and ignore irrelevant noise. 20 | After filtering, I used peak detection to recognize when a clap happens in real time. (see figure below) 21 | Once two peaks (claps) are detected with some minimum spacing, the system launches Chrome (or any command you define). 22 | 23 | ![fig](https://github.com/user-attachments/assets/fa15cd8d-8690-4a86-b878-273dbac2f241) 24 | 25 | ## Dependencies 26 | ##### 🗂️ requirements-core.txt (for `two_claps_open.py`) 27 | ```bash 28 | numpy 29 | pyaudio 30 | scipy 31 | pywin32 32 | ``` 33 | 34 | ##### 🗂️ requirements-agent.txt (for `agent_on_clap.py`) 35 | ```bash 36 | PyAudio 37 | pygame 38 | SpeechRecognition 39 | dotenv 40 | gTTS 41 | langchain 42 | langchain-core 43 | langchain-google 44 | langchain-google-genai 45 | scipy 46 | ``` 47 | > 💡 Don’t forget to set your Google API key in a .env file. Google offers a free tier for API usage. 48 | 49 | ## For windows users (optional) 50 | Run with 「Ctrl + Alt + P」 51 | #### .bat 52 | ```bash 53 | @echo off 54 | cd /d C:\your\file\path 55 | pipenv run python two_claps_open.py 56 | ``` 57 | 58 | #### .ahk 59 | ```bash 60 | ^!p::Run "C:\your\file\path\run_python.bat" 61 | ``` 62 | -------------------------------------------------------------------------------- /matlab/real_time_clap_detection.m: -------------------------------------------------------------------------------- 1 | % Real-time audio processing (detecting two claps, no graphs) 2 | 3 | % Setting up the audio input device 4 | Fs = 44100; % Sampling frequency (e.g., 44.1kHz) 5 | frameDuration = 0.02; % Frame length (20ms) 6 | frameSize = round(Fs * frameDuration); % Frame size (number of samples) 7 | audioIn = audioDeviceReader('SampleRate', Fs, 'SamplesPerFrame', frameSize); 8 | 9 | % Designing the bandpass filter (1kHz to 2.5kHz) 10 | f_low = 1400; % Lower cutoff frequency (Hz) 11 | f_high = 1800; % Upper cutoff frequency (Hz) 12 | order = 2; % Filter order 13 | [b, a] = butter(order, [f_low, f_high]/(Fs/2), 'bandpass'); 14 | 15 | % Setting up the window function 16 | window = hann(frameSize); % Hanning window 17 | 18 | % Settings for peak detection 19 | threshold = 0.2; % Amplitude threshold (needs adjustment) 20 | minPeakDistanceSec = 0.2; % Minimum peak interval (seconds) 21 | clapCount = 0; % Number of detected claps 22 | clapTimes = []; % Times of claps 23 | lastPeakTime = -Inf; % Time of the last peak 24 | 25 | % Real-time processing loop (detecting two claps) 26 | disp('Starting real-time processing. Please clap twice.'); 27 | startTime = tic; 28 | while clapCount < 2 % Exit after detecting 2 claps 29 | % Getting audio frame 30 | frame = audioIn() .* window; % Apply window function 31 | 32 | % Applying the bandpass filter 33 | frame_filtered = filter(b, a, frame); 34 | 35 | % Peak detection (managing intervals between frames) 36 | [peaks, ~] = findpeaks(abs(frame_filtered), 'MinPeakHeight', threshold); 37 | currentTime = toc(startTime); 38 | if ~isempty(peaks) 39 | % Check peak interval (at least 0.2 seconds) between frames 40 | timeDiff = currentTime - lastPeakTime; 41 | if timeDiff >= minPeakDistanceSec 42 | clapCount = clapCount + 1; 43 | clapTimes = [clapTimes; currentTime]; 44 | disp(['Detected clap: ' num2str(currentTime) ' seconds']); 45 | lastPeakTime = currentTime; 46 | end 47 | end 48 | end 49 | 50 | % Releasing resources 51 | release(audioIn); 52 | 53 | % Displaying results and comments 54 | disp(['Sampling frequency: ' num2str(Fs) ' Hz']); 55 | disp(['Number of detected claps: ' num2str(clapCount)]); 56 | disp('Two claps have been detected!'); 57 | if ~isempty(clapTimes) 58 | disp('Times of claps (seconds):'); 59 | disp(clapTimes); 60 | end 61 | -------------------------------------------------------------------------------- /matlab/detect_clap_sound.m: -------------------------------------------------------------------------------- 1 | % Detection of clap sound (Bandpass filter and peak detection) 2 | 3 | % Reading the WAV file 4 | [y, Fs] = audioread('rec3_with_noise.wav'); % Change 'clap.wav' to the actual filename 5 | 6 | % Creating the time axis 7 | t = (0:length(y)-1)/Fs; 8 | 9 | % Designing the bandpass filter (1.4kHz to 1.8kHz) 10 | f_low = 1400; % Lower cutoff frequency (Hz) 11 | f_high = 1800; % Upper cutoff frequency (Hz) 12 | order = 4; % Filter order (Butterworth) 13 | [b, a] = butter(order, [f_low, f_high]/(Fs/2), 'bandpass'); % Butterworth bandpass filter 14 | 15 | % Applying the filter 16 | y_filtered = filter(b, a, y); 17 | 18 | % Fourier transform (to check the frequency spectrum after filtering) 19 | Y_filtered = fft(y_filtered); 20 | N = length(y); 21 | f = (0:N-1)*(Fs/N); 22 | amplitude_filtered = abs(Y_filtered)/N; 23 | half_N = floor(N/2); 24 | f = f(1:half_N); 25 | amplitude_filtered = amplitude_filtered(1:half_N); 26 | 27 | % Peak detection (detecting clap sound) 28 | % threshold = 0.3 * max(abs(y_filtered)); % Threshold (30% of maximum amplitude) 29 | threshold = 0.25; 30 | disp(['threshold: ' num2str(threshold)]); 31 | [peaks, locs] = findpeaks(abs(y_filtered), 'MinPeakHeight', threshold, 'MinPeakDistance', round(Fs*0.2)); % Detect peaks with a minimum interval of 0.2 seconds 32 | % Display the number of detected peaks 33 | disp(['Number of detected peaks: ']); 34 | disp(peaks); 35 | % Display the amplitude of detected peaks 36 | disp('Amplitude of detected peaks:'); 37 | disp(locs); 38 | clap_times = t(locs); % Time positions of the peaks 39 | 40 | % Plotting the graphs 41 | figure; 42 | 43 | % Subplot 1: Original time waveform 44 | subplot(3,1,1); 45 | plot(t, y); 46 | title('Original audio signal (clap sound)'); 47 | xlabel('Time (seconds)'); 48 | ylabel('Amplitude'); 49 | grid on; 50 | 51 | % Subplot 2: Time waveform after filtering and peak detection 52 | subplot(3,1,2); 53 | plot(t, y_filtered); 54 | hold on; 55 | plot(clap_times, y_filtered(locs), 'ro', 'MarkerSize', 8, 'LineWidth', 2); % Display peaks as red circles 56 | title('After applying bandpass filter (1.4kHz to 1.8kHz) and peak detection'); 57 | xlabel('Time (seconds)'); 58 | ylabel('Amplitude'); 59 | grid on; 60 | hold off; 61 | 62 | % Subplot 3: Frequency spectrum after filtering 63 | subplot(3,1,3); 64 | plot(f, amplitude_filtered); 65 | title('Frequency spectrum after filtering'); 66 | xlabel('Frequency (Hz)'); 67 | ylabel('Amplitude'); 68 | grid on; 69 | 70 | % Overall title for the graphs 71 | sgtitle('Detection of clap sound (1.4kHz to 1.8kHz)'); 72 | 73 | % Displaying the sampling frequency and detection results 74 | disp(['Sampling frequency: ' num2str(Fs) ' Hz']); 75 | disp(['Number of detected claps: ' num2str(length(locs))]); 76 | disp('Times of claps (seconds):'); 77 | disp(clap_times'); 78 | -------------------------------------------------------------------------------- /matlab/record/record_wav_file.py: -------------------------------------------------------------------------------- 1 | import os 2 | import threading 3 | import time 4 | import wave 5 | 6 | import keyboard 7 | import pyaudio 8 | 9 | 10 | class AudioRecorder: 11 | def __init__(self): 12 | # Audio settings 13 | self.chunk = 1024 # Buffer size 14 | self.format = pyaudio.paInt16 # 16bit 15 | self.channels = 1 # Mono 16 | self.rate = 44100 # Sampling rate 17 | 18 | self.frames = [] 19 | self.recording = False 20 | self.audio = pyaudio.PyAudio() 21 | 22 | def start_recording(self): 23 | """Start recording""" 24 | self.recording = True 25 | self.frames = [] 26 | 27 | # Open audio stream 28 | stream = self.audio.open( 29 | format=self.format, 30 | channels=self.channels, 31 | rate=self.rate, 32 | input=True, 33 | frames_per_buffer=self.chunk, 34 | ) 35 | 36 | print("Recording... Press Enter to stop") 37 | 38 | # Recording loop 39 | while self.recording: 40 | data = stream.read(self.chunk) 41 | self.frames.append(data) 42 | 43 | # Close stream 44 | stream.stop_stream() 45 | stream.close() 46 | 47 | print("Recording stopped") 48 | 49 | def stop_recording(self): 50 | """Stop recording""" 51 | self.recording = False 52 | 53 | def save_recording(self, filename): 54 | """Save recording data as WAV file""" 55 | if not self.frames: 56 | print("No recording data available") 57 | return 58 | 59 | # Create wav_files directory if it doesn't exist 60 | wav_dir = "wav_files" 61 | if not os.path.exists(wav_dir): 62 | os.makedirs(wav_dir) 63 | 64 | # Full path for the file 65 | filepath = os.path.join(wav_dir, filename) 66 | 67 | # Save as WAV file 68 | wf = wave.open(filepath, "wb") 69 | wf.setnchannels(self.channels) 70 | wf.setsampwidth(self.audio.get_sample_size(self.format)) 71 | wf.setframerate(self.rate) 72 | wf.writeframes(b"".join(self.frames)) 73 | wf.close() 74 | 75 | print(f"Recording saved as {filepath}") 76 | 77 | def cleanup(self): 78 | """Clean up resources""" 79 | self.audio.terminate() 80 | 81 | 82 | def main(): 83 | recorder = AudioRecorder() 84 | 85 | try: 86 | # Get filename from user input 87 | filename = input( 88 | "Enter filename for recording (without .wav extension): " 89 | ).strip() 90 | if not filename: 91 | filename = "recording" 92 | 93 | # Add .wav extension if not present 94 | if not filename.endswith(".wav"): 95 | filename += ".wav" 96 | 97 | print(f"Filename set to: {filename}") 98 | print("Press Enter to start recording...") 99 | input() # Wait for Enter to start 100 | 101 | # Start recording in a separate thread 102 | recording_thread = threading.Thread(target=recorder.start_recording) 103 | recording_thread.start() 104 | 105 | # Wait for Enter key to stop recording 106 | keyboard.wait("enter") 107 | 108 | # Stop recording 109 | recorder.stop_recording() 110 | recording_thread.join() 111 | 112 | # Save the recording 113 | recorder.save_recording(filename) 114 | 115 | except KeyboardInterrupt: 116 | print("\nRecording interrupted") 117 | recorder.stop_recording() 118 | 119 | finally: 120 | recorder.cleanup() 121 | 122 | 123 | if __name__ == "__main__": 124 | main() 125 | -------------------------------------------------------------------------------- /two_claps_open.py: -------------------------------------------------------------------------------- 1 | import time 2 | import webbrowser 3 | 4 | import numpy as np 5 | import pyaudio 6 | from scipy import signal 7 | 8 | # import win32com.client 9 | 10 | # Audio configuration 11 | SAMPLING_RATE = 44100 # Hz 12 | FRAME_DURATION = 0.02 # seconds (20ms) 13 | FRAME_SIZE = int(SAMPLING_RATE * FRAME_DURATION) 14 | 15 | # Filter configuration 16 | FILTER_ORDER = 2 17 | FREQ_LOW = 1400 # Hz 18 | FREQ_HIGH = 1800 # Hz 19 | 20 | # Peak detection configuration 21 | AMPLITUDE_THRESHOLD = 0.2 22 | MIN_PEAK_INTERVAL = 0.2 # seconds 23 | 24 | # FILE_PATH = "/path/to/your_file/presentation.pptx" 25 | 26 | def initialize_audio_stream(): 27 | """Initialize PyAudio stream for audio input.""" 28 | p = pyaudio.PyAudio() 29 | return p, p.open( 30 | format=pyaudio.paFloat32, 31 | channels=1, 32 | rate=SAMPLING_RATE, 33 | input=True, 34 | frames_per_buffer=FRAME_SIZE, 35 | ) 36 | 37 | 38 | def create_bandpass_filter(): 39 | """Create bandpass filter for 1.4kHz to 1.8kHz range.""" 40 | return signal.butter( 41 | FILTER_ORDER, 42 | [FREQ_LOW, FREQ_HIGH], 43 | btype="bandpass", 44 | fs=SAMPLING_RATE, 45 | output="sos", 46 | ) 47 | 48 | 49 | def main(): 50 | # Initialize audio processing 51 | p, stream = initialize_audio_stream() 52 | sos = create_bandpass_filter() 53 | window = signal.windows.hann(FRAME_SIZE) 54 | 55 | # Initialize state variables 56 | clap_count = 0 57 | clap_times = [] 58 | last_peak_time = -float("inf") 59 | filter_state = np.zeros((sos.shape[0], 2)) 60 | 61 | print("Starting real-time processing. Please clap twice.") 62 | start_time = time.time() 63 | 64 | try: 65 | while clap_count < 2: 66 | # Read and process audio frame 67 | frame = np.frombuffer( 68 | stream.read(FRAME_SIZE, exception_on_overflow=False), dtype=np.float32 69 | ) 70 | frame = frame * window 71 | 72 | # Apply bandpass filter 73 | frame_filtered, filter_state = signal.sosfilt(sos, frame, zi=filter_state) 74 | 75 | # Detect peaks 76 | peaks, _ = signal.find_peaks( 77 | np.abs(frame_filtered), height=AMPLITUDE_THRESHOLD 78 | ) 79 | current_time = time.time() - start_time 80 | 81 | if peaks.size > 0 and (current_time - last_peak_time) >= MIN_PEAK_INTERVAL: 82 | clap_count += 1 83 | clap_times.append(current_time) 84 | print(f"Clap detected: {current_time:.2f} seconds") 85 | last_peak_time = current_time 86 | 87 | if clap_count == 2: 88 | webbrowser.open("https://www.netflix.com/") 89 | # this is for powerpoint 90 | # if os.path.exists(FILE_PATH): 91 | # try: 92 | # # Start the PowerPoint application 93 | # powerpoint = win32com.client.Dispatch("PowerPoint.Application") 94 | # powerpoint.Visible = True # Show PowerPoint 95 | # # Open the presentation 96 | # presentation = powerpoint.Presentations.Open(os.path.abspath(FILE_PATH)) 97 | # # Start the slideshow 98 | # presentation.SlideShowSettings.Run() 99 | # except Exception as e: 100 | # print(f"PowerPoint operation error: {e}") 101 | # else: 102 | # print(f"Error: PowerPoint file {FILE_PATH} not found") 103 | 104 | finally: 105 | stream.stop_stream() 106 | stream.close() 107 | p.terminate() 108 | 109 | print(f"\nResults:") 110 | print(f"Sampling frequency: {SAMPLING_RATE} Hz") 111 | print(f"Number of detected claps: {clap_count}") 112 | print("Two claps have been detected!") 113 | 114 | if clap_times: 115 | print("Clap times (seconds):") 116 | for t in clap_times: 117 | print(f"{t:.4f}") 118 | 119 | 120 | if __name__ == "__main__": 121 | main() 122 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | Pipfile 2 | Pipfile.lock 3 | .env 4 | .idea/ 5 | ex.py 6 | run_python.ahk 7 | run_python.bat 8 | 9 | # Byte-compiled / optimized / DLL files 10 | __pycache__/ 11 | *.py[cod] 12 | *$py.class 13 | 14 | # C extensions 15 | *.so 16 | 17 | # Distribution / packaging 18 | .Python 19 | build/ 20 | develop-eggs/ 21 | dist/ 22 | downloads/ 23 | eggs/ 24 | .eggs/ 25 | lib/ 26 | lib64/ 27 | parts/ 28 | sdist/ 29 | var/ 30 | wheels/ 31 | share/python-wheels/ 32 | *.egg-info/ 33 | .installed.cfg 34 | *.egg 35 | MANIFEST 36 | 37 | # PyInstaller 38 | # Usually these files are written by a python script from a template 39 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 40 | *.manifest 41 | *.spec 42 | 43 | # Installer logs 44 | pip-log.txt 45 | pip-delete-this-directory.txt 46 | 47 | # Unit test / coverage reports 48 | htmlcov/ 49 | .tox/ 50 | .nox/ 51 | .coverage 52 | .coverage.* 53 | .cache 54 | nosetests.xml 55 | coverage.xml 56 | *.cover 57 | *.py,cover 58 | .hypothesis/ 59 | .pytest_cache/ 60 | cover/ 61 | 62 | # Translations 63 | *.mo 64 | *.pot 65 | 66 | # Django stuff: 67 | *.log 68 | local_settings.py 69 | db.sqlite3 70 | db.sqlite3-journal 71 | 72 | # Flask stuff: 73 | instance/ 74 | .webassets-cache 75 | 76 | # Scrapy stuff: 77 | .scrapy 78 | 79 | # Sphinx documentation 80 | docs/_build/ 81 | 82 | # PyBuilder 83 | .pybuilder/ 84 | target/ 85 | 86 | # Jupyter Notebook 87 | .ipynb_checkpoints 88 | 89 | # IPython 90 | profile_default/ 91 | ipython_config.py 92 | 93 | # pyenv 94 | # For a library or package, you might want to ignore these files since the code is 95 | # intended to run in multiple environments; otherwise, check them in: 96 | # .python-version 97 | 98 | # pipenv 99 | # According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control. 100 | # However, in case of collaboration, if having platform-specific dependencies or dependencies 101 | # having no cross-platform support, pipenv may install dependencies that don't work, or not 102 | # install all needed dependencies. 103 | #Pipfile.lock 104 | 105 | # UV 106 | # Similar to Pipfile.lock, it is generally recommended to include uv.lock in version control. 107 | # This is especially recommended for binary packages to ensure reproducibility, and is more 108 | # commonly ignored for libraries. 109 | #uv.lock 110 | 111 | # poetry 112 | # Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control. 113 | # This is especially recommended for binary packages to ensure reproducibility, and is more 114 | # commonly ignored for libraries. 115 | # https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control 116 | #poetry.lock 117 | #poetry.toml 118 | 119 | # pdm 120 | # Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control. 121 | #pdm.lock 122 | # pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it 123 | # in version control. 124 | # https://pdm.fming.dev/latest/usage/project/#working-with-version-control 125 | .pdm.toml 126 | .pdm-python 127 | .pdm-build/ 128 | 129 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm 130 | __pypackages__/ 131 | 132 | # Celery stuff 133 | celerybeat-schedule 134 | celerybeat.pid 135 | 136 | # SageMath parsed files 137 | *.sage.py 138 | 139 | # Environments 140 | .env 141 | .venv 142 | env/ 143 | venv/ 144 | ENV/ 145 | env.bak/ 146 | venv.bak/ 147 | 148 | # Spyder project settings 149 | .spyderproject 150 | .spyproject 151 | 152 | # Rope project settings 153 | .ropeproject 154 | 155 | # mkdocs documentation 156 | /site 157 | 158 | # mypy 159 | .mypy_cache/ 160 | .dmypy.json 161 | dmypy.json 162 | 163 | # Pyre type checker 164 | .pyre/ 165 | 166 | # pytype static type analyzer 167 | .pytype/ 168 | 169 | # Cython debug symbols 170 | cython_debug/ 171 | 172 | # PyCharm 173 | # JetBrains specific template is maintained in a separate JetBrains.gitignore that can 174 | # be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore 175 | # and can be added to the global gitignore or merged into this file. For a more nuclear 176 | # option (not recommended) you can uncomment the following to ignore the entire idea folder. 177 | #.idea/ 178 | 179 | # Abstra 180 | # Abstra is an AI-powered process automation framework. 181 | # Ignore directories containing user credentials, local state, and settings. 182 | # Learn more at https://abstra.io/docs 183 | .abstra/ 184 | 185 | # Visual Studio Code 186 | # Visual Studio Code specific template is maintained in a separate VisualStudioCode.gitignore 187 | # that can be found at https://github.com/github/gitignore/blob/main/Global/VisualStudioCode.gitignore 188 | # and can be added to the global gitignore or merged into this file. However, if you prefer, 189 | # you could uncomment the following to ignore the entire vscode folder 190 | # .vscode/ 191 | 192 | # Ruff stuff: 193 | .ruff_cache/ 194 | 195 | # PyPI configuration file 196 | .pypirc 197 | 198 | # Cursor 199 | # Cursor is an AI-powered code editor. `.cursorignore` specifies files/directories to 200 | # exclude from AI features like autocomplete and code analysis. Recommended for sensitive data 201 | # refer to https://docs.cursor.com/context/ignore-files 202 | .cursorignore 203 | .cursorindexingignore -------------------------------------------------------------------------------- /agent_on_clap.py: -------------------------------------------------------------------------------- 1 | import io 2 | import time 3 | import webbrowser 4 | import numpy as np 5 | import pyaudio 6 | import pygame 7 | import speech_recognition as sr 8 | from dotenv import load_dotenv 9 | from gtts import gTTS 10 | from langchain.agents import AgentExecutor, create_tool_calling_agent 11 | from langchain.tools import tool 12 | from langchain_core.prompts import ChatPromptTemplate 13 | from langchain_google_genai import ChatGoogleGenerativeAI 14 | from scipy import signal 15 | 16 | load_dotenv() 17 | 18 | class VoiceActivatedAI: 19 | def __init__(self): 20 | # Audio settings 21 | self.Fs = 44100 22 | self.frame_duration = 0.02 23 | self.frame_size = int(self.Fs * self.frame_duration) 24 | 25 | # Initialize PyAudio 26 | self.p = pyaudio.PyAudio() 27 | self.stream = None 28 | 29 | # Bandpass filter (1.4kHz-1.8kHz) 30 | self.f_low = 1400 31 | self.f_high = 1800 32 | self.order = 2 33 | self.sos = signal.butter( 34 | self.order, 35 | [self.f_low, self.f_high], 36 | btype="bandpass", 37 | fs=self.Fs, 38 | output="sos", 39 | ) 40 | 41 | # Window function 42 | self.window = signal.windows.hann(self.frame_size) 43 | 44 | # Peak detection settings 45 | self.threshold = 0.2 46 | self.min_peak_distance_sec = 0.2 47 | 48 | # Initialize speech recognition 49 | self.recognizer = sr.Recognizer() 50 | self.microphone = sr.Microphone() 51 | 52 | # Initialize pygame mixer 53 | pygame.mixer.init() 54 | 55 | # Initialize AI agent 56 | self.setup_ai_agent() 57 | 58 | def setup_ai_agent(self): 59 | @tool 60 | def open_browser(url: str) -> str: 61 | """Opens the given URL in the firefox web browser.""" 62 | webbrowser.open(url) 63 | return f"Opened browser to {url}" 64 | 65 | self.llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash", temperature=0) 66 | self.tools = [open_browser] 67 | self.prompt = ChatPromptTemplate.from_messages( 68 | [ 69 | ( 70 | "system", 71 | "You are a helpful voice assistant. You can open URLs in the firefox browser. " 72 | "Be concise in your responses as they will be converted to speech.", 73 | ), 74 | ("human", "{input}"), 75 | ("placeholder", "{agent_scratchpad}"), 76 | ] 77 | ) 78 | agent = create_tool_calling_agent(prompt=self.prompt, tools=self.tools, llm=self.llm) 79 | self.agent_executor = AgentExecutor(agent=agent, tools=self.tools, verbose=True) 80 | 81 | def detect_claps(self): 82 | print("Listening for claps... Please clap twice to activate the AI assistant.") 83 | self.stream = self.p.open( 84 | format=pyaudio.paFloat32, 85 | channels=1, 86 | rate=self.Fs, 87 | input=True, 88 | frames_per_buffer=self.frame_size, 89 | ) 90 | 91 | clap_count = 0 92 | last_peak_time = -float("inf") 93 | zi = np.zeros((self.sos.shape[0], 2)) 94 | start_time = time.time() 95 | 96 | while clap_count < 2: 97 | try: 98 | frame = np.frombuffer( 99 | self.stream.read(self.frame_size, exception_on_overflow=False), 100 | dtype=np.float32, 101 | ) 102 | frame = frame * self.window 103 | frame_filtered, zi = signal.sosfilt(self.sos, frame, zi=zi) 104 | peaks, _ = signal.find_peaks(np.abs(frame_filtered), height=self.threshold) 105 | current_time = time.time() - start_time 106 | 107 | if len(peaks) > 0 and (current_time - last_peak_time) >= self.min_peak_distance_sec: 108 | clap_count += 1 109 | print(f"Clap detected: {current_time:.2f} seconds (Count: {clap_count}/2)") 110 | last_peak_time = current_time 111 | 112 | except Exception as e: 113 | print(f"Error in clap detection: {e}") 114 | continue 115 | 116 | self.stream.stop_stream() 117 | self.stream.close() 118 | print("Two claps detected! Activating AI assistant...") 119 | return True 120 | 121 | def play_intro_sound(self): 122 | try: 123 | tts = gTTS(text="Hi, how can I help you?", lang="en") 124 | audio_buffer = io.BytesIO() 125 | tts.write_to_fp(audio_buffer) 126 | audio_buffer.seek(0) 127 | pygame.mixer.music.load(audio_buffer) 128 | pygame.mixer.music.play() 129 | while pygame.mixer.music.get_busy(): 130 | time.sleep(0.1) 131 | except Exception as e: 132 | print(f"Error playing intro sound: {e}") 133 | print("Hi, how can I help you?") 134 | 135 | def listen_for_speech(self, timeout=5): 136 | print(f"Listening for your voice input for {timeout} seconds...") 137 | try: 138 | with self.microphone as source: 139 | self.recognizer.adjust_for_ambient_noise(source, duration=1) 140 | audio = self.recognizer.listen(source, timeout=timeout, phrase_time_limit=timeout) 141 | text = self.recognizer.recognize_google(audio) 142 | print(f"You said: {text}") 143 | return text 144 | except sr.WaitTimeoutError: 145 | print("No speech detected within the timeout period.") 146 | return None 147 | except sr.UnknownValueError: 148 | print("Could not understand the audio.") 149 | return None 150 | except sr.RequestError as e: 151 | print(f"Error with speech recognition service: {e}") 152 | return None 153 | 154 | def process_with_ai_agent(self, user_input): 155 | try: 156 | print(f"Processing request: {user_input}") 157 | response = self.agent_executor.invoke(input={"input": user_input}) 158 | output_text = response.get("output", "I apologize, but I could not process your request.") 159 | print(f"AI Response: {output_text}") 160 | return output_text 161 | except Exception as e: 162 | error_message = f"Sorry, I encountered an error while processing your request: {str(e)}" 163 | print(error_message) 164 | return error_message 165 | 166 | def text_to_speech(self, text): 167 | try: 168 | print(f"Converting to speech: {text}") 169 | tts = gTTS(text=text, lang="en") 170 | audio_buffer = io.BytesIO() 171 | tts.write_to_fp(audio_buffer) 172 | audio_buffer.seek(0) 173 | pygame.mixer.music.load(audio_buffer) 174 | pygame.mixer.music.play() 175 | while pygame.mixer.music.get_busy(): 176 | time.sleep(0.1) 177 | except Exception as e: 178 | print(f"Error in text-to-speech conversion: {e}") 179 | print(f"AI says: {text}") 180 | 181 | def run(self): 182 | try: 183 | while True: 184 | if self.detect_claps(): 185 | self.play_intro_sound() 186 | user_speech = self.listen_for_speech(timeout=5) 187 | if user_speech: 188 | ai_response = self.process_with_ai_agent(user_speech) 189 | self.text_to_speech(ai_response) 190 | else: 191 | self.text_to_speech("I didn't hear anything. Please try again by clapping twice.") 192 | print("\n" + "=" * 50) 193 | print("Ready for next activation. Clap twice to continue...") 194 | print("=" * 50 + "\n") 195 | except KeyboardInterrupt: 196 | print("\nShutting down voice-activated AI assistant...") 197 | except Exception as e: 198 | print(f"Unexpected error: {e}") 199 | finally: 200 | if self.stream and not self.stream.is_stopped(): 201 | self.stream.stop_stream() 202 | self.stream.close() 203 | self.p.terminate() 204 | pygame.mixer.quit() 205 | 206 | def main(): 207 | print("- Clap twice to activate") 208 | assistant = VoiceActivatedAI() 209 | assistant.run() 210 | 211 | if __name__ == "__main__": 212 | main() --------------------------------------------------------------------------------