├── requirements-core.txt
├── requirements-agent.txt
├── LICENSE
├── README.md
├── matlab
    ├── real_time_clap_detection.m
    ├── detect_clap_sound.m
    └── record
    │   └── record_wav_file.py
├── two_claps_open.py
├── .gitignore
└── agent_on_clap.py


/requirements-core.txt:
--------------------------------------------------------------------------------
1 | # Core dependencies for two-claps-open
2 | numpy
3 | pyaudio
4 | scipy
5 | pywin32
6 | 


--------------------------------------------------------------------------------
/requirements-agent.txt:
--------------------------------------------------------------------------------
 1 | # Optional dependencies for voice agent extension
 2 | PyAudio
 3 | pygame
 4 | SpeechRecognition
 5 | dotenv
 6 | gTTS
 7 | langchain
 8 | langchain-core
 9 | langchain-google
10 | langchain-google-genai
11 | scipy
12 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2025 Yutarop
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # `two_claps_open` 👏
 2 | Open Chrome (or any file/app) by simply clapping twice — just like [Tony Stark](https://www.youtube.com/watch?v=OT2b5KzMoC0&t=101s).
 3 | 
 4 | #### 🔸Open a presentation slide and chrome by clapping twice (two_claps_open.py)
 5 | 
 6 | 
 7 | https://github.com/user-attachments/assets/15849e57-b662-4096-8f67-f8937ae61711
 8 | 
 9 | 
10 | #### 🔸Activate a voice assistant agent by clapping twice (agent_on_clap.py)
11 | 
12 | 
13 | https://github.com/user-attachments/assets/3b0ef232-7229-4ca4-9d57-f2fc325295bc
14 | 
15 | 
16 | ## Clap Detection
17 | To figure out what a clap "looks like" in terms of sound, I first recorded a sample clap and ran a Fourier transform on it to check the frequency content.
18 | From the analysis, most of the energy from the clap is concentrated around 1.4kHz to 1.8kHz. 
19 | Based on this, I set up a bandpass filter to isolate only that range and ignore irrelevant noise.
20 | After filtering, I used peak detection to recognize when a clap happens in real time. (see figure below)
21 | Once two peaks (claps) are detected with some minimum spacing, the system launches Chrome (or any command you define).
22 | 
23 | ![fig](https://github.com/user-attachments/assets/fa15cd8d-8690-4a86-b878-273dbac2f241)
24 | 
25 | ## Dependencies
26 | ##### 🗂️ requirements-core.txt (for `two_claps_open.py`)
27 | ```bash
28 | numpy
29 | pyaudio
30 | scipy
31 | pywin32
32 | ```
33 | 
34 | ##### 🗂️ requirements-agent.txt (for `agent_on_clap.py`) 
35 | ```bash
36 | PyAudio
37 | pygame
38 | SpeechRecognition
39 | dotenv
40 | gTTS
41 | langchain
42 | langchain-core
43 | langchain-google
44 | langchain-google-genai
45 | scipy
46 | ```
47 | > 💡 Don’t forget to set your Google API key in a .env file. Google offers a free tier for API usage.
48 | 
49 | ## For windows users (optional)
50 | Run with 「Ctrl + Alt + P」
51 | #### .bat
52 | ```bash
53 | @echo off
54 | cd /d C:\your\file\path
55 | pipenv run python two_claps_open.py
56 | ```
57 | 
58 | #### .ahk
59 | ```bash
60 | ^!p::Run "C:\your\file\path\run_python.bat"
61 | ```
62 | 


--------------------------------------------------------------------------------
/matlab/real_time_clap_detection.m:
--------------------------------------------------------------------------------
 1 | % Real-time audio processing (detecting two claps, no graphs)
 2 | 
 3 | % Setting up the audio input device
 4 | Fs = 44100; % Sampling frequency (e.g., 44.1kHz)
 5 | frameDuration = 0.02; % Frame length (20ms)
 6 | frameSize = round(Fs * frameDuration); % Frame size (number of samples)
 7 | audioIn = audioDeviceReader('SampleRate', Fs, 'SamplesPerFrame', frameSize);
 8 | 
 9 | % Designing the bandpass filter (1kHz to 2.5kHz)
10 | f_low = 1400; % Lower cutoff frequency (Hz)
11 | f_high = 1800; % Upper cutoff frequency (Hz)
12 | order = 2; % Filter order
13 | [b, a] = butter(order, [f_low, f_high]/(Fs/2), 'bandpass');
14 | 
15 | % Setting up the window function
16 | window = hann(frameSize); % Hanning window
17 | 
18 | % Settings for peak detection
19 | threshold = 0.2; % Amplitude threshold (needs adjustment)
20 | minPeakDistanceSec = 0.2; % Minimum peak interval (seconds)
21 | clapCount = 0; % Number of detected claps
22 | clapTimes = []; % Times of claps
23 | lastPeakTime = -Inf; % Time of the last peak
24 | 
25 | % Real-time processing loop (detecting two claps)
26 | disp('Starting real-time processing. Please clap twice.');
27 | startTime = tic;
28 | while clapCount < 2 % Exit after detecting 2 claps
29 |     % Getting audio frame
30 |     frame = audioIn() .* window; % Apply window function
31 | 
32 |     % Applying the bandpass filter
33 |     frame_filtered = filter(b, a, frame);
34 | 
35 |     % Peak detection (managing intervals between frames)
36 |     [peaks, ~] = findpeaks(abs(frame_filtered), 'MinPeakHeight', threshold);
37 |     currentTime = toc(startTime);
38 |     if ~isempty(peaks)
39 |         % Check peak interval (at least 0.2 seconds) between frames
40 |         timeDiff = currentTime - lastPeakTime;
41 |         if timeDiff >= minPeakDistanceSec
42 |             clapCount = clapCount + 1;
43 |             clapTimes = [clapTimes; currentTime];
44 |             disp(['Detected clap: ' num2str(currentTime) ' seconds']);
45 |             lastPeakTime = currentTime;
46 |         end
47 |     end
48 | end
49 | 
50 | % Releasing resources
51 | release(audioIn);
52 | 
53 | % Displaying results and comments
54 | disp(['Sampling frequency: ' num2str(Fs) ' Hz']);
55 | disp(['Number of detected claps: ' num2str(clapCount)]);
56 | disp('Two claps have been detected!');
57 | if ~isempty(clapTimes)
58 |     disp('Times of claps (seconds):');
59 |     disp(clapTimes);
60 | end
61 | 


--------------------------------------------------------------------------------
/matlab/detect_clap_sound.m:
--------------------------------------------------------------------------------
 1 | % Detection of clap sound (Bandpass filter and peak detection)
 2 | 
 3 | % Reading the WAV file
 4 | [y, Fs] = audioread('rec3_with_noise.wav'); % Change 'clap.wav' to the actual filename
 5 | 
 6 | % Creating the time axis
 7 | t = (0:length(y)-1)/Fs;
 8 | 
 9 | % Designing the bandpass filter (1.4kHz to 1.8kHz)
10 | f_low = 1400;  % Lower cutoff frequency (Hz)
11 | f_high = 1800; % Upper cutoff frequency (Hz)
12 | order = 4;     % Filter order (Butterworth)
13 | [b, a] = butter(order, [f_low, f_high]/(Fs/2), 'bandpass'); % Butterworth bandpass filter
14 | 
15 | % Applying the filter
16 | y_filtered = filter(b, a, y);
17 | 
18 | % Fourier transform (to check the frequency spectrum after filtering)
19 | Y_filtered = fft(y_filtered);
20 | N = length(y);
21 | f = (0:N-1)*(Fs/N);
22 | amplitude_filtered = abs(Y_filtered)/N;
23 | half_N = floor(N/2);
24 | f = f(1:half_N);
25 | amplitude_filtered = amplitude_filtered(1:half_N);
26 | 
27 | % Peak detection (detecting clap sound)
28 | % threshold = 0.3 * max(abs(y_filtered)); % Threshold (30% of maximum amplitude)
29 | threshold = 0.25;
30 | disp(['threshold: ' num2str(threshold)]);
31 | [peaks, locs] = findpeaks(abs(y_filtered), 'MinPeakHeight', threshold, 'MinPeakDistance', round(Fs*0.2)); % Detect peaks with a minimum interval of 0.2 seconds
32 | % Display the number of detected peaks
33 | disp(['Number of detected peaks: ']);
34 | disp(peaks);
35 | % Display the amplitude of detected peaks
36 | disp('Amplitude of detected peaks:');
37 | disp(locs);
38 | clap_times = t(locs); % Time positions of the peaks
39 | 
40 | % Plotting the graphs
41 | figure;
42 | 
43 | % Subplot 1: Original time waveform
44 | subplot(3,1,1);
45 | plot(t, y);
46 | title('Original audio signal (clap sound)');
47 | xlabel('Time (seconds)');
48 | ylabel('Amplitude');
49 | grid on;
50 | 
51 | % Subplot 2: Time waveform after filtering and peak detection
52 | subplot(3,1,2);
53 | plot(t, y_filtered);
54 | hold on;
55 | plot(clap_times, y_filtered(locs), 'ro', 'MarkerSize', 8, 'LineWidth', 2); % Display peaks as red circles
56 | title('After applying bandpass filter (1.4kHz to 1.8kHz) and peak detection');
57 | xlabel('Time (seconds)');
58 | ylabel('Amplitude');
59 | grid on;
60 | hold off;
61 | 
62 | % Subplot 3: Frequency spectrum after filtering
63 | subplot(3,1,3);
64 | plot(f, amplitude_filtered);
65 | title('Frequency spectrum after filtering');
66 | xlabel('Frequency (Hz)');
67 | ylabel('Amplitude');
68 | grid on;
69 | 
70 | % Overall title for the graphs
71 | sgtitle('Detection of clap sound (1.4kHz to 1.8kHz)');
72 | 
73 | % Displaying the sampling frequency and detection results
74 | disp(['Sampling frequency: ' num2str(Fs) ' Hz']);
75 | disp(['Number of detected claps: ' num2str(length(locs))]);
76 | disp('Times of claps (seconds):');
77 | disp(clap_times');
78 | 


--------------------------------------------------------------------------------
/matlab/record/record_wav_file.py:
--------------------------------------------------------------------------------
  1 | import os
  2 | import threading
  3 | import time
  4 | import wave
  5 | 
  6 | import keyboard
  7 | import pyaudio
  8 | 
  9 | 
 10 | class AudioRecorder:
 11 |     def __init__(self):
 12 |         # Audio settings
 13 |         self.chunk = 1024  # Buffer size
 14 |         self.format = pyaudio.paInt16  # 16bit
 15 |         self.channels = 1  # Mono
 16 |         self.rate = 44100  # Sampling rate
 17 | 
 18 |         self.frames = []
 19 |         self.recording = False
 20 |         self.audio = pyaudio.PyAudio()
 21 | 
 22 |     def start_recording(self):
 23 |         """Start recording"""
 24 |         self.recording = True
 25 |         self.frames = []
 26 | 
 27 |         # Open audio stream
 28 |         stream = self.audio.open(
 29 |             format=self.format,
 30 |             channels=self.channels,
 31 |             rate=self.rate,
 32 |             input=True,
 33 |             frames_per_buffer=self.chunk,
 34 |         )
 35 | 
 36 |         print("Recording... Press Enter to stop")
 37 | 
 38 |         # Recording loop
 39 |         while self.recording:
 40 |             data = stream.read(self.chunk)
 41 |             self.frames.append(data)
 42 | 
 43 |         # Close stream
 44 |         stream.stop_stream()
 45 |         stream.close()
 46 | 
 47 |         print("Recording stopped")
 48 | 
 49 |     def stop_recording(self):
 50 |         """Stop recording"""
 51 |         self.recording = False
 52 | 
 53 |     def save_recording(self, filename):
 54 |         """Save recording data as WAV file"""
 55 |         if not self.frames:
 56 |             print("No recording data available")
 57 |             return
 58 | 
 59 |         # Create wav_files directory if it doesn't exist
 60 |         wav_dir = "wav_files"
 61 |         if not os.path.exists(wav_dir):
 62 |             os.makedirs(wav_dir)
 63 | 
 64 |         # Full path for the file
 65 |         filepath = os.path.join(wav_dir, filename)
 66 | 
 67 |         # Save as WAV file
 68 |         wf = wave.open(filepath, "wb")
 69 |         wf.setnchannels(self.channels)
 70 |         wf.setsampwidth(self.audio.get_sample_size(self.format))
 71 |         wf.setframerate(self.rate)
 72 |         wf.writeframes(b"".join(self.frames))
 73 |         wf.close()
 74 | 
 75 |         print(f"Recording saved as {filepath}")
 76 | 
 77 |     def cleanup(self):
 78 |         """Clean up resources"""
 79 |         self.audio.terminate()
 80 | 
 81 | 
 82 | def main():
 83 |     recorder = AudioRecorder()
 84 | 
 85 |     try:
 86 |         # Get filename from user input
 87 |         filename = input(
 88 |             "Enter filename for recording (without .wav extension): "
 89 |         ).strip()
 90 |         if not filename:
 91 |             filename = "recording"
 92 | 
 93 |         # Add .wav extension if not present
 94 |         if not filename.endswith(".wav"):
 95 |             filename += ".wav"
 96 | 
 97 |         print(f"Filename set to: {filename}")
 98 |         print("Press Enter to start recording...")
 99 |         input()  # Wait for Enter to start
100 | 
101 |         # Start recording in a separate thread
102 |         recording_thread = threading.Thread(target=recorder.start_recording)
103 |         recording_thread.start()
104 | 
105 |         # Wait for Enter key to stop recording
106 |         keyboard.wait("enter")
107 | 
108 |         # Stop recording
109 |         recorder.stop_recording()
110 |         recording_thread.join()
111 | 
112 |         # Save the recording
113 |         recorder.save_recording(filename)
114 | 
115 |     except KeyboardInterrupt:
116 |         print("\nRecording interrupted")
117 |         recorder.stop_recording()
118 | 
119 |     finally:
120 |         recorder.cleanup()
121 | 
122 | 
123 | if __name__ == "__main__":
124 |     main()
125 | 


--------------------------------------------------------------------------------
/two_claps_open.py:
--------------------------------------------------------------------------------
  1 | import time
  2 | import webbrowser
  3 | 
  4 | import numpy as np
  5 | import pyaudio
  6 | from scipy import signal
  7 | 
  8 | # import win32com.client
  9 | 
 10 | # Audio configuration
 11 | SAMPLING_RATE = 44100  # Hz
 12 | FRAME_DURATION = 0.02  # seconds (20ms)
 13 | FRAME_SIZE = int(SAMPLING_RATE * FRAME_DURATION)
 14 | 
 15 | # Filter configuration
 16 | FILTER_ORDER = 2
 17 | FREQ_LOW = 1400  # Hz
 18 | FREQ_HIGH = 1800  # Hz
 19 | 
 20 | # Peak detection configuration
 21 | AMPLITUDE_THRESHOLD = 0.2
 22 | MIN_PEAK_INTERVAL = 0.2  # seconds
 23 | 
 24 | # FILE_PATH = "/path/to/your_file/presentation.pptx"
 25 | 
 26 | def initialize_audio_stream():
 27 |     """Initialize PyAudio stream for audio input."""
 28 |     p = pyaudio.PyAudio()
 29 |     return p, p.open(
 30 |         format=pyaudio.paFloat32,
 31 |         channels=1,
 32 |         rate=SAMPLING_RATE,
 33 |         input=True,
 34 |         frames_per_buffer=FRAME_SIZE,
 35 |     )
 36 | 
 37 | 
 38 | def create_bandpass_filter():
 39 |     """Create bandpass filter for 1.4kHz to 1.8kHz range."""
 40 |     return signal.butter(
 41 |         FILTER_ORDER,
 42 |         [FREQ_LOW, FREQ_HIGH],
 43 |         btype="bandpass",
 44 |         fs=SAMPLING_RATE,
 45 |         output="sos",
 46 |     )
 47 | 
 48 | 
 49 | def main():
 50 |     # Initialize audio processing
 51 |     p, stream = initialize_audio_stream()
 52 |     sos = create_bandpass_filter()
 53 |     window = signal.windows.hann(FRAME_SIZE)
 54 | 
 55 |     # Initialize state variables
 56 |     clap_count = 0
 57 |     clap_times = []
 58 |     last_peak_time = -float("inf")
 59 |     filter_state = np.zeros((sos.shape[0], 2))
 60 | 
 61 |     print("Starting real-time processing. Please clap twice.")
 62 |     start_time = time.time()
 63 | 
 64 |     try:
 65 |         while clap_count < 2:
 66 |             # Read and process audio frame
 67 |             frame = np.frombuffer(
 68 |                 stream.read(FRAME_SIZE, exception_on_overflow=False), dtype=np.float32
 69 |             )
 70 |             frame = frame * window
 71 | 
 72 |             # Apply bandpass filter
 73 |             frame_filtered, filter_state = signal.sosfilt(sos, frame, zi=filter_state)
 74 | 
 75 |             # Detect peaks
 76 |             peaks, _ = signal.find_peaks(
 77 |                 np.abs(frame_filtered), height=AMPLITUDE_THRESHOLD
 78 |             )
 79 |             current_time = time.time() - start_time
 80 | 
 81 |             if peaks.size > 0 and (current_time - last_peak_time) >= MIN_PEAK_INTERVAL:
 82 |                 clap_count += 1
 83 |                 clap_times.append(current_time)
 84 |                 print(f"Clap detected: {current_time:.2f} seconds")
 85 |                 last_peak_time = current_time
 86 | 
 87 |                 if clap_count == 2:
 88 |                     webbrowser.open("https://www.netflix.com/")
 89 |                     # this is for powerpoint
 90 |                     # if os.path.exists(FILE_PATH):
 91 |                     #     try:
 92 |                     #         # Start the PowerPoint application
 93 |                     #         powerpoint = win32com.client.Dispatch("PowerPoint.Application")
 94 |                     #         powerpoint.Visible = True  # Show PowerPoint
 95 |                     #         # Open the presentation
 96 |                     #         presentation = powerpoint.Presentations.Open(os.path.abspath(FILE_PATH))
 97 |                     #         # Start the slideshow
 98 |                     #         presentation.SlideShowSettings.Run()
 99 |                     #     except Exception as e:
100 |                     #         print(f"PowerPoint operation error: {e}")
101 |                     #     else:
102 |                     #         print(f"Error: PowerPoint file {FILE_PATH} not found")
103 | 
104 |     finally:
105 |         stream.stop_stream()
106 |         stream.close()
107 |         p.terminate()
108 | 
109 |     print(f"\nResults:")
110 |     print(f"Sampling frequency: {SAMPLING_RATE} Hz")
111 |     print(f"Number of detected claps: {clap_count}")
112 |     print("Two claps have been detected!")
113 | 
114 |     if clap_times:
115 |         print("Clap times (seconds):")
116 |         for t in clap_times:
117 |             print(f"{t:.4f}")
118 | 
119 | 
120 | if __name__ == "__main__":
121 |     main()
122 | 


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
  1 | Pipfile
  2 | Pipfile.lock
  3 | .env
  4 | .idea/
  5 | ex.py
  6 | run_python.ahk
  7 | run_python.bat
  8 | 
  9 | # Byte-compiled / optimized / DLL files
 10 | __pycache__/
 11 | *.py[cod]
 12 | *$py.class
 13 | 
 14 | # C extensions
 15 | *.so
 16 | 
 17 | # Distribution / packaging
 18 | .Python
 19 | build/
 20 | develop-eggs/
 21 | dist/
 22 | downloads/
 23 | eggs/
 24 | .eggs/
 25 | lib/
 26 | lib64/
 27 | parts/
 28 | sdist/
 29 | var/
 30 | wheels/
 31 | share/python-wheels/
 32 | *.egg-info/
 33 | .installed.cfg
 34 | *.egg
 35 | MANIFEST
 36 | 
 37 | # PyInstaller
 38 | #  Usually these files are written by a python script from a template
 39 | #  before PyInstaller builds the exe, so as to inject date/other infos into it.
 40 | *.manifest
 41 | *.spec
 42 | 
 43 | # Installer logs
 44 | pip-log.txt
 45 | pip-delete-this-directory.txt
 46 | 
 47 | # Unit test / coverage reports
 48 | htmlcov/
 49 | .tox/
 50 | .nox/
 51 | .coverage
 52 | .coverage.*
 53 | .cache
 54 | nosetests.xml
 55 | coverage.xml
 56 | *.cover
 57 | *.py,cover
 58 | .hypothesis/
 59 | .pytest_cache/
 60 | cover/
 61 | 
 62 | # Translations
 63 | *.mo
 64 | *.pot
 65 | 
 66 | # Django stuff:
 67 | *.log
 68 | local_settings.py
 69 | db.sqlite3
 70 | db.sqlite3-journal
 71 | 
 72 | # Flask stuff:
 73 | instance/
 74 | .webassets-cache
 75 | 
 76 | # Scrapy stuff:
 77 | .scrapy
 78 | 
 79 | # Sphinx documentation
 80 | docs/_build/
 81 | 
 82 | # PyBuilder
 83 | .pybuilder/
 84 | target/
 85 | 
 86 | # Jupyter Notebook
 87 | .ipynb_checkpoints
 88 | 
 89 | # IPython
 90 | profile_default/
 91 | ipython_config.py
 92 | 
 93 | # pyenv
 94 | #   For a library or package, you might want to ignore these files since the code is
 95 | #   intended to run in multiple environments; otherwise, check them in:
 96 | # .python-version
 97 | 
 98 | # pipenv
 99 | #   According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
100 | #   However, in case of collaboration, if having platform-specific dependencies or dependencies
101 | #   having no cross-platform support, pipenv may install dependencies that don't work, or not
102 | #   install all needed dependencies.
103 | #Pipfile.lock
104 | 
105 | # UV
106 | #   Similar to Pipfile.lock, it is generally recommended to include uv.lock in version control.
107 | #   This is especially recommended for binary packages to ensure reproducibility, and is more
108 | #   commonly ignored for libraries.
109 | #uv.lock
110 | 
111 | # poetry
112 | #   Similar to Pipfile.lock, it is generally recommended to include poetry.lock in version control.
113 | #   This is especially recommended for binary packages to ensure reproducibility, and is more
114 | #   commonly ignored for libraries.
115 | #   https://python-poetry.org/docs/basic-usage/#commit-your-poetrylock-file-to-version-control
116 | #poetry.lock
117 | #poetry.toml
118 | 
119 | # pdm
120 | #   Similar to Pipfile.lock, it is generally recommended to include pdm.lock in version control.
121 | #pdm.lock
122 | #   pdm stores project-wide configurations in .pdm.toml, but it is recommended to not include it
123 | #   in version control.
124 | #   https://pdm.fming.dev/latest/usage/project/#working-with-version-control
125 | .pdm.toml
126 | .pdm-python
127 | .pdm-build/
128 | 
129 | # PEP 582; used by e.g. github.com/David-OConnor/pyflow and github.com/pdm-project/pdm
130 | __pypackages__/
131 | 
132 | # Celery stuff
133 | celerybeat-schedule
134 | celerybeat.pid
135 | 
136 | # SageMath parsed files
137 | *.sage.py
138 | 
139 | # Environments
140 | .env
141 | .venv
142 | env/
143 | venv/
144 | ENV/
145 | env.bak/
146 | venv.bak/
147 | 
148 | # Spyder project settings
149 | .spyderproject
150 | .spyproject
151 | 
152 | # Rope project settings
153 | .ropeproject
154 | 
155 | # mkdocs documentation
156 | /site
157 | 
158 | # mypy
159 | .mypy_cache/
160 | .dmypy.json
161 | dmypy.json
162 | 
163 | # Pyre type checker
164 | .pyre/
165 | 
166 | # pytype static type analyzer
167 | .pytype/
168 | 
169 | # Cython debug symbols
170 | cython_debug/
171 | 
172 | # PyCharm
173 | #  JetBrains specific template is maintained in a separate JetBrains.gitignore that can
174 | #  be found at https://github.com/github/gitignore/blob/main/Global/JetBrains.gitignore
175 | #  and can be added to the global gitignore or merged into this file.  For a more nuclear
176 | #  option (not recommended) you can uncomment the following to ignore the entire idea folder.
177 | #.idea/
178 | 
179 | # Abstra
180 | # Abstra is an AI-powered process automation framework.
181 | # Ignore directories containing user credentials, local state, and settings.
182 | # Learn more at https://abstra.io/docs
183 | .abstra/
184 | 
185 | # Visual Studio Code
186 | #  Visual Studio Code specific template is maintained in a separate VisualStudioCode.gitignore
187 | #  that can be found at https://github.com/github/gitignore/blob/main/Global/VisualStudioCode.gitignore
188 | #  and can be added to the global gitignore or merged into this file. However, if you prefer,
189 | #  you could uncomment the following to ignore the entire vscode folder
190 | # .vscode/
191 | 
192 | # Ruff stuff:
193 | .ruff_cache/
194 | 
195 | # PyPI configuration file
196 | .pypirc
197 | 
198 | # Cursor
199 | #  Cursor is an AI-powered code editor. `.cursorignore` specifies files/directories to
200 | #  exclude from AI features like autocomplete and code analysis. Recommended for sensitive data
201 | #  refer to https://docs.cursor.com/context/ignore-files
202 | .cursorignore
203 | .cursorindexingignore


--------------------------------------------------------------------------------
/agent_on_clap.py:
--------------------------------------------------------------------------------
  1 | import io
  2 | import time
  3 | import webbrowser
  4 | import numpy as np
  5 | import pyaudio
  6 | import pygame
  7 | import speech_recognition as sr
  8 | from dotenv import load_dotenv
  9 | from gtts import gTTS
 10 | from langchain.agents import AgentExecutor, create_tool_calling_agent
 11 | from langchain.tools import tool
 12 | from langchain_core.prompts import ChatPromptTemplate
 13 | from langchain_google_genai import ChatGoogleGenerativeAI
 14 | from scipy import signal
 15 | 
 16 | load_dotenv()
 17 | 
 18 | class VoiceActivatedAI:
 19 |     def __init__(self):
 20 |         # Audio settings
 21 |         self.Fs = 44100
 22 |         self.frame_duration = 0.02
 23 |         self.frame_size = int(self.Fs * self.frame_duration)
 24 | 
 25 |         # Initialize PyAudio
 26 |         self.p = pyaudio.PyAudio()
 27 |         self.stream = None
 28 | 
 29 |         # Bandpass filter (1.4kHz-1.8kHz)
 30 |         self.f_low = 1400
 31 |         self.f_high = 1800
 32 |         self.order = 2
 33 |         self.sos = signal.butter(
 34 |             self.order,
 35 |             [self.f_low, self.f_high],
 36 |             btype="bandpass",
 37 |             fs=self.Fs,
 38 |             output="sos",
 39 |         )
 40 | 
 41 |         # Window function
 42 |         self.window = signal.windows.hann(self.frame_size)
 43 | 
 44 |         # Peak detection settings
 45 |         self.threshold = 0.2
 46 |         self.min_peak_distance_sec = 0.2
 47 | 
 48 |         # Initialize speech recognition
 49 |         self.recognizer = sr.Recognizer()
 50 |         self.microphone = sr.Microphone()
 51 | 
 52 |         # Initialize pygame mixer
 53 |         pygame.mixer.init()
 54 | 
 55 |         # Initialize AI agent
 56 |         self.setup_ai_agent()
 57 | 
 58 |     def setup_ai_agent(self):
 59 |         @tool
 60 |         def open_browser(url: str) -> str:
 61 |             """Opens the given URL in the firefox web browser."""
 62 |             webbrowser.open(url)
 63 |             return f"Opened browser to {url}"
 64 | 
 65 |         self.llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash", temperature=0)
 66 |         self.tools = [open_browser]
 67 |         self.prompt = ChatPromptTemplate.from_messages(
 68 |             [
 69 |                 (
 70 |                     "system",
 71 |                     "You are a helpful voice assistant. You can open URLs in the firefox browser. "
 72 |                     "Be concise in your responses as they will be converted to speech.",
 73 |                 ),
 74 |                 ("human", "{input}"),
 75 |                 ("placeholder", "{agent_scratchpad}"),
 76 |             ]
 77 |         )
 78 |         agent = create_tool_calling_agent(prompt=self.prompt, tools=self.tools, llm=self.llm)
 79 |         self.agent_executor = AgentExecutor(agent=agent, tools=self.tools, verbose=True)
 80 | 
 81 |     def detect_claps(self):
 82 |         print("Listening for claps... Please clap twice to activate the AI assistant.")
 83 |         self.stream = self.p.open(
 84 |             format=pyaudio.paFloat32,
 85 |             channels=1,
 86 |             rate=self.Fs,
 87 |             input=True,
 88 |             frames_per_buffer=self.frame_size,
 89 |         )
 90 | 
 91 |         clap_count = 0
 92 |         last_peak_time = -float("inf")
 93 |         zi = np.zeros((self.sos.shape[0], 2))
 94 |         start_time = time.time()
 95 | 
 96 |         while clap_count < 2:
 97 |             try:
 98 |                 frame = np.frombuffer(
 99 |                     self.stream.read(self.frame_size, exception_on_overflow=False),
100 |                     dtype=np.float32,
101 |                 )
102 |                 frame = frame * self.window
103 |                 frame_filtered, zi = signal.sosfilt(self.sos, frame, zi=zi)
104 |                 peaks, _ = signal.find_peaks(np.abs(frame_filtered), height=self.threshold)
105 |                 current_time = time.time() - start_time
106 | 
107 |                 if len(peaks) > 0 and (current_time - last_peak_time) >= self.min_peak_distance_sec:
108 |                     clap_count += 1
109 |                     print(f"Clap detected: {current_time:.2f} seconds (Count: {clap_count}/2)")
110 |                     last_peak_time = current_time
111 | 
112 |             except Exception as e:
113 |                 print(f"Error in clap detection: {e}")
114 |                 continue
115 | 
116 |         self.stream.stop_stream()
117 |         self.stream.close()
118 |         print("Two claps detected! Activating AI assistant...")
119 |         return True
120 | 
121 |     def play_intro_sound(self):
122 |         try:
123 |             tts = gTTS(text="Hi, how can I help you?", lang="en")
124 |             audio_buffer = io.BytesIO()
125 |             tts.write_to_fp(audio_buffer)
126 |             audio_buffer.seek(0)
127 |             pygame.mixer.music.load(audio_buffer)
128 |             pygame.mixer.music.play()
129 |             while pygame.mixer.music.get_busy():
130 |                 time.sleep(0.1)
131 |         except Exception as e:
132 |             print(f"Error playing intro sound: {e}")
133 |             print("Hi, how can I help you?")
134 | 
135 |     def listen_for_speech(self, timeout=5):
136 |         print(f"Listening for your voice input for {timeout} seconds...")
137 |         try:
138 |             with self.microphone as source:
139 |                 self.recognizer.adjust_for_ambient_noise(source, duration=1)
140 |                 audio = self.recognizer.listen(source, timeout=timeout, phrase_time_limit=timeout)
141 |             text = self.recognizer.recognize_google(audio)
142 |             print(f"You said: {text}")
143 |             return text
144 |         except sr.WaitTimeoutError:
145 |             print("No speech detected within the timeout period.")
146 |             return None
147 |         except sr.UnknownValueError:
148 |             print("Could not understand the audio.")
149 |             return None
150 |         except sr.RequestError as e:
151 |             print(f"Error with speech recognition service: {e}")
152 |             return None
153 | 
154 |     def process_with_ai_agent(self, user_input):
155 |         try:
156 |             print(f"Processing request: {user_input}")
157 |             response = self.agent_executor.invoke(input={"input": user_input})
158 |             output_text = response.get("output", "I apologize, but I could not process your request.")
159 |             print(f"AI Response: {output_text}")
160 |             return output_text
161 |         except Exception as e:
162 |             error_message = f"Sorry, I encountered an error while processing your request: {str(e)}"
163 |             print(error_message)
164 |             return error_message
165 | 
166 |     def text_to_speech(self, text):
167 |         try:
168 |             print(f"Converting to speech: {text}")
169 |             tts = gTTS(text=text, lang="en")
170 |             audio_buffer = io.BytesIO()
171 |             tts.write_to_fp(audio_buffer)
172 |             audio_buffer.seek(0)
173 |             pygame.mixer.music.load(audio_buffer)
174 |             pygame.mixer.music.play()
175 |             while pygame.mixer.music.get_busy():
176 |                 time.sleep(0.1)
177 |         except Exception as e:
178 |             print(f"Error in text-to-speech conversion: {e}")
179 |             print(f"AI says: {text}")
180 | 
181 |     def run(self):
182 |         try:
183 |             while True:
184 |                 if self.detect_claps():
185 |                     self.play_intro_sound()
186 |                     user_speech = self.listen_for_speech(timeout=5)
187 |                     if user_speech:
188 |                         ai_response = self.process_with_ai_agent(user_speech)
189 |                         self.text_to_speech(ai_response)
190 |                     else:
191 |                         self.text_to_speech("I didn't hear anything. Please try again by clapping twice.")
192 |                     print("\n" + "=" * 50)
193 |                     print("Ready for next activation. Clap twice to continue...")
194 |                     print("=" * 50 + "\n")
195 |         except KeyboardInterrupt:
196 |             print("\nShutting down voice-activated AI assistant...")
197 |         except Exception as e:
198 |             print(f"Unexpected error: {e}")
199 |         finally:
200 |             if self.stream and not self.stream.is_stopped():
201 |                 self.stream.stop_stream()
202 |                 self.stream.close()
203 |             self.p.terminate()
204 |             pygame.mixer.quit()
205 | 
206 | def main():
207 |     print("- Clap twice to activate")
208 |     assistant = VoiceActivatedAI()
209 |     assistant.run()
210 | 
211 | if __name__ == "__main__":
212 |     main()


--------------------------------------------------------------------------------