├── requirements.txt ├── README.md └── main.py /requirements.txt: -------------------------------------------------------------------------------- 1 | PyQt5==5.15.11 2 | PyAutoGUI==0.9.54 3 | Pillow==10.0.1 4 | requests==2.31.0 5 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Screen Analysis Overlay 2 | 3 | This application provides a transparent overlay for real-time image analysis using KoboldCPP or Ollama. It captures screenshots of a selected region or the entire screen and analyzes them using AI, providing descriptions and alerts based on user-defined conditions. Currently tested only on Windows, but might be also Linux-compatible. 4 | 5 | 6 | 7 | 8 | 9 | 10 | https://github.com/user-attachments/assets/53d47ec5-704a-4ff2-a21c-796f739a1c5e 11 | 12 | 13 | 14 | 15 | 16 | https://github.com/user-attachments/assets/240f12f5-2197-4cf4-88d1-ba273b509393 17 | 18 | 19 | 20 | 21 | 22 | ## Features 23 | 24 | - Transparent overlay that stays on top of other windows 25 | - Customizable capture region selection 26 | - Real-time screen analysis using KoboldCPP or Ollama 27 | - Customizable system prompts for analysis 28 | - Pause/Resume functionality 29 | - Alert system for specific conditions 30 | - Ability to save analysis results 31 | - Resizable overlay 32 | - Hide/show buttons by double clicking the overlay 33 | - Toggle overlay visibility during screenshots 34 | - Saves analysis history to SQL database 35 | - Search and view analysis history 36 | - Export analysis history to JSON or CSV file 37 | - Set analysis Start and End times 38 | - Switch between KoboldCPP and Ollama backends 39 | - Choose Ollama model for analysis 40 | 41 | ## Requirements 42 | 43 | - Python 3.8+ 44 | - PyQt5 45 | - pyautogui 46 | - Pillow 47 | - requests 48 | 49 | ## Installation 50 | 51 | 1. Clone this repository: 52 | ``` 53 | git clone https://github.com/PasiKoodaa/Screen-Analysis-Overlay 54 | cd image-analysis-overlay 55 | ``` 56 | 57 | 58 | 2. Set up a Python environment: 59 | 60 | ### Option 1: Using Conda 61 | 62 | ``` 63 | conda create -n screen-analysis python=3.9 64 | conda activate screen-analysis 65 | pip install -r requirements.txt 66 | ``` 67 | 68 | ### Option 2: Using venv and pip 69 | 70 | ``` 71 | python -m venv venv 72 | venv\Scripts\activate 73 | pip install -r requirements.txt 74 | ``` 75 | 76 | 3. Ensure you have KoboldCPP running locally on `http://localhost:5001` or Ollama running on `http://localhost:11434`. Adjust the `KOBOLDCPP_URL` in the script if your setup is different. 77 | 78 | ## Usage 79 | 80 | 1. Run the application: 81 | ``` 82 | python main.py 83 | ``` 84 | 85 | 2. Use the buttons or right-click context menu to: 86 | - View history and search history 87 | - Export history to JSON or CSV 88 | - Select a capture region 89 | - Update the analysis prompt 90 | - Pause/Resume analysis 91 | - Set alert conditions 92 | - Save analysis results 93 | - Resize the overlay 94 | - Toggle overlay visibility during screenshots 95 | - Select backend (KoboldCPP or Ollama) 96 | - Choose Ollama model (when using Ollama backend) 97 | 98 | 4. The overlay will continuously capture and analyze the selected region, displaying results in real-time. 99 | 100 | ## Configuration 101 | ![kobo](https://github.com/user-attachments/assets/c8781ff4-b7c5-47a4-b72e-84da4a5e3ea2) 102 | 103 | - Adjust the `KOBOLDCPP_URL` variable in the script if your KoboldCPP server is running on a different address. 104 | - Modify the `system_prompt` variable to change the default analysis prompt. 105 | 106 | ## Using Ollama Backend 107 | 108 | 1. Ensure Ollama is installed and running on your system. 109 | 2. In the application, click "Select Backend" button. 110 | 3. Choose "Ollama" as the backend. 111 | 4. Enter the desired Ollama model name (e.g., "llava" or "minicpm-v"). 112 | 5. Click "OK" to confirm the selection. 113 | 114 | 115 | -------------------------------------------------------------------------------- /main.py: -------------------------------------------------------------------------------- 1 | 2 | import sys 3 | import time 4 | import pyautogui 5 | from PIL import Image, ImageGrab 6 | import io 7 | import requests 8 | import base64 9 | import logging 10 | import os 11 | import json 12 | import csv 13 | import sqlite3 14 | from datetime import datetime, time as dt_time 15 | from PyQt5.QtWidgets import (QApplication, QMainWindow, QLabel, QPushButton, QVBoxLayout, QWidget, QMenu, 16 | QHBoxLayout, QFileDialog, QInputDialog, QMessageBox, QSizePolicy, QLayout, QStyle, QDialog, QLineEdit, QListWidget, QScrollArea, QTextEdit, QTimeEdit, QDialogButtonBox, QRadioButton) 17 | from PyQt5.QtCore import Qt, QTimer, QPoint, QRect, QThread, QObject, pyqtSignal, pyqtSlot, QSize, QTime 18 | from PyQt5.QtGui import QFont, QPainter, QPen, QPixmap, QCursor, QColor 19 | import json 20 | from queue import Queue 21 | 22 | 23 | # KoboldCPP server settings 24 | KOBOLDCPP_URL = "http://localhost:5001/api/v1/generate" 25 | 26 | logging.basicConfig(filename='app.log', level=logging.INFO, 27 | format='%(asctime)s - %(levelname)s - %(message)s') 28 | 29 | 30 | def resize_image(image): 31 | """ 32 | Resize the image if it exceeds 1.8 million pixels while maintaining aspect ratio. 33 | :param image: PIL Image object 34 | :return: Resized PIL Image object if necessary, otherwise the original image 35 | """ 36 | MAX_PIXELS = 1_800_000 # 1.8 million pixels 37 | 38 | # Calculate current number of pixels 39 | current_pixels = image.width * image.height 40 | 41 | # If the image is already small enough, return it as is 42 | if current_pixels <= MAX_PIXELS: 43 | return image 44 | 45 | # Calculate the scale factor needed to reduce to 1.8 million pixels 46 | scale_factor = (MAX_PIXELS / current_pixels) ** 0.5 47 | 48 | # Calculate new dimensions, ensuring we round down 49 | new_width = int(image.width * scale_factor) 50 | new_height = int(image.height * scale_factor) 51 | 52 | # Resize the image using LANCZOS resampling 53 | return image.resize((new_width, new_height), Image.LANCZOS) 54 | 55 | 56 | def encode_image_to_base64(image): 57 | buffered = io.BytesIO() 58 | image.save(buffered, format="JPEG") 59 | return base64.b64encode(buffered.getvalue()).decode('utf-8') 60 | 61 | class HistoryManager: 62 | def __init__(self, db_path='analysis_history.db'): 63 | self.db_path = db_path 64 | self.init_db() 65 | 66 | def init_db(self): 67 | conn = sqlite3.connect(self.db_path) 68 | cursor = conn.cursor() 69 | cursor.execute(''' 70 | CREATE TABLE IF NOT EXISTS analysis_history ( 71 | id INTEGER PRIMARY KEY AUTOINCREMENT, 72 | timestamp TEXT, 73 | analysis_text TEXT, 74 | prompt TEXT 75 | ) 76 | ''') 77 | conn.commit() 78 | conn.close() 79 | 80 | def add_analysis(self, analysis_text, prompt): 81 | timestamp = datetime.now().isoformat() 82 | conn = sqlite3.connect(self.db_path) 83 | cursor = conn.cursor() 84 | cursor.execute('INSERT INTO analysis_history (timestamp, analysis_text, prompt) VALUES (?, ?, ?)', 85 | (timestamp, analysis_text, prompt)) 86 | conn.commit() 87 | conn.close() 88 | 89 | def get_history(self, limit=100): 90 | conn = sqlite3.connect(self.db_path) 91 | cursor = conn.cursor() 92 | if limit is None: 93 | cursor.execute('SELECT * FROM analysis_history ORDER BY timestamp DESC') 94 | else: 95 | cursor.execute('SELECT * FROM analysis_history ORDER BY timestamp DESC LIMIT ?', (limit,)) 96 | history = cursor.fetchall() 97 | conn.close() 98 | return history 99 | 100 | def search_history(self, query): 101 | conn = sqlite3.connect(self.db_path) 102 | cursor = conn.cursor() 103 | cursor.execute('SELECT * FROM analysis_history WHERE analysis_text LIKE ? OR prompt LIKE ? ORDER BY timestamp DESC', 104 | (f'%{query}%', f'%{query}%')) 105 | results = cursor.fetchall() 106 | conn.close() 107 | return results 108 | 109 | def export_to_json(self, filename): 110 | history = self.get_history(limit=None) 111 | data = [{'id': item[0], 'timestamp': item[1], 'analysis_text': item[2], 'prompt': item[3]} for item in history] 112 | with open(filename, 'w', encoding='utf-8') as f: 113 | json.dump(data, f, indent=2, ensure_ascii=False) 114 | 115 | def export_to_csv(self, filename): 116 | history = self.get_history(limit=None) 117 | with open(filename, 'w', newline='', encoding='utf-8') as f: 118 | writer = csv.writer(f) 119 | writer.writerow(['ID', 'Timestamp', 'Analysis Text', 'Prompt']) 120 | writer.writerows(history) 121 | 122 | def get_analysis_by_timestamp(self, timestamp): 123 | conn = sqlite3.connect(self.db_path) 124 | cursor = conn.cursor() 125 | cursor.execute('SELECT * FROM analysis_history WHERE timestamp = ?', (timestamp,)) 126 | analysis = cursor.fetchone() 127 | conn.close() 128 | return analysis 129 | 130 | 131 | class AnalysisWorker(QObject): 132 | analysis_complete = pyqtSignal(str) 133 | alert_triggered = pyqtSignal(str, str) 134 | error_occurred = pyqtSignal(str) 135 | request_screenshot = pyqtSignal() 136 | screenshot_taken = pyqtSignal() # Changed to no-argument signal 137 | 138 | def __init__(self): 139 | super().__init__() 140 | self.running = True 141 | self.queue = Queue() 142 | self.overlay = None 143 | 144 | def set_overlay(self, overlay): 145 | self.overlay = overlay 146 | 147 | @pyqtSlot() 148 | def run_analysis(self): 149 | while self.running: 150 | try: 151 | current_time = datetime.now().time() 152 | if (self.overlay and 153 | not self.overlay.is_paused and 154 | not self.overlay.analysis_paused and 155 | (not self.overlay.timer_start or 156 | (self.overlay.timer_start <= current_time < self.overlay.timer_end))): 157 | 158 | self.request_screenshot.emit() 159 | # Wait for the screenshot to be taken 160 | timeout = 5 # 5 seconds timeout 161 | start_time = time.time() 162 | while (not hasattr(self.overlay, 'current_image') or 163 | self.overlay.current_image is None or 164 | self.overlay.current_image.getbbox() is None): 165 | if time.time() - start_time > timeout: 166 | logging.warning("Timeout waiting for valid screenshot") 167 | break 168 | time.sleep(0.1) 169 | 170 | if self.overlay.current_image and self.overlay.current_image.getbbox() is not None: 171 | # Choose the appropriate backend for analysis 172 | if self.overlay.backend == "koboldcpp": 173 | description = analyze_image_with_koboldcpp(self.overlay.current_image, self.overlay.system_prompt) 174 | else: # Ollama 175 | description = analyze_image_with_ollama(self.overlay.current_image, self.overlay.system_prompt, self.overlay.ollama_model) 176 | 177 | self.analysis_complete.emit(description) 178 | 179 | if self.overlay.alert_active: 180 | self.check_alert_condition(self.overlay.current_image, description) 181 | else: 182 | logging.warning("Skipping analysis due to invalid screenshot") 183 | 184 | # Process any pending UI updates 185 | while not self.queue.empty(): 186 | func, args = self.queue.get() 187 | func(*args) 188 | 189 | except Exception as e: 190 | self.error_occurred.emit(str(e)) 191 | 192 | time.sleep(5) # Wait for 5 seconds before the next analysis cycle 193 | 194 | 195 | def check_alert_condition(self, image, analysis_text): 196 | check_prompt = f"Based on the image and the following analysis, determine if the condition '{self.overlay.alert_prompt}' is met. Respond with only 'Yes' or 'No'.\n\nImage analysis: {analysis_text}" 197 | 198 | response = analyze_image_with_koboldcpp(image, check_prompt) 199 | 200 | if response.strip().lower() == 'yes': 201 | self.alert_triggered.emit(self.overlay.alert_prompt, analysis_text) 202 | 203 | def stop(self): 204 | self.running = False 205 | 206 | def queue_function(self, func, *args): 207 | self.queue.put((func, args)) 208 | 209 | 210 | def analyze_image_with_koboldcpp(image, prompt): 211 | if image is None: 212 | # Use a blank 1x1 pixel image when no image is provided 213 | blank_image = Image.new('RGB', (1, 1), color='white') 214 | image_base64 = encode_image_to_base64(blank_image) 215 | else: 216 | image_base64 = encode_image_to_base64(image) 217 | 218 | payload = { 219 | "n": 1, 220 | "max_context_length": 8192, 221 | "max_length": 100, 222 | "rep_pen": 1.15, 223 | "temperature": 0.3, 224 | "top_p": 1, 225 | "top_k": 0, 226 | "top_a": 0, 227 | "typical": 1, 228 | "tfs": 1, 229 | "rep_pen_range": 320, 230 | "rep_pen_slope": 0.7, 231 | "sampler_order": [6,0,1,3,4,2,5], #[6, 5, 0, 1, 3, 4, 2], 232 | "memory": "<|start_header_id|>system<|end_header_id|>\n\n <|begin_of_sentence|>{prompt}\n\n", 233 | "trim_stop": True, 234 | "images": [image_base64], 235 | "genkey": "KCPP4535", 236 | "min_p": 0.1, 237 | "dynatemp_range": 0, 238 | "dynatemp_exponent": 1, 239 | "smoothing_factor": 0, 240 | "banned_tokens": [], 241 | "render_special": False, 242 | "presence_penalty": 0, 243 | "logit_bias": {}, 244 | "prompt": f"\n(Attached Image)\n<|eot_id|><|start_header_id|>user<|end_header_id|>\n\n{prompt}<|eot_id|><|start_header_id|>assistant<|end_header_id|>\n\n", 245 | "quiet": True, 246 | "stop_sequence": ["<|eot_id|><|start_header_id|>user<|end_header_id|>", "<|eot_id|><|start_header_id|>assistant<|end_header_id|>"], 247 | "use_default_badwordsids": False, 248 | "bypass_eos": False 249 | } 250 | 251 | try: 252 | response = requests.post(KOBOLDCPP_URL, json=payload) 253 | response.raise_for_status() 254 | result = response.json() 255 | return result['results'][0]['text'].strip() 256 | except requests.RequestException as e: 257 | print(f"Error communicating with KoboldCPP: {e}") 258 | return "Unable to analyze image at this time." 259 | 260 | 261 | def analyze_image_with_ollama(image, prompt, model="llava"): 262 | image_base64 = encode_image_to_base64(image) 263 | 264 | payload = { 265 | "model": model, 266 | "prompt": prompt, 267 | "stream": False, # Change this to True if you want to handle streaming 268 | "images": [image_base64] 269 | } 270 | 271 | try: 272 | response = requests.post("http://localhost:11434/api/generate", json=payload, stream=True) 273 | response.raise_for_status() 274 | 275 | full_response = "" 276 | for line in response.iter_lines(): 277 | if line: 278 | decoded_line = line.decode('utf-8') 279 | response_data = json.loads(decoded_line) 280 | if 'response' in response_data: 281 | full_response += response_data['response'] 282 | if 'done' in response_data and response_data['done']: 283 | break 284 | 285 | return full_response.strip() 286 | except requests.RequestException as e: 287 | print(f"Error communicating with Ollama: {e}") 288 | return "Unable to analyze image at this time." 289 | 290 | 291 | class TransparentOverlay(QMainWindow): 292 | def __init__(self): 293 | super().__init__() 294 | self.initUI() 295 | self.capture_region = None 296 | self.is_capturing = False 297 | self.origin = None 298 | self.current = None 299 | self.system_prompt = "describe the image" 300 | self.is_paused = False 301 | self.alert_prompt = "" 302 | self.alert_active = False 303 | self.current_image = None 304 | self.analysis_results = [] 305 | self.is_selecting_region = False # New flag to track region selection state 306 | self.analysis_paused = False # New flag to control analysis 307 | self.start_point = None 308 | self.end_point = None 309 | self.buttons_visible = True # New attribute to track button visibility 310 | self.hide_during_screenshot = True # New attribute to control overlay visibility during screenshots 311 | self.history_manager = HistoryManager() 312 | # Add new attributes for timer functionality 313 | self.timer_start = None 314 | self.timer_end = None 315 | self.timer = QTimer(self) 316 | self.timer.timeout.connect(self.check_timer) 317 | self.timer.start(60000) # Check every minute 318 | self.backend = "koboldcpp" # Default backend 319 | self.ollama_model = "minicpm-v" # Default Ollama model 320 | self.initUI() 321 | 322 | # Create a directory for saved screenshots 323 | self.screenshot_dir = "saved_screenshots" 324 | os.makedirs(self.screenshot_dir, exist_ok=True) 325 | 326 | self.analysis_thread = QThread() 327 | self.analysis_worker = AnalysisWorker() 328 | self.analysis_worker.moveToThread(self.analysis_thread) 329 | self.analysis_thread.started.connect(self.analysis_worker.run_analysis) 330 | self.analysis_worker.analysis_complete.connect(self.update_text) 331 | self.analysis_worker.alert_triggered.connect(self.trigger_alert) 332 | self.analysis_worker.error_occurred.connect(self.handle_error) 333 | self.analysis_worker.request_screenshot.connect(self.take_screenshot) 334 | 335 | self.analysis_worker.set_overlay(self) 336 | self.analysis_thread.start() 337 | 338 | 339 | 340 | 341 | def initUI(self): 342 | self.setWindowFlags(Qt.FramelessWindowHint | Qt.WindowStaysOnTopHint | Qt.Tool) 343 | self.setAttribute(Qt.WA_TranslucentBackground) 344 | 345 | # Set a fixed initial size for the overlay 346 | self.setFixedSize(1500, 800) 347 | 348 | central_widget = QWidget(self) 349 | self.setCentralWidget(central_widget) 350 | main_layout = QVBoxLayout(central_widget) 351 | 352 | self.label = QLabel(self) 353 | self.label.setStyleSheet("color: white; background-color: rgba(0, 0, 0, 128); padding: 10px;") 354 | self.label.setAlignment(Qt.AlignTop | Qt.AlignLeft) 355 | self.label.setFont(QFont('Arial', 12)) 356 | self.label.setWordWrap(True) 357 | main_layout.addWidget(self.label) 358 | 359 | self.button_widget = QWidget(self) 360 | self.button_layout = QFlowLayout(self.button_widget) 361 | self.button_layout.setSpacing(5) 362 | self.button_layout.setContentsMargins(5, 5, 5, 5) 363 | 364 | # Add new buttons for timer functionality 365 | set_timer_button = QPushButton("Set Timer", self) 366 | set_timer_button.clicked.connect(self.show_timer_dialog) 367 | self.button_layout.addWidget(set_timer_button) 368 | 369 | clear_timer_button = QPushButton("Clear Timer", self) 370 | clear_timer_button.clicked.connect(self.clear_timer) 371 | self.button_layout.addWidget(clear_timer_button) 372 | 373 | # Add new button for backend selection 374 | select_backend_button = QPushButton("Select Backend", self) 375 | select_backend_button.clicked.connect(self.show_backend_dialog) 376 | self.button_layout.addWidget(select_backend_button) 377 | 378 | buttons = [ 379 | ("Select Region", self.select_region), 380 | ("Update Prompt", self.show_prompt_dialog), 381 | ("Pause", self.toggle_pause_resume), 382 | ("Save Results", self.save_results), 383 | ("Set Alert", self.set_alert_prompt), 384 | ("Resize Overlay", self.resize_overlay), 385 | ("Toggle Hide", self.toggle_hide_during_screenshot) # New button 386 | ] 387 | 388 | # Add new buttons 389 | view_history_button = QPushButton("View History", self) 390 | view_history_button.clicked.connect(self.show_history_dialog) 391 | self.button_layout.addWidget(view_history_button) 392 | 393 | export_button = QPushButton("Export History", self) 394 | export_button.clicked.connect(self.show_export_dialog) 395 | self.button_layout.addWidget(export_button) 396 | 397 | for text, slot in buttons: 398 | button = QPushButton(text, self) 399 | button.clicked.connect(slot) 400 | button.setSizePolicy(QSizePolicy.Preferred, QSizePolicy.Fixed) 401 | if text == "Pause": 402 | self.pause_resume_button = button # Store reference to Pause/Resume button 403 | self.button_layout.addWidget(button) 404 | 405 | main_layout.addWidget(self.button_widget) 406 | 407 | # Set layout margins and spacing 408 | main_layout.setContentsMargins(10, 10, 10, 10) 409 | main_layout.setSpacing(10) 410 | 411 | self.show() 412 | 413 | 414 | def mouseDoubleClickEvent(self, event): 415 | if event.button() == Qt.LeftButton: 416 | self.toggle_buttons_visibility() 417 | 418 | def toggle_buttons_visibility(self): 419 | self.buttons_visible = not self.buttons_visible 420 | self.button_widget.setVisible(self.buttons_visible) 421 | 422 | def resize_overlay(self): 423 | new_width, ok1 = QInputDialog.getInt(self, 'Resize Overlay', 'Enter new width:', self.width(), 100, 2000) 424 | if ok1: 425 | new_height, ok2 = QInputDialog.getInt(self, 'Resize Overlay', 'Enter new height:', self.height(), 100, 2000) 426 | if ok2: 427 | self.setFixedSize(new_width, new_height) 428 | self.update_text(f"Overlay resized to {new_width}x{new_height}") 429 | 430 | def resizeEvent(self, event): 431 | super().resizeEvent(event) 432 | self.button_widget.setFixedWidth(self.width() - 10) # Adjust for margins 433 | 434 | 435 | def update_text(self, text): 436 | self.label.setText(text) 437 | self.analysis_results.append(text) 438 | # Automatically save the analysis to history 439 | self.history_manager.add_analysis(text, self.system_prompt) 440 | 441 | def show_history_dialog(self): 442 | dialog = QDialog(self) 443 | dialog.setWindowTitle("Analysis History") 444 | dialog.setMinimumSize(600, 400) # Set a minimum size for better usability 445 | layout = QVBoxLayout(dialog) 446 | 447 | # Add search box 448 | search_box = QLineEdit(dialog) 449 | search_box.setPlaceholderText("Search history...") 450 | layout.addWidget(search_box) 451 | 452 | # Add list widget to display history 453 | list_widget = QListWidget(dialog) 454 | layout.addWidget(list_widget) 455 | 456 | # Function to update the list widget 457 | def update_list(query=''): 458 | list_widget.clear() 459 | if query: 460 | history = self.history_manager.search_history(query) 461 | else: 462 | history = self.history_manager.get_history() 463 | for item in history: 464 | list_widget.addItem(f"{item[1]}: {item[2][:50]}...") 465 | 466 | # Connect search box to update function 467 | search_box.textChanged.connect(update_list) 468 | 469 | # Function to open selected analysis 470 | def open_analysis(item): 471 | selected_text = item.text() 472 | parts = selected_text.split(":") # Split the string into parts 473 | timestamp = parts[0] + ":" + parts[1] + ":" + parts[2] # Reconstruct the timestamp 474 | print(timestamp) 475 | full_analysis = self.history_manager.get_analysis_by_timestamp(timestamp) 476 | if full_analysis: 477 | self.show_analysis_detail(full_analysis) 478 | 479 | # Connect list widget item click to open_analysis function 480 | list_widget.itemClicked.connect(open_analysis) 481 | 482 | # Initial population of the list 483 | update_list() 484 | 485 | dialog.exec_() 486 | 487 | def show_analysis_detail(self, analysis): 488 | detail_dialog = QDialog(self) 489 | detail_dialog.setWindowTitle(f"Analysis Detail - {analysis[1]}") 490 | detail_dialog.setMinimumSize(800, 600) # Set a minimum size for better readability 491 | layout = QVBoxLayout(detail_dialog) 492 | 493 | # Create a scroll area for the text 494 | scroll_area = QScrollArea(detail_dialog) 495 | scroll_area.setWidgetResizable(True) 496 | layout.addWidget(scroll_area) 497 | 498 | # Create a widget to hold the text 499 | content_widget = QWidget() 500 | scroll_area.setWidget(content_widget) 501 | content_layout = QVBoxLayout(content_widget) 502 | 503 | # Add timestamp 504 | timestamp_label = QLabel(f"Timestamp: {analysis[1]}") 505 | timestamp_label.setStyleSheet("font-weight: bold;") 506 | content_layout.addWidget(timestamp_label) 507 | 508 | # Add prompt 509 | prompt_label = QLabel(f"Prompt: {analysis[3]}") 510 | prompt_label.setStyleSheet("font-weight: bold;") 511 | content_layout.addWidget(prompt_label) 512 | 513 | # Add analysis text 514 | analysis_text = QTextEdit() 515 | analysis_text.setPlainText(analysis[2]) 516 | analysis_text.setReadOnly(True) 517 | content_layout.addWidget(analysis_text) 518 | 519 | detail_dialog.exec_() 520 | 521 | def show_export_dialog(self): 522 | dialog = QDialog(self) 523 | dialog.setWindowTitle("Export History") 524 | layout = QVBoxLayout(dialog) 525 | 526 | json_button = QPushButton("Export as JSON", dialog) 527 | csv_button = QPushButton("Export as CSV", dialog) 528 | 529 | layout.addWidget(json_button) 530 | layout.addWidget(csv_button) 531 | 532 | def export_json(): 533 | filename, _ = QFileDialog.getSaveFileName(self, "Save JSON", "", "JSON Files (*.json)") 534 | if filename: 535 | self.history_manager.export_to_json(filename) 536 | QMessageBox.information(self, "Export Successful", f"Data exported to {filename}") 537 | 538 | def export_csv(): 539 | filename, _ = QFileDialog.getSaveFileName(self, "Save CSV", "", "CSV Files (*.csv)") 540 | if filename: 541 | self.history_manager.export_to_csv(filename) 542 | QMessageBox.information(self, "Export Successful", f"Data exported to {filename}") 543 | 544 | json_button.clicked.connect(export_json) 545 | csv_button.clicked.connect(export_csv) 546 | 547 | dialog.exec_() 548 | 549 | @pyqtSlot(str, str) 550 | def trigger_alert(self, alert_prompt, analysis_text): 551 | QTimer.singleShot(0, lambda: self._show_alert(alert_prompt, analysis_text)) 552 | 553 | def _show_alert(self, alert_prompt, analysis_text): 554 | alert = QMessageBox(self) 555 | alert.setIcon(QMessageBox.Warning) 556 | alert.setText("Alert Condition Met!") 557 | alert.setInformativeText(f"The condition '{alert_prompt}' was detected.") 558 | alert.setDetailedText(analysis_text) 559 | alert.setWindowTitle("Image Analysis Alert") 560 | alert.show() 561 | 562 | @pyqtSlot(str) 563 | def handle_error(self, error_message): 564 | logging.error(f"Error in analysis thread: {error_message}") 565 | self.update_text(f"An error occurred: {error_message}") 566 | 567 | def closeEvent(self, event): 568 | self.analysis_worker.stop() 569 | self.analysis_thread.quit() 570 | self.analysis_thread.wait() 571 | super().closeEvent(event) 572 | 573 | 574 | def toggle_pause_resume(self): 575 | self.is_paused = not self.is_paused 576 | button_text = "Resume" if self.is_paused else "Pause" 577 | self.pause_resume_button.setText(button_text) # Update button text 578 | status = "paused" if self.is_paused else "resumed" 579 | self.update_text(f"Capture and analysis {status}") 580 | 581 | if not self.is_paused: 582 | self.is_selecting_region = False 583 | 584 | def toggle_hide_during_screenshot(self): 585 | self.hide_during_screenshot = not self.hide_during_screenshot 586 | status = "hidden" if self.hide_during_screenshot else "visible" 587 | self.update_text(f"Overlay will be {status} during screenshots") 588 | 589 | def save_results(self): 590 | if not self.analysis_results: 591 | self.update_text("No results to save.") 592 | return 593 | 594 | file_path, _ = QFileDialog.getSaveFileName(self, "Save Analysis Results", "", "Text Files (*.txt);;All Files (*)") 595 | if file_path: 596 | try: 597 | with open(file_path, 'w', encoding='utf-8') as file: 598 | for result in self.analysis_results: 599 | file.write(result + "\n\n") 600 | self.update_text(f"Results saved to {file_path}") 601 | except Exception as e: 602 | self.update_text(f"Error saving results: {str(e)}") 603 | 604 | 605 | def mousePressEvent(self, event): 606 | if event.button() == Qt.LeftButton: 607 | self.offset = event.pos() 608 | elif event.button() == Qt.RightButton: 609 | self.show_context_menu(event.pos()) 610 | 611 | def mouseMoveEvent(self, event): 612 | if event.buttons() & Qt.LeftButton: 613 | self.move(self.mapToGlobal(event.pos() - self.offset)) 614 | 615 | def show_context_menu(self, pos): 616 | context_menu = QMenu(self) 617 | hide_buttons = context_menu.addAction("Hide Buttons") 618 | view_history = context_menu.addAction("View History") 619 | export_history = context_menu.addAction("Export History") 620 | update_prompt_action = context_menu.addAction("Update Prompt") 621 | toggle_pause_action = context_menu.addAction("Pause/Resume") 622 | save_results_action = context_menu.addAction("Save Results") 623 | set_alert_action = context_menu.addAction("Set Alert Condition") 624 | clear_alert_action = context_menu.addAction("Clear Alert") 625 | resize_action = context_menu.addAction("Resize Overlay") 626 | toggle_hide = context_menu.addAction("Toggle Hide") 627 | exit_action = context_menu.addAction("Exit Application") 628 | 629 | action = context_menu.exec_(self.mapToGlobal(pos)) 630 | if action == hide_buttons: 631 | self.toggle_buttons_visibility() 632 | elif action == update_prompt_action: 633 | self.show_prompt_dialog() 634 | elif action == toggle_pause_action: 635 | self.toggle_pause_resume() 636 | elif action == view_history: 637 | self.show_history_dialog() 638 | elif action == export_history: 639 | self.show_export_dialog() 640 | elif action == save_results_action: 641 | self.save_results() 642 | elif action == set_alert_action: 643 | self.set_alert_prompt() 644 | elif action == clear_alert_action: 645 | self.clear_alert() 646 | elif action == resize_action: 647 | self.resize_overlay() 648 | elif action == toggle_hide: 649 | self.toggle_hide_during_screenshot() 650 | elif action == exit_action: 651 | QApplication.quit() 652 | 653 | def select_region(self): 654 | self.is_selecting_region = True 655 | self.analysis_paused = True 656 | self.hide() 657 | self.start_point = None 658 | self.end_point = None 659 | QTimer.singleShot(100, self.start_region_selection) 660 | 661 | def start_region_selection(self): 662 | screen = QApplication.primaryScreen() 663 | self.original_screenshot = screen.grabWindow(0) 664 | self.select_window = QMainWindow() 665 | self.select_window.setWindowFlags(Qt.FramelessWindowHint | Qt.WindowStaysOnTopHint) 666 | self.select_window.setGeometry(screen.geometry()) 667 | self.select_window.setAttribute(Qt.WA_TranslucentBackground) 668 | self.select_window.show() 669 | self.select_window.setMouseTracking(True) 670 | self.select_window.mousePressEvent = self.region_select_press 671 | self.select_window.mouseMoveEvent = self.region_select_move 672 | self.select_window.mouseReleaseEvent = self.region_select_release 673 | self.select_window.paintEvent = self.region_select_paint 674 | 675 | 676 | 677 | 678 | def region_select_press(self, event): 679 | self.start_point = event.pos() 680 | 681 | def region_select_move(self, event): 682 | if self.start_point: 683 | self.end_point = event.pos() 684 | self.select_window.update() 685 | 686 | def region_select_release(self, event): 687 | self.end_point = event.pos() 688 | if self.start_point and self.end_point: 689 | self.capture_region = QRect(self.start_point, self.end_point).normalized() 690 | logging.info(f"Region selected: {self.capture_region}") 691 | self.update_text(f"Region selected: {self.capture_region}") 692 | self.is_selecting_region = False 693 | self.analysis_paused = False 694 | self.select_window.close() 695 | self.show() 696 | self.trigger_analysis() 697 | 698 | 699 | def trigger_analysis(self): 700 | if hasattr(self, 'analysis_worker'): 701 | self.analysis_worker.request_screenshot.emit() 702 | 703 | def region_select_paint(self, event): 704 | painter = QPainter(self.select_window) 705 | painter.drawPixmap(self.select_window.rect(), self.original_screenshot) 706 | 707 | if self.start_point and self.end_point: 708 | painter.setPen(QPen(Qt.red, 2, Qt.SolidLine)) 709 | painter.setBrush(QColor(255, 0, 0, 50)) # Semi-transparent red 710 | painter.drawRect(QRect(self.start_point, self.end_point).normalized()) 711 | 712 | # Draw instructions 713 | painter.setPen(Qt.white) 714 | painter.setFont(QFont('Arial', 14)) 715 | painter.drawText(10, 30, "Click and drag to select a region. Press Esc to cancel.") 716 | 717 | def update_system_prompt(self, new_prompt): 718 | self.system_prompt = new_prompt 719 | print(f"System prompt updated to: {self.system_prompt}") 720 | 721 | 722 | def show_prompt_dialog(self): 723 | new_prompt, ok = QInputDialog.getText(self, 'Update System Prompt', 724 | 'Enter new system prompt:', 725 | text=self.system_prompt) 726 | if ok: 727 | self.system_prompt = new_prompt 728 | self.update_text(f"System prompt updated to: {self.system_prompt}") 729 | 730 | def set_alert_prompt(self): 731 | prompt, ok = QInputDialog.getText(self, 'Set Alert Prompt', 732 | 'Enter alert condition (e.g., "Can you see birds?"):') 733 | if ok and prompt: 734 | self.alert_prompt = prompt 735 | self.alert_active = True 736 | self.update_text(f"Alert set for condition: {self.alert_prompt}") 737 | elif ok: 738 | self.alert_prompt = "" 739 | self.alert_active = False 740 | self.update_text("Alert cleared") 741 | 742 | def clear_alert(self): 743 | self.alert_prompt = "" 744 | self.alert_active = False 745 | self.update_text("Alert condition cleared") 746 | 747 | def check_alert_condition(self, image, analysis_text): 748 | check_prompt = f"Based on the image and the following analysis, determine if the condition '{self.alert_prompt}' is met. Respond with only 'Yes' or 'No'.\n\nImage analysis: {analysis_text}" 749 | 750 | response = analyze_image_with_koboldcpp(image, check_prompt) 751 | 752 | if response.strip().lower() == 'yes': 753 | self.trigger_alert(analysis_text) 754 | 755 | 756 | def show_timer_dialog(self): 757 | dialog = QDialog(self) 758 | dialog.setWindowTitle("Set Analysis Timer") 759 | layout = QVBoxLayout(dialog) 760 | 761 | start_time_edit = QTimeEdit(dialog) 762 | start_time_edit.setDisplayFormat("HH:mm") 763 | layout.addWidget(QLabel("Start Time:")) 764 | layout.addWidget(start_time_edit) 765 | 766 | end_time_edit = QTimeEdit(dialog) 767 | end_time_edit.setDisplayFormat("HH:mm") 768 | layout.addWidget(QLabel("End Time:")) 769 | layout.addWidget(end_time_edit) 770 | 771 | button_box = QHBoxLayout() 772 | ok_button = QPushButton("OK", dialog) 773 | cancel_button = QPushButton("Cancel", dialog) 774 | button_box.addWidget(ok_button) 775 | button_box.addWidget(cancel_button) 776 | layout.addLayout(button_box) 777 | 778 | ok_button.clicked.connect(dialog.accept) 779 | cancel_button.clicked.connect(dialog.reject) 780 | 781 | if dialog.exec_() == QDialog.Accepted: 782 | self.timer_start = start_time_edit.time().toPyTime() 783 | self.timer_end = end_time_edit.time().toPyTime() 784 | self.update_text(f"Timer set: {self.timer_start.strftime('%H:%M')} - {self.timer_end.strftime('%H:%M')}") 785 | 786 | self.check_timer() # Immediately check if we should start/stop analysis 787 | 788 | def clear_timer(self): 789 | self.timer_start = None 790 | self.timer_end = None 791 | self.update_text("Timer cleared") 792 | self.check_timer() 793 | 794 | def check_timer(self): 795 | current_time = datetime.now().time() 796 | if self.timer_start and self.timer_end: 797 | if self.timer_start <= current_time < self.timer_end: 798 | if self.is_paused: 799 | self.toggle_pause_resume() 800 | self.update_text("Analysis started due to timer") 801 | else: 802 | if not self.is_paused: 803 | self.toggle_pause_resume() 804 | self.update_text("Analysis paused due to timer") 805 | 806 | 807 | 808 | @pyqtSlot() 809 | def take_screenshot(self): 810 | if self.is_selecting_region: 811 | logging.info("Region selection in progress, skipping screenshot") 812 | return 813 | 814 | if self.hide_during_screenshot: 815 | self.hide() # Hide the overlay only if hide_during_screenshot is True 816 | QApplication.processEvents() # Ensure the hide takes effect 817 | 818 | try: 819 | if self.capture_region and not self.is_selecting_region: 820 | # Convert QRect to screen coordinates 821 | screen = QApplication.primaryScreen() 822 | screen_geometry = screen.geometry() 823 | left = self.capture_region.left() + screen_geometry.left() 824 | top = self.capture_region.top() + screen_geometry.top() 825 | right = self.capture_region.right() + screen_geometry.left() 826 | bottom = self.capture_region.bottom() + screen_geometry.top() 827 | 828 | # Use ImageGrab for screen capture 829 | img = ImageGrab.grab(bbox=(left, top, right, bottom)) 830 | logging.info(f"Screenshot taken of selected region: {left},{top},{right},{bottom}") 831 | else: 832 | img = ImageGrab.grab() 833 | logging.info("Full screen screenshot taken") 834 | 835 | if self.hide_during_screenshot: 836 | self.show() # Show the overlay immediately after taking the screenshot 837 | 838 | # Save the full-size screenshot 839 | timestamp = datetime.now().strftime("%Y%m%d_%H%M%S") 840 | filename = f"screenshot_{timestamp}.png" 841 | filepath = os.path.join(self.screenshot_dir, filename) 842 | img.save(filepath) 843 | logging.info(f"Screenshot saved: {filepath}") 844 | 845 | self.current_image = resize_image(img) 846 | 847 | if self.current_image.getbbox() is None: 848 | logging.warning("Captured image is empty") 849 | else: 850 | logging.info(f"Captured image size: {self.current_image.size}") 851 | 852 | except Exception as e: 853 | logging.error(f"Error taking screenshot: {str(e)}") 854 | self.current_image = None 855 | finally: 856 | if self.hide_during_screenshot: 857 | self.show() # Show the overlay again only if it was hidden 858 | self.analysis_worker.screenshot_taken.emit() 859 | 860 | def show_backend_dialog(self): 861 | dialog = QDialog(self) 862 | dialog.setWindowTitle("Select Backend") 863 | layout = QVBoxLayout(dialog) 864 | 865 | koboldcpp_radio = QRadioButton("KoboldCPP", dialog) 866 | ollama_radio = QRadioButton("Ollama", dialog) 867 | 868 | if self.backend == "koboldcpp": 869 | koboldcpp_radio.setChecked(True) 870 | else: 871 | ollama_radio.setChecked(True) 872 | 873 | layout.addWidget(koboldcpp_radio) 874 | layout.addWidget(ollama_radio) 875 | 876 | # Add Ollama model selection 877 | ollama_model_label = QLabel("Ollama Model:") 878 | ollama_model_input = QLineEdit(self.ollama_model) 879 | layout.addWidget(ollama_model_label) 880 | layout.addWidget(ollama_model_input) 881 | 882 | button_box = QDialogButtonBox(QDialogButtonBox.Ok | QDialogButtonBox.Cancel) 883 | button_box.accepted.connect(dialog.accept) 884 | button_box.rejected.connect(dialog.reject) 885 | layout.addWidget(button_box) 886 | 887 | if dialog.exec_() == QDialog.Accepted: 888 | if koboldcpp_radio.isChecked(): 889 | self.backend = "koboldcpp" 890 | else: 891 | self.backend = "ollama" 892 | self.ollama_model = ollama_model_input.text() 893 | self.update_text(f"Backend set to: {self.backend}") 894 | if self.backend == "ollama": 895 | self.update_text(f"Ollama model set to: {self.ollama_model}") 896 | 897 | 898 | class QFlowLayout(QLayout): 899 | def __init__(self, parent=None, margin=0, spacing=-1): 900 | super(QFlowLayout, self).__init__(parent) 901 | self.itemList = [] 902 | self.m_hSpace = spacing 903 | self.m_vSpace = spacing 904 | self.setContentsMargins(margin, margin, margin, margin) 905 | 906 | def __del__(self): 907 | item = self.takeAt(0) 908 | while item: 909 | item = self.takeAt(0) 910 | 911 | def addItem(self, item): 912 | self.itemList.append(item) 913 | 914 | def horizontalSpacing(self): 915 | if self.m_hSpace >= 0: 916 | return self.m_hSpace 917 | else: 918 | return self.smartSpacing(QStyle.PM_LayoutHorizontalSpacing) 919 | 920 | def verticalSpacing(self): 921 | if self.m_vSpace >= 0: 922 | return self.m_vSpace 923 | else: 924 | return self.smartSpacing(QStyle.PM_LayoutVerticalSpacing) 925 | 926 | def count(self): 927 | return len(self.itemList) 928 | 929 | def itemAt(self, index): 930 | if 0 <= index < len(self.itemList): 931 | return self.itemList[index] 932 | return None 933 | 934 | def takeAt(self, index): 935 | if 0 <= index < len(self.itemList): 936 | return self.itemList.pop(index) 937 | return None 938 | 939 | def expandingDirections(self): 940 | return Qt.Orientations(Qt.Orientation(0)) 941 | 942 | def hasHeightForWidth(self): 943 | return True 944 | 945 | def heightForWidth(self, width): 946 | height = self.doLayout(QRect(0, 0, width, 0), True) 947 | return height 948 | 949 | def setGeometry(self, rect): 950 | super(QFlowLayout, self).setGeometry(rect) 951 | self.doLayout(rect, False) 952 | 953 | def sizeHint(self): 954 | return self.minimumSize() 955 | 956 | def minimumSize(self): 957 | size = QSize() 958 | for item in self.itemList: 959 | size = size.expandedTo(item.minimumSize()) 960 | size += QSize(2 * self.contentsMargins().top(), 2 * self.contentsMargins().top()) 961 | return size 962 | 963 | def doLayout(self, rect, testOnly): 964 | x = rect.x() 965 | y = rect.y() 966 | lineHeight = 0 967 | 968 | for item in self.itemList: 969 | wid = item.widget() 970 | spaceX = self.horizontalSpacing() 971 | if spaceX == -1: 972 | spaceX = wid.style().layoutSpacing( 973 | QSizePolicy.PushButton, QSizePolicy.PushButton, Qt.Horizontal) 974 | spaceY = self.verticalSpacing() 975 | if spaceY == -1: 976 | spaceY = wid.style().layoutSpacing( 977 | QSizePolicy.PushButton, QSizePolicy.PushButton, Qt.Vertical) 978 | 979 | nextX = x + item.sizeHint().width() + spaceX 980 | if nextX - spaceX > rect.right() and lineHeight > 0: 981 | x = rect.x() 982 | y = y + lineHeight + spaceY 983 | nextX = x + item.sizeHint().width() + spaceX 984 | lineHeight = 0 985 | 986 | if not testOnly: 987 | item.setGeometry(QRect(QPoint(x, y), item.sizeHint())) 988 | 989 | x = nextX 990 | lineHeight = max(lineHeight, item.sizeHint().height()) 991 | 992 | return y + lineHeight - rect.y() 993 | 994 | def smartSpacing(self, pm): 995 | parent = self.parent() 996 | if not parent: 997 | return -1 998 | elif parent.isWidgetType(): 999 | return parent.style().pixelMetric(pm, None, parent) 1000 | else: 1001 | return parent.spacing() 1002 | 1003 | 1004 | 1005 | def capture_and_analyze(overlay): 1006 | while True: 1007 | if not overlay.is_paused: 1008 | if overlay.capture_region and not overlay.is_capturing: 1009 | screenshot = pyautogui.screenshot(region=( 1010 | overlay.capture_region.x(), 1011 | overlay.capture_region.y(), 1012 | overlay.capture_region.width(), 1013 | overlay.capture_region.height() 1014 | )) 1015 | else: 1016 | screenshot = pyautogui.screenshot() 1017 | 1018 | resized_image = resize_image(screenshot) 1019 | overlay.current_image = resized_image # Store the current image 1020 | 1021 | description = analyze_image_with_koboldcpp(resized_image, overlay.system_prompt) 1022 | overlay.update_text(description) 1023 | time.sleep(5) 1024 | 1025 | 1026 | 1027 | 1028 | def main(): 1029 | app = QApplication(sys.argv) 1030 | overlay = TransparentOverlay() 1031 | overlay.show() 1032 | 1033 | # Ensure the analysis worker is properly connected 1034 | overlay.analysis_worker.set_overlay(overlay) 1035 | 1036 | sys.exit(app.exec_()) 1037 | 1038 | if __name__ == "__main__": 1039 | main() 1040 | --------------------------------------------------------------------------------