├── .github └── FUNDING.yml ├── .gitignore ├── LICENSE ├── README.md ├── custom_components └── deepstack_object │ ├── __init__.py │ ├── image_processing.py │ ├── manifest.json │ └── tests.py ├── docs ├── media_player.png ├── object_detail.png └── object_usage.png └── requirements-dev.txt /.github/FUNDING.yml: -------------------------------------------------------------------------------- 1 | # These are supported funding model platforms 2 | 3 | github: robmarkcole 4 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Build and Release Folders 2 | bin-debug/ 3 | bin-release/ 4 | [Oo]bj/ 5 | [Bb]in/ 6 | 7 | # Other files and folders 8 | .settings/ 9 | 10 | # Executables 11 | *.swf 12 | *.air 13 | *.ipa 14 | *.apk 15 | 16 | # Project files, i.e. `.project`, `.actionScriptProperties` and `.flexProperties` 17 | # should NOT be excluded as they contain compiler settings and other important 18 | # information for Eclipse / Flash Builder. 19 | *.pyc 20 | venv* 21 | .vscode* -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2019 Robin 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # HASS-Deepstack-object 2 | [Home Assistant](https://www.home-assistant.io/) custom component for Deepstack object detection. [Deepstack](https://docs.deepstack.cc/) is a service which runs in a docker container and exposes various computer vision models via a REST API. Deepstack [object detection](https://docs.deepstack.cc/object-detection/index.html) can identify 80 different kinds of objects (listed at bottom of this readme), including people (`person`), vehicles and animals. Alternatively a custom object detection model can be used. There is no cost for using Deepstack and it is [fully open source](https://github.com/johnolafenwa/DeepStack). To run Deepstack you will need a machine with 8 GB RAM, or an NVIDIA Jetson. 3 | 4 | On your machine with docker, run Deepstack with the object detection service active on port `80`: 5 | ``` 6 | docker run -e VISION-DETECTION=True -e API-KEY="mysecretkey" -v localstorage:/datastore -p 80:5000 deepquestai/deepstack 7 | ``` 8 | 9 | ## Usage of this component 10 | The `deepstack_object` component adds an `image_processing` entity where the state of the entity is the total count of target objects that are above a `confidence` threshold which has a default value of 80%. You can have a single target object class, or multiple. The time of the last detection of any target object is in the `last target detection` attribute. The type and number of objects (of any confidence) is listed in the `summary` attributes. Optionally a region of interest (ROI) can be configured, and only objects with their center (represented by a `x`) within the ROI will be included in the state count. The ROI will be displayed as a green box, and objects with their center in the ROI have a red box. 11 | 12 | Also optionally the processed image can be saved to disk, with bounding boxes showing the location of detected objects. If `save_file_folder` is configured, an image with filename of format `deepstack_object_{source name}_latest.jpg` is over-written on each new detection of a target. Optionally this image can also be saved with a timestamp in the filename, if `save_timestamped_file` is configured as `True`. An event `deepstack.object_detected` is fired for each object detected that is in the targets list, and meets the confidence and ROI criteria. If you are a power user with advanced needs such as zoning detections or you want to track multiple object types, you will need to use the `deepstack.object_detected` events. 13 | 14 | **Note** that by default the component will **not** automatically scan images, but requires you to call the `image_processing.scan` service e.g. using an automation triggered by motion. 15 | 16 | ## Home Assistant setup 17 | Place the `custom_components` folder in your configuration directory (or add its contents to an existing `custom_components` folder). Then configure object detection. **Important:** It is necessary to configure only a single camera per `deepstack_object` entity. If you want to process multiple cameras, you will therefore need multiple `deepstack_object` `image_processing` entities. 18 | 19 | The component can optionally save snapshots of the processed images. If you would like to use this option, you need to create a folder where the snapshots will be stored. The folder should be in the same folder where your `configuration.yaml` file is located. In the example below, we have named the folder `snapshots`. 20 | 21 | Add to your Home-Assistant config: 22 | 23 | ```yaml 24 | image_processing: 25 | - platform: deepstack_object 26 | ip_address: localhost 27 | port: 80 28 | api_key: mysecretkey 29 | # custom_model: mask 30 | # confidence: 80 31 | save_file_folder: /config/snapshots/ 32 | save_file_format: png 33 | save_timestamped_file: True 34 | always_save_latest_file: True 35 | scale: 0.75 36 | # roi_x_min: 0.35 37 | roi_x_max: 0.8 38 | #roi_y_min: 0.4 39 | roi_y_max: 0.8 40 | crop_to_roi: True 41 | targets: 42 | - target: person 43 | - target: vehicle 44 | confidence: 60 45 | - target: car 46 | confidence: 40 47 | source: 48 | - entity_id: camera.local_file 49 | ``` 50 | 51 | Configuration variables: 52 | - **ip_address**: the ip address of your deepstack instance. 53 | - **port**: the port of your deepstack instance. 54 | - **api_key**: (Optional) Any API key you have set. 55 | - **timeout**: (Optional, default 10 seconds) The timeout for requests to deepstack. 56 | - **custom_model**: (Optional) The name of a custom model if you are using one. Don't forget to add the targets from the custom model below 57 | - **confidence**: (Optional) The confidence (in %) above which detected targets are counted in the sensor state. Default value: 80 58 | - **save_file_folder**: (Optional) The folder to save processed images to. Note that folder path should be added to [whitelist_external_dirs](https://www.home-assistant.io/docs/configuration/basic/) 59 | - **save_file_format**: (Optional, default `jpg`, alternatively `png`) The file format to save images as. `png` generally results in easier to read annotations. 60 | - **save_timestamped_file**: (Optional, default `False`, requires `save_file_folder` to be configured) Save the processed image with the time of detection in the filename. 61 | - **always_save_latest_file**: (Optional, default `False`, requires `save_file_folder` to be configured) Always save the last processed image, even if there were no detections. 62 | - **scale**: (optional, default 1.0), range 0.1-1.0, applies a scaling factor to the images that are saved. This reduces the disk space used by saved images, and is especially beneficial when using high resolution cameras. 63 | - **show_boxes**: (optional, default `True`), if `False` bounding boxes are not shown on saved images 64 | - **roi_x_min**: (optional, default 0), range 0-1, must be less than roi_x_max 65 | - **roi_x_max**: (optional, default 1), range 0-1, must be more than roi_x_min 66 | - **roi_y_min**: (optional, default 0), range 0-1, must be less than roi_y_max 67 | - **roi_y_max**: (optional, default 1), range 0-1, must be more than roi_y_min 68 | - **crop_to_roi**: (optional, default False), crops the image to the specified roi. May improve object detection accuracy when a region-of-interest is applied 69 | - **source**: Must be a camera. 70 | - **targets**: The list of target object names and/or `object_type`, default `person`. Optionally a `confidence` can be set for this target, if not the default confidence is used. Note the minimum possible confidence is 10%. 71 | 72 | For the ROI, the (x=0,y=0) position is the top left pixel of the image, and the (x=1,y=1) position is the bottom right pixel of the image. It might seem a bit odd to have y running from top to bottom of the image, but that is the coordinate system used by pillow. 73 | 74 | I created an app for exploring the config parameters at [https://github.com/robmarkcole/deepstack-ui](https://github.com/robmarkcole/deepstack-ui) 75 | 76 |

77 | 78 |

79 | 80 |

81 | 82 |

83 | 84 | #### Event `deepstack.object_detected` 85 | An event `deepstack.object_detected` is fired for each object detected above the configured `confidence` threshold. This is the recommended way to check the confidence of detections, and to keep track of objects that are not configured as the `target` (use `Developer tools -> EVENTS -> :Listen to events`, to monitor these events). 86 | 87 | An example use case for event is to get an alert when some rarely appearing object is detected, or to increment a [counter](https://www.home-assistant.io/components/counter/). The `deepstack.object_detected` event payload includes: 88 | 89 | - `entity_id` : the entity id responsible for the event 90 | - `name` : the name of the object detected 91 | - `object_type` : the type of the object, from `person`, `vehicle`, `animal` or `other` 92 | - `confidence` : the confidence in detection in the range 0 - 100% 93 | - `box` : the bounding box of the object 94 | - `centroid` : the centre point of the object 95 | - `saved_file` : the path to the saved annotated image, which is the timestamped file if `save_timestamped_file` is True, or the default saved image if False 96 | 97 | An example automation using the `deepstack.object_detected` event is given below: 98 | 99 | ```yaml 100 | - action: 101 | - data_template: 102 | caption: "New person detection with confidence {{ trigger.event.data.confidence }}" 103 | file: "{{ trigger.event.data.saved_file }}" 104 | service: telegram_bot.send_photo 105 | alias: Object detection automation 106 | condition: [] 107 | id: "1120092824622" 108 | trigger: 109 | - platform: event 110 | event_type: deepstack.object_detected 111 | event_data: 112 | name: person 113 | ``` 114 | 115 | ## Displaying the deepstack latest jpg file 116 | It easy to display the `deepstack_object_{source name}_latest.jpg` image with a [local_file](https://www.home-assistant.io/components/local_file/) camera. An example configuration is: 117 | ```yaml 118 | camera: 119 | - platform: local_file 120 | file_path: /config/snapshots/deepstack_object_local_file_latest.jpg 121 | name: deepstack_latest_person 122 | ``` 123 | 124 | ## Info on box 125 | The `box` coordinates and the box center (`centroid`) can be used to determine whether an object falls within a defined region-of-interest (ROI). This can be useful to include/exclude objects by their location in the image. 126 | 127 | * The `box` is defined by the tuple `(y_min, x_min, y_max, x_max)` (equivalent to image top, left, bottom, right) where the coordinates are floats in the range `[0.0, 1.0]` and relative to the width and height of the image. 128 | * The centroid is in `(x,y)` coordinates where `(0,0)` is the top left hand corner of the image and `(1,1)` is the bottom right corner of the image. 129 | 130 | ## Browsing saved images in HA 131 | I highly recommend using the Home Assistant Media Player Browser to browse and preview processed images. Add to your config something like: 132 | ```yaml 133 | homeassistant: 134 | . 135 | . 136 | whitelist_external_dirs: 137 | - /config/images/ 138 | media_dirs: 139 | local: /config/images/ 140 | 141 | media_source: 142 | ``` 143 | And configure Deepstack to use the above directory for `save_file_folder`, then saved images can be browsed from the HA front end like below: 144 | 145 |

146 | 147 |

148 | 149 | ## Face recognition 150 | For face recognition with Deepstack use https://github.com/robmarkcole/HASS-Deepstack-face 151 | 152 | ### Support 153 | For code related issues such as suspected bugs, please open an issue on this repo. For general chat or to discuss Home Assistant specific issues related to configuration or use cases, please [use this thread on the Home Assistant forums](https://community.home-assistant.io/t/face-and-person-detection-with-deepstack-local-and-free/92041). 154 | 155 | ### Docker tips 156 | Add the `-d` flag to run the container in background 157 | 158 | ### FAQ 159 | Q1: I get the following warning, is this normal? 160 | ``` 161 | 2019-01-15 06:37:52 WARNING (MainThread) [homeassistant.loader] You are using a custom component for image_processing.deepstack_face which has not been tested by Home Assistant. This component might cause stability problems, be sure to disable it if you do experience issues with Home Assistant. 162 | ``` 163 | A1: Yes this is normal 164 | 165 | ------ 166 | 167 | Q4: What are the minimum hardware requirements for running Deepstack? 168 | 169 | A4. Based on my experience, I would allow 0.5 GB RAM per model. 170 | 171 | ------ 172 | 173 | Q5: Can object detection be configured to detect car/car colour? 174 | 175 | A5: The list of detected object classes is at the end of the page [here](https://deepstackpython.readthedocs.io/en/latest/objectdetection.html). There is no support for detecting the colour of an object. 176 | 177 | ------ 178 | 179 | Q6: I am getting an error from Home Assistant: `Platform error: image_processing - Integration deepstack_object not found` 180 | 181 | A6: This can happen when you are running in Docker/Hassio, and indicates that one of the dependencies isn't installed. It is necessary to reboot your Hassio device, or rebuild your Docker container. Note that just restarting Home Assistant will not resolve this. 182 | 183 | ------ 184 | 185 | ## Objects 186 | The following lists all valid target object names: 187 | ``` 188 | person, bicycle, car, motorcycle, airplane, 189 | bus, train, truck, boat, traffic light, fire hydrant, stop_sign, 190 | parking meter, bench, bird, cat, dog, horse, sheep, cow, elephant, 191 | bear, zebra, giraffe, backpack, umbrella, handbag, tie, suitcase, 192 | frisbee, skis, snowboard, sports ball, kite, baseball bat, baseball glove, 193 | skateboard, surfboard, tennis racket, bottle, wine glass, cup, fork, 194 | knife, spoon, bowl, banana, apple, sandwich, orange, broccoli, carrot, 195 | hot dog, pizza, donut, cake, chair, couch, potted plant, bed, dining table, 196 | toilet, tv, laptop, mouse, remote, keyboard, cell phone, microwave, 197 | oven, toaster, sink, refrigerator, book, clock, vase, scissors, teddy bear, 198 | hair dryer, toothbrush. 199 | ``` 200 | Objects are grouped by the following `object_type`: 201 | - **person**: person 202 | - **animal**: bird, cat, dog, horse, sheep, cow, elephant, bear, zebra, giraffe 203 | - **vehicle**: bicycle, car, motorcycle, airplane, bus, train, truck 204 | - **other**: any object that is not in `person`, `animal` or `vehicle` 205 | 206 | ## Development 207 | Currently only the helper functions are tested, using pytest. 208 | * `python3 -m venv venv` 209 | * `source venv/bin/activate` 210 | * `pip install -r requirements-dev.txt` 211 | * `venv/bin/py.test custom_components/deepstack_object/tests.py -vv -p no:warnings` 212 | 213 | ## Videos of usage 214 | Checkout this excellent video of usage from [Everything Smart Home](https://www.youtube.com/channel/UCrVLgIniVg6jW38uVqDRIiQ) 215 | 216 | [![](http://img.youtube.com/vi/vMdpLiAB9dI/0.jpg)](http://www.youtube.com/watch?v=vMdpLiAB9dI "") 217 | 218 | Also see the video of a presentation I did to the [IceVision](https://airctic.com/) community on deploying Deepstack on a Jetson nano. 219 | 220 | [![](http://img.youtube.com/vi/1O0mCaA22fE/0.jpg)](http://www.youtube.com/watch?v=1O0mCaA22fE "") -------------------------------------------------------------------------------- /custom_components/deepstack_object/__init__.py: -------------------------------------------------------------------------------- 1 | """The deepstack_object component.""" 2 | -------------------------------------------------------------------------------- /custom_components/deepstack_object/image_processing.py: -------------------------------------------------------------------------------- 1 | """ 2 | Component that will perform object detection and identification via deepstack. 3 | 4 | For more details about this platform, please refer to the documentation at 5 | https://home-assistant.io/components/image_processing.deepstack_object 6 | """ 7 | from collections import namedtuple, Counter 8 | import datetime 9 | import io 10 | import logging 11 | import os 12 | import re 13 | from datetime import timedelta 14 | from typing import Tuple, Dict, List 15 | from pathlib import Path 16 | 17 | from PIL import Image, ImageDraw 18 | 19 | import deepstack.core as ds 20 | import homeassistant.helpers.config_validation as cv 21 | import homeassistant.util.dt as dt_util 22 | import voluptuous as vol 23 | from homeassistant.util.pil import draw_box 24 | from homeassistant.components.image_processing import ( 25 | ATTR_CONFIDENCE, 26 | CONF_CONFIDENCE, 27 | CONF_ENTITY_ID, 28 | CONF_NAME, 29 | CONF_SOURCE, 30 | DEFAULT_CONFIDENCE, 31 | DOMAIN, 32 | PLATFORM_SCHEMA, 33 | ImageProcessingEntity, 34 | ) 35 | from homeassistant.const import ( 36 | ATTR_ENTITY_ID, 37 | ATTR_NAME, 38 | CONF_IP_ADDRESS, 39 | CONF_PORT, 40 | ) 41 | from homeassistant.core import split_entity_id 42 | 43 | _LOGGER = logging.getLogger(__name__) 44 | 45 | ANIMAL = "animal" 46 | ANIMALS = [ 47 | "bird", 48 | "cat", 49 | "dog", 50 | "horse", 51 | "sheep", 52 | "cow", 53 | "elephant", 54 | "bear", 55 | "zebra", 56 | "giraffe", 57 | ] 58 | OTHER = "other" 59 | PERSON = "person" 60 | VEHICLE = "vehicle" 61 | VEHICLES = ["bicycle", "car", "motorcycle", "airplane", "bus", "train", "truck"] 62 | OBJECT_TYPES = [ANIMAL, OTHER, PERSON, VEHICLE] 63 | 64 | 65 | CONF_API_KEY = "api_key" 66 | CONF_TARGET = "target" 67 | CONF_TARGETS = "targets" 68 | CONF_TIMEOUT = "timeout" 69 | CONF_SAVE_FILE_FORMAT = "save_file_format" 70 | CONF_SAVE_FILE_FOLDER = "save_file_folder" 71 | CONF_SAVE_TIMESTAMPTED_FILE = "save_timestamped_file" 72 | CONF_ALWAYS_SAVE_LATEST_FILE = "always_save_latest_file" 73 | CONF_SHOW_BOXES = "show_boxes" 74 | CONF_ROI_Y_MIN = "roi_y_min" 75 | CONF_ROI_X_MIN = "roi_x_min" 76 | CONF_ROI_Y_MAX = "roi_y_max" 77 | CONF_ROI_X_MAX = "roi_x_max" 78 | CONF_SCALE = "scale" 79 | CONF_CUSTOM_MODEL = "custom_model" 80 | CONF_CROP_ROI = "crop_to_roi" 81 | 82 | DATETIME_FORMAT = "%Y-%m-%d_%H-%M-%S-%f" 83 | DEFAULT_API_KEY = "" 84 | DEFAULT_TARGETS = [{CONF_TARGET: PERSON}] 85 | DEFAULT_TIMEOUT = 10 86 | DEFAULT_ROI_Y_MIN = 0.0 87 | DEFAULT_ROI_Y_MAX = 1.0 88 | DEFAULT_ROI_X_MIN = 0.0 89 | DEFAULT_ROI_X_MAX = 1.0 90 | DEAULT_SCALE = 1.0 91 | DEFAULT_ROI = ( 92 | DEFAULT_ROI_Y_MIN, 93 | DEFAULT_ROI_X_MIN, 94 | DEFAULT_ROI_Y_MAX, 95 | DEFAULT_ROI_X_MAX, 96 | ) 97 | 98 | EVENT_OBJECT_DETECTED = "deepstack.object_detected" 99 | BOX = "box" 100 | FILE = "file" 101 | OBJECT = "object" 102 | SAVED_FILE = "saved_file" 103 | MIN_CONFIDENCE = 0.1 104 | JPG = "jpg" 105 | PNG = "png" 106 | 107 | # rgb(red, green, blue) 108 | RED = (255, 0, 0) # For objects within the ROI 109 | GREEN = (0, 255, 0) # For ROI box 110 | YELLOW = (255, 255, 0) # Unused 111 | 112 | TARGETS_SCHEMA = { 113 | vol.Required(CONF_TARGET): cv.string, 114 | vol.Optional(CONF_CONFIDENCE): vol.All( 115 | vol.Coerce(float), vol.Range(min=10, max=100) 116 | ), 117 | } 118 | 119 | 120 | PLATFORM_SCHEMA = PLATFORM_SCHEMA.extend( 121 | { 122 | vol.Required(CONF_IP_ADDRESS): cv.string, 123 | vol.Required(CONF_PORT): cv.port, 124 | vol.Optional(CONF_API_KEY, default=DEFAULT_API_KEY): cv.string, 125 | vol.Optional(CONF_TIMEOUT, default=DEFAULT_TIMEOUT): cv.positive_int, 126 | vol.Optional(CONF_CUSTOM_MODEL, default=""): cv.string, 127 | vol.Optional(CONF_TARGETS, default=DEFAULT_TARGETS): vol.All( 128 | cv.ensure_list, [vol.Schema(TARGETS_SCHEMA)] 129 | ), 130 | vol.Optional(CONF_ROI_Y_MIN, default=DEFAULT_ROI_Y_MIN): cv.small_float, 131 | vol.Optional(CONF_ROI_X_MIN, default=DEFAULT_ROI_X_MIN): cv.small_float, 132 | vol.Optional(CONF_ROI_Y_MAX, default=DEFAULT_ROI_Y_MAX): cv.small_float, 133 | vol.Optional(CONF_ROI_X_MAX, default=DEFAULT_ROI_X_MAX): cv.small_float, 134 | vol.Optional(CONF_SCALE, default=DEAULT_SCALE): vol.All( 135 | vol.Coerce(float, vol.Range(min=0.1, max=1)) 136 | ), 137 | vol.Optional(CONF_SAVE_FILE_FOLDER): cv.isdir, 138 | vol.Optional(CONF_SAVE_FILE_FORMAT, default=JPG): vol.In([JPG, PNG]), 139 | vol.Optional(CONF_SAVE_TIMESTAMPTED_FILE, default=False): cv.boolean, 140 | vol.Optional(CONF_ALWAYS_SAVE_LATEST_FILE, default=False): cv.boolean, 141 | vol.Optional(CONF_SHOW_BOXES, default=True): cv.boolean, 142 | vol.Optional(CONF_CROP_ROI, default=False): cv.boolean, 143 | } 144 | ) 145 | 146 | Box = namedtuple("Box", "y_min x_min y_max x_max") 147 | Point = namedtuple("Point", "y x") 148 | 149 | 150 | def point_in_box(box: Box, point: Point) -> bool: 151 | """Return true if point lies in box""" 152 | if (box.x_min <= point.x <= box.x_max) and (box.y_min <= point.y <= box.y_max): 153 | return True 154 | return False 155 | 156 | 157 | def object_in_roi(roi: dict, centroid: dict) -> bool: 158 | """Convenience to convert dicts to the Point and Box.""" 159 | target_center_point = Point(centroid["y"], centroid["x"]) 160 | roi_box = Box(roi["y_min"], roi["x_min"], roi["y_max"], roi["x_max"]) 161 | return point_in_box(roi_box, target_center_point) 162 | 163 | 164 | def get_valid_filename(name: str) -> str: 165 | return re.sub(r"(?u)[^-\w.]", "", str(name).strip().replace(" ", "_")) 166 | 167 | 168 | def get_object_type(object_name: str) -> str: 169 | if object_name == PERSON: 170 | return PERSON 171 | elif object_name in ANIMALS: 172 | return ANIMAL 173 | elif object_name in VEHICLES: 174 | return VEHICLE 175 | else: 176 | return OTHER 177 | 178 | 179 | def get_objects(predictions: list, img_width: int, img_height: int) -> List[Dict]: 180 | """Return objects with formatting and extra info.""" 181 | objects = [] 182 | decimal_places = 3 183 | for pred in predictions: 184 | box_width = pred["x_max"] - pred["x_min"] 185 | box_height = pred["y_max"] - pred["y_min"] 186 | box = { 187 | "height": round(box_height / img_height, decimal_places), 188 | "width": round(box_width / img_width, decimal_places), 189 | "y_min": round(pred["y_min"] / img_height, decimal_places), 190 | "x_min": round(pred["x_min"] / img_width, decimal_places), 191 | "y_max": round(pred["y_max"] / img_height, decimal_places), 192 | "x_max": round(pred["x_max"] / img_width, decimal_places), 193 | } 194 | box_area = round(box["height"] * box["width"], decimal_places) 195 | centroid = { 196 | "x": round(box["x_min"] + (box["width"] / 2), decimal_places), 197 | "y": round(box["y_min"] + (box["height"] / 2), decimal_places), 198 | } 199 | name = pred["label"] 200 | object_type = get_object_type(name) 201 | confidence = round(pred["confidence"] * 100, decimal_places) 202 | 203 | objects.append( 204 | { 205 | "bounding_box": box, 206 | "box_area": box_area, 207 | "centroid": centroid, 208 | "name": name, 209 | "object_type": object_type, 210 | "confidence": confidence, 211 | } 212 | ) 213 | return objects 214 | 215 | 216 | def setup_platform(hass, config, add_devices, discovery_info=None): 217 | """Set up the classifier.""" 218 | save_file_folder = config.get(CONF_SAVE_FILE_FOLDER) 219 | if save_file_folder: 220 | save_file_folder = Path(save_file_folder) 221 | 222 | entities = [] 223 | for camera in config[CONF_SOURCE]: 224 | object_entity = ObjectClassifyEntity( 225 | ip_address=config.get(CONF_IP_ADDRESS), 226 | port=config.get(CONF_PORT), 227 | api_key=config.get(CONF_API_KEY), 228 | timeout=config.get(CONF_TIMEOUT), 229 | custom_model=config.get(CONF_CUSTOM_MODEL), 230 | targets=config.get(CONF_TARGETS), 231 | confidence=config.get(CONF_CONFIDENCE), 232 | roi_y_min=config[CONF_ROI_Y_MIN], 233 | roi_x_min=config[CONF_ROI_X_MIN], 234 | roi_y_max=config[CONF_ROI_Y_MAX], 235 | roi_x_max=config[CONF_ROI_X_MAX], 236 | scale=config[CONF_SCALE], 237 | show_boxes=config[CONF_SHOW_BOXES], 238 | save_file_folder=save_file_folder, 239 | save_file_format=config[CONF_SAVE_FILE_FORMAT], 240 | save_timestamped_file=config.get(CONF_SAVE_TIMESTAMPTED_FILE), 241 | always_save_latest_file=config.get(CONF_ALWAYS_SAVE_LATEST_FILE), 242 | crop_roi=config[CONF_CROP_ROI], 243 | camera_entity=camera.get(CONF_ENTITY_ID), 244 | name=camera.get(CONF_NAME), 245 | ) 246 | entities.append(object_entity) 247 | add_devices(entities) 248 | 249 | 250 | class ObjectClassifyEntity(ImageProcessingEntity): 251 | """Perform a object classification.""" 252 | 253 | def __init__( 254 | self, 255 | ip_address, 256 | port, 257 | api_key, 258 | timeout, 259 | custom_model, 260 | targets, 261 | confidence, 262 | roi_y_min, 263 | roi_x_min, 264 | roi_y_max, 265 | roi_x_max, 266 | scale, 267 | show_boxes, 268 | save_file_folder, 269 | save_file_format, 270 | save_timestamped_file, 271 | always_save_latest_file, 272 | crop_roi, 273 | camera_entity, 274 | name=None, 275 | ): 276 | """Init with the API key and model id.""" 277 | super().__init__() 278 | self._dsobject = ds.DeepstackObject( 279 | ip=ip_address, 280 | port=port, 281 | api_key=api_key, 282 | timeout=timeout, 283 | min_confidence=MIN_CONFIDENCE, 284 | custom_model=custom_model, 285 | ) 286 | self._custom_model = custom_model 287 | self._confidence = confidence 288 | self._summary = {} 289 | self._targets = targets 290 | for target in self._targets: 291 | if CONF_CONFIDENCE not in target.keys(): 292 | target.update({CONF_CONFIDENCE: self._confidence}) 293 | self._targets_names = [ 294 | target[CONF_TARGET] for target in targets 295 | ] # can be a name or a type 296 | self._camera = camera_entity 297 | if name: 298 | self._name = name 299 | else: 300 | camera_name = split_entity_id(camera_entity)[1] 301 | self._name = "deepstack_object_{}".format(camera_name) 302 | 303 | self._state = None 304 | self._objects = [] # The parsed raw data 305 | self._targets_found = [] 306 | self._last_detection = None 307 | 308 | self._roi_dict = { 309 | "y_min": roi_y_min, 310 | "x_min": roi_x_min, 311 | "y_max": roi_y_max, 312 | "x_max": roi_x_max, 313 | } 314 | self._crop_roi = crop_roi 315 | self._scale = scale 316 | self._show_boxes = show_boxes 317 | self._image_width = None 318 | self._image_height = None 319 | self._save_file_folder = save_file_folder 320 | self._save_file_format = save_file_format 321 | self._always_save_latest_file = always_save_latest_file 322 | self._save_timestamped_file = save_timestamped_file 323 | self._always_save_latest_file = always_save_latest_file 324 | self._image = None 325 | 326 | def process_image(self, image): 327 | """Process an image.""" 328 | self._image = Image.open(io.BytesIO(bytearray(image))) 329 | self._image_width, self._image_height = self._image.size 330 | # scale to roi 331 | if self._crop_roi: 332 | roi = ( 333 | self._image_width * self._roi_dict["x_min"], 334 | self._image_height * self._roi_dict["y_min"], 335 | self._image_width * (self._roi_dict["x_max"]), 336 | self._image_height * (self._roi_dict["y_max"]) 337 | ) 338 | self._image = self._image.crop(roi) 339 | self._image_width, self._image_height = self._image.size 340 | with io.BytesIO() as output: 341 | self._image.save(output, format="JPEG") 342 | image = output.getvalue() 343 | _LOGGER.debug( 344 | ( 345 | f"Image cropped with : {self._roi_dict} W={self._image_width} H={self._image_height}" 346 | ) 347 | ) 348 | # resize image if different then default 349 | if self._scale != DEAULT_SCALE: 350 | newsize = (self._image_width * self._scale, self._image_width * self._scale) 351 | self._image.thumbnail(newsize, Image.ANTIALIAS) 352 | self._image_width, self._image_height = self._image.size 353 | with io.BytesIO() as output: 354 | self._image.save(output, format="JPEG") 355 | image = output.getvalue() 356 | _LOGGER.debug( 357 | ( 358 | f"Image scaled with : {self._scale} W={self._image_width} H={self._image_height}" 359 | ) 360 | ) 361 | 362 | self._state = None 363 | self._objects = [] # The parsed raw data 364 | self._targets_found = [] 365 | self._summary = {} 366 | saved_image_path = None 367 | 368 | try: 369 | predictions = self._dsobject.detect(image) 370 | except ds.DeepstackException as exc: 371 | _LOGGER.error("Deepstack error : %s", exc) 372 | return 373 | 374 | self._objects = get_objects(predictions, self._image_width, self._image_height) 375 | self._targets_found = [] 376 | 377 | for obj in self._objects: 378 | if not ( 379 | (obj["name"] in self._targets_names) 380 | or (obj["object_type"] in self._targets_names) 381 | ): 382 | continue 383 | ## Then check if the type has a configured confidence, if yes assign 384 | ## Then if a confidence for a named object, this takes precedence over type confidence 385 | confidence = None 386 | for target in self._targets: 387 | if obj["object_type"] == target[CONF_TARGET]: 388 | confidence = target[CONF_CONFIDENCE] 389 | for target in self._targets: 390 | if obj["name"] == target[CONF_TARGET]: 391 | confidence = target[CONF_CONFIDENCE] 392 | if obj["confidence"] > confidence: 393 | if not self._crop_roi and not object_in_roi(self._roi_dict, obj["centroid"]): 394 | continue 395 | self._targets_found.append(obj) 396 | 397 | self._state = len(self._targets_found) 398 | if self._state > 0: 399 | self._last_detection = dt_util.now().strftime(DATETIME_FORMAT) 400 | 401 | targets_found = [ 402 | obj["name"] for obj in self._targets_found 403 | ] # Just the list of target names, e.g. [car, car, person] 404 | self._summary = dict(Counter(targets_found)) # e.g. {'car':2, 'person':1} 405 | 406 | if self._save_file_folder: 407 | if self._state > 0 or self._always_save_latest_file: 408 | saved_image_path = self.save_image( 409 | self._targets_found, 410 | self._save_file_folder, 411 | ) 412 | 413 | # Fire events 414 | for target in self._targets_found: 415 | target_event_data = target.copy() 416 | target_event_data[ATTR_ENTITY_ID] = self.entity_id 417 | if saved_image_path: 418 | target_event_data[SAVED_FILE] = saved_image_path 419 | self.hass.bus.fire(EVENT_OBJECT_DETECTED, target_event_data) 420 | 421 | @property 422 | def camera_entity(self): 423 | """Return camera entity id from process pictures.""" 424 | return self._camera 425 | 426 | @property 427 | def state(self): 428 | """Return the state of the entity.""" 429 | return self._state 430 | 431 | @property 432 | def name(self): 433 | """Return the name of the sensor.""" 434 | return self._name 435 | 436 | @property 437 | def unit_of_measurement(self): 438 | """Return the unit of measurement.""" 439 | return "targets" 440 | 441 | @property 442 | def should_poll(self): 443 | """Return the polling state.""" 444 | return False 445 | 446 | @property 447 | def extra_state_attributes(self) -> Dict: 448 | """Return device specific state attributes.""" 449 | attr = {} 450 | attr["targets"] = self._targets 451 | attr["targets_found"] = [ 452 | {obj["name"]: obj["confidence"]} for obj in self._targets_found 453 | ] 454 | attr["summary"] = self._summary 455 | if self._last_detection: 456 | attr["last_target_detection"] = self._last_detection 457 | if self._custom_model: 458 | attr["custom_model"] = self._custom_model 459 | attr["all_objects"] = [ 460 | {obj["name"]: obj["confidence"]} for obj in self._objects 461 | ] 462 | if self._save_file_folder: 463 | attr[CONF_SAVE_FILE_FOLDER] = str(self._save_file_folder) 464 | attr[CONF_SAVE_FILE_FORMAT] = self._save_file_format 465 | attr[CONF_SAVE_TIMESTAMPTED_FILE] = self._save_timestamped_file 466 | attr[CONF_ALWAYS_SAVE_LATEST_FILE] = self._always_save_latest_file 467 | return attr 468 | 469 | def save_image(self, targets, directory) -> str: 470 | """Draws the actual bounding box of the detected objects. 471 | 472 | Returns: saved_image_path, which is the path to the saved timestamped file if configured, else the default saved image. 473 | """ 474 | try: 475 | img = self._image.convert("RGB") 476 | except UnidentifiedImageError: 477 | _LOGGER.warning("Deepstack unable to process image, bad data") 478 | return 479 | draw = ImageDraw.Draw(img) 480 | 481 | roi_tuple = tuple(self._roi_dict.values()) 482 | if roi_tuple != DEFAULT_ROI and self._show_boxes and not self._crop_roi: 483 | draw_box( 484 | draw, 485 | roi_tuple, 486 | img.width, 487 | img.height, 488 | text="ROI", 489 | color=GREEN, 490 | ) 491 | 492 | for obj in targets: 493 | if not self._show_boxes: 494 | break 495 | name = obj["name"] 496 | confidence = obj["confidence"] 497 | box = obj["bounding_box"] 498 | centroid = obj["centroid"] 499 | box_label = f"{name}: {confidence:.1f}%" 500 | 501 | draw_box( 502 | draw, 503 | (box["y_min"], box["x_min"], box["y_max"], box["x_max"]), 504 | img.width, 505 | img.height, 506 | text=box_label, 507 | color=RED, 508 | ) 509 | 510 | # draw bullseye 511 | draw.text( 512 | (centroid["x"] * img.width, centroid["y"] * img.height), 513 | text="X", 514 | fill=RED, 515 | ) 516 | 517 | # Save images, returning the path of saved image as str 518 | latest_save_path = ( 519 | directory 520 | / f"{get_valid_filename(self._name).lower()}_latest.{self._save_file_format}" 521 | ) 522 | img.save(latest_save_path) 523 | _LOGGER.info("Deepstack saved file %s", latest_save_path) 524 | saved_image_path = latest_save_path 525 | 526 | if self._save_timestamped_file: 527 | timestamp_save_path = ( 528 | directory 529 | / f"{self._name}_{self._last_detection}.{self._save_file_format}" 530 | ) 531 | img.save(timestamp_save_path) 532 | _LOGGER.info("Deepstack saved file %s", timestamp_save_path) 533 | saved_image_path = timestamp_save_path 534 | return str(saved_image_path) 535 | -------------------------------------------------------------------------------- /custom_components/deepstack_object/manifest.json: -------------------------------------------------------------------------------- 1 | { 2 | "domain": "deepstack_object", 3 | "name": "deepstack object custom integration", 4 | "documentation": "https://github.com/robmarkcole/HASS-Deepstack-object", 5 | "version": "4.6.0", 6 | "requirements": [ 7 | "pillow", 8 | "deepstack-python==0.8" 9 | ], 10 | "dependencies": [], 11 | "codeowners": [ 12 | "@robmarkcole" 13 | ] 14 | } 15 | -------------------------------------------------------------------------------- /custom_components/deepstack_object/tests.py: -------------------------------------------------------------------------------- 1 | """The tests for the Deepstack object component.""" 2 | from .image_processing import get_objects 3 | 4 | TARGET = "person" 5 | IMG_WIDTH = 960 6 | IMG_HEIGHT = 640 7 | 8 | MOCK_PREDICTIONS = [ 9 | { 10 | "confidence": 0.9995428, 11 | "label": "person", 12 | "y_min": 95, 13 | "x_min": 295, 14 | "y_max": 523, 15 | "x_max": 451, 16 | }, 17 | { 18 | "confidence": 0.9994912, 19 | "label": "person", 20 | "y_min": 99, 21 | "x_min": 440, 22 | "y_max": 531, 23 | "x_max": 608, 24 | }, 25 | { 26 | "confidence": 0.9990447, 27 | "label": "dog", 28 | "y_min": 358, 29 | "x_min": 647, 30 | "y_max": 539, 31 | "x_max": 797, 32 | }, 33 | ] 34 | 35 | PARSED_PREDICTIONS = [ 36 | { 37 | "bounding_box": { 38 | "height": 0.669, 39 | "width": 0.163, 40 | "y_min": 0.148, 41 | "x_min": 0.307, 42 | "y_max": 0.817, 43 | "x_max": 0.47, 44 | }, 45 | "box_area": 0.109, 46 | "centroid": {"x": 0.389, "y": 0.483}, 47 | "name": "person", 48 | "confidence": 99.954, 49 | }, 50 | { 51 | "bounding_box": { 52 | "height": 0.675, 53 | "width": 0.175, 54 | "y_min": 0.155, 55 | "x_min": 0.458, 56 | "y_max": 0.83, 57 | "x_max": 0.633, 58 | }, 59 | "box_area": 0.118, 60 | "centroid": {"x": 0.545, "y": 0.493}, 61 | "name": "person", 62 | "confidence": 99.949, 63 | }, 64 | { 65 | "bounding_box": { 66 | "height": 0.283, 67 | "width": 0.156, 68 | "y_min": 0.559, 69 | "x_min": 0.674, 70 | "y_max": 0.842, 71 | "x_max": 0.83, 72 | }, 73 | "box_area": 0.044, 74 | "centroid": {"x": 0.752, "y": 0.701}, 75 | "name": "dog", 76 | "confidence": 99.904, 77 | }, 78 | ] 79 | 80 | 81 | def test_get_objects(): 82 | objects = get_objects(MOCK_PREDICTIONS, IMG_WIDTH, IMG_HEIGHT) 83 | assert len(objects) == 3 84 | assert objects[0] == PARSED_PREDICTIONS[0] 85 | -------------------------------------------------------------------------------- /docs/media_player.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/robmarkcole/HASS-Deepstack-object/e59ac294108fe9015886e93b8ea17d4539f60f7c/docs/media_player.png -------------------------------------------------------------------------------- /docs/object_detail.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/robmarkcole/HASS-Deepstack-object/e59ac294108fe9015886e93b8ea17d4539f60f7c/docs/object_detail.png -------------------------------------------------------------------------------- /docs/object_usage.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/robmarkcole/HASS-Deepstack-object/e59ac294108fe9015886e93b8ea17d4539f60f7c/docs/object_usage.png -------------------------------------------------------------------------------- /requirements-dev.txt: -------------------------------------------------------------------------------- 1 | pytest 2 | pillow==8.2.0 3 | homeassistant 4 | deepstack-python==0.4 --------------------------------------------------------------------------------