├── .gitignore
├── CHANGELOG.md
├── README.md
├── annotation-tool
├── README.md
├── TaTo.py
├── config.json
├── config_distraction.json
├── config_drowsiness.json
├── config_gaze.json
├── config_hands.json
├── config_statics.json
├── setUp.py
└── vcd4parser.py
├── docs
├── imgs
│ ├── annotation_tool_info.png
│ ├── block_annotation.png
│ ├── colorized_depth.png
│ ├── level_panel.png
│ └── mobaxterm_config.png
├── issue_bug_template.md
├── issue_feature_template.md
├── readme-assets
│ ├── cameras.png
│ ├── dmdStructure.png
│ ├── environments.png
│ ├── gazeRegions.png
│ ├── mosaic.png
│ └── participants.png
├── setup_linux.md
└── setup_windows.md
└── exploreMaterial-tool
├── DExTool.py
├── README.md
├── Tutorial_DEx_(dataset_explorer_tool).ipynb
├── accessDMDAnn.py
├── config_DEx.json
├── group_split_material.py
├── statistics.py
└── vcd4reader.py
/.gitignore:
--------------------------------------------------------------------------------
1 | dmd/
2 | dmd_rgb/
--------------------------------------------------------------------------------
/CHANGELOG.md:
--------------------------------------------------------------------------------
1 | # Changelog
2 |
3 | All notable changes to the DMD repository will be documented in this file.
4 |
5 | The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
6 | and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
7 |
8 | ## [Unreleased]
9 |
10 | ## [1.0.0] - 2020-07-22
11 |
12 | ### Added
13 |
14 | - First version of annotation tool (TaTo).
15 | - New Readme files and steps to run the annotation tool.
16 | - The wiki includes the DMD file structure and annotation instructions for distraction related actions.
17 |
18 | [unreleased]: https://github.com/Vicomtech/DMD-Driver-Monitoring-Dataset/compare/v1.0.0...HEsAD
19 | [1.0.0]: https://github.com/Vicomtech/DMD-Driver-Monitoring-Dataset/release/tag/v1.0.0
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # Driver Monitoring Dataset (DMD)
2 | The [Driver Monitoring Dataset](http://dmd.vicomtech.org/) is the largest visual dataset for real driving actions, with footage from synchronized multiple cameras (body, face, hands) and multiple streams (RGB, Depth, IR) recorded in two scenarios (real car, driving simulator). Different annotated labels related to distraction, fatigue and gaze-head pose can be used to train Deep Learning models for Driver Monitor Systems.
3 |
4 | This project includes a tool to annotate the dataset, inspect the annotated data and export training sets. Output annotations are formatted using [OpenLABEL](https://www.asam.net/standards/detail/openlabel/) language [VCD (Video Content Description)](https://vcd.vicomtech.org/).
5 |
6 | ## Dataset details
7 | More details of the recording and video material of DMD can be found at the [official website](http://dmd.vicomtech.org/)
8 |
9 | In addition, this repository [wiki](https://github.com/Vicomtech/DMD-Driver-Monitoring-Dataset/wiki) has useful information about the DMD dataset and the annotation process.
10 |
11 | ## Available tools:
12 | - Temporal Annotation Tool (TaTo) - (more info [here](annotation-tool/README.md))
13 | - Dataset Explorer Tool (DEx) - (more info [here](exploreMaterial-tool/README.md))
14 | ### Annotation Instructions
15 | Depending the annotation problem, different annotation criteria should be defined to guarantee all the annotators produce the same output annotations.
16 |
17 | We have defined the following criteria to be used with tool to produce consistent annotations:
18 |
19 | - [DMD Distraction-related actions](https://github.com/Vicomtech/DMD-Driver-Monitoring-Dataset/wiki/DMD-distraction-related-action-annotation-criteria) annotation
20 |
21 | ## Known Issues
22 | - The version of OpenLABEL in the annotation files (OpenLabel) and in the tools in this repository has been updated to VCD>=5.0. Make sure you download the annotations files again and update the tools.
23 | - There was an error when uploading IR videos. They have to be .mp4 format, and they were uploaded as .avi. This is fixed now but requires the user to download them again.
24 |
25 | ## Credits
26 | Development of DMD was supported and funded by the European Commission (EC) Horizon 2020 programme (project [VI-DAS](http://www.vi-das.eu/), grant agreement 690772)
27 |
28 | Developed with :blue_heart: by:
29 |
30 | * Paola Cañas (pncanas@vicomtech.org)
31 | * Juan Diego Ortega (jdortega@vicomtech.org)
32 |
33 | Contributions of ideas and comments: Marcos Nieto, Mikel Garcia, Gonzalo Pierola, Itziar Sagastiberri, Itziar Urbieta, Eneritz Etxaniz, Orti Senderos.
34 |
35 | ## License
36 | Copyright :copyright: 2024 Vicomtech
37 |
38 | Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
39 |
40 | The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
41 |
42 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
43 |
--------------------------------------------------------------------------------
/annotation-tool/README.md:
--------------------------------------------------------------------------------
1 | # Temporal Annotation Tool (TaTo)
2 | We have acquired a good amount of high quality and friendly driver’s material in the DMD (Driver Monitoring Dataset) with the purpose of developing computer vision algorithms for driver monitoring. But what would be a dataset without its corresponding annotations?
3 |
4 | We developed the TaTo tool to annotate temporal events and actions performed by the drivers in the video sequences. The tool was used to annotate distraction-related actions. However, through configuration, other labels can be annotated.
5 |
6 | ## Content
7 | - [Temporal Annotation Tool (TaTo)](#temporal-annotation-tool-tato)
8 | - [Content](#content)
9 | - [Setup and Launching](#setup-and-launching)
10 | - [TaTo characteristics](#tato-characteristics)
11 | - [Usage Instructions](#usage-instructions)
12 | - [General functionality](#general-functionality)
13 | - [TaTo Window description](#tato-window-description)
14 | - [Annotating with TaTo](#annotating-with-tato)
15 | - [Select the annotation level](#select-the-annotation-level)
16 | - [Special labels](#special-labels)
17 | - [Annotation Modes](#annotation-modes)
18 | - [Frame-by-frame annotation](#frame-by-frame-annotation)
19 | - [Block annotation](#block-annotation)
20 | - [Automatic annotation](#automatic-annotation)
21 | - [Keyboard Interaction](#keyboard-interaction)
22 | - [General keys](#general-keys)
23 | - [Video Navigation](#video-navigation)
24 | - [Playback keys](#playback-keys)
25 | - [Annotation keys](#annotation-keys)
26 | - [Saving annotations](#saving-annotations)
27 | - [Annotation criteria](#annotation-criteria)
28 | - [Changelog](#changelog)
29 | - [FAQs](#faqs)
30 | - [Known Issues](#known-issues)
31 | - [License](#license)
32 |
33 | ## Setup and Launching
34 | The TaTo tool has been tested using the following system configuration:
35 |
36 | **OS:** Ubuntu 18.04, Windows 10
37 | **Dependencies:** Python 3.8, OpenCV-Python 4.2.0, VCD>=6.0.3
38 |
39 | For a detailed description on how to configure the environment and launch the tool, check: [Linux](../docs/setup_linux.md) / [Windows](../docs/setup_windows.md)
40 |
41 | ## TaTo characteristics
42 | TaTo is a python keyboard-based software application to create and modify temporal annotations in [VCD 5.0. format](https://vcd.vicomtech.org/). It was planned to manage annotations from [DMD](http://dmd.vicomtech.org/) videos, but recently it was modified to be compatible with videos in general. It supports multi-label annotation and has an intuitive interface to visualize temporal annotations through timelines. It offers the following features:
43 |
44 | - Allows the annotation of temporal events in videos (not only from the DMD).
45 | - The annotations could be divided in up to 7 different annotation levels(group of labels). Within each group the labels are **mutually exclusive.**
46 | - Levels and labels of annotation can be defined in a **configuration file** and TaTo can work with many of these files (one at a time).
47 | - Annotations are represented by colors in two timelines. Each label has its own color.
48 | - The labels can be input either frame-by-frame or by frame-block interval.
49 | - There is a **validation property** per each frame that indicates how it was annotated: frame-by-frame or frame-block interval.
50 | - The output annotations are saved in VCD format in a **JSON file**.
51 | - For DMD default annotation modes, you can apply **automatic annotations**. That is, applying logically related annotations among levels.
52 | - To avoid loss of progress, the tool **autosaves** annotations. This autosaving is done in txt files to not affect the tool performance.
53 |
54 | ## Usage Instructions
55 |
56 | ### General functionality
57 | Its operation depends on a general configuration file **“config.json”**. The main options to configure are in the object **“tatoConfig”**, these are:
58 |
59 | #### Annotation mode
60 | DMD has predefined annotation modes. These annotation modes are defined by a group of labels related to an analysis dimension of the dataset. For example, there is the distraction annotation mode which includes labels like texting-left, drinking, etc.
61 | You can indicate the annotation mode in the “annotation_mode” field. To define a new annotation mode you must create a new configuration file and name it as: “config_$annModeName.json”. To create an annotation mode configuration file, we recommend duplicate and modify an existing one to avoid incompatibilities. Inside there must be the following JSON objects:
62 |
63 | - **level_names:** this object lists the names of the levels of annotation. A level is a category of labels. Each label must be identified by a number as a string (e.g. “0”: "driver_actions").
64 | - **level_types:** This object specifies the nature of each level in the VCD (action, object, stream_properties). Must be the same number of items as in level_names, and must share the same id (e.g. “0”, “1”).
65 | - **$level_id:** For each level, there is an object to list its corresponding labels. The name of the object must match the id from level_names and level_types (e.g. “0”, “1”). In the same way, the labels within each level must be identified by a number as a string (e.g. “0”: “safe_drive”). There are two standard labels with their corresponding id: “99”: “--” that represents the absence of a label in one frame and “100”: “NAN” that means there is no frame information.
66 | - **level_defaults:** In this object, the default label for level is defined. When the VCD is initially created from cero, all frames will be annotated with a default label. This label should be the most recurrent one in the video so the annotation effort is less. There must be the same number of items as in level_names, and must share the same id (e.g. “0”, “1”).
67 | - **camera_dependencies:** The DMD annotation is done with a mosaic video that includes 3 perspective views. Some labels are camera-dependant, so when there is no frame showing from one perspective, there shouldn’t be an annotation from those perspective-dependant labels. To indicate these dependencies, there must be an array of the labels ids for each camera perspective (e.g. "face": [1,2]).
68 |
69 | #### Dataset
70 | This option is to identify if you are working with the DMD and load predefined configuration and validations of TaTo for this dataset. If it is different, then it loads a general configuration and allows annotating any video with VCD>=5.0.
71 |
72 | #### Pre_annotate
73 | We created automatic pre-annotations for the DMD, also the DMD has metadata and static annotations we had to include inside the VCD. To do this, we load them before creating a VCD. To activate this functionality, we added this configuration option. You must leave it with a value of 0 since pre-annotation only works internally.
74 |
75 | ### TaTo Window description
76 | The annotation tool TaTo opens with three windows:
77 |
78 | 
79 |
80 | The main window will display:
81 | - The **mosaic video** consisting of three camera streams (these streams were previously synchronized).
82 | - Frame information (current frame). The **current frame** is the mosaic frame.
83 | - The last time you saved your annotation (It is recommended to save your progress frequently).
84 | - The [**annotation info**](#select-the-annotation-level) panel shows a list of levels (annotation groups) and the current label for each level at the current frame. **The level you are currently annotating is indicated with the “->” sign and is written in bold.**
85 |
86 | A second window (with dark background) will show:
87 | - The **frame offsets** between video streams
88 | - The **video path** you are annotating.
89 | - A list of the available **labels** of the current selected annotation level. Each label has a colored box and a key to press for annotation in brackets [ ]. Check [annotation instructions](#annotating-with-tato) to know how to input annotation for each level.
90 | - An informative list of possible frame validation states.
91 |
92 | In the third window there are **two timelines** which show video annotations in colors, depending on the level:
93 | - The first is a full timeline from frame 0 to the total length of the video.
94 | - The second is a zoomed timeline around the current frame always keeping the actual frame in the center of the window. This timeline helps the visualization of individual frame label.
95 |
96 | The colors in the timeline representation are the same as the colors in the label description panel. The colors and labels depend on the selected annotation level.
97 |
98 | When there are 5 frames left, there will appear a “last frames” text and a “LAST FRAME!” text at the last frame.
99 |
100 | ### Annotating with TaTo
101 | The interaction with the tool is meant to be done using the keyboard. We have a thorough list of keys available for interaction and annotation. Once the tool is open, you can press P to open instructions.
102 |
103 | #### Select the annotation level
104 | An annotation level is a group of labels which are mutually exclusive (two or more labels of the same level cannot be assigned to the same frame). The definition of these levels and their corresponding labels is taken from a [config file](../annotation-tool/config_distraction.json).
105 |
106 | There are some annotation levels which require all frames to have a label. However, other levels can admit frames with no label. This depends on the nature of the annotation level. Annotation levels with empty labels will include this option in the config file.
107 |
108 | The first step to start annotating a video is to **select the annotation level**. You can see the current selected level in the main window's annotation info panel (see figure below). The current selected level will display an arrow "->" symbol and the current label will be highlighted in bold text. To change between levels use the Tab key.
109 |
110 | 
111 |
112 | ##### Special labels
113 | All the annotations levels will have a group of text labels which are used to annotate the temporal events. However, you will see two special labels displayed in the [annotation info panel](../docs/imgs/level_panel.png):
114 |
115 | - **"NAN"**: This means there is not a frame from the camera stream used to annotate that level. This label is automatically set by the tool, it is not possible to change it manually.
116 | - **"--"**: This means absence of label. Some levels don't require to have continuos annotation labels so this "empty" label should be present. For those levels which requires continuos annotations, this "empty" label should not be present. Please consult the annotation criteria to know level characteristics.
117 |
118 | #### Annotation Modes
119 | The annotation tool has two modes of annotation: **frame-by-frame** and **block** annotation.
120 |
121 | ##### Frame-by-frame annotation
122 | This is the basic annotation mode. To annotate frame-by-frame you should first [select the annotation level](#select-the-annotation-level) you would want to annotate. Then, press the corresponding [label key](#annotation-keys) according to the list of available labels located in the window with dark background. The key to press will be in brackets [ ] on the left of each label. Once you press the key, the frame color in timeline will change.
123 |
124 | ##### Block annotation
125 | This is a handy way to annotate a frame interval. For this, you should first select the frame interval to be annotated. To do so, press the Z key to select the starting frame. Then, move forward or backward with the [navigation keys](#video-navigation). You will see in the timeline window your selection of the frame interval in green. After selecting the desired frame interval, press the corresponding [label key](#annotation-keys) to annotate all frames in the interval with that label.
126 |
127 | To unselect a frame interval press two times the Z key.
128 |
129 | 
130 |
131 | ##### Automatic annotation
132 | In the case of DMD, there are logical relationships between different levels of annotation. For example, if the driver is performing the activity “Texting left”, then the annotation in hands_using_wheel level should be “Only Right”. There is a function that allows applying these logical annotations and change the related labels from different levels depending on the driver_actions level.
133 |
134 | It is important that you use this function after you have completed the annotations of driver_actions, then perform the automatic annotation changes to the rest of levels. The key to do this in the tool is x and is a just one-time operation. Mor info [here](https://github.com/Vicomtech/DMD-Driver-Monitoring-Dataset/wiki/DMD-distraction-related-action-annotation-criteria#apply-automatic-annotation-interpolation-warning).
135 |
136 | #### Keyboard Interaction
137 | ##### General keys
138 | | Keys | Function |
139 | | :--------------: | :----------------------------------------------- |
140 | | Esc | **Close** the tool, saving the current progress |
141 | | Enter | **Save** the current annotation progress |
142 | | P | Open a help window showing the available keys |
143 |
144 | ##### Video Navigation
145 | | Keys | Function |
146 | | :----------------: | :------------------------------------------------------- |
147 | | Any Key | Besides function specific keys, move forward **1 frame** |
148 | | E | Move **forward 50 frames** |
149 | | R | Move **forward 300 frames** |
150 | | W | Move **backwards 50 frames** |
151 | | Q | Move **backwards 300 frames** |
152 | | Space | Move **backwards 1 frame** |
153 | | S | **Jump forward** to nearest label change |
154 | | A | **Jump backwards** to nearest label change |
155 |
156 | ##### Playback keys
157 | This is a functionality where you can play and visualize the video along with the annotations from all levels. Can be used to check annotations or to navigate in timeline.
158 |
159 | | Keys | Function |
160 | | :------------------: | :------------------------------------------------------------------------------------------------------------ |
161 | | Backspace | Opens the **playback of the video** in a new window. |
162 | | Enter | In the playback window, closes the playback window returning to the main window at the **last frame played** |
163 | | Esc | In the playback window, closes the playback window returning to the main window at the **first frame played** |
164 |
165 | ##### Annotation keys
166 | | Keys | Function |
167 | | :--------------: | :------------------------------------------------------------------------ |
168 | | Tab | **Switch** between annotation levels |
169 | | Z | **Select/Unselect** the starting frame for block annotation (key-frame) |
170 | | 0 ...9 , / , * , - , + | **Annotate** the frame or frame interval with the corresponding label. |
171 | | . | **Remove** the current label of the frame or frame interval |
172 | | X | Apply **automatic annotations** to other levels. :warning: **Caution: This is a destructive action, apply carefully.** |
173 |
174 | ## Saving annotations
175 | You can save the progress by pressing the Enter key. The tool also saves the progress when you exit the tool using the Esc key.
176 |
177 | **The tool saves the annotations in VCD 4 format in a JSON file.**
178 |
179 | The tool has an autosave functionality that creates two TXT files with the annotation information. If a VCD file is successfully saved, then autosave TXT files will be deleted automatically. The name of these files are:
180 | - [video_name]_autoSaveAnn-A.txt
181 | - [video_name]_autoSaveAnn-B.txt
182 |
183 | **In case something occurs and the tool exits without you manually saving the progress, you can recover your progress with these temporal TXT files.**
184 |
185 | If there was a failure in saving the VCD, you will have **both** JSON and TXT files in the video directory. If you try to run the tool again, you should receive an error telling you to keep the most recent file. If that is the case, **delete** the VCD (JSON) file and start TaTo again.
186 |
187 | You could know which file is taken by TaTo during start-up time, depending on the file the tool is reading the annotations from, it should print:
188 |
189 | - *"Loading VCD ..."* if the annotations are taken from a VCD json file.
190 | - *"Loading provisional txt annotation files ..."* if the autoSave txt files are loaded.
191 |
192 | ## Annotation criteria
193 | Depending on the annotation problem, different annotation criteria should be defined to guarantee all the annotators produce the same output annotations.
194 |
195 | We have defined the following criteria to be used with tool to produce consistent annotations:
196 |
197 | - [DMD Distraction-related actions](https://github.com/Vicomtech/DMD-Driver-Monitoring-Dataset/wiki/DMD-distraction-related-action-annotation-criteria) annotation
198 |
199 | ## Changelog
200 | For a complete list of changes check the [CHANGELOG.md](../CHANGELOG.md) file
201 |
202 | ## FAQs
203 |
204 | - **How can I change the labels name?**
205 |
206 | The number of labels per level and their names are specified at the [config file](../annotation-tool/config_distraction.json). You can change them there and restart the tool. You can also add more labels. If you delete labels, some problems of compatibility might appear. You can create your own config file and define your levels and labels of annotation.
207 |
208 | - **What if I forgot to save?**
209 |
210 | Relax, it happens. The tool saves progress when you press Esc and exit the tool. If you forgot to save and there was a sudden problem with the tool, there are autosave files that can help you recover your unsaved progress. Go to: [Saving annotations](#saving-annotations)
211 |
212 | - **The tool gives the error: Incompatible file name. Please check your file name or folder structure.**
213 |
214 | This error appears when the path or the folder structure of the video you had input is not valid or has not correct DMD nomenclature [DMD File Structure](https://github.com/Vicomtech/DMD-Driver-Monitoring-Dataset/wiki/DMD-file-struct). Also, make sure that the path you are inserting is the one form the **mosaic video** and not from the annotations.
215 |
216 | - **I have the error: There are two annotation files: VCD and txt. Please, keep only the most recent one. You can delete '..ann.json' file or '..autoSaveAnnA.txt and ..autoSaveAnnB.txt' files.**
217 |
218 | That error appears when there are two kinds of annotations files, the VCD file and autosave TXT files. This means that the VCD file has not been saved successfully for some reason and there is a backup of your unsaved progress in the txt files. You can check if there are annotations inside the TXT files and go ahead and delete the VCD. When you open the tool again, it will create a VCD from the TXT files and everything will be fine again :)
219 |
220 | - **I'm pressing the keys to navigate through timeline but it does not move**
221 |
222 | Check if any other support window apart from the 3 main windows is open, like instructions window. If so, you can close it with Esc or pressing the "x" directly on the window.
223 |
224 | - **If the main window goes dark while there is a "Saving..." sign, does it still saving?**
225 |
226 | Yes. Depending on the VCD file size, it can take a while to save the progress and the main windows turn dark. We understand that it seems like there has been a problem and the tool is not responding, but don't worry, is normal :)
227 |
228 | - **What are the static annotations shown in the console?**
229 |
230 | Besides of temporal annotations, each video has context annotations, driver info and properties that we call static annotations. These are within VCD file since this format allows to include all kinds of annotations. You can access the VCD file to use those annotations.
231 |
232 |
233 | ## Known Issues
234 |
235 | | Issue | Solution |
236 | | ------------------- | :------------------------------------------------------- |
237 | | When pressed Alt Gr the tool exists abruptly | This is caused due to a bug in the capture system dependency used in the tool. Please **Don't press** Alt Gr |
238 |
239 |
240 | :warning: If you find any bug with the tool or have ideas of new features please open a new issue using the [bug report template](../docs/issue_bug_template.md) or the [feature request template](../docs/issue_feature_template.md) :warning:
241 |
--------------------------------------------------------------------------------
/annotation-tool/config.json:
--------------------------------------------------------------------------------
1 | {
2 | "tatoConfig": {
3 | "annotation_mode": "distraction",
4 | "dataset": "dmd",
5 | "pre_annotate": 0,
6 | "calculate_time":0
7 | },
8 | "interfaceText": {
9 | "mainLevelDependency":{
10 | "distraction":"depending on Driver Actions level!",
11 | "drowsiness":"depending on Eye State (Right) level!"
12 | },
13 | "levelCompletedToAnnotate":{
14 | "distraction": "Only do this when you have completed the Driver_Actions Annotations",
15 | "drowsiness": "Only do this when you have completed the Eye State (Right) Annotations"
16 | }
17 | },
18 | "consoleText": {
19 | "video_path_dmd": {
20 | "True": "PATH of the video (/dmd/.../_.._mosaic.avi): ",
21 | "False": "PATH of the video: "
22 | }
23 | },
24 | "dimensions":{
25 | "total-width": 1280,
26 | "total-height":720
27 |
28 | },
29 | "colors": {
30 | "textColorMain": [
31 | 60,
32 | 60,
33 | 60
34 | ],
35 | "textColorLabels": [
36 | 255,
37 | 255,
38 | 255
39 | ],
40 | "textColorInstructions": [
41 | 60,
42 | 60,
43 | 60
44 | ],
45 | "backgroundColorMain": 255,
46 | "backgroundColorLabels": 40,
47 | "backgroundColorInstructions": 230,
48 | "keyFrameColor": [
49 | 131,
50 | 255,
51 | 167
52 | ],
53 | "colorDict": {
54 | "0": [
55 | 223,
56 | 215,
57 | 195
58 | ],
59 | "1": [
60 | 105,
61 | 237,
62 | 249
63 | ],
64 | "2": [
65 | 201,
66 | 193,
67 | 63
68 | ],
69 | "3": [
70 | 6,
71 | 214,
72 | 160
73 | ],
74 | "4": [
75 | 233,
76 | 187,
77 | 202
78 | ],
79 | "5": [
80 | 133,
81 | 81,
82 | 252
83 | ],
84 | "6": [
85 | 0,
86 | 154,
87 | 255
88 | ],
89 | "7": [
90 | 181,
91 | 107,
92 | 69
93 | ],
94 | "8": [
95 | 137,
96 | 171,
97 | 31
98 | ],
99 | "9": [
100 | 224,
101 | 119,
102 | 125
103 | ],
104 | "10": [
105 | 153,
106 | 153,
107 | 255
108 | ],
109 | "11": [
110 | 83,
111 | 73,
112 | 193
113 | ],
114 | "12": [
115 | 107,
116 | 79,
117 | 54
118 | ],
119 | "13": [
120 | 106,
121 | 107,
122 | 131
123 | ],
124 | "99": [
125 | 245,
126 | 245,
127 | 245
128 | ],
129 | "100": [
130 | 80,
131 | 80,
132 | 80
133 | ],
134 | "val_0": [
135 | 245,
136 | 245,
137 | 245
138 | ],
139 | "val_1": [
140 | 223,
141 | 187,
142 | 185
143 | ],
144 | "val_2": [
145 | 250,
146 | 210,
147 | 170
148 | ]
149 | }
150 | }
151 | }
--------------------------------------------------------------------------------
/annotation-tool/config_distraction.json:
--------------------------------------------------------------------------------
1 | {
2 | "6": {
3 | "0": "safe_drive",
4 | "1": "texting_right",
5 | "2": "phonecall_right",
6 | "3": "texting_left",
7 | "4": "phonecall_left",
8 | "5": "radio",
9 | "6": "drinking",
10 | "7": "reach_side",
11 | "8": "hair_and_makeup",
12 | "9": "talking_to_passenger",
13 | "10": "reach_backseat",
14 | "11": "change_gear",
15 | "12": "standstill_or_waiting",
16 | "13": "unclassified",
17 | "100": "NAN"
18 | },
19 | "5": {
20 | "0": "cellphone",
21 | "1": "hair_comb",
22 | "2": "bottle",
23 | "99": "--",
24 | "100": "NAN"
25 | },
26 | "4": {
27 | "0": "hand_on_gear",
28 | "99": "--",
29 | "100": "NAN"
30 | },
31 | "3": {
32 | "0": "both",
33 | "1": "only_right",
34 | "2": "only_left",
35 | "3": "none",
36 | "100": "NAN"
37 | },
38 | "2": {
39 | "0": "talking",
40 | "99": "--",
41 | "100": "NAN"
42 | },
43 | "1": {
44 | "0": "looking_road",
45 | "1": "not_looking_road",
46 | "100": "NAN"
47 | },
48 | "0": {
49 | "0": "face_camera",
50 | "1": "body_camera",
51 | "2": "hands_camera",
52 | "99": "--",
53 | "100": "NAN"
54 | },
55 | "level_names": {
56 | "0": "occlusion",
57 | "1": "gaze_on_road",
58 | "2": "talking",
59 | "3": "hands_using_wheel",
60 | "4": "hand_on_gear",
61 | "5": "objects_in_scene",
62 | "6": "driver_actions"
63 | },
64 | "level_types": {
65 | "0": "stream_properties",
66 | "1": "action",
67 | "2": "action",
68 | "3": "action",
69 | "4": "action",
70 | "5": "object",
71 | "6": "action"
72 | },
73 | "level_defaults": {
74 | "0": 99,
75 | "1": 0,
76 | "2": 99,
77 | "3": 0,
78 | "4": 99,
79 | "5": 99,
80 | "6": 0
81 | },
82 | "camera_dependencies":{
83 | "face": [1,2],
84 | "body": [5,6],
85 | "hands": [3,4]
86 | }
87 | }
88 |
--------------------------------------------------------------------------------
/annotation-tool/config_drowsiness.json:
--------------------------------------------------------------------------------
1 | {
2 | "0": {
3 | "0": "face_camera",
4 | "1": "body_camera",
5 | "2": "hands_camera",
6 | "99": "--",
7 | "100": "NAN"
8 | },
9 | "1":{
10 | "0": "open",
11 | "1": "close",
12 | "2": "opening",
13 | "3": "closing",
14 | "4": "undefined",
15 | "100": "NAN"
16 | },
17 | "2": {
18 | "0": "blinking",
19 | "99": "--",
20 | "100": "NAN"
21 | },
22 | "3": {
23 | "0": "Yawning with hand",
24 | "1": "Yawning without hand",
25 | "99": "--",
26 | "100": "NAN"
27 | },
28 | "level_names": {
29 | "0": "occlusion",
30 | "1": "eyes_state",
31 | "2": "blinks",
32 | "3": "yawning"
33 | },
34 | "level_types": {
35 | "0": "stream_properties",
36 | "1": "action",
37 | "2": "action",
38 | "3": "action"
39 | },
40 | "level_defaults": {
41 | "0": 99,
42 | "1": 0,
43 | "2": 99,
44 | "3": 99
45 |
46 | },
47 | "camera_dependencies": {
48 | "face": [1,2,3],
49 | "body": [],
50 | "hands": []
51 | }
52 | }
53 |
--------------------------------------------------------------------------------
/annotation-tool/config_gaze.json:
--------------------------------------------------------------------------------
1 | {
2 | "2": {
3 | "0": "blinking",
4 | "99": "--",
5 | "100": "NAN"
6 | },
7 | "1": {
8 | "0": "left_mirror",
9 | "1": "left",
10 | "2": "front",
11 | "3": "center_mirror",
12 | "4": "front_right",
13 | "5": "right_mirror",
14 | "6": "right",
15 | "7": "infotainment",
16 | "8": "steering_wheel",
17 | "9": "not_valid",
18 | "100": "NAN"
19 | },
20 | "0": {
21 | "0": "face_camera",
22 | "1": "body_camera",
23 | "2": "hands_camera",
24 | "99": "--",
25 | "100": "NAN"
26 | },
27 | "level_names": {
28 | "0": "occlusion",
29 | "1": "gaze_zone",
30 | "2": "blinks"
31 | },
32 | "level_types": {
33 | "0": "stream_properties",
34 | "1": "action",
35 | "2": "action"
36 | },
37 | "level_defaults": {
38 | "0": 99,
39 | "1": 2,
40 | "2": 99
41 | },
42 | "camera_dependencies": {
43 | "face": [
44 | 1,
45 | 2
46 | ],
47 | "body": [],
48 | "hands": []
49 | }
50 | }
--------------------------------------------------------------------------------
/annotation-tool/config_hands.json:
--------------------------------------------------------------------------------
1 | {
2 | "0": {
3 | "0": "face_camera",
4 | "1": "body_camera",
5 | "2": "hands_camera",
6 | "99": "--",
7 | "100": "NAN"
8 | },
9 | "1": {
10 | "0": "both_hands",
11 | "1": "only_right",
12 | "2": "only_left",
13 | "3": "none",
14 | "100": "NAN"
15 | },
16 | "2": {
17 | "0": "taking control",
18 | "1": "giving control",
19 | "99": "--",
20 | "100": "NAN"
21 | },
22 | "3": {
23 | "0": "moving",
24 | "1": "not_moving",
25 | "100": "NAN"
26 | },
27 | "level_names": {
28 | "0": "occlusion",
29 | "1": "hands_on_wheel",
30 | "2": "transition",
31 | "3": "moving_hands"
32 | },
33 | "level_types": {
34 | "0": "stream_properties",
35 | "1": "action",
36 | "2": "action",
37 | "3": "action"
38 | },
39 | "level_defaults": {
40 | "0": 99,
41 | "1": 0,
42 | "2": 99,
43 | "3": 1
44 | },
45 | "camera_dependencies": {
46 | "face": [],
47 | "body": [
48 | 1,
49 | 2,
50 | 3
51 | ],
52 | "hands": []
53 | }
54 | }
--------------------------------------------------------------------------------
/annotation-tool/config_statics.json:
--------------------------------------------------------------------------------
1 | {
2 | "static_dict": {
3 | "0": {
4 | "name": "age",
5 | "text": "Subject Age",
6 | "type": "num",
7 | "parent": {
8 | "element": "object",
9 | "type": "driver"
10 | }
11 | },
12 | "1": {
13 | "name": "gender",
14 | "text": "Subject Gender",
15 | "type": "text",
16 | "options": {
17 | "0": "Male",
18 | "1": "Female"
19 | },
20 | "parent": {
21 | "element": "object",
22 | "type": "driver"
23 | }
24 | },
25 | "2": {
26 | "name": "glasses",
27 | "text": "Is wearing glasses?",
28 | "type": "boolean",
29 | "options": {
30 | "0": "No",
31 | "1": "Yes"
32 | },
33 | "parent": {
34 | "element": "object",
35 | "type": "driver"
36 | }
37 | },
38 | "3": {
39 | "name": "drive_freq",
40 | "text": "Driving Frecuency",
41 | "type": "text",
42 | "options": {
43 | "0": "Once a week or less",
44 | "1": "Between 2 or 5 times a week",
45 | "2": "Everyday"
46 | },
47 | "parent": {
48 | "element": "object",
49 | "type": "driver"
50 | }
51 | },
52 | "4": {
53 | "name": "experience",
54 | "text": "Driving Experience",
55 | "type": "text",
56 | "options": {
57 | "0": "Less than 1 year",
58 | "1": "Between 1 and 3 years",
59 | "2": "More than 3 years"
60 | },
61 | "parent": {
62 | "element": "object",
63 | "type": "driver"
64 | }
65 | },
66 | "5": {
67 | "name": "weather",
68 | "text": "Weather",
69 | "type": "text",
70 | "options": {
71 | "0": "Sunny",
72 | "1": "Rainy",
73 | "2": "Cloudy"
74 | },
75 | "parent": {
76 | "element": "context",
77 | "type": "recording_context"
78 | }
79 | },
80 | "6": {
81 | "name": "setup",
82 | "text": "Setup",
83 | "type": "text",
84 | "options": {
85 | "0": "Car Moving",
86 | "1": "Car Stopped",
87 | "2": "Simulator"
88 | },
89 | "parent": {
90 | "element": "context",
91 | "type": "recording_context"
92 | }
93 | },
94 | "7": {
95 | "name": "annotatorID",
96 | "text": "Annotator ID",
97 | "type": "num",
98 | "parent": {
99 | "element": "metadata",
100 | "type": "annotator"
101 | }
102 | }
103 | }
104 | }
--------------------------------------------------------------------------------
/annotation-tool/setUp.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | import json
3 | import re
4 | from pathlib import Path # To handle paths independent of OS
5 | import datetime
6 |
7 |
8 | def is_string_int(s):
9 | try:
10 | int(s)
11 | return True
12 | except ValueError:
13 | return False
14 |
15 |
16 | # Function to transform int keys to integer if possible
17 | def keys_to_int(x):
18 | r = {int(k) if is_string_int(k) else k: v for k, v in x}
19 | return r
20 |
21 |
22 | class ConfigTato:
23 |
24 | def __init__(self):
25 | """ Most of the paths specified are necessary for internal compatibility
26 | also for the construction of pre-annotations.
27 | If its running in external structure, the .json and .avi video paths
28 | should be enough
29 | Run the tool through the ./annotate.sh script to avoid confussions
30 | with paths
31 | """
32 |
33 | # ----GLOBAL VARIABLES----
34 | default_annotation_modes = ["distraction", "drowsiness", "gaze"]
35 | # @_external_struct: flag to know if is running in internal or external
36 | # extructure
37 | self._external_struct = True
38 |
39 | # ----LOAD CONFIG FROM JSON----
40 |
41 | # Config dictionary path
42 | self._config_json = "config.json"
43 | # From json to python dictionaries
44 | with open(self._config_json) as config_file:
45 | config_dict = json.load(config_file, object_pairs_hook=keys_to_int)
46 | tatoConfig = config_dict["tatoConfig"]
47 | self._interfaceTexts = config_dict["interfaceText"]
48 | self._consoleTexts = config_dict["consoleText"]
49 | self._colorConfig = config_dict["colors"]
50 | self._dimensions = config_dict["dimensions"]
51 | # Config variables
52 | self._pre_annotate = bool(tatoConfig["pre_annotate"])
53 | self._annotation_mode = tatoConfig["annotation_mode"]
54 | self._annotation_dataset = tatoConfig["dataset"]
55 | self._calculate_time = bool(tatoConfig["calculate_time"])
56 | self._default_annotation_mode = self._annotation_mode in default_annotation_modes
57 | self._dataset_dmd = self._annotation_dataset == "dmd"
58 |
59 |
60 | #----GET CONSOLE INPUTS----
61 | print("Welcome :)")
62 | #Capture video PATH
63 | self._video_file_path = Path(input(self._consoleTexts["video_path_dmd"][str(self._dataset_dmd)]))
64 |
65 | #Check if video exists
66 | if not self._video_file_path.exists():
67 | raise RuntimeError("Video file doesn't exist: " +
68 | str(self._video_file_path.resolve()))
69 | else:
70 | print("Video from " + self._annotation_dataset +
71 | " loaded: " + self._video_file_path.name)
72 |
73 | #Check if config of annotation exists
74 | self._annConfig_file_path = Path("./config_"+self._annotation_mode+".json")
75 | if not self._annConfig_file_path.exists():
76 | raise RuntimeError("Annotation config file doesn't exist: " +
77 | str(self._annConfig_file_path.resolve()) + " Please, define a config file for "+self._annotation_mode+" or change 'annotation_mode' option in "+self._config_json)
78 | else:
79 | print("TaTo is in "+self._annotation_mode+" annotation mode with " +
80 | self._annConfig_file_path.name+" annotation config file.")
81 |
82 | #----DEFINE FILES PATHS----
83 | root_path = self._video_file_path.parent
84 | #If annotating dmd
85 | if self._dataset_dmd:
86 | # Build a regular expression for the mosaic name to be satisfied by the input mosaic file name
87 | regex_internal = '(?P[1-9]|[1-2][0-9]|[3][0-7])_(?P[a-z]{1,}|[a-z]{1,}[2])_'\
88 | '(?Pmosaic|body|face|hands)_(?P(?P0[1-9]|1[012])-(?P0[1-9]|[12][0-9]|3[01]))'
89 | regex_external = '(?Pg[A-z]{1,})_(?P[1-9]|[1-2][0-9]|[3][0-7])_'\
90 | '(?Ps[1-9]{1,})_(?P(?P(?P\d{4})-(?P0[1-9]|1[012])-'\
91 | '(?P0[1-9]|[12][0-9]|3[01]))T(?P(?P\d{1,2});(?P\d{1,2});'\
92 | '(?P\d{1,2}))\+\d{1,2};\d{1,2})_(?Prgb|depth|ir)_(?Pmosaic|body|face|hands)'
93 | regex_internal = re.compile(regex_internal)
94 | regex_external = re.compile(regex_external)
95 | match_internal = regex_internal.search(str(self._video_file_path))
96 | match_external = regex_external.search(str(self._video_file_path))
97 |
98 | if match_internal:
99 | print("Video in internal structure")
100 | self._external_struct = False
101 | match = match_internal
102 | elif match_external:
103 | print("Video in external structure")
104 | self._external_struct = True
105 | match = match_external
106 | else:
107 | raise RuntimeError(
108 | "Incompatible mosaic name format: " + str(self._video_file_path) + ". Please check file structure or change 'dataset' option in "+self._config_json)
109 |
110 | #Get video info from path
111 | base_name = match.group()
112 | #Get GROUP
113 | self._group = self._video_file_path.parts[-4]
114 | #Get SUBJECT
115 | self._subject = match.group("subject")
116 | #Get SESSION
117 | self._session = match.group("session")
118 | #Get DATE
119 | self._date = match.group("date")
120 | #Get STREAM
121 | self._stream = match.group("stream")
122 |
123 | #check if video is according to mode
124 | distractionRelated = ['attm', 's1', 'atts', 's2', 'reach', 's3', 'attc', 's4',
125 | 'attm2', 'atts2', 'reach2', 'attc2']
126 | drowsinessRelated = ['drow2', 's5', 'drow']
127 | gazeRelated = ['gaze', 's6', 'gazec', 's7', 'gaze2', 'gazec2']
128 | if self._annotation_mode == "distraction" and self._session not in distractionRelated or self._annotation_mode == "drowsiness" and self._session not in drowsinessRelated or self._annotation_mode == "gaze" and self._session not in gazeRelated:
129 | print("---!!WARNING!!: the annotation mode does not match the type of video session.---")
130 |
131 | #Get OpenLABEL, AutoSaveAnn and TIME paths
132 | if self._external_struct:
133 | self._vcd_file_name = (base_name.replace(match.group(
134 | "stream"), 'ann') + '_' + self._annotation_mode + ".json")
135 | #To save progress in anotations in txt
136 | self._autoSave_file_name = (base_name.replace(
137 | match.group("stream"), 'autoSaveAnn-A') + ".txt")
138 | # To read and write the time expended in annotation
139 | self._annTime_file_name = (base_name + '_annTime.txt')
140 |
141 | #Get TIMESTAMP
142 | self._timestamp = match.group("timestamp")
143 | #Get STREAM
144 | self._channel = match.group("channel")
145 |
146 | else:
147 | self._vcd_file_name = (base_name.replace(match.group(
148 | "stream") + '_', '') + '_ann_' + self._annotation_mode + ".json")
149 | #To save progress in anotations in txt
150 | self._autoSave_file_name = (base_name.replace(
151 | match.group("stream") + '_', '') + '_autoSaveAnn-A'+".txt")
152 | # To read and write the time expended in annotation
153 | self._annTime_file_name = (base_name + '_annTime.txt')
154 |
155 | #Define paths for PREANNOTATE annotation mode
156 | if self._pre_annotate:
157 | base_name_body = base_name.replace('mosaic', 'body')
158 | # To read the pre-annotations of the mosaic
159 | self._preAnn_file_path = root_path / (base_name_body + "_ann.txt")
160 | # To read the intel annotations of the mosaic
161 | self._intelAnn_file_path = root_path / \
162 | (base_name_body + "_ann_intel.txt")
163 | # To keep compatibility with this ann format
164 | self._oldAnn_file_path = root_path / \
165 | (base_name_body + "_manualAnn.txt")
166 | # To read the shifts of body, face and hands videos
167 | self._shifts_file_path = (
168 | self._video_file_path.parents[3] / "logs-sync" / ("shifts-" + self._group + ".txt")).resolve()
169 | # To read the metadata of video session (driver info, frame numbers..etc)
170 | self._metadata_file_path = (
171 | self._video_file_path.parents[3] / "metadata" / ("all_"+ self._group+"_bag_metadata.txt")).resolve()
172 | #Check if shifts and metadata files exist
173 | if not self._shifts_file_path.exists():
174 | raise RuntimeError("Shift file doesn't exist: " +
175 | str(self._shifts_file_path.resolve()))
176 | if not self._metadata_file_path.exists():
177 | raise RuntimeError("Metadata file doesn't exist: " +
178 | str(self._metadata_file_path.resolve()))
179 | else:
180 | if self._pre_annotate:
181 | print("---!!WARNING!!: pre_annotate option is on (1). This is not compatible with other datasets. This option will be change to (0) ---")
182 | self._pre_annotate = False
183 |
184 | self._group = "0"
185 | #Get SUBJECT
186 | self._subject = "0"
187 | #Get SESSION
188 | self._session = "default"
189 | #Get DATE
190 | self._date = str(datetime.datetime.now().date())
191 | #Get STREAM
192 | self._stream = "general"
193 | #Working with other dataset
194 | base_name = self._video_file_path.stem
195 | #Get OpenLABEL, AutoSaveAnn and TIME paths
196 | self._vcd_file_name = (base_name + '_ann_' +
197 | self._annotation_mode + ".json")
198 | #To save progress in anotations in txt
199 | self._autoSave_file_name = (base_name + '_autoSaveAnn-A.txt')
200 | # To read and write the time expended in annotation
201 | self._annTime_file_name = (base_name + '_annTime.txt')
202 |
203 | self._vcd_file_path = root_path / self._vcd_file_name
204 | self._autoSave_file_path = root_path / self._autoSave_file_name
205 | self._annTime_file_path = root_path / self._annTime_file_name
206 |
207 | def get_annotation_config(self):
208 | with open(self._annConfig_file_path) as config_file:
209 | config_dict = json.load(config_file, object_pairs_hook=keys_to_int)
210 |
211 | #Complete Dictionaries
212 | self._config_dict = config_dict
213 | #Levels names
214 | self._level_names = config_dict["level_names"]
215 | #Levels defaults
216 | self._level_defaults = config_dict["level_defaults"]
217 | #Levels types
218 | self._level_types = config_dict["level_types"]
219 | #Labels
220 | self._level_labels = []
221 | for ide,name in self._level_names.items():
222 | self._level_labels.append(config_dict[ide])
223 | #Camera dependencies
224 | self._camera_dependencies = config_dict["camera_dependencies"]
225 | #Number of levels
226 | self._num_levels = len(self._level_labels)
227 |
228 | return self._config_dict, self._level_names, self._level_defaults, \
229 | self._level_types, self._level_labels, self._camera_dependencies,\
230 | self._num_levels
231 |
232 | def get_video_path_info(self):
233 | if self._external_struct:
234 | return self._group, self._subject, self._session, self._date, self._stream, self._timestamp, self._channel
235 | else:
236 | return self._group, self._subject, self._session, self._date, self._stream
237 |
238 | def get_statics_dict(self):
239 | with open("./config_statics.json") as config_file:
240 | config_dict = json.load(config_file, object_pairs_hook=keys_to_int)
241 | return config_dict["static_dict"]
242 |
243 |
--------------------------------------------------------------------------------
/annotation-tool/vcd4parser.py:
--------------------------------------------------------------------------------
1 | import warnings
2 | from pathlib import Path
3 |
4 | import numpy as np
5 |
6 | import vcd.core as core
7 | import vcd.types as types
8 | # Import local class to get tato configuration and paths
9 | from setUp import ConfigTato
10 |
11 | # dict for changes in structures
12 | dmd_struct = {
13 | 'groups': {
14 | 'grupo1A': 'gA',
15 | 'grupo2A': 'gB',
16 | 'grupo2M': 'gC',
17 | 'grupo3B': 'gD',
18 | 'grupoE': 'gE',
19 | 'grupo4B': 'gF',
20 | 'grupoZ': 'gZ'
21 | },
22 | 'sessions': {
23 | 'attm': 's1',
24 | 'atts': 's2',
25 | 'reach': 's3',
26 | 'attc': 's4',
27 | 'gaze': 's5',
28 | 'gazec': 's6',
29 | 'drow': 's7',
30 | 'attm2': 's1',
31 | 'atts2': 's2',
32 | 'reach2': 's3',
33 | 'attc2': 's4',
34 | 'gaze2': 's5',
35 | 'gazec2': 's6',
36 | 'drow2': 's7'
37 | }
38 | }
39 |
40 | # Type of annotation
41 | annotate_dict = {
42 | 0: 'unchanged',
43 | 1: 'manual',
44 | 2: 'interval'
45 | }
46 |
47 |
48 | def keys_exist(element: dict, *keys):
49 | """
50 | Check if *keys (nested) exists in `element` (dict).
51 | """
52 | if not isinstance(element, dict):
53 | raise AttributeError('keys_exists() expects dict as first argument.')
54 | if len(keys) == 0:
55 | raise AttributeError(
56 | 'keys_exists() expects at least two arguments, one given.')
57 |
58 | _element = element
59 | for key in keys:
60 | try:
61 | _element = _element[key]
62 | except KeyError:
63 | return False
64 | return True
65 |
66 |
67 | class VcdHandler(object):
68 | def __init__(self, setUpManager: ConfigTato):
69 | self._setUpManager = setUpManager
70 |
71 | # Get TaTo annotation mode
72 | self._annotation_mode = self._setUpManager._annotation_mode
73 |
74 | # Internal Variables
75 | self._vcd = None
76 | self._vcd_file = self._setUpManager._vcd_file_path
77 | self._vcd_loaded = False
78 |
79 | # Get dictionary information
80 | self._dict_file = self._setUpManager._config_json
81 |
82 | self._dicts, self._annotation_levels, \
83 | self._default_levels, self._annotation_types, \
84 | self._level_labels, self._camera_dependencies, \
85 | self._total_levels = self._setUpManager.get_annotation_config()
86 |
87 | # Dictionary that contains the
88 | self._statics_dict = self._setUpManager.get_statics_dict()
89 |
90 | self._annotation_levels = self._annotation_levels.items()
91 | self._default_levels = self._default_levels.items()
92 | self._annotation_types = self._annotation_types.items()
93 |
94 | # If OpenLABEL_file exists then load data from file
95 | if self._vcd_file.exists():
96 | print("OpenLABEL exists")
97 | # Create a OpenLABEL instance and load file
98 |
99 | self._vcd = core.VCD()
100 | self._vcd.load_from_file(file_name=self._vcd_file)
101 | self._vcd_loaded = True
102 | else:
103 | # Create Empty OpenLABEL
104 | self._vcd = core.VCD()
105 | self._vcd_loaded = False
106 |
107 |
108 | # This function adds the annotations and validation vectors to the provided
109 | # OpenLABEL object.
110 | # IMPORTANT: Call to this function should be done after defining all the
111 | # available streams for annotation, as this function could write
112 | # stream_properties
113 | def add_annotations(self, vcd: core.VCD, annotations, validations, ontology_uid: int):
114 |
115 | # Loop over all annotation levels to add the elements present in
116 | # annotation vector
117 | for level_code, level_type in zip(self._annotation_levels,
118 | self._annotation_types):
119 | level_idx = int(level_code[0])
120 | level_name = level_code[1]
121 | level_type_idx = int(level_type[0])
122 | level_type_name = level_type[1]
123 |
124 | assert (level_idx == level_type_idx)
125 | assert (len(self._level_labels) > 0)
126 |
127 | level_labels = self._level_labels[level_idx]
128 |
129 | for label_idx, label_name in level_labels.items():
130 | # Do not save NaN and Empty annotations
131 | if label_idx == 100 or label_idx == 99:
132 | continue
133 |
134 | annotations = np.array(annotations)
135 | validations = np.array(validations)
136 | # Compute frame number of all occurrences of label_idx
137 | f_list = np.where(annotations[:, level_idx] == label_idx)[0]
138 | v_list = validations[f_list, level_idx]
139 |
140 | #From frames with lable_idx, select frames with validation 0, 1 and 2
141 | v_list_0 = f_list[np.where(v_list==0)]
142 | v_list_1 = f_list[np.where(v_list==1)]
143 | v_list_2 = f_list[np.where(v_list==2)]
144 |
145 | #If there are not annotated frames, then all validations are 0 (unchanged)
146 | if len(f_list)==0:
147 | v_list_0 = validations[f_list, level_idx]
148 |
149 | # Make intervals of frames
150 | f_interv = []
151 | f_interv = list(self.interval_extract(f_list))
152 |
153 | #Make intervals of validation
154 | v_0_intervals=list(self.interval_extract(v_list_0))
155 | v_1_intervals=list(self.interval_extract(v_list_1))
156 | v_2_intervals =list(self.interval_extract(v_list_2))
157 |
158 |
159 | # ## Add the elements
160 | # Add an action
161 | if level_type_name == 'action':
162 | action_type = level_name + '/' + label_name
163 | if len(f_interv)>0:
164 | el_uid = vcd.add_action("", semantic_type=action_type,
165 | frame_value=f_interv,
166 | ont_uid=ontology_uid)
167 |
168 |
169 | # Add how the annotation was done
170 | if len(v_0_intervals)>0:
171 | #Intervals with validation 0
172 | validation_data = types.text(name='annotated', val=annotate_dict[0])
173 | vcd.add_action_data(uid=el_uid,
174 | action_data=validation_data,
175 | frame_value=v_0_intervals)
176 | if len(v_1_intervals)>0:
177 | #Intervals with validation 1
178 | validation_data = types.text(name='annotated', val=annotate_dict[1])
179 | vcd.add_action_data(uid=el_uid,
180 | action_data=validation_data,
181 | frame_value=v_1_intervals)
182 | if len(v_2_intervals)>0:
183 | #Intervals with validation 2
184 | validation_data = types.text(name='annotated', val=annotate_dict[2])
185 | vcd.add_action_data(uid=el_uid,
186 | action_data=validation_data,
187 | frame_value=v_2_intervals)
188 |
189 | # Add an object
190 | elif level_type_name == 'object':
191 | object_type = label_name
192 | if len(f_interv)>0:
193 | el_uid = vcd.add_object("", semantic_type=object_type,
194 | frame_value=f_interv,
195 | ont_uid=ontology_uid)
196 | # Add how the annotation was done
197 | #Intervals with validation 0
198 | validation_data = types.text(name='annotated',
199 | val=annotate_dict[0])
200 | vcd.add_object_data(uid=el_uid,
201 | object_data=validation_data,
202 | frame_value=v_0_intervals)
203 | #Intervals with validation 1
204 | validation_data = types.text(name='annotated',
205 | val=annotate_dict[1])
206 | vcd.add_object_data(uid=el_uid,
207 | object_data=validation_data,
208 | frame_value=v_1_intervals)
209 | #Intervals with validation 2
210 | validation_data = types.text(name='annotated',
211 | val=annotate_dict[2])
212 | vcd.add_object_data(uid=el_uid,
213 | object_data=validation_data,
214 | frame_value=v_2_intervals)
215 | # Add stream properties
216 | elif level_type_name == 'stream_properties':
217 | # When a level is defined as stream_properties, the annotations
218 | # will always be considered as boolean, since TaTo only allows
219 | # the presence or absence of that property.
220 | # E.g. occlusion can only be True or False
221 | if len(f_interv)>0:
222 | for i, frame_num in enumerate(f_list):
223 |
224 | stream = label_name
225 | if stream == "--":
226 | continue
227 | property_dict = {
228 | level_name: {
229 | 'val': True,
230 | 'annotated': annotate_dict[int(v_list[i])]
231 | }
232 | }
233 | vcd.add_stream_properties(stream_name=stream,
234 | stream_sync=types.StreamSync(
235 | frame_vcd=int(frame_num)),
236 | properties=property_dict)
237 | else:
238 | raise RuntimeError(
239 | 'Invalid group type: ' + level_type_name)
240 | return vcd
241 |
242 | #This functions gets a list of numbers (frames) and make intervals.
243 | # Useful for add_annotations() function
244 | def interval_extract(self, list):
245 | if len(list) <= 0:
246 | return []
247 | #list = sorted(set(list))
248 | range_start = previous_number = list[0]
249 | for number in list[1:]:
250 | if number == previous_number + 1:
251 | previous_number = number
252 | else:
253 | yield [int(range_start), int(previous_number)]
254 | range_start = previous_number = number
255 | yield [int(range_start), int(previous_number)]
256 |
257 | # Return flag that indicate if OpenLABEL was loaded from file
258 | def file_loaded(self):
259 | return self._vcd_loaded
260 |
261 | # This function only saves the stored OpenLABEL object in the external file
262 | def save_vcd(self, pretty=False):
263 | # Save into file
264 | self._vcd.save(self._vcd_file, pretty=pretty)
265 |
266 | def update_vcd(self, annotations, validations):
267 | """ From an empty OpenLABEL, the annotation and validation vectors are
268 | parsed to OpenLABEL format.
269 |
270 | """
271 | # Get total number of lines which is equivalent to total number of
272 | # frames input video
273 | assert (len(annotations) == len(validations))
274 | total_frames = len(annotations)
275 |
276 | # 1.- Create OpenLABEL only with annotations and validations
277 | new_vcd = core.VCD()
278 |
279 | # 2.- OpenLABEL Name
280 | vcd_name = Path(self._vcd_file).stem
281 | new_vcd.add_name(str(vcd_name))
282 |
283 | # 3.- Camera
284 | # Build Uri to video files
285 | general_uri = self._setUpManager._video_file_path
286 | general_video_descr = 'Unique general camera'
287 | new_vcd.add_stream('general_camera', str(general_uri), general_video_descr,
288 | core.StreamType.camera)
289 | # 4.- Stream Properties
290 | # Real Intrinsics of camera
291 | new_vcd.add_stream_properties(stream_name='general_camera',
292 | properties={
293 | 'total_frames': total_frames,
294 | })
295 | # 5.- Add annotations and validations
296 | vcd = self.add_annotations(new_vcd,annotations, validations,0)
297 |
298 | # Update current OpenLABEL with newly created OpenLABEL
299 | self._vcd = vcd
300 |
301 | # This function is handy to perform simultaneously the updating and saving
302 | # of the OpenLABEL object
303 | # @annotations: annotations array
304 | # @validations: validations array
305 | # @statics: dict with values
306 | # @metadata: array with values from metadata file
307 | # @pretty
308 | def update_save_vcd(self, annotations, validations, pretty=False):
309 | # Update OpenLABEL
310 | self.update_vcd(annotations, validations)
311 |
312 | # Save OpenLABEL
313 | self.save_vcd(pretty)
314 | self._vcd_loaded = True
315 |
316 | # This function extracts the annotation information from the OpenLABEL object
317 | # Returns:
318 | # @annotations: A matrix consisting of the annotation labels for each of
319 | # the levels in dict
320 | # @validations: A matrix consisting of the validation method while
321 | # annotating
322 | def get_annotation_vectors(self):
323 | # Get a copy of OpenLABEL object
324 | vcd = self._vcd
325 |
326 | if vcd is None:
327 | raise RuntimeError("Couldn't get OpenLABEL data")
328 |
329 | # Create annotation and validation vectors
330 | frame_interval = vcd.get_frame_intervals().fis_dict
331 | total_frames = frame_interval[-1]['frame_end'] + 1
332 | # Fill with initial default values
333 | annotations = np.array([[val for x, val in self._default_levels]
334 | for _ in range(total_frames)])
335 | validations = np.array([[0 for _ in range(self._total_levels)]
336 | for _ in range(total_frames)])
337 |
338 | # Fill Data of annotation and validation vectors
339 | # Loop over all the annotation levels searching for annotated frame
340 | # intervals
341 | for level_code, level_type in zip(self._annotation_levels,
342 | self._annotation_types):
343 | level_idx = level_code[0]
344 | level_name = level_code[1]
345 | level_type_idx = level_type[0]
346 | level_type_name = level_type[1]
347 |
348 | assert (level_idx == level_type_idx)
349 | assert (len(self._level_labels) > 0)
350 |
351 | # Loop over each level labels
352 | levels_list = self._level_labels[level_idx].items()
353 | for label_id, label_str in levels_list:
354 | if label_id == 99 or label_id == 100:
355 | continue
356 | if level_type[1] == 'action' or level_type[1] == 'object':
357 | ann_type = None
358 | elem_type = None
359 | if level_type[1] == 'action':
360 | ann_type = level_name + '/' + label_str
361 | elem_type = core.ElementType.action
362 | elif level_type[1] == 'object':
363 | ann_type = label_str
364 | elem_type = core.ElementType.object
365 | l_uids = vcd.get_elements_of_type(element_type=elem_type,
366 | semantic_type=ann_type)
367 | # Only allowed 0 or 1 action type in OpenLABEL
368 | assert (0 <= len(l_uids) <= 1)
369 | # Loop over all elements of specific type
370 | if len(l_uids) == 0:
371 | continue
372 | uid = l_uids[0]
373 | l_fi = vcd.get_element_frame_intervals(elem_type, uid=uid)
374 | # Loop over frame intervals
375 | for fi in l_fi.fis_dict:
376 | start = fi['frame_start']
377 | end = fi['frame_end'] + 1
378 | annotations[start:end, level_idx] = label_id
379 | # Get annotation method (validations)
380 | for f_num in range(start, end):
381 | d = vcd.get_element_data(elem_type, uid,
382 | 'annotated', f_num)
383 | val_id = [k for k, v in annotate_dict.items()
384 | if v == d['val']]
385 | validations[f_num, level_idx] = val_id[0]
386 | elif level_type[1] == 'stream_properties':
387 | # Only way is to read frame by frame the stream_properties
388 | for f_num in range(total_frames):
389 | f = vcd.get_frame(frame_num=f_num)
390 | if keys_exist(f, 'frame_properties', 'streams',
391 | label_str, 'stream_properties',
392 | level_name):
393 | a = f['frame_properties']['streams'][label_str][
394 | 'stream_properties']
395 | ann_str = a[level_name]['annotated']
396 | val_id = [k for k, v in annotate_dict.items()
397 | if
398 | v == ann_str]
399 | assert (len(val_id) == 1)
400 | annotations[f_num, level_idx] = label_id
401 | validations[f_num, level_idx] = val_id[0]
402 |
403 | return annotations, validations
404 |
405 | # Class to handle specific fields in OpenLABEL when the DMD dataset is used
406 |
407 |
408 | class DMDVcdHandler(VcdHandler):
409 |
410 | def __init__(self, setUpManager):
411 | super().__init__(setUpManager)
412 |
413 | # Ontology
414 | self.ont_uid = 0
415 |
416 | # OpenLABEL metadata
417 | self.group = self._setUpManager._group
418 | self.subject = self._setUpManager._subject
419 | self.session = self._setUpManager._session
420 | self.date = self._setUpManager._date
421 | if self._setUpManager._external_struct:
422 | self.date = self._setUpManager._timestamp
423 |
424 | # Stream metadata
425 | self._bf_shift = None
426 | self._hb_shift = None
427 | self._hf_shift = None
428 |
429 | self._b_frames = None
430 | self._f_frames = None
431 | self._h_frames = None
432 |
433 | self._f_intrinsics = np.zeros(12).tolist()
434 | self._b_intrinsics = np.zeros(12).tolist()
435 | self._h_intrinsics = np.zeros(12).tolist()
436 |
437 | # Driver statics
438 | self.uid_driver = None
439 | self.age = None
440 | self.gender = None
441 | self.glasses = None
442 | self.drive_freq = None
443 | self.experience = None
444 |
445 | # Recording context
446 | self.weather = None
447 | self.setup = None
448 | self.timestamp = None
449 |
450 | # Other metadata
451 | self.annotatorID = -1
452 |
453 | # If a OpenLABEL file was loaded,
454 | # Try to extract metadata and statics
455 | if self._vcd_loaded:
456 | # Get values of shifts from loaded OpenLABEL
457 | self._bf_shift, self._hf_shift, self._hb_shift = self.get_shifts()
458 |
459 | # Get values of stream frame number from loaded OpenLABEL
460 | self._f_frames, self._b_frames, self._h_frames = self.get_frames()
461 |
462 | # Get stream intrinsics from loaded OpenLABEL
463 | self._f_intrinsics, self._b_intrinsics, self._h_intrinsics = \
464 | self.get_intrinsics(self._vcd)
465 |
466 | def add_annotationsx(self, vcd: core.VCD, annotations, validations, ontology_uid: int):
467 | return super().add_annotations(vcd, annotations, validations, ontology_uid)
468 |
469 | def save_vcd_dmd(self, pretty=False):
470 | # Save into file
471 | self._vcd.save(self._vcd_file, pretty=pretty)
472 |
473 | def update_vcd(self, annotations, validations, statics=None, metadata=None):
474 | """ Convert annotations into OpenLABEL format
475 | """
476 | # But, if there are already static annotations in OpenLABEL, take and keep
477 | # them for the next OpenLABEL
478 | areStatics = bool(statics)
479 | isMetadata = bool(metadata)
480 |
481 | if isMetadata:
482 | # @metadata: [face_meta, body_meta,hands_meta]
483 | # @face_meta (5): [rgb_video_frames,mat]
484 | # @body_meta (6): [date_time,rgb_video_frames,mat]
485 | # @hands_meta (7): [rgb_video_frames,mat]
486 | self._f_frames = int(metadata[0][0])
487 | self._f_intrinsics = metadata[0][1]
488 |
489 | self.timeStamp = str(metadata[1][0])
490 | # Change ":" symbol to ";" for windows correct visualization
491 | self.timeStamp.replace(":", ";")
492 |
493 | self._b_frames = int(metadata[1][1])
494 | self._b_intrinsics = metadata[1][2]
495 |
496 | self._h_frames = int(metadata[2][0])
497 | self._h_intrinsics = metadata[2][1]
498 |
499 | if areStatics:
500 | # Driver Data
501 | age = int(statics[0]["val"])
502 | gender = statics[1]["val"]
503 | glasses = bool(statics[2]["val"])
504 | drive_freq = statics[3]["val"]
505 | experience = statics[4]["val"]
506 |
507 | # Context Data
508 | weather = statics[5]["val"]
509 | setup = statics[6]["val"]
510 |
511 | # Annotator
512 | annotatorID = str(statics[7]["val"])
513 |
514 | if self._bf_shift is None or self._hb_shift is None or \
515 | self._hf_shift is None:
516 | raise RuntimeError(
517 | "Shift values have not been set. Run set_shifts() function "
518 | "before")
519 | body_face_shift = self._bf_shift
520 | # hands_body_shift = self.__hb_shift
521 | hands_face_shift = self._hf_shift
522 |
523 | # Get total number of lines which is equivalent to total number of
524 | # frames of mosaic
525 | assert (len(annotations) == len(validations))
526 | total_frames = len(annotations)
527 |
528 | # 1.- Create a OpenLABEL instance
529 | vcd = core.VCD()
530 |
531 | # 2.- Add Object for Subject
532 | self.uid_driver = vcd.add_object(self.subject, "driver", ont_uid=0,
533 | frame_value=(0, total_frames - 1))
534 |
535 | # 3.- OpenLABEL Name
536 | vcd.add_name(self.group + '_' + self.subject + '_' +
537 | self.session + '_' + self.date + '_' +
538 | self._annotation_mode)
539 |
540 | # 4.- Annotator
541 | if areStatics:
542 | vcd.add_annotator(annotatorID)
543 |
544 | # 5- Ontology
545 | vcd.add_ontology('http://dmd.vicomtech.org/ontology')
546 |
547 | # 6.- Cameras
548 | # Build Uri to video files
549 | if self._setUpManager._external_struct:
550 | video_root_path = Path() / self.group / self.subject / self.session
551 | face_uri = video_root_path / (self.group + '_' + self.subject + '_' +
552 | self.session + '_' + self.date +
553 | '_rgb_face.mp4')
554 | body_uri = video_root_path / (self.group + '_' + self.subject + '_' +
555 | self.session + '_' + self.date +
556 | '_rgb_body.mp4')
557 | hands_uri = video_root_path / (self.group + '_' + self.subject + '_' +
558 | self.session + '_' + self.date +
559 | '_rgb_hands.mp4')
560 | else:
561 | video_root_path = Path() / self.group / self.date / self.subject
562 | face_uri = video_root_path / (self.subject + '_' +self.session + '_' + 'face' + '_' + self.date +'.mp4')
563 | body_uri = video_root_path / (self.subject + '_' +self.session + '_' + 'body' + '_' + self.date +'.mp4')
564 | hands_uri = video_root_path / (self.subject + '_' +self.session + '_' + 'hands' + '_' + self.date +'.mp4')
565 |
566 | face_video_descr = 'Frontal face looking camera'
567 | body_video_descr = 'Side body looking camera'
568 | hands_video_descr = 'Hands and wheel looking camera'
569 | vcd.add_stream('face_camera', str(face_uri), face_video_descr,
570 | core.StreamType.camera)
571 | vcd.add_stream('body_camera', str(body_uri), body_video_descr,
572 | core.StreamType.camera)
573 | vcd.add_stream('hands_camera', str(hands_uri), hands_video_descr,
574 | core.StreamType.camera)
575 |
576 | # 7.- Stream Properties
577 | # Real Intrinsics of cameras
578 | vcd.add_stream_properties(stream_name='face_camera',
579 | properties={
580 | 'cam_module': 'Intel RealSense D415',
581 | 'total_frames': self._f_frames,
582 | },
583 | stream_sync=types.StreamSync(frame_shift=0),
584 | intrinsics=types.IntrinsicsPinhole(
585 | width_px=1280, height_px=720,
586 | camera_matrix_3x4=self._f_intrinsics)
587 | )
588 | vcd.add_stream_properties(stream_name='body_camera',
589 | properties={
590 | 'camera_module': 'Intel RealSense D435',
591 | 'total_frames': self._b_frames,
592 | },
593 | stream_sync=types.StreamSync(
594 | frame_shift=body_face_shift),
595 | intrinsics=types.IntrinsicsPinhole(
596 | width_px=1280, height_px=720,
597 | camera_matrix_3x4=self._b_intrinsics)
598 | )
599 | vcd.add_stream_properties(stream_name='hands_camera',
600 | properties={
601 | 'camera_module': 'Intel RealSense D415',
602 | 'total_frames': self._h_frames,
603 | },
604 | stream_sync=types.StreamSync(
605 | frame_shift=hands_face_shift),
606 | intrinsics=types.IntrinsicsPinhole(
607 | width_px=1280, height_px=720,
608 | camera_matrix_3x4=self._h_intrinsics)
609 | )
610 |
611 | if areStatics or isMetadata:
612 | # 8.- Add Context of Recording session
613 | last_frame = total_frames - 1
614 | ctx_txt = 'recording_context'
615 | rec_context_uid = vcd.add_context(name='', semantic_type=ctx_txt,
616 | frame_value=(0, last_frame))
617 |
618 | if areStatics:
619 | vcd.add_context_data(rec_context_uid,
620 | types.text(name='weather', val=weather))
621 | vcd.add_context_data(rec_context_uid,
622 | types.text(name='setup', val=setup))
623 |
624 | # 9.- Add Driver static properties
625 | vcd.add_object_data(self.uid_driver,
626 | types.num(name='age', val=age))
627 | vcd.add_object_data(self.uid_driver,
628 | types.text(name='gender', val=gender))
629 | vcd.add_object_data(self.uid_driver,
630 | types.boolean(name='glasses', val=glasses))
631 | vcd.add_object_data(self.uid_driver,
632 | types.text(name='experience', val=experience))
633 | vcd.add_object_data(self.uid_driver,
634 | types.text(name='drive_freq', val=drive_freq))
635 | if isMetadata:
636 | vcd.add_context_data(rec_context_uid,
637 | types.text(name='recordTime',
638 | val=self.timeStamp))
639 |
640 | # 10.- Save annotation and validation vectors in OpenLABEL format
641 | # Perform general update
642 | new_vcd = self.add_annotationsx(vcd, annotations, validations, self.ont_uid)
643 |
644 | # Update class variable __vcd with newly created object
645 | self._vcd = new_vcd
646 | return True
647 |
648 | # This function is handy to perform simultaneously the updating and saving
649 | # of the OpenLABEL object
650 | # @annotations: annotations array
651 | # @validations: validations array
652 | # @statics: dict with values
653 | # @metadata: array with values from metadata file
654 | # @pretty
655 | def update_save_vcd(self, annotations, validations, statics=None,
656 | metadata=None, pretty=False):
657 | # Update OpenLABEL
658 | self.update_vcd(annotations, validations, statics, metadata)
659 |
660 | # Save OpenLABEL
661 | self.save_vcd_dmd(pretty)
662 | self._vcd_loaded = True
663 |
664 | # Return flag that indicate if OpenLABEL was loaded from file
665 | def file_loaded(self):
666 | return self._vcd_loaded
667 |
668 | # Function to check the existence of total_frames field in OpenLABEL given a
669 | # stream name
670 | def stream_frames_exist(self,_vcd, stream_name: str):
671 | frame_exists = False
672 | if _vcd.has_stream(stream_name):
673 | stream = _vcd.get_stream(stream_name)
674 | frame_exists = keys_exist(stream, 'stream_properties',
675 | 'total_frames')
676 | else:
677 | warnings.warn('WARNING: stream ' + stream_name + ' is not present '
678 | 'in input OpenLABEL')
679 | return frame_exists
680 |
681 | # Function to check the existence of frame_shift field in OpenLABEL given a
682 | # stream name
683 | def shift_exist(self, _vcd,stream_name: str):
684 | shift_exists = False
685 | if _vcd.has_stream(stream_name):
686 | stream = _vcd.get_stream(stream_name)
687 | shift_exists = keys_exist(stream, 'stream_properties', 'sync',
688 | 'frame_shift')
689 | else:
690 | warnings.warn('WARNING: stream ' + stream_name + ' is not present '
691 | 'in input OpenLABEL')
692 | return shift_exists
693 |
694 | # Function to check the existence of camera_matrix field in OpenLABEL given a
695 | # stream name
696 | def cam_matrix_exist(self,_vcd, stream_name: str):
697 | matrix_exist = False
698 | if _vcd.has_stream(stream_name):
699 | stream = _vcd.get_stream(stream_name)
700 | matrix_exist = keys_exist(stream, 'stream_properties',
701 | 'intrinsics_pinhole',
702 | 'camera_matrix_3x4')
703 | else:
704 | warnings.warn('WARNING: stream ' + stream_name + ' is not present '
705 | 'in input OpenLABEL')
706 | return matrix_exist
707 |
708 | # Function to check the existence of static fields of driver in OpenLABEL
709 | def driver_statics_exist(self):
710 | # Check in OpenLABEL the existence of the following statics variables
711 | driver_uid = self._vcd.get_object_uid_by_name(str(self.subject))
712 | elem_type = core.ElementType.object
713 | driver_list = self._vcd.get_elements_of_type(elem_type, "driver")
714 |
715 | # check if uid exists in list of objects of type 'driver'
716 | if driver_uid in driver_list:
717 | age_data = self._vcd.get_object_data(driver_uid, 'age')
718 |
719 | statics_exist = True
720 | else:
721 | statics_exist = False
722 |
723 | return statics_exist
724 |
725 |
726 |
727 | # This function reads the number of frames of a stream from the OpenLABEL object
728 | # Returns:
729 | # stream_frames: number of frames of the requested stream
730 | def get_stream_frames_in_vcd(self, _vcd, stream_name: str):
731 | if self.stream_frames_exist(_vcd,stream_name):
732 | stream = _vcd.get_stream(stream_name)
733 | stream_frames = stream['stream_properties']['total_frames']
734 | return stream_frames
735 | else:
736 | raise RuntimeError("OpenLABEL: doesn't have frame number information for "
737 | "stream: " + stream_name)
738 |
739 | # This function reads the shift of a stream from the OpenLABEL object
740 | # Returns:
741 | # shift_in_vcd: shift of the given stream
742 | def get_shift_in_vcd(self, _vcd,stream_name: str):
743 | if self.shift_exist(_vcd,stream_name):
744 | stream = _vcd.get_stream(stream_name)
745 | shift_in_vcd = stream['stream_properties']['sync']['frame_shift']
746 | return shift_in_vcd
747 | else:
748 | raise RuntimeError("OpenLABEL: doesn't have shift information for "
749 | "stream: " + stream_name)
750 |
751 | # This function reads the camera matrix of a stream from the OpenLABEL object
752 | # Returns:
753 | # matrix_in_vcd: camera matrix of the given stream
754 | def get_cam_matrix_in_vcd(self, _vcd,stream_name: str):
755 | if self.cam_matrix_exist(_vcd,stream_name):
756 | stream = _vcd.get_stream(stream_name)
757 | matrix_in_vcd = stream['stream_properties']['intrinsics_pinhole'][
758 | 'camera_matrix_3x4']
759 | return matrix_in_vcd
760 | else:
761 | raise RuntimeError("OpenLABEL: doesn't have shift information for "
762 | "stream: " + stream_name)
763 |
764 | # This function returns all the three stream frame numbers
765 | # If a OpenLABEL is loaded, this function will get the numbers from the OpenLABEL object
766 | # If no OpenLABEL is loaded, this function will return the internal values.
767 | # Returns:
768 | # face_frames: number of frames in face stream
769 | # body_frames: number of frames in body stream
770 | # hands_frames: number of frames in hands stream
771 | def get_frames(self):
772 | if self._vcd_loaded:
773 | face_frames = self.get_stream_frames_in_vcd(self._vcd,'face_camera')
774 | body_frames = self.get_stream_frames_in_vcd(self._vcd,'body_camera')
775 | hands_frames = self.get_stream_frames_in_vcd(self._vcd,'hands_camera')
776 | else:
777 | face_frames = self._f_frames
778 | body_frames = self._b_frames
779 | hands_frames = self._h_frames
780 |
781 | return face_frames, body_frames, hands_frames
782 |
783 | # With this function the shifts of all three streams could be retrieved.
784 | # If a OpenLABEL is loaded, this function will get the numbers from the OpenLABEL object
785 | # If no OpenLABEL is loaded, this function will return the internal values.
786 | # Returns:
787 | # body_face_sh: shift of body stream respect to face stream
788 | # hands_face_sh: shift of hands stream respect to face stream
789 | # hands_body_sh: shift of hands stream respect to body stream
790 | def get_shifts(self):
791 | if self._vcd_loaded:
792 | body_face_sh = self.get_shift_in_vcd(self._vcd,"body_camera")
793 | hands_face_sh = self.get_shift_in_vcd(self._vcd,"hands_camera")
794 | hands_body_sh = hands_face_sh - body_face_sh
795 | else:
796 | body_face_sh = self._bf_shift
797 | hands_face_sh = self._hf_shift
798 | hands_body_sh = self._hb_shift
799 | return body_face_sh, hands_face_sh, hands_body_sh
800 |
801 | # With this function the camera matrix of all three streams could be
802 | # retrieved.
803 | # If a OpenLABEL is loaded, this function will get the numbers from the OpenLABEL object
804 | # If no OpenLABEL is loaded, this function will return the internal values.
805 | # Returns:
806 | # face_cam_matrix: camera matrix of face camera
807 | # body_cam_matrix: camera matrix of body camera
808 | # hands_cam_matrix: camera matrix of hands camera
809 | def get_intrinsics(self, _vcd):
810 | if self._vcd_loaded:
811 | face_cam_matrix = self.get_cam_matrix_in_vcd(_vcd,"face_camera")
812 | body_cam_matrix = self.get_cam_matrix_in_vcd(_vcd,"body_camera")
813 | hands_cam_matrix = self.get_cam_matrix_in_vcd(_vcd,"hands_camera")
814 | else:
815 | face_cam_matrix = self._f_intrinsics
816 | body_cam_matrix = self._b_intrinsics
817 | hands_cam_matrix = self._h_intrinsics
818 | return face_cam_matrix, body_cam_matrix, hands_cam_matrix
819 |
820 | # This function is to set the number of frames of the body video to the OpenLABEL
821 | # @body_frames: number of frames
822 | def set_body_frames(self, body_frames):
823 | self._b_frames = int(body_frames)
824 |
825 | # This function is to set the number of frames of the face video to the OpenLABEL
826 | # @face_frames: number of frames
827 | def set_face_frames(self, face_frames):
828 | self._f_frames = int(face_frames)
829 |
830 | # This function is to set the number of frames of the hands video to the OpenLABEL
831 | # @hands_frames: number of frames
832 | def set_hands_frames(self, hands_frames):
833 | self._h_frames = int(hands_frames)
834 |
835 | # This function allows to set the stream shifts and store in the internal
836 | # variables to be used when saving the OpenLABEL file
837 | def set_shifts(self, body_face_shift=None, hands_face_shift=None,
838 | hands_body_shift=None, ):
839 | if (body_face_shift is None and hands_face_shift is None) or \
840 | (body_face_shift is None and hands_body_shift is None) or \
841 | (hands_face_shift is None and hands_body_shift is None):
842 | raise RuntimeError('At least two shifts values must be passed')
843 |
844 | self._bf_shift = body_face_shift
845 | self._hf_shift = hands_face_shift
846 | self._hb_shift = hands_body_shift
847 | if body_face_shift is None:
848 | self._bf_shift = self._hf_shift - self._hb_shift
849 | if hands_face_shift is None:
850 | self._hf_shift = self._hb_shift + self._bf_shift
851 | if hands_body_shift is None:
852 | self._hb_shift = self._hf_shift - self._bf_shift
853 |
854 | # This function gets the annotation labels and includes shifts between
855 | # streams.
856 | # Returns:
857 | # @annotations: A matrix consisting of the annotation labels for each of
858 | # the levels in dict
859 | # @validations: A matrix consisting of the validation method while
860 | # annotating
861 | def get_annotation_vectors(self):
862 | # Perform general extraction of annotation and verification vectors
863 | annotations, validations = super().get_annotation_vectors()
864 |
865 | vcd = self._vcd
866 | frame_interval = vcd.get_frame_intervals().fis_dict
867 | total_frames = frame_interval[-1]['frame_end'] + 1
868 |
869 | # Get some handy variables
870 | body_face_shift = self._bf_shift
871 | hands_face_shift = self._hf_shift
872 |
873 | if body_face_shift is None or hands_face_shift is None:
874 | raise RuntimeError("Couldn't get OpenLABEL data")
875 |
876 | face_shift = 0
877 | body_shift = body_face_shift
878 | hands_shift = hands_face_shift
879 | # START
880 | face_dependant = self._camera_dependencies["face"]
881 | body_dependant = self._camera_dependencies["body"]
882 | hands_dependant = self._camera_dependencies["hands"]
883 | # Fill with NAN codes those levels where the corresponding reference
884 | # video has no frames
885 | if body_face_shift > 0 and hands_face_shift > 0:
886 | # Face starts first , then body or hands
887 | for level in body_dependant:
888 | annotations[0:body_face_shift, level] = 100
889 | for level in hands_dependant:
890 | annotations[0:hands_face_shift, level] = 100
891 | elif body_face_shift < hands_face_shift:
892 | # Body starts first
893 | face_shift = abs(body_face_shift)
894 | body_shift = 0
895 | hands_shift = face_shift + hands_face_shift
896 | for level in face_dependant:
897 | annotations[0:face_shift, level] = 100
898 | for level in hands_dependant:
899 | annotations[0:hands_shift, level] = 100
900 | elif hands_face_shift < body_face_shift:
901 | # Hands starts first
902 | face_shift = abs(hands_face_shift)
903 | body_shift = face_shift + body_face_shift
904 | hands_shift = 0
905 | for level in face_dependant:
906 | annotations[0:face_shift, level] = 100
907 | for level in body_dependant:
908 | annotations[0:body_shift, level] = 100
909 | elif hands_face_shift == body_face_shift:
910 | # Hands and Body start at the same time
911 | face_shift = abs(hands_face_shift)
912 | body_shift = 0
913 | hands_shift = 0
914 | for level in face_dependant:
915 | annotations[0:face_shift, level] = 100
916 | # END
917 |
918 | # Fill end of vectors with NAN values - body related
919 | if self._b_frames is not None and self._b_frames > 0:
920 | body_end = self._b_frames + body_shift
921 | if body_end < total_frames:
922 | for level in body_dependant:
923 | annotations[body_end:total_frames, level] = 100
924 | else:
925 | warnings.warn('WARNING: Body frame number hasn\'t been set in OpenLABEL')
926 | # Fill end of vectors with NAN values - hands related
927 | if self._h_frames is not None and self._h_frames > 0:
928 | hands_end = self._h_frames + hands_shift
929 | if hands_end < total_frames:
930 | for level in hands_dependant:
931 | annotations[hands_end:total_frames, level] = 100
932 | else:
933 | warnings.warn(
934 | 'WARNING: Hands frame number hasn\'t been set in OpenLABEL')
935 | # Fill end of vectors with NAN values - face related
936 | if self._f_frames is not None and self._f_frames > 0:
937 | face_end = self._f_frames + face_shift
938 | if face_end < total_frames:
939 | for level in face_dependant:
940 | annotations[face_end:total_frames, level] = 100
941 | else:
942 | warnings.warn('WARNING: Face frame number hasn\'t been set in OpenLABEL')
943 |
944 | return annotations, validations
945 |
946 | # This functions checks if the OpenLABEL has the fields of metadata and the values
947 | # are valid
948 | def verify_metadata(self, ctx_id):
949 | valid_metadata = True
950 | # @metadata: [face_meta, body_meta,
951 | # hands_meta]
952 | # @face_meta: [rgb_video_frames,mat]
953 | # @body_meta: [date_time,rgb_video_frames,mat]
954 | # @face_meta: [rgb_video_frames,mat]
955 | # Number of frames
956 | face = self.get_stream_frames_in_vcd(self._vcd,'face_camera')
957 | body = self.get_stream_frames_in_vcd(self._vcd,'body_camera')
958 | hands = self.get_stream_frames_in_vcd(self._vcd,'hands_camera')
959 |
960 | if face == 0 or body == 0 or hands == 0:
961 | valid_metadata = False
962 |
963 | if self._vcd.get_context(ctx_id) == None:
964 | valid_metadata = False
965 |
966 | return valid_metadata
967 |
968 | # --- TEMP FEATURE ---#
969 | # this functions checks if the OpenLABEL has the fields of statics annotations
970 | # and the numbers of frames registered are not 0. If true, static
971 | # annotations exist
972 | def verify_statics(self, staticDict, obj_id):
973 | exist = True
974 | vcd_object = self._vcd.get_object(obj_id)
975 | for att in staticDict:
976 | att_exist = keys_exist(vcd_object, 'object_data',
977 | str(staticDict[att]["type"]))
978 | if not att_exist:
979 | exist = False
980 | break
981 | return exist
982 |
983 | # This function get different values from OpenLABEL to keep the consistency when
984 | # the user saves/creates a new OpenLABEL
985 | # @staticDict: dict of static annotations to get its values from OpenLABEL
986 | # @ctx_id: id of the context (in this case 0)
987 | def getStaticVector(self, staticDict, ctx_id):
988 | return self.get_static_in_vcd(self._vcd,staticDict,ctx_id)
989 |
990 | def get_static_in_vcd(self,_vcd,staticDict,ctx_id):
991 | for x in range(5):
992 | att = staticDict[x]
993 | # Get each of the static annotations of the directory from the OpenLABEL
994 | object_vcd = dict(_vcd.get_object_data(0, att["name"]))
995 | att.update({"val": object_vcd["val"]})
996 | # context
997 | context = dict(_vcd.get_context(ctx_id))["context_data"]["text"]
998 | staticDict[5].update({"val": context[0]["val"]})
999 | staticDict[6].update({"val": context[1]["val"]})
1000 | # record_time = context[2]["val"]
1001 | # Annotator id
1002 | meta_data = dict(_vcd.get_metadata())
1003 | annotator = meta_data["annotator"]
1004 | staticDict[7].update({"val": annotator})
1005 | # returns:
1006 | # @staticDict: the dict with the values taken from the OpenLABEL
1007 | return staticDict
1008 |
1009 | # This function get different values from OpenLABEL to keep the consistency when
1010 | # the user saves/creates a new OpenLABEL
1011 | # @ctx_id: id of the object (in this case 0)
1012 | def getMetadataVector(self, ctx_id):
1013 | return self.get_metadata_in_vcd(self._vcd,ctx_id)
1014 |
1015 | def get_metadata_in_vcd(self, _vcd, ctx_id):
1016 | # context
1017 | record_time = 0
1018 | if not _vcd.get_context_data(ctx_id, "recordTime") ==None:
1019 | record_time =_vcd.get_context_data(ctx_id, "recordTime")["val"]
1020 | # frames
1021 | face = self.get_stream_frames_in_vcd(_vcd,'face_camera')
1022 | body = self.get_stream_frames_in_vcd(_vcd,'body_camera')
1023 | hands = self.get_stream_frames_in_vcd(_vcd,'hands_camera')
1024 | #intrinsics matrix
1025 | face_mat, body_mat, hands_mat = self.get_intrinsics(_vcd)
1026 | # returns:
1027 | # @metadata: [face_meta, body_meta,
1028 | # hands_meta]
1029 | # @face_meta: [rgb_video_frames,mat]
1030 | # @body_meta: [date_time,rgb_video_frames,mat]
1031 | # @face_meta: [rgb_video_frames,mat]
1032 | return [face, face_mat], [record_time, body, body_mat], \
1033 | [hands, hands_mat]
1034 |
1035 | # Function to extract shifts, metadata and maybe statics info to create a
1036 | # new OpenLABEL
1037 | def get_info_from_VCD(self, vcd_file_copy, staticDict, ctx_id):
1038 | #load OpenLABEL
1039 |
1040 | copy_vcd = core.VCD()
1041 | copy_vcd.load_from_file(file_name=vcd_file_copy)
1042 |
1043 | self._vcd_loaded = True
1044 | #get shifts
1045 | body_face_sh = self.get_shift_in_vcd(copy_vcd,"body_camera")
1046 | hands_face_sh = self.get_shift_in_vcd(copy_vcd,"hands_camera")
1047 | hands_body_sh = hands_face_sh - body_face_sh
1048 | #get statics
1049 | static = self.get_static_in_vcd(copy_vcd,staticDict,ctx_id)
1050 | #get metadata
1051 | metadata= self.get_metadata_in_vcd(copy_vcd,ctx_id)
1052 | #Turn to false to save
1053 | self._vcd_loaded = False
1054 | return body_face_sh, hands_body_sh, static, metadata
1055 |
--------------------------------------------------------------------------------
/docs/imgs/annotation_tool_info.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Vicomtech/DMD-Driver-Monitoring-Dataset/3da110ef48b1c2f9460d11e392a858314a1be119/docs/imgs/annotation_tool_info.png
--------------------------------------------------------------------------------
/docs/imgs/block_annotation.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Vicomtech/DMD-Driver-Monitoring-Dataset/3da110ef48b1c2f9460d11e392a858314a1be119/docs/imgs/block_annotation.png
--------------------------------------------------------------------------------
/docs/imgs/colorized_depth.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Vicomtech/DMD-Driver-Monitoring-Dataset/3da110ef48b1c2f9460d11e392a858314a1be119/docs/imgs/colorized_depth.png
--------------------------------------------------------------------------------
/docs/imgs/level_panel.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Vicomtech/DMD-Driver-Monitoring-Dataset/3da110ef48b1c2f9460d11e392a858314a1be119/docs/imgs/level_panel.png
--------------------------------------------------------------------------------
/docs/imgs/mobaxterm_config.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Vicomtech/DMD-Driver-Monitoring-Dataset/3da110ef48b1c2f9460d11e392a858314a1be119/docs/imgs/mobaxterm_config.png
--------------------------------------------------------------------------------
/docs/issue_bug_template.md:
--------------------------------------------------------------------------------
1 | # Opening an Issue
2 | If you find any problem with the temporal annotation tool you are free to open a new issue. Please check before the [README file](README.md) and other [issues](https://github.com/Vicomtech/DMD-Driver-Monitoring-Dataset/issues) to see if your it is already solved.
3 |
4 | ## Bug Issue Template
5 |
6 | ### Expected Behavior
7 | What should had happened.
8 |
9 | ### Actual Behavior
10 | A clear and concise description of the behavior.
11 |
12 | ### Steps to Reproduce the Problem
13 |
14 | 1.
15 | 1.
16 | 1.
17 |
18 | ### Environment
19 |
20 | - Tool Version:
21 | - Dependencies Version(OpenCV, Numpy, VCD):
22 | - OS:
23 | - Using WSL?:
24 |
25 | ### Additional information / Any Screenshot?
26 |
27 |
28 |
29 |
30 |
--------------------------------------------------------------------------------
/docs/issue_feature_template.md:
--------------------------------------------------------------------------------
1 | # Opening an Issue
2 | If you find any problem with the temporal annotation tool you are free to open a new issue. Please check before the [README file](README.md) and other issues to see if your issue is already solved.
3 |
4 | ## Feature Request
5 |
6 | ### Is your feature request related to a problem? Please describe.
7 | A clear and concise description of what the problem is. Ex. I have an issue when [...]
8 |
9 | ## Describe the solution you'd like
10 | A clear and concise description of what you want to happen. Add any considered drawbacks.
11 |
12 | ## Describe alternatives you've considered
13 | A clear and concise description of any alternative solutions or features you've considered.
14 |
15 | ## Teachability, Documentation, Adoption, Migration Strategy
16 | If you can, explain how users will be able to use this and possibly write out a version the docs.
17 | Maybe a screenshot or design?
18 |
19 |
20 |
21 |
22 |
--------------------------------------------------------------------------------
/docs/readme-assets/cameras.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Vicomtech/DMD-Driver-Monitoring-Dataset/3da110ef48b1c2f9460d11e392a858314a1be119/docs/readme-assets/cameras.png
--------------------------------------------------------------------------------
/docs/readme-assets/dmdStructure.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Vicomtech/DMD-Driver-Monitoring-Dataset/3da110ef48b1c2f9460d11e392a858314a1be119/docs/readme-assets/dmdStructure.png
--------------------------------------------------------------------------------
/docs/readme-assets/environments.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Vicomtech/DMD-Driver-Monitoring-Dataset/3da110ef48b1c2f9460d11e392a858314a1be119/docs/readme-assets/environments.png
--------------------------------------------------------------------------------
/docs/readme-assets/gazeRegions.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Vicomtech/DMD-Driver-Monitoring-Dataset/3da110ef48b1c2f9460d11e392a858314a1be119/docs/readme-assets/gazeRegions.png
--------------------------------------------------------------------------------
/docs/readme-assets/mosaic.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Vicomtech/DMD-Driver-Monitoring-Dataset/3da110ef48b1c2f9460d11e392a858314a1be119/docs/readme-assets/mosaic.png
--------------------------------------------------------------------------------
/docs/readme-assets/participants.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/Vicomtech/DMD-Driver-Monitoring-Dataset/3da110ef48b1c2f9460d11e392a858314a1be119/docs/readme-assets/participants.png
--------------------------------------------------------------------------------
/docs/setup_linux.md:
--------------------------------------------------------------------------------
1 | # Setting up the Temporal Annotation Tool (TaTo) - Linux
2 |
3 | ## Dependencies
4 | The TaTo tool has been tested using the following system configuration:
5 |
6 | **OS:** Ubuntu 18.04, Windows 10
7 | **Dependencies:** Python 3.8, OpenCV-Python 4.2.0, VCD 4.3.0. Add: FFMPEG and ffmpeg-python for DExTool.
8 |
9 | ## Environment for Ubuntu
10 | - Please make sure you have **Python 3** installed in your system
11 | - Verify pip is installed, if not install:
12 | ```bash
13 | sudo apt-get install python3 python3-pip
14 | ```
15 | ```bash
16 | pip3 install --upgrade pip
17 | ```
18 | - (Optional) It is recommended to create a virtual environment in Python 3 (more info [here](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/)):
19 | - Configure a new virtual environment:
20 | ```bash
21 | mkdir anntool_py
22 | ```
23 | ```bash
24 | python3 -m venv anntool_py
25 | ```
26 | - Activate the virtual environment:
27 | ```bash
28 | source anntool_py/bin/activate
29 | ```
30 | - Install the dependencies
31 | ``` bash
32 | pip3 install opencv-python numpy vcd
33 | ```
34 | - For DExTool, install the dependecies:
35 | ``` bash
36 | pip3 install --upgrade setuptools
37 | sudo apt update
38 | sudo apt install ffmpeg
39 | pip3 install ffmpeg-python
40 | ```
41 | - Go to [directory](../annotation-tool) that contains the tool scripts.
42 |
43 | ## Launching TaTo
44 | In a terminal window within the folder [annotation_tool](../annotation-tool) run:
45 |
46 | ```python
47 | python TaTo.py
48 | ```
49 |
50 | The tool will ask you to input the **path** of the video you want to annotate. Please insert the path following the [DMD file structure](../docs/dmd_file_struct.md).
51 |
52 | The annotation tool TaTo opens with three windows.
53 |
54 | ## Launching DEx
55 | In a terminal window, within the folder [exploreMaterial-tool](../exploreMaterial-tool) run
56 |
57 | ```python
58 | python DExTool.py
59 | ```
60 | The tool will ask you to input the **task** you wish to perform.
61 |
--------------------------------------------------------------------------------
/docs/setup_windows.md:
--------------------------------------------------------------------------------
1 | # Setting up the Temporal Annotation Tool (TaTo) - Windows
2 |
3 | ## Dependencies
4 | The TaTo tool has been tested using the following system configuration:
5 |
6 | **OS:** Windows 10
7 | **Dependencies:** Python 3.8, OpenCV-Python 4.2.0, VCD 4.3.0. Add: FFMPEG and ffmpeg-python for DExTool.
8 |
9 | ## Environment in Windows
10 |
11 | - Verify pip is installed, if not install:
12 |
13 | - (Optional) It is recommended to create a virtual environment in Python 3 (more info [here](https://packaging.python.org/guides/installing-using-pip-and-virtual-environments/)):
14 |
15 | - Install dependencies
16 | -opencv-python, numpy, vcd
17 | -FFMPEG and ffmpeg-python for DExTool.
18 |
19 | - Go to [directory](../annotation-tool) that contains the tool scripts.
20 |
21 | ## Launching TaTo
22 | In a terminal window, within the folder [annotation_tool](../annotation-tool) run
23 |
24 | ```python
25 | python TaTo.py
26 | ```
27 |
28 | The tool will ask you to input the **path** of the mosaic video you want to annotate. Please insert the path following the [DMD file structure](../docs/dmd_file_struct.md).
29 |
30 | The annotation tool TaTo opens with three windows.
31 |
32 | ## Launching DEx
33 | In a terminal window, within the folder [exploreMaterial-tool](../exploreMaterial-tool) run
34 |
35 | ```python
36 | python DExTool.py
37 | ```
38 |
39 | The tool will ask you to input the **task** you wish to perform.
40 |
--------------------------------------------------------------------------------
/exploreMaterial-tool/DExTool.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | # Created by Paola Cañas with <3
3 | import glob
4 | import os
5 | import re
6 | from pathlib import Path
7 | from accessDMDAnn import exportClass
8 | from group_split_material import splitClass, groupClass
9 | from statistics import get_statistics
10 |
11 | print("Welcome :)")
12 | opt = int(input("What do you whish to do?: export material for training:[0] group exported material by classes:[1] create train and test split:[2] get statistics:[3] : "))
13 |
14 | if opt == 0:
15 | # export material for training
16 | print("To change export settings go to config_DEx.json and change control variables.")
17 | destination_path = input("Enter destination path: ")
18 | selec = input("How do you want to read annotations, by: Group:[g] Sessions:[f] One OpenLABEL:[v] : ")
19 |
20 | if selec == "g":
21 | #By group
22 | folder_path = input("Enter DMD group's path (../dmd/g#): ")
23 | #e.g /home/pncanas/Desktop/consumption/dmd/gA
24 | selec_session = input("Enter the session you wish to export in this group: all:[0] S1:[1] S2:[2] S3[3] S4[4] S5[5] S6[6] : ")
25 |
26 | subject_paths = glob.glob(folder_path + '/*')
27 | subject_paths.sort()
28 |
29 | for subject in subject_paths:
30 | print(subject)
31 | session_path = glob.glob(subject + '/*')
32 | session_path.sort()
33 |
34 | for session in session_path:
35 | if "s"+str(selec_session) in session or selec_session == "0":
36 | print(session)
37 | annotation_paths = glob.glob(session + '/*.json')
38 | annotation_paths.sort()
39 |
40 | for annotation in annotation_paths:
41 | print(annotation)
42 | dmd_folder=Path(annotation).parents[3]
43 |
44 | exportClass(annotation,str(dmd_folder),destination_path)
45 |
46 | print("Oki :) ----------------------------------------")
47 |
48 | elif selec == "f":
49 | #By session
50 | folder_path = input("Enter root dmd folder path(../dmd): ")
51 | #e.g /home/pncanas/Desktop/dmd/
52 | selec_session = input("Enter the session you wish to export in this group: all:[0] S1:[1] S2:[2] S3[3] S4[4] S5[5] S6[6] : ")
53 |
54 | group_paths = glob.glob(folder_path + '/*')
55 | group_paths.sort()
56 |
57 | for group in group_paths:
58 | print(group)
59 | subject_paths = glob.glob(group + '/*')
60 | subject_paths.sort()
61 |
62 | for subject in subject_paths:
63 | print(subject)
64 | session_path = glob.glob(subject + '/*')
65 | session_path.sort()
66 |
67 | for session in session_path:
68 | if "s"+str(selec_session) in session or selec_session == "0":
69 | print(session)
70 | annotation_paths = glob.glob(session + '/*.json')
71 | annotation_paths.sort()
72 |
73 | for annotation in annotation_paths:
74 | print(annotation)
75 | dmd_folder=Path(annotation).parents[3]
76 |
77 | exportClass(annotation,str(dmd_folder),destination_path)
78 |
79 | print("Oki :) ----------------------------------------")
80 |
81 | elif selec == "v":
82 |
83 | vcd_path = input("Paste the OpenLABEL file path (..._ann.json): ")
84 | # e.g: /Desktop/consumption/dmd/gA/1/s2/gA_1_s2_2019-03-08T09;21;03+01;00_rgb_ann.json
85 | regex_internal = '(?P[1-9]|[1-2][0-9]|[3][0-7])_(?P[a-z]{1,})_'\
86 | '(?P(?P0[1-9]|1[012])-(?P0[1-9]|[12][0-9]|3[01]))'
87 | regex_external = '(?Pg[A-z]{1,})_(?P[1-9]|[1-2][0-9]|[3][0-7])_'\
88 | '(?Ps[1-9]{1,})_(?P(?P(?P\d{4})-(?P0[1-9]|1[012])-'\
89 | '(?P0[1-9]|[12][0-9]|3[01]))T(?P(?P\d{1,2});(?P\d{1,2});'\
90 | '(?P\d{1,2}))\+\d{1,2};\d{1,2})_(?Prgb|depth|ir)_(?Pann)'
91 | regex_internal = re.compile(regex_internal)
92 | regex_external = re.compile(regex_external)
93 | match_internal = regex_internal.search(str(vcd_path))
94 | match_external = regex_external.search(str(vcd_path))
95 |
96 | if match_internal or match_external:
97 | #dmd annotation
98 | dmd_folder=Path(vcd_path).parents[3]
99 | datasetDMD = True
100 | else:
101 | #not a dmd annotation
102 | dmd_folder=Path(vcd_path).parents[1]
103 | datasetDMD = False
104 |
105 | exportClass(vcd_path,str(dmd_folder),destination_path, datasetDMD)
106 |
107 | print("Oki :) ----------------------------------------")
108 |
109 | else:
110 | print("__Please, select a valid option__")
111 |
112 |
113 | elif opt == 1:
114 | # group exported material by classes
115 | material_path = input("Enter exported DMD material path (inside must be sessions folders(s#) e.g:../dmd_rgb/): ")
116 | groupClass(material_path)
117 |
118 | print("Oki :) ----------------------------------------")
119 |
120 | elif opt == 2:
121 | # create train and test split
122 | print("This function only works with dmd material structure when exporting with DEx tool.")
123 | material_path = input("Enter exported material path (inside must be classes folders e.g.: /safe_driving/*.jpg): ")
124 | destination_path = input("Enter destination path (a new folder to store train and test splits): ")
125 | test_proportion = input("Enter test proportion for split [0-1] (e.g. 0.20): ")
126 |
127 | splitClass(material_path,destination_path,test_proportion)
128 |
129 | print("Oki :) ----------------------------------------")
130 |
131 | elif opt == 3:
132 | # Get statistics
133 | print("This function only works with dmd material structure when exporting with DEx tool.")
134 | destination_path = input("Enter filename for a report file (e.g. report.txt): ")
135 | selec = input("How do you want to read annotations, by: Group:[g] Sessions:[f] One OpenLABEL:[v] : ")
136 |
137 | #Delete destination_path to avoid redundancies
138 | if os.path.exists(destination_path.replace(".txt","-actions.txt")):
139 | os.remove(destination_path.replace(".txt","-actions.txt"))
140 | if os.path.exists(destination_path.replace(".txt","-frames.txt")):
141 | os.remove(destination_path.replace(".txt","-frames.txt"))
142 |
143 | if selec == "g":
144 | #By group
145 | folder_path = input("Enter DMD group's path (../dmd/g#): ")
146 | #e.g /home/pncanas/Desktop/consumption/dmd/gA
147 | selec_session = input("Enter the session you wish to export in this group: all:[0] S1:[1] S2:[2] S3[3] S4[4] S5[5] S6[6] : ")
148 |
149 | subject_paths = glob.glob(folder_path + '/*')
150 | subject_paths.sort()
151 |
152 | for subject in subject_paths:
153 | print(subject)
154 | session_path = glob.glob(subject + '/*')
155 | session_path.sort()
156 |
157 | for session in session_path:
158 | if "s"+str(selec_session) in session or selec_session == "0":
159 | print(session)
160 | annotation_paths = glob.glob(session + '/*.json')
161 | annotation_paths.sort()
162 | for annotation in annotation_paths:
163 | print(annotation)
164 |
165 | get_statistics(annotation,destination_path)
166 |
167 | print("Oki :) ----------------------------------------")
168 |
169 | elif selec == "f":
170 | #By session
171 | folder_path = input("Enter root dmd folder path(../dmd): ")
172 | #e.g /home/pncanas/Desktop/dmd/
173 | selec_session = input("Enter the session you wish to export in this group: all:[0] S1:[1] S2:[2] S3[3] S4[4] S5[5] S6[6] : ")
174 |
175 | group_paths = glob.glob(folder_path + '/*')
176 | group_paths.sort()
177 |
178 | for group in group_paths:
179 | print(group)
180 | subject_paths = glob.glob(group + '/*')
181 | subject_paths.sort()
182 |
183 | for subject in subject_paths:
184 | print(subject)
185 | session_path = glob.glob(subject + '/*')
186 | session_path.sort()
187 |
188 | for session in session_path:
189 | if "s"+str(selec_session) in session or selec_session == "0":
190 | print(session)
191 | annotation_paths = glob.glob(session + '/*.json')
192 | annotation_paths.sort()
193 |
194 | for annotation in annotation_paths:
195 | print(annotation)
196 | dmd_folder=Path(annotation).parents[3]
197 |
198 | get_statistics(annotation,destination_path)
199 |
200 | print("Oki :) ----------------------------------------")
201 |
202 | elif selec == "v":
203 |
204 | vcd_path = input("Paste the OpenLABEL file path (..._ann.json): ")
205 | # e.g: /Desktop/consumption/dmd/gA/1/s2/gA_1_s2_2019-03-08T09;21;03+01;00_rgb_ann.json
206 | regex_internal = '(?P[1-9]|[1-2][0-9]|[3][0-7])_(?P[a-z]{1,})_'\
207 | '(?P(?P0[1-9]|1[012])-(?P0[1-9]|[12][0-9]|3[01]))'
208 | regex_external = '(?Pg[A-z]{1,})_(?P[1-9]|[1-2][0-9]|[3][0-7])_'\
209 | '(?Ps[1-9]{1,})_(?P(?P(?P\d{4})-(?P0[1-9]|1[012])-'\
210 | '(?P0[1-9]|[12][0-9]|3[01]))T(?P(?P\d{1,2});(?P\d{1,2});'\
211 | '(?P\d{1,2}))\+\d{1,2};\d{1,2})_(?Prgb|depth|ir)_(?Pann)'
212 | regex_internal = re.compile(regex_internal)
213 | regex_external = re.compile(regex_external)
214 | match_internal = regex_internal.search(str(vcd_path))
215 | match_external = regex_external.search(str(vcd_path))
216 |
217 | if match_internal or match_external:
218 | #dmd annotation
219 | dmd_folder=Path(vcd_path).parents[3]
220 | else:
221 | #not a dmd annotation
222 | dmd_folder=Path(vcd_path).parents[1]
223 |
224 | get_statistics(vcd_path,destination_path)
225 |
226 | print("Oki :) ----------------------------------------")
227 |
228 | else:
229 | print("__Please, select a valid option__")
230 | else:
231 | print("__Please, put a valid option.__")
232 |
--------------------------------------------------------------------------------
/exploreMaterial-tool/README.md:
--------------------------------------------------------------------------------
1 | # Dataset Explorer Tool (DEx)
2 | The DMD annotations come in [OpenLABEL](https://www.asam.net/standards/detail/openlabel/) format [link vicomtech](https://vcd.vicomtech.org/), which is compatible with the ASAM OpenLABEL annotation standard.
3 | This language is defined with JSON schemas and supports different types of annotations, being ideal for describing any kind of scenes.
4 | The DMD has spatial and temporal annotations (e.g. Bounding boxes and time intervals), also context and static annotations (e.g. Driver’s and environmental info); with OpenLABEL, it is possible to have all these annotations in one file. VCD is also an API, you can use the library to create and update OpenLABEL files.
5 |
6 | We have developed DEx tool to help access those annotations in the OpenLABEL easily. The main functionality of DEx at the moment is to access the OpenLABEL’s, read annotations and prepare DMD material for training.
7 |
8 | ## Content
9 | - [Dataset Explorer Tool (DEx)](#dataset-explorer-tool-dex)
10 | - [Content](#content)
11 | - [DEx characteristics](#dex-characteristics)
12 | - [Usage Instructions](#usage-instructions)
13 | - [DEx initialization](#dex-initialization)
14 | - [DEx export configuration](#dex-export-configuration)
15 | - [Changelog](#changelog)
16 |
17 | ## Setup and Launching
18 | DEx tool has been tested using the following system configuration:
19 |
20 | **OS:** Ubuntu 18.04, Windows 10
21 | **Dependencies:** Python 3.8, OpenCV-Python 4.2.0, VCD 6.0, FFMPEG and [ffmpeg-python](https://github.com/kkroening/ffmpeg-python)
22 |
23 | For a detailed description on how to configure the environment and launch the tool, check: [Linux](../docs/setup_linux.md) / [Windows](../docs/setup_windows.md)
24 |
25 | ## DEx characteristics
26 | TaTo is a python-based tool to access OpenLABEL annotations more easily. You can prepare the DMD material for training by using DEx. The main functionalities of DEx are: exporting material in images or videos by frame intervals from the annotations, group the resulting material into folders organized by classes (only available for DMD) and after the material is organized by classes, the tool can generate a training and a testing split.
27 |
28 | - Get a **list of frame intervals** of a specific activity (or label) from OpenLABEL.
29 | - Take a list of frame intervals and **divide** them into **subintervals** of desired size. This can be done starting from the first frame of from the last frame and back.
30 | - **Export** those frame intervals as **video clips** or **images**. The material can be exported from the 3 camera perspectives videos (only available for DMD). You can also export images or videos in any size, like 224x224.
31 | - **Export** intervals from **IR**, **RGB** or **DEPTH** material. Each material type will be in a different folder: dmd-ir, dmd-rgb, dmd-depth.
32 | - You can choose what material to export: a group's material, a session material or just the material from a specific OpenLABEL annotation.
33 | - If you are working with the DMD, the exported material will be organized in a similar way as the DMD structure: by groups, sessions and subjects. With DEx, you can **group** this material by **classes**. This is only possible with DMD material.
34 | - After you have the data organized by classes, you can **split** the material into a **training** and a **testing** split. You must provide the testing **ratio or proportion** (e.g: 0.20, 0.25). If the testing ratio is 0.20, the result is a folder named “train” with 80% of the data and a folder named “test” with the 20% of the data.
35 | - Get **statistics** of data. This means, get the number of frames per class and the total number of frames from data of a group, session or a single OpenLABEL.
36 |
37 | ## Usage Instructions
38 | ### DEx initialization
39 | You can initialize the tool by executing the python script [DExTool.py](./DExTool.py). This script will guide you to prepare the DMD material.
40 |
41 | If you need something more specific, you can direclty implement functions from [accessDMDAnn.py](./accessDMDAnn.py), [vcd4reader.py](./vcd4reader.py), [group_split_material.py](./group_split_material.py).
42 |
43 | ### DEx export configuration
44 | There are some export settings you can change at the __init()__ function of file [accessDMDAnn.py](./accessDMDAnn.py) under “-- CONTROL VARIABLES --“ comment.
45 | - To define the **data format** you wish to export, add “image” and/or “video” to **@material** variable as a list.
46 | - The list of **camera perspectives** to export material from can be defined in **@streams** variable, these are: "face", "body" or "hands" camera. If is a video from other dataset, it must be "general"
47 | - To choose the channel of information, **RGB**, **IR** or **DEPTH**, you must specify it with the **@channels** variable. You can define a list of channesl: ["ir","rgb","depth"]. For videos from other datasets, it must be only ["rgb"].
48 | - You can choose the final image/video **size**. Set it as "original" or a tuple with a smaller size than the original (width, height). e.g.(224,224).
49 | - You can make a list of the **classes** you want to get the frame intervals of (e.g. [“safe_drive”,"drinking"]) and assing it to the **@annotations** variable. Objects (cellphone, hair comb and bottle) have to be with the 'object_in_scene/__' label before. The var @self.actionList will get all the classes available in OpenLABEL
50 | - If you want to export and create/write material in a **destination folder**, you must set **@write** variable to True.
51 | - If you wish to **cut** the frame intervals to subintervals, the **size** of the final subintervals can be set in **@intervalChunk** variable.
52 | - Sometimes not all frame intervals can be cutted because they are smaller than the @intervalChunk. To **ignore** and not export these **smaller frame intervals**, set **@ignoreSmall** to True
53 | - To decide where to start cutting the frame intervals, change the **@asc** variable. True to start from the **first frame** and False to start from the **last frame** and go backwards.
54 |
55 | You can read more details about depth data and how to export it on the [DMD-Depth-Material](https://github.com/Vicomtech/DMD-Driver-Monitoring-Dataset/wiki/DMD-Depth-Material) page of the wiki.
56 |
57 | ## Changelog
58 | For a complete list of changes check the [CHANGELOG.md](../CHANGELOG.md) file
59 |
60 | :warning: If you find any bug with the tool or have ideas of new features please open a new issue using the [bug report template](../docs/issue_bug_template.md) or the [feature request template](../docs/issue_feature_template.md) :warning:
61 |
--------------------------------------------------------------------------------
/exploreMaterial-tool/Tutorial_DEx_(dataset_explorer_tool).ipynb:
--------------------------------------------------------------------------------
1 | {
2 | "cells": [
3 | {
4 | "cell_type": "markdown",
5 | "metadata": {},
6 | "source": [
7 | "# DEx Basics Notebook\n",
8 | "Welcome to this notebook! Here, we will explore the functionalities of the tool we are building. Although some parts of this notebook may not be executable directly due to data availability (as the data may reside locally), this notebook aims to provide a comprehensive guide on how to configure and use the different options available in our tool.\n",
9 | "\n",
10 | "This notebook will guide you through the process of setting up the `config.json` for the **DEx tool**, allowing you to export datasets according to your specific requirements. We will demonstrate how to manipulate various settings to obtain data structures and formats that best suit your needs.\n",
11 | "\n",
12 | "Please note that while some sections may require adjustments based on your local setup, the overall process and configurations remain the same. We hope this notebook serves as a valuable resource in understanding and utilizing the DEx tool effectively."
13 | ]
14 | },
15 | {
16 | "cell_type": "markdown",
17 | "metadata": {
18 | "id": "INFr4cb2UeCb"
19 | },
20 | "source": [
21 | "## Prepare enviroment\n",
22 | "\n",
23 | "First, we need the project code. It has to be downloaded from the GitHub repository:"
24 | ]
25 | },
26 | {
27 | "cell_type": "code",
28 | "execution_count": null,
29 | "metadata": {
30 | "colab": {
31 | "base_uri": "https://localhost:8080/"
32 | },
33 | "id": "JQ-M1pRLRla1",
34 | "outputId": "6a6800ea-04de-4d40-d879-f57ef659a7f5"
35 | },
36 | "outputs": [],
37 | "source": [
38 | "!git clone https://github.com/Vicomtech/DMD-Driver-Monitoring-Dataset.git"
39 | ]
40 | },
41 | {
42 | "cell_type": "markdown",
43 | "metadata": {},
44 | "source": [
45 | "Before we start, it's important to ensure that we have all the necessary libraries installed. Here's a list of the libraries that we'll be using:"
46 | ]
47 | },
48 | {
49 | "cell_type": "code",
50 | "execution_count": null,
51 | "metadata": {},
52 | "outputs": [],
53 | "source": [
54 | "!pip install --upgrade opencv-python==4.2.0\n",
55 | "!pip install --upgrade vcd==6.0\n",
56 | "!pip install ffmpeg\n",
57 | "!pip install ffmpeg-python"
58 | ]
59 | },
60 | {
61 | "cell_type": "markdown",
62 | "metadata": {},
63 | "source": [
64 | "**Note**: It’s a good practice to use a virtual environment for your projects to avoid conflicts between package versions.\n"
65 | ]
66 | },
67 | {
68 | "cell_type": "markdown",
69 | "metadata": {},
70 | "source": [
71 | "## Structure of the DMD\n",
72 | "The DMD is structured into various groups of persons, namely gA, gB, gC, gE, gF, and gZ. Each group typically consists of five people, each represented by a numbered subfolder, with the exception of group gZ, which contains seven individuals. The dataset is further divided into different sessions based on the type of data:\n",
73 | "\n",
74 | "- The distraction dataset consists of four sessions: s1, s2, s3, and s4.\n",
75 | "- The drowsiness dataset consists of one session: s5.\n",
76 | "- The gaze dataset consists of one session: s6.\n",
77 | "\n",
78 | "Each session contains at least one video in RGB, IR, and depth format for each of the cameras (face, body, and hands). Lastly, there is a .json file that contains the annotations in OpenLABEL format. These annotations are used for DEx to use its functionalities.\n",
79 | "\n",
80 | "Next, you can see a representation of the structure of the DMD:\n"
81 | ]
82 | },
83 | {
84 | "cell_type": "code",
85 | "execution_count": null,
86 | "metadata": {},
87 | "outputs": [],
88 | "source": [
89 | "'''\n",
90 | "dmd\n",
91 | "├───gA\n",
92 | "│ ├───1\n",
93 | "│ │ ├───s1\n",
94 | "│ │ │ ├───gA_1_s1_2019-03-08T09;31;15+01;00_rgb_ann_distraction.json\n",
95 | "│ │ │ ├───gA_1_s1_2019-03-08T09;31;15+01;00_rgb_face.mp4\n",
96 | "│ │ │ ├───*.mp4\n",
97 | "│ │ │ └───*.avi\n",
98 | "│ │ ├───s2\n",
99 | "│ │ │ └───...\n",
100 | "│ │ └───...\n",
101 | "│ ├───2\n",
102 | "│ │ └───...\n",
103 | "│ └───...\n",
104 | "├───gB\n",
105 | "│ └───...\n",
106 | "└───...\n",
107 | "'''"
108 | ]
109 | },
110 | {
111 | "cell_type": "markdown",
112 | "metadata": {},
113 | "source": [
114 | "**Note**: The .json file has to be in the same folder as its corresponding video. DEx tool can be used to export material that does not belong to the DMD and has been annotated with TaTo. "
115 | ]
116 | },
117 | {
118 | "cell_type": "markdown",
119 | "metadata": {},
120 | "source": [
121 | "## Configuring DEx tool\n",
122 | "The DEx tool is highly customizable and allows you to export data according to your specific requirements. This customization is achieved through the `config_DEx.json` file, which contains various variables that you can modify. Here's an example of what it might look like:"
123 | ]
124 | },
125 | {
126 | "cell_type": "code",
127 | "execution_count": null,
128 | "metadata": {},
129 | "outputs": [],
130 | "source": [
131 | "{\n",
132 | " \"material\": [\"image\"],\n",
133 | " \"streams\" : [\"face\"],\n",
134 | " \"channels\" : [\"rgb\"],\n",
135 | " \"annotations\" : [\"driver_actions/safe_drive\", \"driver_actions/texting_right\"],\n",
136 | " \"write\" : true,\n",
137 | " \"size\" : [224, 224],\n",
138 | " \"intervalChunk\" : 0,\n",
139 | " \"ignoreSmall\" : false,\n",
140 | " \"asc\" : true\n",
141 | "}"
142 | ]
143 | },
144 | {
145 | "cell_type": "markdown",
146 | "metadata": {},
147 | "source": [
148 | "Here is an explanation of all of the available options in the configDEx.json:\n",
149 | "\n",
150 | "- @material: list of data format you wish to export. Possible values: \"image\", \"video\".\n",
151 | "\n",
152 | "- @streams: list of camera names to export material from. Possible values: \"face\", \"body\", \"hands\", and \"general\" (\"general\" if not DMD material).\n",
153 | "\n",
154 | "- @channels: list of channels of information you wish to export. Possible values: \"RGB\", \"IR\", and \"depth\".\n",
155 | "\n",
156 | "- @annotations: list of classes you wish to export (e.g. [\"safe_drive\", \"drinking\"], or \"all\"). Possible values: all label names or only all. The name written can be the type name like in OpenLABEL (e.g.,\"driver_actions/safe_drive\"), or just the label name (e.g.,\"safe_drive\"). You can find the specific name of the name in OpenLABEL format in the .json file or using the mode to get statistics of the DMD. Except for objects (cell phone, hair comb, and bottle) that have to be with the \"object_in_scene/__\" label. Also, the name of the action can be its uid number (e.g. [0,1,2]), but be aware that uid might not be the same in all OpenLabel.\n",
157 | "\n",
158 | "- @write: Flag to create/write material in the destination folder (True) or just get the intervals (False). Possible values: True, False.\n",
159 | "\n",
160 | "The following options are not necessary to be in the config file, if you don't put them, they will take the default value assigned:\n",
161 | "\n",
162 | "- @size: the size of the final output (images or videos). Set it as \"original\" or a tuple with a smaller size than the original (width, height). e.g.(224,224). The default value is (224, 244).\n",
163 | "\n",
164 | "- @intervalChunk: the size of divisions you wish to do to the frame intervals (in case you want videos of x frames each). Possible values: Number greater than 1. The default value is 0, meaning this option is deactivated.\n",
165 | "\n",
166 | "- @ignoreSmall: True to ignore intervals that cannot be cut because they are smaller than @intervalChunk. Possible values: True or False. The default value is deactivated.\n",
167 | "\n",
168 | "- @asc: When cutting interval chunks, the value should be true to create the intervals going in ascendant order (in a video of 105 frames taking chunks of 50 frames DEx creates [0-49, 50-99, 100-104] intervals). The value should be false to go in descendent order (With the 105 frames video taking chunks of 50 frames the intervals created will be [55-104, 5-54, 0-4]). Possible values: True or False. The default value is true, acting as ascendant order.\n"
169 | ]
170 | },
171 | {
172 | "cell_type": "markdown",
173 | "metadata": {},
174 | "source": [
175 | "## Excecuting DEx Tool\n",
176 | "\n",
177 | "To execute the tool, do it like this: \n",
178 | "\n",
179 | "**NOTE**: You must execute the DExTool inside `./DMD-Driver-Monitoring-Dataset/exploreMaterial-tool/`, otherwise it will not find the `config_DEx.json`.\n"
180 | ]
181 | },
182 | {
183 | "cell_type": "code",
184 | "execution_count": null,
185 | "metadata": {},
186 | "outputs": [],
187 | "source": [
188 | "python DExTool.py"
189 | ]
190 | },
191 | {
192 | "cell_type": "markdown",
193 | "metadata": {},
194 | "source": [
195 | "After excecuting DEx, it will show you 4 functionalities. \n",
196 | "- 0: export material for training\n",
197 | "- 1: group exported material by classes\n",
198 | "- 2: create train and test split\n",
199 | "- 3: get statistics\n",
200 | "\n",
201 | "Press the corresponding number on the keyboard to select the option.\n",
202 | "\n",
203 | "**NOTE:** Option 0, 1 and 2, must be excecuted in that order"
204 | ]
205 | },
206 | {
207 | "cell_type": "code",
208 | "execution_count": null,
209 | "metadata": {},
210 | "outputs": [],
211 | "source": [
212 | "'''\n",
213 | "Welcome :)\n",
214 | "What do you whish to do?: export material for training:[0] group exported material by classes:[1] create train and test split:[2] get statistics:[3] : 0\n",
215 | "'''"
216 | ]
217 | },
218 | {
219 | "cell_type": "markdown",
220 | "metadata": {},
221 | "source": [
222 | "## Example of some use cases\n",
223 | "Next, there are some use cases to illustrate how to configure the `config_DEx.json` file for different scenarios. Whether you're interested in exporting RGB images for safe driving actions or IR videos for drinking actions.\n",
224 | "\n"
225 | ]
226 | },
227 | {
228 | "cell_type": "markdown",
229 | "metadata": {},
230 | "source": [
231 | "### 1. Exporting IR images of face camera of safe_drive action\n",
232 | "In this use case, DEx is going to get the images of the distraction dataset in DMD from the face camera when the driver is driving safely. Here you can see how config_DEx.json should be (The options not listed take the default value):"
233 | ]
234 | },
235 | {
236 | "cell_type": "code",
237 | "execution_count": null,
238 | "metadata": {},
239 | "outputs": [],
240 | "source": [
241 | "{\n",
242 | " \"material\": [\"image\"],\n",
243 | " \"streams\" : [\"face\"],\n",
244 | " \"channels\" : [\"ir\"],\n",
245 | " \"annotations\" : [\"driver_actions/safe_drive\"],\n",
246 | " \"write\" : true\n",
247 | "}"
248 | ]
249 | },
250 | {
251 | "cell_type": "markdown",
252 | "metadata": {},
253 | "source": [
254 | "Once you pressed the option \"0\". DEx will ask you for a **destination path** to output the images. \n",
255 | "\n",
256 | "Also, you need to specify how you want to **read the annotations**: e.g., Only an OpenLabel, or you want to export the material from a whole group. \n",
257 | "\n",
258 | "In this case, we will export the material from Group C. So we put the path to this folder: "
259 | ]
260 | },
261 | {
262 | "cell_type": "code",
263 | "execution_count": null,
264 | "metadata": {},
265 | "outputs": [],
266 | "source": [
267 | "'''\n",
268 | "Welcome :)\n",
269 | "What do you whish to do?: export material for training:[0] group exported material by classes:[1] create train and test split:[2] get statistics:[3] : 0\n",
270 | "To change export settings go to config_DEx.json and change control variables.\n",
271 | "Enter destination path: C:\\Users\\example\\Documents\\Output\n",
272 | "How do you want to read annotations, by: Group:[g] Sessions:[f] One OpenLABEL:[v] : g\n",
273 | "Enter DMD group's path (../dmd/g#): C:\\Users\\example\\Documents\\dmd\\gC\n",
274 | "Enter the session you wish to export in this group: all:[0] S1:[1] S2:[2] S3[3] S4[4] : 1\n",
275 | "C:\\Users\\example\\Documents\\dmd\\gC\\11\n",
276 | "C:\\Users\\example\\Documents\\dmd\\gC\\11\\s1\n",
277 | "C:\\Users\\example\\Documents\\dmd\\gC\\11\\s1\\gC_11_s1_2019-03-04T09;33;18+01;00_rgb_ann_distraction.json\n",
278 | "There are 13 actions in this OpenLABEL\n",
279 | "\n",
280 | "\n",
281 | "-- Getting data of rgb channel --\n",
282 | "\n",
283 | "\n",
284 | "-- Creating image of action: driver_actions/safe_drive --\n",
285 | "rgb face stream loaded: gC_11_s1_2019-03-04T09;33;18+01;00_rgb_face.mp4\n",
286 | "Total frame loss: 0 of total: 3798\n",
287 | "Resulting number of intervals: 91 from initial number: 28\n",
288 | "Writing...\n",
289 | "Directory safe_drive created\n",
290 | "Oki :) ----------------------------------------\n",
291 | "'''"
292 | ]
293 | },
294 | {
295 | "cell_type": "markdown",
296 | "metadata": {},
297 | "source": [
298 | "The output will have a structure similar to this one:"
299 | ]
300 | },
301 | {
302 | "cell_type": "code",
303 | "execution_count": null,
304 | "metadata": {},
305 | "outputs": [],
306 | "source": [
307 | "'''\n",
308 | "Output\n",
309 | "└───dmd_ir\n",
310 | " └───s1\n",
311 | " └───driver_actions\n",
312 | " └───safe_drive\n",
313 | " ├───face_2019-03-08-09;31;15_1_0_15.jpg\n",
314 | " └───...\n",
315 | "'''"
316 | ]
317 | },
318 | {
319 | "cell_type": "markdown",
320 | "metadata": {},
321 | "source": [
322 | "### 2. Exporting videos with a minimum of 30 frames of drivers talking\n",
323 | "In this use case, DEx is going to get videos with a minimum of 30 frames from the distraction dataset in DMD. These videos will correspond to the talking_to_passenger annotation. The config_DEx.json will include every stream and channel to get all the possible videos and the ignoreSmall option will be activated to discard the videos smaller than 30 frames. Here you can see how config_DEx.json should be (The options not listed take the default value):"
324 | ]
325 | },
326 | {
327 | "cell_type": "code",
328 | "execution_count": null,
329 | "metadata": {},
330 | "outputs": [],
331 | "source": [
332 | "{\n",
333 | " \"material\": [\"video\"],\n",
334 | " \"streams\" : [\"face\",\"body\",\"hands\"],\n",
335 | " \"channels\" : [\"rgb\",\"ir\",\"depth\"],\n",
336 | " \"annotations\" : [\"driver_actions/talking_to_passenger\"],\n",
337 | " \"write\" : true,\n",
338 | " \"intervalChunk\" : 30,\n",
339 | " \"ignoreSmall\" : true\n",
340 | "}"
341 | ]
342 | },
343 | {
344 | "cell_type": "markdown",
345 | "metadata": {},
346 | "source": [
347 | "Here you can see an example of execution taking only gA_1_s1_2019-03-08T09;31;15+01;00_rgb_face.mp4:"
348 | ]
349 | },
350 | {
351 | "cell_type": "code",
352 | "execution_count": null,
353 | "metadata": {},
354 | "outputs": [],
355 | "source": [
356 | "'''\n",
357 | "Welcome :)\n",
358 | "What do you whish to do?: export material for training:[0] group exported material by classes:[1] create train and test split:[2] get statistics:[3] : 0\n",
359 | "To change export settings go to config_DEx.json and change control variables.\n",
360 | "Enter destination path: C:\\Users\\example\\Documents\\Output\n",
361 | "How do you want to read annotations, by: Group:[g] Sessions:[f] One OpenLABEL:[v] : g\n",
362 | "Enter DMD group's path (../dmd/g#): C:\\Users\\example\\Documents\\dmd\\gA\n",
363 | "Enter the session you wish to export in this group: all:[0] S1:[1] S2:[2] S3[3] S4[4] S5[5] S6[6] : 1\n",
364 | "C:\\Users\\example\\Documents\\dmd\\gA\\1\n",
365 | "C:\\Users\\example\\Documents\\dmd\\gA\\1\\s1\n",
366 | "C:\\Users\\example\\Documents\\dmd\\gA\\1\\s1\\gA_1_s1_2019-03-08T09;31;15+01;00_rgb_ann_distraction.json\n",
367 | "There are 13 actions in this OpenLABEL\n",
368 | "\n",
369 | "\n",
370 | "-- Getting data of rgb channel --\n",
371 | "\n",
372 | "\n",
373 | "-- Creating video of action: driver_actions/talking_to_passenger --\n",
374 | "rgb face stream loaded: gA_1_s1_2019-03-08T09;31;15+01;00_rgb_face.mp4\n",
375 | "WARNING: the interval chunk length chosen is too small, some intervals are too small to be cutted by 30 frames. To ignore small intervals, set True to ignoreSmall argument.\n",
376 | "WARNING: Skipped interval [2128, 2154] for being too small :(\n",
377 | "WARNING: the interval chunk length chosen is too small, some intervals are too small to be cutted by 30 frames. To ignore small intervals, set True to ignoreSmall argument.\n",
378 | "WARNING: Skipped interval [2257, 2274] for being too small :(\n",
379 | "WARNING: the interval chunk length chosen is too small, some intervals are too small to be cutted by 30 frames. To ignore small intervals, set True to ignoreSmall argument.\n",
380 | "WARNING: Skipped interval [2408, 2431] for being too small :(\n",
381 | "WARNING: the interval chunk length chosen is too small, some intervals are too small to be cutted by 30 frames. To ignore small intervals, set True to ignoreSmall argument.\n",
382 | "WARNING: Skipped interval [2560, 2587] for being too small :(\n",
383 | "WARNING: the interval chunk length chosen is too small, some intervals are too small to be cutted by 30 frames. To ignore small intervals, set True to ignoreSmall argument.\n",
384 | "WARNING: Skipped interval [4995, 5021] for being too small :(\n",
385 | "WARNING: the interval chunk length chosen is too small, some intervals are too small to be cutted by 30 frames. To ignore small intervals, set True to ignoreSmall argument.\n",
386 | "WARNING: Skipped interval [5203, 5228] for being too small :(\n",
387 | "WARNING: the interval chunk length chosen is too small, some intervals are too small to be cutted by 30 frames. To ignore small intervals, set True to ignoreSmall argument.\n",
388 | "WARNING: Skipped interval [5281, 5306] for being too small :(\n",
389 | "WARNING: the interval chunk length chosen is too small, some intervals are too small to be cutted by 30 frames. To ignore small intervals, set True to ignoreSmall argument.\n",
390 | "WARNING: Skipped interval [5331, 5357] for being too small :(\n",
391 | "WARNING: the interval chunk length chosen is too small, some intervals are too small to be cutted by 30 frames. To ignore small intervals, set True to ignoreSmall argument.\n",
392 | "WARNING: Skipped interval [5478, 5504] for being too small :(\n",
393 | "Total frame loss: 49 of total: 421\n",
394 | "Resulting number of intervals: 5 from initial number: 14\n",
395 | "Writing...\n",
396 | "'''"
397 | ]
398 | },
399 | {
400 | "cell_type": "markdown",
401 | "metadata": {},
402 | "source": [
403 | "The output of the execution should have a structure similar to this one:"
404 | ]
405 | },
406 | {
407 | "cell_type": "code",
408 | "execution_count": null,
409 | "metadata": {},
410 | "outputs": [],
411 | "source": [
412 | "'''\n",
413 | "Output\n",
414 | "├───dmd_depth\n",
415 | "│ └───s1\n",
416 | "│ └───driver_actions\n",
417 | "│ └───talking_to_passenger\n",
418 | "│ ├───body_2019-03-08-09;31;15_1_0.avi\n",
419 | "│ ├───face_2019-03-08-09;31;15_1_0.avi\n",
420 | "│ ├───hands_2019-03-08-09;31;15_1_0.avi\n",
421 | "│ └───...\n",
422 | "├───dmd_ir\n",
423 | "│ └───s1\n",
424 | "│ └───driver_actions\n",
425 | "│ └───talking_to_passenger\n",
426 | "│ ├───body_2019-03-08-09;31;15_1_0.avi\n",
427 | "│ ├───face_2019-03-08-09;31;15_1_0.avi\n",
428 | "│ ├───hands_2019-03-08-09;31;15_1_0.avi\n",
429 | "│ └───...\n",
430 | "└───dmd_rgb\n",
431 | " └───s1\n",
432 | " └───driver_actions\n",
433 | " └───talking_to_passenger\n",
434 | " ├───body_2019-03-08-09;31;15_1_0.avi\n",
435 | " ├───face_2019-03-08-09;31;15_1_0.avi\n",
436 | " ├───hands_2019-03-08-09;31;15_1_0.avi\n",
437 | " └───...\n",
438 | "'''"
439 | ]
440 | },
441 | {
442 | "cell_type": "markdown",
443 | "metadata": {},
444 | "source": [
445 | "### 3. Exporting Depth material in video for gaze zone estimation\n",
446 | "In this use case, DEx is going to get videos from the gaze dataset in DMD. As we mentioned before, the gaze dataset only contains the session 6 recorded, for this reason when asked by the DEx tool which session to export you have to choose s6 or the all option. In case you choose another session the DEx tool isn't going to return anything because it isn't going to find the videos. Here you can see how config_DEx.json should be (The options not listed take the default value):"
447 | ]
448 | },
449 | {
450 | "cell_type": "code",
451 | "execution_count": null,
452 | "metadata": {},
453 | "outputs": [],
454 | "source": [
455 | "{\n",
456 | " \"material\": [\"video\"],\n",
457 | " \"streams\" : [\"face\"],\n",
458 | " \"channels\" : [\"depth\"],\n",
459 | " \"annotations\" : \"all\",\n",
460 | " \"write\" : true,\n",
461 | " \"size\" : \"original\"\n",
462 | "}"
463 | ]
464 | },
465 | {
466 | "cell_type": "markdown",
467 | "metadata": {},
468 | "source": [
469 | "Here you can see an example of execution with these options:"
470 | ]
471 | },
472 | {
473 | "cell_type": "code",
474 | "execution_count": null,
475 | "metadata": {},
476 | "outputs": [],
477 | "source": [
478 | "'''\n",
479 | "Welcome :)\n",
480 | "What do you whish to do?: export material for training:[0] group exported material by classes:[1] create train and test split:[2] get statistics:[3] : 0\n",
481 | "To change export settings go to config_DEx.json and change control variables.\n",
482 | "Enter destination path: C:\\Users\\example\\Documents\\Output\n",
483 | "How do you want to read annotations, by: Group:[g] Sessions:[f] One OpenLABEL:[v] : g\n",
484 | "Enter DMD group's path (../dmd/g#): C:\\Users\\example\\Documents\\dmd\\gA\n",
485 | "Enter the session you wish to export in this group: all:[0] S1:[1] S2:[2] S3[3] S4[4] S5[5] S6[6] : 6\n",
486 | "C:\\Users\\example\\Documents\\dmd\\gA\\1\n",
487 | "C:\\Users\\example\\Documents\\dmd\\gA\\1\\s6\n",
488 | "C:\\Users\\example\\Documents\\dmd\\gA\\1\\s6\\gA_1_s6_2019-03-08T09;15;15+01;00_rgb_ann_gaze.json\n",
489 | "There are 11 actions in this OpenLABEL\n",
490 | "\n",
491 | "\n",
492 | "-- Getting data of depth channel --\n",
493 | "\n",
494 | "\n",
495 | "-- Creating video of action: gaze_zone/left_mirror --\n",
496 | "depth face stream loaded: gA_1_s6_2019-03-08T09;15;15+01;00_depth_face.avi\n",
497 | "Writing...\n",
498 | "Directory left_mirror created\n",
499 | "ffmpeg version 6.1 Copyright (c) 2000-2023 the FFmpeg developers\n",
500 | " built with clang version 17.0.4\n",
501 | " configuration: --prefix=/d/bld/ffmpeg_1699837986739/_h_env/Library --cc=clang.exe --cxx=clang++.exe --nm=llvm-nm --ar=llvm-ar --disable-doc --disable-openssl --enable-demuxer=dash --enable-hardcoded-tables --enable-libfreetype --enable-libfontconfig --enable-libopenh264 --enable-libdav1d --ld=lld-link --target-os=win64 --enable-cross-compile --toolchain=msvc --host-cc=clang.exe --extra-libs=ucrt.lib --extra-libs=vcruntime.lib --extra-libs=oldnames.lib --strip=llvm-strip --disable-stripping --host-extralibs= --enable-gpl --enable-libx264 --enable-libx265 --enable-libaom --enable-libsvtav1 --enable-libxml2 --enable-pic --enable-shared --disable-static --enable-version3 --enable-zlib --enable-libopus --pkg-config=/d/bld/ffmpeg_1699837986739/_build_env/Library/bin/pkg-config\n",
502 | " libavutil 58. 29.100 / 58. 29.100\n",
503 | " libavcodec 60. 31.102 / 60. 31.102\n",
504 | " libavformat 60. 16.100 / 60. 16.100\n",
505 | " libavdevice 60. 3.100 / 60. 3.100\n",
506 | " libavfilter 9. 12.100 / 9. 12.100\n",
507 | " libswscale 7. 5.100 / 7. 5.100\n",
508 | " libswresample 4. 12.100 / 4. 12.100\n",
509 | " libpostproc 57. 3.100 / 57. 3.100\n",
510 | "Input #0, avi, from '\\\\gpfs-cluster\\activos\\DMD\\gaze-final\\dmd\\gA\\1\\s6\\gA_1_s6_2019-03-08T09;15;15+01;00_depth_face.avi':\n",
511 | " Metadata:\n",
512 | " software : Lavf58.29.100\n",
513 | " Duration: 00:02:59.44, start: 0.000000, bitrate: 50707 kb/s\n",
514 | " Stream #0:0: Video: ffv1 (FFV1 / 0x31564646), gray16le, 1280x720, 50708 kb/s, 29.76 fps, 29.76 tbr, 29.76 tbn\n",
515 | "[out#0/avi @ 0000026BA49CA9C0] Codec AVOption crf (Select the quality for constant quality mode) has not been used for any stream. The most likely reason is either wrong type (e.g. a video option with no video streams) or that it is a private option of some encoder which was not actually used for any stream.\n",
516 | "Stream mapping:\n",
517 | " Stream #0:0 (ffv1) -> trim:default\n",
518 | " setpts:default -> Stream #0:0 (ffv1)\n",
519 | "'''"
520 | ]
521 | },
522 | {
523 | "cell_type": "markdown",
524 | "metadata": {},
525 | "source": [
526 | "The expected structure of the output of the execution should be similar to this one:"
527 | ]
528 | },
529 | {
530 | "cell_type": "code",
531 | "execution_count": null,
532 | "metadata": {},
533 | "outputs": [],
534 | "source": [
535 | "'''\n",
536 | "Output\n",
537 | "└───s6\n",
538 | " ├───blinks\n",
539 | " │ ├───blinking\n",
540 | " │ │ ├───face_2019-03-08-09;15;15_1_0.avi\n",
541 | " │ │ └───...\n",
542 | " │ └───...\n",
543 | " ├───gaze_zone\n",
544 | " │ ├───center_mirror\n",
545 | " │ │ └───...\n",
546 | " │ └───...\n",
547 | " └───...\n",
548 | "'''"
549 | ]
550 | },
551 | {
552 | "cell_type": "markdown",
553 | "metadata": {},
554 | "source": [
555 | "### 4. Joining the exported material by classes\n",
556 | "\n",
557 | "In case you export more than one session of the DMD using the mode \"export material for training:[0]\", the tool extracts the different actions in folders divided by the sessions. Now we're going to explain the next option that combines the subfolders of different sessions in only one folder to be used in training. Having the following structure of subfolders:"
558 | ]
559 | },
560 | {
561 | "cell_type": "code",
562 | "execution_count": null,
563 | "metadata": {},
564 | "outputs": [],
565 | "source": [
566 | "'''\n",
567 | "dmd_rgb\n",
568 | "├───s1\n",
569 | "│ └───driver_actions\n",
570 | "│ ├───drinking\n",
571 | "│ └───radio\n",
572 | "└───s4\n",
573 | " └───driver_actions\n",
574 | " ├───drinking\n",
575 | "│ └───radio\n",
576 | "'''\n"
577 | ]
578 | },
579 | {
580 | "cell_type": "markdown",
581 | "metadata": {},
582 | "source": [
583 | "The execution should go like this:"
584 | ]
585 | },
586 | {
587 | "cell_type": "code",
588 | "execution_count": null,
589 | "metadata": {},
590 | "outputs": [],
591 | "source": [
592 | "'''\n",
593 | "Welcome :)\n",
594 | "What do you whish to do?: export material for training:[0] group exported material by classes:[1] create train and test split:[2] get statistics:[3] : 1\n",
595 | "Enter exported DMD material path (inside must be sessions folders(s#) e.g:../dmd_rgb/): C:\\Users\\example\\Documents\\Output\\dmd_rgb\n",
596 | "dir True\n",
597 | "Moving drinking to C:\\Users\\example\\Documents\\Output\\dmd_rgb\\driver_actions\\drinking\n",
598 | "Moving radio to C:\\Users\\example\\Documents\\Output\\dmd_rgb\\driver_actions\\radio\n",
599 | "dir True\n",
600 | "Moving drinking to C:\\Users\\example\\Documents\\Output\\dmd_rgb\\driver_actions\\drinking\n",
601 | "Oki :) ----------------------------------------\n",
602 | "'''"
603 | ]
604 | },
605 | {
606 | "cell_type": "markdown",
607 | "metadata": {},
608 | "source": [
609 | "Finally, the structure of folders should be transformed to this one:"
610 | ]
611 | },
612 | {
613 | "cell_type": "code",
614 | "execution_count": null,
615 | "metadata": {},
616 | "outputs": [],
617 | "source": [
618 | "'''\n",
619 | "dmd_rgb\n",
620 | "└───driver_actions\n",
621 | " ├───drinking\n",
622 | " └───radio\n",
623 | "'''"
624 | ]
625 | },
626 | {
627 | "cell_type": "markdown",
628 | "metadata": {},
629 | "source": [
630 | "### 5. Dividing exported material into train and test splits\n",
631 | "\n",
632 | "Once the annotations are separated in folders, you can split the dataset in training and test. To do this, let's assume that we have the structure of folders as shown in last section. With this in mind, the execution should be similar to this one:"
633 | ]
634 | },
635 | {
636 | "cell_type": "code",
637 | "execution_count": null,
638 | "metadata": {},
639 | "outputs": [],
640 | "source": [
641 | "'''\n",
642 | "Welcome :)\n",
643 | "What do you whish to do?: export material for training:[0] group exported material by classes:[1] create train and test split:[2] get statistics:[3] : 2\n",
644 | "This function only works with dmd material structure when exporting with DEx tool.\n",
645 | "Enter exported material path (inside must be classes folders e.g.: /safe_driving/*.jpg): C:\\Users\\example\\Documents\\OMS\\DEx\\TutorialNotebook\\Salida\\dmd_rgb\\driver_actions\n",
646 | "Enter destination path (a new folder to store train and test splits): C:\\Users\\example\\Documents\\OMS\\DEx\\TutorialNotebook\\Salida\\split\n",
647 | "Enter test proportion for split [0-1] (e.g. 0.20): 0.20\n",
648 | "folders: ['C:\\Users\\example\\Documents\\Output\\dmd_rgb\\driver_actions\\drinking', 'C:\\Users\\example\\Documents\\Output\\dmd_rgb\\driver_actions\\radio']\n",
649 | "Moving 164 files: 131 for training and 33 for testing.\n",
650 | "Moving 126 files: 101 for training and 25 for testing.\n",
651 | "Oki :) ----------------------------------------\n",
652 | "'''"
653 | ]
654 | },
655 | {
656 | "cell_type": "markdown",
657 | "metadata": {},
658 | "source": [
659 | "As you can see in the execution, you have to put the path to the folder containing all the annotations. The structure of the folder should be something similar to this one:"
660 | ]
661 | },
662 | {
663 | "cell_type": "code",
664 | "execution_count": null,
665 | "metadata": {},
666 | "outputs": [],
667 | "source": [
668 | "'''\n",
669 | "split\n",
670 | "├───test\n",
671 | "│ ├───0\n",
672 | "│ └───1\n",
673 | "└───train\n",
674 | " ├───0\n",
675 | " └───1\n",
676 | "'''"
677 | ]
678 | },
679 | {
680 | "cell_type": "markdown",
681 | "metadata": {},
682 | "source": [
683 | "In this case, the folders 0 contains videos corresponding with the drinking annotation and the folders 1 contains videos corresponding with the radio annotation.\n",
684 | "\n",
685 | "### 6. Getting statistics of annotations\n",
686 | "\n",
687 | "This mode gets the statistics of a group, session or one OpenLABEL of the DMD. The result is two files called example-frames.txt and example-actions.txt, being example the name you input to the tool. The example-frames.txt contains the total number of frames, and the example-actions.txt contains the number of frames in which an annotation appears. Let's see an example:"
688 | ]
689 | },
690 | {
691 | "cell_type": "code",
692 | "execution_count": null,
693 | "metadata": {},
694 | "outputs": [],
695 | "source": [
696 | "'''\n",
697 | "Welcome :)\n",
698 | "What do you whish to do?: export material for training:[0] group exported material by classes:[1] create train and test split:[2] get statistics:[3] : 3\n",
699 | "This function only works with dmd material structure when exporting with DEx tool.\n",
700 | "Enter filename for a report file (e.g. report.txt): statisticsDMDc.txt\n",
701 | "How do you want to read annotations, by: Group:[g] Sessions:[f] One OpenLABEL:[v] : g\n",
702 | "Enter DMD group's path (../dmd/g#): C:\\Users\\example\\Documents\\dmd\\gC\n",
703 | "Enter the session you wish to export in this group: all:[0] S1:[1] S2:[2] S3[3] S4[4] S5[5] S6[6] : 1\n",
704 | "C:\\Users\\example\\Documents\\dmd\\gC\\11\n",
705 | "C:\\Users\\example\\Documents\\dmd\\gC\\11\\s1\n",
706 | "C:\\Users\\example\\Documents\\dmd\\gC\\11\\s1\\gC_11_s1_2019-03-04T09;33;18+01;00_rgb_ann_distraction.json\n",
707 | "There are 13 actions in this OpenLABEL\n",
708 | "Oki :) ----------------------------------------\n",
709 | "C:\\Users\\example\\Documents\\dmd\\gC\\12\n",
710 | "C:\\Users\\example\\Documents\\dmd\\gC\\12\\s1\n",
711 | "C:\\Users\\example\\Documents\\dmd\\gC\\12\\s1\\gC_12_s1_2019-03-13T10;23;45+01;00_rgb_ann_distraction.json\n",
712 | "There are 13 actions in this OpenLABEL\n",
713 | "Oki :) ----------------------------------------\n",
714 | "C:\\Users\\example\\Documents\\dmd\\gC\\13\n",
715 | "C:\\Users\\example\\Documents\\dmd\\gC\\13\\s1\n",
716 | "C:\\Users\\example\\Documents\\dmd\\gC\\13\\s1\\gC_13_s1_2019-03-04T10;26;12+01;00_rgb_ann_distraction.json\n",
717 | "There are 13 actions in this OpenLABEL\n",
718 | "Oki :) ----------------------------------------\n",
719 | "C:\\Users\\example\\Documents\\dmd\\gC\\14\n",
720 | "C:\\Users\\example\\Documents\\dmd\\gC\\14\\s1\n",
721 | "C:\\Users\\example\\Documents\\dmd\\gC\\14\\s1\\gC_14_s1_2019-03-04T11;56;20+01;00_rgb_ann_distraction.json\n",
722 | "There are 15 actions in this OpenLABEL\n",
723 | "Oki :) ----------------------------------------\n",
724 | "C:\\Users\\example\\Documents\\dmd\\gC\\15\n",
725 | "C:\\Users\\example\\Documents\\dmd\\gC\\15\\s1\n",
726 | "C:\\Users\\example\\Documents\\dmd\\gC\\15\\s1\\gC_15_s1_2019-03-04T11;24;57+01;00_rgb_ann_distraction.json\n",
727 | "There are 12 actions in this OpenLABEL\n",
728 | "Oki :) ----------------------------------------\n",
729 | "'''"
730 | ]
731 | },
732 | {
733 | "cell_type": "markdown",
734 | "metadata": {},
735 | "source": [
736 | "As you can see by the name we have given to the tool, the files that has been created are called statisticsDMDc-frames.txt and statisticsDMDc-actions.txt. An important detail to have in mind is that the name given to the report file has to contain .txt suffix. Here is the content in statisticsDMDc-frames.txt:"
737 | ]
738 | },
739 | {
740 | "cell_type": "code",
741 | "execution_count": null,
742 | "metadata": {},
743 | "outputs": [],
744 | "source": [
745 | "'''\n",
746 | "total_frames:34636\n",
747 | "'''"
748 | ]
749 | },
750 | {
751 | "cell_type": "markdown",
752 | "metadata": {},
753 | "source": [
754 | "Content in statisticsDMDc-actions.txt:"
755 | ]
756 | },
757 | {
758 | "cell_type": "code",
759 | "execution_count": null,
760 | "metadata": {},
761 | "outputs": [],
762 | "source": [
763 | "'''\n",
764 | "gaze_on_road/looking_road:28683\n",
765 | "gaze_on_road/not_looking_road:5477\n",
766 | "talking/talking:5330\n",
767 | "hands_using_wheel/both:20487\n",
768 | "hands_using_wheel/only_left:13510\n",
769 | "hand_on_gear/hand_on_gear:100\n",
770 | "driver_actions/safe_drive:19344\n",
771 | "driver_actions/radio:4145\n",
772 | "driver_actions/drinking:3409\n",
773 | "driver_actions/reach_side:2730\n",
774 | "driver_actions/talking_to_passenger:2078\n",
775 | "driver_actions/unclassified:2562\n",
776 | "objects_in_scene/bottle:6202\n",
777 | "hands_using_wheel/none:265\n",
778 | "hands_using_wheel/only_right:103\n",
779 | "'''"
780 | ]
781 | }
782 | ],
783 | "metadata": {
784 | "colab": {
785 | "provenance": []
786 | },
787 | "kernelspec": {
788 | "display_name": "Python 3",
789 | "name": "python3"
790 | },
791 | "language_info": {
792 | "codemirror_mode": {
793 | "name": "ipython",
794 | "version": 3
795 | },
796 | "file_extension": ".py",
797 | "mimetype": "text/x-python",
798 | "name": "python",
799 | "nbconvert_exporter": "python",
800 | "pygments_lexer": "ipython3",
801 | "version": "3.8.18"
802 | }
803 | },
804 | "nbformat": 4,
805 | "nbformat_minor": 0
806 | }
807 |
--------------------------------------------------------------------------------
/exploreMaterial-tool/accessDMDAnn.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | import sys
3 | import os
4 | import cv2
5 | from pathlib import Path # To handle paths independent of OS
6 | import numpy as np
7 | import math
8 | import time
9 | import json
10 |
11 | # Import local class to parse OpenLABEL content
12 | from vcd4reader import VcdHandler
13 | from vcd4reader import VcdDMDHandler
14 |
15 | import ffmpeg
16 | # Written by Paola Cañas and David Galvañ with <3
17 |
18 | # Python and Opencv script to prepare/export material of DMD for training.
19 | # Reads annotations in OpenLABEL and the 3 stream videos.
20 | # To change export settings, go to __init__ and change control variables.
21 |
22 | # Run it through python script DExTool.py
23 | # -----
24 | class exportClass():
25 |
26 | def __init__(self, vcdFile, rootDmd, destinationPath, datasetDMD=True):
27 | # ------ GLOBAL VARIABLES ------
28 | # - Args -
29 | self.vcdFile = vcdFile
30 | self.rootDmd = rootDmd+"/"
31 | self.destinationPath = destinationPath
32 | self.datasetDMD = datasetDMD
33 |
34 | if self.datasetDMD:
35 |
36 | # Create VCD Reader object
37 | self.vcd_handler = VcdDMDHandler(vcd_file=Path(self.vcdFile))
38 | # @self.info: [group, subject, session, date]
39 | self.info = self.vcd_handler.get_basic_metadata()
40 | # Just get day and hour from the full timestamp
41 | self.dateDayHour = self.info[3]
42 | # @shifts: [body_face_sh, hands_face_sh, hands_body_sh]
43 | self.shift_bf, self.shift_hf, self.shift_hb = self.vcd_handler.get_shifts()
44 | else:
45 | # Create VCD Reader object
46 | self.vcd_handler = VcdHandler(vcd_file=Path(self.vcdFile))
47 | self.info = ["g1","0","s1"]
48 | named_tuple = time.localtime() # get struct_time
49 | self.dateDayHour = time.strftime("%Y-%m-%d-%H;%M;%S", named_tuple)
50 | self.shift_bf, self.shift_hf, self.shift_hb = 0, 0, 0
51 |
52 |
53 | # @self.actionList: ["driver_actions/safe_drive", "gaze_on_road/looking_road",.. , ..]
54 | self.actionList = self.vcd_handler.get_action_type_list()
55 | #Get object list
56 | self.objectList = self.vcd_handler.get_object_type_list()
57 | #from 1 to not get "driver" object
58 | for object in self.objectList:
59 | # Append objects to actionList
60 | if "driver" in object:
61 | #Dont add driver object
62 | continue
63 | self.actionList.append("objects_in_scene/"+object)
64 |
65 | self.frameNum = 0
66 |
67 |
68 | # -- CONTROL VARIABLES --
69 |
70 | """
71 | args:
72 | This values are assigned in config_access.json
73 |
74 | @material: list of data format you wish to export.
75 | Possible values: "image","video"
76 |
77 | @streams: list of camera names to export material from.
78 | Possible values: "face", "body", "hands", "general"
79 |
80 | @channels: list of channels of information you wish to export .
81 | Possible values: "rgb", "ir", "depth"
82 |
83 | @annotations: list of classes you wish to export (e.g. ["safe_drive","drinking"], "all")
84 | Possible values: all labels names or only "all"
85 | it can be the type name like in OpenLABEL ("driver_actions/safe_drive") or just the label name ("safe_drive"). Except for objects.
86 | Objects (cellphone, hair comb and bottle) have to be with the "object_in_scene/__" label before.
87 | Also can be the action uid in number (e.g. [0,1,2]) but be aware that uid might not the same in all OpenLABEL's
88 | If you put the value "all" in the config file, the system loads all the classes available in OpenLABEL from the var @self.actionList.
89 |
90 | @write: Flag to create/write material in destination folder (True) or just get the intervals (False)
91 | Possible values: True, False
92 |
93 | Optional args:
94 |
95 | @size: size of the final output (images or videos). Set it as "original" or a tuple with a smaller size than the original (width, height). e.g.(224,224).
96 |
97 | @intervalChunk: size of divisions you wish to do to the frame intervals (in case you want videos of x frames each)
98 | Possible values: Number greater than 1
99 |
100 | @ignoreSmall: True to ignore intervals that cannot be cutted because they are smaller than @intervalChunk
101 | Possible values: True or False
102 |
103 | @asc: When cutting interval chunks, the value should be true to create the intervals going in ascendant order (in a video of 105 frames taking chunks of 50 frames DEx creates
104 | [0-49, 50-99, 100-104] intervals). The value should be false to go in descendent order (With the 105 frames video taking chunks of 50 frames the intervals created will be
105 | [55-104, 5-54, 0-4]). Possible values: True or False
106 | """
107 | # ----LOAD CONFIG FROM JSON----
108 | # Config dictionary path
109 | self._config_json = "config_DEx.json"
110 | # From json to python dictionaries
111 | with open(self._config_json) as config_file:
112 | config_dict = json.load(config_file)
113 | self.material = config_dict["material"] #list: ["image"]
114 | if not isinstance(self.material,list):
115 | self.material = [self.material]
116 | self.streams = config_dict["streams"]#list: ["face","hands","body"] #must be "general" if not DMD dataset
117 | if not isinstance(self.streams,list):
118 | self.streams = [self.streams]
119 | self.channels = config_dict["channels"] #List: ["rgb", "ir", "depth"]. Include "depth" to export Depth information too. It must be only "rgb" if not DMD dataset
120 | if not isinstance(self.channels,list):
121 | self.channels = [self.channels]
122 | if type(config_dict["annotations"]) == type("all") and config_dict["annotations"].lower() == "all":
123 | self.annotations = self.actionList
124 | else:
125 | self.annotations = config_dict["annotations"]
126 | if not isinstance(self.annotations,list):
127 | self.annotations = [self.annotations]
128 |
129 | self.write = config_dict["write"]
130 | if "size" in config_dict:
131 | self.size = config_dict["size"] #[224,224] #"original" # or (width, height) e.g.[224,224]
132 | else:
133 | self.size = (224,224)
134 | if "intervalChunk" in config_dict:
135 | self.intervalChunk = config_dict["intervalChunk"]
136 | else:
137 | self.intervalChunk = 0
138 | if "ignoreSmall" in config_dict:
139 | self.ignoreSmall = config_dict["ignoreSmall"]
140 | else:
141 | self.ignoreSmall = False
142 | if "asc" in config_dict:
143 | self.asc = config_dict["asc"]
144 | else:
145 | self.asc = True
146 |
147 | #validations
148 | if not self.datasetDMD and (self.streams[0] != "general" or len(self.streams)> 1):
149 | raise RuntimeError(
150 | "WARNING: stream option for other datasets must be only 'general'")
151 | if not self.datasetDMD and len(self.channels)> 1:
152 | raise RuntimeError(
153 | "WARNING: channles option for other datasets must be only 'rgb'")
154 | #exec
155 | self.exportMaterial()
156 |
157 | def exportMaterial(self):
158 | for annotation in self.annotations:
159 | for channel in self.channels:
160 | for stream in self.streams:
161 | for mat in self.material:
162 | validAnnotation = False
163 | # Check if annotation exists in OpenLABEL
164 | if isinstance(annotation, str):
165 | # if annotation is string, check with self.vcd_handler if it is in OpenLABEL
166 | if self.vcd_handler.is_action_type_get_uid(annotation)[0] or self.vcd_handler.is_object_type_get_uid(annotation)[0]:
167 | validAnnotation = True
168 | elif isinstance(annotation, int):
169 | # if annotation is int, is an id and has to be less then self.actionList length
170 | if annotation < len(self.actionList):
171 | validAnnotation = True
172 | else:
173 | raise RuntimeError(
174 | "WARNING: Annotation argument must be string or int")
175 | if validAnnotation:
176 | print("\n\n-- Getting data of %s channel --" % (channel))
177 | intervals = self.getIntervals(mat, channel, stream, annotation)
178 | else:
179 | print("WARNING: annotation %s is not in this OpenLABEL." % str(annotation))
180 |
181 | # Function to get intervals of @annotation from OpenLABEL and if @write, exports the @material of @stream indicated to @self.destinationPath
182 | def getIntervals(self, material, channel, stream, annotation):
183 | #get name of action if uid is fiven
184 | if isinstance(annotation, int):
185 | annotation = self.actionList[annotation]
186 |
187 | print("\n\n-- Creating %s of action: %s --" % (material, str(annotation)))
188 |
189 | # Check and load valid video
190 | streamVideoPath = str(self.getStreamVideo(channel, stream))
191 | capVideo = cv2.VideoCapture(streamVideoPath)
192 |
193 | #Validation to check if given new size is smaller than original
194 | if(self.size!="original"):
195 | if (capVideo.get(cv2.CAP_PROP_FRAME_HEIGHT) 1:
210 | fullIntervalsAsList = self.cutIntervals(fullIntervalsAsList)
211 |
212 | if self.write:
213 | print("Writing...")
214 | """If annotation string is the action type from OpenLABEL, it will create a folder
215 | for each label inside their level folder because of the "/" in the name.
216 | If annotation is a number, then a folder will be created for each label with its uid as name """
217 |
218 | if self.datasetDMD:
219 | # create folder per annotation and per session
220 | dirName = Path(self.destinationPath +"/dmd_"+channel+ "/"+self.info[2] + "/" + str(annotation))
221 | else:
222 | dirName = Path(self.destinationPath+ "/" +str(annotation))
223 | if not dirName.exists():
224 | os.makedirs(str(dirName), exist_ok=True)
225 | print("Directory", dirName.name, "created")
226 |
227 | #@depthVideoArray: numpy array with all frames of video containing depth information
228 | depthVideoArray = []
229 | if channel == "depth" and (material == "image" or material == "images" ):
230 | depthVideoArray = self.getDepthVideoArray(streamVideoPath, capVideo)
231 |
232 | for count, interval in enumerate(fullIntervalsAsList):
233 |
234 | # Check if frames are avalabile in stream. Find corresponding frames in stream, rigth now is mosaic frame
235 | valid, startFrame, endFrame = self.checkFrameInStream(
236 | stream, interval[0], interval[1])
237 |
238 | mosaicStartFrame = interval[0]
239 |
240 | if valid:
241 | print('Exporting interval %d \r' % count, end="")
242 | # Name with stream, date, subject and interval id to not overwrite
243 | fileName = str(dirName) + "/" + stream + "_" + \
244 | self.dateDayHour.replace(":",";") + "_"+self.info[1] + "_"+str(count)
245 | if material == "image" or material == "images" or material == "img" or material == "imgs":
246 | if channel == "depth":
247 | self.depthFrameIntervalToImages(startFrame, endFrame, mosaicStartFrame, depthVideoArray, capVideo, fileName)
248 | else:
249 | self.frameIntervalToImages(startFrame, endFrame, mosaicStartFrame, capVideo, fileName)
250 | else:
251 | if channel == "depth":
252 | self.depthFrameIntervalToVideo(startFrame, endFrame, streamVideoPath, fileName)
253 | else:
254 | self.frameIntervalToVideo(startFrame, endFrame, capVideo, fileName)
255 | else:
256 | print(
257 | "WARNING: Skipped interval %i, because some of its frames do not exist in stream %s" %(count,stream))
258 | return fullIntervalsAsList
259 |
260 | # Function to get invervals as dictionaries and return them as a python list
261 | def dictToList(self,intervalsDict):
262 | assert (len(intervalsDict) > 0)
263 | intervalsList = []
264 | for interDict in intervalsDict:
265 | intervalsList.append(
266 | [interDict["frame_start"], interDict["frame_end"]])
267 | return intervalsList
268 |
269 | # Function to take list of intervals and cut them into sub intervals of desired size
270 | # @intervals: list of intervals to cut
271 | # @intervalChunk: chunks size
272 | # @ignoreSmall: True to ignore intervals that cannot be cutted because they are smaller than @intervalChunk
273 | # @asc: flag to start cutting from start of interval (ascendant) or from the end of interval (descendant)
274 | def cutIntervals(self, intervals):
275 | assert (len(intervals) > 0)
276 | assert (self.intervalChunk > 1)
277 | intervalsCutted = []
278 | framesSum = 0
279 | framesLostSum = 0
280 |
281 | for interval in intervals:
282 | framesLost = 0
283 | init = interval[0]
284 | end = interval[1]
285 | if end == self.frameNum:
286 | end = end-1
287 | dist = end - init
288 | framesSum = framesSum + dist
289 | count = init if self.asc else end
290 |
291 | # calculate how many chunks will result per interval
292 | numOfChunks = math.floor(dist / self.intervalChunk)
293 | # If the division of interval is not possible and cannot be ignored
294 | if numOfChunks <= 0 and self.ignoreSmall:
295 | print("WARNING: the interval chunk length chosen is too small, some intervals are too small to be cutted by",
296 | self.intervalChunk, "frames. To ignore small intervals, set True to ignoreSmall argument.")
297 | print("WARNING: Skipped interval", interval, "for being too small :(")
298 | else:
299 | # if the division of interval is possible
300 | if numOfChunks > 0:
301 | for x in range(numOfChunks):
302 | # if Ascendant, take the initial limit of interval and start dividing chunks from there adding chunk size
303 | if self.asc:
304 | intervalsCutted.append([count, count + self.intervalChunk - 1])
305 | count = count + self.intervalChunk
306 | # if descendant, take the final limit of interval and start dividing chunks from there substracting chunk size
307 | else:
308 | intervalsCutted.append([count, count - self.intervalChunk + 1])
309 | count = count - self.intervalChunk
310 | framesLost = abs(
311 | count - end) if self.asc else abs(count - init)
312 | if not self.ignoreSmall:
313 | framesLost = abs(
314 | count - end) if self.asc else abs(count - init)
315 | if self.asc:
316 | intervalsCutted.append([count, count + framesLost])
317 | else:
318 | intervalsCutted.append([count, count - framesLost])
319 | framesLost = 0
320 |
321 | framesLostSum = framesLostSum + framesLost
322 | print("Total frame loss:", framesLostSum, "of total:", framesSum + 1)
323 | print("Resulting number of intervals:", len(intervalsCutted),
324 | "from initial number:", len(intervals))
325 |
326 | return intervalsCutted
327 |
328 | # Function to create a sub video called @name.avi from @frameStart to @frameEnd of stream video @capVideo
329 | # saves video in @self.destinationPath
330 | # @capVideo: is video loaded in opencv, not path
331 | # @name of file with no extension
332 | def frameIntervalToVideo(self, frameStart, frameEnd, capVideo, name):
333 | fourcc = cv2.VideoWriter_fourcc(*'XVID')
334 | if self.size =="original":
335 | width = int(capVideo.get(cv2.CAP_PROP_FRAME_WIDTH))
336 | height = int(capVideo.get(cv2.CAP_PROP_FRAME_HEIGHT))
337 | else:
338 | width = self.size[0]
339 | height = self.size[1]
340 | intervalVideo = cv2.VideoWriter(name + ".avi", fourcc, 29.76, (width, height))
341 | success = True
342 | capVideo.set(cv2.CAP_PROP_POS_FRAMES, frameStart)
343 | while success and capVideo.get(cv2.CAP_PROP_POS_FRAMES) <= frameEnd:
344 | success, image = capVideo.read()
345 | if self.size != "original":
346 | image = cv2.resize(image, self.size, interpolation=cv2.INTER_LANCZOS4)
347 | intervalVideo.write(image)
348 | intervalVideo.release()
349 |
350 | # Function to create a sub video called @name.avi from @frameStart to @frameEnd of stream video @streamVideoPath from DEPTH channel.
351 | # Uses ffmpeg-python to properly cut the video
352 | # Cut must be made with time, not with frame number. That is why it is frameNumber/frameRate
353 | # saves video in @self.destinationPath
354 | # @streamVideoPath: is the depth video path, not video
355 | # @name of file with no extension
356 | def depthFrameIntervalToVideo(self, frameStart, frameEnd, streamVideoPath, name):
357 |
358 | if self.size!="original":
359 | vid = (
360 | ffmpeg.input(streamVideoPath)
361 | .trim(start=int(frameStart/29.76), end=int(frameEnd/29.76))
362 | .setpts('PTS-STARTPTS')
363 | .filter_('scale', w=self.size[0], h=self.size[1], sws_flags="neighbor")
364 | .output(name+".avi", vcodec='ffv1', pix_fmt='gray16le', crf=0)
365 | .run()
366 | )
367 | else:
368 | vid = (
369 | ffmpeg.input(streamVideoPath)
370 | .trim(start=int(frameStart/29.76), end=int(frameEnd/29.76))
371 | .setpts('PTS-STARTPTS')
372 | .output(name+".avi", vcodec='ffv1', pix_fmt='gray16le', crf=0)
373 | .run()
374 | )
375 |
376 | # Function to get images from @frameStart to @frameEnd of stream video @capVideo
377 | # saves in @self.destinationPath
378 | # @mosaicFrameStart is the initial frame number of the mosaic, this is to name the image with the frame number of the mosaic instead of the individual video and help synchronization afterwards
379 | # @capVideo: is video loaded in opencv, not path
380 | # @name of images with no extension
381 | def frameIntervalToImages(self,frameStart, frameEnd, mosaicFrameStart, capVideo, name):
382 | frameCount = mosaicFrameStart
383 | success = True
384 | capVideo.set(cv2.CAP_PROP_POS_FRAMES, frameStart)
385 | while success and capVideo.get(cv2.CAP_PROP_POS_FRAMES) <= frameEnd:
386 | success, image = capVideo.read()
387 | if not success:
388 | break
389 | if self.size != "original":
390 | image = cv2.resize(image, self.size, interpolation=cv2.INTER_LANCZOS4)
391 |
392 | cv2.imwrite(name+"_"+str(frameCount)+".jpg", image)
393 | frameCount += 1
394 |
395 | # Function to get images from @frameStart to @frameEnd of stream DEPTH info array @depthVideoArray
396 | # saves in @self.destinationPath
397 | # @mosaicFrameStart is the initial frame number of the mosaic, this is to name the image with the frame number of the mosaic
398 | # instead of the individual video and help synchronization afterwards
399 | # @name of images with no extension
400 | def depthFrameIntervalToImages(self, frameStart, frameEnd, mosaicFrameStart, depthVideoArray, capVideo, name):
401 | frameCount = mosaicFrameStart
402 | for i in range(frameStart,frameEnd+1):
403 | if i < self.frameNum:
404 | cv2.imwrite(name+"_"+str(frameCount)+".tif",depthVideoArray[i])
405 | else:
406 | #write a black image
407 | if self.size != "original":
408 | cv2.imwrite(name+"_"+str(frameCount)+".tif",np.zeros((self.size[1],self.size[0]),dtype=np.uint16))
409 | cv2.imwrite(name+"_"+str(frameCount)+".tif",np.zeros((int(capVideo.get(cv2.CAP_PROP_FRAME_HEIGHT)),
410 | int(capVideo.get(cv2.CAP_PROP_FRAME_WIDTH))),dtype=np.uint16))
411 | frameCount +=1
412 |
413 | # Function to get depth information of all frames from depth video in a unit16 array. Array should be [self.frameNum, heigth, width]
414 | # Uses ffmpeg-python to properly extract info from video in gray16le pixelformat
415 | def getDepthVideoArray(self, streamVideoPath, capVideo):
416 | if self.size!="original":
417 | out, _ = (
418 | ffmpeg
419 | .input(streamVideoPath)
420 | .filter_('scale', width=self.size[0], height=self.size[1], sws_flags="neighbor")
421 | .output('pipe:', format='rawvideo', pix_fmt='gray16le')
422 | .run(capture_stdout=True)
423 | )
424 | array_imgs = (
425 | np
426 | .frombuffer(out, np.uint16)
427 | .reshape([-1, self.size[1], self.size[0]])
428 | )
429 | else:
430 | out, _ = (
431 | ffmpeg
432 | .input(streamVideoPath)
433 | .output('pipe:', format='rawvideo', pix_fmt='gray16le')
434 | .run(capture_stdout=True)
435 | )
436 | array_imgs = (
437 | np
438 | .frombuffer(out, np.uint16)
439 | .reshape([-1, int(capVideo.get(cv2.CAP_PROP_FRAME_HEIGHT)), int(capVideo.get(cv2.CAP_PROP_FRAME_WIDTH))])
440 | )
441 | return array_imgs
442 |
443 | # Function to get uri of the @videoStream video from OpenLABEL and check if video frame count matches with OpenLABEL.
444 | # Returns @videoPath: path of @videoStream in OpenLABEL
445 | def getStreamVideo(self,videoChannel,videoStream):
446 | # load Uri and frame count
447 | # uri e.g.: gA/1/s1/gA_1_s1_2019-03-08T09;31;15+01;00_rgb_face.mp4
448 | if self.datasetDMD:
449 | if videoStream == "face":
450 | videoPath = self.vcd_handler.get_videos_uris()[0]
451 | self.frameNum = self.vcd_handler.get_frame_numbers()[0]
452 | elif videoStream == "body":
453 | videoPath = self.vcd_handler.get_videos_uris()[1]
454 | self.frameNum = self.vcd_handler.get_frame_numbers()[1]
455 | elif videoStream == "hands":
456 | videoPath = self.vcd_handler.get_videos_uris()[2]
457 | self.frameNum = self.vcd_handler.get_frame_numbers()[2]
458 | else:
459 | raise RuntimeWarning(
460 | videoStream, ": Not a valid video stream. Must be: 'face', 'body' or 'hands'.")
461 | # change to desire channel video path
462 | videoPath = videoPath.replace("rgb", videoChannel)
463 | if videoChannel == "depth":
464 | videoPath = videoPath.replace("mp4", "avi")
465 |
466 | videoPath = Path(self.rootDmd + videoPath)
467 |
468 | if not videoPath.exists():
469 | videoPath = self.vcdFile
470 | videoPath = videoPath.replace("ann_gaze.json",videoStream+".mp4")
471 | videoPath = videoPath.replace("ann_distraction.json",videoStream+".mp4")
472 | videoPath = videoPath.replace("ann_drowsiness.json",videoStream+".mp4")
473 | # change to desire channel video path
474 | videoPath = videoPath.replace("rgb", videoChannel)
475 |
476 | if videoChannel == "depth":
477 | videoPath = videoPath.replace("mp4", "avi")
478 | print("URI inside OpenLABEL not found. Possible path is considered:",videoPath)
479 | videoPath = Path(videoPath)
480 |
481 | else:
482 | if videoStream =="general":
483 | videoPath = self.vcd_handler.get_videos_uri()
484 | self.frameNum = self.vcd_handler.get_frames_number()
485 | videoPath = Path(videoPath)
486 | else:
487 | raise RuntimeWarning(videoStream,": Not a valid video stream. Must be 'general'")
488 |
489 | if not videoPath.exists():
490 | videoPath = self.vcdFile
491 | videoPath = videoPath.split("_ann")[0]
492 | videoPath = videoPath+".mp4"
493 | print("URI inside OpenLABEL not found. Possible path is considered:",videoPath)
494 | videoPath = Path(videoPath)
495 | if not videoPath.exists():
496 | videoPath = str(videoPath).replace(".mp4",".avi")
497 | print("URI inside OpenLABEL not found. Possible path is considered:",videoPath)
498 |
499 | videoPath = Path(videoPath)
500 |
501 | # Check video frame count and OpenLABEL's frame count
502 | if videoPath.exists():
503 | cap = cv2.VideoCapture(str(videoPath))
504 | length = int(cap.get(cv2.CAP_PROP_FRAME_COUNT))
505 | if length != self.frameNum:
506 | #Some depth videos are missing 1 frame
507 | if videoChannel == "depth":
508 | self.frameNum = self.frameNum - 1
509 | else:
510 | raise RuntimeWarning(
511 | "OpenLABEL's and real video frame count don't match. OpenLABEL: %s video: %s",(self.frameNum,length))
512 | else:
513 | print(videoChannel, videoStream, "stream loaded:", videoPath.name)
514 | else:
515 | raise RuntimeError(
516 | videoPath, "video not found. Video Uri in OpenLABEL is wrong or video does not exist")
517 |
518 | return videoPath
519 |
520 | #Function to check if the mosaic-count frame is available in stream requested. Then calculate corresponding frame position in stream-count
521 | def checkFrameInStream(self, stream, frameStart, frameEnd):
522 |
523 | if not self.datasetDMD:
524 | return True, frameStart, frameEnd
525 |
526 | faceLen, bodyLen, handsLen = self.vcd_handler.get_frame_numbers()
527 |
528 | # Check the starting order
529 | if self.shift_bf >= 0 and self.shift_hf >= 0:
530 | # Face starts first
531 | if stream == "face":
532 | if frameEnd < faceLen:
533 | return True, frameStart, frameEnd
534 | elif stream == "body":
535 | if frameStart - self.shift_bf >= 0 and frameEnd - self.shift_bf <= bodyLen:
536 | return True, frameStart - self.shift_bf, frameEnd - self.shift_bf
537 | elif stream == "hands":
538 | if frameStart - self.shift_hf >= 0 and frameEnd - self.shift_hf <= handsLen:
539 | return True, frameStart - self.shift_hf, frameEnd - self.shift_hf
540 |
541 | elif self.shift_bf <= 0 and self.shift_hb >= 0:
542 | # Body starts first
543 | #shifts = [-self.shift_bf, self.shift_hb]
544 |
545 | if stream == "body":
546 | if frameEnd < bodyLen:
547 | return True, frameStart, frameEnd
548 | elif stream == "face":
549 | if frameStart + self.shift_bf >= 0 and frameEnd + self.shift_bf <= faceLen:
550 | return True, frameStart + self.shift_bf, frameEnd + self.shift_bf
551 | elif stream == "hands":
552 | if frameStart - self.shift_hb >= 0 and frameEnd - self.shift_hb <= handsLen:
553 | return True, frameStart - self.shift_hb, frameEnd - self.shift_hb
554 |
555 | elif self.shift_hb <= 0 and self.shift_hf <= 0:
556 | # Hands starts first
557 | #shifts = [-self.shift_hf, -self.shift_hb]
558 | if stream == "hands":
559 | if frameEnd < handsLen:
560 | return True, frameStart, frameEnd
561 | elif stream == "body":
562 | if frameStart + self.shift_hb >= 0 and frameEnd + self.shift_hb <= bodyLen:
563 | return True, frameStart + self.shift_hb, frameEnd + self.shift_hb
564 | elif stream == "face":
565 | if frameStart + self.shift_hf >= 0 and frameEnd + self.shift_hf <= faceLen:
566 | return True, frameStart + self.shift_hf, frameEnd + self.shift_hf
567 | else:
568 | raise RuntimeError("Error: Unknown order")
569 |
570 | return False, 0, 0
571 |
572 |
573 | def is_string_int(self,s):
574 | try:
575 | int(s)
576 | return True
577 | except ValueError:
578 | return False
579 |
580 |
581 | # Function to transform int keys to integer if possible
582 | def keys_to_int(self,x):
583 | return {int(k) if self.is_string_int(k) else k: v for k, v in x}
--------------------------------------------------------------------------------
/exploreMaterial-tool/config_DEx.json:
--------------------------------------------------------------------------------
1 | {
2 | "material": ["videos"],
3 | "streams" : ["body"],
4 | "channels" : ["rgb"],
5 | "annotations" : ["driver_actions/safe_drive", "driver_actions/texting_right", "driver_actions/phonecall_right", "driver_actions/texting_left","driver_actions/phonecall_left","driver_actions/reach_side","driver_actions/radio","driver_actions/drinking"],
6 | "write" : true,
7 | "size" : [224, 224],
8 | "intervalChunk" : 50,
9 | "ignoreSmall" : false,
10 | "asc" : true
11 | }
--------------------------------------------------------------------------------
/exploreMaterial-tool/group_split_material.py:
--------------------------------------------------------------------------------
1 | import random
2 | import shutil
3 | import os
4 | import glob
5 | import sys
6 | from pathlib import Path
7 | # Written by Paola Cañas with <3
8 |
9 | # groupClass(): group dataset by classes ("radio", "drinking"...)
10 | # Dataset in folder must be organized by sessions. (s1,s2,s3...)
11 |
12 | class groupClass():
13 |
14 | def __init__(self,materialPath):
15 |
16 | self.materialPath = materialPath
17 | #e.g /mymaterialpath/dmd_rgb/
18 | if not Path(self.materialPath).exists():
19 | raise RuntimeError("Material path does not exist")
20 |
21 | #list all sessions folders
22 | session_paths = glob.glob(self.materialPath + '/*')
23 | session_paths.sort()
24 |
25 | for session in session_paths:
26 | # e.g /mymaterialpath/dmd_rgb/s1/
27 | #For each session folder, list all classes folders
28 | class_paths = glob.glob(session + '/*')
29 | class_paths.sort()
30 |
31 | for classF in class_paths:
32 | #e.g /mymaterialpath/dmd_rgb/s1/driver_actions
33 | # or /mymaterialpath/dmd_rgb/s1/safe_drive
34 | subClass = glob.glob(classF + '/*')
35 | subClass.sort()
36 | dir = Path(subClass[0]).is_dir()
37 | print("dir",dir)
38 | #If theres a level more of folers
39 | if dir:
40 | for subClassF in subClass:
41 | #e.g /mymaterialpath/dmd_rgb/s1/driver_actions/safe_drive
42 | class_name = Path(classF).name
43 | name = Path(subClassF).name
44 | dest = Path(self.materialPath+"/"+class_name+"/"+name)
45 | #e.g /mymaterialpath/dmd_rgb/driver_actions/safe_drive
46 | os.makedirs(str(dest), exist_ok=True)
47 | print("Moving",name, "to", dest)
48 | shutil.copytree(subClassF, str(dest),dirs_exist_ok=True)
49 |
50 | else:
51 | #For each class folder, get the name and make a folder in destination
52 | name = Path(classF).name
53 | dest = Path(self.materialPath+"/"+name)
54 | #e.g /mymaterialpath/dmd_rgb/safe_drive
55 | os.makedirs(str(dest), exist_ok=True)
56 | print("Moving",name)
57 | shutil.copytree(classF, str(dest),dirs_exist_ok=True)
58 |
59 | #Delete session folder
60 | shutil.rmtree(session)
61 |
62 |
63 |
64 | # splitClass(): split dataset into train and test splits.
65 | # Dataset must be organized by classes. A folder containing each class material. ("radio","drinking"...)
66 | class splitClass():
67 |
68 | def __init__(self,materialPath,destination,testPercent):
69 |
70 | #@self.materialPath: Path of dmd dataset. (inside must be classes folders)
71 | #@self.destination: Path of where the dataset will be splitted in train and test folders.
72 | #@self.testPercent: Portion of desired material for test split. (e.g. 0.20)
73 | self.materialPath = materialPath
74 | self.destination = destination
75 | self.testPercent = float(testPercent)
76 | if not Path(self.materialPath).exists():
77 | raise RuntimeError("Material path does not exist")
78 |
79 | if self.testPercent>1.0 or self.testPercent<=0:
80 | raise RuntimeError("Invalid percent for test split. Must be a number 0.0> and <1.0")
81 |
82 | #Create train and test folders in destination
83 | os.makedirs( self.destination + "/train", exist_ok=True)
84 | os.makedirs( self.destination + "/test", exist_ok=True)
85 |
86 | #List all classes folders
87 | label_paths = glob.glob(self.materialPath + '/*')
88 | label_paths.sort()
89 | print("folders: ",label_paths)
90 | for count,cl in enumerate(label_paths):
91 | #For each class folder, list all files
92 | files = glob.glob(str(cl) + '/*')
93 | #split file list in two by @testPercent
94 | train, test = self.partitionFiles(files)
95 | print("Moving ", len(files), " files: ",len(train)," for training and ",len(test)," for testing.")
96 |
97 | #Create class folder in train and test folders
98 | os.makedirs( self.destination + "/train/"+ str(count)+"/", exist_ok=True)
99 | os.makedirs(self.destination + "/test/" + str(count) + "/", exist_ok=True)
100 |
101 | #Move files from each partition to their correspondant folder
102 | for f in train:
103 | shutil.move(f, self.destination + "/train/" + str(count)+"/")
104 | for f in test:
105 | shutil.move(f, self.destination + "/test/" + str(count)+"/")
106 |
107 | def partitionFiles(self,files_list):
108 | #Calculate number of files for test partition
109 | howManyNumbers = int(round(self.testPercent * len(files_list)))
110 | shuffled = files_list[:]
111 | random.seed(123)
112 | #shuffle list of files
113 | random.shuffle(shuffled)
114 | #return partitions
115 | return shuffled[howManyNumbers:], shuffled[:howManyNumbers]
--------------------------------------------------------------------------------
/exploreMaterial-tool/statistics.py:
--------------------------------------------------------------------------------
1 | # -*- coding: utf-8 -*-
2 | import os
3 | from pathlib import Path # To handle paths independent of OS
4 |
5 | # Import local class to parse OpenLABEL content
6 | from vcd4reader import VcdHandler
7 |
8 | #Written by Paola Cañas with <3
9 |
10 | #Script to get statistics of the data (# of frames per class and total # of frames)
11 | class get_statistics():
12 |
13 | def __init__(self, vcdFile, destinationFile):
14 |
15 | self.vcdFile = vcdFile
16 | self.vcd_handler = VcdHandler(vcd_file=Path(self.vcdFile))
17 |
18 | self.actionPath = destinationFile.replace(".txt","-actions.txt")
19 | self.framesPath = destinationFile.replace(".txt","-frames.txt")
20 |
21 | # @self.actionList: ["driver_actions/safe_drive", "gaze_on_road/looking_road",.. , ..]
22 | self.actionList = self.vcd_handler.get_action_type_list()
23 | #Get object list
24 | self.objectList = self.vcd_handler.get_object_type_list()
25 | #from 1 to not get "driver" object
26 | for object in self.objectList:
27 | # Append objects to actionList
28 | if "driver" in object:
29 | #Dont add driver object
30 | continue
31 | self.actionList.append("objects_in_scene/"+object)
32 |
33 | self.countActions()
34 | self.countFrames()
35 |
36 | def countActions(self):
37 | string_txt = []
38 |
39 | if os.path.exists(self.actionPath):
40 | with open(self.actionPath, "r") as f:
41 | lines = f.readlines()
42 | for line in lines:
43 | string_txt.append(line.split(":"))
44 |
45 | #Delete to avoid redundancy
46 | os.remove(self.actionPath)
47 |
48 | for annotation in self.actionList:
49 | sum = 0
50 | # Check if annotation is an object or an action
51 | if "object" in annotation:
52 | # get object intervals from OpenLABEL
53 | fullIntervals = self.vcd_handler.get_frames_intervals_of_object(annotation)
54 | else:
55 | # get action intervals from OpenLABEL
56 | fullIntervals = self.vcd_handler.get_frames_intervals_of_action(annotation)
57 |
58 | #sum all frames in intervals
59 | for interval in fullIntervals:
60 | sum = sum + int(interval["frame_end"]) - int(interval["frame_start"])
61 |
62 | found = False
63 | #replace the sum for the new one if annotation found in txt
64 | for num, line in enumerate(string_txt):
65 | if annotation in line[0]:
66 | found = True
67 | string_txt[num][1] = str(sum+int(string_txt[num][1]))+"\n"
68 |
69 | #if not found, add the sum
70 | if not found:
71 | string_txt.append([annotation,str(sum)+"\n"])
72 |
73 | #write
74 | file = open(self.actionPath, "a+")
75 | for line in string_txt:
76 | file.write(line[0]+":"+line[1])
77 | file.close()
78 |
79 | def countFrames(self):
80 | sum = 0
81 | if os.path.exists(self.framesPath):
82 | with open(self.framesPath, "r") as f:
83 | lines = f.read()
84 | sum = int(lines.split(":")[1])
85 |
86 | #Delete to avoid redundancy
87 | os.remove(self.framesPath)
88 |
89 | sum = sum + self.vcd_handler.get_frames_number()
90 |
91 | #write
92 | file = open(self.framesPath, "a+")
93 | file.write("total_frames"+":"+str(sum))
94 | file.close()
--------------------------------------------------------------------------------
/exploreMaterial-tool/vcd4reader.py:
--------------------------------------------------------------------------------
1 | import warnings
2 | from pathlib import Path
3 | import json
4 | import numpy as np
5 | import vcd.core as core
6 | import vcd.types as types
7 | import time
8 |
9 |
10 |
11 | # TODO: get actions and objects per frame
12 |
13 | # TODO: funtion to get frame intervals per action or object presence
14 | #TODO: delete unecessary code
15 |
16 | # dict for changes in structures
17 | # data manipulation is only for external structures
18 | dmd_struct = {
19 | "groups": {
20 | "grupo1A": "gA",
21 | "grupo2A": "gB",
22 | "grupo2M": "gC",
23 | "grupo3B": "gD",
24 | "grupoE": "gE",
25 | "grupo4B": "gF",
26 | "grupoZ": "gZ",
27 | },
28 | "sessions": {
29 | "attm": "s1",
30 | "atts": "s2",
31 | "reach": "s3",
32 | "attc": "s4",
33 | "gaze": "s5",
34 | "gazec": "s6",
35 | "drow": "s7",
36 | "attm2": "s1",
37 | "atts2": "s2",
38 | "reach2": "s3",
39 | "attc2": "s4",
40 | "gaze2": "s5",
41 | "gazec2": "s6",
42 | "drow2": "s7",
43 | },
44 | }
45 |
46 | # Type of annotation
47 | annotate_dict = {0: "unchanged", 1: "manual", 2: "interval"}
48 |
49 |
50 | def is_string_int(s):
51 | try:
52 | int(s)
53 | return True
54 | except ValueError:
55 | return False
56 |
57 |
58 | def keys_exists(element, *keys):
59 | """
60 | Check if *keys (nested) exists in `element` (dict).
61 | """
62 | if not isinstance(element, dict):
63 | raise AttributeError("keys_exists() expects dict as first argument.")
64 | if len(keys) == 0:
65 | raise AttributeError(
66 | "keys_exists() expects at least two arguments, one given.")
67 |
68 | _element = element
69 | for key in keys:
70 | try:
71 | _element = _element[key]
72 | except KeyError:
73 | return False
74 | return True
75 |
76 |
77 | class VcdHandler():
78 |
79 | def __init__(self, vcd_file: Path):
80 |
81 | # vcd variables
82 | self._vcd = None
83 | self._vcd_file = str(vcd_file)
84 | self.__vcd_loaded = False
85 |
86 | # If vcd_file exists then load data into vcd object
87 | if vcd_file.exists():
88 |
89 | # -- Load OpenLABEL from file --
90 |
91 | # Create a VCD instance and load file
92 | # OpenLABEL json is in self._vcd.data
93 | self._vcd = core.VCD()
94 | self._vcd.load_from_file(file_name=self._vcd_file)
95 |
96 | #Number of frames in video
97 | self.__full_mosaic_frames= int(self._vcd.get_frame_intervals().get_dict()[0]["frame_end"]) + 1
98 |
99 | #Number of actions in OpenLABEL
100 | self.__num_actions = self._vcd.get_num_actions()
101 |
102 | #Number of objects in OpenLABEL, including the driver
103 | self.__num_objects = self._vcd.get_num_objects()
104 |
105 | print("There are %s actions in this OpenLABEL" % (self.__num_actions+self.__num_objects-1)) #minus 1 for "driver" object
106 |
107 | self.__vcd_loaded = True
108 |
109 | else:
110 | raise RuntimeError("OpenLABEL file not found.")
111 |
112 |
113 |
114 | #function to get intervals from specific action, providing its name or its uid
115 | def get_frames_intervals_of_action(self, uid):
116 | if isinstance(uid, str):
117 | uid = self.is_action_type_get_uid(uid)[1]
118 | if uid >=0:
119 | intervals = self._vcd.get_action(str(uid))["frame_intervals"]
120 | return intervals
121 | else:
122 | raise RuntimeError("WARNING: OpenLABEL does not have action with uid",uid)
123 |
124 | #function to get intervals from specific object, providing its name or its uid
125 | def get_frames_intervals_of_object(self, uid):
126 | if isinstance(uid, str):
127 | uid = self.is_object_type_get_uid(uid)[1]
128 | if uid >=0:
129 | intervals = self._vcd.get_object(str(uid))["frame_intervals"]
130 | return intervals
131 | else:
132 | raise RuntimeError("WARNING: OpenLABEL does not have an object with uid",uid)
133 |
134 |
135 | #Function to know if an action name (label) given is an action type name. It is useful because type names are composed by level_name/label_name
136 | #Also returns uid of action (e.g "only_left" will return 8)
137 | def is_action_type_get_uid(self, action_string):
138 | for uid, action_type in enumerate(self.get_action_type_list()):
139 | if action_string == action_type.split("/")[1] or action_string == action_type:
140 | return True, uid
141 | return False, -1
142 |
143 | #Function to know if an object type name (label) given exists in OpenLABEL.
144 | #Also returns uid of object (e.g "driver" will return 0)
145 | def is_object_type_get_uid(self, object_string):
146 | for uid, object_type in enumerate(self.get_object_type_list()):
147 | #If class name comes with type/classname
148 | if len(object_string.split("/"))>1:
149 | if object_string.split("/")[1] == object_type:
150 | return True, uid
151 | else:
152 | if object_string == object_type:
153 | return True, uid
154 | return False, -1
155 |
156 | #Funcion to go through the OpenLABEL and get the "type" val of all objects available
157 | def get_object_type_list(self):
158 | object_type_list = []
159 | if self._vcd_file:
160 | for uid in range(self.__num_objects):
161 | object_type_list.append(self._vcd.get_object(str(uid)).get('type'))
162 | return object_type_list
163 |
164 | #Funcion to go through the OpenLABEL and get the "type" val of all actions available
165 | def get_action_type_list(self):
166 | action_type_list = []
167 | if self._vcd_file:
168 | for uid in range(self.__num_actions):
169 | action_type_list.append(self._vcd.get_action(str(uid)).get('type'))
170 | return action_type_list
171 |
172 | # Return flag that indicate if OpenLABEL was loaded from file
173 | def fileLoaded(self):
174 | return self.__vcd_loaded
175 |
176 |
177 |
178 | # This function reads each stream video uri from the OpenLABEL
179 | def get_videos_uri(self):
180 |
181 | streams_data = self._vcd.get_streams()
182 | general = str(streams_data["general_camera"]["uri"])
183 |
184 | return general
185 |
186 | def get_frames_number(self):
187 | return int(self._vcd.get_frame_intervals().get_dict()[0]["frame_end"]) + 1
188 |
189 | """def get_frames_with_action_data_name(self, uid, data_name):
190 | frames = []
191 | if uid in self.data['vcd']['actions'] and uid in self.__object_data_names:
192 | object_ = self.data['vcd']['actions'][uid]
193 | if data_name in self.__object_data_names[uid]:
194 | # Now look into Frames
195 | fis = object_['frame_intervals']
196 | for fi in fis:
197 | fi_tuple = (fi['frame_start'], fi['frame_end'])
198 | for frame_num in range(fi_tuple[0], fi_tuple[1]+1):
199 | if self.has_frame_object_data_name(frame_num, data_name, uid):
200 | frames.append(frame_num)
201 | return frames
202 |
203 | def get_frames_with_action(self, action_uid):
204 | frames = []
205 | if uid_action in self.data['vcd']['actions']: #and uid in self.__object_data_names:
206 | action_ = self.data['vcd']['actions'][uid]
207 | if data_name in self.__object_data_names[uid]:
208 | # Now look into Frames
209 | fis = action_['frame_intervals']
210 | for fi in fis:
211 | fi_tuple = (fi['frame_start'], fi['frame_end'])
212 | for frame_num in range(fi_tuple[0], fi_tuple[1]+1):
213 | if self.has_frame_object_data_name(frame_num, data_name, uid):
214 | frames.append(frame_num)
215 | return frames
216 |
217 | def has_frame_action_data_name(self, frame_num, data_name, uid_=-1):
218 | if frame_num in self.data['vcd']['frames']:
219 | for uid, obj in self.data['vcd']['frames'][frame_num]['actions'].items():
220 | if uid_ == -1 or uid == uid_: # if uid == -1 means we want to loop over all objects
221 | for valArray in obj['action_data'].values():
222 | for val in valArray:
223 | if val['name'] == data_name:
224 | return True
225 | return False"""
226 |
227 | class VcdDMDHandler(VcdHandler):
228 | def __init__(self, vcd_file: Path):
229 | super().__init__(vcd_file)
230 |
231 | # Internal Variables initialization
232 | self.__uid_driver = None
233 | self.ont_uid = 0
234 |
235 | self.__group = None
236 | self.__subject = None
237 | self.__session = None
238 | self.__date = None
239 |
240 | self.__bf_shift = None
241 | self.__hb_shift = None
242 | self.__hf_shift = None
243 |
244 | self.__face_frames = None
245 | self.__body_frames = None
246 | self.__hands_frames = None
247 |
248 | self.__face_uri = None
249 | self.__body_uri = None
250 | self.__hands_uri = None
251 |
252 | # Check required essential fields inside to be considered loaded
253 | #vcd_metadata = self._vcd.data["vcd"]
254 | vcd_streams = self._vcd.get_streams()
255 | body_sh_exist = keys_exists(vcd_streams,"body_camera","stream_properties","sync","frame_shift")
256 | hands_sh_exist = keys_exists(vcd_streams,"hands_camera","stream_properties","sync","frame_shift")
257 |
258 | # If shifts fields exist then consider the OpenLABEL loaded was valid
259 | if body_sh_exist and hands_sh_exist:
260 | self.__vcd_loaded = True
261 | else:
262 | raise RuntimeError(
263 | "OpenLABEL doesn't have all necesary information. Not valid."
264 | )
265 |
266 | # -- Get video info --
267 |
268 | # Get video basic metadata
269 | self.__group, self.__subject, self.__session, self.__date = self.get_basic_metadata()
270 |
271 | # Get stream shifts
272 | self.__bf_shift, self.__hf_shift, self.__hb_shift = self.get_shifts()
273 |
274 | #Get video uri's
275 | self.__face_uri, self.__body_uri, self.__hands_uri = self.get_videos_uris()
276 |
277 | # Get frame numbers
278 | self.__face_frames, self.__body_frames, self.__hands_frames = self.get_frame_numbers()
279 |
280 | def get_basic_metadata(self):
281 | if self._vcd_file:
282 |
283 | if dict(self._vcd.get_metadata())["name"]:
284 | # e.g: gA_1_s1_2019-03-08T09;31;15+01;00
285 | name = str(dict(self._vcd.get_metadata())["name"]).split("_")
286 | group = name[0]
287 | subject = name[1]
288 | session = name[2]
289 | if not self._vcd.get_context_data(0, "recordTime") ==None:
290 | record_time =self._vcd.get_context_data(0, "recordTime")["val"]
291 | record_time = record_time.replace(";", ":")
292 | date = record_time.split("T")
293 | # Just get day and hour from the full timestamp
294 | date = date[0]+"-"+date[1].split("+")[0]
295 | else:
296 | #current date
297 | named_tuple = time.localtime() # get struct_time
298 | date = time.strftime("%Y-%m-%d-%H;%M;%S", named_tuple)
299 | return group, subject, session, date
300 | else:
301 | raise RuntimeError("WARNING: OpenLABEL does not have a name")
302 | else:
303 | return self.__group, self.__subject, self.__session, self.__date
304 |
305 |
306 | # This function allows to get the stream shifts directly from a valid and
307 | # loaded OpenLABEL file
308 | # Returns:
309 | # @body_face_shift
310 | # @hands_face_shift
311 | # @hands_body_shift
312 | def get_shifts(self):
313 | if self.__vcd_loaded:
314 | stream = self._vcd.get_stream("body_camera")
315 | body_face_sh = stream['stream_properties']['sync']['frame_shift']
316 |
317 | stream = self._vcd.get_stream("hands_camera")
318 | hands_face_sh = stream['stream_properties']['sync']['frame_shift']
319 |
320 | hands_body_sh = hands_face_sh - body_face_sh
321 | else:
322 | body_face_sh = self.__bf_shift
323 | hands_face_sh = self.__hf_shift
324 | hands_body_sh = self.__hb_shift
325 | return body_face_sh, hands_face_sh, hands_body_sh
326 |
327 | # This function reads each stream video uri from the OpenLABEL
328 | def get_videos_uris(self):
329 | if self.__vcd_loaded:
330 | stream = self._vcd.get_stream("face_camera")
331 | face = str(stream["uri"])
332 | stream = self._vcd.get_stream("body_camera")
333 | body = str(stream["uri"])
334 | stream = self._vcd.get_stream("hands_camera")
335 | hands = str(stream["uri"])
336 | else:
337 | face = self.__face_uri
338 | body = self.__body_uri
339 | hands = self.__hands_uri
340 | return face, body, hands
341 |
342 | # This function reads the number of frames of the hands video from the OpenLABEL
343 | def get_frame_numbers(self):
344 | if self.__vcd_loaded:
345 | stream = self._vcd.get_stream("face_camera")
346 | face = int(stream["stream_properties"]["total_frames"])
347 | stream = self._vcd.get_stream("body_camera")
348 | body = int(stream["stream_properties"]["total_frames"])
349 | stream = self._vcd.get_stream("hands_camera")
350 | hands = int(stream["stream_properties"]["total_frames"])
351 | else:
352 | face = self.__face_frames
353 | body = self.__body_frames
354 | hands = self.__hands_frames
355 | return face, body, hands
356 |
357 | def get_intrinsics(self):
358 | stream = self._vcd.get_stream("face_camera")
359 | face = stream['stream_properties']['intrinsics_pinhole'][
360 | 'camera_matrix_3x4']
361 | stream = self._vcd.get_stream("body_camera")
362 | body = stream['stream_properties']['intrinsics_pinhole'][
363 | 'camera_matrix_3x4']
364 | stream = self._vcd.get_stream("hands_camera")
365 | hands = stream['stream_properties']['intrinsics_pinhole'][
366 | 'camera_matrix_3x4']
367 | return face, body, hands
368 |
369 |
370 | def isNumberOfFrames(self):
371 | exist = True
372 | face, body, hands = self.get_frame_numbers()
373 | if face == 0 or hands == 0 or body == 0:
374 | exist = False
375 | return exist
376 |
377 |
378 | # this functions checks if the OpenLABEL has the fields of statics annotations
379 | # and the numbers of frames registered are not 0. If true, static
380 | # annotations exist
381 | def isStaticAnnotation(self, staticDict, obj_id):
382 | exist = True
383 | vcd_object = self._vcd.get_object(obj_id)
384 | for att in staticDict:
385 | att_exist = keys_exists(
386 | vcd_object, "object_data", str(att["type"]))
387 | if not att_exist:
388 | exist = False
389 | break
390 | frames = self.isNumberOfFrames()
391 | if not (frames and exist):
392 | exist = False
393 | return exist
394 |
395 | # This function get different values from OpenLABEL to keep the consistency when
396 | # the user saves/creates a new OpenLABEL
397 | # @staticDict: dict of static annotations to get its values from OpenLABEL
398 | # @ctx_id: id of the context (in this case 0)
399 | def getStaticVector(self, staticDict, ctx_id):
400 | for x in range(5):
401 | att = staticDict[x]
402 | # Get each of the static annotations of the directory from the OpenLABEL
403 | object_vcd = dict(self._vcd.get_object_data(0, att["name"]))
404 | att.update({"val": object_vcd["val"]})
405 | # context
406 | context = dict(self._vcd.get_context(ctx_id))["context_data"]["text"]
407 | staticDict[5].update({"val": context[0]["val"]})
408 | staticDict[6].update({"val": context[1]["val"]})
409 | # record_time = context[2]["val"]
410 | # Annotator id
411 | meta_data = dict(self._vcd.get_metadata())
412 | annotator = meta_data["annotator"]
413 | staticDict[7].update({"val": annotator})
414 | # returns:
415 | # @staticDict: the dict with the values taken from the OpenLABEL
416 | return staticDict
417 |
418 | # This function get different values from OpenLABEL to keep the consistency when
419 | # the user saves/creates a new OpenLABEL
420 | # @ctx_id: id of the object (in this case 0)
421 | def getMetadataVector(self, ctx_id):
422 | # context
423 | record_time = 0
424 | if not self._vcd.get_context_data(ctx_id, "recordTime") ==None:
425 | record_time =self._vcd.get_context_data(ctx_id, "recordTime")["val"]
426 | # frames
427 | face,body,hands = self.get_frame_numbers()
428 | #intrinsics
429 | face_mat, body_mat, hands_mat = self.get_intrinsics()
430 | # returns:
431 | # @face_meta: [rgb_video_frames,mat]
432 | # @body_meta: [date_time,rgb_video_frames,mat]
433 | # @face_meta: [rgb_video_frames,mat]
434 | return [face, face_mat], [record_time, body, body_mat], [hands, hands_mat]
435 |
436 |
--------------------------------------------------------------------------------