├── header.py ├── example.py ├── LICENSE ├── get_fixation.py ├── data └── README.md ├── saldat_saliency.py ├── head_orientation_lib.py ├── README.md ├── saldat_eval.py ├── Quaternion.py ├── saldat_head_orientation.py └── sal_eval.ipynb /header.py: -------------------------------------------------------------------------------- 1 | dirpath1 = u'./data/head-orientation/dataset1' 2 | dirpath2 = u'./data/head-orientation/dataset2/Experiment_1' 3 | dirpath3 = u'./data/head-orientation/dataset3/sensory/orientation' 4 | ext1 = '.txt' 5 | ext2 = '.csv' 6 | ext3 = '.csv' -------------------------------------------------------------------------------- /example.py: -------------------------------------------------------------------------------- 1 | import pickle 2 | 3 | if __name__ == "__main__": 4 | #TODO: a simple example to load a saliency dataset file 5 | #RETURN: the values of the first record of the file. The record includes: timestamp, fixation_list, and saliency_map 6 | #note: this script assumes the dataset has been download from the LINK provided in the ./data folder 7 | 8 | #load the dataset file named `saliency_ds1_topicparis` (ds=1, video=paris). Assuming the dataset file is in ./data folder 9 | data = pickle.load(open('./data/saliency_ds1_topicparis')) 10 | 11 | #access the first record 12 | timestamp, fixation_list, saliency_map = data[0] 13 | 14 | #print out the values of fields in the first record 15 | print timestamp 16 | print fixation_list 17 | print saliency_map 18 | 19 | 20 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2019 Anh Phan Nguyen 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /get_fixation.py: -------------------------------------------------------------------------------- 1 | import os 2 | import numpy as np 3 | import pickle 4 | import header 5 | import sys 6 | 7 | 8 | import head_orientation_lib 9 | import saldat_head_orientation 10 | import saldat_saliency 11 | 12 | if __name__ == "__main__": 13 | #specify dataset & video name to extract 14 | TOPIC = sys.argv[2]#for 6, modify 2 places, for loop vlength, and output file with _part 15 | DELTA = 0.06 16 | 17 | dataset = int(sys.argv[1])#saldat_head_orientation.HeadOrientation._DATASET2 18 | topic = TOPIC#dataset 1: paris, roller, venise,diving,timelapse, 19 | #dataset 2: '0', '1', '2', '3', '4', '5', '6', '7', '8' 20 | #dataset 3: ['coaster2_', 'coaster_', 'diving', 'drive', 'game', 'landscape', 'pacman', 'panel', 'ride', 'sport'] 21 | #specify output address to store the saliency maps 22 | 23 | #initialize head_oren 24 | print ("generating saliency maps for ds={}, topic={}".format(dataset, TOPIC)) 25 | dirpath1 = header.dirpath1#u'./data/head-orientation/dataset1' 26 | dirpath2 = header.dirpath2#u'./data/head-orientation/dataset2/Experiment_1' 27 | dirpath3 = header.dirpath3#u'./data/head-orientation/dataset3/sensory/orientation' 28 | ext1 = header.ext1 29 | ext2 = header.ext2 30 | ext3 = header.ext3 31 | headoren = saldat_head_orientation.HeadOrientation(dirpath1, dirpath2, dirpath3, ext1, ext2, ext3) 32 | #initialize 33 | var = 20 34 | salsal = saldat_saliency.Fixation(var) 35 | 36 | dirpath, filename_list, f_parse, f_extract_direction = headoren.load_filename_list(dataset, topic) 37 | series_ds = headoren.load_series_ds(filename_list, f_parse) 38 | vector_ds = headoren.headpos_to_headvec(series_ds, f_extract_direction) 39 | vector_ds = headoren.cutoff_vel_acc(vector_ds, dataset=dataset) 40 | 41 | _, vlength, _, _ = head_orientation_lib.topic_info_dict[topic] 42 | saliency_ds = [] 43 | for t in np.arange(1, vlength, DELTA):#0.06 44 | #for t in np.arange( vlength/2, vlength, DELTA): 45 | try: 46 | fixation_list = headoren.get_fixation(vector_ds, t) 47 | v_list = [item[1] for item in fixation_list] 48 | 49 | print (t, len(fixation_list)) 50 | fmap0 = headoren.create_fixation_map(fixation_list, dataset) 51 | heat_map0 = salsal.create_saliency(fixation_list, dataset) 52 | saliency_ds.append([t, v_list, heat_map0]) 53 | 54 | except: 55 | continue 56 | pickle.dump(saliency_ds, open('./data/saliency_ds{}_topic{}'.format(dataset, topic), 'wb')) 57 | -------------------------------------------------------------------------------- /data/README.md: -------------------------------------------------------------------------------- 1 | Saliency maps can be directly accessed from this [link](https://zenodo.org/record/2641282#.XLYYGkMpDAg) 2 | 3 | 4 | The following table are the description of the videos associated with each saliency map file. Note that each file are identified by the dataset index (ds) and the video name. 5 | 6 | | No | Video description | Video name | Dataset Index | From (seconds) | To (seconds) | Original file location (Youtube Id/Url) | 7 | |----|-------------------|------------|---------------|----------------|--------------|---------------------------------------| 8 | | 1 | Paris | Paris | 1 | 1 | 244 | sJxiPiAaB4k | 9 | | 2 | Roller Coaster | Roller | 1 | 66 | 134 | 8lsB-P8nGSM | 10 | | 3 | Diving | Diving | 1 | 41 | 412 | 2OzlksZBTiA | 11 | | 4 | Newyork | Timelapse | 1 | 1 | 91 | CIw8R8thnm8 | 12 | | 5 | Venise | Venise | 1 | 1 | 175 | s-AJRFQuAtE | 13 | | 6 | Conan 1 | Conan 1 | 2 | 1 | 164 | FiClYLgxJ5s | 14 | | 7 | Skiing | Skiing | 2 | 1 | 201 | 0wC3x_bnnps | 15 | | 8 | Alien | Alien | 2 | 1 | 293 | G-XZhKqQAHU | 16 | | 9 | Conan 2 | Conan 2 | 2 | 1 | 172 | 39MfLCMXGj0 | 17 | | 10 | Surfing | Surfing | 2 | 1 | 205 | MKWWhf8RAV8 | 18 | | 11 | War | War | 2 | 1 | 655 | _Ar0UkmID6s | 19 | | 12 | Cooking | Cooking | 2 | 1 | 451 | JpAdLz3iDPE | 20 | | 13 | Football | Football | 2 | 1 | 164 | lvH89OkkKQ8 | 21 | | 14 | Rhinos | Rhinos | 2 | 1 | 292 | AXG96ECE4hA | 22 | | 15 | Roller Coaster | Coaster_ | 3 | 21 | 140 | 8lsB-P8nGSM | 23 | | 16 | Mega Coaster | Coaster2_ | 3 | 91 | 150 | -xNN-bJQ4vI | 24 | | 17 | Shark Shipwreck | Diving2 | 3 | 31 | 90 | aQd41nbQM-U | 25 | | 18 | Driving with | Drive | 3 | 49 | 108 | LKWXHKFCMO8 | 26 | | 19 | Game Hog Rider | Game | 3 | 1 | 60 | yVLfEHXQk08 | 27 | | 20 | Kangaroo Island | Landscape | 3 | 2 | 61 | MXlHCTXtcNs | 28 | | 21 | Pacman | Pacman | 3 | 11 | 70 | p9h3ZqJa1iA | 29 | | 22 | Perils Panel | Panel | 3 | 11 | 70 | kiP5vWqPryY | 30 | | 23 | Chariot Race | Ride | 3 | 3 | 62 | jMyDqZe0z7M | 31 | | 24 | SFR Sport | Sport | 3 | 17 | 76 | lo5N90TlzwU | 32 | -------------------------------------------------------------------------------- /saldat_saliency.py: -------------------------------------------------------------------------------- 1 | from scipy import stats 2 | import numpy as np 3 | from Quaternion import Quat 4 | from pyquaternion import Quaternion 5 | import head_orientation_lib 6 | 7 | import timeit 8 | 9 | class Fixation: 10 | _DATASET1 = 1 11 | _DATASET2 = 2 12 | _DATASET3 = 3 13 | 14 | _gaussian_dict = {} 15 | _vec_map = None 16 | #_dataset_info_dict = {_DATASET1: [head_orientation_lib.extract_direction_dataset1], _DATASET2: [head_orientation_lib.extract_direction_dataset1]} 17 | 18 | def __init__(self, var): 19 | self._gaussian_dict = {np.around(_d, 1):stats.multivariate_normal.pdf(_d, mean=0, cov=var) for _d in np.arange(0.0, 180, .1 )} 20 | #self._f_extract_direction = self._dataset_info_dict[dataset][0] 21 | self._vec_map = self.create_pixel_vecmap() 22 | 23 | def f_extract_direction(self, q): 24 | return head_orientation_lib.extract_direction_dataset1(q) 25 | 26 | def gaussian_from_distance(self, _d): 27 | temp = np.around(_d, 1) 28 | return self._gaussian_dict[temp] if temp in self._gaussian_dict else 0.0 29 | 30 | def create_pixel_vecmap(self): 31 | vec_map = np.zeros((head_orientation_lib.H, head_orientation_lib.W)).tolist() 32 | for i in range(head_orientation_lib.H): 33 | for j in range(head_orientation_lib.W): 34 | theta, phi = head_orientation_lib.pixel_to_ang(i, j, head_orientation_lib.H, head_orientation_lib.W) 35 | t = Quat([0.0, theta, phi]).q #nolonger use Quat 36 | q = Quaternion([t[3], t[2], -t[1], t[0]]) 37 | vec_map[i][j] = self.f_extract_direction(q) 38 | return vec_map 39 | 40 | def create_saliency(self, fixation_list, dataset): 41 | idx = 0 42 | heat_map = np.zeros((head_orientation_lib.H, head_orientation_lib.W)) 43 | for i in range(heat_map.shape[0]): 44 | for j in range(heat_map.shape[1]): 45 | qxy = self._vec_map[i][j] 46 | for fixation in fixation_list: 47 | q0 = fixation[1] 48 | btime = timeit.default_timer() 49 | d = head_orientation_lib.degree_distance(q0, qxy) 50 | 51 | dd_time = timeit.default_timer() - btime 52 | 53 | heat_map[i, j] += 1.0 * self.gaussian_from_distance(d) 54 | gau_time = timeit.default_timer() - btime - dd_time 55 | 56 | if dataset == self._DATASET2: 57 | heat_map1 = np.fliplr(heat_map) 58 | heat_map1 = np.flipud(heat_map1) 59 | pos = int(head_orientation_lib.W/2) 60 | temp = np.copy(heat_map1[:, pos:]) 61 | heat_map1[:, pos:] = heat_map1[:, :pos] 62 | heat_map1[:, :pos] = temp 63 | 64 | elif dataset == self._DATASET1: 65 | ##heat_map1 = np.fliplr(heat_map1) 66 | heat_map1 = np.flipud(heat_map) 67 | pos = int(head_orientation_lib.W/2) 68 | temp = np.copy(heat_map1[:, pos:]) 69 | heat_map1[:, pos:] = heat_map1[:, :pos] 70 | heat_map1[:, :pos] = temp 71 | 72 | elif dataset == self._DATASET3: 73 | heat_map1 = np.fliplr(heat_map) 74 | heat_map1 = np.flipud(heat_map1) 75 | pos = int(head_orientation_lib.W/4) 76 | npos = int(head_orientation_lib.W/4*3) 77 | temp = np.copy(heat_map1[:, :npos]) 78 | heat_map1[:, :pos] = heat_map1[:, npos:] 79 | heat_map1[:, pos:] = temp 80 | 81 | 82 | return heat_map1 -------------------------------------------------------------------------------- /head_orientation_lib.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | from pyquaternion import Quaternion 3 | 4 | H = 90 5 | W = 160 6 | topic_info_dict = {'paris': ['paris.mp4', 244.06047, 3840, 2048], \ 7 | 'timelapse': ['newyork.webm', 91.03333, 3840, 2048], \ 8 | '3': ['conan2.mp4', 172.5724, 2560, 1440], \ 9 | '1': ['skiing.mp4', 201.13426, 2560, 1440], \ 10 | '0': ['conan1.mp4', 164.1973, 2560, 1440], \ 11 | 'venise': ['venise.webm', 175.04, 3840, 2160],\ 12 | '2': ['alien.mp4', 293.2333, 2560, 1440], \ 13 | '5': ['war.mp4', 655.0544, 2160, 1080], \ 14 | '4': ['surfing.mp4', 205.7055, 2560, 1440], \ 15 | '7': ['football.mp4', 164.8, 2560, 1440], \ 16 | '6': ['cooking.mp4', 451.12, 2560, 1440], \ 17 | 'diving': ['ocean40.webm', 372.23853, 3840, 2048], \ 18 | 'roller': ['roller65.webm', 69.0, 3840, 2048], \ 19 | '8': ['rhinos.mp4', 292.0584, 2560, 1440],\ 20 | 'coaster2_': ['', 60.0, -1, -1], \ 21 | 'coaster_': ['', 60.0, -1, -1], \ 22 | 'diving': ['', 60.0, -1, -1], \ 23 | 'drive': ['', 60.0, -1, -1], \ 24 | 'game': ['', 60.0, -1, -1], \ 25 | 'landscape': ['', 60.0, -1, -1], \ 26 | 'pacman': ['', 60.0, -1, -1],\ 27 | 'panel': ['', 60.0, -1, -1], \ 28 | 'ride': ['', 60.0, -1, -1], \ 29 | 'sport': ['', 60.0, -1, -1] } 30 | 31 | #['coaster2_', 'coaster', 'diving', 'drive', 'game', 'landscape', 'pacman', 'panel', 'ride', 'sport'] 32 | 33 | def extract_direction_dataset1(q): 34 | #q is quaternion 35 | v0 = [1, 0, 0] 36 | q = Quaternion([q[3], q[2], q[1], q[0]]) 37 | return q.rotate(v0) 38 | 39 | def extract_direction_dataset2(q): 40 | #q is quaternion 41 | v0 = [0, 0, 1] 42 | q = Quaternion([q[3], -q[2], q[1], -q[0]]) 43 | return q.rotate(v0) 44 | 45 | def extract_direction_dataset3(q): 46 | #q is quaternion 47 | v0 = [0, 0, 1] 48 | q = Quaternion([q[3], -q[2], q[1], -q[0]]) 49 | return q.rotate(v0) 50 | 51 | def pixel_to_ang(_x, _y, _geo_h, _geo_w): 52 | phi = geoy_to_phi(_x, _geo_h) 53 | theta = -(_y * 1.0 / _geo_w) * 360 54 | if theta < -180: theta = 360 + theta 55 | return theta, phi 56 | 57 | def geoy_to_phi(_geoy, _height): 58 | d = (_height/2 - _geoy) * 1.0 / (_height/2) 59 | s = -1 if d < 0 else 1 60 | return s * np.arcsin(np.abs(d)) / np.pi * 180 61 | 62 | def unit_vector(vector): 63 | return vector / np.linalg.norm(vector) 64 | 65 | def degree_distance(v1, v2): 66 | v1_u = unit_vector(v1) 67 | v2_u = unit_vector(v2) 68 | return np.arccos(np.clip(np.dot(v1_u, v2_u), -1.0, 1.0))/np.pi * 180 69 | 70 | def angle_between(v1, v2): 71 | v1_u = unit_vector(v1) 72 | v2_u = unit_vector(v2) 73 | return np.arccos(np.clip(np.dot(v1_u, v2_u), -1.0, 1.0))/np.pi * 180 74 | 75 | 76 | #lib to create fixation map 77 | 78 | def vector_to_ang(_v): 79 | #v = np.array(vector_ds[0][600][1]) 80 | #v = np.array([0, 0, 1]) 81 | _v = np.array(_v) 82 | alpha = degree_distance(_v, [0, 1, 0])#degree between v and [0, 1, 0] 83 | phi = 90.0 - alpha 84 | proj1 = [0, np.cos(alpha/180.0 * np.pi), 0] #proj1 is the projection of v onto [0, 1, 0] axis 85 | proj2 = _v - proj1#proj2 is the projection of v onto the plane([1, 0, 0], [0, 0, 1]) 86 | theta = degree_distance(proj2, [1, 0, 0])#theta = degree between project vector to plane and [1, 0, 0] 87 | sign = -1.0 if degree_distance(_v, [0, 0, -1]) > 90 else 1.0 88 | theta = sign * theta 89 | return theta, phi 90 | 91 | 92 | def ang_to_geoxy(_theta, _phi, _h, _w): 93 | x = _h/2.0 - (_h/2.0) * np.sin(_phi/180.0 * np.pi) 94 | temp = _theta 95 | if temp < 0: temp = 180 + temp + 180 96 | temp = 360 - temp 97 | y = (temp * 1.0/360 * _w) 98 | return x, y#wi, hi 99 | 100 | 101 | def adjust_pixel_dataset3(hi, wi, H, W): 102 | wi = W - wi 103 | wi = wi - W/4 104 | if wi < 0: 105 | wi = wi + W 106 | return hi, wi 107 | 108 | def adjust_pixel_dataset2(hi, wi, H, W): 109 | wi = W - wi 110 | if wi < 0: 111 | wi = wi + W 112 | return hi, wi 113 | 114 | def adjust_pixel_dataset1(hi, wi, H, W): 115 | hi = H - hi 116 | if hi < 0: 117 | hi = hi + H 118 | return hi, wi 119 | 120 | 121 | 122 | def adjust_pixellist_dataset(dataset, pixel_list, H, W): 123 | rhi_list = [] 124 | rwi_list = [] 125 | for hi, wi in pixel_list: 126 | if dataset == 1: 127 | hi, wi = adjust_pixel_dataset1(hi, wi, H, W) 128 | elif dataset == 2: 129 | hi, wi = adjust_pixel_dataset2(hi, wi, H, W) 130 | elif dataset == 3: 131 | hi, wi = adjust_pixel_dataset3(hi, wi, H, W) 132 | rhi_list.append(hi) 133 | rwi_list.append(wi) 134 | return zip(rhi_list, rwi_list) 135 | 136 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # A Saliency Dataset for 360-Degree Videos 2 | This README file contains the instructions to use our 360-degree saliency dataset and how to reproduce the saliency maps which were discussed in the paper: 3 | 4 | Anh Nguyen and Zhisheng Yan. A saliency dataset for 360-degree videos. In Proceedings of the 10th ACM Multimedia Systems Conference (MMSys’19), 2019. 5 | 6 | The data and source code are distributed under the terms of the MIT license. Our contributions in this project are: 7 | - A 360-degree saliency dataset with 50,654 saliency maps from 24 diverse videos. (original YouTube links of these videos are also provided). 8 | - An open-source software to create 360-degree saliency maps from head tracking logs. 9 | 10 | To cite our paper, use this Bibtex code: 11 | ``` 12 | @inproceedings{anguyen139, 13 | AUTHOR = {Nguyen, Anh and Yan, Zhisheng}, 14 | title = {A Saliency Dataset for 360-Degree Videos}, 15 | booktitle = {Proceedings of the 10th ACM Multimedia Systems Conference (MMSys'19)}, 16 | year = {2019} 17 | } 18 | ``` 19 | 20 | # Paper Abstract 21 | Despite the increasing popularity, realizing 360-degree videos in everyday applications is still challenging. Considering the unique viewing behavior in head-mounted display (HMD), understanding the saliency of 360-degree videos becomes the key to various 360-degree video research. Unfortunately, existing saliency datasets are either irrelevant to 360-degree videos or too small to support saliency modeling. In this paper, we introduce a large saliency dataset for 360-degree videos with 50,654 saliency maps from 24 diverse videos. The dataset is created by a new methodology supported by psychology studies in HMD viewing. Evaluation of the dataset shows that the generated saliency is highly correlated with the actual user fixation and that the saliency data can provide useful insight on user attention in 360-degree video viewing. The dataset and the program used to extract saliency are both made publicly available to facilitate future research. 22 | 23 | # 360-Degree Saliency Dataset 24 | The dataset includes 50,654 saliency maps from 24 videos. The saliency maps for each video are stored together in one file. The data in each file is organized into records. Each record has three fields: `timestamp`, `fixation`, and `saliency map`. The first field is the relative video time in seconds for the saliency maps. The second field is a list of fixation points. Each fixation point is a unit vector representing the head orientation in the three-dimensional space. The third field is the saliency map, where each pixel is a float number representing the saliency level in the original video frame. 25 | 26 | To access the dataset, please follow the link provided inside `./data` folder. 27 | 28 | # Program 29 | ## Program structure 30 | `/data` contains the [link](https://zenodo.org/record/2641282#.XLYYGkMpDAg) to Zenodo.org where the saliency maps are stored. 31 | `/data/head-orientation` is the folder where input head tracking logs are supposed to reside. However, the input logs can also be specified inside the `/header.py` script file. 32 | `/get_fixation.py` is the main entry to create 360-degree saliency maps from head tracking logs. 33 | `/example.py`is the example Python code to retrieve the saliency maps from files in `data` folder. 34 | 35 | ## Requirement & Installation 36 | 1. Download Python 2 37 | The program is developed Python 2.7. It is recommended that the [Anaconda2](https://www.anaconda.com/distribution/) packages is used 38 | 2. Install [pyquarternion](http://kieranwynn.github.io/pyquaternion/). 39 | ```sh 40 | pip install pyquaternion 41 | ``` 42 | 3. Collect Head Tracking Logs 43 | The program can received head tracking logs either in quarternion or Euler angles, and output saliency maps. Currently, head tracking logs are received from [Wu](https://wuchlei-thu.github.io/), [Corbillon](http://dash.ipv6.enstb.fr/headMovements/), and [Lo](https://nmsl.cs.nthu.edu.tw/360video/) . 44 | 45 | ## Dataset Collection Program 46 | 47 | To access our generated saliency maps, refer to the example in the file `./example.py`. The saliency maps are stored in python pickle format, which need to be extracted by a python program. 48 | 49 | To generate saliency map from heade tracking logs, refer to the file `./get_fixation.py`. The program assumes input head tracking logs have been downloaded and the file paths have been provided in `header.py`. To run the program, execute this command from terminal: 50 | ```sh 51 | python get_fixation.py