├── .github └── workflows │ └── unit-test.yml ├── .gitignore ├── LICENSE ├── README.md ├── custom_data ├── labels │ ├── EE-bNr36nyA.json │ ├── St Maarten Landing.json │ └── iVt07TCkFM0.json └── videos │ ├── EE-bNr36nyA.mp4 │ ├── St Maarten Landing.mp4 │ └── iVt07TCkFM0.mp4 ├── docs └── framework.jpg ├── requirements.txt ├── splits ├── summe.yml ├── summe_aug.yml ├── summe_trans.yml ├── tvsum.yml ├── tvsum_aug.yml └── tvsum_trans.yml ├── src ├── anchor_based │ ├── anchor_helper.py │ ├── dsnet.py │ ├── losses.py │ └── train.py ├── anchor_free │ ├── anchor_free_helper.py │ ├── dsnet_af.py │ ├── losses.py │ └── train.py ├── evaluate.py ├── helpers │ ├── bbox_helper.py │ ├── data_helper.py │ ├── init_helper.py │ ├── video_helper.py │ └── vsumm_helper.py ├── infer.py ├── kts │ ├── LICENSE │ ├── README.md │ ├── cpd_auto.py │ ├── cpd_nonlin.py │ └── demo.py ├── make_dataset.py ├── make_shots.py ├── make_split.py ├── modules │ ├── model_zoo.py │ └── models.py └── train.py └── tests ├── __init__.py ├── anchor_based ├── __init__.py ├── test_ab_losses.py └── test_anchor_helper.py ├── anchor_free ├── __init__.py ├── test_af_losses.py └── test_anchor_free_helper.py ├── helpers ├── __init__.py ├── test_bbox_helper.py ├── test_data_helper.py └── test_vsumm_helper.py ├── mock_run.sh ├── modules ├── __init__.py └── test_models.py └── test_train.py /.github/workflows/unit-test.yml: -------------------------------------------------------------------------------- 1 | # This workflow will install Python dependencies, run tests and lint with a single version of Python 2 | # For more information see: https://help.github.com/actions/language-and-framework-guides/using-python-with-github-actions 3 | 4 | name: UnitTest 5 | 6 | on: 7 | push: 8 | branches: [ '**' ] 9 | pull_request: 10 | branches: [ '**' ] 11 | 12 | jobs: 13 | build: 14 | 15 | runs-on: ubuntu-latest 16 | 17 | steps: 18 | - uses: actions/checkout@v2 19 | - name: Set up Python 3.8 20 | uses: actions/setup-python@v2 21 | with: 22 | python-version: 3.8 23 | - name: Install dependencies 24 | run: | 25 | python -m pip install --upgrade pip setuptools wheel 26 | pip install h5py ortools pyyaml numpy torch==1.7.0 27 | pip install torch-scatter -f https://pytorch-geometric.com/whl/torch-1.7.0+cpu.html 28 | pip install torch-sparse -f https://pytorch-geometric.com/whl/torch-1.7.0+cpu.html 29 | pip install torch-geometric 30 | pip install pytest 31 | - name: Test with pytest 32 | run: | 33 | pytest tests/ 34 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # ide 2 | .vscode/ 3 | .idea/ 4 | 5 | # macos 6 | .DS_Store 7 | __MACOSX/ 8 | 9 | # python 10 | __pycache__/ 11 | .pytest_cache/ 12 | .coverage 13 | 14 | # project 15 | /models/ 16 | /datasets/ 17 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2020 Jiahao Li 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # DSNet: A Flexible Detect-to-Summarize Network for Video Summarization [[paper]](https://ieeexplore.ieee.org/document/9275314) 2 | 3 | [](https://github.com/li-plus/DSNet/actions) 4 | [](https://github.com/li-plus/DSNet/blob/main/LICENSE) 5 | 6 |  7 | 8 | A PyTorch implementation of our paper [DSNet: A Flexible Detect-to-Summarize Network for Video Summarization](https://ieeexplore.ieee.org/document/9275314) by [Wencheng Zhu](https://woshiwencheng.github.io/), [Jiwen Lu](http://ivg.au.tsinghua.edu.cn/Jiwen_Lu/), [Jiahao Li](https://github.com/li-plus), and [Jie Zhou](http://www.au.tsinghua.edu.cn/info/1078/1635.htm). Published in [IEEE Transactions on Image Processing](https://ieeexplore.ieee.org/xpl/RecentIssue.jsp?punumber=83). 9 | 10 | ## Getting Started 11 | 12 | This project is developed on Ubuntu 16.04 with CUDA 9.0.176. 13 | 14 | First, clone this project to your local environment. 15 | 16 | ```sh 17 | git clone https://github.com/li-plus/DSNet.git 18 | ``` 19 | 20 | Create a virtual environment with python 3.6, preferably using [Anaconda](https://www.anaconda.com/). 21 | 22 | ```sh 23 | conda create --name dsnet python=3.6 24 | conda activate dsnet 25 | ``` 26 | 27 | Install python dependencies. 28 | 29 | ```sh 30 | pip install -r requirements.txt 31 | ``` 32 | 33 | ## Datasets Preparation 34 | 35 | Download the pre-processed datasets into `datasets/` folder, including [TVSum](https://github.com/yalesong/tvsum), [SumMe](https://gyglim.github.io/me/vsum/index.html), [OVP](https://sites.google.com/site/vsummsite/download), and [YouTube](https://sites.google.com/site/vsummsite/download) datasets. 36 | 37 | ```sh 38 | mkdir -p datasets/ && cd datasets/ 39 | wget https://www.dropbox.com/s/tdknvkpz1jp6iuz/dsnet_datasets.zip 40 | unzip dsnet_datasets.zip 41 | ``` 42 | 43 | If the Dropbox link is unavailable to you, try downloading from below links. 44 | 45 | + (Baidu Cloud) Link: https://pan.baidu.com/s/1LUK2aZzLvgNwbK07BUAQRQ Extraction Code: x09b 46 | + (Google Drive) https://drive.google.com/file/d/11ulsvk1MZI7iDqymw9cfL7csAYS0cDYH/view?usp=sharing 47 | 48 | Now the datasets structure should look like 49 | 50 | ``` 51 | DSNet 52 | └── datasets/ 53 | ├── eccv16_dataset_ovp_google_pool5.h5 54 | ├── eccv16_dataset_summe_google_pool5.h5 55 | ├── eccv16_dataset_tvsum_google_pool5.h5 56 | ├── eccv16_dataset_youtube_google_pool5.h5 57 | └── readme.txt 58 | ``` 59 | 60 | ## Pre-trained Models 61 | 62 | Our pre-trained models are now available online. You may download them for evaluation, or you may skip this section and train a new one from scratch. 63 | 64 | ```sh 65 | mkdir -p models && cd models 66 | # anchor-based model 67 | wget https://www.dropbox.com/s/0jwn4c1ccjjysrz/pretrain_ab_basic.zip 68 | unzip pretrain_ab_basic.zip 69 | # anchor-free model 70 | wget https://www.dropbox.com/s/2hjngmb0f97nxj0/pretrain_af_basic.zip 71 | unzip pretrain_af_basic.zip 72 | ``` 73 | 74 | To evaluate our pre-trained models, type: 75 | 76 | ```sh 77 | # evaluate anchor-based model 78 | python evaluate.py anchor-based --model-dir ../models/pretrain_ab_basic/ --splits ../splits/tvsum.yml ../splits/summe.yml 79 | # evaluate anchor-free model 80 | python evaluate.py anchor-free --model-dir ../models/pretrain_af_basic/ --splits ../splits/tvsum.yml ../splits/summe.yml --nms-thresh 0.4 81 | ``` 82 | 83 | If everything works fine, you will get similar F-score results as follows. 84 | 85 | | | TVSum | SumMe | 86 | | ------------ | ----- | ----- | 87 | | Anchor-based | 62.05 | 50.19 | 88 | | Anchor-free | 61.86 | 51.18 | 89 | 90 | ## Training 91 | 92 | ### Anchor-based 93 | 94 | To train anchor-based attention model on TVSum and SumMe datasets with canonical settings, run 95 | 96 | ```sh 97 | python train.py anchor-based --model-dir ../models/ab_basic --splits ../splits/tvsum.yml ../splits/summe.yml 98 | ``` 99 | 100 | To train on augmented and transfer datasets, run 101 | 102 | ```sh 103 | python train.py anchor-based --model-dir ../models/ab_tvsum_aug/ --splits ../splits/tvsum_aug.yml 104 | python train.py anchor-based --model-dir ../models/ab_summe_aug/ --splits ../splits/summe_aug.yml 105 | python train.py anchor-based --model-dir ../models/ab_tvsum_trans/ --splits ../splits/tvsum_trans.yml 106 | python train.py anchor-based --model-dir ../models/ab_summe_trans/ --splits ../splits/summe_trans.yml 107 | ``` 108 | 109 | To train with LSTM, Bi-LSTM or GCN feature extractor, specify the `--base-model` argument as `lstm`, `bilstm`, or `gcn`. For example, 110 | 111 | ```sh 112 | python train.py anchor-based --model-dir ../models/ab_basic --splits ../splits/tvsum.yml ../splits/summe.yml --base-model lstm 113 | ``` 114 | 115 | ### Anchor-free 116 | 117 | Much similar to anchor-based models, to train on canonical TVSum and SumMe, run 118 | 119 | ```sh 120 | python train.py anchor-free --model-dir ../models/af_basic --splits ../splits/tvsum.yml ../splits/summe.yml --nms-thresh 0.4 121 | ``` 122 | 123 | Note that NMS threshold is set to 0.4 for anchor-free models. 124 | 125 | ## Evaluation 126 | 127 | To evaluate your anchor-based models, run 128 | 129 | ```sh 130 | python evaluate.py anchor-based --model-dir ../models/ab_basic/ --splits ../splits/tvsum.yml ../splits/summe.yml 131 | ``` 132 | 133 | For anchor-free models, remember to specify NMS threshold as 0.4. 134 | 135 | ```sh 136 | python evaluate.py anchor-free --model-dir ../models/af_basic/ --splits ../splits/tvsum.yml ../splits/summe.yml --nms-thresh 0.4 137 | ``` 138 | 139 | ## Generating Shots with KTS 140 | 141 | Based on the public datasets provided by [DR-DSN](https://github.com/KaiyangZhou/pytorch-vsumm-reinforce), we apply [KTS](https://github.com/pathak22/videoseg/tree/master/lib/kts) algorithm to generate video shots for OVP and YouTube datasets. Note that the pre-processed datasets already contain these video shots. To re-generate video shots, run 142 | 143 | ```sh 144 | python make_shots.py --dataset ../datasets/eccv16_dataset_ovp_google_pool5.h5 145 | python make_shots.py --dataset ../datasets/eccv16_dataset_youtube_google_pool5.h5 146 | ``` 147 | 148 | ## Using Custom Videos 149 | 150 | ### Training & Validation 151 | 152 | We provide scripts to pre-process custom video data, like the raw videos in `custom_data` folder. 153 | 154 | First, create an h5 dataset. Here `--video-dir` contains several MP4 videos, and `--label-dir` contains ground truth user summaries for each video. The user summary of a video is a UxN binary matrix, where U denotes the number of annotators and N denotes the number of frames in the original video. 155 | 156 | ```sh 157 | python make_dataset.py --video-dir ../custom_data/videos --label-dir ../custom_data/labels \ 158 | --save-path ../custom_data/custom_dataset.h5 --sample-rate 15 159 | ``` 160 | 161 | Then split the dataset into training and validation sets and generate a split file to index them. 162 | 163 | ```sh 164 | python make_split.py --dataset ../custom_data/custom_dataset.h5 \ 165 | --train-ratio 0.67 --save-path ../custom_data/custom.yml 166 | ``` 167 | 168 | Now you may train on your custom videos using the split file. 169 | 170 | ```sh 171 | python train.py anchor-based --model-dir ../models/custom --splits ../custom_data/custom.yml 172 | python evaluate.py anchor-based --model-dir ../models/custom --splits ../custom_data/custom.yml 173 | ``` 174 | 175 | ### Inference 176 | 177 | To predict the summary of a raw video, use `infer.py`. For example, run 178 | 179 | ```sh 180 | python infer.py anchor-based --ckpt-path ../models/custom/checkpoint/custom.yml.0.pt \ 181 | --source ../custom_data/videos/EE-bNr36nyA.mp4 --save-path ./output.mp4 182 | ``` 183 | 184 | ## Acknowledgments 185 | 186 | We gratefully thank the below open-source repo, which greatly boost our research. 187 | 188 | + Thank [KTS](https://github.com/pathak22/videoseg/tree/master/lib/kts) for the effective shot generation algorithm. 189 | + Thank [DR-DSN](https://github.com/KaiyangZhou/pytorch-vsumm-reinforce) for the pre-processed public datasets. 190 | + Thank [VASNet](https://github.com/ok1zjf/VASNet) for the training and evaluation pipeline. 191 | 192 | ## Citation 193 | 194 | If you find our codes or paper helpful, please consider citing. 195 | 196 | ``` 197 | @article{zhu2020dsnet, 198 | title={DSNet: A Flexible Detect-to-Summarize Network for Video Summarization}, 199 | author={Zhu, Wencheng and Lu, Jiwen and Li, Jiahao and Zhou, Jie}, 200 | journal={IEEE Transactions on Image Processing}, 201 | volume={30}, 202 | pages={948--962}, 203 | year={2020} 204 | } 205 | ``` 206 | -------------------------------------------------------------------------------- /custom_data/videos/EE-bNr36nyA.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/li-plus/DSNet/1804176e2e8b57846beb063667448982273fca89/custom_data/videos/EE-bNr36nyA.mp4 -------------------------------------------------------------------------------- /custom_data/videos/St Maarten Landing.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/li-plus/DSNet/1804176e2e8b57846beb063667448982273fca89/custom_data/videos/St Maarten Landing.mp4 -------------------------------------------------------------------------------- /custom_data/videos/iVt07TCkFM0.mp4: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/li-plus/DSNet/1804176e2e8b57846beb063667448982273fca89/custom_data/videos/iVt07TCkFM0.mp4 -------------------------------------------------------------------------------- /docs/framework.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/li-plus/DSNet/1804176e2e8b57846beb063667448982273fca89/docs/framework.jpg -------------------------------------------------------------------------------- /requirements.txt: -------------------------------------------------------------------------------- 1 | tqdm==4.61.0 2 | h5py==3.1.0 3 | ortools==8.0.8283 4 | pyyaml==5.4.1 5 | numpy==1.19.5 6 | opencv-python==4.5.2.54 7 | torch==1.1.0 8 | torchvision==0.3.0 9 | torch-scatter==1.4.0 10 | torch-sparse==0.4.3 11 | torch-cluster==1.4.5 12 | torch-geometric==1.3.2 13 | -------------------------------------------------------------------------------- /splits/summe.yml: -------------------------------------------------------------------------------- 1 | - test_keys: 2 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_22 3 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_7 4 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_6 5 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_11 6 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_1 7 | train_keys: 8 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_23 9 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_21 10 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_10 11 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_12 12 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_4 13 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_24 14 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_5 15 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_14 16 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_25 17 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_13 18 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_20 19 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_17 20 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_2 21 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_3 22 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_9 23 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_15 24 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_16 25 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_18 26 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_19 27 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_8 28 | - test_keys: 29 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_2 30 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_11 31 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_20 32 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_22 33 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_17 34 | train_keys: 35 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_12 36 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_3 37 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_18 38 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_15 39 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_5 40 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_16 41 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_1 42 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_8 43 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_13 44 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_6 45 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_9 46 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_19 47 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_14 48 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_4 49 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_7 50 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_23 51 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_24 52 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_21 53 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_25 54 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_10 55 | - test_keys: 56 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_17 57 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_13 58 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_12 59 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_25 60 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_1 61 | train_keys: 62 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_4 63 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_10 64 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_5 65 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_15 66 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_23 67 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_16 68 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_19 69 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_22 70 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_24 71 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_2 72 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_21 73 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_9 74 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_11 75 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_18 76 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_3 77 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_7 78 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_6 79 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_8 80 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_14 81 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_20 82 | - test_keys: 83 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_1 84 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_13 85 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_25 86 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_16 87 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_9 88 | train_keys: 89 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_4 90 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_23 91 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_10 92 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_14 93 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_21 94 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_2 95 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_18 96 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_20 97 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_19 98 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_3 99 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_7 100 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_12 101 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_5 102 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_15 103 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_17 104 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_24 105 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_11 106 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_8 107 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_22 108 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_6 109 | - test_keys: 110 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_3 111 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_17 112 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_21 113 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_22 114 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_24 115 | train_keys: 116 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_12 117 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_7 118 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_2 119 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_19 120 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_15 121 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_20 122 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_10 123 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_4 124 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_14 125 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_1 126 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_13 127 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_11 128 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_16 129 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_6 130 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_25 131 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_5 132 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_9 133 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_18 134 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_23 135 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_8 136 | -------------------------------------------------------------------------------- /splits/summe_aug.yml: -------------------------------------------------------------------------------- 1 | - test_keys: 2 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_22 3 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_7 4 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_6 5 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_11 6 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_1 7 | train_keys: 8 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_24 9 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_1 10 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_19 11 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_27 12 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_47 13 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_4 14 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_17 15 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_48 16 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_17 17 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_50 18 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_34 19 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_19 20 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_42 21 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_3 22 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_39 23 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_45 24 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_37 25 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_45 26 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_12 27 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_40 28 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_28 29 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_15 30 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_7 31 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_18 32 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_4 33 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_2 34 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_26 35 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_20 36 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_28 37 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_25 38 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_31 39 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_20 40 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_46 41 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_2 42 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_22 43 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_7 44 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_27 45 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_43 46 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_2 47 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_33 48 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_16 49 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_27 50 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_48 51 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_20 52 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_37 53 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_48 54 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_10 55 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_11 56 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_34 57 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_34 58 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_18 59 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_44 60 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_44 61 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_29 62 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_31 63 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_38 64 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_49 65 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_42 66 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_50 67 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_9 68 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_31 69 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_3 70 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_37 71 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_13 72 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_12 73 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_14 74 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_41 75 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_41 76 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_44 77 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_33 78 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_45 79 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_35 80 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_36 81 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_12 82 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_10 83 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_5 84 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_6 85 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_30 86 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_16 87 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_21 88 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_30 89 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_28 90 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_19 91 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_40 92 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_21 93 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_47 94 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_13 95 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_25 96 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_32 97 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_36 98 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_49 99 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_9 100 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_14 101 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_10 102 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_26 103 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_21 104 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_39 105 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_18 106 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_17 107 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_13 108 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_38 109 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_15 110 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_6 111 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_15 112 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_33 113 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_42 114 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_23 115 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_35 116 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_19 117 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_18 118 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_8 119 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_26 120 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_16 121 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_41 122 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_29 123 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_5 124 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_11 125 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_43 126 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_24 127 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_8 128 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_15 129 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_16 130 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_46 131 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_3 132 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_14 133 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_50 134 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_32 135 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_30 136 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_47 137 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_25 138 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_22 139 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_46 140 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_25 141 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_23 142 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_36 143 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_5 144 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_49 145 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_1 146 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_11 147 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_35 148 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_39 149 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_23 150 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_40 151 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_23 152 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_38 153 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_21 154 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_29 155 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_43 156 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_32 157 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_24 158 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_9 159 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_20 160 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_13 161 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_8 162 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_24 163 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_12 164 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_4 165 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_17 166 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_14 167 | - test_keys: 168 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_2 169 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_11 170 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_20 171 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_22 172 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_17 173 | train_keys: 174 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_18 175 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_34 176 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_24 177 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_16 178 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_11 179 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_6 180 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_5 181 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_16 182 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_12 183 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_27 184 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_27 185 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_12 186 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_6 187 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_21 188 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_20 189 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_46 190 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_39 191 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_48 192 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_1 193 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_14 194 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_38 195 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_25 196 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_30 197 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_50 198 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_29 199 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_45 200 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_36 201 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_38 202 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_48 203 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_16 204 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_23 205 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_14 206 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_35 207 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_27 208 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_15 209 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_7 210 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_19 211 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_32 212 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_8 213 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_26 214 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_41 215 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_25 216 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_28 217 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_28 218 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_37 219 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_23 220 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_12 221 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_17 222 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_45 223 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_31 224 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_4 225 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_30 226 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_47 227 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_48 228 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_33 229 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_22 230 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_36 231 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_46 232 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_13 233 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_50 234 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_31 235 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_29 236 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_3 237 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_17 238 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_4 239 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_32 240 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_24 241 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_18 242 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_21 243 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_2 244 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_20 245 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_1 246 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_21 247 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_28 248 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_14 249 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_47 250 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_44 251 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_40 252 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_23 253 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_10 254 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_36 255 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_22 256 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_33 257 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_2 258 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_19 259 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_49 260 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_45 261 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_34 262 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_34 263 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_8 264 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_35 265 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_7 266 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_50 267 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_43 268 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_18 269 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_24 270 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_25 271 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_39 272 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_42 273 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_43 274 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_31 275 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_3 276 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_11 277 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_26 278 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_49 279 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_8 280 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_15 281 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_25 282 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_5 283 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_11 284 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_10 285 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_20 286 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_33 287 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_29 288 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_17 289 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_9 290 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_26 291 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_4 292 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_7 293 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_49 294 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_19 295 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_23 296 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_44 297 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_16 298 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_35 299 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_14 300 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_44 301 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_40 302 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_19 303 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_13 304 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_30 305 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_6 306 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_37 307 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_24 308 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_46 309 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_39 310 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_1 311 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_15 312 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_41 313 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_32 314 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_10 315 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_12 316 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_9 317 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_41 318 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_15 319 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_13 320 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_38 321 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_13 322 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_18 323 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_21 324 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_42 325 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_5 326 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_3 327 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_37 328 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_47 329 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_40 330 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_9 331 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_42 332 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_43 333 | - test_keys: 334 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_17 335 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_13 336 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_12 337 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_25 338 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_1 339 | train_keys: 340 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_48 341 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_32 342 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_44 343 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_20 344 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_30 345 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_13 346 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_31 347 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_23 348 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_24 349 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_38 350 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_29 351 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_2 352 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_37 353 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_23 354 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_35 355 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_16 356 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_24 357 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_35 358 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_16 359 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_21 360 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_19 361 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_15 362 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_1 363 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_28 364 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_29 365 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_33 366 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_22 367 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_16 368 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_11 369 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_9 370 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_50 371 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_7 372 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_4 373 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_19 374 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_15 375 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_8 376 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_3 377 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_7 378 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_3 379 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_47 380 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_21 381 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_22 382 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_21 383 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_17 384 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_20 385 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_30 386 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_26 387 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_4 388 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_23 389 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_18 390 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_31 391 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_27 392 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_32 393 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_37 394 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_5 395 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_28 396 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_49 397 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_44 398 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_17 399 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_42 400 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_34 401 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_24 402 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_21 403 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_6 404 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_46 405 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_24 406 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_16 407 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_25 408 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_5 409 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_15 410 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_42 411 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_43 412 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_41 413 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_23 414 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_45 415 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_41 416 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_8 417 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_40 418 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_39 419 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_34 420 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_4 421 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_40 422 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_45 423 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_41 424 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_6 425 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_14 426 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_49 427 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_36 428 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_12 429 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_48 430 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_43 431 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_7 432 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_50 433 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_8 434 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_27 435 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_32 436 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_33 437 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_35 438 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_47 439 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_45 440 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_26 441 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_9 442 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_26 443 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_9 444 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_18 445 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_39 446 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_17 447 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_5 448 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_28 449 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_34 450 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_11 451 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_22 452 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_18 453 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_40 454 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_19 455 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_20 456 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_33 457 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_19 458 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_48 459 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_38 460 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_27 461 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_11 462 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_25 463 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_10 464 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_30 465 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_14 466 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_3 467 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_13 468 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_10 469 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_44 470 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_46 471 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_47 472 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_29 473 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_36 474 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_38 475 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_43 476 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_10 477 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_18 478 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_14 479 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_12 480 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_1 481 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_50 482 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_2 483 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_46 484 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_39 485 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_6 486 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_2 487 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_42 488 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_25 489 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_20 490 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_49 491 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_13 492 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_15 493 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_36 494 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_14 495 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_31 496 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_37 497 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_12 498 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_11 499 | - test_keys: 500 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_1 501 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_13 502 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_25 503 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_16 504 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_9 505 | train_keys: 506 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_15 507 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_23 508 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_18 509 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_35 510 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_39 511 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_14 512 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_13 513 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_4 514 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_22 515 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_13 516 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_23 517 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_21 518 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_20 519 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_49 520 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_15 521 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_18 522 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_49 523 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_50 524 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_17 525 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_12 526 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_20 527 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_5 528 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_14 529 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_17 530 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_7 531 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_40 532 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_39 533 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_29 534 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_31 535 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_43 536 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_22 537 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_6 538 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_37 539 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_43 540 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_24 541 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_50 542 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_7 543 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_20 544 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_16 545 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_27 546 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_15 547 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_19 548 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_11 549 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_12 550 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_12 551 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_31 552 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_25 553 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_28 554 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_45 555 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_16 556 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_18 557 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_8 558 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_45 559 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_9 560 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_19 561 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_50 562 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_34 563 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_15 564 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_31 565 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_29 566 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_47 567 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_2 568 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_34 569 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_42 570 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_14 571 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_36 572 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_40 573 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_5 574 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_33 575 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_26 576 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_7 577 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_19 578 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_13 579 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_10 580 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_8 581 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_30 582 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_4 583 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_21 584 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_27 585 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_30 586 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_21 587 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_20 588 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_23 589 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_36 590 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_10 591 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_36 592 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_26 593 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_28 594 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_30 595 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_29 596 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_34 597 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_38 598 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_44 599 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_39 600 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_24 601 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_41 602 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_14 603 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_46 604 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_21 605 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_37 606 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_6 607 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_45 608 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_49 609 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_19 610 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_6 611 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_12 612 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_47 613 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_1 614 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_40 615 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_44 616 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_32 617 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_5 618 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_10 619 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_16 620 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_32 621 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_9 622 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_17 623 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_44 624 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_47 625 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_46 626 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_38 627 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_24 628 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_8 629 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_48 630 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_41 631 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_48 632 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_42 633 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_23 634 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_35 635 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_48 636 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_4 637 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_27 638 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_28 639 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_37 640 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_11 641 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_25 642 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_32 643 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_43 644 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_11 645 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_18 646 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_3 647 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_24 648 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_2 649 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_11 650 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_42 651 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_33 652 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_46 653 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_1 654 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_41 655 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_26 656 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_33 657 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_25 658 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_35 659 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_17 660 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_38 661 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_3 662 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_2 663 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_3 664 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_22 665 | - test_keys: 666 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_3 667 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_17 668 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_21 669 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_22 670 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_24 671 | train_keys: 672 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_14 673 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_19 674 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_20 675 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_11 676 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_5 677 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_47 678 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_27 679 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_1 680 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_47 681 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_19 682 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_28 683 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_39 684 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_27 685 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_16 686 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_20 687 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_41 688 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_28 689 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_4 690 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_33 691 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_44 692 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_48 693 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_9 694 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_20 695 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_46 696 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_2 697 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_4 698 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_50 699 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_12 700 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_33 701 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_32 702 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_25 703 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_15 704 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_29 705 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_33 706 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_37 707 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_21 708 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_7 709 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_16 710 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_1 711 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_12 712 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_24 713 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_7 714 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_32 715 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_35 716 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_39 717 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_23 718 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_45 719 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_32 720 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_39 721 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_30 722 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_7 723 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_26 724 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_31 725 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_22 726 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_24 727 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_14 728 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_21 729 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_40 730 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_42 731 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_9 732 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_48 733 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_14 734 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_18 735 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_13 736 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_6 737 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_15 738 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_44 739 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_34 740 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_49 741 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_46 742 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_11 743 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_36 744 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_23 745 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_8 746 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_36 747 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_50 748 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_42 749 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_40 750 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_17 751 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_43 752 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_29 753 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_28 754 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_10 755 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_18 756 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_43 757 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_41 758 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_23 759 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_25 760 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_11 761 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_50 762 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_31 763 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_35 764 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_37 765 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_5 766 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_38 767 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_16 768 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_15 769 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_27 770 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_10 771 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_20 772 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_44 773 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_30 774 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_2 775 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_26 776 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_35 777 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_12 778 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_2 779 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_13 780 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_22 781 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_45 782 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_37 783 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_1 784 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_8 785 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_10 786 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_49 787 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_25 788 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_9 789 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_31 790 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_14 791 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_6 792 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_17 793 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_19 794 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_34 795 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_8 796 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_49 797 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_45 798 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_16 799 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_38 800 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_19 801 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_17 802 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_25 803 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_12 804 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_30 805 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_5 806 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_4 807 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_18 808 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_40 809 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_3 810 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_15 811 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_23 812 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_47 813 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_13 814 | - ../datasets/eccv16_dataset_summe_google_pool5.h5/video_6 815 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_21 816 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_13 817 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_3 818 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_18 819 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_48 820 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_11 821 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_42 822 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_34 823 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_43 824 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_36 825 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_38 826 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_46 827 | - ../datasets/eccv16_dataset_ovp_google_pool5.h5/video_26 828 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_41 829 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_24 830 | - ../datasets/eccv16_dataset_youtube_google_pool5.h5/video_29 831 | -------------------------------------------------------------------------------- /splits/tvsum.yml: -------------------------------------------------------------------------------- 1 | - test_keys: 2 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_35 3 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_43 4 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_21 5 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_16 6 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_38 7 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_31 8 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_33 9 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_34 10 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_1 11 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_15 12 | train_keys: 13 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_3 14 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_17 15 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_44 16 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_7 17 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_50 18 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_26 19 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_25 20 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_47 21 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_28 22 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_11 23 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_12 24 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_18 25 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_10 26 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_49 27 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_6 28 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_41 29 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_4 30 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_48 31 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_36 32 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_29 33 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_37 34 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_23 35 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_14 36 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_20 37 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_19 38 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_5 39 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_2 40 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_45 41 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_9 42 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_13 43 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_30 44 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_8 45 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_39 46 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_46 47 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_32 48 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_24 49 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_27 50 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_40 51 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_42 52 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_22 53 | - test_keys: 54 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_34 55 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_50 56 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_11 57 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_19 58 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_10 59 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_27 60 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_16 61 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_1 62 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_37 63 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_15 64 | train_keys: 65 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_20 66 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_22 67 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_23 68 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_18 69 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_9 70 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_8 71 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_6 72 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_30 73 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_32 74 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_49 75 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_17 76 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_31 77 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_39 78 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_21 79 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_45 80 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_24 81 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_42 82 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_5 83 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_14 84 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_41 85 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_43 86 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_28 87 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_12 88 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_13 89 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_47 90 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_26 91 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_35 92 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_44 93 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_2 94 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_46 95 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_4 96 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_3 97 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_40 98 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_33 99 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_36 100 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_7 101 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_29 102 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_25 103 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_48 104 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_38 105 | - test_keys: 106 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_26 107 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_24 108 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_23 109 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_41 110 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_16 111 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_43 112 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_1 113 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_36 114 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_50 115 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_18 116 | train_keys: 117 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_4 118 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_28 119 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_47 120 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_29 121 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_44 122 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_35 123 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_39 124 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_37 125 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_22 126 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_6 127 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_3 128 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_17 129 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_19 130 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_27 131 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_34 132 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_49 133 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_40 134 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_8 135 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_46 136 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_38 137 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_31 138 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_48 139 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_13 140 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_2 141 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_20 142 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_5 143 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_21 144 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_33 145 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_15 146 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_45 147 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_25 148 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_7 149 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_30 150 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_42 151 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_32 152 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_14 153 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_10 154 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_12 155 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_9 156 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_11 157 | - test_keys: 158 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_19 159 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_26 160 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_42 161 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_11 162 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_40 163 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_46 164 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_29 165 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_18 166 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_20 167 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_44 168 | train_keys: 169 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_28 170 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_27 171 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_5 172 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_49 173 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_1 174 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_30 175 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_7 176 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_39 177 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_38 178 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_34 179 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_22 180 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_24 181 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_43 182 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_15 183 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_25 184 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_10 185 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_14 186 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_37 187 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_47 188 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_6 189 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_21 190 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_17 191 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_2 192 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_4 193 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_48 194 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_36 195 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_13 196 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_9 197 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_23 198 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_3 199 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_32 200 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_45 201 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_50 202 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_8 203 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_12 204 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_16 205 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_33 206 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_31 207 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_35 208 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_41 209 | - test_keys: 210 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_35 211 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_41 212 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_50 213 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_44 214 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_15 215 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_22 216 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_16 217 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_28 218 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_48 219 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_17 220 | train_keys: 221 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_7 222 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_9 223 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_21 224 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_34 225 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_20 226 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_37 227 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_3 228 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_31 229 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_30 230 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_46 231 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_11 232 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_12 233 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_18 234 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_14 235 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_24 236 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_1 237 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_47 238 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_4 239 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_13 240 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_25 241 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_5 242 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_38 243 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_36 244 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_2 245 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_23 246 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_29 247 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_33 248 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_43 249 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_10 250 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_40 251 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_8 252 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_45 253 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_19 254 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_39 255 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_27 256 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_6 257 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_26 258 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_32 259 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_49 260 | - ../datasets/eccv16_dataset_tvsum_google_pool5.h5/video_42 261 | -------------------------------------------------------------------------------- /src/anchor_based/anchor_helper.py: -------------------------------------------------------------------------------- 1 | from typing import List, Tuple 2 | 3 | import numpy as np 4 | 5 | from helpers import bbox_helper 6 | 7 | 8 | def get_anchors(seq_len: int, scales: List[int]) -> np.ndarray: 9 | """Generate all multi-scale anchors for a sequence in center-width format. 10 | 11 | :param seq_len: Sequence length. 12 | :param scales: List of bounding box widths. 13 | :return: All anchors in center-width format. 14 | """ 15 | anchors = np.zeros((seq_len, len(scales), 2), dtype=np.int32) 16 | for pos in range(seq_len): 17 | for scale_idx, scale in enumerate(scales): 18 | anchors[pos][scale_idx] = [pos, scale] 19 | return anchors 20 | 21 | 22 | def get_pos_label(anchors: np.ndarray, 23 | targets: np.ndarray, 24 | iou_thresh: float 25 | ) -> Tuple[np.ndarray, np.ndarray]: 26 | """Generate positive samples for training. 27 | 28 | :param anchors: List of CW anchors 29 | :param targets: List of CW target bounding boxes 30 | :param iou_thresh: If IoU between a target bounding box and any anchor is 31 | higher than this threshold, the target is regarded as a positive sample. 32 | :return: Class and location offset labels 33 | """ 34 | seq_len, num_scales, _ = anchors.shape 35 | anchors = np.reshape(anchors, (seq_len * num_scales, 2)) 36 | 37 | loc_label = np.zeros((seq_len * num_scales, 2)) 38 | cls_label = np.zeros(seq_len * num_scales, dtype=np.int32) 39 | 40 | for target in targets: 41 | target = np.tile(target, (seq_len * num_scales, 1)) 42 | iou = bbox_helper.iou_cw(anchors, target) 43 | pos_idx = np.where(iou > iou_thresh) 44 | cls_label[pos_idx] = 1 45 | loc_label[pos_idx] = bbox2offset(target[pos_idx], anchors[pos_idx]) 46 | 47 | loc_label = loc_label.reshape((seq_len, num_scales, 2)) 48 | cls_label = cls_label.reshape((seq_len, num_scales)) 49 | 50 | return cls_label, loc_label 51 | 52 | 53 | def get_neg_label(cls_label: np.ndarray, num_neg: int) -> np.ndarray: 54 | """Generate random negative samples. 55 | 56 | :param cls_label: Class labels including only positive samples. 57 | :param num_neg: Number of negative samples. 58 | :return: Label with original positive samples (marked by 1), negative 59 | samples (marked by -1), and ignored samples (marked by 0) 60 | """ 61 | seq_len, num_scales = cls_label.shape 62 | cls_label = cls_label.copy().reshape(-1) 63 | cls_label[cls_label < 0] = 0 # reset negative samples 64 | 65 | neg_idx, = np.where(cls_label == 0) 66 | np.random.shuffle(neg_idx) 67 | neg_idx = neg_idx[:num_neg] 68 | 69 | cls_label[neg_idx] = -1 70 | cls_label = np.reshape(cls_label, (seq_len, num_scales)) 71 | return cls_label 72 | 73 | 74 | def offset2bbox(offsets: np.ndarray, anchors: np.ndarray) -> np.ndarray: 75 | """Convert predicted offsets to CW bounding boxes. 76 | 77 | :param offsets: Predicted offsets. 78 | :param anchors: Sequence anchors. 79 | :return: Predicted bounding boxes. 80 | """ 81 | offsets = offsets.reshape(-1, 2) 82 | anchors = anchors.reshape(-1, 2) 83 | 84 | offset_center, offset_width = offsets[:, 0], offsets[:, 1] 85 | anchor_center, anchor_width = anchors[:, 0], anchors[:, 1] 86 | 87 | # Tc = Oc * Aw + Ac 88 | bbox_center = offset_center * anchor_width + anchor_center 89 | # Tw = exp(Ow) * Aw 90 | bbox_width = np.exp(offset_width) * anchor_width 91 | 92 | bbox = np.vstack((bbox_center, bbox_width)).T 93 | return bbox 94 | 95 | 96 | def bbox2offset(bboxes: np.ndarray, anchors: np.ndarray) -> np.ndarray: 97 | """Convert bounding boxes to offset labels. 98 | 99 | :param bboxes: List of CW bounding boxes. 100 | :param anchors: List of CW anchors. 101 | :return: Offsets labels for training. 102 | """ 103 | bbox_center, bbox_width = bboxes[:, 0], bboxes[:, 1] 104 | anchor_center, anchor_width = anchors[:, 0], anchors[:, 1] 105 | 106 | # Oc = (Tc - Ac) / Aw 107 | offset_center = (bbox_center - anchor_center) / anchor_width 108 | # Ow = ln(Tw / Aw) 109 | offset_width = np.log(bbox_width / anchor_width) 110 | 111 | offset = np.vstack((offset_center, offset_width)).T 112 | return offset 113 | -------------------------------------------------------------------------------- /src/anchor_based/dsnet.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from torch import nn 3 | 4 | from anchor_based import anchor_helper 5 | from helpers import bbox_helper 6 | from modules.models import build_base_model 7 | 8 | 9 | class DSNet(nn.Module): 10 | def __init__(self, base_model, num_feature, num_hidden, anchor_scales, 11 | num_head): 12 | super().__init__() 13 | self.anchor_scales = anchor_scales 14 | self.num_scales = len(anchor_scales) 15 | self.base_model = build_base_model(base_model, num_feature, num_head) 16 | 17 | self.roi_poolings = [nn.AvgPool1d(scale, stride=1, padding=scale // 2) 18 | for scale in anchor_scales] 19 | 20 | self.layer_norm = nn.LayerNorm(num_feature) 21 | self.fc1 = nn.Sequential( 22 | nn.Linear(num_feature, num_hidden), 23 | nn.Tanh(), 24 | nn.Dropout(0.5), 25 | nn.LayerNorm(num_hidden) 26 | ) 27 | self.fc_cls = nn.Linear(num_hidden, 1) 28 | self.fc_loc = nn.Linear(num_hidden, 2) 29 | 30 | def forward(self, x): 31 | _, seq_len, _ = x.shape 32 | out = self.base_model(x) 33 | out = out + x 34 | out = self.layer_norm(out) 35 | 36 | out = out.transpose(2, 1) 37 | pool_results = [roi_pooling(out) for roi_pooling in self.roi_poolings] 38 | out = torch.cat(pool_results, dim=0).permute(2, 0, 1)[:-1] 39 | 40 | out = self.fc1(out) 41 | 42 | pred_cls = self.fc_cls(out).sigmoid().view(seq_len, self.num_scales) 43 | pred_loc = self.fc_loc(out).view(seq_len, self.num_scales, 2) 44 | 45 | return pred_cls, pred_loc 46 | 47 | def predict(self, seq): 48 | seq_len = seq.shape[1] 49 | pred_cls, pred_loc = self(seq) 50 | 51 | pred_cls = pred_cls.cpu().numpy().reshape(-1) 52 | pred_loc = pred_loc.cpu().numpy().reshape((-1, 2)) 53 | 54 | anchors = anchor_helper.get_anchors(seq_len, self.anchor_scales) 55 | anchors = anchors.reshape((-1, 2)) 56 | 57 | pred_bboxes = anchor_helper.offset2bbox(pred_loc, anchors) 58 | pred_bboxes = bbox_helper.cw2lr(pred_bboxes) 59 | 60 | return pred_cls, pred_bboxes 61 | -------------------------------------------------------------------------------- /src/anchor_based/losses.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from torch.nn import functional as F 3 | 4 | 5 | def calc_loc_loss(pred_loc: torch.Tensor, 6 | test_loc: torch.Tensor, 7 | cls_label: torch.Tensor, 8 | use_smooth: bool = True 9 | ) -> torch.Tensor: 10 | """Compute location regression loss only on positive samples. 11 | 12 | :param pred_loc: Predicted bbox offsets. Sized [N, S, 2]. 13 | :param test_loc: Ground truth bbox offsets. Sized [N, S, 2]. 14 | :param cls_label: Class labels where the 1 marks the positive samples. Sized 15 | [N, S]. 16 | :param use_smooth: If true, use smooth L1 loss. Otherwise, use L1 loss. 17 | :return: Scalar loss value. 18 | """ 19 | pos_idx = cls_label.eq(1).unsqueeze(-1).repeat((1, 1, 2)) 20 | 21 | pred_loc = pred_loc[pos_idx] 22 | test_loc = test_loc[pos_idx] 23 | 24 | if use_smooth: 25 | loc_loss = F.smooth_l1_loss(pred_loc, test_loc) 26 | else: 27 | loc_loss = (pred_loc - test_loc).abs().mean() 28 | 29 | return loc_loss 30 | 31 | 32 | def calc_cls_loss(pred: torch.Tensor, test: torch.Tensor) -> torch.Tensor: 33 | """Compute classification loss. 34 | 35 | :param pred: Predicted confidence (0-1). Sized [N, S]. 36 | :param test: Class label where 1 marks positive, -1 marks negative, and 0 37 | marks ignored. Sized [N, S]. 38 | :return: Scalar loss value. 39 | """ 40 | pred = pred.view(-1) 41 | test = test.view(-1) 42 | 43 | pos_idx = test.eq(1).nonzero().squeeze(-1) 44 | pred_pos = pred[pos_idx].unsqueeze(-1) 45 | pred_pos = torch.cat([1 - pred_pos, pred_pos], dim=-1) 46 | gt_pos = torch.ones(pred_pos.shape[0], dtype=torch.long, device=pred.device) 47 | loss_pos = F.nll_loss(pred_pos.log(), gt_pos) 48 | 49 | neg_idx = test.eq(-1).nonzero().squeeze(-1) 50 | pred_neg = pred[neg_idx].unsqueeze(-1) 51 | pred_neg = torch.cat([1 - pred_neg, pred_neg], dim=-1) 52 | gt_neg = torch.zeros(pred_neg.shape[0], dtype=torch.long, 53 | device=pred.device) 54 | loss_neg = F.nll_loss(pred_neg.log(), gt_neg) 55 | 56 | loss = (loss_pos + loss_neg) * 0.5 57 | return loss 58 | -------------------------------------------------------------------------------- /src/anchor_based/train.py: -------------------------------------------------------------------------------- 1 | import logging 2 | 3 | import numpy as np 4 | import torch 5 | from torch import nn 6 | 7 | from anchor_based import anchor_helper 8 | from anchor_based.dsnet import DSNet 9 | from anchor_based.losses import calc_cls_loss, calc_loc_loss 10 | from evaluate import evaluate 11 | from helpers import data_helper, vsumm_helper, bbox_helper 12 | 13 | logger = logging.getLogger() 14 | 15 | 16 | def xavier_init(module): 17 | cls_name = module.__class__.__name__ 18 | if 'Linear' in cls_name or 'Conv' in cls_name: 19 | nn.init.xavier_uniform_(module.weight, gain=np.sqrt(2.0)) 20 | if module.bias is not None: 21 | nn.init.constant_(module.bias, 0.1) 22 | 23 | 24 | def train(args, split, save_path): 25 | model = DSNet(base_model=args.base_model, num_feature=args.num_feature, 26 | num_hidden=args.num_hidden, anchor_scales=args.anchor_scales, 27 | num_head=args.num_head) 28 | model = model.to(args.device) 29 | 30 | model.apply(xavier_init) 31 | 32 | parameters = [p for p in model.parameters() if p.requires_grad] 33 | optimizer = torch.optim.Adam(parameters, lr=args.lr, 34 | weight_decay=args.weight_decay) 35 | 36 | max_val_fscore = -1 37 | 38 | train_set = data_helper.VideoDataset(split['train_keys']) 39 | train_loader = data_helper.DataLoader(train_set, shuffle=True) 40 | 41 | val_set = data_helper.VideoDataset(split['test_keys']) 42 | val_loader = data_helper.DataLoader(val_set, shuffle=False) 43 | 44 | for epoch in range(args.max_epoch): 45 | model.train() 46 | stats = data_helper.AverageMeter('loss', 'cls_loss', 'loc_loss') 47 | 48 | for _, seq, gtscore, cps, n_frames, nfps, picks, _ in train_loader: 49 | keyshot_summ = vsumm_helper.get_keyshot_summ( 50 | gtscore, cps, n_frames, nfps, picks) 51 | target = vsumm_helper.downsample_summ(keyshot_summ) 52 | 53 | if not target.any(): 54 | continue 55 | 56 | target_bboxes = bbox_helper.seq2bbox(target) 57 | target_bboxes = bbox_helper.lr2cw(target_bboxes) 58 | anchors = anchor_helper.get_anchors(target.size, args.anchor_scales) 59 | # Get class and location label for positive samples 60 | cls_label, loc_label = anchor_helper.get_pos_label( 61 | anchors, target_bboxes, args.pos_iou_thresh) 62 | 63 | # Get negative samples 64 | num_pos = cls_label.sum() 65 | cls_label_neg, _ = anchor_helper.get_pos_label( 66 | anchors, target_bboxes, args.neg_iou_thresh) 67 | cls_label_neg = anchor_helper.get_neg_label( 68 | cls_label_neg, int(args.neg_sample_ratio * num_pos)) 69 | 70 | # Get incomplete samples 71 | cls_label_incomplete, _ = anchor_helper.get_pos_label( 72 | anchors, target_bboxes, args.incomplete_iou_thresh) 73 | cls_label_incomplete[cls_label_neg != 1] = 1 74 | cls_label_incomplete = anchor_helper.get_neg_label( 75 | cls_label_incomplete, 76 | int(args.incomplete_sample_ratio * num_pos)) 77 | 78 | cls_label[cls_label_neg == -1] = -1 79 | cls_label[cls_label_incomplete == -1] = -1 80 | 81 | cls_label = torch.tensor(cls_label, dtype=torch.float32).to(args.device) 82 | loc_label = torch.tensor(loc_label, dtype=torch.float32).to(args.device) 83 | 84 | seq = torch.tensor(seq, dtype=torch.float32).unsqueeze(0).to(args.device) 85 | 86 | pred_cls, pred_loc = model(seq) 87 | 88 | loc_loss = calc_loc_loss(pred_loc, loc_label, cls_label) 89 | cls_loss = calc_cls_loss(pred_cls, cls_label) 90 | 91 | loss = cls_loss + args.lambda_reg * loc_loss 92 | 93 | optimizer.zero_grad() 94 | loss.backward() 95 | optimizer.step() 96 | 97 | stats.update(loss=loss.item(), cls_loss=cls_loss.item(), 98 | loc_loss=loc_loss.item()) 99 | 100 | val_fscore, _ = evaluate(model, val_loader, args.nms_thresh, args.device) 101 | 102 | if max_val_fscore < val_fscore: 103 | max_val_fscore = val_fscore 104 | torch.save(model.state_dict(), str(save_path)) 105 | 106 | logger.info(f'Epoch: {epoch}/{args.max_epoch} ' 107 | f'Loss: {stats.cls_loss:.4f}/{stats.loc_loss:.4f}/{stats.loss:.4f} ' 108 | f'F-score cur/max: {val_fscore:.4f}/{max_val_fscore:.4f}') 109 | 110 | return max_val_fscore 111 | -------------------------------------------------------------------------------- /src/anchor_free/anchor_free_helper.py: -------------------------------------------------------------------------------- 1 | import numpy as np 2 | 3 | from helpers import bbox_helper 4 | 5 | 6 | def get_loc_label(target: np.ndarray) -> np.ndarray: 7 | """Generate location offset label from ground truth summary. 8 | 9 | :param target: Ground truth summary. Sized [N]. 10 | :return: Location offset label in LR format. Sized [N, 2]. 11 | """ 12 | seq_len, = target.shape 13 | 14 | bboxes = bbox_helper.seq2bbox(target) 15 | offsets = bbox2offset(bboxes, seq_len) 16 | 17 | return offsets 18 | 19 | 20 | def get_ctr_label(target: np.ndarray, 21 | offset: np.ndarray, 22 | eps: float = 1e-8 23 | ) -> np.ndarray: 24 | """Generate centerness label for ground truth summary. 25 | 26 | :param target: Ground truth summary. Sized [N]. 27 | :param offset: LR offset corresponding to target. Sized [N, 2]. 28 | :param eps: Small floating value to prevent division by zero. 29 | :return: Centerness label. Sized [N]. 30 | """ 31 | target = np.asarray(target, dtype=np.bool) 32 | ctr_label = np.zeros(target.shape, dtype=np.float32) 33 | 34 | offset_left, offset_right = offset[target, 0], offset[target, 1] 35 | ctr_label[target] = np.minimum(offset_left, offset_right) / ( 36 | np.maximum(offset_left, offset_right) + eps) 37 | 38 | return ctr_label 39 | 40 | 41 | def bbox2offset(bboxes: np.ndarray, seq_len: int) -> np.ndarray: 42 | """Convert LR bounding boxes to LR offsets. 43 | 44 | :param bboxes: LR bounding boxes. 45 | :param seq_len: Sequence length N. 46 | :return: LR offsets. Sized [N, 2]. 47 | """ 48 | pos_idx = np.arange(seq_len, dtype=np.float32) 49 | offsets = np.zeros((seq_len, 2), dtype=np.float32) 50 | 51 | for lo, hi in bboxes: 52 | bbox_pos = pos_idx[lo:hi] 53 | offsets[lo:hi] = np.vstack((bbox_pos - lo, hi - 1 - bbox_pos)).T 54 | 55 | return offsets 56 | 57 | 58 | def offset2bbox(offsets: np.ndarray) -> np.ndarray: 59 | """Convert LR offsets to bounding boxes. 60 | 61 | :param offsets: LR offsets. Sized [N, 2]. 62 | :return: Bounding boxes corresponding to offsets. Sized [N, 2]. 63 | """ 64 | offset_left, offset_right = offsets[:, 0], offsets[:, 1] 65 | seq_len, _ = offsets.shape 66 | indices = np.arange(seq_len) 67 | bbox_left = indices - offset_left 68 | bbox_right = indices + offset_right + 1 69 | bboxes = np.vstack((bbox_left, bbox_right)).T 70 | return bboxes 71 | -------------------------------------------------------------------------------- /src/anchor_free/dsnet_af.py: -------------------------------------------------------------------------------- 1 | from torch import nn 2 | 3 | from anchor_free import anchor_free_helper 4 | from modules.models import build_base_model 5 | 6 | 7 | class DSNetAF(nn.Module): 8 | def __init__(self, base_model, num_feature, num_hidden, num_head): 9 | super().__init__() 10 | self.base_model = build_base_model(base_model, num_feature, num_head) 11 | self.layer_norm = nn.LayerNorm(num_feature) 12 | 13 | self.fc1 = nn.Sequential( 14 | nn.Linear(num_feature, num_hidden), 15 | nn.ReLU(inplace=True), 16 | nn.Dropout(0.5), 17 | nn.LayerNorm(num_hidden) 18 | ) 19 | self.fc_cls = nn.Linear(num_hidden, 1) 20 | self.fc_loc = nn.Linear(num_hidden, 2) 21 | self.fc_ctr = nn.Linear(num_hidden, 1) 22 | 23 | def forward(self, x): 24 | _, seq_len, _ = x.shape 25 | out = self.base_model(x) 26 | 27 | out = out + x 28 | out = self.layer_norm(out) 29 | 30 | out = self.fc1(out) 31 | 32 | pred_cls = self.fc_cls(out).sigmoid().view(seq_len) 33 | pred_loc = self.fc_loc(out).exp().view(seq_len, 2) 34 | 35 | pred_ctr = self.fc_ctr(out).sigmoid().view(seq_len) 36 | 37 | return pred_cls, pred_loc, pred_ctr 38 | 39 | def predict(self, seq): 40 | pred_cls, pred_loc, pred_ctr = self(seq) 41 | 42 | pred_cls *= pred_ctr 43 | pred_cls /= pred_cls.max() + 1e-8 44 | 45 | pred_cls = pred_cls.cpu().numpy() 46 | pred_loc = pred_loc.cpu().numpy() 47 | 48 | pred_bboxes = anchor_free_helper.offset2bbox(pred_loc) 49 | return pred_cls, pred_bboxes 50 | -------------------------------------------------------------------------------- /src/anchor_free/losses.py: -------------------------------------------------------------------------------- 1 | import torch 2 | from torch.nn import functional as F 3 | 4 | 5 | def calc_cls_loss(pred: torch.Tensor, 6 | test: torch.Tensor, 7 | kind: str = 'focal' 8 | ) -> torch.Tensor: 9 | """Compute classification loss on both positive and negative samples. 10 | 11 | :param pred: Predicted class. Sized [N, S]. 12 | :param test: Class label where 1 marks positive, -1 marks negative, and 0 13 | marks ignored. Sized [N, S]. 14 | :param kind: Loss type. Choose from (focal, cross-entropy). 15 | :return: Scalar loss value. 16 | """ 17 | test = test.type(torch.long) 18 | num_pos = test.sum() 19 | 20 | pred = pred.unsqueeze(-1) 21 | pred = torch.cat([1 - pred, pred], dim=-1) 22 | 23 | if kind == 'focal': 24 | loss = focal_loss(pred, test, reduction='sum') 25 | elif kind == 'cross-entropy': 26 | loss = F.nll_loss(pred.log(), test) 27 | else: 28 | raise ValueError(f'Invalid loss type {kind}') 29 | 30 | loss = loss / num_pos 31 | return loss 32 | 33 | 34 | def iou_offset(offset_a: torch.Tensor, 35 | offset_b: torch.Tensor, 36 | eps: float = 1e-8 37 | ) -> torch.Tensor: 38 | """Compute IoU offsets between multiple offset pairs. 39 | 40 | :param offset_a: Offsets of N positions. Sized [N, 2]. 41 | :param offset_b: Offsets of N positions. Sized [N, 2]. 42 | :param eps: Small floating value to prevent division by zero. 43 | :return: IoU values of N positions. Sized [N]. 44 | """ 45 | left_a, right_a = offset_a[:, 0], offset_a[:, 1] 46 | left_b, right_b = offset_b[:, 0], offset_b[:, 1] 47 | 48 | length_a = left_a + right_a 49 | length_b = left_b + right_b 50 | 51 | intersect = torch.min(left_a, left_b) + torch.min(right_a, right_b) 52 | intersect[intersect < 0] = 0 53 | union = length_a + length_b - intersect 54 | union[union <= 0] = eps 55 | 56 | iou = intersect / union 57 | return iou 58 | 59 | 60 | def calc_loc_loss(pred_loc: torch.Tensor, 61 | test_loc: torch.Tensor, 62 | cls_label: torch.Tensor, 63 | kind: str = 'soft-iou', 64 | eps: float = 1e-8 65 | ) -> torch.Tensor: 66 | """Compute soft IoU loss for regression only on positive samples. 67 | 68 | :param pred_loc: Predicted offsets. Sized [N, 2]. 69 | :param test_loc: Ground truth offsets. Sized [N, 2]. 70 | :param cls_label: Class label specifying positive samples. 71 | :param kind: Loss type. Choose from (soft-iou, smooth-l1). 72 | :param eps: Small floating value to prevent division by zero. 73 | :return: Scalar loss value. 74 | """ 75 | cls_label = cls_label.type(torch.bool) 76 | pred_loc = pred_loc[cls_label] 77 | test_loc = test_loc[cls_label] 78 | 79 | if kind == 'soft-iou': 80 | iou = iou_offset(pred_loc, test_loc) 81 | loss = -torch.log(iou + eps).mean() 82 | elif kind == 'smooth-l1': 83 | loss = F.smooth_l1_loss(pred_loc, test_loc) 84 | else: 85 | raise ValueError(f'Invalid loss type {kind}') 86 | 87 | return loss 88 | 89 | 90 | def calc_ctr_loss(pred, test, pos_mask): 91 | pos_mask = pos_mask.type(torch.bool) 92 | 93 | pred = pred[pos_mask] 94 | test = test[pos_mask] 95 | 96 | loss = F.binary_cross_entropy(pred, test) 97 | return loss 98 | 99 | 100 | def one_hot_embedding(labels: torch.Tensor, num_classes: int) -> torch.Tensor: 101 | """Embedding labels to one-hot form. 102 | 103 | :param labels: Class labels. Sized [N]. 104 | :param num_classes: Number of classes. 105 | :return: One-hot encoded labels. sized [N, #classes]. 106 | """ 107 | eye = torch.eye(num_classes, device=labels.device) 108 | return eye[labels] 109 | 110 | 111 | def focal_loss(x: torch.Tensor, 112 | y: torch.Tensor, 113 | alpha: float = 0.25, 114 | gamma: float = 2, 115 | reduction: str = 'sum' 116 | ) -> torch.Tensor: 117 | """Compute focal loss for binary classification. 118 | FL(p_t) = -alpha_t * (1 - p_t)^gamma * log(p_t) 119 | 120 | :param x: Predicted confidence. Sized [N, D]. 121 | :param y: Ground truth label. Sized [N]. 122 | :param alpha: Alpha parameter in focal loss. 123 | :param gamma: Gamma parameter in focal loss. 124 | :param reduction: Aggregation type. Choose from (sum, mean, none). 125 | :return: Scalar loss value. 126 | """ 127 | _, num_classes = x.shape 128 | 129 | t = one_hot_embedding(y, num_classes) 130 | 131 | # p_t = p if t > 0 else 1-p 132 | p_t = x * t + (1 - x) * (1 - t) 133 | # alpha_t = alpha if t > 0 else 1-alpha 134 | alpha_t = alpha * t + (1 - alpha) * (1 - t) 135 | # FL(p_t) = -alpha_t * (1 - p_t)^gamma * log(p_t) 136 | fl = -alpha_t * (1 - p_t).pow(gamma) * p_t.log() 137 | 138 | if reduction == 'sum': 139 | fl = fl.sum() 140 | elif reduction == 'mean': 141 | fl = fl.mean() 142 | elif reduction == 'none': 143 | pass 144 | else: 145 | raise ValueError(f'Invalid reduction mode {reduction}') 146 | 147 | return fl 148 | 149 | 150 | def focal_loss_with_logits(x, y, reduction='sum'): 151 | """Compute focal loss with logits input""" 152 | return focal_loss(x.sigmoid(), y, reduction=reduction) 153 | -------------------------------------------------------------------------------- /src/anchor_free/train.py: -------------------------------------------------------------------------------- 1 | import logging 2 | 3 | import torch 4 | 5 | from anchor_free import anchor_free_helper 6 | from anchor_free.dsnet_af import DSNetAF 7 | from anchor_free.losses import calc_ctr_loss, calc_cls_loss, calc_loc_loss 8 | from evaluate import evaluate 9 | from helpers import data_helper, vsumm_helper 10 | 11 | logger = logging.getLogger() 12 | 13 | 14 | def train(args, split, save_path): 15 | model = DSNetAF(base_model=args.base_model, num_feature=args.num_feature, 16 | num_hidden=args.num_hidden, num_head=args.num_head) 17 | model = model.to(args.device) 18 | 19 | model.train() 20 | 21 | parameters = [p for p in model.parameters() if p.requires_grad] 22 | optimizer = torch.optim.Adam(parameters, lr=args.lr, 23 | weight_decay=args.weight_decay) 24 | 25 | max_val_fscore = -1 26 | 27 | train_set = data_helper.VideoDataset(split['train_keys']) 28 | train_loader = data_helper.DataLoader(train_set, shuffle=True) 29 | 30 | val_set = data_helper.VideoDataset(split['test_keys']) 31 | val_loader = data_helper.DataLoader(val_set, shuffle=False) 32 | 33 | for epoch in range(args.max_epoch): 34 | model.train() 35 | stats = data_helper.AverageMeter('loss', 'cls_loss', 'loc_loss', 36 | 'ctr_loss') 37 | 38 | for _, seq, gtscore, change_points, n_frames, nfps, picks, _ in train_loader: 39 | keyshot_summ = vsumm_helper.get_keyshot_summ( 40 | gtscore, change_points, n_frames, nfps, picks) 41 | target = vsumm_helper.downsample_summ(keyshot_summ) 42 | 43 | if not target.any(): 44 | continue 45 | 46 | seq = torch.tensor(seq, dtype=torch.float32).unsqueeze(0).to(args.device) 47 | 48 | cls_label = target 49 | loc_label = anchor_free_helper.get_loc_label(target) 50 | ctr_label = anchor_free_helper.get_ctr_label(target, loc_label) 51 | 52 | pred_cls, pred_loc, pred_ctr = model(seq) 53 | 54 | cls_label = torch.tensor(cls_label, dtype=torch.float32).to(args.device) 55 | loc_label = torch.tensor(loc_label, dtype=torch.float32).to(args.device) 56 | ctr_label = torch.tensor(ctr_label, dtype=torch.float32).to(args.device) 57 | 58 | cls_loss = calc_cls_loss(pred_cls, cls_label, args.cls_loss) 59 | loc_loss = calc_loc_loss(pred_loc, loc_label, cls_label, 60 | args.reg_loss) 61 | ctr_loss = calc_ctr_loss(pred_ctr, ctr_label, cls_label) 62 | 63 | loss = cls_loss + args.lambda_reg * loc_loss + args.lambda_ctr * ctr_loss 64 | 65 | optimizer.zero_grad() 66 | loss.backward() 67 | optimizer.step() 68 | 69 | stats.update(loss=loss.item(), cls_loss=cls_loss.item(), 70 | loc_loss=loc_loss.item(), ctr_loss=ctr_loss.item()) 71 | 72 | val_fscore, _ = evaluate(model, val_loader, args.nms_thresh, args.device) 73 | 74 | if max_val_fscore < val_fscore: 75 | max_val_fscore = val_fscore 76 | torch.save(model.state_dict(), str(save_path)) 77 | 78 | logger.info(f'Epoch: {epoch}/{args.max_epoch} ' 79 | f'Loss: {stats.cls_loss:.4f}/{stats.loc_loss:.4f}/{stats.ctr_loss:.4f}/{stats.loss:.4f} ' 80 | f'F-score cur/max: {val_fscore:.4f}/{max_val_fscore:.4f}') 81 | 82 | return max_val_fscore 83 | -------------------------------------------------------------------------------- /src/evaluate.py: -------------------------------------------------------------------------------- 1 | import logging 2 | from pathlib import Path 3 | 4 | import numpy as np 5 | import torch 6 | 7 | from helpers import init_helper, data_helper, vsumm_helper, bbox_helper 8 | from modules.model_zoo import get_model 9 | 10 | logger = logging.getLogger() 11 | 12 | 13 | def evaluate(model, val_loader, nms_thresh, device): 14 | model.eval() 15 | stats = data_helper.AverageMeter('fscore', 'diversity') 16 | 17 | with torch.no_grad(): 18 | for test_key, seq, _, cps, n_frames, nfps, picks, user_summary in val_loader: 19 | seq_len = len(seq) 20 | seq_torch = torch.from_numpy(seq).unsqueeze(0).to(device) 21 | 22 | pred_cls, pred_bboxes = model.predict(seq_torch) 23 | 24 | pred_bboxes = np.clip(pred_bboxes, 0, seq_len).round().astype(np.int32) 25 | 26 | pred_cls, pred_bboxes = bbox_helper.nms(pred_cls, pred_bboxes, nms_thresh) 27 | pred_summ = vsumm_helper.bbox2summary( 28 | seq_len, pred_cls, pred_bboxes, cps, n_frames, nfps, picks) 29 | 30 | eval_metric = 'avg' if 'tvsum' in test_key else 'max' 31 | fscore = vsumm_helper.get_summ_f1score( 32 | pred_summ, user_summary, eval_metric) 33 | 34 | pred_summ = vsumm_helper.downsample_summ(pred_summ) 35 | diversity = vsumm_helper.get_summ_diversity(pred_summ, seq) 36 | stats.update(fscore=fscore, diversity=diversity) 37 | 38 | return stats.fscore, stats.diversity 39 | 40 | 41 | def main(): 42 | args = init_helper.get_arguments() 43 | 44 | init_helper.init_logger(args.model_dir, args.log_file) 45 | init_helper.set_random_seed(args.seed) 46 | 47 | logger.info(vars(args)) 48 | model = get_model(args.model, **vars(args)) 49 | model = model.eval().to(args.device) 50 | 51 | for split_path in args.splits: 52 | split_path = Path(split_path) 53 | splits = data_helper.load_yaml(split_path) 54 | 55 | stats = data_helper.AverageMeter('fscore', 'diversity') 56 | 57 | for split_idx, split in enumerate(splits): 58 | ckpt_path = data_helper.get_ckpt_path(args.model_dir, split_path, split_idx) 59 | state_dict = torch.load(str(ckpt_path), 60 | map_location=lambda storage, loc: storage) 61 | model.load_state_dict(state_dict) 62 | 63 | val_set = data_helper.VideoDataset(split['test_keys']) 64 | val_loader = data_helper.DataLoader(val_set, shuffle=False) 65 | 66 | fscore, diversity = evaluate(model, val_loader, args.nms_thresh, args.device) 67 | stats.update(fscore=fscore, diversity=diversity) 68 | 69 | logger.info(f'{split_path.stem} split {split_idx}: diversity: ' 70 | f'{diversity:.4f}, F-score: {fscore:.4f}') 71 | 72 | logger.info(f'{split_path.stem}: diversity: {stats.diversity:.4f}, ' 73 | f'F-score: {stats.fscore:.4f}') 74 | 75 | 76 | if __name__ == '__main__': 77 | main() 78 | -------------------------------------------------------------------------------- /src/helpers/bbox_helper.py: -------------------------------------------------------------------------------- 1 | from itertools import groupby 2 | from operator import itemgetter 3 | from typing import Tuple 4 | 5 | import numpy as np 6 | 7 | 8 | def lr2cw(bbox_lr: np.ndarray) -> np.ndarray: 9 | """Convert bounding boxes from left-right (LR) to center-width (CW) format. 10 | 11 | :param bbox_lr: LR bounding boxes. Sized [N, 2]. 12 | :return: CW bounding boxes. Sized [N, 2]. 13 | """ 14 | bbox_lr = np.asarray(bbox_lr, dtype=np.float32).reshape((-1, 2)) 15 | center = (bbox_lr[:, 0] + bbox_lr[:, 1]) / 2 16 | width = bbox_lr[:, 1] - bbox_lr[:, 0] 17 | bbox_cw = np.vstack((center, width)).T 18 | return bbox_cw 19 | 20 | 21 | def cw2lr(bbox_cw: np.ndarray) -> np.ndarray: 22 | """Convert bounding boxes from center-width (CW) to left-right (LR) format. 23 | 24 | :param bbox_cw: CW bounding boxes. Sized [N, 2]. 25 | :return: LR bounding boxes. Sized [N, 2]. 26 | """ 27 | bbox_cw = np.asarray(bbox_cw, dtype=np.float32).reshape((-1, 2)) 28 | left = bbox_cw[:, 0] - bbox_cw[:, 1] / 2 29 | right = bbox_cw[:, 0] + bbox_cw[:, 1] / 2 30 | bbox_lr = np.vstack((left, right)).T 31 | return bbox_lr 32 | 33 | 34 | def seq2bbox(sequence: np.ndarray) -> np.ndarray: 35 | """Generate CW bbox from binary sequence mask""" 36 | sequence = np.asarray(sequence, dtype=np.bool) 37 | selected_indices, = np.where(sequence == 1) 38 | 39 | bboxes_lr = [] 40 | for k, g in groupby(enumerate(selected_indices), lambda x: x[0] - x[1]): 41 | segment = list(map(itemgetter(1), g)) 42 | start_frame, end_frame = segment[0], segment[-1] + 1 43 | bboxes_lr.append([start_frame, end_frame]) 44 | 45 | bboxes_lr = np.asarray(bboxes_lr, dtype=np.int32) 46 | return bboxes_lr 47 | 48 | 49 | def iou_lr(anchor_bbox: np.ndarray, target_bbox: np.ndarray) -> np.ndarray: 50 | """Compute iou between multiple LR bbox pairs. 51 | 52 | :param anchor_bbox: LR anchor bboxes. Sized [N, 2]. 53 | :param target_bbox: LR target bboxes. Sized [N, 2]. 54 | :return: IoU between each bbox pair. Sized [N]. 55 | """ 56 | anchor_left, anchor_right = anchor_bbox[:, 0], anchor_bbox[:, 1] 57 | target_left, target_right = target_bbox[:, 0], target_bbox[:, 1] 58 | 59 | inter_left = np.maximum(anchor_left, target_left) 60 | inter_right = np.minimum(anchor_right, target_right) 61 | union_left = np.minimum(anchor_left, target_left) 62 | union_right = np.maximum(anchor_right, target_right) 63 | 64 | intersect = inter_right - inter_left 65 | intersect[intersect < 0] = 0 66 | union = union_right - union_left 67 | union[union <= 0] = 1e-6 68 | 69 | iou = intersect / union 70 | return iou 71 | 72 | 73 | def iou_cw(anchor_bbox: np.ndarray, target_bbox: np.ndarray) -> np.ndarray: 74 | """Compute iou between multiple CW bbox pairs. See ``iou_lr``""" 75 | anchor_bbox_lr = cw2lr(anchor_bbox) 76 | target_bbox_lr = cw2lr(target_bbox) 77 | return iou_lr(anchor_bbox_lr, target_bbox_lr) 78 | 79 | 80 | def nms(scores: np.ndarray, 81 | bboxes: np.ndarray, 82 | thresh: float) -> Tuple[np.ndarray, np.ndarray]: 83 | """Non-Maximum Suppression (NMS) algorithm on 1D bbox. 84 | 85 | :param scores: List of confidence scores for bboxes. Sized [N]. 86 | :param bboxes: List of LR bboxes. Sized [N, 2]. 87 | :param thresh: IoU threshold. Overlapped bboxes with IoU higher than this 88 | threshold will be filtered. 89 | :return: Remaining bboxes and its scores. 90 | """ 91 | valid_idx = bboxes[:, 0] < bboxes[:, 1] 92 | scores = scores[valid_idx] 93 | bboxes = bboxes[valid_idx] 94 | 95 | arg_desc = scores.argsort()[::-1] 96 | 97 | scores_remain = scores[arg_desc] 98 | bboxes_remain = bboxes[arg_desc] 99 | 100 | keep_bboxes = [] 101 | keep_scores = [] 102 | 103 | while bboxes_remain.size > 0: 104 | bbox = bboxes_remain[0] 105 | score = scores_remain[0] 106 | keep_bboxes.append(bbox) 107 | keep_scores.append(score) 108 | 109 | iou = iou_lr(bboxes_remain, np.expand_dims(bbox, axis=0)) 110 | 111 | keep_indices = (iou < thresh) 112 | bboxes_remain = bboxes_remain[keep_indices] 113 | scores_remain = scores_remain[keep_indices] 114 | 115 | keep_bboxes = np.asarray(keep_bboxes, dtype=bboxes.dtype) 116 | keep_scores = np.asarray(keep_scores, dtype=scores.dtype) 117 | 118 | return keep_scores, keep_bboxes 119 | -------------------------------------------------------------------------------- /src/helpers/data_helper.py: -------------------------------------------------------------------------------- 1 | import random 2 | from os import PathLike 3 | from pathlib import Path 4 | from typing import Any, List, Dict 5 | 6 | import h5py 7 | import numpy as np 8 | import yaml 9 | 10 | 11 | class VideoDataset(object): 12 | def __init__(self, keys: List[str]): 13 | self.keys = keys 14 | self.datasets = self.get_datasets(keys) 15 | 16 | def __getitem__(self, index): 17 | key = self.keys[index] 18 | video_path = Path(key) 19 | dataset_name = str(video_path.parent) 20 | video_name = video_path.name 21 | video_file = self.datasets[dataset_name][video_name] 22 | 23 | seq = video_file['features'][...].astype(np.float32) 24 | gtscore = video_file['gtscore'][...].astype(np.float32) 25 | cps = video_file['change_points'][...].astype(np.int32) 26 | n_frames = video_file['n_frames'][...].astype(np.int32) 27 | nfps = video_file['n_frame_per_seg'][...].astype(np.int32) 28 | picks = video_file['picks'][...].astype(np.int32) 29 | user_summary = None 30 | if 'user_summary' in video_file: 31 | user_summary = video_file['user_summary'][...].astype(np.float32) 32 | 33 | gtscore -= gtscore.min() 34 | gtscore /= gtscore.max() 35 | 36 | return key, seq, gtscore, cps, n_frames, nfps, picks, user_summary 37 | 38 | def __len__(self): 39 | return len(self.keys) 40 | 41 | @staticmethod 42 | def get_datasets(keys: List[str]) -> Dict[str, h5py.File]: 43 | dataset_paths = {str(Path(key).parent) for key in keys} 44 | datasets = {path: h5py.File(path, 'r') for path in dataset_paths} 45 | return datasets 46 | 47 | 48 | class DataLoader(object): 49 | def __init__(self, dataset: VideoDataset, shuffle: bool): 50 | self.dataset = dataset 51 | self.shuffle = shuffle 52 | self.data_idx = list(range(len(self.dataset))) 53 | 54 | def __iter__(self): 55 | self.iter_idx = 0 56 | if self.shuffle: 57 | random.shuffle(self.data_idx) 58 | return self 59 | 60 | def __next__(self): 61 | if self.iter_idx == len(self.dataset): 62 | raise StopIteration 63 | curr_idx = self.data_idx[self.iter_idx] 64 | batch = self.dataset[curr_idx] 65 | self.iter_idx += 1 66 | return batch 67 | 68 | 69 | class AverageMeter(object): 70 | def __init__(self, *keys: str): 71 | self.totals = {key: 0.0 for key in keys} 72 | self.counts = {key: 0 for key in keys} 73 | 74 | def update(self, **kwargs: float) -> None: 75 | for key, value in kwargs.items(): 76 | self._check_attr(key) 77 | self.totals[key] += value 78 | self.counts[key] += 1 79 | 80 | def __getattr__(self, attr: str) -> float: 81 | self._check_attr(attr) 82 | total = self.totals[attr] 83 | count = self.counts[attr] 84 | return total / count if count else 0.0 85 | 86 | def _check_attr(self, attr: str) -> None: 87 | assert attr in self.totals and attr in self.counts 88 | 89 | 90 | def get_ckpt_dir(model_dir: PathLike) -> Path: 91 | return Path(model_dir) / 'checkpoint' 92 | 93 | 94 | def get_ckpt_path(model_dir: PathLike, 95 | split_path: PathLike, 96 | split_index: int) -> Path: 97 | split_path = Path(split_path) 98 | return get_ckpt_dir(model_dir) / f'{split_path.name}.{split_index}.pt' 99 | 100 | 101 | def load_yaml(path: PathLike) -> Any: 102 | with open(path) as f: 103 | obj = yaml.safe_load(f) 104 | return obj 105 | 106 | 107 | def dump_yaml(obj: Any, path: PathLike) -> None: 108 | with open(path, 'w') as f: 109 | yaml.dump(obj, f) 110 | -------------------------------------------------------------------------------- /src/helpers/init_helper.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import logging 3 | import random 4 | from pathlib import Path 5 | 6 | import numpy as np 7 | import torch 8 | 9 | 10 | def set_random_seed(seed: int) -> None: 11 | random.seed(seed) 12 | np.random.seed(seed) 13 | torch.manual_seed(seed) 14 | 15 | 16 | def init_logger(log_dir: str, log_file: str) -> None: 17 | logger = logging.getLogger() 18 | format_str = r'[%(asctime)s] %(message)s' 19 | logging.basicConfig( 20 | level=logging.INFO, 21 | datefmt=r'%Y/%m/%d %H:%M:%S', 22 | format=format_str 23 | ) 24 | log_dir = Path(log_dir) 25 | log_dir.mkdir(parents=True, exist_ok=True) 26 | fh = logging.FileHandler(str(log_dir / log_file)) 27 | fh.setFormatter(logging.Formatter(format_str)) 28 | logger.addHandler(fh) 29 | 30 | 31 | def get_parser() -> argparse.ArgumentParser: 32 | parser = argparse.ArgumentParser() 33 | 34 | # model type 35 | parser.add_argument('model', type=str, 36 | choices=('anchor-based', 'anchor-free')) 37 | 38 | # training & evaluation 39 | parser.add_argument('--device', type=str, default='cuda', 40 | choices=('cuda', 'cpu')) 41 | parser.add_argument('--seed', type=int, default=12345) 42 | parser.add_argument('--splits', type=str, nargs='+', default=[]) 43 | parser.add_argument('--max-epoch', type=int, default=300) 44 | parser.add_argument('--model-dir', type=str, default='../models/model') 45 | parser.add_argument('--log-file', type=str, default='log.txt') 46 | parser.add_argument('--lr', type=float, default=5e-5) 47 | parser.add_argument('--weight-decay', type=float, default=1e-5) 48 | parser.add_argument('--lambda-reg', type=float, default=1.0) 49 | parser.add_argument('--nms-thresh', type=float, default=0.5) 50 | 51 | # inference 52 | parser.add_argument('--ckpt-path', type=str, default=None) 53 | parser.add_argument('--sample-rate', type=int, default=15) 54 | parser.add_argument('--source', type=str, default=None) 55 | parser.add_argument('--save-path', type=str, default=None) 56 | 57 | # common model config 58 | parser.add_argument('--base-model', type=str, default='attention', 59 | choices=['attention', 'lstm', 'linear', 'bilstm', 60 | 'gcn']) 61 | parser.add_argument('--num-head', type=int, default=8) 62 | parser.add_argument('--num-feature', type=int, default=1024) 63 | parser.add_argument('--num-hidden', type=int, default=128) 64 | 65 | # anchor based 66 | parser.add_argument('--neg-sample-ratio', type=float, default=2.0) 67 | parser.add_argument('--incomplete-sample-ratio', type=float, default=1.0) 68 | parser.add_argument('--pos-iou-thresh', type=float, default=0.6) 69 | parser.add_argument('--neg-iou-thresh', type=float, default=0.0) 70 | parser.add_argument('--incomplete-iou-thresh', type=float, default=0.3) 71 | parser.add_argument('--anchor-scales', type=int, nargs='+', 72 | default=[4, 8, 16, 32]) 73 | 74 | # anchor free 75 | parser.add_argument('--lambda-ctr', type=float, default=1.0) 76 | parser.add_argument('--cls-loss', type=str, default='focal', 77 | choices=['focal', 'cross-entropy']) 78 | parser.add_argument('--reg-loss', type=str, default='soft-iou', 79 | choices=['soft-iou', 'smooth-l1']) 80 | 81 | return parser 82 | 83 | 84 | def get_arguments() -> argparse.Namespace: 85 | parser = get_parser() 86 | args = parser.parse_args() 87 | return args 88 | -------------------------------------------------------------------------------- /src/helpers/video_helper.py: -------------------------------------------------------------------------------- 1 | from os import PathLike 2 | from pathlib import Path 3 | 4 | import cv2 5 | import numpy as np 6 | import torch 7 | from PIL import Image 8 | from numpy import linalg 9 | from torch import nn 10 | from torchvision import transforms, models 11 | 12 | from kts.cpd_auto import cpd_auto 13 | 14 | 15 | class FeatureExtractor(object): 16 | def __init__(self): 17 | self.preprocess = transforms.Compose([ 18 | transforms.Resize(256), 19 | transforms.CenterCrop(224), 20 | transforms.ToTensor(), 21 | transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]), 22 | ]) 23 | self.model = models.googlenet(pretrained=True) 24 | self.model = nn.Sequential(*list(self.model.children())[:-2]) 25 | self.model = self.model.cuda().eval() 26 | 27 | def run(self, img: np.ndarray) -> np.ndarray: 28 | img = Image.fromarray(img) 29 | img = self.preprocess(img) 30 | batch = img.unsqueeze(0) 31 | with torch.no_grad(): 32 | feat = self.model(batch.cuda()) 33 | feat = feat.squeeze().cpu().numpy() 34 | 35 | assert feat.shape == (1024,), f'Invalid feature shape {feat.shape}: expected 1024' 36 | # normalize frame features 37 | feat /= linalg.norm(feat) + 1e-10 38 | return feat 39 | 40 | 41 | class VideoPreprocessor(object): 42 | def __init__(self, sample_rate: int) -> None: 43 | self.model = FeatureExtractor() 44 | self.sample_rate = sample_rate 45 | 46 | def get_features(self, video_path: PathLike): 47 | video_path = Path(video_path) 48 | cap = cv2.VideoCapture(str(video_path)) 49 | assert cap is not None, f'Cannot open video: {video_path}' 50 | 51 | features = [] 52 | n_frames = 0 53 | 54 | while True: 55 | ret, frame = cap.read() 56 | if not ret: 57 | break 58 | 59 | if n_frames % self.sample_rate == 0: 60 | frame = cv2.cvtColor(frame, cv2.COLOR_BGR2RGB) 61 | feat = self.model.run(frame) 62 | features.append(feat) 63 | 64 | n_frames += 1 65 | 66 | cap.release() 67 | 68 | features = np.array(features) 69 | return n_frames, features 70 | 71 | def kts(self, n_frames, features): 72 | seq_len = len(features) 73 | picks = np.arange(0, seq_len) * self.sample_rate 74 | 75 | # compute change points using KTS 76 | kernel = np.matmul(features, features.T) 77 | change_points, _ = cpd_auto(kernel, seq_len - 1, 1, verbose=False) 78 | change_points *= self.sample_rate 79 | change_points = np.hstack((0, change_points, n_frames)) 80 | begin_frames = change_points[:-1] 81 | end_frames = change_points[1:] 82 | change_points = np.vstack((begin_frames, end_frames - 1)).T 83 | 84 | n_frame_per_seg = end_frames - begin_frames 85 | return change_points, n_frame_per_seg, picks 86 | 87 | def run(self, video_path: PathLike): 88 | n_frames, features = self.get_features(video_path) 89 | cps, nfps, picks = self.kts(n_frames, features) 90 | return n_frames, features, cps, nfps, picks 91 | -------------------------------------------------------------------------------- /src/helpers/vsumm_helper.py: -------------------------------------------------------------------------------- 1 | from typing import Iterable, List 2 | 3 | import numpy as np 4 | from ortools.algorithms.pywrapknapsack_solver import KnapsackSolver 5 | 6 | 7 | def f1_score(pred: np.ndarray, test: np.ndarray) -> float: 8 | """Compute F1-score on binary classification task. 9 | 10 | :param pred: Predicted binary label. Sized [N]. 11 | :param test: Ground truth binary label. Sized [N]. 12 | :return: F1-score value. 13 | """ 14 | assert pred.shape == test.shape 15 | pred = np.asarray(pred, dtype=np.bool) 16 | test = np.asarray(test, dtype=np.bool) 17 | overlap = (pred & test).sum() 18 | if overlap == 0: 19 | return 0.0 20 | precision = overlap / pred.sum() 21 | recall = overlap / test.sum() 22 | f1 = 2 * precision * recall / (precision + recall) 23 | return float(f1) 24 | 25 | 26 | def knapsack(values: Iterable[int], 27 | weights: Iterable[int], 28 | capacity: int 29 | ) -> List[int]: 30 | """Solve 0/1 knapsack problem using dynamic programming. 31 | 32 | :param values: Values of each items. Sized [N]. 33 | :param weights: Weights of each items. Sized [N]. 34 | :param capacity: Total capacity of the knapsack. 35 | :return: List of packed item indices. 36 | """ 37 | knapsack_solver = KnapsackSolver( 38 | KnapsackSolver.KNAPSACK_DYNAMIC_PROGRAMMING_SOLVER, 'test' 39 | ) 40 | 41 | values = list(values) 42 | weights = list(weights) 43 | capacity = int(capacity) 44 | 45 | knapsack_solver.Init(values, [weights], [capacity]) 46 | knapsack_solver.Solve() 47 | packed_items = [x for x in range(0, len(weights)) 48 | if knapsack_solver.BestSolutionContains(x)] 49 | 50 | return packed_items 51 | 52 | 53 | def downsample_summ(summ: np.ndarray) -> np.ndarray: 54 | """Down-sample the summary by 15 times""" 55 | return summ[::15] 56 | 57 | 58 | def get_keyshot_summ(pred: np.ndarray, 59 | cps: np.ndarray, 60 | n_frames: int, 61 | nfps: np.ndarray, 62 | picks: np.ndarray, 63 | proportion: float = 0.15 64 | ) -> np.ndarray: 65 | """Generate keyshot-based video summary i.e. a binary vector. 66 | 67 | :param pred: Predicted importance scores. 68 | :param cps: Change points, 2D matrix, each row contains a segment. 69 | :param n_frames: Original number of frames. 70 | :param nfps: Number of frames per segment. 71 | :param picks: Positions of subsampled frames in the original video. 72 | :param proportion: Max length of video summary compared to original length. 73 | :return: Generated keyshot-based summary. 74 | """ 75 | assert pred.shape == picks.shape 76 | picks = np.asarray(picks, dtype=np.int32) 77 | 78 | # Get original frame scores from downsampled sequence 79 | frame_scores = np.zeros(n_frames, dtype=np.float32) 80 | for i in range(len(picks)): 81 | pos_lo = picks[i] 82 | pos_hi = picks[i + 1] if i + 1 < len(picks) else n_frames 83 | frame_scores[pos_lo:pos_hi] = pred[i] 84 | 85 | # Assign scores to video shots as the average of the frames. 86 | seg_scores = np.zeros(len(cps), dtype=np.int32) 87 | for seg_idx, (first, last) in enumerate(cps): 88 | scores = frame_scores[first:last + 1] 89 | seg_scores[seg_idx] = int(1000 * scores.mean()) 90 | 91 | # Apply knapsack algorithm to find the best shots 92 | limits = int(n_frames * proportion) 93 | packed = knapsack(seg_scores, nfps, limits) 94 | 95 | # Get key-shot based summary 96 | summary = np.zeros(n_frames, dtype=np.bool) 97 | for seg_idx in packed: 98 | first, last = cps[seg_idx] 99 | summary[first:last + 1] = True 100 | 101 | return summary 102 | 103 | 104 | def bbox2summary(seq_len: int, 105 | pred_cls: np.ndarray, 106 | pred_bboxes: np.ndarray, 107 | change_points: np.ndarray, 108 | n_frames: int, 109 | nfps: np.ndarray, 110 | picks: np.ndarray 111 | ) -> np.ndarray: 112 | """Convert predicted bounding boxes to summary""" 113 | score = np.zeros(seq_len, dtype=np.float32) 114 | for bbox_idx in range(len(pred_bboxes)): 115 | lo, hi = pred_bboxes[bbox_idx, 0], pred_bboxes[bbox_idx, 1] 116 | score[lo:hi] = np.maximum(score[lo:hi], [pred_cls[bbox_idx]]) 117 | 118 | pred_summ = get_keyshot_summ(score, change_points, n_frames, nfps, picks) 119 | return pred_summ 120 | 121 | 122 | def get_summ_diversity(pred_summ: np.ndarray, 123 | features: np.ndarray 124 | ) -> float: 125 | """Evaluate diversity of the generated summary. 126 | 127 | :param pred_summ: Predicted down-sampled summary. Sized [N, F]. 128 | :param features: Normalized down-sampled video features. Sized [N, F]. 129 | :return: Diversity value. 130 | """ 131 | assert len(pred_summ) == len(features) 132 | pred_summ = np.asarray(pred_summ, dtype=np.bool) 133 | pos_features = features[pred_summ] 134 | 135 | if len(pos_features) < 2: 136 | return 0.0 137 | 138 | diversity = 0.0 139 | for feat in pos_features: 140 | diversity += (feat * pos_features).sum() - (feat * feat).sum() 141 | 142 | diversity /= len(pos_features) * (len(pos_features) - 1) 143 | return diversity 144 | 145 | 146 | def get_summ_f1score(pred_summ: np.ndarray, 147 | test_summ: np.ndarray, 148 | eval_metric: str = 'avg' 149 | ) -> float: 150 | """Compare predicted summary with ground truth summary (keyshot-based). 151 | 152 | :param pred_summ: Predicted binary label of N frames. Sized [N]. 153 | :param test_summ: Ground truth binary labels of U users. Sized [U, N]. 154 | :param eval_metric: Evaluation method. Choose from (max, avg). 155 | :return: F1-score value. 156 | """ 157 | pred_summ = np.asarray(pred_summ, dtype=np.bool) 158 | test_summ = np.asarray(test_summ, dtype=np.bool) 159 | _, n_frames = test_summ.shape 160 | 161 | if pred_summ.size > n_frames: 162 | pred_summ = pred_summ[:n_frames] 163 | elif pred_summ.size < n_frames: 164 | pred_summ = np.pad(pred_summ, (0, n_frames - pred_summ.size)) 165 | 166 | f1s = [f1_score(user_summ, pred_summ) for user_summ in test_summ] 167 | 168 | if eval_metric == 'avg': 169 | final_f1 = np.mean(f1s) 170 | elif eval_metric == 'max': 171 | final_f1 = np.max(f1s) 172 | else: 173 | raise ValueError(f'Invalid eval metric {eval_metric}') 174 | 175 | return float(final_f1) 176 | -------------------------------------------------------------------------------- /src/infer.py: -------------------------------------------------------------------------------- 1 | import cv2 2 | import numpy as np 3 | import torch 4 | 5 | from helpers import init_helper, vsumm_helper, bbox_helper, video_helper 6 | from modules.model_zoo import get_model 7 | 8 | 9 | def main(): 10 | args = init_helper.get_arguments() 11 | 12 | # load model 13 | print('Loading DSNet model ...') 14 | model = get_model(args.model, **vars(args)) 15 | model = model.eval().to(args.device) 16 | state_dict = torch.load(args.ckpt_path, 17 | map_location=lambda storage, loc: storage) 18 | model.load_state_dict(state_dict) 19 | 20 | # load video 21 | print('Preprocessing source video ...') 22 | video_proc = video_helper.VideoPreprocessor(args.sample_rate) 23 | n_frames, seq, cps, nfps, picks = video_proc.run(args.source) 24 | seq_len = len(seq) 25 | 26 | print('Predicting summary ...') 27 | with torch.no_grad(): 28 | seq_torch = torch.from_numpy(seq).unsqueeze(0).to(args.device) 29 | 30 | pred_cls, pred_bboxes = model.predict(seq_torch) 31 | 32 | pred_bboxes = np.clip(pred_bboxes, 0, seq_len).round().astype(np.int32) 33 | 34 | pred_cls, pred_bboxes = bbox_helper.nms(pred_cls, pred_bboxes, args.nms_thresh) 35 | pred_summ = vsumm_helper.bbox2summary( 36 | seq_len, pred_cls, pred_bboxes, cps, n_frames, nfps, picks) 37 | 38 | print('Writing summary video ...') 39 | 40 | # load original video 41 | cap = cv2.VideoCapture(args.source) 42 | width = int(cap.get(cv2.CAP_PROP_FRAME_WIDTH)) 43 | height = int(cap.get(cv2.CAP_PROP_FRAME_HEIGHT)) 44 | fps = cap.get(cv2.CAP_PROP_FPS) 45 | 46 | # create summary video writer 47 | fourcc = cv2.VideoWriter_fourcc(*'mp4v') 48 | out = cv2.VideoWriter(args.save_path, fourcc, fps, (width, height)) 49 | 50 | frame_idx = 0 51 | while True: 52 | ret, frame = cap.read() 53 | if not ret: 54 | break 55 | 56 | if pred_summ[frame_idx]: 57 | out.write(frame) 58 | 59 | frame_idx += 1 60 | 61 | out.release() 62 | cap.release() 63 | 64 | 65 | if __name__ == '__main__': 66 | main() 67 | -------------------------------------------------------------------------------- /src/kts/LICENSE: -------------------------------------------------------------------------------- 1 | #### DISCLAIMER 2 | This kts library is downloaded from: 3 | - http://lear.inrialpes.fr/software 4 | - http://pascal.inrialpes.fr/data2/potapov/med_summaries/kts_ver1.1.tar.gz 5 | 6 | I just modified the original code to remove weave dependecy. Please follow the 7 | original LICENSE from LEAR if you are using kts. 8 | -------------------------------------------------------------------------------- /src/kts/README.md: -------------------------------------------------------------------------------- 1 | Kernel temporal segmentation 2 | ============================ 3 | 4 |