├── .gitignore
├── LICENSE
├── MANIFEST.in
├── README.md
├── bin
    └── thingscoop
├── resources
    ├── clockwork_orange.png
    ├── filter.png
    ├── header.gif
    └── preview.jpg
├── setup.py
└── thingscoop
    ├── __init__.py
    ├── classifier.py
    ├── models.py
    ├── preview.py
    ├── query.py
    ├── search.py
    └── utils.py


/.gitignore:
--------------------------------------------------------------------------------
1 | *pyc
2 | *#*
3 | .DS_Store
4 | dist
5 | *.egg-info
6 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | The MIT License (MIT)
 2 | 
 3 | Copyright (c) 2015 Anastasis Germanidis
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in
13 | all copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21 | THE SOFTWARE.
22 | 
23 | 


--------------------------------------------------------------------------------
/MANIFEST.in:
--------------------------------------------------------------------------------
1 | include README.md
2 | include LICENSE
3 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | <img width=800 src="http://i.imgur.com/gyVNlgX.jpg"></img>
  2 | 
  3 | ## Thingscoop: Utility for searching and filtering videos based on their content
  4 | 
  5 | ### Description
  6 | 
  7 | Thingscoop is a command-line utility for analyzing videos semantically - that means searching, filtering, and describing videos based on objects, places, and other things that appear in them.
  8 | 
  9 | When you first run thingscoop on a video file, it uses a [convolutional neural network](https://en.wikipedia.org/wiki/Convolutional_neural_network) to create an "index" of what's contained in the every second of the input by repeatedly performing image classification on a frame-by-frame basis. Once an index for a video file has been created, you can search (i.e. get the start and end times of the regions in the video matching the query) and filter (i.e. create a [supercut](https://en.wikipedia.org/wiki/Supercut) of the matching regions) the input using arbitrary queries. Thingscoop uses a very basic query language that lets you to compose queries that test for the presence or absence of labels with the logical operators `!` (not), `||` (or) and `&&` (and). For example, to search a video the presence of the sky *and* the absence of the ocean: `thingscoop search 'sky && !ocean' <file>`.
 10 | 
 11 | Right now two models are supported by thingscoop: `vgg_imagenet` uses the architecture described in ["Very Deep Convolutional Networks for Large-Scale Image Recognition"](http://arxiv.org/abs/1409.1556) to recognize objects from the [ImageNet](http://www.image-net.org/) database, and `googlenet_places` uses the architecture described in ["Going Deeper with Convolutions"](http://arxiv.org/abs/1409.4842) to recognize settings and places from the [MIT Places](http://places.csail.mit.edu/) database. You can specify which model you'd like to use by running `thingscoop models use <model>`, where `<model>` is either `vgg_imagenet` or `googlenet_places`. More models will be added soon.
 12 | 
 13 | Thingscoop is based on [Caffe](http://caffe.berkeleyvision.org/), an open-source deep learning framework.
 14 | 
 15 | ### Installation
 16 | 
 17 | 1. Install ffmpeg, imagemagick, and ghostscript: `brew install ffmpeg imagemagick ghostscript` (Mac OS X) or `apt-get install ffmpeg imagemagick ghostscript` (Ubuntu).
 18 | 1. Follow the installation instructions on the [Caffe Installation page](http://caffe.berkeleyvision.org/installation.html). 
 19 | 2. Make sure you build the Python bindings by running `make pycaffe` (on Caffe's directory).
 20 | 3. Set the environment variable CAFFE_ROOT to point to Caffe's directory: `export CAFFE_ROOT=[Caffe's directory]`.
 21 | 4. Install thingscoop: `easy_install thingscoop` or `pip install thingscoop`.
 22 | 
 23 | ### Usage
 24 | 
 25 | #### `thingscoop search <query> <files...>`
 26 | 
 27 | Print the start and end times (in seconds) of the regions in `<files>` that match `<query>`. Creates an index for `<file>` using the current model if it does not exist.
 28 | 
 29 | Example output:
 30 | 
 31 | ```
 32 | $ thingscoop search violin waking_life.mp4
 33 | /Users/anastasis/Downloads/waking_life.mp4 148.000000 162.000000
 34 | /Users/anastasis/Downloads/waking_life.mp4 176.000000 179.000000
 35 | /Users/anastasis/Downloads/waking_life.mp4 180.000000 186.000000
 36 | /Users/anastasis/Downloads/waking_life.mp4 189.000000 190.000000
 37 | /Users/anastasis/Downloads/waking_life.mp4 192.000000 200.000000
 38 | /Users/anastasis/Downloads/waking_life.mp4 211.000000 212.000000
 39 | /Users/anastasis/Downloads/waking_life.mp4 222.000000 223.000000
 40 | /Users/anastasis/Downloads/waking_life.mp4 235.000000 243.000000
 41 | /Users/anastasis/Downloads/waking_life.mp4 247.000000 249.000000
 42 | /Users/anastasis/Downloads/waking_life.mp4 251.000000 253.000000
 43 | /Users/anastasis/Downloads/waking_life.mp4 254.000000 258.000000
 44 | ```
 45 | 
 46 | ####`thingscoop filter <query> <files...>`
 47 | 
 48 | Generate a video compilation of the regions in the `<files>` that match `<query>`. Creates index for `<file>` using the current model if it does not exist.
 49 | 
 50 | Example output:
 51 | 
 52 | <a href="https://www.youtube.com/watch?v=qe9GjrUJipY"><img width=600 src="resources/filter.png"></img></a>
 53 | 
 54 | #### `thingscoop sort <file>`
 55 | 
 56 | Create a compilation video showing examples for every label recognized in the video (in alphabetic order). Creates an index for `<file>` using the current model if it does not exist.
 57 | 
 58 | Example output:
 59 | 
 60 | <a href="https://www.youtube.com/watch?v=o0VoyJgPgJE"><img width=600 src="resources/clockwork_orange.png"></img></a>
 61 | 
 62 | #### `thingscoop describe <file>`
 63 | 
 64 | Print every label that appears in `<file>` along with the number of times it appears. Creates an index for `<file>` using the current model if it does not exist.
 65 | 
 66 | #### `thingscoop preview <file>`
 67 | 
 68 | Create a window that plays the input video `<file>` while also displaying the labels the model recognizes on every frame.
 69 | 
 70 | ```
 71 | $ thingscoop describe koyaanisqatsi.mp4 -m googlenet_places
 72 | sky 405
 73 | skyscraper 363
 74 | canyon 141
 75 | office_building 130
 76 | highway 78
 77 | lighthouse 66
 78 | hospital 64
 79 | desert 59
 80 | shower 49
 81 | volcano 45
 82 | underwater 44
 83 | airport_terminal 43
 84 | fountain 39
 85 | runway 36
 86 | assembly_line 35
 87 | aquarium 34
 88 | fire_escape 34
 89 | music_studio 32
 90 | bar 28
 91 | amusement_park 28
 92 | stage 26
 93 | wheat_field 25
 94 | butchers_shop 25
 95 | engine_room 24
 96 | slum 20
 97 | butte 20
 98 | igloo 20
 99 | ...etc
100 | ```
101 | 
102 | #### `thingscoop index <file>`
103 | 
104 | Create an index for `<file>` using the current model if it does not exist.
105 | 
106 | #### `thingscoop models list`
107 | 
108 | List all models currently available in Thingscoop.
109 | 
110 | ```
111 | $ thingscoop models list
112 | googlenet_imagenet            Model described in the paper "Going Deeper with Convolutions" trained on the ImageNet database
113 | googlenet_places              Model described in the paper "Going Deeper with Convolutions" trained on the MIT Places database
114 | vgg_imagenet                  16-layer model described in the paper "Return of the Devil in the Details: Delving Deep into Convolutional Nets" trained on the ImageNet database
115 | ```
116 | 
117 | #### `thingscoop models info <model>`
118 | 
119 | Print more detailed information about `<model>`.
120 | 
121 | ```
122 | $ thingscoop models info googlenet_places
123 | Name: googlenet_places
124 | Description: Model described in the paper "Going Deeper with Convolutions" trained on the MIT Places database
125 | Dataset: MIT Places
126 | ```
127 | 
128 | #### `thingscoop models freeze`
129 | 
130 | List all models that have already been downloaded.
131 | 
132 | ```
133 | $ thingscoop models freeze
134 | googlenet_places
135 | vgg_imagenet
136 | ```
137 | 
138 | #### `thingscoop models current`
139 | 
140 | Print the model that is currently in use.
141 | 
142 | ```
143 | $ thingscoop models current
144 | googlenet_places
145 | ```
146 | 
147 | #### `thingscoop models use <model>`
148 | 
149 | Set the current model to `<model>`. Downloads that model locally if it hasn't been downloaded already.
150 | 
151 | #### `thingscoop models download <model>`
152 | 
153 | Download the model `<model>` locally.
154 | 
155 | #### `thingscoop models remove <model>`
156 | 
157 | Remove the model `<model>` locally.
158 | 
159 | #### `thingscoop models clear`
160 | 
161 | Remove all models stored locally.
162 | 
163 | #### `thingscoop labels list`
164 | 
165 | Print all the labels used by the current model.
166 | 
167 | ```
168 | $ thingscoop labels list
169 | abacus
170 | abaya
171 | abstraction
172 | academic gown
173 | accessory
174 | accordion
175 | acorn
176 | acorn squash
177 | acoustic guitar
178 | act
179 | actinic radiation
180 | action
181 | activity
182 | adhesive bandage
183 | adjudicator
184 | administrative district
185 | admiral
186 | adornment
187 | adventurer
188 | advocate
189 | ...
190 | ```
191 | 
192 | #### `thingscoop labels search <regexp>`
193 | 
194 | Print all the labels supported by the current model that match the regular expression `<regexp>`.
195 | 
196 | ```
197 | $ thingscoop labels search instrument$
198 | beating-reed instrument
199 | bowed stringed instrument
200 | double-reed instrument
201 | free-reed instrument
202 | instrument
203 | keyboard instrument
204 | measuring instrument
205 | medical instrument
206 | musical instrument
207 | navigational instrument
208 | negotiable instrument
209 | optical instrument
210 | percussion instrument
211 | scientific instrument
212 | stringed instrument
213 | surveying instrument
214 | wind instrument
215 | ...
216 | 
217 | ```
218 | 
219 | ### Full usage options
220 | 
221 | ```
222 | thingscoop - Command-line utility for searching and filtering videos based on their content
223 | 
224 | Usage:
225 |   thingscoop filter <query> <files>... [-o <output_path>] [-m <model>] [-s <sr>] [-c <mc>] [--recreate-index] [--gpu-mode] [--open]
226 |   thingscoop search <query> <files>... [-o <output_path>] [-m <model>] [-s <sr>] [-c <mc>] [--recreate-index] [--gpu-mode] 
227 |   thingscoop describe <file> [-n <words>] [-m <model>] [--recreate-index] [--gpu-mode] [-c <mc>]
228 |   thingscoop index <files> [-m <model>] [-s <sr>] [-c <mc>] [-r <ocr>] [--recreate-index] [--gpu-mode] 
229 |   thingscoop sort <file> [-m <model>] [--gpu-mode] [--min-confidence <ct>] [--max-section-length <ms>] [-i <ignore>] [--open]
230 |   thingscoop preview <file> [-m <model>] [--gpu-mode] [--min-confidence <ct>]
231 |   thingscoop labels list [-m <model>]
232 |   thingscoop labels search <regexp> [-m <model>]
233 |   thingscoop models list
234 |   thingscoop models info <model>
235 |   thingscoop models freeze
236 |   thingscoop models current
237 |   thingscoop models use <model>
238 |   thingscoop models download <model>
239 |   thingscoop models remove <model>
240 |   thingscoop models clear
241 | 
242 | Options:
243 |   --version                       Show version.
244 |   -h --help                       Show this screen.
245 |   -o --output <dst>               Output file for supercut
246 |   -s --sample-rate <sr>           How many frames to classify per second (default = 1)
247 |   -c --min-confidence <mc>        Minimum prediction confidence required to consider a label (default depends on model)
248 |   -m --model <model>              Model to use (use 'thingscoop models list' to see all available models)
249 |   -n --number-of-words <words>    Number of words to describe the video with (default = 5)
250 |   -t --max-section-length <ms>    Max number of seconds to show examples of a label in the sorted video (default = 5)
251 |   -r --min-occurrences <ocr>      Minimum number of occurrences of a label in video required for it to be shown in the sorted video (default = 2)
252 |   -i --ignore-labels <labels>     Labels to ignore when creating the sorted video video
253 |   --title <title>                 Title to show at the beginning of the video (sort mode only)
254 |   --gpu-mode                      Enable GPU mode
255 |   --recreate-index                Recreate object index for file if it already exists
256 |   --open                          Open filtered video after creating it (OS X only)
257 | ```
258 | 
259 | ### CHANGELOG
260 | 
261 | #### 0.2 (8/16/2015)
262 | 
263 | * Added `sort` option for creating a video compilation of all labels appearing in a video
264 | * Now using JSON for the index files
265 | 
266 | #### 0.1 (8/5/2015)
267 | 
268 | * Conception
269 | 
270 | ### License
271 | 
272 | MIT
273 | 


--------------------------------------------------------------------------------
/bin/thingscoop:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | """
 3 | thingscoop - Command-line utility for searching and filtering videos based on their content
 4 | 
 5 | Usage:
 6 |   thingscoop filter <query> <files>... [-o <output_path>] [-m <model>] [-s <sr>] [-c <mc>] [--recreate-index] [--gpu-mode] [--open]
 7 |   thingscoop search <query> <files>... [-o <output_path>] [-m <model>] [-s <sr>] [-c <mc>] [--recreate-index] [--gpu-mode] 
 8 |   thingscoop describe <file> [-n <words>] [-m <model>] [--recreate-index] [--gpu-mode] [-c <mc>]
 9 |   thingscoop index <files> [-m <model>] [-s <sr>] [-c <mc>] [-r <ocr>] [--recreate-index] [--gpu-mode] 
10 |   thingscoop sort <file> [-m <model>] [--gpu-mode] [--min-confidence <ct>] [--max-section-length <ms>] [-i <ignore>] [--open]
11 |   thingscoop preview <file> [-m <model>] [--gpu-mode] [--min-confidence <ct>]
12 |   thingscoop labels list [-m <model>]
13 |   thingscoop labels search <regexp> [-m <model>]
14 |   thingscoop models list
15 |   thingscoop models info <model>
16 |   thingscoop models freeze
17 |   thingscoop models current
18 |   thingscoop models use <model>
19 |   thingscoop models download <model>
20 |   thingscoop models remove <model>
21 |   thingscoop models clear
22 | 
23 | Options:
24 |   --version                       Show version.
25 |   -h --help                       Show this screen.
26 |   -o --output <dst>               Output file for supercut
27 |   -s --sample-rate <sr>           How many frames to classify per second (default = 1)
28 |   -c --min-confidence <mc>        Minimum prediction confidence required to consider a label (default depends on model)
29 |   -m --model <model>              Model to use (use 'thingscoop models list' to see all available models)
30 |   -n --number-of-words <words>    Number of words to describe the video with (default = 5)
31 |   -t --max-section-length <ms>    Max number of seconds to show examples of a label in the sorted video (default = 5)
32 |   -r --min-occurrences <ocr>      Minimum number of occurrences of a label in video required for it to be shown in the sorted video (default = 2)
33 |   -i --ignore-labels <labels>     Labels to ignore when creating the sorted video video
34 |   --title <title>                 Title to show at the beginning of the video (sort mode only)
35 |   --gpu-mode                      Enable GPU mode
36 |   --recreate-index                Recreate object index for file if it already exists
37 |   --open                          Open filtered video after creating it (OS X only)
38 | """
39 | 
40 | import sys
41 | import os
42 | from docopt import docopt
43 |                 
44 | if __name__ == '__main__':
45 |     if 'CAFFE_ROOT' not in os.environ:
46 |         print "You need to set your CAFFE_ROOT environment variable to point to your Caffe directory"
47 |         sys.exit(1)
48 |     sys.path.append(os.path.join(os.environ['CAFFE_ROOT'], 'python'))
49 |     os.environ['GLOG_minloglevel'] = '3'
50 |     import thingscoop
51 |     args = docopt(__doc__, version="Thingscoop 0.1")
52 |     sys.exit(thingscoop.main(args))
53 | 
54 | 


--------------------------------------------------------------------------------
/resources/clockwork_orange.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/agermanidis/thingscoop/9c61079d4469a2011e232b950a5e17c4dfaf4ef1/resources/clockwork_orange.png


--------------------------------------------------------------------------------
/resources/filter.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/agermanidis/thingscoop/9c61079d4469a2011e232b950a5e17c4dfaf4ef1/resources/filter.png


--------------------------------------------------------------------------------
/resources/header.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/agermanidis/thingscoop/9c61079d4469a2011e232b950a5e17c4dfaf4ef1/resources/header.gif


--------------------------------------------------------------------------------
/resources/preview.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/agermanidis/thingscoop/9c61079d4469a2011e232b950a5e17c4dfaf4ef1/resources/preview.jpg


--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
 1 | #!/usr/bin/env python
 2 | 
 3 | try:
 4 |     from setuptools import setup
 5 | except ImportError:
 6 |     from distutils.core import setup
 7 | 
 8 | setup(
 9 |     name='thingscoop',
10 |     version='0.2',
11 |     description='Command-line utility for searching and filtering videos based on objects that appear in them using convolutional neural networks',
12 |     author='Anastasis Germanidis',
13 |     author_email='agermanidis@gmail.com',
14 |     url='https://github.com/agermanidis/thingscoop',
15 |     packages=['thingscoop'],
16 |     scripts=['bin/thingscoop'],
17 |     install_requires=[
18 |         'pyPEG2>=2.15.1',
19 |         'requests>=2.7.0',
20 |         'moviepy>=0.2.2.11',
21 |         'docopt>=0.6.2',
22 |         'progressbar>=2.3',
23 |         'numpy>=1.9.2',
24 |         'pattern>=2.6',
25 |         'termcolor>=1.1.0'
26 |     ],
27 |     license="MIT"
28 | )
29 | 


--------------------------------------------------------------------------------
/thingscoop/__init__.py:
--------------------------------------------------------------------------------
  1 | import cv2
  2 | import glob
  3 | import multiprocessing
  4 | import os
  5 | import pydoc
  6 | import subprocess
  7 | import tempfile
  8 | from collections import defaultdict
  9 | from operator import itemgetter
 10 | 
 11 | from progressbar import ProgressBar, Percentage, Bar, ETA
 12 | 
 13 | from .classifier import ImageClassifier
 14 | from .models import clear_models
 15 | from .models import download_model
 16 | from .models import get_active_model
 17 | from .models import get_all_models
 18 | from .models import get_downloaded_models
 19 | from .models import info_model
 20 | from .models import read_model
 21 | from .models import remove_model
 22 | from .models import use_model
 23 | from .preview import preview
 24 | from .query import filename_for_query
 25 | from .query import validate_query
 26 | from .search import label_videos
 27 | from .search import filter_out_labels
 28 | from .search import reverse_index
 29 | from .search import search_videos
 30 | from .search import threshold_labels
 31 | from .utils import create_compilation
 32 | from .utils import create_supercut
 33 | from .utils import merge_values
 34 | from .utils import search_labels
 35 | 
 36 | def main(args):
 37 |     model_name = args['--model'] or args['<model>'] or get_active_model()
 38 |     
 39 |     if args['models']:
 40 |         if args['list']:
 41 |             models = get_all_models()
 42 |             for model in models:
 43 |                 print "{0}{1}".format(model['name'].ljust(30), model['description'])
 44 |         if args['freeze']:
 45 |             models = get_downloaded_models()
 46 |             for model_name in models:
 47 |                 print model_name
 48 |         elif args['current']:
 49 |             print get_active_model()
 50 |         elif args['use']:
 51 |             use_model(model_name)
 52 |         elif args['download']:
 53 |             download_model(model_name)
 54 |         elif args['info']:
 55 |             try:
 56 |                 model_info = info_model(model_name)
 57 |                 for key in ['name', 'description', 'dataset']:
 58 |                     print "{0}: {1}".format(key.capitalize(), model_info[key])
 59 |             except:
 60 |                 print "Model not found"
 61 |         elif args['remove']:
 62 |             remove_model(args['<model>'])
 63 |         elif args['clear']:
 64 |             clear_models()
 65 |         return 0
 66 | 
 67 |     if args['labels']:
 68 |         download_model(model_name)
 69 |         model = read_model(model_name)
 70 |         labels = sorted(set(model.labels(with_hypernyms=True)), key=lambda l: l.lower())
 71 |         if args['list']:
 72 |             pydoc.pager('\n'.join(labels))
 73 |         elif args['search']:
 74 |             search_labels(args['<regexp>'], labels)
 75 |         return 0
 76 | 
 77 |     sample_rate = float(args['--sample-rate'] or 1)
 78 |     min_confidence = float(args['--min-confidence'] or 0.25)
 79 |     number_of_words = int(args['--number-of-words'] or 5)
 80 |     max_section_length = float(args['--max-section-length'] or 5)
 81 |     min_occurrences = int(args['--min-occurrences'] or 2)
 82 |     ignore_labels = args['--ignore-labels']
 83 |     recreate_index = args['--recreate-index'] or False
 84 |     gpu_mode = args['--gpu-mode'] or False
 85 |     if ignore_labels:
 86 |         ignore_list = ignore_labels.split(',')
 87 |     else:
 88 |         ignore_list = []
 89 | 
 90 |     download_model(model_name)
 91 |     model = read_model(model_name)
 92 |     classifier = ImageClassifier(model, gpu_mode=gpu_mode)
 93 |                                  
 94 |     if args['preview']:
 95 |         preview(args['<file>'], classifier)
 96 |         return 0
 97 | 
 98 |     filenames = args['<files>'] or [args['<file>']]
 99 | 
100 |     labels_by_filename = label_videos(
101 |         filenames,
102 |         classifier,
103 |         sample_rate=sample_rate,
104 |         recreate_index=recreate_index
105 |     )
106 | 
107 |     if args['describe']:
108 |         freq_dist = defaultdict(lambda: 0)
109 |         for (t, labels) in threshold_labels(merge_values(labels_by_filename), min_confidence):
110 |             for label in map(itemgetter(0), labels):
111 |                 freq_dist[label] += 1
112 |         sorted_labels = sorted(freq_dist.iteritems(), key=itemgetter(1), reverse=True)
113 |         print '\n'.join(map(lambda (k, v): "{0} {1}".format(k, v), sorted_labels))
114 |         return 0
115 | 
116 |     if args['search'] or args['filter']:
117 |         query = args['<query>']
118 |         validate_query(query, model.labels(with_hypernyms=True))
119 |     
120 |         matching_time_regions = search_videos(
121 |             labels_by_filename,
122 |             args['<query>'],
123 |             classifier,
124 |             sample_rate=sample_rate,
125 |             recreate_index=recreate_index,
126 |             min_confidence=min_confidence
127 |         )
128 | 
129 |         if not matching_time_regions:
130 |             return 0
131 | 
132 |     if args['search']:
133 |         for filename, region in matching_time_regions:
134 |             start, end = region
135 |             print "%s %f %f" % (filename, start, end)
136 |         return 0
137 | 
138 |     if args['filter']:
139 |         supercut = create_supercut(matching_time_regions)
140 | 
141 |         dst = args.get('<dst>')
142 |         if not dst:
143 |             base, ext = os.path.splitext(args['<files>'][0])
144 |             dst = "{0}_filtered_{1}.mp4".format(base, filename_for_query(query))
145 |  
146 |         supercut.to_videofile(
147 |             dst, 
148 |             codec="libx264", 
149 |             temp_audiofile="temp.m4a",
150 |             remove_temp=True,
151 |             audio_codec="aac",
152 |         )
153 | 
154 |         if args['--open']:
155 |             subprocess.check_output(['open', dst])
156 | 
157 |     if args['sort']:
158 |         timed_labels = labels_by_filename[args['<file>']]
159 |         timed_labels = threshold_labels(timed_labels, min_confidence)
160 |         timed_labels = filter_out_labels(timed_labels, ignore_list)
161 |         
162 |         idx = reverse_index(
163 |             timed_labels,
164 |             min_occurrences=min_occurrences,
165 |             max_length_per_label=max_section_length
166 |         )
167 |         compilation = create_compilation(args['<file>'], idx)
168 | 
169 |         dst = args.get('<dst>')
170 |         if not dst:
171 |             base, ext = os.path.splitext(args['<file>'])
172 |             dst = "{0}_sorted.mp4".format(base)
173 |  
174 |         compilation.to_videofile(
175 |             dst,
176 |             codec="libx264", 
177 |             temp_audiofile="temp.m4a",
178 |             remove_temp=True,
179 |             audio_codec="aac",
180 |         )
181 | 
182 |         if args['--open']:
183 |             subprocess.check_output(['open', dst])       
184 |         
185 | 


--------------------------------------------------------------------------------
/thingscoop/classifier.py:
--------------------------------------------------------------------------------
 1 | import cPickle
 2 | import caffe
 3 | import cv2
 4 | import glob
 5 | import logging
 6 | import numpy
 7 | import os
 8 | 
 9 | class ImageClassifier(object):
10 |     def __init__(self, model, gpu_mode=False):
11 |         self.model = model
12 |         
13 |         kwargs = {}
14 | 
15 |         if self.model.get("image_dims"):
16 |             kwargs['image_dims'] = tuple(self.model.get("image_dims"))
17 | 
18 |         if self.model.get("channel_swap"):
19 |             kwargs['channel_swap'] = tuple(self.model.get("channel_swap"))
20 | 
21 |         if self.model.get("raw_scale"):
22 |             kwargs['raw_scale'] = float(self.model.get("raw_scale"))
23 | 
24 |         if self.model.get("mean"):
25 |             kwargs['mean'] = numpy.array(self.model.get("mean"))
26 |         
27 |         self.net = caffe.Classifier(
28 |             model.deploy_path(),
29 |             model.model_path(),
30 |             **kwargs
31 |         )
32 |         
33 |         self.confidence_threshold = 0.1
34 |         
35 |         if gpu_mode:
36 |             caffe.set_mode_gpu()
37 |         else:
38 |             caffe.set_mode_cpu()
39 | 
40 |         self.labels = numpy.array(model.labels())
41 | 
42 |         if self.model.bet_path():
43 |             self.bet = cPickle.load(open(self.model.bet_path()))
44 |             self.bet['words'] = map(lambda w: w.replace(' ', '_'), self.bet['words'])
45 |         else:
46 |             self.bet = None
47 |         
48 |         self.net.forward()
49 | 
50 |     def classify_image(self, filename):
51 |         image = caffe.io.load_image(open(filename))
52 |         scores = self.net.predict([image], oversample=True).flatten()
53 | 
54 |         if self.bet:
55 |             expected_infogain = numpy.dot(self.bet['probmat'], scores[self.bet['idmapping']])
56 |             expected_infogain *= self.bet['infogain']
57 |             infogain_sort = expected_infogain.argsort()[::-1]
58 |             results = [
59 |                 (self.bet['words'][v], float(expected_infogain[v]))
60 |                 for v in infogain_sort
61 |                 if expected_infogain[v] > self.confidence_threshold
62 |             ]
63 | 
64 |         else:
65 |             indices = (-scores).argsort()
66 |             predictions = self.labels[indices]
67 |             results = [
68 |                 (p, float(scores[i]))
69 |                 for i, p in zip(indices, predictions)
70 |                 if scores[i] > self.confidence_threshold
71 |             ]
72 | 
73 |         return results
74 | 
75 | 


--------------------------------------------------------------------------------
/thingscoop/models.py:
--------------------------------------------------------------------------------
  1 | import glob
  2 | import os
  3 | import requests
  4 | import shutil
  5 | import tempfile
  6 | import urlparse
  7 | import yaml
  8 | import zipfile
  9 | import urllib
 10 | from pattern.en import wordnet
 11 | from progressbar import ProgressBar, Percentage, Bar, ETA, FileTransferSpeed
 12 | 
 13 | from .utils import get_hypernyms
 14 | 
 15 | THINGSCOOP_DIR = os.path.join(os.path.expanduser("~"), ".thingscoop")
 16 | CONFIG_PATH = os.path.join(THINGSCOOP_DIR, "config.yml")
 17 | 
 18 | DEFAULT_CONFIG = {
 19 |     'repo_url': 'https://s3.amazonaws.com/haystack-models/',
 20 |     'active_model': 'googlenet_imagenet'
 21 | }
 22 | 
 23 | class CouldNotFindModel(Exception):
 24 |     pass
 25 | 
 26 | class Model(object):
 27 |     def __init__(self, name, model_dir):
 28 |         self.name = name
 29 |         self.model_dir = model_dir
 30 |         self.info = yaml.load(open(os.path.join(model_dir, "info.yml")))
 31 | 
 32 |     def get(self, k):
 33 |         return self.info.get(k)
 34 | 
 35 |     def model_path(self):
 36 |         return os.path.join(self.model_dir, self.info['pretrained_model_file'])
 37 | 
 38 |     def deploy_path(self):
 39 |         return os.path.join(self.model_dir, self.info['deploy_file'])
 40 | 
 41 |     def label_path(self):
 42 |         return os.path.join(self.model_dir, self.info['labels_file'])
 43 | 
 44 |     def bet_path(self):
 45 |         if 'bet_file' not in self.info: return None
 46 |         return os.path.join(self.model_dir, self.info['bet_file'])
 47 | 
 48 |     def labels(self, with_hypernyms=False):
 49 |         ret = map(str.strip, open(self.label_path()).readlines())
 50 |         if with_hypernyms:
 51 |             for label in list(ret):
 52 |                 ret.extend(get_hypernyms(label))
 53 |         return ret
 54 | 
 55 | def read_config():
 56 |     if not os.path.exists(CONFIG_PATH):
 57 |         write_config(DEFAULT_CONFIG)
 58 |     return yaml.load(open(CONFIG_PATH))
 59 | 
 60 | def write_config(config):
 61 |     if not os.path.exists(THINGSCOOP_DIR):
 62 |         os.makedirs(THINGSCOOP_DIR)
 63 |     yaml.dump(config, open(CONFIG_PATH, 'wb'))
 64 | 
 65 | def get_repo_url():
 66 |     return read_config()['repo_url']
 67 | 
 68 | def get_active_model():
 69 |     return read_config()['active_model']
 70 | 
 71 | def get_model_url(model):
 72 |     return get_repo_url() + model + ".zip"
 73 | 
 74 | def get_model_local_path(model):
 75 |     return os.path.join(THINGSCOOP_DIR, "models", model)
 76 | 
 77 | def set_config(k, v):
 78 |     config = read_config()
 79 |     config[k] = v
 80 |     write_config(config)
 81 | 
 82 | def model_in_cache(model):
 83 |     return os.path.exists(get_model_local_path(model))
 84 | 
 85 | def get_models_path():
 86 |     return os.path.join(THINGSCOOP_DIR, "models")
 87 | 
 88 | def get_all_models():
 89 |     return yaml.load(requests.get(urlparse.urljoin(get_repo_url(), "info.yml")).text)
 90 | 
 91 | def info_model(model):
 92 |     models = get_all_models()
 93 |     for model_info in models:
 94 |         if model_info['name'] == model:
 95 |             return model_info
 96 | 
 97 | def use_model(model):
 98 |     download_model(model)
 99 |     set_config("active_model", model)
100 | 
101 | def remove_model(model):
102 |     path = get_model_local_path(model)
103 |     shutil.rmtree(path)
104 | 
105 | def clear_models():
106 |     for model in get_downloaded_models():
107 |         remove_model(model)
108 | 
109 | def read_model(model_name):
110 |     if not model_in_cache(model_name):
111 |         raise CouldNotFindModel, "Could not find model {}".format(model_name)
112 |     return Model(model_name, get_model_local_path(model_name))
113 | 
114 | def get_downloaded_models():
115 |     return map(os.path.basename, glob.glob(os.path.join(get_models_path(), "*")))
116 | 
117 | progress_bar = None
118 | def download_model(model):
119 |     if model_in_cache(model): return
120 |     model_url = get_model_url(model)
121 |     tmp_zip = tempfile.NamedTemporaryFile(suffix=".zip")
122 |     prompt = "Downloading model {}".format(model)
123 |     def cb(count, block_size, total_size):
124 |         global progress_bar
125 |         if not progress_bar:
126 |             widgets = [prompt, Percentage(), ' ', Bar(), ' ', FileTransferSpeed(), ' ', ETA()]
127 |             progress_bar = ProgressBar(widgets=widgets, maxval=int(total_size)).start()
128 |         progress_bar.update(min(total_size, count * block_size))
129 |     urllib.urlretrieve(model_url, tmp_zip.name, cb)
130 |     z = zipfile.ZipFile(tmp_zip)
131 |     out_path = get_model_local_path(model)
132 |     try:
133 |         os.mkdir(out_path)
134 |     except:
135 |         pass
136 |     for name in z.namelist():
137 |         if name.startswith("_"): continue
138 |         z.extract(name, out_path)
139 | 
140 | 


--------------------------------------------------------------------------------
/thingscoop/preview.py:
--------------------------------------------------------------------------------
 1 | import cv2
 2 | import datetime
 3 | import math
 4 | import numpy
 5 | import random
 6 | import re
 7 | import subprocess
 8 | import sys
 9 | import caffe
10 | import tempfile
11 | 
12 | def duration_string_to_timedelta(s):
13 |     [hours, minutes, seconds] = map(int, s.split(':'))
14 |     seconds = seconds + minutes * 60 + hours * 3600
15 |     return datetime.timedelta(seconds=seconds)
16 | 
17 | def get_video_duration(path):
18 |     result = subprocess.Popen(["ffprobe", path], stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
19 |     matches = [x for x in result.stdout.readlines() if "Duration" in x]
20 |     duration_string = re.findall(r'Duration: ([0-9:]*)', matches[0])[0]
21 |     return math.ceil(duration_string_to_timedelta(duration_string).seconds)
22 | 
23 | def get_current_position(c):
24 |     return int(c.get(cv2.cv.CV_CAP_PROP_POS_MSEC)/1000)
25 | 
26 | def add_text_to_frame(frame, text):
27 |     ret, _ = cv2.getTextSize(text, cv2.FONT_HERSHEY_PLAIN, 1, 1)
28 |     ret = (ret[0] + 20, ret[1] + 20)
29 |     cv2.rectangle(frame, (0,0), ret, (0, 0, 0), cv2.cv.CV_FILLED)
30 |     cv2.putText(frame, text, (5, 20), cv2.FONT_HERSHEY_PLAIN, 1, (255, 255, 255))
31 | 
32 | def format_classification(result):
33 |     str_list = []
34 |     for label, confidence in result:
35 |         str_list.append("{0} ({1})".format(label, confidence))
36 |     return ', '.join(str_list)
37 | 
38 | def preview(filename, classifier):
39 |     cv2.namedWindow('video')
40 | 
41 |     duration = int(get_video_duration(filename))
42 | 
43 |     def trackbar_change(t):
44 |         cap.set(cv2.cv.CV_CAP_PROP_POS_MSEC, t*1000)
45 | 
46 |     trackbar_prompt = 'Current position:'
47 |     cv2.createTrackbar(trackbar_prompt, 'video', 0, duration, trackbar_change)
48 | 
49 |     cap = cv2.VideoCapture(filename)
50 | 
51 |     classification_result = None
52 |     previous_time_in_seconds = None
53 |     current_pos = 0
54 | 
55 |     tmp = tempfile.NamedTemporaryFile(suffix=".png")    
56 |     
57 |     while cap.isOpened():
58 |         ret, frame = cap.read()
59 | 
60 |         cv2.imwrite(tmp.name, frame)
61 | 
62 |         if ret:
63 |             current_pos = get_current_position(cap)
64 | 
65 |             if current_pos != previous_time_in_seconds:
66 |                 previous_time_in_seconds = current_pos
67 |                 classification_result = classifier.classify_image(tmp.name)
68 |             
69 |             if classification_result:
70 |                 add_text_to_frame(frame, format_classification(classification_result))
71 | 
72 |             cv2.imshow('video', frame)
73 |         
74 |         cv2.setTrackbarPos(trackbar_prompt, 'video', current_pos)
75 |     
76 |         k = cv2.waitKey(1) & 0xFF
77 |         if k == 27:
78 |             break
79 | 
80 |     cap.release()
81 |     cv2.destroyAllWindows()
82 | 
83 | 


--------------------------------------------------------------------------------
/thingscoop/query.py:
--------------------------------------------------------------------------------
 1 | import re
 2 | from pypeg2 import *
 3 | from pattern.en import wordnet
 4 | 
 5 | from .utils import get_hypernyms
 6 | 
 7 | class NoSuchLabelError(Exception):
 8 |     pass
 9 | 
10 | class Q(object):
11 |     def evaluate(self, labels):
12 |         return self.content.evaluate(labels)
13 | 
14 | class Label(object):
15 |     grammar = attr("name", re.compile(r"[A-z][A-z\- ]+[A-z]"))
16 | 
17 |     def evaluate(self, labels):
18 |         for label in labels:
19 |             if label == self.name or self.name in get_hypernyms(label):
20 |                 return True
21 |         return False
22 | 
23 | class UnaryOp(object):
24 |     grammar = (attr("operator", re.compile("!")),
25 |                attr("content", Label))
26 | 
27 |     def evaluate(self, labels):
28 |         return not self.content.evaluate(labels)
29 | 
30 | class ParentheticalQ(List):
31 |     grammar = (re.compile("\("),
32 |                attr("content", Q),
33 |                re.compile("\)"))
34 | 
35 |     def evaluate(self, labels):
36 |         return self.content.evaluate(labels)
37 | 
38 | class BinaryOp(object):
39 |     grammar = (attr("lh", [ParentheticalQ, UnaryOp, Label]),
40 |                attr("op", re.compile("(&&|\|\|)")),
41 |                attr("rh", Q))
42 | 
43 |     def evaluate(self, labels):
44 |         lv = self.lh.evaluate(labels)
45 |         rv = self.rh.evaluate(labels)
46 | 
47 |         if self.op == '||': return lv or rv
48 |         else: return lv and rv
49 | 
50 | Q.grammar = (
51 |     maybe_some(whitespace),
52 |     attr("content", [BinaryOp, ParentheticalQ, UnaryOp, Label]),
53 |     maybe_some(whitespace)
54 | )
55 | 
56 | def get_labels(parsed_q):
57 |     if type(parsed_q) == Label:
58 |         return [parsed_q.name]
59 |     if type(parsed_q) == BinaryOp:
60 |         return get_labels(parsed_q.lh) + get_labels(parsed_q.rh)
61 |     return get_labels(parsed_q.content)
62 | 
63 | def validate_query(q, all_labels):
64 |     parsed_q = parse(q, Q)
65 |     for label in get_labels(parsed_q):
66 |         if label not in all_labels:
67 |             raise NoSuchLabelError, "Label \"{}\" does not exist.".format(label)
68 | 
69 | def eval_query_with_labels(q, labels):
70 |     parsed_q = parse(q, Q)
71 |     return parsed_q.evaluate(labels)
72 | 
73 | def filename_for_query(q):
74 |     parsed_q = parse(q, Q)
75 |     labels = get_labels(parsed_q)
76 |     return '_'.join(labels).replace(' ', '_')
77 |         
78 | 


--------------------------------------------------------------------------------
/thingscoop/search.py:
--------------------------------------------------------------------------------
  1 | import glob
  2 | import os
  3 | import shutil
  4 | import subprocess
  5 | import sys
  6 | import tempfile
  7 | from operator import itemgetter
  8 | 
  9 | from progressbar import ProgressBar, Percentage, Bar, ETA
 10 | 
 11 | from .query import eval_query_with_labels
 12 | from .utils import extract_frames
 13 | from .utils import generate_index_path
 14 | from .utils import read_index_from_path
 15 | from .utils import save_index_to_path
 16 | 
 17 | def times_to_regions(times, max_total_length=None):
 18 |     if not times:
 19 |         return []
 20 |     regions = []
 21 |     current_start = times[0]
 22 |     for t1, t2 in zip(times, times[1:]):
 23 |         if t2 - t1 > 1:
 24 |             regions.append((current_start, t1 + 1))
 25 |             current_start = t2
 26 |     regions.append((current_start, times[-1] + 1))
 27 |     if not max_total_length:
 28 |         return regions
 29 |     accum = 0
 30 |     ret = []
 31 |     for index, (t1, t2) in enumerate(regions):
 32 |         if accum + (t2-t1) > max_total_length:
 33 |             ret.append((t1, t1 + max_total_length-accum))
 34 |             return ret
 35 |         else:
 36 |             ret.append((t1, t2))
 37 |             accum += t2 - t1
 38 |     return ret
 39 | 
 40 | def unique_labels(timed_labels):
 41 |     ret = set()
 42 |     for t, labels_list in timed_labels:
 43 |         ret.update(map(itemgetter(0), labels_list))
 44 |     return ret
 45 | 
 46 | def reverse_index(timed_labels, min_occurrences=2, max_length_per_label=8):
 47 |     ret = {}
 48 |     for label in unique_labels(timed_labels):
 49 |         times = []
 50 |         for t, labels_list in timed_labels:
 51 |             raw_labels = map(lambda (l, c): l, labels_list)
 52 |             if eval_query_with_labels(label, raw_labels):
 53 |                 times.append(t)
 54 |         if len(times) >= min_occurrences:
 55 |             ret[label] = times_to_regions(times, max_total_length=max_length_per_label)
 56 |     return ret
 57 | 
 58 | def threshold_labels(timed_labels, min_confidence):
 59 |     ret = []
 60 |     for t, label_list in timed_labels:
 61 |         filtered = filter(lambda (l, c): c > min_confidence, label_list)
 62 |         if filtered:
 63 |             ret.append((t, filtered))
 64 |     return ret
 65 | 
 66 | def filter_out_labels(timed_labels, ignore_list):
 67 |     ret = []
 68 |     for t, label_list in timed_labels:
 69 |         filtered = filter(lambda (l, c): l not in ignore_list, label_list)
 70 |         if filtered:
 71 |             ret.append((t, filtered))
 72 |     return ret
 73 | 
 74 | def label_video(filename, classifier, sample_rate=1, recreate_index=False):
 75 |     index_filename = generate_index_path(filename, classifier.model)
 76 |     
 77 |     if os.path.exists(index_filename) and not recreate_index:
 78 |         return read_index_from_path(index_filename)
 79 |     
 80 |     temp_frame_dir, frames = extract_frames(filename, sample_rate=sample_rate)
 81 | 
 82 |     timed_labels = []
 83 | 
 84 |     widgets=["Labeling {}: ".format(filename), Percentage(), ' ', Bar(), ' ', ETA()]
 85 |     pbar = ProgressBar(widgets=widgets, maxval=len(frames)).start()
 86 | 
 87 |     for index, frame in enumerate(frames):
 88 |         pbar.update(index)
 89 |         labels = classifier.classify_image(frame)
 90 |         if not len(labels):
 91 |             continue
 92 |         t = (1./sample_rate) * index
 93 |         timed_labels.append((t, labels))
 94 |     
 95 |     shutil.rmtree(temp_frame_dir)
 96 |     save_index_to_path(index_filename, timed_labels)
 97 |     
 98 |     return timed_labels
 99 | 
100 | def label_videos(filenames, classifier, sample_rate=1, recreate_index=False):
101 |     ret = {}
102 |     for filename in filenames:
103 |         ret[filename] = label_video(
104 |             filename,
105 |             classifier,
106 |             sample_rate=sample_rate,
107 |             recreate_index=recreate_index
108 |         )
109 |     return ret
110 | 
111 | def search_labels(timed_labels, query, classifier, sample_rate=1, recreate_index=False, min_confidence=0.3):
112 |     timed_labels = threshold_labels(timed_labels, min_confidence)
113 | 
114 |     times = []
115 |     for t, labels_list in timed_labels:
116 |         raw_labels = map(lambda (l, c): l, labels_list)
117 |         if eval_query_with_labels(query, raw_labels):
118 |             times.append(t)
119 | 
120 |     return times_to_regions(times)
121 | 
122 | def search_videos(labels_by_filename, query, classifier, sample_rate=1, recreate_index=False, min_confidence=0.3):
123 |     ret = []
124 |     for filename, timed_labels in labels_by_filename.items():
125 |         ret += map(lambda r: (filename, r), search_labels(
126 |             timed_labels,
127 |             query,
128 |             classifier,
129 |             sample_rate=sample_rate,
130 |             recreate_index=recreate_index,
131 |             min_confidence=min_confidence
132 |         ))
133 |     return ret
134 | 
135 | 


--------------------------------------------------------------------------------
/thingscoop/utils.py:
--------------------------------------------------------------------------------
  1 | import glob
  2 | import json
  3 | import os
  4 | import re
  5 | import subprocess
  6 | import tempfile
  7 | import textwrap
  8 | from moviepy.editor import VideoFileClip, TextClip, ImageClip, concatenate_videoclips
  9 | from pattern.en import wordnet
 10 | from termcolor import colored
 11 | from PIL import Image, ImageDraw, ImageFont
 12 | 
 13 | def create_title_frame(title, dimensions, fontsize=60):
 14 |     para = textwrap.wrap(title, width=30)
 15 |     im = Image.new('RGB', dimensions, (0, 0, 0, 0))
 16 |     draw = ImageDraw.Draw(im)
 17 |     font = ImageFont.truetype('resources/Helvetica.ttc', fontsize)
 18 |     total_height = sum(map(lambda l: draw.textsize(l, font=font)[1], para))
 19 |     current_h, pad = (dimensions[1]/2-total_height/2), 10
 20 |     for line in para:
 21 |         w, h = draw.textsize(line, font=font)
 22 |         draw.text(((dimensions[0] - w) / 2, current_h), line, font=font)
 23 |         current_h += h + pad
 24 |     f = tempfile.NamedTemporaryFile(suffix=".png", delete=False)
 25 |     im.save(f.name)
 26 |     return f.name
 27 | 
 28 | def get_video_dimensions(filename):
 29 |     p = subprocess.Popen(['ffprobe', filename], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
 30 |     _, out = p.communicate()
 31 |     for line in out.split('\n'):
 32 |         if re.search('Video: ', line):
 33 |             match = re.findall('[1-9][0-9]*x[1-9][0-9]*', line)[0]
 34 |             return tuple(map(int, match.split('x')))
 35 | 
 36 | def extract_frames(filename, sample_rate=1):
 37 |     dest_dir = tempfile.mkdtemp()
 38 |     dest = os.path.join(dest_dir, "%10d.png")
 39 |     subprocess.check_output(["ffmpeg", "-i", filename, "-vf", "fps="+str(sample_rate), dest])
 40 |     glob_pattern = os.path.join(dest_dir, "*.png")
 41 |     return dest_dir, glob.glob(glob_pattern)
 42 | 
 43 | def generate_index_path(filename, model):
 44 |     name, ext = os.path.splitext(filename)
 45 |     return "{name}_{model_name}.json".format(name=name, model_name=model.name)
 46 | 
 47 | def read_index_from_path(filename):
 48 |     return json.load(open(filename))
 49 | 
 50 | def save_index_to_path(filename, timed_labels):
 51 |     json.dump(timed_labels, open(filename, 'w'), indent=4)
 52 | 
 53 | def create_supercut(regions):
 54 |     subclips = []
 55 |     filenames = set(map(lambda (filename, _): filename, regions))
 56 |     video_files = {filename: VideoFileClip(filename) for filename in filenames}
 57 |     for filename, region in regions:
 58 |         subclip = video_files[filename].subclip(*region)
 59 |         subclips.append(subclip)
 60 |     if not subclips: return None
 61 |     return concatenate_videoclips(subclips)
 62 | 
 63 | def label_as_title(label):
 64 |     return label.replace('_', ' ').upper()
 65 | 
 66 | def create_compilation(filename, index):
 67 |     dims = get_video_dimensions(filename)
 68 |     subclips = []
 69 |     video_file = VideoFileClip(filename)
 70 |     for label in sorted(index.keys()):
 71 |         label_img_filename = create_title_frame(label_as_title(label), dims)
 72 |         label_clip = ImageClip(label_img_filename, duration=2)
 73 |         os.remove(label_img_filename)
 74 |         subclips.append(label_clip)
 75 |         for region in index[label]:
 76 |             subclip = video_file.subclip(*region)
 77 |             subclips.append(subclip)
 78 |     if not subclips: return None
 79 |     return concatenate_videoclips(subclips)
 80 | 
 81 | def search_labels(r, labels):
 82 |     r = re.compile(r)
 83 |     for label in labels:
 84 |         if not r.search(label):
 85 |             continue
 86 |         current_i = 0
 87 |         ret = ''
 88 |         for m in r.finditer(label):
 89 |             ret += label[current_i:m.start()]
 90 |             ret += colored(label[m.start():m.end()], 'red', attrs=['bold'])
 91 |             current_i = m.end()
 92 |         ret += label[m.end():]
 93 |         print ret
 94 | 
 95 | def get_hypernyms(label):
 96 |     synsets = wordnet.synsets(label)
 97 |     if not synsets: return []
 98 |     return map(lambda s: s.synonyms[0], synsets[0].hypernyms(True))
 99 | 
100 | def merge_values(d):
101 |     ret = []
102 |     for lst in d.values():
103 |         ret += lst
104 |     return ret
105 | 
106 | 


--------------------------------------------------------------------------------