├── .gitignore
├── LICENSE
├── MANIFEST.in
├── README.md
├── bin
└── thingscoop
├── resources
├── clockwork_orange.png
├── filter.png
├── header.gif
└── preview.jpg
├── setup.py
└── thingscoop
├── __init__.py
├── classifier.py
├── models.py
├── preview.py
├── query.py
├── search.py
└── utils.py
/.gitignore:
--------------------------------------------------------------------------------
1 | *pyc
2 | *#*
3 | .DS_Store
4 | dist
5 | *.egg-info
6 |
--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
1 | The MIT License (MIT)
2 |
3 | Copyright (c) 2015 Anastasis Germanidis
4 |
5 | Permission is hereby granted, free of charge, to any person obtaining a copy
6 | of this software and associated documentation files (the "Software"), to deal
7 | in the Software without restriction, including without limitation the rights
8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 |
12 | The above copyright notice and this permission notice shall be included in
13 | all copies or substantial portions of the Software.
14 |
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
21 | THE SOFTWARE.
22 |
23 |
--------------------------------------------------------------------------------
/MANIFEST.in:
--------------------------------------------------------------------------------
1 | include README.md
2 | include LICENSE
3 |
--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 |
2 |
3 | ## Thingscoop: Utility for searching and filtering videos based on their content
4 |
5 | ### Description
6 |
7 | Thingscoop is a command-line utility for analyzing videos semantically - that means searching, filtering, and describing videos based on objects, places, and other things that appear in them.
8 |
9 | When you first run thingscoop on a video file, it uses a [convolutional neural network](https://en.wikipedia.org/wiki/Convolutional_neural_network) to create an "index" of what's contained in the every second of the input by repeatedly performing image classification on a frame-by-frame basis. Once an index for a video file has been created, you can search (i.e. get the start and end times of the regions in the video matching the query) and filter (i.e. create a [supercut](https://en.wikipedia.org/wiki/Supercut) of the matching regions) the input using arbitrary queries. Thingscoop uses a very basic query language that lets you to compose queries that test for the presence or absence of labels with the logical operators `!` (not), `||` (or) and `&&` (and). For example, to search a video the presence of the sky *and* the absence of the ocean: `thingscoop search 'sky && !ocean' `.
10 |
11 | Right now two models are supported by thingscoop: `vgg_imagenet` uses the architecture described in ["Very Deep Convolutional Networks for Large-Scale Image Recognition"](http://arxiv.org/abs/1409.1556) to recognize objects from the [ImageNet](http://www.image-net.org/) database, and `googlenet_places` uses the architecture described in ["Going Deeper with Convolutions"](http://arxiv.org/abs/1409.4842) to recognize settings and places from the [MIT Places](http://places.csail.mit.edu/) database. You can specify which model you'd like to use by running `thingscoop models use `, where `` is either `vgg_imagenet` or `googlenet_places`. More models will be added soon.
12 |
13 | Thingscoop is based on [Caffe](http://caffe.berkeleyvision.org/), an open-source deep learning framework.
14 |
15 | ### Installation
16 |
17 | 1. Install ffmpeg, imagemagick, and ghostscript: `brew install ffmpeg imagemagick ghostscript` (Mac OS X) or `apt-get install ffmpeg imagemagick ghostscript` (Ubuntu).
18 | 1. Follow the installation instructions on the [Caffe Installation page](http://caffe.berkeleyvision.org/installation.html).
19 | 2. Make sure you build the Python bindings by running `make pycaffe` (on Caffe's directory).
20 | 3. Set the environment variable CAFFE_ROOT to point to Caffe's directory: `export CAFFE_ROOT=[Caffe's directory]`.
21 | 4. Install thingscoop: `easy_install thingscoop` or `pip install thingscoop`.
22 |
23 | ### Usage
24 |
25 | #### `thingscoop search `
26 |
27 | Print the start and end times (in seconds) of the regions in `` that match ``. Creates an index for `` using the current model if it does not exist.
28 |
29 | Example output:
30 |
31 | ```
32 | $ thingscoop search violin waking_life.mp4
33 | /Users/anastasis/Downloads/waking_life.mp4 148.000000 162.000000
34 | /Users/anastasis/Downloads/waking_life.mp4 176.000000 179.000000
35 | /Users/anastasis/Downloads/waking_life.mp4 180.000000 186.000000
36 | /Users/anastasis/Downloads/waking_life.mp4 189.000000 190.000000
37 | /Users/anastasis/Downloads/waking_life.mp4 192.000000 200.000000
38 | /Users/anastasis/Downloads/waking_life.mp4 211.000000 212.000000
39 | /Users/anastasis/Downloads/waking_life.mp4 222.000000 223.000000
40 | /Users/anastasis/Downloads/waking_life.mp4 235.000000 243.000000
41 | /Users/anastasis/Downloads/waking_life.mp4 247.000000 249.000000
42 | /Users/anastasis/Downloads/waking_life.mp4 251.000000 253.000000
43 | /Users/anastasis/Downloads/waking_life.mp4 254.000000 258.000000
44 | ```
45 |
46 | ####`thingscoop filter `
47 |
48 | Generate a video compilation of the regions in the `` that match ``. Creates index for `` using the current model if it does not exist.
49 |
50 | Example output:
51 |
52 |
53 |
54 | #### `thingscoop sort `
55 |
56 | Create a compilation video showing examples for every label recognized in the video (in alphabetic order). Creates an index for `` using the current model if it does not exist.
57 |
58 | Example output:
59 |
60 |
61 |
62 | #### `thingscoop describe `
63 |
64 | Print every label that appears in `` along with the number of times it appears. Creates an index for `` using the current model if it does not exist.
65 |
66 | #### `thingscoop preview `
67 |
68 | Create a window that plays the input video `` while also displaying the labels the model recognizes on every frame.
69 |
70 | ```
71 | $ thingscoop describe koyaanisqatsi.mp4 -m googlenet_places
72 | sky 405
73 | skyscraper 363
74 | canyon 141
75 | office_building 130
76 | highway 78
77 | lighthouse 66
78 | hospital 64
79 | desert 59
80 | shower 49
81 | volcano 45
82 | underwater 44
83 | airport_terminal 43
84 | fountain 39
85 | runway 36
86 | assembly_line 35
87 | aquarium 34
88 | fire_escape 34
89 | music_studio 32
90 | bar 28
91 | amusement_park 28
92 | stage 26
93 | wheat_field 25
94 | butchers_shop 25
95 | engine_room 24
96 | slum 20
97 | butte 20
98 | igloo 20
99 | ...etc
100 | ```
101 |
102 | #### `thingscoop index `
103 |
104 | Create an index for `` using the current model if it does not exist.
105 |
106 | #### `thingscoop models list`
107 |
108 | List all models currently available in Thingscoop.
109 |
110 | ```
111 | $ thingscoop models list
112 | googlenet_imagenet Model described in the paper "Going Deeper with Convolutions" trained on the ImageNet database
113 | googlenet_places Model described in the paper "Going Deeper with Convolutions" trained on the MIT Places database
114 | vgg_imagenet 16-layer model described in the paper "Return of the Devil in the Details: Delving Deep into Convolutional Nets" trained on the ImageNet database
115 | ```
116 |
117 | #### `thingscoop models info `
118 |
119 | Print more detailed information about ``.
120 |
121 | ```
122 | $ thingscoop models info googlenet_places
123 | Name: googlenet_places
124 | Description: Model described in the paper "Going Deeper with Convolutions" trained on the MIT Places database
125 | Dataset: MIT Places
126 | ```
127 |
128 | #### `thingscoop models freeze`
129 |
130 | List all models that have already been downloaded.
131 |
132 | ```
133 | $ thingscoop models freeze
134 | googlenet_places
135 | vgg_imagenet
136 | ```
137 |
138 | #### `thingscoop models current`
139 |
140 | Print the model that is currently in use.
141 |
142 | ```
143 | $ thingscoop models current
144 | googlenet_places
145 | ```
146 |
147 | #### `thingscoop models use `
148 |
149 | Set the current model to ``. Downloads that model locally if it hasn't been downloaded already.
150 |
151 | #### `thingscoop models download `
152 |
153 | Download the model `` locally.
154 |
155 | #### `thingscoop models remove `
156 |
157 | Remove the model `` locally.
158 |
159 | #### `thingscoop models clear`
160 |
161 | Remove all models stored locally.
162 |
163 | #### `thingscoop labels list`
164 |
165 | Print all the labels used by the current model.
166 |
167 | ```
168 | $ thingscoop labels list
169 | abacus
170 | abaya
171 | abstraction
172 | academic gown
173 | accessory
174 | accordion
175 | acorn
176 | acorn squash
177 | acoustic guitar
178 | act
179 | actinic radiation
180 | action
181 | activity
182 | adhesive bandage
183 | adjudicator
184 | administrative district
185 | admiral
186 | adornment
187 | adventurer
188 | advocate
189 | ...
190 | ```
191 |
192 | #### `thingscoop labels search `
193 |
194 | Print all the labels supported by the current model that match the regular expression ``.
195 |
196 | ```
197 | $ thingscoop labels search instrument$
198 | beating-reed instrument
199 | bowed stringed instrument
200 | double-reed instrument
201 | free-reed instrument
202 | instrument
203 | keyboard instrument
204 | measuring instrument
205 | medical instrument
206 | musical instrument
207 | navigational instrument
208 | negotiable instrument
209 | optical instrument
210 | percussion instrument
211 | scientific instrument
212 | stringed instrument
213 | surveying instrument
214 | wind instrument
215 | ...
216 |
217 | ```
218 |
219 | ### Full usage options
220 |
221 | ```
222 | thingscoop - Command-line utility for searching and filtering videos based on their content
223 |
224 | Usage:
225 | thingscoop filter ... [-o ] [-m ] [-s ] [-c ] [--recreate-index] [--gpu-mode] [--open]
226 | thingscoop search ... [-o ] [-m ] [-s ] [-c ] [--recreate-index] [--gpu-mode]
227 | thingscoop describe [-n ] [-m ] [--recreate-index] [--gpu-mode] [-c ]
228 | thingscoop index [-m ] [-s ] [-c ] [-r ] [--recreate-index] [--gpu-mode]
229 | thingscoop sort [-m ] [--gpu-mode] [--min-confidence ] [--max-section-length ] [-i ] [--open]
230 | thingscoop preview [-m ] [--gpu-mode] [--min-confidence ]
231 | thingscoop labels list [-m ]
232 | thingscoop labels search [-m ]
233 | thingscoop models list
234 | thingscoop models info
235 | thingscoop models freeze
236 | thingscoop models current
237 | thingscoop models use
238 | thingscoop models download
239 | thingscoop models remove
240 | thingscoop models clear
241 |
242 | Options:
243 | --version Show version.
244 | -h --help Show this screen.
245 | -o --output Output file for supercut
246 | -s --sample-rate How many frames to classify per second (default = 1)
247 | -c --min-confidence Minimum prediction confidence required to consider a label (default depends on model)
248 | -m --model Model to use (use 'thingscoop models list' to see all available models)
249 | -n --number-of-words Number of words to describe the video with (default = 5)
250 | -t --max-section-length Max number of seconds to show examples of a label in the sorted video (default = 5)
251 | -r --min-occurrences Minimum number of occurrences of a label in video required for it to be shown in the sorted video (default = 2)
252 | -i --ignore-labels Labels to ignore when creating the sorted video video
253 | --title Title to show at the beginning of the video (sort mode only)
254 | --gpu-mode Enable GPU mode
255 | --recreate-index Recreate object index for file if it already exists
256 | --open Open filtered video after creating it (OS X only)
257 | ```
258 |
259 | ### CHANGELOG
260 |
261 | #### 0.2 (8/16/2015)
262 |
263 | * Added `sort` option for creating a video compilation of all labels appearing in a video
264 | * Now using JSON for the index files
265 |
266 | #### 0.1 (8/5/2015)
267 |
268 | * Conception
269 |
270 | ### License
271 |
272 | MIT
273 |
--------------------------------------------------------------------------------
/bin/thingscoop:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 | """
3 | thingscoop - Command-line utility for searching and filtering videos based on their content
4 |
5 | Usage:
6 | thingscoop filter ... [-o ] [-m ] [-s ] [-c ] [--recreate-index] [--gpu-mode] [--open]
7 | thingscoop search ... [-o ] [-m ] [-s ] [-c ] [--recreate-index] [--gpu-mode]
8 | thingscoop describe [-n ] [-m ] [--recreate-index] [--gpu-mode] [-c ]
9 | thingscoop index [-m ] [-s ] [-c ] [-r ] [--recreate-index] [--gpu-mode]
10 | thingscoop sort [-m ] [--gpu-mode] [--min-confidence ] [--max-section-length ] [-i ] [--open]
11 | thingscoop preview [-m ] [--gpu-mode] [--min-confidence ]
12 | thingscoop labels list [-m ]
13 | thingscoop labels search [-m ]
14 | thingscoop models list
15 | thingscoop models info
16 | thingscoop models freeze
17 | thingscoop models current
18 | thingscoop models use
19 | thingscoop models download
20 | thingscoop models remove
21 | thingscoop models clear
22 |
23 | Options:
24 | --version Show version.
25 | -h --help Show this screen.
26 | -o --output Output file for supercut
27 | -s --sample-rate How many frames to classify per second (default = 1)
28 | -c --min-confidence Minimum prediction confidence required to consider a label (default depends on model)
29 | -m --model Model to use (use 'thingscoop models list' to see all available models)
30 | -n --number-of-words Number of words to describe the video with (default = 5)
31 | -t --max-section-length Max number of seconds to show examples of a label in the sorted video (default = 5)
32 | -r --min-occurrences Minimum number of occurrences of a label in video required for it to be shown in the sorted video (default = 2)
33 | -i --ignore-labels Labels to ignore when creating the sorted video video
34 | --title Title to show at the beginning of the video (sort mode only)
35 | --gpu-mode Enable GPU mode
36 | --recreate-index Recreate object index for file if it already exists
37 | --open Open filtered video after creating it (OS X only)
38 | """
39 |
40 | import sys
41 | import os
42 | from docopt import docopt
43 |
44 | if __name__ == '__main__':
45 | if 'CAFFE_ROOT' not in os.environ:
46 | print "You need to set your CAFFE_ROOT environment variable to point to your Caffe directory"
47 | sys.exit(1)
48 | sys.path.append(os.path.join(os.environ['CAFFE_ROOT'], 'python'))
49 | os.environ['GLOG_minloglevel'] = '3'
50 | import thingscoop
51 | args = docopt(__doc__, version="Thingscoop 0.1")
52 | sys.exit(thingscoop.main(args))
53 |
54 |
--------------------------------------------------------------------------------
/resources/clockwork_orange.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/agermanidis/thingscoop/9c61079d4469a2011e232b950a5e17c4dfaf4ef1/resources/clockwork_orange.png
--------------------------------------------------------------------------------
/resources/filter.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/agermanidis/thingscoop/9c61079d4469a2011e232b950a5e17c4dfaf4ef1/resources/filter.png
--------------------------------------------------------------------------------
/resources/header.gif:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/agermanidis/thingscoop/9c61079d4469a2011e232b950a5e17c4dfaf4ef1/resources/header.gif
--------------------------------------------------------------------------------
/resources/preview.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/agermanidis/thingscoop/9c61079d4469a2011e232b950a5e17c4dfaf4ef1/resources/preview.jpg
--------------------------------------------------------------------------------
/setup.py:
--------------------------------------------------------------------------------
1 | #!/usr/bin/env python
2 |
3 | try:
4 | from setuptools import setup
5 | except ImportError:
6 | from distutils.core import setup
7 |
8 | setup(
9 | name='thingscoop',
10 | version='0.2',
11 | description='Command-line utility for searching and filtering videos based on objects that appear in them using convolutional neural networks',
12 | author='Anastasis Germanidis',
13 | author_email='agermanidis@gmail.com',
14 | url='https://github.com/agermanidis/thingscoop',
15 | packages=['thingscoop'],
16 | scripts=['bin/thingscoop'],
17 | install_requires=[
18 | 'pyPEG2>=2.15.1',
19 | 'requests>=2.7.0',
20 | 'moviepy>=0.2.2.11',
21 | 'docopt>=0.6.2',
22 | 'progressbar>=2.3',
23 | 'numpy>=1.9.2',
24 | 'pattern>=2.6',
25 | 'termcolor>=1.1.0'
26 | ],
27 | license="MIT"
28 | )
29 |
--------------------------------------------------------------------------------
/thingscoop/__init__.py:
--------------------------------------------------------------------------------
1 | import cv2
2 | import glob
3 | import multiprocessing
4 | import os
5 | import pydoc
6 | import subprocess
7 | import tempfile
8 | from collections import defaultdict
9 | from operator import itemgetter
10 |
11 | from progressbar import ProgressBar, Percentage, Bar, ETA
12 |
13 | from .classifier import ImageClassifier
14 | from .models import clear_models
15 | from .models import download_model
16 | from .models import get_active_model
17 | from .models import get_all_models
18 | from .models import get_downloaded_models
19 | from .models import info_model
20 | from .models import read_model
21 | from .models import remove_model
22 | from .models import use_model
23 | from .preview import preview
24 | from .query import filename_for_query
25 | from .query import validate_query
26 | from .search import label_videos
27 | from .search import filter_out_labels
28 | from .search import reverse_index
29 | from .search import search_videos
30 | from .search import threshold_labels
31 | from .utils import create_compilation
32 | from .utils import create_supercut
33 | from .utils import merge_values
34 | from .utils import search_labels
35 |
36 | def main(args):
37 | model_name = args['--model'] or args[''] or get_active_model()
38 |
39 | if args['models']:
40 | if args['list']:
41 | models = get_all_models()
42 | for model in models:
43 | print "{0}{1}".format(model['name'].ljust(30), model['description'])
44 | if args['freeze']:
45 | models = get_downloaded_models()
46 | for model_name in models:
47 | print model_name
48 | elif args['current']:
49 | print get_active_model()
50 | elif args['use']:
51 | use_model(model_name)
52 | elif args['download']:
53 | download_model(model_name)
54 | elif args['info']:
55 | try:
56 | model_info = info_model(model_name)
57 | for key in ['name', 'description', 'dataset']:
58 | print "{0}: {1}".format(key.capitalize(), model_info[key])
59 | except:
60 | print "Model not found"
61 | elif args['remove']:
62 | remove_model(args[''])
63 | elif args['clear']:
64 | clear_models()
65 | return 0
66 |
67 | if args['labels']:
68 | download_model(model_name)
69 | model = read_model(model_name)
70 | labels = sorted(set(model.labels(with_hypernyms=True)), key=lambda l: l.lower())
71 | if args['list']:
72 | pydoc.pager('\n'.join(labels))
73 | elif args['search']:
74 | search_labels(args[''], labels)
75 | return 0
76 |
77 | sample_rate = float(args['--sample-rate'] or 1)
78 | min_confidence = float(args['--min-confidence'] or 0.25)
79 | number_of_words = int(args['--number-of-words'] or 5)
80 | max_section_length = float(args['--max-section-length'] or 5)
81 | min_occurrences = int(args['--min-occurrences'] or 2)
82 | ignore_labels = args['--ignore-labels']
83 | recreate_index = args['--recreate-index'] or False
84 | gpu_mode = args['--gpu-mode'] or False
85 | if ignore_labels:
86 | ignore_list = ignore_labels.split(',')
87 | else:
88 | ignore_list = []
89 |
90 | download_model(model_name)
91 | model = read_model(model_name)
92 | classifier = ImageClassifier(model, gpu_mode=gpu_mode)
93 |
94 | if args['preview']:
95 | preview(args[''], classifier)
96 | return 0
97 |
98 | filenames = args[''] or [args['']]
99 |
100 | labels_by_filename = label_videos(
101 | filenames,
102 | classifier,
103 | sample_rate=sample_rate,
104 | recreate_index=recreate_index
105 | )
106 |
107 | if args['describe']:
108 | freq_dist = defaultdict(lambda: 0)
109 | for (t, labels) in threshold_labels(merge_values(labels_by_filename), min_confidence):
110 | for label in map(itemgetter(0), labels):
111 | freq_dist[label] += 1
112 | sorted_labels = sorted(freq_dist.iteritems(), key=itemgetter(1), reverse=True)
113 | print '\n'.join(map(lambda (k, v): "{0} {1}".format(k, v), sorted_labels))
114 | return 0
115 |
116 | if args['search'] or args['filter']:
117 | query = args['']
118 | validate_query(query, model.labels(with_hypernyms=True))
119 |
120 | matching_time_regions = search_videos(
121 | labels_by_filename,
122 | args[''],
123 | classifier,
124 | sample_rate=sample_rate,
125 | recreate_index=recreate_index,
126 | min_confidence=min_confidence
127 | )
128 |
129 | if not matching_time_regions:
130 | return 0
131 |
132 | if args['search']:
133 | for filename, region in matching_time_regions:
134 | start, end = region
135 | print "%s %f %f" % (filename, start, end)
136 | return 0
137 |
138 | if args['filter']:
139 | supercut = create_supercut(matching_time_regions)
140 |
141 | dst = args.get('')
142 | if not dst:
143 | base, ext = os.path.splitext(args[''][0])
144 | dst = "{0}_filtered_{1}.mp4".format(base, filename_for_query(query))
145 |
146 | supercut.to_videofile(
147 | dst,
148 | codec="libx264",
149 | temp_audiofile="temp.m4a",
150 | remove_temp=True,
151 | audio_codec="aac",
152 | )
153 |
154 | if args['--open']:
155 | subprocess.check_output(['open', dst])
156 |
157 | if args['sort']:
158 | timed_labels = labels_by_filename[args['']]
159 | timed_labels = threshold_labels(timed_labels, min_confidence)
160 | timed_labels = filter_out_labels(timed_labels, ignore_list)
161 |
162 | idx = reverse_index(
163 | timed_labels,
164 | min_occurrences=min_occurrences,
165 | max_length_per_label=max_section_length
166 | )
167 | compilation = create_compilation(args[''], idx)
168 |
169 | dst = args.get('')
170 | if not dst:
171 | base, ext = os.path.splitext(args[''])
172 | dst = "{0}_sorted.mp4".format(base)
173 |
174 | compilation.to_videofile(
175 | dst,
176 | codec="libx264",
177 | temp_audiofile="temp.m4a",
178 | remove_temp=True,
179 | audio_codec="aac",
180 | )
181 |
182 | if args['--open']:
183 | subprocess.check_output(['open', dst])
184 |
185 |
--------------------------------------------------------------------------------
/thingscoop/classifier.py:
--------------------------------------------------------------------------------
1 | import cPickle
2 | import caffe
3 | import cv2
4 | import glob
5 | import logging
6 | import numpy
7 | import os
8 |
9 | class ImageClassifier(object):
10 | def __init__(self, model, gpu_mode=False):
11 | self.model = model
12 |
13 | kwargs = {}
14 |
15 | if self.model.get("image_dims"):
16 | kwargs['image_dims'] = tuple(self.model.get("image_dims"))
17 |
18 | if self.model.get("channel_swap"):
19 | kwargs['channel_swap'] = tuple(self.model.get("channel_swap"))
20 |
21 | if self.model.get("raw_scale"):
22 | kwargs['raw_scale'] = float(self.model.get("raw_scale"))
23 |
24 | if self.model.get("mean"):
25 | kwargs['mean'] = numpy.array(self.model.get("mean"))
26 |
27 | self.net = caffe.Classifier(
28 | model.deploy_path(),
29 | model.model_path(),
30 | **kwargs
31 | )
32 |
33 | self.confidence_threshold = 0.1
34 |
35 | if gpu_mode:
36 | caffe.set_mode_gpu()
37 | else:
38 | caffe.set_mode_cpu()
39 |
40 | self.labels = numpy.array(model.labels())
41 |
42 | if self.model.bet_path():
43 | self.bet = cPickle.load(open(self.model.bet_path()))
44 | self.bet['words'] = map(lambda w: w.replace(' ', '_'), self.bet['words'])
45 | else:
46 | self.bet = None
47 |
48 | self.net.forward()
49 |
50 | def classify_image(self, filename):
51 | image = caffe.io.load_image(open(filename))
52 | scores = self.net.predict([image], oversample=True).flatten()
53 |
54 | if self.bet:
55 | expected_infogain = numpy.dot(self.bet['probmat'], scores[self.bet['idmapping']])
56 | expected_infogain *= self.bet['infogain']
57 | infogain_sort = expected_infogain.argsort()[::-1]
58 | results = [
59 | (self.bet['words'][v], float(expected_infogain[v]))
60 | for v in infogain_sort
61 | if expected_infogain[v] > self.confidence_threshold
62 | ]
63 |
64 | else:
65 | indices = (-scores).argsort()
66 | predictions = self.labels[indices]
67 | results = [
68 | (p, float(scores[i]))
69 | for i, p in zip(indices, predictions)
70 | if scores[i] > self.confidence_threshold
71 | ]
72 |
73 | return results
74 |
75 |
--------------------------------------------------------------------------------
/thingscoop/models.py:
--------------------------------------------------------------------------------
1 | import glob
2 | import os
3 | import requests
4 | import shutil
5 | import tempfile
6 | import urlparse
7 | import yaml
8 | import zipfile
9 | import urllib
10 | from pattern.en import wordnet
11 | from progressbar import ProgressBar, Percentage, Bar, ETA, FileTransferSpeed
12 |
13 | from .utils import get_hypernyms
14 |
15 | THINGSCOOP_DIR = os.path.join(os.path.expanduser("~"), ".thingscoop")
16 | CONFIG_PATH = os.path.join(THINGSCOOP_DIR, "config.yml")
17 |
18 | DEFAULT_CONFIG = {
19 | 'repo_url': 'https://s3.amazonaws.com/haystack-models/',
20 | 'active_model': 'googlenet_imagenet'
21 | }
22 |
23 | class CouldNotFindModel(Exception):
24 | pass
25 |
26 | class Model(object):
27 | def __init__(self, name, model_dir):
28 | self.name = name
29 | self.model_dir = model_dir
30 | self.info = yaml.load(open(os.path.join(model_dir, "info.yml")))
31 |
32 | def get(self, k):
33 | return self.info.get(k)
34 |
35 | def model_path(self):
36 | return os.path.join(self.model_dir, self.info['pretrained_model_file'])
37 |
38 | def deploy_path(self):
39 | return os.path.join(self.model_dir, self.info['deploy_file'])
40 |
41 | def label_path(self):
42 | return os.path.join(self.model_dir, self.info['labels_file'])
43 |
44 | def bet_path(self):
45 | if 'bet_file' not in self.info: return None
46 | return os.path.join(self.model_dir, self.info['bet_file'])
47 |
48 | def labels(self, with_hypernyms=False):
49 | ret = map(str.strip, open(self.label_path()).readlines())
50 | if with_hypernyms:
51 | for label in list(ret):
52 | ret.extend(get_hypernyms(label))
53 | return ret
54 |
55 | def read_config():
56 | if not os.path.exists(CONFIG_PATH):
57 | write_config(DEFAULT_CONFIG)
58 | return yaml.load(open(CONFIG_PATH))
59 |
60 | def write_config(config):
61 | if not os.path.exists(THINGSCOOP_DIR):
62 | os.makedirs(THINGSCOOP_DIR)
63 | yaml.dump(config, open(CONFIG_PATH, 'wb'))
64 |
65 | def get_repo_url():
66 | return read_config()['repo_url']
67 |
68 | def get_active_model():
69 | return read_config()['active_model']
70 |
71 | def get_model_url(model):
72 | return get_repo_url() + model + ".zip"
73 |
74 | def get_model_local_path(model):
75 | return os.path.join(THINGSCOOP_DIR, "models", model)
76 |
77 | def set_config(k, v):
78 | config = read_config()
79 | config[k] = v
80 | write_config(config)
81 |
82 | def model_in_cache(model):
83 | return os.path.exists(get_model_local_path(model))
84 |
85 | def get_models_path():
86 | return os.path.join(THINGSCOOP_DIR, "models")
87 |
88 | def get_all_models():
89 | return yaml.load(requests.get(urlparse.urljoin(get_repo_url(), "info.yml")).text)
90 |
91 | def info_model(model):
92 | models = get_all_models()
93 | for model_info in models:
94 | if model_info['name'] == model:
95 | return model_info
96 |
97 | def use_model(model):
98 | download_model(model)
99 | set_config("active_model", model)
100 |
101 | def remove_model(model):
102 | path = get_model_local_path(model)
103 | shutil.rmtree(path)
104 |
105 | def clear_models():
106 | for model in get_downloaded_models():
107 | remove_model(model)
108 |
109 | def read_model(model_name):
110 | if not model_in_cache(model_name):
111 | raise CouldNotFindModel, "Could not find model {}".format(model_name)
112 | return Model(model_name, get_model_local_path(model_name))
113 |
114 | def get_downloaded_models():
115 | return map(os.path.basename, glob.glob(os.path.join(get_models_path(), "*")))
116 |
117 | progress_bar = None
118 | def download_model(model):
119 | if model_in_cache(model): return
120 | model_url = get_model_url(model)
121 | tmp_zip = tempfile.NamedTemporaryFile(suffix=".zip")
122 | prompt = "Downloading model {}".format(model)
123 | def cb(count, block_size, total_size):
124 | global progress_bar
125 | if not progress_bar:
126 | widgets = [prompt, Percentage(), ' ', Bar(), ' ', FileTransferSpeed(), ' ', ETA()]
127 | progress_bar = ProgressBar(widgets=widgets, maxval=int(total_size)).start()
128 | progress_bar.update(min(total_size, count * block_size))
129 | urllib.urlretrieve(model_url, tmp_zip.name, cb)
130 | z = zipfile.ZipFile(tmp_zip)
131 | out_path = get_model_local_path(model)
132 | try:
133 | os.mkdir(out_path)
134 | except:
135 | pass
136 | for name in z.namelist():
137 | if name.startswith("_"): continue
138 | z.extract(name, out_path)
139 |
140 |
--------------------------------------------------------------------------------
/thingscoop/preview.py:
--------------------------------------------------------------------------------
1 | import cv2
2 | import datetime
3 | import math
4 | import numpy
5 | import random
6 | import re
7 | import subprocess
8 | import sys
9 | import caffe
10 | import tempfile
11 |
12 | def duration_string_to_timedelta(s):
13 | [hours, minutes, seconds] = map(int, s.split(':'))
14 | seconds = seconds + minutes * 60 + hours * 3600
15 | return datetime.timedelta(seconds=seconds)
16 |
17 | def get_video_duration(path):
18 | result = subprocess.Popen(["ffprobe", path], stdout=subprocess.PIPE, stderr=subprocess.STDOUT)
19 | matches = [x for x in result.stdout.readlines() if "Duration" in x]
20 | duration_string = re.findall(r'Duration: ([0-9:]*)', matches[0])[0]
21 | return math.ceil(duration_string_to_timedelta(duration_string).seconds)
22 |
23 | def get_current_position(c):
24 | return int(c.get(cv2.cv.CV_CAP_PROP_POS_MSEC)/1000)
25 |
26 | def add_text_to_frame(frame, text):
27 | ret, _ = cv2.getTextSize(text, cv2.FONT_HERSHEY_PLAIN, 1, 1)
28 | ret = (ret[0] + 20, ret[1] + 20)
29 | cv2.rectangle(frame, (0,0), ret, (0, 0, 0), cv2.cv.CV_FILLED)
30 | cv2.putText(frame, text, (5, 20), cv2.FONT_HERSHEY_PLAIN, 1, (255, 255, 255))
31 |
32 | def format_classification(result):
33 | str_list = []
34 | for label, confidence in result:
35 | str_list.append("{0} ({1})".format(label, confidence))
36 | return ', '.join(str_list)
37 |
38 | def preview(filename, classifier):
39 | cv2.namedWindow('video')
40 |
41 | duration = int(get_video_duration(filename))
42 |
43 | def trackbar_change(t):
44 | cap.set(cv2.cv.CV_CAP_PROP_POS_MSEC, t*1000)
45 |
46 | trackbar_prompt = 'Current position:'
47 | cv2.createTrackbar(trackbar_prompt, 'video', 0, duration, trackbar_change)
48 |
49 | cap = cv2.VideoCapture(filename)
50 |
51 | classification_result = None
52 | previous_time_in_seconds = None
53 | current_pos = 0
54 |
55 | tmp = tempfile.NamedTemporaryFile(suffix=".png")
56 |
57 | while cap.isOpened():
58 | ret, frame = cap.read()
59 |
60 | cv2.imwrite(tmp.name, frame)
61 |
62 | if ret:
63 | current_pos = get_current_position(cap)
64 |
65 | if current_pos != previous_time_in_seconds:
66 | previous_time_in_seconds = current_pos
67 | classification_result = classifier.classify_image(tmp.name)
68 |
69 | if classification_result:
70 | add_text_to_frame(frame, format_classification(classification_result))
71 |
72 | cv2.imshow('video', frame)
73 |
74 | cv2.setTrackbarPos(trackbar_prompt, 'video', current_pos)
75 |
76 | k = cv2.waitKey(1) & 0xFF
77 | if k == 27:
78 | break
79 |
80 | cap.release()
81 | cv2.destroyAllWindows()
82 |
83 |
--------------------------------------------------------------------------------
/thingscoop/query.py:
--------------------------------------------------------------------------------
1 | import re
2 | from pypeg2 import *
3 | from pattern.en import wordnet
4 |
5 | from .utils import get_hypernyms
6 |
7 | class NoSuchLabelError(Exception):
8 | pass
9 |
10 | class Q(object):
11 | def evaluate(self, labels):
12 | return self.content.evaluate(labels)
13 |
14 | class Label(object):
15 | grammar = attr("name", re.compile(r"[A-z][A-z\- ]+[A-z]"))
16 |
17 | def evaluate(self, labels):
18 | for label in labels:
19 | if label == self.name or self.name in get_hypernyms(label):
20 | return True
21 | return False
22 |
23 | class UnaryOp(object):
24 | grammar = (attr("operator", re.compile("!")),
25 | attr("content", Label))
26 |
27 | def evaluate(self, labels):
28 | return not self.content.evaluate(labels)
29 |
30 | class ParentheticalQ(List):
31 | grammar = (re.compile("\("),
32 | attr("content", Q),
33 | re.compile("\)"))
34 |
35 | def evaluate(self, labels):
36 | return self.content.evaluate(labels)
37 |
38 | class BinaryOp(object):
39 | grammar = (attr("lh", [ParentheticalQ, UnaryOp, Label]),
40 | attr("op", re.compile("(&&|\|\|)")),
41 | attr("rh", Q))
42 |
43 | def evaluate(self, labels):
44 | lv = self.lh.evaluate(labels)
45 | rv = self.rh.evaluate(labels)
46 |
47 | if self.op == '||': return lv or rv
48 | else: return lv and rv
49 |
50 | Q.grammar = (
51 | maybe_some(whitespace),
52 | attr("content", [BinaryOp, ParentheticalQ, UnaryOp, Label]),
53 | maybe_some(whitespace)
54 | )
55 |
56 | def get_labels(parsed_q):
57 | if type(parsed_q) == Label:
58 | return [parsed_q.name]
59 | if type(parsed_q) == BinaryOp:
60 | return get_labels(parsed_q.lh) + get_labels(parsed_q.rh)
61 | return get_labels(parsed_q.content)
62 |
63 | def validate_query(q, all_labels):
64 | parsed_q = parse(q, Q)
65 | for label in get_labels(parsed_q):
66 | if label not in all_labels:
67 | raise NoSuchLabelError, "Label \"{}\" does not exist.".format(label)
68 |
69 | def eval_query_with_labels(q, labels):
70 | parsed_q = parse(q, Q)
71 | return parsed_q.evaluate(labels)
72 |
73 | def filename_for_query(q):
74 | parsed_q = parse(q, Q)
75 | labels = get_labels(parsed_q)
76 | return '_'.join(labels).replace(' ', '_')
77 |
78 |
--------------------------------------------------------------------------------
/thingscoop/search.py:
--------------------------------------------------------------------------------
1 | import glob
2 | import os
3 | import shutil
4 | import subprocess
5 | import sys
6 | import tempfile
7 | from operator import itemgetter
8 |
9 | from progressbar import ProgressBar, Percentage, Bar, ETA
10 |
11 | from .query import eval_query_with_labels
12 | from .utils import extract_frames
13 | from .utils import generate_index_path
14 | from .utils import read_index_from_path
15 | from .utils import save_index_to_path
16 |
17 | def times_to_regions(times, max_total_length=None):
18 | if not times:
19 | return []
20 | regions = []
21 | current_start = times[0]
22 | for t1, t2 in zip(times, times[1:]):
23 | if t2 - t1 > 1:
24 | regions.append((current_start, t1 + 1))
25 | current_start = t2
26 | regions.append((current_start, times[-1] + 1))
27 | if not max_total_length:
28 | return regions
29 | accum = 0
30 | ret = []
31 | for index, (t1, t2) in enumerate(regions):
32 | if accum + (t2-t1) > max_total_length:
33 | ret.append((t1, t1 + max_total_length-accum))
34 | return ret
35 | else:
36 | ret.append((t1, t2))
37 | accum += t2 - t1
38 | return ret
39 |
40 | def unique_labels(timed_labels):
41 | ret = set()
42 | for t, labels_list in timed_labels:
43 | ret.update(map(itemgetter(0), labels_list))
44 | return ret
45 |
46 | def reverse_index(timed_labels, min_occurrences=2, max_length_per_label=8):
47 | ret = {}
48 | for label in unique_labels(timed_labels):
49 | times = []
50 | for t, labels_list in timed_labels:
51 | raw_labels = map(lambda (l, c): l, labels_list)
52 | if eval_query_with_labels(label, raw_labels):
53 | times.append(t)
54 | if len(times) >= min_occurrences:
55 | ret[label] = times_to_regions(times, max_total_length=max_length_per_label)
56 | return ret
57 |
58 | def threshold_labels(timed_labels, min_confidence):
59 | ret = []
60 | for t, label_list in timed_labels:
61 | filtered = filter(lambda (l, c): c > min_confidence, label_list)
62 | if filtered:
63 | ret.append((t, filtered))
64 | return ret
65 |
66 | def filter_out_labels(timed_labels, ignore_list):
67 | ret = []
68 | for t, label_list in timed_labels:
69 | filtered = filter(lambda (l, c): l not in ignore_list, label_list)
70 | if filtered:
71 | ret.append((t, filtered))
72 | return ret
73 |
74 | def label_video(filename, classifier, sample_rate=1, recreate_index=False):
75 | index_filename = generate_index_path(filename, classifier.model)
76 |
77 | if os.path.exists(index_filename) and not recreate_index:
78 | return read_index_from_path(index_filename)
79 |
80 | temp_frame_dir, frames = extract_frames(filename, sample_rate=sample_rate)
81 |
82 | timed_labels = []
83 |
84 | widgets=["Labeling {}: ".format(filename), Percentage(), ' ', Bar(), ' ', ETA()]
85 | pbar = ProgressBar(widgets=widgets, maxval=len(frames)).start()
86 |
87 | for index, frame in enumerate(frames):
88 | pbar.update(index)
89 | labels = classifier.classify_image(frame)
90 | if not len(labels):
91 | continue
92 | t = (1./sample_rate) * index
93 | timed_labels.append((t, labels))
94 |
95 | shutil.rmtree(temp_frame_dir)
96 | save_index_to_path(index_filename, timed_labels)
97 |
98 | return timed_labels
99 |
100 | def label_videos(filenames, classifier, sample_rate=1, recreate_index=False):
101 | ret = {}
102 | for filename in filenames:
103 | ret[filename] = label_video(
104 | filename,
105 | classifier,
106 | sample_rate=sample_rate,
107 | recreate_index=recreate_index
108 | )
109 | return ret
110 |
111 | def search_labels(timed_labels, query, classifier, sample_rate=1, recreate_index=False, min_confidence=0.3):
112 | timed_labels = threshold_labels(timed_labels, min_confidence)
113 |
114 | times = []
115 | for t, labels_list in timed_labels:
116 | raw_labels = map(lambda (l, c): l, labels_list)
117 | if eval_query_with_labels(query, raw_labels):
118 | times.append(t)
119 |
120 | return times_to_regions(times)
121 |
122 | def search_videos(labels_by_filename, query, classifier, sample_rate=1, recreate_index=False, min_confidence=0.3):
123 | ret = []
124 | for filename, timed_labels in labels_by_filename.items():
125 | ret += map(lambda r: (filename, r), search_labels(
126 | timed_labels,
127 | query,
128 | classifier,
129 | sample_rate=sample_rate,
130 | recreate_index=recreate_index,
131 | min_confidence=min_confidence
132 | ))
133 | return ret
134 |
135 |
--------------------------------------------------------------------------------
/thingscoop/utils.py:
--------------------------------------------------------------------------------
1 | import glob
2 | import json
3 | import os
4 | import re
5 | import subprocess
6 | import tempfile
7 | import textwrap
8 | from moviepy.editor import VideoFileClip, TextClip, ImageClip, concatenate_videoclips
9 | from pattern.en import wordnet
10 | from termcolor import colored
11 | from PIL import Image, ImageDraw, ImageFont
12 |
13 | def create_title_frame(title, dimensions, fontsize=60):
14 | para = textwrap.wrap(title, width=30)
15 | im = Image.new('RGB', dimensions, (0, 0, 0, 0))
16 | draw = ImageDraw.Draw(im)
17 | font = ImageFont.truetype('resources/Helvetica.ttc', fontsize)
18 | total_height = sum(map(lambda l: draw.textsize(l, font=font)[1], para))
19 | current_h, pad = (dimensions[1]/2-total_height/2), 10
20 | for line in para:
21 | w, h = draw.textsize(line, font=font)
22 | draw.text(((dimensions[0] - w) / 2, current_h), line, font=font)
23 | current_h += h + pad
24 | f = tempfile.NamedTemporaryFile(suffix=".png", delete=False)
25 | im.save(f.name)
26 | return f.name
27 |
28 | def get_video_dimensions(filename):
29 | p = subprocess.Popen(['ffprobe', filename], stdout=subprocess.PIPE, stderr=subprocess.PIPE)
30 | _, out = p.communicate()
31 | for line in out.split('\n'):
32 | if re.search('Video: ', line):
33 | match = re.findall('[1-9][0-9]*x[1-9][0-9]*', line)[0]
34 | return tuple(map(int, match.split('x')))
35 |
36 | def extract_frames(filename, sample_rate=1):
37 | dest_dir = tempfile.mkdtemp()
38 | dest = os.path.join(dest_dir, "%10d.png")
39 | subprocess.check_output(["ffmpeg", "-i", filename, "-vf", "fps="+str(sample_rate), dest])
40 | glob_pattern = os.path.join(dest_dir, "*.png")
41 | return dest_dir, glob.glob(glob_pattern)
42 |
43 | def generate_index_path(filename, model):
44 | name, ext = os.path.splitext(filename)
45 | return "{name}_{model_name}.json".format(name=name, model_name=model.name)
46 |
47 | def read_index_from_path(filename):
48 | return json.load(open(filename))
49 |
50 | def save_index_to_path(filename, timed_labels):
51 | json.dump(timed_labels, open(filename, 'w'), indent=4)
52 |
53 | def create_supercut(regions):
54 | subclips = []
55 | filenames = set(map(lambda (filename, _): filename, regions))
56 | video_files = {filename: VideoFileClip(filename) for filename in filenames}
57 | for filename, region in regions:
58 | subclip = video_files[filename].subclip(*region)
59 | subclips.append(subclip)
60 | if not subclips: return None
61 | return concatenate_videoclips(subclips)
62 |
63 | def label_as_title(label):
64 | return label.replace('_', ' ').upper()
65 |
66 | def create_compilation(filename, index):
67 | dims = get_video_dimensions(filename)
68 | subclips = []
69 | video_file = VideoFileClip(filename)
70 | for label in sorted(index.keys()):
71 | label_img_filename = create_title_frame(label_as_title(label), dims)
72 | label_clip = ImageClip(label_img_filename, duration=2)
73 | os.remove(label_img_filename)
74 | subclips.append(label_clip)
75 | for region in index[label]:
76 | subclip = video_file.subclip(*region)
77 | subclips.append(subclip)
78 | if not subclips: return None
79 | return concatenate_videoclips(subclips)
80 |
81 | def search_labels(r, labels):
82 | r = re.compile(r)
83 | for label in labels:
84 | if not r.search(label):
85 | continue
86 | current_i = 0
87 | ret = ''
88 | for m in r.finditer(label):
89 | ret += label[current_i:m.start()]
90 | ret += colored(label[m.start():m.end()], 'red', attrs=['bold'])
91 | current_i = m.end()
92 | ret += label[m.end():]
93 | print ret
94 |
95 | def get_hypernyms(label):
96 | synsets = wordnet.synsets(label)
97 | if not synsets: return []
98 | return map(lambda s: s.synonyms[0], synsets[0].hypernyms(True))
99 |
100 | def merge_values(d):
101 | ret = []
102 | for lst in d.values():
103 | ret += lst
104 | return ret
105 |
106 |
--------------------------------------------------------------------------------