├── .gitignore
├── LICENSE
├── README.md
├── ffmpeg_reader.py
├── py_ops.py
└── usage_example.py


/.gitignore:
--------------------------------------------------------------------------------
 1 | # Byte-compiled / optimized / DLL files
 2 | __pycache__/
 3 | *.py[cod]
 4 | *$py.class
 5 | 
 6 | # C extensions
 7 | *.so
 8 | 
 9 | # Distribution / packaging
10 | .Python
11 | env/
12 | build/
13 | develop-eggs/
14 | dist/
15 | downloads/
16 | eggs/
17 | .eggs/
18 | lib/
19 | lib64/
20 | parts/
21 | sdist/
22 | var/
23 | *.egg-info/
24 | .installed.cfg
25 | *.egg
26 | 
27 | # PyInstaller
28 | #  Usually these files are written by a python script from a template
29 | #  before PyInstaller builds the exe, so as to inject date/other infos into it.
30 | *.manifest
31 | *.spec
32 | 
33 | # Installer logs
34 | pip-log.txt
35 | pip-delete-this-directory.txt
36 | 
37 | # Unit test / coverage reports
38 | htmlcov/
39 | .tox/
40 | .coverage
41 | .coverage.*
42 | .cache
43 | nosetests.xml
44 | coverage.xml
45 | *,cover
46 | .hypothesis/
47 | 
48 | # Translations
49 | *.mo
50 | *.pot
51 | 
52 | # Django stuff:
53 | *.log
54 | local_settings.py
55 | 
56 | # Flask stuff:
57 | instance/
58 | .webassets-cache
59 | 
60 | # Scrapy stuff:
61 | .scrapy
62 | 
63 | # Sphinx documentation
64 | docs/_build/
65 | 
66 | # PyBuilder
67 | target/
68 | 
69 | # IPython Notebook
70 | .ipynb_checkpoints
71 | 
72 | # pyenv
73 | .python-version
74 | 
75 | # celery beat schedule file
76 | celerybeat-schedule
77 | 
78 | # dotenv
79 | .env
80 | 
81 | # virtualenv
82 | venv/
83 | ENV/
84 | 
85 | # Spyder project settings
86 | .spyderproject
87 | 
88 | # Rope project settings
89 | .ropeproject
90 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2017 Víctor Campos
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # Loading videos in TensorFlow with FFmpeg
 2 | 
 3 | TensorFlow does not offer any native operation to decode videos inside the graph. This issue is usually overcome by first extracting and storing video frames as individual images and then loading them using TensorFlow's native image ops. Such approach, however, is not scalable and becomes prohibitive for large video collections. This repository contains a python op that wraps FFmpeg and allows to decode and load videos directly inside the graph.
 4 | 
 5 | 
 6 | ## Requirements
 7 | 
 8 | This code has been tested with [TensorFlow](https://www.tensorflow.org) 1.0. [FFmpeg](https://ffmpeg.org) is required in order to perform video decoding.
 9 | 
10 | 
11 | ## Usage
12 | 
13 | Video can be decoded through the *decode_video()* python op, similarly to how an image would be decoded through *[tf.image.decode_jpeg()](https://www.tensorflow.org/api_docs/python/tf/image/decode_jpeg)*. While images are decoded into 3D tensors with shape (H, W, D), videos are decoded into 4D tensors with shape (T, H, W, D).
14 | 
15 | Please see `usage_example.py` for an example on how to use the custom op. If FFmpeg is not found, make sure that the proper path to the binary is set either by modifying the *FFMPEG_BIN* variable or calling *set_ffmpeg_bin()* in `ffmpeg_reader.py`.
16 | 
17 | 
18 | ## References
19 | 
20 | The code for decoding videos using FFmpeg within Python is a modification of the one in the [moviepy project](https://github.com/Zulko/moviepy).
21 | 


--------------------------------------------------------------------------------
/ffmpeg_reader.py:
--------------------------------------------------------------------------------
  1 | """
  2 | This file is a derivative of a file from the moviepy project (https://github.com/Zulko/moviepy), 
  3 | released under the MIT licence (Copyright Zulko 2017)
  4 | 
  5 | The original file can be found in the moviepy repository:
  6 |     https://github.com/Zulko/moviepy/blob/master/moviepy/video/io/ffmpeg_reader.py
  7 | 
  8 | --------------------------------------------------------------------------------------------------
  9 | 
 10 | This module implements all the functions to read a video or a picture
 11 | using ffmpeg. It is quite ugly, as there are many pitfalls to avoid
 12 | """
 13 | 
 14 | from __future__ import division
 15 | 
 16 | 
 17 | import os
 18 | import re
 19 | import logging
 20 | import warnings
 21 | import numpy as np
 22 | import subprocess as sp
 23 | 
 24 | logging.captureWarnings(True)
 25 | 
 26 | try:
 27 |     from subprocess import DEVNULL  # py3k
 28 | except ImportError:
 29 |     DEVNULL = open(os.devnull, 'wb')
 30 | 
 31 | 
 32 | # Default path to ffmpeg binary
 33 | # Note: you may need to set the absolute path to the ffmpeg binary depending on your installation
 34 | # either by modifying FFMPEG_BIN or calling set_ffmpeg_bin() at the beginning of your script
 35 | FFMPEG_BIN = 'ffmpeg'
 36 | 
 37 | 
 38 | def get_ffmpeg_bin():
 39 |     """Get path to FFmpeg binary."""
 40 |     return FFMPEG_BIN
 41 | 
 42 | 
 43 | def set_ffmpeg_bin(path):
 44 |     """Set path to FFmpeg binary."""
 45 |     global FFMPEG_BIN
 46 |     FFMPEG_BIN = path
 47 | 
 48 | 
 49 | def is_string(obj):
 50 |     """ Returns true if s is string or string-like object,
 51 |     compatible with Python 2 and Python 3."""
 52 |     try:
 53 |         return isinstance(obj, basestring)
 54 |     except NameError:
 55 |         return isinstance(obj, str)
 56 | 
 57 | 
 58 | def cvsecs(time):
 59 |     """ Will convert any time into seconds.
 60 |     Here are the accepted formats:
 61 |     >>> cvsecs(15.4) -> 15.4 # seconds
 62 |     >>> cvsecs( (1,21.5) ) -> 81.5 # (min,sec)
 63 |     >>> cvsecs( (1,1,2) ) -> 3662 # (hr, min, sec)
 64 |     >>> cvsecs('01:01:33.5') -> 3693.5  #(hr,min,sec)
 65 |     >>> cvsecs('01:01:33.045') -> 3693.045
 66 |     >>> cvsecs('01:01:33,5') #coma works too
 67 |     """
 68 | 
 69 |     if is_string(time):
 70 |         if (',' not in time) and ('.' not in time):
 71 |             time = time + '.0'
 72 |         expr = r"(\d+):(\d+):(\d+)[,|.](\d+)"
 73 |         finds = re.findall(expr, time)[0]
 74 |         nums = list( map(float, finds) )
 75 |         return ( 3600*int(finds[0])
 76 |                 + 60*int(finds[1])
 77 |                 + int(finds[2])
 78 |                 + nums[3]/(10**len(finds[3])))
 79 | 
 80 |     elif isinstance(time, tuple):
 81 |         if len(time)== 3:
 82 |             hr, mn, sec = time
 83 |         elif len(time)== 2:
 84 |             hr, mn, sec = 0, time[0], time[1]
 85 |         return 3600*hr + 60*mn + sec
 86 | 
 87 |     else:
 88 |         return time
 89 | 
 90 | 
 91 | class FFMPEG_VideoReader:
 92 |     def __init__(self, filename, print_infos=False, bufsize=None,
 93 |                  pix_fmt="rgb24", check_duration=True, target_fps=-1):
 94 | 
 95 |         self.filename = filename
 96 |         infos = ffmpeg_parse_infos(filename, print_infos, check_duration)
 97 |         self.fps = infos['video_fps']
 98 |         self.size = infos['video_size']
 99 |         self.duration = infos['video_duration']
100 |         self.ffmpeg_duration = infos['duration']
101 |         self.nframes = infos['video_nframes']
102 | 
103 |         self.infos = infos
104 | 
105 |         self.pix_fmt = pix_fmt
106 |         if pix_fmt == 'rgba':
107 |             self.depth = 4
108 |         else:
109 |             self.depth = 3
110 | 
111 |         if bufsize is None:
112 |             w, h = self.size
113 |             bufsize = self.depth * w * h + 100
114 | 
115 |         self.target_fps = target_fps
116 | 
117 |         self.bufsize = bufsize
118 |         self.initialize()
119 | 
120 |         self.pos = 1
121 |         self.lastread = self.read_frame()
122 | 
123 |     def initialize(self, starttime=0):
124 |         """Opens the file, creates the pipe. """
125 | 
126 |         self.close()  # if any
127 | 
128 |         if starttime != 0:
129 |             offset = min(1, starttime)
130 |             i_arg = ['-ss', "%.06f" % (starttime - offset),
131 |                      '-i', self.filename,
132 |                      '-ss', "%.06f" % offset]
133 |         else:
134 |             i_arg = ['-i', self.filename]
135 | 
136 |         if self.target_fps > 0:
137 |             cmd = ([get_ffmpeg_bin()] + i_arg +
138 |                    ['-loglevel', 'error',
139 |                     '-f', 'image2pipe',
140 |                     '-vf', 'fps=%d' % self.target_fps,
141 |                     "-pix_fmt", self.pix_fmt,
142 |                     '-vcodec', 'rawvideo', '-'])
143 |         else:
144 |             cmd = ([get_ffmpeg_bin()] + i_arg +
145 |                    ['-loglevel', 'error',
146 |                     '-f', 'image2pipe',
147 |                     "-pix_fmt", self.pix_fmt,
148 |                     '-vcodec', 'rawvideo', '-'])
149 | 
150 |         popen_params = {"bufsize": self.bufsize,
151 |                         "stdout": sp.PIPE,
152 |                         "stderr": sp.PIPE,
153 |                         "stdin": DEVNULL}
154 | 
155 |         if os.name == "nt":
156 |             popen_params["creationflags"] = 0x08000000
157 | 
158 |         self.proc = sp.Popen(cmd, **popen_params)
159 | 
160 |     def skip_frames(self, n=1):
161 |         """Reads and throws away n frames """
162 |         w, h = self.size
163 |         for i in range(n):
164 |             self.proc.stdout.read(self.depth * w * h)
165 |             # self.proc.stdout.flush()
166 |         self.pos += n
167 | 
168 |     def read_frame(self):
169 |         w, h = self.size
170 |         nbytes = self.depth * w * h
171 | 
172 |         s = self.proc.stdout.read(nbytes)
173 |         if len(s) != nbytes:
174 | 
175 |             warnings.warn("Warning: in file %s, " % (self.filename) +
176 |                           "%d bytes wanted but %d bytes read," % (nbytes, len(s)) +
177 |                           "at frame %d/%d, at time %.02f/%.02f sec. " % (
178 |                               self.pos, self.nframes,
179 |                               1.0 * self.pos / self.fps,
180 |                               self.duration) +
181 |                           "Using the last valid frame instead.",
182 |                           UserWarning)
183 | 
184 |             if not hasattr(self, 'lastread'):
185 |                 raise IOError(("FFMPEG_VideoReader error: failed to read the first frame of "
186 |                                "video file %s. That might mean that the file is "
187 |                                "corrupted. That may also mean that you are using "
188 |                                "a deprecated version of FFMPEG. On Ubuntu/Debian "
189 |                                "for instance the version in the repos is deprecated. "
190 |                                "Please update to a recent version from the website.") % (
191 |                                   self.filename))
192 | 
193 |             result = self.lastread
194 | 
195 |         else:
196 | 
197 |             result = np.fromstring(s, dtype='uint8')
198 |             result.shape = (h, w, len(s) // (w * h))  # reshape((h, w, len(s)//(w*h)))
199 |             self.lastread = result
200 | 
201 |         return result
202 | 
203 |     def get_frame(self, t, fps=None):
204 |         """ Read a file video frame at time t.
205 |         Note for coders: getting an arbitrary frame in the video with
206 |         ffmpeg can be painfully slow if some decoding has to be done.
207 |         This function tries to avoid fetching arbitrary frames
208 |         whenever possible, by moving between adjacent frames.
209 |         """
210 | 
211 |         # these definitely need to be rechecked sometime. Seems to work.
212 | 
213 |         # I use that horrible '+0.00001' hack because sometimes due to numerical
214 |         # imprecisions a 3.0 can become a 2.99999999... which makes the int()
215 |         # go to the previous integer. This makes the fetching more robust in the
216 |         # case where you get the nth frame by writing get_frame(n/fps).
217 | 
218 |         if fps is None:
219 |             fps = self.fps
220 | 
221 |         pos = int(fps * t + 0.00001) + 1
222 | 
223 |         if pos == self.pos:
224 |             return self.lastread
225 |         else:
226 |             if (pos < self.pos) or (pos > self.pos + 100):
227 |                 self.initialize(t)
228 |                 self.pos = pos
229 |             else:
230 |                 self.skip_frames(pos - self.pos - 1)
231 |             result = self.read_frame()
232 |             self.pos = pos
233 |             return result
234 | 
235 |     def close(self):
236 |         if hasattr(self, 'proc'):
237 |             self.proc.terminate()
238 |             self.proc.stdout.close()
239 |             self.proc.stderr.close()
240 |             del self.proc
241 | 
242 |     def __del__(self):
243 |         self.close()
244 |         if hasattr(self, 'lastread'):
245 |             del self.lastread
246 | 
247 | 
248 | def ffmpeg_read_image(filename, with_mask=True):
249 |     """ Read an image file (PNG, BMP, JPEG...).
250 |     Wraps FFMPEG_Videoreader to read just one image.
251 |     Returns an ImageClip.
252 |     This function is not meant to be used directly in MoviePy,
253 |     use ImageClip instead to make clips out of image files.
254 |     Parameters
255 |     -----------
256 |     filename
257 |       Name of the image file. Can be of any format supported by ffmpeg.
258 |     with_mask
259 |       If the image has a transparency layer, ``with_mask=true`` will save
260 |       this layer as the mask of the returned ImageClip
261 |     """
262 |     if with_mask:
263 |         pix_fmt = 'rgba'
264 |     else:
265 |         pix_fmt = "rgb24"
266 |     reader = FFMPEG_VideoReader(filename, pix_fmt=pix_fmt, check_duration=False)
267 |     im = reader.lastread
268 |     del reader
269 |     return im
270 | 
271 | 
272 | def ffmpeg_parse_infos(filename, print_infos=False, check_duration=True):
273 |     """Get file infos using ffmpeg.
274 |     Returns a dictionnary with the fields:
275 |     "video_found", "video_fps", "duration", "video_nframes",
276 |     "video_duration", "audio_found", "audio_fps"
277 |     "video_duration" is slightly smaller than "duration" to avoid
278 |     fetching the uncomplete frames at the end, which raises an error.
279 |     """
280 | 
281 |     # open the file in a pipe, provoke an error, read output
282 |     is_GIF = filename.endswith('.gif')
283 |     cmd = [get_ffmpeg_bin(), "-i", filename]
284 |     if is_GIF:
285 |         cmd += ["-f", "null", "/dev/null"]
286 | 
287 |     popen_params = {"bufsize": 10 ** 5,
288 |                     "stdout": sp.PIPE,
289 |                     "stderr": sp.PIPE,
290 |                     "stdin": DEVNULL}
291 | 
292 |     if os.name == "nt":
293 |         popen_params["creationflags"] = 0x08000000
294 | 
295 |     proc = sp.Popen(cmd, **popen_params)
296 | 
297 |     proc.stdout.readline()
298 |     proc.terminate()
299 |     infos = proc.stderr.read().decode('utf8')
300 |     del proc
301 | 
302 |     if print_infos:
303 |         # print the whole info text returned by FFMPEG
304 |         print(infos)
305 | 
306 |     lines = infos.splitlines()
307 |     if "No such file or directory" in lines[-1]:
308 |         raise IOError(("MoviePy error: the file %s could not be found !\n"
309 |                        "Please check that you entered the correct "
310 |                        "path.") % filename)
311 | 
312 |     result = dict()
313 | 
314 |     # get duration (in seconds)
315 |     result['duration'] = None
316 | 
317 |     if check_duration:
318 |         try:
319 |             keyword = ('frame=' if is_GIF else 'Duration: ')
320 |             line = [l for l in lines if keyword in l][0]
321 |             match = re.findall("([0-9][0-9]:[0-9][0-9]:[0-9][0-9].[0-9][0-9])", line)[0]
322 |             result['duration'] = cvsecs(match)
323 |         except:
324 |             raise IOError(("MoviePy error: failed to read the duration of file %s.\n"
325 |                            "Here are the file infos returned by ffmpeg:\n\n%s") % (
326 |                               filename, infos))
327 | 
328 |     # get the output line that speaks about video
329 |     lines_video = [l for l in lines if ' Video: ' in l and re.search('\d+x\d+', l)]
330 | 
331 |     result['video_found'] = (lines_video != [])
332 | 
333 |     if result['video_found']:
334 | 
335 |         try:
336 |             line = lines_video[0]
337 | 
338 |             # get the size, of the form 460x320 (w x h)
339 |             match = re.search(" [0-9]*x[0-9]*(,| )", line)
340 |             s = list(map(int, line[match.start():match.end() - 1].split('x')))
341 |             result['video_size'] = s
342 |         except:
343 |             raise IOError(("MoviePy error: failed to read video dimensions in file %s.\n"
344 |                            "Here are the file infos returned by ffmpeg:\n\n%s") % (
345 |                               filename, infos))
346 | 
347 |         # get the frame rate. Sometimes it's 'tbr', sometimes 'fps', sometimes
348 |         # tbc, and sometimes tbc/2...
349 |         # Current policy: Trust tbr first, then fps. If result is near from x*1000/1001
350 |         # where x is 23,24,25,50, replace by x*1000/1001 (very common case for the fps).
351 | 
352 |         try:
353 |             match = re.search("( [0-9]*.| )[0-9]* tbr", line)
354 |             tbr = float(line[match.start():match.end()].split(' ')[1])
355 |             result['video_fps'] = tbr
356 | 
357 |         except:
358 |             match = re.search("( [0-9]*.| )[0-9]* fps", line)
359 |             result['video_fps'] = float(line[match.start():match.end()].split(' ')[1])
360 | 
361 |         # It is known that a fps of 24 is often written as 24000/1001
362 |         # but then ffmpeg nicely rounds it to 23.98, which we hate.
363 |         coef = 1000.0 / 1001.0
364 |         fps = result['video_fps']
365 |         for x in [23, 24, 25, 30, 50]:
366 |             if (fps != x) and abs(fps - x * coef) < .01:
367 |                 result['video_fps'] = x * coef
368 | 
369 |         if check_duration:
370 |             result['video_nframes'] = int(result['duration'] * result['video_fps']) + 1
371 |             result['video_duration'] = result['duration']
372 |         else:
373 |             result['video_nframes'] = 1
374 |             result['video_duration'] = None
375 |             # We could have also recomputed the duration from the number
376 |             # of frames, as follows:
377 |             # >>> result['video_duration'] = result['video_nframes'] / result['video_fps']
378 | 
379 |     lines_audio = [l for l in lines if ' Audio: ' in l]
380 | 
381 |     result['audio_found'] = lines_audio != []
382 | 
383 |     if result['audio_found']:
384 |         line = lines_audio[0]
385 |         try:
386 |             match = re.search(" [0-9]* Hz", line)
387 |             result['audio_fps'] = int(line[match.start() + 1:match.end()])
388 |         except:
389 |             result['audio_fps'] = 'unknown'
390 | 
391 |     return result


--------------------------------------------------------------------------------
/py_ops.py:
--------------------------------------------------------------------------------
 1 | """
 2 | Wrap video ffmpeg video decoding with a TensorFlow python op.
 3 | 
 4 | See usage_example.py for details on how to use the python op.
 5 | """
 6 | 
 7 | from __future__ import absolute_import
 8 | 
 9 | import random
10 | import numpy as np
11 | import tensorflow as tf
12 | 
13 | from ffmpeg_reader import FFMPEG_VideoReader
14 | 
15 | 
16 | def _load_video_ffmpeg(filename, n_frames, target_fps, random_chunk):
17 |     """
18 |     Load a video as a numpy array using FFmpeg in [0, 255] RGB format.
19 |     :param filename: path to the video file
20 |     :param n_frames: number of frames to decode
21 |     :param target_fps: framerate at which the video will be decoded
22 |     :param random_chunk: grab frames starting from a random position
23 |     :return: (video, length) tuple
24 |         video: (n_frames, h, w, 3) numpy array containing video frames, as RGB in range [0, 255]
25 |         height: frame height
26 |         width: frame width
27 |         length: number of non-zero frames loaded from the video (the rest of the sequence is zero-padded)
28 |     """
29 |     # Make sure that the types are correct
30 |     if isinstance(filename, bytes):
31 |         filename = filename.decode('utf-8')
32 |     if isinstance(n_frames, np.int32):
33 |         n_frames = int(n_frames)
34 | 
35 |     # Get video params
36 |     video_reader = FFMPEG_VideoReader(filename, target_fps=target_fps)
37 |     w, h = video_reader.size
38 |     fps = video_reader.fps
39 |     if target_fps <= 0:
40 |         target_fps = fps
41 |     video_length = int(video_reader.nframes * target_fps / fps)  # corrected number of frames
42 |     tensor_shape = [n_frames, h, w, 3]
43 | 
44 |     # Determine starting and ending positions
45 |     if n_frames <= 0:
46 |         n_frames = video_length
47 |         tensor_shape[0] = video_length
48 |     elif video_length < n_frames:
49 |         n_frames = video_length
50 |     elif random_chunk:  # start from a random position
51 |         start_pos = random.randint(0, video_length - n_frames - 1)
52 |         video_reader.get_frame(1. * start_pos / target_fps, fps=target_fps)
53 | 
54 |     # Load video chunk as numpy array
55 |     video = np.zeros(tensor_shape, dtype=np.float32)
56 |     for idx in range(n_frames):
57 |         video[idx, :, :, :] = video_reader.read_frame()[:, :, :3].astype(np.float32)
58 | 
59 |     video_reader.close()
60 | 
61 |     return video, h, w, n_frames
62 | 
63 | 
64 | def decode_video(filename, n_frames=0, fps=-1, random_chunk=False):
65 |     """
66 |     Decode frames from a video. Returns frames in [0, 255] RGB format.
67 |     :param filename: string tensor, e.g. dequeue() op from a filenames queue
68 |     :return:
69 |         video: 4-D tensor containing frames of a video: [time, height, width, channel]
70 |         height: frame height
71 |         width: frame width
72 |         length: number of non-zero frames loaded from the video (the rest of the sequence is zero-padded)
73 |     """
74 |     params = [filename, n_frames, fps, random_chunk]
75 |     dtypes = [tf.float32, tf.int64, tf.int64, tf.int64]
76 |     return tf.py_func(_load_video_ffmpeg, params, dtypes, name='decode_video')
77 | 


--------------------------------------------------------------------------------
/usage_example.py:
--------------------------------------------------------------------------------
 1 | """
 2 | Usage example for the decode_video python op.
 3 | """
 4 | 
 5 | 
 6 | from __future__ import absolute_import
 7 | from __future__ import print_function
 8 | 
 9 | import time
10 | import argparse
11 | import numpy as np
12 | import tensorflow as tf
13 | 
14 | from py_ops import decode_video
15 | 
16 | 
17 | def _parse_arguments():
18 |     parser = argparse.ArgumentParser('Test decode_video python op.')
19 |     parser.add_argument('--input_file', help='Path to the video file.')
20 |     parser.add_argument('--output_file', default=None, 
21 |     	help='(Optional) Path to the .npy file where the decoded frames will be stored.')
22 |     parser.add_argument('--play_video', default=False, action='store_true',
23 |                         help='Play the extracted frames.')
24 |     parser.add_argument('--num_frames', default=30, type=int, 
25 |     	help='Number of frames per video (sequence length). Set to 0 for full video.')
26 |     parser.add_argument('--fps', default=-1, type=int, 
27 |     	help='Framerate to which the input videos are converted. Use -1 for the original framerate.')
28 |     parser.add_argument('--random_chunks', default=False, action='store_true',
29 |     	help='Grab video frames starting from a random position.')
30 |     return parser.parse_args()
31 | 
32 | 
33 | def _show_video(video, fps=10):
34 |     # Import matplotlib/pylab only if needed
35 |     import matplotlib
36 |     matplotlib.use('TkAgg')
37 |     import matplotlib.pylab as pl
38 |     pl.style.use('ggplot')
39 |     pl.axis('off')
40 | 
41 |     if fps < 0:
42 |         fps = 25
43 |     video /= 255.  # Pylab works in [0, 1] range
44 |     img = None
45 |     pause_length = 1. / fps
46 |     try:
47 |         for f in range(video.shape[0]):
48 |             im = video[f, :, :, :]
49 |             if img is None:
50 |                 img = pl.imshow(im)
51 |             else:
52 |                 img.set_data(im)
53 |             pl.pause(pause_length)
54 |             pl.draw()
55 |     except:
56 |         pass
57 | 
58 | 
59 | if __name__ == '__main__':
60 |     args = _parse_arguments()
61 |     sess = tf.Session()
62 |     f = tf.placeholder(tf.string)
63 |     video, h, w, seq_length = decode_video(f, args.num_frames, args.fps, args.random_chunks)
64 |     start_time = time.time()
65 |     frames, seq_length_val = sess.run([video, seq_length], feed_dict={f: args.input_file})
66 |     total_time = time.time() - start_time
67 |     print('\nSuccessfully loaded video!\n'
68 |           '\tDimensions: %s\n'
69 |           '\tTime: %.3fs\n'
70 |           '\tLoaded frames: %d\n' %
71 |           (str(frames.shape), total_time, seq_length_val))
72 |     if args.output_file:
73 |     	np.save(args.output_file, frames)
74 |         print("Stored frames to %s" % args.output_file)
75 |     if args.play_video:
76 |         _show_video(frames, args.fps)
77 | 


--------------------------------------------------------------------------------