├── LICENSE.txt
├── README.md
└── aniconvert.py


/LICENSE.txt:
--------------------------------------------------------------------------------
 1 | The MIT License (MIT)
 2 | 
 3 | Copyright (c) 2015 Andrew Sun (@crossbowffs)
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # AniConvert
 2 | 
 3 | Yet another batch file converter for [HandBrake](https://handbrake.fr/)
 4 | 
 5 | ## Features
 6 | 
 7 | - Convert an entire folder of videos with just one command
 8 | - Recursive video searching: perfect for TV shows with multiple seasons
 9 | - Automatically choose an audio and subtitle track based on your language preferences
10 | - Smart "destination file already exists" handling - no more accidental overwriting
11 | - No annoying dependencies, everything is in one portable script
12 | - Works on Windows, Mac OS X, and Linux
13 | 
14 | ## Requirements
15 | 
16 | - [HandBrake (command-line version)](https://handbrake.fr/downloads2.php)
17 | - [Python](https://www.python.org/downloads/) 2.7 or above (Python 3 is supported)
18 | - A folder full of videos to convert!
19 | 
20 | ## Usage notes
21 | 
22 | If HandBrakeCLI is not in your PATH, you may place it in the same directory as
23 | the script, or specify the path manually.
24 | 
25 | ## Example usage
26 | 
27 | - Convert a folder of videos using default settings: `aniconvert.py path/to/folder`
28 | - Also look in subdirectories: `aniconvert.py -r ...`
29 | - Automatically select Japanese audio and English subtitles: `aniconvert.py -a jpn -s eng ...`
30 | - Skip files that have already been converted: `aniconvert.py -w skip ...`
31 | - Any combination of the above, and more! See the source code for full documentation.
32 | 
33 | ## License
34 | 
35 | Distributed under the [MIT License](http://opensource.org/licenses/MIT).
36 | 
37 | ## FAQ
38 | 
39 | ### How do I pronounce the name?
40 | 
41 | "AnyConvert". The "Ani" is also short for "anime", which is what this script
42 | was designed for. Of course, it also works great with just about any show
43 | series, from Game of Thrones to My Little Pony.
44 | 
45 | ### How does it work? Is FFmpeg/Libav required?
46 | 
47 | All of this script's information comes from parsing the output that
48 | HandBrake produces. If HandBrake works, this script will too. No external
49 | libraries are used by the script itself, but may be required by HandBrake.
50 | 
51 | ### Why would I need this?
52 | 
53 | If you are watching your videos on a powerful computer, you probably don't.
54 | However, if you are using an older device, or want to save some disk space,
55 | then converting your videos using HandBrake is a good idea. Your videos will
56 | be smaller (200-300MB for a typical episode of anime at 720p), and you will
57 | be able to utilize H.264 hardware acceleration on devices that support it.
58 | 
59 | ### How is this better than the official HandBrake GUI?
60 | 
61 | The official HandBrake app requires that you apply your audio and subtitle
62 | preferences to each video file individually, which is annoying if you have
63 | a folder of videos that you know are in the same format. This script aims to
64 | solve that problem, while also providing extra automation such as language
65 | priority for your audio and subtitle tracks.
66 | 
67 | ### Why are my subtitles burned into the video?
68 | 
69 | Again, this script was written with anime in mind, where subtitles tend to
70 | be highly stylized. HandBrake does not handle these subtitles well, and the
71 | only way to maintain their styling is to burn them into the video. Read
72 | [the HandBrake wiki](https://trac.handbrake.fr/wiki/Subtitles#wikipage)
73 | for more details.
74 | 
75 | ### Why do I get the error `AssertionError: Track count mismatch`?
76 | 
77 | This commonly occurs if your copy of HandBrakeCLI is linked against FFmpeg
78 | instead of Libav, and your video contains ASS format subtitles. If possible,
79 | use a pre-built copy of HandBrakeCLI downloaded from the
80 | [official site](https://handbrake.fr/downloads2.php). For other operating
81 | systems, you will have to compile HandBrakeCLI yourself.
82 | 
83 | For a more in-depth explanation, FFmpeg uses distinct constants to represent
84 | SSA (`AV_CODEC_ID_SSA`) and ASS (`AV_CODEC_ID_ASS`), while Libav only uses
85 | one constant for both, `AV_CODEC_ID_SSA`. HandBrake in turn only checks for
86 | `AV_CODEC_ID_SSA`. Thus, if your file contains ASS format subtitles, FFmpeg
87 | will return `AV_CODEC_ID_ASS`, which HandBrake will ignore, causing this error.
88 | 


--------------------------------------------------------------------------------
/aniconvert.py:
--------------------------------------------------------------------------------
  1 | #!/usr/bin/env python
  2 | ###############################################################
  3 | # AniConvert: Batch convert directories of videos using
  4 | # HandBrake. Intended to be used on anime and TV series,
  5 | # where files downloaded as a batch tend to have the same
  6 | # track layout. Can also automatically select a single audio
  7 | # and subtitle track based on language preference.
  8 | #
  9 | # Copyright (c) 2015 Andrew Sun (@crossbowffs)
 10 | # Distributed under the MIT license
 11 | ###############################################################
 12 | 
 13 | from __future__ import print_function
 14 | import argparse
 15 | import collections
 16 | import errno
 17 | import logging
 18 | import os
 19 | import re
 20 | import subprocess
 21 | import sys
 22 | 
 23 | ###############################################################
 24 | # Configuration values, no corresponding command-line args
 25 | ###############################################################
 26 | 
 27 | # Name of the HandBrake CLI binary. Set this to the full path
 28 | # of the binary if the script cannot find it automatically.
 29 | HANDBRAKE_EXE = "HandBrakeCLI"
 30 | 
 31 | # The format string for logging messages
 32 | LOGGING_FORMAT = "[%(levelname)s] %(message)s"
 33 | 
 34 | # If no output directory is explicitly specified, the output
 35 | # files will be placed in a directory with this value appended
 36 | # to the name of the input directory.
 37 | DEFAULT_OUTPUT_SUFFIX = "-converted"
 38 | 
 39 | # Define the arguments to pass to HandBrake.
 40 | # Do not define any of the following:
 41 | #   -i <input>
 42 | #   -o <output>
 43 | #   -a <audio track>
 44 | #   -s <subtitle track>
 45 | #   -w <width>
 46 | #   -l <height>
 47 | # Obviously, do not define anything that would cause HandBrake
 48 | # to not convert the video file either.
 49 | HANDBRAKE_ARGS = """
 50 | -E ffaac
 51 | -B 160
 52 | -6 dpl2
 53 | -R Auto
 54 | -e x264
 55 | -q 20.0
 56 | --vfr
 57 | --audio-copy-mask aac,ac3,dtshd,dts,mp3
 58 | --audio-fallback ffaac
 59 | --loose-anamorphic
 60 | --modulus 2
 61 | --x264-preset medium
 62 | --h264-profile high
 63 | --h264-level 3.1
 64 | --subtitle-burned
 65 | """
 66 | 
 67 | ###############################################################
 68 | # Default values and explanations for command-line args
 69 | ###############################################################
 70 | 
 71 | # List of video formats to process. Other file formats in the
 72 | # input directory will be ignored. On the command line, specify
 73 | # as "-i mkv,mp4"
 74 | INPUT_VIDEO_FORMATS = ["mkv", "mp4"]
 75 | 
 76 | # The format to convert the videos to. Only "mp4", "mkv", and
 77 | # "m4v" are accepted, because those are the only formats that
 78 | # HandBrake can write. On the command line, specify as "-j mp4"
 79 | OUTPUT_VIDEO_FORMAT = "mp4"
 80 | 
 81 | # A list of preferred audio languages, ordered from most
 82 | # to least preferable. If there is only one audio track in the
 83 | # most preferable language, it will be automatically selected.
 84 | # If more than one track is in the most preferable language,
 85 | # you will be prompted to select one. If no tracks are
 86 | # in the most preferable language, the program will check
 87 | # the second most preferable language, and so on. This value
 88 | # should use the iso639-2 (3 letter) language code format.
 89 | # You may also specify "none" as one of the items in this list.
 90 | # If it is reached, the track will be discarded. For example,
 91 | # "-a eng,none" will use English audio if it is available, or
 92 | # remove the audio track otherwise. On the command line,
 93 | # specify as "-a jpn,eng"
 94 | AUDIO_LANGUAGES = ["jpn", "eng"]
 95 | 
 96 | # This is the same as the preferred audio languages, but
 97 | # for subtitles. On the command line, specify as "-s eng"
 98 | SUBTITLE_LANGUAGES = ["eng"]
 99 | 
100 | # What to do when the destination file already exists. Can be
101 | # one of:
102 | #    "prompt": Ask the user what to do
103 | #    "skip": Skip the file and proceed to the next one
104 | #    "overwrite": Overwrite the destination file
105 | # On the command line, specify as "-w skip"
106 | DUPLICATE_ACTION = "skip"
107 | 
108 | # The width and height of the output video, in the format
109 | # "1280x720". "1080p" and "720p" are common values and
110 | # translate to 1920x1080 and 1280x720, respectively.
111 | # A value of "auto" is also accepted, and will preserve
112 | # the input video dimensions. On the command line, specify
113 | # as "-d 1280x720", "-d 720p", or "-d auto"
114 | OUTPUT_DIMENSIONS = "auto"
115 | 
116 | # The minimum severity for an event to be logged. Levels
117 | # from least severe to most servere are "debug", "info",
118 | # "warning", "error", and "critical". On the command line,
119 | # specify as "-l info"
120 | LOGGING_LEVEL = "info"
121 | 
122 | # By default, if there is only a single track, and it has
123 | # language code "und" (undefined), it will automatically be
124 | # selected. If you do not want this behavior, set this flag
125 | # to true. On the command line, specify as "-u"
126 | MANUAL_UND = False
127 | 
128 | # Set this to true to search sub-directories within the input
129 | # directory. Files will be output in the correspondingly named
130 | # folder in the destination directory. On the command line,
131 | # specify as "-r"
132 | RECURSIVE_SEARCH = False
133 | 
134 | ###############################################################
135 | # End of configuration values, code begins here
136 | ###############################################################
137 | 
138 | try:
139 |     input = raw_input
140 | except NameError:
141 |     pass
142 | 
143 | 
144 | class TrackInfo(object):
145 |     def __init__(self, audio_track, subtitle_track):
146 |         self.audio_track = audio_track
147 |         self.subtitle_track = subtitle_track
148 | 
149 | 
150 | class BatchInfo(object):
151 |     def __init__(self, dir_path, track_map):
152 |         self.dir_path = dir_path
153 |         self.track_map = track_map
154 | 
155 | 
156 | class FFmpegStreamInfo(object):
157 |     def __init__(self, stream_index, codec_type, codec_name, language_code, metadata):
158 |         self.stream_index = stream_index
159 |         self.codec_type = codec_type
160 |         self.codec_name = codec_name
161 |         self.language_code = language_code
162 |         self.metadata = metadata
163 | 
164 | 
165 | class HandBrakeAudioInfo(object):
166 |     pattern1 = re.compile(r"(\d+), (.+) \(iso639-2: ([a-z]{3})\)")
167 |     pattern2 = re.compile(r"(\d+), (.+) \(iso639-2: ([a-z]{3})\), (\d+)Hz, (\d+)bps")
168 | 
169 |     def __init__(self, info_str):
170 |         match = self.pattern1.match(info_str)
171 |         if not match:
172 |             raise ValueError("Unknown audio track info format: " + repr(info_str))
173 |         self.index = int(match.group(1))
174 |         self.description = match.group(2)
175 |         self.language_code = match.group(3)
176 |         match = self.pattern2.match(info_str)
177 |         if match:
178 |             self.sample_rate = int(match.group(4))
179 |             self.bit_rate = int(match.group(5))
180 |         else:
181 |             self.sample_rate = None
182 |             self.bit_rate = None
183 |         self.title = None
184 | 
185 |     def __str__(self):
186 |         format_str = (
187 |             "Description: {description}\n"
188 |             "Language code: {language_code}"
189 |         )
190 |         if self.sample_rate:
191 |             format_str += "\nSample rate: {sample_rate}Hz"
192 |         if self.bit_rate:
193 |             format_str += "\nBit rate: {bit_rate}bps"
194 |         return format_str.format(**self.__dict__)
195 | 
196 |     def __hash__(self):
197 |         return hash((
198 |             self.index,
199 |             self.description,
200 |             self.language_code,
201 |             self.sample_rate,
202 |             self.language_code,
203 |             self.title
204 |         ))
205 | 
206 |     def __eq__(self, other):
207 |         if not isinstance(other, HandBrakeAudioInfo):
208 |             return False
209 |         return (
210 |             self.index == other.index and
211 |             self.description == other.description and
212 |             self.language_code == other.language_code and
213 |             self.sample_rate == other.sample_rate and
214 |             self.language_code == other.language_code and
215 |             self.title == other.title
216 |         )
217 | 
218 | 
219 | class HandBrakeSubtitleInfo(object):
220 |     pattern = re.compile(r"(\d+), (.+) \(iso639-2: ([a-z]{3})\) \((\S+)\)\((\S+)\)")
221 | 
222 |     def __init__(self, info_str):
223 |         match = self.pattern.match(info_str)
224 |         if not match:
225 |             raise ValueError("Unknown subtitle track info format: " + repr(info_str))
226 |         self.index = int(match.group(1))
227 |         self.language = match.group(2)
228 |         self.language_code = match.group(3)
229 |         self.format = match.group(4)
230 |         self.source = match.group(5)
231 |         self.title = None
232 | 
233 |     def __str__(self):
234 |         format_str = (
235 |             "Language: {language}\n"
236 |             "Language code: {language_code}\n"
237 |             "Format: {format}\n"
238 |             "Source: {source}"
239 |         )
240 |         return format_str.format(**self.__dict__)
241 | 
242 |     def __hash__(self):
243 |         return hash((
244 |             self.index,
245 |             self.language,
246 |             self.language_code,
247 |             self.format,
248 |             self.source,
249 |             self.title
250 |         ))
251 | 
252 |     def __eq__(self, other):
253 |         if not isinstance(other, HandBrakeSubtitleInfo):
254 |             return False
255 |         return (
256 |             self.index == other.index and
257 |             self.language == other.language and
258 |             self.language_code == other.language_code and
259 |             self.format == other.format and
260 |             self.source == other.source and
261 |             self.title == other.title
262 |         )
263 | 
264 | 
265 | def print_err(message="", end="\n", flush=False):
266 |     print(message, end=end, file=sys.stderr)
267 |     if flush:
268 |         sys.stderr.flush()
269 | 
270 | 
271 | def indent_text(text, prefix):
272 |     if isinstance(prefix, int):
273 |         prefix = " " * prefix
274 |     lines = text.splitlines()
275 |     return "\n".join(prefix + line for line in lines)
276 | 
277 | 
278 | def on_walk_error(exception):
279 |     logging.error("Cannot read directory: '%s'", exception.filename)
280 | 
281 | 
282 | def get_files_in_dir(path, extensions, recursive):
283 |     extensions = {e.lower() for e in extensions}
284 |     for (dir_path, subdir_names, file_names) in os.walk(path, onerror=on_walk_error):
285 |         filtered_files = []
286 |         for file_name in file_names:
287 |             extension = os.path.splitext(file_name)[1][1:]
288 |             if extension.lower() in extensions:
289 |                 filtered_files.append(file_name)
290 |         if len(filtered_files) > 0:
291 |             filtered_files.sort()
292 |             yield (dir_path, filtered_files)
293 |         if recursive:
294 |             subdir_names.sort()
295 |         else:
296 |             del subdir_names[:]
297 | 
298 | 
299 | def get_output_dir(base_output_dir, base_input_dir, dir_path):
300 |     relative_path = os.path.relpath(dir_path, base_input_dir)
301 |     if relative_path == ".":
302 |         return base_output_dir
303 |     return os.path.join(base_output_dir, relative_path)
304 | 
305 | 
306 | def replace_extension(file_name, new_extension):
307 |     new_file_name = os.path.splitext(file_name)[0] + "." + new_extension
308 |     return new_file_name
309 | 
310 | 
311 | def get_simplified_path(base_dir_path, full_path):
312 |     base_parent_dir_path = os.path.dirname(base_dir_path)
313 |     return os.path.relpath(full_path, base_parent_dir_path)
314 | 
315 | 
316 | def get_output_path(base_output_dir, base_input_dir, input_path, output_format):
317 |     relative_path = os.path.relpath(input_path, base_input_dir)
318 |     temp_path = os.path.join(base_output_dir, relative_path)
319 |     out_path = os.path.splitext(temp_path)[0] + "." + output_format
320 |     return out_path
321 | 
322 | 
323 | def try_create_directory(path):
324 |     try:
325 |         os.makedirs(path, 0o755)
326 |     except OSError as e:
327 |         if e.errno != errno.EEXIST:
328 |             raise
329 | 
330 | 
331 | def try_delete_file(path):
332 |     try:
333 |         os.remove(path)
334 |     except OSError as e:
335 |         if e.errno != errno.ENOENT:
336 |             raise
337 | 
338 | 
339 | def run_handbrake_scan(handbrake_path, input_path):
340 |     output = subprocess.check_output([
341 |         handbrake_path,
342 |         "-i", input_path,
343 |         "--scan"
344 |     ], stderr=subprocess.STDOUT)
345 |     return output.decode("utf-8")
346 | 
347 | 
348 | def parse_handbrake_track_info(output_lines, start_index, info_cls):
349 |     prefix = "    + "
350 |     prefix_len = len(prefix)
351 |     tracks = []
352 |     i = start_index + 1
353 |     while i < len(output_lines) and output_lines[i].startswith(prefix):
354 |         info_str = output_lines[i][prefix_len:]
355 |         info = info_cls(info_str)
356 |         tracks.append(info)
357 |         i += 1
358 |     return (i, tracks)
359 | 
360 | 
361 | def parse_ffmpeg_stream_metadata(output_lines, start_index, metadata_pattern):
362 |     metadata = {}
363 |     i = start_index + 1
364 |     while i < len(output_lines):
365 |         match = metadata_pattern.match(output_lines[i])
366 |         if not match:
367 |             break
368 |         metadata[match.group(1)] = match.group(2)
369 |         i += 1
370 |     return (i, metadata)
371 | 
372 | 
373 | def parse_ffmpeg_stream_info(output_lines, start_index):
374 |     stream_pattern = re.compile(r"\s{4}Stream #0\.(\d+)(\(([a-z]{3})\))?: (\S+): (\S+?)")
375 |     metadata_pattern = re.compile(r"\s{6}(\S+)\s*: (.+)")
376 |     audio_streams = []
377 |     subtitle_streams = []
378 |     i = start_index + 1
379 |     while i < len(output_lines) and output_lines[i].startswith("  "):
380 |         match = stream_pattern.match(output_lines[i])
381 |         if not match:
382 |             i += 1
383 |             continue
384 |         stream_index = match.group(1)
385 |         language_code = match.group(3) or "und"
386 |         codec_type = match.group(4)
387 |         codec_name = match.group(5)
388 |         i += 1
389 |         if codec_type == "Audio":
390 |             current_stream = audio_streams
391 |         elif codec_type == "Subtitle":
392 |             current_stream = subtitle_streams
393 |         else:
394 |             continue
395 |         if output_lines[i].startswith("    Metadata:"):
396 |             i, metadata = parse_ffmpeg_stream_metadata(output_lines, i, metadata_pattern)
397 |         else:
398 |             metadata = {}
399 |         info = FFmpegStreamInfo(stream_index, codec_type, codec_name, language_code, metadata)
400 |         current_stream.append(info)
401 |     return (i, audio_streams, subtitle_streams)
402 | 
403 | 
404 | def merge_track_titles(hb_tracks, ff_streams):
405 |     if not ff_streams:
406 |         return
407 |     assert len(hb_tracks) == len(ff_streams), "Track count mismatch"
408 |     for hb_track, ff_stream in zip(hb_tracks, ff_streams):
409 |         assert hb_track.language_code == ff_stream.language_code, "Track language code mismatch"
410 |         hb_track.title = ff_stream.metadata.get("title")
411 | 
412 | 
413 | def parse_handbrake_scan_output(output):
414 |     lines = output.splitlines()
415 |     hb_audio_tracks = None
416 |     hb_subtitle_tracks = None
417 |     ff_audio_streams = None
418 |     ff_subtitle_streams = None
419 |     i = 0
420 |     while i < len(lines):
421 |         if lines[i].startswith("Input #0, "):
422 |             logging.debug("Found FFmpeg stream info")
423 |             i, ff_audio_streams, ff_subtitle_streams = parse_ffmpeg_stream_info(lines, i)
424 |             message_format = "FFmpeg: %d audio track(s), %d subtitle track(s)"
425 |             logging.debug(message_format, len(ff_audio_streams), len(ff_subtitle_streams))
426 |             continue
427 |         if lines[i] == "  + audio tracks:":
428 |             logging.debug("Found HandBrake audio track info")
429 |             i, hb_audio_tracks = parse_handbrake_track_info(lines, i, HandBrakeAudioInfo)
430 |             logging.debug("HandBrake: %d audio track(s)", len(hb_audio_tracks))
431 |             continue
432 |         if lines[i] == "  + subtitle tracks:":
433 |             logging.debug("Found HandBrake subtitle track info")
434 |             i, hb_subtitle_tracks = parse_handbrake_track_info(lines, i, HandBrakeSubtitleInfo)
435 |             logging.debug("HandBrake: %d subtitle track(s)", len(hb_subtitle_tracks))
436 |             continue
437 |         i += 1
438 |     merge_track_titles(hb_audio_tracks, ff_audio_streams)
439 |     merge_track_titles(hb_subtitle_tracks, ff_subtitle_streams)
440 |     return (hb_audio_tracks, hb_subtitle_tracks)
441 | 
442 | 
443 | def get_track_info(handbrake_path, input_path):
444 |     scan_output = run_handbrake_scan(handbrake_path, input_path)
445 |     return parse_handbrake_scan_output(scan_output)
446 | 
447 | 
448 | def get_track_by_index(track_list, track_index):
449 |     for track in track_list:
450 |         if track.index == track_index:
451 |             return track
452 |     raise IndexError("Invalid track index: " + str(track_index))
453 | 
454 | 
455 | def filter_tracks_by_language(track_list, preferred_languages, manual_und):
456 |     for preferred_language_code in preferred_languages:
457 |         preferred_language_code = preferred_language_code.lower()
458 |         if preferred_language_code == "none":
459 |             return None
460 |         und_count = 0
461 |         filtered_tracks = []
462 |         for track in track_list:
463 |             if track.language_code == preferred_language_code:
464 |                 filtered_tracks.append(track)
465 |             elif track.language_code == "und":
466 |                 und_count += 1
467 |                 filtered_tracks.append(track)
468 |         if len(filtered_tracks) - und_count >= 1:
469 |             return filtered_tracks
470 |         elif len(track_list) == und_count:
471 |             if und_count == 1 and not manual_und:
472 |                 return track_list
473 |             return []
474 |     return []
475 | 
476 | 
477 | def print_track_list(track_list, file_name, track_type):
478 |     track_type = track_type.capitalize()
479 |     print_err("+ Video: '{0}'".format(file_name))
480 |     for track in track_list:
481 |         message_format = "   + [{1}] {0} track: {2}"
482 |         print_err(message_format.format(track_type, track.index, track.title or ""))
483 |         print_err(indent_text(str(track), "      + "))
484 | 
485 | 
486 | def prompt_select_track(track_list, filtered_track_list, file_name, track_type):
487 |     print_err("Please select {0} track:".format(track_type))
488 |     print_track_list(filtered_track_list, file_name, track_type)
489 |     prompt_format = "Choose a {0} track # (type 'all' to view all choices): "
490 |     alt_prompt_format = "Choose a {0} track # (type 'none' for no track): "
491 |     if len(track_list) == len(filtered_track_list):
492 |         prompt_format = alt_prompt_format
493 |     while True:
494 |         print_err(prompt_format.format(track_type), end="")
495 |         try:
496 |             input_str = input().lower()
497 |         except KeyboardInterrupt:
498 |             print_err(flush=True)
499 |             raise
500 |         if input_str == "all":
501 |             print_track_list(track_list, file_name, track_type)
502 |             prompt_format = alt_prompt_format
503 |             continue
504 |         if input_str == "none":
505 |             return None
506 |         try:
507 |             track_index = int(input_str)
508 |         except ValueError:
509 |             print_err("Enter a valid number!")
510 |             continue
511 |         try:
512 |             return get_track_by_index(track_list, track_index)
513 |         except IndexError:
514 |             print_err("Enter a valid index!")
515 | 
516 | 
517 | def prompt_overwrite_file(file_name):
518 |     print_err("The destination file already exists: '{0}'".format(file_name))
519 |     while True:
520 |         print_err("Do you want to overwrite it? (y/n): ", end="")
521 |         try:
522 |             input_str = input().lower()
523 |         except KeyboardInterrupt:
524 |             print_err(flush=True)
525 |             raise
526 |         if input_str == "y":
527 |             return True
528 |         elif input_str == "n":
529 |             return False
530 |         else:
531 |             print_err("Enter either 'y' or 'n'!")
532 | 
533 | 
534 | def select_best_track(track_list, preferred_languages, manual_und,
535 |         file_name, track_type):
536 |     if len(track_list) == 0:
537 |         logging.info("No %s tracks found", track_type)
538 |         return None
539 |     filtered_tracks = filter_tracks_by_language(track_list,
540 |         preferred_languages, manual_und)
541 |     if filtered_tracks is None:
542 |         logging.info("Matched 'none' language, discarding %s track", track_type)
543 |         return None
544 |     if len(filtered_tracks) == 1:
545 |         track = filtered_tracks[0]
546 |         message_format = "Automatically selected %s track #%d with language '%s'"
547 |         logging.info(message_format, track_type, track.index, track.language_code)
548 |         return track
549 |     else:
550 |         if len(filtered_tracks) == 0:
551 |             filtered_tracks = track_list
552 |             message_format = "Failed to find any %s tracks that match language list: %s"
553 |         else:
554 |             message_format = "More than one %s track matches language list: %s"
555 |         logging.info(message_format, track_type, preferred_languages)
556 |         track = prompt_select_track(track_list, filtered_tracks, file_name, track_type)
557 |         if track:
558 |             message_format = "User selected %s track #%d with language '%s'"
559 |             logging.info(message_format, track_type, track.index, track.language_code)
560 |         else:
561 |             logging.info("User discarded %s track", track_type)
562 |         return track
563 | 
564 | 
565 | def select_best_track_cached(selected_track_map, track_list,
566 |         preferred_languages, manual_und, file_name, track_type):
567 |     track_set = tuple(track_list)
568 |     try:
569 |         track = selected_track_map[track_set]
570 |     except KeyError:
571 |         track = select_best_track(track_list, preferred_languages,
572 |             manual_und, file_name, track_type)
573 |         selected_track_map[track_set] = track
574 |     else:
575 |         track_type = track_type.capitalize()
576 |         message_format = "%s track layout already seen, "
577 |         if track:
578 |             message_format += "selected #%d with language '%s'"
579 |             logging.debug(message_format, track_type, track.index, track.language_code)
580 |         else:
581 |             message_format += "no track selected"
582 |             logging.debug(message_format, track_type)
583 |     return track
584 | 
585 | 
586 | def process_handbrake_output(process):
587 |     pattern1 = re.compile(r"Encoding: task \d+ of \d+, (\d+\.\d\d) %")
588 |     pattern2 = re.compile(
589 |         r"Encoding: task \d+ of \d+, (\d+\.\d\d) % "
590 |         r"\((\d+\.\d\d) fps, avg (\d+\.\d\d) fps, ETA (\d\dh\d\dm\d\ds)\)")
591 |     percent_complete = None
592 |     current_fps = None
593 |     average_fps = None
594 |     estimated_time = None
595 |     prev_message = ""
596 |     format_str = "Progress: {percent:.2f}% done"
597 |     long_format_str = format_str + " (FPS: {fps:.2f}, average FPS: {avg_fps:.2f}, ETA: {eta})"
598 |     try:
599 |         while True:
600 |             output = process.stdout.readline()
601 |             if len(output) == 0:
602 |                 break
603 |             output = output.rstrip()
604 |             match = pattern1.match(output)
605 |             if not match:
606 |                 continue
607 |             percent_complete = float(match.group(1))
608 |             match = pattern2.match(output)
609 |             if match:
610 |                 format_str = long_format_str
611 |                 current_fps = float(match.group(2))
612 |                 average_fps = float(match.group(3))
613 |                 estimated_time = match.group(4)
614 |             message = format_str.format(
615 |                 percent=percent_complete,
616 |                 fps=current_fps,
617 |                 avg_fps=average_fps,
618 |                 eta=estimated_time)
619 |             print_err(message, end="")
620 |             blank_count = max(len(prev_message) - len(message), 0)
621 |             print_err(" " * blank_count, end="\r")
622 |             prev_message = message
623 |     finally:
624 |         print_err(flush=True)
625 | 
626 | 
627 | def run_handbrake(arg_list):
628 |     logging.debug("HandBrake args: '%s'", subprocess.list2cmdline(arg_list))
629 |     process = subprocess.Popen(
630 |         arg_list,
631 |         stdout=subprocess.PIPE,
632 |         stderr=subprocess.STDOUT,
633 |         universal_newlines=True)
634 |     try:
635 |         process_handbrake_output(process)
636 |     except:
637 |         process.kill()
638 |         process.wait()
639 |         raise
640 |     retcode = process.wait()
641 |     if retcode != 0:
642 |         raise subprocess.CalledProcessError(retcode, arg_list)
643 | 
644 | 
645 | def get_handbrake_args(handbrake_path, input_path, output_path,
646 |         audio_track, subtitle_track, video_dimensions):
647 |     args = HANDBRAKE_ARGS.replace("\n", " ").strip().split()
648 |     args += ["-i", input_path]
649 |     args += ["-o", output_path]
650 |     if audio_track:
651 |         args += ["-a", str(audio_track.index)]
652 |     else:
653 |         args += ["-a", "none"]
654 |     if subtitle_track:
655 |         args += ["-s", str(subtitle_track.index)]
656 |     if video_dimensions != "auto":
657 |         args += ["-w", str(video_dimensions[0])]
658 |         args += ["-l", str(video_dimensions[1])]
659 |     return [handbrake_path] + args
660 | 
661 | 
662 | def check_handbrake_executable(file_path):
663 |     if not os.path.isfile(file_path):
664 |         return False
665 |     message_format = "Found HandBrakeCLI binary at '%s'"
666 |     if not os.access(file_path, os.X_OK):
667 |         message_format += ", but it is not executable"
668 |         logging.warning(message_format, file_path)
669 |         return False
670 |     logging.info(message_format, file_path)
671 |     return True
672 | 
673 | 
674 | def find_handbrake_executable_in_path(name):
675 |     if os.name == "nt" and not name.lower().endswith(".exe"):
676 |         name += ".exe"
677 |     path_env = os.environ.get("PATH", os.defpath)
678 |     path_env_split = path_env.split(os.pathsep)
679 |     path_env_split.insert(0, os.path.abspath(os.path.dirname(__file__)))
680 |     for dir_path in path_env_split:
681 |         file_path = os.path.join(dir_path, name)
682 |         if check_handbrake_executable(file_path):
683 |             return file_path
684 |     return None
685 | 
686 | 
687 | def find_handbrake_executable():
688 |     name = HANDBRAKE_EXE
689 |     if os.path.dirname(name):
690 |         logging.info("Full path to HandBrakeCLI binary specified, ignoring PATH")
691 |         if check_handbrake_executable(name):
692 |             return name
693 |     else:
694 |         handbrake_path = find_handbrake_executable_in_path(name)
695 |         if handbrake_path:
696 |             return handbrake_path
697 |     logging.error("Could not find executable HandBrakeCLI binary")
698 |     return None
699 | 
700 | 
701 | def check_output_path(args, output_path):
702 |     simp_output_path = get_simplified_path(args.output_dir, output_path)
703 |     if not os.path.exists(output_path):
704 |         return True
705 |     if os.path.isdir(output_path):
706 |         logging.error("Output path '%s' is a directory, skipping file", simp_output_path)
707 |         return False
708 |     if args.duplicate_action == "prompt":
709 |         return prompt_overwrite_file(simp_output_path)
710 |     elif args.duplicate_action == "skip":
711 |         logging.info("Destination file '%s' already exists, skipping", simp_output_path)
712 |         return False
713 |     elif args.duplicate_action == "overwrite":
714 |         logging.info("Destination file '%s' already exists, overwriting", simp_output_path)
715 |         return True
716 | 
717 | 
718 | def filter_convertible_files(args, dir_path, file_names):
719 |     output_dir = get_output_dir(args.output_dir, args.input_dir, dir_path)
720 |     try:
721 |         try_create_directory(output_dir)
722 |     except OSError as e:
723 |         logging.error("Cannot create output directory: '%s'", output_dir)
724 |         return []
725 |     convertible_files = []
726 |     for file_name in file_names:
727 |         output_file_name = replace_extension(file_name, args.output_format)
728 |         output_path = os.path.join(output_dir, output_file_name)
729 |         if not check_output_path(args, output_path):
730 |             continue
731 |         convertible_files.append(file_name)
732 |     return convertible_files
733 | 
734 | 
735 | def get_track_map(args, dir_path, file_names):
736 |     selected_audio_track_map = {}
737 |     selected_subtitle_track_map = {}
738 |     track_map = collections.OrderedDict()
739 |     for file_name in file_names:
740 |         logging.info("Scanning '%s'", file_name)
741 |         file_path = os.path.join(dir_path, file_name)
742 |         try:
743 |             audio_tracks, subtitle_tracks = get_track_info(
744 |                 args.handbrake_path, file_path)
745 |         except subprocess.CalledProcessError as e:
746 |             logging.error("Error occurred while scanning '%s': %s", file_name, e)
747 |             continue
748 |         selected_audio_track = select_best_track_cached(
749 |             selected_audio_track_map, audio_tracks,
750 |             args.audio_languages, args.manual_und,
751 |             file_name, "audio")
752 |         selected_subtitle_track = select_best_track_cached(
753 |             selected_subtitle_track_map, subtitle_tracks,
754 |             args.subtitle_languages, args.manual_und,
755 |             file_name, "subtitle")
756 |         track_map[file_name] = TrackInfo(
757 |             selected_audio_track, selected_subtitle_track)
758 |     return track_map
759 | 
760 | 
761 | def generate_batch(args, dir_path, file_names):
762 |     simp_dir_path = get_simplified_path(args.input_dir, dir_path)
763 |     logging.info("Scanning videos in '%s'", simp_dir_path)
764 |     convertible_files = filter_convertible_files(args, dir_path, file_names)
765 |     track_map = get_track_map(args, dir_path, convertible_files)
766 |     if len(track_map) == 0:
767 |         logging.warning("No videos in '%s' can be converted", simp_dir_path)
768 |         return None
769 |     return BatchInfo(dir_path, track_map)
770 | 
771 | 
772 | def generate_batches(args):
773 |     dir_list = get_files_in_dir(args.input_dir, args.input_formats, args.recursive_search)
774 |     batch_list = []
775 |     found = False
776 |     for dir_path, file_names in dir_list:
777 |         found = True
778 |         batch = generate_batch(args, dir_path, file_names)
779 |         if batch:
780 |             batch_list.append(batch)
781 |     if not found:
782 |         message = "No videos found in input directory"
783 |         if not args.recursive_search:
784 |             message += ", for recursive search specify '-r'"
785 |         logging.info(message)
786 |     return batch_list
787 | 
788 | 
789 | def execute_batch(args, batch):
790 |     output_dir = get_output_dir(args.output_dir, args.input_dir, batch.dir_path)
791 |     try_create_directory(output_dir)
792 |     for file_name, track_info in batch.track_map.items():
793 |         output_file_name = replace_extension(file_name, args.output_format)
794 |         input_path = os.path.join(batch.dir_path, file_name)
795 |         output_path = os.path.join(output_dir, output_file_name)
796 |         simp_input_path = get_simplified_path(args.input_dir, input_path)
797 |         handbrake_args = get_handbrake_args(args.handbrake_path,
798 |             input_path, output_path, track_info.audio_track,
799 |             track_info.subtitle_track, args.output_dimensions)
800 |         logging.info("Converting '%s'", simp_input_path)
801 |         try:
802 |             run_handbrake(handbrake_args)
803 |         except subprocess.CalledProcessError as e:
804 |             logging.error("Error occurred while converting '%s': %s", simp_input_path, e)
805 |             try_delete_file(output_path)
806 |         except:
807 |             logging.info("Conversion aborted, cleaning up temporary files")
808 |             try_delete_file(output_path)
809 |             raise
810 | 
811 | 
812 | def sanitize_and_validate_args(args):
813 |     args.input_dir = os.path.abspath(args.input_dir)
814 |     if not args.output_dir:
815 |         args.output_dir = args.input_dir + DEFAULT_OUTPUT_SUFFIX
816 |     args.output_dir = os.path.abspath(args.output_dir)
817 |     if not os.path.exists(args.input_dir):
818 |         logging.error("Input directory does not exist: '%s'", args.input_dir)
819 |         return False
820 |     if os.path.isfile(args.input_dir):
821 |         logging.error("Input directory is a file: '%s'", args.input_dir)
822 |         return False
823 |     if not os.access(args.input_dir, os.R_OK | os.X_OK):
824 |         logging.error("Cannot read from input directory: '%s'", args.input_dir)
825 |         return False
826 |     if os.path.isfile(args.output_dir):
827 |         logging.error("Output directory is a file: '%s'", args.output_dir)
828 |         return False
829 |     if os.path.isdir(args.output_dir) and not os.access(args.output_dir, os.W_OK | os.X_OK):
830 |         logging.error("Cannot write to output directory: '%s'", args.output_dir)
831 |         return False
832 |     if args.input_dir == args.output_dir:
833 |         logging.error("Input and output directories are the same: '%s'", args.input_dir)
834 |         return False
835 |     if args.handbrake_path:
836 |         args.handbrake_path = os.path.abspath(args.handbrake_path)
837 |         if not os.path.isfile(args.handbrake_path):
838 |             logging.error("HandBrakeCLI binary not found: '%s'", args.handbrake_path)
839 |             return False
840 |         if not os.access(args.handbrake_path, os.X_OK):
841 |             logging.error("HandBrakeCLI binary is not executable: '%s'", args.handbrake_path)
842 |             return False
843 |     else:
844 |         args.handbrake_path = find_handbrake_executable()
845 |         if not args.handbrake_path:
846 |             return False
847 |     return True
848 | 
849 | 
850 | def arg_error(message):
851 |     raise argparse.ArgumentTypeError(message)
852 | 
853 | 
854 | def parse_output_dimensions(value):
855 |     value_lower = value.lower()
856 |     if value_lower == "auto":
857 |         return value_lower
858 |     if value_lower == "1080p":
859 |         return (1920, 1080)
860 |     if value_lower == "720p":
861 |         return (1280, 720)
862 |     match = re.match(r"^(\d+)x(\d+)$", value_lower)
863 |     if not match:
864 |         arg_error("Invalid video dimensions: " + repr(value))
865 |     width = int(match.group(1))
866 |     height = int(match.group(2))
867 |     return (width, height)
868 | 
869 | 
870 | def parse_duplicate_action(value):
871 |     value_lower = value.lower()
872 |     if value_lower not in {"prompt", "skip", "overwrite"}:
873 |         arg_error("Invalid duplicate action: " + repr(value))
874 |     return value_lower
875 | 
876 | 
877 | def parse_language_list(value):
878 |     language_list = value.split(",")
879 |     for language in language_list:
880 |         language = language.lower()
881 |         if language == "none":
882 |             continue
883 |         elif language == "und":
884 |             arg_error("Do not specify 'und' language, use '-u' flag instead")
885 |         elif not language.isalpha() or len(language) != 3:
886 |             arg_error("Invalid iso639-2 code: " + repr(language))
887 |     return language_list
888 | 
889 | 
890 | def parse_logging_level(value):
891 |     level = getattr(logging, value.upper(), None)
892 |     if level is None:
893 |         arg_error("Invalid logging level: " + repr(value))
894 |     return level
895 | 
896 | 
897 | def parse_input_formats(value):
898 |     format_list = value.split(",")
899 |     for input_format in format_list:
900 |         if input_format.startswith("."):
901 |             arg_error("Do not specify the leading '.' on input formats")
902 |         if not input_format.isalnum():
903 |             arg_error("Invalid input format: " + repr(input_format))
904 |     return format_list
905 | 
906 | 
907 | def parse_output_format(value):
908 |     if value.startswith("."):
909 |         arg_error("Do not specify the leading '.' on output format")
910 |     if value.lower() not in {"mp4", "mkv", "m4v"}:
911 |         arg_error("Invalid output format (only mp4, mkv, and m4v are supported): " + repr(value))
912 |     return value
913 | 
914 | 
915 | def parse_args():
916 |     parser = argparse.ArgumentParser()
917 |     parser.add_argument("input_dir")
918 |     parser.add_argument("-o", "--output-dir")
919 |     parser.add_argument("-x", "--handbrake-path")
920 |     parser.add_argument("-r", "--recursive-search",
921 |         action="store_true", default=RECURSIVE_SEARCH)
922 |     parser.add_argument("-u", "--manual-und",
923 |         action="store_true", default=MANUAL_UND)
924 |     parser.add_argument("-i", "--input-formats",
925 |         type=parse_input_formats, default=INPUT_VIDEO_FORMATS)
926 |     parser.add_argument("-j", "--output-format",
927 |         type=parse_output_format, default=OUTPUT_VIDEO_FORMAT)
928 |     parser.add_argument("-l", "--logging-level",
929 |         type=parse_logging_level, default=LOGGING_LEVEL)
930 |     parser.add_argument("-w", "--duplicate-action",
931 |         type=parse_duplicate_action, default=DUPLICATE_ACTION)
932 |     parser.add_argument("-d", "--output-dimensions",
933 |         type=parse_output_dimensions, default=OUTPUT_DIMENSIONS)
934 |     parser.add_argument("-a", "--audio-languages",
935 |         type=parse_language_list, default=AUDIO_LANGUAGES)
936 |     parser.add_argument("-s", "--subtitle-languages",
937 |         type=parse_language_list, default=SUBTITLE_LANGUAGES)
938 |     return parser.parse_args()
939 | 
940 | 
941 | def main():
942 |     args = parse_args()
943 |     logging.basicConfig(format=LOGGING_FORMAT, level=args.logging_level, stream=sys.stdout)
944 |     if not sanitize_and_validate_args(args):
945 |         return
946 |     batches = generate_batches(args)
947 |     for batch in batches:
948 |         execute_batch(args, batch)
949 |     logging.info("Done!")
950 | 
951 | 
952 | if __name__ == "__main__":
953 |     try:
954 |         main()
955 |     except KeyboardInterrupt:
956 |         pass
957 | 


--------------------------------------------------------------------------------