├── .gitattributes ├── .gitignore ├── LICENSE ├── README.md ├── Vagrantfile ├── supervisor-config.conf └── transcoder.py /.gitattributes: -------------------------------------------------------------------------------- 1 | *.py text eol=lf 2 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .vagrant 2 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | The MIT License (MIT) 2 | 3 | Copyright (c) 2015 Andy McCurdy 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | 23 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | #tested-transcoder 2 | ##--- THIS IS A BETA RELEASE -- 3 | Please post feedback on github or at http://www.tested.com/forums/general-discussion/495076-transcoder-feedback/ 4 | 5 | Thanks for helping us test this out! We'll have more refined instructions once we're sure the transcoder is ready for release. 6 | 7 | This is a vagrant script that creates a Virtualbox virtual machine that serves as a black box for transcoding and repackaging Blu-rays and DVDs ripped using MakeMKV into iTunes quality video files suitable for streaming using Plex or XBMC. It uses Don Melton's video transcoder scripts (https://github.com/donmelton/video-transcoding-scripts) to transcode individual files, but handles a lot of the tedious stuff involved in movie transcoding for you, including adding all audio tracks, selecting the proper subtitle track for non-English dialogue in English language films (Think Greedo's conversation with Han in Star Wars), handling the movie crop, etc. 8 | 9 | To rip discs, first use MakeMKV to rip only the movie, audio tracks, and subtitles you want. The title with the most chapters, and largest size is typically the one you want. I typically tell MakeMKV to grab all the English language subtitles and audio tracks, which is a generally a good strategy. You can set this as the default in View > Preferences > Language > Preferred Language. The process may take a long time, depending on your computer and the resources you give the black box. 10 | 11 | ## Prerequisites 12 | 13 | * Virtualbox - https://www.virtualbox.org/wiki/Downloads 14 | * Vagrant - http://www.vagrantup.com/downloads 15 | * Git - http://git-scm.com/downloads 16 | * MakeMKV - http://www.makemkv.com/download/ 17 | 18 | ## Installation Instructions 19 | 20 | 1. Install the prerequisites. 21 | 2. Verify that CPU Virtualization is turned on in your BIOS. (See below for a simple test) 22 | 3. Navigate to your Documents folder in the terminal/command line and type `git clone https://github.com/andymccurdy/tested-transcoder/` 23 | 4. Switch to the 'tested-transcoder' folder and run `vagrant up`. 24 | 5. Create a folder on the host machine where you will copy source videos to and collect transcoded videos from. 25 | 6. Use the VirtualBox UI on the host machine to share this new folder with the VM. 26 | 1. Click the VM named "Tested Transcoder." 27 | 2. (Optional) Stop machine by using `vagrant halt` and adjust CPU / memory settings to suit. Once you have saved the changes start the machine again with `vagrant up`. 28 | 3. Click Shared folders. 29 | 4. Click the add button. 30 | 5. Find the folder you created in the "Folder Path" field. 31 | 6. The "Folder Name" *must* be "transcoder" or the script will not work. 32 | 7. Check the "Make Permanent" checkbox (may not be visible). Verify all other checkboxes are unchecked. 33 | 8. Click OK twice to return to the VM selection screen. 34 | 8. Wait approximately a minute for 'input', 'output', 'work', and 'completed-originals' folders to be created in the folder on your host machine. 35 | 36 | ## Usage 37 | 38 | 1. While the VM is running, starting your encodes is as easy as dragging a video from MakeMKV into the 'input' folder. 39 | 2. When the encode is in progress, you can check in on its progress by looking at the end of the log in the 'work' folder. 40 | 3. When the encodes are complete, the new, better compressed video will be in the 'output' folder and the original source MKV will be in the 'completed-originals' folder. After you've confirmed subtitles and audio tracks are correct, you can safely delete the large original file. 41 | 4. Enjoy your new, much smaller MKV in your favorite media player. 42 | 43 | --- 44 | #### Verify CPU Virtualization is on 45 | There may be better ways to do this, but this seems to be a reasonable way. 46 | 47 | 1. Open VitrualBox Manager. 48 | 2. Select New 49 | 3. Name: test 50 | 4. Next. 51 | 5. Next. 52 | 5. Do not add a virtual hard drive. 53 | 6. Create. 54 | 7. Click on the test VM and look under System for "Acceleration: VT-x/AMD-V, Nested Paging." 55 | 8. If you see this message you should be good, otherwise you will need to Google how to turn it on for your specific motherboard. 56 | 9. Once you are finished delete the test VM. (Right Click > Remove) 57 | -------------------------------------------------------------------------------- /Vagrantfile: -------------------------------------------------------------------------------- 1 | # -*- mode: ruby -*- 2 | # vi: set ft=ruby : 3 | 4 | # All Vagrant configuration is done below. The "2" in Vagrant.configure 5 | # configures the configuration version (we support older styles for 6 | # backwards compatibility). Please don't change it unless you know what 7 | # you're doing. 8 | Vagrant.configure(2) do |config| 9 | # ubuntu 14.04 64bit image 10 | config.vm.box = "ubuntu/trusty64" 11 | 12 | # Provider-specific configuration so you can fine-tune various 13 | # backing providers for Vagrant. These expose provider-specific options. 14 | # Example for VirtualBox: 15 | # 16 | config.vm.provider "virtualbox" do |vb| 17 | vb.name = "Tested Transcoder" 18 | vb.memory = ENV["VM_MEMORY"] || 4096 19 | vb.cpus = ENV["VM_CPUS"] || 4 20 | end 21 | 22 | # bootstrap the ubuntu machine 23 | config.vm.provision "shell", inline: <<-SHELL 24 | add-apt-repository -y ppa:stebbins/handbrake-releases 25 | add-apt-repository -y ppa:mc3man/trusty-media 26 | apt-get update 27 | apt-get install -y make git mkvtoolnix handbrake-cli mplayer ffmpeg mp4v2-utils linux-headers-generic build-essential dkms virtualbox-guest-utils virtualbox-guest-dkms supervisor 28 | 29 | git clone https://github.com/donmelton/video-transcoding-scripts 30 | mv video-transcoding-scripts/*.sh /usr/local/bin/ 31 | rm -rf video-transcoding-scripts 32 | 33 | # transcoder root. this is where the transcoder directory will be mounted 34 | mkdir -p /media/transcoder 35 | 36 | # install the transcoder's supervisor config file and reload supervisor 37 | cp /vagrant/supervisor-config.conf /etc/supervisor/conf.d/transcoder.conf 38 | cp /vagrant/transcoder.py /usr/local/bin 39 | chmod +x /usr/local/bin/transcoder.py 40 | supervisorctl reload 41 | SHELL 42 | 43 | # copy the transcoder.py script in to place. always run this provisioner to 44 | # get the most recent copy of the script. 45 | config.vm.provision "shell", run: "always", inline: <<-SHELL 46 | cp /vagrant/transcoder.py /usr/local/bin 47 | chmod +x /usr/local/bin/transcoder.py 48 | supervisorctl reload 49 | SHELL 50 | 51 | 52 | if ENV["TRANSCODER_ROOT"] 53 | config.vm.synced_folder ENV["TRANSCODER_ROOT"], "/media/transcoder" 54 | end 55 | end 56 | -------------------------------------------------------------------------------- /supervisor-config.conf: -------------------------------------------------------------------------------- 1 | [program:transcoder] 2 | command=/usr/local/bin/transcoder.py 3 | user=vagrant 4 | autorestart=true 5 | -------------------------------------------------------------------------------- /transcoder.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/python 2 | 3 | import logging 4 | import os 5 | import re 6 | import shlex 7 | import shutil 8 | import signal 9 | import subprocess 10 | import sys 11 | import time 12 | 13 | 14 | def non_zero_min(values): 15 | "Return the min value but always prefer non-zero values if they exist" 16 | if len(values) == 0: 17 | raise TypeError('non_zero_min expected 1 arguments, got 0') 18 | non_zero_values = [i for i in values if i != 0] 19 | if non_zero_values: 20 | return min(non_zero_values) 21 | return 0 22 | 23 | 24 | class Transcoder(object): 25 | 26 | # name of the share defined in virtualbox that will contain input/output video 27 | VBOX_SHARE_NAME = 'transcoder' 28 | # path to mount the virtual box share 29 | TRANSCODER_ROOT = "/media/transcoder" 30 | # directory containing new video to transcode 31 | INPUT_DIRECTORY = TRANSCODER_ROOT + '/input' 32 | # directory where handbrake will save the output to. this is a temporary 33 | # location and the file is moved to OUTPUT_DIRECTORY after complete 34 | WORK_DIRECTORY = TRANSCODER_ROOT + '/work' 35 | # directory containing the original inputs after they've been transcoded 36 | COMPLETED_DIRECTORY = TRANSCODER_ROOT + '/completed-originals' 37 | # directory contained the compressed outputs 38 | OUTPUT_DIRECTORY = TRANSCODER_ROOT + '/output' 39 | # standard options for the transcode-video script 40 | TRANSCODE_OPTIONS = '--mkv --slow --allow-dts --allow-ac3 --find-forced add --copy-all-ac3' 41 | # number of seconds a file must remain unmodified in the INPUT_DIRECTORY 42 | # before it is considered done copying. increase this value for more 43 | # tolerance on bad network connections. 44 | WRITE_THRESHOLD = 30 45 | # path to logfile 46 | LOGFILE = TRANSCODER_ROOT + '/transcoder.log' 47 | 48 | def __init__(self): 49 | self.running = False 50 | self.logger = None 51 | self.current_command = None 52 | self._default_handlers = {} 53 | 54 | def setup_signal_handlers(self): 55 | "Setup graceful shutdown and cleanup when sent a signal" 56 | def handler(signum, frame): 57 | self.stop() 58 | 59 | for sig in (signal.SIGTERM, signal.SIGHUP, signal.SIGINT): 60 | self._default_handlers[sig] = signal.signal(sig, handler) 61 | 62 | def restore_signal_handlers(self): 63 | "Restore the default handlers" 64 | for sig, handler in self._default_handlers.items(): 65 | signal.signal(sig, handler) 66 | self._default_handlers = {} 67 | 68 | def execute(self, command): 69 | # TODO: use Popen and assign to current_command so we can terminate 70 | args = shlex.split(command) 71 | out = subprocess.check_output(args=args, stderr=subprocess.STDOUT) 72 | return out 73 | 74 | def mount_share(self): 75 | """ 76 | Mount the VBox share if it's not already mounted. 77 | Returns True if mounted, otherwise False. 78 | """ 79 | out = self.execute('mount') 80 | if '%s type vboxsf' % self.TRANSCODER_ROOT in out: 81 | return True 82 | # attempt to mount 83 | uid, gid = os.getuid(), os.getgid() 84 | command = 'sudo mount -t vboxsf -o uid=%s,gid=%s %s %s' % ( 85 | uid, gid, self.VBOX_SHARE_NAME, self.TRANSCODER_ROOT) 86 | try: 87 | self.execute(command) 88 | except subprocess.CalledProcessError as ex: 89 | msg = 'Unable to mount Virtual Box Share: %s' % ex.output 90 | sys.stdout.write(msg) 91 | sys.stdout.flush() 92 | return False 93 | return True 94 | 95 | def setup_logging(self): 96 | self.logger = logging.getLogger('transcoder') 97 | self.logger.setLevel(logging.DEBUG) 98 | handler = logging.FileHandler(self.LOGFILE) 99 | handler.setLevel(logging.DEBUG) 100 | formatter = logging.Formatter('%(asctime)s - %(message)s') 101 | handler.setFormatter(formatter) 102 | self.logger.addHandler(handler) 103 | self.logger.info('Transcoder started and scanning for input') 104 | 105 | def check_filesystem(self): 106 | "Checks that the filesystem and logger is setup properly" 107 | dirs = (self.INPUT_DIRECTORY, self.WORK_DIRECTORY, 108 | self.OUTPUT_DIRECTORY, self.COMPLETED_DIRECTORY) 109 | if not all(map(os.path.exists, dirs)): 110 | if not self.mount_share(): 111 | return False 112 | for path in dirs: 113 | if not os.path.exists(path): 114 | try: 115 | os.mkdir(path) 116 | except OSError as ex: 117 | msg = 'Cannot create directory "%s": %s' % ( 118 | path, ex.strerror) 119 | sys.stdout.write(msg) 120 | sys.stdout.flush() 121 | return False 122 | 123 | if not self.logger: 124 | self.setup_logging() 125 | return True 126 | 127 | def stop(self): 128 | # guard against multiple signals being sent before the first one 129 | # finishes 130 | if not self.running: 131 | return 132 | self.running = False 133 | self.logger.info('Transcoder shutting down') 134 | if self.current_command: 135 | self.current_command.terminate() 136 | # logging 137 | logging.shutdown() 138 | self.logger = None 139 | # signal handlers 140 | self.restore_signal_handlers() 141 | 142 | def run(self): 143 | self.running = True 144 | self.setup_signal_handlers() 145 | 146 | while self.running: 147 | if self.check_filesystem(): 148 | self.check_for_input() 149 | time.sleep(5) 150 | 151 | def check_for_input(self): 152 | "Look in INPUT_DIRECTORY for an input file and process it" 153 | for filename in os.listdir(self.INPUT_DIRECTORY): 154 | if filename.startswith('.'): 155 | continue 156 | path = os.path.join(self.INPUT_DIRECTORY, filename) 157 | if (time.time() - os.stat(path).st_mtime) > self.WRITE_THRESHOLD: 158 | # when copying a file from windows to the VM, the filesize and 159 | # last modified times don't change as data is written. 160 | # fortunately these files seem to be locked such that 161 | # attempting to open the file for reading raises an IOError. 162 | # it seems reasonable to skip any file we can't open 163 | try: 164 | f = open(path, 'r') 165 | f.close() 166 | except IOError: 167 | continue 168 | 169 | self.process_input(path) 170 | # move the source to the COMPLETED_DIRECTORY 171 | dst = os.path.join(self.COMPLETED_DIRECTORY, 172 | os.path.basename(path)) 173 | shutil.move(path, dst) 174 | break 175 | 176 | def process_input(self, path): 177 | name = os.path.basename(path) 178 | self.logger.info('Found new input "%s"', name) 179 | 180 | # if any of the following functions return no output, something 181 | # bad happened and we can't continue 182 | 183 | # parse the input meta info. 184 | meta = self.scan_media(path) 185 | if not meta: 186 | return 187 | 188 | # determine crop dimensions 189 | crop = self.detect_crop(path) 190 | if not crop: 191 | return 192 | 193 | # transcode the video 194 | work_path = self.transcode(path, crop, meta) 195 | if not work_path: 196 | return 197 | 198 | # move the completed output to the output directory 199 | self.logger.info('Moving completed work output %s to output directory', 200 | os.path.basename(work_path)) 201 | output_path = os.path.join(self.OUTPUT_DIRECTORY, 202 | os.path.basename(work_path)) 203 | shutil.move(work_path, output_path) 204 | shutil.move(work_path + '.log', output_path + '.log') 205 | 206 | def scan_media(self, path): 207 | "Use handbrake to scan the media for metadata" 208 | name = os.path.basename(path) 209 | self.logger.info('Scanning "%s" for metadata', name) 210 | command = 'HandBrakeCLI --scan --input "%s"' % path 211 | try: 212 | out = self.execute(command) 213 | except subprocess.CalledProcessError as ex: 214 | if 'unrecognized file type' in ex.output: 215 | self.logger.info('Unknown media type for input "%s"', name) 216 | else: 217 | self.logger.info('Unknown error for input "%s" with error: %s', 218 | name, ex.output) 219 | return None 220 | 221 | # process out 222 | return out 223 | 224 | def detect_crop(self, path): 225 | crop_re = r'[0-9]+:[0-9]+:[0-9]+:[0-9]+' 226 | name = os.path.basename(path) 227 | self.logger.info('Detecting crop for input "%s"', name) 228 | command = 'detect-crop.sh --values-only "%s"' % path 229 | try: 230 | out = self.execute(command) 231 | except subprocess.CalledProcessError as ex: 232 | # when detect-crop detects discrepancies between handbrake and 233 | # mplayer, each crop is written out but detect-crop also returns 234 | # an error code. if this is the case, we don't want to error out. 235 | if re.findall(crop_re, ex.output): 236 | out = ex.output 237 | else: 238 | self.logger.info('detect-crop failed for input "%s", ' 239 | 'proceeding with no crop. error: %s', 240 | name, ex.output) 241 | return '0:0:0:0' 242 | 243 | crops = re.findall(crop_re, out) 244 | if not crops: 245 | self.logger.info('No crop found for input "%s", ' 246 | 'proceeding with no crop', name) 247 | 248 | return '0:0:0:0' 249 | else: 250 | # use the smallest crop for each edge. prefer non-zero values if 251 | # they exist 252 | dimensions = zip(*[map(int, c.split(':')) for c in crops]) 253 | crop = ':'.join(map(str, [non_zero_min(piece) for piece in dimensions])) 254 | self.logger.info('Using crop "%s" for input "%s"', crop, name) 255 | return crop 256 | 257 | def transcode(self, path, crop, meta): 258 | name = os.path.basename(path) 259 | output_name = os.path.splitext(name)[0] + '.mkv' 260 | output = os.path.join(self.WORK_DIRECTORY, output_name) 261 | # if these paths exist in the work directory, remove them first 262 | for workpath in (output, output + '.log'): 263 | if os.path.exists(workpath): 264 | self.logger.info('Removing old work output: "%s"', workpath) 265 | os.unlink(workpath) 266 | 267 | command_parts = [ 268 | 'transcode-video.sh', 269 | '--crop %s' % crop, 270 | self.parse_audio_tracks(meta), 271 | self.TRANSCODE_OPTIONS, 272 | '--output "%s"' % output, 273 | '"%s"' % path 274 | ] 275 | command = ' '.join(command_parts) 276 | self.logger.info('Transcoding input "%s" with command: %s', 277 | path, command) 278 | try: 279 | self.execute(command) 280 | except subprocess.CalledProcessError as ex: 281 | self.logger.info('Transcoding failed for input "%s": %s', 282 | name, ex.output) 283 | return None 284 | self.logger.info('Transcoding completed for input "%s"', name) 285 | return output 286 | 287 | def parse_audio_tracks(self, meta): 288 | "Parse the meta info for audio tracks beyond the first one" 289 | 290 | # find all the audio streams and their optional language and title data 291 | streams = [] 292 | stream_re = r'(\s{4}Stream #[0-9]+\.[0-9]+(?:\((?P[a-z]+)\))?: Audio:.*?\n)(?=(?:\s{4}Stream)|(?:[^\s]))' 293 | title_re = r'^\s{6}title\s+:\s(?P[^\n]+)' 294 | for stream, lang in re.findall(stream_re, meta, re.DOTALL | re.MULTILINE): 295 | lang = lang = '' 296 | title = '' 297 | title_match = re.search(title_re, stream, re.MULTILINE) 298 | if title_match: 299 | title = title_match.group(1) 300 | streams.append({'title': title, 'lang': lang}) 301 | 302 | # find the audio track numbers 303 | tracks = [] 304 | pos = meta.find('+ audio tracks:') 305 | track_re = r'^\s+\+\s(?P<track>[0-9]+),\s(?P<title>[^\(\n]*)' 306 | for line in meta[pos:].split('\n')[1:]: 307 | if line.startswith(' + subtitle tracks:'): 308 | break 309 | match = re.match(track_re, line) 310 | if match: 311 | tracks.append({'number': match.group(1), 'title': match.group(2)}) 312 | 313 | # assuming there's an equal number of tracks and streams, we can 314 | # match up stream titles to tracks and have a nicer output 315 | use_stream_titles = len(streams) == len(tracks) 316 | additional_tracks = [] 317 | 318 | for i, track in enumerate(tracks[1:]): 319 | title = '' 320 | if use_stream_titles: 321 | title = streams[i+1]['title'] 322 | title = title or track['title'] 323 | # remove any quotes in the title so we don't mess up the command 324 | title = title.replace('"', '') 325 | self.logger.info('Adding audio track #%s with title: %s', 326 | track['number'], title) 327 | additional_tracks.append('--add-audio %s,"%s"' % ( 328 | track['number'], title.replace('"', ''))) 329 | 330 | return ' '.join(additional_tracks) 331 | 332 | 333 | if __name__ == '__main__': 334 | transcoder = Transcoder() 335 | transcoder.run() 336 | --------------------------------------------------------------------------------