├── tmp └── temp.txt ├── output └── temp.txt ├── README.md └── main.py /tmp/temp.txt: -------------------------------------------------------------------------------- 1 | temp -------------------------------------------------------------------------------- /output/temp.txt: -------------------------------------------------------------------------------- 1 | temp -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # autoPodcastEditor 2 | A program made for video podcasts to expedite the editing process by automatically switching video clips based on who is talking 3 | 4 | ### See it in action! 5 | [![Demo Video](https://i.imgur.com/ClN1S6f.png)](https://www.youtube.com/watch?v=kMJ4Bx4BBBo&feature=youtu.be) 6 | 7 | Made a short video explaining what autoPodcastEditor does with examples (9:43). 8 | Timestamps 9 | - 0:00 - What it does and why 10 | - 2:30 - Introduction to example clips 11 | - 3:40 - First input clip display 12 | - 4:27 - Second input clip display 13 | - 5:35 - Quick explanation on global variables (sample rate, tolerance, exceed req. no audio overlap) 14 | - 6:57 - First example of program combining clips (audio overlap) 15 | - 8:19 - Second example of program combining clips (no audio overlap) 16 | 17 | ### What it does and why 18 | - Many modern video podcasts have cameras pointed at each participant, which makes the editing process long and tedious. The process of selecting whose camera to show when they're talking is a chore that could be done autonomously. This program aims to solve that. 19 | - autoPodcastEditor exports a final video clip that switches between the cameras of each podcast participant depending on who's talking to make the editing process a breeze. 20 | - Use cases besides podcasts include things like D&D campaigns, group video game sessions, or even security camera footage to highlight significant events (assuming they have sound). 21 | 22 | ### Dependencies 23 | - **ffmpeg** - converts video clips to audio waveform arrays (great installation instructions can be found [here](https://www.wikihow.com/Install-FFmpeg-on-Windows)) 24 | - **subprocess** - calls ffmpeg commands via command line 25 | - **moviepy** - concatenates and exports video clips 26 | - **math** - processes split point calculations 27 | 28 | ### How it works 29 | - ffmpeg converts the input files to audio clips (.wav files) and converts the audio clips to integer arrays representing the audio waveform levels throughout the clips 30 | - parseAudioData() cleans the audio arrays and shrinks them to an appropriate size based on the SAMPLE_RATE global 31 | - compareAudioArrays() 'normalizes' all of the arrays (i.e. makes them all the same length by concatenating zeroes; representing no sound, to shorter arrays), compares each entry of the cleaned audio arrays (using returnHighestIndex() to do so), and outputs an array of numbers representing which clip should be displayed based on its audio level at a given time (number is zero based, from 0 to number_input_clips - 1) 32 | - returnHighestIndex() compares the audio level at a given time between all clips, but gives the current clip priority - the EXCEEDS_BY global indicates by what percentage (in decimal form) another clip must exceed the current clip by in sound level to take priority; returns the index of the clips that should be given priority 33 | - After compareAudioArrays() generates the outputArray (priority timeline of which clip should be shown when), moviepy grabs the snippets of the video clips and concatenates them appropriately 34 | 35 | ### Notes 36 | - All video clips must be synced at the start (i.e. synced at time = 0s) differing lengths of clips is fine, though (longer clips will override) 37 | - Does not currently support separate video + audio clips (planning on adding support soon) 38 | - Plan to add an option overlapping audio so you can hear audio from all clips simultaneously (mainly for podcasts where multiple people may be talking at once, since the program's main purpose is video switching) 39 | - No testing has been done for quality retainment and file size - unsure whether quality is dropped via the script or if there is a maximum file size limit 40 | 41 | Clips and audio must already be set to right startpoint 42 | Quality testing 43 | Bound SAMPLE_RATE positive and less than or equal to native rate 44 | 45 | 46 | 47 | ### Compatibility 48 | - Confirmed working on Windows 10, have not confirmed on other operating systems 49 | -------------------------------------------------------------------------------- /main.py: -------------------------------------------------------------------------------- 1 | from moviepy.editor import * 2 | from tkinter import * 3 | from tkinter.filedialog import askopenfilename 4 | from scipy.io import wavfile 5 | import subprocess 6 | import math 7 | 8 | # GLOBALS 9 | INPUT_FILES = [] # List of input files (in local directory) 10 | TEMP_FOLDER = 'tmp/' # Temp folder name 11 | OUTPUT_FOLDER = 'output/' # Output folder name 12 | OUTPUT_FILE_NAME = "output" # Output file name 13 | SAMPLE_RATE = 24 # Number of samples to take per second to check volume level 14 | THRESHOLD = 5 # Required # of consecutive highest indices needed to take priority 15 | EXCEEDS_BY = 4 # Percentage (in decimal form) other clip(s) must exceed volume by to overtake 16 | NO_OVERLAP_AUDIO = True # Restricts audio overlapping (False = overlap audio) 17 | 18 | 19 | # INTERNAL GLOBALS (DO NOT TOUCH) 20 | checkpoints = [] # Global storing tuples of array length + associated index, sorted from min-max by array length 21 | checkpoint_counter = 0 # Determines which checkpoint currently at 22 | ul_x = 10 23 | ul_y = 10 24 | 25 | # GUI CREATION 26 | class Window(Frame): 27 | def __init__(self, master=None): 28 | Frame.__init__(self, master) 29 | self.master = master 30 | self.init_window() 31 | 32 | def init_window(self): 33 | self.master.title("AutoPodcastEditor") 34 | self.pack(fill=BOTH, expand=1) 35 | sync_notice = Label(self, text="Please ensure all input clips are in sync at start and don't go out of sync!") 36 | sync_notice.place(x=ul_x, y=ul_y) 37 | browseFileDir = Button(self, text="Add File", command=self.addFile) 38 | browseFileDir.place(x=ul_x, y=ul_y+25) 39 | 40 | sampleRateLabel = Label(self, text="Sample Rate") 41 | sampleRateLabel.place(x=ul_x, y=ul_y + 320) 42 | self.sampleRateEntry = Entry(self, width=3) 43 | self.sampleRateEntry.place(x=ul_x + 73, y=ul_y + 321) 44 | self.sampleRateEntry.insert(END, str(SAMPLE_RATE)) 45 | 46 | thresholdLabel = Label(self, text="Threshold") 47 | thresholdLabel.place(x=ul_x + 120, y=ul_y + 320) 48 | self.thresholdEntry = Entry(self, width=3) 49 | self.thresholdEntry.place(x=ul_x + 65 + 120, y=ul_y + 321) 50 | self.thresholdEntry.insert(END, str(THRESHOLD)) 51 | 52 | exceedsLabel = Label(self, text="Exceeds By") 53 | exceedsLabel.place(x=ul_x + 235, y=ul_y + 320) 54 | self.exceedsEntry = Entry(self, width=3) 55 | self.exceedsEntry.place(x=ul_x + 65 + 235, y=ul_y + 321) 56 | self.exceedsEntry.insert(END, str(EXCEEDS_BY)) 57 | 58 | overlapAudioLabel = Label(self, text="Overlap Audio") 59 | overlapAudioLabel.place(x=ul_x + 345, y=ul_y + 320) 60 | self.overlapAudioBox = Checkbutton(self, command=self.toggleAudio) 61 | self.overlapAudioBox.place(x=ul_x + 65 + 360, y=ul_y + 319) 62 | 63 | outputNameLabel = Label(self, text="Output File Name") 64 | outputNameLabel.place(x=ul_x, y=ul_y + 345) 65 | self.outputNameEntry = Entry(self, width=57) 66 | self.outputNameEntry.place(x=ul_x + 102, y=ul_y + 346) 67 | self.outputNameEntry.insert(END, OUTPUT_FILE_NAME) 68 | 69 | processButton = Button(self, text="Process", command=self.confirmSettings, width=15, height=3) 70 | processButton.place(x=ul_x + 460, y=ul_y + 310) 71 | 72 | def confirmSettings(self): 73 | global SAMPLE_RATE 74 | SAMPLE_RATE = int(self.sampleRateEntry.get()) 75 | global THRESHOLD 76 | THRESHOLD = int(self.thresholdEntry.get()) 77 | global EXCEEDS_BY 78 | EXCEEDS_BY = float(self.exceedsEntry.get()) 79 | global OUTPUT_FILE_NAME 80 | OUTPUT_FILE_NAME = self.outputNameEntry.get() 81 | self.spliceClips() 82 | 83 | def toggleAudio(self): 84 | global NO_OVERLAP_AUDIO 85 | NO_OVERLAP_AUDIO = not NO_OVERLAP_AUDIO 86 | 87 | def addFile(self): 88 | filename = askopenfilename() 89 | if filename != '': 90 | INPUT_FILES.append(filename) 91 | fileDir = Label(self, text=filename) 92 | fileDir.place(x=ul_x, y=ul_y+28+23*len(INPUT_FILES)) 93 | 94 | # HELPER FUNCTIONS 95 | 96 | # TAKES RAW WAVEFORM DATA AND CREATES INTEGER WAVEFORM ARRAY 97 | # INPUT: audioRate = audio sample rate 98 | # audioArray = associated audio waveform array 99 | # OUTPUT: outputArray = downscaled audioArray based on SAMPLE_RATE 100 | def parseAudioData(self, audioRate, audioArray): 101 | sampleDivider = math.floor(audioRate / SAMPLE_RATE) 102 | outputArray = [] 103 | sampleCounter = 0 104 | while sampleCounter <= audioArray.shape[0]: 105 | outputArray.append(audioArray[sampleCounter][0]) 106 | sampleCounter += sampleDivider 107 | return outputArray 108 | 109 | # TAKES ARRAY OF AUDIOARRAYS AND OUTPUTS INDEX ARRAY INDICATING WHICH ARRAY IS LOUDEST AT GIVEN TIME 110 | # INPUT: audioArrays = array of audioArrays for each clip to compare audio 111 | # OUTPUT: outputArray = array with indices 1..numArrays indicating which clip should overlay 112 | def compareAudioArrays(self, audioArrays): 113 | priorityArray = 0 # Current array that should have video priority (zero-based) 114 | consecutiveArray = 0 # Which array currently has the highest waveform integer 115 | prevArray = 0 # Which array on previous iteration had highest waveform integer 116 | consecutiveCount = 0 # Number of consecutive times audio is larger than others 117 | counter = 0 # Current array index to compare 118 | outputArray = [] 119 | 120 | audioArrays = self.normalizeArrays(audioArrays) # Set arrays to equal lengths by concatenating zeroes 121 | 122 | while counter < len(audioArrays[0]): 123 | consecutiveArray = self.returnHighestIndex(audioArrays, counter, priorityArray) # Find index of loudest clip 124 | # If loudest clip is a different one than the previous loudest clip, add 1 to consecutive counter 125 | if (consecutiveArray != prevArray): 126 | prevArray = consecutiveArray 127 | consecutiveCount = 1 128 | else: 129 | consecutiveCount += 1 130 | # If the overriding loudest clip has been louder >= THRESHOLD # of times, replace it 131 | if (consecutiveCount >= THRESHOLD): 132 | priorityArray = consecutiveArray 133 | for checkpoint in checkpoints: 134 | if checkpoint == counter: 135 | priorityArray = consecutiveArray 136 | outputArray.append(priorityArray) 137 | counter += 1 138 | # Write output data to text file (for debugging) 139 | f = open(TEMP_FOLDER + "audioData.txt", "w") 140 | f.write(str(outputArray)) 141 | f.close() 142 | return outputArray 143 | 144 | # COMPARES AUDIO WAVEFORMS AT GIVEN TIME AND RETURNS INDEX OF LOUDEST 145 | # INPUT: audioArrays = array of audio waveforms 146 | # index = index to compare waveform integers 147 | # currentPriority = current array that has priority (for EXCEEDS_BY) 148 | # OUTPUT: rerturnIndex = index of audioArray with highest waveform integer 149 | def returnHighestIndex(self, audioArrays, index, currentPriority): 150 | maxVal = 0 # Maximum waveform value found so far 151 | returnIndex = 0 # Index of array with highest waveform value 152 | for c in range(len(audioArrays)): 153 | if c != currentPriority: 154 | if abs(audioArrays[c][index]) > maxVal: 155 | maxVal = abs(audioArrays[c][index]) 156 | returnIndex = c 157 | else: 158 | if abs(audioArrays[c][index]) * EXCEEDS_BY > maxVal: 159 | maxVal = abs(audioArrays[c][index]) * EXCEEDS_BY 160 | returnIndex = c 161 | return returnIndex 162 | 163 | # SETS ALL AUDIO ARRAYS TO SAME LENGTH BY CONCATENATING ZEROES TO SHORTER ARRAYS 164 | # INPUT: audioArrays = array of audio arrays containing integer waveform data 165 | # OUTPUT: outputArray = array of audio arrays all equal length (concatenates 0 to shorter arrays) 166 | def normalizeArrays(self, audioArrays): 167 | maxArrayLen = 0 168 | outputArray = [] 169 | # Get the length of the longest array 170 | for array in audioArrays: 171 | if len(array) > maxArrayLen: 172 | maxArrayLen = len(array) 173 | # checkpoints.append(len(array)) 174 | # Fill shorter arrays with trailing zeroes 175 | for array in audioArrays: 176 | for c in range(maxArrayLen - len(array)): 177 | array.append(0) 178 | outputArray.append(array) 179 | return outputArray 180 | 181 | def spliceClips(self): 182 | # MAIN PROCESS 183 | audioDataArrays = [] 184 | # Generate .wav files for each video clip, create associated outputArray 185 | for i in range(len(INPUT_FILES)): 186 | # Convert input to audio waveform and call via command in subprocess 187 | command = "ffmpeg -i " + INPUT_FILES[ 188 | i] + " -ab 160k -ac 2 -y -vn " + TEMP_FOLDER + "audio" + str(i) + ".wav" 189 | subprocess.call(command, shell=False) 190 | audioRate, audioArray = wavfile.read(TEMP_FOLDER + 'audio' + str(i) + '.wav') 191 | audioDataArrays.append(self.parseAudioData(audioRate, audioArray)) 192 | outputArray = self.compareAudioArrays(audioDataArrays) 193 | 194 | # Utilizes outputArray to determine which clips should be split and inserted where 195 | outputClipList = [] 196 | audioClipList = [] # Only used if OVERLAP_AUDIO is set to 1 197 | counter = 0 198 | prevPriority = -1 199 | prevEndPt = -1 200 | while counter < len(outputArray): 201 | # Initialization of loop 202 | if prevEndPt == -1: 203 | prevPriority = outputArray[counter] 204 | prevEndPt = 0 205 | # If the 'priority clip' is different than previous, finalize previous clip and add to clip list 206 | elif prevPriority != outputArray[counter]: 207 | print(str(counter) + " [" + str(prevPriority) + "] || SPLIT_PT: " + "start: " + str( 208 | prevEndPt) + " end: " + str(counter / SAMPLE_RATE)) 209 | outputClipList.append( 210 | VideoFileClip(INPUT_FILES[prevPriority], audio=NO_OVERLAP_AUDIO).subclip(prevEndPt, 211 | counter / SAMPLE_RATE)) 212 | prevPriority = outputArray[counter] 213 | prevEndPt = counter / SAMPLE_RATE 214 | counter += 1 215 | print(str(counter) + " [" + str(prevPriority) + "] || SPLIT_PT: " + "start: " + str(prevEndPt) + " end: " + str( 216 | counter / SAMPLE_RATE)) 217 | outputClipList.append( 218 | VideoFileClip(INPUT_FILES[prevPriority], audio=NO_OVERLAP_AUDIO).subclip(prevEndPt, ( 219 | counter - 1) / SAMPLE_RATE)) 220 | # Concatenate clips and output 221 | videoOutput = concatenate_videoclips(outputClipList) 222 | 223 | # If audio should be overlapped, create an audio file with all overlapped and mix with video 224 | if not NO_OVERLAP_AUDIO: 225 | for wavFileIndex in range(len(INPUT_FILES)): 226 | audioClipList.append(AudioFileClip(TEMP_FOLDER + "audio" + str(wavFileIndex) + ".wav")) 227 | audioOutput = CompositeAudioClip(audioClipList) 228 | videoOutput = videoOutput.set_audio(audioOutput) 229 | 230 | videoOutput.write_videofile( 231 | OUTPUT_FOLDER + OUTPUT_FILE_NAME + ".mp4") 232 | 233 | root = Tk() 234 | root.geometry("600x400") 235 | app = Window(root) 236 | root.mainloop() 237 | --------------------------------------------------------------------------------