├── .gitignore ├── CHANGELOG.md ├── LICENSE.md ├── Makefile ├── README.md ├── dist ├── loopextractor-0.2.0-py3-none-any.whl └── loopextractor-0.2.0.tar.gz ├── loopextractor ├── __init__.py └── loopextractor.py ├── pyproject.toml ├── setup.py └── tests ├── __init__.py ├── example_song.mp3 ├── test_end_to_end.py └── test_loopextractor.py /.gitignore: -------------------------------------------------------------------------------- 1 | loopextractor/__pycache__/* 2 | loopextractor.egg-info/* 3 | tests/test_loops/* 4 | tests/__pycache__/* -------------------------------------------------------------------------------- /CHANGELOG.md: -------------------------------------------------------------------------------- 1 | # Changelog 2 | 3 | ## 0.2.1 4 | 5 | - Added two tests of loopextractor functions 6 | 7 | ## 0.2.0 8 | 9 | Reorganised project, preparing for pypi release. 10 | - store project info in pyproject.toml 11 | - add basic end-to-end test 12 | - madmom temporarily removed as beat tracker, for installation simplicity 13 | 14 | ## 0.1.0 15 | 16 | Initial version. Released December 2019. -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- 1 | Copyright (c) 2019 Jordan B. L. Smith (@jblsmith) 2 | 3 | You may use this software under LGPLv3, which is displayed below. 4 | 5 | # GNU Lesser General Public License 6 | 7 | *Version 3, 29 June 2007* 8 | *Copyright © 2007 Free Software Foundation, Inc. [http://fsf.org/](http://fsf.org/)* 9 | 10 | Everyone is permitted to copy and distribute verbatim copies 11 | of this license document, but changing it is not allowed. 12 | 13 | This version of the GNU Lesser General Public License incorporates 14 | the terms and conditions of version 3 of the GNU General Public 15 | License, supplemented by the additional permissions listed below. 16 | 17 | ### 0. Additional Definitions 18 | 19 | As used herein, “this License” refers to version 3 of the GNU Lesser 20 | General Public License, and the “GNU GPL” refers to version 3 of the GNU 21 | General Public License. 22 | 23 | “The Library” refers to a covered work governed by this License, 24 | other than an Application or a Combined Work as defined below. 25 | 26 | An “Application” is any work that makes use of an interface provided 27 | by the Library, but which is not otherwise based on the Library. 28 | Defining a subclass of a class defined by the Library is deemed a mode 29 | of using an interface provided by the Library. 30 | 31 | A “Combined Work” is a work produced by combining or linking an 32 | Application with the Library. The particular version of the Library 33 | with which the Combined Work was made is also called the “Linked 34 | Version”. 35 | 36 | The “Minimal Corresponding Source” for a Combined Work means the 37 | Corresponding Source for the Combined Work, excluding any source code 38 | for portions of the Combined Work that, considered in isolation, are 39 | based on the Application, and not on the Linked Version. 40 | 41 | The “Corresponding Application Code” for a Combined Work means the 42 | object code and/or source code for the Application, including any data 43 | and utility programs needed for reproducing the Combined Work from the 44 | Application, but excluding the System Libraries of the Combined Work. 45 | 46 | ### 1. Exception to Section 3 of the GNU GPL 47 | 48 | You may convey a covered work under sections 3 and 4 of this License 49 | without being bound by section 3 of the GNU GPL. 50 | 51 | ### 2. Conveying Modified Versions 52 | 53 | If you modify a copy of the Library, and, in your modifications, a 54 | facility refers to a function or data to be supplied by an Application 55 | that uses the facility (other than as an argument passed when the 56 | facility is invoked), then you may convey a copy of the modified 57 | version: 58 | 59 | * **a)** under this License, provided that you make a good faith effort to 60 | ensure that, in the event an Application does not supply the 61 | function or data, the facility still operates, and performs 62 | whatever part of its purpose remains meaningful, or 63 | 64 | * **b)** under the GNU GPL, with none of the additional permissions of 65 | this License applicable to that copy. 66 | 67 | ### 3. Object Code Incorporating Material from Library Header Files 68 | 69 | The object code form of an Application may incorporate material from 70 | a header file that is part of the Library. You may convey such object 71 | code under terms of your choice, provided that, if the incorporated 72 | material is not limited to numerical parameters, data structure 73 | layouts and accessors, or small macros, inline functions and templates 74 | (ten or fewer lines in length), you do both of the following: 75 | 76 | * **a)** Give prominent notice with each copy of the object code that the 77 | Library is used in it and that the Library and its use are 78 | covered by this License. 79 | * **b)** Accompany the object code with a copy of the GNU GPL and this license 80 | document. 81 | 82 | ### 4. Combined Works 83 | 84 | You may convey a Combined Work under terms of your choice that, 85 | taken together, effectively do not restrict modification of the 86 | portions of the Library contained in the Combined Work and reverse 87 | engineering for debugging such modifications, if you also do each of 88 | the following: 89 | 90 | * **a)** Give prominent notice with each copy of the Combined Work that 91 | the Library is used in it and that the Library and its use are 92 | covered by this License. 93 | 94 | * **b)** Accompany the Combined Work with a copy of the GNU GPL and this license 95 | document. 96 | 97 | * **c)** For a Combined Work that displays copyright notices during 98 | execution, include the copyright notice for the Library among 99 | these notices, as well as a reference directing the user to the 100 | copies of the GNU GPL and this license document. 101 | 102 | * **d)** Do one of the following: 103 | - **0)** Convey the Minimal Corresponding Source under the terms of this 104 | License, and the Corresponding Application Code in a form 105 | suitable for, and under terms that permit, the user to 106 | recombine or relink the Application with a modified version of 107 | the Linked Version to produce a modified Combined Work, in the 108 | manner specified by section 6 of the GNU GPL for conveying 109 | Corresponding Source. 110 | - **1)** Use a suitable shared library mechanism for linking with the 111 | Library. A suitable mechanism is one that **(a)** uses at run time 112 | a copy of the Library already present on the user's computer 113 | system, and **(b)** will operate properly with a modified version 114 | of the Library that is interface-compatible with the Linked 115 | Version. 116 | 117 | * **e)** Provide Installation Information, but only if you would otherwise 118 | be required to provide such information under section 6 of the 119 | GNU GPL, and only to the extent that such information is 120 | necessary to install and execute a modified version of the 121 | Combined Work produced by recombining or relinking the 122 | Application with a modified version of the Linked Version. (If 123 | you use option **4d0**, the Installation Information must accompany 124 | the Minimal Corresponding Source and Corresponding Application 125 | Code. If you use option **4d1**, you must provide the Installation 126 | Information in the manner specified by section 6 of the GNU GPL 127 | for conveying Corresponding Source.) 128 | 129 | ### 5. Combined Libraries 130 | 131 | You may place library facilities that are a work based on the 132 | Library side by side in a single library together with other library 133 | facilities that are not Applications and are not covered by this 134 | License, and convey such a combined library under terms of your 135 | choice, if you do both of the following: 136 | 137 | * **a)** Accompany the combined library with a copy of the same work based 138 | on the Library, uncombined with any other library facilities, 139 | conveyed under the terms of this License. 140 | * **b)** Give prominent notice with the combined library that part of it 141 | is a work based on the Library, and explaining where to find the 142 | accompanying uncombined form of the same work. 143 | 144 | ### 6. Revised Versions of the GNU Lesser General Public License 145 | 146 | The Free Software Foundation may publish revised and/or new versions 147 | of the GNU Lesser General Public License from time to time. Such new 148 | versions will be similar in spirit to the present version, but may 149 | differ in detail to address new problems or concerns. 150 | 151 | Each version is given a distinguishing version number. If the 152 | Library as you received it specifies that a certain numbered version 153 | of the GNU Lesser General Public License “or any later version” 154 | applies to it, you have the option of following the terms and 155 | conditions either of that published version or of any later version 156 | published by the Free Software Foundation. If the Library as you 157 | received it does not specify a version number of the GNU Lesser 158 | General Public License, you may choose any version of the GNU Lesser 159 | General Public License ever published by the Free Software Foundation. 160 | 161 | If the Library as you received it specifies that a proxy can decide 162 | whether future versions of the GNU Lesser General Public License shall 163 | apply, that proxy's public statement of acceptance of any version is 164 | permanent authorization for you to choose that version for the 165 | Library. -------------------------------------------------------------------------------- /Makefile: -------------------------------------------------------------------------------- 1 | build: 2 | python -m build 3 | 4 | install: 5 | python setup.py install 6 | rm -r build 7 | 8 | install-dev: install 9 | pip3 install pytest 10 | 11 | test: 12 | pytest tests -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # loopextractor 2 | 3 | A python script for extracting loops from audio files. 4 | 5 | The script uses non-negative tensor factorization to model a version of the spectrum. 6 | 7 | The code was written by Jordan B. L. Smith (@jblsmith) in December 2019. 8 | 9 | 10 | ### Installation 11 | 12 | The project currently requires Python ≥3.8. 13 | 14 | You can download and install the repository from GitHub directly: 15 | 16 | ``` 17 | pip install git+https://github.com/jblsmith/loopextractor 18 | ``` 19 | 20 | ### Usage 21 | 22 | Calling loopextractor from the command line will run it on the included audio file as an example: 23 | 24 | ``` 25 | python loopextractor.py 26 | ``` 27 | 28 | You can also import it and use the functions in it on your own data: 29 | 30 | ``` 31 | import loopextractor 32 | loopextractor.run_algorithm("my_audio_file.mp3", n_templates=[30,25,10], output_savename="my_string") 33 | ``` 34 | 35 | ### Reference 36 | 37 | The script implements the algorithm described in a paper I published in 2018, [described here](http://jblsmith.github.io/projects/nonnegative-tensor-factorization/). When using this code for an academic paper/project, please cite this paper as a reference: 38 | 39 | > Smith, Jordan B. L., and Goto, Masataka. 2018. "Nonnegative tensor factorization for source separation of loops in audio." *Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (IEEE ICASSP 2018).* Calgary, AB, Canada. pp. 171--175. 40 | 41 | The included example song was assembled using loops from [FreeSound.org](FreeSound.org) that were licensed Creative-Commons 0, i.e., committed to the public domain. 42 | 43 | ### License 44 | 45 | This project is licensed under the terms of the [GNU Lesser General Public License version 3 (LGPLv3)](https://www.gnu.org/licenses/lgpl-3.0.en.html). 46 | 47 | ### Disclaimer 48 | 49 | Although the code for loopextractor follows the same steps described in the ICASSP paper cited above, 50 | this code was written from scratch in December 2019 by Jordan Smith alone. 51 | 52 | Outside of this, the code for loopextractor has no relationship or connection to work done at AIST, 53 | nor to the code that powers the Unmixer website (https://unmixer.ongaaccel.jp/). 54 | -------------------------------------------------------------------------------- /dist/loopextractor-0.2.0-py3-none-any.whl: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jblsmith/loopextractor/e09ffd2cee2eb10d4d7fbb2d2fa48b041ba02e52/dist/loopextractor-0.2.0-py3-none-any.whl -------------------------------------------------------------------------------- /dist/loopextractor-0.2.0.tar.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jblsmith/loopextractor/e09ffd2cee2eb10d4d7fbb2d2fa48b041ba02e52/dist/loopextractor-0.2.0.tar.gz -------------------------------------------------------------------------------- /loopextractor/__init__.py: -------------------------------------------------------------------------------- 1 | from __future__ import absolute_import, division, print_function -------------------------------------------------------------------------------- /loopextractor/loopextractor.py: -------------------------------------------------------------------------------- 1 | ''' 2 | File name: loopextractor.py 3 | Author: Jordan B. L. Smith 4 | Date created: 2 December 2019 5 | Date last modified: 1 February 2024 6 | License: GNU Lesser General Public License v3 (LGPLv3) 7 | Python Version: 3.8 8 | ''' 9 | 10 | import argparse 11 | import copy 12 | import librosa 13 | import numpy as np 14 | import os 15 | import soundfile 16 | import tensorly 17 | import tensorly.decomposition as tld 18 | from sklearn.decomposition import NMF 19 | 20 | def run_algorithm(audio_file, n_templates=[0,0,0], output_savename="extracted_loop"): 21 | """Complete pipeline of algorithm. 22 | 23 | Parameters 24 | ---------- 25 | audio_file : string 26 | Path to audio file to be loaded and analysed. 27 | n_templates : list of length 3 28 | The number of sound, rhythm and loop templates. 29 | Default value (0,0,0) causes the script to estimate reasonable values. 30 | output_savename: : string 31 | Base string for saved output filenames. 32 | 33 | Returns 34 | ------- 35 | A set of files containing the extracted loops. 36 | 37 | Examples 38 | -------- 39 | >>> run_algorithm("example_song.mp3", [40,20,7], "extracted_loop") 40 | 41 | See also 42 | -------- 43 | tensorly.decomposition.non_negative_tucker 44 | """ 45 | assert os.path.exists(audio_file) 46 | assert len(n_templates)==3 47 | assert type(n_templates) is list 48 | # Load mono audio: 49 | signal_mono, fs = librosa.load(audio_file, sr=None, mono=True) 50 | # Use madmom to estimate the downbeat times: 51 | downbeat_times = get_downbeats(signal_mono, fs) 52 | # Convert times to frames so we segment signal: 53 | downbeat_frames = librosa.time_to_samples(downbeat_times, sr=fs) 54 | # Create spectral cube out of signal: 55 | spectral_cube = make_spectral_cube(signal_mono, downbeat_frames) 56 | # Validate the input n_templates (inventing new ones if any is wrong): 57 | n_sounds, n_rhythms, n_loops = validate_template_sizes(spectral_cube, n_templates) 58 | # Use TensorLy to do the non-negative Tucker decomposition: 59 | core, factors = tld.non_negative_tucker(np.abs(spectral_cube), [n_sounds, n_rhythms, n_loops], n_iter_max=500, verbose=True) 60 | # Reconstruct each loop: 61 | for ith_loop in range(n_loops): 62 | # Multiply templates together to get real loop spectrum: 63 | loop_spectrum = create_loop_spectrum(factors[0], factors[1], core[:,:,ith_loop]) 64 | # Choose best bar to reconstruct from (we will use its phase): 65 | bar_ind = choose_bar_to_reconstruct(factors[2], ith_loop) 66 | # Reconstruct loop signal by masking original spectrum: 67 | ith_loop_signal = get_loop_signal(loop_spectrum, spectral_cube[:,:,bar_ind]) 68 | # Write signal to disk: 69 | soundfile.write("{0}_{1}.wav".format(output_savename,ith_loop), ith_loop_signal, fs) 70 | 71 | def get_downbeats(signal, fs): 72 | """ 73 | Basic, sloppy downbeat detection: use Librosa-tracked beats, assume 4/4, 74 | and use the phase with the best onset strength. 75 | """ 76 | tempo, beat_frames = librosa.beat.beat_track(y=signal, sr=fs, units="frames") 77 | onset_strength_frames = librosa.onset.onset_strength(y=signal, sr=fs) 78 | phase_strengths = [np.median(onset_strength_frames[beat_frames[i::4]]) for i in range(4)] 79 | best_phase = np.argmax(phase_strengths) 80 | return librosa.frames_to_time(beat_frames[best_phase::4], sr=fs) 81 | 82 | def make_spectral_cube(signal_mono, downbeat_frames): 83 | """Convert audio signal into a spectral cube using 84 | specified downbeat frames. 85 | 86 | An STFT is taken of each segment of audio, and 87 | these STFTs are stacked into a 3rd dimension. 88 | 89 | The STFTs may have different lengths; they are 90 | zero-padded to the length of the longest STFT. 91 | 92 | Parameters 93 | ---------- 94 | signal_mono : np.ndarray [shape=(n,), dtype=float] 95 | one-dimensional audio signal to convert 96 | downbeat_frames : np.ndarray [shape=(n,), dtype=int] 97 | list of frames separating downbeats (or whatever 98 | time interval is desired) 99 | 100 | Returns 101 | ------- 102 | tensor : np.ndarray [shape=(n1,n2,n3), dtype=complex64] 103 | tensor containing spectrum slices 104 | 105 | Examples 106 | -------- 107 | >>> signal_mono, fs = librosa.load("example_song.mp3", sr=None, mono=True) 108 | >>> downbeat_times = get_downbeats(signal_mono) 109 | >>> downbeat_frames = librosa.time_to_samples(downbeat_times, sr=fs) 110 | >>> spectral_cube = make_spectral_cube(signal_mono, downbeat_frames) 111 | >>> spectral_cube.shape 112 | (1025, 162, 31) 113 | >>> spectral_cube[:2,:2,:2] 114 | array([[[ 18.08905602+0.00000000e+00j, -20.48682976+0.00000000e+00j], 115 | [-16.07670403+0.00000000e+00j, -44.98669434+0.00000000e+00j]], 116 | 117 | [[-19.45080566+3.66026653e-15j, -8.5700922 +3.14418630e-16j], 118 | [ 1.01680577-3.67251587e+01j, 35.03190231-2.13507919e+01j]]]) 119 | """ 120 | assert len(signal_mono.shape) == 1 121 | # For each span of audio, compute the FFT using librosa defaults. 122 | usable_downbeat_frames = [d for d in downbeat_frames if d <= len(signal_mono)] 123 | fft_per_span = [librosa.core.stft(signal_mono[b1:b2]) for b1,b2 in zip(usable_downbeat_frames[:-1], usable_downbeat_frames[1:])] 124 | # Tensor size 1: the number of frequency bins 125 | freq_bins = fft_per_span[0].shape[0] 126 | # Tensor size 2: the length of the STFTs. 127 | # This could vary for each span; use the maximum. 128 | rhyt_bins = np.max([fpb.shape[1] for fpb in fft_per_span]) 129 | # Tensor size 3: the number of spans. 130 | bar_bins = len(fft_per_span) 131 | tensor = np.zeros((freq_bins, rhyt_bins, bar_bins)).astype(complex) 132 | for i in range(bar_bins): 133 | tensor[:,:fft_per_span[i].shape[1],i] = fft_per_span[i] 134 | return tensor 135 | 136 | def validate_template_sizes(spectral_cube, n_templates): 137 | """Ensure that specified number of estimated templates are valid. 138 | Values must be greater than 1 and strictly less than 139 | the corresponding dimension of the original tensor. 140 | So, if the tensor has size [1025,100,20], then 141 | n_templates = [99,99,10] is valid (though unadvised), while 142 | n_templates = [30,20,20] is invalid. 143 | 144 | If ANY of the values for n_templates are invalid, than 145 | get_recommended_template_sizes() is used to obtain 146 | replacement values for n_templates. 147 | 148 | Parameters 149 | ---------- 150 | spectral_cube : np.ndarray [shape=(n1,n2,n3)] 151 | Original tensor to be modeled. 152 | n_templates : list [shape=(3,), dtype=int] 153 | Proposed numbers of templates. 154 | 155 | Returns 156 | ------- 157 | output_n_templates : np.ndarray [shape=(3,), dtype=int] 158 | Validated numbers of templates. 159 | 160 | Examples 161 | -------- 162 | >>> validate_template_sizes(np.zeros((1025, 162, 31)), [100, 50, 20]) 163 | array([100, 50, 20]) 164 | >>> validate_template_sizes(np.zeros((1025, 162, 31)), [0, 0, 0]) 165 | array([63, 21, 7]) 166 | >>> validate_template_sizes(np.zeros((1025, 162, 31)), [100, 50, 40]) 167 | array([63, 21, 7]) 168 | 169 | See Also 170 | -------- 171 | get_recommended_template_sizes 172 | """ 173 | max_template_sizes = np.array(spectral_cube.shape) - 1 174 | min_template_sizes = np.ones_like(max_template_sizes) 175 | big_enough = np.all(min_template_sizes <= n_templates) 176 | small_enough = np.all(n_templates <= max_template_sizes) 177 | valid = big_enough & small_enough 178 | if valid: 179 | return n_templates 180 | else: 181 | return get_recommended_template_sizes(spectral_cube) 182 | 183 | def purify_core_tensor(core, factors, new_rank, dim_to_reduce=2): 184 | """Reduce the size of the core tensor by modelling repeated content 185 | across loop recipes. The output is a more "pure" set of loop 186 | recipes that should be more distinct from each other. 187 | 188 | Parameters 189 | ---------- 190 | core : np.ndarray [shape=(n1,n2,n3)] 191 | Core tensor to be compressed. 192 | factors : list [shape=(3,), dtype=np.ndarray] 193 | List of estimated templates 194 | new_rank : int 195 | The new size for the core tensor 196 | dim_to_reduce : int 197 | The dimension along which to compress the core tensor. 198 | (Default value 2 will reduce the number of loop types.) 199 | 200 | Returns 201 | ------- 202 | new_core : np.ndarray [shape=(n1,n2,new_rank)] 203 | Compressed version of the core tensor 204 | new_factors : list [shape=(3,), dtype=np.ndarray] 205 | New list of templates. 206 | Note: two templates will be the same as before; 207 | only the template for the compressed dimension 208 | will be different. 209 | """ 210 | assert new_rank < core.shape[dim_to_reduce] 211 | X = tensorly.unfold(core,dim_to_reduce) 212 | model = NMF(n_components=new_rank, init='nndsvd', random_state=0) 213 | W = model.fit_transform(X) 214 | H = model.components_ 215 | # Re-construct core tensor and factors based on NMF factors from core tensor: 216 | new_shape = list(core.shape) 217 | new_shape[dim_to_reduce] = new_rank 218 | new_core = tensorly.fold(H, dim_to_reduce, new_shape) 219 | new_factors = copy.copy(factors) 220 | new_factors[dim_to_reduce] = np.dot(factors[dim_to_reduce],W) 221 | return new_core, new_factors 222 | 223 | def get_recommended_template_sizes(spectral_cube): 224 | """Propose reasonable values for numbers of templates 225 | to estimate. 226 | 227 | If a dimension of the tensor is N, then N^(6/10), rounded 228 | down, seems to give a reasonable value. 229 | 230 | Parameters 231 | ---------- 232 | spectral_cube : np.ndarray [shape=(n1,n2,n3)] 233 | Original tensor to be modeled. 234 | 235 | Returns 236 | ------- 237 | recommended_sizes : np.ndarray [shape=(len(spectral_cube.shape),), dtype=float] 238 | Suggested number of templates. 239 | 240 | Examples 241 | -------- 242 | >>> get_recommended_template_sizes(np.zeros((100,200,300))) 243 | array([15, 23, 30]) 244 | >>> get_recommended_template_sizes(np.zeros((4,400,40000))) 245 | array([1, 36, 577]) 246 | """ 247 | max_template_sizes = np.array(spectral_cube.shape) - 1 248 | min_template_sizes = np.ones_like(max_template_sizes) 249 | recommended_sizes = np.floor(max_template_sizes**.6).astype(int) 250 | recommended_sizes = np.max((recommended_sizes, min_template_sizes),axis=0) 251 | assert np.all(min_template_sizes <= recommended_sizes) 252 | assert np.all(recommended_sizes <= max_template_sizes) 253 | return recommended_sizes 254 | 255 | def create_loop_spectrum(sounds, rhythms, core_slice): 256 | """Recreate loop spectrum from a slice of the core tensor 257 | and the first two templates, the sounds and rhythms. 258 | 259 | Parameters 260 | ---------- 261 | sounds : np.ndarray [shape=(n_frequency_bins, n_sounds), dtype=float] 262 | The sound templates, one spectral template per column. 263 | rhythms : np.ndarray [shape=(n_time_bins, n_rhythms), dtype=float] 264 | The rhythm templates, or time-in-bar activations functions. 265 | One rhythm template per column. 266 | core_slice : np.ndarray [shape=(n_sounds, n_rhythms)] 267 | A slice of the core tensor giving the recipe for one loop. 268 | 269 | Returns 270 | ------- 271 | loop_spectrum : np.ndarray [shape=(n_frequency_bins, n_time_bins), dtype=float] 272 | Reconstruction of spectrum. 273 | 274 | Examples 275 | -------- 276 | >>> np.random.seed(0) 277 | >>> factors = [np.abs(np.random.randn(1025, 63)), 278 | np.abs(np.random.randn(162, 21)), 279 | np.abs(np.random.randn(31, 7))] 280 | >>> core = np.abs(np.random.randn(63,21,7)) 281 | >>> create_loop_spectrum(factors[0], factors[1], core[:,:,0]) 282 | array([[727.4153606 , 728.64591236, 625.76726056, ..., 512.94167141, 283 | 592.2098947 , 607.10457107], 284 | [782.11991843, 778.09690543, 682.71895323, ..., 550.43525375, 285 | 636.51448493, 666.35600624], 286 | [733.96209316, 720.17586837, 621.80762807, ..., 501.51192504, 287 | 590.14018676, 605.44147057], 288 | ..., 289 | [772.43712078, 758.88473642, 654.35159419, ..., 522.69754588, 290 | 628.84580165, 641.66347072], 291 | [677.58720601, 666.52484723, 583.92269705, ..., 471.24362278, 292 | 558.17441475, 573.31864635], 293 | [768.96634561, 758.85553214, 639.21515256, ..., 525.83186141, 294 | 634.04799161, 644.35772338]]) 295 | """ 296 | loop_spectrum = np.dot(np.dot(sounds, core_slice), rhythms.transpose()) 297 | return loop_spectrum 298 | 299 | def choose_bar_to_reconstruct(loop_templates, ith_loop): 300 | """...Choose... bar... to... reconstruct! 301 | 302 | For now, it just chooses the bar with the largest activation. 303 | More information could / should be included, like reducing 304 | cross-talk, which would mean considering the activations (but 305 | ideally the relative *loudnesses*) of the other loops. 306 | 307 | Parameters 308 | ---------- 309 | loop_templates : np.ndarray [shape=(n_bars, n_loop_types), dtype=float] 310 | The loop activation templates, one template per column. 311 | ith_loop : int 312 | The index of the loop template. 313 | 314 | Returns 315 | ------- 316 | bar_ind : int 317 | The index of the bar to choose. 318 | 319 | Examples 320 | -------- 321 | >>> np.random.seed(0) 322 | >>> factors = [np.abs(np.random.randn(1025, 63)), 323 | np.abs(np.random.randn(162, 21)), 324 | np.abs(np.random.randn(31, 7))] 325 | >>> choose_bar_to_reconstruct(factors[2], 0) 326 | 10 327 | """ 328 | bar_ind = np.argmax(loop_templates[:,ith_loop]) 329 | return bar_ind 330 | 331 | def get_loop_signal(loop_spectrum, original_spectrum): 332 | """Reconstruct the signal for a loop given its spectrum 333 | and the original spectrum. 334 | 335 | The original spectrum is used as the basis, and the reconstructed 336 | loop spectrum is used to mask the spectrum. 337 | 338 | Parameters 339 | ---------- 340 | loop_spectrum : np.ndarray [shape=(n_freq_bins, n_time_bins_1), dtype=float] 341 | Reconstructed loop spectrum (real) 342 | original_spectrum : np.ndarray [shape=(n_freq_bins, n_time_bins_2), dtype=complex] 343 | Original spectrum (complex; possibly different length of time) 344 | 345 | Returns 346 | ------- 347 | signal : np.ndarray [shape=(n,), dtype=float] 348 | Estimated signal of isolated loop. 349 | 350 | Examples 351 | -------- 352 | >>> np.random.seed(0) 353 | >>> random_matrix = np.random.randn(1025,130) 354 | >>> loop_spectrum = np.abs(random_matrix) / np.max(random_matrix) 355 | >>> random_matrix_2 = np.random.randn(1025,130) 356 | >>> loop_spectrum_2 = np.abs(random_matrix_2) / np.max(random_matrix_2) 357 | >>> get_loop_signal(loop_spectrum, loop_spectrum_2) 358 | array([-5.7243928e-04, -2.3625907e-04, -3.8087784e-04, ..., 359 | 9.2569360e-05, 3.9195133e-04, -2.4777438e-04], dtype=float32) 360 | 361 | See also 362 | -------- 363 | librosa.util.softmask 364 | """ 365 | assert loop_spectrum.shape[0] == original_spectrum.shape[0] 366 | min_length = np.min((loop_spectrum.shape[1], original_spectrum.shape[1])) 367 | orig_mag, orig_phase = librosa.magphase(original_spectrum) 368 | mask = librosa.util.softmask(loop_spectrum[:,:min_length], orig_mag[:,:min_length], power=1) 369 | masked_spectrum = original_spectrum[:,:min_length] * mask 370 | signal = librosa.core.istft(masked_spectrum) 371 | return signal 372 | 373 | def write_all_loop_signals(core, factors, spectral_cube, 374 | fs=44100, output_savename="extracted_loop"): 375 | # Reconstruct each loop: 376 | n_loops = core.shape[2] 377 | for ith_loop in range(n_loops): 378 | # Multiply templates together to get real loop spectrum: 379 | loop_spectrum = create_loop_spectrum(factors[0], factors[1], core[:,:,ith_loop]) 380 | # Choose best bar to reconstruct from (we will use its phase): 381 | bar_ind = choose_bar_to_reconstruct(factors[2], ith_loop) 382 | # Reconstruct loop signal by masking original spectrum: 383 | ith_loop_signal = get_loop_signal(loop_spectrum, spectral_cube[:,:,bar_ind]) 384 | # Write signal to disk: 385 | soundfile.write("{0}_{1}.wav".format(output_savename,ith_loop), ith_loop_signal, fs) 386 | 387 | if __name__ == "__main__": 388 | parser = argparse.ArgumentParser(description="Write this later") 389 | parser.add_argument("audio_file", type=str, help="Path to audio file") 390 | parser.add_argument("output_path", type=str, help="Prefix path for output files") 391 | args = parser.parse_args() 392 | run_algorithm(args.audio_file, [0,0,0], args.output_path) -------------------------------------------------------------------------------- /pyproject.toml: -------------------------------------------------------------------------------- 1 | [build-system] 2 | requires = ["setuptools >= 61.0"] 3 | build-backend = "setuptools.build_meta" 4 | 5 | [project] 6 | name = "loopextractor" 7 | version = "0.2.1" 8 | dependencies = [ 9 | "tensorly>=0.8.1", 10 | "librosa>=0.10.0", 11 | "numpy>=1.22.0", 12 | "scikit-learn<=1.3.2,>=1.1.0", 13 | ] 14 | requires-python = ">= 3.8" 15 | authors = [{name="Jordan B. L. Smith", email="jblsmith@gmail.com"}] 16 | description = "Extract repeating loops from songs using Nonnegative Tucker Decomposition-based source separation" 17 | readme = "README.md" 18 | license = {file = "LICENSE.md"} 19 | keywords = ["source separation", "loops", "audio", "music structure analysis"] 20 | classifiers = [ 21 | "Development Status :: 3 - Alpha", 22 | "Environment :: Console", 23 | "Intended Audience :: Science/Research", 24 | "License :: OSI Approved :: GNU Lesser General Public License v3 (LGPLv3)", 25 | "Operating System :: OS Independent", 26 | "Programming Language :: Python", 27 | "Topic :: Scientific/Engineering", 28 | "Topic :: Multimedia :: Sound/Audio :: Analysis", 29 | "Topic :: Multimedia :: Sound/Audio" 30 | ] 31 | 32 | [project.urls] 33 | Homepage = "https://github.com/jblsmith/loopextractor" 34 | -------------------------------------------------------------------------------- /setup.py: -------------------------------------------------------------------------------- 1 | from setuptools import setup 2 | 3 | setup() 4 | -------------------------------------------------------------------------------- /tests/__init__.py: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jblsmith/loopextractor/e09ffd2cee2eb10d4d7fbb2d2fa48b041ba02e52/tests/__init__.py -------------------------------------------------------------------------------- /tests/example_song.mp3: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jblsmith/loopextractor/e09ffd2cee2eb10d4d7fbb2d2fa48b041ba02e52/tests/example_song.mp3 -------------------------------------------------------------------------------- /tests/test_end_to_end.py: -------------------------------------------------------------------------------- 1 | import os 2 | 3 | from loopextractor.loopextractor import run_algorithm 4 | 5 | def test_run_algorithm(): 6 | os.makedirs("tests/test_loops", exist_ok=True) 7 | run_algorithm("tests/example_song.mp3", n_templates=[0,0,0], output_savename="tests/test_loops/loop") 8 | for i in range(7): 9 | assert os.path.isfile(f"tests/test_loops/loop_{i}.wav") -------------------------------------------------------------------------------- /tests/test_loopextractor.py: -------------------------------------------------------------------------------- 1 | from loopextractor.loopextractor import get_recommended_template_sizes, purify_core_tensor, choose_bar_to_reconstruct 2 | 3 | import numpy as np 4 | 5 | def test_get_recommended_template_sizes(): 6 | recommended_size = get_recommended_template_sizes(np.zeros((10,100,1000))) 7 | assert np.all(recommended_size == [3,15,63]) 8 | 9 | 10 | def test_purify_cube(): 11 | thing1 = np.random.randint(0, 1000, (5,5)) 12 | thing2 = np.random.randint(0, 1000, (5,5)) 13 | thing3 = np.random.randint(0, 1000, (5,5)) 14 | thing4 = thing1 + 2*thing2 15 | factors = [[], [], np.array([[1,0], [1,0], [0,1], [0,1]]).T] 16 | core_tensor = np.dstack([thing1, thing2, thing3, thing4]) 17 | pure_tensor, pure_factors = purify_core_tensor(core_tensor, factors, new_rank=3, dim_to_reduce=2) 18 | assert pure_tensor.shape == (5,5,3) 19 | 20 | bar_0_before = np.dot(core_tensor, factors[2][0, :]) 21 | bar_0_after = np.dot(pure_tensor, pure_factors[2][0, :]) 22 | assert np.all(np.abs(bar_0_before - bar_0_after) < bar_0_before) 23 | 24 | 25 | def test_choose_bar_to_reconstruct(): 26 | # Three bars, 4 loops 27 | loop_map = np.array([[11,10,1,0,12], 28 | [0,3,2,1,1], 29 | [5,0,0,1,2]]).T 30 | assert choose_bar_to_reconstruct(loop_map, 0) == 4 31 | assert choose_bar_to_reconstruct(loop_map, 1) == 1 32 | assert choose_bar_to_reconstruct(loop_map, 2) == 0 33 | 34 | # FIXME: Add more tests --------------------------------------------------------------------------------