├── .gitignore ├── LICENSE ├── Librosa tutorial.ipynb └── README.md /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | env/ 12 | build/ 13 | develop-eggs/ 14 | dist/ 15 | downloads/ 16 | eggs/ 17 | .eggs/ 18 | lib/ 19 | lib64/ 20 | parts/ 21 | sdist/ 22 | var/ 23 | *.egg-info/ 24 | .installed.cfg 25 | *.egg 26 | 27 | # PyInstaller 28 | # Usually these files are written by a python script from a template 29 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 30 | *.manifest 31 | *.spec 32 | 33 | # Installer logs 34 | pip-log.txt 35 | pip-delete-this-directory.txt 36 | 37 | # Unit test / coverage reports 38 | htmlcov/ 39 | .tox/ 40 | .coverage 41 | .coverage.* 42 | .cache 43 | nosetests.xml 44 | coverage.xml 45 | *,cover 46 | .hypothesis/ 47 | 48 | # Translations 49 | *.mo 50 | *.pot 51 | 52 | # Django stuff: 53 | *.log 54 | local_settings.py 55 | 56 | # Flask stuff: 57 | instance/ 58 | .webassets-cache 59 | 60 | # Scrapy stuff: 61 | .scrapy 62 | 63 | # Sphinx documentation 64 | docs/_build/ 65 | 66 | # PyBuilder 67 | target/ 68 | 69 | # IPython Notebook 70 | .ipynb_checkpoints 71 | 72 | # pyenv 73 | .python-version 74 | 75 | # celery beat schedule file 76 | celerybeat-schedule 77 | 78 | # dotenv 79 | .env 80 | 81 | # virtualenv 82 | venv/ 83 | ENV/ 84 | 85 | # Spyder project settings 86 | .spyderproject 87 | 88 | # Rope project settings 89 | .ropeproject 90 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | CC0 1.0 Universal 2 | 3 | Statement of Purpose 4 | 5 | The laws of most jurisdictions throughout the world automatically confer 6 | exclusive Copyright and Related Rights (defined below) upon the creator and 7 | subsequent owner(s) (each and all, an "owner") of an original work of 8 | authorship and/or a database (each, a "Work"). 9 | 10 | Certain owners wish to permanently relinquish those rights to a Work for the 11 | purpose of contributing to a commons of creative, cultural and scientific 12 | works ("Commons") that the public can reliably and without fear of later 13 | claims of infringement build upon, modify, incorporate in other works, reuse 14 | and redistribute as freely as possible in any form whatsoever and for any 15 | purposes, including without limitation commercial purposes. These owners may 16 | contribute to the Commons to promote the ideal of a free culture and the 17 | further production of creative, cultural and scientific works, or to gain 18 | reputation or greater distribution for their Work in part through the use and 19 | efforts of others. 20 | 21 | For these and/or other purposes and motivations, and without any expectation 22 | of additional consideration or compensation, the person associating CC0 with a 23 | Work (the "Affirmer"), to the extent that he or she is an owner of Copyright 24 | and Related Rights in the Work, voluntarily elects to apply CC0 to the Work 25 | and publicly distribute the Work under its terms, with knowledge of his or her 26 | Copyright and Related Rights in the Work and the meaning and intended legal 27 | effect of CC0 on those rights. 28 | 29 | 1. Copyright and Related Rights. A Work made available under CC0 may be 30 | protected by copyright and related or neighboring rights ("Copyright and 31 | Related Rights"). Copyright and Related Rights include, but are not limited 32 | to, the following: 33 | 34 | i. the right to reproduce, adapt, distribute, perform, display, communicate, 35 | and translate a Work; 36 | 37 | ii. moral rights retained by the original author(s) and/or performer(s); 38 | 39 | iii. publicity and privacy rights pertaining to a person's image or likeness 40 | depicted in a Work; 41 | 42 | iv. rights protecting against unfair competition in regards to a Work, 43 | subject to the limitations in paragraph 4(a), below; 44 | 45 | v. rights protecting the extraction, dissemination, use and reuse of data in 46 | a Work; 47 | 48 | vi. database rights (such as those arising under Directive 96/9/EC of the 49 | European Parliament and of the Council of 11 March 1996 on the legal 50 | protection of databases, and under any national implementation thereof, 51 | including any amended or successor version of such directive); and 52 | 53 | vii. other similar, equivalent or corresponding rights throughout the world 54 | based on applicable law or treaty, and any national implementations thereof. 55 | 56 | 2. Waiver. To the greatest extent permitted by, but not in contravention of, 57 | applicable law, Affirmer hereby overtly, fully, permanently, irrevocably and 58 | unconditionally waives, abandons, and surrenders all of Affirmer's Copyright 59 | and Related Rights and associated claims and causes of action, whether now 60 | known or unknown (including existing as well as future claims and causes of 61 | action), in the Work (i) in all territories worldwide, (ii) for the maximum 62 | duration provided by applicable law or treaty (including future time 63 | extensions), (iii) in any current or future medium and for any number of 64 | copies, and (iv) for any purpose whatsoever, including without limitation 65 | commercial, advertising or promotional purposes (the "Waiver"). Affirmer makes 66 | the Waiver for the benefit of each member of the public at large and to the 67 | detriment of Affirmer's heirs and successors, fully intending that such Waiver 68 | shall not be subject to revocation, rescission, cancellation, termination, or 69 | any other legal or equitable action to disrupt the quiet enjoyment of the Work 70 | by the public as contemplated by Affirmer's express Statement of Purpose. 71 | 72 | 3. Public License Fallback. Should any part of the Waiver for any reason be 73 | judged legally invalid or ineffective under applicable law, then the Waiver 74 | shall be preserved to the maximum extent permitted taking into account 75 | Affirmer's express Statement of Purpose. In addition, to the extent the Waiver 76 | is so judged Affirmer hereby grants to each affected person a royalty-free, 77 | non transferable, non sublicensable, non exclusive, irrevocable and 78 | unconditional license to exercise Affirmer's Copyright and Related Rights in 79 | the Work (i) in all territories worldwide, (ii) for the maximum duration 80 | provided by applicable law or treaty (including future time extensions), (iii) 81 | in any current or future medium and for any number of copies, and (iv) for any 82 | purpose whatsoever, including without limitation commercial, advertising or 83 | promotional purposes (the "License"). The License shall be deemed effective as 84 | of the date CC0 was applied by Affirmer to the Work. Should any part of the 85 | License for any reason be judged legally invalid or ineffective under 86 | applicable law, such partial invalidity or ineffectiveness shall not 87 | invalidate the remainder of the License, and in such case Affirmer hereby 88 | affirms that he or she will not (i) exercise any of his or her remaining 89 | Copyright and Related Rights in the Work or (ii) assert any associated claims 90 | and causes of action with respect to the Work, in either case contrary to 91 | Affirmer's express Statement of Purpose. 92 | 93 | 4. Limitations and Disclaimers. 94 | 95 | a. No trademark or patent rights held by Affirmer are waived, abandoned, 96 | surrendered, licensed or otherwise affected by this document. 97 | 98 | b. Affirmer offers the Work as-is and makes no representations or warranties 99 | of any kind concerning the Work, express, implied, statutory or otherwise, 100 | including without limitation warranties of title, merchantability, fitness 101 | for a particular purpose, non infringement, or the absence of latent or 102 | other defects, accuracy, or the present or absence of errors, whether or not 103 | discoverable, all to the greatest extent permissible under applicable law. 104 | 105 | c. Affirmer disclaims responsibility for clearing rights of other persons 106 | that may apply to the Work or any use thereof, including without limitation 107 | any person's Copyright and Related Rights in the Work. Further, Affirmer 108 | disclaims responsibility for obtaining any necessary consents, permissions 109 | or other rights required for any use of the Work. 110 | 111 | d. Affirmer understands and acknowledges that Creative Commons is not a 112 | party to this document and has no duty or obligation with respect to this 113 | CC0 or use of the Work. 114 | 115 | For more information, please see 116 | 117 | -------------------------------------------------------------------------------- /Librosa tutorial.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "metadata": { 6 | "slideshow": { 7 | "slide_type": "slide" 8 | } 9 | }, 10 | "source": [ 11 | "\n", 12 | "# Librosa tutorial\n", 13 | "\n", 14 | "- Version: 0.4.3\n", 15 | "- Tutorial home: https://github.com/librosa/tutorial\n", 16 | "- Librosa home: http://librosa.github.io/\n", 17 | "- User forum: https://groups.google.com/forum/#!forum/librosa" 18 | ] 19 | }, 20 | { 21 | "cell_type": "markdown", 22 | "metadata": { 23 | "slideshow": { 24 | "slide_type": "slide" 25 | } 26 | }, 27 | "source": [ 28 | "## Environments\n", 29 | "\n", 30 | "We assume that you have already installed [Anaconda](https://anaconda.org/).\n", 31 | "\n", 32 | "If you don't have an environment, create one by following command:\n", 33 | "\n", 34 | "```bash\n", 35 | "conda create --name YOURNAME scipy jupyter ipython\n", 36 | "```\n", 37 | "(Replace `YOURNAME` by whatever you want to call the new environment.)" 38 | ] 39 | }, 40 | { 41 | "cell_type": "markdown", 42 | "metadata": { 43 | "slideshow": { 44 | "slide_type": "subslide" 45 | } 46 | }, 47 | "source": [ 48 | "Then, activate the new environment\n", 49 | "```bash\n", 50 | "source activate YOURNAME\n", 51 | "```\n" 52 | ] 53 | }, 54 | { 55 | "cell_type": "markdown", 56 | "metadata": { 57 | "slideshow": { 58 | "slide_type": "fragment" 59 | } 60 | }, 61 | "source": [ 62 | "## Installing librosa\n", 63 | "Librosa can then be installed by the following [🔗]:\n", 64 | "\n", 65 | "```bash\n", 66 | "conda install -c conda-forge librosa\n", 67 | "```\n", 68 | "\n", 69 | "*NOTE*: Windows need to install audio decoding libraries separately. We recommend [ffmpeg](http://ffmpeg.org/)." 70 | ] 71 | }, 72 | { 73 | "cell_type": "markdown", 74 | "metadata": { 75 | "collapsed": true, 76 | "slideshow": { 77 | "slide_type": "subslide" 78 | } 79 | }, 80 | "source": [ 81 | "## Test drive" 82 | ] 83 | }, 84 | { 85 | "cell_type": "markdown", 86 | "metadata": { 87 | "slideshow": { 88 | "slide_type": "-" 89 | } 90 | }, 91 | "source": [ 92 | "Start Jupyter:\n", 93 | "```bash\n", 94 | "jupyter notebook\n", 95 | "```\n", 96 | "and open a new notebook.\n", 97 | "\n", 98 | "Then, run the following:" 99 | ] 100 | }, 101 | { 102 | "cell_type": "code", 103 | "execution_count": null, 104 | "metadata": { 105 | "collapsed": false, 106 | "slideshow": { 107 | "slide_type": "-" 108 | } 109 | }, 110 | "outputs": [], 111 | "source": [ 112 | "import librosa\n", 113 | "print(librosa.__version__)" 114 | ] 115 | }, 116 | { 117 | "cell_type": "code", 118 | "execution_count": null, 119 | "metadata": { 120 | "collapsed": false, 121 | "slideshow": { 122 | "slide_type": "fragment" 123 | } 124 | }, 125 | "outputs": [], 126 | "source": [ 127 | "y, sr = librosa.load(librosa.util.example_audio_file())\n", 128 | "print(len(y), sr)" 129 | ] 130 | }, 131 | { 132 | "cell_type": "markdown", 133 | "metadata": { 134 | "slideshow": { 135 | "slide_type": "slide" 136 | } 137 | }, 138 | "source": [ 139 | "# Documentation!\n", 140 | "\n", 141 | "Librosa has extensive documentation with examples.\n", 142 | "\n", 143 | "When in doubt, go to http://librosa.github.io/librosa/" 144 | ] 145 | }, 146 | { 147 | "cell_type": "markdown", 148 | "metadata": { 149 | "slideshow": { 150 | "slide_type": "slide" 151 | } 152 | }, 153 | "source": [ 154 | "# Conventions\n", 155 | "\n", 156 | "- All data are basic `numpy` types\n", 157 | "- **Audio buffers** are called `y`\n", 158 | "- **Sampling rate** is called `sr`\n", 159 | "- The last axis is time-like:\n", 160 | " y[1000] is the 1001st sample\n", 161 | " S[:, 100] is the 101st frame of S\n", 162 | "- **Defaults** `sr=22050`, `hop_length=512`" 163 | ] 164 | }, 165 | { 166 | "cell_type": "markdown", 167 | "metadata": { 168 | "slideshow": { 169 | "slide_type": "slide" 170 | } 171 | }, 172 | "source": [ 173 | "# Roadmap for today\n", 174 | "\n", 175 | "- `librosa.core`\n", 176 | "- `librosa.feature`\n", 177 | "- `librosa.display`\n", 178 | "- `librosa.beat`\n", 179 | "- `librosa.segment`\n", 180 | "- `librosa.decompose`" 181 | ] 182 | }, 183 | { 184 | "cell_type": "markdown", 185 | "metadata": { 186 | "slideshow": { 187 | "slide_type": "subslide" 188 | } 189 | }, 190 | "source": [ 191 | "# `librosa.core`\n", 192 | "\n", 193 | "- Low-level audio processes\n", 194 | "- Unit conversion\n", 195 | "- Time-frequency representations" 196 | ] 197 | }, 198 | { 199 | "cell_type": "markdown", 200 | "metadata": { 201 | "slideshow": { 202 | "slide_type": "subslide" 203 | } 204 | }, 205 | "source": [ 206 | "To load a signal at its native sampling rate, use `sr=None`" 207 | ] 208 | }, 209 | { 210 | "cell_type": "code", 211 | "execution_count": null, 212 | "metadata": { 213 | "collapsed": false, 214 | "slideshow": { 215 | "slide_type": "-" 216 | } 217 | }, 218 | "outputs": [], 219 | "source": [ 220 | "y_orig, sr_orig = librosa.load(librosa.util.example_audio_file(),\n", 221 | " sr=None)\n", 222 | "print(len(y_orig), sr_orig)" 223 | ] 224 | }, 225 | { 226 | "cell_type": "markdown", 227 | "metadata": { 228 | "slideshow": { 229 | "slide_type": "fragment" 230 | } 231 | }, 232 | "source": [ 233 | "Resampling is easy" 234 | ] 235 | }, 236 | { 237 | "cell_type": "code", 238 | "execution_count": null, 239 | "metadata": { 240 | "collapsed": false, 241 | "slideshow": { 242 | "slide_type": "-" 243 | } 244 | }, 245 | "outputs": [], 246 | "source": [ 247 | "sr = 22050\n", 248 | "\n", 249 | "y = librosa.resample(y_orig, sr_orig, sr)\n", 250 | "\n", 251 | "print(len(y), sr)" 252 | ] 253 | }, 254 | { 255 | "cell_type": "markdown", 256 | "metadata": { 257 | "slideshow": { 258 | "slide_type": "fragment" 259 | } 260 | }, 261 | "source": [ 262 | "But what's that in seconds?" 263 | ] 264 | }, 265 | { 266 | "cell_type": "code", 267 | "execution_count": null, 268 | "metadata": { 269 | "collapsed": false, 270 | "slideshow": { 271 | "slide_type": "-" 272 | } 273 | }, 274 | "outputs": [], 275 | "source": [ 276 | "print(librosa.samples_to_time(len(y), sr))" 277 | ] 278 | }, 279 | { 280 | "cell_type": "markdown", 281 | "metadata": { 282 | "slideshow": { 283 | "slide_type": "subslide" 284 | } 285 | }, 286 | "source": [ 287 | "## Spectral representations\n", 288 | "\n", 289 | "Short-time Fourier transform underlies most analysis.\n", 290 | "\n", 291 | "`librosa.stft` returns a complex matrix `D`.\n", 292 | "\n", 293 | "`D[f, t]` is the FFT value at frequency `f`, time (frame) `t`." 294 | ] 295 | }, 296 | { 297 | "cell_type": "code", 298 | "execution_count": null, 299 | "metadata": { 300 | "collapsed": false, 301 | "slideshow": { 302 | "slide_type": "fragment" 303 | } 304 | }, 305 | "outputs": [], 306 | "source": [ 307 | "D = librosa.stft(y)\n", 308 | "print(D.shape, D.dtype)" 309 | ] 310 | }, 311 | { 312 | "cell_type": "markdown", 313 | "metadata": { 314 | "slideshow": { 315 | "slide_type": "subslide" 316 | } 317 | }, 318 | "source": [ 319 | "Often, we only care about the magnitude.\n", 320 | "\n", 321 | "`D` contains both *magnitude* `S` and *phase* $\\phi$.\n", 322 | "\n", 323 | "$$\n", 324 | "D_{ft} = S_{ft} \\exp\\left(j \\phi_{ft}\\right)\n", 325 | "$$" 326 | ] 327 | }, 328 | { 329 | "cell_type": "code", 330 | "execution_count": null, 331 | "metadata": { 332 | "collapsed": true 333 | }, 334 | "outputs": [], 335 | "source": [ 336 | "import numpy as np" 337 | ] 338 | }, 339 | { 340 | "cell_type": "code", 341 | "execution_count": null, 342 | "metadata": { 343 | "collapsed": false 344 | }, 345 | "outputs": [], 346 | "source": [ 347 | "S, phase = librosa.magphase(D)\n", 348 | "print(S.dtype, phase.dtype, np.allclose(D, S * phase))" 349 | ] 350 | }, 351 | { 352 | "cell_type": "markdown", 353 | "metadata": { 354 | "slideshow": { 355 | "slide_type": "subslide" 356 | } 357 | }, 358 | "source": [ 359 | "## Constant-Q transforms\n", 360 | "\n", 361 | "The CQT gives a logarithmically spaced frequency basis.\n", 362 | "\n", 363 | "This representation is more natural for many analysis tasks." 364 | ] 365 | }, 366 | { 367 | "cell_type": "code", 368 | "execution_count": null, 369 | "metadata": { 370 | "collapsed": false 371 | }, 372 | "outputs": [], 373 | "source": [ 374 | "C = librosa.cqt(y, sr=sr)\n", 375 | "\n", 376 | "print(C.shape, C.dtype)" 377 | ] 378 | }, 379 | { 380 | "cell_type": "markdown", 381 | "metadata": { 382 | "slideshow": { 383 | "slide_type": "subslide" 384 | } 385 | }, 386 | "source": [ 387 | "## Exercise 0\n", 388 | "\n", 389 | "- Load a different audio file\n", 390 | "- Compute its STFT with a different hop length" 391 | ] 392 | }, 393 | { 394 | "cell_type": "code", 395 | "execution_count": null, 396 | "metadata": { 397 | "collapsed": true, 398 | "slideshow": { 399 | "slide_type": "subslide" 400 | } 401 | }, 402 | "outputs": [], 403 | "source": [ 404 | "# Exercise 0 solution\n", 405 | "\n", 406 | "y2, sr2 = librosa.load( )\n", 407 | "\n", 408 | "D = librosa.stft(y2, hop_length= )" 409 | ] 410 | }, 411 | { 412 | "cell_type": "markdown", 413 | "metadata": { 414 | "slideshow": { 415 | "slide_type": "slide" 416 | } 417 | }, 418 | "source": [ 419 | "# `librosa.feature`\n", 420 | "\n", 421 | "- Standard features:\n", 422 | " - `librosa.feature.melspectrogram`\n", 423 | " - `librosa.feature.mfcc`\n", 424 | " - `librosa.feature.chroma`\n", 425 | " - Lots more...\n", 426 | "- Feature manipulation:\n", 427 | " - `librosa.feature.stack_memory`\n", 428 | " - `librosa.feature.delta`" 429 | ] 430 | }, 431 | { 432 | "cell_type": "markdown", 433 | "metadata": { 434 | "slideshow": { 435 | "slide_type": "subslide" 436 | } 437 | }, 438 | "source": [ 439 | "Most features work either with audio or STFT input" 440 | ] 441 | }, 442 | { 443 | "cell_type": "code", 444 | "execution_count": null, 445 | "metadata": { 446 | "collapsed": false 447 | }, 448 | "outputs": [], 449 | "source": [ 450 | "melspec = librosa.feature.melspectrogram(y=y, sr=sr)\n", 451 | "\n", 452 | "# Melspec assumes power, not energy as input\n", 453 | "melspec_stft = librosa.feature.melspectrogram(S=S**2, sr=sr)\n", 454 | "\n", 455 | "print(np.allclose(melspec, melspec_stft))" 456 | ] 457 | }, 458 | { 459 | "cell_type": "markdown", 460 | "metadata": { 461 | "slideshow": { 462 | "slide_type": "slide" 463 | } 464 | }, 465 | "source": [ 466 | "# `librosa.display`\n", 467 | "\n", 468 | "- Plotting routines for spectra and waveforms\n", 469 | "\n", 470 | "- **Note**: major overhaul coming in 0.5" 471 | ] 472 | }, 473 | { 474 | "cell_type": "code", 475 | "execution_count": null, 476 | "metadata": { 477 | "collapsed": true, 478 | "slideshow": { 479 | "slide_type": "subslide" 480 | } 481 | }, 482 | "outputs": [], 483 | "source": [ 484 | "# Displays are built with matplotlib \n", 485 | "import matplotlib.pyplot as plt\n", 486 | "\n", 487 | "# Let's make plots pretty\n", 488 | "import matplotlib.style as ms\n", 489 | "ms.use('seaborn-muted')\n", 490 | "\n", 491 | "# Render figures interactively in the notebook\n", 492 | "%matplotlib nbagg\n", 493 | "\n", 494 | "# IPython gives us an audio widget for playback\n", 495 | "from IPython.display import Audio" 496 | ] 497 | }, 498 | { 499 | "cell_type": "markdown", 500 | "metadata": { 501 | "slideshow": { 502 | "slide_type": "subslide" 503 | } 504 | }, 505 | "source": [ 506 | "## Waveform display" 507 | ] 508 | }, 509 | { 510 | "cell_type": "code", 511 | "execution_count": null, 512 | "metadata": { 513 | "collapsed": false, 514 | "slideshow": { 515 | "slide_type": "-" 516 | } 517 | }, 518 | "outputs": [], 519 | "source": [ 520 | "plt.figure()\n", 521 | "librosa.display.waveplot(y=y, sr=sr)" 522 | ] 523 | }, 524 | { 525 | "cell_type": "markdown", 526 | "metadata": { 527 | "slideshow": { 528 | "slide_type": "subslide" 529 | } 530 | }, 531 | "source": [ 532 | "## A basic spectrogram display" 533 | ] 534 | }, 535 | { 536 | "cell_type": "code", 537 | "execution_count": null, 538 | "metadata": { 539 | "collapsed": false 540 | }, 541 | "outputs": [], 542 | "source": [ 543 | "plt.figure()\n", 544 | "librosa.display.specshow(melspec, y_axis='mel', x_axis='time')\n", 545 | "plt.colorbar()" 546 | ] 547 | }, 548 | { 549 | "cell_type": "markdown", 550 | "metadata": { 551 | "slideshow": { 552 | "slide_type": "subslide" 553 | } 554 | }, 555 | "source": [ 556 | "## Exercise 1\n", 557 | "\n", 558 | "* Pick a feature extractor from the `librosa.feature` submodule and plot the output with `librosa.display.specshow`\n", 559 | "\n", 560 | "\n", 561 | "* **Bonus**: Customize the plot using either `specshow` arguments or `pyplot` functions" 562 | ] 563 | }, 564 | { 565 | "cell_type": "code", 566 | "execution_count": null, 567 | "metadata": { 568 | "collapsed": true, 569 | "slideshow": { 570 | "slide_type": "subslide" 571 | } 572 | }, 573 | "outputs": [], 574 | "source": [ 575 | "# Exercise 1 solution\n", 576 | "\n", 577 | "X = librosa.feature.XX()\n", 578 | "\n", 579 | "plt.figure()\n", 580 | "\n", 581 | "librosa.display.specshow( )" 582 | ] 583 | }, 584 | { 585 | "cell_type": "markdown", 586 | "metadata": { 587 | "slideshow": { 588 | "slide_type": "slide" 589 | } 590 | }, 591 | "source": [ 592 | "# `librosa.beat`\n", 593 | "\n", 594 | "- Beat tracking and tempo estimation" 595 | ] 596 | }, 597 | { 598 | "cell_type": "markdown", 599 | "metadata": { 600 | "slideshow": { 601 | "slide_type": "subslide" 602 | } 603 | }, 604 | "source": [ 605 | "The beat tracker returns the estimated tempo and beat positions (measured in frames)" 606 | ] 607 | }, 608 | { 609 | "cell_type": "code", 610 | "execution_count": null, 611 | "metadata": { 612 | "collapsed": false, 613 | "slideshow": { 614 | "slide_type": "fragment" 615 | } 616 | }, 617 | "outputs": [], 618 | "source": [ 619 | "tempo, beats = librosa.beat.beat_track(y=y, sr=sr)\n", 620 | "print(tempo)\n", 621 | "print(beats)" 622 | ] 623 | }, 624 | { 625 | "cell_type": "markdown", 626 | "metadata": { 627 | "slideshow": { 628 | "slide_type": "fragment" 629 | } 630 | }, 631 | "source": [ 632 | "Let's sonify it!" 633 | ] 634 | }, 635 | { 636 | "cell_type": "code", 637 | "execution_count": null, 638 | "metadata": { 639 | "collapsed": false, 640 | "slideshow": { 641 | "slide_type": "-" 642 | } 643 | }, 644 | "outputs": [], 645 | "source": [ 646 | "clicks = librosa.clicks(frames=beats, sr=sr, length=len(y))\n", 647 | "\n", 648 | "Audio(data=y + clicks, rate=sr)" 649 | ] 650 | }, 651 | { 652 | "cell_type": "markdown", 653 | "metadata": { 654 | "slideshow": { 655 | "slide_type": "subslide" 656 | } 657 | }, 658 | "source": [ 659 | "Beats can be used to downsample features" 660 | ] 661 | }, 662 | { 663 | "cell_type": "code", 664 | "execution_count": null, 665 | "metadata": { 666 | "collapsed": false 667 | }, 668 | "outputs": [], 669 | "source": [ 670 | "chroma = librosa.feature.chroma_cqt(y=y, sr=sr)\n", 671 | "chroma_sync = librosa.feature.sync(chroma, beats)" 672 | ] 673 | }, 674 | { 675 | "cell_type": "code", 676 | "execution_count": null, 677 | "metadata": { 678 | "collapsed": false, 679 | "slideshow": { 680 | "slide_type": "fragment" 681 | } 682 | }, 683 | "outputs": [], 684 | "source": [ 685 | "plt.figure(figsize=(6, 3))\n", 686 | "plt.subplot(2, 1, 1)\n", 687 | "librosa.display.specshow(chroma, y_axis='chroma')\n", 688 | "plt.ylabel('Full resolution')\n", 689 | "plt.subplot(2, 1, 2)\n", 690 | "librosa.display.specshow(chroma_sync, y_axis='chroma')\n", 691 | "plt.ylabel('Beat sync')" 692 | ] 693 | }, 694 | { 695 | "cell_type": "markdown", 696 | "metadata": { 697 | "slideshow": { 698 | "slide_type": "slide" 699 | } 700 | }, 701 | "source": [ 702 | "# `librosa.segment`\n", 703 | "\n", 704 | "- Self-similarity / recurrence\n", 705 | "- Segmentation" 706 | ] 707 | }, 708 | { 709 | "cell_type": "markdown", 710 | "metadata": { 711 | "slideshow": { 712 | "slide_type": "subslide" 713 | } 714 | }, 715 | "source": [ 716 | "Recurrence matrices encode self-similarity\n", 717 | "\n", 718 | " R[i, j] = similarity between frames (i, j)\n", 719 | " \n", 720 | "Librosa computes recurrence between `k`-nearest neighbors." 721 | ] 722 | }, 723 | { 724 | "cell_type": "code", 725 | "execution_count": null, 726 | "metadata": { 727 | "collapsed": true, 728 | "slideshow": { 729 | "slide_type": "-" 730 | } 731 | }, 732 | "outputs": [], 733 | "source": [ 734 | "R = librosa.segment.recurrence_matrix(chroma_sync)" 735 | ] 736 | }, 737 | { 738 | "cell_type": "code", 739 | "execution_count": null, 740 | "metadata": { 741 | "collapsed": false, 742 | "slideshow": { 743 | "slide_type": "fragment" 744 | } 745 | }, 746 | "outputs": [], 747 | "source": [ 748 | "plt.figure(figsize=(4, 4))\n", 749 | "librosa.display.specshow(R)" 750 | ] 751 | }, 752 | { 753 | "cell_type": "markdown", 754 | "metadata": { 755 | "slideshow": { 756 | "slide_type": "subslide" 757 | } 758 | }, 759 | "source": [ 760 | "We can include affinity weights for each link as well." 761 | ] 762 | }, 763 | { 764 | "cell_type": "code", 765 | "execution_count": null, 766 | "metadata": { 767 | "collapsed": true, 768 | "slideshow": { 769 | "slide_type": "-" 770 | } 771 | }, 772 | "outputs": [], 773 | "source": [ 774 | "R2 = librosa.segment.recurrence_matrix(chroma_sync,\n", 775 | " mode='affinity',\n", 776 | " sym=True)" 777 | ] 778 | }, 779 | { 780 | "cell_type": "code", 781 | "execution_count": null, 782 | "metadata": { 783 | "collapsed": false, 784 | "slideshow": { 785 | "slide_type": "fragment" 786 | } 787 | }, 788 | "outputs": [], 789 | "source": [ 790 | "plt.figure(figsize=(5, 4))\n", 791 | "librosa.display.specshow(R2)\n", 792 | "plt.colorbar()" 793 | ] 794 | }, 795 | { 796 | "cell_type": "markdown", 797 | "metadata": { 798 | "slideshow": { 799 | "slide_type": "subslide" 800 | } 801 | }, 802 | "source": [ 803 | "## Exercise 2\n", 804 | "\n", 805 | "* Plot a recurrence matrix using different features\n", 806 | "* **Bonus**: Use a custom distance metric" 807 | ] 808 | }, 809 | { 810 | "cell_type": "code", 811 | "execution_count": null, 812 | "metadata": { 813 | "collapsed": true, 814 | "slideshow": { 815 | "slide_type": "subslide" 816 | } 817 | }, 818 | "outputs": [], 819 | "source": [ 820 | "# Exercise 2 solution" 821 | ] 822 | }, 823 | { 824 | "cell_type": "markdown", 825 | "metadata": { 826 | "slideshow": { 827 | "slide_type": "slide" 828 | } 829 | }, 830 | "source": [ 831 | "# `librosa.decompose`\n", 832 | "\n", 833 | "- `hpss`: Harmonic-percussive source separation\n", 834 | "- `nn_filter`: Nearest-neighbor filtering, non-local means, Repet-SIM\n", 835 | "- `decompose`: NMF, PCA and friends" 836 | ] 837 | }, 838 | { 839 | "cell_type": "markdown", 840 | "metadata": { 841 | "slideshow": { 842 | "slide_type": "subslide" 843 | } 844 | }, 845 | "source": [ 846 | "Separating harmonics from percussives is easy" 847 | ] 848 | }, 849 | { 850 | "cell_type": "code", 851 | "execution_count": null, 852 | "metadata": { 853 | "collapsed": false 854 | }, 855 | "outputs": [], 856 | "source": [ 857 | "D_harm, D_perc = librosa.decompose.hpss(D)\n", 858 | "\n", 859 | "y_harm = librosa.istft(D_harm)\n", 860 | "\n", 861 | "y_perc = librosa.istft(D_perc)" 862 | ] 863 | }, 864 | { 865 | "cell_type": "code", 866 | "execution_count": null, 867 | "metadata": { 868 | "collapsed": false, 869 | "slideshow": { 870 | "slide_type": "fragment" 871 | } 872 | }, 873 | "outputs": [], 874 | "source": [ 875 | "Audio(data=y_harm, rate=sr)" 876 | ] 877 | }, 878 | { 879 | "cell_type": "code", 880 | "execution_count": null, 881 | "metadata": { 882 | "collapsed": false 883 | }, 884 | "outputs": [], 885 | "source": [ 886 | "Audio(data=y_perc, rate=sr)" 887 | ] 888 | }, 889 | { 890 | "cell_type": "markdown", 891 | "metadata": { 892 | "slideshow": { 893 | "slide_type": "subslide" 894 | } 895 | }, 896 | "source": [ 897 | "NMF is pretty easy also!" 898 | ] 899 | }, 900 | { 901 | "cell_type": "code", 902 | "execution_count": null, 903 | "metadata": { 904 | "collapsed": true 905 | }, 906 | "outputs": [], 907 | "source": [ 908 | "# Fit the model\n", 909 | "W, H = librosa.decompose.decompose(S, n_components=16, sort=True)" 910 | ] 911 | }, 912 | { 913 | "cell_type": "code", 914 | "execution_count": null, 915 | "metadata": { 916 | "collapsed": false 917 | }, 918 | "outputs": [], 919 | "source": [ 920 | "plt.figure(figsize=(6, 3))\n", 921 | "plt.subplot(1, 2, 1), plt.title('W')\n", 922 | "librosa.display.specshow(librosa.logamplitude(W**2), y_axis='log')\n", 923 | "plt.subplot(1, 2, 2), plt.title('H')\n", 924 | "librosa.display.specshow(H, x_axis='time')" 925 | ] 926 | }, 927 | { 928 | "cell_type": "code", 929 | "execution_count": null, 930 | "metadata": { 931 | "collapsed": true, 932 | "slideshow": { 933 | "slide_type": "subslide" 934 | } 935 | }, 936 | "outputs": [], 937 | "source": [ 938 | "# Reconstruct the signal using only the first component\n", 939 | "S_rec = W[:, :1].dot(H[:1, :])\n", 940 | "\n", 941 | "y_rec = librosa.istft(S_rec * phase)" 942 | ] 943 | }, 944 | { 945 | "cell_type": "code", 946 | "execution_count": null, 947 | "metadata": { 948 | "collapsed": false, 949 | "slideshow": { 950 | "slide_type": "-" 951 | } 952 | }, 953 | "outputs": [], 954 | "source": [ 955 | "Audio(data=y_rec, rate=sr)" 956 | ] 957 | }, 958 | { 959 | "cell_type": "markdown", 960 | "metadata": { 961 | "slideshow": { 962 | "slide_type": "subslide" 963 | } 964 | }, 965 | "source": [ 966 | "## Exercise 3\n", 967 | "\n", 968 | "- Compute a chromagram using only the harmonic component\n", 969 | "- **Bonus**: run the beat tracker using only the percussive component" 970 | ] 971 | }, 972 | { 973 | "cell_type": "markdown", 974 | "metadata": { 975 | "slideshow": { 976 | "slide_type": "slide" 977 | } 978 | }, 979 | "source": [ 980 | "# Wrapping up\n", 981 | "\n", 982 | "- This was just a brief intro, but there's lots more!\n", 983 | "\n", 984 | "- Read the docs: http://librosa.github.io/librosa/\n", 985 | "- And the example gallery: http://librosa.github.io/librosa_gallery/\n", 986 | "- We'll be sprinting all day. Get involved! https://github.com/librosa/librosa/issues/395" 987 | ] 988 | } 989 | ], 990 | "metadata": { 991 | "celltoolbar": "Slideshow", 992 | "kernelspec": { 993 | "display_name": "Python 3.5 (clean)", 994 | "language": "python", 995 | "name": "clean3.5" 996 | }, 997 | "language_info": { 998 | "codemirror_mode": { 999 | "name": "ipython", 1000 | "version": 3 1001 | }, 1002 | "file_extension": ".py", 1003 | "mimetype": "text/x-python", 1004 | "name": "python", 1005 | "nbconvert_exporter": "python", 1006 | "pygments_lexer": "ipython3", 1007 | "version": "3.5.2" 1008 | } 1009 | }, 1010 | "nbformat": 4, 1011 | "nbformat_minor": 0 1012 | } 1013 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Librosa tutorial 2 | 3 | In this tutorial, my goal is to get you set up to use librosa for audio and music analysis. 4 | This tutorial will be interactive, and it will be best if you follow along on your own machine. 5 | Feel free to bring along some of your own music to analyze! 6 | 7 | We'll be using [Jupyter](http://jupyter.org/) notebooks and the [Anaconda](https://www.continuum.io/downloads) 8 | Python environment with Python version 3.5. It will be best if you follow the instructions below before 9 | attending the tutorial, but installation disks will also be provided in case anything goes wrong or you require 10 | assistance. 11 | 12 | --- 13 | 14 | # Installing the dependencies 15 | 16 | Before getting started with librosa, it's important to have a working environment with all dependencies 17 | satisfied. For this, we recommend using the [Anaconda](https://www.continuum.io/downloads) distribution of 18 | Python 3.5. (Older versions of Python are supported as well.) 19 | 20 | Once your Anaconda environment is installed and activated, you can install librosa through `conda-forge`: 21 | 22 | ``` 23 | conda install -c conda-forge librosa 24 | ``` 25 | 26 | ## Audio codecs 27 | 28 | Librosa requires a few additional packages to load audio data encoded in various formats (e.g., `mp3`, `ogg`, 29 | `flac`, `m4a`). The two main libraries used by librosa are [ffmpeg](https://ffmpeg.org/) and 30 | [gstreamer](https://gstreamer.freedesktop.org/). Both are optional, but you must install at least one 31 | package. 32 | Audio codec libraries are packaged differently on different platforms. 33 | 34 | * On Linux, `ffmpeg` is available through `conda-forge` and is installed automatically when `librosa` is 35 | installed through `conda-forge`. GStreamer can be installed through your Linux distribution's package manager 36 | (e.g., `apt-get install libgstreamer1.0-0` on Debian/Ubuntu). 37 | 38 | * On OSX, `ffmpeg` is available through `conda-forge` and is installed automatically when `librosa` is 39 | installed through `conda-forge`. GStreamer can be installed by `brew install gstreamer` or by downloading 40 | directly from the [gstreamer downloads](https://gstreamer.freedesktop.org/download/) page. 41 | 42 | * On Windows, `ffmpeg` must be installed separately from the [ffmpeg downloads](http://ffmpeg.org/download.html) page. 43 | 44 | 45 | For all operating systems, if you're using `gstreamer`, you will need to install `PyGObject` through `pip`: 46 | 47 | ``` 48 | pip install PyGObject 49 | ``` 50 | 51 | ## Jupyter 52 | 53 | The tutorial materials in this repository are provided in the form of [Jupyter](http://jupyter.org/) 54 | notebooks. Jupyter can be installed by the following command: 55 | ``` 56 | conda install jupyter 57 | ``` 58 | --------------------------------------------------------------------------------