├── .gitignore ├── CODE_OF_CONDUCT.md ├── CONTRIBUTING.md ├── LICENSE ├── README.md ├── acronyms.md ├── advice_review.md ├── architectures.md ├── authors.md ├── client.py ├── datasets.md ├── dl4m.bib ├── dl4m.py ├── dl4m.tsv ├── download.py ├── fig ├── articles_per_year.png ├── logo.png ├── pie_chart_architecture.png ├── pie_chart_dataset.png ├── pie_chart_framework.png └── pie_chart_task.png ├── frameworks.md ├── publication_type.md ├── reproducibility.md ├── sources.md └── tasks.md /.gitignore: -------------------------------------------------------------------------------- 1 | paste_in_ReadMe.md 2 | todo.txt 3 | encours.txt 4 | 5 | __pycache__/ 6 | -------------------------------------------------------------------------------- /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | # Contributor Covenant Code of Conduct 2 | 3 | ## Our Pledge 4 | 5 | In the interest of fostering an open and welcoming environment, we as contributors and maintainers pledge to making participation in our project and our community a harassment-free experience for everyone, regardless of age, body size, disability, ethnicity, gender identity and expression, level of experience, nationality, personal appearance, race, religion, or sexual identity and orientation. 6 | 7 | ## Our Standards 8 | 9 | Examples of behavior that contributes to creating a positive environment include: 10 | 11 | * Using welcoming and inclusive language 12 | * Being respectful of differing viewpoints and experiences 13 | * Gracefully accepting constructive criticism 14 | * Focusing on what is best for the community 15 | * Showing empathy towards other community members 16 | 17 | Examples of unacceptable behavior by participants include: 18 | 19 | * The use of sexualized language or imagery and unwelcome sexual attention or advances 20 | * Trolling, insulting/derogatory comments, and personal or political attacks 21 | * Public or private harassment 22 | * Publishing others' private information, such as a physical or electronic address, without explicit permission 23 | * Other conduct which could reasonably be considered inappropriate in a professional setting 24 | 25 | ## Our Responsibilities 26 | 27 | Project maintainers are responsible for clarifying the standards of acceptable behavior and are expected to take appropriate and fair corrective action in response to any instances of unacceptable behavior. 28 | 29 | Project maintainers have the right and responsibility to remove, edit, or reject comments, commits, code, wiki edits, issues, and other contributions that are not aligned to this Code of Conduct, or to ban temporarily or permanently any contributor for other behaviors that they deem inappropriate, threatening, offensive, or harmful. 30 | 31 | ## Scope 32 | 33 | This Code of Conduct applies both within project spaces and in public spaces when an individual is representing the project or its community. Examples of representing a project or community include using an official project e-mail address, posting via an official social media account, or acting as an appointed representative at an online or offline event. Representation of a project may be further defined and clarified by project maintainers. 34 | 35 | ## Enforcement 36 | 37 | Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting the project team at bayle.yann@live.fr. The project team will review and investigate all complaints, and will respond in a way that it deems appropriate to the circumstances. The project team is obligated to maintain confidentiality with regard to the reporter of an incident. Further details of specific enforcement policies may be posted separately. 38 | 39 | Project maintainers who do not follow or enforce the Code of Conduct in good faith may face temporary or permanent repercussions as determined by other members of the project's leadership. 40 | 41 | ## Attribution 42 | 43 | This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, available at [http://contributor-covenant.org/version/1/4][version] 44 | 45 | [homepage]: http://contributor-covenant.org 46 | [version]: http://contributor-covenant.org/version/1/4/ 47 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # How To Contribute 2 | 3 | Contributions are welcome! 4 | Please submit you issues, pull requests, improvements and comments in British English. 5 | You can look at the issues and help to solve them or you can add some missing articles as described below. 6 | 7 | ### Adding an article 8 | 9 | Here are the steps to follow for adding one (or multiple) article: 10 | 1. Check that the article is not already in the [dl4m.bib](dl4m.bib) file. 11 | 2. Fork the repo. 12 | 3. Add the desired bib entry at the beginning of [dl4m.bib](dl4m.bib). Take care to fill all this field for each bib entry (if there is no information about a field please indicate it as `fieldName = {No}`): 13 | - Bib entry type (inproceedings, article, techreport, unpublished,...) (N.B.: Indicate arxiv article as @unpublished and if you know of a possible submission use `note = {Submitted to "NameOfJournalOrConference"}`) 14 | - Bib key (in the form AuthorlastnameYear, e.g. `Snow1999` 15 | - title (N.B.: Make sure the title is not written in all caps and that each word does not start with a capital letter.) 16 | - author (N.B.: Check that all authors in the pdf files are present in the bib author field and in the same order. Indeed automatic bib tools tend to mess up the order or to forget some authors.) 17 | - year 18 | - booktitle or journal 19 | - dataset (e.g. `dataset = {Inhouse & [Jamendo](http://www.mathieuramona.com/wp/data/jamendo/) & [RWC](https://staff.aist.go.jp/m.goto/RWC-MDB/)},`) 20 | 1. provide the link to the dataset 21 | 2. if multiple dataset are used, insert a ` & ` between each dataset 22 | - architecture (if multiple architectures are used, insert a ` & ` between each of them, e.g. `archi = {CNN & VPNN},`) 23 | - link (HTML link to the pdf file) 24 | - task (if multiple tasks are performed, insert a ` & ` between each of them, e.g. `task = {SVS & SVD},`). Please refer to the acronyms listed in [acronyms.md](acronyms.md) 25 | - dataaugmentation (if used, the type of data augmentation technique, otherwise `No`) 26 | - pages (if available) 27 | - code (HTML link to the code if available, `No` instead) 28 | - learningrate (if given in the paper or the code, the learning rate, otherwise `No`) 29 | - framework (if given in the paper or the code, the framework, otherwise `No`) 30 | - reproducible (the reproducibility details of the paper and code) 31 | - activation (if given in the paper or the code, the activation function, otherwise `No`) 32 | - epochs (if given in the paper or the code, the number of epoch, otherwise `No`) 33 | - batch (if given in the paper or the code, the number of batch, otherwise `No`) 34 | - loss (if given in the paper or the code, the loss function, otherwise `No`) 35 | - layers (if given in the paper or the code, the number of layers, otherwise `No`) 36 | - dropout (if given in the paper or the code, the loss function, otherwise `No`) 37 | - momentum (if given in the paper or the code, the momentum, otherwise `No`) 38 | - gpu (if given in the paper or the code, the type and number of GPUs, otherwise `No`) 39 | - metric (if given in the paper or the code, the metric, otherwise `No`) 40 | - computationtime (if given in the paper or the code, the global computation time and per epoch, otherwise `No`) 41 | - dimension (if given in the paper or the code, the number of dimension, otherwise `No`) 42 | - optimizer (if given in the paper or the code, the optimize function, otherwise `No`) 43 | - input (if given in the paper or the code, the input type, otherwise `No`) 44 | - month (for conference paper only) 45 | - address (for conference paper only) 46 | - note (optional additional custom notes, if you feel you want to share a great detail you read or give your opinion) 47 | 48 | For ease of use you can copy and paste and fill the following lines: 49 | ``` 50 | @inproceedings{Snow1999, 51 | activation = {}, 52 | address = {}, 53 | architecture = {}, 54 | author = {}, 55 | batch = {}, 56 | booktitle = {}, 57 | code = {}, 58 | computationtime = {}, 59 | dataaugmentation = {}, 60 | dataset = {}, 61 | dimension = {}, 62 | dropout = {}, 63 | epochs = {}, 64 | framework = {}, 65 | gpu = {}, 66 | input = {}, 67 | layers = {}, 68 | learningrate = {}, 69 | link = {}, 70 | loss = {}, 71 | metric = {}, 72 | momentum = {}, 73 | month = {}, 74 | note = {}, 75 | optimizer = {}, 76 | pages = {}, 77 | reproducible = {}, 78 | task = {}, 79 | title = {}, 80 | year = {}, 81 | } 82 | ``` 83 | 4. Check that you have installed this python package: 84 | 1. numpy 85 | 2. matplotlib 86 | 3. bibtexparser 87 | 5. Launch the python script `python dl4m.py`. 88 | 6. Submit your pull request! 89 | 90 | ### Missing or incorrect field for an article 91 | 92 | Thanks for spotting it! You can: 93 | 1. Submit an issue or 94 | 2. Make a pull request: 95 | 1. Fork the repo. 96 | 2. Add or update the corresponding field in [dl4m.bib](dl4m.bib). 97 | 3. Launch the python script `python dl4m.py`. 98 | 4. Submit your pull request with this title: `[Update][] field added or updated`, e.g. `[Update][Snow1999] added task`. 99 | 100 | ### Note 101 | 102 | I am looking for a way to display relations between articles automatically like a mindmap. Tell me if you know anything able to handle that. 103 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2017 Yann Bayle 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | # Deep Learning for Music (DL4M) [![Awesome](https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg)](https://github.com/sindresorhus/awesome) 4 | 5 | By [Yann Bayle](http://yannbayle.fr/english/index.php) ([Website](http://yannbayle.fr/english/index.php), [GitHub](https://github.com/ybayle)) from LaBRI ([Website](http://www.labri.fr/), [Twitter](https://twitter.com/labriOfficial/)), Univ. Bordeaux ([Website](https://www.u-bordeaux.fr/), [Twitter](https://twitter.com/univbordeaux)), CNRS ([Website](http://www.cnrs.fr/), [Twitter](https://twitter.com/CNRS)) and SCRIME ([Website](https://scrime.u-bordeaux.fr/)). 6 | 7 | **TL;DR** Non-exhaustive list of scientific articles on deep learning for music: [summary](#dl4m-summary) (Article title, pdf link and code), [details](dl4m.tsv) (table - more info), [details](dl4m.bib) (bib - all info) 8 | 9 | The role of this curated list is to gather scientific articles, thesis and reports that use deep learning approaches applied to music. 10 | The list is currently under construction but feel free to contribute to the missing fields and to add other resources! To do so, please refer to the [How To Contribute](#how-to-contribute) section. 11 | The resources provided here come from my review of the state-of-the-art for my PhD Thesis for which an article is being written. 12 | There are already surveys on deep learning for [music generation](https://arxiv.org/pdf/1709.01620.pdf), [speech separation](https://arxiv.org/ftp/arxiv/papers/1708/1708.07524.pdf) and [speaker identification](https://www.researchgate.net/profile/Seyed_Reza_Shahamiri/publication/319158024_Speaker_Identification_Features_Extraction_Methods_A_Systematic_Review/links/599e2816aca272dff12fdef1/Speaker-Identification-Features-Extraction-Methods-A-Systematic-Review.pdf). 13 | However, these surveys do not cover music information retrieval tasks that are included in this repository. 14 | 15 | ## Table of contents 16 | 17 | - [DL4M summary](#dl4m-summary) 18 | - [DL4M details](#dl4m-details) 19 | - [Code without articles](#code-without-articles) 20 | - [Statistics and visualisations](#statistics-and-visualisations) 21 | - [Advices for reviewers of dl4m articles](#advices-for-reviewers-of-dl4m-articles) 22 | - [How To Contribute](#how-to-contribute) 23 | - [FAQ](#faq) 24 | - [Acronyms used](#acronyms-used) 25 | - [Sources](#sources) 26 | - [Contributors](#contributors) 27 | - [Other useful related lists](#other-useful-related-lists-and-resources) 28 | - [Cited by](#cited-by) 29 | - [License](#license) 30 | 31 | ## DL4M summary 32 | 33 | | Year | Articles, Thesis and Reports | Code | 34 | |------|-------------------------------|------| 35 | | 1988 | Neural net modeling of music | No | 36 | | 1988 | [Creation by refinement: A creativity paradigm for gradient descent learning networks](http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=23933) | No | 37 | | 1988 | A sequential network design for musical applications | No | 38 | | 1989 | [The representation of pitch in a neural net model of chord classification](http://www.jstor.org/stable/3679550) | No | 39 | | 1989 | [Algorithms for music composition by neural nets: Improved CBR paradigms](https://quod.lib.umich.edu/cgi/p/pod/dod-idx/algorithms-for-music-composition.pdf?c=icmc;idno=bbp2372.1989.044;format=pdf) | No | 40 | | 1989 | [A connectionist approach to algorithmic composition](http://www.jstor.org/stable/3679551) | No | 41 | | 1994 | [Neural network music composition by prediction: Exploring the benefits of psychoacoustic constraints and multi-scale processing](http://www-labs.iro.umontreal.ca/~pift6080/H09/documents/papers/mozer-music.pdf) | No | 42 | | 1995 | [Automatic source identification of monophonic musical instrument sounds](https://www.researchgate.net/publication/3622871_Automatic_source_identification_of_monophonic_musical_instrument_sounds) | No | 43 | | 1995 | [Neural network based model for classification of music type](http://ieeexplore.ieee.org/abstract/document/514161/) | No | 44 | | 1997 | [A machine learning approach to musical style recognition](http://repository.cmu.edu/cgi/viewcontent.cgi?article=1496&context=compsci) | No | 45 | | 1998 | [Recognition of music types](https://www.ri.cmu.edu/pub_files/pub1/soltau_hagen_1998_2/soltau_hagen_1998_2.pdf) | No | 46 | | 1999 | [Musical networks: Parallel distributed perception and performance](https://s3.amazonaws.com/academia.edu.documents/3551783/10.1.1.39.6248.pdf?AWSAccessKeyId=AKIAIWOWYYGZ2Y53UL3A&Expires=1507055806&Signature=5mGzQc7bvJgUZYfXOmCX8eeNQOs%3D&response-content-disposition=inline%3B%20filename%3DMusical_networks_Parallel_distributed_pe.pdf) | No | 47 | | 2001 | [Multi-phase learning for jazz improvisation and interaction](http://www.cs.smith.edu/~jfrankli/papers/CtColl01.pdf) | No | 48 | | 2002 | [A supervised learning approach to musical style recognition](https://www.researchgate.net/profile/Giuseppe_Buzzanca/publication/228588086_A_supervised_learning_approach_to_musical_style_recognition/links/54b43ee90cf26833efd0109f.pdf) | No | 49 | | 2002 | [Finding temporal structure in music: Blues improvisation with LSTM recurrent networks](http://www-perso.iro.umontreal.ca/~eckdoug/papers/2002_ieee.pdf) | No | 50 | | 2002 | [Neural networks for note onset detection in piano music](https://www.researchgate.net/profile/Matija_Marolt/publication/2473938_Neural_Networks_for_Note_Onset_Detection_in_Piano_Music/links/00b49525efccc79fed000000.pdf) | No | 51 | | 2004 | [A convolutional-kernel based approach for note onset detection in piano-solo audio signals](http://www.murase.nuie.nagoya-u.ac.jp/~ide/res/paper/E04-conference-pablo-1.pdf) | No | 52 | | 2009 | [Unsupervised feature learning for audio classification using convolutional deep belief networks](http://papers.nips.cc/paper/3674-unsupervised-feature-learning-for-audio-classification-using-convolutional-deep-belief-networks.pdf) | No | 53 | | 2010 | [Audio musical genre classification using convolutional neural networks and pitch and tempo transformations](http://lbms03.cityu.edu.hk/theses/c_ftt/mphil-cs-b39478026f.pdf) | No | 54 | | 2010 | [Automatic musical pattern feature extraction using convolutional neural network](https://www.researchgate.net/profile/Antoni_Chan2/publication/44260643_Automatic_Musical_Pattern_Feature_Extraction_Using_Convolutional_Neural_Network/links/02e7e523dac6bb86b0000000.pdf) | No | 55 | | 2011 | [Audio-based music classification with a pretrained convolutional network](http://www.ismir2011.ismir.net/papers/PS6-3.pdf) | No | 56 | | 2012 | [Rethinking automatic chord recognition with convolutional neural networks](http://ieeexplore.ieee.org/abstract/document/6406762/) | No | 57 | | 2012 | [Moving beyond feature design: Deep architectures and automatic feature learning in music informatics](http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.294.2304&rep=rep1&type=pdf) | No | 58 | | 2012 | [Local-feature-map integration using convolutional neural networks for music genre classification](http://liris.cnrs.fr/Documents/Liris-5602.pdf) | No | 59 | | 2012 | [Learning sparse feature representations for music annotation and retrieval](https://pdfs.semanticscholar.org/099d/85f25e9336f48ff64287a4b53ee5fb64ab51.pdf) | No | 60 | | 2012 | [Unsupervised learning of local features for music classification](http://www.ismir2012.ismir.net/event/papers/139_ISMIR_2012.pdf) | No | 61 | | 2013 | [Multiscale approaches to music audio feature learning](http://ismir2013.ismir.net/wp-content/uploads/2013/09/69_Paper.pdf) | No | 62 | | 2013 | [Musical onset detection with convolutional neural networks](http://phenicx.upf.edu/system/files/publications/Schlueter_MML13.pdf) | No | 63 | | 2013 | [Deep content-based music recommendation](http://papers.nips.cc/paper/5004-deep-content-based-music-recommendation.pdf) | No | 64 | | 2014 | [The munich LSTM-RNN approach to the MediaEval 2014 Emotion In Music task](https://pdfs.semanticscholar.org/8a24/c5131d5a28165f719697028c34b00e6d3f60.pdf) | No | 65 | | 2014 | [End-to-end learning for music audio](http://ieeexplore.ieee.org/abstract/document/6854950/) | No | 66 | | 2014 | [Deep learning for music genre classification](https://courses.engr.illinois.edu/ece544na/fa2014/Tao_Feng.pdf) | No | 67 | | 2014 | [Recognition of acoustic events using deep neural networks](https://www.cs.tut.fi/sgn/arg/music/tuomasv/dnn_eusipco2014.pdf) | No | 68 | | 2014 | [Deep image features in music information retrieval](https://www.degruyter.com/downloadpdf/j/eletel.2014.60.issue-4/eletel-2014-0042/eletel-2014-0042.pdf) | No | 69 | | 2014 | [From music audio to chord tablature: Teaching deep convolutional networks to play guitar](http://www.mirlab.org/conference_papers/International_Conference/ICASSP%202014/papers/p7024-humphrey.pdf) | No | 70 | | 2014 | [Improved musical onset detection with convolutional neural networks](http://www.mirlab.org/conference_papers/International_Conference/ICASSP%202014/papers/p7029-schluter.pdf) | No | 71 | | 2014 | [Boundary detection in music structure analysis using convolutional neural networks](https://dav.grrrr.org/public/pub/ullrich_schlueter_grill-2014-ismir.pdf) | No | 72 | | 2014 | [Improving content-based and hybrid music recommendation using deep learning](http://www.smcnus.org/wp-content/uploads/2014/08/reco_MM14.pdf) | No | 73 | | 2014 | [A deep representation for invariance and music classification](http://www.mirlab.org/conference_papers/International_Conference/ICASSP%202014/papers/p7034-zhang.pdf) | No | 74 | | 2015 | [Auralisation of deep convolutional neural networks: Listening to learned features](http://ismir2015.uma.es/LBD/LBD24.pdf) | [GitHub](https://github.com/keunwoochoi/Auralisation) | 75 | | 2015 | [Downbeat tracking with multiple features and deep neural networks](http://perso.telecom-paristech.fr/~grichard/Publications/2015-durand-icassp.pdf) | No | 76 | | 2015 | [Music boundary detection using neural networks on spectrograms and self-similarity lag matrices](http://www.ofai.at/~jan.schlueter/pubs/2015_eusipco.pdf) | No | 77 | | 2015 | [Classification of spatial audio location and content using convolutional neural networks](https://www.researchgate.net/profile/Toni_Hirvonen/publication/276061831_Classification_of_Spatial_Audio_Location_and_Content_Using_Convolutional_Neural_Networks/links/5550665908ae12808b37fe5a/Classification-of-Spatial-Audio-Location-and-Content-Using-Convolutional-Neural-Networks.pdf) | No | 78 | | 2015 | [Deep learning, audio adversaries, and music content analysis](http://www2.imm.dtu.dk/pubdb/views/edoc_download.php/6905/pdf/imm6905.pdf) | No | 79 | | 2015 | [Deep learning and music adversaries](https://arxiv.org/pdf/1507.04761.pdf) | [GitHub](https://github.com/coreyker/dnn-mgr) | 80 | | 2015 | [Singing voice detection with deep recurrent neural networks](https://hal-imt.archives-ouvertes.fr/hal-01110035/) | No | 81 | | 2015 | [Automatic instrument recognition in polyphonic music using convolutional neural networks](https://arxiv.org/pdf/1511.05520.pdf) | No | 82 | | 2015 | [A software framework for musical data augmentation](https://bmcfee.github.io/papers/ismir2015_augmentation.pdf) | No | 83 | | 2015 | [A deep bag-of-features model for music auto-tagging](https://arxiv.org/pdf/1508.04999v1.pdf) | No | 84 | | 2015 | [Music-noise segmentation in spectrotemporal domain using convolutional neural networks](http://ismir2015.uma.es/LBD/LBD27.pdf) | No | 85 | | 2015 | [Musical instrument sound classification with deep convolutional neural network using feature fusion approach](https://arxiv.org/ftp/arxiv/papers/1512/1512.07370.pdf) | No | 86 | | 2015 | [Environmental sound classification with convolutional neural networks](http://karol.piczak.com/papers/Piczak2015-ESC-ConvNet.pdf) | No | 87 | | 2015 | [Exploring data augmentation for improved singing voice detection with neural networks](https://grrrr.org/pub/schlueter-2015-ismir.pdf) | [GitHub](https://github.com/f0k/ismir2015) | 88 | | 2015 | [Singer traits identification using deep neural network](https://cs224d.stanford.edu/reports/SkiZhengshan.pdf) | No | 89 | | 2015 | [A hybrid recurrent neural network for music transcription](https://arxiv.org/pdf/1411.1623.pdf) | No | 90 | | 2015 | [An end-to-end neural network for polyphonic music transcription](https://arxiv.org/pdf/1508.01774.pdf) | No | 91 | | 2015 | [Deep karaoke: Extracting vocals from musical mixtures using a convolutional deep neural network](https://link.springer.com/chapter/10.1007/978-3-319-22482-4_50) | No | 92 | | 2015 | [Folk music style modelling by recurrent neural networks with long short term memory units](http://ismir2015.uma.es/LBD/LBD13.pdf) | [GitHub](https://github.com/IraKorshunova/folk-rnn) | 93 | | 2015 | [Deep neural network based instrument extraction from music](https://www.researchgate.net/profile/Stefan_Uhlich/publication/282001406_Deep_neural_network_based_instrument_extraction_from_music/links/5600eeda08ae07629e52b397/Deep-neural-network-based-instrument-extraction-from-music.pdf) | No | 94 | | 2015 | [A deep neural network for modeling music](https://www.researchgate.net/profile/Xiaoqing_Zheng3/publication/275347034_A_Deep_Neural_Network_for_Modeling_Music/links/5539d2060cf2239f4e7dad0d/A-Deep-Neural-Network-for-Modeling-Music.pdf) | No | 95 | | 2016 | [An efficient approach for segmentation, feature extraction and classification of audio signals](http://file.scirp.org/pdf/CS_2016042615054817.pdf) | No | 96 | | 2016 | [Text-based LSTM networks for automatic music composition](https://drive.google.com/file/d/0B1OooSxEtl0FcG9MYnY2Ylh5c0U/view) | No | 97 | | 2016 | [Towards playlist generation algorithms using RNNs trained on within-track transitions](https://arxiv.org/pdf/1606.02096.pdf) | No | 98 | | 2016 | [Automatic tagging using deep convolutional neural networks](https://arxiv.org/pdf/1606.00298.pdf) | No | 99 | | 2016 | [Automatic chord estimation on seventhsbass chord vocabulary using deep neural network](http://ieeexplore.ieee.org/abstract/document/7471677/) | No | 100 | | 2016 | [DeepBach: A steerable model for Bach chorales generation](https://arxiv.org/pdf/1612.01010.pdf) | [GitHub](https://github.com/Ghadjeres/DeepBach) | 101 | | 2016 | [Bayesian meter tracking on learned signal representations](http://www.rhythmos.org/MMILab-Andre_files/ISMIR2016_CNNDBNbeats_camready.pdf) | No | 102 | | 2016 | [Deep learning for music](https://arxiv.org/pdf/1606.04930.pdf) | No | 103 | | 2016 | [Learning temporal features using a deep neural network and its application to music genre classification](https://www.researchgate.net/profile/Il_Young_Jeong/publication/305683876_Learning_temporal_features_using_a_deep_neural_network_and_its_application_to_music_genre_classification/links/5799a27c08aec89db7bb9f92.pdf) | No | 104 | | 2016 | [On the potential of simple framewise approaches to piano transcription](https://arxiv.org/pdf/1612.05153.pdf) | No | 105 | | 2016 | [Feature learning for chord recognition: The deep chroma extractor](https://arxiv.org/pdf/1612.05065.pdf) | [GitHub](https://github.com/fdlm/chordrec/tree/master/experiments/ismir2016) | 106 | | 2016 | [A fully convolutional deep auditory model for musical chord recognition](https://www.researchgate.net/profile/Filip_Korzeniowski/publication/305590295_A_Fully_Convolutional_Deep_Auditory_Model_for_Musical_Chord_Recognition/links/579486ba08aed51475cc6958/A-Fully-Convolutional-Deep-Auditory-Model-for-Musical-Chord-Recognition.pdf?_iepl%5BhomeFeedViewId%5D=HTzFFmKPia2YminQ4psHT5at&_iepl%5Bcontexts%5D%5B0%5D=pcfhf&_iepl%5BinteractionType%5D=publicationDownload&origin=publication_detail&ev=pub_int_prw_xdl&msrp=Dz_6LKHzYcPyP-LmgZPF-m63ayZ6k0entFEntooiu_e32zfETNQXKPQSTFOI87NONIIQuUQdnUtwORdomTXfteTrb09KiAIdDtBJnw_02P6JeRr5zu2eyaCG.2Uxsi_eENxtbYL39lvorIK8LofRYhkgpUHzpzmVzkIEiyHc0wUY87rEa4PH1qbXi4k4RyagHUsA2IsZtewnprglORjx2v9Cwbk9ZfQ.cd67BaqtHul_hE6SX6vUFKuldz81aH6dWq-cYMkq5vQKCHcvB8l9zgeM694Efb_r2wBB5GT9idt3OLeME0UxVHI6ROxamgK3LMNlSw.JtZXAo9HhR9t-8Wl3gxJgnoM4--rtmDEUDbXSWezbFyU-CoB_nyfxbRQ4kdoN4-5aJ3Tgx4YHdikicqAhc_cezB2ZntjxkB4rEDx1A) | No | 107 | | 2016 | [A deep bidirectional long short-term memory based multi-scale approach for music dynamic emotion prediction](http://ieeexplore.ieee.org/document/7471734/) | No | 108 | | 2016 | [Event localization in music auto-tagging](http://mac.citi.sinica.edu.tw/~yang/pub/liu16mm.pdf) | [GitHub](https://github.com/ciaua/clip2frame) | 109 | | 2016 | [Deep convolutional networks on the pitch spiral for musical instrument recognition](https://github.com/lostanlen/ismir2016/blob/master/paper/lostanlen_ismir2016.pdf) | [GitHub](https://github.com/lostanlen/ismir2016) | 110 | | 2016 | [SampleRNN: An unconditional end-to-end neural audio generation model](https://openreview.net/pdf?id=SkxKPDv5xl) | [GitHub](https://github.com/soroushmehr/sampleRNN_ICLR2017) | 111 | | 2016 | [Robust audio event recognition with 1-max pooling convolutional neural networks](https://arxiv.org/pdf/1604.06338.pdf) | No | 112 | | 2016 | [Experimenting with musically motivated convolutional neural networks](http://jordipons.me/media/CBMI16.pdf) | [GitHub](https://github.com/jordipons/) | 113 | | 2016 | [Singing voice melody transcription using deep neural networks](https://wp.nyu.edu/ismir2016/wp-content/uploads/sites/2294/2016/07/163_Paper.pdf) | No | 114 | | 2016 | [Singing voice separation using deep neural networks and F0 estimation](http://www.music-ir.org/mirex/abstracts/2016/RSGP1.pdf) | [Website](http://cvssp.org/projects/maruss/mirex2016/) | 115 | | 2016 | [Learning to pinpoint singing voice from weakly labeled examples](http://www.ofai.at/~jan.schlueter/pubs/2016_ismir.pdf) | No | 116 | | 2016 | [Analysis of time-frequency representations for musical onset detection with convolutional neural network](http://ieeexplore.ieee.org/abstract/document/7733228/) | No | 117 | | 2016 | [Note onset detection in musical signals via neural-network-based multi-ODF fusion](https://www.degruyter.com/downloadpdf/j/amcs.2016.26.issue-1/amcs-2016-0014/amcs-2016-0014.pdf) | No | 118 | | 2016 | [Music transcription modelling and composition using deep learning](https://drive.google.com/file/d/0B1OooSxEtl0FcTBiOGdvSTBmWnc/view) | [GitHub](https://github.com/IraKorshunova/folk-rnn) | 119 | | 2016 | [Convolutional neural network for robust pitch determination](http://www.mirlab.org/conference_papers/International_Conference/ICASSP%202016/pdfs/0000579.pdf) | No | 120 | | 2016 | [Deep convolutional neural networks and data augmentation for acoustic event detection](https://arxiv.org/pdf/1604.07160.pdf) | [Website](https://bitbucket.org/naoya1/aenet_release) | 121 | | 2017 | [Gabor frames and deep scattering networks in audio processing](https://arxiv.org/pdf/1706.08818.pdf) | No | 122 | | 2017 | [Vision-based detection of acoustic timed events: A case study on clarinet note onsets](http://dorienherremans.com/dlm2017/papers/bazzica2017clarinet.pdf) | No | 123 | | 2017 | [Deep learning techniques for music generation - A survey](https://arxiv.org/pdf/1709.01620.pdf) | No | 124 | | 2017 | [JamBot: Music theory aware chord based generation of polyphonic music with LSTMs](https://arxiv.org/pdf/1711.07682.pdf) | [GitHub](https://github.com/brunnergino/JamBot) | 125 | | 2017 | [XFlow: 1D <-> 2D cross-modal deep neural networks for audiovisual classification](https://arxiv.org/pdf/1709.00572.pdf) | No | 126 | | 2017 | [Machine listening intelligence](http://dorienherremans.com/dlm2017/papers/cella2017mli.pdf) | No | 127 | | 2017 | [Monoaural audio source separation using deep convolutional neural networks](http://mtg.upf.edu/system/files/publications/monoaural-audio-source_0.pdf) | [GitHub](https://github.com/MTG/DeepConvSep) | 128 | | 2017 | [Deep multimodal network for multi-label classification](http://ieeexplore.ieee.org/abstract/document/8019322/) | No | 129 | | 2017 | [A tutorial on deep learning for music information retrieval](https://arxiv.org/pdf/1709.04396.pdf) | [GitHub](https://github.com/keunwoochoi/dl4mir) | 130 | | 2017 | [A comparison on audio signal preprocessing methods for deep neural networks on music tagging](https://arxiv.org/pdf/1709.01922.pdf) | [GitHub](https://github.com/keunwoochoi/transfer_learning_music) | 131 | | 2017 | [Transfer learning for music classification and regression tasks](https://arxiv.org/pdf/1703.09179v3.pdf) | [GitHub](https://github.com/keunwoochoi/transfer_learning_music) | 132 | | 2017 | [Convolutional recurrent neural networks for music classification](http://ieeexplore.ieee.org/abstract/document/7952585/) | [GitHub](https://github.com/keunwoochoi/icassp_2017) | 133 | | 2017 | [An evaluation of convolutional neural networks for music classification using spectrograms](http://www.inf.ufpr.br/lesoliveira/download/ASOC2017.pdf) | No | 134 | | 2017 | [Large vocabulary automatic chord estimation using deep neural nets: Design framework, system variations and limitations](https://arxiv.org/pdf/1709.07153.pdf) | No | 135 | | 2017 | [Basic filters for convolutional neural networks: Training or design?](https://arxiv.org/pdf/1709.02291.pdf) | No | 136 | | 2017 | [Ensemble Of Deep Neural Networks For Acoustic Scene Classification](https://arxiv.org/pdf/1708.05826.pdf) | No | 137 | | 2017 | [Robust downbeat tracking using an ensemble of convolutional networks](http://ieeexplore.ieee.org/abstract/document/7728057/) | No | 138 | | 2017 | [Music signal processing using vector product neural networks](http://dorienherremans.com/dlm2017/papers/fan2017vector.pdf) | No | 139 | | 2017 | [Transforming musical signals through a genre classifying convolutional neural network](http://dorienherremans.com/dlm2017/papers/geng2017genre.pdf) | No | 140 | | 2017 | [Audio to score matching by combining phonetic and duration information](https://arxiv.org/pdf/1707.03547.pdf) | [GitHub](https://github.com/ronggong/jingjuSingingPhraseMatching/tree/v0.1.0) | 141 | | 2017 | [Interactive music generation with positional constraints using anticipation-RNNs](https://arxiv.org/pdf/1709.06404.pdf) | No | 142 | | 2017 | [Deep rank-based transposition-invariant distances on musical sequences](https://arxiv.org/pdf/1709.00740.pdf) | No | 143 | | 2017 | [GLSR-VAE: Geodesic latent space regularization for variational autoencoder architectures](https://arxiv.org/pdf/1707.04588.pdf) | No | 144 | | 2017 | [Deep convolutional neural networks for predominant instrument recognition in polyphonic music](http://dl.acm.org/citation.cfm?id=3068697) | No | 145 | | 2017 | [CNN architectures for large-scale audio classification](https://arxiv.org/pdf/1609.09430v2.pdf) | No | 146 | | 2017 | [DeepSheet: A sheet music generator based on deep learning](http://ieeexplore.ieee.org/abstract/document/8026272/) | No | 147 | | 2017 | [Talking Drums: Generating drum grooves with neural networks](http://dorienherremans.com/dlm2017/papers/hutchings2017drums.pdf) | No | 148 | | 2017 | [Singing voice separation with deep U-Net convolutional networks](https://ismir2017.smcnus.org/wp-content/uploads/2017/10/171_Paper.pdf) | [GitHub](https://github.com/Xiao-Ming/UNet-VocalSeparation-Chainer) | 149 | | 2017 | [Music emotion recognition via end-to-end multimodal neural networks](http://ceur-ws.org/Vol-1905/recsys2017_poster18.pdf) | No | 150 | | 2017 | [Chord label personalization through deep learning of integrated harmonic interval-based representations](http://dorienherremans.com/dlm2017/papers/koops2017pers.pdf) | No | 151 | | 2017 | [End-to-end musical key estimation using a convolutional neural network](https://arxiv.org/pdf/1706.02921.pdf) | No | 152 | | 2017 | [MediaEval 2017 AcousticBrainz genre task: Multilayer perceptron approach](http://www.cp.jku.at/research/papers/Koutini_2017_mediaeval-acousticbrainz.pdf) | No | 153 | | 2017 | [Classification-based singing melody extraction using deep convolutional neural networks](https://www.preprints.org/manuscript/201711.0027/v1) | No | 154 | | 2017 | [Multi-level and multi-scale feature aggregation using pre-trained convolutional neural networks for music auto-tagging](https://arxiv.org/pdf/1703.01793v2.pdf) | No | 155 | | 2017 | [Multi-level and multi-scale feature aggregation using sample-level deep convolutional neural networks for music classification](https://arxiv.org/pdf/1706.06810.pdf) | [GitHub](https://github.com/jongpillee/musicTagging_MSD) | 156 | | 2017 | [Sample-level deep convolutional neural networks for music auto-tagging using raw waveforms](https://arxiv.org/pdf/1703.01789v2.pdf) | No | 157 | | 2017 | [A SeqGAN for Polyphonic Music Generation](https://arxiv.org/pdf/1710.11418.pdf) | [GitHub](https://github.com/L0SG/seqgan-music) | 158 | | 2017 | [Harmonic and percussive source separation using a convolutional auto encoder](http://www.eurasip.org/Proceedings/Eusipco/Eusipco2017/papers/1570346835.pdf) | No | 159 | | 2017 | [Stacked convolutional and recurrent neural networks for music emotion recognition](https://arxiv.org/pdf/1706.02292.pdf) | No | 160 | | 2017 | [A deep learning approach to source separation and remixing of hiphop music](https://repositori.upf.edu/bitstream/handle/10230/32919/Martel_2017.pdf?sequence=1&isAllowed=y) | No | 161 | | 2017 | [Music Genre Classification Using Masked Conditional Neural Networks](https://link.springer.com/chapter/10.1007%2F978-3-319-70096-0_49) | No | 162 | | 2017 | [Monaural Singing Voice Separation with Skip-Filtering Connections and Recurrent Inference of Time-Frequency Mask](https://arxiv.org/pdf/1711.01437.pdf) | [GitHub](https://github.com/Js-Mim/mss_pytorch) | 163 | | 2017 | [Generating data to train convolutional neural networks for classical music source separation](https://www.researchgate.net/profile/Marius_Miron/publication/318322107_Generating_data_to_train_convolutional_neural_networks_for_classical_music_source_separation/links/59637cc3458515a3575b93c6/Generating-data-to-train-convolutional-neural-networks-for-classical-music-source-separation.pdf?_iepl%5BhomeFeedViewId%5D=WchoMnlUL1Hk9hBLVTeR8Amh&_iepl%5Bcontexts%5D%5B0%5D=pcfhf&_iepl%5BinteractionType%5D=publicationDownload&origin=publication_detail&ev=pub_int_prw_xdl&msrp=p3lQ8M4uZlb4TF5Hv9a2U3P2y4wW7ant5KWj4E5-OcD1Mg53p1ykTKHMG9_zVTB9n6mI8fvZOCL2Xhpru186pCEY-2ZxiYR-CB8_QvwHc1kUG-QE4SHdProR.LoJb2BDOiiQth3iR9xgZUxxCWEJgtTBF4whFrFa01OD49-3YYRxA0WQVN--zhtQU_7C2Pt0rKdwoFxT1pfxFvnKXSXmy2eT1Jpz-pw.U1QLoFO_Uc6aQVr2Nm2FcAi6BqAUfngH2Or5__6wegbCgVvTYoIGt22tmCkYbGTOQ_4PxBgt1LrvsFQiL0oMyogP8Yk8myTj0gs9jw.fGpkufGqAI4R2v8Hfe0ThcXL7M7yN2PuAlx974BGVn50SdUWvNhhIPWBD-zWTn8NKtVJx3XrjKXFrMgi9Cx7qGrNP8tBWpha6Srf6g) | [GitHub](https://github.com/MTG/DeepConvSep) | 164 | | 2017 | [Monaural score-informed source separation for classical music using convolutional neural networks](https://www.researchgate.net/profile/Marius_Miron/publication/318637038_Monaural_score-informed_source_separation_for_classical_music_using_convolutional_neural_networks/links/597327c6458515e26dfdb007/Monaural-score-informed-source-separation-for-classical-music-using-convolutional-neural-networks.pdf?_iepl%5BhomeFeedViewId%5D=WchoMnlUL1Hk9hBLVTeR8Amh&_iepl%5Bcontexts%5D%5B0%5D=pcfhf&_iepl%5BinteractionType%5D=publicationDownload&origin=publication_detail&ev=pub_int_prw_xdl&msrp=Hp6dDqMepEiRZ5E6WkreaqyjFkFkwMxPFoJvr14etVJsoKZBc5qb99fBnJjVUZrRHLFRhaXvNY9k1sMvYPOouuGbQP0YhEGm28zLw_55Zewu86WGnHck1Tqi.93HH2WqXfTedn6IaZRjjhQGYZVDHBz1X6nr4ABBgMAVv584gvGN3sW5IyBAY-4MBWf5DJFPBGm8zsaC2dKz8G-odZPfosWoXY0afAQ.KoCP2mO9l31lCER0oMZMZBrbuRGvb6ZzeBwHb88pL8AhMfJk03Hj1eLrohQIjPDETBj4hhqb0gniDGJgtZ9GnW64ZNjh9GbQDrIl5A.egNQTyC7t8P26zCQWrbEhf51Pxy2JRBZoTkH6SpRHHhRhFl1_AT_AT481lMcFI34-JbeRq-5oTQR7DpvAuw7iUIivd78ltuxpI9syg) | [GitHub](https://github.com/MTG/DeepConvSep) | 165 | | 2017 | [Multi-label music genre classification from audio, text, and images using deep features](https://ismir2017.smcnus.org/wp-content/uploads/2017/10/126_Paper.pdf) | [GitHub](https://github.com/sergiooramas/tartarus) | 166 | | 2017 | [A deep multimodal approach for cold-start music recommendation](https://arxiv.org/pdf/1706.09739.pdf) | [GitHub](https://github.com/sergiooramas/tartarus) | 167 | | 2017 | [Melody extraction and detection through LSTM-RNN with harmonic sum loss](http://ieeexplore.ieee.org/abstract/document/7952660/) | No | 168 | | 2017 | [Representation learning of music using artist labels](https://arxiv.org/pdf/1710.06648.pdf) | No | 169 | | 2017 | [Toward inverse control of physics-based sound synthesis](http://dorienherremans.com/dlm2017/papers/pfalz2017synthesis.pdf) | [Website](https://www.cct.lsu.edu/~apfalz/inverse_control.html) | 170 | | 2017 | [DNN and CNN with weighted and multi-task loss functions for audio event detection](https://arxiv.org/pdf/1708.03211.pdf) | No | 171 | | 2017 | [Score-informed syllable segmentation for a cappella singing voice with convolutional neural networks](https://ismir2017.smcnus.org/wp-content/uploads/2017/10/46_Paper.pdf) | [GitHub](https://github.com/ronggong/jingjuSyllabicSegmentaion/tree/v0.1.0) | 172 | | 2017 | [End-to-end learning for music audio tagging at scale](https://arxiv.org/pdf/1711.02520.pdf) | [GitHub](https://github.com/jordipons/music-audio-tagging-at-scale-models) | 173 | | 2017 | [Designing efficient architectures for modeling temporal features with convolutional neural networks](http://ieeexplore.ieee.org/document/7952601/) | [GitHub](https://github.com/jordipons/ICASSP2017) | 174 | | 2017 | [Timbre analysis of music audio signals with convolutional neural networks](https://github.com/ronggong/EUSIPCO2017) | [GitHub](https://github.com/jordipons/EUSIPCO2017) | 175 | | 2017 | [Deep learning and intelligent audio mixing](http://www.semanticaudio.co.uk/wp-content/uploads/2017/09/WIMP2017_Martinez-RamirezReiss.pdf) | No | 176 | | 2017 | [Deep learning for event detection, sequence labelling and similarity estimation in music signals](http://ofai.at/~jan.schlueter/pubs/phd/phd.pdf) | No | 177 | | 2017 | [Music feature maps with convolutional neural networks for music genre classification](https://www.researchgate.net/profile/Thomas_Pellegrini/publication/319326354_Music_Feature_Maps_with_Convolutional_Neural_Networks_for_Music_Genre_Classification/links/59ba5ae3458515bb9c4c6724/Music-Feature-Maps-with-Convolutional-Neural-Networks-for-Music-Genre-Classification.pdf?origin=publication_detail&ev=pub_int_prw_xdl&msrp=wzXuHZAa5zAnqEmErYyZwIRr2H0q01LnNEd4Wd7A15CQfdVLwdy98pmE-AdnrDvoc3-bVENSFrHt0yhaOiE2mQrYllVS9CJZOk-c9R0j_R1rbgcZugS6RtQ_.AUjPuJSF5P_DMngf-woH7W-7jdnQlbNQziR4_h6NnCHfR_zGcEa8vOyyOz5gx5nc4azqKTPQ5ZgGGLUxkLj1qCQLEQ5ThkhGlWHLyA.s6MBZE20-EO_RjRGCOCV4wk0WSFdN56Aloiraxz9hKCbJwRM2Et27RHVUA8jj9H8qvXIB6f7zSIrQgjXGrL2yCpyQlLffuf57rzSwg.KMMXbZrHsihV8DJM53xkHAWf3VebCJESi4KU4btNv9nQsyK2KnkhSQaTILKv0DSZY3c70a61LzywCBuoHtIhVOFhW5hVZN2n5O9uKQ) | No | 178 | | 2017 | [Automatic drum transcription for polyphonic recordings using soft attention mechanisms and convolutional neural networks](https://carlsouthall.files.wordpress.com/2017/12/ismir2017adt.pdf) | [GitHub](https://github.com/CarlSouthall/ADTLib) | 179 | | 2017 | [Adversarial semi-supervised audio source separation applied to singing voice extraction](https://arxiv.org/pdf/1711.00048.pdf) | No | 180 | | 2017 | [Taking the models back to music practice: Evaluating generative transcription models built using deep learning](http://jcms.org.uk/issues/Vol2Issue1/taking-models-back-to-music-practice/Taking%20the%20Models%20back%20to%20Music%20Practice:%20Evaluating%20Generative%20Transcription%20Models%20built%20using%20Deep%20Learning.pdf) | [GitHub](https://github.com/IraKorshunova/folk-rnn) | 181 | | 2017 | [Generating nontrivial melodies for music as a service](https://ismir2017.smcnus.org/wp-content/uploads/2017/10/178_Paper.pdf) | No | 182 | | 2017 | [Invariances and data augmentation for supervised music transcription](https://arxiv.org/pdf/1711.04845.pdf) | [GitHub](https://github.com/jthickstun/thickstun2018invariances/) | 183 | | 2017 | [Lyrics-based music genre classification using a hierarchical attention network](https://ismir2017.smcnus.org/wp-content/uploads/2017/10/43_Paper.pdf) | [GitHub](https://github.com/alexTsaptsinos/lyricsHAN) | 184 | | 2017 | [A hybrid DSP/deep learning approach to real-time full-band speech enhancement](https://arxiv.org/pdf/1709.08243.pdf) | [GitHub](https://github.com/xiph/rnnoise/) | 185 | | 2017 | [Convolutional methods for music analysis](http://vbn.aau.dk/files/260308151/PHD_Gissel_Velarde_E_pdf.pdf) | No | 186 | | 2017 | [Extending temporal feature integration for semantic audio analysis](http://www.aes.org/e-lib/browse.cfm?elib=18682) | No | 187 | | 2017 | [Recognition and retrieval of sound events using sparse coding convolutional neural network](http://ieeexplore.ieee.org/abstract/document/8019552/) | No | 188 | | 2017 | [A two-stage approach to note-level transcription of a specific piano](http://www.mdpi.com/2076-3417/7/9/901/htm) | No | 189 | | 2017 | [Reducing model complexity for DNN based large-scale audio classification](https://arxiv.org/pdf/1711.00229.pdf) | No | 190 | | 2017 | [Audio spectrogram representations for processing with convolutional neural networks](http://dorienherremans.com/dlm2017/papers/wyse2017spect.pdf) | [Website](http://lonce.org/research/audioST/) | 191 | | 2017 | [Unsupervised feature learning based on deep models for environmental audio tagging](https://arxiv.org/pdf/1607.03681.pdf) | No | 192 | | 2017 | [Attention and localization based on a deep convolutional recurrent model for weakly supervised audio tagging](https://arxiv.org/pdf/1703.06052.pdf) | [GitHub](https://github.com/yongxuUSTC/att_loc_cgrnn) | 193 | | 2017 | [Surrey-CVSSP system for DCASE2017 challenge task4](https://www.cs.tut.fi/sgn/arg/dcase2017/documents/challenge_technical_reports/DCASE2017_Xu_146.pdf) | [GitHub](https://github.com/yongxuUSTC/dcase2017_task4_cvssp) | 194 | | 2017 | [A study on LSTM networks for polyphonic music sequence modelling](https://qmro.qmul.ac.uk/xmlui/handle/123456789/24946) | [Website](http://www.eecs.qmul.ac.uk/~ay304/code/ismir17) | 195 | | 2018 | [MuseGAN: Multi-track sequential generative adversarial networks for symbolic music generation and accompaniment](https://arxiv.org/pdf/1709.06298.pdf) | [GitHub](https://github.com/salu133445/musegan) | 196 | | 2018 | [Music transformer: Generating music with long-term structure](https://arxiv.org/pdf/1809.04281.pdf) | No | 197 | | 2018 | [Music theory inspired policy gradient method for piano music transcription](https://nips2018creativity.github.io/doc/music_theory_inspired_policy_gradient.pdf) | No | 198 | | 2019 | [Enabling factorized piano music modeling and generation with the MAESTRO dataset](https://arxiv.org/abs/1810.12247) | No | 199 | | 2019 | [Generating Long Sequences with Sparse Transformers](https://arxiv.org/pdf/1904.10509.pdf) | [GitHub](https://github.com/openai/sparse_attention) | 200 | 201 | [Go back to top](https://github.com/ybayle/awesome-deep-learning-music#deep-learning-for-music-dl4m-) 202 | 203 | ## DL4M details 204 | 205 | A human-readable table summarized version if displayed in the file [dl4m.tsv](dl4m.tsv). All details for each article are stored in the corresponding bib entry in [dl4m.bib](dl4m.bib). Each entry has the regular bib field: 206 | 207 | - `author` 208 | - `year` 209 | - `title` 210 | - `journal` or `booktitle` 211 | 212 | Each entry in [dl4m.bib](dl4m.bib) also displays additional information: 213 | 214 | - `link` - HTML link to the PDF file 215 | - `code` - Link to the source code if available 216 | - `archi` - Neural network architecture 217 | - `layer` - Number of layers 218 | - `task` - The proposed tasks studied in the article 219 | - `dataset` - The names of the dataset used 220 | - `dataaugmentation` - The type of data augmentation technique used 221 | - `time` - The computation time 222 | - `hardware` - The hardware used 223 | - `note` - Additional notes and information 224 | - `repro` - Indication to what extent the experiments are reproducible 225 | 226 | [Go back to top](https://github.com/ybayle/awesome-deep-learning-music#deep-learning-for-music-dl4m-) 227 | 228 | ## Code without articles 229 | 230 | - [Audio Classifier in Keras using Convolutional Neural Network](https://github.com/drscotthawley/audio-classifier-keras-cnn) 231 | - [Deep learning driven jazz generation using Keras & Theano](https://github.com/jisungk/deepjazz) 232 | - [End-to-end learning for music audio tagging at scale](https://github.com/jordipons/music-audio-tagging-at-scale-models) 233 | - [Music Genre classification on GTZAN dataset using CNNs](https://github.com/Hguimaraes/gtzan.keras) 234 | - [Pitch Estimation of Choir Music using Deep Learning Strategies: from Solo to Unison Recordings](https://github.com/helenacuesta/choir-pitch-estimation) 235 | - [Music Genre Classification with LSTMs](https://github.com/ruohoruotsi/LSTM-Music-Genre-Classification) 236 | - [CNN based Music Emotion Classification using TensorFlow](https://github.com/rickiepark/cnn_mer) 237 | - [Separating singing voice from music based on deep neural networks in Tensorflow](https://github.com/andabi/music-source-separation) 238 | - [Music tag classification model using CRNN](https://github.com/kristijanbartol/Deep-Music-Tagger) 239 | - [Finding the genre of a song with Deep Learning](https://github.com/despoisj/DeepAudioClassification) 240 | - [Composing music using neural nets](https://github.com/fephsun/neuralnetmusic) 241 | - [Performance-RNN-PyTorch](https://github.com/djosix/Performance-RNN-PyTorch) 242 | 243 | [Go back to top](https://github.com/ybayle/awesome-deep-learning-music#deep-learning-for-music-dl4m-) 244 | 245 | ## Statistics and visualisations 246 | 247 | - 165 papers referenced. See the details in [dl4m.bib](dl4m.bib). 248 | There are more papers from 2017 than any other years combined. 249 | Number of articles per year: 250 | ![Number of articles per year](fig/articles_per_year.png) 251 | - If you are applying DL to music, there are [356 other researchers](authors.md) in your field. 252 | - 34 tasks investigated. See the list of [tasks](tasks.md). 253 | Tasks pie chart: 254 | ![Tasks pie chart](fig/pie_chart_task.png) 255 | - 53 datasets used. See the list of [datasets](datasets.md). 256 | Datasets pie chart: 257 | ![Datasets pie chart](fig/pie_chart_dataset.png) 258 | - 30 architectures used. See the list of [architectures](architectures.md). 259 | Architectures pie chart: 260 | ![Architectures pie chart](fig/pie_chart_architecture.png) 261 | - 9 frameworks used. See the list of [frameworks](frameworks.md). 262 | Frameworks pie chart: 263 | ![Frameworks pie chart](fig/pie_chart_framework.png) 264 | - Only 44 articles (26%) provide their source code. 265 | Repeatability is the key to good science, so check out the [list of useful resources on reproducibility for MIR and ML](reproducibility.md). 266 | 267 | [Go back to top](https://github.com/ybayle/awesome-deep-learning-music#deep-learning-for-music-dl4m-) 268 | 269 | ## Advices for reviewers of dl4m articles 270 | 271 | Please refer to the [advice_review.md](advice_review.md) file. 272 | 273 | ## How To Contribute 274 | 275 | Contributions are welcome! 276 | Please refer to the [CONTRIBUTING.md](CONTRIBUTING.md) file. 277 | 278 | [Go back to top](https://github.com/ybayle/awesome-deep-learning-music#deep-learning-for-music-dl4m-) 279 | 280 | ## FAQ 281 | 282 | > How are the articles sorted? 283 | 284 | The articles are first sorted by decreasing year (to keep up with the latest news) and then alphabetically by the main author's family name. 285 | 286 | > Why are preprint from arXiv included in the list? 287 | 288 | I want to have exhaustive research and the latest news on DL4M. However, one should take care of the information provided in the articles currently in review. If possible you should wait for the final accepted and peer-reviewed version before citing an arXiv paper. I regularly update the arXiv links to the corresponding published papers when available. 289 | 290 | > How much can I trust the results published in an article? 291 | 292 | The list provided here does not guarantee the quality of the articles. You should either try to reproduce the experiments described or submit a request to [ReScience](https://github.com/ReScience/ReScience). Use one article's conclusion at your own risks. 293 | 294 | [Go back to top](https://github.com/ybayle/awesome-deep-learning-music#deep-learning-for-music-dl4m-) 295 | 296 | ## Acronyms used 297 | 298 | A list of useful acronyms used in deep learning and music is stored in [acronyms.md](acronyms.md). 299 | 300 | [Go back to top](https://github.com/ybayle/awesome-deep-learning-music#deep-learning-for-music-dl4m-) 301 | 302 | ## Sources 303 | 304 | The list of conferences, journals and aggregators used to gather the proposed materials is stored in [sources.md](sources.md). 305 | 306 | [Go back to top](https://github.com/ybayle/awesome-deep-learning-music#deep-learning-for-music-dl4m-) 307 | 308 | ## Contributors 309 | 310 | - [Yann Bayle](http://yannbayle.fr/english/index.php) ([GitHub](https://github.com/ybayle)) - Instigator and principal maintainer 311 | - Vincent Lostanlen ([GitHub](https://github.com/lostanlen)) 312 | - [Keunwoo Choi](https://keunwoochoi.wordpress.com/) ([GitHub](https://github.com/keunwoochoi)) 313 | - [Bob L. Sturm](http://www.eecs.qmul.ac.uk/~sturm/) ([GitHub](https://github.com/boblsturm)) 314 | - [Stefan Balke](https://www.audiolabs-erlangen.de/fau/assistant/balke) ([GitHub](https://github.com/stefan-balke)) 315 | - [Jordi Pons](http://www.jordipons.me/) ([GitHub](https://github.com/jordipons)) 316 | - Mirza Zulfan ([GitHub](https://github.com/mirzazulfan)) for the logo 317 | - [Devin Walters](https://github.com/devn) 318 | - https://github.com/LegendJ 319 | 320 | [Go back to top](https://github.com/ybayle/awesome-deep-learning-music#deep-learning-for-music-dl4m-) 321 | 322 | ## Other useful related lists and resources 323 | 324 | #### Audio 325 | 326 | - [DL4MIR tutorial with keras](https://github.com/tuwien-musicir/DL_MIR_Tutorial) - Tutorial for Deep Learning on Music Information Retrieval by [Thomas Lidy](http://ifs.tuwien.ac.at/~lidy/) 327 | - [Video talk from Ron Weiss](https://www.youtube.com/watch?v=sI_8EA0_ha8) - Ron Weiss (Google) Talk on Training neural network acoustic models on waveforms 328 | - [Slides on DL4M](http://www.jordipons.me/media/DL4Music_Pons.pdf) - A personal (re)view of the state-of-the-art by [Jordi Pons](http://www.jordipons.me/) 329 | - [DL4MIR tutorial](https://github.com/marl/dl4mir-tutorial) - Python tutorials for learning to solve MIR tasks with DL 330 | - [Awesome Python Scientific Audio](https://github.com/faroit/awesome-python-scientific-audio) - Python resources for Audio and Machine Learning 331 | - [ISMIR resources](http://ismir.net/resources.php) - Community maintained list 332 | - [ISMIR Google group](https://groups.google.com/a/ismir.net/forum/#!forum/community) - Daily dose of general MIR 333 | - [Awesome Python](https://github.com/vinta/awesome-python#audio) - Audio section of Python resources 334 | - [Awesome Web Audio](https://github.com/notthetup/awesome-webaudio) - WebAudio packages and resources 335 | - [Awesome Music](https://github.com/ciconia/awesome-music) - Music softwares 336 | - [Awesome Music Production](https://github.com/adius/awesome-music-production) - Music creation 337 | - [The Asimov Institute](http://www.asimovinstitute.org/analyzing-deep-learning-tools-music/) - 6 deep learning tools for music generation 338 | - [DLM Google group](https://groups.google.com/forum/#!forum/icdlm) - Deep Learning in Music group 339 | - [MIR community on Slack](https://slackpass.io/mircommunity) - Link to subscribe to the MIR community's Slack 340 | - [Unclassified list of MIR-related links](http://www.music.mcgill.ca/~cmckay/links_academic.html) - [Cory McKay](http://www.music.mcgill.ca/~cmckay/)'s list of various links on DL, MIR, ... 341 | - [MIRDL](http://jordipons.me/wiki/index.php/MIRDL) - Unmaintained list of DL articles for MIR from [Jordi Pons](http://www.jordipons.me/) 342 | - [WWW 2018 Challenge](https://www.crowdai.org/challenges/www-2018-challenge-learning-to-recognize-musical-genre) - Learning to Recognize Musical Genre on the [FMA](https://github.com/mdeff/fma) dataset 343 | - [Music generation with DL](https://github.com/umbrellabeach/music-generation-with-DL) - List of resources on music generation with deep learning 344 | - [Auditory Scene Analysis](https://mitpress.mit.edu/books/auditory-scene-analysis) - Book about the perceptual organization of sound by [Albert Bregman](https://en.wikipedia.org/wiki/Albert_Bregman), the "father of [Auditory Scene Analysis](https://en.wikipedia.org/wiki/Auditory_scene_analysis)". 345 | - [Demonstrations of Auditory Scene Analysis](http://webpages.mcgill.ca/staff/Group2/abregm1/web/downloadstoc.htm) - Audio demonstrations, which illustrate examples of auditory perceptual organization. 346 | 347 | [Go back to top](https://github.com/ybayle/awesome-deep-learning-music#deep-learning-for-music-dl4m-) 348 | 349 | #### Music datasets 350 | 351 | - [AudioContentAnalysis nearly exhaustive list of music-related datasets](http://www.audiocontentanalysis.org/data-sets/) 352 | - [Teaching MIR](https://teachingmir.wikispaces.com/Datasets) 353 | - [Wikipedia's list of datasets for machine learning research](https://en.wikipedia.org/wiki/List_of_datasets_for_machine_learning_research#cite_ref-215) 354 | - [Datasets for deep learning](http://deeplearning.net/datasets/) 355 | - [Awesome public datasets](https://github.com/caesar0301/awesome-public-datasets) 356 | - [Awesome music listening](https://github.com/ybayle/awesome-music-listening) 357 | 358 | [Go back to top](https://github.com/ybayle/awesome-deep-learning-music#deep-learning-for-music-dl4m-) 359 | 360 | #### Deep learning 361 | 362 | - [DLPaper2Code: Auto-generation of Code from Deep Learning Research Papers](https://arxiv.org/abs/1711.03543) - 363 | - [Model Convertors](https://github.com/ysh329/deep-learning-model-convertor) - Convertors for DL frameworks and backend 364 | - [Deep architecture genealogy](https://github.com/hunkim/deep_architecture_genealogy) - Genealogy of DL architectures 365 | - [Deep Learning as an Engineer](http://www.univie.ac.at/nuhag-php/dateien/talks/3358_schlueter.pdf) - Slides from Jan Schlüter 366 | - [Awesome Deep Learning](https://github.com/ChristosChristofidis/awesome-deep-learning) - General deep learning resources 367 | - [Awesome Deep Learning Resources](https://github.com/endymecy/awesome-deeplearning-resources) - Papers regarding deep learning and deep reinforcement learning 368 | - [Awesome RNNs](https://github.com/kjw0612/awesome-rnn) - RNNs code, theory and applications 369 | - [Cheatsheets AI](https://github.com/kailashahirwar/cheatsheets-ai) - Cheat Sheets for Keras, neural networks, scikit-learn,... 370 | - [DL PaperNotes](https://github.com/dennybritz/deeplearning-papernotes) - Summaries and notes on general deep learning research papers 371 | - General [![Awesome](https://cdn.rawgit.com/sindresorhus/awesome/d7305f38d29fed78fa85652e3a63e154dd8e8829/media/badge.svg)](https://github.com/sindresorhus/awesome) lists 372 | - [Echo State Network](http://minds.jacobs-university.de/sites/default/files/uploads/papers/PracticalESN.pdf) 373 | - [DL in NLP](http://ruder.io/deep-learning-nlp-best-practices/index.html#introduction) - Best practices for using neural networks by [Sebastian Ruder](http://ruder.io/) 374 | - [CNN overview](http://cs231n.github.io/convolutional-networks/) - Stanford Course 375 | - [Dilated Recurrent Neural Networks](https://arxiv.org/pdf/1710.02224.pdf) - How to improve RNNs? 376 | - [Encoder-Decoder in RNNs](https://machinelearningmastery.com/how-does-attention-work-in-encoder-decoder-recurrent-neural-networks/?utm_content=buffer0d2a7&utm_medium=social&utm_source=twitter.com&utm_campaign=bufferhttps://blog.recast.ai/ml-spotlight-rnn/?utm_content=bufferf19d3&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer) - How Does Attention Work in Encoder-Decoder Recurrent Neural Networks 377 | - [On the use of DL](https://twitter.com/randal_olson/status/927157485240311808/photo/1) - Misc fun around DL 378 | - [ML from scratch](https://github.com/eriklindernoren/ML-From-Scratch) - Python implementations of ML models and algorithms from scratch from Data Mining to DL 379 | - [Comparison of DL frameworks](https://project.inria.fr/deeplearning/files/2016/05/DLFrameworks.pdf) - Presentation describing the different existing frameworks for DL 380 | - [ELU > ReLU](https://arxiv.org/pdf/1511.07289.pdf) - Article describing the differences between ELU and ReLU 381 | - [Reinforcement Learning: An Introduction](http://incompleteideas.net/sutton/book/bookdraft2017nov5.pdf) - Book about reinforcement learning 382 | - [Estimating Optimal Learning Rate](https://towardsdatascience.com/estimating-optimal-learning-rate-for-a-deep-neural-network-ce32f2556ce0) - Blog post on the learning rate optimisation 383 | - [GitHub repo for sklearn add-on for imbalanced learning](https://github.com/scikit-learn-contrib/imbalanced-learn) - ML in uneven datasets 384 | - [Video on DL from Nando de Freitas, Scott Reed and Oriol Vinyals](https://www.youtube.com/watch?v=YJnddoa8sHk) - Deep Learning: Practice and Trends (NIPS 2017 Tutorial, parts I & II) 385 | - [Article "Are GANs Created Equal? A Large-Scale Study"](https://arxiv.org/abs/1711.10337) - Actually comparing DL algorithms 386 | - [Battle of the Deep Learning frameworks](https://towardsdatascience.com/battle-of-the-deep-learning-frameworks-part-i-cff0e3841750) - DL frameworks comparison and evolution 387 | - [Black-box optimization](http://timvieira.github.io/blog/post/2018/03/16/black-box-optimization/) - There are other optimization algorithms than just gradient descent 388 | 389 | [Go back to top](https://github.com/ybayle/awesome-deep-learning-music#deep-learning-for-music-dl4m-) 390 | 391 | ## Cited by 392 | 393 | If you use the information contained in this repository, please let us know! This repository is cited by: 394 | 395 | - [Alexander Schindler](https://twitter.com/Slychief/status/915218386421997568) 396 | - [Meinard Müller, Christof Weiss, Stefan Balke](https://www.audiolabs-erlangen.de/resources/MIR/2017-GI-Tutorial-Musik/2017_MuellerWeissBalke_GI_DeepLearningMIR.pdf) 397 | - [WWW 2018 Challenge: Learning to Recognize Musical Genre](https://www.crowdai.org/challenges/www-2018-challenge-learning-to-recognize-musical-genre) 398 | - [Awesome Deep Learning](https://github.com/ChristosChristofidis/awesome-deep-learning) 399 | - [AINewsFeed](https://twitter.com/AINewsFeed/status/897832912351105025) 400 | 401 | [Go back to top](https://github.com/ybayle/awesome-deep-learning-music#deep-learning-for-music-dl4m-) 402 | 403 | ## License 404 | 405 | You are free to copy, modify, and distribute ***Deep Learning for Music (DL4M)*** with attribution under the terms of the MIT license. See the LICENSE file for details. 406 | This project use another projects and you may refer to them for appropriate license information : 407 | 408 | - [Readme checklist](https://github.com/ddbeck/readme-checklist) - To build an universal Readme. 409 | - [Pylint](https://www.pylint.org/) - To clean the python code. 410 | - [Numpy](http://www.numpy.org/) - To manage python structure. 411 | - [Matplotlib](https://matplotlib.org/) - To plot nice figures. 412 | - [Bibtexparser](https://github.com/sciunto-org/python-bibtexparser) - To deal with the bib entries. 413 | 414 | [Go back to top](https://github.com/ybayle/awesome-deep-learning-music#deep-learning-for-music-dl4m-) 415 | -------------------------------------------------------------------------------- /acronyms.md: -------------------------------------------------------------------------------- 1 | # Acronyms used 2 | 3 | A list of useful acronyms used in deep learning and music sorted alphabetically. 4 | 5 | | Acronym | Full name | 6 | |--------------|-----------| 7 | | ADT | Automatic Drum Transcription | 8 | | AE | AutoEncoder | 9 | | AMT | Automatic Music Transcription | 10 | | ANN | Artificial Neural Network | 11 | | ARNN | Anticipation Recurrent Neural Network | 12 | | BILSTM | Bidirectional Long Short-Term Memory | 13 | | BPTT | Back-Propagation Through Time | 14 | | BRNN | Bidirectional Recurrent Neural Network | 15 | | CDBN | Convolutional Deep Belief Networks | 16 | | CEC | Constant Error Carousel | 17 | | CLNN | ConditionaL Neural Networks | 18 | | CNN | Convolutional Neural Network | 19 | | ConvNet | Convolutional Neural Network | 20 | | CRBM | Conditional Restricted Boltzmann Machine | 21 | | CRNN | Convolutional Recurrent Neural Network | 22 | | DAE | Denoising AutoEncoder or Deep AutoEncoder | 23 | | DBM | Deep Boltzmann Machine | 24 | | DBN | Deep Belief Network | 25 | | DeconvNet | DeConvolutional Neural Network | 26 | | DL | Deep Learning | 27 | | DNN | Deep Neural Network | 28 | | DSN | Deep Stacking Network | 29 | | DWT | Discrete Wavelet Transform | 30 | | ELM | Extreme Learning Machine | 31 | | FC | Fully Connected | 32 | | FCN | Fully Convolutional Network | 33 | | FC-CNN | Fully Convolutional Convolutional Neural Network | 34 | | FC-LSTM | Fully Connected Long Short-Term Memory | 35 | | GBRCN | Gradient-Boosting Random Convolutional Network | 36 | | GFNN | Gradient Frequency Neural Networks | 37 | | GLCM | Gray Level Co-occurrence Matrix | 38 | | HAN | Hierarchical Attention Network | 39 | | HHDS | HipHop Dataset | 40 | | LSTM | Long Short-Term Memory | 41 | | MCLNN |Masked ConditionaL Neural Networks | 42 | | MER | Music Emotion Recognition | 43 | | MGR | Music Genre Recognition | 44 | | MLM | Music Language Models | 45 | | MLP | Multi-Layer Perceptron | 46 | | MRS | Music Recommender System | 47 | | MSDAE | Modified Sparse Denoising Autoencoder | 48 | | MSE | Mean Squared Error | 49 | | MSR | Music Style Recognition | 50 | | NN | Neural Network | 51 | | NNMODFF | Neural Network based Multi-Onset Detection Function Fusion | 52 | | ODF | Onset Detection Function | 53 | | PNN | Probabilistic Neural Network | 54 | | PReLU | Parametric Rectified Linear Unit | 55 | | RANSAC | RANdom SAmple Consensus | 56 | | RBM | Restricted Boltzmann Machine | 57 | | ReLU | Rectified Linear Unit | 58 | | RICNN | Rotation Invariant Convolutional Neural Network | 59 | | RNN | Recurrent Neural Network | 60 | | RTRL | Real-Time Recurrent Learning | 61 | | SAE | Stacked AE | 62 | | SDAE | Stacked DAE | 63 | | SGD | Stochastic Gradient Descent | 64 | | SVM | Support Vector Machine | 65 | | SVD | Singing Voice Detection | 66 | | SVS | Singing Voice Separation | 67 | | VAD | Voice Activity Detection | 68 | | VAE | Variational AutoEncoder | 69 | | VPNN | Vector Product Neural Network | 70 | | WPE | Weighted Prediction Error | 71 | -------------------------------------------------------------------------------- /advice_review.md: -------------------------------------------------------------------------------- 1 | # Advices for reviewers of dl4m articles 2 | 3 | Regarding conflicts of interest, confidentiality, anonymity, ethical guidelines, commitment, respect and scheduling please refer to the guidelines provided by the MIR community on the dedicated conference websites: 4 | - [ISMIR 2016](https://wp.nyu.edu/ismir2016/call-for-participation/guidelines-for-reviewers/) 5 | - [ISMIR 2017](https://ismir2017.smcnus.org/guidelines-for-reviewers/) 6 | 7 | For technical guidelines on deep learning and music you can use this humble following advices. 8 | Check for completeness of details about: 9 | 10 | - **Music aspects** 11 | - datasets used, please refer to [datasets.md](https://github.com/ybayle/awesome-deep-learning-music/blob/master/datasets.md) 12 | - data augmentation (Pitch shift, Time-stretch, Mixing, Circular shift, Noise addition, Filter, Dropout, ...) 13 | - input type (Raw signal, Time-frequency representation, ...) 14 | - number of dimension used as input (1D, 2D, ...) 15 | 16 | - **Deep learning aspects**: 17 | - architectures, please refer to [architectures.md](https://github.com/ybayle/awesome-deep-learning-music/blob/master/architectures.md) 18 | - learning rate (Fixed or changing and range) 19 | - framework, please refer to [frameworks.md](https://github.com/ybayle/awesome-deep-learning-music/blob/master/frameworks.md) 20 | - reproducibility, please refer to [reproducibility.md](https://github.com/ybayle/awesome-deep-learning-music/blob/master/reproducibility.md) 21 | - activation function (ReLU, Leaky ReLU, Sigmoid, Softmax, ...) 22 | - number of epochs 23 | - batch size (the bigger the better but generally between 16 and 150) 24 | - loss function (RMSE, Cross-entropy, ...) 25 | - number of layers 26 | - dropout ratio 27 | - cpu or gpu usage and description 28 | - computation time (Global or per epoch) 29 | - optimizer (Adam, SGD, ...) 30 | 31 | - **General aspects**: 32 | - source code provided 33 | - description of the task similar to existing ones 34 | - citing relevant literature from [dl4m.bib](https://github.com/ybayle/awesome-deep-learning-music/blob/master/dl4m.bib) 35 | -------------------------------------------------------------------------------- /architectures.md: -------------------------------------------------------------------------------- 1 | # List of architectures 2 | 3 | Please refer to the list of useful acronyms used in deep learning and music: [acronyms.md](acronyms.md). 4 | 5 | - ANN 6 | - ARNN 7 | - AlexNet 8 | - BILSTM 9 | - BRNN 10 | - CDBN 11 | - CLNN 12 | - CNN 13 | - CRNN 14 | - ConvNet 15 | - DAE 16 | - DBN 17 | - DNN 18 | - ELM 19 | - FCN 20 | - GAN 21 | - HAN 22 | - MCLNN 23 | - MLP 24 | - NNMODFF 25 | - No 26 | - PNN 27 | - RNN 28 | - RNN-LSTM 29 | - ResNet 30 | - SeqGAN 31 | - Transformer 32 | - U-Net 33 | - VPNN 34 | - tensor2tensor 35 | -------------------------------------------------------------------------------- /authors.md: -------------------------------------------------------------------------------- 1 | # List of authors 2 | 3 | - Adavanne, Sharath 4 | - Alec Radford 5 | - Arumugam, Muthumari 6 | - Arzt, Andreas 7 | - Badeau, Roland 8 | - Bammer, Roswitha 9 | - Barbieri, Francesco 10 | - Bazzica, Alessio 11 | - Bello, Juan Pablo 12 | - Ben-Tal, Oded 13 | - Benetos, Emmanouil 14 | - Bengio, Yoshua 15 | - Berdahl, E 16 | - Bharucha, J. 17 | - Bittner, Rachel 18 | - Bock, Sebastian 19 | - Boulanger-Lewandowski, Nicolas 20 | - Brakel, Philémon 21 | - Bransen, Jeroen 22 | - Briot, Jean-Pierre 23 | - Brunner, Gino 24 | - Buzzanca, Giuseppe 25 | - Böck, Sebastian 26 | - Cai, Lianhong 27 | - Cangea, Cătălina 28 | - Cella, Carmine-Emanuele 29 | - Chan, Antoni B 30 | - Chan, TS 31 | - Chandna, Pritish 32 | - Chaudhuri, Sourish 33 | - Chen, Shiyu 34 | - Chen, Tanfang 35 | - Chen, Wenxiao 36 | - Cheng, Wen-Huang 37 | - Chesmore, David 38 | - Chiang, Chin-Chin 39 | - Cho, Kyunghyun 40 | - Choi, Keunwoo 41 | - Chun, A 42 | - Costa, Yandre MG 43 | - Courville, Aaron 44 | - Coutinho, Eduardo 45 | - Dai, Andrew M. 46 | - Dannenberg, Roger B 47 | - Das, Samarjit 48 | - David, Bertrand 49 | - De Haas, W Bas 50 | - De Lyon, Insa 51 | - Deng, Junqi 52 | - Dieleman, Sander 53 | - Dimoulas, Charalampos 54 | - Dinculescu, Monica 55 | - Dixon, Simon 56 | - Doerfler, Monika 57 | - Dong, Hao-Wen 58 | - Dorfer, Matthias 59 | - Drossos, Konstantinos 60 | - Duppada, Venkatesh 61 | - Durand, Simon 62 | - Eck, Douglas 63 | - Ehmann, Andreas F. 64 | - Ellis, Daniel P. W. 65 | - Elsen, Erich 66 | - Engel, Jesse 67 | - Evangelopoulos, Georgios 68 | - Ewert, Sebastian 69 | - Fan, Zhe-Cheng 70 | - Fazekas, György 71 | - Feng, Tao 72 | - Flexer, Arthur 73 | - Foster, Dean 74 | - Foster, Peter 75 | - Franklin, Judy A 76 | - Furst, Miriam 77 | - Gao, Guanglai 78 | - Garcez, Artur S d'Avila 79 | - Garcia, Christophe 80 | - Gemmeke, Jort F. 81 | - Gencoglu, Oguzhan 82 | - Geng, Shijia 83 | - Gerkmann, Timo 84 | - Giron, Franck 85 | - Gong, Rong 86 | - Goudeseune, Camille 87 | - Grais, Emad M. 88 | - Griffith, Niall 89 | - Grill, Thomas 90 | - Gruber, Alexander Rudolf 91 | - Grzywczak, Daniel 92 | - Gulrajani, Ishaan 93 | - Gwardys, Grzegorz 94 | - Gygli, Michael 95 | - Gómez Gutiérrez, Emilia 96 | - Gómez, Emilia 97 | - Ha, Jung-Woo 98 | - Hadjeres, Gaëtan 99 | - Han, Yoonchang 100 | - Hanjalic, A 101 | - Harchaoui, Zaid 102 | - Hawthorne, Curtis 103 | - He, Wenqi 104 | - Hennequin, Romain 105 | - Herrera, Jorge 106 | - Hershey, Shawn 107 | - Hertel, Lars 108 | - Hiray, Sushant 109 | - Hirvonen, Toni 110 | - Hockman, Jason 111 | - Hoffman, Matthew D. 112 | - Holzapfel, Andre 113 | - Hsiao, Wen-Yi 114 | - Hsu, Yu-Lun 115 | - Hu, Min-Chun 116 | - Huang, Allen 117 | - Huang, Cheng-Zhi Anna 118 | - Huang, Qiang 119 | - Humphrey, Eric J. 120 | - Hutchings, P. 121 | - Huttunen, Heikki 122 | - Hwang, Uiwon 123 | - Ide, Ichiro 124 | - Ilya Sutskever 125 | - Imenina, Alina 126 | - Jackson, Philip J. B. 127 | - Jain, Shubham 128 | - Janer Mestres, Jordi 129 | - Janer, Jordi 130 | - Jang, Jyh-Shing R 131 | - Jansen, Aren 132 | - Jansson, Andreas 133 | - Jarina, Roman 134 | - Jeon, Byungsoo 135 | - Jeong, Il-Young 136 | - Kakade, Sham M. 137 | - Kaliappan, Mala 138 | - Kaminsky, I. 139 | - Kavcic, Alenka 140 | - Keefe, Douglas H. 141 | - Kelz, Rainer 142 | - Kereliuk, Corey 143 | - Kim, Adrian 144 | - Kim, Chanju 145 | - Kim, Dongwon 146 | - Kim, Jaehun 147 | - Kim, Jeonghee 148 | - Kim, Keunhyoung Luke 149 | - Kong, Qiuqiang 150 | - Koops, Hendrik Vincent 151 | - Korshunova, Iryna 152 | - Korzeniowski, Filip 153 | - Koutini, Khaled 154 | - Krawczyk-Becker, Martin 155 | - Kum, Sangeun 156 | - Kumar, Aparna 157 | - Kumar, Kundan 158 | - Kumar, Rithesh 159 | - Kuo, Hsu-Chan 160 | - Kwok, Yu-Kwong 161 | - Laden, Bernice 162 | - Largman, Yan 163 | - Larsen, Jan 164 | - LeCun, Yann 165 | - Lee, Honglak 166 | - Lee, Jongpil 167 | - Lee, Kyogu 168 | - Lee, Sang-gil 169 | - Lee, Taejin 170 | - Lee, Tan 171 | - Leglaive, Simon 172 | - Lewis, J. P. 173 | - Li, Juncheng 174 | - Li, Lihua 175 | - Li, Peter 176 | - Li, Siyan 177 | - Li, Tom LH 178 | - Li, Xinjian 179 | - Li, Xinxing 180 | - Lidy, Thomas 181 | - Liem, CCS 182 | - Lim, Wootaek 183 | - Lin, Bo-Chen 184 | - Lin, Chi-Po 185 | - Liu, Jen-Yu 186 | - Liò, Pietro 187 | - Lostanlen, Vincent 188 | - Maass, Marco 189 | - Malik, Miroslav 190 | - Marolt, Matija 191 | - Martel Baro, Héctor 192 | - Materka, Andrzej 193 | - Mathulaprangsan, Seksan 194 | - Matityaho, Benyamin 195 | - McFee, Brian 196 | - Medhat, Fady 197 | - Mehri, Soroush 198 | - Meng, Fanhang 199 | - Mertins, Alfred 200 | - Metze, Florian 201 | - Mimilakis, Stylianos Ioannis 202 | - Min, Seonwoo 203 | - Miron, Marius 204 | - Mitsufuji, Yuki 205 | - Montecchio, Nicola 206 | - Moore, R. Channing 207 | - Mouret, Florian 208 | - Mozer, Michael C. 209 | - Mońko, Jędrzej 210 | - Nakashika, Toru 211 | - Nam, Juhan 212 | - Nava, Gabriel Pablo 213 | - Ng, Andrew Y 214 | - Nielsen, Frank 215 | - Nieto, Oriol 216 | - Niewiadomski, Adam 217 | - Ogihara, Mitsunori 218 | - Oliveira, Luiz S 219 | - Oramas, Sergio 220 | - Pachet, François 221 | - Papanikolaou, George 222 | - Park, Hyunsin 223 | - Park, Jangyeon 224 | - Park, Jiyoung 225 | - Park, Taejin 226 | - Pellegrini, Thomas 227 | - Pfalz, A 228 | - Pfister, Beat 229 | - Pham, Peter 230 | - Phan, Huy 231 | - Piczak, Karol J 232 | - Pinquier, Julien 233 | - Plakal, Manoj 234 | - Platt, Devin 235 | - Plumbley, Mark D. 236 | - Poggio, Tomaso 237 | - Pons, Jordi 238 | - Privosnik, Marko 239 | - Prockup, Matthew 240 | - Qian, Jiyuan 241 | - Qian, Sheng 242 | - Qu, Shuhui 243 | - Radenen, Mathieu 244 | - Ramírez, Marco A. Martínez 245 | - Reiss, Joshua D. 246 | - Ren, Gang 247 | - Rewon Child 248 | - Richard, Gaël 249 | - Riedmiller, Martin 250 | - Rigaud, François 251 | - Roberts, Adam 252 | - Robinson, John 253 | - Roma, Gerard 254 | - Rosasco, Lorenzo 255 | - Sandler, Mark Brian 256 | - Santos, João Felipe 257 | - Santoso, Andri 258 | - Saurous, Rif A. 259 | - Schedl, Markus 260 | - Scherer, Klaus R 261 | - Schlüter, Jan 262 | - Schmidhuber, Juergen 263 | - Schmidt, Erik M. 264 | - Schrauwen, Benjamin 265 | - Schuller, Björn W 266 | - Schuller, Gerald 267 | - Schultz, Tanja 268 | - Scott Gray 269 | - Senac, Christine 270 | - Serra, Xavier 271 | - Seybold, Bryan 272 | - Shazeer, Noam 273 | - Shi, Zhengshan 274 | - Sigtia, Siddharth 275 | - Silla, Carlos N 276 | - Simon, Ian 277 | - Simpson, Andrew J. R. 278 | - Slaney, Malcolm 279 | - Slizovskaia, Olga 280 | - Smith, Julius O 281 | - Soltau, Hagen 282 | - Sordo, Mohamed 283 | - Sotelo, Jose 284 | - Southall, Carl 285 | - Stables, Ryan 286 | - Stasiak, Bartłomiej 287 | - Stasyuk, Andriy 288 | - Stoller, Daniel 289 | - Sturm, Bob L. 290 | - Su, Hong 291 | - Takahashi, Naoya 292 | - Takiguchi, Tetsuya 293 | - Tanaka, Hidehiko 294 | - Teng, Yifei 295 | - Thickstun, John 296 | - Thom, Belinda 297 | - Tian, Jiashen 298 | - Ticha, Dasa 299 | - Todd, Peter M. 300 | - Tsaptsinos, Alexandros 301 | - Tsipas, Nikolaos 302 | - Uhlich, Stefan 303 | - Ullrich, Karen 304 | - Uszkoreit, Jakob 305 | - Valin, Jean-Marc 306 | - Van Gemert, JC 307 | - Van Gool, Luc 308 | - Van den Oord, Aaron 309 | - Vaswani, Ashish 310 | - Velarde, Gissel 311 | - Veličković, Petar 312 | - Virtanen, Tuomas 313 | - Voinea, Stephen 314 | - Volk, Anja 315 | - Vrysis, Lazaros 316 | - Waibel, Alex 317 | - Wang, Chien-Yao 318 | - Wang, Jia-Ching 319 | - Wang, Qi 320 | - Wang, Shangfei 321 | - Wang, Tian 322 | - Wang, Wenwu 323 | - Wang, Xinxi 324 | - Wang, Ye 325 | - Wang, Yun 326 | - Wang, Yuyi 327 | - Wang, Ziyuan 328 | - Watson, David 329 | - Wattenhofer, Roger 330 | - Weiss, Ron J. 331 | - Weninger, Felix 332 | - Westphal, Martin 333 | - Weyde, Tillman 334 | - Widmer, Gerhard 335 | - Wiesendanger, Jonas 336 | - Wilson, Kevin 337 | - Wu, Chung-Hsien 338 | - Wu, Raymond 339 | - Wu, Yuzhong 340 | - Wyse, Lonce 341 | - Wülfing, Jan 342 | - Xianyu, Haishu 343 | - Xu, Mingxing 344 | - Xu, Yong 345 | - Yan, Yonghong 346 | - Yang, Li-Chia 347 | - Yang, Yi-Hsuan 348 | - Ycart, Adrien 349 | - Yoo, Chang D 350 | - Yoon, Sungroh 351 | - Zhang, Chiyuan 352 | - Zhang, Hui 353 | - Zhang, Pengjing 354 | - Zhang, Shangtong 355 | - Zhang, Wenqiang 356 | - Zhang, Xueliang 357 | - Zhao, An 358 | - Zheng, Xiaoqing 359 | - Zhou, Ruohua 360 | -------------------------------------------------------------------------------- /client.py: -------------------------------------------------------------------------------- 1 | import os 2 | import requests 3 | import urllib 4 | 5 | 6 | class Downloader(object): 7 | def __init__(self, dirname, timeout=5.0): 8 | self.dirname = dirname 9 | self.timeout = timeout 10 | 11 | def client(self, paper): 12 | # For now, HTTPS is not supported 13 | query = paper['link'].replace('https', 'http') 14 | r = requests.get(query, stream=True, timeout=self.timeout) 15 | if r.status_code == 200: 16 | # Successful connection 17 | with open(os.path.join(self.dirname, paper['name']), 'wb') as f: 18 | f.write(r.content) 19 | -------------------------------------------------------------------------------- /datasets.md: -------------------------------------------------------------------------------- 1 | # List of datasets 2 | 3 | Please refer to the list of useful acronyms used in deep learning and music: [acronyms.md](acronyms.md). 4 | 5 | - Inhouse 6 | - No 7 | - [32 Beethoven’s piano sonatas gathered from https://archive.org](https://soundcloud.com/samplernn/sets) 8 | - [413 hours of recorded solo piano music](http://papers.nips.cc/paper/8023-the-challenge-of-realistic-music-generation-modelling-raw-audio-at-scale-supplemental.zip) 9 | - [7digital](https://7digital.com) 10 | - [ADC2004](http://labrosa.ee.columbia.edu/projects/melody/) 11 | - [Acoustic Event](https://data.vision.ee.ethz.ch/cvl/ae_dataset/) 12 | - [AudioSet](https://research.google.com/audioset/index.html) 13 | - [Bach Corpus](http://musedata.org/) 14 | - [Bach10](http://music.cs.northwestern.edu/data/Bach10.html) 15 | - [Ballroom](http://mtg.upf.edu/ismir2004/contest/tempoContest/node5.html) 16 | - [Beatles](http://isophonics.net/content/reference-annotations-beatles) 17 | - [C4S](http://mmc.tudelft.nl/users/alessio-bazzica#C4S-dataset) 18 | - [CCMixter](https://members.loria.fr/ALiutkus/kam/) 19 | - [DSD100](http://sisec17.audiolabs-erlangen.de/#/dataset) 20 | - [Echo Nest Taste Profile Subset](https://labrosa.ee.columbia.edu/millionsong/tasteprofile) 21 | - [Free music archive](http://freemusicarchive.org/) 22 | - [GTzan](http://marsyas.info/downloads/datasets.html) 23 | - [HHDS](https://drive.google.com/drive/folders/0B1zpiGdDzFNlbmJyYU1VVFR3OEE) 24 | - [Homburg](http://www-ai.cs.uni-dortmund.de/audio.html) 25 | - [IDMT-SMT-Drums](https://www.idmt.fraunhofer.de/en/business_units/m2d/smt/drums.html) 26 | - [IRMAS](https://www.upf.edu/web/mtg/irmas) 27 | - [J.S. Bach chorales dataset](https://github.com/czhuang/JSB-Chorales-dataset) 28 | - [JSB Chorales](https://github.com/czhuang/JSB-Chorales-dataset) 29 | - [Jamendo](http://www.mathieuramona.com/wp/data/jamendo/) 30 | - [LMD](https://sites.google.com/site/carlossillajr/resources/the-latin-music-database-lmd) 31 | - [LSDB](lsdb.flow-machines.com/) 32 | - [LabROSA](http://labrosa.ee.columbia.edu/projects/melody/) 33 | - [Lakh MIDI](https://labrosa.ee.columbia.edu/sounds/music/) 34 | - [Lakh Pianoroll Datase](https://github.com/salu133445/musegan/blob/master/docs/dataset.md) 35 | - [Last.fm](https://www.last.fm/) 36 | - [LyricFind](http://lyricfind.com/) 37 | - [MAESTRO](https://magenta.tensorflow.org/datasets/maestro/) 38 | - [MAPS](http://www.tsi.telecom-paristech.fr/aao/en/2010/07/08/maps-database-a-piano-database-for-multipitch-estimation-and-automatic-transcription-of-music/) 39 | - [MIR-1K](https://sites.google.com/site/unvoicedsoundseparation/mir-1k) 40 | - [MSD](https://labrosa.ee.columbia.edu/millionsong/) 41 | - [Magnatagatune](http://mirg.city.ac.uk/codeapps/the-magnatagatune-dataset) 42 | - [MedleyDB](http://medleydb.weebly.com/) 43 | - [MusicNet](https://homes.cs.washington.edu/~thickstn/musicnet.html) 44 | - [NTT MLS](http://www.ntt-at.com/product/speech/) 45 | - [Nottingham dataset](http://abc.sourceforge.net/NMD/) 46 | - [Open Multitrack Testbed](http://www.semanticaudio.co.uk/projects/omtb/) 47 | - [Piano-e-Competition dataset (competition history)](http://www.piano-e-competition.com/) 48 | - [Piano-midi.de](Piano-midi.de) 49 | - [RWC](https://staff.aist.go.jp/m.goto/RWC-MDB/) 50 | - [SALAMI](http://ddmal.music.mcgill.ca/research/salami/annotations) 51 | - [Symbolic music data](http://users.cecs.anu.edu.au/~christian.walder/) 52 | - [TIMIT](https://catalog.ldc.upenn.edu/LDC93S1) 53 | - [TSP](http://www-mmsp.ece.mcgill.ca/Documents/Data/) 54 | - [TUT Acoustic Scenes 2016](http://www.cs.tut.fi/~mesaros/pubs/mesaros_eusipco2016-dcase.pdf) 55 | - [UIOWA MIS](http://theremin.music.uiowa.edu/mis.html) 56 | - [US Pop](https://labrosa.ee.columbia.edu/projects/musicsim/uspop2002.html) 57 | - [iKala](http://mac.citi.sinica.edu.tw/ikala/) 58 | -------------------------------------------------------------------------------- /dl4m.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | #!/usr/bin/python 3 | # 4 | # Authors Yann Bayle 5 | # E-mails bayle.yann@live.fr 6 | # License MIT 7 | # Created 16/08/2017 8 | # Updated 23/03/2018 9 | # Version 1.0.1 10 | # 11 | 12 | """ 13 | Description of dl4m.py 14 | ====================== 15 | 16 | Parse dl4m.bib to create a simple and readable ReadMe.md table. 17 | 18 | ..todo:: 19 | add function that test and report dead http links 20 | sort bib 21 | add Fig for tasks, wordcloud, dataaugmentation 22 | bibtexparser accentuation handling in authors.md list 23 | error handling 24 | report on nb item per ENTRYTYPE 25 | generate .tsv from .bib 26 | wordcloud titles, abstract, articles 27 | valid bib field https://www.openoffice.org/bibliographic/bibtex-defs.html 28 | """ 29 | 30 | import sys 31 | import numpy as np 32 | import matplotlib.pyplot as plt 33 | import bibtexparser 34 | from bibtexparser.bwriter import BibTexWriter 35 | 36 | 37 | def write_bib(bib_database, filen="dl4m.bib"): 38 | """Description of write_bib 39 | Write the items stored in bib_database into filen 40 | """ 41 | writer = BibTexWriter() 42 | writer.indent = ' ' 43 | writer.order_entries_by = ('year', "author") 44 | with open(filen, "w", encoding="utf-8") as bibfile: 45 | bibfile.write(writer.write(bib_database)) 46 | 47 | 48 | def read_bib(filen="dl4m.bib"): 49 | """Description of read_bib 50 | Parse a bib file and load it into memory in a python format 51 | """ 52 | with open(filen, "r", encoding="utf-8") as bibtex_file: 53 | bibtex_str = bibtex_file.read() 54 | bib_database = bibtexparser.loads(bibtex_str) 55 | return bib_database 56 | 57 | 58 | def load_bib(filen="dl4m.bib"): 59 | """Description of load_bib 60 | Load and return the items stored in filen 61 | """ 62 | bib = read_bib(filen) 63 | write_bib(bib, filen) 64 | bib = read_bib(filen) 65 | return bib.entries 66 | 67 | 68 | def articles_per_year(bib): 69 | """Description of main 70 | Display the number of articles published per year 71 | input: file name storing articles details 72 | """ 73 | years = [] 74 | for entry in bib: 75 | year = int(entry['year']) 76 | years.append(year) 77 | 78 | plt.xlabel('Years') 79 | plt.ylabel('Number of Deep Learning articles\n related to Music Information Retrieval') 80 | year_bins = np.arange(min(years), max(years) + 2.0, 1.0) 81 | plt.hist(years, bins=year_bins, color="#401153", align="left") 82 | axe = plt.gca() 83 | axe.spines['right'].set_color('none') 84 | axe.spines['top'].set_color('none') 85 | axe.xaxis.set_ticks_position('bottom') 86 | axe.yaxis.set_ticks_position('left') 87 | fig_fn = "fig/articles_per_year.png" 88 | plt.savefig(fig_fn, dpi=200) 89 | print("Fig. with number of articles per year saved in", fig_fn) 90 | 91 | 92 | def get_reproducibility(bib): 93 | """Description of get_reproducibility 94 | Generate insights on reproducibility 95 | """ 96 | cpt = 0 97 | for entry in bib: 98 | if "code" in entry: 99 | if entry["code"][:2] != "No": 100 | cpt += 1 101 | print(str(cpt) + " articles provide their source code.") 102 | 103 | return cpt 104 | 105 | 106 | def get_nb_articles(bib): 107 | """Description of get_nb_articles 108 | Count the number of articles in the database 109 | """ 110 | print("There are", len(bib), "articles referenced.") 111 | return len(bib) 112 | 113 | 114 | def get_authors(bib): 115 | """Description of get_authors 116 | Print in authors.md the alphabetical list of authors 117 | """ 118 | authors = [] 119 | for entry in bib: 120 | for author in entry['author'].split(" and "): 121 | authors.append(author) 122 | authors = sorted(set(authors)) 123 | nb_authors = len(authors) 124 | print("There are", nb_authors, "researchers working on DL4M.") 125 | 126 | authors_fn = "authors.md" 127 | with open(authors_fn, "w", encoding="utf-8") as filep: 128 | filep.write("# List of authors\n\n") 129 | for author in authors: 130 | filep.write("- " + author + "\n") 131 | print("List of authors written in", authors_fn) 132 | 133 | return nb_authors 134 | 135 | 136 | def generate_list_articles(bib): 137 | """Description of generate_list_articles 138 | From the bib file generates a ReadMe-styled table like: 139 | | [Name of the article](Link to the .pdf) | Code's link if available | 140 | """ 141 | articles = "" 142 | for entry in bib: 143 | if "title" in entry: 144 | if "year" in entry: 145 | articles += "| " + entry["year"] + " " 146 | else: 147 | print("ERROR: Missing year for ", entry) 148 | sys.exit() 149 | if "link" in entry: 150 | articles += "| [" + entry["title"] + "](" + entry["link"] + ") | " 151 | else: 152 | articles += "| " + entry["title"] + " | " 153 | if "code" in entry: 154 | if "No" in entry["code"]: 155 | articles += "No " 156 | else: 157 | if "github" in entry["code"]: 158 | articles += "[GitHub" 159 | else: 160 | articles += "[Website" 161 | articles += "](" + entry["code"] + ") " 162 | else: 163 | articles += "No " 164 | articles += "|\n" 165 | else: 166 | print("ERROR: Missing title for ", entry) 167 | sys.exit() 168 | 169 | return articles 170 | 171 | 172 | def generate_summary_table(bib): 173 | """Description of generate_summary_table 174 | Parse dl4m.bib to create a simple and readable ReadMe.md table. 175 | """ 176 | nb_articles = get_nb_articles(bib) 177 | nb_repro = get_reproducibility(bib) 178 | percent_repro = str(int(nb_repro * 100. / nb_articles)) 179 | nb_articles = str(nb_articles) 180 | nb_repro = str(nb_repro) 181 | nb_authors = str(get_authors(bib) - 1) 182 | nb_tasks = str(get_field(bib, "task")) 183 | nb_datasets = str(get_field(bib, "dataset")) 184 | nb_archi = str(get_field(bib, "architecture")) 185 | nb_framework = str(get_field(bib, "framework")) 186 | articles = generate_list_articles(bib) 187 | 188 | readme_fn = "README.md" 189 | readme = "" 190 | pasted_articles = False 191 | with open(readme_fn, "r", encoding="utf-8") as filep: 192 | for line in filep: 193 | if "| " in line[:2] and line[2] != " ": 194 | if not pasted_articles: 195 | readme += articles 196 | pasted_articles = True 197 | elif "papers referenced" in line: 198 | readme += "- " + nb_articles + " papers referenced. " 199 | readme += "See the details in [dl4m.bib](dl4m.bib).\n" 200 | elif "other researchers" in line: 201 | readme += "- If you are applying DL to music, there are [" 202 | readme += nb_authors + " other researchers](authors.md) " 203 | readme += "in your field.\n" 204 | elif "tasks investigated" in line: 205 | readme += "- " + nb_tasks + " tasks investigated. " 206 | readme += "See the list of [tasks](tasks.md).\n" 207 | elif "datasets used" in line: 208 | readme += "- " + nb_datasets + " datasets used. " 209 | readme += "See the list of [datasets](datasets.md).\n" 210 | elif "architectures used" in line: 211 | readme += "- " + nb_archi + " architectures used. " 212 | readme += "See the list of [architectures](architectures.md).\n" 213 | elif "frameworks used" in line: 214 | readme += "- " + nb_framework + " frameworks used. " 215 | readme += "See the list of [frameworks](frameworks.md).\n" 216 | elif "- Only" in line: 217 | readme += "- Only " + nb_repro + " articles (" + percent_repro 218 | readme += "%) provide their source code.\n" 219 | else: 220 | readme += line 221 | with open(readme_fn, "w", encoding="utf-8") as filep: 222 | filep.write(readme) 223 | print("New ReadMe generated") 224 | 225 | 226 | def validate_field(field_name): 227 | """Description of validate_field 228 | Assert the validity of the field's name 229 | """ 230 | fields = ["task", "dataset", "architecture", "author", "dataaugmentation", 231 | "link", "title", "year", "journal", "code", "ENTRYTYPE", 232 | "framework"] 233 | error_str = "Invalid field provided: " + field_name + ". " 234 | error_str += "Valid fields: " + '[%s]' % ', '.join(map(str, fields)) 235 | assert field_name in fields, error_str 236 | 237 | def make_autopct(values): 238 | """Wrapper for the custom values to display in the pie chart slices 239 | """ 240 | def my_autopct(pct): 241 | """Define custom value to print in pie chart 242 | """ 243 | total = sum(values) 244 | val = int(round(pct*total/100.0)) 245 | return '{p:.1f}% ({v:d})'.format(p=pct, v=val) 246 | return my_autopct 247 | 248 | 249 | def pie_chart(items, field_name, max_nb_slice=8): 250 | """Description of pie_chart 251 | Display a pie_chart from the items given in input 252 | """ 253 | # plt.figure(figsize=(14, 10)) 254 | sizes = [] 255 | labels = sorted(set(items)) 256 | for label in labels: 257 | sizes.append(items.count(label)) 258 | 259 | labels = np.array(labels) 260 | sizes = np.array(sizes) 261 | if len(sizes) > max_nb_slice: 262 | new_labels = [] 263 | new_sizes = [] 264 | for _ in range(0, max_nb_slice): 265 | index = np.where(sizes == max(sizes))[0] 266 | if len(index) == len(labels): 267 | break 268 | new_labels.append(labels[index][0]) 269 | new_sizes.append(sizes[index][0]) 270 | labels = np.delete(labels, index) 271 | sizes = np.delete(sizes, index) 272 | new_labels.append(str(len(labels)) + " others") 273 | new_sizes.append(sum(sizes)) 274 | labels = np.array(new_labels) 275 | sizes = np.array(new_sizes) 276 | 277 | colors = ["gold", "yellowgreen", "lightcoral", "lightskyblue", 278 | "red", "green", "bisque", "lightgrey", "#555555"] 279 | 280 | tmp_labels = [] 281 | for label in labels: 282 | if "[" in label: 283 | label = label[1:].split("]")[0] 284 | tmp_labels.append(label) 285 | labels = np.array(tmp_labels) 286 | 287 | # h = plt.pie(sizes, labels=labels, colors=colors, shadow=False, 288 | plt.pie(sizes, labels=labels, colors=colors, shadow=False, 289 | startangle=90, autopct=make_autopct(sizes)) 290 | 291 | # Display the legend 292 | # leg = plt.legend(h[0], labels, bbox_to_anchor=(0.08, 0.4)) 293 | # leg.draw_frame(False) 294 | plt.axis('equal') 295 | fig_fn = "fig/pie_chart_" + field_name + ".png" 296 | plt.savefig(fig_fn, dpi=200) 297 | plt.close() 298 | print("Fig. with number of articles per year saved in", fig_fn) 299 | 300 | 301 | def get_field(bib, field_name): 302 | """Description of get_field 303 | Generate insights on the field_name in the bib file 304 | """ 305 | validate_field(field_name) 306 | nb_article_missing = 0 307 | fields = [] 308 | for entry in bib: 309 | if field_name in entry: 310 | cur_fields = entry[field_name].split(" & ") 311 | for field in cur_fields: 312 | fields.append(field) 313 | else: 314 | nb_article_missing += 1 315 | print(str(nb_article_missing) + " entries are missing the " + field_name + " field.") 316 | nb_fields = len(set(fields)) 317 | print(str(nb_fields) + " unique " + field_name + ".") 318 | 319 | field_fn = field_name + "s.md" 320 | with open(field_fn, "w", encoding="utf-8") as filep: 321 | filep.write("# List of " + field_name + "s\n\n") 322 | filep.write("Please refer to the list of useful acronyms used in deep ") 323 | filep.write("learning and music: [acronyms.md](acronyms.md).\n\n") 324 | for field in sorted(set(fields)): 325 | filep.write("- " + field + "\n") 326 | print("List of " + field_name + "s written in", field_fn) 327 | 328 | pie_chart(fields, field_name) 329 | 330 | return nb_fields 331 | 332 | 333 | def create_table(bib, outfilen="dl4m.tsv"): 334 | """Description of create_table 335 | Generate human-readable table in .tsv form. 336 | """ 337 | 338 | print("Generating the human-readable table as .tsv") 339 | # Gather all existing field in bib 340 | fields = [] 341 | for entry in bib: 342 | for key in entry: 343 | fields.append(key) 344 | 345 | print("Available fields:") 346 | print(set(fields)) 347 | fields = ["year", "ENTRYTYPE", "title", "author", "link", "code", "task", 348 | "reproducible", "dataset", "framework", "architecture", "dropout", 349 | "batch", "epochs", "dataaugmentation", "input", "dimension", 350 | "activation", "loss", "learningrate", "optimizer", "gpu"] 351 | print("Fields taken in order (in this order):") 352 | print(fields) 353 | 354 | separator = "\t" 355 | str2write = "" 356 | for field in fields: 357 | str2write += field.title() + separator 358 | str2write += "\n" 359 | for entry in bib: 360 | for field in fields: 361 | if field in entry: 362 | str2write += entry[field] 363 | str2write += separator 364 | str2write += "\n" 365 | with open(outfilen, "w", encoding="UTF-8") as filep: 366 | filep.write(str2write) 367 | 368 | 369 | def where_published(bib): 370 | """Display insights on where the articles have been published 371 | """ 372 | journals = [] 373 | conf = [] 374 | for entry in bib: 375 | if "article" in entry["ENTRYTYPE"]: 376 | journals.append(entry["journal"]) 377 | elif "inproceedings" in entry["ENTRYTYPE"]: 378 | conf.append(entry["booktitle"]) 379 | journals = sorted(set(journals)) 380 | conf = sorted(set(conf)) 381 | 382 | with open("publication_type.md", "w") as filep: 383 | filep.write("# List of publications type\n\n### Journals:\n\n- ") 384 | filep.write("\n- ".join(journals)) 385 | filep.write("\n\n### Conferences:\n\n- ") 386 | filep.write("\n- ".join(conf)) 387 | filep.write("\n") 388 | 389 | 390 | def main(filen="dl4m.bib"): 391 | """Description of main 392 | Main entry point 393 | input: file name storing articles details 394 | """ 395 | bib = load_bib(filen) 396 | generate_summary_table(bib) 397 | articles_per_year(bib) 398 | create_table(bib) 399 | where_published(bib) 400 | 401 | 402 | if __name__ == "__main__": 403 | main("dl4m.bib") 404 | -------------------------------------------------------------------------------- /dl4m.tsv: -------------------------------------------------------------------------------- 1 | Year Entrytype Title Author Link Code Task Reproducible Dataset Framework Architecture Dropout Batch Epochs Dataaugmentation Input Dimension Activation Loss Learningrate Optimizer Gpu 2 | 1988 inproceedings Neural net modeling of music Bharucha, J. 3 | 1988 inproceedings Creation by refinement: A creativity paradigm for gradient descent learning networks Lewis, J. P. http://ieeexplore.ieee.org/stamp/stamp.jsp?arnumber=23933 4 | 1988 inproceedings A sequential network design for musical applications Todd, Peter M. Composition 5 | 1989 article The representation of pitch in a neural net model of chord classification Laden, Bernice and Keefe, Douglas H. http://www.jstor.org/stable/3679550 Chord recognition 6 | 1989 inproceedings Algorithms for music composition by neural nets: Improved CBR paradigms Lewis, J. P. https://quod.lib.umich.edu/cgi/p/pod/dod-idx/algorithms-for-music-composition.pdf?c=icmc;idno=bbp2372.1989.044;format=pdf Composition 7 | 1989 article A connectionist approach to algorithmic composition Todd, Peter M. http://www.jstor.org/stable/3679551 Composition 8 | 1994 article Neural network music composition by prediction: Exploring the benefits of psychoacoustic constraints and multi-scale processing Mozer, Michael C. http://www-labs.iro.umontreal.ca/~pift6080/H09/documents/papers/mozer-music.pdf Composition 9 | 1995 inproceedings Automatic source identification of monophonic musical instrument sounds Kaminsky, I. and Materka, Andrzej https://www.researchgate.net/publication/3622871_Automatic_source_identification_of_monophonic_musical_instrument_sounds No Instrument recognition No Inhouse No No No No No No Raw audio 1D Sigmoid No 0.25 No No 10 | 1995 inproceedings Neural network based model for classification of music type Matityaho, Benyamin and Furst, Miriam http://ieeexplore.ieee.org/abstract/document/514161/ MGR 11 | 1997 inproceedings A machine learning approach to musical style recognition Dannenberg, Roger B and Thom, Belinda and Watson, David http://repository.cmu.edu/cgi/viewcontent.cgi?article=1496&context=compsci MSR 12 | 1998 inproceedings Recognition of music types Soltau, Hagen and Schultz, Tanja and Westphal, Martin and Waibel, Alex https://www.ri.cmu.edu/pub_files/pub1/soltau_hagen_1998_2/soltau_hagen_1998_2.pdf No MGR No Inhouse No DNN No No No No 10x5 cepstral coefficients 2D No No No No No 13 | 1999 book Musical networks: Parallel distributed perception and performance Griffith, Niall and Todd, Peter M. https://s3.amazonaws.com/academia.edu.documents/3551783/10.1.1.39.6248.pdf?AWSAccessKeyId=AKIAIWOWYYGZ2Y53UL3A&Expires=1507055806&Signature=5mGzQc7bvJgUZYfXOmCX8eeNQOs%3D&response-content-disposition=inline%3B%20filename%3DMusical_networks_Parallel_distributed_pe.pdf 14 | 2001 inproceedings Multi-phase learning for jazz improvisation and interaction Franklin, Judy A http://www.cs.smith.edu/~jfrankli/papers/CtColl01.pdf Composition RNN 15 | 2002 inproceedings A supervised learning approach to musical style recognition Buzzanca, Giuseppe https://www.researchgate.net/profile/Giuseppe_Buzzanca/publication/228588086_A_supervised_learning_approach_to_musical_style_recognition/links/54b43ee90cf26833efd0109f.pdf MGR 16 | 2002 inproceedings Finding temporal structure in music: Blues improvisation with LSTM recurrent networks Eck, Douglas and Schmidhuber, Juergen http://www-perso.iro.umontreal.ca/~eckdoug/papers/2002_ieee.pdf No Composition No Inhouse No RNN-LSTM No No No No Midi Chords & Midi notes 1D Logistic Sigmoid cross-entropy 0.00001 SGD No 17 | 2002 unpublished Neural networks for note onset detection in piano music Marolt, Matija and Kavcic, Alenka and Privosnik, Marko https://www.researchgate.net/profile/Matija_Marolt/publication/2473938_Neural_Networks_for_Note_Onset_Detection_in_Piano_Music/links/00b49525efccc79fed000000.pdf No Onset detection No Inhouse No MLP No No No No Raw audio signal and synthesized 1D No No No No No 18 | 2004 inproceedings A convolutional-kernel based approach for note onset detection in piano-solo audio signals Nava, Gabriel Pablo and Tanaka, Hidehiko and Ide, Ichiro http://www.murase.nuie.nagoya-u.ac.jp/~ide/res/paper/E04-conference-pablo-1.pdf Onset detection 19 | 2009 inproceedings Unsupervised feature learning for audio classification using convolutional deep belief networks Lee, Honglak and Pham, Peter and Largman, Yan and Ng, Andrew Y http://papers.nips.cc/paper/3674-unsupervised-feature-learning-for-audio-classification-using-convolutional-deep-belief-networks.pdf Speaker gender recognition [TIMIT](https://catalog.ldc.upenn.edu/LDC93S1) CDBN 20 | 2010 phdthesis Audio musical genre classification using convolutional neural networks and pitch and tempo transformations Li, Lihua http://lbms03.cityu.edu.hk/theses/c_ftt/mphil-cs-b39478026f.pdf [GTzan](http://marsyas.info/downloads/datasets.html) MFCC 21 | 2010 inproceedings Automatic musical pattern feature extraction using convolutional neural network Li, Tom LH and Chan, Antoni B and Chun, A https://www.researchgate.net/profile/Antoni_Chan2/publication/44260643_Automatic_Musical_Pattern_Feature_Extraction_Using_Convolutional_Neural_Network/links/02e7e523dac6bb86b0000000.pdf MGR [GTzan](http://marsyas.info/downloads/datasets.html) MFCC 22 | 2011 inproceedings Audio-based music classification with a pretrained convolutional network Dieleman, Sander and Brakel, Philémon and Schrauwen, Benjamin http://www.ismir2011.ismir.net/papers/PS6-3.pdf No MGR & Artist recognition No [MSD](https://labrosa.ee.columbia.edu/millionsong/) Theano CNN & MLP 0.3 No 1 No Custom 0.005 & 0.0001 No No 23 | 2012 inproceedings Rethinking automatic chord recognition with convolutional neural networks Humphrey, Eric J. and Bello, Juan Pablo http://ieeexplore.ieee.org/abstract/document/6406762/ Chord recognition [Beatles](http://isophonics.net/content/reference-annotations-beatles) & [RWC](https://staff.aist.go.jp/m.goto/RWC-MDB/) & [US Pop](https://labrosa.ee.columbia.edu/projects/musicsim/uspop2002.html) CNN Cross-entropy 24 | 2012 inproceedings Moving beyond feature design: Deep architectures and automatic feature learning in music informatics Humphrey, Eric J. and Bello, Juan Pablo and LeCun, Yann http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.294.2304&rep=rep1&type=pdf 25 | 2012 inproceedings Local-feature-map integration using convolutional neural networks for music genre classification Nakashika, Toru and Garcia, Christophe and Takiguchi, Tetsuya and De Lyon, Insa http://liris.cnrs.fr/Documents/Liris-5602.pdf MGR [GTzan](http://marsyas.info/downloads/datasets.html) GLCM 26 | 2012 inproceedings Learning sparse feature representations for music annotation and retrieval Nam, Juhan and Herrera, Jorge and Slaney, Malcolm and Smith, Julius O https://pdfs.semanticscholar.org/099d/85f25e9336f48ff64287a4b53ee5fb64ab51.pdf 27 | 2012 inproceedings Unsupervised learning of local features for music classification Wülfing, Jan and Riedmiller, Martin http://www.ismir2012.ismir.net/event/papers/139_ISMIR_2012.pdf MGR [GTzan](http://marsyas.info/downloads/datasets.html) CQT 28 | 2013 inproceedings Multiscale approaches to music audio feature learning Dieleman, Sander and Schrauwen, Benjamin http://ismir2013.ismir.net/wp-content/uploads/2013/09/69_Paper.pdf [Magnatagatune](http://mirg.city.ac.uk/codeapps/the-magnatagatune-dataset) Mel-spectrogram cross-entropy 29 | 2013 inproceedings Musical onset detection with convolutional neural networks Schlüter, Jan and Böck, Sebastian http://phenicx.upf.edu/system/files/publications/Schlueter_MML13.pdf Onset detection Mel-spectrogram cross-entropy 30 | 2013 inproceedings Deep content-based music recommendation Van den Oord, Aaron and Dieleman, Sander and Schrauwen, Benjamin http://papers.nips.cc/paper/5004-deep-content-based-music-recommendation.pdf Recommendation [MSD](https://labrosa.ee.columbia.edu/millionsong/) & [Echo Nest Taste Profile Subset](https://labrosa.ee.columbia.edu/millionsong/tasteprofile) & [Last.fm](https://www.last.fm/) Theano CNN MFCC & Mel-Spectro ReLU 31 | 2014 inproceedings The munich LSTM-RNN approach to the MediaEval 2014 Emotion In Music task Coutinho, Eduardo and Weninger, Felix and Schuller, Björn W and Scherer, Klaus R https://pdfs.semanticscholar.org/8a24/c5131d5a28165f719697028c34b00e6d3f60.pdf MER RNN-LSTM 32 | 2014 inproceedings End-to-end learning for music audio Dieleman, Sander and Schrauwen, Benjamin http://ieeexplore.ieee.org/abstract/document/6854950/ MGR [Magnatagatune](http://mirg.city.ac.uk/codeapps/the-magnatagatune-dataset) CNN Raw & Mel-spectrogram 33 | 2014 techreport Deep learning for music genre classification Feng, Tao https://courses.engr.illinois.edu/ece544na/fa2014/Tao_Feng.pdf MGR [GTzan](http://marsyas.info/downloads/datasets.html) 34 | 2014 inproceedings Recognition of acoustic events using deep neural networks Gencoglu, Oguzhan and Virtanen, Tuomas and Huttunen, Heikki https://www.cs.tut.fi/sgn/arg/music/tuomasv/dnn_eusipco2014.pdf 35 | 2014 article Deep image features in music information retrieval Gwardys, Grzegorz and Grzywczak, Daniel https://www.degruyter.com/downloadpdf/j/eletel.2014.60.issue-4/eletel-2014-0042/eletel-2014-0042.pdf [GTzan](http://marsyas.info/downloads/datasets.html) 36 | 2014 inproceedings From music audio to chord tablature: Teaching deep convolutional networks to play guitar Humphrey, Eric J. and Bello, Juan Pablo http://www.mirlab.org/conference_papers/International_Conference/ICASSP%202014/papers/p7024-humphrey.pdf Chord recognition [Beatles](http://isophonics.net/content/reference-annotations-beatles) & [RWC](https://staff.aist.go.jp/m.goto/RWC-MDB/) & [US Pop](https://labrosa.ee.columbia.edu/projects/musicsim/uspop2002.html) CQT 37 | 2014 inproceedings Improved musical onset detection with convolutional neural networks Schlüter, Jan and Bock, Sebastian http://www.mirlab.org/conference_papers/International_Conference/ICASSP%202014/papers/p7029-schluter.pdf Onset detection Inhouse CNN Mel-spectrogram 3D 38 | 2014 inproceedings Boundary detection in music structure analysis using convolutional neural networks Ullrich, Karen and Schlüter, Jan and Grill, Thomas https://dav.grrrr.org/public/pub/ullrich_schlueter_grill-2014-ismir.pdf Boundary detection [SALAMI](http://ddmal.music.mcgill.ca/research/salami/annotations) Mel-spectrogram Cross-entropy 39 | 2014 inproceedings Improving content-based and hybrid music recommendation using deep learning Wang, Xinxi and Wang, Ye http://www.smcnus.org/wp-content/uploads/2014/08/reco_MM14.pdf Recommendation [Echo Nest Taste Profile Subset](https://labrosa.ee.columbia.edu/millionsong/tasteprofile) & [7digital](https://7digital.com) Theano DBN No 15 nodes of 2 Tesla M2090 40 | 2014 inproceedings A deep representation for invariance and music classification Zhang, Chiyuan and Evangelopoulos, Georgios and Voinea, Stephen and Rosasco, Lorenzo and Poggio, Tomaso http://www.mirlab.org/conference_papers/International_Conference/ICASSP%202014/papers/p7034-zhang.pdf MGR [GTzan](http://marsyas.info/downloads/datasets.html) CNN 41 | 2015 inproceedings Auralisation of deep convolutional neural networks: Listening to learned features Choi, Keunwoo and Fazekas, György and Sandler, Mark Brian and Kim, Jeonghee http://ismir2015.uma.es/LBD/LBD24.pdf https://github.com/keunwoochoi/Auralisation MGR Inhouse STFT 42 | 2015 inproceedings Downbeat tracking with multiple features and deep neural networks Durand, Simon and Bello, Juan Pablo and David, Bertrand and Richard, Gaël http://perso.telecom-paristech.fr/~grichard/Publications/2015-durand-icassp.pdf Beat detection 43 | 2015 inproceedings Music boundary detection using neural networks on spectrograms and self-similarity lag matrices Grill, Thomas and Schlüter, Jan http://www.ofai.at/~jan.schlueter/pubs/2015_eusipco.pdf Boundary detection [SALAMI](http://ddmal.music.mcgill.ca/research/salami/annotations) STFT 44 | 2015 inproceedings Classification of spatial audio location and content using convolutional neural networks Hirvonen, Toni https://www.researchgate.net/profile/Toni_Hirvonen/publication/276061831_Classification_of_Spatial_Audio_Location_and_Content_Using_Convolutional_Neural_Networks/links/5550665908ae12808b37fe5a/Classification-of-Spatial-Audio-Location-and-Content-Using-Convolutional-Neural-Networks.pdf 45 | 2015 inproceedings Deep learning, audio adversaries, and music content analysis Kereliuk, Corey and Sturm, Bob L. and Larsen, Jan http://www2.imm.dtu.dk/pubdb/views/edoc_download.php/6905/pdf/imm6905.pdf 46 | 2015 article Deep learning and music adversaries Kereliuk, Corey and Sturm, Bob L. and Larsen, Jan https://arxiv.org/pdf/1507.04761.pdf https://github.com/coreyker/dnn-mgr MGR [GTzan](http://marsyas.info/downloads/datasets.html) & [LMD](https://sites.google.com/site/carlossillajr/resources/the-latin-music-database-lmd) CNN Magnitude spectral frames 47 | 2015 inproceedings Singing voice detection with deep recurrent neural networks Leglaive, Simon and Hennequin, Romain and Badeau, Roland https://hal-imt.archives-ouvertes.fr/hal-01110035/ SVD 48 | 2015 unpublished Automatic instrument recognition in polyphonic music using convolutional neural networks Li, Peter and Qian, Jiyuan and Wang, Tian https://arxiv.org/pdf/1511.05520.pdf Instrument recognition [MedleyDB](http://medleydb.weebly.com/) 1D Freq Raw audio Cross-entropy 49 | 2015 inproceedings A software framework for musical data augmentation McFee, Brian and Humphrey, Eric J. and Bello, Juan Pablo https://bmcfee.github.io/papers/ismir2015_augmentation.pdf Instrument recognition [MedleyDB](http://medleydb.weebly.com/) 50 | 2015 unpublished A deep bag-of-features model for music auto-tagging Nam, Juhan and Herrera, Jorge and Lee, Kyogu https://arxiv.org/pdf/1508.04999v1.pdf 51 | 2015 inproceedings Music-noise segmentation in spectrotemporal domain using convolutional neural networks Park, Taejin and Lee, Taejin http://ismir2015.uma.es/LBD/LBD27.pdf Music/Noise segmentation No 2D Cross-entropy 52 | 2015 unpublished Musical instrument sound classification with deep convolutional neural network using feature fusion approach Park, Taejin and Lee, Taejin https://arxiv.org/ftp/arxiv/papers/1512/1512.07370.pdf Instrument recognition [UIOWA MIS](http://theremin.music.uiowa.edu/mis.html) 53 | 2015 inproceedings Environmental sound classification with convolutional neural networks Piczak, Karol J http://karol.piczak.com/papers/Piczak2015-ESC-ConvNet.pdf 54 | 2015 inproceedings Exploring data augmentation for improved singing voice detection with neural networks Schlüter, Jan and Grill, Thomas https://grrrr.org/pub/schlueter-2015-ismir.pdf https://github.com/f0k/ismir2015 SVD Inhouse & [Jamendo](http://www.mathieuramona.com/wp/data/jamendo/) & [RWC](https://staff.aist.go.jp/m.goto/RWC-MDB/) CNN Dropout {5%, 10%, 20%} & Noise {Gaussian sigma={0.05, 0.1, 0.2}} & Pitch shift +-{10, 20, 30, 50} & Time stretch +-{10, 20, 30, 50} & Loudness +-{5dB, 10dB, 20dB} & Frequency filter +-{5dB, 10dB, 20dB} & Mix {10%, 20%, 30%, 50%} & Combined & Test and train Spectrogram 55 | 2015 techreport Singer traits identification using deep neural network Shi, Zhengshan https://cs224d.stanford.edu/reports/SkiZhengshan.pdf 56 | 2015 inproceedings A hybrid recurrent neural network for music transcription Sigtia, Siddharth and Benetos, Emmanouil and Boulanger-Lewandowski, Nicolas and Weyde, Tillman and Garcez, Artur S d'Avila and Dixon, Simon https://arxiv.org/pdf/1411.1623.pdf Transcription [MAPS](http://www.tsi.telecom-paristech.fr/aao/en/2010/07/08/maps-database-a-piano-database-for-multipitch-estimation-and-automatic-transcription-of-music/) RNN No No No No 57 | 2015 unpublished An end-to-end neural network for polyphonic music transcription Sigtia, Siddharth and Benetos, Emmanouil and Dixon, Simon https://arxiv.org/pdf/1508.01774.pdf Transcription CQT 58 | 2015 unpublished Deep karaoke: Extracting vocals from musical mixtures using a convolutional deep neural network Simpson, Andrew J. R. and Roma, Gerard and Plumbley, Mark D. https://link.springer.com/chapter/10.1007/978-3-319-22482-4_50 Source separation [MedleyDB](http://medleydb.weebly.com/) STFT 59 | 2015 inproceedings Folk music style modelling by recurrent neural networks with long short term memory units Sturm, Bob L. and Santos, João Felipe and Korshunova, Iryna http://ismir2015.uma.es/LBD/LBD13.pdf https://github.com/IraKorshunova/folk-rnn Composition 60 | 2015 inproceedings Deep neural network based instrument extraction from music Uhlich, Stefan and Giron, Franck and Mitsufuji, Yuki https://www.researchgate.net/profile/Stefan_Uhlich/publication/282001406_Deep_neural_network_based_instrument_extraction_from_music/links/5600eeda08ae07629e52b397/Deep-neural-network-based-instrument-extraction-from-music.pdf Source separation STFT 61 | 2015 inproceedings A deep neural network for modeling music Zhang, Pengjing and Zheng, Xiaoqing and Zhang, Wenqiang and Li, Siyan and Qian, Sheng and He, Wenqi and Zhang, Shangtong and Wang, Ziyuan https://www.researchgate.net/profile/Xiaoqing_Zheng3/publication/275347034_A_Deep_Neural_Network_for_Modeling_Music/links/5539d2060cf2239f4e7dad0d/A-Deep-Neural-Network-for-Modeling-Music.pdf MGR CNN 62 | 2016 article An efficient approach for segmentation, feature extraction and classification of audio signals Arumugam, Muthumari and Kaliappan, Mala http://file.scirp.org/pdf/CS_2016042615054817.pdf MGR & Instrument recognition [GTzan](http://marsyas.info/downloads/datasets.html) PNN 63 | 2016 inproceedings Text-based LSTM networks for automatic music composition Choi, Keunwoo and Fazekas, György and Sandler, Mark Brian https://drive.google.com/file/d/0B1OooSxEtl0FcG9MYnY2Ylh5c0U/view Composition 64 | 2016 unpublished Towards playlist generation algorithms using RNNs trained on within-track transitions Choi, Keunwoo and Fazekas, György and Sandler, Mark Brian https://arxiv.org/pdf/1606.02096.pdf Playlist generation RNN 65 | 2016 inproceedings Automatic tagging using deep convolutional neural networks Choi, Keunwoo and Fazekas, György and Sandler, Mark Brian https://arxiv.org/pdf/1606.00298.pdf MGR FCN 66 | 2016 inproceedings Automatic chord estimation on seventhsbass chord vocabulary using deep neural network Deng, Junqi and Kwok, Yu-Kwong http://ieeexplore.ieee.org/abstract/document/7471677/ Chord recognition 67 | 2016 inproceedings DeepBach: A steerable model for Bach chorales generation Hadjeres, Gaëtan and Pachet, François https://arxiv.org/pdf/1612.01010.pdf https://github.com/Ghadjeres/DeepBach 68 | 2016 inproceedings Bayesian meter tracking on learned signal representations Holzapfel, Andre and Grill, Thomas http://www.rhythmos.org/MMILab-Andre_files/ISMIR2016_CNNDBNbeats_camready.pdf Beat detection CNN 69 | 2016 unpublished Deep learning for music Huang, Allen and Wu, Raymond https://arxiv.org/pdf/1606.04930.pdf Composition [Bach Corpus](http://musedata.org/) RNN-LSTM 70 | 2016 inproceedings Learning temporal features using a deep neural network and its application to music genre classification Jeong, Il-Young and Lee, Kyogu https://www.researchgate.net/profile/Il_Young_Jeong/publication/305683876_Learning_temporal_features_using_a_deep_neural_network_and_its_application_to_music_genre_classification/links/5799a27c08aec89db7bb9f92.pdf STFT & Cepstrum 71 | 2016 unpublished On the potential of simple framewise approaches to piano transcription Kelz, Rainer and Dorfer, Matthias and Korzeniowski, Filip and Böck, Sebastian and Arzt, Andreas and Widmer, Gerhard https://arxiv.org/pdf/1612.05153.pdf DNN & ConvNet 72 | 2016 inproceedings Feature learning for chord recognition: The deep chroma extractor Korzeniowski, Filip and Widmer, Gerhard https://arxiv.org/pdf/1612.05065.pdf https://github.com/fdlm/chordrec/tree/master/experiments/ismir2016 Chord recognition 73 | 2016 inproceedings A fully convolutional deep auditory model for musical chord recognition Korzeniowski, Filip and Widmer, Gerhard https://www.researchgate.net/profile/Filip_Korzeniowski/publication/305590295_A_Fully_Convolutional_Deep_Auditory_Model_for_Musical_Chord_Recognition/links/579486ba08aed51475cc6958/A-Fully-Convolutional-Deep-Auditory-Model-for-Musical-Chord-Recognition.pdf?_iepl%5BhomeFeedViewId%5D=HTzFFmKPia2YminQ4psHT5at&_iepl%5Bcontexts%5D%5B0%5D=pcfhf&_iepl%5BinteractionType%5D=publicationDownload&origin=publication_detail&ev=pub_int_prw_xdl&msrp=Dz_6LKHzYcPyP-LmgZPF-m63ayZ6k0entFEntooiu_e32zfETNQXKPQSTFOI87NONIIQuUQdnUtwORdomTXfteTrb09KiAIdDtBJnw_02P6JeRr5zu2eyaCG.2Uxsi_eENxtbYL39lvorIK8LofRYhkgpUHzpzmVzkIEiyHc0wUY87rEa4PH1qbXi4k4RyagHUsA2IsZtewnprglORjx2v9Cwbk9ZfQ.cd67BaqtHul_hE6SX6vUFKuldz81aH6dWq-cYMkq5vQKCHcvB8l9zgeM694Efb_r2wBB5GT9idt3OLeME0UxVHI6ROxamgK3LMNlSw.JtZXAo9HhR9t-8Wl3gxJgnoM4--rtmDEUDbXSWezbFyU-CoB_nyfxbRQ4kdoN4-5aJ3Tgx4YHdikicqAhc_cezB2ZntjxkB4rEDx1A Chord recognition 74 | 2016 inproceedings A deep bidirectional long short-term memory based multi-scale approach for music dynamic emotion prediction Li, Xinxing and Xianyu, Haishu and Tian, Jiashen and Chen, Wenxiao and Meng, Fanhang and Xu, Mingxing and Cai, Lianhong http://ieeexplore.ieee.org/document/7471734/ MER RNN & BILSTM & ELM 75 | 2016 inproceedings Event localization in music auto-tagging Liu, Jen-Yu and Yang, Yi-Hsuan http://mac.citi.sinica.edu.tw/~yang/pub/liu16mm.pdf https://github.com/ciaua/clip2frame CNN 76 | 2016 inproceedings Deep convolutional networks on the pitch spiral for musical instrument recognition Lostanlen, Vincent and Cella, Carmine-Emanuele https://github.com/lostanlen/ismir2016/blob/master/paper/lostanlen_ismir2016.pdf https://github.com/lostanlen/ismir2016 Instrument recognition 77 | 2016 inproceedings SampleRNN: An unconditional end-to-end neural audio generation model Mehri, Soroush and Kumar, Kundan and Gulrajani, Ishaan and Kumar, Rithesh and Jain, Shubham and Sotelo, Jose and Courville, Aaron and Bengio, Yoshua https://openreview.net/pdf?id=SkxKPDv5xl https://github.com/soroushmehr/sampleRNN_ICLR2017 Composition [32 Beethoven’s piano sonatas gathered from https://archive.org](https://soundcloud.com/samplernn/sets) RNN 78 | 2016 unpublished Robust audio event recognition with 1-max pooling convolutional neural networks Phan, Huy and Hertel, Lars and Maass, Marco and Mertins, Alfred https://arxiv.org/pdf/1604.06338.pdf Event recognition [RWC](https://staff.aist.go.jp/m.goto/RWC-MDB/) CNN 79 | 2016 inproceedings Experimenting with musically motivated convolutional neural networks Pons, Jordi and Lidy, Thomas and Serra, Xavier http://jordipons.me/media/CBMI16.pdf https://github.com/jordipons/ [Ballroom](http://mtg.upf.edu/ismir2004/contest/tempoContest/node5.html) 80 | 2016 inproceedings Singing voice melody transcription using deep neural networks Rigaud, François and Radenen, Mathieu https://wp.nyu.edu/ismir2016/wp-content/uploads/sites/2294/2016/07/163_Paper.pdf F0 & VAD DNN & RNN-LSTM 81 | 2016 inproceedings Singing voice separation using deep neural networks and F0 estimation Roma, Gerard and Grais, Emad M. and Simpson, Andrew J. R. and Plumbley, Mark D. http://www.music-ir.org/mirex/abstracts/2016/RSGP1.pdf http://cvssp.org/projects/maruss/mirex2016/ SVS [iKala](http://mac.citi.sinica.edu.tw/ikala/) 82 | 2016 inproceedings Learning to pinpoint singing voice from weakly labeled examples Schlüter, Jan http://www.ofai.at/~jan.schlueter/pubs/2016_ismir.pdf CNN STFT 83 | 2016 inproceedings Analysis of time-frequency representations for musical onset detection with convolutional neural network Stasiak, Bartłomiej and Mońko, Jędrzej http://ieeexplore.ieee.org/abstract/document/7733228/ Onset detection 84 | 2016 article Note onset detection in musical signals via neural-network-based multi-ODF fusion Stasiak, Bartłomiej and Mońko, Jędrzej and Niewiadomski, Adam https://www.degruyter.com/downloadpdf/j/amcs.2016.26.issue-1/amcs-2016-0014/amcs-2016-0014.pdf Onset detection Inhouse & [RWC](https://staff.aist.go.jp/m.goto/RWC-MDB/) NNMODFF Onset activation 85 | 2016 inproceedings Music transcription modelling and composition using deep learning Sturm, Bob L. and Santos, João Felipe and Ben-Tal, Oded and Korshunova, Iryna https://drive.google.com/file/d/0B1OooSxEtl0FcTBiOGdvSTBmWnc/view https://github.com/IraKorshunova/folk-rnn Composition 86 | 2016 inproceedings Convolutional neural network for robust pitch determination Su, Hong and Zhang, Hui and Zhang, Xueliang and Gao, Guanglai http://www.mirlab.org/conference_papers/International_Conference/ICASSP%202016/pdfs/0000579.pdf Pitch determination CNN & MLP 87 | 2016 unpublished Deep convolutional neural networks and data augmentation for acoustic event detection Takahashi, Naoya and Gygli, Michael and Pfister, Beat and Van Gool, Luc https://arxiv.org/pdf/1604.07160.pdf https://bitbucket.org/naoya1/aenet_release Event recognition [Acoustic Event](https://data.vision.ee.ethz.ch/cvl/ae_dataset/) CNN Mixing 88 | 2017 unpublished Gabor frames and deep scattering networks in audio processing Bammer, Roswitha and Doerfler, Monika https://arxiv.org/pdf/1706.08818.pdf 89 | 2017 inproceedings Vision-based detection of acoustic timed events: A case study on clarinet note onsets Bazzica, Alessio and Van Gemert, JC and Liem, CCS and Hanjalic, A http://dorienherremans.com/dlm2017/papers/bazzica2017clarinet.pdf Onset detection [C4S](http://mmc.tudelft.nl/users/alessio-bazzica#C4S-dataset) CNN 90 | 2017 unpublished Deep learning techniques for music generation - A survey Briot, Jean-Pierre and Hadjeres, Gaëtan and Pachet, François https://arxiv.org/pdf/1709.01620.pdf Survey & Composition [JSB Chorales](https://github.com/czhuang/JSB-Chorales-dataset) & [MusicNet](https://homes.cs.washington.edu/~thickstn/musicnet.html) & [Symbolic music data](http://users.cecs.anu.edu.au/~christian.walder/) & [LSDB](lsdb.flow-machines.com/) No 91 | 2017 inproceedings JamBot: Music theory aware chord based generation of polyphonic music with LSTMs Brunner, Gino and Wang, Yuyi and Wattenhofer, Roger and Wiesendanger, Jonas https://arxiv.org/pdf/1711.07682.pdf https://github.com/brunnergino/JamBot Composition [Lakh MIDI](https://labrosa.ee.columbia.edu/sounds/music/) Keras-TensorFlow RNN-LSTM No No 4 No Softmax 0.00001 Adam 1 92 | 2017 unpublished XFlow: 1D <-> 2D cross-modal deep neural networks for audiovisual classification Cangea, Cătălina and Veličković, Petar and Liò, Pietro https://arxiv.org/pdf/1709.00572.pdf 93 | 2017 inproceedings Machine listening intelligence Cella, Carmine-Emanuele http://dorienherremans.com/dlm2017/papers/cella2017mli.pdf No Manifesto No No No No No No No No No No No No 94 | 2017 inproceedings Monoaural audio source separation using deep convolutional neural networks Chandna, Pritish and Miron, Marius and Janer, Jordi and Gómez, Emilia http://mtg.upf.edu/system/files/publications/monoaural-audio-source_0.pdf https://github.com/MTG/DeepConvSep Source separation [DSD100](http://sisec17.audiolabs-erlangen.de/#/dataset) CNN 95 | 2017 inproceedings Deep multimodal network for multi-label classification Chen, Tanfang and Wang, Shangfei and Chen, Shiyu http://ieeexplore.ieee.org/abstract/document/8019322/ General audio classification 96 | 2017 unpublished A tutorial on deep learning for music information retrieval Choi, Keunwoo and Fazekas, György and Cho, Kyunghyun and Sandler, Mark Brian https://arxiv.org/pdf/1709.04396.pdf https://github.com/keunwoochoi/dl4mir General audio classification 97 | 2017 unpublished A comparison on audio signal preprocessing methods for deep neural networks on music tagging Choi, Keunwoo and Fazekas, György and Cho, Kyunghyun and Sandler, Mark Brian https://arxiv.org/pdf/1709.01922.pdf https://github.com/keunwoochoi/transfer_learning_music MGR [MSD](https://labrosa.ee.columbia.edu/millionsong/) 98 | 2017 inproceedings Transfer learning for music classification and regression tasks Choi, Keunwoo and Fazekas, György and Sandler, Mark Brian and Cho, Kyunghyun https://arxiv.org/pdf/1703.09179v3.pdf https://github.com/keunwoochoi/transfer_learning_music General audio classification [MSD](https://labrosa.ee.columbia.edu/millionsong/) No Adam 99 | 2017 inproceedings Convolutional recurrent neural networks for music classification Choi, Keunwoo and Fazekas, György and Sandler, Mark Brian and Cho, Kyunghyun http://ieeexplore.ieee.org/abstract/document/7952585/ https://github.com/keunwoochoi/icassp_2017 MGR Models & split sets only CRNN 100 | 2017 article An evaluation of convolutional neural networks for music classification using spectrograms Costa, Yandre MG and Oliveira, Luiz S and Silla, Carlos N http://www.inf.ufpr.br/lesoliveira/download/ASOC2017.pdf General audio classification [LMD](https://sites.google.com/site/carlossillajr/resources/the-latin-music-database-lmd) Caffe CNN 128 0.001 Tesla C2050 101 | 2017 unpublished Large vocabulary automatic chord estimation using deep neural nets: Design framework, system variations and limitations Deng, Junqi and Kwok, Yu-Kwong https://arxiv.org/pdf/1709.07153.pdf Chord recognition 102 | 2017 unpublished Basic filters for convolutional neural networks: Training or design? Doerfler, Monika and Grill, Thomas and Bammer, Roswitha and Flexer, Arthur https://arxiv.org/pdf/1709.02291.pdf SVD Inhouse Raw & Mel-spectrogram 0.001 Adam 103 | 2017 unpublished Ensemble Of Deep Neural Networks For Acoustic Scene Classification Duppada, Venkatesh and Hiray, Sushant https://arxiv.org/pdf/1708.05826.pdf 104 | 2017 article Robust downbeat tracking using an ensemble of convolutional networks Durand, Simon and Bello, Juan Pablo and David, Bertrand and Richard, Gaël http://ieeexplore.ieee.org/abstract/document/7728057/ Beat detection CNN 105 | 2017 inproceedings Music signal processing using vector product neural networks Fan, Zhe-Cheng and Chan, TS and Yang, Yi-Hsuan and Jang, Jyh-Shing R http://dorienherremans.com/dlm2017/papers/fan2017vector.pdf SVS [iKala](http://mac.citi.sinica.edu.tw/ikala/) VPNN & DNN 106 | 2017 inproceedings Transforming musical signals through a genre classifying convolutional neural network Geng, Shijia and Ren, Gang and Ogihara, Mitsunori http://dorienherremans.com/dlm2017/papers/geng2017genre.pdf Composition Inhouse CNN 107 | 2017 inproceedings Audio to score matching by combining phonetic and duration information Gong, Rong and Pons, Jordi and Serra, Xavier https://arxiv.org/pdf/1707.03547.pdf https://github.com/ronggong/jingjuSingingPhraseMatching/tree/v0.1.0 108 | 2017 unpublished Interactive music generation with positional constraints using anticipation-RNNs Hadjeres, Gaëtan and Nielsen, Frank https://arxiv.org/pdf/1709.06404.pdf Composition ARNN 109 | 2017 unpublished Deep rank-based transposition-invariant distances on musical sequences Hadjeres, Gaëtan and Nielsen, Frank https://arxiv.org/pdf/1709.00740.pdf 110 | 2017 unpublished GLSR-VAE: Geodesic latent space regularization for variational autoencoder architectures Hadjeres, Gaëtan and Nielsen, Frank and Pachet, François https://arxiv.org/pdf/1707.04588.pdf 111 | 2017 article Deep convolutional neural networks for predominant instrument recognition in polyphonic music Han, Yoonchang and Kim, Jaehun and Lee, Kyogu and Han, Yoonchang and Kim, Jaehun and Lee, Kyogu http://dl.acm.org/citation.cfm?id=3068697 Instrument recognition [IRMAS](https://www.upf.edu/web/mtg/irmas) CNN 112 | 2017 inproceedings CNN architectures for large-scale audio classification Hershey, Shawn and Chaudhuri, Sourish and Ellis, Daniel P. W. and Gemmeke, Jort F. and Jansen, Aren and Moore, R. Channing and Plakal, Manoj and Platt, Devin and Saurous, Rif A. and Seybold, Bryan and Slaney, Malcolm and Weiss, Ron J. and Wilson, Kevin https://arxiv.org/pdf/1609.09430v2.pdf CNN 113 | 2017 inproceedings DeepSheet: A sheet music generator based on deep learning Hsu, Yu-Lun and Lin, Chi-Po and Lin, Bo-Chen and Kuo, Hsu-Chan and Cheng, Wen-Huang and Hu, Min-Chun http://ieeexplore.ieee.org/abstract/document/8026272/ Composition 114 | 2017 inproceedings Talking Drums: Generating drum grooves with neural networks Hutchings, P. http://dorienherremans.com/dlm2017/papers/hutchings2017drums.pdf Composition RNN 115 | 2017 inproceedings Singing voice separation with deep U-Net convolutional networks Jansson, Andreas and Humphrey, Eric J. and Montecchio, Nicola and Bittner, Rachel and Kumar, Aparna and Weyde, Tillman https://ismir2017.smcnus.org/wp-content/uploads/2017/10/171_Paper.pdf https://github.com/Xiao-Ming/UNet-VocalSeparation-Chainer SVS No [iKala](http://mac.citi.sinica.edu.tw/ikala/) & [MedleyDB](http://medleydb.weebly.com/) CNN & U-Net No Sigmoid Adam 116 | 2017 inproceedings Music emotion recognition via end-to-end multimodal neural networks Jeon, Byungsoo and Kim, Chanju and Kim, Adrian and Kim, Dongwon and Park, Jangyeon and Ha, Jung-Woo http://ceur-ws.org/Vol-1905/recsys2017_poster18.pdf MER Inhouse CNN 117 | 2017 inproceedings Chord label personalization through deep learning of integrated harmonic interval-based representations Koops, Hendrik Vincent and De Haas, W Bas and Bransen, Jeroen and Volk, Anja http://dorienherremans.com/dlm2017/papers/koops2017pers.pdf 118 | 2017 unpublished End-to-end musical key estimation using a convolutional neural network Korzeniowski, Filip and Widmer, Gerhard https://arxiv.org/pdf/1706.02921.pdf 119 | 2017 inproceedings MediaEval 2017 AcousticBrainz genre task: Multilayer perceptron approach Koutini, Khaled and Imenina, Alina and Dorfer, Matthias and Gruber, Alexander Rudolf and Schedl, Markus http://www.cp.jku.at/research/papers/Koutini_2017_mediaeval-acousticbrainz.pdf 120 | 2017 unpublished Classification-based singing melody extraction using deep convolutional neural networks Kum, Sangeun and Nam, Juhan https://www.preprints.org/manuscript/201711.0027/v1 No F0 No [LabROSA](http://labrosa.ee.columbia.edu/projects/melody/) & [MedleyDB](http://medleydb.weebly.com/) & [Jamendo](http://www.mathieuramona.com/wp/data/jamendo/) & [RWC](https://staff.aist.go.jp/m.goto/RWC-MDB/) & [iKala](http://mac.citi.sinica.edu.tw/ikala/) & [MIR-1K](https://sites.google.com/site/unvoicedsoundseparation/mir-1k) & [ADC2004](http://labrosa.ee.columbia.edu/projects/melody/) Keras CNN 0.3 No 100 Pitch shift -2, -1, +1, +2 semitones Leaky ReLU 0.02 SGD 2 121 | 2017 article Multi-level and multi-scale feature aggregation using pre-trained convolutional neural networks for music auto-tagging Lee, Jongpil and Nam, Juhan https://arxiv.org/pdf/1703.01793v2.pdf 122 | 2017 unpublished Multi-level and multi-scale feature aggregation using sample-level deep convolutional neural networks for music classification Lee, Jongpil and Nam, Juhan https://arxiv.org/pdf/1706.06810.pdf https://github.com/jongpillee/musicTagging_MSD [MSD](https://labrosa.ee.columbia.edu/millionsong/) 123 | 2017 inproceedings Sample-level deep convolutional neural networks for music auto-tagging using raw waveforms Lee, Jongpil and Park, Jiyoung and Kim, Keunhyoung Luke and Nam, Juhan https://arxiv.org/pdf/1703.01789v2.pdf 124 | 2017 unpublished A SeqGAN for Polyphonic Music Generation Lee, Sang-gil and Hwang, Uiwon and Min, Seonwoo and Yoon, Sungroh https://arxiv.org/pdf/1710.11418.pdf https://github.com/L0SG/seqgan-music Polyphonic music sequence modelling No [Nottingham dataset](http://abc.sourceforge.net/NMD/) Tensorflow SeqGAN No No 100 No MIDI 1D No No 0.01 & 0.001 & 0.0001 No No 125 | 2017 inproceedings Harmonic and percussive source separation using a convolutional auto encoder Lim, Wootaek and Lee, Taejin http://www.eurasip.org/Proceedings/Eusipco/Eusipco2017/papers/1570346835.pdf Source separation [DSD100](http://sisec17.audiolabs-erlangen.de/#/dataset) CNN 126 | 2017 unpublished Stacked convolutional and recurrent neural networks for music emotion recognition Malik, Miroslav and Adavanne, Sharath and Drossos, Konstantinos and Virtanen, Tuomas and Ticha, Dasa and Jarina, Roman https://arxiv.org/pdf/1706.02292.pdf No MER [Free music archive](http://freemusicarchive.org/) & [MedleyDB](http://medleydb.weebly.com/) & [Jamendo](http://www.mathieuramona.com/wp/data/jamendo/) Keras-Theano CRNN No RMSE Adam 127 | 2017 mastersthesis A deep learning approach to source separation and remixing of hiphop music Martel Baro, Héctor https://repositori.upf.edu/bitstream/handle/10230/32919/Martel_2017.pdf?sequence=1&isAllowed=y Source separation & Remixing [DSD100](http://sisec17.audiolabs-erlangen.de/#/dataset) & [HHDS](https://drive.google.com/drive/folders/0B1zpiGdDzFNlbmJyYU1VVFR3OEE) DNN & CNN & RNN Mixing & Circular Shift & Instrument augmentation 128 | 2017 inproceedings Music Genre Classification Using Masked Conditional Neural Networks Medhat, Fady and Chesmore, David and Robinson, John https://link.springer.com/chapter/10.1007%2F978-3-319-70096-0_49 No MGR [Ballroom](http://mtg.upf.edu/ismir2004/contest/tempoContest/node5.html) & [Homburg](http://www-ai.cs.uni-dortmund.de/audio.html) Not disclosed MCLNN & CLNN No No No No No Adam No 129 | 2017 unpublished Monaural Singing Voice Separation with Skip-Filtering Connections and Recurrent Inference of Time-Frequency Mask Mimilakis, Stylianos Ioannis and Drossos, Konstantinos and Santos, João Felipe and Schuller, Gerald and Virtanen, Tuomas and Bengio, Yoshua https://arxiv.org/pdf/1711.01437.pdf https://github.com/Js-Mim/mss_pytorch SVS [MedleyDB](http://medleydb.weebly.com/) & [DSD100](http://sisec17.audiolabs-erlangen.de/#/dataset) PyTorch RNN & DNN 16 100 No ReLU 0.0001 Adam No 130 | 2017 inproceedings Generating data to train convolutional neural networks for classical music source separation Miron, Marius and Janer Mestres, Jordi and Gómez Gutiérrez, Emilia https://www.researchgate.net/profile/Marius_Miron/publication/318322107_Generating_data_to_train_convolutional_neural_networks_for_classical_music_source_separation/links/59637cc3458515a3575b93c6/Generating-data-to-train-convolutional-neural-networks-for-classical-music-source-separation.pdf?_iepl%5BhomeFeedViewId%5D=WchoMnlUL1Hk9hBLVTeR8Amh&_iepl%5Bcontexts%5D%5B0%5D=pcfhf&_iepl%5BinteractionType%5D=publicationDownload&origin=publication_detail&ev=pub_int_prw_xdl&msrp=p3lQ8M4uZlb4TF5Hv9a2U3P2y4wW7ant5KWj4E5-OcD1Mg53p1ykTKHMG9_zVTB9n6mI8fvZOCL2Xhpru186pCEY-2ZxiYR-CB8_QvwHc1kUG-QE4SHdProR.LoJb2BDOiiQth3iR9xgZUxxCWEJgtTBF4whFrFa01OD49-3YYRxA0WQVN--zhtQU_7C2Pt0rKdwoFxT1pfxFvnKXSXmy2eT1Jpz-pw.U1QLoFO_Uc6aQVr2Nm2FcAi6BqAUfngH2Or5__6wegbCgVvTYoIGt22tmCkYbGTOQ_4PxBgt1LrvsFQiL0oMyogP8Yk8myTj0gs9jw.fGpkufGqAI4R2v8Hfe0ThcXL7M7yN2PuAlx974BGVn50SdUWvNhhIPWBD-zWTn8NKtVJx3XrjKXFrMgi9Cx7qGrNP8tBWpha6Srf6g https://github.com/MTG/DeepConvSep Source separation [RWC](https://staff.aist.go.jp/m.goto/RWC-MDB/) & [Bach10](http://music.cs.northwestern.edu/data/Bach10.html) CNN 32 131 | 2017 inproceedings Monaural score-informed source separation for classical music using convolutional neural networks Miron, Marius and Janer, Jordi and Gómez, Emilia https://www.researchgate.net/profile/Marius_Miron/publication/318637038_Monaural_score-informed_source_separation_for_classical_music_using_convolutional_neural_networks/links/597327c6458515e26dfdb007/Monaural-score-informed-source-separation-for-classical-music-using-convolutional-neural-networks.pdf?_iepl%5BhomeFeedViewId%5D=WchoMnlUL1Hk9hBLVTeR8Amh&_iepl%5Bcontexts%5D%5B0%5D=pcfhf&_iepl%5BinteractionType%5D=publicationDownload&origin=publication_detail&ev=pub_int_prw_xdl&msrp=Hp6dDqMepEiRZ5E6WkreaqyjFkFkwMxPFoJvr14etVJsoKZBc5qb99fBnJjVUZrRHLFRhaXvNY9k1sMvYPOouuGbQP0YhEGm28zLw_55Zewu86WGnHck1Tqi.93HH2WqXfTedn6IaZRjjhQGYZVDHBz1X6nr4ABBgMAVv584gvGN3sW5IyBAY-4MBWf5DJFPBGm8zsaC2dKz8G-odZPfosWoXY0afAQ.KoCP2mO9l31lCER0oMZMZBrbuRGvb6ZzeBwHb88pL8AhMfJk03Hj1eLrohQIjPDETBj4hhqb0gniDGJgtZ9GnW64ZNjh9GbQDrIl5A.egNQTyC7t8P26zCQWrbEhf51Pxy2JRBZoTkH6SpRHHhRhFl1_AT_AT481lMcFI34-JbeRq-5oTQR7DpvAuw7iUIivd78ltuxpI9syg https://github.com/MTG/DeepConvSep Source separation [Bach10](http://music.cs.northwestern.edu/data/Bach10.html) CNN 132 | 2017 inproceedings Multi-label music genre classification from audio, text, and images using deep features Oramas, Sergio and Nieto, Oriol and Barbieri, Francesco and Serra, Xavier https://ismir2017.smcnus.org/wp-content/uploads/2017/10/126_Paper.pdf https://github.com/sergiooramas/tartarus MGR [MSD](https://labrosa.ee.columbia.edu/millionsong/) CNN No 133 | 2017 inproceedings A deep multimodal approach for cold-start music recommendation Oramas, Sergio and Nieto, Oriol and Sordo, Mohamed and Serra, Xavier https://arxiv.org/pdf/1706.09739.pdf https://github.com/sergiooramas/tartarus Recommendation [MSD](https://labrosa.ee.columbia.edu/millionsong/) CNN 32 134 | 2017 inproceedings Melody extraction and detection through LSTM-RNN with harmonic sum loss Park, Hyunsin and Yoo, Chang D http://ieeexplore.ieee.org/abstract/document/7952660/ Artist recognition & MGR No 135 | 2017 unpublished Representation learning of music using artist labels Park, Jiyoung and Lee, Jongpil and Park, Jangyeon and Ha, Jung-Woo and Nam, Juhan https://arxiv.org/pdf/1710.06648.pdf [MSD](https://labrosa.ee.columbia.edu/millionsong/) & [GTzan](http://marsyas.info/downloads/datasets.html) & [Magnatagatune](http://mirg.city.ac.uk/codeapps/the-magnatagatune-dataset) CNN 136 | 2017 inproceedings Toward inverse control of physics-based sound synthesis Pfalz, A and Berdahl, E http://dorienherremans.com/dlm2017/papers/pfalz2017synthesis.pdf https://www.cct.lsu.edu/~apfalz/inverse_control.html 137 | 2017 techreport DNN and CNN with weighted and multi-task loss functions for audio event detection Phan, Huy and Krawczyk-Becker, Martin and Gerkmann, Timo and Mertins, Alfred https://arxiv.org/pdf/1708.03211.pdf Event recognition CNN & DNN 138 | 2017 inproceedings Score-informed syllable segmentation for a cappella singing voice with convolutional neural networks Pons, Jordi and Gong, Rong and Serra, Xavier https://ismir2017.smcnus.org/wp-content/uploads/2017/10/46_Paper.pdf https://github.com/ronggong/jingjuSyllabicSegmentaion/tree/v0.1.0 Syllable segmentation 128 Adam 139 | 2017 inproceedings End-to-end learning for music audio tagging at scale Pons, Jordi and Nieto, Oriol and Prockup, Matthew and Schmidt, Erik M. and Ehmann, Andreas F. and Serra, Xavier https://arxiv.org/pdf/1711.02520.pdf https://github.com/jordipons/music-audio-tagging-at-scale-models General audio classification Inhouse Tensorflow CNN 16 No No ReLU 0.001 Adam No 140 | 2017 inproceedings Designing efficient architectures for modeling temporal features with convolutional neural networks Pons, Jordi and Serra, Xavier http://ieeexplore.ieee.org/document/7952601/ https://github.com/jordipons/ICASSP2017 MGR [Ballroom](http://mtg.upf.edu/ismir2004/contest/tempoContest/node5.html) CNN 141 | 2017 inproceedings Timbre analysis of music audio signals with convolutional neural networks Pons, Jordi and Slizovskaia, Olga and Gong, Rong and Gómez, Emilia and Serra, Xavier https://github.com/ronggong/EUSIPCO2017 https://github.com/jordipons/EUSIPCO2017 CNN 142 | 2017 inproceedings Deep learning and intelligent audio mixing Ramírez, Marco A. Martínez and Reiss, Joshua D. http://www.semanticaudio.co.uk/wp-content/uploads/2017/09/WIMP2017_Martinez-RamirezReiss.pdf No Mixing [Open Multitrack Testbed](http://www.semanticaudio.co.uk/projects/omtb/) DAE No Adam 143 | 2017 phdthesis Deep learning for event detection, sequence labelling and similarity estimation in music signals Schlüter, Jan http://ofai.at/~jan.schlueter/pubs/phd/phd.pdf 144 | 2017 inproceedings Music feature maps with convolutional neural networks for music genre classification Senac, Christine and Pellegrini, Thomas and Mouret, Florian and Pinquier, Julien https://www.researchgate.net/profile/Thomas_Pellegrini/publication/319326354_Music_Feature_Maps_with_Convolutional_Neural_Networks_for_Music_Genre_Classification/links/59ba5ae3458515bb9c4c6724/Music-Feature-Maps-with-Convolutional-Neural-Networks-for-Music-Genre-Classification.pdf?origin=publication_detail&ev=pub_int_prw_xdl&msrp=wzXuHZAa5zAnqEmErYyZwIRr2H0q01LnNEd4Wd7A15CQfdVLwdy98pmE-AdnrDvoc3-bVENSFrHt0yhaOiE2mQrYllVS9CJZOk-c9R0j_R1rbgcZugS6RtQ_.AUjPuJSF5P_DMngf-woH7W-7jdnQlbNQziR4_h6NnCHfR_zGcEa8vOyyOz5gx5nc4azqKTPQ5ZgGGLUxkLj1qCQLEQ5ThkhGlWHLyA.s6MBZE20-EO_RjRGCOCV4wk0WSFdN56Aloiraxz9hKCbJwRM2Et27RHVUA8jj9H8qvXIB6f7zSIrQgjXGrL2yCpyQlLffuf57rzSwg.KMMXbZrHsihV8DJM53xkHAWf3VebCJESi4KU4btNv9nQsyK2KnkhSQaTILKv0DSZY3c70a61LzywCBuoHtIhVOFhW5hVZN2n5O9uKQ MGR [GTzan](http://marsyas.info/downloads/datasets.html) CNN Spectrograms & common audio features 145 | 2017 inproceedings Automatic drum transcription for polyphonic recordings using soft attention mechanisms and convolutional neural networks Southall, Carl and Stables, Ryan and Hockman, Jason https://carlsouthall.files.wordpress.com/2017/12/ismir2017adt.pdf https://github.com/CarlSouthall/ADTLib Transcription [IDMT-SMT-Drums](https://www.idmt.fraunhofer.de/en/business_units/m2d/smt/drums.html) CNN & BRNN 146 | 2017 unpublished Adversarial semi-supervised audio source separation applied to singing voice extraction Stoller, Daniel and Ewert, Sebastian and Dixon, Simon https://arxiv.org/pdf/1711.00048.pdf No Source separation No [iKala](http://mac.citi.sinica.edu.tw/ikala/) & [MedleyDB](http://medleydb.weebly.com/) & [DSD100](http://sisec17.audiolabs-erlangen.de/#/dataset) & [CCMixter](https://members.loria.fr/ALiutkus/kam/) CNN & U-Net No 6 No ReLU & Leaky ReLU Adam NVIDIA GTX1080 147 | 2017 article Taking the models back to music practice: Evaluating generative transcription models built using deep learning Sturm, Bob L. and Ben-Tal, Oded http://jcms.org.uk/issues/Vol2Issue1/taking-models-back-to-music-practice/Taking%20the%20Models%20back%20to%20Music%20Practice:%20Evaluating%20Generative%20Transcription%20Models%20built%20using%20Deep%20Learning.pdf https://github.com/IraKorshunova/folk-rnn Composition 148 | 2017 inproceedings Generating nontrivial melodies for music as a service Teng, Yifei and Zhao, An and Goudeseune, Camille https://ismir2017.smcnus.org/wp-content/uploads/2017/10/178_Paper.pdf Composition 149 | 2017 unpublished Invariances and data augmentation for supervised music transcription Thickstun, John and Harchaoui, Zaid and Foster, Dean and Kakade, Sham M. https://arxiv.org/pdf/1711.04845.pdf https://github.com/jthickstun/thickstun2018invariances/ Transcription [MusicNet](https://homes.cs.washington.edu/~thickstn/musicnet.html) Tensorflow CNN 150 No Pitch shift integer in [-5, 5] semitones and continuous in [-0.1, 0.1] No No No 1 NVIDIA 1080Ti 150 | 2017 inproceedings Lyrics-based music genre classification using a hierarchical attention network Tsaptsinos, Alexandros https://ismir2017.smcnus.org/wp-content/uploads/2017/10/43_Paper.pdf https://github.com/alexTsaptsinos/lyricsHAN MGR [LyricFind](http://lyricfind.com/) Tensorflow HAN cross-entropy 151 | 2017 unpublished A hybrid DSP/deep learning approach to real-time full-band speech enhancement Valin, Jean-Marc https://arxiv.org/pdf/1709.08243.pdf https://github.com/xiph/rnnoise/ Noise suppression [TSP](http://www-mmsp.ece.mcgill.ca/Documents/Data/) & [NTT MLS](http://www.ntt-at.com/product/speech/) RNN No BFCC (22), 1st and 2nd derivatives of first 6 BFCCs, 6 coefficients of DCT of pitch correlation, pitch period, spectral non-stationary metric Custom 152 | 2017 phdthesis Convolutional methods for music analysis Velarde, Gissel http://vbn.aau.dk/files/260308151/PHD_Gissel_Velarde_E_pdf.pdf 153 | 2017 inproceedings Extending temporal feature integration for semantic audio analysis Vrysis, Lazaros and Tsipas, Nikolaos and Dimoulas, Charalampos and Papanikolaou, George http://www.aes.org/e-lib/browse.cfm?elib=18682 ANN 154 | 2017 inproceedings Recognition and retrieval of sound events using sparse coding convolutional neural network Wang, Chien-Yao and Santoso, Andri and Mathulaprangsan, Seksan and Chiang, Chin-Chin and Wu, Chung-Hsien and Wang, Jia-Ching http://ieeexplore.ieee.org/abstract/document/8019552/ Event recognition CNN 155 | 2017 article A two-stage approach to note-level transcription of a specific piano Wang, Qi and Zhou, Ruohua and Yan, Yonghong http://www.mdpi.com/2076-3417/7/9/901/htm Transcription 156 | 2017 unpublished Reducing model complexity for DNN based large-scale audio classification Wu, Yuzhong and Lee, Tan https://arxiv.org/pdf/1711.00229.pdf No General audio classification No [AudioSet](https://research.google.com/audioset/index.html) & [TUT Acoustic Scenes 2016](http://www.cs.tut.fi/~mesaros/pubs/mesaros_eusipco2016-dcase.pdf) CNN & RNN & MLP & AlexNet & ResNet No ReLU Adam 157 | 2017 inproceedings Audio spectrogram representations for processing with convolutional neural networks Wyse, Lonce http://dorienherremans.com/dlm2017/papers/wyse2017spect.pdf http://lonce.org/research/audioST/ Review & Comparison CNN 158 | 2017 article Unsupervised feature learning based on deep models for environmental audio tagging Xu, Yong and Huang, Qiang and Wang, Wenwu and Foster, Peter and Sigtia, Siddharth and Jackson, Philip J. B. and Plumbley, Mark D. https://arxiv.org/pdf/1607.03681.pdf 159 | 2017 inproceedings Attention and localization based on a deep convolutional recurrent model for weakly supervised audio tagging Xu, Yong and Kong, Qiuqiang and Huang, Qiang and Wang, Wenwu and Plumbley, Mark D. https://arxiv.org/pdf/1703.06052.pdf https://github.com/yongxuUSTC/att_loc_cgrnn DCASE 2016 Task 4 Domestic audio tagging CRNN 160 | 2017 techreport Surrey-CVSSP system for DCASE2017 challenge task4 Xu, Yong and Kong, Qiuqiang and Wang, Wenwu and Plumbley, Mark D. https://www.cs.tut.fi/sgn/arg/dcase2017/documents/challenge_technical_reports/DCASE2017_Xu_146.pdf https://github.com/yongxuUSTC/dcase2017_task4_cvssp Event recognition 161 | 2017 inproceedings A study on LSTM networks for polyphonic music sequence modelling Ycart, Adrien and Benetos, Emmanouil https://qmro.qmul.ac.uk/xmlui/handle/123456789/24946 http://www.eecs.qmul.ac.uk/~ay304/code/ismir17 Polyphonic music sequence modelling Inhouse & [Piano-midi.de](Piano-midi.de) RNN-LSTM Pitch shift 162 | 2018 inproceedings MuseGAN: Multi-track sequential generative adversarial networks for symbolic music generation and accompaniment Dong, Hao-Wen and Hsiao, Wen-Yi and Yang, Li-Chia and Yang, Yi-Hsuan https://arxiv.org/pdf/1709.06298.pdf https://github.com/salu133445/musegan Composition No [Lakh Pianoroll Datase](https://github.com/salu133445/musegan/blob/master/docs/dataset.md) No GAN & CNN No No No No Piano-roll 1D ReLU & Leaky ReLU No No Adam 1 Tesla K40m 163 | 2018 unpublished Music transformer: Generating music with long-term structure Huang, Cheng-Zhi Anna and Vaswani, Ashish and Uszkoreit, Jakob and Shazeer, Noam and Simon, Ian and Hawthorne, Curtis and Dai, Andrew M. and Hoffman, Matthew D. and Dinculescu, Monica and Eck, Douglas https://arxiv.org/pdf/1809.04281.pdf Polyphonic music sequence modelling No [J.S. Bach chorales dataset](https://github.com/czhuang/JSB-Chorales-dataset) & [Piano-e-Competition dataset (competition history)](http://www.piano-e-competition.com/) No Transformer & RNN & tensor2tensor 0.1 1 No Time Stretches & pitch transcription MIDI 1D No No 0.1 No No 164 | 2018 inproceedings Music theory inspired policy gradient method for piano music transcription Li, Juncheng and Qu, Shuhui and Wang, Yun and Li, Xinjian and Das, Samarjit and Metze, Florian https://nips2018creativity.github.io/doc/music_theory_inspired_policy_gradient.pdf No Transcription No [MAPS](http://www.tsi.telecom-paristech.fr/aao/en/2010/07/08/maps-database-a-piano-database-for-multipitch-estimation-and-automatic-transcription-of-music/) CNN & RNN No 8 No No Log Mel-spectrogram with 48 bins per octave and 512 hop-size and 2018 window size and 16 kHz sample rate 2D No binary cross-entropy 0.0006 Adam No 165 | 2019 inproceedings Enabling factorized piano music modeling and generation with the MAESTRO dataset Hawthorne, Curtis and Stasyuk, Andriy and Roberts, Adam and Simon, Ian and Huang, Cheng-Zhi Anna and Dieleman, Sander and Elsen, Erich and Engel, Jesse and Eck, Douglas https://arxiv.org/abs/1810.12247 No Transcription No [MAESTRO](https://magenta.tensorflow.org/datasets/maestro/) No No No No No pitch-shift {+-0.1 semitones} & compression {0 - 100} & EQ {32 - 4096} & Reverb & {0 - 70} & Pink-noise {0 - 0.04} MIDI 1D No No No No No 166 | 2019 unpublished Generating Long Sequences with Sparse Transformers Rewon Child and Scott Gray and Alec Radford and Ilya Sutskever https://arxiv.org/pdf/1904.10509.pdf https://github.com/openai/sparse_attention Audio generation No [413 hours of recorded solo piano music](http://papers.nips.cc/paper/8023-the-challenge-of-realistic-music-generation-modelling-raw-audio-at-scale-supplemental.zip) Tensorflow Transformer 0.25 No 120 No Raw Audio 1D No No 0.00035 Adam 8 NVIDIA Tesla V100 167 | -------------------------------------------------------------------------------- /download.py: -------------------------------------------------------------------------------- 1 | import argparse 2 | import os 3 | import re 4 | import multiprocessing as mp 5 | 6 | from tqdm import tqdm 7 | 8 | from client import Downloader 9 | from dl4m import * 10 | 11 | 12 | def main(args): 13 | bib = load_bib(args.filename) 14 | 15 | print('Generating files list ...') 16 | data = [] 17 | for item in tqdm(bib): 18 | year = item['year'] 19 | title = item['title'].replace(' ', '_') 20 | auth1 = re.split(' |,', item['author'])[0] 21 | item_str = '{:s}_{:s}_{:s}'.format(year, auth1, title) 22 | item_str = item_str[:40] 23 | 24 | if 'link' not in item: 25 | tqdm.write('Skip item ' + item_str + ': No link is provided') 26 | continue 27 | 28 | link = item['link'] 29 | if not link.endswith('.pdf'): 30 | tqdm.write('Skip item ' + item_str + ': The link is not a pdf') 31 | continue 32 | 33 | filename = item_str + '.pdf' 34 | item_dict = dict(name=filename, link=link, year=year, title=title, auth1=auth1) 35 | data.append(item_dict) 36 | print('') 37 | print('File list is now generated') 38 | print('') 39 | 40 | # Create line format to print pretty 41 | max_len = 0 42 | for paper in data: 43 | max_len_new = len(paper['title']) 44 | if max_len < max_len_new: 45 | max_len = max_len_new 46 | line_format = \ 47 | '{:4s}' + \ 48 | ' ' + \ 49 | '{:' + str(max_len) + 's}' + \ 50 | ' ' + \ 51 | '{:s}' 52 | 53 | # Print the results 54 | print(line_format.format('YEAR', 'TITLE', '1st AUTHOR')) 55 | for paper in data: 56 | print(line_format.format(paper['year'], paper['title'], paper['auth1'])) 57 | 58 | # Search website for more information 59 | print('') 60 | print('Download papers from online ...') 61 | if not os.path.isdir(args.dirname): 62 | os.mkdir(args.dirname) 63 | downloader = Downloader(args.dirname) 64 | 65 | if args.use_single_thread: 66 | # Single-threaded downloader 67 | for paper in tqdm(data): 68 | downloader.client(paper) 69 | else: 70 | # Multi-threaded downloader 71 | pbar = tqdm(total=len(data)) 72 | def update_pbar(*args): 73 | pbar.update() 74 | 75 | pool = mp.Pool(mp.cpu_count()) 76 | for i in range(len(data)): 77 | pool.apply_async(downloader.client, args=(data[i],), callback=update_pbar) 78 | pool.close() 79 | pool.join() 80 | print('') 81 | print('Download is now complete') 82 | 83 | 84 | if __name__ == '__main__': 85 | parser = argparse.ArgumentParser(description='Paper downloader') 86 | parser.add_argument('--filename', '-f', type=str, default='dl4m.bib', 87 | help='search database') 88 | parser.add_argument('--dirname', '-d', type=str, default='downloads', 89 | help='download directory') 90 | parser.add_argument('--use-single-thread', '-s', action='store_true', 91 | help='use single thread for downloading') 92 | args = parser.parse_args() 93 | 94 | main(args) 95 | -------------------------------------------------------------------------------- /fig/articles_per_year.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/Best-Audio-Classification-Resources-with-Deep-learning/c4e0ee3388b7407238fdfe00b6c8e7d31b4c98e3/fig/articles_per_year.png -------------------------------------------------------------------------------- /fig/logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/Best-Audio-Classification-Resources-with-Deep-learning/c4e0ee3388b7407238fdfe00b6c8e7d31b4c98e3/fig/logo.png -------------------------------------------------------------------------------- /fig/pie_chart_architecture.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/Best-Audio-Classification-Resources-with-Deep-learning/c4e0ee3388b7407238fdfe00b6c8e7d31b4c98e3/fig/pie_chart_architecture.png -------------------------------------------------------------------------------- /fig/pie_chart_dataset.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/Best-Audio-Classification-Resources-with-Deep-learning/c4e0ee3388b7407238fdfe00b6c8e7d31b4c98e3/fig/pie_chart_dataset.png -------------------------------------------------------------------------------- /fig/pie_chart_framework.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/Best-Audio-Classification-Resources-with-Deep-learning/c4e0ee3388b7407238fdfe00b6c8e7d31b4c98e3/fig/pie_chart_framework.png -------------------------------------------------------------------------------- /fig/pie_chart_task.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ashishpatel26/Best-Audio-Classification-Resources-with-Deep-learning/c4e0ee3388b7407238fdfe00b6c8e7d31b4c98e3/fig/pie_chart_task.png -------------------------------------------------------------------------------- /frameworks.md: -------------------------------------------------------------------------------- 1 | # List of frameworks 2 | 3 | Please refer to the list of useful acronyms used in deep learning and music: [acronyms.md](acronyms.md). 4 | 5 | - Caffe 6 | - Keras 7 | - Keras-TensorFlow 8 | - Keras-Theano 9 | - No 10 | - Not disclosed 11 | - PyTorch 12 | - Tensorflow 13 | - Theano 14 | -------------------------------------------------------------------------------- /publication_type.md: -------------------------------------------------------------------------------- 1 | # List of publications type 2 | 3 | ### Journals: 4 | 5 | - [Applied Sciences](http://www.mdpi.com/journal/applsci) 6 | - [Applied Soft Computing](https://www.journals.elsevier.com/applied-soft-computing/) 7 | - [Circuits and Systems](http://www.scirp.org/journal/cs/) 8 | - [Computer Music Journal](http://computermusicjournal.org/) 9 | - [Connection Science](http://www.tandfonline.com/toc/ccos20/current) 10 | - [IEEE Signal Processing Letters](https://signalprocessingsociety.org/publications-resources/ieee-signal-processing-letters/ieee-signal-processing-letters) 11 | - [IEEE Transactions on Multimedia](https://signalprocessingsociety.org/publications-resources/ieee-transactions-multimedia) 12 | - [IEEE/ACM Transactions on Audio, Speech, and Language Processing](https://signalprocessingsociety.org/publications-resources/ieeeacm-transactions-audio-speech-and-language-processing/ieeeacm) 13 | - [International Journal of Applied Mathematics and Computer Science](https://www.amcs.uz.zgora.pl/) 14 | - [International Journal of Electronics and Telecommunications](http://ijet.pl/index.php/ijet) 15 | - [Journal of Creative Music Systems](http://jcms.org.uk/) 16 | 17 | ### Conferences: 18 | 19 | - AAAI 20 | - ACM_MM 21 | - Audio Engineering Society Convention 22 | - Biennial Symposium for Arts and Technology 23 | - CBMI 24 | - CSMC 25 | - Connectionist Models Summer School 26 | - Convention of Electrical and Electronics Engineers 27 | - DLRS 28 | - EUSIPCO 29 | - ICASSP 30 | - ICLR 31 | - ICMC 32 | - ICME 33 | - ICML 34 | - ICMLA 35 | - ICMR 36 | - ICONIP 37 | - IEEE_FedCSIS 38 | - IEEE_ICME 39 | - IEEE_ICNN 40 | - IEEE_MLSP 41 | - INTERSPEECH 42 | - ISMA 43 | - ISMIR 44 | - IWDLM 45 | - Int. Conf. Data Mining and Applications 46 | - International Conference on Latent Variable Analysis and Signal Separation (LVA ICA) 47 | - International Workshop on Machine Learning and Music 48 | - MIREX 49 | - MediaEval 50 | - Music and Artificial Intelligence. Additional Proceedings of the Second International Conference, ICMAI 51 | - Proceedings of the First Workshop on Artificial Intelligence and Music 52 | - RECSYS 53 | - SMC 54 | - WASPAA 55 | - WIMP 56 | - [IEEE_ICTAI](http://ictai2017.org/) 57 | - [NIPS](https://nips.cc/) 58 | - [NIPS_ML4Audio](https://nips.cc/Conferences/2017/Schedule?showEvent=8790) 59 | - [NNSP](http://cogsys.imm.dtu.dk/nnsp2002/) 60 | -------------------------------------------------------------------------------- /reproducibility.md: -------------------------------------------------------------------------------- 1 | # Reproducibility and replicability 2 | 3 | Out of the 159 papers listed in this repository, only 41 articles provide their source code. 4 | Repeatability is the key to good science. 5 | Below is a list of useful links for reproducibility and replicability in Science: 6 | 7 | - [DLPaper2Code: Auto-generation of Code from Deep Learning Research Papers](https://arxiv.org/abs/1711.03543) 8 | - [Audio Processing Seminar 2016: Reproducible Research](https://github.com/audiolabs/APSRR-2016) 9 | - [Reproducible Research in Signal Processing - What, why, and how](https://infoscience.epfl.ch/record/136640) 10 | - [List of Reproducible Audio Research Papers ](https://github.com/faroit/reproducible-audio-research) 11 | - [ReScience initiative](https://rescience.github.io/) 12 | - [Workshop on reproducibility and replication in ML](https://sites.google.com/view/icml-reproducibility-workshop/home) 13 | - [ICLR 2018 Reproducibility Challenge](http://www.cs.mcgill.ca/~jpineau/ICLR2018-ReproducibilityChallenge.html) 14 | - [Reproducibility in Machine Learning-Based Studies: An Example of Text Mining](https://openreview.net/pdf?id=By4l2PbQ-) 15 | - [Nature's insights on reproducibility](http://www.nature.com/news/1-500-scientists-lift-the-lid-on-reproducibility-1.19970?WT.mc_id=FBK_NatureNews) 16 | - [The titans of AI are getting their work double-checked by students](https://qz.com/1118671/the-titans-of-ai-are-getting-their-work-double-checked-by-students/) 17 | - [Anonymous GitHub for double-blind peer-review](https://github.com/tdurieux/anonymous_github/) 18 | - [Automated collection and reporting of citations for used software/methods/datasets](https://github.com/duecredit/duecredit) 19 | - [EvalAI](https://github.com/Cloud-CV/EvalAI) - Evaluating state of the art in AI 20 | - [Open Science List](https://github.com/INRIA/awesome-open-science-software) - Awesome open list of pointers about open science for software and computational science 21 | - [Twitter post](https://twitter.com/goodfellow_ian/status/935886522607271937) - Advice from Ian Goodfellow 22 | - [Dissemin](https://dissem.in/) - Open Access facilitator from [Dissemin](https://twitter.com/disseminOA/status/923469525827440641) 23 | - [Do software and data products advance biology more than papers?](http://ivory.idyll.org/blog/2018-software-and-data-better-than-papers.html) 24 | - [Before reproducibility must come preproducibility](https://www.nature.com/articles/d41586-018-05256-0?utm_source=twt_nnc&utm_medium=social&utm_campaign=naturenews&sf190281917=1) - Science should be ‘show me’, not ‘trust me’; it should be ‘help me if you can’, not ‘catch me if you can’. 25 | -------------------------------------------------------------------------------- /sources.md: -------------------------------------------------------------------------------- 1 | # Sources 2 | 3 | The list of website used to gather the proposed materials: 4 | 5 | - [Journals](#journals) 6 | - [Conferences](#conferences) 7 | - [Workshops](#workshops) 8 | - [Aggregators](#aggregators) 9 | 10 | ## Journals 11 | 12 | - [Applied Sciences Special Issue on Sound and Music Computing](http://www.mdpi.com/journal/applsci/special_issues/Music_Computing) 13 | - [CMJ](https://www.jstor.org/journal/computermusicj) - Computer Music Journal 14 | - [JCMS](http://jcms.org.uk/) - Journal of Creative Music Systems 15 | 16 | ## Conferences 17 | 18 | - [AAAI](https://aaai.org/Conferences/AAAI-18/) - Association for the Advancement of Artificial Intelligence Conference on Artificial Intelligence 19 | - [CSMC](https://csmc2016.wordpress.com/) - Conference on Computer Simulation of Musical Creativity 20 | - [ICASSP](http://ieeexplore.ieee.org/xpl/conhome.jsp?punumber=1000002) - IEEE International Conference on Acoustics, Speech, and Signal Processing 21 | - [ICMC](http://www.computermusic.org/) - International Computer Music Conference 22 | - [ICML](https://2017.icml.cc/) - International Conference on Machine Learning 23 | - [INTERSPEECH](http://www.isca-speech.org/iscaweb/) 24 | - [ISMIR](http://ismir.net/) - International Society for Music Information Retrieval 25 | - [LVA/ICA](http://lva-ica-2017.com/index.php/aim) - International Conference on Latent Variable Analysis and Signal Separation 26 | - [MIREX](http://music-ir.org/mirex/wiki/MIREX_HOME) - Music Information Retrieval Evaluation eXchange 27 | - [RECSYS](https://recsys.acm.org/) - ACM Recommender Systems conference 28 | - [SMC](http://www.smc-conference.org/) - Sound and Music Computing Conference 29 | 30 | ## Workshops 31 | 32 | - [DLRS](http://dlrs-workshop.org/) - Workshop on Deep Learning for Recommender Systems 33 | - [ESI](http://www.univie.ac.at/nuhag-php/event_NEW/make.php?event=esi17&page=program) 34 | - [ML4Audio](https://nips.cc/Conferences/2017/Schedule?showEvent=8790) - Machine Learning for Audio Signal Processing at [NIPS](https://nips.cc/) 35 | - [IWDLM](http://dorienherremans.com/dlm2017/) - International Workshop on Deep Learning for Music 36 | 37 | ## Aggregators 38 | 39 | - [arXiv](https://arxiv.org/) (cs.[AI, CV, IR, LG, MM, NE, SD] and eess.[AS, SP]) 40 | - [PaperScape](http://paperscape.org/) 41 | - [Scholar](https://scholar.google.com/scholar?q=neural+network+audio+music) 42 | - [SciRate](https://scirate.com/) 43 | -------------------------------------------------------------------------------- /tasks.md: -------------------------------------------------------------------------------- 1 | # List of tasks 2 | 3 | Please refer to the list of useful acronyms used in deep learning and music: [acronyms.md](acronyms.md). 4 | 5 | - Artist recognition 6 | - Audio generation 7 | - Beat detection 8 | - Boundary detection 9 | - Chord recognition 10 | - Comparison 11 | - Composition 12 | - DCASE 2016 Task 4 Domestic audio tagging 13 | - Event recognition 14 | - F0 15 | - General audio classification 16 | - Instrument recognition 17 | - MER 18 | - MGR 19 | - MSR 20 | - Manifesto 21 | - Mixing 22 | - Music/Noise segmentation 23 | - Noise suppression 24 | - Onset detection 25 | - Pitch determination 26 | - Playlist generation 27 | - Polyphonic music sequence modelling 28 | - Recommendation 29 | - Remixing 30 | - Review 31 | - SVD 32 | - SVS 33 | - Source separation 34 | - Speaker gender recognition 35 | - Survey 36 | - Syllable segmentation 37 | - Transcription 38 | - VAD 39 | --------------------------------------------------------------------------------