├── .gitignore ├── COPYRIGHT.txt ├── LICENSE.md ├── README.md ├── citations └── DALI_v1.0.bib ├── code ├── DALI │ ├── Annotations.py │ ├── __init__.py │ ├── download.py │ ├── extra.py │ ├── files │ │ ├── dali_v1_metadata.json │ │ └── template.xml │ ├── main.py │ ├── utilities.py │ └── vizualization.py ├── LICENSE ├── MANIFEST.in ├── README.md └── setup.py ├── docs └── images │ ├── Example.png │ ├── graphs.key │ ├── horizontal.png │ ├── l1.png │ ├── p1.png │ ├── vertical.png │ └── w1.png └── versions ├── README.md └── v1 ├── gt_v1.0_22:11:18.gz └── v1.0.md /.gitignore: -------------------------------------------------------------------------------- 1 | .DS_Store 2 | *.pyc 3 | /code/DALI/cross-correlation.py 4 | /code/DALI/my_helper.py 5 | /code/DALI/ground_truth.py 6 | /code/DALI/exp_alginment.py 7 | /code/DALI/get_audio.py 8 | /code/DALI/melody.py 9 | /images/graphs.key 10 | Icon? 11 | -------------------------------------------------------------------------------- /COPYRIGHT.txt: -------------------------------------------------------------------------------- 1 | Copyright © Ircam 2018 2 | DALI by Gabriel Meseguer-Brocal, Alice Cohen-Hadrian and Peeters Geoffroy. 3 | DALI is offered free of charge for non-commercial research use only under the terms of the Creative Commons Attribution Noncommercial License: http://creativecommons.org/licenses/by-nc-sa/4.0/ 4 | The DALI is provided for educational purposes only and the material contained in them should not be used for any commercial purpose without the express permission of the copyright holders. 5 | -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- 1 | Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 2 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | [horizontal]: ./docs/images/horizontal.png 3 | [vertical]: ./docs/images/vertical.png 4 | [p1]: ./docs/images/p1.png 5 | [l1]: ./docs/images/l1.png 6 | [w1]: ./docs/images/w1.png 7 | [Example]: ./docs/images/Example.png 8 | 9 | 10 | # WELCOME TO THE DALI DATASET: a large **D**ataset of synchronised **A**udio, **L**yr**I**cs and vocal notes. 11 | 12 | You can find a detailed explanation of how DALI has been created at: 13 | ***[Meseguer-Brocal_2018]*** [G. Meseguer-Brocal, A. Cohen-Hadria and G. Peeters. DALI: a large Dataset of synchronized Audio, LyrIcs and notes, automatically created using teacher-student machine learning paradigm. In ISMIR Paris, France, 2018.](http://ismir2018.ircam.fr/doc/pdfs/35_Paper.pdf) 14 | 15 | Cite this [paper](https://zenodo.org/record/1492443): 16 | 17 | >@inproceedings{Meseguer-Brocal_2018, 18 | Author = {Meseguer-Brocal, Gabriel and Cohen-Hadria, Alice and Peeters, Geoffroy}, 19 | Booktitle = {19th International Society for Music Information Retrieval Conference}, 20 | Editor = {ISMIR}, 21 | Month = {September}, 22 | Title = {DALI: a large Dataset of synchronized Audio, LyrIcs and notes, automatically created using teacher-student machine learning paradigm.}, 23 | Year = {2018}} 24 | 25 | 26 | 27 | Here's an example of the kind of information DALI contains: 28 | 29 | ![alt text][Example] 30 | 31 | 32 | DALI has two main elements: 33 | 34 | ## 1- The dataset - dali_data 35 | 36 | The dataset itself. It is denoted as **dali_data** and it is presented as a collection of **gz** files. 37 | You can find the different DALI_data versions in [here](https://github.com/gabolsgabs/DALI/blob/master/versions/). 38 | 39 | ## 2- The code for working with DALI - dali_code 40 | The code, denoted as **dali_code**, for reading and working with dali_data. 41 | It is stored in this repository and presented as a python package. 42 | Dali_code has its own versions controlled by this github. 43 | The release and stable versions can be found at [pypi](https://pypi.org/project/DALI-dataset/). 44 | 45 | repository
46 | ├── code
47 | │   ├── DALI
48 | │   │   ├── \_\_init\_\_.py
49 | │   │   ├── Annotations.py
50 | │   │   ├── main.py
51 | │   │   ├── utilities.py
52 | │   │   ├── extra.py
53 | │   │   ├── download.py
54 | │   │   ├── vizualization.py
55 | │   └── setup.py
56 | 57 | 58 | # NEWS: 59 | 60 | Ground-Truth for version 1.0 updated with 105 songs. 61 | Remember that DALI is a ongoing project. There are many things to solve. 62 | 63 | Currently we are working in: 64 | * the second generation for the Singing voice detection system. 65 | * solving errors for indivual notes. 66 | * solving errors global notas errors (songs where all the notes are place off by the same certain interval) 67 | * errors in local note alignments. 68 | 69 | Please, if you have any suggestion our improvement please contact us at: dali [dot] dataset [at] gmail [dot] com 70 | 71 | For any problem with the package that deal with the annotations open an issue in this repository. 72 | 73 | Thank you. 74 | 75 | # TUTORIAL: 76 | 77 | First of all, [download](https://github.com/gabolsgabs/DALI/blob/master/versions/) your Dali_data version and clone this repository. 78 | 79 | 80 | ## 0- Installing Dali_code. 81 | For the release and stable versions just run the command: 82 | 83 | > pip install dali-dataset 84 | 85 | For non-release and unstable versions you can install them manually going to folder DALI/code and running: 86 | 87 | > pip install . 88 | 89 | You can upgrade DALI for future version with: 90 | 91 | > pip install dali-dataset --upgrade 92 | 93 | DALI can be uninstalled with: 94 | 95 | > pip uninstall dali-dataset 96 | 97 | Requirements: **numpy** and **youtube_dl** 98 | 99 | **NOTE**: the version of the code in pip only refers to the code itself. The different versions of the Dali_data can be found above. 100 | 101 | 102 | ## 1- Loading DALI_data. 103 | 104 | DALI is presented as a set of **gz** files. 105 | Each gz contains the annotations of a particular song. 106 | We use a unique id for each entry. 107 | You can download your dali_data version as follow: 108 | 109 | import DALI as dali_code 110 | dali_data_path = 'full_path_to_your_dali_data' 111 | dali_data = dali_code.get_the_DALI_dataset(dali_data_path, skip=[], keep=[]) 112 | 113 | This function can also be used to load a subset of the DALI dataset by providing the ids of the entries you either want to **skip** or to **keep**. 114 | 115 | 116 | **NOTE**: Loading DALI might take some minutes depending on your computer and python version. Python3 is faster than python2. 117 | 118 | Additionally, each DALI version contains a DALI_DATA_INFO.gz: 119 | 120 | dali_info = dali_code.get_info(dali_data_path + 'info/DALI_DATA_INFO.gz') 121 | print(dali_info[0]) -> array(['DALI_ID', 'NAME', 'YOUTUBE', 'WORKING']) 122 | 123 | This file matches the unique DALI id with the artist_name-song_tile, the url to youtube and a bool that says if the youtube link is working or not. 124 | 125 | 126 | 127 | ## 1.1- An annotation instance. 128 | 129 | _dali_data_ is a dictionary where each key is a unique id and the value is an instance of the class DALI/Annotations namely **an annotation instance** of the class Annotations. 130 | 131 | entry = dali_data['a_dali_unique_id'] 132 | type(entry) -> DALI.Annotations.Annotations 133 | 134 | Each annotation instance has two attributes: **info** and **annotations**. 135 | 136 | entry.info --> {'id': 'a_dali_unique_id', 137 | 'artist': 'An Artist', 138 | 'title': 'A song title', 139 | 'dataset_version': 1.0, **# dali_data version** 140 | 'ground-truth': False, 141 | 'scores': {'NCC': 0.8098520072498807, 142 | 'manual': 0.0}, **# Not ready yet** 143 | 'audio': {'url': 'a youtube url', 144 | 'path': 'None', 145 | **# To you to modify it to point to your local audio file** 146 | 'working': True}, 147 | 'metadata': {'album': 'An album title', 148 | 'release_date': 'A year', 149 | 'cover': 'link to a image with the cover', 150 | 'genres': ['genre_0', ... , 'genre_n'], 151 | # The n of genre depends on the song 152 | 'language': 'a language'}} 153 | 154 | entry.annotations --> {'annot': {'the annotations themselves'}, 155 | 'type': 'horizontal' or 'vertical', 156 | 'annot_param': {'fr': float(frame rate used in the annotation process), 157 | 'offset': float(offset value)}} 158 | 159 | 160 | ## 1.2- Saving as json. 161 | 162 | You can export and import annotations a json file. 163 | 164 | path_save = 'my_full_save_path' 165 | name = 'my_annot_name' 166 | # export 167 | entry.write_json(path_save, name) 168 | # import 169 | my_json_entry = dali_code.Annotations() 170 | my_json_entry.read_json(os.path.join(path_save, name+'.json')) 171 | 172 | 173 | ## 1.3- Ground-truth. 174 | 175 | Each dali_data has its own ground-truth [ground-truth file](https://github.com/gabolsgabs/DALI/tree/master/versions/). 176 | The annotations that are part of the ground-truth are entries of the dali_data with the offset and fr parameters manually annotated. 177 | 178 | You can easily load a ground-truth file: 179 | 180 | gt_file = 'full_path_to_my_ground_truth_file' 181 | # you can load the ground-truth 182 | gt = dali_code.utilities.read_gzip(gt_file) 183 | type(gt) --> dict 184 | gt['a_dali_unique_id'] --> {'offset': float(a_number), 185 | 'fr': float(a_number)} 186 | 187 | You can also load a **dali_gt** with all the entries of the dali_data that are part of the ground-truth with their annotations updated to the offset and fr parameters manually annotated: 188 | 189 | # dali_gt only with ground_truth songs 190 | gt = dali_code.utilities.read_gzip(gt_file) 191 | dali_gt = dali_code.get_the_DALI_dataset(dali_data_path, gt_file, keep=gt.keys()) 192 | len(dali_gt) -> == len(gt) 193 | 194 | 195 | You can also load the whole dali_data and update the songs that are part of the ground truth with the offset and fr parameters manually verified. 196 | 197 | # Two options: 198 | # 1- once you have your dali_data 199 | dali_data = dali_code.update_with_ground_truth(dali_data, gt_file) 200 | 201 | # 2- while reading the dataset 202 | dali_data = dali_code.get_the_DALI_dataset(dali_data_path, gt_file=gt_file) 203 | 204 | 205 | NOTE 1: Please be sure you have the last [ground truth version](https://github.com/gabolsgabs/DALI/tree/master/versions/). 206 | 207 | # 2- Getting the audio. 208 | 209 | You can retrieve the audio for each annotation (if avilable) using the function dali_code.get_audio(): 210 | 211 | path_audio = 'full_path_to_store_the_audio' 212 | errors = dali_code.get_audio(dali_info, path_audio, skip=[], keep=[]) 213 | errors -> ['dali_id', 'youtube_url', 'error'] 214 | 215 | This function can also be used to download a subset of the DALI dataset by providing the ids of the entries you either want to **skip** or to **keep**. 216 | 217 | 218 | # 3- Working with DALI. 219 | 220 | Annotations are in: 221 | > entry.annotations['annot'] 222 | 223 | and they are presented in two different formats: **'horizontal'** or **'vertical'**. 224 | You can easily change the format using the functions: 225 | 226 | entry.horizontal2vertical() 227 | entry.vertical2horizontal() 228 | 229 | ## 3.1- Horizontal. 230 | In this format each level of granularity is stored individually. 231 | It is the default format. 232 | 233 | ![alt text][horizontal] 234 | 235 | entry.vertical2horizontal() --> 'Annot are already in a vertical format' 236 | entry.annotations['type'] --> 'horizontal' 237 | entry.annotations['annot'].keys() --> ['notes', 'lines', 'words', 'paragraphs'] 238 | 239 | Each level contains a list of annotation where each element has: 240 | 241 | my_annot = entry.annotations['annot']['notes'] 242 | my_annot[0] --> {'text': 'wo', # the annotation itself. 243 | 'time': [12.534, 12.659], # the begining and end of the segment in seconds. 244 | 'freq': [466.1637615180899, 466.1637615180899], # The range of frequency the text information is covering. At the lowest level, syllables, it corresponds to the vocal note. 245 | 'index': 0} # link with the upper level. For example, index 0 at the 'words' level means that that particular word below to first line ([0]). The paragraphs level has no index key. 246 | 247 | ### 3.1.1- Vizualizing an annotation file. 248 | 249 | You can export the annotations of each individual level to a xml or text file to vizualize them with Audacity or AudioSculpt. The pitch information is only presented in the xml files for AudioSculpt. 250 | 251 | my_annot = entry.annotations['annot']['notes'] 252 | path_save = 'my_save_path' 253 | name = 'my_annot_name' 254 | dali_code.write_annot_txt(my_annot, name, path_save) 255 | # import the txt file in your Audacity 256 | dali_code.write_annot_xml(my_annot, name, path_save) 257 | # import Rythm XML file in AudioSculpt 258 | 259 | 260 | ### 3.1.2- Examples. 261 | This format is ment to be use for working with each level individually. 262 | > Example 1: recovering the main vocal melody. 263 | 264 | Let's used the extra function dali_code.annot2vector() that transforms the annotations into a vector. There are two types of vector: 265 | 266 | - type='voice': each frame has a value 1 or 0 for voice or not voice. 267 | - type='melody': each frame has the freq value of the main vocal melody. 268 | 269 | my_annot = entry.annotations['annot']['notes'] 270 | time_resolution = 0.014 271 | # the value dur is just an example you should use the end of your audio file 272 | end_of_the_song = entry.annotations['annot']['notes'][-1]['time'][1] + 10 273 | melody = dali_code.annot2vector(my_annot, end_of_the_song, time_resolution, type='melody') 274 | 275 | **NOTE: have a look to dali_code.annot2vector_chopping() for computing a vector chopped with respect to a given window and hop size.** 276 | 277 | > Example 2: find the audio frames that define each paragraph. 278 | 279 | Let's used the other extra function dali_code.annot2frames() that transforms time in seconds into time in frames. 280 | 281 | my_annot = entry.annotations['annot']['paragraphs'] 282 | paragraphs = [i['time'] for i in dali_code.annot2frames(my_annot, time_resolution)] 283 | paragraphs --> [(49408, 94584), ..., (3080265, 3299694)] 284 | 285 | 286 | **NOTE**: dali_code.annot2frames() can also be used in the vertical format but not dali_code.annot2vector(). 287 | 288 | ## 3.2- Vertical. 289 | In this format the different levels of granularity are hierarchically connected: 290 | 291 | ![alt text][vertical] 292 | 293 | entry.horizontal2vertical() 294 | entry.annotations['type'] --> 'vertical' 295 | entry.annotations['annot'].keys() --> ['hierarchical'] 296 | my_annot = entry.annotations['annot']['hierarchical'] 297 | 298 | Each element of the list is a paragraph. 299 | 300 | my_annot[0] --> {'freq': [277.1826309768721, 493.8833012561241], # The range of frequency the text information is covering 301 | 'time': [12.534, 41.471500000000006], # the begining and end of the time segment. 302 | 'text': [line_0, line_1, ..., line_n]} 303 | 304 | ![alt text][p1] 305 | 306 | where 'text' contains all the lines of the paragraph. Each line follows the same format: 307 | 308 | lines_1paragraph = my_annot[0]['text'] 309 | lines_1paragraph[0] --> {'freq': [...], 'time': [...], 310 | 'text': [word_0, word_1, ..., word_n]} 311 | 312 | ![alt text][l1] 313 | 314 | again, each word contains all the notes for that word to be sung: 315 | 316 | words_1line_1paragraph = lines_1paragraph[0]['text'] 317 | words_1line_1paragraph[0] --> {'freq': [...], 'time': [...], 318 | 'text': [note_0, note_1, ..., note_n]} 319 | 320 | ![alt text][w1] 321 | 322 | Only the deepest level directly has the text information. 323 | 324 | notes_1word_1line_1paragraph = words_1line_1paragraph[1]['text'] 325 | notes_1word_1line_1paragraph[0] --> {'freq': [...], 'time': [...], 326 | 'text': 'note text'} 327 | 328 | You can always get the text at specific point with dali_code.get_text(), i.e: 329 | 330 | dali_code.get_text(lines_1paragraph) --> ['text word_0', 'text word_1', ..., text_word_n] 331 | # words in the first line of the first paragraph 332 | 333 | dali_code.get_text(my_annot[0]['text']) --> ['text word_0', 'text word_1', ..., text_word_n] 334 | # words in the first paragraph 335 | 336 | ### 3.2.2- Examples. 337 | This organization is meant to be used for working with specific hierarchical blocks. 338 | 339 | > Example 1: working only with the third paragraph. 340 | 341 | my_paragraph = my_annot[3]['text'] 342 | text_paragraph = dali_code.get_text(my_paragraph) 343 | 344 | Additionally, you can easily retrieve all its individual information with the function dali_code.unroll(): 345 | 346 | lines_in_paragraph, _ = dali_code.unroll(my_paragraph, depth=0, output=[]) 347 | words_in_paragraph, _ = dali_code.unroll(my_paragraph, depth=1, output=[]) 348 | notes_in_paragraph, _ = dali_code.unroll(my_paragraph, depth=2, output=[]) 349 | 350 | > Example 2: working only with the first line of the third paragraph 351 | 352 | my_line = my_annot[3]['text'][0]['text'] 353 | text_line = dali_code.get_text(my_line) 354 | words_in_line, _ = dali_code.unroll(my_line, depth=0, output=[]) 355 | notes_in_line, _ = dali_code.unroll(my_line, depth=1, output=[]) 356 | 357 | # 4- Correcting Annotations. 358 | 359 | Up to now, we are facing only global alignment problems. You can change this alignment by modifying the offset and frame rate parameters. The original ones are stored at: 360 | 361 | print(entry.annotations['annot_param']) 362 | {'offset': float(a_number), 'fr': float(a_number)} 363 | 364 | If you find a better parameters set you can modified the annotations using the function dali_code.change_time(): 365 | 366 | dali_code.change_time(entry, new_offset, new_fr) 367 | # The default new_offset and new_fr are entry.annotations['annot_param'] 368 | 369 | We encourage you to send us your parameters in order to improve DALI. 370 | 371 | _____ 372 | You can contact us at: 373 | 374 | > dali dot dataset at gmail dot com 375 | 376 | This research has received funding from the French National Research Agency under the contract ANR-16-CE23-0017-01 (WASABI project) 377 | 378 | Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License. 379 | -------------------------------------------------------------------------------- /citations/DALI_v1.0.bib: -------------------------------------------------------------------------------- 1 | @inproceedings{Meseguer-Brocal_2018, 2 | Author = {Meseguer-Brocal, Gabriel and Cohen-Hadria, Alice and Peeters, Geoffroy}, 3 | Booktitle = {19th International Society for Music Information Retrieval Conference}, 4 | Editor = {ISMIR}, 5 | Month = {September}, 6 | Title = {DALI: a large Dataset of synchronized Audio, LyrIcs and notes, automatically created using teacher-student machine learning paradigm.}, 7 | Year = {2018}} 8 | -------------------------------------------------------------------------------- /code/DALI/Annotations.py: -------------------------------------------------------------------------------- 1 | """ ANNOTATIONS class. 2 | 3 | GABRIEL MESEGUER-BROCAL 2018 4 | """ 5 | from .extra import (unroll, roll) 6 | from . import utilities as ut 7 | 8 | 9 | class Annotations(object): 10 | """Basic class that store annotations and its information. 11 | 12 | It contains some method for transformin the annot representation. 13 | """ 14 | 15 | def __init__(self, i=u'None'): 16 | self.info = {'id': i, 'artist': u'None', 'title': u'None', 17 | 'audio': {'url': u'None', 'working': False, 18 | 'path': u'None'}, 19 | 'metadata': {}, 'scores': {'NCC': 0.0, 'manual': 0.0}, 20 | 'dataset_version': 0.0, 'ground-truth': False} 21 | self.annotations = {'type': u'None', 'annot': {}, 22 | 'annot_param': {'fr': 0.0, 'offset': 0.0}} 23 | self.errors = None 24 | return 25 | 26 | def read_json(self, fname): 27 | """Read the annots from a json file.""" 28 | data = ut.read_json(fname) 29 | if(ut.check_structure(self.info, data['info']) and 30 | ut.check_structure(self.annotations, data['annotations'])): 31 | self.info = data['info'] 32 | self.annotations = data['annotations'] 33 | else: 34 | print('ERROR: wrong format') 35 | return 36 | 37 | def write_json(self, pth, name): 38 | """Writes the annots into a json file.""" 39 | data = {'info': self.info, 'annotations': self.annotations} 40 | ut.write_in_json(pth, name, data) 41 | return 42 | 43 | def horizontal2vertical(self): 44 | """Converts horizontal annotations (indivual levels) into a vertical 45 | representation (hierarchical).""" 46 | try: 47 | if self.annotations['type'] == 'horizontal': 48 | self.annotations['annot'] = roll(self.annotations['annot']) 49 | self.annotations['type'] = 'vertical' 50 | else: 51 | print('Annot are already in a horizontal format') 52 | except Exception as e: 53 | print('ERROR: unknow type of annotations') 54 | return 55 | 56 | def vertical2horizontal(self): 57 | """Converts vertical representation (hierarchical) into a horizontal 58 | annotations (indivual levels).""" 59 | try: 60 | if self.annotations['type'] == 'vertical': 61 | self.annotations['annot'] = unroll(self.annotations['annot']) 62 | self.annotations['type'] = 'horizontal' 63 | else: 64 | print('Annot are already in a vertical format') 65 | except Exception as e: 66 | print('ERROR: unknow type of annotations') 67 | return 68 | 69 | def is_horizontal(self): 70 | output = False 71 | if self.annotations['type'] == 'horizontal': 72 | output = True 73 | return output 74 | 75 | def is_vertical(self): 76 | output = False 77 | if self.annotations['type'] == 'vertical': 78 | output = True 79 | return output 80 | -------------------------------------------------------------------------------- /code/DALI/__init__.py: -------------------------------------------------------------------------------- 1 | from .Annotations import Annotations 2 | from DALI.extra import (annot2frames, annot2vector, annot2vector_chopping, get_audio) 3 | from DALI.download import audio_from_url 4 | from DALI.main import (get_the_DALI_dataset, get_an_entry, get_info, change_time, update_with_ground_truth) 5 | from DALI.utilities import (get_text, unroll) 6 | from DALI.vizualization import (write_annot_txt, write_annot_xml) 7 | -------------------------------------------------------------------------------- /code/DALI/download.py: -------------------------------------------------------------------------------- 1 | from . import utilities as ut 2 | import os 3 | import youtube_dl 4 | 5 | base_url = 'http://www.youtube.com/watch?v=' 6 | 7 | 8 | class MyLogger(object): 9 | def debug(self, msg): 10 | print(msg) 11 | 12 | def warning(self, msg): 13 | print(msg) 14 | 15 | def error(self, msg): 16 | print(msg) 17 | 18 | 19 | def my_hook(d): 20 | if d['status'] == 'finished': 21 | print('Done downloading, now converting ...') 22 | 23 | 24 | def get_my_ydl(directory=os.path.dirname(os.path.abspath(__file__))): 25 | ydl = None 26 | outtmpl = None 27 | if ut.check_directory(directory): 28 | outtmpl = os.path.join(directory, '%(title)s.%(ext)s') 29 | ydl_opts = {'format': 'bestaudio/best', 30 | 'postprocessors': [{'key': 'FFmpegExtractAudio', 31 | 'preferredcodec': 'mp3', 32 | 'preferredquality': '320'}], 33 | 'outtmpl': outtmpl, 34 | 'logger': MyLogger(), 35 | 'progress_hooks': [my_hook], 36 | 'verbose': False, 37 | 'ignoreerrors': False, 38 | 'external_downloader': 'ffmpeg', 39 | 'nocheckcertificate': True} 40 | # 'external_downloader_args': "-j 8 -s 8 -x 8 -k 5M"} 41 | # 'maxBuffer': 'Infinity'} 42 | # it uses multiple connections for speed up the downloading 43 | # 'external-downloader': 'ffmpeg'} 44 | ydl = youtube_dl.YoutubeDL(ydl_opts) 45 | ydl.cache.remove() 46 | import time 47 | time.sleep(.5) 48 | return ydl 49 | 50 | 51 | def audio_from_url(url, name, path_output, errors=[]): 52 | """ 53 | Download audio from a url. 54 | url : str 55 | url of the video (after watch?v= in youtube) 56 | name : str 57 | used to store the data 58 | path_output : str 59 | path for storing the data 60 | """ 61 | error = None 62 | 63 | # ydl(youtube_dl.YoutubeDL): extractor 64 | ydl = get_my_ydl(path_output) 65 | 66 | ydl.params['outtmpl'] = ydl.params['outtmpl'] % { 67 | 'ext': ydl.params['postprocessors'][0]['preferredcodec'], 68 | 'title': name} 69 | 70 | if ydl: 71 | print ("Downloading " + url) 72 | try: 73 | ydl.download([base_url + url]) 74 | except Exception as e: 75 | print(e) 76 | error = e 77 | if error: 78 | errors.append([name, url, error]) 79 | return 80 | -------------------------------------------------------------------------------- /code/DALI/extra.py: -------------------------------------------------------------------------------- 1 | """ Extra function: annot2vector, annot2frames, unroll and roll. 2 | 3 | Transformating the annots a song into different representations. 4 | They are disconnected to the class because they can be 5 | applyied to a subsection i.e. for transforming only one indivual level 6 | to a vector representation. 7 | 8 | 9 | GABRIEL MESEGUER-BROCAL 2018 10 | """ 11 | import copy 12 | import numpy as np 13 | from .download import (audio_from_url, get_my_ydl) 14 | from . import utilities as ut 15 | 16 | 17 | def unroll(annot): 18 | """Unrolls the hierarchical information into paragraphs, lines, words 19 | keeping the relations with the key 'index.' 20 | """ 21 | tmp = copy.deepcopy(annot['hierarchical']) 22 | p, _ = ut.unroll(tmp, depth=0, output=[]) 23 | l, _ = ut.unroll(tmp, depth=1, output=[]) 24 | w, _ = ut.unroll(tmp, depth=2, output=[]) 25 | m, _ = ut.unroll(tmp, depth=3, output=[]) 26 | return {'paragraphs': p, 'lines': l, 'words': w, 'notes': m} 27 | 28 | 29 | def roll(annot): 30 | """Rolls the individual info into a hierarchical level. 31 | 32 | Output example: [paragraph]['text'][line]['text'][word]['text'][notes]' 33 | """ 34 | tmp = copy.deepcopy(annot) 35 | output = ut.roll(tmp['notes'], tmp['words']) 36 | output = ut.roll(output, tmp['lines']) 37 | output = ut.roll(output, tmp['paragraphs']) 38 | return {'hierarchical': output} 39 | 40 | 41 | def annot2frames(annot, time_r, type='horizontal', depth=3): 42 | """Transforms annot time into a discrete formart wrt a time_resolution. 43 | 44 | This function can be use with the whole annotation or with a subset. 45 | For example, it can be called with a particular paragraph in the horizontal 46 | format [annot[paragraph_i]] or line [annot[paragraph_i]['text'][line_i]]. 47 | 48 | Parameters 49 | ---------- 50 | annot : list 51 | annotations vector (annotations['annot']) in any the formats. 52 | time_r : float 53 | time resolution for discriticing the time. 54 | type : str 55 | annotation format: horizontal or vertical. 56 | depth : int 57 | depth of the horizontal level. 58 | """ 59 | output = [] 60 | tmp = copy.deepcopy(annot) 61 | try: 62 | if type == 'horizontal': 63 | output = ut.sample(tmp, time_r) 64 | elif type == 'vertical': 65 | vertical = [ut.sample(ut.unroll(tmp, [], depth=depth)[0], time_r) 66 | for i in range(depth+1)][::-1] 67 | for i in range(len(vertical[:-1])): 68 | if i == 0: 69 | output = roll(vertical[i], vertical[i+1]) 70 | else: 71 | output = roll(output, vertical[i+1]) 72 | except Exception as e: 73 | print('ERROR: unknow type of annotations') 74 | return output 75 | 76 | 77 | def annot2vector(annot, duration, time_r, type='voice'): 78 | """Transforms the annotations into frame vector wrt a time resolution. 79 | 80 | Parameters 81 | ---------- 82 | annot : list 83 | annotations only horizontal level 84 | (for example: annotations['annot']['lines']) 85 | dur : float 86 | duration of the vector (for adding zeros). 87 | time_r : float 88 | time resolution for discriticing the time. 89 | type : str 90 | 'voice': each frame has a value 1 or 0 for voice or not voice. 91 | 'notes': each frame has the freq value of the main vocal melody. 92 | """ 93 | singal = np.zeros(int(duration / time_r)) 94 | for note in annot: 95 | b, e = note['time'] 96 | b = np.round(b/time_r).astype(int) 97 | e = np.round(e/time_r).astype(int) 98 | if type == 'voice': 99 | singal[b:e+1] = 1 100 | if type == 'melody': 101 | singal[b:e+1] = np.mean(note['freq']) 102 | return singal 103 | 104 | 105 | def annot2vector_chopping(annot, dur, time_r, win_bin, hop_bin, type='voice'): 106 | """ 107 | Transforms the annotations into a frame vector by: 108 | 109 | 1 - creating a vector singal for a give sample rate 110 | 2 - chopping it using the given hop and wind size. 111 | 112 | Parameters 113 | ---------- 114 | annot : list 115 | annotations only horizontal level 116 | (for example: annotations['annot']['lines']) 117 | dur : float 118 | duration of the vector (for adding zeros). 119 | time_r : float 120 | sample rate for discriticing annots. 121 | win_bin : int 122 | window size in bins for sampling the vector. 123 | hop_bin: int 124 | hope size in bins for sampling the vector. 125 | type :str 126 | 'voice': each frame has a value 1 or 0 for voice or not voice. 127 | 'notes': each frame has the freq value of the main vocal melody. 128 | """ 129 | output = [] 130 | try: 131 | singal = annot2vector(annot, dur, time_r, type) 132 | win = np.hanning(win_bin) 133 | win_sum = np.sum(win) 134 | v = hop_bin*np.arange(int((len(singal)-win_bin)/hop_bin+1)) 135 | output = np.array([np.sum(win[::-1]*singal[i:i+win_bin])/win_sum 136 | for i in v]).T 137 | except Exception as e: 138 | print('ERROR: unknow type of annotations') 139 | return output 140 | 141 | 142 | def get_audio(dali_info, path_output, skip=[], keep=[]): 143 | """Get the audio for the dali dataset. 144 | 145 | It can download the whole dataset or only a subset of the dataset 146 | by providing either the ids to skip or the ids that to load. 147 | 148 | Parameters 149 | ---------- 150 | dali_info : list 151 | where elements are ['DALI_ID', 'NAME', 'YOUTUBE', 'WORKING'] 152 | path_output : str 153 | full path for storing the audio 154 | skip : list 155 | list with the ids to be skipped. 156 | keep : list 157 | list with the ids to be keeped. 158 | """ 159 | errors = [] 160 | if len(keep) > 0: 161 | for i in dali_info[1:]: 162 | if i[0] in keep: 163 | audio_from_url(i[-2], i[0], path_output, errors) 164 | else: 165 | for i in dali_info[1:]: 166 | if i[0] not in skip: 167 | audio_from_url(i[-2], i[0], path_output, errors) 168 | return errors 169 | -------------------------------------------------------------------------------- /code/DALI/files/template.xml: -------------------------------------------------------------------------------- 1 | 2 | Audio_path 3 | 4 | freetype 5 | 6 | 7 | 8 | 9 | 10 | 11 | -------------------------------------------------------------------------------- /code/DALI/main.py: -------------------------------------------------------------------------------- 1 | """LOADING DALI DATASET FUNCTIONS: 2 | 3 | Functions for loading the Dali dataset. 4 | 5 | GABRIEL MESEGUER-BROCAL 2018 6 | """ 7 | from . import utilities as ut 8 | # from .Annotations import Annotations 9 | 10 | # ------------------------ READING INFO ------------------------ 11 | 12 | 13 | def generator_files_skip(file_names, skip=[]): 14 | """Generator with all the file to read. 15 | It skips the files in the skip ids list""" 16 | for fl in file_names: 17 | if fl.split('/')[-1].rstrip('.gz') not in skip: 18 | yield ut.read_gzip(fl) 19 | 20 | 21 | def generator_files_in(file_names, keep=[]): 22 | """Generator with all the file to read. 23 | It reads only the files in the keep ids list""" 24 | for fl in file_names: 25 | if fl.split('/')[-1].rstrip('.gz') in keep: 26 | yield ut.read_gzip(fl, print_error=True) 27 | 28 | 29 | def generator_folder(folder_pth, skip=[], keep=[]): 30 | """Create the final Generator with all the files.""" 31 | if len(keep) > 0: 32 | return generator_files_in(ut.get_files_path(folder_pth, 33 | print_error=True), keep) 34 | else: 35 | return generator_files_skip(ut.get_files_path(folder_pth, 36 | print_error=True), skip) 37 | 38 | 39 | def change_time(entry, new_offset=None, new_fr=None): 40 | type = 'horizontal' 41 | fr = entry.annotations['annot_param']['fr'] 42 | offset = entry.annotations['annot_param']['offset'] 43 | if new_fr is None: 44 | new_fr = fr 45 | if new_offset is None: 46 | new_offset = offset 47 | args = (fr, offset, new_fr, new_offset) 48 | entry.annotations['annot_param']['fr'] = new_fr 49 | entry.annotations['annot_param']['offset'] = new_offset 50 | if entry.annotations['type'] == 'vertical': 51 | type = 'vertical' 52 | entry.vertical2horizontal() 53 | for key, value in entry.annotations['annot'].items(): 54 | value = ut.compute_new_time(value, *args) 55 | if type == 'vertical': 56 | entry.horizontal2vertical() 57 | return 58 | 59 | 60 | def update_with_ground_truth(dali, gt_file): 61 | gt = [] 62 | if ut.check_file(gt_file, print_error=False): 63 | gt = load_ground_truth(gt_file) 64 | if len(gt) > 0: 65 | for i in gt: 66 | entry = dali[i] 67 | change_time(entry, gt[i]['offset'], gt[i]['fr']) 68 | entry.info['ground-truth'] = True 69 | return dali 70 | 71 | 72 | def get_the_DALI_dataset(pth, gt_file='', skip=[], keep=[]): 73 | """Load the whole DALI dataset. It can load only a subset of the dataset 74 | by providing either the ids to skip or the ids that to load.""" 75 | args = (pth, skip, keep) 76 | dali = {song.info['id']: song for song in generator_folder(*args)} 77 | dali = update_with_ground_truth(dali, gt_file) 78 | return dali 79 | 80 | 81 | def get_an_entry(fl_pth): 82 | """Retrieve a particular entry and return as a class.""" 83 | return ut.read_gzip(fl_pth) 84 | 85 | 86 | def get_info(dali_info_file): 87 | """Read the DALI INFO file with ['DALI_ID', 'YOUTUBE_ID', 'WORKING'] 88 | """ 89 | return ut.read_gzip(dali_info_file, print_error=True) 90 | 91 | 92 | def load_ground_truth(dali_gt_file): 93 | """Read the ground_truth file 94 | """ 95 | return ut.read_gzip(dali_gt_file, print_error=True) 96 | 97 | # ------------------------ BASIC OPERATIONS ------------------------ 98 | 99 | 100 | def update_audio_working_from_info(info, dali_dataset): 101 | """Update the working label for each class using a info file""" 102 | for i in info[1:]: 103 | dali_dataset[i[0]].info['audio']['working'] = i[-1] 104 | return dali_dataset 105 | 106 | 107 | def ids_to_title_artist(dali_dataset): 108 | """Transform the unique DALI id to """ 109 | output = [[value.info['id'], value.info['artist'], value.info['title']] 110 | for key, value in dali_dataset.items()] 111 | output.insert(0, ['id', 'artist', 'title']) 112 | return output 113 | -------------------------------------------------------------------------------- /code/DALI/utilities.py: -------------------------------------------------------------------------------- 1 | """ Utilities functions for dealing with the dali dataset. 2 | 3 | It includes all the needed functions for reading files and the helpers for 4 | transforming the annotation information. 5 | 6 | GABRIEL MESEGUER-BROCAL 2018 7 | """ 8 | import copy 9 | import glob 10 | import gzip 11 | import json 12 | import numpy as np 13 | import os 14 | import pickle 15 | 16 | # ------------------------ READING INFO ---------------- 17 | 18 | 19 | def get_files_path(pth, ext='*.gz', print_error=False): 20 | """Get all the files with a extension for a particular path.""" 21 | return list_files_from_folder(pth, extension=ext, print_error=print_error) 22 | 23 | 24 | def check_absolute_path(directory, print_error=True): 25 | """Check if a directory has an absolute path or not.""" 26 | output = False 27 | if os.path.isabs(directory): 28 | output = True 29 | elif print_error: 30 | print("ERROR: Please use an absolute path") 31 | return output 32 | 33 | 34 | def check_directory(directory, print_error=True): 35 | """Return True if a directory exists and False if not.""" 36 | output = False 37 | if check_absolute_path(directory, print_error): 38 | if os.path.isdir(directory) and os.path.exists(directory): 39 | output = True 40 | elif print_error: 41 | print("ERROR: not valid directory " + directory) 42 | return output 43 | 44 | 45 | def check_file(fl, print_error=True): 46 | """Return True if a file exists and False if not.""" 47 | output = False 48 | if check_absolute_path(fl, print_error): 49 | if os.path.isfile(fl) and os.path.exists(fl): 50 | output = True 51 | elif print_error: 52 | print("ERROR: not valid file " + fl) 53 | return output 54 | 55 | 56 | def create_directory(directory, print_error=False): 57 | """Create a folder.""" 58 | if not check_directory(directory, print_error) and \ 59 | check_absolute_path(directory): 60 | os.makedirs(directory) 61 | print("Creating a folder at " + directory) 62 | return directory 63 | 64 | 65 | def list_files_from_folder(directory, extension, print_error=True): 66 | """Return all the files with a specific extension for a given folder.""" 67 | files = [] 68 | if check_absolute_path(directory, print_error): 69 | if '*' not in extension[0]: 70 | extension = '*' + extension 71 | if check_directory(directory, print_error): 72 | files = glob.glob(os.path.join(directory, extension)) 73 | if not files and print_error: 74 | print("ERROR: not files with extension " + extension[1:]) 75 | return sorted(files) 76 | 77 | 78 | def write_in_gzip(pth, name, data, print_error=False): 79 | """Write data in a gzip file""" 80 | if check_directory(pth, print_error): 81 | save_name = os.path.join(pth, name) 82 | try: 83 | gz = gzip.open(save_name + '.gz', 'wb') 84 | except Exception as e: 85 | gz = gzip.open(save_name + '.gz', 'w') 86 | gz.write(pickle.dumps(data, protocol=2)) 87 | gz.close() 88 | return 89 | 90 | 91 | def write_in_json(pth, name, data, print_error=False): 92 | """Write data in a json file""" 93 | if check_directory(pth, print_error): 94 | save_name = os.path.join(pth, name) 95 | try: 96 | with open(save_name + '.json', 'wb') as outfile: 97 | json.dump(data, outfile) 98 | except Exception as e: 99 | with open(save_name + '.json', 'w') as outfile: 100 | json.dump(data, outfile) 101 | return 102 | 103 | 104 | def read_gzip(fl, print_error=False): 105 | """Read gzip file""" 106 | output = None 107 | if check_file(fl, print_error): 108 | try: 109 | with gzip.open(fl, 'rb') as f: 110 | output = pickle.load(f) 111 | except Exception as e: 112 | with gzip.open(fl, 'r') as f: 113 | output = pickle.load(f) 114 | return output 115 | 116 | 117 | def read_json(fl, print_error=False): 118 | """Read json file""" 119 | output = None 120 | if check_file(fl, print_error): 121 | try: 122 | with open(fl, 'rb') as outfile: 123 | output = json.load(outfile) 124 | except Exception as e: 125 | with open(fl, 'r') as outfile: 126 | output = json.load(outfile) 127 | return output 128 | 129 | 130 | def check_structure(ref, struct): 131 | output = False 132 | if isinstance(ref, dict) and isinstance(struct, dict): 133 | # ref is a dict of types or other dicts 134 | output = all(k in struct and check_structure(ref[k], struct[k]) 135 | for k in ref) 136 | else: 137 | # ref is the type of struct 138 | output = isinstance(struct, type(ref)) 139 | return output 140 | 141 | # -------------- CHANING ANNOTATIONS -------------- 142 | 143 | 144 | def beat2time(beat, **args): 145 | bps = None 146 | offset = 0 147 | beat = float(beat) 148 | if 'bps' in args: 149 | bps = float(args['bps']) 150 | if 'fr' in args: 151 | bps = float(args['fr']) / 60. 152 | if 'offset' in args: 153 | offset = args['offset'] 154 | return beat/bps + offset 155 | 156 | 157 | def time2beat(time, **args): 158 | bps = None 159 | offset = 0 160 | if 'bps' in args: 161 | bps = float(args['bps']) 162 | if 'fr' in args: 163 | bps = float(args['fr']) / 60. 164 | if 'offset' in args: 165 | offset = args['offset'] 166 | return np.round((time - offset)*bps).astype(int) 167 | 168 | 169 | def change_time(time, old_param, new_param): 170 | beat = time2beat(time, offset=old_param['offset'], fr=old_param['fr']) 171 | new_time = beat2time(beat, offset=new_param['offset'], fr=new_param['fr']) 172 | return new_time 173 | 174 | 175 | def change_time_tuple(time, old_param, new_param): 176 | return tuple(change_time(t, old_param, new_param) for t in time) 177 | 178 | 179 | def compute_new_time(lst, old_fr, old_offset, new_fr, new_offset): 180 | old_param = {'fr': old_fr, 'offset': old_offset} 181 | new_param = {'fr': new_fr, 'offset': new_offset} 182 | for e in lst: 183 | e['time'] = change_time_tuple(e['time'], old_param, new_param) 184 | return lst 185 | 186 | # -------------- TRANSFORMING THE INFO IN THE ANNOTATIONS -------------- 187 | 188 | 189 | def roll(lower, upper): 190 | """It merges an horizontal level with its upper. For example melody with 191 | words or lines with paragraphs 192 | """ 193 | tmp = copy.deepcopy(lower) 194 | for info in tmp: 195 | i = info['index'] 196 | if not isinstance(upper[i]['text'], list): 197 | upper[i]['text'] = [] 198 | info.pop('index', None) 199 | upper[i]['text'].append(info) 200 | return upper 201 | 202 | 203 | def get_text(text, output=[], m=False): 204 | """Recursive function for having the needed text for the unroll function. 205 | """ 206 | f = lambda x: isinstance(x, unicode) or isinstance(x, str) 207 | try: 208 | f('whatever') 209 | except Exception as e: 210 | f = lambda x: isinstance(x, str) 211 | if isinstance(text, list): 212 | tmp = [get_text(i['text'], output) for i in text] 213 | if f(tmp[0]): 214 | output.append(''.join(tmp)) 215 | elif f(text): 216 | output = text 217 | return output 218 | 219 | 220 | def unroll(annot, output=[], depth=0, index=0, b=False): 221 | """Recursive function that transforms an annotation vector in the format 222 | vertical format into a horizontal vector: 223 | - annot (list): annotations vector in vertical formart. 224 | - output (list): internally used in the recursion, final output. 225 | - depth (int): depth that the recursion is going to be. 226 | Example: 227 | 1 - input list of paragraph (annot) 228 | depth = 0 -> output = horizontal level for paragraphs, 229 | depth = 1 -> output = lines, depth = 2 -> output = words 230 | depth = 3 -> output = melody 231 | 2 - input a list of lines (annot[paragraph_i]['text'][line_i]) 232 | depth = 0 -> output = lines, depth = 1 -> words, 233 | depth = 2 -> melody, 234 | depth = 3 -> ERROR: not controlled behaviour 235 | - b (bool) = control if the horizontal level is going to have index 236 | or not. The paragraph level does not need it. 237 | """ 238 | if depth == 0: 239 | """Bottom level to be merged""" 240 | for i in annot: 241 | text = get_text(i['text'], output=[]) 242 | if isinstance(text, list): 243 | text = " ".join(text) 244 | output.append({'time': i['time'], 'freq': i['freq'], 'text': text}) 245 | if b: 246 | output[-1]['index'] = index 247 | index += 1 248 | else: 249 | """Go deeper""" 250 | depth -= 1 251 | b = True 252 | for l in annot: 253 | output, index = unroll(l['text'], output, depth, index, b) 254 | return output, index 255 | 256 | 257 | def sample(annot, time_r): 258 | """Transform the normal time into a discrete value with respect time_r. 259 | """ 260 | output = copy.deepcopy(annot) 261 | for a in output: 262 | a['time'] = (np.round(a['time'][0]/time_r).astype(int), 263 | np.round(a['time'][1]/time_r).astype(int)) 264 | return output 265 | -------------------------------------------------------------------------------- /code/DALI/vizualization.py: -------------------------------------------------------------------------------- 1 | from DALI.Annotations import Annotations 2 | import os 3 | from . import utilities as ut 4 | import xml.etree.ElementTree as ET 5 | 6 | 7 | sculpt_segment_tag = '{http://www.ircam.fr/musicdescription/1.1}segment' 8 | sculpt_freetype_tag = '{http://www.ircam.fr/musicdescription/1.1}freetype' 9 | sculpt_media_tag = '{http://www.ircam.fr/musicdescription/1.1}media' 10 | ET.register_namespace('', "http://www.ircam.fr/musicdescription/1.1") 11 | pth = os.path.dirname(os.path.abspath(__file__)) 12 | xml_template = os.path.join(pth, 'files/template.xml') 13 | 14 | # ----------------------------------XML------------------------------- 15 | 16 | 17 | def addsemgnet(parent, text, attrib, tag=sculpt_segment_tag, 18 | sub_tag=sculpt_freetype_tag): 19 | 20 | element = parent.makeelement(tag, attrib) 21 | sub = ET.SubElement(element, sub_tag) 22 | sub.attrib['value'] = text 23 | sub.attrib['id'] = '1' 24 | parent.append(element) 25 | # element.text = text 26 | return 27 | 28 | 29 | def create_xml_attrib(annot): 30 | point = {'startFreq': '', 'endFreq': '', 'length': '', 'sourcetrack': '0', 31 | 'time': ''} 32 | offset = 0 33 | if annot['freq'][0] == annot['freq'][1]: 34 | offset = 20 35 | 36 | point['time'] = str(annot['time'][0]) 37 | point['length'] = str(annot['time'][1] - annot['time'][0]) 38 | point['startFreq'] = str(annot['freq'][0] - offset) 39 | point['endFreq'] = str(annot['freq'][1] + offset) 40 | return point 41 | 42 | 43 | def write_annot_xml(annot, name, path_save, xml_template=xml_template): 44 | path_save = ut.create_directory(path_save) 45 | tree = ET.parse(xml_template) 46 | root = tree.getroot() 47 | media = root.findall(sculpt_media_tag)[0] 48 | media.text = name 49 | 50 | # print ET.tostring(root) 51 | # segments = root.findall(sculpt_segment_tag) 52 | 53 | for point in annot: 54 | addsemgnet(root, point['text'], attrib=create_xml_attrib(point)) 55 | 56 | """ 57 | for line in segmented_lyrics.lines: 58 | addsemgnet(root, line['text'], attrib=create_xml_attrib(line)) 59 | for word in segmented_lyrics.words: 60 | addsemgnet(root, word['text'], attrib=create_xml_attrib(word)) 61 | """ 62 | 63 | segment = root.findall(sculpt_segment_tag) 64 | root.remove(segment[0]) 65 | # tree.write(name.replace("wav", "xml")) 66 | tree.write(os.path.join(path_save, name + ".xml")) 67 | return 68 | 69 | 70 | # ------------------------------TXT------------------------------- 71 | 72 | 73 | def write_annot_txt(annot, name, path_save): 74 | path_save = ut.create_directory(path_save) 75 | with open(os.path.join(path_save, name + ".txt"), 'w') as f: 76 | for item in annot: 77 | f.write("%f\t" % item['time'][0]) 78 | f.write("%f\t" % item['time'][1]) 79 | f.write("%s\n" % item['text']) 80 | return 81 | -------------------------------------------------------------------------------- /code/LICENSE: -------------------------------------------------------------------------------- 1 | Academic Free License ("AFL") v. 3.0 2 | This Academic Free License (the "License") applies to any original work of authorship (the "Original Work") whose owner (the "Licensor") has placed the following licensing notice adjacent to the copyright notice for the Original Work: 3 | 4 | Licensed under the Academic Free License version 3.0 5 | 6 | 1) Grant of Copyright License. Licensor grants You a worldwide, royalty-free, non-exclusive, sublicensable license, for the duration of the copyright, to do the following: 7 | 8 | a) to reproduce the Original Work in copies, either alone or as part of a collective work; 9 | 10 | b) to translate, adapt, alter, transform, modify, or arrange the Original Work, thereby creating derivative works ("Derivative Works") based upon the Original Work; 11 | 12 | c) to distribute or communicate copies of the Original Work and Derivative Works to the public, under any license of your choice that does not contradict the terms and conditions, including Licensor's reserved rights and remedies, in this Academic Free License; 13 | 14 | d) to perform the Original Work publicly; and 15 | 16 | e) to display the Original Work publicly. 17 | 18 | 2) Grant of Patent License. Licensor grants You a worldwide, royalty-free, non-exclusive, sublicensable license, under patent claims owned or controlled by the Licensor that are embodied in the Original Work as furnished by the Licensor, for the duration of the patents, to make, use, sell, offer for sale, have made, and import the Original Work and Derivative Works. 19 | 20 | 3) Grant of Source Code License. The term "Source Code" means the preferred form of the Original Work for making modifications to it and all available documentation describing how to modify the Original Work. Licensor agrees to provide a machine-readable copy of the Source Code of the Original Work along with each copy of the Original Work that Licensor distributes. Licensor reserves the right to satisfy this obligation by placing a machine-readable copy of the Source Code in an information repository reasonably calculated to permit inexpensive and convenient access by You for as long as Licensor continues to distribute the Original Work. 21 | 22 | 4) Exclusions From License Grant. Neither the names of Licensor, nor the names of any contributors to the Original Work, nor any of their trademarks or service marks, may be used to endorse or promote products derived from this Original Work without express prior permission of the Licensor. Except as expressly stated herein, nothing in this License grants any license to Licensor's trademarks, copyrights, patents, trade secrets or any other intellectual property. No patent license is granted to make, use, sell, offer for sale, have made, or import embodiments of any patent claims other than the licensed claims defined in Section 2. No license is granted to the trademarks of Licensor even if such marks are included in the Original Work. Nothing in this License shall be interpreted to prohibit Licensor from licensing under terms different from this License any Original Work that Licensor otherwise would have a right to license. 23 | 24 | 5) External Deployment. The term "External Deployment" means the use, distribution, or communication of the Original Work or Derivative Works in any way such that the Original Work or Derivative Works may be used by anyone other than You, whether those works are distributed or communicated to those persons or made available as an application intended for use over a network. As an express condition for the grants of license hereunder, You must treat any External Deployment by You of the Original Work or a Derivative Work as a distribution under section 1(c). 25 | 26 | 6) Attribution Rights. You must retain, in the Source Code of any Derivative Works that You create, all copyright, patent, or trademark notices from the Source Code of the Original Work, as well as any notices of licensing and any descriptive text identified therein as an "Attribution Notice." You must cause the Source Code for any Derivative Works that You create to carry a prominent Attribution Notice reasonably calculated to inform recipients that You have modified the Original Work. 27 | 28 | 7) Warranty of Provenance and Disclaimer of Warranty. Licensor warrants that the copyright in and to the Original Work and the patent rights granted herein by Licensor are owned by the Licensor or are sublicensed to You under the terms of this License with the permission of the contributor(s) of those copyrights and patent rights. Except as expressly stated in the immediately preceding sentence, the Original Work is provided under this License on an "AS IS" BASIS and WITHOUT WARRANTY, either express or implied, including, without limitation, the warranties of non-infringement, merchantability or fitness for a particular purpose. THE ENTIRE RISK AS TO THE QUALITY OF THE ORIGINAL WORK IS WITH YOU. This DISCLAIMER OF WARRANTY constitutes an essential part of this License. No license to the Original Work is granted by this License except under this disclaimer. 29 | 30 | 8) Limitation of Liability. Under no circumstances and under no legal theory, whether in tort (including negligence), contract, or otherwise, shall the Licensor be liable to anyone for any indirect, special, incidental, or consequential damages of any character arising as a result of this License or the use of the Original Work including, without limitation, damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses. This limitation of liability shall not apply to the extent applicable law prohibits such limitation. 31 | 32 | 9) Acceptance and Termination. If, at any time, You expressly assented to this License, that assent indicates your clear and irrevocable acceptance of this License and all of its terms and conditions. If You distribute or communicate copies of the Original Work or a Derivative Work, You must make a reasonable effort under the circumstances to obtain the express assent of recipients to the terms of this License. This License conditions your rights to undertake the activities listed in Section 1, including your right to create Derivative Works based upon the Original Work, and doing so without honoring these terms and conditions is prohibited by copyright law and international treaty. Nothing in this License is intended to affect copyright exceptions and limitations (including "fair use" or "fair dealing"). This License shall terminate immediately and You may no longer exercise any of the rights granted to You by this License upon your failure to honor the conditions in Section 1(c). 33 | 34 | 10) Termination for Patent Action. This License shall terminate automatically and You may no longer exercise any of the rights granted to You by this License as of the date You commence an action, including a cross-claim or counterclaim, against Licensor or any licensee alleging that the Original Work infringes a patent. This termination provision shall not apply for an action alleging patent infringement by combinations of the Original Work with other software or hardware. 35 | 36 | 11) Jurisdiction, Venue and Governing Law. Any action or suit relating to this License may be brought only in the courts of a jurisdiction wherein the Licensor resides or in which Licensor conducts its primary business, and under the laws of that jurisdiction excluding its conflict-of-law provisions. The application of the United Nations Convention on Contracts for the International Sale of Goods is expressly excluded. Any use of the Original Work outside the scope of this License or after its termination shall be subject to the requirements and penalties of copyright or patent law in the appropriate jurisdiction. This section shall survive the termination of this License. 37 | 38 | 12) Attorneys' Fees. In any action to enforce the terms of this License or seeking damages relating thereto, the prevailing party shall be entitled to recover its costs and expenses, including, without limitation, reasonable attorneys' fees and costs incurred in connection with such action, including any appeal of such action. This section shall survive the termination of this License. 39 | 40 | 13) Miscellaneous. If any provision of this License is held to be unenforceable, such provision shall be reformed only to the extent necessary to make it enforceable. 41 | 42 | 14) Definition of "You" in This License. "You" throughout this License, whether in upper or lower case, means an individual or a legal entity exercising rights under, and complying with all of the terms of, this License. For legal entities, "You" includes any entity that controls, is controlled by, or is under common control with you. For purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. 43 | 44 | 15) Right to Use. You may use the Original Work in all ways not otherwise restricted or conditioned by this License or by law, and Licensor promises not to interfere with or be responsible for such uses by You. 45 | 46 | 16) Modification of This License. This License is Copyright © 2005 Lawrence Rosen. Permission is granted to copy, distribute, or communicate this License without modification. Nothing in this License permits You to modify this License as applied to the Original Work or to Derivative Works. However, You may modify the text of this License and copy, distribute or communicate your modified version (the "Modified License") and apply it to other original works of authorship subject to the following conditions: (i) You may not indicate in any way that your Modified License is the "Academic Free License" or "AFL" and you may not use those names in the name of your Modified License; (ii) You must replace the notice specified in the first paragraph above with the notice "Licensed under " or with a notice of your own that is not confusingly similar to the notice in this License; and (iii) You may not claim that your original works are open source software unless your Modified License has been approved by Open Source Initiative (OSI) and You comply with its license review and certification process. 47 | -------------------------------------------------------------------------------- /code/MANIFEST.in: -------------------------------------------------------------------------------- 1 | include DALI/files/template.xml 2 | -------------------------------------------------------------------------------- /code/README.md: -------------------------------------------------------------------------------- 1 | # DALI-DATASET 2 | 3 | Code for working with the DALI dataset. 4 | You can download the dataset at [zenodo](https://zenodo.org/record/2577915) and find a tutorial of and more information of how to use this code at [github/gabolsgabs](https://github.com/gabolsgabs/DALI). 5 | 6 | Copyright © Ircam 2018 7 | This package is distributed under the Academic Free License ("AFL") v. 3.0. 8 | For more information about the license read the LICENSE.txt 9 | -------------------------------------------------------------------------------- /code/setup.py: -------------------------------------------------------------------------------- 1 | from setuptools import setup 2 | 3 | with open("README.md", "r") as fh: 4 | long_description = fh.read() 5 | 6 | setup(name='DALI-dataset', 7 | version='1.0.0', 8 | description='Code for working with the DALI dataset', 9 | url='http://github.com/gabolsgabs/DALI', 10 | author='Gabriel Meseguer Brocal', 11 | author_email='gabriel.meseguer.brocal@ircam.fr', 12 | license='afl-3.0', 13 | long_description=long_description, 14 | long_description_content_type="text/markdown", 15 | # https://help.github.com/articles/licensing-a-repository/#disclaimer 16 | packages=['DALI'], 17 | include_package_data=True, 18 | install_requires=['youtube_dl',], 19 | zip_safe=False) 20 | -------------------------------------------------------------------------------- /docs/images/Example.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gabolsgabs/DALI/0bfeec1008b32a8294ff2f49141f943758a476cb/docs/images/Example.png -------------------------------------------------------------------------------- /docs/images/graphs.key: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gabolsgabs/DALI/0bfeec1008b32a8294ff2f49141f943758a476cb/docs/images/graphs.key -------------------------------------------------------------------------------- /docs/images/horizontal.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gabolsgabs/DALI/0bfeec1008b32a8294ff2f49141f943758a476cb/docs/images/horizontal.png -------------------------------------------------------------------------------- /docs/images/l1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gabolsgabs/DALI/0bfeec1008b32a8294ff2f49141f943758a476cb/docs/images/l1.png -------------------------------------------------------------------------------- /docs/images/p1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gabolsgabs/DALI/0bfeec1008b32a8294ff2f49141f943758a476cb/docs/images/p1.png -------------------------------------------------------------------------------- /docs/images/vertical.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gabolsgabs/DALI/0bfeec1008b32a8294ff2f49141f943758a476cb/docs/images/vertical.png -------------------------------------------------------------------------------- /docs/images/w1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gabolsgabs/DALI/0bfeec1008b32a8294ff2f49141f943758a476cb/docs/images/w1.png -------------------------------------------------------------------------------- /versions/README.md: -------------------------------------------------------------------------------- 1 | Here you can find the different dali_data versions. 2 | Dali_data has its own version control. 3 | There are always two numbers: version a.b. 4 | 5 | * Ther first number (**a**) refers to the singing voice detection system generation used for retrieving the audio and finding the gobal alignmnet. 6 | 7 | * The second number (**b**) refers to improvements for solving local alignment or notes problems. 8 | 9 | The definition of each version follows the standard presented at: 10 | [Geoffroy Peeters, Karën Fort. Towards a (better) Definition of Annotated MIR Corpora. International Society for Music Information Retrieval Conference (ISMIR), Oct 2012, Porto, Portugal. 2012.](https://hal.archives-ouvertes.fr/hal-00713074) 11 | 12 | 13 | 14 | ### Version 1.0. 15 | 16 | * **Donwload** it in [here](https://zenodo.org/record/2577915) 17 | * The ground-truth in [here](https://github.com/gabolsgabs/DALI/blob/master/versions/v1/) --> update 12/11/2018 18 | * The [definiton](https://github.com/gabolsgabs/DALI/blob/master/versions/v1/v1.0.md) 19 | * Here's the [paper](http://ismir2018.ircam.fr/doc/pdfs/35_Paper.pdf) 20 | * and the citation [bibtex](https://github.com/gabolsgabs/DALI/blob/master/citations/DALI_v1.0.bib): 21 | 22 | @inproceedings{Meseguer-Brocal_2018, 23 | Author = {Meseguer-Brocal, Gabriel and Cohen-Hadria, Alice and Gomez and Peeters Geoffroy}, 24 | Booktitle = {19th International Society for Music Information Retrieval Conference}, 25 | Editor = {ISMIR}, 26 | Month = {September}, 27 | Title = {DALI: a large Dataset of synchronized Audio, LyrIcs and notes, automatically created using teacher-student machine learning paradigm.}, 28 | Year = {2018}} 29 | 30 | 31 | ### Version 2.0. 32 | 33 | ##### Soon 34 | -------------------------------------------------------------------------------- /versions/v1/gt_v1.0_22:11:18.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gabolsgabs/DALI/0bfeec1008b32a8294ff2f49141f943758a476cb/versions/v1/gt_v1.0_22:11:18.gz -------------------------------------------------------------------------------- /versions/v1/v1.0.md: -------------------------------------------------------------------------------- 1 | _____ 2 | **-C1-** Corpus ID: corpus:MIR:DALI:Vocal:2018:version1.0
3 | _____ 4 | **-A- Raw Corpus**
5 | **(A1) Definition:** (a13) real items. **5358** songs each with -- its audio in full-duration, -- its time-aligned lyrics and -- its time-aligned notes of the vocal melody. Popularity-oriented defined by karaoke user demands.
6 | **(A2) Type of media diffusion:** insolated music tracks.
7 | _____ 8 | **-B- Annotations**
9 | **(B1) Origin:** (b15) Traditional manual human annotations.
10 | **(B21) Concepts definition:** note and text annotations by non-expert users for the vocal melody for playing karaoke games. Annotations are automatically aligned to audio tracks. Four different granularity levels are contracted: syllables, words, lines and paragraphs.
11 | **(B22) Annotation rules:** unknow.
12 | **(B31) Annotators:** unknow, non-expert users from the karaoke open source community.
13 | **(B32) Validation/ reliability:** not proved yet.
14 | **(B4) Annotation tools:** UltraStar Song Editor + automatic alignment at [Meseguer-Brocal_2018].
15 | _____ 16 | **-C- Documents and Storing**
17 | **(C1) Audio identifier and storage:** specific unique url identifiers to youtube videos are provided + annotations distributed as AFL accessible through this github as gz files.
18 | 19 | 20 | The version 1.0 is the result of the methodology discribed at: ***[Meseguer-Brocal_2018]*** [G. Meseguer-Brocal, A. Cohen-Hadria and G. Peeters. DALI: a large Dataset of synchronized Audio, LyrIcs and notes, automatically created using teacher-student machine learning paradigm. In ISMIR Paris, France, 2018.](http://ismir2018.ircam.fr/doc/pdfs/35_Paper.pdf) 21 | 22 | NOTES: 23 | - **Singing Voice detection system** = student trained with the dataset produced by the teacher J+M. 24 | --------------------------------------------------------------------------------