├── .gitignore
├── COPYRIGHT.txt
├── LICENSE.md
├── README.md
├── citations
    └── DALI_v1.0.bib
├── code
    ├── DALI
    │   ├── Annotations.py
    │   ├── __init__.py
    │   ├── download.py
    │   ├── extra.py
    │   ├── files
    │   │   ├── dali_v1_metadata.json
    │   │   └── template.xml
    │   ├── main.py
    │   ├── utilities.py
    │   └── vizualization.py
    ├── LICENSE
    ├── MANIFEST.in
    ├── README.md
    └── setup.py
├── docs
    └── images
    │   ├── Example.png
    │   ├── graphs.key
    │   ├── horizontal.png
    │   ├── l1.png
    │   ├── p1.png
    │   ├── vertical.png
    │   └── w1.png
└── versions
    ├── README.md
    └── v1
        ├── gt_v1.0_22:11:18.gz
        └── v1.0.md


/.gitignore:
--------------------------------------------------------------------------------
 1 | .DS_Store
 2 | *.pyc
 3 | /code/DALI/cross-correlation.py
 4 | /code/DALI/my_helper.py
 5 | /code/DALI/ground_truth.py
 6 | /code/DALI/exp_alginment.py
 7 | /code/DALI/get_audio.py
 8 | /code/DALI/melody.py
 9 | /images/graphs.key
10 | Icon?
11 | 


--------------------------------------------------------------------------------
/COPYRIGHT.txt:
--------------------------------------------------------------------------------
1 | Copyright © Ircam 2018
2 | DALI by Gabriel Meseguer-Brocal, Alice Cohen-Hadrian and Peeters Geoffroy.
3 | DALI is offered free of charge for non-commercial research use only under the terms of the Creative Commons Attribution Noncommercial License: http://creativecommons.org/licenses/by-nc-sa/4.0/
4 | The DALI is provided for educational purposes only and the material contained in them should not be used for any commercial purpose without the express permission of the copyright holders.
5 | 


--------------------------------------------------------------------------------
/LICENSE.md:
--------------------------------------------------------------------------------
1 | <a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a>.
2 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
  1 | 
  2 | [horizontal]: ./docs/images/horizontal.png
  3 | [vertical]: ./docs/images/vertical.png
  4 | [p1]: ./docs/images/p1.png
  5 | [l1]: ./docs/images/l1.png
  6 | [w1]: ./docs/images/w1.png
  7 | [Example]: ./docs/images/Example.png
  8 | 
  9 | 
 10 | # WELCOME TO THE DALI DATASET: a large **D**ataset of synchronised **A**udio, **L**yr**I**cs and vocal notes.
 11 | 
 12 | You can find a detailed explanation of how DALI has been created at:
 13 | ***[Meseguer-Brocal_2018]*** [G. Meseguer-Brocal, A. Cohen-Hadria and G. Peeters. DALI: a large Dataset of synchronized Audio, LyrIcs and notes, automatically created using teacher-student machine learning paradigm. In ISMIR Paris, France, 2018.](http://ismir2018.ircam.fr/doc/pdfs/35_Paper.pdf)
 14 | 
 15 | Cite this [paper](https://zenodo.org/record/1492443):
 16 | 
 17 | >@inproceedings{Meseguer-Brocal_2018,
 18 | 	Author = {Meseguer-Brocal, Gabriel and Cohen-Hadria, Alice and Peeters, Geoffroy},
 19 | 	Booktitle = {19th International Society for Music Information Retrieval Conference},
 20 | 	Editor = {ISMIR},
 21 | 	Month = {September},
 22 | 	Title = {DALI: a large Dataset of synchronized Audio, LyrIcs and notes, automatically created using teacher-student machine learning paradigm.},
 23 | 	Year = {2018}}
 24 | 
 25 | 
 26 | 
 27 | Here's an example of the kind of information DALI contains:
 28 | 
 29 | ![alt text][Example]
 30 | 
 31 | 
 32 | DALI has two main elements:
 33 | 
 34 | ## 1- The dataset - dali_data
 35 | 
 36 | The dataset itself. It is denoted as **dali_data** and it is presented as a collection of **gz** files.
 37 | You can find the different DALI_data versions in [here](https://github.com/gabolsgabs/DALI/blob/master/versions/).
 38 | 
 39 | ## 2- The code for working with DALI - dali_code
 40 | The code, denoted as **dali_code**, for reading and working with dali_data.
 41 | It is stored in this repository and presented as a python package.
 42 | Dali_code has its own versions controlled by this github.
 43 | The release and stable versions can be found at [pypi](https://pypi.org/project/DALI-dataset/).
 44 | 
 45 | repository<br>
 46 | ├── code<br>
 47 | │   ├── DALI<br>
 48 | │   │   ├── \_\_init\_\_.py<br>
 49 | │   │   ├── Annotations.py<br>
 50 | │   │   ├── main.py<br>
 51 | │   │   ├── utilities.py<br>
 52 | │   │   ├── extra.py<br>
 53 | │   │   ├── download.py<br>
 54 | │   │   ├── vizualization.py<br>
 55 | │   └── setup.py<br>
 56 | 
 57 | 
 58 | # NEWS:
 59 | 
 60 | Ground-Truth for version 1.0 updated with 105 songs.
 61 | Remember that DALI is a ongoing project. There are many things to solve.
 62 | 
 63 | Currently we are working in:
 64 | * the second generation for the Singing voice detection system.
 65 | * solving errors for indivual notes.
 66 | * solving errors global notas errors (songs where all the notes are place off by the same certain interval)
 67 | * errors in local note alignments.
 68 | 
 69 | Please, if you have any suggestion our improvement please contact us at: dali [dot] dataset [at] gmail [dot] com
 70 | 
 71 | For any problem with the package that deal with the annotations open an issue in this repository.
 72 | 
 73 | Thank you.
 74 | 
 75 | # TUTORIAL:
 76 | 
 77 | First of all, [download](https://github.com/gabolsgabs/DALI/blob/master/versions/) your Dali_data version and clone this repository.
 78 | 
 79 | 
 80 | ## 0- Installing Dali_code.
 81 | For the release and stable versions just run the command:
 82 | 
 83 |   >  pip install dali-dataset
 84 | 
 85 | For non-release and unstable versions  you can install them manually going to folder DALI/code and running:
 86 | 
 87 |   >  pip install .
 88 | 
 89 | You can upgrade DALI for future version with:
 90 | 
 91 |   >  pip install dali-dataset --upgrade
 92 | 
 93 | DALI can be uninstalled with:
 94 | 
 95 |   >  pip uninstall dali-dataset
 96 | 
 97 | Requirements: **numpy** and **youtube_dl**
 98 | 
 99 | **NOTE**: the version of the code in pip only refers to the code itself. The different versions of the Dali_data can be found above.
100 | 
101 | 
102 | ## 1- Loading DALI_data.
103 | 
104 | DALI is presented as a set of **gz** files.
105 | Each gz contains the annotations of a particular song.
106 | We use a unique id for each entry.
107 | You can download your dali_data version as follow:
108 | 
109 |     import DALI as dali_code
110 |     dali_data_path = 'full_path_to_your_dali_data'
111 |     dali_data = dali_code.get_the_DALI_dataset(dali_data_path, skip=[], keep=[])
112 | 
113 | This function can also be used to load a subset of the DALI dataset by providing the ids of the entries you either want to **skip** or to **keep**.
114 | 
115 | 
116 | **NOTE**: Loading DALI might take some minutes depending on your computer and python version. Python3 is faster than python2.
117 | 
118 | Additionally, each DALI version contains a DALI_DATA_INFO.gz:
119 | 
120 |     dali_info = dali_code.get_info(dali_data_path + 'info/DALI_DATA_INFO.gz')
121 |     print(dali_info[0]) -> array(['DALI_ID', 'NAME', 'YOUTUBE', 'WORKING'])
122 | 
123 | This file matches the unique DALI id with the artist_name-song_tile, the url to youtube and a bool that says if the youtube link is working or not.  
124 | 
125 | <!--- This file is updated with -->
126 | 
127 | ## 1.1- An annotation instance.
128 | 
129 | _dali_data_ is a dictionary where each key is a unique id and the value is an instance of the class DALI/Annotations namely **an annotation instance** of the class Annotations.
130 | 
131 |     entry = dali_data['a_dali_unique_id']
132 |     type(entry) -> DALI.Annotations.Annotations
133 | 
134 | Each annotation instance has two attributes: **info** and **annotations**.
135 | 
136 |     entry.info --> {'id': 'a_dali_unique_id',
137 |                     'artist': 'An Artist',
138 |                     'title': 'A song title',
139 |                     'dataset_version': 1.0,     **# dali_data version**
140 |                     'ground-truth': False,     
141 |                     'scores': {'NCC': 0.8098520072498807,
142 |                                'manual': 0.0},  **# Not ready yet**
143 |                     'audio': {'url': 'a youtube url',
144 |                               'path': 'None',   
145 |                               **# To you to modify it to point to your local audio file**
146 |                               'working': True},
147 |                     'metadata': {'album': 'An album title',
148 |                                  'release_date': 'A year',
149 |                                  'cover': 'link to a image with the cover',
150 |                                  'genres': ['genre_0', ... , 'genre_n'],
151 |                                  # The n of genre depends on the song
152 |                                  'language': 'a language'}}
153 | 
154 |     entry.annotations --> {'annot': {'the annotations themselves'},
155 |                            'type': 'horizontal' or 'vertical',
156 |                            'annot_param': {'fr': float(frame rate used in the annotation process),
157 |                                           'offset': float(offset value)}}
158 | 
159 | 
160 | ## 1.2- Saving as json.
161 | 
162 | You can export and import annotations a json file.
163 | 
164 |         path_save = 'my_full_save_path'
165 |         name = 'my_annot_name'
166 |         # export
167 |         entry.write_json(path_save, name)
168 |         # import
169 |         my_json_entry = dali_code.Annotations()
170 |         my_json_entry.read_json(os.path.join(path_save, name+'.json'))
171 | 
172 | 
173 | ## 1.3- Ground-truth.
174 | 
175 | Each dali_data has its own ground-truth [ground-truth file](https://github.com/gabolsgabs/DALI/tree/master/versions/).
176 | The annotations that are part of the ground-truth are entries of the dali_data with the offset and fr parameters manually annotated.
177 | 
178 | You can easily load a ground-truth file:
179 | 
180 |     gt_file = 'full_path_to_my_ground_truth_file'
181 |     # you can load the ground-truth
182 |     gt = dali_code.utilities.read_gzip(gt_file)
183 |     type(gt) --> dict
184 |     gt['a_dali_unique_id'] --> {'offset': float(a_number),
185 |                                 'fr': float(a_number)}
186 | 
187 | You can also load a **dali_gt** with all the entries of the dali_data that are part of the ground-truth with their annotations updated to the offset and fr parameters manually annotated:
188 | 
189 |     # dali_gt only with ground_truth songs
190 |     gt = dali_code.utilities.read_gzip(gt_file)
191 |     dali_gt = dali_code.get_the_DALI_dataset(dali_data_path, gt_file, keep=gt.keys())
192 |     len(dali_gt) -> == len(gt)
193 | 
194 | 
195 | You can also load the whole dali_data and update the songs that are part of the ground truth with the offset and fr parameters manually verified.
196 | 
197 |     # Two options:
198 |     # 1- once you have your dali_data
199 |     dali_data = dali_code.update_with_ground_truth(dali_data, gt_file)
200 | 
201 |     # 2- while reading the dataset
202 |     dali_data = dali_code.get_the_DALI_dataset(dali_data_path, gt_file=gt_file)
203 | 
204 | 
205 | NOTE 1: Please be sure you have the last [ground truth version](https://github.com/gabolsgabs/DALI/tree/master/versions/).
206 | 
207 | # 2- Getting the audio.
208 | 
209 | You can retrieve the audio for each annotation (if avilable) using the function dali_code.get_audio():
210 | 
211 |     path_audio = 'full_path_to_store_the_audio'
212 |     errors = dali_code.get_audio(dali_info, path_audio, skip=[], keep=[])
213 |     errors -> ['dali_id', 'youtube_url', 'error']
214 | 
215 | This function can also be used to download a subset of the DALI dataset by providing the ids of the entries you either want to **skip** or to **keep**.
216 | 
217 | 
218 | # 3- Working with DALI.
219 | 
220 | Annotations are in:
221 | > entry.annotations['annot']
222 | 
223 | and they are presented in two different formats: **'horizontal'** or **'vertical'**.
224 | You can easily change the format using the functions:
225 | 
226 |       entry.horizontal2vertical()
227 |       entry.vertical2horizontal()
228 | 
229 | ## 3.1- Horizontal.
230 | In this format each level of granularity is stored individually.
231 | It is the default format.
232 | 
233 | ![alt text][horizontal]
234 | 
235 |     entry.vertical2horizontal() --> 'Annot are already in a vertical format'
236 |     entry.annotations['type'] --> 'horizontal'
237 |     entry.annotations['annot'].keys() --> ['notes', 'lines', 'words', 'paragraphs']
238 | 
239 | Each level contains a list of annotation where each element has:
240 | 
241 |     my_annot = entry.annotations['annot']['notes']
242 |     my_annot[0] --> {'text': 'wo', # the annotation itself.
243 |                      'time': [12.534, 12.659], # the begining and end of the  segment in seconds.
244 |                      'freq': [466.1637615180899, 466.1637615180899], # The range of frequency the text information is covering. At the lowest level, syllables, it corresponds to the vocal note.
245 |                      'index': 0} # link with the upper level. For example, index 0 at the 'words' level means that that particular word below to first line ([0]). The paragraphs level has no index key.
246 | 
247 | ### 3.1.1- Vizualizing an annotation file.
248 | 
249 | You can export the annotations of each individual level to a xml or text file to vizualize them with Audacity or AudioSculpt. The pitch information is only presented in the xml files for AudioSculpt.
250 | 
251 |         my_annot = entry.annotations['annot']['notes']
252 |         path_save = 'my_save_path'
253 |         name = 'my_annot_name'
254 |         dali_code.write_annot_txt(my_annot, name, path_save)
255 |         # import the txt file in your Audacity
256 |         dali_code.write_annot_xml(my_annot, name, path_save)
257 |         # import Rythm XML file in AudioSculpt
258 | 
259 | 
260 | ### 3.1.2- Examples.
261 | This format is ment to be use for working with each level individually.
262 | > Example 1: recovering the main vocal melody.
263 | 
264 | Let's used the extra function dali_code.annot2vector() that transforms the annotations into a vector. There are two types of vector:
265 | 
266 | - type='voice': each frame has a value 1 or 0 for voice or not voice.
267 | - type='melody': each frame has the freq value of the main vocal melody.
268 | 
269 |       my_annot = entry.annotations['annot']['notes']
270 |       time_resolution = 0.014
271 |       # the value dur is just an example you should use the end of your audio file
272 |       end_of_the_song =  entry.annotations['annot']['notes'][-1]['time'][1] + 10
273 |       melody = dali_code.annot2vector(my_annot, end_of_the_song, time_resolution, type='melody')
274 | 
275 | **NOTE: have a look to dali_code.annot2vector_chopping() for computing a vector chopped with respect to a given window and hop size.**
276 | 
277 | > Example 2: find the audio frames that define each paragraph.
278 | 
279 | Let's used the other extra function dali_code.annot2frames() that transforms time in seconds into time in frames.
280 | 
281 |       my_annot = entry.annotations['annot']['paragraphs']
282 |       paragraphs = [i['time'] for i in dali_code.annot2frames(my_annot, time_resolution)]
283 |       paragraphs --> [(49408, 94584), ..., (3080265, 3299694)]
284 | 
285 | 
286 | **NOTE**: dali_code.annot2frames() can also be used in the vertical format but not dali_code.annot2vector().
287 | 
288 | ## 3.2- Vertical.
289 | In this format the different levels of granularity are hierarchically connected:
290 | 
291 | ![alt text][vertical]
292 | 
293 |       entry.horizontal2vertical()
294 |       entry.annotations['type'] --> 'vertical'
295 |       entry.annotations['annot'].keys() --> ['hierarchical']
296 |       my_annot = entry.annotations['annot']['hierarchical']
297 | 
298 | Each element of the list is a paragraph.
299 | 
300 |       my_annot[0] --> {'freq': [277.1826309768721, 493.8833012561241], # The range of frequency the text information is covering
301 |                        'time': [12.534, 41.471500000000006], # the begining and end of the time segment.
302 |                        'text': [line_0, line_1, ..., line_n]}
303 | 
304 | ![alt text][p1]
305 | 
306 | where 'text' contains all the lines of the paragraph. Each line follows the same format:
307 | 
308 |       lines_1paragraph = my_annot[0]['text']
309 |       lines_1paragraph[0] --> {'freq': [...], 'time': [...],
310 |                                'text': [word_0, word_1, ..., word_n]}
311 | 
312 | ![alt text][l1]
313 | 
314 | again, each word contains all the notes for that word to be sung:
315 | 
316 |       words_1line_1paragraph = lines_1paragraph[0]['text']
317 |       words_1line_1paragraph[0] --> {'freq': [...], 'time': [...],
318 |                                      'text': [note_0, note_1, ..., note_n]}
319 | 
320 | ![alt text][w1]
321 | 
322 | Only the deepest level directly has the text information.
323 | 
324 |       notes_1word_1line_1paragraph = words_1line_1paragraph[1]['text']
325 |       notes_1word_1line_1paragraph[0] --> {'freq': [...], 'time': [...],
326 |                                            'text': 'note text'}
327 | 
328 | You can always get the text at specific point with dali_code.get_text(), i.e:
329 | 
330 |       dali_code.get_text(lines_1paragraph) --> ['text word_0', 'text word_1', ..., text_word_n]
331 |       # words in the first line of the first paragraph
332 | 
333 |       dali_code.get_text(my_annot[0]['text']) --> ['text word_0', 'text word_1', ..., text_word_n]
334 |       # words in the first paragraph
335 | 
336 | ### 3.2.2- Examples.
337 | This organization is meant to be used for working with specific hierarchical blocks.
338 | 
339 | > Example 1: working only with the third paragraph.
340 | 
341 |       my_paragraph = my_annot[3]['text']
342 |       text_paragraph = dali_code.get_text(my_paragraph)
343 | 
344 | Additionally, you can easily retrieve all its individual information with the function dali_code.unroll():
345 | 
346 |       lines_in_paragraph, _ = dali_code.unroll(my_paragraph, depth=0, output=[])
347 |       words_in_paragraph, _ = dali_code.unroll(my_paragraph, depth=1, output=[])
348 |       notes_in_paragraph, _ = dali_code.unroll(my_paragraph, depth=2, output=[])
349 | 
350 | > Example 2: working only with the first line of the third paragraph
351 | 
352 |       my_line = my_annot[3]['text'][0]['text']
353 |       text_line = dali_code.get_text(my_line)
354 |       words_in_line, _ = dali_code.unroll(my_line, depth=0, output=[])
355 |       notes_in_line, _ = dali_code.unroll(my_line, depth=1, output=[])
356 | 
357 | # 4- Correcting Annotations.
358 | 
359 | Up to now, we are facing only global alignment problems. You can change this alignment by modifying the offset and frame rate parameters. The original ones are stored at:
360 | 
361 |       print(entry.annotations['annot_param'])
362 |       {'offset': float(a_number), 'fr': float(a_number)}
363 | 
364 | If you find a better parameters set you can modified the annotations using the function dali_code.change_time():
365 | 
366 |       dali_code.change_time(entry, new_offset, new_fr)
367 |       # The default new_offset and new_fr are entry.annotations['annot_param']
368 | 
369 | We encourage you to send us your parameters in order to improve DALI.
370 | 
371 | _____
372 | You can contact us at:
373 | 
374 | > dali dot dataset at gmail dot com
375 | 
376 | This research has received funding from the French National Research Agency under the contract ANR-16-CE23-0017-01 (WASABI project)
377 | 
378 | <a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/"><img alt="Creative Commons License" style="border-width:0" src="https://i.creativecommons.org/l/by-nc-sa/4.0/88x31.png" /></a><br />This work is licensed under a <a rel="license" href="http://creativecommons.org/licenses/by-nc-sa/4.0/">Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License</a>.
379 | 


--------------------------------------------------------------------------------
/citations/DALI_v1.0.bib:
--------------------------------------------------------------------------------
1 | @inproceedings{Meseguer-Brocal_2018,
2 | 	Author = {Meseguer-Brocal, Gabriel and Cohen-Hadria, Alice and Peeters, Geoffroy},
3 | 	Booktitle = {19th International Society for Music Information Retrieval Conference},
4 | 	Editor = {ISMIR},
5 | 	Month = {September},
6 | 	Title = {DALI: a large Dataset of synchronized Audio, LyrIcs and notes, automatically created using teacher-student machine learning paradigm.},
7 | 	Year = {2018}}
8 | 


--------------------------------------------------------------------------------
/code/DALI/Annotations.py:
--------------------------------------------------------------------------------
 1 | """ ANNOTATIONS class.
 2 | 
 3 | GABRIEL MESEGUER-BROCAL 2018
 4 | """
 5 | from .extra import (unroll, roll)
 6 | from . import utilities as ut
 7 | 
 8 | 
 9 | class Annotations(object):
10 |     """Basic class that store annotations and its information.
11 | 
12 |     It contains some method for transformin the annot representation.
13 |     """
14 | 
15 |     def __init__(self, i=u'None'):
16 |         self.info = {'id': i, 'artist': u'None', 'title': u'None',
17 |                      'audio': {'url': u'None', 'working': False,
18 |                                'path': u'None'},
19 |                      'metadata': {}, 'scores': {'NCC': 0.0, 'manual': 0.0},
20 |                      'dataset_version': 0.0, 'ground-truth': False}
21 |         self.annotations = {'type': u'None', 'annot': {},
22 |                             'annot_param': {'fr': 0.0, 'offset': 0.0}}
23 |         self.errors = None
24 |         return
25 | 
26 |     def read_json(self, fname):
27 |         """Read the annots from a json file."""
28 |         data = ut.read_json(fname)
29 |         if(ut.check_structure(self.info, data['info']) and
30 |            ut.check_structure(self.annotations, data['annotations'])):
31 |             self.info = data['info']
32 |             self.annotations = data['annotations']
33 |         else:
34 |             print('ERROR: wrong format')
35 |         return
36 | 
37 |     def write_json(self, pth, name):
38 |         """Writes the annots into a json file."""
39 |         data = {'info': self.info, 'annotations': self.annotations}
40 |         ut.write_in_json(pth, name, data)
41 |         return
42 | 
43 |     def horizontal2vertical(self):
44 |         """Converts horizontal annotations (indivual levels) into a vertical
45 |         representation (hierarchical)."""
46 |         try:
47 |             if self.annotations['type'] == 'horizontal':
48 |                 self.annotations['annot'] = roll(self.annotations['annot'])
49 |                 self.annotations['type'] = 'vertical'
50 |             else:
51 |                 print('Annot are already in a horizontal format')
52 |         except Exception as e:
53 |             print('ERROR: unknow type of annotations')
54 |         return
55 | 
56 |     def vertical2horizontal(self):
57 |         """Converts vertical representation (hierarchical) into a  horizontal
58 |         annotations (indivual levels)."""
59 |         try:
60 |             if self.annotations['type'] == 'vertical':
61 |                 self.annotations['annot'] = unroll(self.annotations['annot'])
62 |                 self.annotations['type'] = 'horizontal'
63 |             else:
64 |                 print('Annot are already in a vertical format')
65 |         except Exception as e:
66 |             print('ERROR: unknow type of annotations')
67 |         return
68 | 
69 |     def is_horizontal(self):
70 |         output = False
71 |         if self.annotations['type'] == 'horizontal':
72 |             output = True
73 |         return output
74 | 
75 |     def is_vertical(self):
76 |         output = False
77 |         if self.annotations['type'] == 'vertical':
78 |             output = True
79 |         return output
80 | 


--------------------------------------------------------------------------------
/code/DALI/__init__.py:
--------------------------------------------------------------------------------
1 | from .Annotations import Annotations
2 | from DALI.extra import (annot2frames, annot2vector, annot2vector_chopping, get_audio)
3 | from DALI.download import audio_from_url
4 | from DALI.main import (get_the_DALI_dataset, get_an_entry, get_info, change_time, update_with_ground_truth)
5 | from DALI.utilities import (get_text, unroll)
6 | from DALI.vizualization import (write_annot_txt, write_annot_xml)
7 | 


--------------------------------------------------------------------------------
/code/DALI/download.py:
--------------------------------------------------------------------------------
 1 | from . import utilities as ut
 2 | import os
 3 | import youtube_dl
 4 | 
 5 | base_url = 'http://www.youtube.com/watch?v='
 6 | 
 7 | 
 8 | class MyLogger(object):
 9 |     def debug(self, msg):
10 |         print(msg)
11 | 
12 |     def warning(self, msg):
13 |         print(msg)
14 | 
15 |     def error(self, msg):
16 |         print(msg)
17 | 
18 | 
19 | def my_hook(d):
20 |     if d['status'] == 'finished':
21 |         print('Done downloading, now converting ...')
22 | 
23 | 
24 | def get_my_ydl(directory=os.path.dirname(os.path.abspath(__file__))):
25 |     ydl = None
26 |     outtmpl = None
27 |     if ut.check_directory(directory):
28 |         outtmpl = os.path.join(directory, '%(title)s.%(ext)s')
29 |         ydl_opts = {'format': 'bestaudio/best',
30 |                     'postprocessors': [{'key': 'FFmpegExtractAudio',
31 |                                         'preferredcodec': 'mp3',
32 |                                         'preferredquality': '320'}],
33 |                     'outtmpl': outtmpl,
34 |                     'logger': MyLogger(),
35 |                     'progress_hooks': [my_hook],
36 |                     'verbose': False,
37 |                     'ignoreerrors': False,
38 |                     'external_downloader': 'ffmpeg',
39 |                     'nocheckcertificate': True}
40 |                     # 'external_downloader_args': "-j 8 -s 8 -x 8 -k 5M"}
41 |                     # 'maxBuffer': 'Infinity'}
42 |                     #  it uses multiple connections for speed up the downloading
43 |                     #  'external-downloader': 'ffmpeg'}
44 |         ydl = youtube_dl.YoutubeDL(ydl_opts)
45 |         ydl.cache.remove()
46 |         import time
47 |         time.sleep(.5)
48 |     return ydl
49 | 
50 | 
51 | def audio_from_url(url, name, path_output, errors=[]):
52 |     """
53 |     Download audio from a url.
54 |         url : str
55 |             url of the video (after watch?v= in youtube)
56 |         name : str
57 |             used to store the data
58 |         path_output : str
59 |             path for storing the data
60 |     """
61 |     error = None
62 | 
63 |     # ydl(youtube_dl.YoutubeDL): extractor
64 |     ydl = get_my_ydl(path_output)
65 | 
66 |     ydl.params['outtmpl'] = ydl.params['outtmpl'] % {
67 |         'ext': ydl.params['postprocessors'][0]['preferredcodec'],
68 |         'title': name}
69 | 
70 |     if ydl:
71 |         print ("Downloading " + url)
72 |         try:
73 |             ydl.download([base_url + url])
74 |         except Exception as e:
75 |             print(e)
76 |             error = e
77 |     if error:
78 |         errors.append([name, url, error])
79 |     return
80 | 


--------------------------------------------------------------------------------
/code/DALI/extra.py:
--------------------------------------------------------------------------------
  1 | """ Extra function: annot2vector, annot2frames, unroll and roll.
  2 | 
  3 | Transformating the annots a song into different representations.
  4 | They are disconnected to the class because they can be
  5 | applyied to a subsection i.e. for transforming only one indivual level
  6 | to a vector representation.
  7 | 
  8 | 
  9 | GABRIEL MESEGUER-BROCAL 2018
 10 | """
 11 | import copy
 12 | import numpy as np
 13 | from .download import (audio_from_url, get_my_ydl)
 14 | from . import utilities as ut
 15 | 
 16 | 
 17 | def unroll(annot):
 18 |     """Unrolls the hierarchical information into paragraphs, lines, words
 19 |     keeping the relations with the key 'index.'
 20 |     """
 21 |     tmp = copy.deepcopy(annot['hierarchical'])
 22 |     p, _ = ut.unroll(tmp, depth=0, output=[])
 23 |     l, _ = ut.unroll(tmp, depth=1, output=[])
 24 |     w, _ = ut.unroll(tmp, depth=2, output=[])
 25 |     m, _ = ut.unroll(tmp, depth=3, output=[])
 26 |     return {'paragraphs': p, 'lines': l, 'words': w, 'notes': m}
 27 | 
 28 | 
 29 | def roll(annot):
 30 |     """Rolls the individual info into a hierarchical level.
 31 | 
 32 |     Output example: [paragraph]['text'][line]['text'][word]['text'][notes]'
 33 |     """
 34 |     tmp = copy.deepcopy(annot)
 35 |     output = ut.roll(tmp['notes'], tmp['words'])
 36 |     output = ut.roll(output, tmp['lines'])
 37 |     output = ut.roll(output, tmp['paragraphs'])
 38 |     return {'hierarchical': output}
 39 | 
 40 | 
 41 | def annot2frames(annot, time_r, type='horizontal', depth=3):
 42 |     """Transforms annot time into a discrete formart wrt a time_resolution.
 43 | 
 44 |     This function can be use with the whole annotation or with a subset.
 45 |     For example, it can be called with a particular paragraph in the horizontal
 46 |     format [annot[paragraph_i]] or line [annot[paragraph_i]['text'][line_i]].
 47 | 
 48 |     Parameters
 49 |     ----------
 50 |         annot : list
 51 |             annotations vector (annotations['annot']) in any the formats.
 52 |         time_r : float
 53 |             time resolution for discriticing the time.
 54 |         type : str
 55 |             annotation format: horizontal or vertical.
 56 |         depth : int
 57 |             depth of the horizontal level.
 58 |     """
 59 |     output = []
 60 |     tmp = copy.deepcopy(annot)
 61 |     try:
 62 |         if type == 'horizontal':
 63 |             output = ut.sample(tmp, time_r)
 64 |         elif type == 'vertical':
 65 |             vertical = [ut.sample(ut.unroll(tmp, [], depth=depth)[0], time_r)
 66 |                         for i in range(depth+1)][::-1]
 67 |             for i in range(len(vertical[:-1])):
 68 |                 if i == 0:
 69 |                     output = roll(vertical[i], vertical[i+1])
 70 |                 else:
 71 |                     output = roll(output, vertical[i+1])
 72 |     except Exception as e:
 73 |         print('ERROR: unknow type of annotations')
 74 |     return output
 75 | 
 76 | 
 77 | def annot2vector(annot, duration, time_r, type='voice'):
 78 |     """Transforms the annotations into frame vector wrt a time resolution.
 79 | 
 80 |     Parameters
 81 |     ----------
 82 |         annot : list
 83 |             annotations only horizontal level
 84 |             (for example: annotations['annot']['lines'])
 85 |         dur : float
 86 |             duration of the vector (for adding zeros).
 87 |         time_r : float
 88 |             time resolution for discriticing the time.
 89 |         type : str
 90 |             'voice': each frame has a value 1 or 0 for voice or not voice.
 91 |             'notes': each frame has the freq value of the main vocal melody.
 92 |     """
 93 |     singal = np.zeros(int(duration / time_r))
 94 |     for note in annot:
 95 |         b, e = note['time']
 96 |         b = np.round(b/time_r).astype(int)
 97 |         e = np.round(e/time_r).astype(int)
 98 |         if type == 'voice':
 99 |             singal[b:e+1] = 1
100 |         if type == 'melody':
101 |             singal[b:e+1] = np.mean(note['freq'])
102 |     return singal
103 | 
104 | 
105 | def annot2vector_chopping(annot, dur, time_r, win_bin, hop_bin, type='voice'):
106 |     """
107 |     Transforms the annotations into a frame vector by:
108 | 
109 |         1 - creating a vector singal for a give sample rate
110 |         2 - chopping it using the given hop and wind size.
111 | 
112 |     Parameters
113 |     ----------
114 |         annot : list
115 |             annotations only horizontal level
116 |             (for example: annotations['annot']['lines'])
117 |         dur : float
118 |             duration of the vector (for adding zeros).
119 |         time_r : float
120 |             sample rate for discriticing annots.
121 |         win_bin : int
122 |             window size in bins for sampling the vector.
123 |         hop_bin: int
124 |             hope size in bins for sampling the vector.
125 |         type :str
126 |             'voice': each frame has a value 1 or 0 for voice or not voice.
127 |             'notes': each frame has the freq value of the main vocal melody.
128 |     """
129 |     output = []
130 |     try:
131 |         singal = annot2vector(annot, dur, time_r, type)
132 |         win = np.hanning(win_bin)
133 |         win_sum = np.sum(win)
134 |         v = hop_bin*np.arange(int((len(singal)-win_bin)/hop_bin+1))
135 |         output = np.array([np.sum(win[::-1]*singal[i:i+win_bin])/win_sum
136 |                            for i in v]).T
137 |     except Exception as e:
138 |         print('ERROR: unknow type of annotations')
139 |     return output
140 | 
141 | 
142 | def get_audio(dali_info, path_output, skip=[], keep=[]):
143 |     """Get the audio for the dali dataset.
144 | 
145 |     It can download the whole dataset or only a subset of the dataset
146 |     by providing either the ids to skip or the ids that to load.
147 | 
148 |     Parameters
149 |     ----------
150 |         dali_info : list
151 |             where elements are ['DALI_ID', 'NAME', 'YOUTUBE', 'WORKING']
152 |         path_output : str
153 |             full path for storing the audio
154 |         skip : list
155 |             list with the ids to be skipped.
156 |         keep : list
157 |             list with the ids to be keeped.
158 |     """
159 |     errors = []
160 |     if len(keep) > 0:
161 |         for i in dali_info[1:]:
162 |             if i[0] in keep:
163 |                 audio_from_url(i[-2], i[0], path_output, errors)
164 |     else:
165 |         for i in dali_info[1:]:
166 |             if i[0] not in skip:
167 |                 audio_from_url(i[-2], i[0], path_output, errors)
168 |     return errors
169 | 


--------------------------------------------------------------------------------
/code/DALI/files/template.xml:
--------------------------------------------------------------------------------
 1 | <musicdescription xmlns="http://www.ircam.fr/musicdescription/1.1" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" date="2017-01-28" format="20140305A" xsi:schemaLocation="http://www.ircam.fr/musicdescription/1.1 http://recherche.ircam.fr/anasyn/xsd/musicdescription_1.1.xsd">
 2 | <media>Audio_path</media>
 3 | <descriptiondefinition id="1">
 4 | 	<type>freetype</type>
 5 | 	<generator date="" name="AudioSculpt" version="3.4.5" />
 6 | </descriptiondefinition>
 7 | <segment endFreq="" length="" sourcetrack="" startFreq="" time="">
 8 | 	<freetype id="1" value="" />
 9 | </segment>
10 | </musicdescription>
11 | 


--------------------------------------------------------------------------------
/code/DALI/main.py:
--------------------------------------------------------------------------------
  1 | """LOADING DALI DATASET FUNCTIONS:
  2 | 
  3 | Functions for loading the Dali dataset.
  4 | 
  5 | GABRIEL MESEGUER-BROCAL 2018
  6 | """
  7 | from . import utilities as ut
  8 | # from .Annotations import Annotations
  9 | 
 10 | # ------------------------ READING INFO ------------------------
 11 | 
 12 | 
 13 | def generator_files_skip(file_names, skip=[]):
 14 |     """Generator with all the file to read.
 15 |     It skips the files in the skip ids list"""
 16 |     for fl in file_names:
 17 |         if fl.split('/')[-1].rstrip('.gz') not in skip:
 18 |             yield ut.read_gzip(fl)
 19 | 
 20 | 
 21 | def generator_files_in(file_names, keep=[]):
 22 |     """Generator with all the file to read.
 23 |     It reads only the files in the keep ids list"""
 24 |     for fl in file_names:
 25 |         if fl.split('/')[-1].rstrip('.gz') in keep:
 26 |             yield ut.read_gzip(fl, print_error=True)
 27 | 
 28 | 
 29 | def generator_folder(folder_pth, skip=[], keep=[]):
 30 |     """Create the final Generator with all the files."""
 31 |     if len(keep) > 0:
 32 |         return generator_files_in(ut.get_files_path(folder_pth,
 33 |                                                     print_error=True), keep)
 34 |     else:
 35 |         return generator_files_skip(ut.get_files_path(folder_pth,
 36 |                                                       print_error=True), skip)
 37 | 
 38 | 
 39 | def change_time(entry, new_offset=None, new_fr=None):
 40 |     type = 'horizontal'
 41 |     fr = entry.annotations['annot_param']['fr']
 42 |     offset = entry.annotations['annot_param']['offset']
 43 |     if new_fr is None:
 44 |         new_fr = fr
 45 |     if new_offset is None:
 46 |         new_offset = offset
 47 |     args = (fr, offset, new_fr, new_offset)
 48 |     entry.annotations['annot_param']['fr'] = new_fr
 49 |     entry.annotations['annot_param']['offset'] = new_offset
 50 |     if entry.annotations['type'] == 'vertical':
 51 |         type = 'vertical'
 52 |         entry.vertical2horizontal()
 53 |     for key, value in entry.annotations['annot'].items():
 54 |         value = ut.compute_new_time(value, *args)
 55 |     if type == 'vertical':
 56 |         entry.horizontal2vertical()
 57 |     return
 58 | 
 59 | 
 60 | def update_with_ground_truth(dali, gt_file):
 61 |     gt = []
 62 |     if ut.check_file(gt_file, print_error=False):
 63 |         gt = load_ground_truth(gt_file)
 64 |     if len(gt) > 0:
 65 |         for i in gt:
 66 |             entry = dali[i]
 67 |             change_time(entry, gt[i]['offset'], gt[i]['fr'])
 68 |             entry.info['ground-truth'] = True
 69 |     return dali
 70 | 
 71 | 
 72 | def get_the_DALI_dataset(pth, gt_file='', skip=[], keep=[]):
 73 |     """Load the whole DALI dataset. It can load only a subset of the dataset
 74 |     by providing either the ids to skip or the ids that to load."""
 75 |     args = (pth, skip, keep)
 76 |     dali = {song.info['id']: song for song in generator_folder(*args)}
 77 |     dali = update_with_ground_truth(dali, gt_file)
 78 |     return dali
 79 | 
 80 | 
 81 | def get_an_entry(fl_pth):
 82 |     """Retrieve a particular entry and return as a class."""
 83 |     return ut.read_gzip(fl_pth)
 84 | 
 85 | 
 86 | def get_info(dali_info_file):
 87 |     """Read the DALI INFO file with ['DALI_ID', 'YOUTUBE_ID', 'WORKING']
 88 |     """
 89 |     return ut.read_gzip(dali_info_file, print_error=True)
 90 | 
 91 | 
 92 | def load_ground_truth(dali_gt_file):
 93 |     """Read the ground_truth file
 94 |     """
 95 |     return ut.read_gzip(dali_gt_file, print_error=True)
 96 | 
 97 | # ------------------------ BASIC OPERATIONS ------------------------
 98 | 
 99 | 
100 | def update_audio_working_from_info(info, dali_dataset):
101 |     """Update the working label for each class using a info file"""
102 |     for i in info[1:]:
103 |         dali_dataset[i[0]].info['audio']['working'] = i[-1]
104 |     return dali_dataset
105 | 
106 | 
107 | def ids_to_title_artist(dali_dataset):
108 |     """Transform the unique DALI id to """
109 |     output = [[value.info['id'], value.info['artist'], value.info['title']]
110 |               for key, value in dali_dataset.items()]
111 |     output.insert(0, ['id', 'artist', 'title'])
112 |     return output
113 | 


--------------------------------------------------------------------------------
/code/DALI/utilities.py:
--------------------------------------------------------------------------------
  1 | """ Utilities functions for dealing with the dali dataset.
  2 | 
  3 | It includes all the needed functions for reading files and the helpers for
  4 | transforming the annotation information.
  5 | 
  6 | GABRIEL MESEGUER-BROCAL 2018
  7 | """
  8 | import copy
  9 | import glob
 10 | import gzip
 11 | import json
 12 | import numpy as np
 13 | import os
 14 | import pickle
 15 | 
 16 | # ------------------------ READING INFO ----------------
 17 | 
 18 | 
 19 | def get_files_path(pth, ext='*.gz', print_error=False):
 20 |     """Get all the files with a extension for a particular path."""
 21 |     return list_files_from_folder(pth, extension=ext, print_error=print_error)
 22 | 
 23 | 
 24 | def check_absolute_path(directory, print_error=True):
 25 |     """Check if a directory has an absolute path or not."""
 26 |     output = False
 27 |     if os.path.isabs(directory):
 28 |         output = True
 29 |     elif print_error:
 30 |         print("ERROR: Please use an absolute path")
 31 |     return output
 32 | 
 33 | 
 34 | def check_directory(directory, print_error=True):
 35 |     """Return True if a directory exists and False if not."""
 36 |     output = False
 37 |     if check_absolute_path(directory, print_error):
 38 |         if os.path.isdir(directory) and os.path.exists(directory):
 39 |             output = True
 40 |         elif print_error:
 41 |             print("ERROR: not valid directory " + directory)
 42 |     return output
 43 | 
 44 | 
 45 | def check_file(fl, print_error=True):
 46 |     """Return True if a file exists and False if not."""
 47 |     output = False
 48 |     if check_absolute_path(fl, print_error):
 49 |         if os.path.isfile(fl) and os.path.exists(fl):
 50 |             output = True
 51 |         elif print_error:
 52 |             print("ERROR: not valid file " + fl)
 53 |     return output
 54 | 
 55 | 
 56 | def create_directory(directory, print_error=False):
 57 |     """Create a folder."""
 58 |     if not check_directory(directory, print_error) and \
 59 |        check_absolute_path(directory):
 60 |         os.makedirs(directory)
 61 |         print("Creating a folder at " + directory)
 62 |     return directory
 63 | 
 64 | 
 65 | def list_files_from_folder(directory, extension, print_error=True):
 66 |     """Return all the files with a specific extension for a given folder."""
 67 |     files = []
 68 |     if check_absolute_path(directory, print_error):
 69 |         if '*' not in extension[0]:
 70 |             extension = '*' + extension
 71 |         if check_directory(directory, print_error):
 72 |             files = glob.glob(os.path.join(directory, extension))
 73 |             if not files and print_error:
 74 |                 print("ERROR: not files with extension " + extension[1:])
 75 |     return sorted(files)
 76 | 
 77 | 
 78 | def write_in_gzip(pth, name, data, print_error=False):
 79 |     """Write data in a gzip file"""
 80 |     if check_directory(pth, print_error):
 81 |         save_name = os.path.join(pth, name)
 82 |         try:
 83 |             gz = gzip.open(save_name + '.gz', 'wb')
 84 |         except Exception as e:
 85 |             gz = gzip.open(save_name + '.gz', 'w')
 86 |         gz.write(pickle.dumps(data, protocol=2))
 87 |         gz.close()
 88 |     return
 89 | 
 90 | 
 91 | def write_in_json(pth, name, data, print_error=False):
 92 |     """Write data in a json file"""
 93 |     if check_directory(pth, print_error):
 94 |         save_name = os.path.join(pth, name)
 95 |         try:
 96 |             with open(save_name + '.json', 'wb') as outfile:
 97 |                 json.dump(data, outfile)
 98 |         except Exception as e:
 99 |             with open(save_name + '.json', 'w') as outfile:
100 |                 json.dump(data, outfile)
101 |     return
102 | 
103 | 
104 | def read_gzip(fl, print_error=False):
105 |     """Read gzip file"""
106 |     output = None
107 |     if check_file(fl, print_error):
108 |         try:
109 |             with gzip.open(fl, 'rb') as f:
110 |                 output = pickle.load(f)
111 |         except Exception as e:
112 |             with gzip.open(fl, 'r') as f:
113 |                 output = pickle.load(f)
114 |     return output
115 | 
116 | 
117 | def read_json(fl, print_error=False):
118 |     """Read json file"""
119 |     output = None
120 |     if check_file(fl, print_error):
121 |         try:
122 |             with open(fl, 'rb') as outfile:
123 |                 output = json.load(outfile)
124 |         except Exception as e:
125 |             with open(fl, 'r') as outfile:
126 |                 output = json.load(outfile)
127 |     return output
128 | 
129 | 
130 | def check_structure(ref, struct):
131 |     output = False
132 |     if isinstance(ref, dict) and isinstance(struct, dict):
133 |         # ref is a dict of types or other dicts
134 |         output = all(k in struct and check_structure(ref[k], struct[k])
135 |                      for k in ref)
136 |     else:
137 |         # ref is the type of struct
138 |         output = isinstance(struct, type(ref))
139 |     return output
140 | 
141 | # -------------- CHANING ANNOTATIONS --------------
142 | 
143 | 
144 | def beat2time(beat, **args):
145 |     bps = None
146 |     offset = 0
147 |     beat = float(beat)
148 |     if 'bps' in args:
149 |         bps = float(args['bps'])
150 |     if 'fr' in args:
151 |         bps = float(args['fr']) / 60.
152 |     if 'offset' in args:
153 |         offset = args['offset']
154 |     return beat/bps + offset
155 | 
156 | 
157 | def time2beat(time, **args):
158 |     bps = None
159 |     offset = 0
160 |     if 'bps' in args:
161 |         bps = float(args['bps'])
162 |     if 'fr' in args:
163 |         bps = float(args['fr']) / 60.
164 |     if 'offset' in args:
165 |         offset = args['offset']
166 |     return np.round((time - offset)*bps).astype(int)
167 | 
168 | 
169 | def change_time(time, old_param, new_param):
170 |     beat = time2beat(time, offset=old_param['offset'], fr=old_param['fr'])
171 |     new_time = beat2time(beat, offset=new_param['offset'], fr=new_param['fr'])
172 |     return new_time
173 | 
174 | 
175 | def change_time_tuple(time, old_param, new_param):
176 |     return tuple(change_time(t, old_param, new_param) for t in time)
177 | 
178 | 
179 | def compute_new_time(lst, old_fr, old_offset, new_fr, new_offset):
180 |     old_param = {'fr': old_fr, 'offset': old_offset}
181 |     new_param = {'fr': new_fr, 'offset': new_offset}
182 |     for e in lst:
183 |         e['time'] = change_time_tuple(e['time'], old_param, new_param)
184 |     return lst
185 | 
186 | # -------------- TRANSFORMING THE INFO IN THE ANNOTATIONS --------------
187 | 
188 | 
189 | def roll(lower, upper):
190 |     """It merges an horizontal level with its upper. For example melody with
191 |     words or lines with paragraphs
192 |     """
193 |     tmp = copy.deepcopy(lower)
194 |     for info in tmp:
195 |         i = info['index']
196 |         if not isinstance(upper[i]['text'], list):
197 |             upper[i]['text'] = []
198 |         info.pop('index', None)
199 |         upper[i]['text'].append(info)
200 |     return upper
201 | 
202 | 
203 | def get_text(text, output=[], m=False):
204 |     """Recursive function for having the needed text for the unroll function.
205 |     """
206 |     f = lambda x: isinstance(x, unicode) or isinstance(x, str)
207 |     try:
208 |         f('whatever')
209 |     except Exception as e:
210 |         f = lambda x: isinstance(x, str)
211 |     if isinstance(text, list):
212 |         tmp = [get_text(i['text'], output) for i in text]
213 |         if f(tmp[0]):
214 |                 output.append(''.join(tmp))
215 |     elif f(text):
216 |         output = text
217 |     return output
218 | 
219 | 
220 | def unroll(annot, output=[], depth=0, index=0, b=False):
221 |     """Recursive function that transforms an annotation vector in the format
222 |     vertical format into a horizontal vector:
223 |         - annot (list): annotations vector in vertical formart.
224 |         - output (list): internally used in the recursion, final output.
225 |         - depth (int): depth that the recursion is going to be.
226 |             Example:
227 |             1 - input list of paragraph (annot)
228 |                 depth = 0 -> output = horizontal level for paragraphs,
229 |                 depth = 1 -> output = lines, depth = 2 -> output = words
230 |                 depth = 3 -> output = melody
231 |             2 - input a list of lines (annot[paragraph_i]['text'][line_i])
232 |                 depth = 0 -> output = lines, depth = 1 -> words,
233 |                 depth = 2 -> melody,
234 |                 depth = 3 -> ERROR: not controlled behaviour
235 |         - b (bool) = control if the horizontal level is going to have index
236 |             or not. The paragraph level does not need it.
237 |     """
238 |     if depth == 0:
239 |         """Bottom level to be merged"""
240 |         for i in annot:
241 |             text = get_text(i['text'], output=[])
242 |             if isinstance(text, list):
243 |                 text = " ".join(text)
244 |             output.append({'time': i['time'], 'freq': i['freq'], 'text': text})
245 |             if b:
246 |                 output[-1]['index'] = index
247 |         index += 1
248 |     else:
249 |         """Go deeper"""
250 |         depth -= 1
251 |         b = True
252 |         for l in annot:
253 |             output, index = unroll(l['text'], output, depth, index, b)
254 |     return output, index
255 | 
256 | 
257 | def sample(annot, time_r):
258 |     """Transform the normal time into a discrete value with respect time_r.
259 |     """
260 |     output = copy.deepcopy(annot)
261 |     for a in output:
262 |         a['time'] = (np.round(a['time'][0]/time_r).astype(int),
263 |                      np.round(a['time'][1]/time_r).astype(int))
264 |     return output
265 | 


--------------------------------------------------------------------------------
/code/DALI/vizualization.py:
--------------------------------------------------------------------------------
 1 | from DALI.Annotations import Annotations
 2 | import os
 3 | from . import utilities as ut
 4 | import xml.etree.ElementTree as ET
 5 | 
 6 | 
 7 | sculpt_segment_tag = '{http://www.ircam.fr/musicdescription/1.1}segment'
 8 | sculpt_freetype_tag = '{http://www.ircam.fr/musicdescription/1.1}freetype'
 9 | sculpt_media_tag = '{http://www.ircam.fr/musicdescription/1.1}media'
10 | ET.register_namespace('', "http://www.ircam.fr/musicdescription/1.1")
11 | pth = os.path.dirname(os.path.abspath(__file__))
12 | xml_template = os.path.join(pth, 'files/template.xml')
13 | 
14 | # ----------------------------------XML-------------------------------
15 | 
16 | 
17 | def addsemgnet(parent, text, attrib, tag=sculpt_segment_tag,
18 |                sub_tag=sculpt_freetype_tag):
19 | 
20 |     element = parent.makeelement(tag, attrib)
21 |     sub = ET.SubElement(element, sub_tag)
22 |     sub.attrib['value'] = text
23 |     sub.attrib['id'] = '1'
24 |     parent.append(element)
25 |     # element.text = text
26 |     return
27 | 
28 | 
29 | def create_xml_attrib(annot):
30 |     point = {'startFreq': '', 'endFreq': '', 'length': '', 'sourcetrack': '0',
31 |              'time': ''}
32 |     offset = 0
33 |     if annot['freq'][0] == annot['freq'][1]:
34 |         offset = 20
35 | 
36 |     point['time'] = str(annot['time'][0])
37 |     point['length'] = str(annot['time'][1] - annot['time'][0])
38 |     point['startFreq'] = str(annot['freq'][0] - offset)
39 |     point['endFreq'] = str(annot['freq'][1] + offset)
40 |     return point
41 | 
42 | 
43 | def write_annot_xml(annot, name, path_save, xml_template=xml_template):
44 |     path_save = ut.create_directory(path_save)
45 |     tree = ET.parse(xml_template)
46 |     root = tree.getroot()
47 |     media = root.findall(sculpt_media_tag)[0]
48 |     media.text = name
49 | 
50 |     # print ET.tostring(root)
51 |     # segments = root.findall(sculpt_segment_tag)
52 | 
53 |     for point in annot:
54 |         addsemgnet(root, point['text'], attrib=create_xml_attrib(point))
55 | 
56 |     """
57 |     for line in segmented_lyrics.lines:
58 |         addsemgnet(root, line['text'], attrib=create_xml_attrib(line))
59 |     for word in segmented_lyrics.words:
60 |         addsemgnet(root, word['text'], attrib=create_xml_attrib(word))
61 |     """
62 | 
63 |     segment = root.findall(sculpt_segment_tag)
64 |     root.remove(segment[0])
65 |     # tree.write(name.replace("wav", "xml"))
66 |     tree.write(os.path.join(path_save, name + ".xml"))
67 |     return
68 | 
69 | 
70 | # ------------------------------TXT-------------------------------
71 | 
72 | 
73 | def write_annot_txt(annot, name, path_save):
74 |     path_save = ut.create_directory(path_save)
75 |     with open(os.path.join(path_save, name + ".txt"), 'w') as f:
76 |         for item in annot:
77 |             f.write("%f\t" % item['time'][0])
78 |             f.write("%f\t" % item['time'][1])
79 |             f.write("%s\n" % item['text'])
80 |     return
81 | 


--------------------------------------------------------------------------------
/code/LICENSE:
--------------------------------------------------------------------------------
 1 | Academic Free License ("AFL") v. 3.0
 2 | This Academic Free License (the "License") applies to any original work of authorship (the "Original Work") whose owner (the "Licensor") has placed the following licensing notice adjacent to the copyright notice for the Original Work:
 3 | 
 4 | Licensed under the Academic Free License version 3.0
 5 | 
 6 | 1) Grant of Copyright License. Licensor grants You a worldwide, royalty-free, non-exclusive, sublicensable license, for the duration of the copyright, to do the following:
 7 | 
 8 | a) to reproduce the Original Work in copies, either alone or as part of a collective work;
 9 | 
10 | b) to translate, adapt, alter, transform, modify, or arrange the Original Work, thereby creating derivative works ("Derivative Works") based upon the Original Work;
11 | 
12 | c) to distribute or communicate copies of the Original Work and Derivative Works to the public, under any license of your choice that does not contradict the terms and conditions, including Licensor's reserved rights and remedies, in this Academic Free License;
13 | 
14 | d) to perform the Original Work publicly; and
15 | 
16 | e) to display the Original Work publicly.
17 | 
18 | 2) Grant of Patent License. Licensor grants You a worldwide, royalty-free, non-exclusive, sublicensable license, under patent claims owned or controlled by the Licensor that are embodied in the Original Work as furnished by the Licensor, for the duration of the patents, to make, use, sell, offer for sale, have made, and import the Original Work and Derivative Works.
19 | 
20 | 3) Grant of Source Code License. The term "Source Code" means the preferred form of the Original Work for making modifications to it and all available documentation describing how to modify the Original Work. Licensor agrees to provide a machine-readable copy of the Source Code of the Original Work along with each copy of the Original Work that Licensor distributes. Licensor reserves the right to satisfy this obligation by placing a machine-readable copy of the Source Code in an information repository reasonably calculated to permit inexpensive and convenient access by You for as long as Licensor continues to distribute the Original Work.
21 | 
22 | 4) Exclusions From License Grant. Neither the names of Licensor, nor the names of any contributors to the Original Work, nor any of their trademarks or service marks, may be used to endorse or promote products derived from this Original Work without express prior permission of the Licensor. Except as expressly stated herein, nothing in this License grants any license to Licensor's trademarks, copyrights, patents, trade secrets or any other intellectual property. No patent license is granted to make, use, sell, offer for sale, have made, or import embodiments of any patent claims other than the licensed claims defined in Section 2. No license is granted to the trademarks of Licensor even if such marks are included in the Original Work. Nothing in this License shall be interpreted to prohibit Licensor from licensing under terms different from this License any Original Work that Licensor otherwise would have a right to license.
23 | 
24 | 5) External Deployment. The term "External Deployment" means the use, distribution, or communication of the Original Work or Derivative Works in any way such that the Original Work or Derivative Works may be used by anyone other than You, whether those works are distributed or communicated to those persons or made available as an application intended for use over a network. As an express condition for the grants of license hereunder, You must treat any External Deployment by You of the Original Work or a Derivative Work as a distribution under section 1(c).
25 | 
26 | 6) Attribution Rights. You must retain, in the Source Code of any Derivative Works that You create, all copyright, patent, or trademark notices from the Source Code of the Original Work, as well as any notices of licensing and any descriptive text identified therein as an "Attribution Notice." You must cause the Source Code for any Derivative Works that You create to carry a prominent Attribution Notice reasonably calculated to inform recipients that You have modified the Original Work.
27 | 
28 | 7) Warranty of Provenance and Disclaimer of Warranty. Licensor warrants that the copyright in and to the Original Work and the patent rights granted herein by Licensor are owned by the Licensor or are sublicensed to You under the terms of this License with the permission of the contributor(s) of those copyrights and patent rights. Except as expressly stated in the immediately preceding sentence, the Original Work is provided under this License on an "AS IS" BASIS and WITHOUT WARRANTY, either express or implied, including, without limitation, the warranties of non-infringement, merchantability or fitness for a particular purpose. THE ENTIRE RISK AS TO THE QUALITY OF THE ORIGINAL WORK IS WITH YOU. This DISCLAIMER OF WARRANTY constitutes an essential part of this License. No license to the Original Work is granted by this License except under this disclaimer.
29 | 
30 | 8) Limitation of Liability. Under no circumstances and under no legal theory, whether in tort (including negligence), contract, or otherwise, shall the Licensor be liable to anyone for any indirect, special, incidental, or consequential damages of any character arising as a result of this License or the use of the Original Work including, without limitation, damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses. This limitation of liability shall not apply to the extent applicable law prohibits such limitation.
31 | 
32 | 9) Acceptance and Termination. If, at any time, You expressly assented to this License, that assent indicates your clear and irrevocable acceptance of this License and all of its terms and conditions. If You distribute or communicate copies of the Original Work or a Derivative Work, You must make a reasonable effort under the circumstances to obtain the express assent of recipients to the terms of this License. This License conditions your rights to undertake the activities listed in Section 1, including your right to create Derivative Works based upon the Original Work, and doing so without honoring these terms and conditions is prohibited by copyright law and international treaty. Nothing in this License is intended to affect copyright exceptions and limitations (including "fair use" or "fair dealing"). This License shall terminate immediately and You may no longer exercise any of the rights granted to You by this License upon your failure to honor the conditions in Section 1(c).
33 | 
34 | 10) Termination for Patent Action. This License shall terminate automatically and You may no longer exercise any of the rights granted to You by this License as of the date You commence an action, including a cross-claim or counterclaim, against Licensor or any licensee alleging that the Original Work infringes a patent. This termination provision shall not apply for an action alleging patent infringement by combinations of the Original Work with other software or hardware.
35 | 
36 | 11) Jurisdiction, Venue and Governing Law. Any action or suit relating to this License may be brought only in the courts of a jurisdiction wherein the Licensor resides or in which Licensor conducts its primary business, and under the laws of that jurisdiction excluding its conflict-of-law provisions. The application of the United Nations Convention on Contracts for the International Sale of Goods is expressly excluded. Any use of the Original Work outside the scope of this License or after its termination shall be subject to the requirements and penalties of copyright or patent law in the appropriate jurisdiction. This section shall survive the termination of this License.
37 | 
38 | 12) Attorneys' Fees. In any action to enforce the terms of this License or seeking damages relating thereto, the prevailing party shall be entitled to recover its costs and expenses, including, without limitation, reasonable attorneys' fees and costs incurred in connection with such action, including any appeal of such action. This section shall survive the termination of this License.
39 | 
40 | 13) Miscellaneous. If any provision of this License is held to be unenforceable, such provision shall be reformed only to the extent necessary to make it enforceable.
41 | 
42 | 14) Definition of "You" in This License. "You" throughout this License, whether in upper or lower case, means an individual or a legal entity exercising rights under, and complying with all of the terms of, this License. For legal entities, "You" includes any entity that controls, is controlled by, or is under common control with you. For purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity.
43 | 
44 | 15) Right to Use. You may use the Original Work in all ways not otherwise restricted or conditioned by this License or by law, and Licensor promises not to interfere with or be responsible for such uses by You.
45 | 
46 | 16) Modification of This License. This License is Copyright © 2005 Lawrence Rosen. Permission is granted to copy, distribute, or communicate this License without modification. Nothing in this License permits You to modify this License as applied to the Original Work or to Derivative Works. However, You may modify the text of this License and copy, distribute or communicate your modified version (the "Modified License") and apply it to other original works of authorship subject to the following conditions: (i) You may not indicate in any way that your Modified License is the "Academic Free License" or "AFL" and you may not use those names in the name of your Modified License; (ii) You must replace the notice specified in the first paragraph above with the notice "Licensed under <insert your license name here>" or with a notice of your own that is not confusingly similar to the notice in this License; and (iii) You may not claim that your original works are open source software unless your Modified License has been approved by Open Source Initiative (OSI) and You comply with its license review and certification process.
47 | 


--------------------------------------------------------------------------------
/code/MANIFEST.in:
--------------------------------------------------------------------------------
1 | include DALI/files/template.xml
2 | 


--------------------------------------------------------------------------------
/code/README.md:
--------------------------------------------------------------------------------
1 | # DALI-DATASET
2 | 
3 | Code for working with the DALI dataset.
4 | You can download the dataset at [zenodo](https://zenodo.org/record/2577915) and find a tutorial of and more information of how to use this code at [github/gabolsgabs](https://github.com/gabolsgabs/DALI).
5 | 
6 | Copyright © Ircam 2018
7 | This package is distributed under the Academic Free License ("AFL") v. 3.0.
8 | For more information about the license read the LICENSE.txt
9 | 


--------------------------------------------------------------------------------
/code/setup.py:
--------------------------------------------------------------------------------
 1 | from setuptools import setup
 2 | 
 3 | with open("README.md", "r") as fh:
 4 |     long_description = fh.read()
 5 | 
 6 | setup(name='DALI-dataset',
 7 |       version='1.0.0',
 8 |       description='Code for working with the DALI dataset',
 9 |       url='http://github.com/gabolsgabs/DALI',
10 |       author='Gabriel Meseguer Brocal',
11 |       author_email='gabriel.meseguer.brocal@ircam.fr',
12 |       license='afl-3.0',
13 |       long_description=long_description,
14 |       long_description_content_type="text/markdown",
15 |       #  https://help.github.com/articles/licensing-a-repository/#disclaimer
16 |       packages=['DALI'],
17 |       include_package_data=True,
18 |       install_requires=['youtube_dl',],
19 |       zip_safe=False)
20 | 


--------------------------------------------------------------------------------
/docs/images/Example.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/gabolsgabs/DALI/0bfeec1008b32a8294ff2f49141f943758a476cb/docs/images/Example.png


--------------------------------------------------------------------------------
/docs/images/graphs.key:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/gabolsgabs/DALI/0bfeec1008b32a8294ff2f49141f943758a476cb/docs/images/graphs.key


--------------------------------------------------------------------------------
/docs/images/horizontal.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/gabolsgabs/DALI/0bfeec1008b32a8294ff2f49141f943758a476cb/docs/images/horizontal.png


--------------------------------------------------------------------------------
/docs/images/l1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/gabolsgabs/DALI/0bfeec1008b32a8294ff2f49141f943758a476cb/docs/images/l1.png


--------------------------------------------------------------------------------
/docs/images/p1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/gabolsgabs/DALI/0bfeec1008b32a8294ff2f49141f943758a476cb/docs/images/p1.png


--------------------------------------------------------------------------------
/docs/images/vertical.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/gabolsgabs/DALI/0bfeec1008b32a8294ff2f49141f943758a476cb/docs/images/vertical.png


--------------------------------------------------------------------------------
/docs/images/w1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/gabolsgabs/DALI/0bfeec1008b32a8294ff2f49141f943758a476cb/docs/images/w1.png


--------------------------------------------------------------------------------
/versions/README.md:
--------------------------------------------------------------------------------
 1 | Here you can find the different dali_data versions.
 2 | Dali_data has its own version control.
 3 | There are always two numbers: version a.b.
 4 | 
 5 | * Ther first number (**a**) refers to the singing voice detection system generation  used for retrieving the audio and finding the gobal alignmnet.
 6 | 
 7 | * The second number (**b**) refers to improvements for solving local alignment or notes problems.
 8 | 
 9 | The definition of each version follows the standard presented at:
10 | [Geoffroy Peeters, Karën Fort. Towards a (better) Definition of Annotated MIR Corpora. International Society for Music Information Retrieval Conference (ISMIR), Oct 2012, Porto, Portugal. 2012.](https://hal.archives-ouvertes.fr/hal-00713074)
11 | 
12 | <!-- For the **password**, please fill out the form: https://goo.gl/forms/rGgI6RQdyYXMdMaq1 -->
13 | 
14 | ### Version 1.0.
15 | 
16 | * **Donwload** it in [here](https://zenodo.org/record/2577915)<!--(https://mega.nz/#!bzQUnCyK)-->
17 | * The ground-truth in [here](https://github.com/gabolsgabs/DALI/blob/master/versions/v1/) --> update 12/11/2018
18 | * The [definiton](https://github.com/gabolsgabs/DALI/blob/master/versions/v1/v1.0.md)
19 | * Here's the [paper](http://ismir2018.ircam.fr/doc/pdfs/35_Paper.pdf)
20 | * and the citation [bibtex](https://github.com/gabolsgabs/DALI/blob/master/citations/DALI_v1.0.bib):
21 | 
22 |       @inproceedings{Meseguer-Brocal_2018,
23 |       	 Author = {Meseguer-Brocal, Gabriel and Cohen-Hadria, Alice and Gomez and Peeters Geoffroy},
24 |          Booktitle = {19th International Society for Music Information Retrieval Conference},
25 |       	 Editor = {ISMIR},
26 |       	 Month = {September},
27 |       	 Title = {DALI: a large Dataset of synchronized Audio, LyrIcs and notes, automatically created using teacher-student machine learning paradigm.},
28 |       	 Year = {2018}}
29 | 
30 | 
31 | ### Version 2.0.
32 | 
33 | ##### Soon
34 | 


--------------------------------------------------------------------------------
/versions/v1/gt_v1.0_22:11:18.gz:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/gabolsgabs/DALI/0bfeec1008b32a8294ff2f49141f943758a476cb/versions/v1/gt_v1.0_22:11:18.gz


--------------------------------------------------------------------------------
/versions/v1/v1.0.md:
--------------------------------------------------------------------------------
 1 | _____
 2 | **-C1-** Corpus ID: corpus:MIR:DALI:Vocal:2018:version1.0 <br>
 3 | _____
 4 | **-A- Raw Corpus**<br>
 5 | **(A1) Definition:** (a13) real items. **5358** songs each with -- its audio in full-duration, -- its time-aligned lyrics and -- its time-aligned notes of the vocal melody. Popularity-oriented defined by karaoke user demands.<br>
 6 | **(A2) Type of media diffusion:** insolated music tracks.<br>
 7 | _____
 8 | **-B- Annotations**<br>
 9 | **(B1) Origin:** (b15) Traditional manual human annotations.<br>
10 | **(B21) Concepts definition:** note and text annotations by non-expert users for the vocal melody for playing karaoke games. Annotations are automatically aligned to audio tracks. Four different granularity levels are contracted: syllables, words, lines and paragraphs.<br>
11 | **(B22) Annotation rules:** unknow. <br>
12 | **(B31) Annotators:** unknow, non-expert users from the karaoke open source community.<br>
13 | **(B32) Validation/ reliability:** not proved yet.<br>
14 | **(B4) Annotation tools:** UltraStar Song Editor + automatic alignment at [Meseguer-Brocal_2018].<br>
15 | _____
16 | **-C- Documents and Storing**<br>
17 | **(C1) Audio identifier and storage:** specific unique url identifiers to youtube videos are provided + annotations distributed as AFL accessible through this github as gz files.<br>
18 | 
19 | 
20 | The version 1.0 is the result of the methodology discribed at: ***[Meseguer-Brocal_2018]*** [G. Meseguer-Brocal, A. Cohen-Hadria and G. Peeters. DALI: a large Dataset of synchronized Audio, LyrIcs and notes, automatically created using teacher-student machine learning paradigm. In ISMIR Paris, France, 2018.](http://ismir2018.ircam.fr/doc/pdfs/35_Paper.pdf)
21 | 
22 | NOTES:
23 | - **Singing Voice detection system** = student trained with the dataset produced by the teacher J+M.
24 | 


--------------------------------------------------------------------------------