├── docs ├── r1_DC_393.png ├── r1_losses_DC.png ├── r1_chinese_003.png ├── r1_expansion_DD.png └── README.md ├── examples ├── DC_393.wav ├── chinese_003.wav ├── DC_393.PointProcess ├── DC_393.TextGrid ├── chinese_003.PointProcess └── chinese_003.TextGrid ├── sfc ├── __init__.py ├── sfc_dsp.py ├── sfc_learn.py ├── sfc_params.py ├── sfc_corpus.py └── sfc_plot.py ├── .gitignore ├── sfc.py └── LICENSE /docs/r1_DC_393.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gerazov/PySFC/HEAD/docs/r1_DC_393.png -------------------------------------------------------------------------------- /examples/DC_393.wav: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gerazov/PySFC/HEAD/examples/DC_393.wav -------------------------------------------------------------------------------- /docs/r1_losses_DC.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gerazov/PySFC/HEAD/docs/r1_losses_DC.png -------------------------------------------------------------------------------- /docs/r1_chinese_003.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gerazov/PySFC/HEAD/docs/r1_chinese_003.png -------------------------------------------------------------------------------- /docs/r1_expansion_DD.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gerazov/PySFC/HEAD/docs/r1_expansion_DD.png -------------------------------------------------------------------------------- /examples/chinese_003.wav: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/gerazov/PySFC/HEAD/examples/chinese_003.wav -------------------------------------------------------------------------------- /sfc/__init__.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | """ 4 | PySFC - this folder holds all the modules needed by sfc.py 5 | 6 | @authors: 7 | Branislav Gerazov Nov 2017 8 | 9 | Copyright 2017 by GIPSA-lab, Grenoble INP, Grenoble, France. 10 | 11 | See the file LICENSE for the licence associated with this software. 12 | """ 13 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # Byte-compiled / optimized / DLL files 2 | __pycache__/ 3 | *.py[cod] 4 | *$py.class 5 | 6 | # C extensions 7 | *.so 8 | 9 | # Distribution / packaging 10 | .Python 11 | env/ 12 | build/ 13 | develop-eggs/ 14 | dist/ 15 | downloads/ 16 | eggs/ 17 | .eggs/ 18 | lib/ 19 | lib64/ 20 | parts/ 21 | sdist/ 22 | var/ 23 | wheels/ 24 | *.egg-info/ 25 | .installed.cfg 26 | *.egg 27 | 28 | # PyInstaller 29 | # Usually these files are written by a python script from a template 30 | # before PyInstaller builds the exe, so as to inject date/other infos into it. 31 | *.manifest 32 | *.spec 33 | 34 | # Installer logs 35 | pip-log.txt 36 | pip-delete-this-directory.txt 37 | 38 | # Unit test / coverage reports 39 | htmlcov/ 40 | .tox/ 41 | .coverage 42 | .coverage.* 43 | .cache 44 | nosetests.xml 45 | coverage.xml 46 | *.cover 47 | .hypothesis/ 48 | 49 | # Translations 50 | *.mo 51 | *.pot 52 | 53 | # Django stuff: 54 | *.log 55 | local_settings.py 56 | 57 | # Flask stuff: 58 | instance/ 59 | .webassets-cache 60 | 61 | # Scrapy stuff: 62 | .scrapy 63 | 64 | # Sphinx documentation 65 | docs/_build/ 66 | 67 | # PyBuilder 68 | target/ 69 | 70 | # Jupyter Notebook 71 | .ipynb_checkpoints 72 | 73 | # pyenv 74 | .python-version 75 | 76 | # celery beat schedule file 77 | celerybeat-schedule 78 | 79 | # SageMath parsed files 80 | *.sage.py 81 | 82 | # dotenv 83 | .env 84 | 85 | # virtualenv 86 | .venv 87 | venv/ 88 | ENV/ 89 | 90 | # Spyder project settings 91 | .spyderproject 92 | .spyproject 93 | 94 | # Rope project settings 95 | .ropeproject 96 | 97 | # mkdocs documentation 98 | /site 99 | 100 | # mypy 101 | .mypy_cache/ 102 | -------------------------------------------------------------------------------- /sfc/sfc_dsp.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | """ 4 | PySFC - functions for signal processing. 5 | 6 | @authors: 7 | Branislav Gerazov Nov 2017 8 | 9 | Copyright 2017 by GIPSA-lab, Grenoble INP, Grenoble, France. 10 | 11 | See the file LICENSE for the licence associated with this software. 12 | """ 13 | import numpy as np 14 | from matplotlib import pyplot as plt 15 | from scipy.interpolate import interp1d 16 | from scipy import signal as sig 17 | 18 | def f0_smooth(pitch_ts, f0s, plot=False): 19 | ''' 20 | Smooth the f0. 21 | 22 | Parameters 23 | ========== 24 | pitch_ts : ndarray 25 | Pitch marks timepoints. 26 | f0s : ndarray 27 | f0s at those timepoints. 28 | plot : bool 29 | Plot smoothing results. 30 | ''' 31 | fs = 200 32 | t = np.arange(0, pitch_ts[-1]+.01, 1/fs) 33 | interfunc = interp1d(pitch_ts, f0s, kind='linear', bounds_error=False, fill_value=0) 34 | f0s_t = interfunc(t) 35 | fl = 30 # Hz 36 | order= 8 37 | 38 | # b_iir, a_iir = sig.iirfilter(order, np.array(fl/(fs/2)), btype='lowpass', ftype='butter') 39 | b_fir = sig.firwin(order, fl, window='hamming', pass_zero=True, nyq=fs/2) 40 | # f0s_t_lp = sig.lfilter(b_iir, a_iir, f0s_t) 41 | # f0s_t_lp = sig.filtfilt(b_iir, a_iir, f0s_t) 42 | f0s_t_lp = sig.filtfilt(b_fir, [1], f0s_t) 43 | 44 | # now find the points you need: 45 | interfunc_back = interp1d(t, f0s_t_lp, kind='linear', bounds_error=True, fill_value=None) 46 | f0s_smooth = interfunc_back(pitch_ts) 47 | 48 | if plot: 49 | plt.figure() 50 | plt.subplot(2,1,1) 51 | # w, h_spec = sig.freqz(b_iir, a_iir) 52 | w, h_spec = sig.freqz(b_fir, 1) 53 | plt.plot(w/np.pi*fs/2, 54 | 20*np.log10(np.abs(h_spec))) 55 | plt.xlabel('Frequency [Hz]') 56 | plt.ylabel('Amplitude [dB]') 57 | plt.grid('on') 58 | plt.subplot(2,1,2) 59 | plt.plot(t, f0s_t) 60 | plt.plot(t, f0s_t_lp) 61 | plt.xlabel('Time [s]') 62 | plt.ylabel('Frequency [Hz]') 63 | plt.grid('on') 64 | 65 | return f0s_smooth 66 | 67 | -------------------------------------------------------------------------------- /examples/DC_393.PointProcess: -------------------------------------------------------------------------------- 1 | File type = "ooTextFile short" 2 | "PointProcess" 3 | 4 | 0 5 | 2.626 6 | 257 7 | 0.1163 8 | 0.1236 9 | 0.1317 10 | 0.1405 11 | 0.1494 12 | 0.1583 13 | 0.1672 14 | 0.1762 15 | 0.1851 16 | 0.1944 17 | 0.2036 18 | 0.213 19 | 0.2226 20 | 0.2322 21 | 0.2417 22 | 0.2511 23 | 0.2605 24 | 0.2697 25 | 0.2786 26 | 0.2858 27 | 0.2926 28 | 0.2999 29 | 0.3067 30 | 0.3134 31 | 0.3199 32 | 0.3262 33 | 0.3324 34 | 0.3383 35 | 0.3442 36 | 0.35 37 | 0.3557 38 | 0.3615 39 | 0.3673 40 | 0.3732 41 | 0.3797 42 | 0.386 43 | 0.3923 44 | 0.3988 45 | 0.4056 46 | 0.4126 47 | 0.4197 48 | 0.427 49 | 0.4345 50 | 0.4421 51 | 0.4498 52 | 0.4574 53 | 0.4648 54 | 0.4723 55 | 0.4795 56 | 0.4866 57 | 0.4935 58 | 0.5002 59 | 0.5066 60 | 0.5129 61 | 0.5189 62 | 0.5246 63 | 0.5302 64 | 0.5356 65 | 0.5409 66 | 0.5461 67 | 0.5511 68 | 0.5561 69 | 0.5609 70 | 0.5657 71 | 0.5704 72 | 0.5751 73 | 0.5797 74 | 0.5842 75 | 0.5887 76 | 0.5932 77 | 0.5977 78 | 0.6023 79 | 0.6074 80 | 0.6126 81 | 0.6175 82 | 0.6226 83 | 0.6277 84 | 0.7192 85 | 0.725 86 | 0.7319 87 | 0.7389 88 | 0.7465 89 | 0.7542 90 | 0.762 91 | 0.7701 92 | 0.7788 93 | 0.7882 94 | 0.798 95 | 0.808 96 | 0.8176 97 | 0.8267 98 | 0.836 99 | 0.8452 100 | 0.8547 101 | 0.8645 102 | 0.8744 103 | 0.8846 104 | 0.8951 105 | 0.906 106 | 0.9163 107 | 1.0159 108 | 1.0209 109 | 1.0258 110 | 1.0311 111 | 1.0363 112 | 1.0415 113 | 1.0467 114 | 1.0517 115 | 1.0566 116 | 1.0615 117 | 1.0663 118 | 1.0711 119 | 1.0762 120 | 1.0814 121 | 1.0868 122 | 1.092 123 | 1.098 124 | 1.2036 125 | 1.209 126 | 1.2134 127 | 1.2184 128 | 1.2235 129 | 1.2287 130 | 1.234 131 | 1.2394 132 | 1.2452 133 | 1.2512 134 | 1.2571 135 | 1.2631 136 | 1.2691 137 | 1.2751 138 | 1.2811 139 | 1.2872 140 | 1.2932 141 | 1.2992 142 | 1.3052 143 | 1.3111 144 | 1.317 145 | 1.323 146 | 1.329 147 | 1.335 148 | 1.3412 149 | 1.3474 150 | 1.3537 151 | 1.3603 152 | 1.3675 153 | 1.3752 154 | 1.3828 155 | 1.4789 156 | 1.4863 157 | 1.4917 158 | 1.4981 159 | 1.5043 160 | 1.5106 161 | 1.5168 162 | 1.523 163 | 1.529 164 | 1.535 165 | 1.5408 166 | 1.5465 167 | 1.5522 168 | 1.5577 169 | 1.5632 170 | 1.5686 171 | 1.5739 172 | 1.5791 173 | 1.5842 174 | 1.5892 175 | 1.5942 176 | 1.5992 177 | 1.6042 178 | 1.6095 179 | 1.6146 180 | 1.6198 181 | 1.6251 182 | 1.6305 183 | 1.6362 184 | 1.6422 185 | 1.6487 186 | 1.6554 187 | 1.6629 188 | 1.6709 189 | 1.6799 190 | 1.6893 191 | 1.6991 192 | 1.7093 193 | 1.7199 194 | 1.7313 195 | 1.8255 196 | 1.8419 197 | 1.8468 198 | 1.8524 199 | 1.858 200 | 1.8636 201 | 1.8693 202 | 1.8748 203 | 1.8805 204 | 1.886 205 | 1.8915 206 | 1.8975 207 | 1.903 208 | 1.9085 209 | 1.914 210 | 1.9193 211 | 1.9245 212 | 1.9297 213 | 1.935 214 | 1.9403 215 | 1.9456 216 | 1.9509 217 | 1.9563 218 | 1.9617 219 | 1.967 220 | 1.9723 221 | 1.9778 222 | 1.9833 223 | 1.9888 224 | 1.9943 225 | 1.9999 226 | 2.0055 227 | 2.0112 228 | 2.017 229 | 2.0231 230 | 2.0289 231 | 2.0349 232 | 2.041 233 | 2.0473 234 | 2.0536 235 | 2.0599 236 | 2.0662 237 | 2.0728 238 | 2.0794 239 | 2.0861 240 | 2.0929 241 | 2.0997 242 | 2.1067 243 | 2.1137 244 | 2.1209 245 | 2.1282 246 | 2.1357 247 | 2.1434 248 | 2.1514 249 | 2.1599 250 | 2.1705 251 | 2.2942 252 | 2.3014 253 | 2.3123 254 | 2.3237 255 | 2.3355 256 | 2.3477 257 | 2.3601 258 | 2.373 259 | 2.386 260 | 2.399 261 | 2.4117 262 | 2.4234 263 | 2.4359 264 | -------------------------------------------------------------------------------- /docs/README.md: -------------------------------------------------------------------------------- 1 | # PySFC 2 | Python implementation of the SFC intonation model. 3 | 4 | 5 | ## The SFC model 6 | 7 | The Superposition of Functional Contours (SFC) model is a prosody model that is based on the decomposition of prosodic contours into functionally relevant elementary contours [1]. It proposes a generative mechanism for encoding socio-communicative functions, such as syntactic structure and attitudes, through the use of prosody. 8 | The SFC has been successfully used to model different linguistic levels, including: attitudes, dependency relations of word groups, word focus, tones in Mandarin, etc. It has been used for a number of languages including: French, Galician, German and Chinese. Recently, the SFC model has been extended into the visual prosody domain through modelling facial expressions and head and gaze motion. 9 | 10 | The SFC model is based on neural network contour generators (NNCGs) each responsible for encoding one linguistic function on a given scope. The prosody contour is then obtained by overlapping and adding these elementary contours. 11 | NNCG training is done using an analysis-by-synthesis loop that distributes the error and usual backpropagation training at each iteration. 12 | Four syllable position ramps are used by the NNCGs to generate pitch and duration coefficients for each syllable. 13 | 14 | 15 | [1] Bailly, Gérard, and Bleicke Holm. "SFC: a trainable prosodic model." Speech communication 46, no. 3 (2005): 348-364. 16 | 17 | ## PySFC 18 | 19 | PySFC is a Python implementation of the SFC model that was created with two goals: *i*) to make the SFC more accessible to the scientific community, and *ii*) to serve as a foundation for future improvements of the prosody model. 20 | The PySFC also implements a minimum set of tools necessary to make the system self-contained and fully functional. 21 | 22 | Python was chosen as an implementation language because of the powerful scientific computing environment that is completely based on free software. It is based on [NumPy](http://www.numpy.org/) within the [SciPy](https://www.scipy.org/) ecosystem. The neural networks and their training have been facilitated through the use of the Multi Layer Perceptron (MLP) regressor in the powerful [scikit-learn](http://scikit-learn.org/stable/index.html) module. 23 | Great attention was put on code readability, which is also one of the features of good Python, augmented with detailed functions docstrings, and comments. The code is segmented in [Spyder](https://pythonhosted.org/spyder/) cells for rapid prototyping. Finally, the whole implementation has been licensed as [free software](http://fsf.org/) with a [GNU General Public License v3](http://www.gnu.org/licenses/). 24 | 25 | ### PySFC Modules 26 | 27 | PySFC comprises the following modules: 28 | * `sfc.py` - main module that controls the application of the SFC model to a chosen dataset. 29 | * `sfc_params.py` - parameter setting module that includes: 30 | * data related parameters - type of input data, phrase level functional contours, local functional contours, 31 | * SFC hyperparameters - number of points to be sampled from the pitch, number of iterations for analysis-by-synthesis, as well as NNCG parameters, 32 | * SFC execution parameters - use of preprocessed data, use of trained models, plotting. 33 | * `sfc_corpus.py` - holds all the functions that are used to consolidate and work with the corpus of data that is directly fed and output from the SFC model. The corpus is a [Pandas](http://pandas.pydata.org/) data frame object, which allows easy data access and analysis. 34 | * `sfc_data.py` - comprises functions that read the input data files and calculate the `f_0` and duration coefficients, 35 | * `sfc_learn.py` - holds the SFC training function `analysis_by_synthesis()` and the function for NNCG initialisation, 36 | * `sfc_dsp.py` - holds DSP functions for smoothing the pitch contour based on SciPy, 37 | * `sfc_plot.py` - holds the plotting functions based on [matplotlib](http://matplotlib.org/) and [seaborn](http://seaborn.pydata.org/). 38 | 39 | Currently, PySFC supports the proprietary SFC `fpro` file format as well as standard Praat `TextGrid` annotations. Pitch is calculated based on Praat `PointProcess` pitch mark files, but integration of state-of-the-art pitch extractors is planned for the future. 40 | PySFC also brings added value, by adding the possibility to adjust the number of samples to be taken from the pitch contour at each rhythmical unit vowel nucleus, and with its extended plotting capabilities for data and performance analysis. 41 | 42 | ### PySFC Example Plots 43 | 44 | Here are a few example plots with PySFC just to show case what it can do. The plotted files are included as examples in the `examples/` directory. 45 | 46 | ![alt text](r1_DC_393.png) 47 | **Figure 1.** Example PySFC intonation decomposition for the French utterance: *Son bagou pourrait faciliter la communauté.* into constituent functional contours: declaration (DC), dependency to the left/right (DG/DD), and cliticisation (XX, DV). 48 | 49 | ![alt text](r1_chinese_003.png) 50 | **Figure 2.** Example PySFC intonation decomposition for the Chinese utterance: *Tā men céng zài jī cāng nèi géi lǔ kè diǎn gē hè shēng rì, 51 | céng ná zhē shuí guǒ nái fěn qù tàn wàng yóu tā men zhuǎn sòng qù yī yuàn de lǔ kè chǎn fù.* into constituent functional contours: declaration (DC), tones (C0-4), word boundaries (WB), and independence (ID). 52 | 53 | ![alt text](r1_expansion_DD.png) 54 | **Figure 3.** Example PySFC expansion in left and right context for the dependency to the right (DD) functional contour, numbers next to the plots show the number of occurences of that scope in the data. 55 | 56 | ![alt text](r1_losses_DC.png) 57 | 58 | **Figure 4.** Example PySFC plots of `f_0` reconstruction losses for all NNCGs for attitude DC per iteration for French. 59 | 60 | 61 | -------------------------------------------------------------------------------- /sfc/sfc_learn.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | """ 4 | Learning utils used for the SFC. 5 | 6 | Created on Fri Oct 20 15:38:03 2017 7 | 8 | @author: gerazovb 9 | """ 10 | #import logging 11 | from sklearn.neural_network import MLPRegressor 12 | import logging 13 | import numpy as np 14 | 15 | #%% analysis-by-synthesis 16 | def analysis_by_synthesis(corpus, mask_all_files, mask_file_dict, mask_contours, 17 | n_units_dict, mask_unit_dict, contour_keys, 18 | contour_generators, params): 19 | ''' 20 | Runs SFC analysis by synthesis. 21 | 22 | Parameters 23 | =========== 24 | 25 | corpus : pandas data frame 26 | Holds all data from corpus. 27 | orig_columns : list of str 28 | The original prosodic parameters f0, f1, f2 and coeff_dur, 29 | target_columns : list of str 30 | The targets for training the contour generators. 31 | iterations : int 32 | Number of iterations of analysis-by-synthesis loop. 33 | mask_all_files : dataseries bool 34 | Mask for all good files in the corpus DataFrame. 35 | mask_file_dict : dict 36 | Dictionary with masks for each of the files in corpus. 37 | mask_contours : dict 38 | Dictinary with masks for each of the contours. 39 | n_units_dict : dict 40 | Dictionary with the number of units in each file in corpus. 41 | mask_unit_dict : dict 42 | Dictionary of masks for each unit number in corpus. 43 | contour_keys : list 44 | Types of functions covered by the contour generators. 45 | contour_generators : dict 46 | Dictionary of contour generators. 47 | ''' 48 | 49 | log = logging.getLogger('an-by-syn') 50 | orig_columns = params.orig_columns 51 | target_columns = params.target_columns 52 | iterations = params.iterations 53 | #%% set initial targets 54 | log.info('='*42) 55 | log.info('Setting initial targets....') 56 | for file, mask_file in mask_file_dict.items(): 57 | n_units = n_units_dict[file] 58 | for n_unit in range(n_units+1): 59 | mask_unit = mask_unit_dict[n_unit] 60 | mask_row = mask_file & mask_unit 61 | n_contours = corpus[mask_row].shape[0] # coeff to divide the error 62 | targets = corpus.loc[mask_row, orig_columns].values/n_contours 63 | corpus.loc[mask_row, target_columns] = targets 64 | 65 | #%% now do the training iterations updating the targets 66 | losses = {key : np.empty(iterations) for key in contour_keys} 67 | for i in range(iterations): 68 | log.info('='*42) 69 | log.info('Analysis-by-synthesis iteration {} ...'.format(i)) 70 | log.info('='*42) 71 | pred_columns = [column + '_it{:03}'.format(i) for column in orig_columns] 72 | 73 | for contour_type in contour_keys: 74 | log.info('Training for contour type : {}'.format(contour_type)) 75 | contour_generator = contour_generators[contour_type] 76 | mask_row = mask_all_files & mask_contours[contour_type] 77 | X = corpus.loc[mask_row,'ramp1':'ramp4'] 78 | y_target = corpus.loc[mask_row, target_columns] 79 | contour_generator.fit(X, y_target) 80 | losses[contour_type][i] = contour_generator.loss_ 81 | log.info('mean squared error : {}'.format(contour_generator.loss_)) 82 | y_pred = contour_generator.predict(X) 83 | corpus.loc[mask_row, pred_columns] = y_pred 84 | 85 | #% now sum the predictions for each unit, calculate the error and new targets 86 | log.info('Summing predictions, calculate the error and new targets ...') 87 | for file, mask_file in mask_file_dict.items(): 88 | n_units = n_units_dict[file] 89 | for n_unit in range(n_units+1): 90 | mask_unit = mask_unit_dict[n_unit] 91 | mask_row = mask_file & mask_unit 92 | n_contours = corpus[mask_row].shape[0] # coeff to divide the error 93 | y_pred = corpus.loc[mask_row, pred_columns].values 94 | y_pred_sum = np.sum(y_pred, axis=0) 95 | y_orig = corpus.loc[mask_row, orig_columns].values # every row should be the same 96 | y_error = y_orig - y_pred_sum # this will automatically tile row to matrix 97 | targets = y_pred + y_error/n_contours 98 | corpus.loc[mask_row, target_columns] = targets # write new targets 99 | 100 | return corpus, contour_generators, losses 101 | 102 | def construct_contour_generator(params): 103 | ''' 104 | Construct Neural Network based contour generator. 105 | 106 | Parameters 107 | ========== 108 | learn_rate : float 109 | Learning rate of NN optimizer. 110 | max_iter : int 111 | Maximum of training iterations. 112 | l2 : float 113 | L2 regulizer value. 114 | hidden_units : int 115 | Number of units in the hidden layer. 116 | 117 | ''' 118 | learn_rate = params.learn_rate 119 | max_iter = params.max_iter 120 | l2 = params.l2 121 | hidden_units = params.hidden_units 122 | contour_generator = MLPRegressor(hidden_layer_sizes=(hidden_units, ), # 15 in 2004 paper but Config says 17 123 | activation='logistic', # relu default, logistic in snns 124 | batch_size='auto', # auto batch_size=min(200, n_samples) 125 | max_iter=max_iter, # is this 50 in the original?? 126 | alpha=l2, # L2 penalty 1e-4 default - config says should be 0.1?? 127 | shuffle=True, # Whether to shuffle samples in each iteration. Only solver=’sgd’ or ‘adam’. 128 | random_state=42, 129 | verbose=False, 130 | warm_start=True, # When set to True, reuse the solution of the 131 | # previous call to fit as initialization, otherwise, 132 | # just erase the previous solution. 133 | early_stopping=False, validation_fraction=0.01, 134 | # Whether to use early stopping to terminate training when 135 | # validation score is not improving. If set to true, it will 136 | # automatically set aside 10% of training data as validation 137 | # and terminate training when validation score is not improving 138 | # by at least tol for two consecutive epochs. Only solver=’sgd’ or ‘adam’ 139 | solver='adam', # adam is newer, I don't think you can use rprop 140 | learning_rate_init=learn_rate, # default 0.001, in Config it's 0.1 141 | beta_1=0.9, beta_2=0.999, epsilon=1e-08, 142 | # learning_rate='constant', # Only used when solver='sgd' 143 | ## 'constant' is a constant learning rate given by ‘learning_rate_init’. 144 | ## 'invscaling' gradually decreases the learning rate learning_rate_ at each 145 | ## time step 't' using an inverse scaling exponent of 'power_t'. 146 | ## effective_learning_rate = learning_rate_init / pow(t, power_t) 147 | ## 'adaptive' keeps the learning rate constant to 'learning_rate_init' as long 148 | ## as training loss keeps decreasing. Each time two consecutive epochs fail 149 | ## to decrease training loss by at least tol, or fail to increase validation 150 | ## score by at least tol if 'early_stopping' is on, the current learning rate 151 | ## is divided by 5. 152 | power_t=0.5, tol=0.0001, 153 | momentum=0.9, # Momentum for gradient descent update. Only solver=’sgd’. 154 | nesterovs_momentum=True) # Only used when solver=’sgd’ 155 | return contour_generator -------------------------------------------------------------------------------- /sfc.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | """ 4 | PySFC - Python implementation of the Superposition of Functional Contours (SFC) 5 | prosody model [1]. 6 | 7 | [1] Bailly, Gérard, and Bleicke Holm. "SFC: a trainable prosodic model." 8 | Speech communication 46, no. 3 (2005): 348-364. 9 | 10 | @authors: 11 | Branislav Gerazov Oct 2017 12 | 13 | Copyright 2017 by GIPSA-lab, Grenoble INP, Grenoble, France. 14 | 15 | See the file LICENSE for the licence associated with this software. 16 | """ 17 | import pandas as pd 18 | from sfc import sfc_params, sfc_corpus, sfc_learn, sfc_plot 19 | import pickle 20 | from datetime import datetime 21 | import logging 22 | import os 23 | import shutil 24 | 25 | start_time = datetime.now() # start stopwatch 26 | 27 | #%% logger setup 28 | logging.basicConfig(filename='sfc.log', filemode='w', 29 | format='%(asctime)s %(name)-12s: %(levelname)-8s: %(message)s', 30 | datefmt='%H:%M:%S', 31 | level=logging.INFO) 32 | 33 | # define a Handler which writes INFO messages or higher to the sys.stderr 34 | console = logging.StreamHandler() 35 | console.setLevel(logging.INFO) 36 | # set a format which is simpler for console use 37 | formatter = logging.Formatter('%(name)-12s: %(levelname)-8s: %(message)s') 38 | console.setFormatter(formatter) # tell the handler to use this format 39 | logging.getLogger('').addHandler(console) # add the handler to the root logger 40 | 41 | #%% init params 42 | params = sfc_params.Params() 43 | 44 | #%% mkdirs 45 | if os.path.isdir(params.save_path): # delete them 46 | if params.remove_folders: 47 | shutil.rmtree(params.save_path, ignore_errors=False) 48 | os.mkdir(params.save_path) 49 | for phrase_type in params.phrase_types: 50 | os.mkdir(params.save_path+'/'+phrase_type+'_f0') 51 | os.mkdir(params.save_path+'/'+phrase_type+'_dur') 52 | os.mkdir(params.save_path+'/'+phrase_type+'_exp') 53 | else: 54 | os.mkdir(params.save_path) 55 | for phrase_type in params.phrase_types: 56 | os.mkdir(params.save_path+'/'+phrase_type+'_f0') 57 | os.mkdir(params.save_path+'/'+phrase_type+'_dur') 58 | os.mkdir(params.save_path+'/'+phrase_type+'_exp') 59 | 60 | #%% load processed corpus or redo 61 | if not params.load_processed_corpus: 62 | #% read or rebuild corpus 63 | if params.load_corpus: 64 | logging.info('Loading corpus ...') 65 | with open(params.pkl_path + params.corpus_name + '.pkl', 'rb') as f: 66 | data = pickle.load(f) 67 | fpro_stats, corpus, utterances, phone_set, phone_cnts = data 68 | f0_ref, isochrony_clock, isochrony_gravity, disol, stats = fpro_stats 69 | 70 | else: # rebuild corpus 71 | logging.info('Rebuilding corpus ...') 72 | fpro_stats, corpus, utterances, phone_set, phone_cnts = \ 73 | sfc_corpus.build_corpus(params) 74 | 75 | f0_ref, isochrony_clock, isochrony_gravity, disol, stats = fpro_stats 76 | corpus = sfc_corpus.downcast_corpus(corpus, params.columns) 77 | with open(params.pkl_path + params.corpus_name + '.pkl', 'wb') as f: 78 | fpro_stats = f0_ref, isochrony_clock, isochrony_gravity, disol, stats # to avoid bad headers 79 | data = (fpro_stats, corpus, utterances, phone_set, phone_cnts) 80 | pickle.dump(data, f, -1) # last version 81 | 82 | #%% do analysis-by-synthesis 83 | # fix and add columns to the corpus 84 | corpus = sfc_corpus.expand_corpus(corpus,params) 85 | 86 | #%% get the scope counts 87 | dict_scope_counts ={} 88 | for phrase_type in params.phrase_types: 89 | # this doesn't take care of good phrases!!! 90 | dict_scope_counts[phrase_type] = sfc_corpus.contour_scope_count(corpus, 91 | phrase_type, 92 | max_scope=40) 93 | 94 | #%% phrase loop 95 | # init dictionaries per phrase type 96 | dict_contour_generators = {} 97 | dict_losses = {} 98 | dict_files = {} 99 | for phrase_type in params.phrase_types: 100 | logging.info('='*42) 101 | logging.info('='*42) 102 | logging.info('Analysis-by-synthesis for phrase type {} from {} ...'.format( 103 | phrase_type, params.phrase_types)) 104 | 105 | # init contour generators 106 | logging.info('Initialising contour generators and masks ...') 107 | contour_generators = {} 108 | contour_keys = [phrase_type] 109 | for function_type in params.function_types: 110 | # if dict_scope_counts[phrase_type][function_type].sum() > 0: 111 | if function_type in dict_scope_counts[phrase_type].keys(): 112 | contour_keys.append(function_type) 113 | 114 | for contour_type in contour_keys: 115 | contour_generators[contour_type] = \ 116 | sfc_learn.construct_contour_generator(params) 117 | # save them in dictonary 118 | dict_contour_generators[phrase_type] = contour_generators 119 | 120 | # create masks 121 | files , mask_all_files, mask_file_dict, \ 122 | mask_contours, n_units_dict, mask_unit_dict = \ 123 | sfc_corpus.create_masks(corpus, phrase_type, contour_keys, params) 124 | dict_files[phrase_type] = files 125 | 126 | corpus, dict_contour_generators[phrase_type], dict_losses[phrase_type] = \ 127 | sfc_learn.analysis_by_synthesis(corpus, mask_all_files, mask_file_dict, mask_contours, 128 | n_units_dict, mask_unit_dict, contour_keys, 129 | contour_generators, params) 130 | 131 | ## save results 132 | # delete not good files 133 | corpus = corpus[corpus.notnull().all(1)] 134 | 135 | # downcast 136 | corpus.loc[:, 'f01':] = corpus.loc[:, 'f01':].apply(pd.to_numeric, downcast='float') 137 | 138 | with open(params.pkl_path + params.processed_corpus_name+'.pkl', 'wb') as f: 139 | data = (corpus, fpro_stats, utterances, dict_files, dict_contour_generators, 140 | dict_losses, dict_scope_counts) 141 | pickle.dump(data, f, -1) # last version 142 | else: 143 | #%% if load processed data 144 | with open(params.pkl_path + params.processed_corpus_name+'.pkl', 'rb') as f: 145 | data = pickle.load(f) 146 | corpus, fpro_stats, utterances, dict_files, dict_contour_generators, \ 147 | dict_losses, dict_scope_counts = data 148 | 149 | #%% make a DataFrame from utterances 150 | db_utterances = pd.DataFrame(data=list(utterances.values()), index=utterances.keys(), columns=['utterance']) 151 | db_utterances["length"] = db_utterances.utterance.apply(lambda x: len(x.split())) 152 | 153 | #%% get colors 154 | colors = sfc_plot.init_colors(params) 155 | 156 | #%% plot last iteration for every file 157 | if params.plot_contours: 158 | logging.info('='*42) 159 | logging.info('='*42) 160 | logging.info('Plotting final iterations ...') 161 | for phrase_type, files in dict_files.items(): 162 | for file in files: 163 | # #%% plot one file 164 | ## file = 'DC_393.fpro' 165 | # file = 'yanpin_000003.TextGrid' 166 | logging.info('Plotting f0 and dur for file {} ...'.format(file)) 167 | sfc_plot.plot_contours(params.save_path+'/'+phrase_type+'_f0/', file, utterances, 168 | corpus, colors, params, plot_contour='f0', show_plot=True) 169 | 170 | sfc_plot.plot_contours(params.save_path+'/'+phrase_type+'_dur/', file, utterances, 171 | corpus, colors, params, plot_contour='dur', show_plot=False) 172 | 173 | #%% plot losses 174 | logging.info('='*42) 175 | logging.info('Plotting losses ...') 176 | for phrase_type, losses in dict_losses.items(): 177 | #%% plot single phrase type 178 | phrase_type = 'DC' 179 | losses = dict_losses[phrase_type] 180 | sfc_plot.plot_losses(params.save_path, phrase_type, losses, show_plot=True) 181 | 182 | #%% plot expansion 183 | logging.info('Plotting expansions ...') 184 | 185 | for phrase_type, contour_generators in dict_contour_generators.items(): 186 | # #%% plot single phrase type 187 | # phrase_type = 'DC' 188 | # contour_generators = dict_contour_generators[phrase_type] 189 | scope_counts = dict_scope_counts[phrase_type] 190 | sfc_plot.plot_expansion(params.save_path+'/'+phrase_type+'_exp/', contour_generators, 191 | colors, scope_counts, phrase_type, params, show_plot=False) 192 | 193 | #%% final losses 194 | logging.info('Plotting final losses ...') 195 | sfc_plot.plot_final_losses(dict_losses, params, show_plot=False) 196 | 197 | 198 | #%% now copy figures in new folder sorted by loss 199 | if params.copy_worst: 200 | ## 100 worst files 201 | logging.info('Copying 100 worst files ...') 202 | sfc_plot.plot_worst(corpus, params, n_files=100) 203 | # 204 | ## all DC files 205 | logging.info('Copying all DC from worst to best ...') 206 | sfc_plot.plot_worst(corpus, params, phrase='DC') 207 | 208 | #%% wrap up 209 | end_time = datetime.now() 210 | dif_time = end_time - start_time 211 | logging.info('='*42) 212 | logging.info('Finished in {}'.format(dif_time)) 213 | 214 | 215 | -------------------------------------------------------------------------------- /examples/DC_393.TextGrid: -------------------------------------------------------------------------------- 1 | File type = "ooTextFile" 2 | Object class = "TextGrid" 3 | 4 | xmin = 0 5 | xmax = 2.53 6 | tiers? 7 | size = 7 8 | item []: 9 | item [1]: 10 | class = "IntervalTier" 11 | name = "PHON" 12 | xmin = 0 13 | xmax = 2.53 14 | intervals: size = 30 15 | intervals [1]: 16 | xmin = 0 17 | xmax = 0.015 18 | text = "_" 19 | intervals [2]: 20 | xmin = 0.015 21 | xmax = 0.11 22 | text = "s" 23 | intervals [3]: 24 | xmin = 0.11 25 | xmax = 0.2067 26 | text = "o~" 27 | intervals [4]: 28 | xmin = 0.2067 29 | xmax = 0.2831 30 | text = "b" 31 | intervals [5]: 32 | xmin = 0.2831 33 | xmax = 0.39 34 | text = "a" 35 | intervals [6]: 36 | xmin = 0.39 37 | xmax = 0.4675 38 | text = "g" 39 | intervals [7]: 40 | xmin = 0.4675 41 | xmax = 0.61 42 | text = "u" 43 | intervals [8]: 44 | xmin = 0.61 45 | xmax = 0.71 46 | text = "p" 47 | intervals [9]: 48 | xmin = 0.71 49 | xmax = 0.7824 50 | text = "u" 51 | intervals [10]: 52 | xmin = 0.7824 53 | xmax = 0.83 54 | text = "r" 55 | intervals [11]: 56 | xmin = 0.83 57 | xmax = 0.91 58 | text = "e^" 59 | intervals [12]: 60 | xmin = 0.91 61 | xmax = 1.0178 62 | text = "f" 63 | intervals [13]: 64 | xmin = 1.0178 65 | xmax = 1.08 66 | text = "a" 67 | intervals [14]: 68 | xmin = 1.08 69 | xmax = 1.19 70 | text = "s" 71 | intervals [15]: 72 | xmin = 1.19 73 | xmax = 1.2531 74 | text = "i" 75 | intervals [16]: 76 | xmin = 1.2531 77 | xmax = 1.2881 78 | text = "l" 79 | intervals [17]: 80 | xmin = 1.2881 81 | xmax = 1.365 82 | text = "i" 83 | intervals [18]: 84 | xmin = 1.365 85 | xmax = 1.48 86 | text = "t" 87 | intervals [19]: 88 | xmin = 1.48 89 | xmax = 1.6 90 | text = "e" 91 | intervals [20]: 92 | xmin = 1.6 93 | xmax = 1.6507 94 | text = "l" 95 | intervals [21]: 96 | xmin = 1.6507 97 | xmax = 1.725 98 | text = "a" 99 | intervals [22]: 100 | xmin = 1.725 101 | xmax = 1.835 102 | text = "k" 103 | intervals [23]: 104 | xmin = 1.835 105 | xmax = 1.89 106 | text = "o^" 107 | intervals [24]: 108 | xmin = 1.89 109 | xmax = 1.9656 110 | text = "m" 111 | intervals [25]: 112 | xmin = 1.9656 113 | xmax = 2.0228 114 | text = "y" 115 | intervals [26]: 116 | xmin = 2.0228 117 | xmax = 2.0769 118 | text = "n" 119 | intervals [27]: 120 | xmin = 2.0769 121 | xmax = 2.1564 122 | text = "o" 123 | intervals [28]: 124 | xmin = 2.1564 125 | xmax = 2.29 126 | text = "t" 127 | intervals [29]: 128 | xmin = 2.29 129 | xmax = 2.414 130 | text = "e" 131 | intervals [30]: 132 | xmin = 2.414 133 | xmax = 2.53 134 | text = "_" 135 | item [2]: 136 | class = "IntervalTier" 137 | name = "SYLL" 138 | xmin = 0 139 | xmax = 2.53 140 | intervals: size = 15 141 | intervals [1]: 142 | xmin = 0.015 143 | xmax = 0.2067 144 | text = "Syl" 145 | intervals [2]: 146 | xmin = 0.2067 147 | xmax = 0.39 148 | text = "Syl" 149 | intervals [3]: 150 | xmin = 0.39 151 | xmax = 0.61 152 | text = "Acc" 153 | intervals [4]: 154 | xmin = 0.61 155 | xmax = 0.7824 156 | text = "Syl" 157 | intervals [5]: 158 | xmin = 0.7824 159 | xmax = 0.91 160 | text = "Syl" 161 | intervals [6]: 162 | xmin = 0.91 163 | xmax = 1.08 164 | text = "Syl" 165 | intervals [7]: 166 | xmin = 1.08 167 | xmax = 1.2531 168 | text = "Syl" 169 | intervals [8]: 170 | xmin = 1.2531 171 | xmax = 1.365 172 | text = "Syl" 173 | intervals [9]: 174 | xmin = 1.365 175 | xmax = 1.6 176 | text = "Syl" 177 | intervals [10]: 178 | xmin = 1.6 179 | xmax = 1.725 180 | text = "Syl" 181 | intervals [11]: 182 | xmin = 1.725 183 | xmax = 1.89 184 | text = "Syl" 185 | intervals [12]: 186 | xmin = 1.89 187 | xmax = 2.0228 188 | text = "Syl" 189 | intervals [13]: 190 | xmin = 2.0228 191 | xmax = 2.1564 192 | text = "Syl" 193 | intervals [14]: 194 | xmin = 2.1564 195 | xmax = 2.414 196 | text = "Acc" 197 | intervals [15]: 198 | xmin = 2.414 199 | xmax = 2.53 200 | text = "Syl" 201 | item [3]: 202 | class = "IntervalTier" 203 | name = "LEX" 204 | xmin = 0 205 | xmax = 2.53 206 | intervals: size = 6 207 | intervals [1]: 208 | xmin = 0.015 209 | xmax = 0.2067 210 | text = "Det" 211 | intervals [2]: 212 | xmin = 0.2067 213 | xmax = 0.61 214 | text = "Nom" 215 | intervals [3]: 216 | xmin = 0.61 217 | xmax = 0.91 218 | text = "Vrb" 219 | intervals [4]: 220 | xmin = 0.91 221 | xmax = 1.6 222 | text = "Inf" 223 | intervals [5]: 224 | xmin = 1.6 225 | xmax = 1.725 226 | text = "Det" 227 | intervals [6]: 228 | xmin = 1.725 229 | xmax = 2.53 230 | text = "Nom" 231 | item [4]: 232 | class = "IntervalTier" 233 | name = "ORTHOGRAPHE" 234 | xmin = 0 235 | xmax = 2.53 236 | intervals: size = 7 237 | intervals [1]: 238 | xmin = 0.015 239 | xmax = 0.2067 240 | text = "SON" 241 | intervals [2]: 242 | xmin = 0.2067 243 | xmax = 0.61 244 | text = "BAGOU" 245 | intervals [3]: 246 | xmin = 0.61 247 | xmax = 0.91 248 | text = "POURRAIENT" 249 | intervals [4]: 250 | xmin = 0.91 251 | xmax = 1.6 252 | text = "FACILITER" 253 | intervals [5]: 254 | xmin = 1.6 255 | xmax = 1.725 256 | text = "LA" 257 | intervals [6]: 258 | xmin = 1.725 259 | xmax = 2.414 260 | text = "COMMUNAUTE" 261 | intervals [7]: 262 | xmin = 2.414 263 | xmax = 2.53 264 | text = "." 265 | item [5]: 266 | class = "TextTier" 267 | name = "PHRASE" 268 | xmin = 0 269 | xmax = 2.53 270 | points: size = 2 271 | points [1]: 272 | time = 0.015 273 | mark = ":FF" 274 | points [2]: 275 | time = 2.414 276 | mark = ":DC" 277 | item [6]: 278 | class = "TextTier" 279 | name = "NIV1" 280 | xmin = 0 281 | xmax = 2.53 282 | points: size = 4 283 | points [1]: 284 | time = 0.015 285 | mark = ":FF" 286 | points [2]: 287 | time = 0.61 288 | mark = ":DG" 289 | points [3]: 290 | time = 1.6 291 | mark = ":DD" 292 | points [4]: 293 | time = 2.414 294 | mark = ":FF" 295 | item [7]: 296 | class = "TextTier" 297 | name = "NIV2" 298 | xmin = 0 299 | xmax = 2.53 300 | points: size = 7 301 | points [1]: 302 | time = 0.015 303 | mark = ":FF" 304 | points [2]: 305 | time = 0.2067 306 | mark = ":XX" 307 | points [3]: 308 | time = 0.61 309 | mark = ":FF" 310 | points [4]: 311 | time = 0.91 312 | mark = ":XX" 313 | points [5]: 314 | time = 1.6 315 | mark = ":FF" 316 | points [6]: 317 | time = 1.725 318 | mark = ":XX" 319 | points [7]: 320 | time = 2.414 321 | mark = ":FF" 322 | -------------------------------------------------------------------------------- /sfc/sfc_params.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | """ 4 | PySFC - parameters class used to set all PySFC parameters. 5 | 6 | @authors: 7 | Branislav Gerazov Nov 2017 8 | 9 | Copyright 2017 by GIPSA-lab, Grenoble INP, Grenoble, France. 10 | 11 | See the file LICENSE for the licence associated with this software. 12 | """ 13 | import numpy as np 14 | import re 15 | 16 | class Params: 17 | def __init__(self): 18 | 19 | ############### 20 | #%% General flow 21 | ############### 22 | 23 | self.load_corpus = True 24 | self.load_processed_corpus = True 25 | self.do_all_phrases = True 26 | self.good_files_only = True # use only files specified 27 | self.remove_folders = False 28 | 29 | ############### 30 | #%% SFC params 31 | ############### 32 | 33 | self.database = 'french' # french, chinese 34 | 35 | self.vowel_marks=[.1,.5,.9] # original SFC 36 | # self.vowel_marks=[.1,.3,.5,.7,.9] # in 5pts 37 | self.vowel_pts = len(self.vowel_marks) 38 | 39 | #% contour generators params 40 | self.f0_scale = .05 # originally .05 for 3 vowel marks 41 | self.dur_scale = 10 # 10/.05 = 200 (originally 10) 42 | self.learn_rate = 0.01 # learning rate - default for adam is 0.001 43 | self.max_iter = 500 # number of iterations at each stepping of the contour generators 44 | self.iterations = 20 # to run analysis by synthesis 45 | self.l2 = .1 # 1e-4 default 46 | self.hidden_units = 17 # 15 in the paper, but snns config says 17 47 | 48 | ################# 49 | #%% database params 50 | ################# 51 | self.home = '/home/bgerazov/' 52 | self.file_type='TextGrid' # from what to build the corpus 53 | 54 | if self.database == 'french': 55 | self.corpus_name = self.database+'_v6_{}_{}pts'.format(self.file_type, self.vowel_pts) 56 | 57 | self.processed_corpus_name = self.corpus_name+'_learn{}_maxit{}_it{}_{}n_l2{}'.format( 58 | self.learn_rate, self.max_iter, self.iterations, self.hidden_units, self.l2) 59 | 60 | if self.do_all_phrases: 61 | # these are the phrase types we have 62 | self.phrase_types = 'DC QS EX SC EV DI'.split() 63 | # DC - declaration 64 | # QS - question 65 | # EX - exclamation 66 | # SC - suspicious irony 67 | # EV - obviousness 68 | # DI - incredulous question 69 | else: 70 | self.phrase_types = ['DC'] # just do it for DC 71 | 72 | ## and the function types for each 73 | self.function_types = 'DD DG XX DV EM ID IT'.split() 74 | # DD - word on the right depends on the left 75 | # DG - word on the left depends on the right 76 | # XX - clitic on the left (les enfants) - downstepping for function words 77 | # DV - like XX - downstepping for auxiliaries 78 | # ID - independancy (separated by a , ) 79 | # IT - interdependancy - this thing links 3 segments so it's not implemented ... 80 | 81 | self.end_marks = 'XX DV EM'.split() # contours with only left context 82 | 83 | ## good files used to train the SFC 84 | self.good_files = {'DC' : np.r_[np.arange(1,90), np.arange(91,220), 85 | np.arange(221,313), np.arange(314,394)], 86 | 'SC' : np.r_[np.arange(1,219), np.arange(220,313), 87 | np.arange(314,319), np.arange(324,329)], 88 | 'EV' : np.r_[np.arange(1,90), np.arange(91,313), 89 | np.arange(314,319), np.arange(324,329)], 90 | 'DI' : np.r_[np.arange(1,26), np.arange(27,90), 91 | np.arange(91,139), np.arange(140,220), 92 | np.arange(221,245), np.arange(246,277), 93 | np.arange(278,313), np.arange(314,319), 94 | np.arange(324,329)], 95 | 'QS' : np.r_[np.arange(1,219), np.arange(220, 313), 96 | np.arange(314,319), np.arange(324,329)], 97 | 'EX' : np.r_[np.arange(1,313), np.arange(314,319), 98 | np.arange(324,329)]} 99 | 100 | if self.file_type=='fpro': 101 | self.datafolder = self.home + 'work/data/french/_fpro/' 102 | self.re_folder = re.compile(r'^.*\d\.fpro$') 103 | elif self.file_type=='TextGrid': 104 | self.datafolder = self.home + 'work/data/french/_grid/' 105 | self.re_folder = re.compile(r'^.*\d\.TextGrid$') 106 | 107 | ### chinese 108 | elif self.database == 'chinese': 109 | 110 | self.corpus_name = self.database+'_v2_{}_{}pts'.format(self.file_type, self.vowel_pts) 111 | 112 | self.processed_corpus_name = self.corpus_name+'_learn{}_maxit{}_it{}_{}n_l2{}'.format( 113 | self.learn_rate, self.max_iter, self.iterations, self.hidden_units, self.l2) 114 | 115 | if self.do_all_phrases: 116 | self.phrase_types = 'DC QS'.split() 117 | else: 118 | self.phrase_types = ['DC'] # just do it for DC 119 | 120 | ## and the function types 121 | self.function_types = 'C0 C1 C2 C3 C4 WB ID IT'.split() 122 | # C0-4 - tonal accents 123 | # WB - word boundary 124 | 125 | self.end_marks = ['WB'] # contours with only left context 126 | 127 | ## good files used to train the SFC 128 | ## without the multiple DCs 129 | self.good_files = np.r_[np.arange(1,101), np.arange(1001,1005), 130 | 4211,4212,4214, 5949] 131 | 132 | if self.file_type=='fpro': 133 | self.datafolder = self.home + 'work/data/chinese/_fpro/' 134 | self.re_folder = re.compile(r'^chinese_\d*\.fpro$') 135 | 136 | elif self.file_type=='TextGrid': 137 | self.datafolder = self.home + 'work/data/chinese/_grid/' 138 | self.re_folder = re.compile(r'^chinese_\d*\.TextGrid$') 139 | 140 | ################## 141 | #%% read data params 142 | ################## 143 | 144 | self.re_fpro = re.compile( # fpro first line regex 145 | r"""^\s* 146 | F0_Ref=\s*(\d+).* 147 | CLOCK=\s*(\d*).* 148 | R=([01]\.\d*).* 149 | DISOL=\s*([\d.]*).* 150 | Stats=\s*([./\w]*) 151 | \s*$""", re.VERBOSE) 152 | 153 | ### read textgrids 154 | self.disol = 0 155 | self.isochrony_gravity = 0.2 156 | self.f0_method = 'pitch_marks' 157 | 158 | if self.database == 'french': 159 | self.use_ipcgs = True 160 | self.re_vowels = re.compile('[aeouiyx]') 161 | 162 | self.isochrony_clock = .190 163 | self.f0_ref = 126 164 | self.f0_min = 80 165 | self.f0_max = 300 166 | 167 | self.f0_ref_method='all' 168 | # Method to use to accumulate stats if f0_stats is None. Can be: 169 | # all - all files 170 | # DC - DC files (works for french) 171 | # vowels - just vowel segments 172 | # vowel_marks - at the vowel marks in the vowel segments 173 | 174 | if self.f0_method == 'pitch_marks': 175 | self.re_f0 = re.compile(r'^.*.PointProcess$') 176 | elif self.f0_method == 'pca': 177 | self.re_f0 = re.compile(r'^.*.pca$') 178 | 179 | self.textgrid_folder = self.home + 'work/data/french/_grid/' 180 | self.f0_folder = self.home + 'work/data/french/_pca/' 181 | 182 | # levels in the textgrid - phones should always be 0 183 | self.syll_level = 1 184 | self.orthographs_level = 3 185 | self.phrase_level = 4 186 | self.tone_levels = None 187 | 188 | # stats 189 | self.f0_stats = 'french_f0_stats_all' 190 | self.dur_stats = 'french_phone_dur_stats' 191 | self.syll_stats = 'french_syll_dur_stats' 192 | 193 | elif self.database == 'chinese': 194 | self.use_ipcgs = False # use syllables 195 | self.re_vowels = re.compile('.*[aeouiv].*') 196 | # v in yu 197 | 198 | self.isochrony_clock = .214 199 | 200 | self.f0_ref = 270 201 | self.f0_min = 100 202 | self.f0_max = 450 203 | 204 | self.textgrid_folder = self.home + 'work/data/chinese/_grid/' 205 | self.f0_folder = self.home + 'work/data/chinese/_pca/' 206 | 207 | if self.f0_method == 'pitch_marks': 208 | self.re_f0 = re.compile(r'^chinese_\d*\.PointProcess$') 209 | elif self.f0_method == 'pca': 210 | self.re_f0 = re.compile(r'^chinese_\d*\.pca$') 211 | 212 | # levels in the textgrid - phones should always be 0 213 | self.syll_level = 1 214 | self.tone_levels = [2, 3] 215 | self.orthographs_level = 4 216 | self.phrase_level = 7 217 | 218 | # stats 219 | self.f0_stats = 'chinese_f0_stats_all' 220 | self.f0_ref_method='all' 221 | self.dur_stats = 'chinese_phone_dur_stats' 222 | self.syll_stats = 'chinese_syll_dur_stats' 223 | 224 | ############### 225 | #%% corpus params 226 | ############### 227 | 228 | self.columns = 'file phrasetype contourtype unit n_unit ramp1 ramp2 ramp3 ramp4'.split() 229 | self.columns_in = len(self.columns) 230 | self.columns += ['f0{}'.format(x) for x in range(len(self.vowel_marks))] + ['dur'] 231 | self.orig_columns = self.columns[self.columns_in:] 232 | self.target_columns = ['target_'+column for column in self.orig_columns] 233 | 234 | ####################### 235 | #%% plotting and saving 236 | ####################### 237 | self.show_plot = False # used in all plotting - whether to close the plots immediately 238 | 239 | self.plot_contours = True 240 | 241 | # expansion plots 242 | self.left_max = 5 243 | self.right_max = 5 244 | self.phrase_max = 10 245 | 246 | # copy worst files 247 | self.plot_worst = False 248 | self.n_files=100 249 | 250 | # figure save path 251 | self.save_path = 'figures/{}'.format(self.processed_corpus_name) 252 | 253 | # pkl save path 254 | self.pkl_path = 'pkls/' -------------------------------------------------------------------------------- /examples/chinese_003.PointProcess: -------------------------------------------------------------------------------- 1 | File type = "ooTextFile short" 2 | "PointProcess" 3 | 4 | 0 5 | 10.3245 6 | 1631 7 | 0.85025 8 | 0.853 9 | 0.855375 10 | 0.8579375 11 | 0.8604375 12 | 0.8630625 13 | 0.865625 14 | 0.8681875 15 | 0.870625 16 | 0.87325 17 | 0.8758125 18 | 0.878375 19 | 0.8809375 20 | 0.8835625 21 | 0.886125 22 | 0.888625 23 | 0.8911875 24 | 0.89375 25 | 0.8963125 26 | 0.8988125 27 | 0.9013125 28 | 0.903875 29 | 0.906375 30 | 0.908875 31 | 0.911375 32 | 0.9138125 33 | 0.9163125 34 | 0.9188125 35 | 0.9213125 36 | 0.9238125 37 | 0.9263125 38 | 0.9288125 39 | 0.93125 40 | 0.9336875 41 | 0.9361875 42 | 0.9386875 43 | 0.9411875 44 | 0.9436875 45 | 0.946125 46 | 0.9486875 47 | 0.9514375 48 | 0.9545 49 | 0.9576875 50 | 0.9606875 51 | 0.9635625 52 | 0.9663125 53 | 0.9693125 54 | 0.972375 55 | 0.9754375 56 | 0.9783125 57 | 0.9809375 58 | 0.9836875 59 | 0.986625 60 | 0.9900625 61 | 0.9938125 62 | 0.99775 63 | 1.001625 64 | 1.0050625 65 | 1.0079375 66 | 1.0104375 67 | 1.013 68 | 1.016125 69 | 1.0196875 70 | 1.0234375 71 | 1.0270625 72 | 1.0304375 73 | 1.0338125 74 | 1.037 75 | 1.0404375 76 | 1.044 77 | 1.0476875 78 | 1.0513125 79 | 1.0549375 80 | 1.0585625 81 | 1.062375 82 | 1.0661875 83 | 1.070125 84 | 1.0740625 85 | 1.078 86 | 1.082125 87 | 1.0863125 88 | 1.0905625 89 | 1.094625 90 | 1.0985 91 | 1.102375 92 | 1.1063125 93 | 1.11025 94 | 1.1141875 95 | 1.118125 96 | 1.1220625 97 | 1.126125 98 | 1.1303125 99 | 1.134375 100 | 1.139125 101 | 1.143 102 | 1.19225 103 | 1.196375 104 | 1.200375 105 | 1.204125 106 | 1.2079375 107 | 1.2116875 108 | 1.215625 109 | 1.2195625 110 | 1.2238125 111 | 1.2281875 112 | 1.2326875 113 | 1.23725 114 | 1.242125 115 | 1.247 116 | 1.251875 117 | 1.256875 118 | 1.2618125 119 | 1.2665625 120 | 1.2714375 121 | 1.276125 122 | 1.28075 123 | 1.28525 124 | 1.28975 125 | 1.294125 126 | 1.298375 127 | 1.3025 128 | 1.3065625 129 | 1.310625 130 | 1.3145625 131 | 1.3184375 132 | 1.32225 133 | 1.3260625 134 | 1.3298125 135 | 1.3335 136 | 1.33725 137 | 1.3409375 138 | 1.3443125 139 | 1.34725 140 | 1.3506875 141 | 1.3811875 142 | 1.38425 143 | 1.3875625 144 | 1.3905 145 | 1.393625 146 | 1.396875 147 | 1.4 148 | 1.4031875 149 | 1.406375 150 | 1.4096875 151 | 1.4130625 152 | 1.4164375 153 | 1.419875 154 | 1.4233125 155 | 1.4268125 156 | 1.430375 157 | 1.4339375 158 | 1.437625 159 | 1.4413125 160 | 1.445 161 | 1.4486875 162 | 1.4524375 163 | 1.4561875 164 | 1.46 165 | 1.463875 166 | 1.4676875 167 | 1.471625 168 | 1.4755625 169 | 1.4795625 170 | 1.4835 171 | 1.4875 172 | 1.4915 173 | 1.4955 174 | 1.4995625 175 | 1.5036875 176 | 1.50775 177 | 1.5118125 178 | 1.515875 179 | 1.5199375 180 | 1.524 181 | 1.528 182 | 1.5319375 183 | 1.535875 184 | 1.53975 185 | 1.5436875 186 | 1.5475625 187 | 1.551375 188 | 1.5551875 189 | 1.5590625 190 | 1.5631875 191 | 1.5674375 192 | 1.5716875 193 | 1.57575 194 | 1.69175 195 | 1.6935625 196 | 1.6956875 197 | 1.6981875 198 | 1.700875 199 | 1.703625 200 | 1.706375 201 | 1.709125 202 | 1.711875 203 | 1.714625 204 | 1.717375 205 | 1.720125 206 | 1.7228125 207 | 1.7255625 208 | 1.72825 209 | 1.7309375 210 | 1.733625 211 | 1.73625 212 | 1.7389375 213 | 1.7415625 214 | 1.7441875 215 | 1.7468125 216 | 1.7494375 217 | 1.7520625 218 | 1.754625 219 | 1.7571875 220 | 1.75975 221 | 1.762375 222 | 1.7649375 223 | 1.7675 224 | 1.770125 225 | 1.7728125 226 | 1.775625 227 | 1.7784375 228 | 1.78125 229 | 1.7840625 230 | 1.78675 231 | 1.862625 232 | 1.865125 233 | 1.8674375 234 | 1.869625 235 | 1.872 236 | 1.874375 237 | 1.87675 238 | 1.8791875 239 | 1.8816875 240 | 1.88425 241 | 1.8868125 242 | 1.88925 243 | 1.8916875 244 | 1.8944375 245 | 1.8975 246 | 1.900625 247 | 1.903625 248 | 1.906375 249 | 1.909 250 | 1.9116875 251 | 1.9143125 252 | 1.916875 253 | 1.919625 254 | 1.9223125 255 | 1.9250625 256 | 1.92775 257 | 1.9304375 258 | 1.9330625 259 | 1.9355625 260 | 1.9380625 261 | 1.940625 262 | 1.9431875 263 | 1.9456875 264 | 1.948375 265 | 1.951 266 | 1.9535625 267 | 1.9561875 268 | 1.95875 269 | 1.961375 270 | 1.9640625 271 | 1.96675 272 | 1.9695 273 | 1.97225 274 | 1.9749375 275 | 1.9775625 276 | 1.98025 277 | 1.9828125 278 | 1.9855 279 | 1.988125 280 | 1.9908125 281 | 1.993625 282 | 1.9964375 283 | 1.9993125 284 | 2.002125 285 | 2.0049375 286 | 2.0076875 287 | 2.0105 288 | 2.0131875 289 | 2.015875 290 | 2.0185 291 | 2.02125 292 | 2.023875 293 | 2.026375 294 | 2.0289375 295 | 2.0314375 296 | 2.034 297 | 2.0366875 298 | 2.03925 299 | 2.0416875 300 | 2.0443125 301 | 2.0469375 302 | 2.0495 303 | 2.0521875 304 | 2.0548125 305 | 2.0575 306 | 2.0601875 307 | 2.0629375 308 | 2.0656875 309 | 2.0684375 310 | 2.0711875 311 | 2.074 312 | 2.076875 313 | 2.07975 314 | 2.0826875 315 | 2.085625 316 | 2.0886875 317 | 2.09175 318 | 2.0949375 319 | 2.0981875 320 | 2.1015 321 | 2.1049375 322 | 2.1084375 323 | 2.1120625 324 | 2.1156875 325 | 2.1195 326 | 2.1234375 327 | 2.1274375 328 | 2.1315625 329 | 2.1358125 330 | 2.1401875 331 | 2.144625 332 | 2.1491875 333 | 2.1538125 334 | 2.1585625 335 | 2.163375 336 | 2.16825 337 | 2.1731875 338 | 2.1781875 339 | 2.1831875 340 | 2.1881875 341 | 2.19325 342 | 2.1983125 343 | 2.2033125 344 | 2.208375 345 | 2.2134375 346 | 2.2184375 347 | 2.2235 348 | 2.2285625 349 | 2.2336875 350 | 2.238875 351 | 2.2443125 352 | 2.3341875 353 | 2.3384375 354 | 2.3421875 355 | 2.3465 356 | 2.3508125 357 | 2.3551875 358 | 2.359625 359 | 2.364125 360 | 2.36875 361 | 2.373375 362 | 2.378 363 | 2.3825625 364 | 2.387 365 | 2.3914375 366 | 2.3956875 367 | 2.399875 368 | 2.403875 369 | 2.4078125 370 | 2.411625 371 | 2.4153125 372 | 2.4189375 373 | 2.4225 374 | 2.4259375 375 | 2.4293125 376 | 2.432625 377 | 2.4359375 378 | 2.439125 379 | 2.4423125 380 | 2.4455 381 | 2.4486875 382 | 2.45175 383 | 2.4548125 384 | 2.4579375 385 | 2.461125 386 | 2.464375 387 | 2.4675625 388 | 2.4706875 389 | 2.4739375 390 | 2.477125 391 | 2.48025 392 | 2.4834375 393 | 2.486625 394 | 2.4898125 395 | 2.492875 396 | 2.496 397 | 2.4990625 398 | 2.5021875 399 | 2.505375 400 | 2.508625 401 | 2.5119375 402 | 2.51525 403 | 2.518625 404 | 2.521875 405 | 2.525 406 | 2.528 407 | 2.531125 408 | 2.5343125 409 | 2.5375625 410 | 2.5409375 411 | 2.5444375 412 | 2.548 413 | 2.55175 414 | 2.5556875 415 | 2.5598125 416 | 2.564125 417 | 2.568625 418 | 2.5733125 419 | 2.5781875 420 | 2.5831875 421 | 2.588375 422 | 2.593875 423 | 2.599625 424 | 2.6055 425 | 2.6115 426 | 2.617625 427 | 2.62375 428 | 2.630125 429 | 2.6368125 430 | 2.6445 431 | 2.6514375 432 | 2.78175 433 | 2.784875 434 | 2.7885625 435 | 2.7918125 436 | 2.7951875 437 | 2.7985625 438 | 2.802 439 | 2.8055625 440 | 2.809125 441 | 2.8126875 442 | 2.8163125 443 | 2.8199375 444 | 2.823625 445 | 2.8274375 446 | 2.83125 447 | 2.83525 448 | 2.83925 449 | 2.8434375 450 | 2.847625 451 | 2.8519375 452 | 2.8563125 453 | 2.860875 454 | 2.8654375 455 | 2.870125 456 | 2.8748125 457 | 2.8795625 458 | 2.884375 459 | 2.8893125 460 | 2.8941875 461 | 2.89925 462 | 2.9043125 463 | 2.9095 464 | 2.9146875 465 | 2.9199375 466 | 2.9251875 467 | 2.9305 468 | 2.935875 469 | 2.9411875 470 | 2.9465625 471 | 2.951875 472 | 2.957125 473 | 2.9623125 474 | 2.9675 475 | 2.9726875 476 | 2.977875 477 | 2.9838125 478 | 2.9896875 479 | 3.073625 480 | 3.077125 481 | 3.0813125 482 | 3.085625 483 | 3.09 484 | 3.0944375 485 | 3.0989375 486 | 3.1034375 487 | 3.108125 488 | 3.1128125 489 | 3.1175625 490 | 3.122375 491 | 3.1271875 492 | 3.132125 493 | 3.1370625 494 | 3.142125 495 | 3.14725 496 | 3.152375 497 | 3.157625 498 | 3.1629375 499 | 3.1681875 500 | 3.1735625 501 | 3.179 502 | 3.1844375 503 | 3.19 504 | 3.1955625 505 | 3.201125 506 | 3.2066875 507 | 3.21225 508 | 3.2179375 509 | 3.223625 510 | 3.22925 511 | 3.234875 512 | 3.2405625 513 | 3.2461875 514 | 3.251875 515 | 3.2574375 516 | 3.263125 517 | 3.26875 518 | 3.2743125 519 | 3.28 520 | 3.28575 521 | 3.291375 522 | 3.2970625 523 | 3.302875 524 | 3.3086875 525 | 3.3143125 526 | 3.375125 527 | 3.3783125 528 | 3.3815 529 | 3.3846875 530 | 3.3878125 531 | 3.3909375 532 | 3.394 533 | 3.3970625 534 | 3.400125 535 | 3.403125 536 | 3.406125 537 | 3.409125 538 | 3.412125 539 | 3.4150625 540 | 3.418 541 | 3.4209375 542 | 3.423875 543 | 3.42675 544 | 3.4296875 545 | 3.4325625 546 | 3.4354375 547 | 3.4383125 548 | 3.4411875 549 | 3.4440625 550 | 3.4469375 551 | 3.4498125 552 | 3.4526875 553 | 3.4555625 554 | 3.458375 555 | 3.46125 556 | 3.4640625 557 | 3.4669375 558 | 3.46975 559 | 3.4725625 560 | 3.475375 561 | 3.47825 562 | 3.4810625 563 | 3.483875 564 | 3.4866875 565 | 3.4895 566 | 3.4923125 567 | 3.495125 568 | 3.497875 569 | 3.5006875 570 | 3.5035 571 | 3.5063125 572 | 3.5090625 573 | 3.511875 574 | 3.514625 575 | 3.517375 576 | 3.520125 577 | 3.522875 578 | 3.525625 579 | 3.528375 580 | 3.531125 581 | 3.5338125 582 | 3.5365625 583 | 3.5393125 584 | 3.542 585 | 3.54475 586 | 3.5474375 587 | 3.550125 588 | 3.5528125 589 | 3.5555625 590 | 3.5583125 591 | 3.5610625 592 | 3.563875 593 | 3.566625 594 | 3.5694375 595 | 3.57225 596 | 3.5750625 597 | 3.577875 598 | 3.58075 599 | 3.58375 600 | 3.586375 601 | 3.7164375 602 | 3.7188125 603 | 3.7211875 604 | 3.7235625 605 | 3.7260625 606 | 3.72875 607 | 3.7315 608 | 3.73425 609 | 3.7371875 610 | 3.740125 611 | 3.743125 612 | 3.74625 613 | 3.7495 614 | 3.7528125 615 | 3.75625 616 | 3.75975 617 | 3.7634375 618 | 3.76725 619 | 3.77125 620 | 3.7753125 621 | 3.7795 622 | 3.7838125 623 | 3.7883125 624 | 3.792875 625 | 3.7974375 626 | 3.8020625 627 | 3.8066875 628 | 3.81125 629 | 3.81575 630 | 3.82025 631 | 3.82475 632 | 3.829375 633 | 3.8341875 634 | 3.8390625 635 | 3.8435 636 | 3.9203125 637 | 3.923625 638 | 3.9264375 639 | 3.9293125 640 | 3.932375 641 | 3.935375 642 | 3.9384375 643 | 3.9414375 644 | 3.9444375 645 | 3.9475 646 | 3.9505625 647 | 3.9535 648 | 3.9565625 649 | 3.959625 650 | 3.962625 651 | 3.9656875 652 | 3.9686875 653 | 3.97175 654 | 3.974875 655 | 3.9779375 656 | 3.981 657 | 3.9840625 658 | 3.987125 659 | 3.99025 660 | 3.9933125 661 | 3.996375 662 | 3.9995 663 | 4.0025625 664 | 4.0055625 665 | 4.008625 666 | 4.0116875 667 | 4.01475 668 | 4.01775 669 | 4.0208125 670 | 4.0238125 671 | 4.027 672 | 4.02996875 673 | 4.03296875 674 | 4.03596875 675 | 4.038875 676 | 4.04178125 677 | 4.0446875 678 | 4.0476875 679 | 4.05059375 680 | 4.0534375 681 | 4.05628125 682 | 4.05840625 683 | 4.060875 684 | 4.06375 685 | 4.0666875 686 | 4.0696875 687 | 4.0726875 688 | 4.0756875 689 | 4.078875 690 | 4.0821875 691 | 4.085125 692 | 4.088125 693 | 4.09125 694 | 4.09415625 695 | 4.13575 696 | 4.1385625 697 | 4.1415 698 | 4.1445 699 | 4.147375 700 | 4.1505625 701 | 4.1538125 702 | 4.157125 703 | 4.1605625 704 | 4.164 705 | 4.167625 706 | 4.1713125 707 | 4.1750625 708 | 4.179 709 | 4.1830625 710 | 4.1873125 711 | 4.1916875 712 | 4.19625 713 | 4.201 714 | 4.2059375 715 | 4.2109375 716 | 4.2161875 717 | 4.22175 718 | 4.227625 719 | 4.2336875 720 | 4.24 721 | 4.24675 722 | 4.254125 723 | 4.2626875 724 | 4.27325 725 | 4.288375 726 | 4.8114375 727 | 4.8145 728 | 4.81775 729 | 4.8209375 730 | 4.8239375 731 | 4.8268125 732 | 4.8296875 733 | 4.8326875 734 | 4.8358125 735 | 4.838875 736 | 4.8419375 737 | 4.845 738 | 4.848 739 | 4.851 740 | 4.8538125 741 | 4.856625 742 | 4.8595 743 | 4.8623125 744 | 4.865125 745 | 4.867875 746 | 4.8706875 747 | 4.8735625 748 | 4.8763125 749 | 4.879125 750 | 4.881875 751 | 4.8846875 752 | 4.8874375 753 | 4.8901875 754 | 4.8929375 755 | 4.8956875 756 | 4.8983125 757 | 4.901125 758 | 4.90375 759 | 4.9064375 760 | 4.9090625 761 | 4.911625 762 | 4.914125 763 | 4.91675 764 | 4.9193125 765 | 4.9219375 766 | 4.9245625 767 | 4.9271875 768 | 4.9298125 769 | 4.9325 770 | 4.9350625 771 | 4.9375625 772 | 4.9399375 773 | 4.942375 774 | 4.9448125 775 | 4.9475625 776 | 4.950625 777 | 4.9539375 778 | 4.9570625 779 | 4.959875 780 | 4.9625625 781 | 4.965125 782 | 4.9675625 783 | 4.970375 784 | 4.9734375 785 | 4.97675 786 | 4.9799375 787 | 4.9829375 788 | 4.98575 789 | 4.988625 790 | 4.9915 791 | 4.994125 792 | 4.9969375 793 | 4.9998125 794 | 5.00275 795 | 5.0058125 796 | 5.008875 797 | 5.0120625 798 | 5.0153125 799 | 5.018625 800 | 5.022 801 | 5.0254375 802 | 5.029 803 | 5.032625 804 | 5.0363125 805 | 5.04 806 | 5.0438125 807 | 5.0475 808 | 5.05125 809 | 5.0550625 810 | 5.058875 811 | 5.0626875 812 | 5.0665 813 | 5.07025 814 | 5.074 815 | 5.07775 816 | 5.0815 817 | 5.0851875 818 | 5.0889375 819 | 5.0925625 820 | 5.0961875 821 | 5.09975 822 | 5.10325 823 | 5.1066875 824 | 5.1100625 825 | 5.113375 826 | 5.1166875 827 | 5.1199375 828 | 5.12325 829 | 5.1265 830 | 5.1296875 831 | 5.1329375 832 | 5.1363125 833 | 5.1398125 834 | 5.1434375 835 | 5.147 836 | 5.150375 837 | 5.1536875 838 | 5.180375 839 | 5.1831875 840 | 5.185875 841 | 5.1885625 842 | 5.1915 843 | 5.1944375 844 | 5.1974375 845 | 5.2004375 846 | 5.2035 847 | 5.206625 848 | 5.20975 849 | 5.212875 850 | 5.216125 851 | 5.2194375 852 | 5.22275 853 | 5.226125 854 | 5.2295625 855 | 5.2330625 856 | 5.2366875 857 | 5.240375 858 | 5.2441875 859 | 5.248 860 | 5.2519375 861 | 5.2559375 862 | 5.259875 863 | 5.2639375 864 | 5.2680625 865 | 5.2721875 866 | 5.2763125 867 | 5.280375 868 | 5.2844375 869 | 5.2885625 870 | 5.292625 871 | 5.2966875 872 | 5.30075 873 | 5.30475 874 | 5.3088125 875 | 5.312875 876 | 5.3170625 877 | 5.3213125 878 | 5.3253125 879 | 5.421625 880 | 5.424875 881 | 5.42825 882 | 5.4319375 883 | 5.435875 884 | 5.4398125 885 | 5.4438125 886 | 5.4478125 887 | 5.451875 888 | 5.45575 889 | 5.4595625 890 | 5.46325 891 | 5.4668125 892 | 5.4703125 893 | 5.4736875 894 | 5.477 895 | 5.48025 896 | 5.4834375 897 | 5.4865625 898 | 5.4896875 899 | 5.49275 900 | 5.4956875 901 | 5.4986875 902 | 5.5016875 903 | 5.504625 904 | 5.5075 905 | 5.510375 906 | 5.5131875 907 | 5.516 908 | 5.519 909 | 5.522 910 | 5.5249375 911 | 5.527875 912 | 5.530625 913 | 5.533375 914 | 5.536125 915 | 5.538875 916 | 5.541625 917 | 5.5443125 918 | 5.5470625 919 | 5.5498125 920 | 5.5526875 921 | 5.5555 922 | 5.55825 923 | 5.561 924 | 5.5635 925 | 5.6068125 926 | 5.60975 927 | 5.6126875 928 | 5.6155 929 | 5.6181875 930 | 5.621 931 | 5.624125 932 | 5.6271875 933 | 5.630375 934 | 5.63375 935 | 5.63725 936 | 5.640625 937 | 5.6441875 938 | 5.64775 939 | 5.6514375 940 | 5.65525 941 | 5.659125 942 | 5.663125 943 | 5.66725 944 | 5.6714375 945 | 5.6756875 946 | 5.6800625 947 | 5.6845 948 | 5.689 949 | 5.6935625 950 | 5.698125 951 | 5.7028125 952 | 5.7075625 953 | 5.7125 954 | 5.717125 955 | 5.7218125 956 | 5.726625 957 | 5.7315625 958 | 5.7364375 959 | 5.741375 960 | 5.746375 961 | 5.7514375 962 | 5.7565 963 | 5.7615 964 | 5.7665 965 | 5.77125 966 | 5.7763125 967 | 5.7815 968 | 5.78675 969 | 5.792125 970 | 5.7975 971 | 5.8029375 972 | 5.808375 973 | 5.8138125 974 | 5.8193125 975 | 5.82475 976 | 5.8300625 977 | 5.835375 978 | 5.8405625 979 | 5.845625 980 | 5.8505625 981 | 5.855375 982 | 5.86 983 | 5.8645 984 | 5.8686875 985 | 5.8729375 986 | 5.876875 987 | 5.88075 988 | 5.8844375 989 | 5.888 990 | 5.8914375 991 | 5.8948125 992 | 5.898125 993 | 5.901375 994 | 5.9045625 995 | 5.9076875 996 | 5.910875 997 | 5.9139375 998 | 5.917 999 | 5.9200625 1000 | 5.923125 1001 | 5.9261875 1002 | 5.9293125 1003 | 5.932375 1004 | 5.9355625 1005 | 5.9388125 1006 | 5.942 1007 | 5.945125 1008 | 5.9483125 1009 | 6.0455625 1010 | 6.049125 1011 | 6.05325 1012 | 6.057625 1013 | 6.0620625 1014 | 6.0666875 1015 | 6.0715625 1016 | 6.076625 1017 | 6.08175 1018 | 6.0871875 1019 | 6.0926875 1020 | 6.09825 1021 | 6.1039375 1022 | 6.1096875 1023 | 6.1154375 1024 | 6.12125 1025 | 6.1271875 1026 | 6.133125 1027 | 6.139 1028 | 6.1450625 1029 | 6.151 1030 | 6.157 1031 | 6.163 1032 | 6.169 1033 | 6.175 1034 | 6.181 1035 | 6.187125 1036 | 6.1931875 1037 | 6.199125 1038 | 6.2051875 1039 | 6.2110625 1040 | 6.217 1041 | 6.2228125 1042 | 6.228625 1043 | 6.23425 1044 | 6.2398125 1045 | 6.2454375 1046 | 6.25175 1047 | 6.2566875 1048 | 6.26175 1049 | 6.3571875 1050 | 6.360125 1051 | 6.3635 1052 | 6.367 1053 | 6.3706875 1054 | 6.3745 1055 | 6.378375 1056 | 6.3823125 1057 | 6.386375 1058 | 6.3905 1059 | 6.3946875 1060 | 6.398875 1061 | 6.4030625 1062 | 6.40725 1063 | 6.4114375 1064 | 6.415625 1065 | 6.419875 1066 | 6.4240625 1067 | 6.428125 1068 | 6.4323125 1069 | 6.55475 1070 | 6.557625 1071 | 6.5608125 1072 | 6.5639375 1073 | 6.5671875 1074 | 6.5705 1075 | 6.5738125 1076 | 6.5771875 1077 | 6.5805625 1078 | 6.584 1079 | 6.5875 1080 | 6.5910625 1081 | 6.5946875 1082 | 6.598375 1083 | 6.6021875 1084 | 6.606 1085 | 6.6099375 1086 | 6.6139375 1087 | 6.618 1088 | 6.62225 1089 | 6.6264375 1090 | 6.63075 1091 | 6.6350625 1092 | 6.6395 1093 | 6.644 1094 | 6.6485 1095 | 6.6530625 1096 | 6.6576875 1097 | 6.6624375 1098 | 6.66725 1099 | 6.6720625 1100 | 6.6769375 1101 | 6.682 1102 | 6.687 1103 | 6.6920625 1104 | 6.6971875 1105 | 6.7024375 1106 | 6.7081875 1107 | 6.7138125 1108 | 6.7191875 1109 | 6.7243125 1110 | 6.72925 1111 | 6.7340625 1112 | 6.7389375 1113 | 6.74375 1114 | 6.7486875 1115 | 6.7536875 1116 | 6.7588125 1117 | 6.7639375 1118 | 6.7691875 1119 | 6.7745 1120 | 6.7798125 1121 | 6.7851875 1122 | 6.790625 1123 | 6.7960625 1124 | 6.8015 1125 | 6.8069375 1126 | 6.812375 1127 | 6.8178125 1128 | 6.8233125 1129 | 6.8288125 1130 | 6.8343125 1131 | 6.83975 1132 | 6.8451875 1133 | 6.850625 1134 | 6.8560625 1135 | 6.8614375 1136 | 6.867 1137 | 6.8730625 1138 | 6.8804375 1139 | 6.94825 1140 | 6.9531875 1141 | 6.9576875 1142 | 6.961875 1143 | 6.9661875 1144 | 6.9706875 1145 | 6.9751875 1146 | 6.979875 1147 | 6.9846875 1148 | 6.9894375 1149 | 6.99425 1150 | 6.9989375 1151 | 7.0035625 1152 | 7.0081875 1153 | 7.0126875 1154 | 7.017125 1155 | 7.0215625 1156 | 7.026 1157 | 7.0305 1158 | 7.034875 1159 | 7.0391875 1160 | 7.0435 1161 | 7.047625 1162 | 7.05175 1163 | 7.055875 1164 | 7.0598125 1165 | 7.0638125 1166 | 7.06775 1167 | 7.071625 1168 | 7.075375 1169 | 7.0790625 1170 | 7.082625 1171 | 7.086125 1172 | 7.0895 1173 | 7.0928125 1174 | 7.096125 1175 | 7.099375 1176 | 7.1025 1177 | 7.105625 1178 | 7.10875 1179 | 7.1118125 1180 | 7.114875 1181 | 7.1179375 1182 | 7.1214375 1183 | 7.12475 1184 | 7.224 1185 | 7.2264375 1186 | 7.228875 1187 | 7.231625 1188 | 7.23425 1189 | 7.236625 1190 | 7.23925 1191 | 7.2418125 1192 | 7.244375 1193 | 7.247 1194 | 7.2495625 1195 | 7.252125 1196 | 7.2546875 1197 | 7.25725 1198 | 7.2598125 1199 | 7.262375 1200 | 7.2649375 1201 | 7.2674375 1202 | 7.27 1203 | 7.2725 1204 | 7.275 1205 | 7.2774375 1206 | 7.2799375 1207 | 7.2824375 1208 | 7.284875 1209 | 7.287375 1210 | 7.2898125 1211 | 7.292375 1212 | 7.294875 1213 | 7.2974375 1214 | 7.300125 1215 | 7.302625 1216 | 7.305 1217 | 7.3075 1218 | 7.3099375 1219 | 7.312375 1220 | 7.314875 1221 | 7.3175 1222 | 7.32025 1223 | 7.323375 1224 | 7.32675 1225 | 7.3301875 1226 | 7.33325 1227 | 7.3360625 1228 | 7.3388125 1229 | 7.3415 1230 | 7.3440625 1231 | 7.3466875 1232 | 7.3495625 1233 | 7.352875 1234 | 7.35625 1235 | 7.3595 1236 | 7.3625625 1237 | 7.3654375 1238 | 7.3685625 1239 | 7.3720625 1240 | 7.3755625 1241 | 7.379 1242 | 7.3823125 1243 | 7.385375 1244 | 7.3883125 1245 | 7.391375 1246 | 7.394375 1247 | 7.397375 1248 | 7.4005 1249 | 7.40375 1250 | 7.4069375 1251 | 7.410125 1252 | 7.4134375 1253 | 7.4169375 1254 | 7.42075 1255 | 7.4246875 1256 | 7.4286875 1257 | 7.4325 1258 | 7.43625 1259 | 7.4399375 1260 | 7.44375 1261 | 7.447625 1262 | 7.4516875 1263 | 7.456125 1264 | 7.4603125 1265 | 7.464125 1266 | 7.4683125 1267 | 7.4718125 1268 | 7.4763125 1269 | 7.5269375 1270 | 7.530875 1271 | 7.5344375 1272 | 7.5383125 1273 | 7.542375 1274 | 7.5464375 1275 | 7.5506875 1276 | 7.555 1277 | 7.5595625 1278 | 7.5641875 1279 | 7.5689375 1280 | 7.5738125 1281 | 7.579 1282 | 7.5841875 1283 | 7.5895 1284 | 7.5949375 1285 | 7.6005625 1286 | 7.60625 1287 | 7.6120625 1288 | 7.618125 1289 | 7.6241875 1290 | 7.63025 1291 | 7.6364375 1292 | 7.642625 1293 | 7.648875 1294 | 7.6550625 1295 | 7.6610625 1296 | 7.6670625 1297 | 7.673 1298 | 7.678875 1299 | 7.684625 1300 | 7.6903125 1301 | 7.695875 1302 | 7.75975 1303 | 7.7630625 1304 | 7.7661875 1305 | 7.7695625 1306 | 7.7729375 1307 | 7.776375 1308 | 7.77975 1309 | 7.7833125 1310 | 7.786875 1311 | 7.7905 1312 | 7.7941875 1313 | 7.797875 1314 | 7.80175 1315 | 7.8055625 1316 | 7.8095 1317 | 7.8135 1318 | 7.8175625 1319 | 7.8216875 1320 | 7.8260625 1321 | 7.8304375 1322 | 7.834875 1323 | 7.839375 1324 | 7.8439375 1325 | 7.8485625 1326 | 7.85325 1327 | 7.8579375 1328 | 7.862625 1329 | 7.8673125 1330 | 7.8720625 1331 | 7.876875 1332 | 7.882125 1333 | 7.9965 1334 | 8.0006875 1335 | 8.004375 1336 | 8.008 1337 | 8.012125 1338 | 8.0164375 1339 | 8.0208125 1340 | 8.025375 1341 | 8.0300625 1342 | 8.035 1343 | 8.0399375 1344 | 8.0449375 1345 | 8.0500625 1346 | 8.0553125 1347 | 8.0605625 1348 | 8.0660625 1349 | 8.0715 1350 | 8.0770625 1351 | 8.082625 1352 | 8.0881875 1353 | 8.0938125 1354 | 8.0994375 1355 | 8.105 1356 | 8.110625 1357 | 8.1161875 1358 | 8.121875 1359 | 8.1285 1360 | 8.2033125 1361 | 8.206125 1362 | 8.2095 1363 | 8.21275 1364 | 8.216125 1365 | 8.2195625 1366 | 8.22275 1367 | 8.225875 1368 | 8.229 1369 | 8.2321875 1370 | 8.2353125 1371 | 8.2384375 1372 | 8.2415 1373 | 8.2445625 1374 | 8.2475625 1375 | 8.250625 1376 | 8.2535625 1377 | 8.2565 1378 | 8.2595 1379 | 8.2624375 1380 | 8.265375 1381 | 8.2681875 1382 | 8.271125 1383 | 8.274 1384 | 8.2769375 1385 | 8.27975 1386 | 8.2825625 1387 | 8.2854375 1388 | 8.2881875 1389 | 8.291 1390 | 8.293875 1391 | 8.29675 1392 | 8.2995 1393 | 8.30225 1394 | 8.305 1395 | 8.307625 1396 | 8.310375 1397 | 8.313125 1398 | 8.316 1399 | 8.3188125 1400 | 8.3215625 1401 | 8.324375 1402 | 8.3270625 1403 | 8.329875 1404 | 8.332625 1405 | 8.3354375 1406 | 8.3381875 1407 | 8.3409375 1408 | 8.343625 1409 | 8.3463125 1410 | 8.3490625 1411 | 8.35175 1412 | 8.3544375 1413 | 8.3571875 1414 | 8.36 1415 | 8.362625 1416 | 8.365375 1417 | 8.368125 1418 | 8.37075 1419 | 8.373375 1420 | 8.3760625 1421 | 8.37875 1422 | 8.381375 1423 | 8.3839375 1424 | 8.3865625 1425 | 8.3895 1426 | 8.3925625 1427 | 8.395875 1428 | 8.3995625 1429 | 8.4035 1430 | 8.4075 1431 | 8.41125 1432 | 8.4145625 1433 | 8.4174375 1434 | 8.42025 1435 | 8.422875 1436 | 8.4254375 1437 | 8.428 1438 | 8.430625 1439 | 8.4331875 1440 | 8.43575 1441 | 8.4384375 1442 | 8.441125 1443 | 8.44375 1444 | 8.4464375 1445 | 8.449125 1446 | 8.451875 1447 | 8.4545625 1448 | 8.45725 1449 | 8.4600625 1450 | 8.462875 1451 | 8.465625 1452 | 8.468375 1453 | 8.4711875 1454 | 8.474 1455 | 8.476875 1456 | 8.479875 1457 | 8.482875 1458 | 8.4859375 1459 | 8.489 1460 | 8.4921875 1461 | 8.4955625 1462 | 8.498875 1463 | 8.502375 1464 | 8.5058125 1465 | 8.509375 1466 | 8.5129375 1467 | 8.516625 1468 | 8.520375 1469 | 8.5241875 1470 | 8.5280625 1471 | 8.5320625 1472 | 8.5360625 1473 | 8.5403125 1474 | 8.544625 1475 | 8.548875 1476 | 8.553125 1477 | 8.55775 1478 | 8.5623125 1479 | 8.5671875 1480 | 8.5718125 1481 | 8.5765625 1482 | 8.581375 1483 | 8.5861875 1484 | 8.5913125 1485 | 8.5966875 1486 | 8.6030625 1487 | 8.6105 1488 | 8.61875 1489 | 8.626875 1490 | 8.6345 1491 | 8.64025 1492 | 8.6455625 1493 | 8.651875 1494 | 8.658625 1495 | 8.66525 1496 | 8.6701875 1497 | 8.67575 1498 | 8.6814375 1499 | 8.6873125 1500 | 8.6933125 1501 | 8.69925 1502 | 8.70525 1503 | 8.711125 1504 | 8.717125 1505 | 8.7230625 1506 | 8.7291875 1507 | 8.73525 1508 | 8.7414375 1509 | 8.7474375 1510 | 8.75325 1511 | 8.7585 1512 | 8.7635 1513 | 8.768375 1514 | 8.7739375 1515 | 8.7805 1516 | 8.78775 1517 | 8.7968125 1518 | 8.8090625 1519 | 8.8509375 1520 | 8.857875 1521 | 8.8645625 1522 | 8.8694375 1523 | 8.874125 1524 | 8.87875 1525 | 8.8835 1526 | 8.88825 1527 | 8.89325 1528 | 8.89825 1529 | 8.9034375 1530 | 8.9088125 1531 | 8.9141875 1532 | 8.919 1533 | 8.9238125 1534 | 8.928625 1535 | 8.9334375 1536 | 8.9384375 1537 | 8.9435 1538 | 8.948625 1539 | 8.953875 1540 | 8.959125 1541 | 8.9645625 1542 | 8.9700625 1543 | 8.975625 1544 | 8.98125 1545 | 8.987125 1546 | 8.993 1547 | 8.9990625 1548 | 9.0050625 1549 | 9.011125 1550 | 9.01725 1551 | 9.023375 1552 | 9.0295625 1553 | 9.0356875 1554 | 9.0420625 1555 | 9.0485625 1556 | 9.054875 1557 | 9.1795625 1558 | 9.1821875 1559 | 9.185125 1560 | 9.18775 1561 | 9.1904375 1562 | 9.193125 1563 | 9.1958125 1564 | 9.1985625 1565 | 9.2013125 1566 | 9.204125 1567 | 9.207 1568 | 9.209875 1569 | 9.2128125 1570 | 9.2158125 1571 | 9.2188125 1572 | 9.2219375 1573 | 9.225125 1574 | 9.228375 1575 | 9.231625 1576 | 9.235 1577 | 9.2384375 1578 | 9.241875 1579 | 9.245375 1580 | 9.2489375 1581 | 9.2525625 1582 | 9.25625 1583 | 9.2599375 1584 | 9.2636875 1585 | 9.267625 1586 | 9.272 1587 | 9.3506875 1588 | 9.3546875 1589 | 9.3585625 1590 | 9.3626875 1591 | 9.367125 1592 | 9.3715625 1593 | 9.376 1594 | 9.3805625 1595 | 9.3853125 1596 | 9.390375 1597 | 9.3955 1598 | 9.400875 1599 | 9.4065 1600 | 9.4125 1601 | 9.418625 1602 | 9.42525 1603 | 9.4323125 1604 | 9.439875 1605 | 9.448875 1606 | 9.459375 1607 | 9.469125 1608 | 9.4761875 1609 | 9.482875 1610 | 9.489 1611 | 9.496 1612 | 9.5818125 1613 | 9.5855625 1614 | 9.5895625 1615 | 9.5935625 1616 | 9.5979375 1617 | 9.602375 1618 | 9.6069375 1619 | 9.6114375 1620 | 9.6160625 1621 | 9.6206875 1622 | 9.6254375 1623 | 9.6303125 1624 | 9.63525 1625 | 9.64025 1626 | 9.645375 1627 | 9.6505625 1628 | 9.6558125 1629 | 9.66125 1630 | 9.66675 1631 | 9.672375 1632 | 9.6781875 1633 | 9.6845 1634 | 9.691125 1635 | 9.7004375 1636 | 9.707125 1637 | 9.713875 1638 | -------------------------------------------------------------------------------- /sfc/sfc_corpus.py: -------------------------------------------------------------------------------- 1 | # -*- coding: utf-8 -*- 2 | """ 3 | PySFC - corpus related utility functions. 4 | 5 | @authors: 6 | Branislav Gerazov Nov 2017 7 | 8 | Copyright 2017 by GIPSA-lab, Grenoble INP, Grenoble, France. 9 | 10 | See the file LICENSE for the licence associated with this software. 11 | """ 12 | import numpy as np 13 | import pandas as pd 14 | import logging 15 | from natsort import natsorted 16 | import os 17 | import re 18 | import sys 19 | from sfc import sfc_data 20 | 21 | #%% 22 | def build_corpus(params): 23 | ''' 24 | Build corpus from all files in datafolder and save it to corpus_name. All parameters 25 | passed through object params. 26 | 27 | Parameters 28 | ---------- 29 | datafolder : str 30 | Folder where the input data files are located. 31 | file_type : str 32 | Whether to use fpros or TextGrids to read the data. 33 | 34 | phrase_types : list 35 | Attitude functions found in database. 36 | 37 | function_types : list 38 | Functions other than attitudes found in database. 39 | 40 | end_marks : list 41 | Functions that end the scope, i.e. they have only left context. 42 | 43 | database : str 44 | Which database are we using - sets a bunch of predifined parameters. 45 | 46 | vowel_marks : list 47 | Points where to sample vocalic nuclei. 48 | 49 | use_ipcgs : bool 50 | Use IPCGs or regular syllables as rhtythmic units. Morlec uses IPCGs, 51 | chinese uses syllables. 52 | 53 | show_plot : bool 54 | Whether to keep the plots open of the f0_ref, dur_stat and syll_stat if it is 55 | necessary to extract them when file_type is TextGrid. 56 | 57 | save_path : str 58 | Save path for the plotted figures. 59 | ''' 60 | log = logging.getLogger('build_corpus') 61 | 62 | #%% load variables from params 63 | datafolder = params.datafolder 64 | phrase_types = params.phrase_types 65 | function_types = params.function_types 66 | end_marks = params.end_marks 67 | database = params.database 68 | file_type = params.file_type 69 | columns = params.columns 70 | 71 | re_folder = params.re_folder 72 | re_vowels = params.re_vowels 73 | use_ipcgs = params.use_ipcgs 74 | f0_ref = params.f0_ref 75 | isochrony_clock = params.isochrony_clock 76 | 77 | #%% read filenames 78 | filenames = natsorted([f for f in os.listdir(datafolder) if re_folder.match(f)]) 79 | #%% build a pandas corpus 80 | # filenames = ['DC_140.TextGrid'] # errors FF - FF 81 | # filenames = ['EX_265.TextGrid'] # errors FF - FF 82 | # filenames = ['chinese_000002.TextGrid'] # 83 | # filenames = ['chinese_001004.TextGrid'] # QS 84 | # filenames = ['chinese_005949.TextGrid'] # no tones? 85 | 86 | corpus = pd.DataFrame(columns=columns) 87 | re_fpro = params.re_fpro 88 | 89 | #%% main loop 90 | phone_set = [] 91 | utterances = {} 92 | phone_duration_means=None 93 | for file_cnt, barename in enumerate(filenames): 94 | log.info('Reading file {}'.format(barename)) 95 | filename = datafolder + barename 96 | bare = barename.split('.')[0] 97 | if file_type == 'fpro': 98 | try: 99 | fpro_lists = sfc_data.read_fpro(filename, re_fpro, database) # read fpro 100 | except: 101 | log.error(sys.exc_info()[0]) 102 | log.error('{} read error!'.format(barename)) 103 | continue 104 | fpro_stats, f0s, enrs, units, durunits, durcoeffs, tpss, dursegs, phones, sylls, poss, \ 105 | orthographs, phrase, levels = fpro_lists # detuple - TODO make pretier! 106 | levels = levels.T 107 | 108 | elif file_type == 'TextGrid': # do it from a TextGrid 109 | try: 110 | fpro_lists, phone_duration_means, f0_ref, isochrony_clock = sfc_data.read_textgrid(barename, params, 111 | phone_duration_means=phone_duration_means, 112 | f0_ref=f0_ref, 113 | isochrony_clock=isochrony_clock) 114 | except: 115 | log.error(sys.exc_info()[0]) 116 | log.error('{} read error!'.format(barename)) 117 | continue 118 | 119 | fpro_stats, f0s, units, durunits, durcoeffs, dursegs, \ 120 | phones, orthographs, phrase, levels = fpro_lists 121 | 122 | phone_set = phone_set + phones.tolist() 123 | # get the utterance text 124 | utterance = orthographs[orthographs != '*'].tolist() 125 | utterance = [' '+s if s not in ['.',',','?','!'] else s for s in utterance] 126 | utterance = ''.join(utterance)[1:] 127 | utterances[bare]= utterance 128 | if fpro_stats is not None: # might have a bad header 129 | f0_ref, isochrony_clock, r_clock, disol, stats = fpro_stats 130 | #%% get the rhythmic units (rus) and order them 131 | # we can't use np.unique for this 132 | rus = [] 133 | ru_f0s = [] 134 | ru_coefs = [] 135 | ru_map = np.zeros(phones[1:-1].size, dtype=int) 136 | cnt = -1 # count of unit, skip first silence 137 | unit = '' 138 | for i, phone in enumerate(phones[1:-1]): 139 | # run through the phones and detect vowels to accumulate the ru-s 140 | if len(unit) == 0: # at start of loop and after deleting unit 141 | cnt += 1 142 | unit = units[1:-1][i] 143 | rus.append(unit) 144 | unit = unit[len(phone):] # delete phone from unit 145 | else: 146 | assert unit.startswith(phone), \ 147 | log.error('{} - unit doesn''t start with phone!'.format(barename)) 148 | unit = unit[len(phone):] # delete phone from unit 149 | if re_vowels.match(phone): 150 | ru_f0s.append(f0s[1:-1,:][i,:]) 151 | ru_coefs.append(durcoeffs[1:-1][i]) 152 | ru_map[i] = cnt 153 | 154 | ru_f0s = np.array(ru_f0s) 155 | # if there are nans change them to 0s, so the learning doesn't crash: 156 | if np.any(np.isnan(ru_f0s)): 157 | log.warning('No pitch marks for unit {} in {}!'.format( 158 | rus[np.where(np.any(np.isnan(ru_f0s), axis=1))[0][0]], barename)) 159 | ru_f0s[np.isnan(ru_f0s)] = 0 160 | 161 | ru_coefs = np.array(ru_coefs) 162 | rus = np.array(rus) 163 | assert ru_coefs.size == rus.size, \ 164 | log.error('{} - error ru_coefs.size != rus.size'.format(barename)) 165 | 166 | #%% construct phrase contours 167 | if 'FF' not in phrase[1]: 168 | log.error('{} Start of phrase is not in first segment - skipping!'.format(barename)) 169 | continue 170 | # this covers phr:FF too 171 | if phrase[-1] == '*': 172 | log.error('{} End of phrase is not in finall segment - skipping!'.format(barename)) 173 | continue 174 | 175 | # target vector for 176 | phrase_type = phrase[-1].strip(':') 177 | phrase_targets = np.c_[ru_f0s, ru_coefs] 178 | # generate input ramps for phrase 179 | ramp_global = np.arange(rus.size)[::-1] 180 | ramp_local = np.c_[ramp_global, 181 | np.linspace(10, 0, num=rus.size), 182 | ramp_global[::-1]] 183 | phrase_ramps = np.c_[ramp_local, ramp_global].T 184 | 185 | 186 | #mask = np.arange(rus.size) # the scope of the contour 187 | for unit, l in enumerate(np.c_[phrase_ramps.T, phrase_targets]): 188 | row = pd.Series([barename, phrase_type, phrase_type, rus[unit], unit] + l.tolist(), 189 | columns) 190 | corpus = corpus.append([row], ignore_index=True) 191 | 192 | #%% do the rest of the contours 193 | landmarks = phrase_types + function_types 194 | if levels.size > 0: 195 | for level in levels[:,1:]: # without the initial silence 196 | #%% debug 197 | # level = levels[0,1:] 198 | mask = [] 199 | iscontour = False 200 | found_landmark = False 201 | for cnt, entry in enumerate(level.tolist()): 202 | #%% debug 203 | # cnt = 4 204 | # entry = level[cnt] 205 | entry = entry.strip(':') 206 | entry = entry[:2] 207 | if not iscontour: 208 | if 'FF' in entry: # contour start 209 | # if it's the first IPCG or it's one with a vowel we add the mask 210 | # this is to solve IPCG-Syll overlap issues: 211 | # wordlevel: game DG kupe, but IPCG level: gam (e DG k) up e 212 | # i.e. where is ek? it is in the left unit but not the right! 213 | # this is because of the mapping function we use 214 | # between segments and units (IPCGs) 215 | if use_ipcgs: 216 | if cnt == 0 or re_vowels.match(phones[cnt+1]): #+1 because we skip silence 217 | mask.append(cnt) 218 | else: # use sylls - easier 219 | mask.append(cnt) 220 | 221 | iscontour = True 222 | 223 | if entry in landmarks: 224 | log.warning('{} landmark {} starts contour - skipping!'.format(barename, entry)) 225 | 226 | elif iscontour: 227 | if entry == '*': # normal * 228 | mask.append(cnt) 229 | 230 | elif (entry in end_marks and \ 231 | not found_landmark) or \ 232 | (entry in landmarks and \ 233 | not found_landmark and \ 234 | entry not in end_marks and \ 235 | cnt == len(level.tolist())-1): # end of contour, 236 | # Cs can also end contours if at the end: 237 | # :C4 :FF * :C2 238 | contour_type = entry 239 | 240 | if use_ipcgs: 241 | if cnt < level.size - 1 and re_vowels.match(phones[cnt+1]): 242 | # the segment with FF is not part of scope if its the end silence 243 | # and if it is not a vowel (from the word that follows) 244 | mask.append(cnt) 245 | landmark_ind = cnt 246 | else: 247 | landmark_ind = cnt - 1 248 | else: 249 | landmark_ind = cnt - 1 250 | 251 | corpus = append_contour_to_corpus(corpus, columns, barename, phrase_type, 252 | contour_type, rus, mask, landmark_ind, 253 | ru_map, ru_f0s, ru_coefs) 254 | if use_ipcgs: 255 | if re_vowels.match(phones[cnt+1]): 256 | # the segment with FF is not part of scope if its not a vowel 257 | mask = [cnt] 258 | else: 259 | mask = [] 260 | 261 | else: # when using syllables it's part of the next scope 262 | mask = [cnt] 263 | 264 | found_landmark = False # you cannot daisy chain XX 265 | 266 | elif entry in landmarks and \ 267 | not found_landmark and \ 268 | entry not in end_marks: # if first landmark 269 | found_landmark = True 270 | contour_type = entry 271 | 272 | if use_ipcgs: 273 | if cnt < level.size - 1 and re_vowels.match(phones[cnt+1]): 274 | # the segment with FF is not part of scope if its the end silence 275 | # and if it is not a vowel (from the word that follows) 276 | landmark_ind = cnt 277 | else: 278 | landmark_ind = cnt - 1 279 | else: 280 | landmark_ind = cnt - 1 281 | 282 | mask.append(cnt) 283 | 284 | 285 | elif 'FF' in entry: # end of contour and maybe start of next one 286 | if use_ipcgs: 287 | if cnt < level.size - 1 and re_vowels.match(phones[cnt+1]): 288 | # the segment with FF is not part of scope if its the end silence 289 | # and if it is not a vowel (from the word that follows) 290 | mask.append(cnt) 291 | # else: # it's not part of the scope 292 | 293 | if not found_landmark: 294 | log.warning('{} no landmark in level {} - skipping!'.format( 295 | barename, level[cnt-3:cnt+4])) 296 | # print(cnt, entry, level[cnt-3:cnt+3]) 297 | # this will also find the case where FF is not the start of the next contour 298 | # e.g. FF * * * DD * * FF * * FF * * * XX * FF 299 | else: 300 | corpus = append_contour_to_corpus(corpus, columns, barename, phrase_type, 301 | contour_type, rus, mask, landmark_ind, 302 | ru_map, ru_f0s, ru_coefs) 303 | 304 | # it might be the start of the next one like in DC_368.fpro! 305 | # e.g. FF * * * DD * * * FF * * * XX * FF 306 | found_landmark = False 307 | 308 | if use_ipcgs: 309 | if re_vowels.match(phones[cnt+1]): 310 | # the segment with FF is not part of scope if its not a vowel 311 | mask = [cnt] 312 | else: 313 | mask = [] 314 | 315 | else: # when using syllables it's part of the next scope 316 | mask = [cnt] 317 | 318 | elif entry in landmarks and found_landmark: # end of contour 319 | # if not first landmark - ie it's the start of another contour 320 | # process previous contour 321 | if use_ipcgs: 322 | if re_vowels.match(phones[cnt+1]): #+1 because we skip silence 323 | # also if its a vowel its from the next scope 324 | mask.append(cnt) # well apparently it is, except the final silence 325 | # else: # don't append if it's syllables 326 | 327 | corpus = append_contour_to_corpus(corpus, columns, barename, phrase_type, 328 | contour_type, rus, mask, landmark_ind, 329 | ru_map, ru_f0s, ru_coefs) 330 | 331 | # update mask to new contour 332 | # keep the mask starting from previous landmark 333 | # keep landmark if its on vowel 334 | if use_ipcgs: 335 | if re_vowels.match(phones[landmark_ind+1]): #+1 because we skip silence 336 | mask = [i for i in mask if i >= landmark_ind] 337 | else: 338 | mask = [i for i in mask if i > landmark_ind] 339 | else: # for sylls keep after the landmark 340 | mask = [i for i in mask if i > landmark_ind] 341 | 342 | contour_type = entry # new contour's type 343 | if use_ipcgs: 344 | landmark_ind = cnt 345 | else: 346 | landmark_ind = cnt - 1 347 | 348 | mask.append(cnt) 349 | 350 | else: 351 | log.warning('{} unknown landmark {} in {} - skipping!'.format(barename, entry, level)) 352 | break 353 | #%% end test 354 | phone_set = np.array(phone_set) 355 | phone_set, phone_cnts = np.unique(phone_set, return_counts=True) 356 | 357 | return fpro_stats, corpus, utterances, phone_set, phone_cnts 358 | 359 | def create_masks(corpus, phrase_type, contour_keys, params): 360 | ''' 361 | Create masks to address the data in the corpus. 362 | 363 | Parameters 364 | ========== 365 | corpus : pandas data frame 366 | Holds all data from corpus. 367 | phrase_type : str 368 | Type of phrase type to make the mask. 369 | contour_keys : list 370 | Types of functions used for contour generators. 371 | 372 | params 373 | ====== 374 | good_files_only : bool 375 | Whether to use only a subset of the files. 376 | good_files : list 377 | The subset of good files. 378 | database : str 379 | Name of database. 380 | ''' 381 | #%% 382 | good_files_only = params.good_files_only 383 | good_files = params.good_files 384 | database = params.database 385 | # init masks 386 | mask_phrase = corpus['phrasetype'] == phrase_type 387 | # files = natsorted(np.unique(corpus[mask_phrase]['file'].values)) 388 | files = natsorted(corpus[mask_phrase]['file'].unique()) 389 | mask_file_dict = {} 390 | n_units_dict = {} 391 | mask_unit_dict = {} 392 | re_file_nr = re.compile('.*_(\d*).*') 393 | mask_contours = {} 394 | for contour_type in contour_keys: 395 | mask_contours[contour_type] = corpus['contourtype'] == contour_type 396 | # make a mask for all files for training the contours 397 | if good_files_only: 398 | mask_all_files = mask_phrase & False # this one we have to edit to account for the not good files 399 | else: 400 | mask_all_files = mask_phrase 401 | 402 | for file in files: 403 | file_nr = re_file_nr.match(file).groups()[0] 404 | if good_files_only: 405 | # print(file_nr) 406 | if database == 'morlec': 407 | good_nrs = good_files[phrase_type] 408 | else: 409 | good_nrs = good_files 410 | 411 | if int(file_nr) in good_nrs: 412 | mask_file = corpus['file'] == file 413 | mask_file_dict[file] = mask_file 414 | mask_all_files = mask_all_files | mask_file 415 | n_units = np.max(corpus.loc[mask_file, 'n_unit'].values) 416 | n_units_dict[file] = n_units 417 | for n_unit in range(n_units+1): 418 | mask_unit = corpus['n_unit'] == n_unit 419 | mask_unit_dict[n_unit] = mask_unit 420 | 421 | else: # all files 422 | mask_file = corpus['file'] == file 423 | mask_file_dict[file] = mask_file 424 | n_units = np.max(corpus.loc[mask_file, 'n_unit'].values) 425 | n_units_dict[file] = n_units 426 | for n_unit in range(n_units+1): 427 | mask_unit = corpus['n_unit'] == n_unit 428 | mask_unit_dict[n_unit] = mask_unit 429 | 430 | if good_files_only: 431 | files = list(mask_file_dict.keys()) 432 | #%% 433 | return files, mask_all_files, mask_file_dict, \ 434 | mask_contours, n_units_dict, mask_unit_dict 435 | 436 | 437 | def downcast_corpus(corpus, columns): 438 | ''' 439 | Take care of dtypes and downcast. 440 | 441 | Parameters 442 | ========== 443 | corpus : pandas DataFrame 444 | Holds all the data. 445 | columns : list 446 | The columns of the DataFrame. 447 | ''' 448 | log = logging.getLogger('down_corpus') 449 | log.info('Converting columns to numeric ...') # TODO why some n_units are strings?? 450 | start_colum = columns.index('f00') 451 | corpus[columns[start_colum:]] = corpus[columns[start_colum:]].apply(pd.to_numeric, downcast='float') 452 | # for column in ['n_unit','ramp1','ramp3','ramp4']: 453 | # corpus[column] = pd.to_numeric(corpus[column], downcast='unsigned') 454 | corpus[['n_unit','ramp1','ramp3','ramp4']] = \ 455 | corpus[['n_unit','ramp1','ramp3','ramp4']].apply(pd.to_numeric, 456 | downcast='unsigned') 457 | # for column in ['ramp2']: 458 | # corpus[column] = pd.to_numeric(corpus[column], downcast='float') 459 | corpus['ramp2'] = corpus['ramp2'].apply(pd.to_numeric, downcast='float') 460 | 461 | return corpus 462 | 463 | def expand_corpus(corpus, params): 464 | ''' 465 | Take care of scale and expand corpus. 466 | 467 | Parameters 468 | ========== 469 | corpus : pandas DataFrame 470 | Holds all input features and all predictions by the contour generators. 471 | 472 | params 473 | ====== 474 | columns : list 475 | Columns of corpus DataFrame. 476 | orig_columns : list 477 | Columns holding original f0 and dur_coeff in corpus DataFrame. 478 | target_columns : list 479 | Columns holding the f0 and dur_coeff tagets used to train the contour 480 | generators. 481 | iterations : int 482 | Number of iterations to run analysis-by-synthesis loop. 483 | f0_scale : float 484 | Scaling factor for the f0s to downscale them near to the dur_coeffs. 485 | dur_scale : float 486 | Scaling factor for the dur_coeffs to upscale them near to the f0s. 487 | ''' 488 | log = logging.getLogger('exp_corpus') 489 | 490 | orig_columns = params.orig_columns 491 | target_columns = params.target_columns 492 | iterations = params.iterations 493 | f0_scale = params.f0_scale 494 | dur_scale = params.dur_scale 495 | 496 | log.info('Applying scaling to columns ...') 497 | corpus.loc[:,'dur'] = corpus.loc[:,'dur'] * dur_scale 498 | # corpus.loc[:,'f01':'f03'] = corpus.loc[:,'f01':'f03'] * f0_scale 499 | corpus.loc[:,orig_columns[0]:orig_columns[-2]] = corpus.loc[:,orig_columns[0]: 500 | orig_columns[-2]] * f0_scale 501 | 502 | #% expand corpus with extra columns 503 | log.info('Expanding initial columns ...') 504 | # for column in target_columns: 505 | # corpus.loc[:, column] = np.NaN # doesn't work on a list of labels 506 | new_columns = target_columns.copy() 507 | for i in range(iterations): 508 | pred_columns = [column + '_it{:03}'.format(i) for column in orig_columns] 509 | new_columns += pred_columns 510 | # for column in pred_columns: # doesn't work on a list of labels 511 | # corpus.loc[:, column] = np.NaN 512 | 513 | # onliner: 514 | corpus = pd.concat([corpus, 515 | pd.DataFrame(np.nan, index=corpus.index, 516 | columns=new_columns)], 517 | axis=1) 518 | 519 | return corpus 520 | 521 | #%% 522 | def append_contour_to_corpus(corpus, columns, barename, phrase_type, 523 | contour_type, rus, mask, landmark_ind, 524 | ru_map, ru_f0s, ru_coefs): #, debug=False): 525 | """ 526 | Process contour and append to corpus. 527 | 528 | Parameters 529 | ========== 530 | corpus : pandas DataFrame 531 | Holds all the data. 532 | columns : list 533 | The columns of the DataFrame. 534 | barename : str 535 | Name of file. 536 | phrase_type : str 537 | Type of phrase component in file. 538 | contour_type : str 539 | Type of contour to add. 540 | rus : list 541 | List of Rhythmic units, e.g. IPCGs or syllables, in contour. 542 | mask : list 543 | Mask of unit number in the utterance for each unit in the contour. 544 | landmark_ind : int 545 | Position of the function type designator. 546 | ru_map : list 547 | Shows where each phone in utterance belongs to in rus. 548 | ru_f0s : ndarray 549 | f0s for each of the units in rus. 550 | ru_coefs : ndarray 551 | dur_coeffs for each unit in rus. 552 | """ 553 | log = logging.getLogger('append2corpus') 554 | # find targets for contour 555 | mask_rus = np.array(np.unique(ru_map[mask]), dtype=int) # map to the rus 556 | # TODO cast int to int because some are string?? 557 | contour_targets = np.c_[ru_f0s[mask_rus], 558 | ru_coefs[mask_rus]] 559 | # generate ramps for contour 560 | contour_scope = contour_targets.shape[0] 561 | ramp_global = np.arange(contour_scope)[::-1] 562 | 563 | landmark_unit = ru_map[landmark_ind] - mask_rus[0] # relative position of landmark 564 | unit_scope_1 = np.arange(landmark_unit+1) # including landmark unit 565 | unit_scope_2 = np.arange(contour_scope - landmark_unit - 1) 566 | ramp_local = np.c_[np.r_[unit_scope_1[::-1], unit_scope_2[::-1]], 567 | np.r_[np.linspace(0, 10, num=unit_scope_1.size)[::-1], 568 | # if there is 1 unit this will favor the 0 569 | np.linspace(0, 10, num=unit_scope_2.size)], 570 | np.r_[unit_scope_1, unit_scope_2]] 571 | 572 | contour_ramps = np.c_[ramp_local, ramp_global].T 573 | 574 | # if debug: # debug log.info 575 | log.debug('-'*42) 576 | log.debug(contour_type, mask, landmark_ind) 577 | log.debug(contour_targets) 578 | log.debug(contour_ramps) 579 | log.debug('-'*42,'\n') 580 | 581 | # add to corpus 582 | for l in np.c_[rus[mask_rus], mask_rus, contour_ramps.T, contour_targets]: 583 | row = pd.Series([barename, phrase_type, contour_type] + l.tolist(), 584 | columns) 585 | corpus = corpus.append([row], ignore_index=True) 586 | # assert(contour_type!='DD') # just break it : ) 587 | return corpus 588 | #%% 589 | def contour_scope_count(corpus, phrase_type, max_scope=20): 590 | ''' 591 | Count all the scope contexts of a contour. 592 | 593 | Parameters 594 | ========== 595 | corpus : pandas DataFrame 596 | Holds all the data. 597 | phrase_type : str 598 | Phrase contour subset within which to count. 599 | max_scope : int 600 | Maximum scope to take into account 601 | ''' 602 | #%% 603 | log = logging.getLogger('scope_count') 604 | # phrase_type = 'DC' 605 | corpus_phrase = corpus[corpus['phrasetype'] == phrase_type] 606 | corpus_phrase = corpus_phrase.reset_index() 607 | # contour_types = np.unique(corpus_phrase['contourtype']) 608 | contour_types = corpus_phrase['contourtype'].unique() 609 | scope_counts = {} 610 | for contour_type in contour_types: 611 | scope_counts[contour_type] = np.zeros((max_scope,max_scope),dtype=int) 612 | contour_ends = corpus_phrase['ramp4'] == 0 613 | contour_ends = contour_ends.index[contour_ends == True].tolist() 614 | for i, contour_end in enumerate(contour_ends): 615 | contour_type = corpus_phrase.loc[contour_end]['contourtype'] 616 | if i == 0: 617 | contour_start = 0 618 | else: 619 | contour_start = contour_ends[i-1] + 1 620 | scope_tot = int(corpus_phrase.loc[contour_start]['ramp4']) + 1 621 | scope_left = int(corpus_phrase.loc[contour_start]['ramp1']) + 1 622 | cnt_left = 0 623 | while corpus_phrase.loc[contour_start + cnt_left]['ramp1'] != 0: # we're looking for the end 624 | cnt_left += 1 625 | 626 | if cnt_left+1 <= scope_tot-1: # if not the end of total scope 627 | scope_right = int(corpus_phrase.loc[contour_start + cnt_left+1]['ramp1']) + 1 628 | else: 629 | scope_right = 0 630 | 631 | assert (scope_tot == scope_left + scope_right) , log.error('Scope length doesn''t match!') 632 | scope_counts[contour_type][scope_left, scope_right] = \ 633 | scope_counts[contour_type][scope_left, scope_right] + 1 634 | #%% 635 | return scope_counts -------------------------------------------------------------------------------- /sfc/sfc_plot.py: -------------------------------------------------------------------------------- 1 | #!/usr/bin/env python3 2 | # -*- coding: utf-8 -*- 3 | """ 4 | PySFC - plotting functions. 5 | 6 | @authors: 7 | Branislav Gerazov Nov 2017 8 | 9 | Copyright 2017 by GIPSA-lab, Grenoble INP, Grenoble, France. 10 | 11 | See the file LICENSE for the licence associated with this software. 12 | """ 13 | import numpy as np 14 | import pandas as pd 15 | import logging 16 | from matplotlib import pyplot as plt 17 | import seaborn as sns 18 | import os 19 | import glob 20 | import shutil 21 | #%% 22 | def init_colors(params): 23 | ''' 24 | Init color palet for plots. 25 | 26 | Parameters 27 | ========== 28 | phrase_types : list 29 | Phrase types to reserve colors for. 30 | function_types : list 31 | Function types to reserve colors for. 32 | ''' 33 | phrase_types = params.phrase_types 34 | function_types = params.function_types 35 | 36 | plt.style.use('default') 37 | color_phrase = 'C1' 38 | colors = {'orig' : 'C7', 39 | 'recon' : 'C0'} 40 | for phrase_type in phrase_types: 41 | colors[phrase_type] = color_phrase 42 | 43 | for i, function_type in enumerate(function_types): 44 | colors[function_type] = 'C' + str(i+2) 45 | # 'DC' : color_phrase, 'QS' : color_phrase, 46 | # 'DG' : 'C2', 47 | # 'DD' : 'C3', 48 | # 'EM' : 'C4', 49 | # 'XX' : 'C5', 50 | # 'DV' : 'C6', 51 | # 'ID' : 'C8', 52 | # 'IT' : 'C9'} 53 | 54 | return colors 55 | 56 | #%% 57 | def plot_histograms(data_all, data_mean, data_median, data_kde, save_path, plot_type=None, show_plot=False): 58 | ''' 59 | Used for obtaining f0_ref and duration isochrony clock. 60 | 61 | Parameters 62 | ========== 63 | data_all : DataSeries 64 | Holds all samples from the distribution. 65 | data_mean : float 66 | Mean of all the data. 67 | data_median : float 68 | Median of all the data. 69 | data_kde : float 70 | Kernel Density Estimation peak fo the data. 71 | save_path : str 72 | Save path. 73 | plot_type : str 74 | To plot f0, phone or syll stats. 75 | show_plot : bool 76 | Whether to close the plot. 77 | ''' 78 | log = logging.getLogger('plt_hist') 79 | fig = plt.figure() 80 | sns.set(color_codes=True, style='ticks') 81 | ax = sns.distplot(data_all) 82 | 83 | kde_x, kde_y = ax.get_lines()[0].get_data() 84 | f0_kde = kde_x[np.argmax(kde_y)] 85 | 86 | ax.axvline(data_mean, c='C1', lw=2, alpha=.7, label='mean') 87 | ax.axvline(data_median, c='C2', lw=2, alpha=.7, label='median') 88 | ax.axvline(f0_kde, c='C4', lw=2, alpha=.7, label='kde') 89 | ax.legend() 90 | if plot_type == 'f0': 91 | plt.title('F0 histogram\nmean = {} Hz, median = {} Hz, kde = {} Hz'.format( 92 | int(data_mean), int(data_median), int(data_kde))) 93 | plt.xlabel('Frequency [Hz]') 94 | elif plot_type == 'phone': 95 | plt.title('Phone duration histogram\nmean = {} ms, median = {} ms, kde = {} ms'.format( 96 | int(data_mean*1e3), int(data_median*1e3), int(data_kde*1e3))) 97 | plt.xlabel('Duration [s]') 98 | elif plot_type == 'syll': 99 | plt.title('Syllable duration histogram\nmean = {} ms, median = {} ms, kde = {} ms'.format( 100 | int(data_mean*1e3), int(data_median*1e3), int(data_kde*1e3))) 101 | plt.xlabel('Duration [s]') 102 | 103 | plt.ylabel('Density') 104 | filename = '{}/histogram_{}_stats.png'.format(save_path, plot_type) 105 | plt.savefig(filename, dpi='figure') 106 | log.info('Histograms saved in {}'.format(filename)) 107 | fig.tight_layout() 108 | 109 | if not show_plot: 110 | plt.close(fig) 111 | #%% 112 | def plot_contours(save_path, file, utterances, corpus, colors, params, plot_contour='f0', show_plot=False): 113 | ''' 114 | Plot SFC contours. 115 | 116 | Parameters 117 | ---------- 118 | save_path : str 119 | file : str 120 | utterances : dict 121 | corpus : pandas DataFrame 122 | colors : dict 123 | scale : float 124 | plot_contour : str 125 | Which contour to plot. Avalable options 'f0' and 'dur' 126 | show_plot : bool 127 | Whether to keep plot open. 128 | 129 | params 130 | ====== 131 | iterations : int 132 | Number of iterations of analysis-by-synthesis loop. 133 | function_types : list 134 | List of available function types. 135 | learn_rate : float 136 | Learning rate of contour generators. 137 | max_iter : int 138 | Maximum number of iterations for the contour generators. 139 | orig_columns : list 140 | List of original columns. 141 | database : str 142 | Name of database. 143 | vowel_pts : int 144 | Number of samples taken from the vocalic nuclei. 145 | ''' 146 | log = logging.getLogger('plt_contour') 147 | plt.style.use('default') 148 | 149 | i = params.iterations -1 150 | function_types = params.function_types 151 | learn_rate = params.learn_rate 152 | max_iter = params.max_iter 153 | orig_columns = params.orig_columns 154 | database = params.database 155 | vowel_pts = params.vowel_pts 156 | if plot_contour == 'f0': 157 | scale = params.f0_scale 158 | else: 159 | scale = params.dur_scale 160 | #%% 161 | # file = 'DC_13.fpro' # test 162 | # plot_contour='dur' # test 163 | # scale = 200 164 | 165 | barename = file.split('.',1)[0] 166 | mask_file = corpus['file'] == file 167 | contour_type = corpus.loc[mask_file,'phrasetype'].iloc[0] 168 | mask_contour = corpus['contourtype'] == contour_type 169 | mask_row = mask_file & mask_contour 170 | rus = corpus.loc[mask_row, 'unit'].values 171 | f0_orig = corpus.loc[mask_row,'f00':'f0{}'.format(vowel_pts-1)].values 172 | f0_orig = f0_orig.ravel() # flatten to 1D 173 | dur_orig = corpus.loc[mask_row,'dur'].values 174 | n_rus = rus.size 175 | 176 | ## open fig and setup axis 177 | if database == 'french': 178 | fig = plt.figure(figsize=(10,8)) 179 | else: 180 | fig = plt.figure(figsize=(19,10)) 181 | ax1 = fig.add_subplot(111) 182 | plt.grid('on') 183 | ax1.set_xlabel('Rhythmic unit') 184 | ax2 = ax1.twiny() 185 | ax2.set_xticks(np.arange(n_rus)+1) 186 | ax2.set_xticklabels(rus) 187 | # ax2.set_xlabel('{}, learn_rate {}, iteration {}'.format(barename, learn_rate, i+1)) 188 | plt.title('{} : {}'.format(barename, utterances[barename].lower()), 189 | y=1.05) 190 | 191 | if plot_contour == 'f0': 192 | ax1.set_ylabel('Normalised f0') 193 | x_axis = np.arange(1-int(vowel_pts/2)/vowel_pts, n_rus + 1/2, 1/vowel_pts) 194 | ax1.plot(x_axis, f0_orig/scale, c=colors['orig'], marker='o', ms=3.5, lw=3, label='f0 target') 195 | # predicted values without the durcoeff 196 | pred_columns = [column + '_it{:03}'.format(i) for column in orig_columns[:-1]] 197 | 198 | elif plot_contour == 'dur': 199 | ax1.set_ylabel('Duration coefficient') 200 | x_axis = np.arange(1, n_rus + 1/2, 1) 201 | ax1.plot(x_axis, dur_orig/scale, c=colors['orig'], marker='o', ms=3.5, lw=3, label='dur target') 202 | pred_columns = 'dur_it{:03}'.format(i) 203 | 204 | else: 205 | raise ValueError('Contour unksupported!') 206 | 207 | contourtypes = corpus.loc[mask_file, 'contourtype'].values 208 | if not any([c in function_types for c in contourtypes.tolist()]): # only phrase component 209 | if plot_contour == 'f0': 210 | f0_pred = corpus.loc[mask_row, pred_columns].values 211 | f0_pred = f0_pred.ravel() 212 | ax1.plot(x_axis, f0_pred/scale, c=colors[contour_type], marker='o', ms=3.5, 213 | lw=3, alpha=.8, label='f0 recon') 214 | elif plot_contour == 'dur': 215 | dur_pred = corpus.loc[mask_row, pred_columns].values 216 | ax1.plot(x_axis, dur_pred/scale, c=colors[contour_type], marker='o', ms=3.5, 217 | lw=3, alpha=.8, label='dur recon') 218 | ylims = ax1.get_ylim() 219 | ax2.set_xlim(ax1.get_xlim()) 220 | else: # more contours 221 | n_units = corpus.loc[mask_file, 'n_unit'].values 222 | ramp4 = corpus.loc[mask_file, 'ramp4'].values # we use this to separate contours 223 | ramp2 = corpus.loc[mask_file, 'ramp2'].values # we use this to find landmarks 224 | 225 | ## go through ramp4 and decompose the contours 226 | end_inds = np.where(ramp4 == 0)[0] 227 | start_inds = np.r_[0, end_inds[:-1]+1] 228 | n_contours = end_inds.size 229 | contour_levels = np.empty(n_contours) 230 | contour_landmarks = np.empty(n_contours, dtype=int) # position of landmark in contour 231 | contour_labels = [] 232 | contour_cnt = 0 233 | 234 | if plot_contour == 'f0': 235 | f0_preds = corpus.loc[mask_file, pred_columns].values 236 | contour_array = np.empty((50, vowel_pts*n_rus)) # position of contours in plot, at least 2 levels 237 | contours = np.empty((n_contours, vowel_pts*n_rus)) # one contour per row 238 | 239 | elif plot_contour == 'dur': 240 | dur_preds = corpus.loc[mask_file, pred_columns].values 241 | contour_array = np.empty((50, n_rus)) # position of contours in plot, at least 2 levels 242 | contours = np.empty((n_contours, n_rus)) # one contour per row 243 | 244 | contour_array.fill(np.nan) 245 | contours.fill(np.nan) 246 | 247 | level = 0 248 | for start, end in zip(start_inds, end_inds): 249 | if plot_contour == 'f0': 250 | # check empty 251 | while not np.all(np.isnan(contour_array[level, 252 | n_units[start]*vowel_pts : (n_units[end]+1)*vowel_pts])): 253 | level += 1 254 | if level == contour_array.shape[0]: # array not big enough! 255 | log.error('{} not enough levels for contours - skipping!'.format(barename)) 256 | break 257 | 258 | # set f0s 259 | f0s_contour = f0_preds[start : end+1, :].ravel() 260 | contour_array[level, n_units[start]*vowel_pts : (n_units[end]+1)*vowel_pts] = f0s_contour 261 | contours[contour_cnt, n_units[start]*vowel_pts : (n_units[end]+1)*vowel_pts] = f0s_contour 262 | 263 | elif plot_contour == 'dur': 264 | # check empty 265 | while not np.all(np.isnan(contour_array[level, 266 | n_units[start] : n_units[end]+1])): 267 | level += 1 268 | if level == contour_array.shape[0]: # array not big enough! 269 | log.error('{} not enough levels for contours - skipping!'.format(barename)) 270 | break 271 | # set durs 272 | durs_contour = dur_preds[start : end+1] 273 | contour_array[level, n_units[start] : n_units[end]+1] = durs_contour 274 | contours[contour_cnt, n_units[start] : n_units[end]+1] = durs_contour 275 | 276 | contour_levels[contour_cnt] = level 277 | contour_labels.append(contourtypes[start]) 278 | landmark_ind = np.where(ramp2[start:end+1] == 0)[0][0] + start 279 | contour_landmarks[contour_cnt] = n_units[landmark_ind]+1 280 | 281 | contour_cnt += 1 282 | 283 | # trim contour_array 284 | contour_array = contour_array[:level+1] 285 | # sum and plot predictions 286 | y_pred_sum = np.nansum(contour_array, axis=0) 287 | 288 | if plot_contour == 'f0': 289 | ax1.plot(x_axis, y_pred_sum/scale, c=colors['recon'], marker='o', lw=3, ms=3.5, 290 | alpha=.8, label='f0 recon') 291 | offset_coef = 300 # for plotting 292 | # offset_text = 30 # for the text labels 293 | offset_text = 100 # chinese 294 | upper_margin = 100 # for the plot borders 295 | lower_margin = 200 # for the plot borders 296 | y_ticks = np.arange(-300, 300, 100) 297 | ylims = ax1.get_ylim() 298 | # f0_min = - np.ceil(np.abs(ylims[0]) / offset_coef) * offset_coef - offset_coef/2 299 | # f0_min = - np.ceil(np.abs(ylims[0]) / 100) * 100 300 | f0_min = - 300 # chinese 301 | 302 | elif plot_contour == 'dur': 303 | ax1.plot(x_axis, y_pred_sum/scale, c=colors['recon'], marker='o', lw=3, ms=3.5, 304 | alpha=.8, label='dur recon') 305 | offset_coef = 2 # for plotting 306 | offset_text = .2 # for plotting 307 | upper_margin = 1 # for the plot borders 308 | lower_margin = 1 # for the plot borders 309 | y_ticks = np.arange(-2, 4, 1) 310 | ylims = ax1.get_ylim() 311 | f0_min = - np.ceil(np.abs(ylims[0]) / offset_coef) * offset_coef 312 | 313 | for level, label, contour, landmark in zip(contour_levels, contour_labels, 314 | contours, contour_landmarks): 315 | if plot_contour == 'f0': 316 | landmark_ind = np.where(x_axis >= landmark)[0][0] # address the next one 317 | elif plot_contour == 'dur': 318 | landmark_ind = np.where(x_axis == landmark)[0][0] # address the next one 319 | landmark = x_axis[landmark_ind] 320 | # contour_min = np.nanmin(contour) 321 | # contour_min = np.ceil(np.abs(contour_min)/ offset_coef) * offset_coef 322 | # offset = np.max([offset_coef, contour_min]) 323 | offset = f0_min - (level+1)*offset_coef 324 | plt.axhline(y=offset, c='C7', ls='--', lw=2) 325 | # plt.plot([landmark, landmark], 326 | # [-4e3, 327 | # contour[landmark_ind]/scale+offset], 328 | # c=colors[label], ls='--', lw=1, alpha=.8) 329 | plt.plot([landmark, landmark], 330 | [contour[landmark_ind]/scale+offset-offset_coef/8, 331 | contour[landmark_ind]/scale+offset +offset_coef/8], 332 | c=colors[label], ls='-', lw=2, alpha=.8) 333 | # plt.text(landmark, contour[landmark_ind]/scale+offset+offset_text, 334 | # label, color=colors[label], fontweight='bold', 335 | # bbox=dict(facecolor='w', lw=0, alpha=0.5)) 336 | plt.text(landmark-.2, np.max((offset+offset_text, 337 | contour[landmark_ind]/scale+offset+offset_text/2)), 338 | label, color=colors[label], fontweight='bold') 339 | # bbox=dict(facecolor='w', lw=0, alpha=0.8)) 340 | plt.plot(x_axis, contour/scale+offset, c=colors[label], 341 | marker='o', ms=3.5, lw=3, alpha=.8) 342 | # f0_min = offset - np.ceil(np.abs(np.nanmin(contour)) / offset_coef) * offset_coef 343 | 344 | # delete ticks from y-axis 345 | ax1.set_yticks(y_ticks) 346 | ax1.set_yticklabels(y_ticks) 347 | 348 | # final_limit = f0_min - np.max(contour_levels)*offset_coef-lower_margin 349 | final_limit = offset-lower_margin 350 | ax1.axis([0,n_rus+1,final_limit, ylims[1]+upper_margin]) 351 | ax2.axis([0,n_rus+1,final_limit, ylims[1]+upper_margin]) 352 | 353 | if plot_contour == 'f0': 354 | ax1.legend(loc='upper right') 355 | else: 356 | ax1.legend(loc='upper left') 357 | #%% 358 | plt.savefig('{}{}_{}_learn{}_maxit{}_it{}.png'.format(save_path, barename, plot_contour, 359 | learn_rate, 360 | max_iter, i+1), dpi='figure') 361 | if not show_plot: 362 | plt.close(fig) 363 | 364 | 365 | def plot_losses(save_path, phrase_type, losses, show_plot=False): 366 | ''' 367 | Plot losses of training contour generators. 368 | 369 | Parameters 370 | ========== 371 | save_path : str 372 | Path to save figure in. 373 | phrase_type : str 374 | Type of phrase to plot losses for. 375 | losses : dict 376 | Losses for each function type. 377 | show_plot : bool 378 | Whether to close plot. 379 | ''' 380 | plt.style.use('default') 381 | fig = plt.figure(figsize=(10,8)) 382 | plt.grid('on') 383 | plt.ylabel('Squared-loss') 384 | plt.xlabel('Iteration') 385 | for key in losses: 386 | plt.plot(losses[key], lw=3, alpha=0.75, label=key) 387 | plt.legend() 388 | # plt.savefig('losses_{}_learn{}_maxit{}_it{}.png'.format(corpus_name, learn_rate, max_iter, iterations), 389 | plt.savefig('{}/losses_{}.png'.format(save_path, phrase_type), dpi='figure') 390 | if not show_plot: 391 | plt.close(fig) 392 | 393 | def plot_expansion(save_path, contour_generators, colors, scope_counts, phrase_type, 394 | params, show_plot=False): 395 | ''' 396 | Plot expansion of contour generators. 397 | 398 | Parameters 399 | ========== 400 | save_path : str 401 | Save path for figures. 402 | contour_generators : dict 403 | Dictionary of contour generators. 404 | colors : dict 405 | Dictionary of colors. 406 | phrase_type : str 407 | Phrase type for plotting expansion. 408 | scope_counts : dict 409 | Dictionary of scope counts for each function type for chosen phrase type. 410 | show_plot : bool 411 | Whether to keep the plot open after saving. 412 | 413 | params 414 | ====== 415 | left_max : int 416 | Maximum length of left scope. 417 | right_max : int 418 | Maximum length of right scope. 419 | f0_scale : float 420 | f0 scaling coefficient. 421 | dur_scale : float 422 | dur_coeff scaling coefficient. 423 | end_marks : list 424 | List of function types with left context only. 425 | vowel_pts : int 426 | Number of vowel samples. 427 | ''' 428 | plt.style.use('default') 429 | left_max = params.left_max 430 | right_max = params.right_max 431 | phrase_max = params.phrase_max 432 | f0_scale = params.f0_scale 433 | # dur_scale = params.dur_scale 434 | end_marks = params.end_marks 435 | vowel_pts = params.vowel_pts 436 | #%% test 437 | # left_max = 5 438 | # right_max = 5 439 | # show_plot=True 440 | # scope_counts = sfcdata.contour_scope_count(corpus, 'DC') 441 | # colors = sfcplot.init_colors(phrase_types) 442 | # create ramps 443 | ramps_all = [] 444 | left_scopes = np.arange(1,left_max+1) 445 | right_scopes = np.arange(1,right_max+1) 446 | for left_scope in left_scopes: 447 | ramps_row = [] 448 | for right_scope in right_scopes: 449 | #% 450 | # create ramps 451 | contour_scope = left_scope + right_scope 452 | ramp_global = np.arange(contour_scope)[::-1] 453 | # landmark_unit = left_scope # position of landmark 454 | unit_scope_1 = np.arange(left_scope) # including landmark unit 455 | unit_scope_2 = np.arange(right_scope) 456 | ramp_local = np.c_[np.r_[unit_scope_1[::-1], unit_scope_2[::-1]], 457 | np.r_[np.linspace(0, 10, num=left_scope)[::-1], 458 | # if there is 1 unit this will favor the 0 459 | np.linspace(0, 10, num=right_scope)], 460 | np.r_[unit_scope_1, unit_scope_2]] 461 | 462 | ramps = np.c_[ramp_local, ramp_global] 463 | ramps_row.append(ramps) 464 | 465 | ramps_all.append(ramps_row) 466 | 467 | #% left_ramps - only with left scope 468 | ramps_left = [] 469 | for contour_scope in range(1,phrase_max+1): 470 | ramp_global = np.arange(contour_scope)[::-1] 471 | ramp_local = np.c_[ramp_global, 472 | np.linspace(10, 0, num=contour_scope), 473 | ramp_global[::-1]] 474 | ramps = np.c_[ramp_local, ramp_global] 475 | ramps_left.append(ramps) 476 | 477 | #%% plot expansions 478 | for contour_type, contour_generator in contour_generators.items(): 479 | # contour_type = 'DD' 480 | contour_generator = contour_generators[contour_type] 481 | x_offset = 10 482 | y_offset = 400 483 | if contour_type in [phrase_type] + end_marks: 484 | if contour_type in [phrase_type]: 485 | y_offset = 200 486 | fig = plt.figure(figsize=(10,12)) 487 | # fig = plt.figure(figsize=(4,6)) 488 | ramps_left_trim = ramps_left # draw all 489 | else: 490 | # continue 491 | fig = plt.figure(figsize=(6,8)) 492 | ramps_left_trim = ramps_left[:left_max+1] 493 | ax1 = fig.add_subplot(111) 494 | plt.grid('on') 495 | ax1.set_xlabel('Rhythmic unit') 496 | plt.title('Expansion for {}'.format(contour_type)) 497 | ax1.set_ylabel('Normalised f0') 498 | plt.grid('on') 499 | x_axis_f0 = np.arange(-int(vowel_pts/2)/vowel_pts, len(ramps_left_trim)+1/2, 1/vowel_pts) + 1 500 | x_axis_f0 = - x_axis_f0[::-1] 501 | # y_ticks_orig = np.arange(-200,300,100) 502 | y_ticks = np.empty(0) 503 | # y_tick_labels = np.empty(0, dtype=int) 504 | y_tick_labels = [] 505 | for i, row in enumerate(ramps_left_trim): 506 | row_offset = i * y_offset 507 | X = row 508 | y_pred = contour_generator.predict(X) 509 | f0_pred = y_pred[:,:-1].ravel()/f0_scale 510 | 511 | scope_count = scope_counts[contour_type][i+1,0] 512 | 513 | plt.axhline(y=row_offset, c='C7', ls='--', lw=2) 514 | if scope_count > 0: 515 | plt.plot(x_axis_f0[-f0_pred.size:], f0_pred + row_offset, 516 | c=colors[contour_type], marker='o', ms=3.5, lw=3, alpha=.8) 517 | else: 518 | plt.plot(x_axis_f0[-f0_pred.size:], f0_pred + row_offset, 519 | c=colors[contour_type], marker='o', ms=3.5, lw=3, alpha=.3) 520 | plt.text(-.2, f0_pred[-1] + row_offset, scope_count, 521 | color=colors[contour_type], fontweight='bold', 522 | bbox=dict(facecolor='w', lw=0, alpha=0.5)) 523 | y_ticks = np.r_[y_ticks, row_offset] 524 | # y_ticks = np.r_[y_ticks, y_ticks_orig+row_offset] 525 | # y_tick_labels = np.r_[y_tick_labels, 'RU{}'.format(i+1)] 526 | y_tick_labels.append('RU{:02}'.format(i+1)) 527 | 528 | # set ticks 529 | ax1.set_yticks(y_ticks) 530 | ax1.set_yticklabels(y_tick_labels) 531 | ax1.set_xticks(x_axis_f0[-int(vowel_pts/2)-1::-vowel_pts]) 532 | ax1.set_xticklabels(np.arange(-1, -contour_scope-1, -1)) 533 | 534 | # set ticks 535 | ax1.set_yticks(y_ticks) 536 | ax1.set_yticklabels(y_tick_labels) 537 | 538 | else: # left and right scopes 539 | # continue 540 | fig = plt.figure(figsize=(18,14)) 541 | ax1 = fig.add_subplot(111) 542 | plt.grid('on') 543 | ax1.set_xlabel('Rhythmic unit') 544 | plt.title('Expansion for {}'.format(contour_type)) 545 | ax1.set_ylabel('Normalised f0') 546 | plt.grid('on') 547 | y_ticks = np.empty(0) 548 | # y_tick_labels = np.empty(0, dtype=int) 549 | y_tick_labels = [] 550 | x_ticks = np.empty(0) 551 | x_tick_labels = np.empty(0, dtype=int) 552 | x_tick_labels = [] 553 | 554 | for i, row in enumerate(ramps_all): # rows are left_scope in range(1,left_max+1): 555 | row_offset = i * y_offset 556 | plt.axhline(y=row_offset, c='C7', ls='--', lw=1.5) 557 | y_ticks = np.r_[y_ticks, row_offset] 558 | # y_tick_labels = np.r_[y_tick_labels, 0] 559 | y_tick_labels.append('L{:02}'.format(i+1)) 560 | 561 | for j, ramps in enumerate(row): # columns are right_scope in range(1,right_max+1): 562 | offset = j * x_offset 563 | if i == 0: # if first row plot vertical 564 | x_ticks = np.r_[x_ticks, offset] 565 | # x_tick_labels = np.r_[x_tick_labels, 0] 566 | x_tick_labels.append('R{:02}'.format(j+1)) 567 | plt.axvline(x=offset, c='C7', ls='--', lw=1.5) 568 | # plt.plot([landmark, landmark], 569 | # [-2e3, contour[landmark_ind]/scale+offset], 570 | # c=colors[label], ls='--', lw=2, alpha=.8) 571 | 572 | scope_count = scope_counts[contour_type][i+1,j+1] 573 | X = ramps 574 | y_pred = contour_generator.predict(X) 575 | f0_pred = y_pred[:,:-1].ravel()/f0_scale 576 | 577 | l = left_scopes[i] 578 | r = right_scopes[j] 579 | # x_axis_f0 = np.r_[np.arange(-l-2/3, -2/3, 1/3), # 0 on the end of R1 580 | # np.arange(-2/3, r-2/3, 1/3)] 581 | # 0 in the start of R1: 582 | x_axis_f0 = np.r_[np.arange(-l, 0, 1/vowel_pts), 583 | np.arange(0, r, 1/vowel_pts)] + 1/2/vowel_pts # to make it in the middle 584 | if scope_count > 0: 585 | plt.plot(x_axis_f0 + offset, f0_pred + row_offset, 586 | c=colors[contour_type], marker='o', ms=2.5, lw=2, alpha=.8) 587 | else: 588 | plt.plot(x_axis_f0 + offset, f0_pred + row_offset, 589 | c=colors[contour_type], marker='o', ms=2.5, lw=2, alpha=.3) 590 | 591 | plt.text(x_axis_f0[-1] + offset + .4, 592 | f0_pred[-1] + row_offset, scope_count, 593 | color=colors[contour_type], fontweight='bold', 594 | bbox=dict(facecolor='w', lw=0, alpha=0.5)) 595 | 596 | # set ticks 597 | ax1.set_yticks(y_ticks) 598 | ax1.set_yticklabels(y_tick_labels) 599 | ax1.set_xticks(x_ticks) 600 | ax1.set_xticklabels(x_tick_labels) 601 | 602 | #%% save 603 | 604 | plt.savefig('{}expansion_{}.png'.format(save_path, contour_type), dpi='figure') 605 | if not show_plot: 606 | plt.close(fig) 607 | 608 | def plot_final_losses(dict_losses, params, show_plot=False): 609 | ''' 610 | Plot final losses for all contour generators. 611 | 612 | Parameters 613 | ========== 614 | dict_losses : dict 615 | Dictionary of lossess for all phrase types. 616 | show_plot : bool 617 | Whether to close plot. 618 | 619 | params 620 | ------ 621 | save_path : str 622 | Save path for figure. 623 | iterations : int 624 | Number of iterations of analysis-by-synthesis loop. 625 | max_iter : int 626 | Maximum number of itterations. 627 | ''' 628 | plt.style.use('default') 629 | save_path = params.save_path 630 | iterations = params.iterations 631 | max_iter = params.max_iter 632 | 633 | pd_losses = pd.DataFrame(dict_losses) 634 | pd_losses = pd_losses.applymap(lambda x: x[-1] if x is not np.NaN else 0) 635 | sns.set() 636 | fig = plt.figure(figsize=(8,6)) 637 | sns.heatmap(pd_losses, annot=True, cbar=False, cmap="YlGnBu") 638 | plt.xlabel('Phrase type') 639 | plt.ylabel('Contour generator') 640 | plt.title('Losses after {} iterations with {} maxiter per loop'.format(iterations, max_iter)) 641 | plt.savefig('{}/final_losses_heat.png'.format(save_path), dpi='figure') 642 | if not show_plot: 643 | plt.close(fig) 644 | 645 | def plot_worst(corpus, params, phrase=None, n_files=None): 646 | ''' 647 | Copy plots for 100 worst files, or all files for a particular phrase type 648 | in descending order according to RMS. 649 | 650 | Parameters 651 | ========== 652 | corpus : pandas DataFrame 653 | Holds all the data. 654 | phrase : str 655 | Phrase type to copy for, if None copy for all. 656 | n_files : int 657 | Number of files to copy, if None copy all. 658 | 659 | params 660 | ====== 661 | orig_columns : list 662 | List with column names from corpus holding original data. 663 | iterations : iter 664 | Number of iterations of analysis-by-synthesis loop. 665 | save_path : str 666 | Save path for figure. 667 | ''' 668 | orig_columns = params.orig_columns 669 | iterations = params.iterations 670 | save_path = params.save_path 671 | 672 | # analyse losses per file and get a list of files 673 | pred_columns = [column + '_it{:03}'.format(iterations-1) for column in orig_columns] 674 | df_errors = pd.Series() 675 | for file in corpus['file'].unique(): 676 | mask_file = corpus['file'] == file 677 | n_units = np.max(corpus.loc[mask_file, 'n_unit'].values) 678 | error_file = np.array([]) 679 | for n_unit in range(n_units+1): 680 | mask_unit = corpus['n_unit'] == n_unit 681 | mask_row = mask_file & mask_unit 682 | y_pred = corpus.loc[mask_row, pred_columns].values 683 | y_pred_sum = np.sum(y_pred, axis=0) 684 | y_orig = corpus.loc[mask_row, orig_columns].values # every row should be the same 685 | y_error = y_orig[0,:] - y_pred_sum # this will automatically tile row to matrix 686 | error_file = np.r_[error_file, y_error[:-1]] # without the dur coeff 687 | df_errors[file] = np.mean(error_file**2) 688 | 689 | df_errors_sorted = df_errors.sort_values(ascending=False) 690 | df_errors_sorted.rename('RMS') 691 | 692 | # write to excel 693 | writer = pd.ExcelWriter(save_path+'/errors_lineup.xls') 694 | df_errors_sorted.to_excel(writer,'Sheet1') 695 | writer.save() 696 | 697 | # plot 698 | if phrase is None: 699 | os.mkdir(save_path+'/errors_lineup/') 700 | for cnt, file in enumerate(df_errors_sorted.index[:n_files]): 701 | bare = file.split('.')[0] 702 | phrase_type = bare[:2] 703 | src = save_path+'/'+phrase_type+'_f0/'+bare+'_*.png' 704 | dst = '{}/errors_lineup/{:03}_{:03}rms_{}.png'.format( 705 | save_path, cnt, int(df_errors_sorted.iloc[cnt]), bare) 706 | shutil.copyfile(glob.glob(src)[0], dst) 707 | 708 | else: 709 | os.mkdir(save_path+'/errors_lineup_'+phrase+'/') 710 | for cnt, file in enumerate(df_errors_sorted.index): 711 | bare = file.split('.')[0] 712 | phrase_type = bare[:2] 713 | if phrase_type == phrase: 714 | src = save_path+'/'+phrase_type+'_f0/'+bare+'_*.png' 715 | dst = '{}/errors_lineup_{}/{:03}_{:03}rms_{}.png'.format( 716 | save_path, phrase, cnt, int(df_errors_sorted.iloc[cnt]), bare) 717 | shutil.copyfile(glob.glob(src)[0], dst) -------------------------------------------------------------------------------- /examples/chinese_003.TextGrid: -------------------------------------------------------------------------------- 1 | File type = "ooTextFile" 2 | Object class = "TextGrid" 3 | 4 | xmin = 0 5 | xmax = 10.3245 6 | tiers? 7 | size = 10 8 | item []: 9 | item [1]: 10 | class = "IntervalTier" 11 | name = "PHON" 12 | xmin = 0 13 | xmax = 10.3245 14 | intervals: size = 75 15 | intervals [1]: 16 | xmin = 0 17 | xmax = 0.7779 18 | text = "__" 19 | intervals [2]: 20 | xmin = 0.7779 21 | xmax = 0.8519 22 | text = "t" 23 | intervals [3]: 24 | xmin = 0.8519 25 | xmax = 0.9569 26 | text = "a" 27 | intervals [4]: 28 | xmin = 0.9569 29 | xmax = 1.026 30 | text = "m" 31 | intervals [5]: 32 | xmin = 1.026 33 | xmax = 1.1138 34 | text = "en" 35 | intervals [6]: 36 | xmin = 1.1138 37 | xmax = 1.2 38 | text = "c" 39 | intervals [7]: 40 | xmin = 1.2 41 | xmax = 1.34 42 | text = "eng" 43 | intervals [8]: 44 | xmin = 1.34 45 | xmax = 1.386 46 | text = "z" 47 | intervals [9]: 48 | xmin = 1.386 49 | xmax = 1.5622 50 | text = "ai" 51 | intervals [10]: 52 | xmin = 1.5622 53 | xmax = 1.692 54 | text = "j" 55 | intervals [11]: 56 | xmin = 1.692 57 | xmax = 1.7804 58 | text = "i" 59 | intervals [12]: 60 | xmin = 1.7804 61 | xmax = 1.8719 62 | text = "c" 63 | intervals [13]: 64 | xmin = 1.8719 65 | xmax = 1.979 66 | text = "ang" 67 | intervals [14]: 68 | xmin = 1.979 69 | xmax = 2.0357 70 | text = "n" 71 | intervals [15]: 72 | xmin = 2.0357 73 | xmax = 2.2448 74 | text = "ei" 75 | intervals [16]: 76 | xmin = 2.2448 77 | xmax = 2.3459 78 | text = "g" 79 | intervals [17]: 80 | xmin = 2.3459 81 | xmax = 2.4929 82 | text = "ei" 83 | intervals [18]: 84 | xmin = 2.4929 85 | xmax = 2.5439 86 | text = "l" 87 | intervals [19]: 88 | xmin = 2.5439 89 | xmax = 2.6389 90 | text = "v" 91 | intervals [20]: 92 | xmin = 2.6389 93 | xmax = 2.7899 94 | text = "k" 95 | intervals [21]: 96 | xmin = 2.7899 97 | xmax = 2.9815 98 | text = "e" 99 | intervals [22]: 100 | xmin = 2.9815 101 | xmax = 3.072 102 | text = "d" 103 | intervals [23]: 104 | xmin = 3.072 105 | xmax = 3.2941 106 | text = "ian" 107 | intervals [24]: 108 | xmin = 3.2941 109 | xmax = 3.384 110 | text = "g" 111 | intervals [25]: 112 | xmin = 3.384 113 | xmax = 3.5758 114 | text = "e" 115 | intervals [26]: 116 | xmin = 3.5758 117 | xmax = 3.72 118 | text = "h" 119 | intervals [27]: 120 | xmin = 3.72 121 | xmax = 3.8271 122 | text = "e" 123 | intervals [28]: 124 | xmin = 3.8271 125 | xmax = 3.924 126 | text = "sh" 127 | intervals [29]: 128 | xmin = 3.924 129 | xmax = 4.0731 130 | text = "eng" 131 | intervals [30]: 132 | xmin = 4.0731 133 | xmax = 4.1432 134 | text = "r" 135 | intervals [31]: 136 | xmin = 4.1432 137 | xmax = 4.3528 138 | text = "iii" 139 | intervals [32]: 140 | xmin = 4.3528 141 | xmax = 4.7544 142 | text = "__" 143 | intervals [33]: 144 | xmin = 4.7544 145 | xmax = 4.8239 146 | text = "c" 147 | intervals [34]: 148 | xmin = 4.8239 149 | xmax = 4.9157 150 | text = "eng" 151 | intervals [35]: 152 | xmin = 4.9157 153 | xmax = 4.992 154 | text = "n" 155 | intervals [36]: 156 | xmin = 4.992 157 | xmax = 5.13 158 | text = "a" 159 | intervals [37]: 160 | xmin = 5.13 161 | xmax = 5.184 162 | text = "zh" 163 | intervals [38]: 164 | xmin = 5.184 165 | xmax = 5.3249 166 | text = "e" 167 | intervals [39]: 168 | xmin = 5.3249 169 | xmax = 5.4299 170 | text = "sh" 171 | intervals [40]: 172 | xmin = 5.4299 173 | xmax = 5.5519 174 | text = "uei" 175 | intervals [41]: 176 | xmin = 5.5519 177 | xmax = 5.622 178 | text = "g" 179 | intervals [42]: 180 | xmin = 5.622 181 | xmax = 5.7041 182 | text = "uo" 183 | intervals [43]: 184 | xmin = 5.7041 185 | xmax = 5.7659 186 | text = "n" 187 | intervals [44]: 188 | xmin = 5.7659 189 | xmax = 5.9355 190 | text = "ai" 191 | intervals [45]: 192 | xmin = 5.9355 193 | xmax = 6.0419 194 | text = "f" 195 | intervals [46]: 196 | xmin = 6.0419 197 | xmax = 6.249 198 | text = "en" 199 | intervals [47]: 200 | xmin = 6.249 201 | xmax = 6.354 202 | text = "q" 203 | intervals [48]: 204 | xmin = 6.354 205 | xmax = 6.4302 206 | text = "v" 207 | intervals [49]: 208 | xmin = 6.4302 209 | xmax = 6.558 210 | text = "t" 211 | intervals [50]: 212 | xmin = 6.558 213 | xmax = 6.6842 214 | text = "an" 215 | intervals [51]: 216 | xmin = 6.6842 217 | xmax = 6.9428 218 | text = "uang" 219 | intervals [52]: 220 | xmin = 6.9428 221 | xmax = 7.1248 222 | text = "iou" 223 | intervals [53]: 224 | xmin = 7.1248 225 | xmax = 7.236 226 | text = "t" 227 | intervals [54]: 228 | xmin = 7.236 229 | xmax = 7.3229 230 | text = "a" 231 | intervals [55]: 232 | xmin = 7.3229 233 | xmax = 7.386 234 | text = "m" 235 | intervals [56]: 236 | xmin = 7.386 237 | xmax = 7.4449 238 | text = "en" 239 | intervals [57]: 240 | xmin = 7.4449 241 | xmax = 7.5359 242 | text = "zh" 243 | intervals [58]: 244 | xmin = 7.5359 245 | xmax = 7.6857 246 | text = "uan" 247 | intervals [59]: 248 | xmin = 7.6857 249 | xmax = 7.7579 250 | text = "s" 251 | intervals [60]: 252 | xmin = 7.7579 253 | xmax = 7.8869 254 | text = "ong" 255 | intervals [61]: 256 | xmin = 7.8869 257 | xmax = 7.998 258 | text = "q" 259 | intervals [62]: 260 | xmin = 7.998 261 | xmax = 8.1645 262 | text = "v" 263 | intervals [63]: 264 | xmin = 8.1645 265 | xmax = 8.3975 266 | text = "i" 267 | intervals [64]: 268 | xmin = 8.3975 269 | xmax = 8.6023 270 | text = "van" 271 | intervals [65]: 272 | xmin = 8.6023 273 | xmax = 8.6459 274 | text = "d" 275 | intervals [66]: 276 | xmin = 8.6459 277 | xmax = 8.7915 278 | text = "e" 279 | intervals [67]: 280 | xmin = 8.7915 281 | xmax = 8.868 282 | text = "l" 283 | intervals [68]: 284 | xmin = 8.868 285 | xmax = 9.0454 286 | text = "v" 287 | intervals [69]: 288 | xmin = 9.0454 289 | xmax = 9.18 290 | text = "k" 291 | intervals [70]: 292 | xmin = 9.18 293 | xmax = 9.265 294 | text = "e" 295 | intervals [71]: 296 | xmin = 9.265 297 | xmax = 9.3719 298 | text = "ch" 299 | intervals [72]: 300 | xmin = 9.3719 301 | xmax = 9.4783 302 | text = "an" 303 | intervals [73]: 304 | xmin = 9.4783 305 | xmax = 9.5933 306 | text = "f" 307 | intervals [74]: 308 | xmin = 9.5933 309 | xmax = 9.7239 310 | text = "u" 311 | intervals [75]: 312 | xmin = 9.7239 313 | xmax = 10.3245 314 | text = "__" 315 | item [2]: 316 | class = "IntervalTier" 317 | name = "SYLL" 318 | xmin = 0 319 | xmax = 10.3245 320 | intervals: size = 38 321 | intervals [1]: 322 | xmin = 0.7779 323 | xmax = 0.9569 324 | text = "SYL" 325 | intervals [2]: 326 | xmin = 0.9569 327 | xmax = 1.1138 328 | text = "SYL" 329 | intervals [3]: 330 | xmin = 1.1138 331 | xmax = 1.34 332 | text = "SYL" 333 | intervals [4]: 334 | xmin = 1.34 335 | xmax = 1.5622 336 | text = "SYL" 337 | intervals [5]: 338 | xmin = 1.5622 339 | xmax = 1.7804 340 | text = "SYL" 341 | intervals [6]: 342 | xmin = 1.7804 343 | xmax = 1.979 344 | text = "SYL" 345 | intervals [7]: 346 | xmin = 1.979 347 | xmax = 2.2448 348 | text = "SYL" 349 | intervals [8]: 350 | xmin = 2.2448 351 | xmax = 2.4929 352 | text = "SYL" 353 | intervals [9]: 354 | xmin = 2.4929 355 | xmax = 2.6389 356 | text = "SYL" 357 | intervals [10]: 358 | xmin = 2.6389 359 | xmax = 2.9815 360 | text = "SYL" 361 | intervals [11]: 362 | xmin = 2.9815 363 | xmax = 3.2941 364 | text = "SYL" 365 | intervals [12]: 366 | xmin = 3.2941 367 | xmax = 3.5758 368 | text = "SYL" 369 | intervals [13]: 370 | xmin = 3.5758 371 | xmax = 3.8271 372 | text = "SYL" 373 | intervals [14]: 374 | xmin = 3.8271 375 | xmax = 4.0731 376 | text = "SYL" 377 | intervals [15]: 378 | xmin = 4.0731 379 | xmax = 4.7544 380 | text = "SYL" 381 | intervals [16]: 382 | xmin = 4.7544 383 | xmax = 4.9157 384 | text = "SYL" 385 | intervals [17]: 386 | xmin = 4.9157 387 | xmax = 5.13 388 | text = "SYL" 389 | intervals [18]: 390 | xmin = 5.13 391 | xmax = 5.3249 392 | text = "SYL" 393 | intervals [19]: 394 | xmin = 5.3249 395 | xmax = 5.5519 396 | text = "SYL" 397 | intervals [20]: 398 | xmin = 5.5519 399 | xmax = 5.7041 400 | text = "SYL" 401 | intervals [21]: 402 | xmin = 5.7041 403 | xmax = 5.9355 404 | text = "SYL" 405 | intervals [22]: 406 | xmin = 5.9355 407 | xmax = 6.249 408 | text = "SYL" 409 | intervals [23]: 410 | xmin = 6.249 411 | xmax = 6.4302 412 | text = "SYL" 413 | intervals [24]: 414 | xmin = 6.4302 415 | xmax = 6.6842 416 | text = "SYL" 417 | intervals [25]: 418 | xmin = 6.6842 419 | xmax = 6.9428 420 | text = "SYL" 421 | intervals [26]: 422 | xmin = 6.9428 423 | xmax = 7.1248 424 | text = "SYL" 425 | intervals [27]: 426 | xmin = 7.1248 427 | xmax = 7.3229 428 | text = "SYL" 429 | intervals [28]: 430 | xmin = 7.3229 431 | xmax = 7.4449 432 | text = "SYL" 433 | intervals [29]: 434 | xmin = 7.4449 435 | xmax = 7.6857 436 | text = "SYL" 437 | intervals [30]: 438 | xmin = 7.6857 439 | xmax = 7.8869 440 | text = "SYL" 441 | intervals [31]: 442 | xmin = 7.8869 443 | xmax = 8.1645 444 | text = "SYL" 445 | intervals [32]: 446 | xmin = 8.1645 447 | xmax = 8.3975 448 | text = "SYL" 449 | intervals [33]: 450 | xmin = 8.3975 451 | xmax = 8.6023 452 | text = "SYL" 453 | intervals [34]: 454 | xmin = 8.6023 455 | xmax = 8.7915 456 | text = "SYL" 457 | intervals [35]: 458 | xmin = 8.7915 459 | xmax = 9.0454 460 | text = "SYL" 461 | intervals [36]: 462 | xmin = 9.0454 463 | xmax = 9.265 464 | text = "SYL" 465 | intervals [37]: 466 | xmin = 9.265 467 | xmax = 9.4783 468 | text = "SYL" 469 | intervals [38]: 470 | xmin = 9.4783 471 | xmax = 10.3245 472 | text = "SYL" 473 | item [3]: 474 | class = "TextTier" 475 | name = "TONE1" 476 | xmin = 0 477 | xmax = 10.3245 478 | points: size = 40 479 | points [1]: 480 | time = 0.7779 481 | mark = ":FF" 482 | points [2]: 483 | time = 0.9569 484 | mark = ":C1" 485 | points [3]: 486 | time = 1.1138 487 | mark = ":FF" 488 | points [4]: 489 | time = 1.34 490 | mark = ":C2" 491 | points [5]: 492 | time = 1.5622 493 | mark = ":FF" 494 | points [6]: 495 | time = 1.7804 496 | mark = ":C1" 497 | points [7]: 498 | time = 1.979 499 | mark = ":FF" 500 | points [8]: 501 | time = 2.2448 502 | mark = ":C4" 503 | points [9]: 504 | time = 2.4929 505 | mark = ":FF" 506 | points [10]: 507 | time = 2.6389 508 | mark = ":C3" 509 | points [11]: 510 | time = 2.9815 511 | mark = ":FF" 512 | points [12]: 513 | time = 3.2941 514 | mark = ":C3" 515 | points [13]: 516 | time = 3.5758 517 | mark = ":FF" 518 | points [14]: 519 | time = 3.8271 520 | mark = ":C4" 521 | points [15]: 522 | time = 4.0731 523 | mark = ":FF" 524 | points [16]: 525 | time = 4.3528 526 | mark = ":C4" 527 | points [17]: 528 | time = 4.7544 529 | mark = ":FF" 530 | points [18]: 531 | time = 4.9157 532 | mark = ":FF" 533 | points [19]: 534 | time = 5.13 535 | mark = ":C2" 536 | points [20]: 537 | time = 5.3249 538 | mark = ":FF" 539 | points [21]: 540 | time = 5.5519 541 | mark = ":C2" 542 | points [22]: 543 | time = 5.7041 544 | mark = ":FF" 545 | points [23]: 546 | time = 5.9355 547 | mark = ":C2" 548 | points [24]: 549 | time = 6.249 550 | mark = ":FF" 551 | points [25]: 552 | time = 6.4302 553 | mark = ":C4" 554 | points [26]: 555 | time = 6.6842 556 | mark = ":FF" 557 | points [27]: 558 | time = 6.9428 559 | mark = ":C4" 560 | points [28]: 561 | time = 7.1248 562 | mark = ":FF" 563 | points [29]: 564 | time = 7.3229 565 | mark = ":C1" 566 | points [30]: 567 | time = 7.4449 568 | mark = ":FF" 569 | points [31]: 570 | time = 7.6857 571 | mark = ":C3" 572 | points [32]: 573 | time = 7.8869 574 | mark = ":FF" 575 | points [33]: 576 | time = 8.1645 577 | mark = ":C4" 578 | points [34]: 579 | time = 8.3975 580 | mark = ":FF" 581 | points [35]: 582 | time = 8.6023 583 | mark = ":C4" 584 | points [36]: 585 | time = 8.7915 586 | mark = ":FF" 587 | points [37]: 588 | time = 9.0454 589 | mark = ":C3" 590 | points [38]: 591 | time = 9.265 592 | mark = ":FF" 593 | points [39]: 594 | time = 9.4783 595 | mark = ":C3" 596 | points [40]: 597 | time = 9.7239 598 | mark = ":FF" 599 | item [4]: 600 | class = "TextTier" 601 | name = "TONE2" 602 | xmin = 0 603 | xmax = 10.3245 604 | points: size = 39 605 | points [1]: 606 | time = 0.9569 607 | mark = ":FF" 608 | points [2]: 609 | time = 1.1138 610 | mark = ":C0" 611 | points [3]: 612 | time = 1.34 613 | mark = ":FF" 614 | points [4]: 615 | time = 1.5622 616 | mark = ":C4" 617 | points [5]: 618 | time = 1.7804 619 | mark = ":FF" 620 | points [6]: 621 | time = 1.979 622 | mark = ":C1" 623 | points [7]: 624 | time = 2.2448 625 | mark = ":FF" 626 | points [8]: 627 | time = 2.4929 628 | mark = ":C2" 629 | points [9]: 630 | time = 2.6389 631 | mark = ":FF" 632 | points [10]: 633 | time = 2.9815 634 | mark = ":C4" 635 | points [11]: 636 | time = 3.2941 637 | mark = ":FF" 638 | points [12]: 639 | time = 3.5758 640 | mark = ":C1" 641 | points [13]: 642 | time = 3.8271 643 | mark = ":FF" 644 | points [14]: 645 | time = 4.0731 646 | mark = ":C1" 647 | points [15]: 648 | time = 4.3528 649 | mark = ":FF" 650 | points [16]: 651 | time = 4.7544 652 | mark = ":FF" 653 | points [17]: 654 | time = 4.9157 655 | mark = ":C2" 656 | points [18]: 657 | time = 5.13 658 | mark = ":FF" 659 | points [19]: 660 | time = 5.3249 661 | mark = ":C1" 662 | points [20]: 663 | time = 5.5519 664 | mark = ":FF" 665 | points [21]: 666 | time = 5.7041 667 | mark = ":C3" 668 | points [22]: 669 | time = 5.9355 670 | mark = ":FF" 671 | points [23]: 672 | time = 6.249 673 | mark = ":C3" 674 | points [24]: 675 | time = 6.4302 676 | mark = ":FF" 677 | points [25]: 678 | time = 6.6842 679 | mark = ":C4" 680 | points [26]: 681 | time = 6.9428 682 | mark = ":FF" 683 | points [27]: 684 | time = 7.1248 685 | mark = ":C2" 686 | points [28]: 687 | time = 7.3229 688 | mark = ":FF" 689 | points [29]: 690 | time = 7.4449 691 | mark = ":C0" 692 | points [30]: 693 | time = 7.6857 694 | mark = ":FF" 695 | points [31]: 696 | time = 7.8869 697 | mark = ":C4" 698 | points [32]: 699 | time = 8.1645 700 | mark = ":FF" 701 | points [33]: 702 | time = 8.3975 703 | mark = ":C1" 704 | points [34]: 705 | time = 8.6023 706 | mark = ":FF" 707 | points [35]: 708 | time = 8.7915 709 | mark = ":C0" 710 | points [36]: 711 | time = 9.0454 712 | mark = ":FF" 713 | points [37]: 714 | time = 9.265 715 | mark = ":C4" 716 | points [38]: 717 | time = 9.4783 718 | mark = ":FF" 719 | points [39]: 720 | time = 9.7239 721 | mark = ":C4" 722 | item [5]: 723 | class = "IntervalTier" 724 | name = "ORTHOGRAPHE" 725 | xmin = 0 726 | xmax = 10.3245 727 | intervals: size = 40 728 | intervals [1]: 729 | xmin = 0.7779 730 | xmax = 0.9569 731 | text = "ta1" 732 | intervals [2]: 733 | xmin = 0.9569 734 | xmax = 1.1138 735 | text = "men0" 736 | intervals [3]: 737 | xmin = 1.1138 738 | xmax = 1.34 739 | text = "ceng2" 740 | intervals [4]: 741 | xmin = 1.34 742 | xmax = 1.5622 743 | text = "zai4" 744 | intervals [5]: 745 | xmin = 1.5622 746 | xmax = 1.7804 747 | text = "ji1" 748 | intervals [6]: 749 | xmin = 1.7804 750 | xmax = 1.979 751 | text = "cang1" 752 | intervals [7]: 753 | xmin = 1.979 754 | xmax = 2.2448 755 | text = "nei4" 756 | intervals [8]: 757 | xmin = 2.2448 758 | xmax = 2.4929 759 | text = "gei2" 760 | intervals [9]: 761 | xmin = 2.4929 762 | xmax = 2.6389 763 | text = "lv3" 764 | intervals [10]: 765 | xmin = 2.6389 766 | xmax = 2.9815 767 | text = "ke4" 768 | intervals [11]: 769 | xmin = 2.9815 770 | xmax = 3.2941 771 | text = "dian3" 772 | intervals [12]: 773 | xmin = 3.2941 774 | xmax = 3.5758 775 | text = "ge1" 776 | intervals [13]: 777 | xmin = 3.5758 778 | xmax = 3.8271 779 | text = "he4" 780 | intervals [14]: 781 | xmin = 3.8271 782 | xmax = 4.0731 783 | text = "sheng1" 784 | intervals [15]: 785 | xmin = 4.0731 786 | xmax = 4.3528 787 | text = "ri4" 788 | intervals [16]: 789 | xmin = 4.3528 790 | xmax = 4.7544 791 | text = "," 792 | intervals [17]: 793 | xmin = 4.7544 794 | xmax = 4.9157 795 | text = "ceng2" 796 | intervals [18]: 797 | xmin = 4.9157 798 | xmax = 5.13 799 | text = "na2" 800 | intervals [19]: 801 | xmin = 5.13 802 | xmax = 5.3249 803 | text = "zhe1" 804 | intervals [20]: 805 | xmin = 5.3249 806 | xmax = 5.5519 807 | text = "shui2" 808 | intervals [21]: 809 | xmin = 5.5519 810 | xmax = 5.7041 811 | text = "guo3" 812 | intervals [22]: 813 | xmin = 5.7041 814 | xmax = 5.9355 815 | text = "nai2" 816 | intervals [23]: 817 | xmin = 5.9355 818 | xmax = 6.249 819 | text = "fen3" 820 | intervals [24]: 821 | xmin = 6.249 822 | xmax = 6.4302 823 | text = "qu4" 824 | intervals [25]: 825 | xmin = 6.4302 826 | xmax = 6.6842 827 | text = "tan4" 828 | intervals [26]: 829 | xmin = 6.6842 830 | xmax = 6.9428 831 | text = "wang4" 832 | intervals [27]: 833 | xmin = 6.9428 834 | xmax = 7.1248 835 | text = "you2" 836 | intervals [28]: 837 | xmin = 7.1248 838 | xmax = 7.3229 839 | text = "ta1" 840 | intervals [29]: 841 | xmin = 7.3229 842 | xmax = 7.4449 843 | text = "men0" 844 | intervals [30]: 845 | xmin = 7.4449 846 | xmax = 7.6857 847 | text = "zhuan3" 848 | intervals [31]: 849 | xmin = 7.6857 850 | xmax = 7.8869 851 | text = "song4" 852 | intervals [32]: 853 | xmin = 7.8869 854 | xmax = 8.1645 855 | text = "qu4" 856 | intervals [33]: 857 | xmin = 8.1645 858 | xmax = 8.3975 859 | text = "yi1" 860 | intervals [34]: 861 | xmin = 8.3975 862 | xmax = 8.6023 863 | text = "yuan4" 864 | intervals [35]: 865 | xmin = 8.6023 866 | xmax = 8.7915 867 | text = "de0" 868 | intervals [36]: 869 | xmin = 8.7915 870 | xmax = 9.0454 871 | text = "lv3" 872 | intervals [37]: 873 | xmin = 9.0454 874 | xmax = 9.265 875 | text = "ke4" 876 | intervals [38]: 877 | xmin = 9.265 878 | xmax = 9.4783 879 | text = "chan3" 880 | intervals [39]: 881 | xmin = 9.4783 882 | xmax = 9.7239 883 | text = "fu4" 884 | intervals [40]: 885 | xmin = 9.7239 886 | xmax = 10.3245 887 | text = "." 888 | item [6]: 889 | class = "IntervalTier" 890 | name = "TRANSLATION" 891 | xmin = 0 892 | xmax = 10.3245 893 | intervals: size = 0 894 | item [7]: 895 | class = "IntervalTier" 896 | name = "LEX" 897 | xmin = 0 898 | xmax = 10.3245 899 | intervals: size = 18 900 | intervals [1]: 901 | xmin = 0.7779 902 | xmax = 1.1138 903 | text = "u" 904 | intervals [2]: 905 | xmin = 1.1138 906 | xmax = 1.5622 907 | text = "p" 908 | intervals [3]: 909 | xmin = 1.5622 910 | xmax = 2.2448 911 | text = "f" 912 | intervals [4]: 913 | xmin = 2.2448 914 | xmax = 2.9815 915 | text = "ngp" 916 | intervals [5]: 917 | xmin = 2.9815 918 | xmax = 3.5758 919 | text = "n" 920 | intervals [6]: 921 | xmin = 3.5758 922 | xmax = 4.7544 923 | text = "ngp" 924 | intervals [7]: 925 | xmin = 4.7544 926 | xmax = 5.3249 927 | text = "u" 928 | intervals [8]: 929 | xmin = 5.3249 930 | xmax = 5.7041 931 | text = "n" 932 | intervals [9]: 933 | xmin = 5.7041 934 | xmax = 6.249 935 | text = "n" 936 | intervals [10]: 937 | xmin = 6.249 938 | xmax = 6.4302 939 | text = "v" 940 | intervals [11]: 941 | xmin = 6.4302 942 | xmax = 6.9428 943 | text = "v" 944 | intervals [12]: 945 | xmin = 6.9428 946 | xmax = 7.1248 947 | text = "p" 948 | intervals [13]: 949 | xmin = 7.1248 950 | xmax = 7.4449 951 | text = "u" 952 | intervals [14]: 953 | xmin = 7.4449 954 | xmax = 7.8869 955 | text = "v" 956 | intervals [15]: 957 | xmin = 7.8869 958 | xmax = 8.1645 959 | text = "v" 960 | intervals [16]: 961 | xmin = 8.1645 962 | xmax = 8.7915 963 | text = "u" 964 | intervals [17]: 965 | xmin = 8.7915 966 | xmax = 9.265 967 | text = "ngp" 968 | intervals [18]: 969 | xmin = 9.265 970 | xmax = 10.3245 971 | text = "ngp" 972 | item [8]: 973 | class = "TextTier" 974 | name = "CLAUSE" 975 | xmin = 0 976 | xmax = 10.3245 977 | points: size = 2 978 | points [1]: 979 | time = 0.7779 980 | mark = ":FF" 981 | points [2]: 982 | time = 9.7239 983 | mark = ":DC" 984 | item [9]: 985 | class = "TextTier" 986 | name = "WORD" 987 | xmin = 0 988 | xmax = 10.3245 989 | points: size = 19 990 | points [1]: 991 | time = 0.7779 992 | mark = ":FF" 993 | points [2]: 994 | time = 1.1138 995 | mark = ":WB" 996 | points [3]: 997 | time = 1.5622 998 | mark = ":WB" 999 | points [4]: 1000 | time = 2.2448 1001 | mark = ":WB" 1002 | points [5]: 1003 | time = 2.9815 1004 | mark = ":WB" 1005 | points [6]: 1006 | time = 3.5758 1007 | mark = ":WB" 1008 | points [7]: 1009 | time = 4.7544 1010 | mark = ":WB" 1011 | points [8]: 1012 | time = 5.3249 1013 | mark = ":WB" 1014 | points [9]: 1015 | time = 5.7041 1016 | mark = ":WB" 1017 | points [10]: 1018 | time = 6.249 1019 | mark = ":WB" 1020 | points [11]: 1021 | time = 6.4302 1022 | mark = ":WB" 1023 | points [12]: 1024 | time = 6.9428 1025 | mark = ":WB" 1026 | points [13]: 1027 | time = 7.1248 1028 | mark = ":WB" 1029 | points [14]: 1030 | time = 7.4449 1031 | mark = ":WB" 1032 | points [15]: 1033 | time = 7.8869 1034 | mark = ":WB" 1035 | points [16]: 1036 | time = 8.1645 1037 | mark = ":WB" 1038 | points [17]: 1039 | time = 8.7915 1040 | mark = ":WB" 1041 | points [18]: 1042 | time = 9.265 1043 | mark = ":WB" 1044 | points [19]: 1045 | time = 9.7239 1046 | mark = ":WB" 1047 | item [10]: 1048 | class = "TextTier" 1049 | name = "NIV1" 1050 | xmin = 0 1051 | xmax = 10.3245 1052 | points: size = 3 1053 | points [1]: 1054 | time = 0.7779 1055 | mark = ":FF" 1056 | points [2]: 1057 | time = 4.7544 1058 | mark = ":ID" 1059 | points [3]: 1060 | time = 9.7239 1061 | mark = ":FF" 1062 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | GNU GENERAL PUBLIC LICENSE 2 | Version 3, 29 June 2007 3 | 4 | Copyright (C) 2007 Free Software Foundation, Inc. 5 | Everyone is permitted to copy and distribute verbatim copies 6 | of this license document, but changing it is not allowed. 7 | 8 | Preamble 9 | 10 | The GNU General Public License is a free, copyleft license for 11 | software and other kinds of works. 12 | 13 | The licenses for most software and other practical works are designed 14 | to take away your freedom to share and change the works. By contrast, 15 | the GNU General Public License is intended to guarantee your freedom to 16 | share and change all versions of a program--to make sure it remains free 17 | software for all its users. We, the Free Software Foundation, use the 18 | GNU General Public License for most of our software; it applies also to 19 | any other work released this way by its authors. You can apply it to 20 | your programs, too. 21 | 22 | When we speak of free software, we are referring to freedom, not 23 | price. Our General Public Licenses are designed to make sure that you 24 | have the freedom to distribute copies of free software (and charge for 25 | them if you wish), that you receive source code or can get it if you 26 | want it, that you can change the software or use pieces of it in new 27 | free programs, and that you know you can do these things. 28 | 29 | To protect your rights, we need to prevent others from denying you 30 | these rights or asking you to surrender the rights. Therefore, you have 31 | certain responsibilities if you distribute copies of the software, or if 32 | you modify it: responsibilities to respect the freedom of others. 33 | 34 | For example, if you distribute copies of such a program, whether 35 | gratis or for a fee, you must pass on to the recipients the same 36 | freedoms that you received. You must make sure that they, too, receive 37 | or can get the source code. And you must show them these terms so they 38 | know their rights. 39 | 40 | Developers that use the GNU GPL protect your rights with two steps: 41 | (1) assert copyright on the software, and (2) offer you this License 42 | giving you legal permission to copy, distribute and/or modify it. 43 | 44 | For the developers' and authors' protection, the GPL clearly explains 45 | that there is no warranty for this free software. For both users' and 46 | authors' sake, the GPL requires that modified versions be marked as 47 | changed, so that their problems will not be attributed erroneously to 48 | authors of previous versions. 49 | 50 | Some devices are designed to deny users access to install or run 51 | modified versions of the software inside them, although the manufacturer 52 | can do so. This is fundamentally incompatible with the aim of 53 | protecting users' freedom to change the software. The systematic 54 | pattern of such abuse occurs in the area of products for individuals to 55 | use, which is precisely where it is most unacceptable. Therefore, we 56 | have designed this version of the GPL to prohibit the practice for those 57 | products. If such problems arise substantially in other domains, we 58 | stand ready to extend this provision to those domains in future versions 59 | of the GPL, as needed to protect the freedom of users. 60 | 61 | Finally, every program is threatened constantly by software patents. 62 | States should not allow patents to restrict development and use of 63 | software on general-purpose computers, but in those that do, we wish to 64 | avoid the special danger that patents applied to a free program could 65 | make it effectively proprietary. To prevent this, the GPL assures that 66 | patents cannot be used to render the program non-free. 67 | 68 | The precise terms and conditions for copying, distribution and 69 | modification follow. 70 | 71 | TERMS AND CONDITIONS 72 | 73 | 0. Definitions. 74 | 75 | "This License" refers to version 3 of the GNU General Public License. 76 | 77 | "Copyright" also means copyright-like laws that apply to other kinds of 78 | works, such as semiconductor masks. 79 | 80 | "The Program" refers to any copyrightable work licensed under this 81 | License. Each licensee is addressed as "you". "Licensees" and 82 | "recipients" may be individuals or organizations. 83 | 84 | To "modify" a work means to copy from or adapt all or part of the work 85 | in a fashion requiring copyright permission, other than the making of an 86 | exact copy. The resulting work is called a "modified version" of the 87 | earlier work or a work "based on" the earlier work. 88 | 89 | A "covered work" means either the unmodified Program or a work based 90 | on the Program. 91 | 92 | To "propagate" a work means to do anything with it that, without 93 | permission, would make you directly or secondarily liable for 94 | infringement under applicable copyright law, except executing it on a 95 | computer or modifying a private copy. Propagation includes copying, 96 | distribution (with or without modification), making available to the 97 | public, and in some countries other activities as well. 98 | 99 | To "convey" a work means any kind of propagation that enables other 100 | parties to make or receive copies. Mere interaction with a user through 101 | a computer network, with no transfer of a copy, is not conveying. 102 | 103 | An interactive user interface displays "Appropriate Legal Notices" 104 | to the extent that it includes a convenient and prominently visible 105 | feature that (1) displays an appropriate copyright notice, and (2) 106 | tells the user that there is no warranty for the work (except to the 107 | extent that warranties are provided), that licensees may convey the 108 | work under this License, and how to view a copy of this License. If 109 | the interface presents a list of user commands or options, such as a 110 | menu, a prominent item in the list meets this criterion. 111 | 112 | 1. Source Code. 113 | 114 | The "source code" for a work means the preferred form of the work 115 | for making modifications to it. "Object code" means any non-source 116 | form of a work. 117 | 118 | A "Standard Interface" means an interface that either is an official 119 | standard defined by a recognized standards body, or, in the case of 120 | interfaces specified for a particular programming language, one that 121 | is widely used among developers working in that language. 122 | 123 | The "System Libraries" of an executable work include anything, other 124 | than the work as a whole, that (a) is included in the normal form of 125 | packaging a Major Component, but which is not part of that Major 126 | Component, and (b) serves only to enable use of the work with that 127 | Major Component, or to implement a Standard Interface for which an 128 | implementation is available to the public in source code form. A 129 | "Major Component", in this context, means a major essential component 130 | (kernel, window system, and so on) of the specific operating system 131 | (if any) on which the executable work runs, or a compiler used to 132 | produce the work, or an object code interpreter used to run it. 133 | 134 | The "Corresponding Source" for a work in object code form means all 135 | the source code needed to generate, install, and (for an executable 136 | work) run the object code and to modify the work, including scripts to 137 | control those activities. However, it does not include the work's 138 | System Libraries, or general-purpose tools or generally available free 139 | programs which are used unmodified in performing those activities but 140 | which are not part of the work. For example, Corresponding Source 141 | includes interface definition files associated with source files for 142 | the work, and the source code for shared libraries and dynamically 143 | linked subprograms that the work is specifically designed to require, 144 | such as by intimate data communication or control flow between those 145 | subprograms and other parts of the work. 146 | 147 | The Corresponding Source need not include anything that users 148 | can regenerate automatically from other parts of the Corresponding 149 | Source. 150 | 151 | The Corresponding Source for a work in source code form is that 152 | same work. 153 | 154 | 2. Basic Permissions. 155 | 156 | All rights granted under this License are granted for the term of 157 | copyright on the Program, and are irrevocable provided the stated 158 | conditions are met. This License explicitly affirms your unlimited 159 | permission to run the unmodified Program. The output from running a 160 | covered work is covered by this License only if the output, given its 161 | content, constitutes a covered work. This License acknowledges your 162 | rights of fair use or other equivalent, as provided by copyright law. 163 | 164 | You may make, run and propagate covered works that you do not 165 | convey, without conditions so long as your license otherwise remains 166 | in force. You may convey covered works to others for the sole purpose 167 | of having them make modifications exclusively for you, or provide you 168 | with facilities for running those works, provided that you comply with 169 | the terms of this License in conveying all material for which you do 170 | not control copyright. Those thus making or running the covered works 171 | for you must do so exclusively on your behalf, under your direction 172 | and control, on terms that prohibit them from making any copies of 173 | your copyrighted material outside their relationship with you. 174 | 175 | Conveying under any other circumstances is permitted solely under 176 | the conditions stated below. Sublicensing is not allowed; section 10 177 | makes it unnecessary. 178 | 179 | 3. Protecting Users' Legal Rights From Anti-Circumvention Law. 180 | 181 | No covered work shall be deemed part of an effective technological 182 | measure under any applicable law fulfilling obligations under article 183 | 11 of the WIPO copyright treaty adopted on 20 December 1996, or 184 | similar laws prohibiting or restricting circumvention of such 185 | measures. 186 | 187 | When you convey a covered work, you waive any legal power to forbid 188 | circumvention of technological measures to the extent such circumvention 189 | is effected by exercising rights under this License with respect to 190 | the covered work, and you disclaim any intention to limit operation or 191 | modification of the work as a means of enforcing, against the work's 192 | users, your or third parties' legal rights to forbid circumvention of 193 | technological measures. 194 | 195 | 4. Conveying Verbatim Copies. 196 | 197 | You may convey verbatim copies of the Program's source code as you 198 | receive it, in any medium, provided that you conspicuously and 199 | appropriately publish on each copy an appropriate copyright notice; 200 | keep intact all notices stating that this License and any 201 | non-permissive terms added in accord with section 7 apply to the code; 202 | keep intact all notices of the absence of any warranty; and give all 203 | recipients a copy of this License along with the Program. 204 | 205 | You may charge any price or no price for each copy that you convey, 206 | and you may offer support or warranty protection for a fee. 207 | 208 | 5. Conveying Modified Source Versions. 209 | 210 | You may convey a work based on the Program, or the modifications to 211 | produce it from the Program, in the form of source code under the 212 | terms of section 4, provided that you also meet all of these conditions: 213 | 214 | a) The work must carry prominent notices stating that you modified 215 | it, and giving a relevant date. 216 | 217 | b) The work must carry prominent notices stating that it is 218 | released under this License and any conditions added under section 219 | 7. This requirement modifies the requirement in section 4 to 220 | "keep intact all notices". 221 | 222 | c) You must license the entire work, as a whole, under this 223 | License to anyone who comes into possession of a copy. This 224 | License will therefore apply, along with any applicable section 7 225 | additional terms, to the whole of the work, and all its parts, 226 | regardless of how they are packaged. This License gives no 227 | permission to license the work in any other way, but it does not 228 | invalidate such permission if you have separately received it. 229 | 230 | d) If the work has interactive user interfaces, each must display 231 | Appropriate Legal Notices; however, if the Program has interactive 232 | interfaces that do not display Appropriate Legal Notices, your 233 | work need not make them do so. 234 | 235 | A compilation of a covered work with other separate and independent 236 | works, which are not by their nature extensions of the covered work, 237 | and which are not combined with it such as to form a larger program, 238 | in or on a volume of a storage or distribution medium, is called an 239 | "aggregate" if the compilation and its resulting copyright are not 240 | used to limit the access or legal rights of the compilation's users 241 | beyond what the individual works permit. Inclusion of a covered work 242 | in an aggregate does not cause this License to apply to the other 243 | parts of the aggregate. 244 | 245 | 6. Conveying Non-Source Forms. 246 | 247 | You may convey a covered work in object code form under the terms 248 | of sections 4 and 5, provided that you also convey the 249 | machine-readable Corresponding Source under the terms of this License, 250 | in one of these ways: 251 | 252 | a) Convey the object code in, or embodied in, a physical product 253 | (including a physical distribution medium), accompanied by the 254 | Corresponding Source fixed on a durable physical medium 255 | customarily used for software interchange. 256 | 257 | b) Convey the object code in, or embodied in, a physical product 258 | (including a physical distribution medium), accompanied by a 259 | written offer, valid for at least three years and valid for as 260 | long as you offer spare parts or customer support for that product 261 | model, to give anyone who possesses the object code either (1) a 262 | copy of the Corresponding Source for all the software in the 263 | product that is covered by this License, on a durable physical 264 | medium customarily used for software interchange, for a price no 265 | more than your reasonable cost of physically performing this 266 | conveying of source, or (2) access to copy the 267 | Corresponding Source from a network server at no charge. 268 | 269 | c) Convey individual copies of the object code with a copy of the 270 | written offer to provide the Corresponding Source. This 271 | alternative is allowed only occasionally and noncommercially, and 272 | only if you received the object code with such an offer, in accord 273 | with subsection 6b. 274 | 275 | d) Convey the object code by offering access from a designated 276 | place (gratis or for a charge), and offer equivalent access to the 277 | Corresponding Source in the same way through the same place at no 278 | further charge. You need not require recipients to copy the 279 | Corresponding Source along with the object code. If the place to 280 | copy the object code is a network server, the Corresponding Source 281 | may be on a different server (operated by you or a third party) 282 | that supports equivalent copying facilities, provided you maintain 283 | clear directions next to the object code saying where to find the 284 | Corresponding Source. Regardless of what server hosts the 285 | Corresponding Source, you remain obligated to ensure that it is 286 | available for as long as needed to satisfy these requirements. 287 | 288 | e) Convey the object code using peer-to-peer transmission, provided 289 | you inform other peers where the object code and Corresponding 290 | Source of the work are being offered to the general public at no 291 | charge under subsection 6d. 292 | 293 | A separable portion of the object code, whose source code is excluded 294 | from the Corresponding Source as a System Library, need not be 295 | included in conveying the object code work. 296 | 297 | A "User Product" is either (1) a "consumer product", which means any 298 | tangible personal property which is normally used for personal, family, 299 | or household purposes, or (2) anything designed or sold for incorporation 300 | into a dwelling. In determining whether a product is a consumer product, 301 | doubtful cases shall be resolved in favor of coverage. For a particular 302 | product received by a particular user, "normally used" refers to a 303 | typical or common use of that class of product, regardless of the status 304 | of the particular user or of the way in which the particular user 305 | actually uses, or expects or is expected to use, the product. A product 306 | is a consumer product regardless of whether the product has substantial 307 | commercial, industrial or non-consumer uses, unless such uses represent 308 | the only significant mode of use of the product. 309 | 310 | "Installation Information" for a User Product means any methods, 311 | procedures, authorization keys, or other information required to install 312 | and execute modified versions of a covered work in that User Product from 313 | a modified version of its Corresponding Source. The information must 314 | suffice to ensure that the continued functioning of the modified object 315 | code is in no case prevented or interfered with solely because 316 | modification has been made. 317 | 318 | If you convey an object code work under this section in, or with, or 319 | specifically for use in, a User Product, and the conveying occurs as 320 | part of a transaction in which the right of possession and use of the 321 | User Product is transferred to the recipient in perpetuity or for a 322 | fixed term (regardless of how the transaction is characterized), the 323 | Corresponding Source conveyed under this section must be accompanied 324 | by the Installation Information. But this requirement does not apply 325 | if neither you nor any third party retains the ability to install 326 | modified object code on the User Product (for example, the work has 327 | been installed in ROM). 328 | 329 | The requirement to provide Installation Information does not include a 330 | requirement to continue to provide support service, warranty, or updates 331 | for a work that has been modified or installed by the recipient, or for 332 | the User Product in which it has been modified or installed. Access to a 333 | network may be denied when the modification itself materially and 334 | adversely affects the operation of the network or violates the rules and 335 | protocols for communication across the network. 336 | 337 | Corresponding Source conveyed, and Installation Information provided, 338 | in accord with this section must be in a format that is publicly 339 | documented (and with an implementation available to the public in 340 | source code form), and must require no special password or key for 341 | unpacking, reading or copying. 342 | 343 | 7. Additional Terms. 344 | 345 | "Additional permissions" are terms that supplement the terms of this 346 | License by making exceptions from one or more of its conditions. 347 | Additional permissions that are applicable to the entire Program shall 348 | be treated as though they were included in this License, to the extent 349 | that they are valid under applicable law. If additional permissions 350 | apply only to part of the Program, that part may be used separately 351 | under those permissions, but the entire Program remains governed by 352 | this License without regard to the additional permissions. 353 | 354 | When you convey a copy of a covered work, you may at your option 355 | remove any additional permissions from that copy, or from any part of 356 | it. (Additional permissions may be written to require their own 357 | removal in certain cases when you modify the work.) You may place 358 | additional permissions on material, added by you to a covered work, 359 | for which you have or can give appropriate copyright permission. 360 | 361 | Notwithstanding any other provision of this License, for material you 362 | add to a covered work, you may (if authorized by the copyright holders of 363 | that material) supplement the terms of this License with terms: 364 | 365 | a) Disclaiming warranty or limiting liability differently from the 366 | terms of sections 15 and 16 of this License; or 367 | 368 | b) Requiring preservation of specified reasonable legal notices or 369 | author attributions in that material or in the Appropriate Legal 370 | Notices displayed by works containing it; or 371 | 372 | c) Prohibiting misrepresentation of the origin of that material, or 373 | requiring that modified versions of such material be marked in 374 | reasonable ways as different from the original version; or 375 | 376 | d) Limiting the use for publicity purposes of names of licensors or 377 | authors of the material; or 378 | 379 | e) Declining to grant rights under trademark law for use of some 380 | trade names, trademarks, or service marks; or 381 | 382 | f) Requiring indemnification of licensors and authors of that 383 | material by anyone who conveys the material (or modified versions of 384 | it) with contractual assumptions of liability to the recipient, for 385 | any liability that these contractual assumptions directly impose on 386 | those licensors and authors. 387 | 388 | All other non-permissive additional terms are considered "further 389 | restrictions" within the meaning of section 10. If the Program as you 390 | received it, or any part of it, contains a notice stating that it is 391 | governed by this License along with a term that is a further 392 | restriction, you may remove that term. If a license document contains 393 | a further restriction but permits relicensing or conveying under this 394 | License, you may add to a covered work material governed by the terms 395 | of that license document, provided that the further restriction does 396 | not survive such relicensing or conveying. 397 | 398 | If you add terms to a covered work in accord with this section, you 399 | must place, in the relevant source files, a statement of the 400 | additional terms that apply to those files, or a notice indicating 401 | where to find the applicable terms. 402 | 403 | Additional terms, permissive or non-permissive, may be stated in the 404 | form of a separately written license, or stated as exceptions; 405 | the above requirements apply either way. 406 | 407 | 8. Termination. 408 | 409 | You may not propagate or modify a covered work except as expressly 410 | provided under this License. Any attempt otherwise to propagate or 411 | modify it is void, and will automatically terminate your rights under 412 | this License (including any patent licenses granted under the third 413 | paragraph of section 11). 414 | 415 | However, if you cease all violation of this License, then your 416 | license from a particular copyright holder is reinstated (a) 417 | provisionally, unless and until the copyright holder explicitly and 418 | finally terminates your license, and (b) permanently, if the copyright 419 | holder fails to notify you of the violation by some reasonable means 420 | prior to 60 days after the cessation. 421 | 422 | Moreover, your license from a particular copyright holder is 423 | reinstated permanently if the copyright holder notifies you of the 424 | violation by some reasonable means, this is the first time you have 425 | received notice of violation of this License (for any work) from that 426 | copyright holder, and you cure the violation prior to 30 days after 427 | your receipt of the notice. 428 | 429 | Termination of your rights under this section does not terminate the 430 | licenses of parties who have received copies or rights from you under 431 | this License. If your rights have been terminated and not permanently 432 | reinstated, you do not qualify to receive new licenses for the same 433 | material under section 10. 434 | 435 | 9. Acceptance Not Required for Having Copies. 436 | 437 | You are not required to accept this License in order to receive or 438 | run a copy of the Program. Ancillary propagation of a covered work 439 | occurring solely as a consequence of using peer-to-peer transmission 440 | to receive a copy likewise does not require acceptance. However, 441 | nothing other than this License grants you permission to propagate or 442 | modify any covered work. These actions infringe copyright if you do 443 | not accept this License. Therefore, by modifying or propagating a 444 | covered work, you indicate your acceptance of this License to do so. 445 | 446 | 10. Automatic Licensing of Downstream Recipients. 447 | 448 | Each time you convey a covered work, the recipient automatically 449 | receives a license from the original licensors, to run, modify and 450 | propagate that work, subject to this License. You are not responsible 451 | for enforcing compliance by third parties with this License. 452 | 453 | An "entity transaction" is a transaction transferring control of an 454 | organization, or substantially all assets of one, or subdividing an 455 | organization, or merging organizations. If propagation of a covered 456 | work results from an entity transaction, each party to that 457 | transaction who receives a copy of the work also receives whatever 458 | licenses to the work the party's predecessor in interest had or could 459 | give under the previous paragraph, plus a right to possession of the 460 | Corresponding Source of the work from the predecessor in interest, if 461 | the predecessor has it or can get it with reasonable efforts. 462 | 463 | You may not impose any further restrictions on the exercise of the 464 | rights granted or affirmed under this License. For example, you may 465 | not impose a license fee, royalty, or other charge for exercise of 466 | rights granted under this License, and you may not initiate litigation 467 | (including a cross-claim or counterclaim in a lawsuit) alleging that 468 | any patent claim is infringed by making, using, selling, offering for 469 | sale, or importing the Program or any portion of it. 470 | 471 | 11. Patents. 472 | 473 | A "contributor" is a copyright holder who authorizes use under this 474 | License of the Program or a work on which the Program is based. The 475 | work thus licensed is called the contributor's "contributor version". 476 | 477 | A contributor's "essential patent claims" are all patent claims 478 | owned or controlled by the contributor, whether already acquired or 479 | hereafter acquired, that would be infringed by some manner, permitted 480 | by this License, of making, using, or selling its contributor version, 481 | but do not include claims that would be infringed only as a 482 | consequence of further modification of the contributor version. For 483 | purposes of this definition, "control" includes the right to grant 484 | patent sublicenses in a manner consistent with the requirements of 485 | this License. 486 | 487 | Each contributor grants you a non-exclusive, worldwide, royalty-free 488 | patent license under the contributor's essential patent claims, to 489 | make, use, sell, offer for sale, import and otherwise run, modify and 490 | propagate the contents of its contributor version. 491 | 492 | In the following three paragraphs, a "patent license" is any express 493 | agreement or commitment, however denominated, not to enforce a patent 494 | (such as an express permission to practice a patent or covenant not to 495 | sue for patent infringement). To "grant" such a patent license to a 496 | party means to make such an agreement or commitment not to enforce a 497 | patent against the party. 498 | 499 | If you convey a covered work, knowingly relying on a patent license, 500 | and the Corresponding Source of the work is not available for anyone 501 | to copy, free of charge and under the terms of this License, through a 502 | publicly available network server or other readily accessible means, 503 | then you must either (1) cause the Corresponding Source to be so 504 | available, or (2) arrange to deprive yourself of the benefit of the 505 | patent license for this particular work, or (3) arrange, in a manner 506 | consistent with the requirements of this License, to extend the patent 507 | license to downstream recipients. "Knowingly relying" means you have 508 | actual knowledge that, but for the patent license, your conveying the 509 | covered work in a country, or your recipient's use of the covered work 510 | in a country, would infringe one or more identifiable patents in that 511 | country that you have reason to believe are valid. 512 | 513 | If, pursuant to or in connection with a single transaction or 514 | arrangement, you convey, or propagate by procuring conveyance of, a 515 | covered work, and grant a patent license to some of the parties 516 | receiving the covered work authorizing them to use, propagate, modify 517 | or convey a specific copy of the covered work, then the patent license 518 | you grant is automatically extended to all recipients of the covered 519 | work and works based on it. 520 | 521 | A patent license is "discriminatory" if it does not include within 522 | the scope of its coverage, prohibits the exercise of, or is 523 | conditioned on the non-exercise of one or more of the rights that are 524 | specifically granted under this License. You may not convey a covered 525 | work if you are a party to an arrangement with a third party that is 526 | in the business of distributing software, under which you make payment 527 | to the third party based on the extent of your activity of conveying 528 | the work, and under which the third party grants, to any of the 529 | parties who would receive the covered work from you, a discriminatory 530 | patent license (a) in connection with copies of the covered work 531 | conveyed by you (or copies made from those copies), or (b) primarily 532 | for and in connection with specific products or compilations that 533 | contain the covered work, unless you entered into that arrangement, 534 | or that patent license was granted, prior to 28 March 2007. 535 | 536 | Nothing in this License shall be construed as excluding or limiting 537 | any implied license or other defenses to infringement that may 538 | otherwise be available to you under applicable patent law. 539 | 540 | 12. No Surrender of Others' Freedom. 541 | 542 | If conditions are imposed on you (whether by court order, agreement or 543 | otherwise) that contradict the conditions of this License, they do not 544 | excuse you from the conditions of this License. If you cannot convey a 545 | covered work so as to satisfy simultaneously your obligations under this 546 | License and any other pertinent obligations, then as a consequence you may 547 | not convey it at all. For example, if you agree to terms that obligate you 548 | to collect a royalty for further conveying from those to whom you convey 549 | the Program, the only way you could satisfy both those terms and this 550 | License would be to refrain entirely from conveying the Program. 551 | 552 | 13. Use with the GNU Affero General Public License. 553 | 554 | Notwithstanding any other provision of this License, you have 555 | permission to link or combine any covered work with a work licensed 556 | under version 3 of the GNU Affero General Public License into a single 557 | combined work, and to convey the resulting work. The terms of this 558 | License will continue to apply to the part which is the covered work, 559 | but the special requirements of the GNU Affero General Public License, 560 | section 13, concerning interaction through a network will apply to the 561 | combination as such. 562 | 563 | 14. Revised Versions of this License. 564 | 565 | The Free Software Foundation may publish revised and/or new versions of 566 | the GNU General Public License from time to time. Such new versions will 567 | be similar in spirit to the present version, but may differ in detail to 568 | address new problems or concerns. 569 | 570 | Each version is given a distinguishing version number. If the 571 | Program specifies that a certain numbered version of the GNU General 572 | Public License "or any later version" applies to it, you have the 573 | option of following the terms and conditions either of that numbered 574 | version or of any later version published by the Free Software 575 | Foundation. If the Program does not specify a version number of the 576 | GNU General Public License, you may choose any version ever published 577 | by the Free Software Foundation. 578 | 579 | If the Program specifies that a proxy can decide which future 580 | versions of the GNU General Public License can be used, that proxy's 581 | public statement of acceptance of a version permanently authorizes you 582 | to choose that version for the Program. 583 | 584 | Later license versions may give you additional or different 585 | permissions. However, no additional obligations are imposed on any 586 | author or copyright holder as a result of your choosing to follow a 587 | later version. 588 | 589 | 15. Disclaimer of Warranty. 590 | 591 | THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY 592 | APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT 593 | HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY 594 | OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, 595 | THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR 596 | PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM 597 | IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF 598 | ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 599 | 600 | 16. Limitation of Liability. 601 | 602 | IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN WRITING 603 | WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS 604 | THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY 605 | GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE 606 | USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF 607 | DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD 608 | PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), 609 | EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF 610 | SUCH DAMAGES. 611 | 612 | 17. Interpretation of Sections 15 and 16. 613 | 614 | If the disclaimer of warranty and limitation of liability provided 615 | above cannot be given local legal effect according to their terms, 616 | reviewing courts shall apply local law that most closely approximates 617 | an absolute waiver of all civil liability in connection with the 618 | Program, unless a warranty or assumption of liability accompanies a 619 | copy of the Program in return for a fee. 620 | 621 | END OF TERMS AND CONDITIONS 622 | 623 | How to Apply These Terms to Your New Programs 624 | 625 | If you develop a new program, and you want it to be of the greatest 626 | possible use to the public, the best way to achieve this is to make it 627 | free software which everyone can redistribute and change under these terms. 628 | 629 | To do so, attach the following notices to the program. It is safest 630 | to attach them to the start of each source file to most effectively 631 | state the exclusion of warranty; and each file should have at least 632 | the "copyright" line and a pointer to where the full notice is found. 633 | 634 | 635 | Copyright (C) 636 | 637 | This program is free software: you can redistribute it and/or modify 638 | it under the terms of the GNU General Public License as published by 639 | the Free Software Foundation, either version 3 of the License, or 640 | (at your option) any later version. 641 | 642 | This program is distributed in the hope that it will be useful, 643 | but WITHOUT ANY WARRANTY; without even the implied warranty of 644 | MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the 645 | GNU General Public License for more details. 646 | 647 | You should have received a copy of the GNU General Public License 648 | along with this program. If not, see . 649 | 650 | Also add information on how to contact you by electronic and paper mail. 651 | 652 | If the program does terminal interaction, make it output a short 653 | notice like this when it starts in an interactive mode: 654 | 655 | Copyright (C) 656 | This program comes with ABSOLUTELY NO WARRANTY; for details type `show w'. 657 | This is free software, and you are welcome to redistribute it 658 | under certain conditions; type `show c' for details. 659 | 660 | The hypothetical commands `show w' and `show c' should show the appropriate 661 | parts of the General Public License. Of course, your program's commands 662 | might be different; for a GUI interface, you would use an "about box". 663 | 664 | You should also get your employer (if you work as a programmer) or school, 665 | if any, to sign a "copyright disclaimer" for the program, if necessary. 666 | For more information on this, and how to apply and follow the GNU GPL, see 667 | . 668 | 669 | The GNU General Public License does not permit incorporating your program 670 | into proprietary programs. If your program is a subroutine library, you 671 | may consider it more useful to permit linking proprietary applications with 672 | the library. If this is what you want to do, use the GNU Lesser General 673 | Public License instead of this License. But first, please read 674 | . 675 | --------------------------------------------------------------------------------