├── analyzes_vowels_extracts_f0_f1_f2_f3_f4_int_dur.praat ├── .gitignore ├── README.md ├── writeF1F2intier.praat └── extracts_whole_formant.praat /analyzes_vowels_extracts_f0_f1_f2_f3_f4_int_dur.praat: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wendyelviragarcia/vowels/HEAD/analyzes_vowels_extracts_f0_f1_f2_f3_f4_int_dur.praat -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | ## Ignore for R code 2 | 3 | # Ignore OS generated files # 4 | ###################### 5 | .DS_Store 6 | .DS_Store? 7 | ._* 8 | .Spotlight-V100 9 | .Trashes 10 | ehthumbs.db 11 | Thumbs.db 12 | 13 | ### R ### 14 | # History files 15 | .Rhistory 16 | .Rapp.history 17 | 18 | # Session Data files 19 | .RData 20 | .RDataTmp 21 | 22 | # User-specific files 23 | .Ruserdata 24 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Vowel scripts 2 | This folder contains three scripts: 3 | 1) Goes through all the files in a folder or all the folders and files in a folder and writes in a txt information about formants, duration, intensity and F0. It extracts either the mid-point or the mean of the mid 50% of the vowel. The maximum formant is set automatically to 5000 or 5500 depending on the F0 of the file, it expects a file that contains a least a sentence. 4 | 5 | 2) Extracts the whole formant (formant-track) of labelled intervals that contain a vowel. From each time step it extracts the formants value, intensity, F0 and duration of the interval. It can be time normalized (30 points by segment) or with the same time-step for the whole folder. The time-normalized version is is useful to analyze formant tracks with GAMMs or similar. 6 | 7 | 3) Writes the F1 and F2 in a TextGrid to ease manual correction. You will need to use a script later on that extracts the labels. 8 | 9 | ## REQUIREMENTS [INPUT] 10 | A sound and a Textgrid with THE SAME filename and without spaces in the filename. For example this_is_my_sentence.wav and this_is_my_sentence.TextGrid 11 | The format of the TextGrid must be: tier1 interval for each sound, the script will analyse only intervals with labels that have one of these symbols: a, e, i ,o ,u, ɪ, ɛ, æ, ɑ, ɔ, ʊ, ʌ, ɝ, AEIOU@ 12 | 13 | 14 | ## INSTRUCTIONS 15 | 1. Open the script (Open/Read from file...), click Run in the upper menu and Run again. 16 | 2. Set the parameters. 17 | a) Folder where the files you want to analyse are 18 | b) Name of the txt where the results will be saved 19 | c) Data to optimise the formantic analysis 20 | d) Data for optimizing the pitch (F0) detection 21 | 22 | ## OUTPUT 23 | The output is a tab separated txt file (can be dragged to Excel, beware decimals are ".") with the following information in columns. Depending on the script that you are using you will find an interval per row or a sample point per row. 24 | a) file name 25 | b) number of the Interval 26 | c) label of the interval in the tier of the analysis: vowel or sentence 27 | d) F0 28 | e) F1 29 | f) F2 30 | h) F3 31 | i) F4 32 | j) Duration of the vowel 33 | k) Intensity of the vowel at its mid point or mean intensity in the interval. 34 | 35 | ## CREDITS 36 | (c) Wendy Elvira García [Contact](https://www.ub.edu/phoneticslaboratory/sites/wendyelvira/contact.html) 37 | [Laboratori de Fonètica (Universitat de Barcelona)](https://www.ub.edu/phoneticslaboratory) 38 | -------------------------------------------------------------------------------- /writeF1F2intier.praat: -------------------------------------------------------------------------------- 1 | 2 | ################################## 3 | # F1 and F2 in new tiers v1 (2023) 4 | # This script goes through all the files in a folder and writes adds F1 and F2 tiers to their TextGrid 5 | # 6 | # REQUIREMENTS [INPUT] 7 | # A sound and a Textgrid with THE SAME filename and without spaces in the filename. For example this_is_my_sentence.wav and this_is_my_sentence.TextGrid 8 | # The format of the TextGrid must be: tier1 (word sentenc or whatever you like) for sounds; tier2 (interval) for vowels. 9 | # The script will write F1/F2 for all the intervals that contain a vowel symbol (listed form line 88 and forth). 10 | 11 | # INSTRUCTIONS 12 | # 1. Open the script (Open/Read from file...), click Run in the upper menu and Run again. 13 | # 2. Set the parameters. 14 | # a) Folder where the files you want to analyse are 15 | # b) Data to optimise the formantic analysis 16 | # 17 | # OUTPUT 18 | # The output is a modified TextGrid with two new point tiers with the F1 and F2 values. ALERT! it will modify your TextGrids, make a copy. 19 | # 20 | # 21 | # (c) Wendy Elvira García (2024) wen dy el vi r a g a r c ia @ g m a i l. c o m 22 | # Laboratori de Fonètica (Universitat de Barcelona) https://www.ub.edu/phoneticslaboratory/ 23 | # 24 | ################################# 25 | 26 | ######### FORM ############### 27 | 28 | form Pausas vowelFormants 29 | sentence Folder /Users/weg/Library/CloudStorage/OneDrive-UniversitatdeBarcelona/_hacerahora/APP_Marcelo/0_corpus_repeticion/ 30 | comment _ 31 | comment Data formantic analysis 32 | positive Time_step 0.01 33 | integer Maximum_number_of_formants 5 34 | positive Maximum_formant_(Hz) 5500 35 | positive Window_length_(s) 0.025 36 | real Preemphasis_from_(Hz) 50 37 | comment _ 38 | comment Pitch analysis data 39 | integer pitchFloor 75 40 | integer pitchCeiling 600 41 | endform 42 | 43 | ################################# 44 | ################################# 45 | 46 | # variables 47 | tier =3 48 | 49 | #checks whether the file exists 50 | 51 | # index files for loop 52 | # index files for loop 53 | wav = Create Strings as file list: "myList", folder$ + "/" +"*.wav" 54 | mp3 = Create Strings as file list: "myList", folder$ + "/" +"*.mp3" 55 | 56 | selectObject: wav, mp3 57 | myList= Append 58 | 59 | 60 | nFiles = Get number of strings 61 | 62 | #loops all files in folder 63 | for file to nFiles 64 | selectObject: myList 65 | nameFile$ = Get string: file 66 | mySound = Read from file: folder$ + "/"+ nameFile$ 67 | base$ = selected$("Sound") 68 | myTextGrid = Read from file: folder$ + "/"+ base$ + ".TextGrid" 69 | #base name 70 | myTextGrid$ = selected$("TextGrid") 71 | 72 | 73 | nOfIntervals = Get number of intervals: tier 74 | Convert to Unicode 75 | 76 | Insert point tier: 4, "F1" 77 | Insert point tier: 5, "F2" 78 | selectObject: mySound 79 | myFormant = To Formant (burg): time_step, maximum_number_of_formants, maximum_formant, window_length, preemphasis_from 80 | 81 | #loops intervals 82 | nInterval=1 83 | for nInterval from 1 to nOfIntervals 84 | selectObject: myTextGrid 85 | labelOfInterval$ = Get label of interval: tier, nInterval 86 | 87 | #perform actions only for vowels 88 | if index(labelOfInterval$, "a") <> 0 or 89 | ... index(labelOfInterval$, "e") <> 0 or 90 | ... index(labelOfInterval$, "ə") <> 0 or 91 | ... index(labelOfInterval$, "i") <> 0 or 92 | ... index(labelOfInterval$, "o") <> 0 or 93 | ... index(labelOfInterval$, "u") <> 0 or 94 | ... index(labelOfInterval$, "ɪ") <> 0 or 95 | ... index(labelOfInterval$, "ɛ") <> 0 or 96 | ... index(labelOfInterval$, "æ") <> 0 or 97 | ... index(labelOfInterval$, "ɑ") <> 0 or 98 | ... index(labelOfInterval$, "ɔ") <> 0 or 99 | ... index(labelOfInterval$, "ʊ") <> 0 or 100 | ... index(labelOfInterval$, "ʌ") <> 0 or 101 | ... index(labelOfInterval$, "ɝ") <> 0 102 | 103 | #Gets time of the interval 104 | endPoint = Get end point: tier, nInterval 105 | startPoint = Get starting point: tier, nInterval 106 | durInterval = endPoint- startPoint 107 | midInterval = startPoint +(durInterval/2) 108 | 109 | 110 | 111 | #look for formants 112 | selectObject: myFormant 113 | 114 | 115 | f1 = Get value at time: 1, midInterval, "Hertz", "Linear" 116 | f2 = Get value at time: 2, midInterval, "Hertz", "Linear" 117 | 118 | f1$ = fixed$(f1,0) 119 | f2$ = fixed$(f2,0) 120 | 121 | 122 | selectObject: myTextGrid 123 | Insert point: 4, midInterval, f1$ 124 | Insert point: 5, midInterval, f2$ 125 | 126 | # Save the textgrid 127 | Save as text file: folder$ + "/"+ base$ + ".TextGrid" 128 | 129 | 130 | 131 | 132 | 133 | endif 134 | #close interval loop 135 | 136 | endfor 137 | removeObject: myFormant 138 | 139 | #close file loop 140 | removeObject: myTextGrid 141 | endfor 142 | removeObject: myList 143 | echo Done. 144 | 145 | -------------------------------------------------------------------------------- /extracts_whole_formant.praat: -------------------------------------------------------------------------------- 1 | 2 | ################################## 3 | # Formant track (time-normalized) (March 2025) 4 | # This script goes through all the files in a folder and writes in a txt the formant tracks of an interval than contains a vowel symbol. 5 | # 6 | # REQUIREMENTS [INPUT] 7 | # A sound and a Textgrid with THE SAME filename and without spaces in the filename. For example this_is_my_sentence.wav and this_is_my_sentence.TextGrid 8 | # The format of the TextGrid must be: tier interval for each sound, the script will analyse only intervals with labels that have one of these symbols 9 | # a, e, i ,o ,u, ɪ, ɛ, æ, ɑ, ɔ, ʊ, ʌ, ɝ 10 | 11 | # 12 | # INSTRUCTIONS 13 | # 1. Open the script (Open/Read from file...), click Run in the upper menu and Run again. 14 | # 2. Set the parameters. 15 | # a) Folder where the files you want to analyse are 16 | # b) Name of the txt where the results will be saved 17 | # c) Data to optimise the formantic analysis 18 | # 19 | # OUTPUT 20 | # The output is a tab separated txt file (can be dragged to Excel) with the following information in columns. 21 | # a) file name 22 | # b) number of the Interval in the Textgrid 23 | # c) label of the interval in the tier of the analysis: vowel 24 | # b) number of point of the track (this is the normalized time if you have chosen the automatic time-step) 25 | # b) time when the data is measured 26 | # d) F0 27 | # e) F1 28 | # f) F2 29 | # h) F3 30 | # i) F4 31 | # j) durIntervalms 32 | # h) intensity 33 | # 34 | # 35 | # (c) Wendy Elvira García (2025) https://www.ub.edu/phoneticslaboratory/sites/wendyelvira/ 36 | # Laboratori de Fonètica (Universitat de Barcelona) 37 | # 38 | ################################# 39 | 40 | ######### FORM ############### 41 | 42 | 43 | form: "vowelFormants" 44 | sentence: "Folder", "C:\Users\labfonub99\Desktop\formant" 45 | comment: "the results file will be saved in the same folder your wax+Textgrids are" 46 | sentence: "txtName", "f_track" 47 | comment: "In which tier do you have the sound per sound segmentation with you vowels labelled?" 48 | natural: "tier", "1" 49 | comment: "-" 50 | 51 | comment: "You can set your own time-step or use the option for having normalized time (30 points by sound)," 52 | comment: "you will also have the real original time in the results" 53 | choice: "time_step_type", 2 54 | option: "Manual" 55 | option: "Automatic_for_30_values_per_segment" 56 | comment: "Time step manual only used if set to manual" 57 | positive: "Time_step_manual", "0.02" 58 | 59 | comment: "You can change the Maximum formant to 5000 if you are working with deep voices (usually male)" 60 | integer: "Maximum_number_of_formants", "5" 61 | positive: "Maximum_formant", "5500" 62 | 63 | 64 | endform 65 | 66 | ######### PREDEFINED VARIABLES FOR THE ANALYSIS ############### 67 | 68 | window_length = 0.025 69 | preemphasis_from= 50 70 | 71 | # Pitch analysis data 72 | pitchFloor = 75 73 | pitchCeiling = 600 74 | 75 | ################################# 76 | 77 | #checks whether the file exists 78 | if fileReadable(folder$ + "/" + txtName$ +".txt") = 1 79 | pause The file already exists. If you click continue it will be overwritten. 80 | endif 81 | echo 'folder$' 82 | 83 | #writes interval in the output 84 | writeFileLine: folder$ + "/"+ txtName$+ ".txt", "fileName", tab$ , "nInterval", tab$, "Label_interval", tab$,"n_point", tab$, "original_time", tab$, "F0", tab$, "F1", tab$, "F2", tab$, "F3", tab$, "F4", tab$, "duration", tab$, "intensity" 85 | 86 | 87 | # index files for loop 88 | myList = Create Strings as file list: "myList", folder$ + "/" +"*.wav" 89 | nFiles = Get number of strings 90 | 91 | #loops all files in folder 92 | for file to nFiles 93 | selectObject: myList 94 | nameFile$ = Get string: file 95 | base$= nameFile$ -".wav" 96 | myTextGrid = Read from file: folder$ + "/"+ base$ + ".TextGrid" 97 | #base name 98 | myTextGrid$ = selected$("TextGrid") 99 | mySound = Read from file: folder$ + "/"+ myTextGrid$ + ".wav" 100 | selectObject: myTextGrid 101 | nOfIntervals = Get number of intervals: tier 102 | Convert to Unicode 103 | 104 | 105 | #F0 106 | selectObject: mySound 107 | myPitch = To Pitch: 0, pitchFloor, pitchCeiling 108 | # intensity 109 | selectObject: mySound 110 | myIntensity = To Intensity: 100, 0, "yes" 111 | 112 | #formant created with 0.01 time-step, we will extract the real rime step later (this is preallocated to speed up the analysis) 113 | selectObject: mySound 114 | myFormant = To Formant (burg): 0.01, maximum_number_of_formants, maximum_formant, window_length, preemphasis_from 115 | 116 | #loops intervals 117 | nInterval=1 118 | for nInterval to nOfIntervals 119 | selectObject: myTextGrid 120 | labelOfInterval$ = Get label of interval: tier, nInterval 121 | 122 | #perform actions only for vowels 123 | if index(labelOfInterval$, "a") <> 0 or 124 | ... index(labelOfInterval$, "e") <> 0 or 125 | ... index(labelOfInterval$, "i") <> 0 or 126 | ... index(labelOfInterval$, "o") <> 0 or 127 | ... index(labelOfInterval$, "u") <> 0 or 128 | ... index(labelOfInterval$, "ɪ") <> 0 or 129 | ... index(labelOfInterval$, "ɛ") <> 0 or 130 | ... index(labelOfInterval$, "æ") <> 0 or 131 | ... index(labelOfInterval$, "ɑ") <> 0 or 132 | ... index(labelOfInterval$, "ɔ") <> 0 or 133 | ... index(labelOfInterval$, "ʊ") <> 0 or 134 | ... index(labelOfInterval$, "ʌ") <> 0 or 135 | ... index(labelOfInterval$, "ɝ") <> 0 136 | 137 | #Gets time of the interval 138 | endPoint = Get end point: tier, nInterval 139 | startPoint = Get starting point: tier, nInterval 140 | durInterval = endPoint- startPoint 141 | midInterval = startPoint +(durInterval/2) 142 | durIntervalms = durInterval*1000 143 | #fix decimals 144 | durIntervalms$ = fixed$(durIntervalms, 0) 145 | #change decimal marker for commas 146 | #durIntervalms$ = replace$ (durIntervalms$, ".", ",", 1) 147 | 148 | selectObject: myPitch 149 | f0 = Get value at time: midInterval, "Hertz", "Linear" 150 | f0$ = fixed$(f0, 0) 151 | 152 | 153 | if time_step_type = 1 154 | time_step = time_step_manual 155 | else 156 | time_step = durInterval / 30 157 | endif 158 | 159 | nPoints = durInterval / time_step 160 | 161 | #look for formants 162 | 163 | myTime = 0 164 | for point to nPoints 165 | selectObject: myFormant 166 | time = startPoint+myTime 167 | f1 = Get value at time: 1, time, "Hertz", "Linear" 168 | f2 = Get value at time: 2, time, "Hertz", "Linear" 169 | f3 = Get value at time: 3, time, "Hertz", "Linear" 170 | f4 = Get value at time: 4, time, "Hertz", "Linear" 171 | f1$ = fixed$(f1, 0) 172 | f2$ = fixed$(f2, 0) 173 | f3$ = fixed$(f3, 0) 174 | f4$ = fixed$(f4, 0) 175 | selectObject: myIntensity 176 | int = Get value at time: time, "nearest" 177 | int$ = fixed$(int, 0) 178 | 179 | 180 | myTime = myTime + time_step 181 | # Save result to text file: 182 | appendFile: folder$ + "/"+ txtName$ + ".txt", base$, tab$, nInterval, tab$, labelOfInterval$, tab$ 183 | appendFile: folder$ + "/"+ txtName$ + ".txt", point, tab$, time, tab$, f0$, tab$, f1$, tab$, f2$, tab$, f3$, tab$, f4$, tab$, durIntervalms$, tab$, int$, newline$ 184 | # end of loop for points 185 | endfor 186 | endif 187 | #close interval loop 188 | # end of loop for intervals 189 | endfor 190 | removeObject: myTextGrid, mySound, myPitch 191 | #end of for files 192 | endfor 193 | removeObject: myList 194 | echo Done. 195 | 196 | --------------------------------------------------------------------------------