├── .gitignore ├── BuildingaSpeechRecognizerinJavaScript.md ├── JsSpeechRecognizer.js ├── LICENSE ├── README.md ├── adapter.js ├── css ├── demos.css ├── normalize.css └── skeleton.css ├── demos ├── README.md ├── keyword-spotting │ ├── README.md │ ├── keyword-spotting.html │ └── readme-images │ │ └── screenshot-keyword-spotting.png ├── resources │ └── sounds │ │ └── notification1.wav └── video-interaction │ ├── README.md │ ├── readme-images │ └── video-interaction-screenshot.png │ └── video-interaction.html ├── experimental └── README.md ├── readme-images ├── screenshot-chicken-frog.png └── screenshot-yes-no.png └── speechrec.html /.gitignore: -------------------------------------------------------------------------------- 1 | .DS_Store 2 | .AppleDouble 3 | .LSOverride 4 | 5 | # Icon must end with two \r 6 | Icon 7 | 8 | 9 | # Thumbnails 10 | ._* 11 | 12 | # Files that might appear in the root of a volume 13 | .DocumentRevisions-V100 14 | .fseventsd 15 | .Spotlight-V100 16 | .TemporaryItems 17 | .Trashes 18 | .VolumeIcon.icns 19 | 20 | # Directories potentially created on remote AFP share 21 | .AppleDB 22 | .AppleDesktop 23 | Network Trash Folder 24 | Temporary Items 25 | .apdisk 26 | 27 | 28 | -------------------------------------------------------------------------------- /BuildingaSpeechRecognizerinJavaScript.md: -------------------------------------------------------------------------------- 1 | # Building a Speech Recognizer in JavaScript 2 | 3 | This document will go into key details about how the JsSpeechRecognizer was built, and you can always look at the complete file for the full implementation. 4 | 5 | ## 1. Get Access to the Microphone 6 | 7 | The first and probably most important step is to get access to the microphone. To do this we use WebRTC functions. 8 | 9 | JsSpeechRecognizer uses the adapter.js file from the WebRTC project to accomplish this. Here is a link to their github repo: https://github.com/webrtc/adapter 10 | 11 | ````javascript 12 | // Request access to the microphone 13 | var constraints = { 14 | "audio": true 15 | }; 16 | 17 | navigator.getUserMedia(constraints, successCallback, errorCallback); 18 | ```` 19 | 20 | ## 2. Connect the Audio to an Analyser and Script Node 21 | 22 | When we successfully get acces to the microphone, we hook up an Analyser. Click [here](https://developer.mozilla.org/en-US/docs/Web/API/AudioContext/createAnalyser) for more details about the analyser. The analyser will take the raw audio samples and calculate a Fast Fourier Transform. This will give us the audio in the frequency domain. Frequency data is much more useful for distinguishing words than the raw audio data. 23 | 24 | Notice that the Analyser is connected to a Script Node. A ScriptNode allows us to create a custom processing function. 25 | 26 | ````javascript 27 | // Create an analyser 28 | this.analyser = this.audioCtx.createAnalyser(); 29 | this.analyser.minDecibels = -80; 30 | this.analyser.maxDecibels = -10; 31 | this.analyser.smoothingTimeConstant = 0; 32 | this.analyser.fftSize = 1024; 33 | 34 | // Create the scriptNode 35 | this.scriptNode = this.audioCtx.createScriptProcessor(this.analyser.fftSize, 1, 1); 36 | 37 | 38 | // Acess to the microphone was granted 39 | function successCallback(stream) { 40 | _this.stream = stream; 41 | _this.source = _this.audioCtx.createMediaStreamSource(stream); 42 | 43 | _this.source.connect(_this.analyser); 44 | _this.analyser.connect(_this.scriptNode); 45 | 46 | // This is needed for chrome 47 | _this.scriptNode.connect(_this.audioCtx.destination); 48 | } 49 | ```` 50 | 51 | ## 3. Normalize and Group the Frequencies 52 | 53 | In the custom processing function for the script node, we normalize and group the frequencies. We normalize them to help accommodate for different volume levels of the recordings, and we group the frequencies to simplify the data. 54 | 55 | The size and number of groupings was chosen through trial and error and may not be optimal. The number of groups and the size of the groups will affect how specific the data model is. A more specific data model will take up more memory, may take more time to process recognitions, but may also be more accurate. 56 | 57 | ````javascript 58 | // Function for script node to process 59 | var _this = this; 60 | this.scriptNode.onaudioprocess = function(audioProcessingEvent) { 61 | 62 | var i = 0; 63 | 64 | // get the fft data 65 | var dataArray = new Uint8Array(_this.analyser.fftSize); 66 | _this.analyser.getByteFrequencyData(dataArray); 67 | 68 | // Find the max in the fft array 69 | var max = Math.max.apply(Math, dataArray); 70 | 71 | // If the max is zero ignore it. 72 | if (max === 0) { 73 | return; 74 | } 75 | 76 | // Normalize and Group the frequencies 77 | var numGroups = 25; 78 | var groupSize = 10; 79 | var groups = []; 80 | 81 | for (i = 0; i < numGroups; i++) { 82 | var peakGroupValue = 0; 83 | for (var j = 0; j < groupSize; j++) { 84 | var curPos = (groupSize * i) + j; 85 | 86 | // normalize the value 87 | var tempCalc = Math.floor((dataArray[curPos] / max) * 100); 88 | 89 | // Keep the peak normalized value for this group 90 | if (tempCalc > peakGroupValue) { 91 | peakGroupValue = tempCalc; 92 | } 93 | 94 | } 95 | groups.push(peakGroupValue); 96 | } 97 | _this.groupedValues.push(groups); 98 | }; 99 | ```` 100 | 101 | ## Training or Recognizing? 102 | 103 | Steps 1 through 3 are common to both training and recognition. Step 4, however, will differ depending on if you are training or recognizing. 104 | 105 | ## 4. (Training) Save to the Model 106 | 107 | If we are training, after we save the results from step 3 into our model. The JsSpeechRecognizer allows for one word to be trained multiple times, so these steps can be repeated numerous amounts of times for one or more words. 108 | 109 | ## 4. (Recognizing) Match Recording to an Entry in the Model 110 | 111 | If we are recognizing we want to take the results from the step 3, and compare it to all the entries we have stored in our model. 112 | 113 | The comparison is simple, we simple take the difference of input value and the model value. 114 | 115 | ````javascript 116 | JsSpeechRecognizer.prototype.findDistance = function(input, check) { 117 | var i = 0; 118 | var distance = 0; 119 | 120 | for (i = 0; i < Math.max(input.length, check.length); i++) { 121 | var checkVal = check[i] || 0; 122 | var inputVal = input[i] || 0; 123 | distance += Math.abs(checkVal - inputVal); 124 | } 125 | 126 | return distance; 127 | }; 128 | ```` 129 | 130 | We then transform this difference to a confidence value. 131 | 132 | ````javascript 133 | JsSpeechRecognizer.prototype.calcConfidence = function(distance, matchArray) { 134 | var sum = 0; 135 | var i = 0; 136 | 137 | for (i = 0; i < matchArray.length; i++) { 138 | sum += matchArray[i]; 139 | } 140 | 141 | return (1 - (distance / sum)); 142 | }; 143 | ```` 144 | 145 | When all the confidences have been calculated, the highest value result is returned. This becomes our recognition hypothesis. 146 | 147 | ## That's It 148 | Now go have fun playing with [the live demo](https://dreamdom.github.io/speechrec.html)! Be sure to read the tips on using the demo in the README file. 149 | -------------------------------------------------------------------------------- /JsSpeechRecognizer.js: -------------------------------------------------------------------------------- 1 | /** 2 | * JavaScript based speech recognizer. 3 | * 4 | * Copyright 2016, Dominic Winkelman 5 | * Free to use under the Apache 2.0 License 6 | * 7 | * https://github.com/dreamdom/JsSpeechRecognizer 8 | * 9 | * Requires the WebRTC adapter.js file. 10 | */ 11 | 12 | /** 13 | * Constructor for JsSpeechRecognizer. 14 | * Sets a number of parameters to default values. 15 | */ 16 | function JsSpeechRecognizer() { 17 | 18 | // Constants 19 | this.RecordingEnum = { "NOT_RECORDING": 0, "TRAINING": 1, "RECOGNITION": 2, "KEYWORD_SPOTTING": 3, "KEYWORD_SPOTTING_NOISY": 4 }; 20 | Object.freeze(this.RecordingEnum); 21 | this.RecognitionModel = { "TRAINED": 0, "AVERAGE": 1, "COMPOSITE": 2 }; 22 | Object.freeze(this.RecognitionModel); 23 | 24 | // Variables for recording data 25 | this.recordingBufferArray = []; 26 | this.currentRecordingBuffer = []; 27 | this.wordBuffer = []; 28 | this.modelBuffer = []; 29 | this.groupedValues = []; 30 | this.keywordSpottingGroupBuffer = []; 31 | this.keywordSpottingRecordingBuffer = []; 32 | 33 | // The speech recognition model 34 | this.model = {}; 35 | 36 | this.recordingState = this.RecordingEnum.NOT_RECORDING; 37 | this.useRecognitionModel = this.RecognitionModel.COMPOSITE; 38 | 39 | // Get an audio context 40 | this.audioCtx = new (window.AudioContext || window.webkitAudioContext)(); 41 | 42 | 43 | // Generate functions for keyword spotting 44 | this.findDistanceForKeywordSpotting = this.generateFindDistanceForKeywordSpotting(-1); 45 | this.findDistanceForKeywordSpotting0 = this.generateFindDistanceForKeywordSpotting(0); 46 | this.findDistanceForKeywordSpotting5 = this.generateFindDistanceForKeywordSpotting(5); 47 | this.findDistanceForKeywordSpotting15 = this.generateFindDistanceForKeywordSpotting(15); 48 | 49 | 50 | // Adjustable parameters 51 | 52 | // Create an analyser 53 | this.analyser = this.audioCtx.createAnalyser(); 54 | this.analyser.minDecibels = -80; 55 | this.analyser.maxDecibels = -10; 56 | this.analyser.smoothingTimeConstant = 0; 57 | this.analyser.fftSize = 1024; 58 | 59 | // Create the scriptNode 60 | this.scriptNode = this.audioCtx.createScriptProcessor(this.analyser.fftSize, 1, 1); 61 | this.scriptNode.onaudioprocess = this.generateOnAudioProcess(); 62 | 63 | // Parameters for the model calculation 64 | this.numGroups = 25; 65 | this.groupSize = 10; 66 | this.minPower = 0.01; 67 | 68 | // Keyword spotting parameters 69 | this.keywordSpottingMinConfidence = 0.50; 70 | this.keywordSpottingBufferCount = 80; 71 | this.keywordSpottingLastVoiceActivity = 0; 72 | this.keywordSpottingMaxVoiceActivityGap = 300; 73 | this.keywordSpottedCallback = null; 74 | 75 | } 76 | 77 | /** 78 | * Requests access to the microphone. 79 | * @public 80 | */ 81 | JsSpeechRecognizer.prototype.openMic = function() { 82 | 83 | var constraints = { 84 | "audio": true 85 | }; 86 | 87 | navigator.getUserMedia(constraints, successCallback, errorCallback); 88 | 89 | var _this = this; 90 | // Acess to the microphone was granted 91 | function successCallback(stream) { 92 | _this.stream = stream; 93 | _this.source = _this.audioCtx.createMediaStreamSource(stream); 94 | 95 | _this.source.connect(_this.analyser); 96 | _this.analyser.connect(_this.scriptNode); 97 | 98 | // This is needed for chrome 99 | _this.scriptNode.connect(_this.audioCtx.destination); 100 | } 101 | 102 | function errorCallback(error) { 103 | console.error('navigator.getUserMedia error: ', error); 104 | } 105 | }; 106 | 107 | /** 108 | * Returns false if the recognizer is not recording. True otherwise. 109 | * @public. 110 | */ 111 | JsSpeechRecognizer.prototype.isRecording = function() { 112 | return (this.recordingState !== this.RecordingEnum.NOT_RECORDING); 113 | }; 114 | 115 | /** 116 | * Starts recording in TRAINING mode. 117 | * @public 118 | */ 119 | JsSpeechRecognizer.prototype.startTrainingRecording = function(curWord) { 120 | this.resetBuffers(); 121 | this.recordingState = this.RecordingEnum.TRAINING; 122 | this.wordBuffer.push(curWord); 123 | }; 124 | 125 | /** 126 | * Starts recording in RECOGNITION mode. 127 | * @public 128 | */ 129 | JsSpeechRecognizer.prototype.startRecognitionRecording = function() { 130 | this.resetBuffers(); 131 | this.recordingState = this.RecordingEnum.RECOGNITION; 132 | }; 133 | 134 | /** 135 | * Starts recording in KEYWORD_SPOTTING mode. 136 | * @public 137 | */ 138 | JsSpeechRecognizer.prototype.startKeywordSpottingRecording = function() { 139 | this.resetBuffers(); 140 | this.recordingState = this.RecordingEnum.KEYWORD_SPOTTING; 141 | }; 142 | 143 | /** 144 | * Starts a recording in KEYWORD_SPOTTING_NOISY mode. 145 | * @public 146 | */ 147 | JsSpeechRecognizer.prototype.startKeywordSpottingNoisyRecording = function() { 148 | this.resetBuffers(); 149 | this.recordingState = this.RecordingEnum.KEYWORD_SPOTTING_NOISY; 150 | }; 151 | 152 | /** 153 | * Stops recording. 154 | * @return {Number} the length of the training buffer. 155 | * @public 156 | */ 157 | JsSpeechRecognizer.prototype.stopRecording = function() { 158 | 159 | this.groupedValues = [].concat.apply([], this.groupedValues); 160 | this.normalizeInput(this.groupedValues); 161 | 162 | // If we are training we want to save to the recongition model buffer 163 | if (this.recordingState === this.RecordingEnum.TRAINING) { 164 | this.recordingBufferArray.push(this.currentRecordingBuffer.slice(0)); 165 | this.modelBuffer.push(this.groupedValues.slice(0)); 166 | } 167 | 168 | this.recordingState = this.RecordingEnum.NOT_RECORDING; 169 | 170 | return this.recordingBufferArray.length; 171 | }; 172 | 173 | /** 174 | * Plays training audio for the specified index. 175 | * @param {Number} index 176 | * @public 177 | */ 178 | JsSpeechRecognizer.prototype.playTrainingBuffer = function(index) { 179 | this.playMonoAudio(this.recordingBufferArray[index]); 180 | }; 181 | 182 | /** 183 | * Delete training data for the specified index. 184 | * @param {Number} index 185 | * @public 186 | */ 187 | JsSpeechRecognizer.prototype.deleteTrainingBuffer = function(index) { 188 | this.modelBuffer[index] = null; 189 | }; 190 | 191 | /** 192 | * Play mono audio. 193 | * @param {Array} playBuffer 194 | * @public 195 | */ 196 | JsSpeechRecognizer.prototype.playMonoAudio = function(playBuffer) { 197 | 198 | var channels = 1; 199 | var frameCount = playBuffer.length; 200 | var myArrayBuffer = this.audioCtx.createBuffer(channels, frameCount, this.audioCtx.sampleRate); 201 | 202 | for (var channel = 0; channel < channels; channel++) { 203 | var nowBuffering = myArrayBuffer.getChannelData(channel); 204 | for (var i = 0; i < frameCount; i++) { 205 | nowBuffering[i] = playBuffer[i]; 206 | } 207 | } 208 | 209 | var playSource = this.audioCtx.createBufferSource(); 210 | playSource.buffer = myArrayBuffer; 211 | playSource.connect(this.audioCtx.destination); 212 | playSource.start(); 213 | }; 214 | 215 | /** 216 | * Returns an array of the top recognition hypotheses. 217 | * @param {Number} numResults 218 | * @return {Array} 219 | * @public 220 | */ 221 | JsSpeechRecognizer.prototype.getTopRecognitionHypotheses = function(numResults) { 222 | return this.findClosestMatch(this.groupedValues, numResults, this.model, this.findDistance); 223 | }; 224 | 225 | /** 226 | * Method to generate the new speech recognition model from the training data. 227 | * @public 228 | */ 229 | JsSpeechRecognizer.prototype.generateModel = function() { 230 | 231 | var i = 0; 232 | var j = 0; 233 | var k = 0; 234 | var key = ""; 235 | var averageModel = {}; 236 | 237 | // Reset the model 238 | this.model = {}; 239 | 240 | for (i = 0; i < this.wordBuffer.length; i++) { 241 | key = this.wordBuffer[i]; 242 | this.model[key] = []; 243 | } 244 | 245 | for (i = 0; i < this.modelBuffer.length; i++) { 246 | if (this.modelBuffer[i] !== null) { 247 | key = this.wordBuffer[i]; 248 | this.model[key].push(this.modelBuffer[i]); 249 | } 250 | } 251 | 252 | // If we are only using the trained entries, no need to anything else 253 | if (this.useRecognitionModel === this.RecognitionModel.TRAINED) { 254 | return; 255 | } 256 | 257 | // Average Model 258 | // Holds one entry for each key. That entry is the average of all the entries in the model 259 | for (key in this.model) { 260 | var average = []; 261 | for (i = 0; i < this.model[key].length; i++) { 262 | for (j = 0; j < this.model[key][i].length; j++) { 263 | average[j] = (average[j] || 0) + (this.model[key][i][j] / this.model[key].length); 264 | } 265 | } 266 | 267 | averageModel[key] = []; 268 | averageModel[key].push(average); 269 | } 270 | 271 | // Interpolation - Take the average of each pair of entries for a key and 272 | // add it to the average model 273 | for (key in this.model) { 274 | 275 | var averageInterpolation = []; 276 | for (k = 0; k < this.model[key].length; k++) { 277 | for (i = k + 1; i < this.model[key].length; i++) { 278 | 279 | averageInterpolation = []; 280 | for (j = 0; j < Math.max(this.model[key][k].length, this.model[key][i].length); j++) { 281 | var entryOne = this.model[key][k][j] || 0; 282 | var entryTwo = this.model[key][i][j] || 0; 283 | averageInterpolation[j] = (entryOne + entryTwo) / 2; 284 | } 285 | 286 | averageModel[key].push(averageInterpolation); 287 | } 288 | } 289 | } 290 | 291 | if (this.useRecognitionModel === this.RecognitionModel.AVERAGE) { 292 | this.model = averageModel; 293 | } else if (this.useRecognitionModel === this.RecognitionModel.COMPOSITE) { 294 | // Merge the average model into the model 295 | for (key in this.model) { 296 | this.model[key] = this.model[key].concat(averageModel[key]); 297 | } 298 | } 299 | 300 | }; 301 | 302 | 303 | // Private internal functions 304 | 305 | /** 306 | * Resets the recording buffers. 307 | * @private 308 | */ 309 | JsSpeechRecognizer.prototype.resetBuffers = function() { 310 | this.currentRecordingBuffer = []; 311 | this.groupedValues = []; 312 | 313 | this.keywordSpottingGroupBuffer = []; 314 | this.keywordSpottingRecordingBuffer = []; 315 | }; 316 | 317 | // Audio Processing functions 318 | 319 | /** 320 | * Generates an audioProcess function. 321 | * @return {Function} 322 | * @private 323 | */ 324 | JsSpeechRecognizer.prototype.generateOnAudioProcess = function() { 325 | var _this = this; 326 | return function(audioProcessingEvent) { 327 | 328 | var i = 0; 329 | 330 | // If we aren't recording, don't do anything 331 | if (_this.recordingState === _this.RecordingEnum.NOT_RECORDING) { 332 | return; 333 | } 334 | 335 | // get the fft data 336 | var dataArray = new Uint8Array(_this.analyser.fftSize); 337 | _this.analyser.getByteFrequencyData(dataArray); 338 | 339 | // Find the max in the fft array 340 | var max = Math.max.apply(Math, dataArray); 341 | 342 | // If the max is zero ignore it. 343 | if (max === 0) { 344 | return; 345 | } 346 | 347 | // Get the audio data. For simplicity just take one channel 348 | var inputBuffer = audioProcessingEvent.inputBuffer; 349 | var leftChannel = inputBuffer.getChannelData(0); 350 | 351 | // Calculate the power 352 | var curFrame = new Float32Array(leftChannel); 353 | var power = 0; 354 | for (i = 0; i < curFrame.length; i++) { 355 | power += curFrame[i] * curFrame[i]; 356 | } 357 | 358 | // Check for the proper power level 359 | if (power < _this.minPower) { 360 | return; 361 | } 362 | 363 | // Save the data for playback. 364 | Array.prototype.push.apply(_this.currentRecordingBuffer, curFrame); 365 | 366 | // Normalize and Group the frequencies 367 | var groups = []; 368 | 369 | for (i = 0; i < _this.numGroups; i++) { 370 | var peakGroupValue = 0; 371 | for (var j = 0; j < _this.groupSize; j++) { 372 | var curPos = (_this.groupSize * i) + j; 373 | 374 | // Keep the peak normalized value for this group 375 | if (dataArray[curPos] > peakGroupValue) { 376 | peakGroupValue = dataArray[curPos]; 377 | } 378 | 379 | } 380 | groups.push(peakGroupValue); 381 | } 382 | 383 | // Depending on the state, handle the data differently 384 | if (_this.recordingState === _this.RecordingEnum.KEYWORD_SPOTTING || _this.recordingState === _this.RecordingEnum.KEYWORD_SPOTTING_NOISY) { 385 | 386 | // Check if we should reset the buffers 387 | var now = new Date().getTime(); 388 | if (now - _this.keywordSpottingLastVoiceActivity > _this.keywordSpottingMaxVoiceActivityGap) { 389 | _this.resetBuffers(); 390 | } 391 | _this.keywordSpottingLastVoiceActivity = now; 392 | 393 | _this.keywordSpottingProcessFrame(groups, curFrame); 394 | } else { 395 | _this.groupedValues.push(groups); 396 | } 397 | 398 | }; 399 | }; 400 | 401 | /** 402 | * Process a new frame of data while in recording state KEYWORD_SPOTTING. 403 | * @param{Array} groups - the group data for the frame 404 | * @param{Array} curFrame - the raw audio data for the frame 405 | * @private 406 | */ 407 | JsSpeechRecognizer.prototype.keywordSpottingProcessFrame = function(groups, curFrame) { 408 | 409 | var computedLength; 410 | var key; 411 | var allResults = []; 412 | var recordingLength; 413 | var workingGroupBuffer = []; 414 | 415 | // Append to the keywordspotting buffer 416 | this.keywordSpottingGroupBuffer.push(groups); 417 | this.keywordSpottingGroupBuffer = [].concat.apply([], this.keywordSpottingGroupBuffer); 418 | 419 | // Trim the buffer if necessary 420 | computedLength = (this.keywordSpottingBufferCount * this.numGroups); 421 | if (this.keywordSpottingGroupBuffer.length > computedLength) { 422 | this.keywordSpottingGroupBuffer = this.keywordSpottingGroupBuffer.slice(this.keywordSpottingGroupBuffer.length - computedLength, this.keywordSpottingGroupBuffer.length); 423 | } 424 | 425 | // Save the audio data 426 | Array.prototype.push.apply(this.keywordSpottingRecordingBuffer, curFrame); 427 | 428 | // Trim the buffer if necessary 429 | computedLength = (this.keywordSpottingBufferCount * this.analyser.fftSize); 430 | if (this.keywordSpottingRecordingBuffer.length > computedLength) { 431 | this.keywordSpottingRecordingBuffer = this.keywordSpottingRecordingBuffer.slice(this.keywordSpottingRecordingBuffer.length - computedLength, this.keywordSpottingRecordingBuffer.length); 432 | } 433 | 434 | // Copy buffer, and normalize it, and use it to find the closest match 435 | workingGroupBuffer = this.keywordSpottingGroupBuffer.slice(0); 436 | this.normalizeInput(workingGroupBuffer); 437 | 438 | // Use the correct keyword spotting function 439 | if (this.recordingState === this.RecordingEnum.KEYWORD_SPOTTING_NOISY) { 440 | allResults = this.keywordDetectedNoisy(workingGroupBuffer); 441 | } else { 442 | allResults = this.keywordDetectedNormal(workingGroupBuffer); 443 | } 444 | 445 | 446 | // See if a keyword was spotted 447 | if (allResults !== null && allResults[0] !== undefined) { 448 | 449 | // Save the audio 450 | recordingLength = (allResults[0].frameCount / this.numGroups) * this.analyser.fftSize; 451 | 452 | if (recordingLength > this.keywordSpottingRecordingBuffer.length) { 453 | recordingLength = this.keywordSpottingRecordingBuffer.length; 454 | } 455 | 456 | allResults[0].audioBuffer = this.keywordSpottingRecordingBuffer.slice(this.keywordSpottingRecordingBuffer.length - recordingLength, this.keywordSpottingRecordingBuffer.length); 457 | 458 | this.resetBuffers(); 459 | if (this.keywordSpottedCallback !== undefined && this.keywordSpottedCallback !== null) { 460 | this.keywordSpottedCallback(allResults[0]); 461 | } 462 | 463 | } 464 | 465 | }; 466 | 467 | // Keyword spotting functions 468 | 469 | /** 470 | * Analyzes a buffer to determine if a keyword has been found. 471 | * Will return an array if a keyword was found, null otherwise. 472 | * 473 | * @param {Array} workingGroupBuffer 474 | * @return {Array|null} 475 | * @private 476 | */ 477 | JsSpeechRecognizer.prototype.keywordDetectedNormal = function(workingGroupBuffer) { 478 | var allResults = {}; 479 | 480 | allResults = this.findClosestMatch(workingGroupBuffer, 1, this.model, this.findDistanceForKeywordSpotting); 481 | 482 | if (allResults[0] !== undefined && allResults[0].confidence > this.keywordSpottingMinConfidence) { 483 | return allResults; 484 | } 485 | 486 | return null; 487 | }; 488 | 489 | /** 490 | * Analyzes a buffer to determine if a keyword has been found. 491 | * Will return an array if a keyword was found, null otherwise. 492 | * Designed to adjust for different levels of noise. 493 | * 494 | * @param {Array} workingGroupBuffer 495 | * @return {Array|null} 496 | * @private 497 | */ 498 | JsSpeechRecognizer.prototype.keywordDetectedNoisy = function(workingGroupBuffer) { 499 | 500 | // TODO: Make it possible for a user to specify the number of keyword spotting functions 501 | // And change this duplicated code to a loop! 502 | 503 | var allResults15 = {}; 504 | var allResults15MinConfidence = this.keywordSpottingMinConfidence; 505 | 506 | allResults15 = this.findClosestMatch(workingGroupBuffer, 1, this.model, this.findDistanceForKeywordSpotting15); 507 | 508 | if (allResults15[0].confidence <= allResults15MinConfidence) { 509 | return null; 510 | } 511 | 512 | 513 | var allResults5 = {}; 514 | var allResults5MinConfidence = this.keywordSpottingMinConfidence - 0.1; 515 | 516 | allResults5 = this.findClosestMatch(workingGroupBuffer, 1, this.model, this.findDistanceForKeywordSpotting5); 517 | 518 | if (allResults5[0].confidence <= allResults5MinConfidence) { 519 | return null; 520 | } 521 | 522 | 523 | var allResults0 = {}; 524 | var allResults0MinConfidence = this.keywordSpottingMinConfidence - 0.15; 525 | 526 | allResults0 = this.findClosestMatch(workingGroupBuffer, 1, this.model, this.findDistanceForKeywordSpotting0); 527 | 528 | if (allResults0[0].confidence <= allResults0MinConfidence) { 529 | return null; 530 | } 531 | 532 | 533 | // finally, run the normal check 534 | var allResults = {}; 535 | 536 | allResults = this.findClosestMatch(workingGroupBuffer, 1, this.model, this.findDistanceForKeywordSpotting); 537 | 538 | // Calculate the minimum confidence 539 | var allResultsMinConfidence = this.keywordSpottingMinConfidence - 0.1 - (Math.max((allResults[0].noise * 1.25) - 1, 0) * 0.75); 540 | 541 | // Final check for returning the results 542 | if (allResults[0] !== undefined && allResults[0].confidence > allResultsMinConfidence) { 543 | return allResults; 544 | } 545 | 546 | return null; 547 | }; 548 | 549 | // Calculation functions 550 | 551 | /** 552 | * Normalizes an input array to a scale from 0 to 100. 553 | * 554 | * @param {Array} input 555 | * @private 556 | */ 557 | JsSpeechRecognizer.prototype.normalizeInput = function(input) { 558 | // Find the max in the fft array 559 | var max = Math.max.apply(Math, input); 560 | 561 | for (var i = 0; i < input.length; i++) { 562 | input[i] = Math.floor((input[i] / max) * 100); 563 | } 564 | }; 565 | 566 | /** 567 | * Finds the closest matches for an input, for a specified model. 568 | * Uses specified findDistance function, or a default one. 569 | * 570 | * @param {Array} input 571 | * @param {Number} numResults 572 | * @param {Object} speechModel 573 | * @param {Function} findDistance 574 | * @return {Array} 575 | * @private 576 | */ 577 | JsSpeechRecognizer.prototype.findClosestMatch = function(input, numResults, speechModel, findDistanceFunction) { 578 | 579 | var i = 0; 580 | var key = ""; 581 | var allResults = []; 582 | 583 | // If no findDistance function is defined, use the default 584 | if (findDistanceFunction === undefined) { 585 | findDistanceFunction = this.findDistanceFunction; 586 | } 587 | 588 | // Loop through all the keys in the model 589 | for (key in speechModel) { 590 | // Loop through all entries for that key 591 | for (i = 0; i < speechModel[key].length; i++) { 592 | 593 | var curDistance = findDistanceFunction(input, speechModel[key][i]); 594 | var curConfidence = this.calcConfidence(curDistance, speechModel[key][i]); 595 | var curNoise = this.calculateNoise(input, speechModel[key][i]); 596 | 597 | var newResult = {}; 598 | newResult.match = key; 599 | newResult.confidence = curConfidence; 600 | newResult.noise = curNoise; 601 | newResult.frameCount = speechModel[key][i].length; 602 | allResults.push(newResult); 603 | } 604 | 605 | } 606 | 607 | allResults.sort(function(a, b) { return b.confidence - a.confidence; }); 608 | 609 | if (numResults === -1) { 610 | return allResults; 611 | } 612 | 613 | return allResults.slice(0, numResults); 614 | }; 615 | 616 | /** 617 | * Computes the sum of differances between an input and a modelEntry. 618 | * 619 | * @param {Array} input 620 | * @param {Array} modelEntry 621 | * @return {Number} 622 | * @private 623 | */ 624 | JsSpeechRecognizer.prototype.findDistance = function(input, modelEntry) { 625 | var i = 0; 626 | var distance = 0; 627 | 628 | for (i = 0; i < Math.max(input.length, modelEntry.length); i++) { 629 | var modelVal = modelEntry[i] || 0; 630 | var inputVal = input[i] || 0; 631 | distance += Math.abs(modelVal - inputVal); 632 | } 633 | 634 | return distance; 635 | }; 636 | 637 | /** 638 | * Will generate a distanceForKeywordSpotting function. 639 | * The function will calculate differences for entries in the model that 640 | * are greater than the parameter modelEntryGreaterThanVal. 641 | * 642 | * @param {Number} modelEntryGreaterThanVal 643 | * @return {Function} 644 | * @private 645 | */ 646 | JsSpeechRecognizer.prototype.generateFindDistanceForKeywordSpotting = function(modelEntryGreaterThanVal) { 647 | 648 | /** 649 | * Calculates the keyword spotting distance an input is from a model entry. 650 | * 651 | * @param {Array} input 652 | * @param {Array} modelEntry 653 | * @return {Number} 654 | * @private 655 | */ 656 | return function(input, modelEntry) { 657 | var i = 0; 658 | var distance = 0; 659 | 660 | // Compare from the end of the input, for modelEntry.length entries 661 | for (i = 1; i <= modelEntry.length; i++) { 662 | var modelVal = modelEntry[modelEntry.length - i] || 0; 663 | var inputVal = input[input.length - i] || 0; 664 | if (modelVal > modelEntryGreaterThanVal) { 665 | distance += Math.abs(modelVal - inputVal); 666 | } 667 | } 668 | 669 | return distance; 670 | }; 671 | }; 672 | 673 | /** 674 | * Calculates a confidence value based on the distance form a model entry. 675 | * Max confidence is 1, min is negative infinity. 676 | * 677 | * @param {Number} distance 678 | * @param {Array} modelEntry 679 | * @return {Number} 680 | * @private 681 | */ 682 | JsSpeechRecognizer.prototype.calcConfidence = function(distance, modelEntry) { 683 | var sum = 0; 684 | var i = 0; 685 | 686 | for (i = 0; i < modelEntry.length; i++) { 687 | sum += modelEntry[i]; 688 | } 689 | 690 | return (1 - (distance / sum)); 691 | }; 692 | 693 | /** 694 | * Calculates how noisy an input is compared to a model entry. 695 | * 696 | * @param {Array} input 697 | * @param {Array} modelEntry 698 | * @return {Number} 699 | * @private 700 | */ 701 | JsSpeechRecognizer.prototype.calculateNoise = function(input, modelEntry) { 702 | var i = 0; 703 | var sumIn = 0; 704 | var sumEntry = 0; 705 | 706 | // Compare from the end of the input, for modelEntry.length entries 707 | for (i = 1; i <= modelEntry.length; i++) { 708 | var modelVal = modelEntry[modelEntry.length - i] || 0; 709 | var inputVal = input[input.length - i] || 0; 710 | sumIn += inputVal * inputVal; 711 | 712 | // TODO: Optimize by caching the calculation for the model 713 | sumEntry += modelVal * modelVal; 714 | } 715 | 716 | return (sumIn / sumEntry); 717 | }; 718 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | Apache License 2 | Version 2.0, January 2004 3 | http://www.apache.org/licenses/ 4 | 5 | TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 6 | 7 | 1. Definitions. 8 | 9 | "License" shall mean the terms and conditions for use, reproduction, 10 | and distribution as defined by Sections 1 through 9 of this document. 11 | 12 | "Licensor" shall mean the copyright owner or entity authorized by 13 | the copyright owner that is granting the License. 14 | 15 | "Legal Entity" shall mean the union of the acting entity and all 16 | other entities that control, are controlled by, or are under common 17 | control with that entity. For the purposes of this definition, 18 | "control" means (i) the power, direct or indirect, to cause the 19 | direction or management of such entity, whether by contract or 20 | otherwise, or (ii) ownership of fifty percent (50%) or more of the 21 | outstanding shares, or (iii) beneficial ownership of such entity. 22 | 23 | "You" (or "Your") shall mean an individual or Legal Entity 24 | exercising permissions granted by this License. 25 | 26 | "Source" form shall mean the preferred form for making modifications, 27 | including but not limited to software source code, documentation 28 | source, and configuration files. 29 | 30 | "Object" form shall mean any form resulting from mechanical 31 | transformation or translation of a Source form, including but 32 | not limited to compiled object code, generated documentation, 33 | and conversions to other media types. 34 | 35 | "Work" shall mean the work of authorship, whether in Source or 36 | Object form, made available under the License, as indicated by a 37 | copyright notice that is included in or attached to the work 38 | (an example is provided in the Appendix below). 39 | 40 | "Derivative Works" shall mean any work, whether in Source or Object 41 | form, that is based on (or derived from) the Work and for which the 42 | editorial revisions, annotations, elaborations, or other modifications 43 | represent, as a whole, an original work of authorship. For the purposes 44 | of this License, Derivative Works shall not include works that remain 45 | separable from, or merely link (or bind by name) to the interfaces of, 46 | the Work and Derivative Works thereof. 47 | 48 | "Contribution" shall mean any work of authorship, including 49 | the original version of the Work and any modifications or additions 50 | to that Work or Derivative Works thereof, that is intentionally 51 | submitted to Licensor for inclusion in the Work by the copyright owner 52 | or by an individual or Legal Entity authorized to submit on behalf of 53 | the copyright owner. For the purposes of this definition, "submitted" 54 | means any form of electronic, verbal, or written communication sent 55 | to the Licensor or its representatives, including but not limited to 56 | communication on electronic mailing lists, source code control systems, 57 | and issue tracking systems that are managed by, or on behalf of, the 58 | Licensor for the purpose of discussing and improving the Work, but 59 | excluding communication that is conspicuously marked or otherwise 60 | designated in writing by the copyright owner as "Not a Contribution." 61 | 62 | "Contributor" shall mean Licensor and any individual or Legal Entity 63 | on behalf of whom a Contribution has been received by Licensor and 64 | subsequently incorporated within the Work. 65 | 66 | 2. Grant of Copyright License. Subject to the terms and conditions of 67 | this License, each Contributor hereby grants to You a perpetual, 68 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 69 | copyright license to reproduce, prepare Derivative Works of, 70 | publicly display, publicly perform, sublicense, and distribute the 71 | Work and such Derivative Works in Source or Object form. 72 | 73 | 3. Grant of Patent License. Subject to the terms and conditions of 74 | this License, each Contributor hereby grants to You a perpetual, 75 | worldwide, non-exclusive, no-charge, royalty-free, irrevocable 76 | (except as stated in this section) patent license to make, have made, 77 | use, offer to sell, sell, import, and otherwise transfer the Work, 78 | where such license applies only to those patent claims licensable 79 | by such Contributor that are necessarily infringed by their 80 | Contribution(s) alone or by combination of their Contribution(s) 81 | with the Work to which such Contribution(s) was submitted. If You 82 | institute patent litigation against any entity (including a 83 | cross-claim or counterclaim in a lawsuit) alleging that the Work 84 | or a Contribution incorporated within the Work constitutes direct 85 | or contributory patent infringement, then any patent licenses 86 | granted to You under this License for that Work shall terminate 87 | as of the date such litigation is filed. 88 | 89 | 4. Redistribution. You may reproduce and distribute copies of the 90 | Work or Derivative Works thereof in any medium, with or without 91 | modifications, and in Source or Object form, provided that You 92 | meet the following conditions: 93 | 94 | (a) You must give any other recipients of the Work or 95 | Derivative Works a copy of this License; and 96 | 97 | (b) You must cause any modified files to carry prominent notices 98 | stating that You changed the files; and 99 | 100 | (c) You must retain, in the Source form of any Derivative Works 101 | that You distribute, all copyright, patent, trademark, and 102 | attribution notices from the Source form of the Work, 103 | excluding those notices that do not pertain to any part of 104 | the Derivative Works; and 105 | 106 | (d) If the Work includes a "NOTICE" text file as part of its 107 | distribution, then any Derivative Works that You distribute must 108 | include a readable copy of the attribution notices contained 109 | within such NOTICE file, excluding those notices that do not 110 | pertain to any part of the Derivative Works, in at least one 111 | of the following places: within a NOTICE text file distributed 112 | as part of the Derivative Works; within the Source form or 113 | documentation, if provided along with the Derivative Works; or, 114 | within a display generated by the Derivative Works, if and 115 | wherever such third-party notices normally appear. The contents 116 | of the NOTICE file are for informational purposes only and 117 | do not modify the License. You may add Your own attribution 118 | notices within Derivative Works that You distribute, alongside 119 | or as an addendum to the NOTICE text from the Work, provided 120 | that such additional attribution notices cannot be construed 121 | as modifying the License. 122 | 123 | You may add Your own copyright statement to Your modifications and 124 | may provide additional or different license terms and conditions 125 | for use, reproduction, or distribution of Your modifications, or 126 | for any such Derivative Works as a whole, provided Your use, 127 | reproduction, and distribution of the Work otherwise complies with 128 | the conditions stated in this License. 129 | 130 | 5. Submission of Contributions. Unless You explicitly state otherwise, 131 | any Contribution intentionally submitted for inclusion in the Work 132 | by You to the Licensor shall be under the terms and conditions of 133 | this License, without any additional terms or conditions. 134 | Notwithstanding the above, nothing herein shall supersede or modify 135 | the terms of any separate license agreement you may have executed 136 | with Licensor regarding such Contributions. 137 | 138 | 6. Trademarks. This License does not grant permission to use the trade 139 | names, trademarks, service marks, or product names of the Licensor, 140 | except as required for reasonable and customary use in describing the 141 | origin of the Work and reproducing the content of the NOTICE file. 142 | 143 | 7. Disclaimer of Warranty. Unless required by applicable law or 144 | agreed to in writing, Licensor provides the Work (and each 145 | Contributor provides its Contributions) on an "AS IS" BASIS, 146 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or 147 | implied, including, without limitation, any warranties or conditions 148 | of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A 149 | PARTICULAR PURPOSE. You are solely responsible for determining the 150 | appropriateness of using or redistributing the Work and assume any 151 | risks associated with Your exercise of permissions under this License. 152 | 153 | 8. Limitation of Liability. In no event and under no legal theory, 154 | whether in tort (including negligence), contract, or otherwise, 155 | unless required by applicable law (such as deliberate and grossly 156 | negligent acts) or agreed to in writing, shall any Contributor be 157 | liable to You for damages, including any direct, indirect, special, 158 | incidental, or consequential damages of any character arising as a 159 | result of this License or out of the use or inability to use the 160 | Work (including but not limited to damages for loss of goodwill, 161 | work stoppage, computer failure or malfunction, or any and all 162 | other commercial damages or losses), even if such Contributor 163 | has been advised of the possibility of such damages. 164 | 165 | 9. Accepting Warranty or Additional Liability. While redistributing 166 | the Work or Derivative Works thereof, You may choose to offer, 167 | and charge a fee for, acceptance of support, warranty, indemnity, 168 | or other liability obligations and/or rights consistent with this 169 | License. However, in accepting such obligations, You may act only 170 | on Your own behalf and on Your sole responsibility, not on behalf 171 | of any other Contributor, and only if You agree to indemnify, 172 | defend, and hold each Contributor harmless for any liability 173 | incurred by, or claims asserted against, such Contributor by reason 174 | of your accepting any such warranty or additional liability. 175 | 176 | END OF TERMS AND CONDITIONS 177 | 178 | APPENDIX: How to apply the Apache License to your work. 179 | 180 | To apply the Apache License to your work, attach the following 181 | boilerplate notice, with the fields enclosed by brackets "{}" 182 | replaced with your own identifying information. (Don't include 183 | the brackets!) The text should be enclosed in the appropriate 184 | comment syntax for the file format. We also recommend that a 185 | file or class name and description of purpose be included on the 186 | same "printed page" as the copyright notice for easier 187 | identification within third-party archives. 188 | 189 | Copyright {yyyy} {name of copyright owner} 190 | 191 | Licensed under the Apache License, Version 2.0 (the "License"); 192 | you may not use this file except in compliance with the License. 193 | You may obtain a copy of the License at 194 | 195 | http://www.apache.org/licenses/LICENSE-2.0 196 | 197 | Unless required by applicable law or agreed to in writing, software 198 | distributed under the License is distributed on an "AS IS" BASIS, 199 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 200 | See the License for the specific language governing permissions and 201 | limitations under the License. 202 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # JsSpeechRecognizer 2 | JavaScript Speech Recognizer 3 | 4 | ## Demos 5 | [Speech Recognition Demo](https://dreamdom.github.io/speechrec.html) 6 | 7 | [Keyword Spotting Demo](https://dreamdom.github.io/demos/keyword-spotting/keyword-spotting.html) 8 | 9 | [Video Interaction Live Demo](https://dreamdom.github.io/demos/video-interaction/video-interaction.html) 10 | 11 | ## Video 12 | Here is a [short video](https://vimeo.com/161142124) of the keyword spotting demo. 13 | 14 | And here is a [short video](https://vimeo.com/161726625) of the video interaction demo. 15 | 16 | ## What is It? 17 | JsSpeechRecognizer is a javascript based speech recognizer. It allows you to train words or phrases to be recognized, and then record new audio to match to these words or phrases. 18 | 19 | At the moment, JsSpeechRecognizer does not include any data model, so you will have to train new words before using it. 20 | 21 | ## How Does it Work? 22 | 23 | ### WebRTC 24 | JsSpeechRecognizer uses browser WebRTC functionality to get access to the microphone and Fast Fourier Transform (fft) data. Therefore, it will only work in browsers with WebRTC support. 25 | 26 | The WebRTC adapter javascript is needed to use the JsSpeechRecognizer. It is hosted on github here. https://github.com/webrtc/adapter 27 | 28 | ### JsSpeechRecognizer.js 29 | This file contains all of the specific speech recognizer logic. 30 | 31 | ### Detailed Write Up 32 | For a more detailed write up on how the JsSpeechRecognizer was built click [here](BuildingaSpeechRecognizerinJavaScript.md). 33 | 34 | ## Live Demo 35 | Play with the Live Demo [here](https://dreamdom.github.io/speechrec.html). It has only been tested in Firefox and Chrome. 36 | 37 | ### Screenshots 38 | ![Yes No Screenshot](readme-images/screenshot-yes-no.png "Yes No Screenshot") 39 | ![Chicken Frog Screenshot](readme-images/screenshot-chicken-frog.png "Chicken Frog Screenshot") 40 | 41 | ### Tips for the Live Demo 42 | 43 | 1. Try training the word "yes", and then training the word "no". 44 | 2. It is recommended that you train and test in a quiet room. 45 | 3. You can (and should) train a word multiple times. This is especially important if you are trying to recognize words that sound very similar such as "no" and "go". 46 | 4. Use the "play" button to hear the audio data that was recorded. You should verify that a recording in the training set is of good quality and is of the correct word. 47 | 5. If a recording is incorrect, of bad quality, or contains too much noise get rid of it with the "delete" button. 48 | 49 | ### Fun Stuff 50 | 51 | * Try training phrases like "find sushi" or "show me coffee in San Francisco" 52 | * Train and detect laughing or screaming. 53 | * Use emoticons like 🐔, instead of words. 54 | * Train the recognizer with one person, and test with another person. 55 | 56 | ## More Demos 57 | Find information about more demos [here](https://github.com/dreamdom/JsSpeechRecognizer/tree/master/demos). 58 | 59 | I would love to hear more ideas! 60 | 61 | ## Running the Demos on Your Own Machine 62 | The demo speechrec.html lets you train new words and then recognize them. 63 | 64 | ### Running in Firefox 65 | Simply open the file speechrec.html. You should get a popup from the browser asking you if you would like to grant the site permission to use the microphone. 66 | 67 | ### Running in Chrome 68 | If the speechrec.html file is opened as a local file (with a file:/// prefix) the demo will not work by default due to security settings. You can either disable the security (temporarily) or set up a local server to test the file. 69 | 70 | I recommend using a Python SimpleHTTPServer. Open up a terminal, cd to the proper folder you want to host, and run the following command: 71 | 72 | Python 2 73 | ````shell 74 | python -m SimpleHTTPServer 8000 75 | ```` 76 | 77 | Python 3 78 | ````shell 79 | python -m http.server 8000 80 | ```` 81 | 82 | Open up "localhost:8000" in your browser to see the list of files in the folder being shared. For more details see the python documentation. 83 | https://docs.python.org/2/library/simplehttpserver.html 84 | 85 | Other alternatives include browser-sync or webpack-dev-server. 86 | 87 | For more details about Chrome and webrtc locally, see the following stack overflow question: 88 | http://stackoverflow.com/questions/14318319/webrtc-browser-doesnt-ask-for-mic-access-permission-for-local-html-file 89 | 90 | ### Other Browsers 91 | I have not tested other browsers. 92 | -------------------------------------------------------------------------------- /adapter.js: -------------------------------------------------------------------------------- 1 | (function(f){if(typeof exports==="object"&&typeof module!=="undefined"){module.exports=f()}else if(typeof define==="function"&&define.amd){define([],f)}else{var g;if(typeof window!=="undefined"){g=window}else if(typeof global!=="undefined"){g=global}else if(typeof self!=="undefined"){g=self}else{g=this}g.adapter = f()}})(function(){var define,module,exports;return (function e(t,n,r){function s(o,u){if(!n[o]){if(!t[o]){var a=typeof require=="function"&&require;if(!u&&a)return a(o,!0);if(i)return i(o,!0);var f=new Error("Cannot find module '"+o+"'");throw f.code="MODULE_NOT_FOUND",f}var l=n[o]={exports:{}};t[o][0].call(l.exports,function(e){var n=t[o][1][e];return s(n?n:e)},l,l.exports,e,t,n,r)}return n[o].exports}var i=typeof require=="function"&&require;for(var o=0;o 0 && typeof selector === 'function') { 189 | return origGetStats(selector, successCallback); 190 | } 191 | 192 | var fixChromeStats_ = function(response) { 193 | var standardReport = {}; 194 | var reports = response.result(); 195 | reports.forEach(function(report) { 196 | var standardStats = { 197 | id: report.id, 198 | timestamp: report.timestamp, 199 | type: report.type 200 | }; 201 | report.names().forEach(function(name) { 202 | standardStats[name] = report.stat(name); 203 | }); 204 | standardReport[standardStats.id] = standardStats; 205 | }); 206 | 207 | return standardReport; 208 | }; 209 | 210 | if (arguments.length >= 2) { 211 | var successCallbackWrapper_ = function(response) { 212 | args[1](fixChromeStats_(response)); 213 | }; 214 | 215 | return origGetStats.apply(this, [successCallbackWrapper_, arguments[0]]); 216 | } 217 | 218 | // promise-support 219 | return new Promise(function(resolve, reject) { 220 | if (args.length === 1 && selector === null) { 221 | origGetStats.apply(self, [ 222 | function(response) { 223 | resolve.apply(null, [fixChromeStats_(response)]); 224 | }, reject]); 225 | } else { 226 | origGetStats.apply(self, [resolve, reject]); 227 | } 228 | }); 229 | }; 230 | 231 | return pc; 232 | }; 233 | window.RTCPeerConnection.prototype = webkitRTCPeerConnection.prototype; 234 | 235 | // wrap static methods. Currently just generateCertificate. 236 | if (webkitRTCPeerConnection.generateCertificate) { 237 | Object.defineProperty(window.RTCPeerConnection, 'generateCertificate', { 238 | get: function() { 239 | if (arguments.length) { 240 | return webkitRTCPeerConnection.generateCertificate.apply(null, 241 | arguments); 242 | } else { 243 | return webkitRTCPeerConnection.generateCertificate; 244 | } 245 | } 246 | }); 247 | } 248 | 249 | // add promise support 250 | ['createOffer', 'createAnswer'].forEach(function(method) { 251 | var nativeMethod = webkitRTCPeerConnection.prototype[method]; 252 | webkitRTCPeerConnection.prototype[method] = function() { 253 | var self = this; 254 | if (arguments.length < 1 || (arguments.length === 1 && 255 | typeof(arguments[0]) === 'object')) { 256 | var opts = arguments.length === 1 ? arguments[0] : undefined; 257 | return new Promise(function(resolve, reject) { 258 | nativeMethod.apply(self, [resolve, reject, opts]); 259 | }); 260 | } else { 261 | return nativeMethod.apply(this, arguments); 262 | } 263 | }; 264 | }); 265 | 266 | ['setLocalDescription', 'setRemoteDescription', 267 | 'addIceCandidate'].forEach(function(method) { 268 | var nativeMethod = webkitRTCPeerConnection.prototype[method]; 269 | webkitRTCPeerConnection.prototype[method] = function() { 270 | var args = arguments; 271 | var self = this; 272 | return new Promise(function(resolve, reject) { 273 | nativeMethod.apply(self, [args[0], 274 | function() { 275 | resolve(); 276 | if (args.length >= 2) { 277 | args[1].apply(null, []); 278 | } 279 | }, 280 | function(err) { 281 | reject(err); 282 | if (args.length >= 3) { 283 | args[2].apply(null, [err]); 284 | } 285 | }] 286 | ); 287 | }); 288 | }; 289 | }); 290 | }, 291 | 292 | shimGetUserMedia: function() { 293 | var constraintsToChrome_ = function(c) { 294 | if (typeof c !== 'object' || c.mandatory || c.optional) { 295 | return c; 296 | } 297 | var cc = {}; 298 | Object.keys(c).forEach(function(key) { 299 | if (key === 'require' || key === 'advanced' || key === 'mediaSource') { 300 | return; 301 | } 302 | var r = (typeof c[key] === 'object') ? c[key] : {ideal: c[key]}; 303 | if (r.exact !== undefined && typeof r.exact === 'number') { 304 | r.min = r.max = r.exact; 305 | } 306 | var oldname_ = function(prefix, name) { 307 | if (prefix) { 308 | return prefix + name.charAt(0).toUpperCase() + name.slice(1); 309 | } 310 | return (name === 'deviceId') ? 'sourceId' : name; 311 | }; 312 | if (r.ideal !== undefined) { 313 | cc.optional = cc.optional || []; 314 | var oc = {}; 315 | if (typeof r.ideal === 'number') { 316 | oc[oldname_('min', key)] = r.ideal; 317 | cc.optional.push(oc); 318 | oc = {}; 319 | oc[oldname_('max', key)] = r.ideal; 320 | cc.optional.push(oc); 321 | } else { 322 | oc[oldname_('', key)] = r.ideal; 323 | cc.optional.push(oc); 324 | } 325 | } 326 | if (r.exact !== undefined && typeof r.exact !== 'number') { 327 | cc.mandatory = cc.mandatory || {}; 328 | cc.mandatory[oldname_('', key)] = r.exact; 329 | } else { 330 | ['min', 'max'].forEach(function(mix) { 331 | if (r[mix] !== undefined) { 332 | cc.mandatory = cc.mandatory || {}; 333 | cc.mandatory[oldname_(mix, key)] = r[mix]; 334 | } 335 | }); 336 | } 337 | }); 338 | if (c.advanced) { 339 | cc.optional = (cc.optional || []).concat(c.advanced); 340 | } 341 | return cc; 342 | }; 343 | 344 | var getUserMedia_ = function(constraints, onSuccess, onError) { 345 | if (constraints.audio) { 346 | constraints.audio = constraintsToChrome_(constraints.audio); 347 | } 348 | if (constraints.video) { 349 | constraints.video = constraintsToChrome_(constraints.video); 350 | } 351 | logging('chrome: ' + JSON.stringify(constraints)); 352 | return navigator.webkitGetUserMedia(constraints, onSuccess, onError); 353 | }; 354 | navigator.getUserMedia = getUserMedia_; 355 | 356 | // Returns the result of getUserMedia as a Promise. 357 | var getUserMediaPromise_ = function(constraints) { 358 | return new Promise(function(resolve, reject) { 359 | navigator.getUserMedia(constraints, resolve, reject); 360 | }); 361 | } 362 | 363 | if (!navigator.mediaDevices) { 364 | navigator.mediaDevices = {getUserMedia: getUserMediaPromise_, 365 | enumerateDevices: function() { 366 | return new Promise(function(resolve) { 367 | var kinds = {audio: 'audioinput', video: 'videoinput'}; 368 | return MediaStreamTrack.getSources(function(devices) { 369 | resolve(devices.map(function(device) { 370 | return {label: device.label, 371 | kind: kinds[device.kind], 372 | deviceId: device.id, 373 | groupId: ''}; 374 | })); 375 | }); 376 | }); 377 | }}; 378 | } 379 | 380 | // A shim for getUserMedia method on the mediaDevices object. 381 | // TODO(KaptenJansson) remove once implemented in Chrome stable. 382 | if (!navigator.mediaDevices.getUserMedia) { 383 | navigator.mediaDevices.getUserMedia = function(constraints) { 384 | return getUserMediaPromise_(constraints); 385 | }; 386 | } else { 387 | // Even though Chrome 45 has navigator.mediaDevices and a getUserMedia 388 | // function which returns a Promise, it does not accept spec-style 389 | // constraints. 390 | var origGetUserMedia = navigator.mediaDevices.getUserMedia. 391 | bind(navigator.mediaDevices); 392 | navigator.mediaDevices.getUserMedia = function(c) { 393 | if (c) { 394 | logging('spec: ' + JSON.stringify(c)); // whitespace for alignment 395 | c.audio = constraintsToChrome_(c.audio); 396 | c.video = constraintsToChrome_(c.video); 397 | logging('chrome: ' + JSON.stringify(c)); 398 | } 399 | return origGetUserMedia(c); 400 | }.bind(this); 401 | } 402 | 403 | // Dummy devicechange event methods. 404 | // TODO(KaptenJansson) remove once implemented in Chrome stable. 405 | if (typeof navigator.mediaDevices.addEventListener === 'undefined') { 406 | navigator.mediaDevices.addEventListener = function() { 407 | logging('Dummy mediaDevices.addEventListener called.'); 408 | }; 409 | } 410 | if (typeof navigator.mediaDevices.removeEventListener === 'undefined') { 411 | navigator.mediaDevices.removeEventListener = function() { 412 | logging('Dummy mediaDevices.removeEventListener called.'); 413 | }; 414 | } 415 | }, 416 | 417 | // Attach a media stream to an element. 418 | attachMediaStream: function(element, stream) { 419 | logging('DEPRECATED, attachMediaStream will soon be removed.'); 420 | if (browserDetails.version >= 43) { 421 | element.srcObject = stream; 422 | } else if (typeof element.src !== 'undefined') { 423 | element.src = URL.createObjectURL(stream); 424 | } else { 425 | logging('Error attaching stream to element.'); 426 | } 427 | }, 428 | 429 | reattachMediaStream: function(to, from) { 430 | logging('DEPRECATED, reattachMediaStream will soon be removed.'); 431 | if (browserDetails.version >= 43) { 432 | to.srcObject = from.srcObject; 433 | } else { 434 | to.src = from.src; 435 | } 436 | } 437 | } 438 | 439 | // Expose public methods. 440 | module.exports = { 441 | shimOnTrack: chromeShim.shimOnTrack, 442 | shimSourceObject: chromeShim.shimSourceObject, 443 | shimPeerConnection: chromeShim.shimPeerConnection, 444 | shimGetUserMedia: chromeShim.shimGetUserMedia, 445 | attachMediaStream: chromeShim.attachMediaStream, 446 | reattachMediaStream: chromeShim.reattachMediaStream 447 | }; 448 | 449 | },{"../utils.js":6}],3:[function(require,module,exports){ 450 | /* 451 | * Copyright (c) 2016 The WebRTC project authors. All Rights Reserved. 452 | * 453 | * Use of this source code is governed by a BSD-style license 454 | * that can be found in the LICENSE file in the root of the source 455 | * tree. 456 | */ 457 | 'use strict'; 458 | 459 | // SDP helpers. 460 | var SDPUtils = {}; 461 | 462 | // Generate an alphanumeric identifier for cname or mids. 463 | // TODO: use UUIDs instead? https://gist.github.com/jed/982883 464 | SDPUtils.generateIdentifier = function() { 465 | return Math.random().toString(36).substr(2, 10); 466 | }; 467 | 468 | // The RTCP CNAME used by all peerconnections from the same JS. 469 | SDPUtils.localCName = SDPUtils.generateIdentifier(); 470 | 471 | 472 | // Splits SDP into lines, dealing with both CRLF and LF. 473 | SDPUtils.splitLines = function(blob) { 474 | return blob.trim().split('\n').map(function(line) { 475 | return line.trim(); 476 | }); 477 | }; 478 | // Splits SDP into sessionpart and mediasections. Ensures CRLF. 479 | SDPUtils.splitSections = function(blob) { 480 | var parts = blob.split('\r\nm='); 481 | return parts.map(function(part, index) { 482 | return (index > 0 ? 'm=' + part : part).trim() + '\r\n'; 483 | }); 484 | }; 485 | 486 | // Returns lines that start with a certain prefix. 487 | SDPUtils.matchPrefix = function(blob, prefix) { 488 | return SDPUtils.splitLines(blob).filter(function(line) { 489 | return line.indexOf(prefix) === 0; 490 | }); 491 | }; 492 | 493 | // Parses an ICE candidate line. Sample input: 494 | // candidate:702786350 2 udp 41819902 8.8.8.8 60769 typ relay raddr 8.8.8.8 rport 55996" 495 | SDPUtils.parseCandidate = function(line) { 496 | var parts; 497 | // Parse both variants. 498 | if (line.indexOf('a=candidate:') === 0) { 499 | parts = line.substring(12).split(' '); 500 | } else { 501 | parts = line.substring(10).split(' '); 502 | } 503 | 504 | var candidate = { 505 | foundation: parts[0], 506 | component: parts[1], 507 | protocol: parts[2].toLowerCase(), 508 | priority: parseInt(parts[3], 10), 509 | ip: parts[4], 510 | port: parseInt(parts[5], 10), 511 | // skip parts[6] == 'typ' 512 | type: parts[7] 513 | }; 514 | 515 | for (var i = 8; i < parts.length; i += 2) { 516 | switch (parts[i]) { 517 | case 'raddr': 518 | candidate.relatedAddress = parts[i + 1]; 519 | break; 520 | case 'rport': 521 | candidate.relatedPort = parseInt(parts[i + 1], 10); 522 | break; 523 | case 'tcptype': 524 | candidate.tcpType = parts[i + 1]; 525 | break; 526 | default: // Unknown extensions are silently ignored. 527 | break; 528 | } 529 | } 530 | return candidate; 531 | }; 532 | 533 | // Translates a candidate object into SDP candidate attribute. 534 | SDPUtils.writeCandidate = function(candidate) { 535 | var sdp = []; 536 | sdp.push(candidate.foundation); 537 | sdp.push(candidate.component); 538 | sdp.push(candidate.protocol.toUpperCase()); 539 | sdp.push(candidate.priority); 540 | sdp.push(candidate.ip); 541 | sdp.push(candidate.port); 542 | 543 | var type = candidate.type; 544 | sdp.push('typ'); 545 | sdp.push(type); 546 | if (type !== 'host' && candidate.relatedAddress && 547 | candidate.relatedPort) { 548 | sdp.push('raddr'); 549 | sdp.push(candidate.relatedAddress); // was: relAddr 550 | sdp.push('rport'); 551 | sdp.push(candidate.relatedPort); // was: relPort 552 | } 553 | if (candidate.tcpType && candidate.protocol.toLowerCase() === 'tcp') { 554 | sdp.push('tcptype'); 555 | sdp.push(candidate.tcpType); 556 | } 557 | return 'candidate:' + sdp.join(' '); 558 | }; 559 | 560 | // Parses an rtpmap line, returns RTCRtpCoddecParameters. Sample input: 561 | // a=rtpmap:111 opus/48000/2 562 | SDPUtils.parseRtpMap = function(line) { 563 | var parts = line.substr(9).split(' '); 564 | var parsed = { 565 | payloadType: parseInt(parts.shift(), 10) // was: id 566 | }; 567 | 568 | parts = parts[0].split('/'); 569 | 570 | parsed.name = parts[0]; 571 | parsed.clockRate = parseInt(parts[1], 10); // was: clockrate 572 | parsed.numChannels = parts.length === 3 ? parseInt(parts[2], 10) : 1; // was: channels 573 | return parsed; 574 | }; 575 | 576 | // Generate an a=rtpmap line from RTCRtpCodecCapability or RTCRtpCodecParameters. 577 | SDPUtils.writeRtpMap = function(codec) { 578 | var pt = codec.payloadType; 579 | if (codec.preferredPayloadType !== undefined) { 580 | pt = codec.preferredPayloadType; 581 | } 582 | return 'a=rtpmap:' + pt + ' ' + codec.name + '/' + codec.clockRate + 583 | (codec.numChannels !== 1 ? '/' + codec.numChannels : '') + '\r\n'; 584 | }; 585 | 586 | // Parses an ftmp line, returns dictionary. Sample input: 587 | // a=fmtp:96 vbr=on;cng=on 588 | // Also deals with vbr=on; cng=on 589 | SDPUtils.parseFmtp = function(line) { 590 | var parsed = {}; 591 | var kv; 592 | var parts = line.substr(line.indexOf(' ') + 1).split(';'); 593 | for (var j = 0; j < parts.length; j++) { 594 | kv = parts[j].trim().split('='); 595 | parsed[kv[0].trim()] = kv[1]; 596 | } 597 | return parsed; 598 | }; 599 | 600 | // Generates an a=ftmp line from RTCRtpCodecCapability or RTCRtpCodecParameters. 601 | SDPUtils.writeFtmp = function(codec) { 602 | var line = ''; 603 | var pt = codec.payloadType; 604 | if (codec.preferredPayloadType !== undefined) { 605 | pt = codec.preferredPayloadType; 606 | } 607 | if (codec.parameters && codec.parameters.length) { 608 | var params = []; 609 | Object.keys(codec.parameters).forEach(function(param) { 610 | params.push(param + '=' + codec.parameters[param]); 611 | }); 612 | line += 'a=fmtp:' + pt + ' ' + params.join(';') + '\r\n'; 613 | } 614 | return line; 615 | }; 616 | 617 | // Parses an rtcp-fb line, returns RTCPRtcpFeedback object. Sample input: 618 | // a=rtcp-fb:98 nack rpsi 619 | SDPUtils.parseRtcpFb = function(line) { 620 | var parts = line.substr(line.indexOf(' ') + 1).split(' '); 621 | return { 622 | type: parts.shift(), 623 | parameter: parts.join(' ') 624 | }; 625 | }; 626 | // Generate a=rtcp-fb lines from RTCRtpCodecCapability or RTCRtpCodecParameters. 627 | SDPUtils.writeRtcpFb = function(codec) { 628 | var lines = ''; 629 | var pt = codec.payloadType; 630 | if (codec.preferredPayloadType !== undefined) { 631 | pt = codec.preferredPayloadType; 632 | } 633 | if (codec.rtcpFeedback && codec.rtcpFeedback.length) { 634 | // FIXME: special handling for trr-int? 635 | codec.rtcpFeedback.forEach(function(fb) { 636 | lines += 'a=rtcp-fb:' + pt + ' ' + fb.type + ' ' + fb.parameter + 637 | '\r\n'; 638 | }); 639 | } 640 | return lines; 641 | }; 642 | 643 | // Parses an RFC 5576 ssrc media attribute. Sample input: 644 | // a=ssrc:3735928559 cname:something 645 | SDPUtils.parseSsrcMedia = function(line) { 646 | var sp = line.indexOf(' '); 647 | var parts = { 648 | ssrc: line.substr(7, sp - 7), 649 | }; 650 | var colon = line.indexOf(':', sp); 651 | if (colon > -1) { 652 | parts.attribute = line.substr(sp + 1, colon - sp - 1); 653 | parts.value = line.substr(colon + 1); 654 | } else { 655 | parts.attribute = line.substr(sp + 1); 656 | } 657 | return parts; 658 | }; 659 | 660 | // Extracts DTLS parameters from SDP media section or sessionpart. 661 | // FIXME: for consistency with other functions this should only 662 | // get the fingerprint line as input. See also getIceParameters. 663 | SDPUtils.getDtlsParameters = function(mediaSection, sessionpart) { 664 | var lines = SDPUtils.splitLines(mediaSection); 665 | lines = lines.concat(SDPUtils.splitLines(sessionpart)); // Search in session part, too. 666 | var fpLine = lines.filter(function(line) { 667 | return line.indexOf('a=fingerprint:') === 0; 668 | })[0].substr(14); 669 | // Note: a=setup line is ignored since we use the 'auto' role. 670 | var dtlsParameters = { 671 | role: 'auto', 672 | fingerprints: [{ 673 | algorithm: fpLine.split(' ')[0], 674 | value: fpLine.split(' ')[1] 675 | }] 676 | }; 677 | return dtlsParameters; 678 | }; 679 | 680 | // Serializes DTLS parameters to SDP. 681 | SDPUtils.writeDtlsParameters = function(params, setupType) { 682 | var sdp = 'a=setup:' + setupType + '\r\n'; 683 | params.fingerprints.forEach(function(fp) { 684 | sdp += 'a=fingerprint:' + fp.algorithm + ' ' + fp.value + '\r\n'; 685 | }); 686 | return sdp; 687 | }; 688 | // Parses ICE information from SDP media section or sessionpart. 689 | // FIXME: for consistency with other functions this should only 690 | // get the ice-ufrag and ice-pwd lines as input. 691 | SDPUtils.getIceParameters = function(mediaSection, sessionpart) { 692 | var lines = SDPUtils.splitLines(mediaSection); 693 | lines = lines.concat(SDPUtils.splitLines(sessionpart)); // Search in session part, too. 694 | var iceParameters = { 695 | usernameFragment: lines.filter(function(line) { 696 | return line.indexOf('a=ice-ufrag:') === 0; 697 | })[0].substr(12), 698 | password: lines.filter(function(line) { 699 | return line.indexOf('a=ice-pwd:') === 0; 700 | })[0].substr(10) 701 | }; 702 | return iceParameters; 703 | }; 704 | 705 | // Serializes ICE parameters to SDP. 706 | SDPUtils.writeIceParameters = function(params) { 707 | return 'a=ice-ufrag:' + params.usernameFragment + '\r\n' + 708 | 'a=ice-pwd:' + params.password + '\r\n'; 709 | }; 710 | 711 | // Parses the SDP media section and returns RTCRtpParameters. 712 | SDPUtils.parseRtpParameters = function(mediaSection) { 713 | var description = { 714 | codecs: [], 715 | headerExtensions: [], 716 | fecMechanisms: [], 717 | rtcp: [] 718 | }; 719 | var lines = SDPUtils.splitLines(mediaSection); 720 | var mline = lines[0].split(' '); 721 | for (var i = 3; i < mline.length; i++) { // find all codecs from mline[3..] 722 | var pt = mline[i]; 723 | var rtpmapline = SDPUtils.matchPrefix( 724 | mediaSection, 'a=rtpmap:' + pt + ' ')[0]; 725 | if (rtpmapline) { 726 | var codec = SDPUtils.parseRtpMap(rtpmapline); 727 | var fmtps = SDPUtils.matchPrefix( 728 | mediaSection, 'a=fmtp:' + pt + ' '); 729 | // Only the first a=fmtp: is considered. 730 | codec.parameters = fmtps.length ? SDPUtils.parseFmtp(fmtps[0]) : {}; 731 | codec.rtcpFeedback = SDPUtils.matchPrefix( 732 | mediaSection, 'a=rtcp-fb:' + pt + ' ') 733 | .map(SDPUtils.parseRtcpFb); 734 | description.codecs.push(codec); 735 | } 736 | } 737 | // FIXME: parse headerExtensions, fecMechanisms and rtcp. 738 | return description; 739 | }; 740 | 741 | // Generates parts of the SDP media section describing the capabilities / parameters. 742 | SDPUtils.writeRtpDescription = function(kind, caps) { 743 | var sdp = ''; 744 | 745 | // Build the mline. 746 | sdp += 'm=' + kind + ' '; 747 | sdp += caps.codecs.length > 0 ? '9' : '0'; // reject if no codecs. 748 | sdp += ' UDP/TLS/RTP/SAVPF '; 749 | sdp += caps.codecs.map(function(codec) { 750 | if (codec.preferredPayloadType !== undefined) { 751 | return codec.preferredPayloadType; 752 | } 753 | return codec.payloadType; 754 | }).join(' ') + '\r\n'; 755 | 756 | sdp += 'c=IN IP4 0.0.0.0\r\n'; 757 | sdp += 'a=rtcp:9 IN IP4 0.0.0.0\r\n'; 758 | 759 | // Add a=rtpmap lines for each codec. Also fmtp and rtcp-fb. 760 | caps.codecs.forEach(function(codec) { 761 | sdp += SDPUtils.writeRtpMap(codec); 762 | sdp += SDPUtils.writeFtmp(codec); 763 | sdp += SDPUtils.writeRtcpFb(codec); 764 | }); 765 | // FIXME: add headerExtensions, fecMechanismş and rtcp. 766 | sdp += 'a=rtcp-mux\r\n'; 767 | return sdp; 768 | }; 769 | 770 | SDPUtils.writeSessionBoilerplate = function() { 771 | // FIXME: sess-id should be an NTP timestamp. 772 | return 'v=0\r\n' + 773 | 'o=thisisadapterortc 8169639915646943137 2 IN IP4 127.0.0.1\r\n' + 774 | 's=-\r\n' + 775 | 't=0 0\r\n'; 776 | }; 777 | 778 | SDPUtils.writeMediaSection = function(transceiver, caps, type, stream) { 779 | var sdp = SDPUtils.writeRtpDescription(transceiver.kind, caps); 780 | 781 | // Map ICE parameters (ufrag, pwd) to SDP. 782 | sdp += SDPUtils.writeIceParameters( 783 | transceiver.iceGatherer.getLocalParameters()); 784 | 785 | // Map DTLS parameters to SDP. 786 | sdp += SDPUtils.writeDtlsParameters( 787 | transceiver.dtlsTransport.getLocalParameters(), 788 | type === 'offer' ? 'actpass' : 'active'); 789 | 790 | sdp += 'a=mid:' + transceiver.mid + '\r\n'; 791 | 792 | if (transceiver.rtpSender && transceiver.rtpReceiver) { 793 | sdp += 'a=sendrecv\r\n'; 794 | } else if (transceiver.rtpSender) { 795 | sdp += 'a=sendonly\r\n'; 796 | } else if (transceiver.rtpReceiver) { 797 | sdp += 'a=recvonly\r\n'; 798 | } else { 799 | sdp += 'a=inactive\r\n'; 800 | } 801 | 802 | // FIXME: for RTX there might be multiple SSRCs. Not implemented in Edge yet. 803 | if (transceiver.rtpSender) { 804 | var msid = 'msid:' + stream.id + ' ' + 805 | transceiver.rtpSender.track.id + '\r\n'; 806 | sdp += 'a=' + msid; 807 | sdp += 'a=ssrc:' + transceiver.sendSsrc + ' ' + msid; 808 | } 809 | // FIXME: this should be written by writeRtpDescription. 810 | sdp += 'a=ssrc:' + transceiver.sendSsrc + ' cname:' + 811 | SDPUtils.localCName + '\r\n'; 812 | return sdp; 813 | }; 814 | 815 | // Gets the direction from the mediaSection or the sessionpart. 816 | SDPUtils.getDirection = function(mediaSection, sessionpart) { 817 | // Look for sendrecv, sendonly, recvonly, inactive, default to sendrecv. 818 | var lines = SDPUtils.splitLines(mediaSection); 819 | for (var i = 0; i < lines.length; i++) { 820 | switch (lines[i]) { 821 | case 'a=sendrecv': 822 | case 'a=sendonly': 823 | case 'a=recvonly': 824 | case 'a=inactive': 825 | return lines[i].substr(2); 826 | } 827 | } 828 | if (sessionpart) { 829 | return SDPUtils.getDirection(sessionpart); 830 | } 831 | return 'sendrecv'; 832 | }; 833 | 834 | // Expose public methods. 835 | module.exports = SDPUtils; 836 | 837 | },{}],4:[function(require,module,exports){ 838 | /* 839 | * Copyright (c) 2016 The WebRTC project authors. All Rights Reserved. 840 | * 841 | * Use of this source code is governed by a BSD-style license 842 | * that can be found in the LICENSE file in the root of the source 843 | * tree. 844 | */ 845 | 'use strict'; 846 | 847 | var SDPUtils = require('./edge_sdp'); 848 | var logging = require('../utils').log; 849 | var browserDetails = require('../utils').browserDetails; 850 | 851 | var edgeShim = { 852 | shimPeerConnection: function() { 853 | if (window.RTCIceGatherer) { 854 | // ORTC defines an RTCIceCandidate object but no constructor. 855 | // Not implemented in Edge. 856 | if (!window.RTCIceCandidate) { 857 | window.RTCIceCandidate = function(args) { 858 | return args; 859 | }; 860 | } 861 | // ORTC does not have a session description object but 862 | // other browsers (i.e. Chrome) that will support both PC and ORTC 863 | // in the future might have this defined already. 864 | if (!window.RTCSessionDescription) { 865 | window.RTCSessionDescription = function(args) { 866 | return args; 867 | }; 868 | } 869 | } 870 | 871 | window.RTCPeerConnection = function(config) { 872 | var self = this; 873 | 874 | this.onicecandidate = null; 875 | this.onaddstream = null; 876 | this.onremovestream = null; 877 | this.onsignalingstatechange = null; 878 | this.oniceconnectionstatechange = null; 879 | this.onnegotiationneeded = null; 880 | this.ondatachannel = null; 881 | 882 | this.localStreams = []; 883 | this.remoteStreams = []; 884 | this.getLocalStreams = function() { return self.localStreams; }; 885 | this.getRemoteStreams = function() { return self.remoteStreams; }; 886 | 887 | this.localDescription = new RTCSessionDescription({ 888 | type: '', 889 | sdp: '' 890 | }); 891 | this.remoteDescription = new RTCSessionDescription({ 892 | type: '', 893 | sdp: '' 894 | }); 895 | this.signalingState = 'stable'; 896 | this.iceConnectionState = 'new'; 897 | 898 | this.iceOptions = { 899 | gatherPolicy: 'all', 900 | iceServers: [] 901 | }; 902 | if (config && config.iceTransportPolicy) { 903 | switch (config.iceTransportPolicy) { 904 | case 'all': 905 | case 'relay': 906 | this.iceOptions.gatherPolicy = config.iceTransportPolicy; 907 | break; 908 | case 'none': 909 | // FIXME: remove once implementation and spec have added this. 910 | throw new TypeError('iceTransportPolicy "none" not supported'); 911 | } 912 | } 913 | if (config && config.iceServers) { 914 | // Edge does not like 915 | // 1) stun: 916 | // 2) turn: that does not have all of turn:host:port?transport=udp 917 | this.iceOptions.iceServers = config.iceServers.filter(function(server) { 918 | if (server && server.urls) { 919 | server.urls = server.urls.filter(function(url) { 920 | return url.indexOf('transport=udp') !== -1; 921 | })[0]; 922 | return true; 923 | } 924 | return false; 925 | }); 926 | } 927 | 928 | // per-track iceGathers, iceTransports, dtlsTransports, rtpSenders, ... 929 | // everything that is needed to describe a SDP m-line. 930 | this.transceivers = []; 931 | 932 | // since the iceGatherer is currently created in createOffer but we 933 | // must not emit candidates until after setLocalDescription we buffer 934 | // them in this array. 935 | this._localIceCandidatesBuffer = []; 936 | }; 937 | 938 | window.RTCPeerConnection.prototype._emitBufferedCandidates = function() { 939 | var self = this; 940 | // FIXME: need to apply ice candidates in a way which is async but in-order 941 | this._localIceCandidatesBuffer.forEach(function(event) { 942 | if (self.onicecandidate !== null) { 943 | self.onicecandidate(event); 944 | } 945 | }); 946 | this._localIceCandidatesBuffer = []; 947 | }; 948 | 949 | window.RTCPeerConnection.prototype.addStream = function(stream) { 950 | // Clone is necessary for local demos mostly, attaching directly 951 | // to two different senders does not work (build 10547). 952 | this.localStreams.push(stream.clone()); 953 | this._maybeFireNegotiationNeeded(); 954 | }; 955 | 956 | window.RTCPeerConnection.prototype.removeStream = function(stream) { 957 | var idx = this.localStreams.indexOf(stream); 958 | if (idx > -1) { 959 | this.localStreams.splice(idx, 1); 960 | this._maybeFireNegotiationNeeded(); 961 | } 962 | }; 963 | 964 | // Determines the intersection of local and remote capabilities. 965 | window.RTCPeerConnection.prototype._getCommonCapabilities = 966 | function(localCapabilities, remoteCapabilities) { 967 | var commonCapabilities = { 968 | codecs: [], 969 | headerExtensions: [], 970 | fecMechanisms: [] 971 | }; 972 | localCapabilities.codecs.forEach(function(lCodec) { 973 | for (var i = 0; i < remoteCapabilities.codecs.length; i++) { 974 | var rCodec = remoteCapabilities.codecs[i]; 975 | if (lCodec.name.toLowerCase() === rCodec.name.toLowerCase() && 976 | lCodec.clockRate === rCodec.clockRate && 977 | lCodec.numChannels === rCodec.numChannels) { 978 | // push rCodec so we reply with offerer payload type 979 | commonCapabilities.codecs.push(rCodec); 980 | 981 | // FIXME: also need to determine intersection between 982 | // .rtcpFeedback and .parameters 983 | break; 984 | } 985 | } 986 | }); 987 | 988 | localCapabilities.headerExtensions.forEach(function(lHeaderExtension) { 989 | for (var i = 0; i < remoteCapabilities.headerExtensions.length; i++) { 990 | var rHeaderExtension = remoteCapabilities.headerExtensions[i]; 991 | if (lHeaderExtension.uri === rHeaderExtension.uri) { 992 | commonCapabilities.headerExtensions.push(rHeaderExtension); 993 | break; 994 | } 995 | } 996 | }); 997 | 998 | // FIXME: fecMechanisms 999 | return commonCapabilities; 1000 | }; 1001 | 1002 | // Create ICE gatherer, ICE transport and DTLS transport. 1003 | window.RTCPeerConnection.prototype._createIceAndDtlsTransports = 1004 | function(mid, sdpMLineIndex) { 1005 | var self = this; 1006 | var iceGatherer = new RTCIceGatherer(self.iceOptions); 1007 | var iceTransport = new RTCIceTransport(iceGatherer); 1008 | iceGatherer.onlocalcandidate = function(evt) { 1009 | var event = {}; 1010 | event.candidate = {sdpMid: mid, sdpMLineIndex: sdpMLineIndex}; 1011 | 1012 | var cand = evt.candidate; 1013 | // Edge emits an empty object for RTCIceCandidateComplete‥ 1014 | if (!cand || Object.keys(cand).length === 0) { 1015 | // polyfill since RTCIceGatherer.state is not implemented in Edge 10547 yet. 1016 | if (iceGatherer.state === undefined) { 1017 | iceGatherer.state = 'completed'; 1018 | } 1019 | 1020 | // Emit a candidate with type endOfCandidates to make the samples work. 1021 | // Edge requires addIceCandidate with this empty candidate to start checking. 1022 | // The real solution is to signal end-of-candidates to the other side when 1023 | // getting the null candidate but some apps (like the samples) don't do that. 1024 | event.candidate.candidate = 1025 | 'candidate:1 1 udp 1 0.0.0.0 9 typ endOfCandidates'; 1026 | } else { 1027 | // RTCIceCandidate doesn't have a component, needs to be added 1028 | cand.component = iceTransport.component === 'RTCP' ? 2 : 1; 1029 | event.candidate.candidate = SDPUtils.writeCandidate(cand); 1030 | } 1031 | 1032 | var complete = self.transceivers.every(function(transceiver) { 1033 | return transceiver.iceGatherer && 1034 | transceiver.iceGatherer.state === 'completed'; 1035 | }); 1036 | // FIXME: update .localDescription with candidate and (potentially) end-of-candidates. 1037 | // To make this harder, the gatherer might emit candidates before localdescription 1038 | // is set. To make things worse, gather.getLocalCandidates still errors in 1039 | // Edge 10547 when no candidates have been gathered yet. 1040 | 1041 | if (self.onicecandidate !== null) { 1042 | // Emit candidate if localDescription is set. 1043 | // Also emits null candidate when all gatherers are complete. 1044 | if (self.localDescription && self.localDescription.type === '') { 1045 | self._localIceCandidatesBuffer.push(event); 1046 | if (complete) { 1047 | self._localIceCandidatesBuffer.push({}); 1048 | } 1049 | } else { 1050 | self.onicecandidate(event); 1051 | if (complete) { 1052 | self.onicecandidate({}); 1053 | } 1054 | } 1055 | } 1056 | }; 1057 | iceTransport.onicestatechange = function() { 1058 | self._updateConnectionState(); 1059 | }; 1060 | 1061 | var dtlsTransport = new RTCDtlsTransport(iceTransport); 1062 | dtlsTransport.ondtlsstatechange = function() { 1063 | self._updateConnectionState(); 1064 | }; 1065 | dtlsTransport.onerror = function() { 1066 | // onerror does not set state to failed by itself. 1067 | dtlsTransport.state = 'failed'; 1068 | self._updateConnectionState(); 1069 | }; 1070 | 1071 | return { 1072 | iceGatherer: iceGatherer, 1073 | iceTransport: iceTransport, 1074 | dtlsTransport: dtlsTransport 1075 | }; 1076 | }; 1077 | 1078 | // Start the RTP Sender and Receiver for a transceiver. 1079 | window.RTCPeerConnection.prototype._transceive = function(transceiver, 1080 | send, recv) { 1081 | var params = this._getCommonCapabilities(transceiver.localCapabilities, 1082 | transceiver.remoteCapabilities); 1083 | if (send && transceiver.rtpSender) { 1084 | params.encodings = [{ 1085 | ssrc: transceiver.sendSsrc 1086 | }]; 1087 | params.rtcp = { 1088 | cname: SDPUtils.localCName, 1089 | ssrc: transceiver.recvSsrc 1090 | }; 1091 | transceiver.rtpSender.send(params); 1092 | } 1093 | if (recv && transceiver.rtpReceiver) { 1094 | params.encodings = [{ 1095 | ssrc: transceiver.recvSsrc 1096 | }]; 1097 | params.rtcp = { 1098 | cname: transceiver.cname, 1099 | ssrc: transceiver.sendSsrc 1100 | }; 1101 | transceiver.rtpReceiver.receive(params); 1102 | } 1103 | }; 1104 | 1105 | window.RTCPeerConnection.prototype.setLocalDescription = 1106 | function(description) { 1107 | var self = this; 1108 | if (description.type === 'offer') { 1109 | if (!this._pendingOffer) { 1110 | } else { 1111 | this.transceivers = this._pendingOffer; 1112 | delete this._pendingOffer; 1113 | } 1114 | } else if (description.type === 'answer') { 1115 | var sections = SDPUtils.splitSections(self.remoteDescription.sdp); 1116 | var sessionpart = sections.shift(); 1117 | sections.forEach(function(mediaSection, sdpMLineIndex) { 1118 | var transceiver = self.transceivers[sdpMLineIndex]; 1119 | var iceGatherer = transceiver.iceGatherer; 1120 | var iceTransport = transceiver.iceTransport; 1121 | var dtlsTransport = transceiver.dtlsTransport; 1122 | var localCapabilities = transceiver.localCapabilities; 1123 | var remoteCapabilities = transceiver.remoteCapabilities; 1124 | var rejected = mediaSection.split('\n', 1)[0] 1125 | .split(' ', 2)[1] === '0'; 1126 | 1127 | if (!rejected) { 1128 | var remoteIceParameters = SDPUtils.getIceParameters(mediaSection, 1129 | sessionpart); 1130 | iceTransport.start(iceGatherer, remoteIceParameters, 'controlled'); 1131 | 1132 | var remoteDtlsParameters = SDPUtils.getDtlsParameters(mediaSection, 1133 | sessionpart); 1134 | dtlsTransport.start(remoteDtlsParameters); 1135 | 1136 | // Calculate intersection of capabilities. 1137 | var params = self._getCommonCapabilities(localCapabilities, 1138 | remoteCapabilities); 1139 | 1140 | // Start the RTCRtpSender. The RTCRtpReceiver for this transceiver 1141 | // has already been started in setRemoteDescription. 1142 | self._transceive(transceiver, 1143 | params.codecs.length > 0, 1144 | false); 1145 | } 1146 | }); 1147 | } 1148 | 1149 | this.localDescription = description; 1150 | switch (description.type) { 1151 | case 'offer': 1152 | this._updateSignalingState('have-local-offer'); 1153 | break; 1154 | case 'answer': 1155 | this._updateSignalingState('stable'); 1156 | break; 1157 | default: 1158 | throw new TypeError('unsupported type "' + description.type + '"'); 1159 | } 1160 | 1161 | // If a success callback was provided, emit ICE candidates after it has been 1162 | // executed. Otherwise, emit callback after the Promise is resolved. 1163 | var hasCallback = arguments.length > 1 && 1164 | typeof arguments[1] === 'function'; 1165 | if (hasCallback) { 1166 | var cb = arguments[1]; 1167 | window.setTimeout(function() { 1168 | cb(); 1169 | self._emitBufferedCandidates(); 1170 | }, 0); 1171 | } 1172 | var p = Promise.resolve(); 1173 | p.then(function() { 1174 | if (!hasCallback) { 1175 | window.setTimeout(self._emitBufferedCandidates.bind(self), 0); 1176 | } 1177 | }); 1178 | return p; 1179 | }; 1180 | 1181 | window.RTCPeerConnection.prototype.setRemoteDescription = 1182 | function(description) { 1183 | var self = this; 1184 | var stream = new MediaStream(); 1185 | var sections = SDPUtils.splitSections(description.sdp); 1186 | var sessionpart = sections.shift(); 1187 | sections.forEach(function(mediaSection, sdpMLineIndex) { 1188 | var lines = SDPUtils.splitLines(mediaSection); 1189 | var mline = lines[0].substr(2).split(' '); 1190 | var kind = mline[0]; 1191 | var rejected = mline[1] === '0'; 1192 | var direction = SDPUtils.getDirection(mediaSection, sessionpart); 1193 | 1194 | var transceiver; 1195 | var iceGatherer; 1196 | var iceTransport; 1197 | var dtlsTransport; 1198 | var rtpSender; 1199 | var rtpReceiver; 1200 | var sendSsrc; 1201 | var recvSsrc; 1202 | var localCapabilities; 1203 | 1204 | // FIXME: ensure the mediaSection has rtcp-mux set. 1205 | var remoteCapabilities = SDPUtils.parseRtpParameters(mediaSection); 1206 | var remoteIceParameters; 1207 | var remoteDtlsParameters; 1208 | if (!rejected) { 1209 | remoteIceParameters = SDPUtils.getIceParameters(mediaSection, 1210 | sessionpart); 1211 | remoteDtlsParameters = SDPUtils.getDtlsParameters(mediaSection, 1212 | sessionpart); 1213 | } 1214 | var mid = SDPUtils.matchPrefix(mediaSection, 'a=mid:')[0].substr(6); 1215 | 1216 | var cname; 1217 | // Gets the first SSRC. Note that with RTX there might be multiple SSRCs. 1218 | var remoteSsrc = SDPUtils.matchPrefix(mediaSection, 'a=ssrc:') 1219 | .map(function(line) { 1220 | return SDPUtils.parseSsrcMedia(line); 1221 | }) 1222 | .filter(function(obj) { 1223 | return obj.attribute === 'cname'; 1224 | })[0]; 1225 | if (remoteSsrc) { 1226 | recvSsrc = parseInt(remoteSsrc.ssrc, 10); 1227 | cname = remoteSsrc.value; 1228 | } 1229 | 1230 | if (description.type === 'offer') { 1231 | var transports = self._createIceAndDtlsTransports(mid, sdpMLineIndex); 1232 | 1233 | localCapabilities = RTCRtpReceiver.getCapabilities(kind); 1234 | sendSsrc = (2 * sdpMLineIndex + 2) * 1001; 1235 | 1236 | rtpReceiver = new RTCRtpReceiver(transports.dtlsTransport, kind); 1237 | 1238 | // FIXME: not correct when there are multiple streams but that is 1239 | // not currently supported in this shim. 1240 | stream.addTrack(rtpReceiver.track); 1241 | 1242 | // FIXME: look at direction. 1243 | if (self.localStreams.length > 0 && 1244 | self.localStreams[0].getTracks().length >= sdpMLineIndex) { 1245 | // FIXME: actually more complicated, needs to match types etc 1246 | var localtrack = self.localStreams[0].getTracks()[sdpMLineIndex]; 1247 | rtpSender = new RTCRtpSender(localtrack, transports.dtlsTransport); 1248 | } 1249 | 1250 | self.transceivers[sdpMLineIndex] = { 1251 | iceGatherer: transports.iceGatherer, 1252 | iceTransport: transports.iceTransport, 1253 | dtlsTransport: transports.dtlsTransport, 1254 | localCapabilities: localCapabilities, 1255 | remoteCapabilities: remoteCapabilities, 1256 | rtpSender: rtpSender, 1257 | rtpReceiver: rtpReceiver, 1258 | kind: kind, 1259 | mid: mid, 1260 | cname: cname, 1261 | sendSsrc: sendSsrc, 1262 | recvSsrc: recvSsrc 1263 | }; 1264 | // Start the RTCRtpReceiver now. The RTPSender is started in setLocalDescription. 1265 | self._transceive(self.transceivers[sdpMLineIndex], 1266 | false, 1267 | direction === 'sendrecv' || direction === 'sendonly'); 1268 | } else if (description.type === 'answer' && !rejected) { 1269 | transceiver = self.transceivers[sdpMLineIndex]; 1270 | iceGatherer = transceiver.iceGatherer; 1271 | iceTransport = transceiver.iceTransport; 1272 | dtlsTransport = transceiver.dtlsTransport; 1273 | rtpSender = transceiver.rtpSender; 1274 | rtpReceiver = transceiver.rtpReceiver; 1275 | sendSsrc = transceiver.sendSsrc; 1276 | //recvSsrc = transceiver.recvSsrc; 1277 | localCapabilities = transceiver.localCapabilities; 1278 | 1279 | self.transceivers[sdpMLineIndex].recvSsrc = recvSsrc; 1280 | self.transceivers[sdpMLineIndex].remoteCapabilities = 1281 | remoteCapabilities; 1282 | self.transceivers[sdpMLineIndex].cname = cname; 1283 | 1284 | iceTransport.start(iceGatherer, remoteIceParameters, 'controlling'); 1285 | dtlsTransport.start(remoteDtlsParameters); 1286 | 1287 | self._transceive(transceiver, 1288 | direction === 'sendrecv' || direction === 'recvonly', 1289 | direction === 'sendrecv' || direction === 'sendonly'); 1290 | 1291 | if (rtpReceiver && 1292 | (direction === 'sendrecv' || direction === 'sendonly')) { 1293 | stream.addTrack(rtpReceiver.track); 1294 | } else { 1295 | // FIXME: actually the receiver should be created later. 1296 | delete transceiver.rtpReceiver; 1297 | } 1298 | } 1299 | }); 1300 | 1301 | this.remoteDescription = description; 1302 | switch (description.type) { 1303 | case 'offer': 1304 | this._updateSignalingState('have-remote-offer'); 1305 | break; 1306 | case 'answer': 1307 | this._updateSignalingState('stable'); 1308 | break; 1309 | default: 1310 | throw new TypeError('unsupported type "' + description.type + '"'); 1311 | } 1312 | window.setTimeout(function() { 1313 | if (self.onaddstream !== null && stream.getTracks().length) { 1314 | self.remoteStreams.push(stream); 1315 | window.setTimeout(function() { 1316 | self.onaddstream({stream: stream}); 1317 | }, 0); 1318 | } 1319 | }, 0); 1320 | if (arguments.length > 1 && typeof arguments[1] === 'function') { 1321 | window.setTimeout(arguments[1], 0); 1322 | } 1323 | return Promise.resolve(); 1324 | }; 1325 | 1326 | window.RTCPeerConnection.prototype.close = function() { 1327 | this.transceivers.forEach(function(transceiver) { 1328 | /* not yet 1329 | if (transceiver.iceGatherer) { 1330 | transceiver.iceGatherer.close(); 1331 | } 1332 | */ 1333 | if (transceiver.iceTransport) { 1334 | transceiver.iceTransport.stop(); 1335 | } 1336 | if (transceiver.dtlsTransport) { 1337 | transceiver.dtlsTransport.stop(); 1338 | } 1339 | if (transceiver.rtpSender) { 1340 | transceiver.rtpSender.stop(); 1341 | } 1342 | if (transceiver.rtpReceiver) { 1343 | transceiver.rtpReceiver.stop(); 1344 | } 1345 | }); 1346 | // FIXME: clean up tracks, local streams, remote streams, etc 1347 | this._updateSignalingState('closed'); 1348 | }; 1349 | 1350 | // Update the signaling state. 1351 | window.RTCPeerConnection.prototype._updateSignalingState = 1352 | function(newState) { 1353 | this.signalingState = newState; 1354 | if (this.onsignalingstatechange !== null) { 1355 | this.onsignalingstatechange(); 1356 | } 1357 | }; 1358 | 1359 | // Determine whether to fire the negotiationneeded event. 1360 | window.RTCPeerConnection.prototype._maybeFireNegotiationNeeded = 1361 | function() { 1362 | // Fire away (for now). 1363 | if (this.onnegotiationneeded !== null) { 1364 | this.onnegotiationneeded(); 1365 | } 1366 | }; 1367 | 1368 | // Update the connection state. 1369 | window.RTCPeerConnection.prototype._updateConnectionState = 1370 | function() { 1371 | var self = this; 1372 | var newState; 1373 | var states = { 1374 | 'new': 0, 1375 | closed: 0, 1376 | connecting: 0, 1377 | checking: 0, 1378 | connected: 0, 1379 | completed: 0, 1380 | failed: 0 1381 | }; 1382 | this.transceivers.forEach(function(transceiver) { 1383 | states[transceiver.iceTransport.state]++; 1384 | states[transceiver.dtlsTransport.state]++; 1385 | }); 1386 | // ICETransport.completed and connected are the same for this purpose. 1387 | states['connected'] += states['completed']; 1388 | 1389 | newState = 'new'; 1390 | if (states['failed'] > 0) { 1391 | newState = 'failed'; 1392 | } else if (states['connecting'] > 0 || states['checking'] > 0) { 1393 | newState = 'connecting'; 1394 | } else if (states['disconnected'] > 0) { 1395 | newState = 'disconnected'; 1396 | } else if (states['new'] > 0) { 1397 | newState = 'new'; 1398 | } else if (states['connecting'] > 0 || states['completed'] > 0) { 1399 | newState = 'connected'; 1400 | } 1401 | 1402 | if (newState !== self.iceConnectionState) { 1403 | self.iceConnectionState = newState; 1404 | if (this.oniceconnectionstatechange !== null) { 1405 | this.oniceconnectionstatechange(); 1406 | } 1407 | } 1408 | }; 1409 | 1410 | window.RTCPeerConnection.prototype.createOffer = function() { 1411 | var self = this; 1412 | if (this._pendingOffer) { 1413 | throw new Error('createOffer called while there is a pending offer.'); 1414 | } 1415 | var offerOptions; 1416 | if (arguments.length === 1 && typeof arguments[0] !== 'function') { 1417 | offerOptions = arguments[0]; 1418 | } else if (arguments.length === 3) { 1419 | offerOptions = arguments[2]; 1420 | } 1421 | 1422 | var tracks = []; 1423 | var numAudioTracks = 0; 1424 | var numVideoTracks = 0; 1425 | // Default to sendrecv. 1426 | if (this.localStreams.length) { 1427 | numAudioTracks = this.localStreams[0].getAudioTracks().length; 1428 | numVideoTracks = this.localStreams[0].getVideoTracks().length; 1429 | } 1430 | // Determine number of audio and video tracks we need to send/recv. 1431 | if (offerOptions) { 1432 | // Reject Chrome legacy constraints. 1433 | if (offerOptions.mandatory || offerOptions.optional) { 1434 | throw new TypeError( 1435 | 'Legacy mandatory/optional constraints not supported.'); 1436 | } 1437 | if (offerOptions.offerToReceiveAudio !== undefined) { 1438 | numAudioTracks = offerOptions.offerToReceiveAudio; 1439 | } 1440 | if (offerOptions.offerToReceiveVideo !== undefined) { 1441 | numVideoTracks = offerOptions.offerToReceiveVideo; 1442 | } 1443 | } 1444 | if (this.localStreams.length) { 1445 | // Push local streams. 1446 | this.localStreams[0].getTracks().forEach(function(track) { 1447 | tracks.push({ 1448 | kind: track.kind, 1449 | track: track, 1450 | wantReceive: track.kind === 'audio' ? 1451 | numAudioTracks > 0 : numVideoTracks > 0 1452 | }); 1453 | if (track.kind === 'audio') { 1454 | numAudioTracks--; 1455 | } else if (track.kind === 'video') { 1456 | numVideoTracks--; 1457 | } 1458 | }); 1459 | } 1460 | // Create M-lines for recvonly streams. 1461 | while (numAudioTracks > 0 || numVideoTracks > 0) { 1462 | if (numAudioTracks > 0) { 1463 | tracks.push({ 1464 | kind: 'audio', 1465 | wantReceive: true 1466 | }); 1467 | numAudioTracks--; 1468 | } 1469 | if (numVideoTracks > 0) { 1470 | tracks.push({ 1471 | kind: 'video', 1472 | wantReceive: true 1473 | }); 1474 | numVideoTracks--; 1475 | } 1476 | } 1477 | 1478 | var sdp = SDPUtils.writeSessionBoilerplate(); 1479 | var transceivers = []; 1480 | tracks.forEach(function(mline, sdpMLineIndex) { 1481 | // For each track, create an ice gatherer, ice transport, dtls transport, 1482 | // potentially rtpsender and rtpreceiver. 1483 | var track = mline.track; 1484 | var kind = mline.kind; 1485 | var mid = SDPUtils.generateIdentifier(); 1486 | 1487 | var transports = self._createIceAndDtlsTransports(mid, sdpMLineIndex); 1488 | 1489 | var localCapabilities = RTCRtpSender.getCapabilities(kind); 1490 | var rtpSender; 1491 | var rtpReceiver; 1492 | 1493 | // generate an ssrc now, to be used later in rtpSender.send 1494 | var sendSsrc = (2 * sdpMLineIndex + 1) * 1001; 1495 | if (track) { 1496 | rtpSender = new RTCRtpSender(track, transports.dtlsTransport); 1497 | } 1498 | 1499 | if (mline.wantReceive) { 1500 | rtpReceiver = new RTCRtpReceiver(transports.dtlsTransport, kind); 1501 | } 1502 | 1503 | transceivers[sdpMLineIndex] = { 1504 | iceGatherer: transports.iceGatherer, 1505 | iceTransport: transports.iceTransport, 1506 | dtlsTransport: transports.dtlsTransport, 1507 | localCapabilities: localCapabilities, 1508 | remoteCapabilities: null, 1509 | rtpSender: rtpSender, 1510 | rtpReceiver: rtpReceiver, 1511 | kind: kind, 1512 | mid: mid, 1513 | sendSsrc: sendSsrc, 1514 | recvSsrc: null 1515 | }; 1516 | var transceiver = transceivers[sdpMLineIndex]; 1517 | sdp += SDPUtils.writeMediaSection(transceiver, 1518 | transceiver.localCapabilities, 'offer', self.localStreams[0]); 1519 | }); 1520 | 1521 | this._pendingOffer = transceivers; 1522 | var desc = new RTCSessionDescription({ 1523 | type: 'offer', 1524 | sdp: sdp 1525 | }); 1526 | if (arguments.length && typeof arguments[0] === 'function') { 1527 | window.setTimeout(arguments[0], 0, desc); 1528 | } 1529 | return Promise.resolve(desc); 1530 | }; 1531 | 1532 | window.RTCPeerConnection.prototype.createAnswer = function() { 1533 | var self = this; 1534 | var answerOptions; 1535 | if (arguments.length === 1 && typeof arguments[0] !== 'function') { 1536 | answerOptions = arguments[0]; 1537 | } else if (arguments.length === 3) { 1538 | answerOptions = arguments[2]; 1539 | } 1540 | 1541 | var sdp = SDPUtils.writeSessionBoilerplate(); 1542 | this.transceivers.forEach(function(transceiver) { 1543 | // Calculate intersection of capabilities. 1544 | var commonCapabilities = self._getCommonCapabilities( 1545 | transceiver.localCapabilities, 1546 | transceiver.remoteCapabilities); 1547 | 1548 | sdp += SDPUtils.writeMediaSection(transceiver, commonCapabilities, 1549 | 'answer', self.localStreams[0]); 1550 | }); 1551 | 1552 | var desc = new RTCSessionDescription({ 1553 | type: 'answer', 1554 | sdp: sdp 1555 | }); 1556 | if (arguments.length && typeof arguments[0] === 'function') { 1557 | window.setTimeout(arguments[0], 0, desc); 1558 | } 1559 | return Promise.resolve(desc); 1560 | }; 1561 | 1562 | window.RTCPeerConnection.prototype.addIceCandidate = function(candidate) { 1563 | var mLineIndex = candidate.sdpMLineIndex; 1564 | if (candidate.sdpMid) { 1565 | for (var i = 0; i < this.transceivers.length; i++) { 1566 | if (this.transceivers[i].mid === candidate.sdpMid) { 1567 | mLineIndex = i; 1568 | break; 1569 | } 1570 | } 1571 | } 1572 | var transceiver = this.transceivers[mLineIndex]; 1573 | if (transceiver) { 1574 | var cand = Object.keys(candidate.candidate).length > 0 ? 1575 | SDPUtils.parseCandidate(candidate.candidate) : {}; 1576 | // Ignore Chrome's invalid candidates since Edge does not like them. 1577 | if (cand.protocol === 'tcp' && cand.port === 0) { 1578 | return; 1579 | } 1580 | // Ignore RTCP candidates, we assume RTCP-MUX. 1581 | if (cand.component !== '1') { 1582 | return; 1583 | } 1584 | // A dirty hack to make samples work. 1585 | if (cand.type === 'endOfCandidates') { 1586 | cand = {}; 1587 | } 1588 | transceiver.iceTransport.addRemoteCandidate(cand); 1589 | } 1590 | if (arguments.length > 1 && typeof arguments[1] === 'function') { 1591 | window.setTimeout(arguments[1], 0); 1592 | } 1593 | return Promise.resolve(); 1594 | }; 1595 | 1596 | window.RTCPeerConnection.prototype.getStats = function() { 1597 | var promises = []; 1598 | this.transceivers.forEach(function(transceiver) { 1599 | ['rtpSender', 'rtpReceiver', 'iceGatherer', 'iceTransport', 1600 | 'dtlsTransport'].forEach(function(method) { 1601 | if (transceiver[method]) { 1602 | promises.push(transceiver[method].getStats()); 1603 | } 1604 | }); 1605 | }); 1606 | var cb = arguments.length > 1 && typeof arguments[1] === 'function' && 1607 | arguments[1]; 1608 | return new Promise(function(resolve) { 1609 | var results = {}; 1610 | Promise.all(promises).then(function(res) { 1611 | res.forEach(function(result) { 1612 | Object.keys(result).forEach(function(id) { 1613 | results[id] = result[id]; 1614 | }); 1615 | }); 1616 | if (cb) { 1617 | window.setTimeout(cb, 0, results); 1618 | } 1619 | resolve(results); 1620 | }); 1621 | }); 1622 | }; 1623 | }, 1624 | 1625 | // Attach a media stream to an element. 1626 | attachMediaStream: function(element, stream) { 1627 | logging('DEPRECATED, attachMediaStream will soon be removed.'); 1628 | element.srcObject = stream; 1629 | }, 1630 | 1631 | reattachMediaStream: function(to, from) { 1632 | logging('DEPRECATED, reattachMediaStream will soon be removed.'); 1633 | to.srcObject = from.srcObject; 1634 | } 1635 | } 1636 | 1637 | // Expose public methods. 1638 | module.exports = { 1639 | shimPeerConnection: edgeShim.shimPeerConnection, 1640 | attachMediaStream: edgeShim.attachMediaStream, 1641 | reattachMediaStream: edgeShim.reattachMediaStream 1642 | } 1643 | 1644 | 1645 | },{"../utils":6,"./edge_sdp":3}],5:[function(require,module,exports){ 1646 | /* 1647 | * Copyright (c) 2016 The WebRTC project authors. All Rights Reserved. 1648 | * 1649 | * Use of this source code is governed by a BSD-style license 1650 | * that can be found in the LICENSE file in the root of the source 1651 | * tree. 1652 | */ 1653 | 'use strict'; 1654 | 1655 | var logging = require('../utils').log; 1656 | var browserDetails = require('../utils').browserDetails; 1657 | 1658 | var firefoxShim = { 1659 | shimOnTrack: function() { 1660 | if (typeof window === 'object' && window.RTCPeerConnection && !('ontrack' in 1661 | window.RTCPeerConnection.prototype)) { 1662 | Object.defineProperty(window.RTCPeerConnection.prototype, 'ontrack', { 1663 | get: function() { return this._ontrack; }, 1664 | set: function(f) { 1665 | var self = this; 1666 | if (this._ontrack) { 1667 | this.removeEventListener('track', this._ontrack); 1668 | this.removeEventListener('addstream', this._ontrackpoly); 1669 | } 1670 | this.addEventListener('track', this._ontrack = f); 1671 | this.addEventListener('addstream', this._ontrackpoly = function(e) { 1672 | e.stream.getTracks().forEach(function(track) { 1673 | var event = new Event('track'); 1674 | event.track = track; 1675 | event.receiver = {track: track}; 1676 | event.streams = [e.stream]; 1677 | this.dispatchEvent(event); 1678 | }.bind(this)); 1679 | }.bind(this)); 1680 | } 1681 | }); 1682 | } 1683 | }, 1684 | 1685 | shimSourceObject: function() { 1686 | // Firefox has supported mozSrcObject since FF22, unprefixed in 42. 1687 | if (typeof window === 'object') { 1688 | if (window.HTMLMediaElement && 1689 | !('srcObject' in window.HTMLMediaElement.prototype)) { 1690 | // Shim the srcObject property, once, when HTMLMediaElement is found. 1691 | Object.defineProperty(window.HTMLMediaElement.prototype, 'srcObject', { 1692 | get: function() { 1693 | return this.mozSrcObject; 1694 | }, 1695 | set: function(stream) { 1696 | this.mozSrcObject = stream; 1697 | } 1698 | }); 1699 | } 1700 | } 1701 | }, 1702 | 1703 | shimPeerConnection: function() { 1704 | // The RTCPeerConnection object. 1705 | if (!window.RTCPeerConnection) { 1706 | window.RTCPeerConnection = function(pcConfig, pcConstraints) { 1707 | if (browserDetails.version < 38) { 1708 | // .urls is not supported in FF < 38. 1709 | // create RTCIceServers with a single url. 1710 | if (pcConfig && pcConfig.iceServers) { 1711 | var newIceServers = []; 1712 | for (var i = 0; i < pcConfig.iceServers.length; i++) { 1713 | var server = pcConfig.iceServers[i]; 1714 | if (server.hasOwnProperty('urls')) { 1715 | for (var j = 0; j < server.urls.length; j++) { 1716 | var newServer = { 1717 | url: server.urls[j] 1718 | }; 1719 | if (server.urls[j].indexOf('turn') === 0) { 1720 | newServer.username = server.username; 1721 | newServer.credential = server.credential; 1722 | } 1723 | newIceServers.push(newServer); 1724 | } 1725 | } else { 1726 | newIceServers.push(pcConfig.iceServers[i]); 1727 | } 1728 | } 1729 | pcConfig.iceServers = newIceServers; 1730 | } 1731 | } 1732 | return new mozRTCPeerConnection(pcConfig, pcConstraints); // jscs:ignore requireCapitalizedConstructors 1733 | }; 1734 | window.RTCPeerConnection.prototype = mozRTCPeerConnection.prototype; 1735 | 1736 | // wrap static methods. Currently just generateCertificate. 1737 | if (mozRTCPeerConnection.generateCertificate) { 1738 | Object.defineProperty(window.RTCPeerConnection, 'generateCertificate', { 1739 | get: function() { 1740 | if (arguments.length) { 1741 | return mozRTCPeerConnection.generateCertificate.apply(null, 1742 | arguments); 1743 | } else { 1744 | return mozRTCPeerConnection.generateCertificate; 1745 | } 1746 | } 1747 | }); 1748 | } 1749 | 1750 | window.RTCSessionDescription = mozRTCSessionDescription; 1751 | window.RTCIceCandidate = mozRTCIceCandidate; 1752 | } 1753 | }, 1754 | 1755 | shimGetUserMedia: function() { 1756 | // getUserMedia constraints shim. 1757 | var getUserMedia_ = function(constraints, onSuccess, onError) { 1758 | var constraintsToFF37_ = function(c) { 1759 | if (typeof c !== 'object' || c.require) { 1760 | return c; 1761 | } 1762 | var require = []; 1763 | Object.keys(c).forEach(function(key) { 1764 | if (key === 'require' || key === 'advanced' || key === 'mediaSource') { 1765 | return; 1766 | } 1767 | var r = c[key] = (typeof c[key] === 'object') ? 1768 | c[key] : {ideal: c[key]}; 1769 | if (r.min !== undefined || 1770 | r.max !== undefined || r.exact !== undefined) { 1771 | require.push(key); 1772 | } 1773 | if (r.exact !== undefined) { 1774 | if (typeof r.exact === 'number') { 1775 | r. min = r.max = r.exact; 1776 | } else { 1777 | c[key] = r.exact; 1778 | } 1779 | delete r.exact; 1780 | } 1781 | if (r.ideal !== undefined) { 1782 | c.advanced = c.advanced || []; 1783 | var oc = {}; 1784 | if (typeof r.ideal === 'number') { 1785 | oc[key] = {min: r.ideal, max: r.ideal}; 1786 | } else { 1787 | oc[key] = r.ideal; 1788 | } 1789 | c.advanced.push(oc); 1790 | delete r.ideal; 1791 | if (!Object.keys(r).length) { 1792 | delete c[key]; 1793 | } 1794 | } 1795 | }); 1796 | if (require.length) { 1797 | c.require = require; 1798 | } 1799 | return c; 1800 | }; 1801 | if (browserDetails.version < 38) { 1802 | logging('spec: ' + JSON.stringify(constraints)); 1803 | if (constraints.audio) { 1804 | constraints.audio = constraintsToFF37_(constraints.audio); 1805 | } 1806 | if (constraints.video) { 1807 | constraints.video = constraintsToFF37_(constraints.video); 1808 | } 1809 | logging('ff37: ' + JSON.stringify(constraints)); 1810 | } 1811 | return navigator.mozGetUserMedia(constraints, onSuccess, onError); 1812 | }; 1813 | 1814 | navigator.getUserMedia = getUserMedia_; 1815 | 1816 | // Returns the result of getUserMedia as a Promise. 1817 | var getUserMediaPromise_ = function(constraints) { 1818 | return new Promise(function(resolve, reject) { 1819 | navigator.getUserMedia(constraints, resolve, reject); 1820 | }); 1821 | } 1822 | 1823 | // Shim for mediaDevices on older versions. 1824 | if (!navigator.mediaDevices) { 1825 | navigator.mediaDevices = {getUserMedia: getUserMediaPromise_, 1826 | addEventListener: function() { }, 1827 | removeEventListener: function() { } 1828 | }; 1829 | } 1830 | navigator.mediaDevices.enumerateDevices = 1831 | navigator.mediaDevices.enumerateDevices || function() { 1832 | return new Promise(function(resolve) { 1833 | var infos = [ 1834 | {kind: 'audioinput', deviceId: 'default', label: '', groupId: ''}, 1835 | {kind: 'videoinput', deviceId: 'default', label: '', groupId: ''} 1836 | ]; 1837 | resolve(infos); 1838 | }); 1839 | }; 1840 | 1841 | if (browserDetails.version < 41) { 1842 | // Work around http://bugzil.la/1169665 1843 | var orgEnumerateDevices = 1844 | navigator.mediaDevices.enumerateDevices.bind(navigator.mediaDevices); 1845 | navigator.mediaDevices.enumerateDevices = function() { 1846 | return orgEnumerateDevices().then(undefined, function(e) { 1847 | if (e.name === 'NotFoundError') { 1848 | return []; 1849 | } 1850 | throw e; 1851 | }); 1852 | }; 1853 | } 1854 | }, 1855 | 1856 | // Attach a media stream to an element. 1857 | attachMediaStream: function(element, stream) { 1858 | logging('DEPRECATED, attachMediaStream will soon be removed.'); 1859 | element.srcObject = stream; 1860 | }, 1861 | 1862 | reattachMediaStream: function(to, from) { 1863 | logging('DEPRECATED, reattachMediaStream will soon be removed.'); 1864 | to.srcObject = from.srcObject; 1865 | } 1866 | } 1867 | 1868 | // Expose public methods. 1869 | module.exports = { 1870 | shimOnTrack: firefoxShim.shimOnTrack, 1871 | shimSourceObject: firefoxShim.shimSourceObject, 1872 | shimPeerConnection: firefoxShim.shimPeerConnection, 1873 | shimGetUserMedia: firefoxShim.shimGetUserMedia, 1874 | attachMediaStream: firefoxShim.attachMediaStream, 1875 | reattachMediaStream: firefoxShim.reattachMediaStream 1876 | } 1877 | 1878 | },{"../utils":6}],6:[function(require,module,exports){ 1879 | /* 1880 | * Copyright (c) 2016 The WebRTC project authors. All Rights Reserved. 1881 | * 1882 | * Use of this source code is governed by a BSD-style license 1883 | * that can be found in the LICENSE file in the root of the source 1884 | * tree. 1885 | */ 1886 | 'use strict'; 1887 | 1888 | var logDisabled_ = false; 1889 | 1890 | // Utility methods. 1891 | var utils = { 1892 | disableLog: function(bool) { 1893 | if (typeof bool !== 'boolean') { 1894 | return new Error('Argument type: ' + typeof bool + 1895 | '. Please use a boolean.'); 1896 | } 1897 | logDisabled_ = bool; 1898 | return (bool) ? 'adapter.js logging disabled' : 1899 | 'adapter.js logging enabled'; 1900 | }, 1901 | 1902 | log: function() { 1903 | if (typeof window === 'object') { 1904 | if (logDisabled_) { 1905 | return; 1906 | } 1907 | console.log.apply(console, arguments); 1908 | } 1909 | }, 1910 | 1911 | /** 1912 | * Extract browser version out of the provided user agent string. 1913 | * @param {!string} uastring userAgent string. 1914 | * @param {!string} expr Regular expression used as match criteria. 1915 | * @param {!number} pos position in the version string to be returned. 1916 | * @return {!number} browser version. 1917 | */ 1918 | extractVersion: function(uastring, expr, pos) { 1919 | var match = uastring.match(expr); 1920 | return match && match.length >= pos && parseInt(match[pos], 10); 1921 | }, 1922 | 1923 | /** 1924 | * Browser detector. 1925 | * @return {object} result containing browser, version and minVersion 1926 | * properties. 1927 | */ 1928 | detectBrowser: function() { 1929 | // Returned result object. 1930 | var result = {}; 1931 | result.browser = null; 1932 | result.version = null; 1933 | result.minVersion = null; 1934 | 1935 | // Non supported browser. 1936 | if (typeof window === 'undefined' || !window.navigator) { 1937 | result.browser = 'Not a supported browser.'; 1938 | return result; 1939 | } 1940 | 1941 | // Firefox. 1942 | if (navigator.mozGetUserMedia) { 1943 | result.browser = 'firefox'; 1944 | result.version = this.extractVersion(navigator.userAgent, 1945 | /Firefox\/([0-9]+)\./, 1); 1946 | result.minVersion = 31; 1947 | return result; 1948 | } 1949 | 1950 | // Chrome/Chromium/Webview. 1951 | if (navigator.webkitGetUserMedia && window.webkitRTCPeerConnection) { 1952 | result.browser = 'chrome'; 1953 | result.version = this.extractVersion(navigator.userAgent, 1954 | /Chrom(e|ium)\/([0-9]+)\./, 2); 1955 | result.minVersion = 38; 1956 | return result; 1957 | } 1958 | 1959 | // Edge. 1960 | if (navigator.mediaDevices && 1961 | navigator.userAgent.match(/Edge\/(\d+).(\d+)$/)) { 1962 | result.browser = 'edge'; 1963 | result.version = this.extractVersion(navigator.userAgent, 1964 | /Edge\/(\d+).(\d+)$/, 2); 1965 | result.minVersion = 10547; 1966 | return result; 1967 | } 1968 | 1969 | // Non supported browser default. 1970 | result.browser = 'Not a supported browser.'; 1971 | return result; 1972 | } 1973 | }; 1974 | 1975 | // Export. 1976 | module.exports = { 1977 | log: utils.log, 1978 | disableLog: utils.disableLog, 1979 | browserDetails: utils.detectBrowser(), 1980 | extractVersion: utils.extractVersion 1981 | }; 1982 | 1983 | },{}]},{},[1])(1) 1984 | }); -------------------------------------------------------------------------------- /css/demos.css: -------------------------------------------------------------------------------- 1 | /* 2 | * demos.css - custom css for the speech recognition demos. 3 | */ 4 | 5 | .playDeleteButton { 6 | margin: 5px; 7 | } 8 | 9 | /* 10 | * Based on snippet from: 11 | * https://github.com/skeleton-framework/skeleton-framework 12 | */ 13 | .button.button-primary:disabled, 14 | button.button-primary:disabled, 15 | input[type="submit"].button-primary:disabled, 16 | input[type="reset"].button-primary:disabled, 17 | input[type="button"].button-primary:disabled { 18 | color: #FFF; 19 | background-color: #BBDDE9; 20 | border-color: #BBDDE9; 21 | } -------------------------------------------------------------------------------- /css/normalize.css: -------------------------------------------------------------------------------- 1 | /*! normalize.css v3.0.2 | MIT License | git.io/normalize */ 2 | 3 | /** 4 | * 1. Set default font family to sans-serif. 5 | * 2. Prevent iOS text size adjust after orientation change, without disabling 6 | * user zoom. 7 | */ 8 | 9 | html { 10 | font-family: sans-serif; /* 1 */ 11 | -ms-text-size-adjust: 100%; /* 2 */ 12 | -webkit-text-size-adjust: 100%; /* 2 */ 13 | } 14 | 15 | /** 16 | * Remove default margin. 17 | */ 18 | 19 | body { 20 | margin: 0; 21 | } 22 | 23 | /* HTML5 display definitions 24 | ========================================================================== */ 25 | 26 | /** 27 | * Correct `block` display not defined for any HTML5 element in IE 8/9. 28 | * Correct `block` display not defined for `details` or `summary` in IE 10/11 29 | * and Firefox. 30 | * Correct `block` display not defined for `main` in IE 11. 31 | */ 32 | 33 | article, 34 | aside, 35 | details, 36 | figcaption, 37 | figure, 38 | footer, 39 | header, 40 | hgroup, 41 | main, 42 | menu, 43 | nav, 44 | section, 45 | summary { 46 | display: block; 47 | } 48 | 49 | /** 50 | * 1. Correct `inline-block` display not defined in IE 8/9. 51 | * 2. Normalize vertical alignment of `progress` in Chrome, Firefox, and Opera. 52 | */ 53 | 54 | audio, 55 | canvas, 56 | progress, 57 | video { 58 | display: inline-block; /* 1 */ 59 | vertical-align: baseline; /* 2 */ 60 | } 61 | 62 | /** 63 | * Prevent modern browsers from displaying `audio` without controls. 64 | * Remove excess height in iOS 5 devices. 65 | */ 66 | 67 | audio:not([controls]) { 68 | display: none; 69 | height: 0; 70 | } 71 | 72 | /** 73 | * Address `[hidden]` styling not present in IE 8/9/10. 74 | * Hide the `template` element in IE 8/9/11, Safari, and Firefox < 22. 75 | */ 76 | 77 | [hidden], 78 | template { 79 | display: none; 80 | } 81 | 82 | /* Links 83 | ========================================================================== */ 84 | 85 | /** 86 | * Remove the gray background color from active links in IE 10. 87 | */ 88 | 89 | a { 90 | background-color: transparent; 91 | } 92 | 93 | /** 94 | * Improve readability when focused and also mouse hovered in all browsers. 95 | */ 96 | 97 | a:active, 98 | a:hover { 99 | outline: 0; 100 | } 101 | 102 | /* Text-level semantics 103 | ========================================================================== */ 104 | 105 | /** 106 | * Address styling not present in IE 8/9/10/11, Safari, and Chrome. 107 | */ 108 | 109 | abbr[title] { 110 | border-bottom: 1px dotted; 111 | } 112 | 113 | /** 114 | * Address style set to `bolder` in Firefox 4+, Safari, and Chrome. 115 | */ 116 | 117 | b, 118 | strong { 119 | font-weight: bold; 120 | } 121 | 122 | /** 123 | * Address styling not present in Safari and Chrome. 124 | */ 125 | 126 | dfn { 127 | font-style: italic; 128 | } 129 | 130 | /** 131 | * Address variable `h1` font-size and margin within `section` and `article` 132 | * contexts in Firefox 4+, Safari, and Chrome. 133 | */ 134 | 135 | h1 { 136 | font-size: 2em; 137 | margin: 0.67em 0; 138 | } 139 | 140 | /** 141 | * Address styling not present in IE 8/9. 142 | */ 143 | 144 | mark { 145 | background: #ff0; 146 | color: #000; 147 | } 148 | 149 | /** 150 | * Address inconsistent and variable font size in all browsers. 151 | */ 152 | 153 | small { 154 | font-size: 80%; 155 | } 156 | 157 | /** 158 | * Prevent `sub` and `sup` affecting `line-height` in all browsers. 159 | */ 160 | 161 | sub, 162 | sup { 163 | font-size: 75%; 164 | line-height: 0; 165 | position: relative; 166 | vertical-align: baseline; 167 | } 168 | 169 | sup { 170 | top: -0.5em; 171 | } 172 | 173 | sub { 174 | bottom: -0.25em; 175 | } 176 | 177 | /* Embedded content 178 | ========================================================================== */ 179 | 180 | /** 181 | * Remove border when inside `a` element in IE 8/9/10. 182 | */ 183 | 184 | img { 185 | border: 0; 186 | } 187 | 188 | /** 189 | * Correct overflow not hidden in IE 9/10/11. 190 | */ 191 | 192 | svg:not(:root) { 193 | overflow: hidden; 194 | } 195 | 196 | /* Grouping content 197 | ========================================================================== */ 198 | 199 | /** 200 | * Address margin not present in IE 8/9 and Safari. 201 | */ 202 | 203 | figure { 204 | margin: 1em 40px; 205 | } 206 | 207 | /** 208 | * Address differences between Firefox and other browsers. 209 | */ 210 | 211 | hr { 212 | -moz-box-sizing: content-box; 213 | box-sizing: content-box; 214 | height: 0; 215 | } 216 | 217 | /** 218 | * Contain overflow in all browsers. 219 | */ 220 | 221 | pre { 222 | overflow: auto; 223 | } 224 | 225 | /** 226 | * Address odd `em`-unit font size rendering in all browsers. 227 | */ 228 | 229 | code, 230 | kbd, 231 | pre, 232 | samp { 233 | font-family: monospace, monospace; 234 | font-size: 1em; 235 | } 236 | 237 | /* Forms 238 | ========================================================================== */ 239 | 240 | /** 241 | * Known limitation: by default, Chrome and Safari on OS X allow very limited 242 | * styling of `select`, unless a `border` property is set. 243 | */ 244 | 245 | /** 246 | * 1. Correct color not being inherited. 247 | * Known issue: affects color of disabled elements. 248 | * 2. Correct font properties not being inherited. 249 | * 3. Address margins set differently in Firefox 4+, Safari, and Chrome. 250 | */ 251 | 252 | button, 253 | input, 254 | optgroup, 255 | select, 256 | textarea { 257 | color: inherit; /* 1 */ 258 | font: inherit; /* 2 */ 259 | margin: 0; /* 3 */ 260 | } 261 | 262 | /** 263 | * Address `overflow` set to `hidden` in IE 8/9/10/11. 264 | */ 265 | 266 | button { 267 | overflow: visible; 268 | } 269 | 270 | /** 271 | * Address inconsistent `text-transform` inheritance for `button` and `select`. 272 | * All other form control elements do not inherit `text-transform` values. 273 | * Correct `button` style inheritance in Firefox, IE 8/9/10/11, and Opera. 274 | * Correct `select` style inheritance in Firefox. 275 | */ 276 | 277 | button, 278 | select { 279 | text-transform: none; 280 | } 281 | 282 | /** 283 | * 1. Avoid the WebKit bug in Android 4.0.* where (2) destroys native `audio` 284 | * and `video` controls. 285 | * 2. Correct inability to style clickable `input` types in iOS. 286 | * 3. Improve usability and consistency of cursor style between image-type 287 | * `input` and others. 288 | */ 289 | 290 | button, 291 | html input[type="button"], /* 1 */ 292 | input[type="reset"], 293 | input[type="submit"] { 294 | -webkit-appearance: button; /* 2 */ 295 | cursor: pointer; /* 3 */ 296 | } 297 | 298 | /** 299 | * Re-set default cursor for disabled elements. 300 | */ 301 | 302 | button[disabled], 303 | html input[disabled] { 304 | cursor: default; 305 | } 306 | 307 | /** 308 | * Remove inner padding and border in Firefox 4+. 309 | */ 310 | 311 | button::-moz-focus-inner, 312 | input::-moz-focus-inner { 313 | border: 0; 314 | padding: 0; 315 | } 316 | 317 | /** 318 | * Address Firefox 4+ setting `line-height` on `input` using `!important` in 319 | * the UA stylesheet. 320 | */ 321 | 322 | input { 323 | line-height: normal; 324 | } 325 | 326 | /** 327 | * It's recommended that you don't attempt to style these elements. 328 | * Firefox's implementation doesn't respect box-sizing, padding, or width. 329 | * 330 | * 1. Address box sizing set to `content-box` in IE 8/9/10. 331 | * 2. Remove excess padding in IE 8/9/10. 332 | */ 333 | 334 | input[type="checkbox"], 335 | input[type="radio"] { 336 | box-sizing: border-box; /* 1 */ 337 | padding: 0; /* 2 */ 338 | } 339 | 340 | /** 341 | * Fix the cursor style for Chrome's increment/decrement buttons. For certain 342 | * `font-size` values of the `input`, it causes the cursor style of the 343 | * decrement button to change from `default` to `text`. 344 | */ 345 | 346 | input[type="number"]::-webkit-inner-spin-button, 347 | input[type="number"]::-webkit-outer-spin-button { 348 | height: auto; 349 | } 350 | 351 | /** 352 | * 1. Address `appearance` set to `searchfield` in Safari and Chrome. 353 | * 2. Address `box-sizing` set to `border-box` in Safari and Chrome 354 | * (include `-moz` to future-proof). 355 | */ 356 | 357 | input[type="search"] { 358 | -webkit-appearance: textfield; /* 1 */ 359 | -moz-box-sizing: content-box; 360 | -webkit-box-sizing: content-box; /* 2 */ 361 | box-sizing: content-box; 362 | } 363 | 364 | /** 365 | * Remove inner padding and search cancel button in Safari and Chrome on OS X. 366 | * Safari (but not Chrome) clips the cancel button when the search input has 367 | * padding (and `textfield` appearance). 368 | */ 369 | 370 | input[type="search"]::-webkit-search-cancel-button, 371 | input[type="search"]::-webkit-search-decoration { 372 | -webkit-appearance: none; 373 | } 374 | 375 | /** 376 | * Define consistent border, margin, and padding. 377 | */ 378 | 379 | fieldset { 380 | border: 1px solid #c0c0c0; 381 | margin: 0 2px; 382 | padding: 0.35em 0.625em 0.75em; 383 | } 384 | 385 | /** 386 | * 1. Correct `color` not being inherited in IE 8/9/10/11. 387 | * 2. Remove padding so people aren't caught out if they zero out fieldsets. 388 | */ 389 | 390 | legend { 391 | border: 0; /* 1 */ 392 | padding: 0; /* 2 */ 393 | } 394 | 395 | /** 396 | * Remove default vertical scrollbar in IE 8/9/10/11. 397 | */ 398 | 399 | textarea { 400 | overflow: auto; 401 | } 402 | 403 | /** 404 | * Don't inherit the `font-weight` (applied by a rule above). 405 | * NOTE: the default cannot safely be changed in Chrome and Safari on OS X. 406 | */ 407 | 408 | optgroup { 409 | font-weight: bold; 410 | } 411 | 412 | /* Tables 413 | ========================================================================== */ 414 | 415 | /** 416 | * Remove most spacing between table cells. 417 | */ 418 | 419 | table { 420 | border-collapse: collapse; 421 | border-spacing: 0; 422 | } 423 | 424 | td, 425 | th { 426 | padding: 0; 427 | } -------------------------------------------------------------------------------- /css/skeleton.css: -------------------------------------------------------------------------------- 1 | /* 2 | * Skeleton V2.0.4 3 | * Copyright 2014, Dave Gamache 4 | * www.getskeleton.com 5 | * Free to use under the MIT license. 6 | * http://www.opensource.org/licenses/mit-license.php 7 | * 12/29/2014 8 | */ 9 | 10 | 11 | /* Table of contents 12 | –––––––––––––––––––––––––––––––––––––––––––––––––– 13 | - Grid 14 | - Base Styles 15 | - Typography 16 | - Links 17 | - Buttons 18 | - Forms 19 | - Lists 20 | - Code 21 | - Tables 22 | - Spacing 23 | - Utilities 24 | - Clearing 25 | - Media Queries 26 | */ 27 | 28 | 29 | /* Grid 30 | –––––––––––––––––––––––––––––––––––––––––––––––––– */ 31 | .container { 32 | position: relative; 33 | width: 100%; 34 | max-width: 960px; 35 | margin: 0 auto; 36 | padding: 0 20px; 37 | box-sizing: border-box; } 38 | .column, 39 | .columns { 40 | width: 100%; 41 | float: left; 42 | box-sizing: border-box; } 43 | 44 | /* For devices larger than 400px */ 45 | @media (min-width: 400px) { 46 | .container { 47 | width: 85%; 48 | padding: 0; } 49 | } 50 | 51 | /* For devices larger than 550px */ 52 | @media (min-width: 550px) { 53 | .container { 54 | width: 80%; } 55 | .column, 56 | .columns { 57 | margin-left: 4%; } 58 | .column:first-child, 59 | .columns:first-child { 60 | margin-left: 0; } 61 | 62 | .one.column, 63 | .one.columns { width: 4.66666666667%; } 64 | .two.columns { width: 13.3333333333%; } 65 | .three.columns { width: 22%; } 66 | .four.columns { width: 30.6666666667%; } 67 | .five.columns { width: 39.3333333333%; } 68 | .six.columns { width: 48%; } 69 | .seven.columns { width: 56.6666666667%; } 70 | .eight.columns { width: 65.3333333333%; } 71 | .nine.columns { width: 74.0%; } 72 | .ten.columns { width: 82.6666666667%; } 73 | .eleven.columns { width: 91.3333333333%; } 74 | .twelve.columns { width: 100%; margin-left: 0; } 75 | 76 | .one-third.column { width: 30.6666666667%; } 77 | .two-thirds.column { width: 65.3333333333%; } 78 | 79 | .one-half.column { width: 48%; } 80 | 81 | /* Offsets */ 82 | .offset-by-one.column, 83 | .offset-by-one.columns { margin-left: 8.66666666667%; } 84 | .offset-by-two.column, 85 | .offset-by-two.columns { margin-left: 17.3333333333%; } 86 | .offset-by-three.column, 87 | .offset-by-three.columns { margin-left: 26%; } 88 | .offset-by-four.column, 89 | .offset-by-four.columns { margin-left: 34.6666666667%; } 90 | .offset-by-five.column, 91 | .offset-by-five.columns { margin-left: 43.3333333333%; } 92 | .offset-by-six.column, 93 | .offset-by-six.columns { margin-left: 52%; } 94 | .offset-by-seven.column, 95 | .offset-by-seven.columns { margin-left: 60.6666666667%; } 96 | .offset-by-eight.column, 97 | .offset-by-eight.columns { margin-left: 69.3333333333%; } 98 | .offset-by-nine.column, 99 | .offset-by-nine.columns { margin-left: 78.0%; } 100 | .offset-by-ten.column, 101 | .offset-by-ten.columns { margin-left: 86.6666666667%; } 102 | .offset-by-eleven.column, 103 | .offset-by-eleven.columns { margin-left: 95.3333333333%; } 104 | 105 | .offset-by-one-third.column, 106 | .offset-by-one-third.columns { margin-left: 34.6666666667%; } 107 | .offset-by-two-thirds.column, 108 | .offset-by-two-thirds.columns { margin-left: 69.3333333333%; } 109 | 110 | .offset-by-one-half.column, 111 | .offset-by-one-half.columns { margin-left: 52%; } 112 | 113 | } 114 | 115 | 116 | /* Base Styles 117 | –––––––––––––––––––––––––––––––––––––––––––––––––– */ 118 | /* NOTE 119 | html is set to 62.5% so that all the REM measurements throughout Skeleton 120 | are based on 10px sizing. So basically 1.5rem = 15px :) */ 121 | html { 122 | font-size: 62.5%; } 123 | body { 124 | font-size: 1.5em; /* currently ems cause chrome bug misinterpreting rems on body element */ 125 | line-height: 1.6; 126 | font-weight: 400; 127 | font-family: "Raleway", "HelveticaNeue", "Helvetica Neue", Helvetica, Arial, sans-serif; 128 | color: #222; } 129 | 130 | 131 | /* Typography 132 | –––––––––––––––––––––––––––––––––––––––––––––––––– */ 133 | h1, h2, h3, h4, h5, h6 { 134 | margin-top: 0; 135 | margin-bottom: 2rem; 136 | font-weight: 300; } 137 | h1 { font-size: 4.0rem; line-height: 1.2; letter-spacing: -.1rem;} 138 | h2 { font-size: 3.6rem; line-height: 1.25; letter-spacing: -.1rem; } 139 | h3 { font-size: 3.0rem; line-height: 1.3; letter-spacing: -.1rem; } 140 | h4 { font-size: 2.4rem; line-height: 1.35; letter-spacing: -.08rem; } 141 | h5 { font-size: 1.8rem; line-height: 1.5; letter-spacing: -.05rem; } 142 | h6 { font-size: 1.5rem; line-height: 1.6; letter-spacing: 0; } 143 | 144 | /* Larger than phablet */ 145 | @media (min-width: 550px) { 146 | h1 { font-size: 5.0rem; } 147 | h2 { font-size: 4.2rem; } 148 | h3 { font-size: 3.6rem; } 149 | h4 { font-size: 3.0rem; } 150 | h5 { font-size: 2.4rem; } 151 | h6 { font-size: 1.5rem; } 152 | } 153 | 154 | p { 155 | margin-top: 0; } 156 | 157 | 158 | /* Links 159 | –––––––––––––––––––––––––––––––––––––––––––––––––– */ 160 | a { 161 | color: #1EAEDB; } 162 | a:hover { 163 | color: #0FA0CE; } 164 | 165 | 166 | /* Buttons 167 | –––––––––––––––––––––––––––––––––––––––––––––––––– */ 168 | .button, 169 | button, 170 | input[type="submit"], 171 | input[type="reset"], 172 | input[type="button"] { 173 | display: inline-block; 174 | height: 38px; 175 | padding: 0 30px; 176 | color: #555; 177 | text-align: center; 178 | font-size: 11px; 179 | font-weight: 600; 180 | line-height: 38px; 181 | letter-spacing: .1rem; 182 | text-transform: uppercase; 183 | text-decoration: none; 184 | white-space: nowrap; 185 | background-color: transparent; 186 | border-radius: 4px; 187 | border: 1px solid #bbb; 188 | cursor: pointer; 189 | box-sizing: border-box; } 190 | .button:hover, 191 | button:hover, 192 | input[type="submit"]:hover, 193 | input[type="reset"]:hover, 194 | input[type="button"]:hover, 195 | .button:focus, 196 | button:focus, 197 | input[type="submit"]:focus, 198 | input[type="reset"]:focus, 199 | input[type="button"]:focus { 200 | color: #333; 201 | border-color: #888; 202 | outline: 0; } 203 | .button.button-primary, 204 | button.button-primary, 205 | input[type="submit"].button-primary, 206 | input[type="reset"].button-primary, 207 | input[type="button"].button-primary { 208 | color: #FFF; 209 | background-color: #33C3F0; 210 | border-color: #33C3F0; } 211 | .button.button-primary:hover, 212 | button.button-primary:hover, 213 | input[type="submit"].button-primary:hover, 214 | input[type="reset"].button-primary:hover, 215 | input[type="button"].button-primary:hover, 216 | .button.button-primary:focus, 217 | button.button-primary:focus, 218 | input[type="submit"].button-primary:focus, 219 | input[type="reset"].button-primary:focus, 220 | input[type="button"].button-primary:focus { 221 | color: #FFF; 222 | background-color: #1EAEDB; 223 | border-color: #1EAEDB; } 224 | 225 | 226 | /* Forms 227 | –––––––––––––––––––––––––––––––––––––––––––––––––– */ 228 | input[type="email"], 229 | input[type="number"], 230 | input[type="search"], 231 | input[type="text"], 232 | input[type="tel"], 233 | input[type="url"], 234 | input[type="password"], 235 | textarea, 236 | select { 237 | height: 38px; 238 | padding: 6px 10px; /* The 6px vertically centers text on FF, ignored by Webkit */ 239 | background-color: #fff; 240 | border: 1px solid #D1D1D1; 241 | border-radius: 4px; 242 | box-shadow: none; 243 | box-sizing: border-box; } 244 | /* Removes awkward default styles on some inputs for iOS */ 245 | input[type="email"], 246 | input[type="number"], 247 | input[type="search"], 248 | input[type="text"], 249 | input[type="tel"], 250 | input[type="url"], 251 | input[type="password"], 252 | textarea { 253 | -webkit-appearance: none; 254 | -moz-appearance: none; 255 | appearance: none; } 256 | textarea { 257 | min-height: 65px; 258 | padding-top: 6px; 259 | padding-bottom: 6px; } 260 | input[type="email"]:focus, 261 | input[type="number"]:focus, 262 | input[type="search"]:focus, 263 | input[type="text"]:focus, 264 | input[type="tel"]:focus, 265 | input[type="url"]:focus, 266 | input[type="password"]:focus, 267 | textarea:focus, 268 | select:focus { 269 | border: 1px solid #33C3F0; 270 | outline: 0; } 271 | label, 272 | legend { 273 | display: block; 274 | margin-bottom: .5rem; 275 | font-weight: 600; } 276 | fieldset { 277 | padding: 0; 278 | border-width: 0; } 279 | input[type="checkbox"], 280 | input[type="radio"] { 281 | display: inline; } 282 | label > .label-body { 283 | display: inline-block; 284 | margin-left: .5rem; 285 | font-weight: normal; } 286 | 287 | 288 | /* Lists 289 | –––––––––––––––––––––––––––––––––––––––––––––––––– */ 290 | ul { 291 | list-style: circle inside; } 292 | ol { 293 | list-style: decimal inside; } 294 | ol, ul { 295 | padding-left: 0; 296 | margin-top: 0; } 297 | ul ul, 298 | ul ol, 299 | ol ol, 300 | ol ul { 301 | margin: 1.5rem 0 1.5rem 3rem; 302 | font-size: 90%; } 303 | li { 304 | margin-bottom: 1rem; } 305 | 306 | 307 | /* Code 308 | –––––––––––––––––––––––––––––––––––––––––––––––––– */ 309 | code { 310 | padding: .2rem .5rem; 311 | margin: 0 .2rem; 312 | font-size: 90%; 313 | white-space: nowrap; 314 | background: #F1F1F1; 315 | border: 1px solid #E1E1E1; 316 | border-radius: 4px; } 317 | pre > code { 318 | display: block; 319 | padding: 1rem 1.5rem; 320 | white-space: pre; } 321 | 322 | 323 | /* Tables 324 | –––––––––––––––––––––––––––––––––––––––––––––––––– */ 325 | th, 326 | td { 327 | padding: 12px 15px; 328 | text-align: left; 329 | border-bottom: 1px solid #E1E1E1; } 330 | th:first-child, 331 | td:first-child { 332 | padding-left: 0; } 333 | th:last-child, 334 | td:last-child { 335 | padding-right: 0; } 336 | 337 | 338 | /* Spacing 339 | –––––––––––––––––––––––––––––––––––––––––––––––––– */ 340 | button, 341 | .button { 342 | margin-bottom: 1rem; } 343 | input, 344 | textarea, 345 | select, 346 | fieldset { 347 | margin-bottom: 1.5rem; } 348 | pre, 349 | blockquote, 350 | dl, 351 | figure, 352 | table, 353 | p, 354 | ul, 355 | ol, 356 | form { 357 | margin-bottom: 2.5rem; } 358 | 359 | 360 | /* Utilities 361 | –––––––––––––––––––––––––––––––––––––––––––––––––– */ 362 | .u-full-width { 363 | width: 100%; 364 | box-sizing: border-box; } 365 | .u-max-full-width { 366 | max-width: 100%; 367 | box-sizing: border-box; } 368 | .u-pull-right { 369 | float: right; } 370 | .u-pull-left { 371 | float: left; } 372 | 373 | 374 | /* Misc 375 | –––––––––––––––––––––––––––––––––––––––––––––––––– */ 376 | hr { 377 | margin-top: 3rem; 378 | margin-bottom: 3.5rem; 379 | border-width: 0; 380 | border-top: 1px solid #E1E1E1; } 381 | 382 | 383 | /* Clearing 384 | –––––––––––––––––––––––––––––––––––––––––––––––––– */ 385 | 386 | /* Self Clearing Goodness */ 387 | .container:after, 388 | .row:after, 389 | .u-cf { 390 | content: ""; 391 | display: table; 392 | clear: both; } 393 | 394 | 395 | /* Media Queries 396 | –––––––––––––––––––––––––––––––––––––––––––––––––– */ 397 | /* 398 | Note: The best way to structure the use of media queries is to create the queries 399 | near the relevant code. For example, if you wanted to change the styles for buttons 400 | on small devices, paste the mobile query code up in the buttons section and style it 401 | there. 402 | */ 403 | 404 | 405 | /* Larger than mobile */ 406 | @media (min-width: 400px) {} 407 | 408 | /* Larger than phablet (also point when grid becomes active) */ 409 | @media (min-width: 550px) {} 410 | 411 | /* Larger than tablet */ 412 | @media (min-width: 750px) {} 413 | 414 | /* Larger than desktop */ 415 | @media (min-width: 1000px) {} 416 | 417 | /* Larger than Desktop HD */ 418 | @media (min-width: 1200px) {} 419 | -------------------------------------------------------------------------------- /demos/README.md: -------------------------------------------------------------------------------- 1 | # JsSpeechRecognizer Demos 2 | JavaScript Speech Recognizer Demos 3 | 4 | ## More Demos 5 | 6 | More demos will of the JsSpeechRecognizer will be included in this folder. 7 | 8 | [Keyword Spotting](https://github.com/dreamdom/JsSpeechRecognizer/tree/master/demos/keyword-spotting) 9 | 10 | [Video Interaction](https://github.com/dreamdom/JsSpeechRecognizer/tree/master/demos/video-interaction) 11 | -------------------------------------------------------------------------------- /demos/keyword-spotting/README.md: -------------------------------------------------------------------------------- 1 | # JsSpeechRecognizer Keyword Spotting 2 | 3 | Train the Speech Recognizer to recognize a spoken keyword. 4 | 5 | [Keyword Spotting Live Demo](https://dreamdom.github.io/demos/keyword-spotting/keyword-spotting.html) 6 | 7 | ## Video 8 | Here is a [short video](https://vimeo.com/161142124) that shows how to run the demo. 9 | 10 | ## Screenshot 11 | ![Keyword Spotting Screenshot](readme-images/screenshot-keyword-spotting.png "Keyword Spotting") 12 | 13 | ## Instructions 14 | 15 | 1. Train the word you would like to recognize several times. For example train "Zoey" five times. 16 | 2. Click the "start testing" button. The recognizer is now continuously listening for the word "Zoey". 17 | 3. Say the word "Zoey". A notification will sound when the word is recognized. 18 | 4. The recognized word will also appear in the testing section. Clicking the "Play" button next to the word will play the audio clip the recognizer identified as the keyword. 19 | 20 | ## Tips and Things to Try 21 | 22 | 1. If you are getting too many false positives, try increasing the confidence threshold. 23 | 2. If you aren't getting any matches, try lowering the confidence threshold. 24 | 3. Try varying the amounts of training entries. 25 | 4. Clone the code, and then in file keyword-spotting.html try changing the group size and the number of groups. 26 | 5. Experiment with different words and phrases. Some shorter words may generate a lot of false positives. Some longer words or phrases are more unique sounding. 27 | 6. Train and test in a quiet room. The recognizer does not currently handle background noise well. 28 | -------------------------------------------------------------------------------- /demos/keyword-spotting/keyword-spotting.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | JavaScript Speech Recognition - Keyword Spotting Demo 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 143 | 144 | 145 |
146 |
147 |

JsSpeechRecognizer - Keyword Spotting

148 |

JavaScript Speech Recognition Keyword Spotting Demo

149 | JsSpeechRecognizer github page 150 |
    151 |
  1. Train the keyword you would like to spot many times.
  2. 152 |
  3. Test by pressing start testing. Say the keyword you trained.
  4. 153 |
  5. For more detailed instructions, click here.
  6. 154 |
155 | 156 |
157 |
158 | 159 |
160 |
161 |
162 |

Training

163 |
164 | Word: 165 | 166 |
167 | 168 |
169 |
170 |
171 | 172 |
173 |

Testing

174 |
175 | 176 | 177 | .55 178 |
179 |
180 | 181 |
182 |
183 | 184 |
185 |
186 |
187 |
188 |
189 | 190 |
191 | 192 | 193 | 194 | 195 | -------------------------------------------------------------------------------- /demos/keyword-spotting/readme-images/screenshot-keyword-spotting.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dreamdom/JsSpeechRecognizer/5db119f148e7ba0b5534a2f12963ad77bc5dc36d/demos/keyword-spotting/readme-images/screenshot-keyword-spotting.png -------------------------------------------------------------------------------- /demos/resources/sounds/notification1.wav: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dreamdom/JsSpeechRecognizer/5db119f148e7ba0b5534a2f12963ad77bc5dc36d/demos/resources/sounds/notification1.wav -------------------------------------------------------------------------------- /demos/video-interaction/README.md: -------------------------------------------------------------------------------- 1 | # JsSpeechRecognizer Video Interaction 2 | 3 | Interact with a video using your voice. 4 | 5 | [Video Interaction Live Demo](https://dreamdom.github.io/demos/video-interaction/video-interaction.html) 6 | 7 | ## Video 8 | Here is a [short video](https://vimeo.com/161726625) that shows how to run the demo. 9 | 10 | ## Screenshot 11 | ![Video Interaction Screenshot](readme-images/video-interaction-screenshot.png "Video Interaction") 12 | 13 | ## Instructions 14 | 1. Train the phrase "Like It" several times. 15 | 2. Click the "start testing" button. The recognizer is now continuously listening for the phrase "Like It" 16 | 3. Say the phrase "Like It". A notification will sound when the word is recognized. A notification bar will appear near the bottom of the video. 17 | 4. Start playing the video. Try saying the phrase "Like It" again. 18 | 19 | ## Tips and Things to Try 20 | 1. Try adjusting the confidence threshold up and down. 21 | 2. Try training a different phrase. 22 | 3. Try testing at different volume levels. 23 | 4. Try testing while playing audio from different sources. 24 | 25 | ## Final Notes 26 | At the moment I have only tried testing on one computer. Also I have only tested in Firefox and Chrome. 27 | -------------------------------------------------------------------------------- /demos/video-interaction/readme-images/video-interaction-screenshot.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dreamdom/JsSpeechRecognizer/5db119f148e7ba0b5534a2f12963ad77bc5dc36d/demos/video-interaction/readme-images/video-interaction-screenshot.png -------------------------------------------------------------------------------- /demos/video-interaction/video-interaction.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | JavaScript Speech Recognition - Keyword Spotting Demo 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 179 | 180 | 181 |
182 |
183 |

JsSpeechRecognizer - Video Interaction

184 |

JavaScript Speech Recognition Video Interaction Demo

185 | JsSpeechRecognizer github page 186 |
    187 |
  1. Train the keyword you would like to spot many times.
  2. 188 |
  3. Test by pressing start testing. Say the keyword you trained. Try testing while the video is playing.
  4. 189 |
  5. For more detailed instructions, click here.
  6. 190 |
191 | 192 |
193 |
194 | 195 |
196 | 197 |
198 |

Training

199 |
200 | 201 |
202 |
203 | Word: 204 | 205 |
206 | 207 |
208 |
209 |
210 | 211 |
212 |

Testing

213 |
214 | 215 | 216 | .55 217 |
218 |
219 | 220 |
221 |
222 | 223 |
224 |
225 |
226 |
227 | 228 | 229 |
230 | 231 |
232 | 233 | 234 |
235 |
236 | 237 | 242 | 243 |
244 | 245 | 246 | 247 | -------------------------------------------------------------------------------- /experimental/README.md: -------------------------------------------------------------------------------- 1 | # Experimental Changes 2 | 3 | Experimental changes and updates to the JsSpeechRecognizer will live in this folder. These updates may be improvements or new features for the recognizer. 4 | 5 | ## Warning 6 | 7 | Expect frequent code changes, features that may not be fully implemented, and use cases that might not be fully tested. -------------------------------------------------------------------------------- /readme-images/screenshot-chicken-frog.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dreamdom/JsSpeechRecognizer/5db119f148e7ba0b5534a2f12963ad77bc5dc36d/readme-images/screenshot-chicken-frog.png -------------------------------------------------------------------------------- /readme-images/screenshot-yes-no.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dreamdom/JsSpeechRecognizer/5db119f148e7ba0b5534a2f12963ad77bc5dc36d/readme-images/screenshot-yes-no.png -------------------------------------------------------------------------------- /speechrec.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | JavaScript Speech Recognition 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 102 | 103 | 104 |
105 |
106 |

JsSpeechRecognizer

107 |

JavaScript Speech Recognition Demo

108 | JsSpeechRecognizer github page 109 |
    110 |
  1. Train by writing the word and pressing start training and stop training.
  2. 111 |
  3. Try training a word multiple times. Try training multiple words.
  4. 112 |
  5. Test by pressing start testing and say a word already trained.
  6. 113 |
114 | 115 |
116 |
117 | 118 |
119 |
120 |
121 |

Training

122 |
123 | Word: 124 | 125 |
126 | 127 |
128 |
129 |
130 | 131 |
132 |

Testing

133 |
134 | 135 |
136 |
137 |
138 |
139 |
140 | 141 |
142 | 143 | 144 | 145 | 146 | --------------------------------------------------------------------------------