18 |
19 | # Speech recognition polyfill
20 |
21 | Polyfill for the [SpeechRecognition](https://wicg.github.io/speech-api/#speechreco-section) standard on web, using [Speechly](https://www.speechly.com/) as the underlying API. The primary use of this library is to enable speech recognition on browsers that would not normally support it natively.
22 |
23 | Speechly offers a free tier for its speech recognition API with a generous usage limit.
24 |
25 | ## Useful links
26 |
27 | * [Quickstart](#quickstart)
28 | * [Browser support](#browser-support)
29 | * [Handling errors](#handling-errors)
30 | * [Examples](#examples)
31 | * [Integrating with react-speech-recognition](#integrating-with-react-speech-recognition)
32 | * [Limitations](#limitations)
33 | * [Type docs](docs/README.md)
34 | * [Contributing](#contributing)
35 | * [About Speechly](#about-speechly)
36 |
37 | ## Quickstart
38 |
39 | ### Installation
40 |
41 | ```
42 | npm install --save @speechly/speech-recognition-polyfill
43 | ```
44 |
45 | ### Basic usage
46 |
47 | First, you need a Speechly Application ID:
48 |
49 | 1. Log in to [Speechly Dashboard](https://api.speechly.com/dashboard/)
50 | 2. Open [Create a new application](https://api.speechly.com/dashboard/#/app/new)
51 | 3. Give your application a name and press **Create application**
52 | 4. **Deploy** the application
53 | 5. Copy the **App ID**, you'll need it the next step.
54 |
55 | Once you have your App ID, you can use it to create a recognition object that can start transcribing anything the user speaks into the microphone:
56 |
57 | ```
58 | import { createSpeechlySpeechRecognition } from '@speechly/speech-recognition-polyfill';
59 |
60 | const appId = '';
61 | const SpeechlySpeechRecognition = createSpeechlySpeechRecognition(appId);
62 | const speechRecognition = new SpeechlySpeechRecognition();
63 | ```
64 |
65 | Before you start using `speechRecognition` to start transcribing, you should provide a callback to process any transcripts that get generated. A common use case is to match the transcript against a list of commands and perform an action when you detect a match. Alternatively, you may want to display the transcript in the UI. Here's how to set the callback:
66 |
67 | ```
68 | speechRecognition.onresult = ({ results }) => {
69 | const transcript = results[0][0].transcript;
70 | // Process the transcript
71 | };
72 | ```
73 |
74 | You may also want to configure the recognition object:
75 |
76 | ```
77 | // Keep transcribing, even if the user stops speaking
78 | speechRecognition.continuous = true;
79 |
80 | // Get transcripts while the user is speaking, not just when they've finished
81 | speechRecognition.interimResults = true;
82 | ```
83 |
84 | With your recognition object configured, you're ready to start transcribing by using the `start()` method. To comply with rules set by browsers, this _must_ be triggered by a user action such as a button click. For example, in a React component this could look like:
85 |
86 | ```
87 | const startTranscribing = () => {
88 | speechRecognition.start();
89 | };
90 |
91 | // When rendering component
92 |
93 | ```
94 |
95 | After calling `start()`, the microphone will be turned on and the recognition object will start passing transcripts to the callback you assigned to `onresult`. If you want to stop transcribing, you can call the following:
96 |
97 | ```
98 | speechRecognition.stop();
99 | ```
100 |
101 | ## Browser support
102 |
103 | This polyfill will work on browsers that support the [MediaDevices](https://developer.mozilla.org/en-US/docs/Web/API/MediaDevices) and [AudioContext](https://developer.mozilla.org/en-US/docs/Web/API/AudioContext) APIs, which covers roughly 95% of web users in 2022. The exceptions are Internet Explorer and most browsers from before 2016. On these browsers, an error will be thrown when creating a `SpeechlySpeechRecognition` object.
104 |
105 | The `SpeechlySpeechRecognition` class offers the `hasBrowserSupport` flag to check whether the browser supports the required APIs. We recommend you do the following when creating your speech recognition object:
106 | ```
107 | if (SpeechlySpeechRecognition.hasBrowserSupport) {
108 | const speechRecognition = new SpeechlySpeechRecognition();
109 | // Use speech recognition
110 | } else {
111 | // Show some fallback UI
112 | }
113 | ```
114 |
115 | ## Handling errors
116 |
117 | A common error case is when the user chooses not to give permission for the web app to access the microphone. This, and any other error emitted by this polyfill, can be handled via the `onerror` callback. In such cases, it's advised that you render some fallback UI as these errors will usually mean that voice-driven features will not work and should be disabled:
118 |
119 | ```
120 | import { MicrophoneNotAllowedError } from '@speechly/speech-recognition-polyfill';
121 |
122 | ...
123 |
124 | speechRecognition.onerror = (event) => {
125 | if (event === MicrophoneNotAllowedError) {
126 | // Microphone permission denied - show some fallback UI
127 | } else {
128 | // Unable to start transcribing - show some fallback UI
129 | }
130 | };
131 | ```
132 |
133 | ## Examples
134 |
135 | The following examples use React to demonstrate how this polyfill can be used in real web components.
136 |
137 | ### Matching commands
138 |
139 | A common use case is to enable the user to control a web app using their voice. The following example has a "hold to talk" button that enables transcription while held down. It provides a list of commands that, when matched by anything the user says, will be displayed. In practice, these matched commands could be used to perform actions.
140 |
141 | ```
142 | import React, { useState, useEffect } from 'react';
143 | import { createSpeechlySpeechRecognition } from '@speechly/speech-recognition-polyfill';
144 |
145 | const appId = '';
146 | const SpeechlySpeechRecognition = createSpeechlySpeechRecognition(appId);
147 | const speechRecognition = new SpeechlySpeechRecognition();
148 | speechRecognition.continuous = true;
149 | speechRecognition.interimResults = true;
150 |
151 | const COMMANDS = ['PLAY', 'PAUSE', 'REWIND'];
152 |
153 | export default () => {
154 | const [matchedCommand, setMatchedCommand] = useState('');
155 |
156 | const handleResult = ({ results }) => {
157 | const { transcript } = results[0][0];
158 | COMMANDS.forEach(command => {
159 | if (transcript.indexOf(command) !== -1) {
160 | setMatchedCommand(command);
161 | }
162 | });
163 | };
164 |
165 | useEffect(() => {
166 | speechRecognition.onresult = handleResult;
167 | });
168 |
169 | return (
170 |
171 |
177 | {matchedCommand}
178 |
179 | );
180 | };
181 | ```
182 |
183 | ### Displaying a transcript
184 |
185 | You may want to simply display everything the user says as text, for composing a message for example. This example uses the same button as before. The transcripts are combined and collected in a local state, which is displayed as one piece of text.
186 |
187 | ```
188 | import React, { useState, useEffect, useCallback } from 'react';
189 | import { createSpeechlySpeechRecognition } from '@speechly/speech-recognition-polyfill';
190 |
191 | const appId = '';
192 | const SpeechlySpeechRecognition = createSpeechlySpeechRecognition(appId);
193 | const speechRecognition = new SpeechlySpeechRecognition();
194 | speechRecognition.continuous = true;
195 |
196 | export default () => {
197 | const [transcript, setTranscript] = useState('');
198 |
199 | const handleResult = useCallback(({ results }) => {
200 | const newTranscript = [transcript, results[0][0].transcript].join(' ');
201 | setTranscript(newTranscript);
202 | }, [transcript]);
203 |
204 | useEffect(() => {
205 | speechRecognition.onresult = handleResult;
206 | });
207 |
208 | return (
209 |
210 |
216 | {transcript}
217 |
218 | );
219 | };
220 | ```
221 |
222 | ## Integrating with react-speech-recognition
223 |
224 | This polyfill is compatible with `react-speech-recognition`, a React hook that manages the transcript for you and allows you to provide more powerful commands. For React web apps, we recommend you combine these libraries. See its [README](https://github.com/JamesBrill/react-speech-recognition) for full guidance on how to use `react-speech-recognition`. It can be installed with:
225 |
226 | ```
227 | npm install --save react-speech-recognition
228 | ```
229 |
230 | Below is an example with more complex commands, which print a message in response to each command matched. For example, saying "Bob is my name" will result in the message "Hi Bob!".
231 |
232 | ```
233 | import React, { useState } from 'react';
234 | import SpeechRecognition, { useSpeechRecognition } from 'react-speech-recognition';
235 | import { createSpeechlySpeechRecognition } from '@speechly/speech-recognition-polyfill';
236 |
237 | const appId = '';
238 | const SpeechlySpeechRecognition = createSpeechlySpeechRecognition(appId);
239 | SpeechRecognition.applyPolyfill(SpeechlySpeechRecognition);
240 |
241 | export default () => {
242 | const [message, setMessage] = useState('');
243 | const commands = [
244 | {
245 | command: '* is my name',
246 | callback: (name) => setMessage(`Hi ${name}!`),
247 | matchInterim: true
248 | },
249 | {
250 | command: 'My top sports are * and *',
251 | callback: (sport1, sport2) => setMessage(`#1: ${sport1}, #2: ${sport2}`)
252 | },
253 | {
254 | command: 'Goodbye',
255 | callback: () => setMessage('So long!'),
256 | matchInterim: true
257 | },
258 | {
259 | command: 'Pass the salt (please)',
260 | callback: () => setMessage('My pleasure')
261 | }
262 | ];
263 | const {
264 | transcript,
265 | listening,
266 | browserSupportsSpeechRecognition,
267 | isMicrophoneAvailable
268 | } = useSpeechRecognition({ commands });
269 | const listenContinuously = () => SpeechRecognition.startListening({ continuous: true });
270 |
271 | if (!browserSupportsSpeechRecognition) {
272 | return No browser support
273 | }
274 |
275 | if (!isMicrophoneAvailable) {
276 | return Please allow access to the microphone
277 | }
278 |
279 | return (
280 |
281 |
Microphone: {listening ? 'on' : 'off'}
282 |
288 |
{transcript}
289 |
{message}
290 |
291 | );
292 | };
293 | ```
294 |
295 | ## Limitations
296 |
297 | While this polyfill is intended to enable most use cases for voice-driven web apps, it does not implement the full [W3C specification](https://wicg.github.io/speech-api/#speechreco-section) for `SpeechRecognition`, only a subset:
298 | * `start()` method
299 | * `stop()` method
300 | * `abort()` method
301 | * `continuous` property
302 | * `interimResults` property
303 | * `onresult` property
304 | * `onend` property - a callback that is fired when `stop()` or `abort()` is called
305 | * `onerror` property - a callback that is fired when an error occurs when attempting to start speech recognition
306 |
307 | Some notable limitations:
308 | * The `lang` property is currently unsupported, defaulting to English transcription
309 | * `onresult` will only receive the most recent speech recognition result (the utterance that the user is in the process of saying or has just finished saying) and does not store a history of all transcripts. This can easily be resolved by either managing your own transcript state (see the [Displaying a transcript](#displaying-a-transcript) example above) or using `react-speech-recognition` to do that for you
310 | * Transcripts are generated in uppercase letters without punctuation. If needed, you can transform them using [toLowerCase()](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/toLowerCase)
311 | * `onerror` currently only supports the `not-allowed` (user denied permission to use the microphone) error and the `audio-capture` error, which is emitted for any other case where speech recognition fails. The full list in the spec can be found [here](https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognitionErrorEvent/error)
312 |
313 | ## Contributing
314 |
315 | For a guide on how to develop `speech-recognition-polyfill` and contribute changes, see [CONTRIBUTING.md](CONTRIBUTING.md).
316 |
317 | ## About Speechly
318 |
319 | Speechly is a developer tool for building real-time multimodal voice user interfaces. It enables developers and designers to enhance their current touch user interface with voice functionalities for better user experience. Speechly key features:
320 |
321 | ### Speechly key features
322 |
323 | - Fully streaming API
324 | - Multi modal from the ground up
325 | - Easy to configure for any use case
326 | - Fast to integrate to any touch screen application
327 | - Supports natural corrections such as "Show me red – i mean blue t-shirts"
328 | - Real time visual feedback encourages users to go on with their voice
329 |
330 | | Example application | Description |
331 | | :---------------------------------------------------: | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
332 | | | Instead of using buttons, input fields and dropdowns, Speechly enables users to interact with the application by using voice. User gets real-time visual feedback on the form as they speak and are encouraged to go on. If there's an error, the user can either correct it by using traditional touch user interface or by voice. |
333 |
--------------------------------------------------------------------------------
/src/createSpeechRecognition.test.ts:
--------------------------------------------------------------------------------
1 | import { mocked } from 'ts-jest/utils';
2 | import { BrowserMicrophone, ErrNoAudioConsent } from '@speechly/browser-client'
3 | import createSpeechlySpeechRecognition from './createSpeechRecognition';
4 | import { MicrophoneNotAllowedError, SpeechRecognitionFailedError } from './types';
5 | import {
6 | mockUndefinedWindow,
7 | mockUndefinedNavigator,
8 | mockMediaDevices,
9 | mockUndefinedMediaDevices,
10 | mockAudioContext,
11 | mockWebkitAudioContext,
12 | mockUndefinedAudioContext,
13 | mockUndefinedWebkitAudioContext,
14 | expectSentenceToBeTranscribedWithFinalResult,
15 | expectSentenceToBeTranscribedWithInterimAndFinalResults,
16 | expectSentenceToBeTranscribedWithFirstInitialResult,
17 | } from './testUtils';
18 | import TEST_DATA from './testData';
19 |
20 | const { SENTENCE_ONE, SENTENCE_TWO } = TEST_DATA;
21 |
22 | let _callback: any;
23 | const mockOnSegmentChange = jest.fn((callback) => {
24 | _callback = callback;
25 | });
26 | const mockMicrophoneInitialize = jest.fn(() => Promise.resolve());
27 | const mockMicrophoneClose = jest.fn(() => Promise.resolve());
28 | const mockStart = jest.fn(() => Promise.resolve());
29 | const mockStop = jest.fn(() => Promise.resolve());
30 | const mockAttach = jest.fn(() => Promise.resolve());
31 | const mockDetach = jest.fn(() => Promise.resolve());
32 | const mockMediaStream = { data: 'mockData' };
33 | const MockBrowserMicrophone = mocked(BrowserMicrophone, true);
34 |
35 | const mockBrowserMicrophone = ({ mediaStream }: { mediaStream: typeof mockMediaStream | null }) => {
36 | MockBrowserMicrophone.mockImplementation(function () {
37 | return {
38 | initialize: mockMicrophoneInitialize,
39 | close: mockMicrophoneClose,
40 | mediaStream,
41 | } as any;
42 | });
43 | };
44 |
45 | jest.mock('@speechly/browser-client', () => ({
46 | BrowserClient: function () {
47 | return {
48 | onSegmentChange: mockOnSegmentChange,
49 | start: mockStart,
50 | stop: mockStop,
51 | attach: mockAttach,
52 | detach: mockDetach,
53 | };
54 | },
55 | BrowserMicrophone: jest.fn(),
56 | ErrNoAudioConsent: jest.fn(),
57 | }));
58 |
59 | const speak = (sentence: any) => {
60 | sentence.forEach(_callback)
61 | }
62 |
63 | const speakAndInterrupt = (sentence: any, interrupt: any) => {
64 | _callback(sentence[0]);
65 | interrupt();
66 | sentence.slice(1).forEach(_callback);
67 | }
68 |
69 | describe('createSpeechlySpeechRecognition', () => {
70 | beforeEach(() => {
71 | MockBrowserMicrophone.mockClear();
72 | mockBrowserMicrophone({ mediaStream: mockMediaStream });
73 | mockMicrophoneInitialize.mockClear();
74 | mockMicrophoneClose.mockClear();
75 | mockStart.mockClear();
76 | mockStop.mockClear();
77 | mockOnSegmentChange.mockClear();
78 | mockAttach.mockClear();
79 | mockDetach.mockClear();
80 | });
81 |
82 | it('calls initialize on browser microphone when starting transcription', async () => {
83 | const SpeechRecognition = createSpeechlySpeechRecognition('app id');
84 | const speechRecognition = new SpeechRecognition();
85 |
86 | await speechRecognition.start();
87 |
88 | expect(mockMicrophoneInitialize).toHaveBeenCalledTimes(1);
89 | })
90 |
91 | it('calls attach on Speechly client with browser microphone media stream when starting transcription', async () => {
92 | const SpeechRecognition = createSpeechlySpeechRecognition('app id');
93 | const speechRecognition = new SpeechRecognition();
94 |
95 | await speechRecognition.start();
96 |
97 | expect(mockAttach).toHaveBeenCalledTimes(1);
98 | expect(mockAttach).toHaveBeenCalledWith(mockMediaStream);
99 | })
100 |
101 | it('calls start on Speechly client when starting transcription', async () => {
102 | const SpeechRecognition = createSpeechlySpeechRecognition('app id');
103 | const speechRecognition = new SpeechRecognition();
104 |
105 | await speechRecognition.start();
106 |
107 | expect(mockStart).toHaveBeenCalledTimes(1);
108 | })
109 |
110 | it('calls given onresult for only the final result (interimResults: false)', async () => {
111 | const SpeechRecognition = createSpeechlySpeechRecognition('app id');
112 | const speechRecognition = new SpeechRecognition();
113 | const mockOnResult = jest.fn();
114 | speechRecognition.onresult = mockOnResult;
115 |
116 | await speechRecognition.start();
117 | speak(SENTENCE_ONE);
118 |
119 | expect(mockOnResult).toHaveBeenCalledTimes(1);
120 | expectSentenceToBeTranscribedWithFinalResult(SENTENCE_ONE, mockOnResult);
121 | })
122 |
123 | it('calls given onresult for each interim or final result (interimResults: true)', async () => {
124 | const SpeechRecognition = createSpeechlySpeechRecognition('app id');
125 | const speechRecognition = new SpeechRecognition();
126 | const mockOnResult = jest.fn();
127 | speechRecognition.onresult = mockOnResult;
128 | speechRecognition.interimResults = true;
129 |
130 | await speechRecognition.start();
131 | speak(SENTENCE_ONE);
132 |
133 | expect(mockOnResult).toHaveBeenCalledTimes(SENTENCE_ONE.length);
134 | expectSentenceToBeTranscribedWithInterimAndFinalResults(SENTENCE_ONE, mockOnResult);
135 | })
136 |
137 | it('transcribes two utterances when continuous is turned on (interimResults: false)', async () => {
138 | const SpeechRecognition = createSpeechlySpeechRecognition('app id');
139 | const speechRecognition = new SpeechRecognition();
140 | const mockOnResult = jest.fn();
141 | speechRecognition.onresult = mockOnResult;
142 | speechRecognition.continuous = true;
143 |
144 | await speechRecognition.start();
145 | speak(SENTENCE_ONE);
146 | speak(SENTENCE_TWO);
147 |
148 | expect(mockOnResult).toHaveBeenCalledTimes(2);
149 | expectSentenceToBeTranscribedWithFinalResult(SENTENCE_ONE, mockOnResult);
150 | expectSentenceToBeTranscribedWithFinalResult(SENTENCE_TWO, mockOnResult, 2);
151 | })
152 |
153 | it('transcribes only one of two utterances when continuous is turned off (interimResults: false)', async () => {
154 | const SpeechRecognition = createSpeechlySpeechRecognition('app id');
155 | const speechRecognition = new SpeechRecognition();
156 | const mockOnResult = jest.fn();
157 | speechRecognition.onresult = mockOnResult;
158 |
159 | await speechRecognition.start();
160 | speak(SENTENCE_ONE);
161 | speak(SENTENCE_TWO);
162 |
163 | expect(mockOnResult).toHaveBeenCalledTimes(1);
164 | expectSentenceToBeTranscribedWithFinalResult(SENTENCE_ONE, mockOnResult);
165 | })
166 |
167 | it('transcribes two utterances when continuous is turned on (interimResults: true)', async () => {
168 | const SpeechRecognition = createSpeechlySpeechRecognition('app id');
169 | const speechRecognition = new SpeechRecognition();
170 | const mockOnResult = jest.fn();
171 | speechRecognition.onresult = mockOnResult;
172 | speechRecognition.interimResults = true;
173 | speechRecognition.continuous = true;
174 |
175 | await speechRecognition.start();
176 | speak(SENTENCE_ONE);
177 | speak(SENTENCE_TWO);
178 |
179 | expect(mockOnResult).toHaveBeenCalledTimes(SENTENCE_ONE.length + SENTENCE_TWO.length);
180 | expectSentenceToBeTranscribedWithInterimAndFinalResults(SENTENCE_ONE, mockOnResult);
181 | expectSentenceToBeTranscribedWithInterimAndFinalResults(SENTENCE_TWO, mockOnResult, SENTENCE_ONE.length + 1);
182 | })
183 |
184 | it('transcribes only one of two utterances when continuous is turned off (interimResults: true)', async () => {
185 | const SpeechRecognition = createSpeechlySpeechRecognition('app id');
186 | const speechRecognition = new SpeechRecognition();
187 | const mockOnResult = jest.fn();
188 | speechRecognition.onresult = mockOnResult;
189 | speechRecognition.interimResults = true;
190 |
191 | await speechRecognition.start();
192 | speak(SENTENCE_ONE);
193 | speak(SENTENCE_TWO);
194 |
195 | expect(mockOnResult).toHaveBeenCalledTimes(SENTENCE_ONE.length);
196 | expectSentenceToBeTranscribedWithInterimAndFinalResults(SENTENCE_ONE, mockOnResult);
197 | })
198 |
199 | it('does not call initialize, stop or onend when stopping a transcription that was never started', async () => {
200 | const SpeechRecognition = createSpeechlySpeechRecognition('app id');
201 | const speechRecognition = new SpeechRecognition();
202 | const mockOnEnd = jest.fn();
203 | speechRecognition.onend = mockOnEnd;
204 |
205 | await speechRecognition.stop();
206 |
207 | expect(mockMicrophoneInitialize).toHaveBeenCalledTimes(0);
208 | expect(mockStop).toHaveBeenCalledTimes(0);
209 | expect(mockDetach).toHaveBeenCalledTimes(0);
210 | expect(mockMicrophoneClose).toHaveBeenCalledTimes(0);
211 | expect(mockOnEnd).toHaveBeenCalledTimes(0);
212 | })
213 |
214 | it('calls initialize, stop or onend when stopping a transcription that has been started', async () => {
215 | const SpeechRecognition = createSpeechlySpeechRecognition('app id');
216 | const speechRecognition = new SpeechRecognition();
217 | const mockOnEnd = jest.fn();
218 | speechRecognition.onend = mockOnEnd;
219 |
220 | await speechRecognition.start();
221 | await speechRecognition.stop();
222 |
223 | expect(mockMicrophoneInitialize).toHaveBeenCalledTimes(1);
224 | expect(mockStop).toHaveBeenCalledTimes(1);
225 | expect(mockDetach).toHaveBeenCalledTimes(1);
226 | expect(mockMicrophoneClose).toHaveBeenCalledTimes(1);
227 | expect(mockOnEnd).toHaveBeenCalledTimes(1);
228 | })
229 |
230 | it('does not call initialize, stop or onend a second time when stopping a transcription that was already stopped', async () => {
231 | const SpeechRecognition = createSpeechlySpeechRecognition('app id');
232 | const speechRecognition = new SpeechRecognition();
233 | const mockOnEnd = jest.fn();
234 | speechRecognition.onend = mockOnEnd;
235 |
236 | await speechRecognition.start();
237 | await speechRecognition.stop();
238 | await speechRecognition.stop();
239 |
240 | expect(mockMicrophoneInitialize).toHaveBeenCalledTimes(1);
241 | expect(mockStop).toHaveBeenCalledTimes(1);
242 | expect(mockDetach).toHaveBeenCalledTimes(1);
243 | expect(mockMicrophoneClose).toHaveBeenCalledTimes(1);
244 | expect(mockOnEnd).toHaveBeenCalledTimes(1);
245 | })
246 |
247 | it('does not call initialize, stop or onend when aborting a transcription that was never started', async () => {
248 | const SpeechRecognition = createSpeechlySpeechRecognition('app id');
249 | const speechRecognition = new SpeechRecognition();
250 | const mockOnEnd = jest.fn();
251 | speechRecognition.onend = mockOnEnd;
252 |
253 | await speechRecognition.abort();
254 |
255 | expect(mockMicrophoneInitialize).toHaveBeenCalledTimes(0);
256 | expect(mockStop).toHaveBeenCalledTimes(0);
257 | expect(mockDetach).toHaveBeenCalledTimes(0);
258 | expect(mockMicrophoneClose).toHaveBeenCalledTimes(0);
259 | expect(mockOnEnd).toHaveBeenCalledTimes(0);
260 | })
261 |
262 | it('calls initialize, stop or onend when aborting a transcription that has been started', async () => {
263 | const SpeechRecognition = createSpeechlySpeechRecognition('app id');
264 | const speechRecognition = new SpeechRecognition();
265 | const mockOnEnd = jest.fn();
266 | speechRecognition.onend = mockOnEnd;
267 |
268 | await speechRecognition.start();
269 | await speechRecognition.abort();
270 |
271 | expect(mockMicrophoneInitialize).toHaveBeenCalledTimes(1);
272 | expect(mockStop).toHaveBeenCalledTimes(1);
273 | expect(mockDetach).toHaveBeenCalledTimes(1);
274 | expect(mockMicrophoneClose).toHaveBeenCalledTimes(1);
275 | expect(mockOnEnd).toHaveBeenCalledTimes(1);
276 | })
277 |
278 | it('does not call initialize, stop or onend a second time when aborting a transcription that was already aborted', async () => {
279 | const SpeechRecognition = createSpeechlySpeechRecognition('app id');
280 | const speechRecognition = new SpeechRecognition();
281 | const mockOnEnd = jest.fn();
282 | speechRecognition.onend = mockOnEnd;
283 |
284 | await speechRecognition.start();
285 | await speechRecognition.abort();
286 | await speechRecognition.abort();
287 |
288 | expect(mockMicrophoneInitialize).toHaveBeenCalledTimes(1);
289 | expect(mockStop).toHaveBeenCalledTimes(1);
290 | expect(mockDetach).toHaveBeenCalledTimes(1);
291 | expect(mockMicrophoneClose).toHaveBeenCalledTimes(1);
292 | expect(mockOnEnd).toHaveBeenCalledTimes(1);
293 | })
294 |
295 | it('calling stop does not prevent ongoing utterance from being transcribed', async () => {
296 | const SpeechRecognition = createSpeechlySpeechRecognition('app id');
297 | const speechRecognition = new SpeechRecognition();
298 | const mockOnResult = jest.fn();
299 | speechRecognition.onresult = mockOnResult;
300 | const mockOnEnd = jest.fn();
301 | speechRecognition.onend = mockOnEnd;
302 | speechRecognition.interimResults = true;
303 |
304 | await speechRecognition.start();
305 | speakAndInterrupt(SENTENCE_ONE, speechRecognition.stop);
306 |
307 | expect(mockOnResult).toHaveBeenCalledTimes(SENTENCE_ONE.length);
308 | expectSentenceToBeTranscribedWithInterimAndFinalResults(SENTENCE_ONE, mockOnResult);
309 | })
310 |
311 | it('calling abort prevents ongoing utterance from being transcribed', async () => {
312 | const SpeechRecognition = createSpeechlySpeechRecognition('app id');
313 | const speechRecognition = new SpeechRecognition();
314 | const mockOnResult = jest.fn();
315 | speechRecognition.onresult = mockOnResult;
316 | const mockOnEnd = jest.fn();
317 | speechRecognition.onend = mockOnEnd;
318 | speechRecognition.interimResults = true;
319 |
320 | await speechRecognition.start();
321 | speakAndInterrupt(SENTENCE_ONE, speechRecognition.abort);
322 |
323 | expect(mockOnResult).toHaveBeenCalledTimes(1);
324 | expectSentenceToBeTranscribedWithFirstInitialResult(SENTENCE_ONE, mockOnResult);
325 | })
326 |
327 | it('calling stop prevents subsequent utterances from being transcribed', async () => {
328 | const SpeechRecognition = createSpeechlySpeechRecognition('app id');
329 | const speechRecognition = new SpeechRecognition();
330 | const mockOnResult = jest.fn();
331 | speechRecognition.onresult = mockOnResult;
332 | const mockOnEnd = jest.fn();
333 | speechRecognition.onend = mockOnEnd;
334 | speechRecognition.interimResults = true;
335 |
336 | await speechRecognition.start();
337 | speakAndInterrupt(SENTENCE_ONE, speechRecognition.stop);
338 | speak(SENTENCE_TWO);
339 |
340 | expect(mockOnResult).toHaveBeenCalledTimes(SENTENCE_ONE.length);
341 | expectSentenceToBeTranscribedWithInterimAndFinalResults(SENTENCE_ONE, mockOnResult);
342 | })
343 |
344 | it('calling abort prevents subsequent utterances from being transcribed', async () => {
345 | const SpeechRecognition = createSpeechlySpeechRecognition('app id');
346 | const speechRecognition = new SpeechRecognition();
347 | const mockOnResult = jest.fn();
348 | speechRecognition.onresult = mockOnResult;
349 | const mockOnEnd = jest.fn();
350 | speechRecognition.onend = mockOnEnd;
351 | speechRecognition.interimResults = true;
352 |
353 | await speechRecognition.start();
354 | speakAndInterrupt(SENTENCE_ONE, speechRecognition.abort);
355 | speak(SENTENCE_TWO);
356 |
357 | expect(mockOnResult).toHaveBeenCalledTimes(1);
358 | expectSentenceToBeTranscribedWithFirstInitialResult(SENTENCE_ONE, mockOnResult);
359 | })
360 |
361 | it('sets hasBrowserSupport to true when required APIs are defined (non-WebKit)', async () => {
362 | mockAudioContext();
363 | mockUndefinedWebkitAudioContext();
364 | mockMediaDevices();
365 |
366 | const SpeechRecognition = createSpeechlySpeechRecognition('app id');
367 |
368 | expect(SpeechRecognition.hasBrowserSupport).toEqual(true);
369 | })
370 |
371 | it('sets hasBrowserSupport to true when required APIs are defined (WebKit)', async () => {
372 | mockUndefinedAudioContext();
373 | mockWebkitAudioContext();
374 | mockMediaDevices();
375 |
376 | const SpeechRecognition = createSpeechlySpeechRecognition('app id');
377 |
378 | expect(SpeechRecognition.hasBrowserSupport).toEqual(true);
379 | })
380 |
381 | it('sets hasBrowserSupport to false when all AudioContext APIs are undefined', async () => {
382 | mockUndefinedAudioContext();
383 | mockUndefinedWebkitAudioContext();
384 | mockMediaDevices();
385 |
386 | const SpeechRecognition = createSpeechlySpeechRecognition('app id');
387 |
388 | expect(SpeechRecognition.hasBrowserSupport).toEqual(false);
389 | })
390 |
391 | it('sets hasBrowserSupport to false when MediaDevices API is undefined', async () => {
392 | mockAudioContext();
393 | mockUndefinedMediaDevices();
394 |
395 | const SpeechRecognition = createSpeechlySpeechRecognition('app id');
396 |
397 | expect(SpeechRecognition.hasBrowserSupport).toEqual(false);
398 | })
399 |
400 | it('sets hasBrowserSupport to false when Navigator API is undefined', async () => {
401 | mockAudioContext();
402 | mockUndefinedNavigator();
403 |
404 | const SpeechRecognition = createSpeechlySpeechRecognition('app id');
405 |
406 | expect(SpeechRecognition.hasBrowserSupport).toEqual(false);
407 | })
408 |
409 | it('sets hasBrowserSupport to false when window is undefined', async () => {
410 | mockUndefinedWindow();
411 |
412 | const SpeechRecognition = createSpeechlySpeechRecognition('app id');
413 |
414 | expect(SpeechRecognition.hasBrowserSupport).toEqual(false);
415 | })
416 |
417 | it('calls onerror with MicrophoneNotAllowedError error when no microphone permission given on start', async () => {
418 | const SpeechRecognition = createSpeechlySpeechRecognition('app id');
419 | const speechRecognition = new SpeechRecognition();
420 | const mockOnError = jest.fn();
421 | speechRecognition.onerror = mockOnError;
422 | mockMicrophoneInitialize.mockImplementationOnce(() => Promise.reject(ErrNoAudioConsent))
423 |
424 | await speechRecognition.start();
425 |
426 | expect(mockOnError).toHaveBeenCalledTimes(1);
427 | expect(mockOnError).toHaveBeenCalledWith(MicrophoneNotAllowedError);
428 | })
429 |
430 | it('calls onerror with SpeechRecognitionFailedError error when speech recognition fails on start', async () => {
431 | const SpeechRecognition = createSpeechlySpeechRecognition('app id');
432 | const speechRecognition = new SpeechRecognition();
433 | const mockOnError = jest.fn();
434 | speechRecognition.onerror = mockOnError;
435 | mockMicrophoneInitialize.mockImplementationOnce(() => Promise.reject(new Error('generic failure')))
436 |
437 | await speechRecognition.start();
438 |
439 | expect(mockOnError).toHaveBeenCalledTimes(1);
440 | expect(mockOnError).toHaveBeenCalledWith(SpeechRecognitionFailedError);
441 | })
442 |
443 | it('calls onerror with SpeechRecognitionFailedError error when speech recognition fails on attach', async () => {
444 | const SpeechRecognition = createSpeechlySpeechRecognition('app id');
445 | const speechRecognition = new SpeechRecognition();
446 | const mockOnError = jest.fn();
447 | speechRecognition.onerror = mockOnError;
448 | mockAttach.mockImplementationOnce(() => Promise.reject(new Error('generic failure')))
449 |
450 | await speechRecognition.start();
451 |
452 | expect(mockOnError).toHaveBeenCalledTimes(1);
453 | expect(mockOnError).toHaveBeenCalledWith(SpeechRecognitionFailedError);
454 | })
455 |
456 | it('calls onerror with SpeechRecognitionFailedError error when browser microphone media stream is falsey', async () => {
457 | mockBrowserMicrophone({ mediaStream: null });
458 | const SpeechRecognition = createSpeechlySpeechRecognition('app id');
459 | const speechRecognition = new SpeechRecognition();
460 | const mockOnError = jest.fn();
461 | speechRecognition.onerror = mockOnError;
462 |
463 | await speechRecognition.start();
464 |
465 | expect(mockOnError).toHaveBeenCalledTimes(1);
466 | expect(mockOnError).toHaveBeenCalledWith(SpeechRecognitionFailedError);
467 | })
468 | })
--------------------------------------------------------------------------------