├── BabelinSpeech.html └── README.md /BabelinSpeech.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 10 | 11 | Babelin Speech 1.1 / Speech Recording with Web Speech API 12 | 13 | 307 | 308 | 309 |
310 |
?
311 |
312 |
313 |
314 |
315 | 316 | 317 | 318 |    319 |
320 |
321 | 322 |
323 | 324 | 325 |
326 |
327 | 332 |
333 | 337 |
338 | 339 | 340 |
341 |
342 |
343 |
344 | 348 |
349 |
350 |
351 |
352 | 360 |
361 |
362 |
363 | 369 | 370 | 371 |
372 |
373 | 374 | 375 | 378 | 379 | 380 | 381 | 382 |
383 | Download Subtitles (Translation): 384 | .vtt 385 | .srt 386 | .ass 387 |
388 |
389 |
390 |
391 |
-
392 |
    393 |
    394 | 395 | 396 | 397 | 398 | 401 | 402 | 403 | 406 | 409 | 410 | 411 |
    # 399 | Start 400 | EndSettings cue 404 | 405 | 407 | Text (Edit All) 408 |
    412 |
    413 |
414 |
415 |
416 |
417 | × 418 |

Babelin Speech For voice recognition and real-time translation

419 |
420 |
421 |

422 | It uses the recognition and translation services offered by web browsers, which in this case is Chrome. 423 | For this, it is only necessary to change the microphone input to Stereo Mix. 424 |

425 | 426 |

Tips:

427 |
  • 428 | In local mode, permission will always be asked to access the microphone for safety. If a web server is running, it will only ask for permission once. 429 | Python 430 | (python.exe -m http.server 8000 --bind 127.0.0.1), or Everthing 431 | can be used as web servers. 432 | 433 |
  • 434 |
  • Sometimes it is better to stop the recognition, for each person speaking.
  • 435 |
  • Use shortcuts in full screen
  • 436 |

    Known Issues.

    437 |
  • After a few seconds of silence, the automatic recognition stops.
  • 438 |
  • Sometimes it stops recognizing the voice after a period of time, stop and start
  • 439 |
    440 | 443 |
    444 |
    445 | 1338 | 1339 | 1340 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Babelin Speach 2 | It is a simple html file for voice recognition and **Real-time translation**of files videos or audio, use the services offered by web browsers (Chrome/Edge current). 3 | It is possible to see the subtitles in real time (delay minus 1 second), in any video and in any language that the browser supports. 4 | 5 | [![IMAGE ALT TEXT HERE](http://img.youtube.com/vi/Q-7P6Xgqwb0/0.jpg)](http://www.youtube.com/watch?v=Q-7P6Xgqwb0) 6 | 7 | ## Features⚙️ 8 | * Transcription and translation in real time. 9 | * Creation subtitles. 10 | * Allows you to set the time to divide sentences. 11 | * It allows to edit the subtitles using the generated table. 12 | * Import and export in three subtitle formats vtt, ass, srt. 13 | * Allows to join/splits adjacent lines 14 | * Shortcuts for time correction, recognition start and stop, forward and backward to the next subtitle cue, video advance in 500ms, 1sec, and 10sec. 15 | * Option to translate the text with other translators such as DeepL, Yandex, Bing. With copy and paste, and a word counter to facilitate this. 16 | * Time correction for recognition delay. 17 | * Create line new empty press key c. 18 | 19 | 20 | ### Requirements 📋 21 | * [Chrome browser (last version)](https://www.google.com/intl/en_us/chrome/). 22 | * [Edge browser (last version)](https://www.microsoft.com/en-us/edge/). 23 | 24 | * Your audio card must support the Mix Stereo option for recording. 25 | 26 | * Test only windows 10. 27 | 28 | ### Installation 🔧 29 | 30 | * Download the [BabelinSpeech.html](https://raw.githubusercontent.com/JeanCaro/Babelin/main/BabelinSpeech.html) file anywhere on your computer and open it in your browser or [Test Online](https://jeancaro.github.io/Babelin/BabelinSpeech.html), in this case it only works with Chrome/Edge. 31 | * [Enable Stereo Mix](https://thegeekpage.com/stereo-mix/) 32 | 33 | 34 | ![](https://thegeekpage.com/wp-content/uploads/2020/06/enable-stereo-mix.png) 35 | 36 | 37 | ## Built with 🛠️ 38 | 39 | * [Speech Recognition](https://developer.mozilla.org/en-US/docs/Web/API/SpeechRecognitionEvent).- Speech Recognition 40 | * [Speech Api](https://wicg.github.io/speech-api/).- Speech Api 41 | * [Video](https://developer.mozilla.org/en-US/docs/Web/HTML/Element/video) - Element Video 42 | * [APIS](https://developer.mozilla.org/en-US/docs/Web/API).- Apis Web 43 | 44 | ## Download 📌 45 | 46 | [V 1.0](https://github.com/JeanCaro/Babelin/releases). 47 | 48 | ## Test online with the latest version 1.1 49 | 50 | * [Babelin Online](https://jeancaro.github.io/Babelin/BabelinSpeech.html) 51 | 52 | # All audio is sent to google/cloud servers for processing. 53 | 54 | # Version 1.1 55 | * Colors are added to the table based on CPS. 56 | * Button "Fix Time Start" lines short time. 57 | * Add Button for Download Table or Track of subtitle. 58 | * Compatible Edge/Chrome 59 | * By default, groups of statements/sentence/phrases are handled. 60 | * Many bugs are fixed 61 | 62 | *CPS stands for "characters per second". It therefore measures the speed at which words appear in subtitles or captions. 63 | 64 | # Recommendations 65 | * **Use Edge/Microsoft** performs **better transcription** and Chrome/Google for It better recognizes the start and end times of the voice, but its transcription is regular. 66 | 67 | * There are two important CPSs, the source CPS and the destination CPS. For example, languages like Japanese, Korean, Chinese can have CPS of 4-6, while languages like English 8-12, or Portugues/Spanish of 10-14. 68 | 69 | * Once the whole transcription is finished, if colored lines appear, easily by pressing the "Fix Time Start (CPS)" button, correct the start times of the lines with short time problems. 70 | ### Shortcuts General 71 | * Ctrl+Q: Start/Stop Recognition Speech 72 | * alt + PageDown: Next cue 73 | * alt + PageUp: Previous cue 74 | 75 | ### SHORTCUTS IN TABLE 76 | * Ctrl+alt+ArrowUp: Join row previous 77 | * Ctrl+alt+ArrowDown: Join row next 78 | * Ctrl+alt+J: Split row in cursor or selection 79 | 80 | ### SHORTCUTS IN CELLS TIME 81 | * minus (-): rest 100 ms to start time 82 | * plus (+): add 100 ms to start time 83 | 84 | ### SHORTCUTS IN VIDEO 85 | * Q: Start/Stop Recognition Speech 86 | * P: Play/Stop video 87 | * C: Hold down the key C, create a new subtitle line empty 88 | * Alt + ArrowUp: Forward 5 seconds 89 | * Alt + ArrowDown: Rewind 5 seconds 90 | * Ctrl + ArrowUp: Forward 10 seconds 91 | * Ctrl + ArrowDown: Rewind 10 seconds 92 | * PageDown: Next cue 93 | * PageUp: Previous cue 94 | * M: Forward 1 second 95 | * Shift + M: Forward 500 ms 96 | * N: Rewind 1 second 97 | * Shift + N: Rewind 500 ms 98 | * Del: Delete cue(s) active 99 | * Ctrl + Del: Delete all cue(s) active 100 | 101 | * +: add 100 ms to start time of last cue active. 102 | * -: rest 100 ms to star time of last cue active. 103 | * Alt + +: add 100 ms to start time of last cue active(All). 104 | * Alt + -: rest 100 ms to star time of last cue active(All). 105 | * ctrl + +: add 100 ms to end time of last cue active. 106 | * ctrl + -: rest 100 ms to end time of last cue active. 107 | 108 | # Subtitles and CPS 109 | https://captiz.com/wp-content/themes/captiz/SubtitlingGuidelinesCaptiz.pdf 110 | 111 | https://partnerhelp.netflixstudios.com/hc/en-us/articles/215758617-Timed-Text-Style-Guide-General-Requirements 112 | 113 | https://www.authot.com/en/2022/02/10/synchronisation-of-subtitles/ 114 | --------------------------------------------------------------------------------