├── .DS_Store
├── .github
    └── PULL_REQUEST_TEMPLATE.md
├── .gitignore
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── LICENSE
├── README.md
├── requirements.txt
├── src
    ├── DataCollection.srt
    ├── License.txt
    ├── Notice.txt
    ├── THIRD_PARTY_LICENSES.txt
    ├── __pycache__
    │   ├── audioUtils.cpython-37.pyc
    │   └── srtUtils.cpython-37.pyc
    ├── audioUtils.py
    ├── concatenateVideos.py
    ├── makevideo.bat
    ├── srt.py
    ├── srtUtils.py
    ├── transcribeUtils.py
    ├── translatevideo.py
    └── videoUtils.py
└── tools
    ├── srtUtils.py
    ├── testWebVTT.py
    └── webvttUtils.py


/.DS_Store:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/aws-transcribe-captioning-tools/14d8b58186fd71c2145e6cc76719ffc0be3a7087/.DS_Store


--------------------------------------------------------------------------------
/.github/PULL_REQUEST_TEMPLATE.md:
--------------------------------------------------------------------------------
1 | *Issue #, if available:*
2 | 
3 | *Description of changes:*
4 | 
5 | 
6 | By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.
7 | 


--------------------------------------------------------------------------------
/.gitignore:
--------------------------------------------------------------------------------
1 | **/__pycache__/
2 | *.pyc
3 | 


--------------------------------------------------------------------------------
/CODE_OF_CONDUCT.md:
--------------------------------------------------------------------------------
1 | ## Code of Conduct
2 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 
3 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 
4 | opensource-codeofconduct@amazon.com with any additional questions or comments.
5 | 


--------------------------------------------------------------------------------
/CONTRIBUTING.md:
--------------------------------------------------------------------------------
 1 | # Contributing Guidelines
 2 | 
 3 | Thank you for your interest in contributing to our project. Whether it's a bug report, new feature, correction, or additional 
 4 | documentation, we greatly value feedback and contributions from our community.
 5 | 
 6 | Please read through this document before submitting any issues or pull requests to ensure we have all the necessary 
 7 | information to effectively respond to your bug report or contribution.
 8 | 
 9 | 
10 | ## Reporting Bugs/Feature Requests
11 | 
12 | We welcome you to use the GitHub issue tracker to report bugs or suggest features.
13 | 
14 | When filing an issue, please check [existing open](https://github.com/aws-samples/aws-transcribe-captioning-tools/issues), or [recently closed](https://github.com/aws-samples/aws-transcribe-captioning-tools/issues?utf8=%E2%9C%93&q=is%3Aissue%20is%3Aclosed%20), issues to make sure somebody else hasn't already 
15 | reported the issue. Please try to include as much information as you can. Details like these are incredibly useful:
16 | 
17 | * A reproducible test case or series of steps
18 | * The version of our code being used
19 | * Any modifications you've made relevant to the bug
20 | * Anything unusual about your environment or deployment
21 | 
22 | 
23 | ## Contributing via Pull Requests
24 | Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that:
25 | 
26 | 1. You are working against the latest source on the *master* branch.
27 | 2. You check existing open, and recently merged, pull requests to make sure someone else hasn't addressed the problem already.
28 | 3. You open an issue to discuss any significant work - we would hate for your time to be wasted.
29 | 
30 | To send us a pull request, please:
31 | 
32 | 1. Fork the repository.
33 | 2. Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it will be hard for us to focus on your change.
34 | 3. Ensure local tests pass.
35 | 4. Commit to your fork using clear commit messages.
36 | 5. Send us a pull request, answering any default questions in the pull request interface.
37 | 6. Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation.
38 | 
39 | GitHub provides additional document on [forking a repository](https://help.github.com/articles/fork-a-repo/) and 
40 | [creating a pull request](https://help.github.com/articles/creating-a-pull-request/).
41 | 
42 | 
43 | ## Finding contributions to work on
44 | Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels ((enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any ['help wanted'](https://github.com/aws-samples/aws-transcribe-captioning-tools/labels/help%20wanted) issues is a great place to start. 
45 | 
46 | 
47 | ## Code of Conduct
48 | This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 
49 | For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 
50 | opensource-codeofconduct@amazon.com with any additional questions or comments.
51 | 
52 | 
53 | ## Security issue notifications
54 | If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Please do **not** create a public github issue.
55 | 
56 | 
57 | ## Licensing
58 | 
59 | See the [LICENSE](https://github.com/aws-samples/aws-transcribe-captioning-tools/blob/master/LICENSE) file for our project's licensing. We will ask you to confirm the licensing of your contribution.
60 | 
61 | We may ask you to sign a [Contributor License Agreement (CLA)](http://en.wikipedia.org/wiki/Contributor_License_Agreement) for larger changes.
62 | 


--------------------------------------------------------------------------------
/LICENSE:
--------------------------------------------------------------------------------
 1 | Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
 2 | 
 3 | Permission is hereby granted, free of charge, to any person obtaining a copy of this
 4 | software and associated documentation files (the "Software"), to deal in the Software
 5 | without restriction, including without limitation the rights to use, copy, modify,
 6 | merge, publish, distribute, sublicense, and/or sell copies of the Software, and to
 7 | permit persons to whom the Software is furnished to do so.
 8 | 
 9 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
10 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
11 | PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
12 | HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
13 | OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
14 | SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
15 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
 1 | # AWS VOD Captioning using AWS Transcribe
 2 | 
 3 | > Add subtitles to video with AWS machine learning services. Inlcuding AWS Polly, AWS Transcribe, and AWS Translate.
 4 | 
 5 | ## Overview
 6 | This repository contains code for VOD subtitle creation, described in the AWS blog post [“Create video subtitles with translation using machine learning”](https://aws.amazon.com/blogs/machine-learning/create-video-subtitles-with-translation-using-machine-learning/).
 7 | 
 8 | ## Prerequisites
 9 | 
10 | - Set up an AWS account. ([instructions](https://AWS.amazon.com/free/?sc_channel=PS&sc_campaign=acquisition_US&sc_publisher=google&sc_medium=cloud_computing_b&sc_content=AWS_account_bmm_control_q32016&sc_detail=%2BAWS%20%2Baccount&sc_category=cloud_computing&sc_segment=102882724242&sc_matchtype=b&sc_country=US&s_kwcid=AL!4422!3!102882724242!b!!g!!%2BAWS%20%2Baccount&ef_id=WS3s1AAAAJur-Oj2:20170825145941:s))
11 | - Clone this repo.
12 | - The other requirements are listed in this ([blog post](https://aws.amazon.com/blogs/machine-learning/create-video-subtitles-with-translation-using-machine-learning/))  
13 | - Configure AWS CLI and a local credentials file. ([instructions](http://docs.AWS.amazon.com/cli/latest/userguide/cli-chap-welcome.html))  
14 | 
15 | 
16 | ## Getting Started
17 | 
18 | Head on over to this blog post to see the instructions to create captions with AWS Transcribe in the SRT format, create alternate language SRT files with AWS Translate, and use AWS Polly to create alternate language video files:
19 | https://aws.amazon.com/blogs/machine-learning/create-video-subtitles-with-translation-using-machine-learning/
20 | 
21 | 
22 | 
23 | 
24 | ## More AWS Transcribe Tools for Video
25 | 
26 | If you just want to create an SRT or a VTT file, the tools directory contains Python code to convert AWS Transcribe JSON to an SRT or a VTT file. These files can be imported and used on web or desktop video players. 
27 | 
28 | ```shell
29 | python srt.py output_file_from_transcribe.json output.srt
30 | ```
31 | 
32 | 
33 | | name | description | 
34 | |-------|-------------|
35 | |srt.py | Takes the JSON response from AWS Transcribe and converts to a captions.srt file |
36 | |vtt.py | Takes the JSON response from AWS Transcribe and converts to a captions.vtt file |
37 | 
38 | 
39 | ## License Summary
40 | 
41 | This sample code is made available under a modified MIT license.See the LICENSE file.
42 | 


--------------------------------------------------------------------------------
/requirements.txt:
--------------------------------------------------------------------------------
1 | boto3==1.9.136
2 | moviepy==1.0.3


--------------------------------------------------------------------------------
/src/DataCollection.srt:
--------------------------------------------------------------------------------
  1 | 1
  2 | 00:00:10,240 --> 00:00:14,529
  3 | Hello, Statistic body. I'm yoga. Welcome to
  4 | 
  5 | 2
  6 | 00:00:14,539 --> 00:00:21,510
  7 | competition. All statistic subject. Okay, in this
  8 | 
  9 | 3
 10 | 00:00:21,510 --> 00:00:32,600
 11 | section I will tell you about data collection, the
 12 | 
 13 | 4
 14 | 00:00:32,600 --> 00:00:39,299
 15 | objectives off the session, our definition off data types
 16 | 
 17 | 5
 18 | 00:00:39,310 --> 00:00:45,579
 19 | , off data primary and secondary data and the last
 20 | 
 21 | 6
 22 | 00:00:45,590 --> 00:00:53,719
 23 | IHS data collection techniques. Okay, Now I start
 24 | 
 25 | 7
 26 | 00:00:53,729 --> 00:01:00,490
 27 | from definition off data, as we know from the
 28 | 
 29 | 8
 30 | 00:01:00,490 --> 00:01:06,930
 31 | previous chapter that statistics is important and closely related to
 32 | 
 33 | 9
 34 | 00:01:06,930 --> 00:01:11,269
 35 | data. Now, I will explain about the definition
 36 | 
 37 | 10
 38 | 00:01:11,280 --> 00:01:17,200
 39 | off data itself. In definition one, data are
 40 | 
 41 | 11
 42 | 00:01:17,200 --> 00:01:23,200
 43 | plain facts usually roll numbers and in the definition to
 44 | 
 45 | 12
 46 | 00:01:23,939 --> 00:01:30,140
 47 | data are individual pieces off, factual information recorded and
 48 | 
 49 | 13
 50 | 00:01:30,140 --> 00:01:37,000
 51 | used for the purpose off analysis. So from the
 52 | 
 53 | 14
 54 | 00:01:37,010 --> 00:01:40,969
 55 | two definitions, we know that data is the part
 56 | 
 57 | 15
 58 | 00:01:40,969 --> 00:01:49,319
 59 | off information. Now, I will explain about types
 60 | 
 61 | 16
 62 | 00:01:49,329 --> 00:01:56,659
 63 | off data data is divided into two kinds, namely
 64 | 
 65 | 17
 66 | 00:01:57,040 --> 00:02:08,120
 67 | qualitative and quantitative. Qualitative data itself is divided into
 68 | 
 69 | 18
 70 | 00:02:08,120 --> 00:02:15,650
 71 | nominal and or denial for quantitative is defected into inter
 72 | 
 73 | 19
 74 | 00:02:15,650 --> 00:02:21,680
 75 | file and rescue each off. The inter fall and
 76 | 
 77 | 20
 78 | 00:02:21,680 --> 00:02:32,370
 79 | rescue have discrete data and continuous data. Okay,
 80 | 
 81 | 21
 82 | 00:02:34,439 --> 00:02:37,960
 83 | Now I will explain more detail about the types off
 84 | 
 85 | 22
 86 | 00:02:37,969 --> 00:02:43,939
 87 | data. Qualitative. Okay. Qualitative is a data
 88 | 
 89 | 23
 90 | 00:02:43,939 --> 00:02:49,469
 91 | concerned with descriptions which can be observed but cannot be
 92 | 
 93 | 24
 94 | 00:02:49,469 --> 00:02:55,969
 95 | computed. And as we know, that qualitative data
 96 | 
 97 | 25
 98 | 00:02:57,939 --> 00:03:02,050
 99 | is divided into nominal an orginal scale. No,
100 | 
101 | 26
102 | 00:03:02,939 --> 00:03:07,840
103 | I will explain about the nominal scale nominal scale called
104 | 
105 | 27
106 | 00:03:07,840 --> 00:03:14,449
107 | simply because levels you can check the examples below to
108 | 
109 | 28
110 | 00:03:14,449 --> 00:03:27,259
111 | understand what the nominal is now for the orginal scale
112 | 
113 | 29
114 | 00:03:28,539 --> 00:03:30,990
115 | , the orginal scale have order off the values.
116 | 
117 | 30
118 | 00:03:31,610 --> 00:03:37,979
119 | The order is important and significant, but the differences
120 | 
121 | 31
122 | 00:03:37,979 --> 00:03:44,020
123 | between each one is not really known. And you
124 | 
125 | 32
126 | 00:03:44,020 --> 00:03:53,840
127 | can see the example below. Okay. Now its
128 | 
129 | 33
130 | 00:03:53,840 --> 00:04:01,569
131 | quantitative quantitative is the one that focus on numbers and
132 | 
133 | 34
134 | 00:04:01,770 --> 00:04:10,159
135 | mathematical calculations and can be calculated and computed and quantitative
136 | 
137 | 35
138 | 00:04:11,340 --> 00:04:15,529
139 | . Uh huh. Toe rescue. The first is
140 | 
141 | 36
142 | 00:04:15,540 --> 00:04:23,839
143 | interval scale and interval skills are numbering skills in which
144 | 
145 | 37
146 | 00:04:23,850 --> 00:04:28,209
147 | we know both the order and the exact differences between
148 | 
149 | 38
150 | 00:04:28,230 --> 00:04:33,129
151 | the values and second is ratio skills. Raise your
152 | 
153 | 39
154 | 00:04:33,129 --> 00:04:39,709
155 | skills are data measurement skills because they tell us about
156 | 
157 | 40
158 | 00:04:39,709 --> 00:04:43,589
159 | the order. They tell us the exact value between
160 | 
161 | 41
162 | 00:04:43,589 --> 00:04:47,949
163 | units, and they also have a new absolute zero
164 | 
165 | 42
166 | 00:04:48,189 --> 00:04:54,930
167 | , which allows for a wide range off both descriptive
168 | 
169 | 43
170 | 00:04:54,939 --> 00:05:00,829
171 | and inferential statistics to be applied now I will explain
172 | 
173 | 44
174 | 00:05:00,829 --> 00:05:11,980
175 | about discrete and continuous data for discrete data can only
176 | 
177 | 45
178 | 00:05:11,980 --> 00:05:19,629
179 | take certain values. And that's the example, the
180 | 
181 | 46
182 | 00:05:19,629 --> 00:05:26,850
183 | number off student and the number that appear after you
184 | 
185 | 47
186 | 00:05:26,850 --> 00:05:40,860
187 | rolling dies and continuous data for continuous data can take
188 | 
189 | 48
190 | 00:05:41,439 --> 00:05:46,699
191 | any value within a range. And the examples are
192 | 
193 | 49
194 | 00:05:46,709 --> 00:05:51,930
195 | the first is a person's hey and then the time
196 | 
197 | 50
198 | 00:05:51,939 --> 00:05:58,050
199 | in a race. And then a talks waked and
200 | 
201 | 51
202 | 00:05:58,240 --> 00:06:05,930
203 | the length off a leaf. No, If there's
204 | 
205 | 52
206 | 00:06:06,230 --> 00:06:15,230
207 | a question how we to get the data, the
208 | 
209 | 53
210 | 00:06:15,230 --> 00:06:20,750
211 | answer is they're too option to get data. First
212 | 
213 | 54
214 | 00:06:21,639 --> 00:06:27,089
215 | is get the data by ourselves. For example,
216 | 
217 | 55
218 | 00:06:27,180 --> 00:06:31,160
219 | a researcher conduct some research, and he gathered the
220 | 
221 | 56
222 | 00:06:31,160 --> 00:06:35,730
223 | data by himself. We called the data as primary
224 | 
225 | 57
226 | 00:06:35,740 --> 00:06:42,790
227 | data. Second is Katie data from another source.
228 | 
229 | 58
230 | 00:06:43,540 --> 00:06:46,259
231 | For example, I collect the data from Internet or
232 | 
233 | 59
234 | 00:06:46,259 --> 00:06:49,769
235 | I ask my fellow researcher to give his data.
236 | 
237 | 60
238 | 00:06:50,230 --> 00:06:54,769
239 | The data that I get is called secondary data,
240 | 
241 | 61
242 | 00:06:59,139 --> 00:07:01,680
243 | and there are many techniques to get the data.
244 | 
245 | 62
246 | 00:07:01,769 --> 00:07:05,089
247 | But in this session I only mentioned five techniques,
248 | 
249 | 63
250 | 00:07:05,439 --> 00:07:14,649
251 | namely, record station senses, survey experiment and observation
252 | 
253 | 64
254 | 00:07:17,439 --> 00:07:24,560
255 | . Registration is a method which in places more on
256 | 
257 | 65
258 | 00:07:24,560 --> 00:07:35,610
259 | structured recording through various institutions, and census is a
260 | 
261 | 66
262 | 00:07:35,610 --> 00:07:42,060
263 | complete way off collecting data where all elements in the
264 | 
265 | 67
266 | 00:07:42,060 --> 00:07:46,800
267 | population that are object off the research are investigated or
268 | 
269 | 68
270 | 00:07:46,810 --> 00:07:56,889
271 | enumerated one by one. And then the next ISS
272 | 
273 | 69
274 | 00:07:56,889 --> 00:08:01,300
275 | survey survey is collecting information from a sample group to
276 | 
277 | 70
278 | 00:08:01,300 --> 00:08:09,649
279 | learn about the entire population and next ISS experiment.
280 | 
281 | 71
282 | 00:08:11,110 --> 00:08:16,930
283 | An experimental study has the researcher purposely attempting to influence
284 | 
285 | 72
286 | 00:08:16,939 --> 00:08:20,939
287 | the result. The goal is to do their mind
288 | 
289 | 73
290 | 00:08:20,949 --> 00:08:26,550
291 | what effect a particular treatment has on the outcome.
292 | 
293 | 74
294 | 00:08:28,439 --> 00:08:33,570
295 | Researchers take measurements or surface off the sample population,
296 | 
297 | 75
298 | 00:08:35,240 --> 00:08:46,169
299 | and you can read the example below. No,
300 | 
301 | 76
302 | 00:08:46,309 --> 00:08:52,710
303 | the observational in the observational study, the simple population
304 | 
305 | 77
306 | 00:08:52,720 --> 00:08:58,529
307 | being studied ISS miserable or surveilled as it ISS.
308 | 
309 | 78
310 | 00:08:58,340 --> 00:09:05,879
311 | The researcher observes the subjects and missiles variables but doesn't
312 | 
313 | 79
314 | 00:09:05,879 --> 00:09:09,779
315 | influence the population in any way or attempt to intervene
316 | 
317 | 80
318 | 00:09:09,970 --> 00:09:13,990
319 | in the study. There is no manipulation by the
320 | 
321 | 81
322 | 00:09:13,990 --> 00:09:20,659
323 | researcher, and the last is that those that we
324 | 
325 | 82
326 | 00:09:20,659 --> 00:09:26,480
327 | can use to collect the data. You can use
328 | 
329 | 83
330 | 00:09:26,480 --> 00:09:33,049
331 | questionnaire in their view checklist or any digital tools.
332 | 
333 | 84
334 | 00:09:35,240 --> 00:09:41,110
335 | Okay, I think enough for the session. I
336 | 
337 | 85
338 | 00:09:41,110 --> 00:09:45,769
339 | hope you enjoy that. And CIA
340 | 
341 | 


--------------------------------------------------------------------------------
/src/License.txt:
--------------------------------------------------------------------------------
 1 | MIT No Attribution
 2 | 
 3 | Permission is hereby granted, free of charge, to any person obtaining a copy of this
 4 | software and associated documentation files (the "Software"), to deal in the Software
 5 | without restriction, including without limitation the rights to use, copy, modify,
 6 | merge, publish, distribute, sublicense, and/or sell copies of the Software, and to
 7 | permit persons to whom the Software is furnished to do so.
 8 | 
 9 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
10 | INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
11 | PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
12 | HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
13 | OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
14 | SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.


--------------------------------------------------------------------------------
/src/Notice.txt:
--------------------------------------------------------------------------------
1 | Transcribing and Subtitling Videos Using Amazon Services
2 | Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. 
3 | 


--------------------------------------------------------------------------------
/src/THIRD_PARTY_LICENSES.txt:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/aws-transcribe-captioning-tools/14d8b58186fd71c2145e6cc76719ffc0be3a7087/src/THIRD_PARTY_LICENSES.txt


--------------------------------------------------------------------------------
/src/__pycache__/audioUtils.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/aws-transcribe-captioning-tools/14d8b58186fd71c2145e6cc76719ffc0be3a7087/src/__pycache__/audioUtils.cpython-37.pyc


--------------------------------------------------------------------------------
/src/__pycache__/srtUtils.cpython-37.pyc:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/aws-samples/aws-transcribe-captioning-tools/14d8b58186fd71c2145e6cc76719ffc0be3a7087/src/__pycache__/srtUtils.cpython-37.pyc


--------------------------------------------------------------------------------
/src/audioUtils.py:
--------------------------------------------------------------------------------
  1 | ﻿# ==================================================================================
  2 | # Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
  3 | 
  4 | # Permission is hereby granted, free of charge, to any person obtaining a copy of this
  5 | # software and associated documentation files (the "Software"), to deal in the Software
  6 | # without restriction, including without limitation the rights to use, copy, modify,
  7 | # merge, publish, distribute, sublicense, and/or sell copies of the Software, and to
  8 | # permit persons to whom the Software is furnished to do so.
  9 | 
 10 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
 11 | # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
 12 | # PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
 13 | # HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
 14 | # OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
 15 | # SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
 16 | # ==================================================================================
 17 | #
 18 | # audioUtils.py
 19 | # by: Rob Dachowski
 20 | # For questions or feedback, please contact robdac@amazon.com
 21 | # 
 22 | # Purpose: The program provides a number of utility audio functions used to create 
 23 | #          transcribed, translated, and subtitled videos using Amazon Transcribe,
 24 | #          Amazon Translate, Amazon Polly, and MoviePy
 25 | #
 26 | # Change Log:
 27 | #          6/29/2018: Initial version
 28 | #
 29 | # ==================================================================================
 30 | 
 31 | 
 32 | import boto3
 33 | import os
 34 | import json
 35 | import contextlib
 36 | from moviepy.editor import *
 37 | from moviepy import editor
 38 | from contextlib import closing
 39 | 
 40 | # ==================================================================================
 41 | # Function: writeAudio
 42 | # Purpose: writes the bytes associates with the stream to a binary file
 43 | # Parameters: 
 44 | #                 output_file - the name + extension of the ouptut file (e.g. "abc.mp3")
 45 | #                 stream - the stream of bytes to write to the output_file
 46 | # ==================================================================================
 47 | def writeAudio( output_file, stream ):
 48 | 
 49 | 	bytes = stream.read()
 50 | 	
 51 | 	print("\t==> Writing {:d} bytes to audio file: {:s}".format(len(output_file), output_file))
 52 | 	try:
 53 | 		# Open a file for writing the output as a binary stream
 54 | 		with open(output_file, "wb") as file:
 55 | 			file.write(bytes)
 56 | 		
 57 | 		if file.closed:
 58 | 				print("\t==> {:s} is closed".format(output_file))
 59 | 		else:
 60 | 				print("\t==> {:s} is NOT closed".format(output_file))
 61 | 	except IOError as error:
 62 | 		# Could not write to file, exit gracefully
 63 | 		print(error)
 64 | 		sys.exit(-1)
 65 | 
 66 | # ==================================================================================
 67 | # Function: createAudioTrackFromTranslation
 68 | # Purpose: Using the provided transcript, get a translation from Amazon Translate, then use Amazon Polly to synthesize speech
 69 | # Prrameters: 
 70 | #                 region - the aws region in which to run the service
 71 | #                 transcript - the Amazon Transcribe JSON structure to translate
 72 | #                 sourceLangCode - the language code for the original content (e.g. English = "EN")
 73 | #                 targetLangCode - the language code for the translated content (e.g. Spanich = "ES")
 74 | #                 audioFileName - the name (including extension) of the target audio file (e.g. "abc.mp3")
 75 | # ==================================================================================
 76 | def createAudioTrackFromTranslation( region, transcript, sourceLangCode, targetLangCode, audioFileName ):
 77 | 	print( "\n==> createAudioTrackFromTranslation " )
 78 | 	
 79 | 	# Set up the polly and translate services
 80 | 	client = boto3.client('polly')
 81 | 	translate = boto3.client(service_name='translate', region_name=region, use_ssl=True)
 82 | 
 83 | 	#get the transcript text
 84 | 	temp = json.loads( transcript)
 85 | 	transcript_txt = temp["results"]["transcripts"][0]["transcript"]
 86 | 	
 87 | 	voiceId = getVoiceId( targetLangCode )
 88 | 	
 89 | 	# Now translate it.
 90 | 	translated_txt = unicode((translate.translate_text(Text=transcript_txt, SourceLanguageCode=sourceLangCode, TargetLanguageCode=targetLangCode))["TranslatedText"])[:2999]
 91 | 
 92 | 	# Use the translated text to create the synthesized speech
 93 | 	response = client.synthesize_speech( OutputFormat="mp3", SampleRate="22050", Text=translated_txt, VoiceId=voiceId)
 94 | 	
 95 | 	if response["ResponseMetadata"]["HTTPStatusCode"] == 200:
 96 | 		print( "\t==> Successfully called Polly for speech synthesis")
 97 | 		writeAudioStream( response, audioFileName )
 98 | 	else:
 99 | 		print( "\t==> Error calling Polly for speech synthesis")
100 | 
101 | 	
102 | # ==================================================================================
103 | # Function: writeAudioStream
104 | # Purpose: Utility to write an audio file from the response from the Amazon Polly API
105 | # Prrameters: 
106 | #                 response - the Amazaon Polly JSON response  
107 | #                 audioFileName - the name (including extension) of the target audio file (e.g. "abc.mp3")
108 | # ==================================================================================
109 | def writeAudioStream( response, audioFileName ):
110 | 	# Take the resulting stream and write it to an mp3 file
111 | 	if "AudioStream" in response:
112 | 		with closing(response["AudioStream"]) as stream:
113 | 			output = audioFileName
114 | 			writeAudio( output, stream )
115 | 
116 | 
117 | 
118 | # ==================================================================================
119 | # Function: getVoiceId
120 | # Purpose: Utility to return the name of the voice to use given a language code.  Note: this is only populated with the
121 | #          VoiceIds used for this example.   Refer to the Amazon Polly API documentation for other voiceId names
122 | # Prrameters: 
123 | #                 targetLangCode - the language code used for the target Amazon Polly output 
124 | # ==================================================================================
125 | def getVoiceId( targetLangCode ):
126 | 
127 | 	# Feel free to add others as desired
128 | 	if targetLangCode == "es":
129 | 		voiceId = "Penelope"
130 | 	elif targetLangCode == "de":
131 | 		voiceId = "Marlene"
132 | 		
133 | 	return voiceId
134 | 	
135 | 	
136 | # ==================================================================================
137 | # Function: getSecondsFromTranslation
138 | # Purpose: Utility to determine how long in seconds it will take for a particular phrase of translated text to be spoken
139 | # Prrameters: 
140 | #                 textToSynthesize - the raw text to be synthesized   
141 | #                 targetLangCode - the language code used for the target Amazon Polly output 
142 | #                 audioFileName - the name (including extension) of the target audio file (e.g. "abc.mp3")
143 | # ==================================================================================
144 | def getSecondsFromTranslation( textToSynthesize, targetLangCode, audioFileName ):
145 | 
146 | 	# Set up the polly and translate services
147 | 	client = boto3.client('polly')
148 | 	translate = boto3.client(service_name='translate', region_name="us-east-1", use_ssl=True)
149 | 	
150 | 	# Use the translated text to create the synthesized speech
151 | 	response = client.synthesize_speech( OutputFormat="mp3", SampleRate="22050", Text=textToSynthesize, VoiceId=getVoiceId( targetLangCode ) )
152 | 	
153 | 	# write the stream out to disk so that we can load it into an AudioClip
154 | 	writeAudioStream( response, audioFileName )
155 | 	
156 | 	# Load the temporary audio clip into an AudioFileClip
157 | 	audio = AudioFileClip( audioFileName)
158 | 		
159 | 	# return the duration
160 | 	return audio.duration
161 | 	
162 | 	
163 | 	


--------------------------------------------------------------------------------
/src/concatenateVideos.py:
--------------------------------------------------------------------------------
 1 | # ==================================================================================
 2 | # Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
 3 | 
 4 | # Permission is hereby granted, free of charge, to any person obtaining a copy of this
 5 | # software and associated documentation files (the "Software"), to deal in the Software
 6 | # without restriction, including without limitation the rights to use, copy, modify,
 7 | # merge, publish, distribute, sublicense, and/or sell copies of the Software, and to
 8 | # permit persons to whom the Software is furnished to do so.
 9 | 
10 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
11 | # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
12 | # PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
13 | # HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
14 | # OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
15 | # SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
16 | # ==================================================================================
17 | #
18 | # concatenateVideos.py
19 | # by: Rob Dachowski
20 | # For questions or feedback, please contact robdac@amazon.com
21 | # 
22 | # Purpose: This code uses the output of makevideo.bat to combine the clips into a short demo consisting of 
23 | #          short subclips and some title frames
24 | 
25 | # Change Log:
26 | #          6/29/2018: Initial version
27 | #
28 | # ==================================================================================
29 | 
30 | # Import everything needed to edit video clips
31 | from moviepy.editor import *
32 | from moviepy import editor
33 | from moviepy.video.tools.subtitles import SubtitlesClip
34 | #import moviepy.video.fx.all as vfx 
35 | from time import gmtime, strftime
36 | 
37 | 
38 | # Load the clips outputed from makevideo.bat
39 | print strftime("%H:%M:%S", gmtime()), "Reading video English clip..."
40 | english = VideoFileClip("subtitledVideo-en.mp4")
41 | english = english.subclip( 0, 15).set_duration(15)
42 | 
43 | print strftime("%H:%M:%S", gmtime()), "Reading video Spanish clip..."
44 | spanish = VideoFileClip("subtitledVideo-es.mp4")
45 | spanish = spanish.subclip( 15, 30).set_duration(15)
46 | 
47 | print strftime("%H:%M:%S", gmtime()), "Reading video German clip..."
48 | german = VideoFileClip("subtitledVideo-de.mp4")
49 | german = german.subclip( 30, 45).set_duration(15)
50 | 
51 | 
52 | 
53 | print strftime("%H:%M:%S", gmtime()), "Creating title..."
54 | # Generate a text clip. You can customize the font, color, etc.
55 | toptitle = TextClip("Creating Subtitles and Translations Using Amazon Services:\n\nAmazon Transcribe\nAmazon Translate\nAmazon Polly",fontsize=36,color='white', bg_color='black', method="caption", align="center", size=english.size)
56 | toptitle.set_duration(5)
57 | 
58 | 
59 | subtitle1 = TextClip("re:Invent 2017 Keynote Address",fontsize=36,color='white', bg_color='black', method="caption", align="center", size=english.size)
60 | subtitle1.set_duration(5)
61 | 
62 | subtitle2 = TextClip( "\nAndy Jassy, President and CEO of Amazon Web Services", fontsize=28, color='white', bg_color='black', method="caption", align="center ", size=english.size)
63 | subtitle2.set_duration(5)
64 | 
65 | # Composite the video clips into a title page
66 | title = CompositeVideoClip( [ toptitle, subtitle1.set_start(5), subtitle2.set_start(9)] ).set_duration(15)
67 | 
68 | 
69 | #Create text clips for the various different translations
70 | est = TextClip("English Subtitles\nUsing Amazon Transcribe",fontsize=24,color='white', bg_color='black', method="caption", align="center", size=english.size)
71 | est = est.set_pos('center').set_duration(2.5)
72 | 
73 | sst = TextClip("Spanish Subtitles\nUsing Amazon Transcribe, Amazon Translate, and Amazon Polly",fontsize=24,color='white', bg_color='black', method="caption", align="center", size=english.size)
74 | sst = sst.set_pos('center').set_duration(2.5)
75 | 
76 | dst = TextClip("German Subtitles\nUsing Amazon Transcribe, Amazon Translate, and Amazon Polly",fontsize=24,color='white', bg_color='black', method="caption", align="center", size=english.size)
77 | dst = dst.set_pos('center').set_duration(2.5)
78 | 
79 | print strftime("%H:%M:%S", gmtime()), "Concatenating videos"
80 | 
81 | # concatenate the various titles, subtitles, and clips together
82 | combined = concatenate_videoclips( [title.crossfadeout(2), est, english, sst, spanish, dst, german] )
83 | 
84 | # Write the result to a file (many options available !)
85 | print strftime("%H:%M:%S", gmtime()), "Writing concatnated video"
86 | combined.write_videofile("combined.mp4",  codec="libx264", audio_codec="aac", fps=24)


--------------------------------------------------------------------------------
/src/makevideo.bat:
--------------------------------------------------------------------------------
 1 | REM ==================================================================================
 2 | REM Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
 3 | 
 4 | REM Permission is hereby granted, free of charge, to any person obtaining a copy of this
 5 | REM software and associated documentation files (the "Software"), to deal in the Software
 6 | REM without restriction, including without limitation the rights to use, copy, modify,
 7 | REM merge, publish, distribute, sublicense, and/or sell copies of the Software, and to
 8 | REM permit persons to whom the Software is furnished to do so.
 9 | 
10 | REM THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
11 | REM INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
12 | REM PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
13 | REM HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
14 | REM OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
15 | REM SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
16 | REM ==================================================================================
17 | REM
18 | REM makevideo.bat
19 | REM by: Rob Dachowski
20 | REM For questions or feedback, please contact robdac@amazon.com
21 | REM 
22 | REM Purpose: This batchfile invokes the translatevideo.py file with parameters 
23 | REM
24 | REM Change Log:
25 | REM          6/29/2018: Initial version
26 | REM
27 | REM ==================================================================================
28 | 
29 | cls
30 | python translatevideo.py -region us-east-1 -inbucket robdac-aiml-test/ -infile AWS_reInvent_2017.mp4 -outbucket robdac-aiml-test/ -outfilename subtitledVideo -outfiletype mp4 -outlang es de
31 | 


--------------------------------------------------------------------------------
/src/srt.py:
--------------------------------------------------------------------------------
1 | import sys
2 | 
3 | from srtUtils import *
4 | 
5 | input_file = sys.argv[1]
6 | output_file = sys.argv[2]
7 | 
8 | with open(input_file, "r") as f:
9 |       data = writeTranscriptToSRT(f.read(), 'en', output_file )


--------------------------------------------------------------------------------
/src/srtUtils.py:
--------------------------------------------------------------------------------
  1 | ﻿# ==================================================================================
  2 | # Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
  3 | 
  4 | # Permission is hereby granted, free of charge, to any person obtaining a copy of this
  5 | # software and associated documentation files (the "Software"), to deal in the Software
  6 | # without restriction, including without limitation the rights to use, copy, modify,
  7 | # merge, publish, distribute, sublicense, and/or sell copies of the Software, and to
  8 | # permit persons to whom the Software is furnished to do so.
  9 | 
 10 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
 11 | # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
 12 | # PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
 13 | # HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
 14 | # OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
 15 | # SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
 16 | # ==================================================================================
 17 | #
 18 | # srtUtils.py
 19 | # by: Rob Dachowski
 20 | # For questions or feedback, please contact robdac@amazon.com
 21 | # 
 22 | # Purpose: The program provides a number of utility functions for creating SubRip Subtitle files (.SRT)
 23 | #
 24 | # Change Log:
 25 | #          6/29/2018: Initial version
 26 | #
 27 | # ==================================================================================
 28 | 
 29 | import json
 30 | import boto3
 31 | import re
 32 | import codecs
 33 | import time
 34 | import math
 35 | from audioUtils import *
 36 | 
 37 | 
 38 | 
 39 | # ==================================================================================
 40 | # Function: newPhrase
 41 | # Purpose: simply create a phrase tuple
 42 | # Parameters: 
 43 | #                 None
 44 | # ==================================================================================
 45 | def newPhrase():
 46 | 	return { 'start_time': '', 'end_time': '', 'words' : [] }
 47 | 
 48 | 
 49 | 	
 50 | # ==================================================================================
 51 | # Function: getTimeCode
 52 | # Purpose: Format and return a string that contains the converted number of seconds into SRT format
 53 | # Parameters: 
 54 | #                 seconds - the duration in seconds to convert to HH:MM:SS,mmm 
 55 | # ==================================================================================	
 56 | 	# Format and return a string that contains the converted number of seconds into SRT format
 57 | def getTimeCode(seconds):
 58 | # ....t_hund = int(seconds % 1 * 1000)
 59 | # ....t_seconds = int( seconds )
 60 | # ....t_secs = ((float( t_seconds) / 60) % 1) * 60
 61 | # ....t_mins = int( t_seconds / 60 )
 62 | # ....return str( "%02d:%02d:%02d,%03d" % (00, t_mins, int(t_secs), t_hund ))
 63 |     (frac, whole) = math.modf(seconds)
 64 |     frac = frac * 1000
 65 |     return str('%s,%03d' % (time.strftime('%H:%M:%S',time.gmtime(whole)), frac))
 66 | 	
 67 | 
 68 | # ==================================================================================
 69 | # Function: writeTranscriptToSRT
 70 | # Purpose: Function to get the phrases from the transcript and write it out to an SRT file
 71 | # Parameters: 
 72 | #                 transcript - the JSON output from Amazon Transcribe
 73 | #                 sourceLangCode - the language code for the original content (e.g. English = "EN")
 74 | #                 srtFileName - the name of the SRT file (e.g. "mySRT.SRT")
 75 | # ==================================================================================	
 76 | def writeTranscriptToSRT( transcript, sourceLangCode, srtFileName ):
 77 | 	# Write the SRT file for the original language
 78 | 	print( "==> Creating SRT from transcript")
 79 | 	phrases = getPhrasesFromTranscript( transcript )
 80 | 	writeSRT( phrases, srtFileName )
 81 | 	
 82 | 
 83 |     
 84 | 
 85 | # ==================================================================================
 86 | # Function: writeTranscriptToSRT
 87 | # Purpose: Based on the JSON transcript provided by Amazon Transcribe, get the phrases from the translation 
 88 | #          and write it out to an SRT file
 89 | # Parameters: 
 90 | #                 transcript - the JSON output from Amazon Transcribe
 91 | #                 sourceLangCode - the language code for the original content (e.g. English = "EN")
 92 | #                 targetLangCode - the language code for the translated content (e.g. Spanich = "ES")
 93 | #                 srtFileName - the name of the SRT file (e.g. "mySRT.SRT")
 94 | # ==================================================================================
 95 | def writeTranslationToSRT( transcript, sourceLangCode, targetLangCode, srtFileName, region ):
 96 | 	# First get the translation
 97 | 	print( "\n\n==> Translating from " + sourceLangCode + " to " + targetLangCode )
 98 | 	translation = translateTranscript( transcript, sourceLangCode, targetLangCode, region )
 99 | 	#print( "\n\n==> Translation: " + str(translation))
100 | 		
101 | 	# Now create phrases from the translation
102 | 	textToTranslate = unicode(translation["TranslatedText"])
103 | 	phrases = getPhrasesFromTranslation( textToTranslate, targetLangCode )
104 | 	writeSRT( phrases, srtFileName )
105 | 	
106 | 
107 | # ==================================================================================
108 | # Function: getPhrasesFromTranslation
109 | # Purpose: Based on the JSON translation provided by Amazon Translate, get the phrases from the translation 
110 | #          and write it out to an SRT file.  Note that since we are using a block of translated text rather than
111 | #          a JSON structure with the timing for the start and end of each word as in the output of Transcribe,
112 | #          we will need to calculate the start and end-time for each phrase
113 | # Parameters: 
114 | #                 translation - the JSON output from Amazon Translate
115 | #                 targetLangCode - the language code for the translated content (e.g. Spanich = "ES")
116 | # ==================================================================================	
117 | def getPhrasesFromTranslation( translation, targetLangCode ):
118 | 
119 | 	# Now create phrases from the translation
120 | 	words = translation.split()
121 | 	
122 | 	#print( words ) #debug statement
123 | 	
124 | 	#set up some variables for the first pass
125 | 	phrase =  newPhrase()
126 | 	phrases = []
127 | 	nPhrase = True
128 | 	x = 0
129 | 	c = 0
130 | 	seconds = 0
131 | 
132 | 	print("==> Creating phrases from translation...")
133 | 
134 | 	for word in words:
135 | 
136 | 		# if it is a new phrase, then get the start_time of the first item
137 | 		if nPhrase == True:
138 | 			phrase["start_time"] = getTimeCode( seconds )
139 | 			nPhrase = False
140 | 			c += 1
141 | 				
142 | 		# Append the word to the phrase...
143 | 		phrase["words"].append(word)
144 | 		x += 1
145 | 		
146 | 		
147 | 		# now add the phrase to the phrases, generate a new phrase, etc.
148 | 		if x == 10:
149 | 		
150 | 			# For Translations, we now need to calculate the end time for the phrase
151 | 			psecs = getSecondsFromTranslation( getPhraseText( phrase), targetLangCode, "phraseAudio" + str(c) + ".mp3" ) 
152 | 			seconds += psecs
153 | 			phrase["end_time"] = getTimeCode( seconds )
154 | 		
155 | 			#print c, phrase
156 | 			phrases.append(phrase)
157 | 			phrase = newPhrase()
158 | 			nPhrase = True
159 | 			#seconds += .001
160 | 			x = 0
161 | 			
162 | 		# This if statement is to address a defect in the SubtitleClip.   If the Subtitles end up being
163 | 		# a different duration than the content, MoviePy will sometimes fail with unexpected errors while
164 | 		# processing the subclip.   This is limiting it to something less than the total duration for our example
165 | 		# however, you may need to modify or eliminate this line depending on your content.
166 | 		if c == 30:
167 | 			break
168 | 			
169 | 	return phrases
170 | 	
171 | 
172 | # ==================================================================================
173 | # Function: getPhrasesFromTranscript
174 | # Purpose: Based on the JSON transcript provided by Amazon Transcribe, get the phrases from the translation 
175 | #          and write it out to an SRT file
176 | # Parameters: 
177 | #                 transcript - the JSON output from Amazon Transcribe
178 | # ==================================================================================
179 | def getPhrasesFromTranscript( transcript ):
180 | 
181 | 	# This function is intended to be called with the JSON structure output from the Transcribe service.  However,
182 | 	# if you only have the translation of the transcript, then you should call getPhrasesFromTranslation instead
183 | 
184 | 	# Now create phrases from the translation
185 | 	ts = json.loads( transcript )
186 | 	items = ts['results']['items']
187 | 	#print( items )
188 | 	
189 | 	#set up some variables for the first pass
190 | 	phrase =  newPhrase()
191 | 	phrases = []
192 | 	nPhrase = True
193 | 	x = 0
194 | 	c = 0
195 | 	lastEndTime = ""
196 | 
197 | 	print("==> Creating phrases from transcript...")
198 | 
199 | 	for item in items:
200 | 
201 | 		# if it is a new phrase, then get the start_time of the first item
202 | 		if nPhrase == True:
203 | 			if item["type"] == "pronunciation":
204 | 				phrase["start_time"] = getTimeCode( float(item["start_time"]) )
205 | 				nPhrase = False
206 | 				lastEndTime =  getTimeCode( float(item["end_time"]) )
207 | 			c+= 1
208 | 		else:	
209 | 			# get the end_time if the item is a pronuciation and store it
210 | 			# We need to determine if this pronunciation or puncuation here
211 | 			# Punctuation doesn't contain timing information, so we'll want
212 | 			# to set the end_time to whatever the last word in the phrase is.
213 | 			if item["type"] == "pronunciation":
214 | 				phrase["end_time"] = getTimeCode( float(item["end_time"]) )
215 | 				
216 | 		# in either case, append the word to the phrase...
217 | 		phrase["words"].append(item['alternatives'][0]["content"])
218 | 		x += 1
219 | 		
220 | 		# now add the phrase to the phrases, generate a new phrase, etc.
221 | 		if x == 10:
222 | 			#print c, phrase
223 | 			phrases.append(phrase)
224 | 			phrase = newPhrase()
225 | 			nPhrase = True
226 | 			x = 0
227 | 	
228 | 	# if there are any words in the final phrase add to phrases  
229 | 	if(len(phrase["words"]) > 0):
230 | 		if phrase['end_time'] == '':
231 |             		phrase['end_time'] = lastEndTime
232 | 		phrases.append(phrase)	
233 | 				
234 | 	return phrases
235 | 	
236 | 
237 | 
238 | 
239 | # ==================================================================================
240 | # Function: translateTranscript
241 | # Purpose: Based on the JSON transcript provided by Amazon Transcribe, get the JSON response of translated text
242 | # Parameters: 
243 | #                 transcript - the JSON output from Amazon Transcribe
244 | #                 sourceLangCode - the language code for the original content (e.g. English = "EN")
245 | #                 targetLangCode - the language code for the translated content (e.g. Spanich = "ES")
246 | #                 region - the AWS region in which to run the Translation (e.g. "us-east-1")
247 | # ==================================================================================
248 | def translateTranscript( transcript, sourceLangCode, targetLangCode, region ):
249 | 	# Get the translation in the target language.  We want to do this first so that the translation is in the full context
250 | 	# of what is said vs. 1 phrase at a time.  This really matters in some lanaguages
251 | 
252 | 	# stringify the transcript
253 | 	ts = json.loads( transcript )
254 | 
255 | 	# pull out the transcript text and put it in the txt variable
256 | 	txt = ts["results"]["transcripts"][0]["transcript"]
257 | 		
258 | 	#set up the Amazon Translate client
259 | 	translate = boto3.client(service_name='translate', region_name=region, use_ssl=True)
260 | 	
261 | 	# call Translate  with the text, source language code, and target language code.  The result is a JSON structure containing the 
262 | 	# translated text
263 | 	translation = translate.translate_text(Text=txt,SourceLanguageCode=sourceLangCode, TargetLanguageCode=targetLangCode)
264 | 	
265 | 	return translation
266 | 	
267 | 	
268 | 
269 | # ==================================================================================
270 | # Function: writeSRT
271 | # Purpose: Iterate through the phrases and write them to the SRT file
272 | # Parameters: 
273 | #                 phrases - the array of JSON tuples containing the phrases to show up as subtitles
274 | #                 filename - the name of the SRT output file (e.g. "mySRT.srt")
275 | # ==================================================================================
276 | def writeSRT( phrases, filename ):
277 | 	print ("==> Writing phrases to disk...")
278 | 
279 | 	# open the files
280 | 	e = codecs.open(filename,"w+", "utf-8")
281 | 	x = 1
282 | 	
283 | 	for phrase in phrases:
284 | 
285 | 		# determine how many words are in the phrase
286 | 		length = len(phrase["words"])
287 | 		
288 | 		# write out the phrase number
289 | 		e.write( str(x) + "\n" )
290 | 		x += 1
291 | 		
292 | 		# write out the start and end time
293 | 		e.write( phrase["start_time"] + " --> " + phrase["end_time"] + "\n" )
294 | 					
295 | 		# write out the full phase.  Use spacing if it is a word, or punctuation without spacing
296 | 		out = getPhraseText( phrase )
297 | 
298 | 		# write out the srt file
299 | 		e.write(out + "\n\n" )
300 | 		
301 | 
302 | 		#print out
303 | 		
304 | 	e.close()
305 | 	
306 | 
307 | # ==================================================================================
308 | # Function: getPhraseText
309 | # Purpose: For a given phrase, return the string of words including punctuation
310 | # Parameters: 
311 | #                 phrase - the array of JSON tuples containing the words to show up as subtitles
312 | # ==================================================================================
313 | 
314 | def getPhraseText( phrase ):
315 | 
316 | 	length = len(phrase["words"])
317 | 		
318 | 	out = ""
319 | 	for i in range( 0, length ):
320 | 		if re.match( '[a-zA-Z0-9]', phrase["words"][i]):
321 | 			if i > 0:
322 | 				out += " " + phrase["words"][i]
323 | 			else:
324 | 				out += phrase["words"][i]
325 | 		else:
326 | 			out += phrase["words"][i]
327 | 			
328 | 	return out
329 | 	
330 | 
331 | 			
332 | 
333 | 	
334 | 
335 | 
336 | 	
337 | 	
338 | 


--------------------------------------------------------------------------------
/src/transcribeUtils.py:
--------------------------------------------------------------------------------
 1 | ﻿# ==================================================================================
 2 | # Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
 3 | 
 4 | # Permission is hereby granted, free of charge, to any person obtaining a copy of this
 5 | # software and associated documentation files (the "Software"), to deal in the Software
 6 | # without restriction, including without limitation the rights to use, copy, modify,
 7 | # merge, publish, distribute, sublicense, and/or sell copies of the Software, and to
 8 | # permit persons to whom the Software is furnished to do so.
 9 | 
10 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
11 | # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
12 | # PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
13 | # HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
14 | # OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
15 | # SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
16 | # ==================================================================================
17 | #
18 | # transcribeUtils.py
19 | # by: Rob Dachowski
20 | # For questions or feedback, please contact robdac@amazon.com
21 | # 
22 | # Purpose: The program provides a number of utility functions for leveraging the Amazon Transcribe API
23 | #
24 | # Change Log:
25 | #          6/29/2018: Initial version
26 | #
27 | # ==================================================================================
28 | 
29 | import boto3
30 | import uuid
31 | import requests
32 | 
33 | 
34 | 
35 | # ==================================================================================
36 | # Function: createTranscribeJob
37 | # Purpose: Function to format the input parameters and invoke the Amazon Transcribe service
38 | # Parameters: 
39 | #                 region - the AWS region in which to run AWS services (e.g. "us-east-1")
40 | #                 bucket - the Amazon S3 bucket name (e.g. "mybucket/") found in region that contains the media file for processing.   
41 | #                 mediaFile - the content to process (e.g. "myvideo.mp4")
42 | #
43 | # ==================================================================================
44 | def createTranscribeJob( region, bucket, mediaFile ):
45 | 
46 | 	# Set up the Transcribe client 
47 | 	transcribe = boto3.client('transcribe')
48 | 	
49 | 	# Set up the full uri for the bucket and media file
50 | 	mediaUri = "https://" + "s3-" + region + ".amazonaws.com/" + bucket + mediaFile 
51 | 	
52 | 	print( "Creating Job: " + "transcribe" + mediaFile + " for " + mediaUri )
53 | 	
54 | 	# Use the uuid functionality to generate a unique job name.  Otherwise, the Transcribe service will return an error
55 | 	response = transcribe.start_transcription_job( TranscriptionJobName="transcribe_" + uuid.uuid4().hex + "_" + mediaFile , \
56 | 		LanguageCode = "en-US", \
57 | 		MediaFormat = "mp4", \
58 | 		Media = { "MediaFileUri" : mediaUri }, \
59 | 		Settings = { "VocabularyName" : "MyVocabulary" } \
60 | 		)
61 | 	
62 | 	# return the response structure found in the Transcribe Documentation
63 | 	return response
64 | 	
65 | 	
66 | # ==================================================================================
67 | # Function: getTranscriptionJobStatus
68 | # Purpose: Helper function to return the status of a job running the Amazon Transcribe service
69 | # Parameters: 
70 | #                 jobName - the unique jobName used to start the Amazon Transcribe job
71 | # ==================================================================================
72 | def getTranscriptionJobStatus( jobName ):
73 | 	transcribe = boto3.client('transcribe')
74 | 	
75 | 	response = transcribe.get_transcription_job( TranscriptionJobName=jobName )
76 | 	return response
77 | 	
78 | 	
79 | # ==================================================================================
80 | # Function: getTranscript
81 | # Purpose: Helper function to return the transcript based on the signed URI in S3 as produced by the Transcript job
82 | # Parameters: 
83 | #                 transcriptURI - the signed S3 URI for the Transcribe output
84 | # ==================================================================================
85 | def getTranscript( transcriptURI ):
86 | 	# Get the resulting Transcription Job and store the JSON response in transcript
87 | 	result = requests.get( transcriptURI )
88 | 
89 | 	return result.text
90 | 
91 | 	


--------------------------------------------------------------------------------
/src/translatevideo.py:
--------------------------------------------------------------------------------
  1 | ﻿# ==================================================================================
  2 | # Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
  3 | 
  4 | # Permission is hereby granted, free of charge, to any person obtaining a copy of this
  5 | # software and associated documentation files (the "Software"), to deal in the Software
  6 | # without restriction, including without limitation the rights to use, copy, modify,
  7 | # merge, publish, distribute, sublicense, and/or sell copies of the Software, and to
  8 | # permit persons to whom the Software is furnished to do so.
  9 | 
 10 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
 11 | # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
 12 | # PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
 13 | # HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
 14 | # OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
 15 | # SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
 16 | # ==================================================================================
 17 | #
 18 | # translatevideo.py
 19 | # by: Rob Dachowski
 20 | # For questions or feedback, please contact robdac@amazon.com
 21 | # 
 22 | # Purpose: This code drives the process to create a transription job, translate it into another language,
 23 | #          create subtitles, use Amazon Polly to synthesize an alternate audio track, and finally put it all together
 24 | #          into a new video.
 25 | #
 26 | # Change Log:
 27 | #          6/29/2018: Initial version
 28 | #
 29 | # ==================================================================================
 30 | 
 31 | 
 32 | import argparse
 33 | from transcribeUtils import *
 34 | from srtUtils import *
 35 | import time
 36 | from videoUtils import *
 37 | from audioUtils import *
 38 | 
 39 | 
 40 | 
 41 | # Get the command line arguments and parse them
 42 | parser = argparse.ArgumentParser( prog='translatevideo.py', description='Process a video found in the input file, process it, and write tit out to the output file')
 43 | parser.add_argument('-region', required=True, help="The AWS region containing the S3 buckets" )
 44 | parser.add_argument('-inbucket', required=True, help='The S3 bucket containing the input file')
 45 | parser.add_argument('-infile', required=True, help='The input file to process')
 46 | parser.add_argument('-outbucket', required=True, help='The S3 bucket containing the input file')
 47 | parser.add_argument('-outfilename', required=True, help='The file name without the extension')
 48 | parser.add_argument('-outfiletype', required=True, help='The output file type.  E.g. mp4, mov')
 49 | parser.add_argument('-outlang', required=True, nargs='+', help='The language codes for the desired output.  E.g. en = English, de = German')		
 50 | args = parser.parse_args()
 51 | 
 52 | # print out parameters and key header information for the user
 53 | print( "==> translatevideo.py:\n")
 54 | print( "==> Parameters: ")
 55 | print("\tInput bucket/object: " + args.inbucket + args.infile )
 56 | print( "\tOutput bucket/object: " + args.outbucket + args.outfilename + "." + args.outfiletype )
 57 | 
 58 | print( "\n==> Target Language Translation Output: " )
 59 | 
 60 | for lang in args.outlang:
 61 | 	print( "\t" + args.outbucket + args.outfilename + "-" + lang + "." + args.outfiletype)
 62 | 	
 63 | 	
 64 | # Create Transcription Job
 65 | response = createTranscribeJob( args.region, args.inbucket, args.infile )
 66 | 
 67 | # loop until the job successfully completes
 68 | print( "\n==> Transcription Job: " + response["TranscriptionJob"]["TranscriptionJobName"] + "\n\tIn Progress"),
 69 | 
 70 | while( response["TranscriptionJob"]["TranscriptionJobStatus"] == "IN_PROGRESS"):
 71 | 	print( "."),
 72 | 	time.sleep( 30 )
 73 | 	response = getTranscriptionJobStatus( response["TranscriptionJob"]["TranscriptionJobName"] )
 74 | 
 75 | print( "\nJob Complete")
 76 | print( "\tStart Time: " + str(response["TranscriptionJob"]["CreationTime"]) )
 77 | print( "\tEnd Time: "  + str(response["TranscriptionJob"]["CompletionTime"]) )
 78 | print( "\tTranscript URI: " + str(response["TranscriptionJob"]["Transcript"]["TranscriptFileUri"]) )
 79 | 
 80 | # Now get the transcript JSON from AWS Transcribe
 81 | transcript = getTranscript( str(response["TranscriptionJob"]["Transcript"]["TranscriptFileUri"]) ) 
 82 | # print( "\n==> Transcript: \n" + transcript)
 83 | 
 84 | # Create the SRT File for the original transcript and write it out.  
 85 | writeTranscriptToSRT( transcript, 'en', "subtitles-en.srt" )  
 86 | createVideo( args.infile, "subtitles-en.srt", args.outfilename + "-en." + args.outfiletype, "audio-en.mp3", True)
 87 | 
 88 | 
 89 | # Now write out the translation to the transcript for each of the target languages
 90 | for lang in args.outlang:
 91 | 	writeTranslationToSRT(transcript, 'en', lang, "subtitles-" + lang + ".srt", args.region ) 	
 92 | 	
 93 | 	#Now that we have the subtitle files, let's create the audio track
 94 | 	createAudioTrackFromTranslation( args.region, transcript, 'en', lang, "audio-" + lang + ".mp3" )
 95 | 	
 96 | 	# Finally, create the composited video
 97 | 	createVideo( args.infile, "subtitles-" + lang + ".srt", args.outfilename + "-" + lang + "." + args.outfiletype, "audio-" + lang + ".mp3", False)
 98 | 	
 99 | 	
100 | 
101 | 
102 | 	
103 | 


--------------------------------------------------------------------------------
/src/videoUtils.py:
--------------------------------------------------------------------------------
  1 | ﻿# ==================================================================================
  2 | # Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
  3 | 
  4 | # Permission is hereby granted, free of charge, to any person obtaining a copy of this
  5 | # software and associated documentation files (the "Software"), to deal in the Software
  6 | # without restriction, including without limitation the rights to use, copy, modify,
  7 | # merge, publish, distribute, sublicense, and/or sell copies of the Software, and to
  8 | # permit persons to whom the Software is furnished to do so.
  9 | 
 10 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
 11 | # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
 12 | # PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
 13 | # HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
 14 | # OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
 15 | # SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
 16 | # ==================================================================================
 17 | #
 18 | # videoUtils.py
 19 | # by: Rob Dachowski
 20 | # For questions or feedback, please contact robdac@amazon.com
 21 | # 
 22 | # Purpose: This code drives the MoviePy functions needed to create the subtitled video
 23 | #
 24 | # Change Log:
 25 | #          6/29/2018: Initial version
 26 | #
 27 | # ==================================================================================
 28 | 
 29 | from moviepy.editor import *
 30 | from moviepy import editor
 31 | from moviepy.video.tools.subtitles import SubtitlesClip
 32 | from time import gmtime, strftime
 33 | from audioUtils import *
 34 | 
 35 | 
 36 | # ==================================================================================
 37 | # Function: annotate
 38 | # Purpose: This function creates a TextClip based on the provided text and composites the subtitle onto the provided clip.
 39 | #          Defaults are used for txt_color, fontsize, and font.   You can override them as desired
 40 | # Parameters: 
 41 | #                 clip - the clip to composite the text on 
 42 | #                 txt - the block of text to composite on the clip
 43 | #                 txt_color - the color of the text on the screen
 44 | #                 font_size - the size of the font to display
 45 | #                 font - the font to use for the text
 46 | #
 47 | # ==================================================================================
 48 | def annotate(clip, txt, txt_color='white', fontsize=24, font='Arial-Bold'):
 49 |     # Writes a text at the bottom of the clip  'Xolonium-Bold'
 50 |     txtclip = editor.TextClip(txt, fontsize=fontsize, font=font, color=txt_color).on_color(color=[0,0,0])
 51 |     cvc = editor.CompositeVideoClip([clip, txtclip.set_pos(('center', 50))])
 52 |     return cvc.set_duration(clip.duration)
 53 | 	
 54 | # ==================================================================================
 55 | # Function: createVideo
 56 | # Purpose: This function drives the MoviePy code needed to put all of the pieces together and create a new subtitled video  
 57 | # Parameters: 
 58 | #                 originalClipName - the flename of the orignal conent (e.g. "originalVideo.mp4")
 59 | #                 subtitlesFileName - the filename of the SRT file (e.g. "mySRT.srt")
 60 | #                 outputFileName - the filename of the output video file (e.g. "outputFileName.mp4")
 61 | #                 alternateAudioFileName - the filename of an MP3 file that should be used to replace the audio track
 62 | #                 useOriginalAudio - boolean value as to whether or not we should leave the orignal audio in place or overlay it
 63 | #
 64 | # ==================================================================================
 65 | def createVideo( originalClipName, subtitlesFileName, outputFileName, alternateAudioFileName, useOriginalAudio=True ):
 66 | 	# This function is used to put all of the pieces together.   
 67 | 	# Note that if we need to use an alternate audio track, the last parm should = False
 68 | 	
 69 | 	print( "\n==> createVideo " )
 70 | 
 71 | 	# Load the original clip
 72 | 	print "\t" + strftime("%H:%M:%S", gmtime()), "Reading video clip: " + originalClipName 
 73 | 	clip = VideoFileClip(originalClipName)
 74 | 	print "\t\t==> Original clip duration: " + str(clip.duration)
 75 | 
 76 | 	if useOriginalAudio == False:
 77 | 		print strftime( "\t" + "%H:%M:%S", gmtime()), "Reading alternate audio track: " + alternateAudioFileName
 78 | 		audio = AudioFileClip(alternateAudioFileName)
 79 | 		audio = audio.subclip( 0, clip.duration )
 80 | 		audio.set_duration(clip.duration)
 81 | 		print "\t\t==> Audio duration: " + str(audio.duration)
 82 | 		clip = clip.set_audio( audio )
 83 | 	else:
 84 | 		print strftime( "\t" + "%H:%M:%S", gmtime()), "Using original audio track..."
 85 | 		
 86 | 	# Create a lambda function that will be used to generate the subtitles for each sequence in the SRT
 87 | 	generator = lambda txt: TextClip(txt, font='Arial-Bold', fontsize=24, color='white')
 88 | 
 89 | 	# read in the subtitles files
 90 | 	print "\t" + strftime("%H:%M:%S", gmtime()), "Reading subtitle file: " + subtitlesFileName 
 91 | 	subs = SubtitlesClip(subtitlesFileName, generator)
 92 | 	print "\t\t==> Subtitles duration before: " + str(subs.duration)
 93 | 	subs = subs.subclip( 0, clip.duration - .001)
 94 | 	subs.set_duration( clip.duration - .001 )
 95 | 	print "\t\t==> Subtitles duration after: " + str(subs.duration)
 96 | 	print "\t" + strftime("%H:%M:%S", gmtime()), "Reading subtitle file complete: " + subtitlesFileName 
 97 | 
 98 | 
 99 | 	print "\t" + strftime( "%H:%M:%S", gmtime()), "Creating Subtitles Track..."
100 | 	annotated_clips = [annotate(clip.subclip(from_t, to_t), txt) for (from_t, to_t), txt in subs]
101 | 
102 | 
103 | 
104 | 	print "\t" + strftime( "%H:%M:%S", gmtime()), "Creating composited video: " + outputFileName
105 | 	# Overlay the text clip on the first video clip
106 | 	final = concatenate_videoclips( annotated_clips )
107 | 
108 | 	print "\t" + strftime( "%H:%M:%S", gmtime()), "Writing video file: " + outputFileName 
109 | 	final.write_videofile(outputFileName)


--------------------------------------------------------------------------------
/tools/srtUtils.py:
--------------------------------------------------------------------------------
  1 | ﻿# ==================================================================================
  2 | # Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.
  3 | 
  4 | # Permission is hereby granted, free of charge, to any person obtaining a copy of this
  5 | # software and associated documentation files (the "Software"), to deal in the Software
  6 | # without restriction, including without limitation the rights to use, copy, modify,
  7 | # merge, publish, distribute, sublicense, and/or sell copies of the Software, and to
  8 | # permit persons to whom the Software is furnished to do so.
  9 | 
 10 | # THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED,
 11 | # INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A
 12 | # PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT
 13 | # HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
 14 | # OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
 15 | # SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
 16 | # ==================================================================================
 17 | #
 18 | # srtUtils.py
 19 | # by: Rob Dachowski
 20 | # For questions or feedback, please contact robdac@amazon.com
 21 | # 
 22 | # Purpose: The program provides a number of utility functions for creating SubRip Subtitle files (.SRT)
 23 | #
 24 | # Change Log:
 25 | #          6/29/2018: Initial version
 26 | #
 27 | # ==================================================================================
 28 | 
 29 | import json
 30 | import boto3
 31 | import re
 32 | import codecs
 33 | from audioUtils import *
 34 | 
 35 | 
 36 | 
 37 | # ==================================================================================
 38 | # Function: newPhrase
 39 | # Purpose: simply create a phrase tuple
 40 | # Parameters: 
 41 | #                 None
 42 | # ==================================================================================
 43 | def newPhrase():
 44 | 	return { 'start_time': '', 'end_time': '', 'words' : [] }
 45 | 
 46 | 
 47 | 	
 48 | # ==================================================================================
 49 | # Function: getTimeCode
 50 | # Purpose: Format and return a string that contains the converted number of seconds into SRT format
 51 | # Parameters: 
 52 | #                 seconds - the duration in seconds to convert to HH:MM:SS,mmm 
 53 | # ==================================================================================	
 54 | 	# Format and return a string that contains the converted number of seconds into SRT format
 55 | def getTimeCode( seconds ):
 56 | 	t_hund = int(seconds % 1 * 1000)
 57 | 	t_seconds = int( seconds )
 58 | 	t_secs = ((float( t_seconds) / 60) % 1) * 60
 59 | 	t_mins = int( t_seconds / 60 )
 60 | 	return str( "%02d:%02d:%02d,%03d" % (00, t_mins, int(t_secs), t_hund ))
 61 | 	
 62 | 
 63 | # ==================================================================================
 64 | # Function: writeTranscriptToSRT
 65 | # Purpose: Function to get the phrases from the transcript and write it out to an SRT file
 66 | # Parameters: 
 67 | #                 transcript - the JSON output from Amazon Transcribe
 68 | #                 sourceLangCode - the language code for the original content (e.g. English = "EN")
 69 | #                 srtFileName - the name of the SRT file (e.g. "mySRT.SRT")
 70 | # ==================================================================================	
 71 | def writeTranscriptToSRT( transcript, sourceLangCode, srtFileName ):
 72 | 	# Write the SRT file for the original language
 73 | 	print( "==> Creating SRT from transcript")
 74 | 	phrases = getPhrasesFromTranscript( transcript )
 75 | 	writeSRT( phrases, srtFileName )
 76 | 	
 77 | 
 78 | # ==================================================================================
 79 | # Function: writeTranscriptToSRT
 80 | # Purpose: Based on the JSON transcript provided by Amazon Transcribe, get the phrases from the translation 
 81 | #          and write it out to an SRT file
 82 | # Parameters: 
 83 | #                 transcript - the JSON output from Amazon Transcribe
 84 | #                 sourceLangCode - the language code for the original content (e.g. English = "EN")
 85 | #                 targetLangCode - the language code for the translated content (e.g. Spanich = "ES")
 86 | #                 srtFileName - the name of the SRT file (e.g. "mySRT.SRT")
 87 | # ==================================================================================
 88 | def writeTranslationToSRT( transcript, sourceLangCode, targetLangCode, srtFileName, region ):
 89 | 	# First get the translation
 90 | 	print( "\n\n==> Translating from " + sourceLangCode + " to " + targetLangCode )
 91 | 	translation = translateTranscript( transcript, sourceLangCode, targetLangCode, region )
 92 | 	#print( "\n\n==> Translation: " + str(translation))
 93 | 		
 94 | 	# Now create phrases from the translation
 95 | 	textToTranslate = unicode(translation["TranslatedText"])
 96 | 	phrases = getPhrasesFromTranslation( textToTranslate, targetLangCode )
 97 | 	writeSRT( phrases, srtFileName )
 98 | 	
 99 | 
100 | # ==================================================================================
101 | # Function: getPhrasesFromTranslation
102 | # Purpose: Based on the JSON translation provided by Amazon Translate, get the phrases from the translation 
103 | #          and write it out to an SRT file.  Note that since we are using a block of translated text rather than
104 | #          a JSON structure with the timing for the start and end of each word as in the output of Transcribe,
105 | #          we will need to calculate the start and end-time for each phrase
106 | # Parameters: 
107 | #                 translation - the JSON output from Amazon Translate
108 | #                 targetLangCode - the language code for the translated content (e.g. Spanich = "ES")
109 | # ==================================================================================	
110 | def getPhrasesFromTranslation( translation, targetLangCode ):
111 | 
112 | 	# Now create phrases from the translation
113 | 	words = translation.split()
114 | 	
115 | 	#print( words ) #debug statement
116 | 	
117 | 	#set up some variables for the first pass
118 | 	phrase =  newPhrase()
119 | 	phrases = []
120 | 	nPhrase = True
121 | 	x = 0
122 | 	c = 0
123 | 	seconds = 0
124 | 
125 | 	print "==> Creating phrases from translation..."
126 | 
127 | 	for word in words:
128 | 
129 | 		# if it is a new phrase, then get the start_time of the first item
130 | 		if nPhrase == True:
131 | 			phrase["start_time"] = getTimeCode( seconds )
132 | 			nPhrase = False
133 | 			c += 1
134 | 				
135 | 		# Append the word to the phrase...
136 | 		phrase["words"].append(word)
137 | 		x += 1
138 | 		
139 | 		
140 | 		# now add the phrase to the phrases, generate a new phrase, etc.
141 | 		if x == 10:
142 | 		
143 | 			# For Translations, we now need to calculate the end time for the phrase
144 | 			psecs = getSecondsFromTranslation( getPhraseText( phrase), targetLangCode, "phraseAudio" + str(c) + ".mp3" ) 
145 | 			seconds += psecs
146 | 			phrase["end_time"] = getTimeCode( seconds )
147 | 		
148 | 			#print c, phrase
149 | 			phrases.append(phrase)
150 | 			phrase = newPhrase()
151 | 			nPhrase = True
152 | 			#seconds += .001
153 | 			x = 0
154 | 			
155 | 		# This if statement is to address a defect in the SubtitleClip.   If the Subtitles end up being
156 | 		# a different duration than the content, MoviePy will sometimes fail with unexpected errors while
157 | 		# processing the subclip.   This is limiting it to something less than the total duration for our example
158 | 		# however, you may need to modify or eliminate this line depending on your content.
159 | 		if c == 30:
160 | 			break
161 | 			
162 | 	return phrases
163 | 	
164 | 
165 | # ==================================================================================
166 | # Function: getPhrasesFromTranscript
167 | # Purpose: Based on the JSON transcript provided by Amazon Transcribe, get the phrases from the translation 
168 | #          and write it out to an SRT file
169 | # Parameters: 
170 | #                 transcript - the JSON output from Amazon Transcribe
171 | # ==================================================================================
172 | def getPhrasesFromTranscript( transcript ):
173 | 
174 | 	# This function is intended to be called with the JSON structure output from the Transcribe service.  However,
175 | 	# if you only have the translation of the transcript, then you should call getPhrasesFromTranslation instead
176 | 
177 | 	# Now create phrases from the translation
178 | 	ts = json.loads( transcript )
179 | 	items = ts['results']['items']
180 | 	#print( items )
181 | 	
182 | 	#set up some variables for the first pass
183 | 	phrase =  newPhrase()
184 | 	phrases = []
185 | 	nPhrase = True
186 | 	x = 0
187 | 	c = 0
188 | 
189 | 	print "==> Creating phrases from transcript..."
190 | 
191 | 	for item in items:
192 | 
193 | 		# if it is a new phrase, then get the start_time of the first item
194 | 		if nPhrase == True:
195 | 			if item["type"] == "pronunciation":
196 | 				phrase["start_time"] = getTimeCode( float(item["start_time"]) )
197 | 				nPhrase = False
198 | 			c+= 1
199 | 		else:	
200 | 			# get the end_time if the item is a pronuciation and store it
201 | 			# We need to determine if this pronunciation or puncuation here
202 | 			# Punctuation doesn't contain timing information, so we'll want
203 | 			# to set the end_time to whatever the last word in the phrase is.
204 | 			if item["type"] == "pronunciation":
205 | 				phrase["end_time"] = getTimeCode( float(item["end_time"]) )
206 | 				
207 | 		# in either case, append the word to the phrase...
208 | 		phrase["words"].append(item['alternatives'][0]["content"])
209 | 		x += 1
210 | 		
211 | 		# now add the phrase to the phrases, generate a new phrase, etc.
212 | 		if x == 10:
213 | 			#print c, phrase
214 | 			phrases.append(phrase)
215 | 			phrase = newPhrase()
216 | 			nPhrase = True
217 | 			x = 0
218 | 	
219 | 	# if there are any words in the final phrase add to phrases  
220 | 	if(len(phrase["words"]) > 0):
221 | 		phrases.append(phrase)	
222 | 				
223 | 	return phrases
224 | 	
225 | 
226 | 
227 | 
228 | # ==================================================================================
229 | # Function: translateTranscript
230 | # Purpose: Based on the JSON transcript provided by Amazon Transcribe, get the JSON response of translated text
231 | # Parameters: 
232 | #                 transcript - the JSON output from Amazon Transcribe
233 | #                 sourceLangCode - the language code for the original content (e.g. English = "EN")
234 | #                 targetLangCode - the language code for the translated content (e.g. Spanich = "ES")
235 | #                 region - the AWS region in which to run the Translation (e.g. "us-east-1")
236 | # ==================================================================================
237 | def translateTranscript( transcript, sourceLangCode, targetLangCode, region ):
238 | 	# Get the translation in the target language.  We want to do this first so that the translation is in the full context
239 | 	# of what is said vs. 1 phrase at a time.  This really matters in some lanaguages
240 | 
241 | 	# stringify the transcript
242 | 	ts = json.loads( transcript )
243 | 
244 | 	# pull out the transcript text and put it in the txt variable
245 | 	txt = ts["results"]["transcripts"][0]["transcript"]
246 | 		
247 | 	#set up the Amazon Translate client
248 | 	translate = boto3.client(service_name='translate', region_name=region, use_ssl=True)
249 | 	
250 | 	# call Translate  with the text, source language code, and target language code.  The result is a JSON structure containing the 
251 | 	# translated text
252 | 	translation = translate.translate_text(Text=txt,SourceLanguageCode=sourceLangCode, TargetLanguageCode=targetLangCode)
253 | 	
254 | 	return translation
255 | 	
256 | 	
257 | 
258 | # ==================================================================================
259 | # Function: writeSRT
260 | # Purpose: Iterate through the phrases and write them to the SRT file
261 | # Parameters: 
262 | #                 phrases - the array of JSON tuples containing the phrases to show up as subtitles
263 | #                 filename - the name of the SRT output file (e.g. "mySRT.srt")
264 | # ==================================================================================
265 | def writeSRT( phrases, filename ):
266 | 	print "==> Writing phrases to disk..."
267 | 
268 | 	# open the files
269 | 	e = codecs.open(filename,"w+", "utf-8")
270 | 	x = 1
271 | 	
272 | 	for phrase in phrases:
273 | 
274 | 		# determine how many words are in the phrase
275 | 		length = len(phrase["words"])
276 | 		
277 | 		# write out the phrase number
278 | 		e.write( str(x) + "\n" )
279 | 		x += 1
280 | 		
281 | 		# write out the start and end time
282 | 		e.write( phrase["start_time"] + " --> " + phrase["end_time"] + "\n" )
283 | 					
284 | 		# write out the full phase.  Use spacing if it is a word, or punctuation without spacing
285 | 		out = getPhraseText( phrase )
286 | 
287 | 		# write out the srt file
288 | 		e.write(out + "\n\n" )
289 | 		
290 | 
291 | 		#print out
292 | 		
293 | 	e.close()
294 | 	
295 | 
296 | # ==================================================================================
297 | # Function: getPhraseText
298 | # Purpose: For a given phrase, return the string of words including punctuation
299 | # Parameters: 
300 | #                 phrase - the array of JSON tuples containing the words to show up as subtitles
301 | # ==================================================================================
302 | 
303 | def getPhraseText( phrase ):
304 | 
305 | 	length = len(phrase["words"])
306 | 		
307 | 	out = ""
308 | 	for i in range( 0, length ):
309 | 		if re.match( '[a-zA-Z0-9]', phrase["words"][i]):
310 | 			if i > 0:
311 | 				out += " " + phrase["words"][i]
312 | 			else:
313 | 				out += phrase["words"][i]
314 | 		else:
315 | 			out += phrase["words"][i]
316 | 			
317 | 	return out
318 | 	
319 | 
320 | 			
321 | 
322 | 	
323 | 
324 | 
325 | 	
326 | 	


--------------------------------------------------------------------------------
/tools/testWebVTT.py:
--------------------------------------------------------------------------------
 1 | ﻿import argparse
 2 | from transcribeUtils import *
 3 | from webvttUtils import *
 4 | import requests
 5 | from videoUtils import *
 6 | from audioUtils import *
 7 | 
 8 | # Get the command line arguments and parse them
 9 | parser = argparse.ArgumentParser( prog='testWebVTT.py', description='Process a video found in the input file, process it, and write tit out to the output file')
10 | parser.add_argument('-region', required=True, help="The AWS region containing the S3 buckets" )
11 | parser.add_argument('-inbucket', required=True, help='The S3 bucket containing the input file')
12 | parser.add_argument('-infile', required=True, help='The input file to process')
13 | parser.add_argument('-outbucket', required=True, help='The S3 bucket containing the input file')
14 | parser.add_argument('-outfilename', required=True, help='The file name without the extension')
15 | parser.add_argument('-outfiletype', required=True, help='The output file type.  E.g. mp4, mov')
16 | parser.add_argument('-outlang', required=True, nargs='+', help='The language codes for the desired output.  E.g. en = English, de = German')		
17 | parser.add_argument('-TranscriptJob', required=True, help='The URI resulting from the transcript job')
18 | args = parser.parse_args()
19 | 
20 | 
21 | job = getTranscriptionJobStatus( args.TranscriptJob )
22 | #print( job )
23 | 
24 | 
25 | # Now get the transcript JSON from AWS Transcribe
26 | transcript = getTranscript( str(job["TranscriptionJob"]["Transcript"]["TranscriptFileUri"]) ) 
27 | #print( "\n==> Transcript: \n" + transcript)
28 | 
29 | # Create the WebVTT File for the original transcript and write it out.  
30 | writeTranscriptToWebVTT( transcript, 'en', "subtitles-en.vtt")
31 | #createVideo( args.infile, "subtitles-en.vtt", args.outfilename + "-en." + args.outfiletype, "audio-en.mp3", True)
32 | 
33 | 
34 | # Now write out the translation to the transcript for each of the target languages
35 | for lang in args.outlang:
36 | 	writeTranslationToWebVTT(transcript, 'en', lang, "subtitles-" + lang + ".vtt" ) 	
37 | 	
38 | 	#Now that we have the subtitle files, let's create the audio track
39 | 	#createAudioTrackFromTranslation( args.region, transcript, 'en', lang, "audio-" + lang + ".mp3" )
40 | 	
41 | 	# Finally, create the composited video
42 | 	#createVideo( args.infile, "subtitles-" + lang + ".WebVTT", args.outfilename + "-" + lang + "." + args.outfiletype, "audio-" + lang + ".mp3", False)
43 | 
44 | 
45 | 


--------------------------------------------------------------------------------
/tools/webvttUtils.py:
--------------------------------------------------------------------------------
  1 | ﻿import json
  2 | import boto3
  3 | import re
  4 | import codecs
  5 | from audioUtils import *
  6 | 
  7 | translate = boto3.client(service_name='translate', region_name='us-east-1', use_ssl=True)
  8 | 
  9 | 
 10 | 
 11 | # Create a new phrase structure
 12 | def newPhrase():
 13 | 	return { 'start_time': '', 'end_time': '', 'words' : [] }
 14 | 
 15 | # Format and return a string that contains the converted number of seconds into WebVTT format
 16 | def getTimeCode( seconds ):
 17 | 	t_hund = int(seconds % 1 * 1000)
 18 | 	t_seconds = int( seconds )
 19 | 	t_secs = ((float( t_seconds) / 60) % 1) * 60
 20 | 	t_mins = int( t_seconds / 60 )
 21 | 	return str( "%02d:%02d:%02d.%03d" % (00, t_mins, int(t_secs), t_hund ))
 22 | 	
 23 | 
 24 | 	
 25 | def writeTranscriptToWebVTT( transcript, sourceLangCode, WebVTTFileName ):
 26 | 	# Write the WebVTT file for the original language
 27 | 	print( "==> Creating WebVTT from transcript")
 28 | 	phrases = getPhrasesFromTranscript( transcript )
 29 | 	writeWebVTT( phrases, WebVTTFileName, "A:middle L:90%" ) 
 30 | 	
 31 | 	
 32 | def writeTranslationToWebVTT( transcript, sourceLangCode, targetLangCode, WebVTTFileName ):
 33 | 	# First get the translation
 34 | 	print( "\n\n==> Translating from " + sourceLangCode + " to " + targetLangCode )
 35 | 	translation = translateTranscript( transcript, sourceLangCode, targetLangCode )
 36 | 	#print( "\n\n==> Translation: " + str(translation))
 37 | 		
 38 | 	# Now create phrases from the translation
 39 | 	textToTranslate = unicode(translation["TranslatedText"])
 40 | 	phrases = getPhrasesFromTranslation( textToTranslate, targetLangCode )
 41 | 	writeWebVTT( phrases, WebVTTFileName, "A:middle L:90%" )
 42 | 	
 43 | 	
 44 | def getPhrasesFromTranslation( translation, targetLangCode ):
 45 | 
 46 | 	# Now create phrases from the translation
 47 | 	words = translation.split()
 48 | 	
 49 | 	#print( words ) #debug statement
 50 | 	
 51 | 	#set up some variables for the first pass
 52 | 	phrase =  newPhrase()
 53 | 	phrases = []
 54 | 	nPhrase = True
 55 | 	x = 0
 56 | 	c = 0
 57 | 	seconds = 0
 58 | 
 59 | 	print "==> Creating phrases from translation..."
 60 | 
 61 | 	for word in words:
 62 | 
 63 | 		# if it is a new phrase, then get the start_time of the first item
 64 | 		if nPhrase == True:
 65 | 			phrase["start_time"] = getTimeCode( seconds )
 66 | 			nPhrase = False
 67 | 			c += 1
 68 | 				
 69 | 		# Append the word to the phrase...
 70 | 		phrase["words"].append(word)
 71 | 		x += 1
 72 | 		
 73 | 		
 74 | 		# now add the phrase to the phrases, generate a new phrase, etc.
 75 | 		if x == 10:
 76 | 		
 77 | 			# For Translations, we now need to calculate the end time for the phrase
 78 | 			psecs = getSecondsFromTranslation( getPhraseText( phrase), targetLangCode, "phraseAudio" + str(c) + ".mp3" ) 
 79 | 			seconds += psecs
 80 | 			phrase["end_time"] = getTimeCode( seconds )
 81 | 		
 82 | 			#print c, phrase
 83 | 			phrases.append(phrase)
 84 | 			phrase = newPhrase()
 85 | 			nPhrase = True
 86 | 			#seconds += .001
 87 | 			x = 0
 88 | 			
 89 | 		#if c == 30:
 90 | 		#	break
 91 | 			
 92 | 	return phrases
 93 | 	
 94 | 	
 95 | def getPhrasesFromTranscript( transcript ):
 96 | 
 97 | 	# This function is intended to be called with the JSON structure output from the Transcribe service.  However,
 98 | 	# if you only have the translation of the transcript, then you should call getPhrasesFromTranslation instead
 99 | 
100 | 	# Now create phrases from the translation
101 | 	ts = json.loads( transcript )
102 | 	items = ts['results']['items']
103 | 	#print( items )
104 | 	
105 | 	#set up some variables for the first pass
106 | 	phrase =  newPhrase()
107 | 	phrases = []
108 | 	nPhrase = True
109 | 	x = 0
110 | 	c = 0
111 | 
112 | 	print "==> Creating phrases from transcript..."
113 | 
114 | 	for item in items:
115 | 
116 | 		# if it is a new phrase, then get the start_time of the first item
117 | 		if nPhrase == True:
118 | 			if item["type"] == "pronunciation":
119 | 				phrase["start_time"] = getTimeCode( float(item["start_time"]) )
120 | 				nPhrase = False
121 | 			c+= 1
122 | 		else:	
123 | 			# get the end_time if the item is a pronuciation and store it
124 | 			# We need to determine if this pronunciation or puncuation here
125 | 			# Punctuation doesn't contain timing information, so we'll want
126 | 			# to set the end_time to whatever the last word in the phrase is.
127 | 			if item["type"] == "pronunciation":
128 | 				phrase["end_time"] = getTimeCode( float(item["end_time"]) )
129 | 				
130 | 		# in either case, append the word to the phrase...
131 | 		phrase["words"].append(item['alternatives'][0]["content"])
132 | 		x += 1
133 | 		
134 | 		# now add the phrase to the phrases, generate a new phrase, etc.
135 | 		if x == 10:
136 | 			#print c, phrase
137 | 			phrases.append(phrase)
138 | 			phrase = newPhrase()
139 | 			nPhrase = True
140 | 			x = 0
141 | 	
142 | 	# if there are any words in the final phrase add to phrases  
143 | 	if(len(phrase["words"]) > 0):
144 | 		phrases.append(phrase)	
145 | 			
146 | 	return phrases
147 | 	
148 | 
149 | 
150 | 
151 | 
152 | def translateTranscript( transcript, sourceLangCode, targetLangCode ):
153 | 	# Get the translation in the target language.  We want to do this first so that the translation is in the full context
154 | 	# of what is said vs. 1 phrase at a time.  This really matters in some lanaguages
155 | 
156 | 	# stringify the transcript
157 | 	ts = json.loads( transcript )
158 | 
159 | 	# pull out the transcript text and put it in the txt variable
160 | 	txt = ts["results"]["transcripts"][0]["transcript"]
161 | 		
162 | 	# call Translate  with the text, source language code, and target language code.  The result is a JSON structure containing the 
163 | 	# translated text
164 | 	translation = translate.translate_text(Text=txt,SourceLanguageCode=sourceLangCode, TargetLanguageCode=targetLangCode)
165 | 	
166 | 	return translation
167 | 	
168 | 	
169 | 
170 | def writeWebVTT( phrases, filename, style ):
171 | 	print "==> Writing phrases to disk..."
172 | 
173 | 	# open the files
174 | 	e = codecs.open(filename,"w+", "utf-8")
175 | 	x = 1
176 | 	
177 | 	# write the header of the webVTT file
178 | 	e.write( "WEBVTT\n\n")
179 | 	
180 | 	for phrase in phrases:
181 | 
182 | 		# determine how many words are in the phrase
183 | 		length = len(phrase["words"])
184 | 		
185 | 		# write out the phrase number
186 | 		e.write( str(x) + "\n" )
187 | 		x += 1
188 | 		
189 | 		# write out the start and end time
190 | 		e.write( phrase["start_time"] + " --> " + phrase["end_time"] + " " + style + "\n" )
191 | 					
192 | 		# write out the full phase.  Use spacing if it is a word, or punctuation without spacing
193 | 		out = getPhraseText( phrase )
194 | 
195 | 		# write out the WebVTT file
196 | 		e.write(out + "\n\n" )
197 | 		
198 | 
199 | 		#print out
200 | 		
201 | 	e.close()
202 | 	
203 | 
204 | 
205 | def getPhraseText( phrase ):
206 | 
207 | 	length = len(phrase["words"])
208 | 		
209 | 	out = ""
210 | 	for i in range( 0, length ):
211 | 		if re.match( '[a-zA-Z0-9]', phrase["words"][i]):
212 | 			if i > 0:
213 | 				out += " " + phrase["words"][i]
214 | 			else:
215 | 				out += phrase["words"][i]
216 | 		else:
217 | 			out += phrase["words"][i]
218 | 			
219 | 	return out
220 | 	
221 | 
222 | 			
223 | 
224 | 	
225 | 
226 | 
227 | 	
228 | 	


--------------------------------------------------------------------------------