├── .gitignore ├── enable-api.png ├── Armstrong_Small_Step.ogg.mp3 ├── Readme.md └── transcribe.ipynb /.gitignore: -------------------------------------------------------------------------------- 1 | AutoMLdemo-64ae368c71e2.json 2 | **/venv 3 | **/.ipynb_checkpoints 4 | -------------------------------------------------------------------------------- /enable-api.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jy2k/speech-to-text-exmple/main/enable-api.png -------------------------------------------------------------------------------- /Armstrong_Small_Step.ogg.mp3: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jy2k/speech-to-text-exmple/main/Armstrong_Small_Step.ogg.mp3 -------------------------------------------------------------------------------- /Readme.md: -------------------------------------------------------------------------------- 1 | Using Speec-to-Text API with MP3 file 2 | =================== 3 | Download the repo and run the [ transcribe.ipynb ](#publish-a-document) notebook against your own **GCP project**. 4 | 5 | The notebook walks through the following steps: 6 | ------------- 7 | 1. Replace the parameters: 8 | ``` 9 | PROJECT_ID = "[PROJECT-ID-GOES-HERE]" 10 | ... 11 | ``` 12 | 2. Install client libraries 13 | 3. export credentials after downloading a service account 14 | 4. Enable the APIs 15 | 5. Creating a bucket 16 | 6. Upload MP3 file 17 | 7. Transcribe using the Speech-to-Text API. 18 | 19 | 20 | ---------- 21 | -------------------------------------------------------------------------------- /transcribe.ipynb: -------------------------------------------------------------------------------- 1 | { 2 | "cells": [ 3 | { 4 | "cell_type": "markdown", 5 | "id": "friendly-palestinian", 6 | "metadata": {}, 7 | "source": [ 8 | "# GCP Speech to Text (STT) API example\n", 9 | "## In this tutorial we will be following these steps:\n", 10 | "1. Install client libraries\n", 11 | "2. Enable the APIs\n", 12 | "3. export credentials after downloading a service account\n", 13 | "4. Creating a bucket\n", 14 | "5. Upload MP3 file\n", 15 | "6. Transcribe using the Speech-to-Text API.\n", 16 | "\n", 17 | "### See doucmentation links for documentations. \n", 18 | "TODO add the documentation links\n", 19 | "\n", 20 | "### Replace all ' # <--- CHANGE THIS ' throughout the notebook" 21 | ] 22 | }, 23 | { 24 | "cell_type": "markdown", 25 | "id": "focused-version", 26 | "metadata": {}, 27 | "source": [ 28 | "### 1. Install client libraries" 29 | ] 30 | }, 31 | { 32 | "cell_type": "code", 33 | "execution_count": null, 34 | "id": "varied-assumption", 35 | "metadata": { 36 | "tags": [] 37 | }, 38 | "outputs": [], 39 | "source": [ 40 | "!pip install --upgrade google-cloud-storage\n", 41 | "!pip install --upgrade google-cloud-speech" 42 | ] 43 | }, 44 | { 45 | "cell_type": "markdown", 46 | "id": "catholic-preference", 47 | "metadata": {}, 48 | "source": [ 49 | "### 2. Enable the APIs" 50 | ] 51 | }, 52 | { 53 | "cell_type": "code", 54 | "execution_count": 114, 55 | "id": "supreme-newport", 56 | "metadata": { 57 | "tags": [] 58 | }, 59 | "outputs": [ 60 | { 61 | "name": "stdout", 62 | "output_type": "stream", 63 | "text": [ 64 | "If this is the first time you are using the API in the project please enable the API via this link:\n", 65 | "\n", 66 | "https://console.developers.google.com/apis/api/speech.googleapis.com/overview?project=YOUR-PROJECT-ID\n" 67 | ] 68 | } 69 | ], 70 | "source": [ 71 | "PROJECT_ID = 'YOUR-PROJECT-ID' # <--- CHANGE THIS\n", 72 | "\n", 73 | "link_to_enable_API = \"https://console.developers.google.com/apis/api/speech.googleapis.com/overview?project=\"+PROJECT_ID\n", 74 | "\n", 75 | "print(\"If this is the first time you are using the API in the project please enable the API via this link:\\n\\n\"+ link_to_enable_API)" 76 | ] 77 | }, 78 | { 79 | "cell_type": "markdown", 80 | "id": "powered-horse", 81 | "metadata": {}, 82 | "source": [ 83 | "### Make sure to enable the API\n", 84 | "" 85 | ] 86 | }, 87 | { 88 | "cell_type": "markdown", 89 | "id": "important-graduate", 90 | "metadata": {}, 91 | "source": [ 92 | "### 3. export credentials after downloading a service account\n", 93 | "#### 3.1 Create and download Service Account\n", 94 | "https://cloud.google.com/iam/docs/creating-managing-service-account-keys#iam-service-account-keys-create-console\n", 95 | "\n", 96 | "Follow the instructions in the link above to create a **service account key** with correct permissions (Project owner) and download the key.json\n", 97 | "\n", 98 | "Export the path to json key as an environment variable\n", 99 | "#### 3.2 Add permissions to service account\n", 100 | "TODO: add link to documentation here\n", 101 | "#### 3.3 Export service account as env variable" 102 | ] 103 | }, 104 | { 105 | "cell_type": "code", 106 | "execution_count": 103, 107 | "id": "mounted-buddy", 108 | "metadata": {}, 109 | "outputs": [], 110 | "source": [ 111 | "!export GOOGLE_APPLICATION_CREDENTIALS=\"/path/to/key.json\" # <--- CHANGE THIS" 112 | ] 113 | }, 114 | { 115 | "cell_type": "code", 116 | "execution_count": 104, 117 | "id": "threaded-chancellor", 118 | "metadata": {}, 119 | "outputs": [ 120 | { 121 | "data": { 122 | "text/plain": [ 123 | "'AutoMLdemo-64ae368c71e2.json'" 124 | ] 125 | }, 126 | "execution_count": 104, 127 | "metadata": {}, 128 | "output_type": "execute_result" 129 | } 130 | ], 131 | "source": [ 132 | "#check that the variable have been set correctly\n", 133 | "GOOGLE_APPLICATION_CREDENTIALS" 134 | ] 135 | }, 136 | { 137 | "cell_type": "code", 138 | "execution_count": 105, 139 | "id": "computational-acquisition", 140 | "metadata": {}, 141 | "outputs": [ 142 | { 143 | "name": "stdout", 144 | "output_type": "stream", 145 | "text": [ 146 | "AutoMLdemo-64ae368c71e2.json\n" 147 | ] 148 | } 149 | ], 150 | "source": [ 151 | "import os\n", 152 | "os.environ[\"GOOGLE_APPLICATION_CREDENTIALS\"]=\"/path/to/key.json\" # <--- CHANGE THIS\n", 153 | "print(os.environ['GOOGLE_APPLICATION_CREDENTIALS'])" 154 | ] 155 | }, 156 | { 157 | "cell_type": "markdown", 158 | "id": "future-surrey", 159 | "metadata": {}, 160 | "source": [ 161 | "### 4. Creating a bucket" 162 | ] 163 | }, 164 | { 165 | "cell_type": "code", 166 | "execution_count": 106, 167 | "id": "fuzzy-extraction", 168 | "metadata": { 169 | "collapsed": true, 170 | "jupyter": { 171 | "outputs_hidden": true 172 | }, 173 | "tags": [] 174 | }, 175 | "outputs": [ 176 | { 177 | "ename": "Conflict", 178 | "evalue": "409 POST https://storage.googleapis.com/storage/v1/b?project=automl-demo-198411&prettyPrint=false: You already own this bucket. Please select another name.", 179 | "output_type": "error", 180 | "traceback": [ 181 | "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", 182 | "\u001b[0;31mConflict\u001b[0m Traceback (most recent call last)", 183 | "\u001b[0;32m\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 8\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 9\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"Bucket {} created.\"\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mformat\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mbucket\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mname\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 10\u001b[0;31m \u001b[0mcreate_bucket\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mBUCKET_NAME\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", 184 | "\u001b[0;32m\u001b[0m in \u001b[0;36mcreate_bucket\u001b[0;34m(bucket_name)\u001b[0m\n\u001b[1;32m 5\u001b[0m \u001b[0;32mdef\u001b[0m \u001b[0mcreate_bucket\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mbucket_name\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 6\u001b[0m \u001b[0mstorage_client\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mstorage\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mClient\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 7\u001b[0;31m \u001b[0mbucket\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mstorage_client\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcreate_bucket\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mbucket_name\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 8\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 9\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m\"Bucket {} created.\"\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mformat\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mbucket\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mname\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 185 | "\u001b[0;32m~/projects/speech-to-text/venv/lib/python3.8/site-packages/google/cloud/storage/client.py\u001b[0m in \u001b[0;36mcreate_bucket\u001b[0;34m(self, bucket_or_name, requester_pays, project, user_project, location, predefined_acl, predefined_default_object_acl, timeout, retry)\u001b[0m\n\u001b[1;32m 603\u001b[0m \u001b[0mproperties\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;34m\"location\"\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mlocation\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 604\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 605\u001b[0;31m api_response = self._connection.api_request(\n\u001b[0m\u001b[1;32m 606\u001b[0m \u001b[0mmethod\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m\"POST\"\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 607\u001b[0m \u001b[0mpath\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0;34m\"/b\"\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 186 | "\u001b[0;32m~/projects/speech-to-text/venv/lib/python3.8/site-packages/google/cloud/storage/_http.py\u001b[0m in \u001b[0;36mapi_request\u001b[0;34m(self, *args, **kwargs)\u001b[0m\n\u001b[1;32m 76\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mretry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 77\u001b[0m \u001b[0mcall\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mretry\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mcall\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m---> 78\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mcall\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", 187 | "\u001b[0;32m~/projects/speech-to-text/venv/lib/python3.8/site-packages/google/api_core/retry.py\u001b[0m in \u001b[0;36mretry_wrapped_func\u001b[0;34m(*args, **kwargs)\u001b[0m\n\u001b[1;32m 279\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_initial\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_maximum\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mmultiplier\u001b[0m\u001b[0;34m=\u001b[0m\u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_multiplier\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 280\u001b[0m )\n\u001b[0;32m--> 281\u001b[0;31m return retry_target(\n\u001b[0m\u001b[1;32m 282\u001b[0m \u001b[0mtarget\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 283\u001b[0m \u001b[0mself\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0m_predicate\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 188 | "\u001b[0;32m~/projects/speech-to-text/venv/lib/python3.8/site-packages/google/api_core/retry.py\u001b[0m in \u001b[0;36mretry_target\u001b[0;34m(target, predicate, sleep_generator, deadline, on_error)\u001b[0m\n\u001b[1;32m 182\u001b[0m \u001b[0;32mfor\u001b[0m \u001b[0msleep\u001b[0m \u001b[0;32min\u001b[0m \u001b[0msleep_generator\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 183\u001b[0m \u001b[0;32mtry\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 184\u001b[0;31m \u001b[0;32mreturn\u001b[0m \u001b[0mtarget\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 185\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 186\u001b[0m \u001b[0;31m# pylint: disable=broad-except\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 189 | "\u001b[0;32m~/projects/speech-to-text/venv/lib/python3.8/site-packages/google/cloud/_http.py\u001b[0m in \u001b[0;36mapi_request\u001b[0;34m(self, method, path, query_params, data, content_type, headers, api_base_url, api_version, expect_json, _target_object, timeout)\u001b[0m\n\u001b[1;32m 481\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 482\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0;32mnot\u001b[0m \u001b[0;36m200\u001b[0m \u001b[0;34m<=\u001b[0m \u001b[0mresponse\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mstatus_code\u001b[0m \u001b[0;34m<\u001b[0m \u001b[0;36m300\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m--> 483\u001b[0;31m \u001b[0;32mraise\u001b[0m \u001b[0mexceptions\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mfrom_http_response\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0mresponse\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m\u001b[1;32m 484\u001b[0m \u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 485\u001b[0m \u001b[0;32mif\u001b[0m \u001b[0mexpect_json\u001b[0m \u001b[0;32mand\u001b[0m \u001b[0mresponse\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mcontent\u001b[0m\u001b[0;34m:\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n", 190 | "\u001b[0;31mConflict\u001b[0m: 409 POST https://storage.googleapis.com/storage/v1/b?project=automl-demo-198411&prettyPrint=false: You already own this bucket. Please select another name." 191 | ] 192 | } 193 | ], 194 | "source": [ 195 | "from google.cloud import storage\n", 196 | "\n", 197 | "BUCKET_NAME = \"test-bucket-speech-n\" # <--- CHANGE THIS\n", 198 | "\n", 199 | "def create_bucket(bucket_name): \n", 200 | " storage_client = storage.Client()\n", 201 | " bucket = storage_client.create_bucket(bucket_name)\n", 202 | "\n", 203 | " print(\"Bucket {} created.\".format(bucket.name))\n", 204 | "create_bucket(BUCKET_NAME)" 205 | ] 206 | }, 207 | { 208 | "cell_type": "markdown", 209 | "id": "egyptian-seattle", 210 | "metadata": {}, 211 | "source": [ 212 | "### 5. Upload MP3 file" 213 | ] 214 | }, 215 | { 216 | "cell_type": "code", 217 | "execution_count": 107, 218 | "id": "informational-australia", 219 | "metadata": {}, 220 | "outputs": [ 221 | { 222 | "name": "stdout", 223 | "output_type": "stream", 224 | "text": [ 225 | "File Armstrong_Small_Step.ogg.mp3 uploaded to Armstrong_Small_Step2.ogg.mp3.\n" 226 | ] 227 | } 228 | ], 229 | "source": [ 230 | "SOURCE_FILE_NAME = \"Armstrong_Small_Step.ogg.mp3\"\n", 231 | "DEST_FILE_NAME = \"Armstrong_Small_Step2.ogg.mp3\"\n", 232 | "def upload_blob(bucket_name, source_file_name, destination_blob_name):\n", 233 | " storage_client = storage.Client()\n", 234 | " bucket = storage_client.bucket(bucket_name)\n", 235 | " blob = bucket.blob(destination_blob_name)\n", 236 | " blob.upload_from_filename(source_file_name)\n", 237 | "\n", 238 | " print(\n", 239 | " \"File {} uploaded to {}.\".format(\n", 240 | " source_file_name, destination_blob_name\n", 241 | " )\n", 242 | " )\n", 243 | " \n", 244 | "upload_blob(BUCKET_NAME, SOURCE_FILE_NAME, DEST_FILE_NAME)" 245 | ] 246 | }, 247 | { 248 | "cell_type": "code", 249 | "execution_count": 117, 250 | "id": "amateur-theme", 251 | "metadata": {}, 252 | "outputs": [ 253 | { 254 | "name": "stdout", 255 | "output_type": "stream", 256 | "text": [ 257 | "gs://test-bucket-speech-n/2021020713284401905bff2636dcbb7e-partner-F3c69Fj6-972527143684.mp3\n", 258 | "gs://test-bucket-speech-n/20210207132906019075a793a86e9da4-partner-htCGaDFf-972506009909.mp3\n", 259 | "gs://test-bucket-speech-n/Armstrong_Small_Step.ogg.mp3\n", 260 | "gs://test-bucket-speech-n/Armstrong_Small_Step2.ogg.mp3\n" 261 | ] 262 | } 263 | ], 264 | "source": [ 265 | "# validating the file has been uploaded\n", 266 | "!gsutil ls gs://$BUCKET_NAME" 267 | ] 268 | }, 269 | { 270 | "cell_type": "markdown", 271 | "id": "informational-clarity", 272 | "metadata": {}, 273 | "source": [ 274 | "### 6. Transcribe using the Speech-to-Text API.\n" 275 | ] 276 | }, 277 | { 278 | "cell_type": "code", 279 | "execution_count": 109, 280 | "id": "satisfactory-latin", 281 | "metadata": { 282 | "tags": [] 283 | }, 284 | "outputs": [ 285 | { 286 | "name": "stdout", 287 | "output_type": "stream", 288 | "text": [ 289 | "gs://test-bucket-speech-n/Armstrong_Small_Step2.ogg.mp3\n", 290 | "Transcript: step off the Lem now\n", 291 | "Transcript: that's one small step for man\n", 292 | "Transcript: one giant leap for mankind\n" 293 | ] 294 | } 295 | ], 296 | "source": [ 297 | "from google.cloud import speech_v1p1beta1 as speech\n", 298 | "\n", 299 | "GCS_URI = 'gs://' + BUCKET_NAME + '/' + DEST_FILE_NAME\n", 300 | "print(GCS_URI)\n", 301 | "\n", 302 | "def transcribe_sync(storage_uri):\n", 303 | " \"\"\"\n", 304 | " Performs synchronous speech recognition on an audio file\n", 305 | "\n", 306 | " Args:\n", 307 | " storage_uri URI for audio file in Cloud Storage, e.g. gs://[BUCKET]/[FILE]\n", 308 | " \"\"\"\n", 309 | "\n", 310 | " client = speech.SpeechClient()\n", 311 | "\n", 312 | "\n", 313 | " # The language of the supplied audio\n", 314 | " language_code = \"en-US\"\n", 315 | "\n", 316 | " # Sample rate in Hertz of the audio data sent\n", 317 | " sample_rate_hertz = 44100\n", 318 | "\n", 319 | " # Encoding of audio data sent. This sample sets this explicitly.\n", 320 | " # This field is optional for FLAC and WAV audio formats.\n", 321 | " encoding = speech.RecognitionConfig.AudioEncoding.MP3\n", 322 | " config = {\n", 323 | " \"language_code\": language_code,\n", 324 | " \"sample_rate_hertz\": sample_rate_hertz,\n", 325 | " \"encoding\": encoding,\n", 326 | " \"model\": \"video\"\n", 327 | " }\n", 328 | " audio = {\"uri\": storage_uri}\n", 329 | "\n", 330 | " response = client.recognize(config=config, audio=audio)\n", 331 | "\n", 332 | " for result in response.results:\n", 333 | " # First alternative is the most probable result\n", 334 | " alternative = result.alternatives[0]\n", 335 | " print(u\"Transcript: {}\".format(alternative.transcript))\n", 336 | " \n", 337 | "transcribe_sync(GCS_URI)" 338 | ] 339 | }, 340 | { 341 | "cell_type": "code", 342 | "execution_count": 112, 343 | "id": "nutritional-communication", 344 | "metadata": {}, 345 | "outputs": [ 346 | { 347 | "name": "stdout", 348 | "output_type": "stream", 349 | "text": [ 350 | "gs://test-bucket-speech-n/Armstrong_Small_Step2.ogg.mp3\n", 351 | "Waiting for operation to complete...\n", 352 | "Transcript: step off the Lem now\n", 353 | "Transcript: that's one small step for man\n", 354 | "Transcript: one giant leap for mankind\n" 355 | ] 356 | } 357 | ], 358 | "source": [ 359 | "from google.cloud import speech_v1p1beta1 as speech\n", 360 | "\n", 361 | "GCS_URI = 'gs://' + BUCKET_NAME + '/' + DEST_FILE_NAME\n", 362 | "print(GCS_URI)\n", 363 | "\n", 364 | "def transcribe_async(gcs_uri):\n", 365 | " \"\"\"Asynchronously transcribes the audio file specified by the gcs_uri.\"\"\"\n", 366 | "\n", 367 | " client = speech.SpeechClient()\n", 368 | "\n", 369 | " audio = speech.RecognitionAudio(uri=gcs_uri)\n", 370 | " encoding = speech.RecognitionConfig.AudioEncoding.MP3\n", 371 | " language_code = \"en-US\" #iw-IL\n", 372 | " sample_rate_hertz = 44100\n", 373 | " \n", 374 | " config = {\n", 375 | " \"language_code\": language_code,\n", 376 | " \"sample_rate_hertz\": sample_rate_hertz,\n", 377 | " \"encoding\": encoding,\n", 378 | " \"use_enhanced\":True,\n", 379 | " \"model\": \"video\"\n", 380 | " }\n", 381 | "\n", 382 | " operation = client.long_running_recognize(config=config, audio=audio)\n", 383 | "\n", 384 | " print(\"Waiting for operation to complete...\")\n", 385 | " response = operation.result(timeout=90)\n", 386 | "\n", 387 | " # Each result is for a consecutive portion of the audio. Iterate through\n", 388 | " # them to get the transcripts for the entire audio file.\n", 389 | " for result in response.results:\n", 390 | " # The first alternative is the most likely one for this portion.\n", 391 | " print(u\"Transcript: {}\".format(result.alternatives[0].transcript))\n", 392 | " #print(\"Confidence: {}\".format(result.alternatives[0].confidence))\n", 393 | "transcribe_async(GCS_URI) \n" 394 | ] 395 | }, 396 | { 397 | "cell_type": "code", 398 | "execution_count": null, 399 | "id": "victorian-samoa", 400 | "metadata": {}, 401 | "outputs": [], 402 | "source": [] 403 | } 404 | ], 405 | "metadata": { 406 | "kernelspec": { 407 | "display_name": "Python 3", 408 | "language": "python", 409 | "name": "python3" 410 | }, 411 | "language_info": { 412 | "codemirror_mode": { 413 | "name": "ipython", 414 | "version": 3 415 | }, 416 | "file_extension": ".py", 417 | "mimetype": "text/x-python", 418 | "name": "python", 419 | "nbconvert_exporter": "python", 420 | "pygments_lexer": "ipython3", 421 | "version": "3.8.2" 422 | } 423 | }, 424 | "nbformat": 4, 425 | "nbformat_minor": 5 426 | } 427 | --------------------------------------------------------------------------------