├── .Rbuildignore ├── .gitignore ├── .travis.yml ├── CONTRIBUTING.md ├── CRAN-RELEASE ├── DESCRIPTION ├── Dockerfile ├── LICENSE ├── NAMESPACE ├── NEWS.md ├── R ├── auth.R ├── googleLanguageR.R ├── natural-language.R ├── options.R ├── speech-to-text.R ├── text-to-speech.R ├── translate-document.R ├── translate.R ├── utilities.R └── version.R ├── README.md ├── _pkgdown.yml ├── cloud_build ├── build.R ├── cloudbuild-pkgdown.yml └── cloudbuild-tests.yml ├── cran-comments.md ├── docs ├── .gitignore ├── 404.html ├── CONTRIBUTING.html ├── LICENSE-text.html ├── LICENSE.html ├── articles │ ├── Intro.html │ ├── find_html_node.png │ ├── index.html │ ├── nlp.R │ ├── nlp.html │ ├── setup.R │ ├── setup.html │ ├── speech.R │ ├── speech.html │ ├── text-to-speech.html │ ├── translation.R │ └── translation.html ├── authors.html ├── bootstrap-toc.css ├── bootstrap-toc.js ├── docsearch.css ├── docsearch.js ├── docsearch.json ├── index.html ├── jquery.sticky-kit.min.js ├── link.svg ├── news │ └── index.html ├── pkgdown.css ├── pkgdown.js ├── pkgdown.yml ├── reference │ ├── gl_auth.html │ ├── gl_nlp.html │ ├── gl_speech.html │ ├── gl_speech_op.html │ ├── gl_talk.html │ ├── gl_talk_languages.html │ ├── gl_talk_player.html │ ├── gl_talk_shiny.html │ ├── gl_talk_shinyUI.html │ ├── gl_translate.html │ ├── gl_translate_detect.html │ ├── gl_translate_languages.html │ ├── googleLanguageR.html │ ├── index.html │ ├── is.NullOb.html │ └── rmNullObs.html └── sitemap.xml ├── googleLanguageR.Rproj ├── inst ├── shiny │ └── capture_speech │ │ ├── DESCRIPTION │ │ ├── README.html │ │ ├── README.md │ │ ├── babelfish.png │ │ ├── server.R │ │ ├── ui.R │ │ └── www │ │ ├── audiodisplay.js │ │ ├── main.js │ │ ├── mic128.png │ │ ├── recorderWorker.js │ │ ├── save.svg │ │ ├── speech.js │ │ └── style.css ├── test-doc-no.pdf ├── test-doc.pdf └── woman1_wb.wav ├── man ├── gl_auth.Rd ├── gl_nlp.Rd ├── gl_speech.Rd ├── gl_speech_op.Rd ├── gl_talk.Rd ├── gl_talk_languages.Rd ├── gl_talk_player.Rd ├── gl_talk_shiny.Rd ├── gl_talk_shinyUI.Rd ├── gl_translate.Rd ├── gl_translate_detect.Rd ├── gl_translate_document.Rd ├── gl_translate_languages.Rd ├── googleLanguageR.Rd ├── is.NullOb.Rd └── rmNullObs.Rd ├── tests ├── testthat.R └── testthat │ ├── comments.rds │ ├── prep_tests.R │ ├── test-translate-document.R │ └── test_gl.R └── vignettes ├── nlp.Rmd ├── nlp.html ├── setup.Rmd ├── setup.html ├── speech.Rmd ├── speech.html ├── text-to-speech.Rmd ├── text-to-speech.html ├── translation.Rmd └── translation.html /.Rbuildignore: -------------------------------------------------------------------------------- 1 | ^.*\.Rproj$ 2 | ^\.Rproj\.user$ 3 | ^docs$ 4 | ^_pkgdown\.yml$ 5 | ^\.travis\.yml$ 6 | ^CONTRIBUTING\.md$ 7 | ^README\.Rmd$ 8 | ^_gcssave\.yml$ 9 | ^\.httr-oauth$ 10 | ^cran-comments\.md$ 11 | ^\.Renviron$ 12 | ^cloud_build$ 13 | ^CRAN-RELEASE$ 14 | ^Dockerfile$ 15 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .Rproj.user 2 | .Rhistory 3 | .RData 4 | *.history 5 | _gcssave.yml 6 | .httr-oauth 7 | inst/OSR_us_000_0010_8k.wav 8 | output.wav 9 | player.html 10 | .Renviron 11 | -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- 1 | language: r 2 | cache: packages 3 | notifications: 4 | slack: googleauthrverse:tGfXjSD58cQSEr1YuzQ5hKPS 5 | email: 6 | on_success: change 7 | on_failure: change 8 | r_packages: 9 | - knitr 10 | - covr 11 | - drat 12 | - readr 13 | after_success: 14 | - Rscript -e 'library("covr");codecov(line_exclusions = list("R/options.R","R/utilities.R"))' 15 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing to googleLanguageR 2 | 3 | Thank you for your interest in contributing to this project! 4 | 5 | To run unit tests, saved API calls are cached in the `tests/testthat/mock` folder. These substitute for an API call to Google and avoid authentication. 6 | 7 | The API calls to the Cloud Speech API and translation varies slightly, so test success is judged if the string is within 10 characters of the test string. For this test then, the `stringdist` package is needed (under the package `Suggests`) 8 | 9 | To run integration tests that hit the API, you will need to add your own authentication service JSON file from Google Cloud projects. Save this file to your computer and then set an environment variable `GL_AUTH` pointing to the file location. If not present, (such as on CRAN or Travis) the integration tests will be skipped. 10 | 11 | You will need to enable the following APIs: 12 | 13 | * [Google Cloud Speech API](https://console.developers.google.com/apis/api/speech.googleapis.com/overview) 14 | * [Google Cloud Natural Language API](https://console.developers.google.com/apis/api/language.googleapis.com/overview) 15 | * [Google Cloud Translation API](https://console.developers.google.com/apis/api/translate.googleapis.com/overview) 16 | 17 | To create new mock files, it needs to fully load the package so do: 18 | 19 | ``` 20 | remotes::install_github("ropensci/googleLanguageR") 21 | setwd("tests/testthat") 22 | source("test_unit.R") 23 | ``` 24 | 25 | ## Contributor Covenant Code of Conduct. 26 | 27 | * Any contributors should be doing so for the joy of creating and sharing and advancing knowledge. Treat each other with the respect this deserves. 28 | * The main language is English. 29 | * The community will tolerate everything but intolerance. It should be assumed everyone is trying to be tolerant until they repeatedly prove otherwise. 30 | * Don't break any laws or copyrights during contributions, credit sources where it is due. 31 | -------------------------------------------------------------------------------- /CRAN-RELEASE: -------------------------------------------------------------------------------- 1 | This package was submitted to CRAN on 2020-04-19. 2 | Once it is accepted, delete this file and tag the release (commit 119d130bcf). 3 | -------------------------------------------------------------------------------- /DESCRIPTION: -------------------------------------------------------------------------------- 1 | Package: googleLanguageR 2 | Title: Call Google's 'Natural Language' API, 'Cloud Translation' API, 3 | 'Cloud Speech' API and 'Cloud Text-to-Speech' API 4 | Version: 0.3.0.9000 5 | Authors@R: c(person("Aleksander", "Dietrichson",email = "dietrichson@gmail.com", role=c("ctb")), 6 | person("Mark", "Edmondson", email = "r@sunholo.com", role = c("aut", "cre")), 7 | person("John", "Muschelli", email = "muschellij2@gmail.com", role = c("ctb")), 8 | person("Neal", "Richardson", email = "neal.p.richardson@gmail.com", role = "rev", 9 | comment = "Neal reviewed the package for ropensci, 10 | see "), 11 | person("Julia", "Gustavsen", email = "j.gustavsen@gmail.com", role = "rev", 12 | comment = "Julia reviewed the package for ropensci, 13 | see ") 14 | ) 15 | Description: Call 'Google Cloud' machine learning APIs for text and speech tasks. 16 | Call the 'Cloud Translation' API for detection 17 | and translation of text, the 'Natural Language' API to 18 | analyse text for sentiment, entities or syntax, the 'Cloud Speech' API 19 | to transcribe sound files to text and 20 | the 'Cloud Text-to-Speech' API to turn text 21 | into sound files. 22 | URL: http://code.markedmondson.me/googleLanguageR/, https://github.com/ropensci/googleLanguageR, https://docs.ropensci.org/googleLanguageR/ 23 | BugReports: https://github.com/ropensci/googleLanguageR/issues 24 | Depends: R (>= 3.3) 25 | License: MIT + file LICENSE 26 | Encoding: UTF-8 27 | LazyData: true 28 | RoxygenNote: 7.2.3 29 | VignetteBuilder: knitr 30 | Imports: 31 | assertthat, 32 | base64enc, 33 | googleAuthR (>= 1.1.1), 34 | jsonlite, 35 | magrittr, 36 | purrr (>= 0.2.4), 37 | stats, 38 | tibble, 39 | utils 40 | Suggests: 41 | pdftools, 42 | cld2, 43 | testthat, 44 | knitr, 45 | rmarkdown, 46 | rvest, 47 | shiny, 48 | shinyjs, 49 | stringdist, 50 | tidyr, 51 | tuneR, 52 | xml2 53 | -------------------------------------------------------------------------------- /Dockerfile: -------------------------------------------------------------------------------- 1 | FROM gcr.io/mark-edmondson-gde/googleauthr 2 | 3 | RUN ["install2.r", "googleLanguageR"] 4 | 5 | RUN ["installGithub.r", "MarkEdmondson1234/googleCloudRunner"] 6 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | YEAR: 2017 2 | COPYRIGHT HOLDER: Sunholo Ltd. 3 | -------------------------------------------------------------------------------- /NAMESPACE: -------------------------------------------------------------------------------- 1 | # Generated by roxygen2: do not edit by hand 2 | 3 | S3method(print,gl_speech_op) 4 | export(gl_auth) 5 | export(gl_auto_auth) 6 | export(gl_nlp) 7 | export(gl_speech) 8 | export(gl_speech_op) 9 | export(gl_talk) 10 | export(gl_talk_languages) 11 | export(gl_talk_player) 12 | export(gl_talk_shiny) 13 | export(gl_talk_shinyUI) 14 | export(gl_translate) 15 | export(gl_translate_detect) 16 | export(gl_translate_document) 17 | export(gl_translate_languages) 18 | import(assertthat) 19 | import(base64enc) 20 | importFrom(base64enc,base64decode) 21 | importFrom(base64enc,base64encode) 22 | importFrom(googleAuthR,gar_api_generator) 23 | importFrom(googleAuthR,gar_attach_auto_auth) 24 | importFrom(googleAuthR,gar_auth_service) 25 | importFrom(jsonlite,unbox) 26 | importFrom(magrittr,"%>%") 27 | importFrom(purrr,compact) 28 | importFrom(purrr,is_empty) 29 | importFrom(purrr,map) 30 | importFrom(purrr,map_chr) 31 | importFrom(stats,setNames) 32 | importFrom(tibble,as_tibble) 33 | importFrom(tibble,enframe) 34 | importFrom(tibble,tibble) 35 | importFrom(utils,URLencode) 36 | importFrom(utils,browseURL) 37 | -------------------------------------------------------------------------------- /NEWS.md: -------------------------------------------------------------------------------- 1 | # 0.3.0.9000 2 | 3 | * ... 4 | 5 | # 0.3.0 6 | 7 | * Improved error handling for vectorised `gl_nlp()` (#55) 8 | * `gl_nlp()`'s classifyText returns list of data.frames, not data.frame 9 | * Fix `gl_nlp` when `nlp_type='classifyText'` 10 | * `customConfig` available for `gl_speech` 11 | * Add support for SSML for `gl_talk()` (#66) 12 | * Add support for device profiles for `gl_talk()` (#67) 13 | * Add support for tuneR wave objects in `gl_speech()` - (#62 thanks @muschellij2) 14 | * Add check for file size for audio source - (#62 thanks @muschellij2) 15 | 16 | # 0.2.0 17 | 18 | * Added an example Shiny app that calls the Speech API 19 | * Fixed bug where cbind of any missing API content raised an error (#28) 20 | * Add Google text to speech via `gl_talk()` (#39) 21 | * Add classify text endpoint for `gl_nlp()` (#20) 22 | 23 | # 0.1.1 24 | 25 | * Fix bug where `gl_speech()` only returned first few seconds of translation when asynch (#23) 26 | * CRAN version carries stable API, GitHub version for beta features 27 | 28 | # 0.1.0 29 | 30 | 31 | * Natural language API via `gl_nlp` 32 | * Speech annotation via `gl_speech_recognise` 33 | * Translation detection and performance via `gl_translate_detect` and `gl_translate` 34 | * Vectorised support for inputs 35 | * Translating HTML support 36 | * Tibble outputs 37 | -------------------------------------------------------------------------------- /R/auth.R: -------------------------------------------------------------------------------- 1 | #' Authenticate with Google language API services 2 | #' 3 | #' @param json_file Authentication json file you have downloaded from your Google Project 4 | #' 5 | #' @details 6 | #' 7 | #' The best way to authenticate is to use an environment argument pointing at your authentication file. 8 | #' 9 | #' Set the file location of your download Google Project JSON file in a \code{GL_AUTH} argument 10 | #' 11 | #' Then, when you load the library you should auto-authenticate 12 | #' 13 | #' However, you can authenticate directly using this function pointing at your JSON auth file. 14 | #' 15 | #' @examples 16 | #' 17 | #' \dontrun{ 18 | #' library(googleLanguageR) 19 | #' gl_auth("location_of_json_file.json") 20 | #' } 21 | #' 22 | #' @export 23 | #' @importFrom googleAuthR gar_auth_service gar_attach_auto_auth 24 | gl_auth <- function(json_file){ 25 | options(googleAuthR.scopes.selected = c("https://www.googleapis.com/auth/cloud-language", 26 | "https://www.googleapis.com/auth/cloud-platform")) 27 | googleAuthR::gar_auth_service(json_file = json_file) 28 | } 29 | 30 | #' @export 31 | #' @rdname gl_auth 32 | #' @param ... additional argument to 33 | #' pass to \code{\link{gar_attach_auto_auth}}. 34 | #' 35 | #' @examples 36 | #' \dontrun{ 37 | #' library(googleLanguageR) 38 | #' gl_auto_auth() 39 | #' gl_auto_auth(environment_var = "GAR_AUTH_FILE") 40 | #' } 41 | gl_auto_auth <- function(...){ 42 | required_scopes = c("https://www.googleapis.com/auth/cloud-language", 43 | "https://www.googleapis.com/auth/cloud-platform") 44 | googleAuthR::gar_attach_auto_auth( 45 | required_scopes = required_scopes, 46 | ...) 47 | } 48 | -------------------------------------------------------------------------------- /R/googleLanguageR.R: -------------------------------------------------------------------------------- 1 | #' googleLanguageR 2 | #' 3 | #' This package contains functions for analysing language through the 4 | #' Google Cloud Machine Learning APIs 5 | #' 6 | #' For examples and documentation see the vignettes and the website: 7 | #' 8 | #' \url{http://code.markedmondson.me/googleLanguageR/} 9 | #' 10 | #' @seealso \url{https://cloud.google.com/products/machine-learning/} 11 | #' 12 | #' @docType package 13 | #' @name googleLanguageR 14 | NULL 15 | -------------------------------------------------------------------------------- /R/options.R: -------------------------------------------------------------------------------- 1 | .onLoad <- function(libname, pkgname) { 2 | 3 | op <- options() 4 | op.googleLanguageR <- list( 5 | googleAuthR.scopes.selected = c("https://www.googleapis.com/auth/cloud-language", 6 | "https://www.googleapis.com/auth/cloud-platform") 7 | ) 8 | 9 | toset <- !(names(op.googleLanguageR) %in% names(op)) 10 | 11 | if(any(toset)) options(op.googleLanguageR[toset]) 12 | 13 | invisible() 14 | 15 | } 16 | 17 | .onAttach <- function(libname, pkgname){ 18 | 19 | needed <- c("https://www.googleapis.com/auth/cloud-language", 20 | "https://www.googleapis.com/auth/cloud-platform") 21 | 22 | googleAuthR::gar_attach_auto_auth(needed, 23 | environment_var = "GL_AUTH") 24 | 25 | invisible() 26 | 27 | } 28 | 29 | 30 | 31 | -------------------------------------------------------------------------------- /R/translate-document.R: -------------------------------------------------------------------------------- 1 | #' Translate document 2 | #' 3 | #' Translate a document via the Google Translate API 4 | #' 5 | 6 | #' 7 | #' @param d_path path of the document to be translated 8 | #' @param output_path where to save the translated document 9 | #' @param format currently only pdf-files are supported 10 | #' 11 | #' @return output filename 12 | #' @family translations 13 | #' @import assertthat 14 | #' @importFrom base64enc base64encode 15 | #' @importFrom base64enc base64decode 16 | #' @importFrom utils URLencode 17 | #' @importFrom googleAuthR gar_api_generator 18 | #' @importFrom tibble as_tibble 19 | #' @importFrom stats setNames 20 | #' @export 21 | #' 22 | #' @examples 23 | #' 24 | #' \dontrun{ 25 | #' gl_translate_document(system.file(package = "googleLanguageR","test-doc.pdf"), "no") 26 | #' 27 | #' } 28 | gl_translate_document <- function(d_path, 29 | target = "es-ES", 30 | output_path = "out.pdf", 31 | format = c("pdf"), 32 | source = 'en-UK', 33 | model = c("nmt", "base"), 34 | 35 | location = "global"){ 36 | 37 | ## Checks 38 | assert_that(is.character(d_path), 39 | is.character(output_path), 40 | is.string(target), 41 | is.string(source)) 42 | 43 | format <- match.arg(format) 44 | model <- match.arg(model) 45 | 46 | format <- paste0("application/",format) 47 | 48 | if(file.exists(output_path)) stop("Output file already exists.") 49 | 50 | payload <- 51 | list( 52 | target_language_code = target, 53 | source_language_code = source, 54 | document_input_config = list( 55 | mimeType = format, 56 | content = base64encode(d_path) 57 | ) 58 | ) 59 | 60 | 61 | project_id <- gar_token()$auth_token$secrets$project_id 62 | LOCATION <- location 63 | 64 | my_URI <- paste0( 65 | "https://translation.googleapis.com/v3beta1/projects/", 66 | project_id, 67 | "/locations/", 68 | location,":translateDocument") 69 | 70 | 71 | call_api <- gar_api_generator(my_URI, "POST" ) 72 | 73 | me <- tryCatch(call_api(the_body = payload), error=function(e){print(e)}) 74 | 75 | writeBin( 76 | base64decode( 77 | me$content$documentTranslation[[1]] 78 | ), output_path) 79 | 80 | path.expand(output_path) 81 | 82 | } 83 | 84 | -------------------------------------------------------------------------------- /R/utilities.R: -------------------------------------------------------------------------------- 1 | 2 | #' A helper function that tests whether an object is either NULL _or_ 3 | #' a list of NULLs 4 | #' 5 | #' @keywords internal 6 | is.NullOb <- function(x) is.null(x) | all(sapply(x, is.null)) 7 | 8 | #' Recursively step down into list, removing all such objects 9 | #' 10 | #' @keywords internal 11 | rmNullObs <- function(x) { 12 | x <- Filter(Negate(is.NullOb), x) 13 | lapply(x, function(x) if (is.list(x)) rmNullObs(x) else x) 14 | } 15 | 16 | # safe cbind that removes df with no rows 17 | my_cbind <- function(...){ 18 | dots <- list(...) 19 | 20 | nrows <- vapply(dots, function(x) nrow(x) > 0, logical(1)) 21 | 22 | dots <- dots[nrows] 23 | 24 | do.call(cbind, args = dots) 25 | 26 | } 27 | 28 | #' base R safe rbind 29 | #' 30 | #' Send in a list of data.fames with different column names 31 | #' 32 | #' @return one data.frame 33 | #' a safe rbind for variable length columns 34 | #' @noRd 35 | my_reduce_rbind <- function(x){ 36 | classes <- lapply(x, inherits, what = "data.frame") 37 | stopifnot(all(unlist(classes))) 38 | 39 | # all possible names 40 | df_names <- Reduce(union, lapply(x, names)) 41 | 42 | df_same_names <- lapply(x, function(y){ 43 | missing_names <- setdiff(df_names,names(y)) 44 | num_col <- length(missing_names) 45 | if(num_col > 0){ 46 | missing_cols <- vapply(missing_names, function(i) NA, NA, USE.NAMES = TRUE) 47 | new_df <- data.frame(matrix(missing_cols, ncol = num_col)) 48 | names(new_df) <- names(missing_cols) 49 | y <- cbind(y, new_df, row.names = NULL) 50 | } 51 | 52 | y[, df_names] 53 | 54 | }) 55 | 56 | Reduce(rbind, df_same_names) 57 | } 58 | 59 | # purrr's map_df without dplyr 60 | my_map_df <- function(.x, .f, ...){ 61 | tryCatch( 62 | { 63 | .f <- purrr::as_mapper(.f, ...) 64 | res <- map(.x, .f, ...) 65 | my_reduce_rbind(res) 66 | }, 67 | error = function(err){ 68 | warning("Could not parse object with names: ", paste(names(.x), collapse = " ")) 69 | .x 70 | } 71 | 72 | ) 73 | 74 | 75 | } 76 | 77 | 78 | #' @importFrom jsonlite unbox 79 | #' @noRd 80 | jubox <- function(x){ 81 | unbox(x) 82 | } 83 | 84 | # tests if a google storage URL 85 | is.gcs <- function(x){ 86 | out <- grepl("^gs://", x) 87 | if(out){ 88 | my_message("Using Google Storage URI: ", x, level = 3) 89 | } 90 | out 91 | } 92 | 93 | # controls when messages are sent to user via an option 94 | # 1 = low level, 2= debug, 3=normal 95 | my_message <- function(..., level = 1){ 96 | 97 | compare_level <- getOption("googleAuthR.verbose", default = 1) 98 | 99 | if(level >= compare_level){ 100 | message(Sys.time()," -- ", ...) 101 | } 102 | 103 | } 104 | 105 | -------------------------------------------------------------------------------- /R/version.R: -------------------------------------------------------------------------------- 1 | # used to specify the API endpoint - on GitHub beta, on CRAN, stable 2 | get_version <- function(api = c("nlp","tts","speech","trans")){ 3 | 4 | api <- match.arg(api) 5 | 6 | # github uses beta endpoints 7 | if(grepl("\\.9...$", utils::packageVersion("googleLanguageR"))){ 8 | 9 | version <- switch(api, 10 | speech = "v1p1beta1", 11 | nlp = "v1beta2", 12 | tts = "v1beta1", 13 | trans = "v2") # no beta 14 | } else { 15 | version <- "v1" 16 | } 17 | 18 | version 19 | } 20 | -------------------------------------------------------------------------------- /_pkgdown.yml: -------------------------------------------------------------------------------- 1 | title: googleLanguageR 2 | url: https://code.markedmondson.me/googleLanguageR/ 3 | home: 4 | strip_header: true 5 | authors: 6 | Mark Edmondson: 7 | href: http://code.markedmondson.me 8 | template: 9 | params: 10 | bootswatch: cosmo 11 | ganalytics: UA-47480439-2 12 | navbar: 13 | title: googleLanguageR 14 | type: inverse 15 | structure: 16 | left: [translation, nlp, speech, text, help] 17 | right: [news, search, github, lightswitch] 18 | components: 19 | translation: 20 | text: "Translation API" 21 | icon: fa-globe fa-lg 22 | href: articles/translation.html 23 | nlp: 24 | text: "NLP API" 25 | icon: fa-object-group fa-lg 26 | href: articles/nlp.html 27 | speech: 28 | text: "Speech API" 29 | icon: fa-comment fa-lg 30 | href: articles/speech.html 31 | text: 32 | text: "Text-to-Speech API" 33 | icon: fa-phone fa-lg 34 | href: articles/text-to-speech.html 35 | help: 36 | text: "Help" 37 | menu: 38 | - text: "Function Reference" 39 | href: reference/index.html 40 | icon: fa-info 41 | - icon: fa-google 42 | text: "Google Documentation" 43 | href: https://cloud.google.com/products/machine-learning/ 44 | - icon: fa-github 45 | text: "Development site" 46 | href: https://github.com/ropensci/googleLanguageR 47 | - icon: fa-slack 48 | text: "Slack Support Channel" 49 | href: https://docs.google.com/forms/d/e/1FAIpQLSerjirmMpB3b7LmBs_Vx_XPIE9IrhpCpPg1jUcpfBcivA3uBw/viewform 50 | 51 | -------------------------------------------------------------------------------- /cloud_build/build.R: -------------------------------------------------------------------------------- 1 | library(googleCloudRunner) 2 | 3 | cr_deploy_packagetests( 4 | steps = cr_buildstep_secret("googlelanguager-auth", "/workspace/auth.json"), 5 | cloudbuild_file = "cloud_build/cloudbuild-tests.yml", 6 | timeout = 2400, 7 | env = c("NOT_CRAN=true","GL_AUTH=/workspace/auth.json") 8 | ) 9 | 10 | cr_deploy_pkgdown( 11 | steps = cr_buildstep_secret("googlelanguager-auth", "/workspace/auth.json"), 12 | secret = "github-ssh", 13 | github_repo = "ropensci/googleLanguageR", 14 | cloudbuild_file = "cloud_build/cloudbuild-pkgdown.yml", 15 | env = "GL_AUTH=/workspace/auth.json", 16 | post_clone = cr_buildstep_bash( 17 | c("git remote set-url --push origin git@github.com:MarkEdmondson1234/googleLanguageR.git"), 18 | name = "gcr.io/cloud-builders/git", 19 | entrypoint = "bash", 20 | dir = "repo") 21 | ) 22 | -------------------------------------------------------------------------------- /cloud_build/cloudbuild-pkgdown.yml: -------------------------------------------------------------------------------- 1 | steps: 2 | - name: gcr.io/cloud-builders/gcloud 3 | entrypoint: bash 4 | args: 5 | - -c 6 | - gcloud secrets versions access latest --secret=googlelanguager-auth > /workspace/auth.json 7 | - name: gcr.io/cloud-builders/gcloud 8 | entrypoint: bash 9 | args: 10 | - -c 11 | - gcloud secrets versions access latest --secret=github-ssh > /root/.ssh/id_rsa 12 | id: git secret 13 | volumes: 14 | - name: ssh 15 | path: /root/.ssh 16 | - name: gcr.io/cloud-builders/git 17 | entrypoint: bash 18 | args: 19 | - -c 20 | - |- 21 | chmod 600 /root/.ssh/id_rsa 22 | cat <known_hosts 23 | github.com ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAq2A7hRGmdnm9tUDbO9IDSwBK6TbQa+PXYPCPy6rbTrTtw7PHkccKrpp0yVhp5HdEIcKr6pLlVDBfOLX9QUsyCOV0wzfjIJNlGEYsdlLJizHhbn2mUjvSAHQqZETYP81eFzLQNnPHt4EVVUh7VfDESU84KezmD5QlWpXLmvU31/yMf+Se8xhHTvKSCZIFImWwoG6mbUoWf9nzpIoaSjB+weqqUUmpaaasXVal72J+UX2B+2RPW3RcT0eOzQgqlJL3RKrTJvdsjE3JEAvGq3lGHSZXy28G3skua2SmVi/w4yCE6gbODqnTWlg7+wC604ydGXA8VJiS5ap43JXiUFFAaQ== 24 | EOF 25 | cat </root/.ssh/config 26 | Hostname github.com 27 | IdentityFile /root/.ssh/id_rsa 28 | EOF 29 | mv known_hosts /root/.ssh/known_hosts 30 | git config --global user.name "googleCloudRunner" 31 | git config --global user.email "cr_buildstep_gitsetup@googleCloudRunner.com" 32 | id: git setup script 33 | volumes: 34 | - name: ssh 35 | path: /root/.ssh 36 | - name: gcr.io/cloud-builders/git 37 | args: 38 | - clone 39 | - git@github.com:ropensci/googleLanguageR 40 | - repo 41 | id: clone to repo dir 42 | volumes: 43 | - name: ssh 44 | path: /root/.ssh 45 | - name: gcr.io/cloud-builders/git 46 | entrypoint: bash 47 | args: 48 | - -c 49 | - |- 50 | git remote -v 51 | git remote set-url --push origin git@github.com:MarkEdmondson1234/googleLanguageR.git 52 | git remote -v 53 | dir: repo 54 | - name: gcr.io/gcer-public/packagetools:master 55 | args: 56 | - Rscript 57 | - -e 58 | - |- 59 | devtools::install() 60 | pkgdown::build_site() 61 | id: build pkgdown 62 | dir: repo 63 | env: 64 | - GL_AUTH=/workspace/auth.json 65 | - name: gcr.io/cloud-builders/git 66 | args: 67 | - add 68 | - --all 69 | dir: repo 70 | volumes: 71 | - name: ssh 72 | path: /root/.ssh 73 | - name: gcr.io/cloud-builders/git 74 | args: 75 | - commit 76 | - -a 77 | - -m 78 | - "[skip travis] Build website from commit ${COMMIT_SHA}: \n$(date +\"%Y%m%dT%H:%M:%S\")" 79 | dir: repo 80 | volumes: 81 | - name: ssh 82 | path: /root/.ssh 83 | - name: gcr.io/cloud-builders/git 84 | args: 85 | - status 86 | dir: repo 87 | volumes: 88 | - name: ssh 89 | path: /root/.ssh 90 | - name: gcr.io/cloud-builders/git 91 | args: 92 | - push 93 | - --force 94 | dir: repo 95 | volumes: 96 | - name: ssh 97 | path: /root/.ssh 98 | #Generated by googleCloudRunner::cr_build_write at 2020-04-20 01:04:32 99 | -------------------------------------------------------------------------------- /cloud_build/cloudbuild-tests.yml: -------------------------------------------------------------------------------- 1 | steps: 2 | - name: gcr.io/cloud-builders/gcloud 3 | entrypoint: bash 4 | args: 5 | - -c 6 | - gcloud secrets versions access latest --secret=googlelanguager-auth > /workspace/auth.json 7 | - name: gcr.io/gcer-public/packagetools:master 8 | args: 9 | - Rscript 10 | - -e 11 | - |- 12 | message("cran mirror: ", getOption("repos")) 13 | remotes::install_deps(dependencies = TRUE) 14 | rcmdcheck::rcmdcheck(args = '--no-manual', error_on = 'warning') 15 | env: 16 | - NOT_CRAN=true 17 | - GL_AUTH=/workspace/auth.json 18 | - name: gcr.io/gcer-public/packagetools:master 19 | args: 20 | - Rscript 21 | - -e 22 | - |- 23 | remotes::install_deps(dependencies = TRUE) 24 | remotes::install_local() 25 | cv <- covr::package_coverage() 26 | print(cv) 27 | covr::codecov(coverage=cv, commit = '$COMMIT_SHA', branch = '$BRANCH_NAME') 28 | env: 29 | - NOT_CRAN=true 30 | - GL_AUTH=/workspace/auth.json 31 | - CODECOV_TOKEN=$_CODECOV_TOKEN 32 | timeout: 2400s 33 | #Generated by googleCloudRunner::cr_build_write at 2020-04-20 01:04:32 34 | -------------------------------------------------------------------------------- /cran-comments.md: -------------------------------------------------------------------------------- 1 | ## Test environments 2 | * local OS X install, R 3.6.3 3 | * ubuntu 14.04.5 LTS R 3.6.3 4 | * rhub - Windows Server 2008 R2 SP1, R-devel, 32/64 bit, R 3.6.3 5 | 6 | ## R CMD check results 7 | 8 | 0 errors | 0 warnings | 0 notes 9 | 10 | ## Reverse dependencies 11 | 12 | There are no reverse dependencies. 13 | 14 | -------------------------------------------------------------------------------- /docs/.gitignore: -------------------------------------------------------------------------------- 1 | .httr-oauth 2 | -------------------------------------------------------------------------------- /docs/404.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Page not found (404) • googleLanguageR 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 55 | 56 | 57 | 58 | 59 | 66 | 67 | 68 | 69 | 70 | 71 |
72 |
73 | 168 | 169 | 170 | 171 |
172 | 173 |
174 |
175 | 178 | 179 | Content not found. Please use links in the navbar. 180 | 181 |
182 | 183 | 188 | 189 |
190 | 191 | 192 | 193 |
194 | 197 | 198 |
199 |

Site built with pkgdown 1.5.1.

200 |
201 | 202 |
203 |
204 | 205 | 206 | 207 | 208 | 209 | 210 | 211 | 212 | -------------------------------------------------------------------------------- /docs/LICENSE-text.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | License • googleLanguageR 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 55 | 56 | 57 | 58 | 59 | 66 | 67 | 68 | 69 | 70 | 71 |
72 |
73 | 168 | 169 | 170 | 171 |
172 | 173 |
174 |
175 | 178 | 179 |
YEAR: 2017
180 | COPYRIGHT HOLDER: Sunholo Ltd.
181 | 
182 | 183 |
184 | 185 | 190 | 191 |
192 | 193 | 194 | 195 |
196 | 199 | 200 |
201 |

Site built with pkgdown 1.5.1.

202 |
203 | 204 |
205 |
206 | 207 | 208 | 209 | 210 | 211 | 212 | 213 | 214 | -------------------------------------------------------------------------------- /docs/LICENSE.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | License • googleLanguageR 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 35 | 36 | 37 | 38 | 48 | 49 | 50 | 51 | 52 |
53 |
54 | 127 | 128 | 129 |
130 | 131 |
132 |
133 | 136 | 137 |
YEAR: 2017
138 | COPYRIGHT HOLDER: Sunholo Ltd.
139 | 
140 | 141 |
142 | 143 |
144 | 145 | 146 |
147 | 150 | 151 |
152 |

Site built with pkgdown.

153 |
154 | 155 |
156 |
157 | 158 | 159 | 160 | -------------------------------------------------------------------------------- /docs/articles/find_html_node.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci/googleLanguageR/7c6f93b0977ac7ac2189a6b5648362b12509c953/docs/articles/find_html_node.png -------------------------------------------------------------------------------- /docs/articles/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Articles • googleLanguageR 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 55 | 56 | 57 | 58 | 59 | 66 | 67 | 68 | 69 | 70 | 71 |
72 |
73 | 168 | 169 | 170 | 171 |
172 | 173 |
174 |
175 | 178 | 179 |
180 |

All vignettes

181 |

182 | 183 |
184 |
Google Natural Language API
185 |
186 |
Introduction to googleLanguageR
187 |
188 |
Google Cloud Speech-to-Text API
189 |
190 |
Google Cloud Text-to-Speech API
191 |
192 |
Google Cloud Translation API
193 |
194 |
195 |
196 |
197 |
198 | 199 | 200 |
201 | 204 | 205 |
206 |

Site built with pkgdown 1.5.1.

207 |
208 | 209 |
210 |
211 | 212 | 213 | 214 | 215 | 216 | 217 | 218 | 219 | -------------------------------------------------------------------------------- /docs/articles/nlp.R: -------------------------------------------------------------------------------- 1 | ## ---- include=FALSE------------------------------------------------------ 2 | NOT_CRAN <- identical(tolower(Sys.getenv("NOT_CRAN")), "true") 3 | knitr::opts_chunk$set( 4 | collapse = TRUE, 5 | comment = "#>", 6 | purl = NOT_CRAN, 7 | eval = NOT_CRAN 8 | ) 9 | 10 | ## ---- message=TRUE, warning=FALSE---------------------------------------- 11 | library(googleLanguageR) 12 | 13 | texts <- c("to administer medicince to animals is frequently a very difficult matter, 14 | and yet sometimes it's necessary to do so", 15 | "I don't know how to make a text demo that is sensible") 16 | nlp_result <- gl_nlp(texts) 17 | 18 | # two results of lists of tibbles 19 | str(nlp_result, max.level = 2) 20 | 21 | ## get first return 22 | nlp <- nlp_result[[1]] 23 | nlp$sentences 24 | 25 | nlp2 <- nlp_result[[2]] 26 | nlp2$sentences 27 | 28 | nlp2$tokens 29 | 30 | nlp2$entities 31 | 32 | nlp2$documentSentiment 33 | 34 | nlp2$language 35 | 36 | -------------------------------------------------------------------------------- /docs/articles/setup.R: -------------------------------------------------------------------------------- 1 | ## ---- include=FALSE------------------------------------------------------ 2 | NOT_CRAN <- identical(tolower(Sys.getenv("NOT_CRAN")), "true") 3 | knitr::opts_chunk$set( 4 | collapse = TRUE, 5 | comment = "#>", 6 | purl = NOT_CRAN, 7 | eval = NOT_CRAN 8 | ) 9 | 10 | -------------------------------------------------------------------------------- /docs/articles/speech.R: -------------------------------------------------------------------------------- 1 | ## ---- include=FALSE------------------------------------------------------ 2 | NOT_CRAN <- identical(tolower(Sys.getenv("NOT_CRAN")), "true") 3 | knitr::opts_chunk$set( 4 | collapse = TRUE, 5 | comment = "#>", 6 | purl = NOT_CRAN, 7 | eval = NOT_CRAN 8 | ) 9 | 10 | ## ---- message=TRUE, warning=FALSE---------------------------------------- 11 | library(googleLanguageR) 12 | ## get the sample source file 13 | test_audio <- system.file("woman1_wb.wav", package = "googleLanguageR") 14 | 15 | ## its not perfect but...:) 16 | gl_speech(test_audio)$transcript 17 | 18 | ## get alternative transcriptions 19 | gl_speech(test_audio, maxAlternatives = 2L)$transcript 20 | 21 | gl_speech(test_audio, languageCode = "en-GB")$transcript 22 | 23 | ## help it out with context for "frequently" 24 | gl_speech(test_audio, 25 | languageCode = "en-GB", 26 | speechContexts = list(phrases = list("is frequently a very difficult")))$transcript 27 | 28 | -------------------------------------------------------------------------------- /docs/articles/translation.R: -------------------------------------------------------------------------------- 1 | ## ---- include=FALSE------------------------------------------------------ 2 | NOT_CRAN <- identical(tolower(Sys.getenv("NOT_CRAN")), "true") 3 | knitr::opts_chunk$set( 4 | collapse = TRUE, 5 | comment = "#>", 6 | purl = NOT_CRAN, 7 | eval = NOT_CRAN 8 | ) 9 | 10 | ## ---- warning=FALSE------------------------------------------------------ 11 | library(googleLanguageR) 12 | 13 | text <- "to administer medicince to animals is frequently a very difficult matter, and yet sometimes it's necessary to do so" 14 | ## translate British into Danish 15 | gl_translate(text, target = "da")$translatedText 16 | 17 | ## ------------------------------------------------------------------------ 18 | # translate webpages 19 | library(rvest) 20 | library(googleLanguageR) 21 | 22 | my_url <- "http://www.dr.dk/nyheder/indland/greenpeace-facebook-og-google-boer-foelge-apples-groenne-planer" 23 | 24 | ## in this case the content to translate is in css select .wcms-article-content 25 | read_html(my_url) %>% # read html 26 | html_node(css = ".wcms-article-content") %>% # select article content with CSS 27 | html_text %>% # extract text 28 | gl_translate(format = "html") %>% # translate with html flag 29 | dplyr::select(translatedText) # show translatedText column of output tibble 30 | 31 | 32 | ## ------------------------------------------------------------------------ 33 | ## which language is this? 34 | gl_translate_detect("katten sidder på måtten") 35 | 36 | ## ------------------------------------------------------------------------ 37 | cld2::detect_language("katten sidder på måtten") 38 | 39 | -------------------------------------------------------------------------------- /docs/authors.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Authors • googleLanguageR 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 55 | 56 | 57 | 58 | 59 | 66 | 67 | 68 | 69 | 70 | 71 |
72 |
73 | 168 | 169 | 170 | 171 |
172 | 173 |
174 |
175 | 178 | 179 |
    180 |
  • 181 |

    Mark Edmondson. Author, maintainer. 182 |

    183 |
  • 184 |
  • 185 |

    John Muschelli. Contributor. 186 |

    187 |
  • 188 |
  • 189 |

    Neal Richardson. Reviewer. 190 |
    Neal reviewed the package for ropensci, 191 | see

    192 |
  • 193 |
  • 194 |

    Julia Gustavsen. Reviewer. 195 |
    Julia reviewed the package for ropensci, 196 | see

    197 |
  • 198 |
199 | 200 |
201 | 202 |
203 | 204 | 205 | 206 |
207 | 210 | 211 |
212 |

Site built with pkgdown 1.5.1.

213 |
214 | 215 |
216 |
217 | 218 | 219 | 220 | 221 | 222 | 223 | 224 | 225 | -------------------------------------------------------------------------------- /docs/bootstrap-toc.css: -------------------------------------------------------------------------------- 1 | /*! 2 | * Bootstrap Table of Contents v0.4.1 (http://afeld.github.io/bootstrap-toc/) 3 | * Copyright 2015 Aidan Feldman 4 | * Licensed under MIT (https://github.com/afeld/bootstrap-toc/blob/gh-pages/LICENSE.md) */ 5 | 6 | /* modified from https://github.com/twbs/bootstrap/blob/94b4076dd2efba9af71f0b18d4ee4b163aa9e0dd/docs/assets/css/src/docs.css#L548-L601 */ 7 | 8 | /* All levels of nav */ 9 | nav[data-toggle='toc'] .nav > li > a { 10 | display: block; 11 | padding: 4px 20px; 12 | font-size: 13px; 13 | font-weight: 500; 14 | color: #767676; 15 | } 16 | nav[data-toggle='toc'] .nav > li > a:hover, 17 | nav[data-toggle='toc'] .nav > li > a:focus { 18 | padding-left: 19px; 19 | color: #563d7c; 20 | text-decoration: none; 21 | background-color: transparent; 22 | border-left: 1px solid #563d7c; 23 | } 24 | nav[data-toggle='toc'] .nav > .active > a, 25 | nav[data-toggle='toc'] .nav > .active:hover > a, 26 | nav[data-toggle='toc'] .nav > .active:focus > a { 27 | padding-left: 18px; 28 | font-weight: bold; 29 | color: #563d7c; 30 | background-color: transparent; 31 | border-left: 2px solid #563d7c; 32 | } 33 | 34 | /* Nav: second level (shown on .active) */ 35 | nav[data-toggle='toc'] .nav .nav { 36 | display: none; /* Hide by default, but at >768px, show it */ 37 | padding-bottom: 10px; 38 | } 39 | nav[data-toggle='toc'] .nav .nav > li > a { 40 | padding-top: 1px; 41 | padding-bottom: 1px; 42 | padding-left: 30px; 43 | font-size: 12px; 44 | font-weight: normal; 45 | } 46 | nav[data-toggle='toc'] .nav .nav > li > a:hover, 47 | nav[data-toggle='toc'] .nav .nav > li > a:focus { 48 | padding-left: 29px; 49 | } 50 | nav[data-toggle='toc'] .nav .nav > .active > a, 51 | nav[data-toggle='toc'] .nav .nav > .active:hover > a, 52 | nav[data-toggle='toc'] .nav .nav > .active:focus > a { 53 | padding-left: 28px; 54 | font-weight: 500; 55 | } 56 | 57 | /* from https://github.com/twbs/bootstrap/blob/e38f066d8c203c3e032da0ff23cd2d6098ee2dd6/docs/assets/css/src/docs.css#L631-L634 */ 58 | nav[data-toggle='toc'] .nav > .active > ul { 59 | display: block; 60 | } 61 | -------------------------------------------------------------------------------- /docs/bootstrap-toc.js: -------------------------------------------------------------------------------- 1 | /*! 2 | * Bootstrap Table of Contents v0.4.1 (http://afeld.github.io/bootstrap-toc/) 3 | * Copyright 2015 Aidan Feldman 4 | * Licensed under MIT (https://github.com/afeld/bootstrap-toc/blob/gh-pages/LICENSE.md) */ 5 | (function() { 6 | 'use strict'; 7 | 8 | window.Toc = { 9 | helpers: { 10 | // return all matching elements in the set, or their descendants 11 | findOrFilter: function($el, selector) { 12 | // http://danielnouri.org/notes/2011/03/14/a-jquery-find-that-also-finds-the-root-element/ 13 | // http://stackoverflow.com/a/12731439/358804 14 | var $descendants = $el.find(selector); 15 | return $el.filter(selector).add($descendants).filter(':not([data-toc-skip])'); 16 | }, 17 | 18 | generateUniqueIdBase: function(el) { 19 | var text = $(el).text(); 20 | var anchor = text.trim().toLowerCase().replace(/[^A-Za-z0-9]+/g, '-'); 21 | return anchor || el.tagName.toLowerCase(); 22 | }, 23 | 24 | generateUniqueId: function(el) { 25 | var anchorBase = this.generateUniqueIdBase(el); 26 | for (var i = 0; ; i++) { 27 | var anchor = anchorBase; 28 | if (i > 0) { 29 | // add suffix 30 | anchor += '-' + i; 31 | } 32 | // check if ID already exists 33 | if (!document.getElementById(anchor)) { 34 | return anchor; 35 | } 36 | } 37 | }, 38 | 39 | generateAnchor: function(el) { 40 | if (el.id) { 41 | return el.id; 42 | } else { 43 | var anchor = this.generateUniqueId(el); 44 | el.id = anchor; 45 | return anchor; 46 | } 47 | }, 48 | 49 | createNavList: function() { 50 | return $(''); 51 | }, 52 | 53 | createChildNavList: function($parent) { 54 | var $childList = this.createNavList(); 55 | $parent.append($childList); 56 | return $childList; 57 | }, 58 | 59 | generateNavEl: function(anchor, text) { 60 | var $a = $(''); 61 | $a.attr('href', '#' + anchor); 62 | $a.text(text); 63 | var $li = $('
  • '); 64 | $li.append($a); 65 | return $li; 66 | }, 67 | 68 | generateNavItem: function(headingEl) { 69 | var anchor = this.generateAnchor(headingEl); 70 | var $heading = $(headingEl); 71 | var text = $heading.data('toc-text') || $heading.text(); 72 | return this.generateNavEl(anchor, text); 73 | }, 74 | 75 | // Find the first heading level (`

    `, then `

    `, etc.) that has more than one element. Defaults to 1 (for `

    `). 76 | getTopLevel: function($scope) { 77 | for (var i = 1; i <= 6; i++) { 78 | var $headings = this.findOrFilter($scope, 'h' + i); 79 | if ($headings.length > 1) { 80 | return i; 81 | } 82 | } 83 | 84 | return 1; 85 | }, 86 | 87 | // returns the elements for the top level, and the next below it 88 | getHeadings: function($scope, topLevel) { 89 | var topSelector = 'h' + topLevel; 90 | 91 | var secondaryLevel = topLevel + 1; 92 | var secondarySelector = 'h' + secondaryLevel; 93 | 94 | return this.findOrFilter($scope, topSelector + ',' + secondarySelector); 95 | }, 96 | 97 | getNavLevel: function(el) { 98 | return parseInt(el.tagName.charAt(1), 10); 99 | }, 100 | 101 | populateNav: function($topContext, topLevel, $headings) { 102 | var $context = $topContext; 103 | var $prevNav; 104 | 105 | var helpers = this; 106 | $headings.each(function(i, el) { 107 | var $newNav = helpers.generateNavItem(el); 108 | var navLevel = helpers.getNavLevel(el); 109 | 110 | // determine the proper $context 111 | if (navLevel === topLevel) { 112 | // use top level 113 | $context = $topContext; 114 | } else if ($prevNav && $context === $topContext) { 115 | // create a new level of the tree and switch to it 116 | $context = helpers.createChildNavList($prevNav); 117 | } // else use the current $context 118 | 119 | $context.append($newNav); 120 | 121 | $prevNav = $newNav; 122 | }); 123 | }, 124 | 125 | parseOps: function(arg) { 126 | var opts; 127 | if (arg.jquery) { 128 | opts = { 129 | $nav: arg 130 | }; 131 | } else { 132 | opts = arg; 133 | } 134 | opts.$scope = opts.$scope || $(document.body); 135 | return opts; 136 | } 137 | }, 138 | 139 | // accepts a jQuery object, or an options object 140 | init: function(opts) { 141 | opts = this.helpers.parseOps(opts); 142 | 143 | // ensure that the data attribute is in place for styling 144 | opts.$nav.attr('data-toggle', 'toc'); 145 | 146 | var $topContext = this.helpers.createChildNavList(opts.$nav); 147 | var topLevel = this.helpers.getTopLevel(opts.$scope); 148 | var $headings = this.helpers.getHeadings(opts.$scope, topLevel); 149 | this.helpers.populateNav($topContext, topLevel, $headings); 150 | } 151 | }; 152 | 153 | $(function() { 154 | $('nav[data-toggle="toc"]').each(function(i, el) { 155 | var $nav = $(el); 156 | Toc.init($nav); 157 | }); 158 | }); 159 | })(); 160 | -------------------------------------------------------------------------------- /docs/docsearch.js: -------------------------------------------------------------------------------- 1 | $(function() { 2 | 3 | // register a handler to move the focus to the search bar 4 | // upon pressing shift + "/" (i.e. "?") 5 | $(document).on('keydown', function(e) { 6 | if (e.shiftKey && e.keyCode == 191) { 7 | e.preventDefault(); 8 | $("#search-input").focus(); 9 | } 10 | }); 11 | 12 | $(document).ready(function() { 13 | // do keyword highlighting 14 | /* modified from https://jsfiddle.net/julmot/bL6bb5oo/ */ 15 | var mark = function() { 16 | 17 | var referrer = document.URL ; 18 | var paramKey = "q" ; 19 | 20 | if (referrer.indexOf("?") !== -1) { 21 | var qs = referrer.substr(referrer.indexOf('?') + 1); 22 | var qs_noanchor = qs.split('#')[0]; 23 | var qsa = qs_noanchor.split('&'); 24 | var keyword = ""; 25 | 26 | for (var i = 0; i < qsa.length; i++) { 27 | var currentParam = qsa[i].split('='); 28 | 29 | if (currentParam.length !== 2) { 30 | continue; 31 | } 32 | 33 | if (currentParam[0] == paramKey) { 34 | keyword = decodeURIComponent(currentParam[1].replace(/\+/g, "%20")); 35 | } 36 | } 37 | 38 | if (keyword !== "") { 39 | $(".contents").unmark({ 40 | done: function() { 41 | $(".contents").mark(keyword); 42 | } 43 | }); 44 | } 45 | } 46 | }; 47 | 48 | mark(); 49 | }); 50 | }); 51 | 52 | /* Search term highlighting ------------------------------*/ 53 | 54 | function matchedWords(hit) { 55 | var words = []; 56 | 57 | var hierarchy = hit._highlightResult.hierarchy; 58 | // loop to fetch from lvl0, lvl1, etc. 59 | for (var idx in hierarchy) { 60 | words = words.concat(hierarchy[idx].matchedWords); 61 | } 62 | 63 | var content = hit._highlightResult.content; 64 | if (content) { 65 | words = words.concat(content.matchedWords); 66 | } 67 | 68 | // return unique words 69 | var words_uniq = [...new Set(words)]; 70 | return words_uniq; 71 | } 72 | 73 | function updateHitURL(hit) { 74 | 75 | var words = matchedWords(hit); 76 | var url = ""; 77 | 78 | if (hit.anchor) { 79 | url = hit.url_without_anchor + '?q=' + escape(words.join(" ")) + '#' + hit.anchor; 80 | } else { 81 | url = hit.url + '?q=' + escape(words.join(" ")); 82 | } 83 | 84 | return url; 85 | } 86 | -------------------------------------------------------------------------------- /docs/docsearch.json: -------------------------------------------------------------------------------- 1 | { 2 | "index_name": "googleLanguageR", 3 | "start_urls": [ 4 | "https://code.markedmondson.me/googleLanguageR/" 5 | ], 6 | "stop_urls": ["index.html", "authors.html", "/LICENSE", "/news/"], 7 | "sitemap_urls": [ 8 | "https://code.markedmondson.me/googleLanguageR//sitemap.xml" 9 | ], 10 | "selectors": { 11 | "lvl0": ".contents h1", 12 | "lvl1": ".contents .name", 13 | "lvl2": ".contents h2", 14 | "lvl3": ".contents h3, .contents th", 15 | "lvl4": ".contents h4", 16 | "text": ".contents p, .contents li, .usage, .template-article .contents .pre" 17 | }, 18 | "selectors_exclude": [ 19 | ".dont-index" 20 | ] 21 | } 22 | -------------------------------------------------------------------------------- /docs/jquery.sticky-kit.min.js: -------------------------------------------------------------------------------- 1 | /* 2 | Sticky-kit v1.1.2 | WTFPL | Leaf Corcoran 2015 | http://leafo.net 3 | */ 4 | (function(){var b,f;b=this.jQuery||window.jQuery;f=b(window);b.fn.stick_in_parent=function(d){var A,w,J,n,B,K,p,q,k,E,t;null==d&&(d={});t=d.sticky_class;B=d.inner_scrolling;E=d.recalc_every;k=d.parent;q=d.offset_top;p=d.spacer;w=d.bottoming;null==q&&(q=0);null==k&&(k=void 0);null==B&&(B=!0);null==t&&(t="is_stuck");A=b(document);null==w&&(w=!0);J=function(a,d,n,C,F,u,r,G){var v,H,m,D,I,c,g,x,y,z,h,l;if(!a.data("sticky_kit")){a.data("sticky_kit",!0);I=A.height();g=a.parent();null!=k&&(g=g.closest(k)); 5 | if(!g.length)throw"failed to find stick parent";v=m=!1;(h=null!=p?p&&a.closest(p):b("
    "))&&h.css("position",a.css("position"));x=function(){var c,f,e;if(!G&&(I=A.height(),c=parseInt(g.css("border-top-width"),10),f=parseInt(g.css("padding-top"),10),d=parseInt(g.css("padding-bottom"),10),n=g.offset().top+c+f,C=g.height(),m&&(v=m=!1,null==p&&(a.insertAfter(h),h.detach()),a.css({position:"",top:"",width:"",bottom:""}).removeClass(t),e=!0),F=a.offset().top-(parseInt(a.css("margin-top"),10)||0)-q, 6 | u=a.outerHeight(!0),r=a.css("float"),h&&h.css({width:a.outerWidth(!0),height:u,display:a.css("display"),"vertical-align":a.css("vertical-align"),"float":r}),e))return l()};x();if(u!==C)return D=void 0,c=q,z=E,l=function(){var b,l,e,k;if(!G&&(e=!1,null!=z&&(--z,0>=z&&(z=E,x(),e=!0)),e||A.height()===I||x(),e=f.scrollTop(),null!=D&&(l=e-D),D=e,m?(w&&(k=e+u+c>C+n,v&&!k&&(v=!1,a.css({position:"fixed",bottom:"",top:c}).trigger("sticky_kit:unbottom"))),eb&&!v&&(c-=l,c=Math.max(b-u,c),c=Math.min(q,c),m&&a.css({top:c+"px"})))):e>F&&(m=!0,b={position:"fixed",top:c},b.width="border-box"===a.css("box-sizing")?a.outerWidth()+"px":a.width()+"px",a.css(b).addClass(t),null==p&&(a.after(h),"left"!==r&&"right"!==r||h.append(a)),a.trigger("sticky_kit:stick")),m&&w&&(null==k&&(k=e+u+c>C+n),!v&&k)))return v=!0,"static"===g.css("position")&&g.css({position:"relative"}), 8 | a.css({position:"absolute",bottom:d,top:"auto"}).trigger("sticky_kit:bottom")},y=function(){x();return l()},H=function(){G=!0;f.off("touchmove",l);f.off("scroll",l);f.off("resize",y);b(document.body).off("sticky_kit:recalc",y);a.off("sticky_kit:detach",H);a.removeData("sticky_kit");a.css({position:"",bottom:"",top:"",width:""});g.position("position","");if(m)return null==p&&("left"!==r&&"right"!==r||a.insertAfter(h),h.remove()),a.removeClass(t)},f.on("touchmove",l),f.on("scroll",l),f.on("resize", 9 | y),b(document.body).on("sticky_kit:recalc",y),a.on("sticky_kit:detach",H),setTimeout(l,0)}};n=0;for(K=this.length;n 2 | 3 | 5 | 8 | 12 | 13 | -------------------------------------------------------------------------------- /docs/news/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Changelog • googleLanguageR 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 55 | 56 | 57 | 58 | 59 | 66 | 67 | 68 | 69 | 70 | 71 |
    72 |
    73 | 168 | 169 | 170 | 171 |
    172 | 173 |
    174 |
    175 | 179 | 180 |
    181 | 182 | 187 | 188 |
    189 | 190 | 191 |
    192 | 195 | 196 |
    197 |

    Site built with pkgdown 1.5.1.

    198 |
    199 | 200 |
    201 |
    202 | 203 | 204 | 205 | 206 | 207 | 208 | 209 | 210 | -------------------------------------------------------------------------------- /docs/pkgdown.js: -------------------------------------------------------------------------------- 1 | /* http://gregfranko.com/blog/jquery-best-practices/ */ 2 | (function($) { 3 | $(function() { 4 | 5 | $('.navbar-fixed-top').headroom(); 6 | 7 | $('body').css('padding-top', $('.navbar').height() + 10); 8 | $(window).resize(function(){ 9 | $('body').css('padding-top', $('.navbar').height() + 10); 10 | }); 11 | 12 | $('[data-toggle="tooltip"]').tooltip(); 13 | 14 | var cur_path = paths(location.pathname); 15 | var links = $("#navbar ul li a"); 16 | var max_length = -1; 17 | var pos = -1; 18 | for (var i = 0; i < links.length; i++) { 19 | if (links[i].getAttribute("href") === "#") 20 | continue; 21 | // Ignore external links 22 | if (links[i].host !== location.host) 23 | continue; 24 | 25 | var nav_path = paths(links[i].pathname); 26 | 27 | var length = prefix_length(nav_path, cur_path); 28 | if (length > max_length) { 29 | max_length = length; 30 | pos = i; 31 | } 32 | } 33 | 34 | // Add class to parent
  • , and enclosing
  • if in dropdown 35 | if (pos >= 0) { 36 | var menu_anchor = $(links[pos]); 37 | menu_anchor.parent().addClass("active"); 38 | menu_anchor.closest("li.dropdown").addClass("active"); 39 | } 40 | }); 41 | 42 | function paths(pathname) { 43 | var pieces = pathname.split("/"); 44 | pieces.shift(); // always starts with / 45 | 46 | var end = pieces[pieces.length - 1]; 47 | if (end === "index.html" || end === "") 48 | pieces.pop(); 49 | return(pieces); 50 | } 51 | 52 | // Returns -1 if not found 53 | function prefix_length(needle, haystack) { 54 | if (needle.length > haystack.length) 55 | return(-1); 56 | 57 | // Special case for length-0 haystack, since for loop won't run 58 | if (haystack.length === 0) { 59 | return(needle.length === 0 ? 0 : -1); 60 | } 61 | 62 | for (var i = 0; i < haystack.length; i++) { 63 | if (needle[i] != haystack[i]) 64 | return(i); 65 | } 66 | 67 | return(haystack.length); 68 | } 69 | 70 | /* Clipboard --------------------------*/ 71 | 72 | function changeTooltipMessage(element, msg) { 73 | var tooltipOriginalTitle=element.getAttribute('data-original-title'); 74 | element.setAttribute('data-original-title', msg); 75 | $(element).tooltip('show'); 76 | element.setAttribute('data-original-title', tooltipOriginalTitle); 77 | } 78 | 79 | if(ClipboardJS.isSupported()) { 80 | $(document).ready(function() { 81 | var copyButton = ""; 82 | 83 | $(".examples, div.sourceCode").addClass("hasCopyButton"); 84 | 85 | // Insert copy buttons: 86 | $(copyButton).prependTo(".hasCopyButton"); 87 | 88 | // Initialize tooltips: 89 | $('.btn-copy-ex').tooltip({container: 'body'}); 90 | 91 | // Initialize clipboard: 92 | var clipboardBtnCopies = new ClipboardJS('[data-clipboard-copy]', { 93 | text: function(trigger) { 94 | return trigger.parentNode.textContent; 95 | } 96 | }); 97 | 98 | clipboardBtnCopies.on('success', function(e) { 99 | changeTooltipMessage(e.trigger, 'Copied!'); 100 | e.clearSelection(); 101 | }); 102 | 103 | clipboardBtnCopies.on('error', function() { 104 | changeTooltipMessage(e.trigger,'Press Ctrl+C or Command+C to copy'); 105 | }); 106 | }); 107 | } 108 | })(window.jQuery || window.$) 109 | -------------------------------------------------------------------------------- /docs/pkgdown.yml: -------------------------------------------------------------------------------- 1 | pandoc: 2.3.1 2 | pkgdown: 1.5.1 3 | pkgdown_sha: ~ 4 | articles: 5 | nlp: nlp.html 6 | setup: setup.html 7 | speech: speech.html 8 | text-to-speech: text-to-speech.html 9 | translation: translation.html 10 | last_built: 2020-04-19T21:39Z 11 | urls: 12 | reference: https://code.markedmondson.me/googleLanguageR//reference 13 | article: https://code.markedmondson.me/googleLanguageR//articles 14 | 15 | -------------------------------------------------------------------------------- /docs/reference/gl_talk_languages.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Get a list of voices available for text to speech — gl_talk_languages • googleLanguageR 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 56 | 57 | 58 | 59 | 60 | 67 | 68 | 69 | 70 | 71 | 72 |
    73 |
    74 | 169 | 170 | 171 | 172 |
    173 | 174 |
    175 |
    176 | 181 | 182 |
    183 |

    Returns a list of voices supported for synthesis.

    184 |
    185 | 186 |
    gl_talk_languages(languageCode = NULL)
    187 | 188 |

    Arguments

    189 | 190 | 191 | 192 | 193 | 194 | 195 |
    languageCode

    A BCP-47 language tag. If specified, will only return voices that can be used to synthesize this languageCode

    196 | 197 | 198 |
    199 | 204 |
    205 | 206 | 207 |
    208 | 211 | 212 |
    213 |

    Site built with pkgdown 1.5.1.

    214 |
    215 | 216 |
    217 |
    218 | 219 | 220 | 221 | 222 | 223 | 224 | 225 | 226 | -------------------------------------------------------------------------------- /docs/reference/gl_talk_shinyUI.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Speak in Shiny module (ui) — gl_talk_shinyUI • googleLanguageR 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 56 | 57 | 58 | 59 | 60 | 67 | 68 | 69 | 70 | 71 | 72 |
    73 |
    74 | 169 | 170 | 171 | 172 |
    173 | 174 |
    175 |
    176 | 181 | 182 |
    183 |

    Speak in Shiny module (ui)

    184 |
    185 | 186 |
    gl_talk_shinyUI(id)
    187 | 188 |

    Arguments

    189 | 190 | 191 | 192 | 193 | 194 | 195 |
    id

    The Shiny id

    196 | 197 |

    Details

    198 | 199 |

    Shiny Module for use with gl_talk_shiny.

    200 | 201 |
    202 | 207 |
    208 | 209 | 210 |
    211 | 214 | 215 |
    216 |

    Site built with pkgdown 1.5.1.

    217 |
    218 | 219 |
    220 |
    221 | 222 | 223 | 224 | 225 | 226 | 227 | 228 | 229 | -------------------------------------------------------------------------------- /docs/reference/googleLanguageR.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | googleLanguageR — googleLanguageR • googleLanguageR 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 57 | 58 | 59 | 60 | 61 | 68 | 69 | 70 | 71 | 72 | 73 |
    74 |
    75 | 170 | 171 | 172 | 173 |
    174 | 175 |
    176 |
    177 | 182 | 183 |
    184 |

    This package contains functions for analysing language through the 185 | Google Cloud Machine Learning APIs

    186 |
    187 | 188 | 189 | 190 |

    Details

    191 | 192 |

    For examples and documentation see the vignettes and the website:

    193 |

    http://code.markedmondson.me/googleLanguageR/

    194 |

    See also

    195 | 196 | 197 | 198 |
    199 | 204 |
    205 | 206 | 207 |
    208 | 211 | 212 |
    213 |

    Site built with pkgdown 1.5.1.

    214 |
    215 | 216 |
    217 |
    218 | 219 | 220 | 221 | 222 | 223 | 224 | 225 | 226 | -------------------------------------------------------------------------------- /docs/reference/is.NullOb.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A helper function that tests whether an object is either NULL _or_ 10 | a list of NULLs — is.NullOb • googleLanguageR 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 45 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 59 | 60 | 61 | 62 | 63 | 70 | 71 | 72 | 73 | 74 | 75 |
    76 |
    77 | 172 | 173 | 174 | 175 |
    176 | 177 |
    178 |
    179 | 185 | 186 |
    187 |

    A helper function that tests whether an object is either NULL _or_ 188 | a list of NULLs

    189 |
    190 | 191 |
    is.NullOb(x)
    192 | 193 | 194 | 195 |
    196 | 201 |
    202 | 203 | 204 |
    205 | 208 | 209 |
    210 |

    Site built with pkgdown 1.5.1.

    211 |
    212 | 213 |
    214 |
    215 | 216 | 217 | 218 | 219 | 220 | 221 | 222 | 223 | -------------------------------------------------------------------------------- /docs/reference/rmNullObs.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Recursively step down into list, removing all such objects — rmNullObs • googleLanguageR 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 56 | 57 | 58 | 59 | 60 | 67 | 68 | 69 | 70 | 71 | 72 |
    73 |
    74 | 169 | 170 | 171 | 172 |
    173 | 174 |
    175 |
    176 | 181 | 182 |
    183 |

    Recursively step down into list, removing all such objects

    184 |
    185 | 186 |
    rmNullObs(x)
    187 | 188 | 189 | 190 |
    191 | 196 |
    197 | 198 | 199 |
    200 | 203 | 204 |
    205 |

    Site built with pkgdown 1.5.1.

    206 |
    207 | 208 |
    209 |
    210 | 211 | 212 | 213 | 214 | 215 | 216 | 217 | 218 | -------------------------------------------------------------------------------- /docs/sitemap.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | https://code.markedmondson.me/googleLanguageR//index.html 5 | 6 | 7 | https://code.markedmondson.me/googleLanguageR//reference/gl_auth.html 8 | 9 | 10 | https://code.markedmondson.me/googleLanguageR//reference/gl_nlp.html 11 | 12 | 13 | https://code.markedmondson.me/googleLanguageR//reference/gl_speech.html 14 | 15 | 16 | https://code.markedmondson.me/googleLanguageR//reference/gl_speech_op.html 17 | 18 | 19 | https://code.markedmondson.me/googleLanguageR//reference/gl_talk.html 20 | 21 | 22 | https://code.markedmondson.me/googleLanguageR//reference/gl_talk_languages.html 23 | 24 | 25 | https://code.markedmondson.me/googleLanguageR//reference/gl_talk_player.html 26 | 27 | 28 | https://code.markedmondson.me/googleLanguageR//reference/gl_talk_shiny.html 29 | 30 | 31 | https://code.markedmondson.me/googleLanguageR//reference/gl_talk_shinyUI.html 32 | 33 | 34 | https://code.markedmondson.me/googleLanguageR//reference/gl_translate.html 35 | 36 | 37 | https://code.markedmondson.me/googleLanguageR//reference/gl_translate_detect.html 38 | 39 | 40 | https://code.markedmondson.me/googleLanguageR//reference/gl_translate_languages.html 41 | 42 | 43 | https://code.markedmondson.me/googleLanguageR//reference/googleLanguageR.html 44 | 45 | 46 | https://code.markedmondson.me/googleLanguageR//reference/is.NullOb.html 47 | 48 | 49 | https://code.markedmondson.me/googleLanguageR//reference/rmNullObs.html 50 | 51 | 52 | https://code.markedmondson.me/googleLanguageR//articles/nlp.html 53 | 54 | 55 | https://code.markedmondson.me/googleLanguageR//articles/setup.html 56 | 57 | 58 | https://code.markedmondson.me/googleLanguageR//articles/speech.html 59 | 60 | 61 | https://code.markedmondson.me/googleLanguageR//articles/text-to-speech.html 62 | 63 | 64 | https://code.markedmondson.me/googleLanguageR//articles/translation.html 65 | 66 | 67 | -------------------------------------------------------------------------------- /googleLanguageR.Rproj: -------------------------------------------------------------------------------- 1 | Version: 1.0 2 | 3 | RestoreWorkspace: No 4 | SaveWorkspace: No 5 | AlwaysSaveHistory: Default 6 | 7 | EnableCodeIndexing: Yes 8 | UseSpacesForTab: Yes 9 | NumSpacesForTab: 2 10 | Encoding: UTF-8 11 | 12 | RnwWeave: Sweave 13 | LaTeX: pdfLaTeX 14 | 15 | AutoAppendNewline: Yes 16 | StripTrailingWhitespace: Yes 17 | 18 | BuildType: Package 19 | PackageUseDevtools: Yes 20 | PackageInstallArgs: --no-multiarch --with-keep.source 21 | PackageCheckArgs: --no-manual 22 | PackageRoxygenize: rd,collate,namespace,vignette 23 | -------------------------------------------------------------------------------- /inst/shiny/capture_speech/DESCRIPTION: -------------------------------------------------------------------------------- 1 | Title: Cloud Speech in Shiny 2 | Author: Mark Edmondson 3 | AuthorUrl: http://code.markedmondson.me 4 | License: MIT 5 | DisplayMode: Showcase 6 | Type: Shiny 7 | -------------------------------------------------------------------------------- /inst/shiny/capture_speech/README.md: -------------------------------------------------------------------------------- 1 | # Google Cloud Speech API Shiny app 2 | 3 | This is a demo on using the [Cloud Speech API](https://cloud.google.com/speech/) with Shiny. 4 | 5 | It uses `library(tuneR)` to process the audio file, and a JavaScript audio library from [Web Audio Demos](https://webaudiodemos.appspot.com/AudioRecorder/index.html) to capture the audio in your browser. 6 | 7 | You can also optionally send your transcription to the [Cloud Translation API](https://cloud.google.com/translate/) 8 | 9 | The results are then spoken back to you using the `gl_talk()` functions. 10 | 11 | ## Screenshot 12 | 13 | ![](babelfish.png) 14 | 15 | -------------------------------------------------------------------------------- /inst/shiny/capture_speech/babelfish.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci/googleLanguageR/7c6f93b0977ac7ac2189a6b5648362b12509c953/inst/shiny/capture_speech/babelfish.png -------------------------------------------------------------------------------- /inst/shiny/capture_speech/server.R: -------------------------------------------------------------------------------- 1 | library(shiny) 2 | library(tuneR) 3 | library(googleLanguageR) 4 | library(shinyjs) 5 | 6 | function(input, output, session){ 7 | 8 | output$result_text <- renderText({ 9 | req(get_api_text()) 10 | 11 | get_api_text() 12 | 13 | }) 14 | 15 | output$result_translation <- renderText({ 16 | req(translation()) 17 | 18 | translation() 19 | }) 20 | 21 | output$nlp_sentences <- renderTable({ 22 | req(nlp()) 23 | 24 | nlp()$sentences[[1]] 25 | 26 | }) 27 | 28 | output$nlp_tokens <- renderTable({ 29 | req(nlp()) 30 | 31 | ## only a few otherwise it breaks formatting 32 | nlp()$tokens[[1]][, c("content","beginOffset","tag","mood","number")] 33 | 34 | }) 35 | 36 | output$nlp_entities <- renderTable({ 37 | req(nlp()) 38 | 39 | nlp()$entities[[1]] 40 | 41 | }) 42 | 43 | output$nlp_misc <- renderTable({ 44 | req(nlp()) 45 | 46 | data.frame( 47 | language = nlp()$language, 48 | text = nlp()$text, 49 | documentSentimentMagnitude = nlp()$documentSentiment$magnitude, 50 | documentSentimentScore = nlp()$documentSentiment$score 51 | ) 52 | 53 | }) 54 | 55 | input_audio <- reactive({ 56 | req(input$audio) 57 | a <- input$audio 58 | 59 | if(length(a) > 0){ 60 | return(a) 61 | } else { 62 | NULL 63 | } 64 | 65 | }) 66 | 67 | wav_name <- reactive({ 68 | req(input_audio()) 69 | 70 | a <- input_audio() 71 | 72 | ## split two channel audio 73 | audio_split <- length(a)/2 74 | a1 <- a[1:audio_split] 75 | a2 <- a[(audio_split+1):length(a)] 76 | 77 | # construct wav object that the API likes 78 | Wobj <- Wave(a1, a2, samp.rate = 44100, bit = 16) 79 | Wobj <- normalize(Wobj, unit = "16", pcm = TRUE) 80 | Wobj <- mono(Wobj) 81 | 82 | wav_name <- paste0("audio",gsub("[^0-9]","",Sys.time()),".wav") 83 | 84 | writeWave(Wobj, wav_name, extensible = FALSE) 85 | 86 | wav_name 87 | 88 | 89 | }) 90 | 91 | get_api_text <- reactive({ 92 | req(wav_name()) 93 | req(input$language) 94 | 95 | if(input$language == ""){ 96 | stop("Must enter a languageCode - default en-US") 97 | } 98 | 99 | wav_name <- wav_name() 100 | 101 | if(!file.exists(wav_name)){ 102 | return(NULL) 103 | } 104 | 105 | message("Calling Speech API") 106 | shinyjs::show(id = "api", 107 | anim = TRUE, 108 | animType = "fade", 109 | time = 1, 110 | selector = NULL) 111 | 112 | # make API call 113 | me <- gl_speech(wav_name, 114 | sampleRateHertz = 44100L, 115 | languageCode = input$language) 116 | 117 | ## remove old file 118 | unlink(wav_name) 119 | 120 | message("API returned: ", me$transcript$transcript) 121 | shinyjs::hide(id = "api", 122 | anim = TRUE, 123 | animType = "fade", 124 | time = 1, 125 | selector = NULL) 126 | 127 | me$transcript$transcript 128 | }) 129 | 130 | translation <- reactive({ 131 | 132 | req(get_api_text()) 133 | req(input$translate) 134 | 135 | if(input$translate == "none"){ 136 | return("No translation required") 137 | } 138 | 139 | message("Calling Translation API") 140 | shinyjs::show(id = "api", 141 | anim = TRUE, 142 | animType = "fade", 143 | time = 1, 144 | selector = NULL) 145 | 146 | ttt <- gl_translate(get_api_text(), target = input$translate) 147 | 148 | message("API returned: ", ttt$translatedText) 149 | shinyjs::hide(id = "api", 150 | anim = TRUE, 151 | animType = "fade", 152 | time = 1, 153 | selector = NULL) 154 | 155 | ttt$translatedText 156 | 157 | }) 158 | 159 | nlp <- reactive({ 160 | req(get_api_text()) 161 | req(input$nlp) 162 | 163 | nlp_lang <- switch(input$nlp, 164 | none = NULL, 165 | input = substr(input$language, start = 0, stop = 2), 166 | trans = input$translate # not activated from ui.R dropdown as entity analysis only available on 'en' at the moment 167 | ) 168 | 169 | if(is.null(nlp_lang)){ 170 | return(NULL) 171 | } 172 | 173 | ## has to be on supported list of NLP language codes 174 | if(!any(nlp_lang %in% c("en", "zh", "zh-Hant", "fr", 175 | "de", "it", "ja", "ko", "pt", "es"))){ 176 | message("Unsupported NLP language, switching to 'en'") 177 | nlp_lang <- "en" 178 | } 179 | 180 | message("Calling NLP API") 181 | shinyjs::show(id = "api", 182 | anim = TRUE, 183 | animType = "fade", 184 | time = 1, 185 | selector = NULL) 186 | 187 | nnn <- gl_nlp(get_api_text(), language = nlp_lang) 188 | 189 | message("API returned: ", nnn$text) 190 | shinyjs::hide(id = "api", 191 | anim = TRUE, 192 | animType = "fade", 193 | time = 1, 194 | selector = NULL) 195 | nnn 196 | 197 | }) 198 | 199 | talk_file <- reactive({ 200 | req(get_api_text()) 201 | req(translation()) 202 | req(input$translate) 203 | 204 | # clean up any existing wav files 205 | unlink(list.files("www", pattern = ".wav$", full.names = TRUE)) 206 | 207 | # to prevent browser caching 208 | paste0(input$language,input$translate,basename(tempfile(fileext = ".wav"))) 209 | 210 | }) 211 | 212 | output$talk <- renderUI({ 213 | 214 | req(get_api_text()) 215 | req(translation()) 216 | req(talk_file()) 217 | 218 | # to prevent browser caching 219 | output_name <- talk_file() 220 | 221 | if(input$translate != "none"){ 222 | audio_file <- gl_talk(translation(), 223 | languageCode = input$translate, 224 | name = NULL, 225 | output = file.path("www", output_name)) 226 | } else { 227 | audio_file <- gl_talk(get_api_text(), 228 | languageCode = input$language, 229 | output = file.path("www", output_name)) 230 | } 231 | 232 | ## the audio file sits in folder www, but the audio file must be referenced without www 233 | tags$audio(autoplay = NA, controls = NA, tags$source(src = output_name)) 234 | 235 | }) 236 | 237 | } 238 | -------------------------------------------------------------------------------- /inst/shiny/capture_speech/ui.R: -------------------------------------------------------------------------------- 1 | library(shiny) 2 | library(shinyjs) 3 | 4 | shinyUI( 5 | fluidPage( 6 | useShinyjs(), 7 | includeCSS("www/style.css"), 8 | includeScript("www/main.js"), 9 | includeScript("www/speech.js"), 10 | includeScript("www/audiodisplay.js"), 11 | 12 | titlePanel("Shiny Babelfish"), 13 | 14 | sidebarLayout( 15 | sidebarPanel( 16 | helpText("Click on the microphone to record, click again to send to Cloud Speech API and wait for results."), 17 | img(id = "record", 18 | src = "mic128.png", 19 | onclick = "toggleRecording(this);", 20 | style = "display:block; margin:1px auto;"), 21 | hr(), 22 | div(id = "viz", 23 | tags$canvas(id = "analyser"), 24 | tags$canvas(id = "wavedisplay") 25 | ), 26 | br(), 27 | hr(), 28 | selectInput("language", "Language input", choices = c("English (UK)" = "en-GB", 29 | "English (Americans)" = "en-US", 30 | "Danish" = "da-DK", 31 | "French (France)" = "fr-FR", 32 | "German" = "de-DE", 33 | "Spanish (Spain)" = "es-ES", 34 | "Spanish (Chile)" = "es-CL", 35 | "Dutch" = "nl-NL", 36 | "Romainian" = "ro-RO", 37 | "Italian" = "it-IT", 38 | "Norwegian" = "nb-NO", 39 | "Swedish" = "sv-SE")), 40 | helpText("You can also add a call to the Google Translation API by selecting an output below"), 41 | selectInput("translate", "Translate output", choices = c("No Translation" = "none", 42 | "English" = "en", 43 | "Danish" = "da", 44 | "French" = "fr", 45 | "German" = "de", 46 | "Spanish" = "es", 47 | "Dutch" = "nl", 48 | "Romainian" = "ro", 49 | "Italian" = "it", 50 | "Norwegian" = "nb", 51 | "Swedish" = "sv")), 52 | helpText("Send the text to the Natural Language API for NLP analysis below."), 53 | selectInput("nlp", "Perform NLP", choices = c("No NLP" = "none", 54 | "NLP" = "input" 55 | #, 56 | #"On Translated Text" = "trans" 57 | ) 58 | ), 59 | helpText("Many more languages are supported in the API but I couldn't be bothered to put them all in - see here:", 60 | a(href="https://cloud.google.com/speech/docs/languages", "Supported languages")) 61 | ), 62 | 63 | mainPanel( 64 | helpText("Transcription will appear here when ready. (Can take 30 seconds +). Streaming support not implemented yet."), 65 | shinyjs::hidden( 66 | div(id = "api", 67 | p("Calling API - please wait", icon("circle-o-notch fa-spin fa-fw")) 68 | )), 69 | h2("Transcribed text"), 70 | p(textOutput("result_text")), 71 | h2("Translated text"), 72 | p(textOutput("result_translation")), 73 | h2("NLP"), 74 | tableOutput("nlp_sentences"), 75 | tableOutput("nlp_tokens"), 76 | tableOutput("nlp_entities"), 77 | tableOutput("nlp_misc"), 78 | htmlOutput("talk") 79 | ) 80 | ), 81 | helpText( 82 | a("Adapted from Web Audio Demos", 83 | href="https://webaudiodemos.appspot.com/AudioRecorder/index.html")) 84 | )) 85 | -------------------------------------------------------------------------------- /inst/shiny/capture_speech/www/audiodisplay.js: -------------------------------------------------------------------------------- 1 | function drawBuffer( width, height, context, data ) { 2 | var step = Math.ceil( data.length / width ); 3 | var amp = height / 2; 4 | context.fillStyle = "silver"; 5 | context.clearRect(0,0,width,height); 6 | for(var i=0; i < width; i++){ 7 | var min = 1.0; 8 | var max = -1.0; 9 | for (j=0; j max) 14 | max = datum; 15 | } 16 | context.fillRect(i,(1+min)*amp,1,Math.max(1,(max-min)*amp)); 17 | } 18 | } 19 | -------------------------------------------------------------------------------- /inst/shiny/capture_speech/www/main.js: -------------------------------------------------------------------------------- 1 | /* Copyright 2013 Chris Wilson 2 | 3 | Licensed under the Apache License, Version 2.0 (the "License"); 4 | you may not use this file except in compliance with the License. 5 | You may obtain a copy of the License at 6 | 7 | http://www.apache.org/licenses/LICENSE-2.0 8 | 9 | Unless required by applicable law or agreed to in writing, software 10 | distributed under the License is distributed on an "AS IS" BASIS, 11 | WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. 12 | See the License for the specific language governing permissions and 13 | limitations under the License. 14 | */ 15 | 16 | window.AudioContext = window.AudioContext || window.webkitAudioContext; 17 | 18 | var audioContext = new AudioContext(); 19 | var audioInput = null, 20 | realAudioInput = null, 21 | inputPoint = null, 22 | audioRecorder = null; 23 | var rafID = null; 24 | var analyserContext = null; 25 | var canvasWidth, canvasHeight; 26 | var recIndex = 0; 27 | 28 | /* TODO: 29 | 30 | - offer mono option 31 | - "Monitor input" switch 32 | */ 33 | 34 | function saveAudio() { 35 | //audioRecorder.exportWAV( doneEncoding ); 36 | // could get mono instead by saying 37 | audioRecorder.exportMonoWAV( doneEncoding ); 38 | 39 | } 40 | 41 | function gotBuffers( buffers ) { 42 | var canvas = document.getElementById( "wavedisplay" ); 43 | 44 | drawBuffer( canvas.width, canvas.height, canvas.getContext('2d'), buffers[0] ); 45 | 46 | // the ONLY time gotBuffers is called is right after a new recording is completed - 47 | // so here's where we should set up the download. 48 | //audioRecorder.exportWAV( doneEncoding ); 49 | Shiny.onInputChange("audio", buffers); 50 | } 51 | 52 | function doneEncoding( blob ) { 53 | Recorder.setupDownload( blob, "myRecording" + ((recIndex<10)?"0":"") + recIndex + ".wav" ); 54 | recIndex++; 55 | } 56 | 57 | function toggleRecording( e ) { 58 | if (e.classList.contains("recording")) { 59 | // stop recording 60 | audioRecorder.stop(); 61 | e.classList.remove("recording"); 62 | audioRecorder.getBuffers( gotBuffers ); 63 | } else { 64 | // start recording 65 | if (!audioRecorder) 66 | return; 67 | e.classList.add("recording"); 68 | audioRecorder.clear(); 69 | audioRecorder.record(); 70 | } 71 | } 72 | 73 | function convertToMono( input ) { 74 | var splitter = audioContext.createChannelSplitter(2); 75 | var merger = audioContext.createChannelMerger(2); 76 | 77 | input.connect( splitter ); 78 | splitter.connect( merger, 0, 0 ); 79 | splitter.connect( merger, 0, 1 ); 80 | return merger; 81 | } 82 | 83 | function cancelAnalyserUpdates() { 84 | window.cancelAnimationFrame( rafID ); 85 | rafID = null; 86 | } 87 | 88 | function updateAnalysers(time) { 89 | if (!analyserContext) { 90 | var canvas = document.getElementById("analyser"); 91 | canvasWidth = canvas.width; 92 | canvasHeight = canvas.height; 93 | analyserContext = canvas.getContext('2d'); 94 | } 95 | 96 | // analyzer draw code here 97 | { 98 | var SPACING = 3; 99 | var BAR_WIDTH = 1; 100 | var numBars = Math.round(canvasWidth / SPACING); 101 | var freqByteData = new Uint8Array(analyserNode.frequencyBinCount); 102 | 103 | analyserNode.getByteFrequencyData(freqByteData); 104 | 105 | analyserContext.clearRect(0, 0, canvasWidth, canvasHeight); 106 | analyserContext.fillStyle = '#F6D565'; 107 | analyserContext.lineCap = 'round'; 108 | var multiplier = analyserNode.frequencyBinCount / numBars; 109 | 110 | // Draw rectangle for each frequency bin. 111 | for (var i = 0; i < numBars; ++i) { 112 | var magnitude = 0; 113 | var offset = Math.floor( i * multiplier ); 114 | // gotta sum/average the block, or we miss narrow-bandwidth spikes 115 | for (var j = 0; j< multiplier; j++) 116 | magnitude += freqByteData[offset + j]; 117 | magnitude = magnitude / multiplier; 118 | var magnitude2 = freqByteData[i * multiplier]; 119 | analyserContext.fillStyle = "hsl( " + Math.round((i*360)/numBars) + ", 100%, 50%)"; 120 | analyserContext.fillRect(i * SPACING, canvasHeight, BAR_WIDTH, -magnitude); 121 | } 122 | } 123 | 124 | rafID = window.requestAnimationFrame( updateAnalysers ); 125 | } 126 | 127 | function toggleMono() { 128 | if (audioInput != realAudioInput) { 129 | audioInput.disconnect(); 130 | realAudioInput.disconnect(); 131 | audioInput = realAudioInput; 132 | } else { 133 | realAudioInput.disconnect(); 134 | audioInput = convertToMono( realAudioInput ); 135 | } 136 | 137 | audioInput.connect(inputPoint); 138 | } 139 | 140 | function gotStream(stream) { 141 | inputPoint = audioContext.createGain(); 142 | 143 | // Create an AudioNode from the stream. 144 | realAudioInput = audioContext.createMediaStreamSource(stream); 145 | audioInput = realAudioInput; 146 | audioInput.connect(inputPoint); 147 | 148 | audioInput = convertToMono(audioInput); 149 | 150 | analyserNode = audioContext.createAnalyser(); 151 | analyserNode.fftSize = 2048; 152 | inputPoint.connect( analyserNode ); 153 | 154 | audioRecorder = new Recorder( inputPoint ); 155 | 156 | zeroGain = audioContext.createGain(); 157 | zeroGain.gain.value = 0.0; 158 | inputPoint.connect( zeroGain ); 159 | zeroGain.connect( audioContext.destination ); 160 | updateAnalysers(); 161 | } 162 | 163 | function initAudio() { 164 | if (!navigator.getUserMedia) 165 | navigator.getUserMedia = navigator.webkitGetUserMedia || navigator.mozGetUserMedia; 166 | if (!navigator.cancelAnimationFrame) 167 | navigator.cancelAnimationFrame = navigator.webkitCancelAnimationFrame || navigator.mozCancelAnimationFrame; 168 | if (!navigator.requestAnimationFrame) 169 | navigator.requestAnimationFrame = navigator.webkitRequestAnimationFrame || navigator.mozRequestAnimationFrame; 170 | 171 | navigator.getUserMedia( 172 | { 173 | "audio": { 174 | "mandatory": { 175 | "googEchoCancellation": "false", 176 | "googAutoGainControl": "false", 177 | "googNoiseSuppression": "false", 178 | "googHighpassFilter": "false" 179 | }, 180 | "optional": [] 181 | }, 182 | }, gotStream, function(e) { 183 | alert('Error getting audio'); 184 | console.log(e); 185 | }); 186 | } 187 | 188 | window.addEventListener('load', initAudio ); 189 | -------------------------------------------------------------------------------- /inst/shiny/capture_speech/www/mic128.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci/googleLanguageR/7c6f93b0977ac7ac2189a6b5648362b12509c953/inst/shiny/capture_speech/www/mic128.png -------------------------------------------------------------------------------- /inst/shiny/capture_speech/www/recorderWorker.js: -------------------------------------------------------------------------------- 1 | /*License (MIT) 2 | 3 | Copyright © 2013 Matt Diamond 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated 6 | documentation files (the "Software"), to deal in the Software without restriction, including without limitation 7 | the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and 8 | to permit persons to whom the Software is furnished to do so, subject to the following conditions: 9 | 10 | The above copyright notice and this permission notice shall be included in all copies or substantial portions of 11 | the Software. 12 | 13 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO 14 | THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 15 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF 16 | CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 17 | DEALINGS IN THE SOFTWARE. 18 | */ 19 | 20 | var recLength = 0, 21 | recBuffersL = [], 22 | recBuffersR = [], 23 | sampleRate; 24 | 25 | this.onmessage = function(e){ 26 | switch(e.data.command){ 27 | case 'init': 28 | init(e.data.config); 29 | break; 30 | case 'record': 31 | record(e.data.buffer); 32 | break; 33 | case 'exportWAV': 34 | exportWAV(e.data.type); 35 | break; 36 | case 'exportMonoWAV': 37 | exportMonoWAV(e.data.type); 38 | break; 39 | case 'getBuffers': 40 | getBuffers(); 41 | break; 42 | case 'clear': 43 | clear(); 44 | break; 45 | } 46 | }; 47 | 48 | function init(config){ 49 | sampleRate = config.sampleRate; 50 | } 51 | 52 | function record(inputBuffer){ 53 | recBuffersL.push(inputBuffer[0]); 54 | recBuffersR.push(inputBuffer[1]); 55 | recLength += inputBuffer[0].length; 56 | } 57 | 58 | function exportWAV(type){ 59 | var bufferL = mergeBuffers(recBuffersL, recLength); 60 | var bufferR = mergeBuffers(recBuffersR, recLength); 61 | var interleaved = interleave(bufferL, bufferR); 62 | var dataview = encodeWAV(interleaved); 63 | var audioBlob = new Blob([dataview], { type: type }); 64 | 65 | this.postMessage(audioBlob); 66 | } 67 | 68 | function exportMonoWAV(type){ 69 | var bufferL = mergeBuffers(recBuffersL, recLength); 70 | var dataview = encodeWAV(bufferL, true); 71 | var audioBlob = new Blob([dataview], { type: type }); 72 | 73 | this.postMessage(audioBlob); 74 | } 75 | 76 | function getBuffers() { 77 | var buffers = []; 78 | buffers.push( mergeBuffers(recBuffersL, recLength) ); 79 | buffers.push( mergeBuffers(recBuffersR, recLength) ); 80 | this.postMessage(buffers); 81 | } 82 | 83 | function clear(){ 84 | recLength = 0; 85 | recBuffersL = []; 86 | recBuffersR = []; 87 | } 88 | 89 | function mergeBuffers(recBuffers, recLength){ 90 | var result = new Float32Array(recLength); 91 | var offset = 0; 92 | for (var i = 0; i < recBuffers.length; i++){ 93 | result.set(recBuffers[i], offset); 94 | offset += recBuffers[i].length; 95 | } 96 | return result; 97 | } 98 | 99 | function interleave(inputL, inputR){ 100 | var length = inputL.length + inputR.length; 101 | var result = new Float32Array(length); 102 | 103 | var index = 0, 104 | inputIndex = 0; 105 | 106 | while (index < length){ 107 | result[index++] = inputL[inputIndex]; 108 | result[index++] = inputR[inputIndex]; 109 | inputIndex++; 110 | } 111 | return result; 112 | } 113 | 114 | function floatTo16BitPCM(output, offset, input){ 115 | for (var i = 0; i < input.length; i++, offset+=2){ 116 | var s = Math.max(-1, Math.min(1, input[i])); 117 | output.setInt16(offset, s < 0 ? s * 0x8000 : s * 0x7FFF, true); 118 | } 119 | } 120 | 121 | function writeString(view, offset, string){ 122 | for (var i = 0; i < string.length; i++){ 123 | view.setUint8(offset + i, string.charCodeAt(i)); 124 | } 125 | } 126 | 127 | function encodeWAV(samples, mono){ 128 | var buffer = new ArrayBuffer(44 + samples.length * 2); 129 | var view = new DataView(buffer); 130 | 131 | /* RIFF identifier */ 132 | writeString(view, 0, 'RIFF'); 133 | /* file length */ 134 | view.setUint32(4, 32 + samples.length * 2, true); 135 | /* RIFF type */ 136 | writeString(view, 8, 'WAVE'); 137 | /* format chunk identifier */ 138 | writeString(view, 12, 'fmt '); 139 | /* format chunk length */ 140 | view.setUint32(16, 16, true); 141 | /* sample format (raw) */ 142 | view.setUint16(20, 1, true); 143 | /* channel count */ 144 | view.setUint16(22, mono?1:2, true); 145 | /* sample rate */ 146 | view.setUint32(24, sampleRate, true); 147 | /* byte rate (sample rate * block align) */ 148 | view.setUint32(28, sampleRate * 4, true); 149 | /* block align (channel count * bytes per sample) */ 150 | view.setUint16(32, 4, true); 151 | /* bits per sample */ 152 | view.setUint16(34, 16, true); 153 | /* data chunk identifier */ 154 | writeString(view, 36, 'data'); 155 | /* data chunk length */ 156 | view.setUint32(40, samples.length * 2, true); 157 | 158 | floatTo16BitPCM(view, 44, samples); 159 | 160 | return view; 161 | } 162 | -------------------------------------------------------------------------------- /inst/shiny/capture_speech/www/speech.js: -------------------------------------------------------------------------------- 1 | /*License (MIT) 2 | 3 | Copyright © 2013 Matt Diamond 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated 6 | documentation files (the "Software"), to deal in the Software without restriction, including without limitation 7 | the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and 8 | to permit persons to whom the Software is furnished to do so, subject to the following conditions: 9 | 10 | The above copyright notice and this permission notice shall be included in all copies or substantial portions of 11 | the Software. 12 | 13 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO 14 | THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 15 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF 16 | CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER 17 | DEALINGS IN THE SOFTWARE. 18 | https://webaudiodemos.appspot.com/AudioRecorder/js/recorderjs/recorder.js 19 | */ 20 | 21 | (function(window){ 22 | 23 | var WORKER_PATH = 'recorderWorker.js'; 24 | 25 | var Recorder = function(source, cfg){ 26 | var config = cfg || {}; 27 | var bufferLen = config.bufferLen || 4096; 28 | this.context = source.context; 29 | if(!this.context.createScriptProcessor){ 30 | this.node = this.context.createJavaScriptNode(bufferLen, 2, 2); 31 | } else { 32 | this.node = this.context.createScriptProcessor(bufferLen, 2, 2); 33 | } 34 | 35 | var worker = new Worker(config.workerPath || WORKER_PATH); 36 | worker.postMessage({ 37 | command: 'init', 38 | config: { 39 | sampleRate: this.context.sampleRate 40 | } 41 | }); 42 | var recording = false, 43 | currCallback; 44 | 45 | this.node.onaudioprocess = function(e){ 46 | if (!recording) return; 47 | worker.postMessage({ 48 | command: 'record', 49 | buffer: [ 50 | e.inputBuffer.getChannelData(0), 51 | e.inputBuffer.getChannelData(1) 52 | ] 53 | }); 54 | } 55 | 56 | this.configure = function(cfg){ 57 | for (var prop in cfg){ 58 | if (cfg.hasOwnProperty(prop)){ 59 | config[prop] = cfg[prop]; 60 | } 61 | } 62 | } 63 | 64 | this.record = function(){ 65 | recording = true; 66 | } 67 | 68 | this.stop = function(){ 69 | recording = false; 70 | } 71 | 72 | this.clear = function(){ 73 | worker.postMessage({ command: 'clear' }); 74 | } 75 | 76 | this.getBuffers = function(cb) { 77 | currCallback = cb || config.callback; 78 | worker.postMessage({ command: 'getBuffers' }) 79 | } 80 | 81 | this.exportWAV = function(cb, type){ 82 | currCallback = cb || config.callback; 83 | type = type || config.type || 'audio/wav'; 84 | if (!currCallback) throw new Error('Callback not set'); 85 | worker.postMessage({ 86 | command: 'exportWAV', 87 | type: type 88 | }); 89 | } 90 | 91 | this.exportMonoWAV = function(cb, type){ 92 | currCallback = cb || config.callback; 93 | type = type || config.type || 'audio/wav'; 94 | if (!currCallback) throw new Error('Callback not set'); 95 | worker.postMessage({ 96 | command: 'exportMonoWAV', 97 | type: type 98 | }); 99 | } 100 | 101 | worker.onmessage = function(e){ 102 | var blob = e.data; 103 | currCallback(blob); 104 | } 105 | 106 | source.connect(this.node); 107 | this.node.connect(this.context.destination); // if the script node is not connected to an output the "onaudioprocess" event is not triggered in chrome. 108 | }; 109 | 110 | Recorder.setupDownload = function(blob, filename){ 111 | //var url = (window.URL || window.webkitURL).createObjectURL(blob); 112 | //var link = document.getElementById("save"); 113 | //link.href = url; 114 | //link.download = filename || 'output.wav'; 115 | } 116 | 117 | window.Recorder = Recorder; 118 | 119 | })(window); 120 | -------------------------------------------------------------------------------- /inst/shiny/capture_speech/www/style.css: -------------------------------------------------------------------------------- 1 | html { overflow: hidden; } 2 | canvas { 3 | display: inline-block; 4 | background: #202020; 5 | width: 95%; 6 | height: 45%; 7 | box-shadow: 0px 0px 10px blue; 8 | } 9 | #controls { 10 | display: flex; 11 | flex-direction: row; 12 | align-items: center; 13 | justify-content: space-around; 14 | height: 20%; 15 | width: 100%; 16 | } 17 | #record { height: 15vh; } 18 | #record.recording { 19 | background: red; 20 | background: -webkit-radial-gradient(center, ellipse cover, #ff0000 0%,lightgrey 75%,lightgrey 100%,#7db9e8 100%); 21 | background: -moz-radial-gradient(center, ellipse cover, #ff0000 0%,lightgrey 75%,lightgrey 100%,#7db9e8 100%); 22 | background: radial-gradient(center, ellipse cover, #ff0000 0%,lightgrey 75%,lightgrey 100%,#7db9e8 100%); 23 | } 24 | #save, #save img { height: 10vh; } 25 | #save { opacity: 0.25;} 26 | #save[download] { opacity: 1;} 27 | #viz { 28 | height: 80%; 29 | width: 100%; 30 | display: flex; 31 | flex-direction: column; 32 | justify-content: space-around; 33 | align-items: center; 34 | } 35 | @media (orientation: landscape) { 36 | body { flex-direction: row;} 37 | #controls { flex-direction: column; height: 100%; width: 10%;} 38 | #viz { height: 100%; width: 90%;} 39 | } 40 | -------------------------------------------------------------------------------- /inst/test-doc-no.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci/googleLanguageR/7c6f93b0977ac7ac2189a6b5648362b12509c953/inst/test-doc-no.pdf -------------------------------------------------------------------------------- /inst/test-doc.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci/googleLanguageR/7c6f93b0977ac7ac2189a6b5648362b12509c953/inst/test-doc.pdf -------------------------------------------------------------------------------- /inst/woman1_wb.wav: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci/googleLanguageR/7c6f93b0977ac7ac2189a6b5648362b12509c953/inst/woman1_wb.wav -------------------------------------------------------------------------------- /man/gl_auth.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/auth.R 3 | \name{gl_auth} 4 | \alias{gl_auth} 5 | \alias{gl_auto_auth} 6 | \title{Authenticate with Google language API services} 7 | \usage{ 8 | gl_auth(json_file) 9 | 10 | gl_auto_auth(...) 11 | } 12 | \arguments{ 13 | \item{json_file}{Authentication json file you have downloaded from your Google Project} 14 | 15 | \item{...}{additional argument to 16 | pass to \code{\link{gar_attach_auto_auth}}.} 17 | } 18 | \description{ 19 | Authenticate with Google language API services 20 | } 21 | \details{ 22 | The best way to authenticate is to use an environment argument pointing at your authentication file. 23 | 24 | Set the file location of your download Google Project JSON file in a \code{GL_AUTH} argument 25 | 26 | Then, when you load the library you should auto-authenticate 27 | 28 | However, you can authenticate directly using this function pointing at your JSON auth file. 29 | } 30 | \examples{ 31 | 32 | \dontrun{ 33 | library(googleLanguageR) 34 | gl_auth("location_of_json_file.json") 35 | } 36 | 37 | \dontrun{ 38 | library(googleLanguageR) 39 | gl_auto_auth() 40 | gl_auto_auth(environment_var = "GAR_AUTH_FILE") 41 | } 42 | } 43 | -------------------------------------------------------------------------------- /man/gl_nlp.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/natural-language.R 3 | \name{gl_nlp} 4 | \alias{gl_nlp} 5 | \title{Perform Natural Language Analysis} 6 | \usage{ 7 | gl_nlp( 8 | string, 9 | nlp_type = c("annotateText", "analyzeEntities", "analyzeSentiment", "analyzeSyntax", 10 | "analyzeEntitySentiment", "classifyText"), 11 | type = c("PLAIN_TEXT", "HTML"), 12 | language = c("en", "zh", "zh-Hant", "fr", "de", "it", "ja", "ko", "pt", "es"), 13 | encodingType = c("UTF8", "UTF16", "UTF32", "NONE") 14 | ) 15 | } 16 | \arguments{ 17 | \item{string}{A vector of text to detect language for, or Google Cloud Storage URI(s)} 18 | 19 | \item{nlp_type}{The type of Natural Language Analysis to perform. The default \code{annotateText} will perform all features in one call.} 20 | 21 | \item{type}{Whether input text is plain text or a HTML page} 22 | 23 | \item{language}{Language of source, must be supported by API.} 24 | 25 | \item{encodingType}{Text encoding that the caller uses to process the output} 26 | } 27 | \value{ 28 | A list of the following objects, if those fields are asked for via \code{nlp_type}: 29 | 30 | \itemize{ 31 | \item{sentences - }{\href{https://cloud.google.com/natural-language/docs/reference/rest/v1/Sentence}{Sentences in the input document}} 32 | \item{tokens - }{\href{https://cloud.google.com/natural-language/docs/reference/rest/v1/Token}{Tokens, along with their syntactic information, in the input document}} 33 | \item{entities - }{\href{https://cloud.google.com/natural-language/docs/reference/rest/v1/Entity}{Entities, along with their semantic information, in the input document}} 34 | \item{documentSentiment - }{\href{https://cloud.google.com/natural-language/docs/reference/rest/v1/Sentiment}{The overall sentiment for the document}} 35 | \item{classifyText -}{\href{https://cloud.google.com/natural-language/docs/classifying-text}{Classification of the document}} 36 | \item{language - }{The language of the text, which will be the same as the language specified in the request or, if not specified, the automatically-detected language} 37 | \item{text - }{The original text passed into the API. \code{NA} if not passed due to being zero-length etc. } 38 | } 39 | } 40 | \description{ 41 | Analyse text entities, sentiment, syntax and categorisation using the Google Natural Language API 42 | } 43 | \details{ 44 | \code{string} can be a character vector, or a location of a file content on Google cloud Storage. 45 | This URI must be of the form \code{gs://bucket_name/object_name} 46 | 47 | Encoding type can usually be left at default \code{UTF8}. 48 | \href{https://cloud.google.com/natural-language/docs/reference/rest/v1/EncodingType}{Read more here} 49 | 50 | The current language support is available \href{https://cloud.google.com/natural-language/docs/languages}{here} 51 | } 52 | \examples{ 53 | 54 | \dontrun{ 55 | 56 | text <- "to administer medicince to animals is frequently a very difficult matter, 57 | and yet sometimes it's necessary to do so" 58 | nlp <- gl_nlp(text) 59 | 60 | nlp$sentences 61 | 62 | nlp$tokens 63 | 64 | nlp$entities 65 | 66 | nlp$documentSentiment 67 | 68 | ## vectorised input 69 | texts <- c("The cat sat one the mat", "oh no it didn't you fool") 70 | nlp_results <- gl_nlp(texts) 71 | 72 | 73 | 74 | } 75 | 76 | } 77 | \seealso{ 78 | \url{https://cloud.google.com/natural-language/docs/reference/rest/v1/documents} 79 | } 80 | -------------------------------------------------------------------------------- /man/gl_speech.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/speech-to-text.R 3 | \name{gl_speech} 4 | \alias{gl_speech} 5 | \title{Call Google Speech API} 6 | \usage{ 7 | gl_speech( 8 | audio_source, 9 | encoding = c("LINEAR16", "FLAC", "MULAW", "AMR", "AMR_WB", "OGG_OPUS", 10 | "SPEEX_WITH_HEADER_BYTE"), 11 | sampleRateHertz = NULL, 12 | languageCode = "en-US", 13 | maxAlternatives = 1L, 14 | profanityFilter = FALSE, 15 | speechContexts = NULL, 16 | asynch = FALSE, 17 | customConfig = NULL 18 | ) 19 | } 20 | \arguments{ 21 | \item{audio_source}{File location of audio data, or Google Cloud Storage URI} 22 | 23 | \item{encoding}{Encoding of audio data sent} 24 | 25 | \item{sampleRateHertz}{Sample rate in Hertz of audio data. Valid values \code{8000-48000}. Optimal and default if left \code{NULL} is \code{16000}} 26 | 27 | \item{languageCode}{Language of the supplied audio as a \code{BCP-47} language tag} 28 | 29 | \item{maxAlternatives}{Maximum number of recognition hypotheses to be returned. \code{0-30}} 30 | 31 | \item{profanityFilter}{If \code{TRUE} will attempt to filter out profanities} 32 | 33 | \item{speechContexts}{An optional character vector of context to assist the speech recognition} 34 | 35 | \item{asynch}{If your \code{audio_source} is greater than 60 seconds, set this to TRUE to return an asynchronous call} 36 | 37 | \item{customConfig}{[optional] A \code{RecognitionConfig} object that will be converted from a list to JSON via \code{\link[jsonlite]{toJSON}} - see \href{https://cloud.google.com/speech-to-text/docs/reference/rest/v1p1beta1/RecognitionConfig}{RecognitionConfig documentation}. The \code{languageCode} will be taken from this functions arguments if not present since it is required.} 38 | } 39 | \value{ 40 | A list of two tibbles: \code{$transcript}, a tibble of the \code{transcript} with a \code{confidence}; \code{$timings}, a tibble that contains \code{startTime}, \code{endTime} per \code{word}. If maxAlternatives is greater than 1, then the transcript will return near-duplicate rows with other interpretations of the text. 41 | If \code{asynch} is TRUE, then an operation you will need to pass to \link{gl_speech_op} to get the finished result. 42 | } 43 | \description{ 44 | Turn audio into text 45 | } 46 | \details{ 47 | Google Cloud Speech API enables developers to convert audio to text by applying powerful 48 | neural network models in an easy to use API. 49 | The API recognizes over 80 languages and variants, to support your global user base. 50 | You can transcribe the text of users dictating to an application’s microphone, 51 | enable command-and-control through voice, or transcribe audio files, among many other use cases. 52 | Recognize audio uploaded in the request, and integrate with your audio storage on Google Cloud Storage, 53 | by using the same technology Google uses to power its own products. 54 | } 55 | \section{AudioEncoding}{ 56 | 57 | 58 | Audio encoding of the data sent in the audio message. All encodings support only 1 channel (mono) audio. 59 | Only FLAC and WAV include a header that describes the bytes of audio that follow the header. 60 | The other encodings are raw audio bytes with no header. 61 | For best results, the audio source should be captured and transmitted using a 62 | lossless encoding (FLAC or LINEAR16). 63 | Recognition accuracy may be reduced if lossy codecs, which include the other codecs listed in this section, 64 | are used to capture or transmit the audio, particularly if background noise is present. 65 | 66 | Read more on audio encodings here \url{https://cloud.google.com/speech/docs/encoding} 67 | } 68 | 69 | \section{WordInfo}{ 70 | 71 | 72 | 73 | \code{startTime} - Time offset relative to the beginning of the audio, and corresponding to the start of the spoken word. 74 | 75 | \code{endTime} - Time offset relative to the beginning of the audio, and corresponding to the end of the spoken word. 76 | 77 | \code{word} - The word corresponding to this set of information. 78 | } 79 | 80 | \examples{ 81 | 82 | \dontrun{ 83 | 84 | test_audio <- system.file("woman1_wb.wav", package = "googleLanguageR") 85 | result <- gl_speech(test_audio) 86 | 87 | result$transcript 88 | result$timings 89 | 90 | result2 <- gl_speech(test_audio, maxAlternatives = 2L) 91 | result2$transcript 92 | 93 | result_brit <- gl_speech(test_audio, languageCode = "en-GB") 94 | 95 | 96 | ## make an asynchronous API request (mandatory for sound files over 60 seconds) 97 | asynch <- gl_speech(test_audio, asynch = TRUE) 98 | 99 | ## Send to gl_speech_op() for status or finished result 100 | gl_speech_op(asynch) 101 | 102 | ## Upload to GCS bucket for long files > 60 seconds 103 | test_gcs <- "gs://mark-edmondson-public-files/googleLanguageR/a-dream-mono.wav" 104 | gcs <- gl_speech(test_gcs, sampleRateHertz = 44100L, asynch = TRUE) 105 | gl_speech_op(gcs) 106 | 107 | ## Use a custom configuration 108 | my_config <- list(encoding = "LINEAR16", 109 | diarizationConfig = list( 110 | enableSpeakerDiarization = TRUE, 111 | minSpeakerCount = 2, 112 | maxSpeakCount = 3 113 | )) 114 | 115 | # languageCode is required, so will be added if not in your custom config 116 | gl_speech(my_audio, languageCode = "en-US", customConfig = my_config) 117 | 118 | } 119 | 120 | 121 | 122 | } 123 | \seealso{ 124 | \url{https://cloud.google.com/speech/reference/rest/v1/speech/recognize} 125 | } 126 | -------------------------------------------------------------------------------- /man/gl_speech_op.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/speech-to-text.R 3 | \name{gl_speech_op} 4 | \alias{gl_speech_op} 5 | \title{Get a speech operation} 6 | \usage{ 7 | gl_speech_op(operation = .Last.value) 8 | } 9 | \arguments{ 10 | \item{operation}{A speech operation object from \link{gl_speech} when \code{asynch = TRUE}} 11 | } 12 | \value{ 13 | If the operation is still running, another operation object. If done, the result as per \link{gl_speech} 14 | } 15 | \description{ 16 | For asynchronous calls of audio over 60 seconds, this returns the finished job 17 | } 18 | \examples{ 19 | 20 | \dontrun{ 21 | 22 | test_audio <- system.file("woman1_wb.wav", package = "googleLanguageR") 23 | 24 | ## make an asynchronous API request (mandatory for sound files over 60 seconds) 25 | asynch <- gl_speech(test_audio, asynch = TRUE) 26 | 27 | ## Send to gl_speech_op() for status or finished result 28 | gl_speech_op(asynch) 29 | 30 | } 31 | 32 | } 33 | \seealso{ 34 | \link{gl_speech} 35 | } 36 | -------------------------------------------------------------------------------- /man/gl_talk.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/text-to-speech.R 3 | \name{gl_talk} 4 | \alias{gl_talk} 5 | \title{Perform text to speech} 6 | \usage{ 7 | gl_talk( 8 | input, 9 | output = "output.wav", 10 | languageCode = "en", 11 | gender = c("SSML_VOICE_GENDER_UNSPECIFIED", "MALE", "FEMALE", "NEUTRAL"), 12 | name = NULL, 13 | audioEncoding = c("LINEAR16", "MP3", "OGG_OPUS"), 14 | speakingRate = 1, 15 | pitch = 0, 16 | volumeGainDb = 0, 17 | sampleRateHertz = NULL, 18 | inputType = c("text", "ssml"), 19 | effectsProfileIds = NULL 20 | ) 21 | } 22 | \arguments{ 23 | \item{input}{The text to turn into speech} 24 | 25 | \item{output}{Where to save the speech audio file} 26 | 27 | \item{languageCode}{The language of the voice as a \code{BCP-47} language code} 28 | 29 | \item{gender}{The gender of the voice, if available} 30 | 31 | \item{name}{Name of the voice, see list via \link{gl_talk_languages} for supported voices. Set to \code{NULL} to make the service choose a voice based on \code{languageCode} and \code{gender}.} 32 | 33 | \item{audioEncoding}{Format of the requested audio stream} 34 | 35 | \item{speakingRate}{Speaking rate/speed between \code{0.25} and \code{4.0}} 36 | 37 | \item{pitch}{Speaking pitch between \code{-20.0} and \code{20.0} in semitones.} 38 | 39 | \item{volumeGainDb}{Volumne gain in dB} 40 | 41 | \item{sampleRateHertz}{Sample rate for returned audio} 42 | 43 | \item{inputType}{Choose between \code{text} (the default) or SSML markup. The \code{input} text must be SSML markup if you choose \code{ssml}} 44 | 45 | \item{effectsProfileIds}{Optional. An identifier which selects 'audio effects' profiles that are applied on (post synthesized) text to speech. Effects are applied on top of each other in the order they are given} 46 | } 47 | \value{ 48 | The file output name you supplied as \code{output} 49 | } 50 | \description{ 51 | Synthesizes speech synchronously: receive results after all text input has been processed. 52 | } 53 | \details{ 54 | Requires the Cloud Text-To-Speech API to be activated for your Google Cloud project. 55 | 56 | Supported voices are here \url{https://cloud.google.com/text-to-speech/docs/voices} and can be imported into R via \link{gl_talk_languages} 57 | 58 | To play the audio in code via a browser see \link{gl_talk_player} 59 | 60 | To use Speech Synthesis Markup Language (SSML) select \code{inputType=ssml} - more details on using this to insert pauses, sounds and breaks in your audio can be found here: \url{https://cloud.google.com/text-to-speech/docs/ssml} 61 | 62 | To use audio profiles, supply a character vector of the available audio profiles listed here: \url{https://cloud.google.com/text-to-speech/docs/audio-profiles} - the audio profiles are applied in the order given. For instance \code{effectsProfileIds="wearable-class-device"} will optimise output for smart watches, \code{effectsProfileIds=c("wearable-class-device","telephony-class-application")} will apply sound filters optimised for smart watches, then telephonic devices. 63 | } 64 | \examples{ 65 | 66 | \dontrun{ 67 | library(magrittr) 68 | gl_talk("The rain in spain falls mainly in the plain", 69 | output = "output.wav") 70 | 71 | gl_talk("Testing my new audio player") \%>\% gl_talk_player() 72 | 73 | # using SSML 74 | gl_talk('The SSML 75 | standard is defined by the 76 | W3C.', 77 | inputType = "ssml") 78 | 79 | # using effects profiles 80 | gl_talk("This sounds great on headphones", 81 | effectsProfileIds = "headphone-class-device") 82 | 83 | } 84 | 85 | } 86 | \seealso{ 87 | \url{https://cloud.google.com/text-to-speech/docs/} 88 | } 89 | -------------------------------------------------------------------------------- /man/gl_talk_languages.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/text-to-speech.R 3 | \name{gl_talk_languages} 4 | \alias{gl_talk_languages} 5 | \title{Get a list of voices available for text to speech} 6 | \usage{ 7 | gl_talk_languages(languageCode = NULL) 8 | } 9 | \arguments{ 10 | \item{languageCode}{A \code{BCP-47} language tag. If specified, will only return voices that can be used to synthesize this languageCode} 11 | } 12 | \description{ 13 | Returns a list of voices supported for synthesis. 14 | } 15 | -------------------------------------------------------------------------------- /man/gl_talk_player.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/text-to-speech.R 3 | \name{gl_talk_player} 4 | \alias{gl_talk_player} 5 | \title{Play audio in a browser} 6 | \usage{ 7 | gl_talk_player(audio = "output.wav", html = "player.html") 8 | } 9 | \arguments{ 10 | \item{audio}{The file location of the audio file. Must be supported by HTML5} 11 | 12 | \item{html}{The html file location that will be created host the audio} 13 | } 14 | \description{ 15 | This uses HTML5 audio tags to play audio in your browser 16 | } 17 | \details{ 18 | A platform neutral way to play audio is not easy, so this uses your browser to play it instead. 19 | } 20 | \examples{ 21 | 22 | \dontrun{ 23 | 24 | gl_talk("Testing my new audio player") \%>\% gl_talk_player() 25 | 26 | } 27 | 28 | } 29 | -------------------------------------------------------------------------------- /man/gl_talk_shiny.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/text-to-speech.R 3 | \name{gl_talk_shiny} 4 | \alias{gl_talk_shiny} 5 | \title{Speak in Shiny module (server)} 6 | \usage{ 7 | gl_talk_shiny( 8 | input, 9 | output, 10 | session, 11 | transcript, 12 | ..., 13 | autoplay = TRUE, 14 | controls = TRUE, 15 | loop = FALSE, 16 | keep_wav = FALSE 17 | ) 18 | } 19 | \arguments{ 20 | \item{input}{shiny input} 21 | 22 | \item{output}{shiny output} 23 | 24 | \item{session}{shiny session} 25 | 26 | \item{transcript}{The (reactive) text to talk} 27 | 28 | \item{...}{ 29 | Arguments passed on to \code{\link[=gl_talk]{gl_talk}} 30 | \describe{ 31 | \item{\code{languageCode}}{The language of the voice as a \code{BCP-47} language code} 32 | \item{\code{name}}{Name of the voice, see list via \link{gl_talk_languages} for supported voices. Set to \code{NULL} to make the service choose a voice based on \code{languageCode} and \code{gender}.} 33 | \item{\code{gender}}{The gender of the voice, if available} 34 | \item{\code{audioEncoding}}{Format of the requested audio stream} 35 | \item{\code{speakingRate}}{Speaking rate/speed between \code{0.25} and \code{4.0}} 36 | \item{\code{pitch}}{Speaking pitch between \code{-20.0} and \code{20.0} in semitones.} 37 | \item{\code{volumeGainDb}}{Volumne gain in dB} 38 | \item{\code{sampleRateHertz}}{Sample rate for returned audio} 39 | \item{\code{inputType}}{Choose between \code{text} (the default) or SSML markup. The \code{input} text must be SSML markup if you choose \code{ssml}} 40 | \item{\code{effectsProfileIds}}{Optional. An identifier which selects 'audio effects' profiles that are applied on (post synthesized) text to speech. Effects are applied on top of each other in the order they are given} 41 | }} 42 | 43 | \item{autoplay}{passed to the HTML audio player - default \code{TRUE} plays on load} 44 | 45 | \item{controls}{passed to the HTML audio player - default \code{TRUE} shows controls} 46 | 47 | \item{loop}{passed to the HTML audio player - default \code{FALSE} does not loop} 48 | 49 | \item{keep_wav}{keep the generated wav files if TRUE.} 50 | } 51 | \description{ 52 | Call via \code{shiny::callModule(gl_talk_shiny, "your_id")} 53 | } 54 | -------------------------------------------------------------------------------- /man/gl_talk_shinyUI.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/text-to-speech.R 3 | \name{gl_talk_shinyUI} 4 | \alias{gl_talk_shinyUI} 5 | \title{Speak in Shiny module (ui)} 6 | \usage{ 7 | gl_talk_shinyUI(id) 8 | } 9 | \arguments{ 10 | \item{id}{The Shiny id} 11 | } 12 | \description{ 13 | Speak in Shiny module (ui) 14 | } 15 | \details{ 16 | Shiny Module for use with \link{gl_talk_shiny}. 17 | } 18 | -------------------------------------------------------------------------------- /man/gl_translate.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/translate.R 3 | \name{gl_translate} 4 | \alias{gl_translate} 5 | \title{Translate the language of text within a request} 6 | \usage{ 7 | gl_translate( 8 | t_string, 9 | target = "en", 10 | format = c("text", "html"), 11 | source = "", 12 | model = c("nmt", "base") 13 | ) 14 | } 15 | \arguments{ 16 | \item{t_string}{A character vector of text to detect language for} 17 | 18 | \item{target}{The target language} 19 | 20 | \item{format}{Whether the text is plain or HTML} 21 | 22 | \item{source}{Specify the language to translate from. Will detect it if left default} 23 | 24 | \item{model}{What translation model to use} 25 | } 26 | \value{ 27 | A tibble of \code{translatedText} and \code{detectedSourceLanguage} 28 | and \code{text} of length equal to the vector of text you passed in. 29 | } 30 | \description{ 31 | Translate character vectors via the Google Translate API 32 | } 33 | \details{ 34 | You can translate a vector of strings, although if too many for one call then it will be 35 | broken up into one API call per element. 36 | This is the same cost as charging is per character translated, but will take longer. 37 | 38 | If translating HTML set the \code{format = "html"}. 39 | Consider removing anything not needed to be translated first, 40 | such as JavaScript and CSS scripts. See example on how to do this with \code{rvest} 41 | 42 | The API limits in three ways: characters per day, characters per 100 seconds, 43 | and API requests per 100 seconds. 44 | All can be set in the API manager 45 | \url{https://console.developers.google.com/apis/api/translate.googleapis.com/quotas} 46 | } 47 | \examples{ 48 | 49 | \dontrun{ 50 | 51 | text <- "to administer medicine to animals is frequently a very difficult matter, 52 | and yet sometimes it's necessary to do so" 53 | 54 | gl_translate(text, target = "ja") 55 | 56 | # translate webpages using rvest to process beforehand 57 | library(rvest) 58 | library(googleLanguageR) 59 | 60 | # translate webpages 61 | 62 | # dr.dk article 63 | my_url <- "http://bit.ly/2yhrmrH" 64 | 65 | ## in this case the content to translate is in css selector '.wcms-article-content' 66 | read_html(my_url) \%>\% 67 | html_node(css = ".wcms-article-content") \%>\% 68 | html_text \%>\% 69 | gl_translate(format = "html") 70 | 71 | } 72 | 73 | } 74 | \seealso{ 75 | \url{https://cloud.google.com/translate/docs/reference/translate} 76 | 77 | Other translations: 78 | \code{\link{gl_translate_detect}()}, 79 | \code{\link{gl_translate_languages}()} 80 | } 81 | \concept{translations} 82 | -------------------------------------------------------------------------------- /man/gl_translate_detect.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/translate.R 3 | \name{gl_translate_detect} 4 | \alias{gl_translate_detect} 5 | \title{Detect the language of text within a request} 6 | \usage{ 7 | gl_translate_detect(string) 8 | } 9 | \arguments{ 10 | \item{string}{A character vector of text to detect language for} 11 | } 12 | \value{ 13 | A tibble of the detected languages with columns \code{confidence}, \code{isReliable}, \code{language}, and \code{text} of length equal to the vector of text you passed in. 14 | } 15 | \description{ 16 | Detect the language of text within a request 17 | } 18 | \details{ 19 | Consider using \code{library(cld2)} and \code{cld2::detect_language} instead offline, 20 | since that is free and local without needing a paid API call. 21 | 22 | \link{gl_translate} also returns a detection of the language, 23 | so you could also wish to do it in one step via that function. 24 | } 25 | \examples{ 26 | 27 | \dontrun{ 28 | 29 | gl_translate_detect("katten sidder på måtten") 30 | # Detecting language: 39 characters - katten sidder på måtten... 31 | # confidence isReliable language text 32 | # 1 0.536223 FALSE da katten sidder på måtten 33 | 34 | 35 | } 36 | 37 | } 38 | \seealso{ 39 | \url{https://cloud.google.com/translate/docs/reference/detect} 40 | 41 | Other translations: 42 | \code{\link{gl_translate_languages}()}, 43 | \code{\link{gl_translate}()} 44 | } 45 | \concept{translations} 46 | -------------------------------------------------------------------------------- /man/gl_translate_document.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/translate-document.R 3 | \name{gl_translate_document} 4 | \alias{gl_translate_document} 5 | \title{Translate document} 6 | \usage{ 7 | gl_translate_document( 8 | d_path, 9 | target = "es-ES", 10 | output_path = "out.pdf", 11 | format = c("pdf"), 12 | source = "en-UK", 13 | model = c("nmt", "base"), 14 | location = "global" 15 | ) 16 | } 17 | \arguments{ 18 | \item{d_path}{path of the document to be translated} 19 | 20 | \item{output_path}{where to save the translated document} 21 | 22 | \item{format}{currently only pdf-files are supported} 23 | } 24 | \value{ 25 | output filename 26 | } 27 | \description{ 28 | Translate a document via the Google Translate API 29 | } 30 | \examples{ 31 | 32 | \dontrun{ 33 | gl_translate_document(system.file(package = "googleLanguageR","test-doc.pdf"), "no") 34 | 35 | } 36 | } 37 | \seealso{ 38 | Other translations: 39 | \code{\link{gl_translate_detect}()}, 40 | \code{\link{gl_translate_languages}()}, 41 | \code{\link{gl_translate}()} 42 | } 43 | \concept{translations} 44 | -------------------------------------------------------------------------------- /man/gl_translate_languages.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/translate.R 3 | \name{gl_translate_languages} 4 | \alias{gl_translate_languages} 5 | \title{Lists languages from Google Translate API} 6 | \usage{ 7 | gl_translate_languages(target = "en") 8 | } 9 | \arguments{ 10 | \item{target}{If specified, language names are localized in target language} 11 | } 12 | \value{ 13 | A tibble of supported languages 14 | } 15 | \description{ 16 | Returns a list of supported languages for translation. 17 | } 18 | \details{ 19 | Supported language codes, generally consisting of its ISO 639-1 identifier. (E.g. \code{'en', 'ja'}). 20 | In certain cases, BCP-47 codes including language + region identifiers are returned (e.g. \code{'zh-TW', 'zh-CH'}) 21 | } 22 | \examples{ 23 | 24 | \dontrun{ 25 | 26 | # default english names of languages supported 27 | gl_translate_languages() 28 | 29 | # specify a language code to get other names, such as Danish 30 | gl_translate_languages("da") 31 | 32 | } 33 | } 34 | \seealso{ 35 | \url{https://cloud.google.com/translate/docs/reference/languages} 36 | 37 | Other translations: 38 | \code{\link{gl_translate_detect}()}, 39 | \code{\link{gl_translate_document}()}, 40 | \code{\link{gl_translate}()} 41 | } 42 | \concept{translations} 43 | -------------------------------------------------------------------------------- /man/googleLanguageR.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/googleLanguageR.R 3 | \docType{package} 4 | \name{googleLanguageR} 5 | \alias{googleLanguageR} 6 | \title{googleLanguageR} 7 | \description{ 8 | This package contains functions for analysing language through the 9 | Google Cloud Machine Learning APIs 10 | } 11 | \details{ 12 | For examples and documentation see the vignettes and the website: 13 | 14 | \url{http://code.markedmondson.me/googleLanguageR/} 15 | } 16 | \seealso{ 17 | \url{https://cloud.google.com/products/machine-learning/} 18 | } 19 | -------------------------------------------------------------------------------- /man/is.NullOb.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/utilities.R 3 | \name{is.NullOb} 4 | \alias{is.NullOb} 5 | \title{A helper function that tests whether an object is either NULL _or_ 6 | a list of NULLs} 7 | \usage{ 8 | is.NullOb(x) 9 | } 10 | \description{ 11 | A helper function that tests whether an object is either NULL _or_ 12 | a list of NULLs 13 | } 14 | \keyword{internal} 15 | -------------------------------------------------------------------------------- /man/rmNullObs.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/utilities.R 3 | \name{rmNullObs} 4 | \alias{rmNullObs} 5 | \title{Recursively step down into list, removing all such objects} 6 | \usage{ 7 | rmNullObs(x) 8 | } 9 | \description{ 10 | Recursively step down into list, removing all such objects 11 | } 12 | \keyword{internal} 13 | -------------------------------------------------------------------------------- /tests/testthat.R: -------------------------------------------------------------------------------- 1 | library(testthat) 2 | library(googleLanguageR) 3 | 4 | test_check("googleLanguageR") 5 | -------------------------------------------------------------------------------- /tests/testthat/comments.rds: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci/googleLanguageR/7c6f93b0977ac7ac2189a6b5648362b12509c953/tests/testthat/comments.rds -------------------------------------------------------------------------------- /tests/testthat/prep_tests.R: -------------------------------------------------------------------------------- 1 | library(googleLanguageR) 2 | library(rvest) 3 | library(magrittr) 4 | library(xml2) 5 | library(rvest) 6 | 7 | local_auth <- Sys.getenv("GL_AUTH") != "" 8 | if(!local_auth){ 9 | cat("\nNo authentication file detected\n") 10 | } else { 11 | cat("\nFound local auth file:", Sys.getenv("GL_AUTH")) 12 | } 13 | 14 | on_travis <- Sys.getenv("CI") == "true" 15 | if(on_travis){ 16 | cat("\n#testing on CI - working dir: ", path.expand(getwd()), "\n") 17 | } else { 18 | cat("\n#testing not on CI\n") 19 | } 20 | 21 | ## Generate test text and audio 22 | testthat::context("Setup test files") 23 | 24 | test_text <- "Norma is a small constellation in the Southern Celestial Hemisphere between Ara and Lupus, one of twelve drawn up in the 18th century by French astronomer Nicolas Louis de Lacaille and one of several depicting scientific instruments. Its name refers to a right angle in Latin, and is variously considered to represent a rule, a carpenter's square, a set square or a level. It remains one of the 88 modern constellations. Four of Norma's brighter stars make up a square in the field of faint stars. Gamma2 Normae is the brightest star with an apparent magnitude of 4.0. Mu Normae is one of the most luminous stars known, but is partially obscured by distance and cosmic dust. Four star systems are known to harbour planets. " 25 | test_text2 <- "Solomon Wariso (born 11 November 1966 in Portsmouth) is a retired English sprinter who competed primarily in the 200 and 400 metres.[1] He represented his country at two outdoor and three indoor World Championships and is the British record holder in the indoor 4 × 400 metres relay." 26 | trans_text <- "Der gives Folk, der i den Grad omgaaes letsindigt og skammeligt med Andres Ideer, de snappe op, at de burde tiltales for ulovlig Omgang med Hittegods." 27 | expected <- "There are people who are soberly and shamefully opposed to the ideas of others, who make it clear that they should be charged with unlawful interference with the former." 28 | 29 | test_gcs <- "gs://mark-edmondson-public-files/googleLanguageR/a-dream-mono.wav" 30 | 31 | test_audio <- system.file(package = "googleLanguageR", "woman1_wb.wav") 32 | 33 | speaker_d_test <- "gs://mark-edmondson-public-read/boring_conversation.wav" 34 | 35 | -------------------------------------------------------------------------------- /tests/testthat/test-translate-document.R: -------------------------------------------------------------------------------- 1 | source("prep_tests.R") 2 | test_that("document tranlation works", { 3 | skip_on_cran() 4 | skip_on_travis() 5 | 6 | my_out <- tempfile(fileext = ".pdf") 7 | 8 | gl_translate_document(system.file(package = "googleLanguageR","test-doc.pdf"), target = "no",output_path = my_out) 9 | 10 | my_pdf1 <-pdftools::pdf_data(my_out) 11 | 12 | my_pdf2 <- pdftools::pdf_data( 13 | system.file(package = "googleLanguageR","test-doc-no.pdf") 14 | ) 15 | 16 | expect_equal(my_pdf1[[1]]$text, my_pdf2[[1]]$text) 17 | }) 18 | -------------------------------------------------------------------------------- /vignettes/setup.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Introduction to googleLanguageR" 3 | author: "Mark Edmondson" 4 | date: "`r Sys.Date()`" 5 | output: rmarkdown::html_vignette 6 | vignette: > 7 | %\VignetteIndexEntry{Introduction to googleLanguageR} 8 | %\VignetteEngine{knitr::rmarkdown} 9 | %\VignetteEncoding{UTF-8} 10 | --- 11 | 12 | `googleLanguageR` contains functions for analysing language through the [Google Cloud Machine Learning APIs](https://cloud.google.com/products/machine-learning/) 13 | 14 | Note all are paid services, you will need to provide your credit card details for your own Google Project to use them. 15 | 16 | The package can be used by any user who is looking to take advantage of Google's massive dataset to train these machine learning models. Some applications include: 17 | 18 | * Translation of speech into another language text, via speech-to-text then translation 19 | * Identification of sentiment within text, such as from Twitter feeds 20 | * Pulling out the objects of a sentence, to help classify texts and get metadata links from Wikipedia about them. 21 | 22 | The applications of the API results could be relevant to business or researchers looking to scale text analysis. 23 | 24 | ## Google Natural Language API 25 | 26 | > Google Natural Language API reveals the structure and meaning of text by offering powerful machine learning models in an easy to use REST API. You can use it to extract information about people, places, events and much more, mentioned in text documents, news articles or blog posts. You can also use it to understand sentiment about your product on social media or parse intent from customer conversations happening in a call center or a messaging app. 27 | 28 | Read more [on the Google Natural Language API](https://cloud.google.com/natural-language/) 29 | 30 | ## Google Cloud Translation API 31 | 32 | > Google Cloud Translation API provides a simple programmatic interface for translating an arbitrary string into any supported language. Translation API is highly responsive, so websites and applications can integrate with Translation API for fast, dynamic translation of source text from the source language to a target language (e.g. French to English). 33 | 34 | Read more [on the Google Cloud Translation Website](https://cloud.google.com/translate/) 35 | 36 | ## Google Cloud Speech API 37 | 38 | > Google Cloud Speech API enables you to convert audio to text by applying neural network models in an easy to use API. The API recognizes over 80 languages and variants, to support your global user base. You can transcribe the text of users dictating to an application’s microphone or enable command-and-control through voice among many other use cases. 39 | 40 | Read more [on the Google Cloud Speech Website](https://cloud.google.com/speech/) 41 | 42 | ## Installation 43 | 44 | 1. Create a [Google API Console Project](https://cloud.google.com/resource-manager/docs/creating-managing-projects) 45 | 2. Within your project, add a [payment method to the project](https://support.google.com/cloud/answer/6293589) 46 | 3. Within your project, check the relevant APIs are activated 47 | - [Google Natural Language API](https://console.cloud.google.com/apis/api/language.googleapis.com/overview) 48 | - [Google Cloud Translation API](https://console.cloud.google.com/apis/api/translate.googleapis.com/overview) 49 | - [Google Cloud Speech API](https://console.cloud.google.com/apis/api/speech.googleapis.com/overview) 50 | 4. [Generate a service account credential](https://cloud.google.com/storage/docs/authentication#generating-a-private-key) as a JSON file 51 | 5. Return to R, and install this library via `devtools::install_github("MarkEdmondson1234/googleLanguageR")` 52 | 53 | ## Usage 54 | 55 | ### Authentication 56 | 57 | The best way to authenticate is to use an environment file. See `?Startup`. I usually place this in my home directory. (e.g. if using RStudio, click on `Home` in the file explorer, create a new `TEXT` file and call it `.Renviron`) 58 | 59 | Set the file location of your download Google Project JSON file in a `GL_AUTH` argument: 60 | 61 | ``` 62 | #.Renviron 63 | GL_AUTH=location_of_json_file.json 64 | ``` 65 | 66 | Then, when you load the library you should auto-authenticate: 67 | 68 | ```r 69 | library(googleLanguageR) 70 | # Setting scopes to https://www.googleapis.com/auth/cloud-platform 71 | # Set any additional scopes via options(googleAuthR.scopes.selected = c('scope1', 'scope2')) before loading library. 72 | # Successfully authenticated via location_of_json_file.json 73 | ``` 74 | 75 | You can also authenticate directly using the `gl_auth` function pointing at your JSON auth file: 76 | 77 | ```r 78 | library(googleLanguageR) 79 | gl_auth("location_of_json_file.json") 80 | ``` 81 | 82 | You can then call the APIs via the functions: 83 | 84 | * `gl_nlp()` - Natural Langage API 85 | * `gl_speech()` - Cloud Speech API 86 | * `gl_translate()` - Cloud Translation API 87 | 88 | -------------------------------------------------------------------------------- /vignettes/speech.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Google Cloud Speech-to-Text API" 3 | author: "Mark Edmondson" 4 | date: "`r Sys.Date()`" 5 | output: rmarkdown::html_vignette 6 | vignette: > 7 | %\VignetteIndexEntry{Google Cloud Speech-to-Text API} 8 | %\VignetteEngine{knitr::rmarkdown} 9 | %\VignetteEncoding{UTF-8} 10 | --- 11 | 12 | The Google Cloud Speech-to-Text API enables you to convert audio to text by applying neural network models in an easy to use API. The API recognizes over 80 languages and variants, to support your global user base. You can transcribe the text of users dictating to an application’s microphone or enable command-and-control through voice among many other use cases. 13 | 14 | Read more [on the Google Cloud Speech-to-Text Website](https://cloud.google.com/speech/) 15 | 16 | The Cloud Speech API provides audio transcription. Its accessible via the `gl_speech` function. 17 | 18 | Arguments include: 19 | 20 | * `audio_source` - this is a local file in the correct format, or a Google Cloud Storage URI. This can also be a `Wave` class object from the package `tuneR` 21 | * `encoding` - the format of the sound file - `LINEAR16` is the common `.wav` format, other formats include `FLAC` and `OGG_OPUS` 22 | * `sampleRate` - this needs to be set to what your file is recorded at. 23 | * `languageCode` - specify the language spoken as a [`BCP-47` language tag](https://tools.ietf.org/html/bcp47) 24 | * `speechContexts` - you can supply keywords to help the translation with some context. 25 | 26 | ### Returned structure 27 | 28 | The API returns a list of two data.frame tibbles - `transcript` and `timings`. 29 | 30 | Access them via the returned object and `$transcript` and `$timings` 31 | 32 | ```r 33 | return <- gl_speech(test_audio, languageCode = "en-GB") 34 | 35 | return$transcript 36 | # A tibble: 1 x 2 37 | # transcript confidence 38 | # 39 | #1 to administer medicine to animals is frequently a very difficult matter and yet sometimes it's necessary to do so 0.9711006 40 | 41 | return$timings 42 | # startTime endTime word 43 | #1 0s 0.100s to 44 | #2 0.100s 0.700s administer 45 | #3 0.700s 0.700s medicine 46 | #4 0.700s 1.200s to 47 | # etc... 48 | ``` 49 | 50 | ### Demo for Google Cloud Speech-to-Text API 51 | 52 | 53 | A test audio file is installed with the package which reads: 54 | 55 | > "To administer medicine to animals is frequently a very difficult matter, and yet sometimes it's necessary to do so" 56 | 57 | The file is sourced from the University of Southampton's speech detection (`http://www-mobile.ecs.soton.ac.uk/`) group and is fairly difficult for computers to parse, as we see below: 58 | 59 | ```r 60 | library(googleLanguageR) 61 | ## get the sample source file 62 | test_audio <- system.file("woman1_wb.wav", package = "googleLanguageR") 63 | 64 | ## its not perfect but...:) 65 | gl_speech(test_audio)$transcript 66 | 67 | ## get alternative transcriptions 68 | gl_speech(test_audio, maxAlternatives = 2L)$transcript 69 | 70 | gl_speech(test_audio, languageCode = "en-GB")$transcript 71 | 72 | ## help it out with context for "frequently" 73 | gl_speech(test_audio, 74 | languageCode = "en-GB", 75 | speechContexts = list(phrases = list("is frequently a very difficult")))$transcript 76 | ``` 77 | 78 | ### Word transcripts 79 | 80 | The API [supports timestamps](https://cloud.google.com/speech/reference/rest/v1/speech/recognize#WordInfo) on when words are recognised. These are outputted into a second data.frame that holds three entries: `startTime`, `endTime` and the `word`. 81 | 82 | 83 | ```r 84 | str(result$timings) 85 | #'data.frame': 152 obs. of 3 variables: 86 | # $ startTime: chr "0s" "0.100s" "0.500s" "0.700s" ... 87 | # $ endTime : chr "0.100s" "0.500s" "0.700s" "0.900s" ... 88 | # $ word : chr "a" "Dream" "Within" "A" ... 89 | 90 | result$timings 91 | # startTime endTime word 92 | #1 0s 0.100s a 93 | #2 0.100s 0.500s Dream 94 | #3 0.500s 0.700s Within 95 | #4 0.700s 0.900s A 96 | #5 0.900s 1s Dream 97 | ``` 98 | 99 | ## Custom configurations 100 | 101 | You can also send in other arguments which can help shape the output, such as speaker diagrization (labelling different speakers) - to use such custom configurations create a [`RecognitionConfig`](https://cloud.google.com/speech-to-text/docs/reference/rest/v1p1beta1/RecognitionConfig) object. This can be done via R lists which are converted to JSON via `library(jsonlite)` and an example is shown below: 102 | 103 | ```r 104 | ## Use a custom configuration 105 | my_config <- list(encoding = "LINEAR16", 106 | diarizationConfig = list( 107 | enableSpeakerDiarization = TRUE, 108 | minSpeakerCount = 2, 109 | maxSpeakCount = 3 110 | )) 111 | 112 | # languageCode is required, so will be added if not in your custom config 113 | gl_speech(my_audio, languageCode = "en-US", customConfig = my_config) 114 | ``` 115 | 116 | ## Asynchronous calls 117 | 118 | For speech files greater than 60 seconds of if you don't want your results straight away, set `asynch = TRUE` in the call to the API. 119 | 120 | This will return an object of class `"gl_speech_op"` which should be used within the `gl_speech_op()` function to check the status of the task. If the task is finished, then it will return an object the same form as the non-asynchronous case. 121 | 122 | ```r 123 | async <- gl_speech(test_audio, asynch = TRUE) 124 | async 125 | ## Send to gl_speech_op() for status 126 | ## 4625920921526393240 127 | 128 | result <- gl_speech_op(async) 129 | ``` 130 | 131 | -------------------------------------------------------------------------------- /vignettes/text-to-speech.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Google Cloud Text-to-Speech API" 3 | author: "Mark Edmondson" 4 | date: "`r Sys.Date()`" 5 | output: rmarkdown::html_vignette 6 | vignette: > 7 | %\VignetteIndexEntry{Google Cloud Text-to-Speech API} 8 | %\VignetteEngine{knitr::rmarkdown} 9 | %\VignetteEncoding{UTF-8} 10 | --- 11 | 12 | Google Cloud Text-to-Speech enables developers to synthesize natural-sounding speech with 30 voices, available in multiple languages and variants. It applies DeepMind’s groundbreaking research in WaveNet and Google’s powerful neural networks to deliver the highest fidelity possible. With this easy-to-use API, you can create lifelike interactions with your users, across many applications and devices. 13 | 14 | Read more [on the Google Cloud Text-to-Speech Website](https://cloud.google.com/text-to-speech/) 15 | 16 | The Cloud Text-to-Speech API turns text into sound files of the spoken words. Its accessible via the `gl_talk` function. 17 | 18 | Arguments include: 19 | 20 | * `input` - The text to turn into speech 21 | * `output` Where to save the speech audio file 22 | * `languageCode` The language of the voice as a [`BCP-47` language tag](https://tools.ietf.org/html/bcp47) 23 | * `name` Name of the voice, see list via `gl_talk_languages()` or [online](https://cloud.google.com/text-to-speech/docs/voices) for supported voices. If not set, then the service will choose a voice based on `languageCode` and `gender`. 24 | * `gender` The gender of the voice, if available 25 | * `audioEncoding` Format of the requested audio stream - can be a choice of `.wav`, `.mp3` or `.ogg` 26 | * `speakingRate` Speaking rate/speed 27 | * `pitch` Speaking pitch 28 | * `volumeGainDb` Volumne gain in dB 29 | * `sampleRateHertz` Sample rate for returned audio 30 | 31 | ## Returned structure 32 | 33 | The API returns an audio file which is saved to the location specified in `output` - by default this is `output.wav` - if you don't rename this file it will be overwritten by the next API call. 34 | 35 | It is advised to set the appropriate file extension if you change the audio encoding (e.g. to one of `.wav`, `.mp3` or `.ogg`) so audio payers recognise the file format. 36 | 37 | ## Talk Languages 38 | 39 | The API can talk several different languages, with more being added over time. You can get a current list via the function `gl_talk_languages()` or [online](https://cloud.google.com/text-to-speech/docs/voices) 40 | 41 | ```r 42 | gl_talk_languages() 43 | # A tibble: 32 x 4 44 | languageCodes name ssmlGender naturalSampleRateHertz 45 | 46 | 1 es-ES es-ES-Standard-A FEMALE 24000 47 | 2 ja-JP ja-JP-Standard-A FEMALE 22050 48 | 3 pt-BR pt-BR-Standard-A FEMALE 24000 49 | 4 tr-TR tr-TR-Standard-A FEMALE 22050 50 | 5 sv-SE sv-SE-Standard-A FEMALE 22050 51 | 6 nl-NL nl-NL-Standard-A FEMALE 24000 52 | 7 en-US en-US-Wavenet-A MALE 24000 53 | 8 en-US en-US-Wavenet-B MALE 24000 54 | 9 en-US en-US-Wavenet-C FEMALE 24000 55 | 10 en-US en-US-Wavenet-D MALE 24000 56 | ``` 57 | 58 | If you are looking a specific language, specify that in the function call e.g. to see only Spanish (`es`) 59 | voices issue: 60 | 61 | ```r 62 | gl_talk_languages(languageCode = "es") 63 | # A tibble: 1 x 4 64 | languageCodes name ssmlGender naturalSampleRateHertz 65 | 66 | 1 es-ES es-ES-Standard-A FEMALE 24000 67 | ``` 68 | 69 | You can then specify that voice when calling the API via the `name` argument, which overrides the `gender` and `languageCode` argument: 70 | 71 | ```r 72 | gl_talk("Hasta la vista", name = "es-ES-Standard-A") 73 | ``` 74 | 75 | Otherwise, specify your own `gender` and `languageCode` and the voice will be picked for you: 76 | 77 | ```r 78 | gl_talk("Would you like a cup of tea?", gender = "FEMALE", languageCode = "en-GB") 79 | ``` 80 | 81 | Some languages are not yet supported, such as Danish. The API will return an error in those cases. 82 | 83 | ## Support for SSML 84 | 85 | Support is also included for Speech Synthesis Markup Language (SSML) - more details on using this to insert pauses, sounds and breaks in your audio can be found here: `https://cloud.google.com/text-to-speech/docs/ssml` 86 | 87 | To use, send in your SSML markup around the text you want to talk and set `inputType= "ssml"`: 88 | 89 | ```r 90 | # using SSML 91 | gl_talk('The SSML 92 | standard is defined by the 93 | W3C.', 94 | inputType = "ssml") 95 | ``` 96 | 97 | ## Effect Profiles 98 | 99 | You can output audio files that are optimised for playing on various devices. 100 | 101 | To use audio profiles, supply a character vector of the available audio profiles listed here: `https://cloud.google.com/text-to-speech/docs/audio-profiles` - the audio profiles are applied in the order given. 102 | 103 | For instance `effectsProfileIds="wearable-class-device"` will optimise output for smart watches, `effectsProfileIds=c("wearable-class-device","telephony-class-application")` will apply sound filters optimised for smart watches, then telephonic devices. 104 | 105 | ```r 106 | # using effects profiles 107 | gl_talk("This sounds great on headphones", 108 | effectsProfileIds = "headphone-class-device") 109 | ``` 110 | 111 | ## Browser Speech player 112 | 113 | Creating and clicking on the audio file to play it can be a bit of a drag, so you also have a function that will play the audio file for you, launching via the browser. This can be piped via the tidyverse's `%>%` 114 | 115 | ```r 116 | library(magrittr) 117 | gl_talk("This is my audio player") %>% gl_talk_player() 118 | 119 | ## non-piped equivalent 120 | gl_talk_player(gl_talk("This is my audio player")) 121 | ``` 122 | 123 | The `gl_talk_player()` creates a HTML file called `player.html` in your working directory by default. 124 | 125 | ### Using with Shiny 126 | 127 | You can do this in Shiny too, which is demonstrated in the [example Shiny app](https://github.com/ropensci/googleLanguageR/tree/master/inst/shiny/capture_speech) included with the package. 128 | 129 | Click the link for a video tutorial on how to [integrate text-to-speech into a Shiny app](https://www.youtube.com/watch?v=Ny0e7vHFu6o&t=8s) - the demo uses text-to-speech to [talk through a user's Google Analytics statistics](https://github.com/MarkEdmondson1234/verbal_ga_shiny). 130 | 131 | 132 | 133 | A shiny module has been created to help integrate text-to-speech into your Shiny apps, demo in the video above and below: 134 | 135 | ```r 136 | library(shiny) 137 | library(googleLanguageR) # assume auto auth setup 138 | 139 | ui <- fluidPage( 140 | gl_talk_shinyUI("talk") 141 | ) 142 | 143 | server <- function(input, output, session){ 144 | 145 | transcript <- reactive({ 146 | paste("This is a demo talking Shiny app!") 147 | }) 148 | 149 | callModule(gl_talk_shiny, "talk", transcript = transcript) 150 | } 151 | 152 | 153 | shinyApp(ui = ui, server = server) 154 | ``` 155 | 156 | 157 | -------------------------------------------------------------------------------- /vignettes/translation.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Google Cloud Translation API" 3 | author: "Mark Edmondson" 4 | date: "`r Sys.Date()`" 5 | output: rmarkdown::html_vignette 6 | vignette: > 7 | %\VignetteIndexEntry{Google Cloud Translation API} 8 | %\VignetteEngine{knitr::rmarkdown} 9 | %\VignetteEncoding{UTF-8} 10 | --- 11 | 12 | The Google Cloud Translation API provides a simple programmatic interface for translating an arbitrary string into any supported language. Translation API is highly responsive, so websites and applications can integrate with Translation API for fast, dynamic translation of source text from the source language to a target language (e.g. French to English). 13 | 14 | Read more [on the Google Cloud Translation Website](https://cloud.google.com/translate/) 15 | 16 | You can detect the language via `gl_translate_detect`, or translate and detect language via `gl_translate` 17 | 18 | ### Language Translation 19 | 20 | Translate text via `gl_translate`. Note this is a lot more refined than the free version on Google's translation website. 21 | 22 | ```r 23 | library(googleLanguageR) 24 | 25 | text <- "to administer medicince to animals is frequently a very difficult matter, and yet sometimes it's necessary to do so" 26 | ## translate British into Danish 27 | gl_translate(text, target = "da")$translatedText 28 | ``` 29 | 30 | You can choose the target language via the argument `target`. The function will automatically detect the language if you do not define an argument `source`. This function which will also detect the langauge. As it costs the same as `gl_translate_detect`, its usually cheaper to detect and translate in one step. 31 | 32 | You can pass a vector of text which will first be attempted to translate in one API call - if that fails due to being greater than the API limits, it will attempt again but vectorising the API calls. This will result in more calls and be slower, but cost the same as you are charged per character translated, not per API call. 33 | 34 | #### HTML support 35 | 36 | You can also supply web HTML and select the `format='html'` which will handle HTML tags to give you a cleaner translation. 37 | 38 | Consider removing anything not needed to be translated first, such as JavaScript and CSS scripts using the tools of `rvest` - an example is shown below: 39 | 40 | ```r 41 | # translate webpages 42 | library(rvest) 43 | library(googleLanguageR) 44 | 45 | my_url <- "http://www.dr.dk/nyheder/indland/greenpeace-facebook-og-google-boer-foelge-apples-groenne-planer" 46 | 47 | ## in this case the content to translate is in css select .wcms-article-content 48 | read_html(my_url) %>% # read html 49 | html_node(css = ".wcms-article-content") %>% # select article content 50 | html_text %>% # extract text 51 | gl_translate(format = "html") %>% # translate with html flag 52 | dplyr::select(translatedText) # show translatedText column of output tibble 53 | 54 | ``` 55 | 56 | 57 | ### Language Detection 58 | 59 | This function only detects the language: 60 | 61 | ```r 62 | ## which language is this? 63 | gl_translate_detect("katten sidder på måtten") 64 | ``` 65 | 66 | The more text it has, the better. And it helps if its not Danish... 67 | 68 | It may be better to use [`cld2`](https://github.com/ropensci/cld2) to translate offline first, to avoid charges if the translation is unnecessary (e.g. already in English). You could then verify online for more uncertain cases. 69 | 70 | ```r 71 | cld2::detect_language("katten sidder på måtten") 72 | ``` 73 | 74 | ### Translation API limits 75 | 76 | The API limits in three ways: characters per day, characters per 100 seconds, and API requests per 100 seconds. All can be set in the API manager in Google Cloud console: `https://console.developers.google.com/apis/api/translate.googleapis.com/quotas` 77 | 78 | The library will limit the API calls for the characters and API requests per 100 seconds. The API will automatically retry if you are making requests too quickly, and also pause to make sure you only send `100000` characters per 100 seconds. 79 | --------------------------------------------------------------------------------