├── .gitignore
├── Face Detection
    ├── Facial Recognition in R.Rmd
    ├── Facial Recognition in R._pub.html
    ├── Facial Recognition in R.md
    ├── Facial Recognition.Rmd
    ├── Facial_Recognition_in_R.html
    ├── LICENSE
    ├── ML-Image-Processing-R.Rproj
    ├── facialRecognition.py
    ├── haarcascade_eye.xml
    ├── haarcascade_frontalface_default.xml
    ├── imageFunctions.R
    ├── main.R
    ├── modifiedWebcamShot.png
    └── originalWebcamShot.png
├── Google Vision API
    ├── Google Vision API in R.Rmd
    ├── Google Vision API in R._pub.html
    ├── Google Vision API in R.md
    ├── Google Vision API.Rproj
    ├── Google_Vision_API_in_R.html
    ├── dog_mountain.jpg
    ├── figure
    │   ├── unnamed-chunk-10-1.png
    │   ├── unnamed-chunk-11-1.png
    │   ├── unnamed-chunk-14-1.png
    │   ├── unnamed-chunk-15-1.png
    │   ├── unnamed-chunk-17-1.png
    │   ├── unnamed-chunk-19-1.png
    │   ├── unnamed-chunk-199-1.png
    │   ├── unnamed-chunk-2-1.png
    │   ├── unnamed-chunk-4-1.png
    │   ├── unnamed-chunk-6-1.png
    │   └── unnamed-chunk-8-1.png
    ├── my_face.jpg
    ├── originalWebcamShot.jpg
    ├── snacks_logos.JPG
    ├── us_castle.jpg
    ├── us_castle_2.jpg
    ├── us_dog_mountain.jpg
    ├── us_hats.jpg
    └── wrigley_text.jpg
├── Microsoft Vision API
    ├── Microsoft Vision API.Rproj
    ├── R - Microsoft Vision API.Rmd
    ├── R_-_Microsoft_Vision_API.html
    ├── SnoozeGenius.jpg
    ├── df.rds
    └── sandbox.R
└── README.md


/.gitignore:
--------------------------------------------------------------------------------
 1 | # History files
 2 | .Rhistory
 3 | .Rapp.history
 4 | 
 5 | # Session Data files
 6 | .RData
 7 | 
 8 | # Example code in package build process
 9 | *-Ex.R
10 | 
11 | # Output files from R CMD build
12 | /*.tar.gz
13 | 
14 | # Output files from R CMD check
15 | /*.Rcheck/
16 | 
17 | # RStudio files
18 | .Rproj.user/
19 | 
20 | # produced vignettes
21 | vignettes/*.html
22 | vignettes/*.pdf
23 | 
24 | # OAuth2 token, see https://github.com/hadley/httr/releases/tag/v0.3
25 | .httr-oauth
26 | 
27 | # knitr and R markdown default cache directories
28 | /*_cache/
29 | /cache/
30 | 
31 | # Temporary files created by R markdown
32 | *.utf8.md
33 | *.knit.md
34 | .Rproj.user
35 | 
36 | # Files with credentials
37 | *.csv
38 | *.json
39 | *.httr-oauth
40 | *.txt


--------------------------------------------------------------------------------
/Face Detection/Facial Recognition in R.Rmd:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: "Facial Recognition in R"
  3 | author: "Scott Stoltzman"
  4 | date: "6/22/2017"
  5 | output: html_document
  6 | ---
  7 | 
  8 | ### Facial Recognition in R
  9 | 
 10 | ![Original](originalWebcamShot.png) ![FaceDetection](modifiedWebcamShot.png)
 11 | 
 12 | OpenCV is an incredibly powerful tool to have in your toolbox. I have had a lot of success using it in Python but very little success in R. I haven't done too much other than searching Google but it seems as if "imager" and "videoplayR" provide a lot of the functionality but not all of it.  
 13 | 
 14 | I have never actually called Python functions from R before. Initially, I tried the "rPython" library - that has a lot of advantages, but was completely unecessary for me so system() worked absolutly fine. While this example is extremely simple, it should help to illustrate how easy it is to utilize the power of Python from within R.  
 15 | 
 16 | Using videoplayR I created a function which would take a picture with my webcam and save it as "originalWebcamShot.png"  
 17 | 
 18 | **Note:** saving images and then loading them isn't very efficient but works in this case and is extremely easy to implement. It saves us from passing variables, functions, objects, and/or methods between R and Python in this case.    
 19 | 
 20 | I'll trace my steps backward through this post (I think it's easier to understand what's going on in this case).  
 21 | 
 22 | #### The main.R file:  
 23 | 
 24 |   1. Calls my user-defined function 
 25 |     * Turns on the camera
 26 |     * Takes a picture
 27 |     * Saves it as "originalWebcamShot.png"
 28 |   2. Runs the Python script 
 29 |     * Loads the previously saved image
 30 |     * Loads the Haar Cascade algorithms
 31 |     * Detects faces and eyes
 32 |     * Draws colored rectangles around them
 33 |     * Saves the new image as "modifiedWebcamShot.png"
 34 |   3. Reads new image into R
 35 |   4. Displays both images  
 36 | 
 37 | 
 38 | ```{r mainCode,warning=FALSE,message=FALSE,eval=FALSE}
 39 | source('imageFunctions.R')
 40 | library("videoplayR")
 41 | 
 42 | # Take a picture and save it
 43 | img = webcamImage(rollFrames = 10, 
 44 |                   showImage = FALSE,
 45 |                   saveImageToWD = 'originalWebcamShot.png')
 46 | 
 47 | # Run Python script to detect faces, draw rectangles, return new image
 48 | system('python3 facialRecognition.py')
 49 | 
 50 | # Read in new image
 51 | img.face = readImg("modifiedWebcamShot.png")
 52 | 
 53 | # Display images
 54 | imshow(img)
 55 | imshow(img.face)
 56 | ```
 57 |   
 58 | 
 59 | The user-defined function:  
 60 | 
 61 |   1. Function inputs  
 62 |     * rollFrames is the number of pictures to take (allows the camera to adjust)
 63 |     * showImage gives the option to display the image
 64 |     * saveImageToWD saves the image generated to the current working directory
 65 |   2. Turns the webcam on
 66 |   3. Takes pictures (number of rollFrames)
 67 |   4. Uses basic logic to determine to show images and/or save them
 68 |   5. Returns the image
 69 | 
 70 | 
 71 | 
 72 | ```{r imageFunctions, eval=FALSE}
 73 | library("videoplayR")
 74 | 
 75 | webcamImage = function(rollFrames = 4, showImage = FALSE, saveImageToWD = NA){
 76 |   
 77 |   # rollFrames runs through multiple pictures - allows camera to adjust
 78 |   # showImage allows opportunity to display image within function
 79 |   
 80 |   # Turn on webcam
 81 |   stream = readStream(0)
 82 |   
 83 |   # Take pictures
 84 |   print("Video stream initiated.")
 85 |   for(i in seq(rollFrames)){
 86 |     img = nextFrame(stream)
 87 |   }
 88 |   
 89 |   # Turn off camera
 90 |   release(stream)
 91 |   
 92 |   # Display image if requested
 93 |   if(showImage == TRUE){
 94 |     imshow(img)
 95 |   }
 96 |   
 97 |   if(!is.na(saveImageToWD)){
 98 |     fileName = paste(getwd(),"/",saveImageToWD,sep='')
 99 |     print(paste("Saving Image To: ",fileName, sep=''))
100 |     writeImg(fileName,img)
101 |   }
102 |   
103 |   return(img)
104 |   
105 | }
106 | ```
107 | 
108 | 
109 | The Python script:
110 | 
111 |   1. Loads the algorithms from xml files
112 |   2. Loads the image from "originalWebcamShot.png"
113 |   3. Converts the image to grayscale
114 |   4. Runs the facial detection algorithm
115 |   5. Runs the eye detection algorithm (within the face)
116 |   6. Draws rectangles around the face and eyes (different colors)
117 |   7. Saves the new image as "modifiedWebcamShot.png"
118 | 
119 | 
120 | ```{python PythonScript, eval=FALSE}
121 | import numpy as np
122 | import cv2
123 | 
124 | def main():
125 | 
126 |   # I followed Harrison Kingsley's work for this
127 |   # Much of the source code is found https://pythonprogramming.net/haar-cascade-face-eye-detection-python-opencv-tutorial/
128 | 
129 | 	face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
130 | 	eye_cascade = cv2.CascadeClassifier('haarcascade_eye.xml')
131 | 
132 | 	img = cv2.imread('originalWebcamShot.png')
133 | 
134 | 	gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
135 | 	faces = face_cascade.detectMultiScale(gray, 1.3, 5)
136 | 
137 | 	for (x,y,w,h) in faces:
138 | 	    cv2.rectangle(img,(x,y),(x+w,y+h),(0,0,255),2)
139 | 	    roi_gray = gray[y:y+h, x:x+w]
140 | 	    roi_color = img[y:y+h, x:x+w]
141 | 	    
142 | 	    eyes = eye_cascade.detectMultiScale(roi_gray)
143 | 	    for (ex,ey,ew,eh) in eyes:
144 | 	        cv2.rectangle(roi_color,(ex,ey),(ex+ew,ey+eh),(0,255,0),2)
145 | 
146 | 	cv2.imwrite('modifiedWebcamShot.png',img)
147 | 
148 | if __name__ == '__main__':
149 | 	main()
150 | ```
151 | 
152 | The Python code was entirely based off of Harrison Kingsley's work:  
153 | 
154 |   * @sentdex [Twitter](https://twitter.com/Sentdex) | [YouTube](https://www.youtube.com/sentdex)
155 |   * Website: [PythonProgramming.net](https://pythonprogramming.net/haar-cascade-face-eye-detection-python-opencv-tutorial/)
156 | 


--------------------------------------------------------------------------------
/Face Detection/Facial Recognition in R.md:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: "Facial Recognition in R"
  3 | author: "Scott Stoltzman"
  4 | date: "6/22/2017"
  5 | output: html_document
  6 | ---
  7 | 
  8 | ### Facial Recognition in R
  9 | 
 10 | ![Original](originalWebcamShot.png) ![FaceDetection](modifiedWebcamShot.png)
 11 | 
 12 | OpenCV is an incredibly powerful tool to have in your toolbox. I have had a lot of success using it in Python but very little success in R. I haven't done too much other than searching Google but it seems as if "imager" and "videoplayR" provide a lot of the functionality but not all of it.  
 13 | 
 14 | I have never actually called Python functions from R before. Initially, I tried the "rPython" library - that has a lot of advantages, but was completely unecessary for me so system() worked absolutly fine. While this example is extremely simple, it should help to illustrate how easy it is to utilize the power of Python from within R.  
 15 | 
 16 | Using videoplayR I created a function which would take a picture with my webcam and save it as "originalWebcamShot.png"  
 17 | 
 18 | **Note:** saving images and then loading them isn't very efficient but works in this case and is extremely easy to implement. It saves us from passing variables, functions, objects, and/or methods between R and Python in this case.    
 19 | 
 20 | I'll trace my steps backward through this post (I think it's easier to understand what's going on in this case).  
 21 | 
 22 | #### The main.R file:  
 23 | 
 24 |   1. Calls my user-defined function 
 25 |     * Turns on the camera
 26 |     * Takes a picture
 27 |     * Saves it as "originalWebcamShot.png"
 28 |   2. Runs the Python script 
 29 |     * Loads the previously saved image
 30 |     * Loads the Haar Cascade algorithms
 31 |     * Detects faces and eyes
 32 |     * Draws colored rectangles around them
 33 |     * Saves the new image as "modifiedWebcamShot.png"
 34 |   3. Reads new image into R
 35 |   4. Displays both images  
 36 | 
 37 | 
 38 | 
 39 | ```r
 40 | source('imageFunctions.R')
 41 | library("videoplayR")
 42 | 
 43 | # Take a picture and save it
 44 | img = webcamImage(rollFrames = 10, 
 45 |                   showImage = FALSE,
 46 |                   saveImageToWD = 'originalWebcamShot.png')
 47 | 
 48 | # Run Python script to detect faces, draw rectangles, return new image
 49 | system('python3 facialRecognition.py')
 50 | 
 51 | # Read in new image
 52 | img.face = readImg("modifiedWebcamShot.png")
 53 | 
 54 | # Display images
 55 | imshow(img)
 56 | imshow(img.face)
 57 | ```
 58 |   
 59 | 
 60 | The user-defined function:  
 61 | 
 62 |   1. Function inputs  
 63 |     * rollFrames is the number of pictures to take (allows the camera to adjust)
 64 |     * showImage gives the option to display the image
 65 |     * saveImageToWD saves the image generated to the current working directory
 66 |   2. Turns the webcam on
 67 |   3. Takes pictures (number of rollFrames)
 68 |   4. Uses basic logic to determine to show images and/or save them
 69 |   5. Returns the image
 70 | 
 71 | 
 72 | 
 73 | 
 74 | ```r
 75 | library("videoplayR")
 76 | 
 77 | webcamImage = function(rollFrames = 4, showImage = FALSE, saveImageToWD = NA){
 78 |   
 79 |   # rollFrames runs through multiple pictures - allows camera to adjust
 80 |   # showImage allows opportunity to display image within function
 81 |   
 82 |   # Turn on webcam
 83 |   stream = readStream(0)
 84 |   
 85 |   # Take pictures
 86 |   print("Video stream initiated.")
 87 |   for(i in seq(rollFrames)){
 88 |     img = nextFrame(stream)
 89 |   }
 90 |   
 91 |   # Turn off camera
 92 |   release(stream)
 93 |   
 94 |   # Display image if requested
 95 |   if(showImage == TRUE){
 96 |     imshow(img)
 97 |   }
 98 |   
 99 |   if(!is.na(saveImageToWD)){
100 |     fileName = paste(getwd(),"/",saveImageToWD,sep='')
101 |     print(paste("Saving Image To: ",fileName, sep=''))
102 |     writeImg(fileName,img)
103 |   }
104 |   
105 |   return(img)
106 |   
107 | }
108 | ```
109 | 
110 | 
111 | The Python script:
112 | 
113 |   1. Loads the algorithms from xml files
114 |   2. Loads the image from "originalWebcamShot.png"
115 |   3. Converts the image to grayscale
116 |   4. Runs the facial detection algorithm
117 |   5. Runs the eye detection algorithm (within the face)
118 |   6. Draws rectangles around the face and eyes (different colors)
119 |   7. Saves the new image as "modifiedWebcamShot.png"
120 | 
121 | 
122 | 
123 | ```python
124 | import numpy as np
125 | import cv2
126 | 
127 | def main():
128 | 
129 |   # I followed Harrison Kingsley's work for this
130 |   # Much of the source code is found https://pythonprogramming.net/haar-cascade-face-eye-detection-python-opencv-tutorial/
131 | 
132 | 	face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
133 | 	eye_cascade = cv2.CascadeClassifier('haarcascade_eye.xml')
134 | 
135 | 	img = cv2.imread('originalWebcamShot.png')
136 | 
137 | 	gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
138 | 	faces = face_cascade.detectMultiScale(gray, 1.3, 5)
139 | 
140 | 	for (x,y,w,h) in faces:
141 | 	    cv2.rectangle(img,(x,y),(x+w,y+h),(0,0,255),2)
142 | 	    roi_gray = gray[y:y+h, x:x+w]
143 | 	    roi_color = img[y:y+h, x:x+w]
144 | 	    
145 | 	    eyes = eye_cascade.detectMultiScale(roi_gray)
146 | 	    for (ex,ey,ew,eh) in eyes:
147 | 	        cv2.rectangle(roi_color,(ex,ey),(ex+ew,ey+eh),(0,255,0),2)
148 | 
149 | 	cv2.imwrite('modifiedWebcamShot.png',img)
150 | 
151 | if __name__ == '__main__':
152 | 	main()
153 | ```
154 | 
155 | The Python code was entirely based off of Harrison Kingsley's work:  
156 | 
157 |   * @sentdex [Twitter](https://twitter.com/Sentdex) | [YouTube](https://www.youtube.com/sentdex)
158 |   * Website: [PythonProgramming.net](https://pythonprogramming.net/haar-cascade-face-eye-detection-python-opencv-tutorial/)
159 | 


--------------------------------------------------------------------------------
/Face Detection/Facial Recognition.Rmd:
--------------------------------------------------------------------------------
 1 | ---
 2 | title: "Facial Recognition in R"
 3 | author: "Scott Stoltzman"
 4 | date: "6/22/2017"
 5 | output: html_document
 6 | ---
 7 | 
 8 | ```{r setup, include=FALSE}
 9 | knitr::opts_chunk$set(echo = TRUE)
10 | ```
11 | 
12 | ## R Markdown
13 | 
14 | This is an R Markdown document. Markdown is a simple formatting syntax for authoring HTML, PDF, and MS Word documents. For more details on using R Markdown see <http://rmarkdown.rstudio.com>.
15 | 
16 | When you click the **Knit** button a document will be generated that includes both content as well as the output of any embedded R code chunks within the document. You can embed an R code chunk like this:
17 | 
18 | ```{r cars}
19 | summary(cars)
20 | ```
21 | 
22 | ## Including Plots
23 | 
24 | You can also embed plots, for example:
25 | 
26 | ```{r pressure, echo=FALSE}
27 | plot(pressure)
28 | ```
29 | 
30 | Note that the `echo = FALSE` parameter was added to the code chunk to prevent printing of the R code that generated the plot.
31 | 


--------------------------------------------------------------------------------
/Face Detection/LICENSE:
--------------------------------------------------------------------------------
 1 | MIT License
 2 | 
 3 | Copyright (c) 2017 Scott Stoltzman
 4 | 
 5 | Permission is hereby granted, free of charge, to any person obtaining a copy
 6 | of this software and associated documentation files (the "Software"), to deal
 7 | in the Software without restriction, including without limitation the rights
 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
 9 | copies of the Software, and to permit persons to whom the Software is
10 | furnished to do so, subject to the following conditions:
11 | 
12 | The above copyright notice and this permission notice shall be included in all
13 | copies or substantial portions of the Software.
14 | 
15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
21 | SOFTWARE.
22 | 


--------------------------------------------------------------------------------
/Face Detection/ML-Image-Processing-R.Rproj:
--------------------------------------------------------------------------------
 1 | Version: 1.0
 2 | 
 3 | RestoreWorkspace: Default
 4 | SaveWorkspace: Default
 5 | AlwaysSaveHistory: Default
 6 | 
 7 | EnableCodeIndexing: Yes
 8 | UseSpacesForTab: Yes
 9 | NumSpacesForTab: 2
10 | Encoding: UTF-8
11 | 
12 | RnwWeave: Sweave
13 | LaTeX: pdfLaTeX
14 | 


--------------------------------------------------------------------------------
/Face Detection/facialRecognition.py:
--------------------------------------------------------------------------------
 1 | import numpy as np
 2 | import cv2
 3 | 
 4 | def main():
 5 | 	
 6 |   # I followed Harrison Kingsley's work for this
 7 |   # Much of the source code is found https://pythonprogramming.net/haar-cascade-face-eye-detection-python-opencv-tutorial/
 8 | 
 9 | 	face_cascade = cv2.CascadeClassifier('haarcascade_frontalface_default.xml')
10 | 	eye_cascade = cv2.CascadeClassifier('haarcascade_eye.xml')
11 | 
12 | 	img = cv2.imread('originalWebcamShot.png')
13 | 
14 | 	gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
15 | 	faces = face_cascade.detectMultiScale(gray, 1.3, 5)
16 | 
17 | 	for (x,y,w,h) in faces:
18 | 	    cv2.rectangle(img,(x,y),(x+w,y+h),(0,0,255),2)
19 | 	    roi_gray = gray[y:y+h, x:x+w]
20 | 	    roi_color = img[y:y+h, x:x+w]
21 | 	    
22 | 	    eyes = eye_cascade.detectMultiScale(roi_gray)
23 | 	    for (ex,ey,ew,eh) in eyes:
24 | 	        cv2.rectangle(roi_color,(ex,ey),(ex+ew,ey+eh),(0,255,0),2)
25 | 
26 | 	cv2.imwrite('modifiedWebcamShot.png',img)
27 | 
28 | if __name__ == '__main__':
29 | 	main()
30 | 


--------------------------------------------------------------------------------
/Face Detection/imageFunctions.R:
--------------------------------------------------------------------------------
 1 | library("videoplayR")
 2 | 
 3 | webcamImage = function(rollFrames = 4, showImage = FALSE, saveImageToWD = NA){
 4 |   
 5 |   # rollFrames runs through multiple pictures - allows camera to adjust
 6 |   # showImage allows opportunity to display image within function
 7 |   
 8 |   # Turn on webcam
 9 |   stream = readStream(0)
10 |   
11 |   # Take pictures
12 |   print("Video stream initiated.")
13 |   for(i in seq(rollFrames)){
14 |     img = nextFrame(stream)
15 |   }
16 |   
17 |   # Turn off camera
18 |   release(stream)
19 |   
20 |   # Display image if requested
21 |   if(showImage == TRUE){
22 |     imshow(img)
23 |   }
24 |   
25 |   if(!is.na(saveImageToWD)){
26 |     fileName = paste(getwd(),"/",saveImageToWD,sep='')
27 |     print(paste("Saving Image To: ",fileName, sep=''))
28 |     writeImg(fileName,img)
29 |   }
30 |   
31 |   return(img)
32 |   
33 | }


--------------------------------------------------------------------------------
/Face Detection/main.R:
--------------------------------------------------------------------------------
 1 | source('imageFunctions.R')
 2 | library("videoplayR")
 3 | 
 4 | # Take a picture and save it
 5 | img = webcamImage(rollFrames = 10, 
 6 |                   showImage = FALSE,
 7 |                   saveImageToWD = 'originalWebcamShot.png')
 8 | 
 9 | # Run Python script to detect faces, draw rectangles, return new image
10 | system('python3 facialRecognition.py')
11 | 
12 | # Read in new image
13 | img.face = readImg("modifiedWebcamShot.png")
14 | 
15 | # Display images
16 | imshow(img)
17 | imshow(img.face)


--------------------------------------------------------------------------------
/Face Detection/modifiedWebcamShot.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stoltzmaniac/ML-Image-Processing-R/33ff327eb94fb404aea9c27ff5aa419e65cdf7cf/Face Detection/modifiedWebcamShot.png


--------------------------------------------------------------------------------
/Face Detection/originalWebcamShot.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stoltzmaniac/ML-Image-Processing-R/33ff327eb94fb404aea9c27ff5aa419e65cdf7cf/Face Detection/originalWebcamShot.png


--------------------------------------------------------------------------------
/Google Vision API/Google Vision API in R.Rmd:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: "Google Vision API in R"
  3 | author: "Scott Stoltzman"
  4 | date: "7/29/2017"
  5 | output: html_document
  6 | ---
  7 | 
  8 | ## Using the Google Vision API in R  
  9 | 
 10 | ### Utilizing RoogleVision  
 11 | 
 12 | After doing my post last month on OpenCV and face detection, I started looking into other algorithms used for pattern detection in images. As it turns out, Google has done a phenomenal job with their Vision API. It's absolutely incredible the amount of information it can spit back to you by simply sending it a picture.  
 13 | 
 14 | Also, it's 100% free! I believe that includes 1000 images per month. Amazing!  
 15 | 
 16 | In this post I'm going to walk you through the absolute basics of accessing the power of the Google Vision API using the RoogleVision package in R.
 17 | 
 18 | As always, we'll start off loading some libraries. I wrote some extra notation around where you can install them within the code.  
 19 | 
 20 | ```{r setup, message=FALSE, warning=FALSE}
 21 | # Normal Libraries
 22 | library(tidyverse)
 23 | 
 24 | # devtools::install_github("flovv/RoogleVision")
 25 | library(RoogleVision)
 26 | library(jsonlite) # to import credentials
 27 | 
 28 | # For image processing
 29 | # source("http://bioconductor.org/biocLite.R")
 30 | # biocLite("EBImage")
 31 | library(EBImage)
 32 | 
 33 | # For Latitude Longitude Map
 34 | library(leaflet)
 35 | ```
 36 | 
 37 | #### Google Authentication  
 38 | 
 39 | In order to use the API, you have to authenticate. There is plenty of documentation out there about how to setup an account, create a project, download credentials, etc. Head over to [Google Cloud Console](https://console.cloud.google.com) if you don't have an account already.
 40 | 
 41 | ```{r}
 42 | # Credentials file I downloaded from the cloud console
 43 | creds = fromJSON('credentials.json')
 44 | 
 45 | # Google Authentication - Use Your Credentials
 46 | # options("googleAuthR.client_id" = "xxx.apps.googleusercontent.com")
 47 | # options("googleAuthR.client_secret" = "")
 48 | 
 49 | options("googleAuthR.client_id" = creds$installed$client_id)
 50 | options("googleAuthR.client_secret" = creds$installed$client_secret)
 51 | options("googleAuthR.scopes.selected" = c("https://www.googleapis.com/auth/cloud-platform"))
 52 | googleAuthR::gar_auth()
 53 | ```
 54 |   
 55 |   
 56 | ### Now You're Ready to Go  
 57 | 
 58 | The function getGoogleVisionResponse takes three arguments:
 59 | 
 60 |   1. imagePath
 61 |   2. feature
 62 |   3. numResults
 63 | 
 64 | Numbers 1 and 3 are self-explanatory, "feature" has 5 options:  
 65 | 
 66 |   * LABEL_DETECTION
 67 |   * LANDMARK_DETECTION
 68 |   * FACE_DETECTION
 69 |   * LOGO_DETECTION
 70 |   * TEXT_DETECTION  
 71 | 
 72 | These are self-explanatory but it's nice to see each one in action. 
 73 | 
 74 | As a side note: there are also other features that the API has which aren't included (yet) in the RoogleVision package such as "Safe Search" which identifies inappropriate content, "Properties" which identifies dominant colors and aspect ratios and a few others can be found at the [Cloud Vision website](https://cloud.google.com/vision/)
 75 | 
 76 | ----  
 77 | 
 78 | #### Label Detection  
 79 | 
 80 | This is used to help determine content within the photo. It can basically add a level of metadata around the image.  
 81 | 
 82 | Here is a photo of our dog when we hiked up to Audubon Peak in Colorado:  
 83 | 
 84 | ```{r echo=FALSE}
 85 | dog_mountain_image <- readImage('dog_mountain.jpg')
 86 | plot(dog_mountain_image)
 87 | ```
 88 | 
 89 | 
 90 | 
 91 | ```{r}
 92 | dog_mountain_label = getGoogleVisionResponse('dog_mountain.jpg',
 93 |                                               feature = 'LABEL_DETECTION')
 94 | head(dog_mountain_label)
 95 | ```
 96 |   
 97 |   
 98 | All 5 responses were incredibly accurate! The "score" that is returned is how confident the Google Vision algorithms are, so there's a 91.9% chance a mountain is prominent in this photo. I like "dog hiking" the best - considering that's what we were doing at the time. Kind of a little bit too accurate...  
 99 | 
100 | ----  
101 | 
102 | #### Landmark Detection   
103 | 
104 | This is a feature designed to specifically pick out a recognizable landmark! It provides the position in the image along with the geolocation of the landmark (in longitude and latitude).  
105 | 
106 | My wife and I took this selfie in at the Linderhof Castle in Bavaria, Germany. 
107 | 
108 | ```{r}
109 | us_castle <- readImage('us_castle_2.jpg')
110 | plot(us_castle)
111 | ```
112 |   
113 | 
114 | The response from the Google Vision API was spot on. It returned "Linderhof Palace" as the description. It also provided a  score (I reduced the resolution of the image which hurt the score), a boundingPoly field and locations.  
115 | 
116 |   * Bounding Poly - gives x,y coordinates for a polygon around the landmark in the image
117 |   * Locations - provides longitude,latitude coordinates  
118 | 
119 | ```{r}
120 | us_landmark = getGoogleVisionResponse('us_castle_2.jpg',
121 |                                       feature = 'LANDMARK_DETECTION')
122 | head(us_landmark)
123 | ```
124 |   
125 | I plotted the polygon over the image using the coordinates returned. It does a great job (certainly not perfect) of getting the castle identified. It's a bit tough to say what the actual "landmark" would be in this case due to the fact the fountains, stairs and grounds are certainly important and are a key part of the castle.
126 | 
127 | ```{r}
128 | us_castle <- readImage('us_castle_2.jpg')
129 | plot(us_castle)
130 | xs = us_landmark$boundingPoly$vertices[[1]][1][[1]]
131 | ys = us_landmark$boundingPoly$vertices[[1]][2][[1]]
132 | polygon(x=xs,y=ys,border='red',lwd=4)
133 | ```
134 |   
135 |   
136 | Turning to the locations - I plotted this using the leaflet library. If you haven't used leaflet, start doing so immediately. I'm a huge fan of it due to speed and simplicity. There are a lot of customization options available as well that you can check out.  
137 | 
138 | The location = spot on! While it isn't a shock to me that Google could provide the location of "Linderhof Castle" - it is amazing to me that I don't have to write a web crawler search function to find it myself! That's just one of many little luxuries they have built into this API.
139 | 
140 | ```{r}
141 | latt = us_landmark$locations[[1]][[1]][[1]]
142 | lon = us_landmark$locations[[1]][[1]][[2]]
143 | m = leaflet() %>%
144 |   addProviderTiles(providers$CartoDB.Positron) %>%
145 |   setView(lng = lon, lat = latt, zoom = 5) %>%
146 |   addMarkers(lng = lon, lat = latt)
147 | m
148 | ```
149 | 
150 | 
151 | ----  
152 | 
153 | #### Face Detection  
154 | 
155 | My last blog post showed the OpenCV package utilizing the haar cascade algorithm in action. I didn't dig into Google's algorithms to figure out what is under the hood, but it provides similar results. However, rather than layering in each subsequent "find the eyes" and "find the mouth" and ...etc... it returns more than you ever needed to know.
156 | 
157 |   * Bounding Poly = highest level polygon
158 |   * FD Bounding Poly = polygon surrounding each face
159 |   * Landmarks = (funny name) includes each feature of the face (left eye, right eye, etc.)
160 |   * Roll Angle, Pan Angle, Tilt Angle = all of the different angles you'd need per face
161 |   * Confidence (detection and landmarking) = how certain the algorithm is that it's accurate
162 |   * Joy, sorrow, anger, surprise, under exposed, blurred, headwear likelihoods = how likely it is that each face contains that emotion or characteristic
163 | 
164 | The likelihoods is another amazing piece of information returned! I have run about 20 images through this API and every single one has been accurate - very impressive!
165 | 
166 | I wanted to showcase the face detection and headwear first. Here's a picture of my wife and I at "The Bean" in Chicago (side note: it's awesome! I thought it was going to be really silly, but you can really have a lot of fun with all of the angles and reflections):
167 | 
168 | ```{r}
169 | us_hats_pic <- readImage('us_hats.jpg')
170 | plot(us_hats_pic)
171 | ```
172 | 
173 | 
174 | ```{r}
175 | us_hats = getGoogleVisionResponse('us_hats.jpg',
176 |                                       feature = 'FACE_DETECTION')
177 | head(us_hats)
178 | ```
179 | 
180 | 
181 | ```{r}
182 | us_hats_pic <- readImage('us_hats.jpg')
183 | plot(us_hats_pic)
184 | 
185 | xs1 = us_hats$fdBoundingPoly$vertices[[1]][1][[1]]
186 | ys1 = us_hats$fdBoundingPoly$vertices[[1]][2][[1]]
187 | 
188 | xs2 = us_hats$fdBoundingPoly$vertices[[2]][1][[1]]
189 | ys2 = us_hats$fdBoundingPoly$vertices[[2]][2][[1]]
190 | 
191 | polygon(x=xs1,y=ys1,border='red',lwd=4)
192 | polygon(x=xs2,y=ys2,border='green',lwd=4)
193 | ```
194 | 
195 | 
196 | Here's a shot that should be familiar (copied directly from my last blog) - and I wanted to highlight the different features that can be detected. Look at how many points are perfectly placed:
197 | 
198 | ```{r}
199 | my_face_pic <- readImage('my_face.jpg')
200 | plot(my_face_pic)
201 | ```
202 | 
203 | 
204 | 
205 | ```{r}
206 | my_face = getGoogleVisionResponse('my_face.jpg',
207 |                                       feature = 'FACE_DETECTION')
208 | head(my_face)
209 | ```
210 | 
211 | 
212 | 
213 | ```{r}
214 | head(my_face$landmarks)
215 | ```
216 | 
217 | 
218 | 
219 | ```{r}
220 | my_face_pic <- readImage('my_face.jpg')
221 | plot(my_face_pic)
222 | 
223 | xs1 = my_face$fdBoundingPoly$vertices[[1]][1][[1]]
224 | ys1 = my_face$fdBoundingPoly$vertices[[1]][2][[1]]
225 | 
226 | xs2 = my_face$landmarks[[1]][[2]][[1]]
227 | ys2 = my_face$landmarks[[1]][[2]][[2]]
228 | 
229 | polygon(x=xs1,y=ys1,border='red',lwd=4)
230 | points(x=xs2,y=ys2,lwd=2, col='lightblue')
231 | ```
232 | 
233 | ----
234 | 
235 | #### Logo Detection
236 |   
237 | To continue along the Chicago trip, we drove by Wrigley field and I took a really bad photo of the sign from a moving car as it was under construction. It's nice because it has a lot of different lines and writing the Toyota logo isn't incredibly prominent or necessarily fit to brand colors.
238 | 
239 | This call returns:  
240 | 
241 |   * Description = Brand name of the logo detected
242 |   * Score = Confidence of prediction accuracy
243 |   * Bounding Poly = (Again) coordinates of the logo
244 | 
245 |   
246 | ```{r}
247 | wrigley_image <- readImage('wrigley_text.jpg')
248 | plot(wrigley_image)
249 | ```
250 | 
251 | 
252 | 
253 | ```{r}
254 | wrigley_logo = getGoogleVisionResponse('wrigley_text.jpg',
255 |                                    feature = 'LOGO_DETECTION')
256 | head(wrigley_logo)
257 | ```
258 | 
259 | 
260 | ```{r}
261 | wrigley_image <- readImage('wrigley_text.jpg')
262 | plot(wrigley_image)
263 | xs = wrigley_logo$boundingPoly$vertices[[1]][[1]]
264 | ys = wrigley_logo$boundingPoly$vertices[[1]][[2]]
265 | polygon(x=xs,y=ys,border='green',lwd=4)
266 | ```
267 | 
268 | ----
269 | 
270 | #### Text Detection
271 | 
272 | I'll continue using the Wrigley Field picture. There is text all over the place and it's fun to see what is captured and what isn't. It appears as if the curved text at the top "field" isn't easily interpreted as text. However, the rest is caught and the words are captured.  
273 | 
274 | The response sent back is a bit more difficult to interpret than the rest of the API calls - it breaks things apart by word but also returns everything as one line. Here's what comes back:  
275 | 
276 |   * Locale = language, returned as source
277 |   * Description = the text (the first line is everything, and then the rest are indiviudal words)
278 |   * Bounding Poly = I'm sure you can guess by now
279 | 
280 | ```{r}
281 | wrigley_text = getGoogleVisionResponse('wrigley_text.jpg',
282 |                                    feature = 'TEXT_DETECTION')
283 | head(wrigley_text)
284 | ```
285 | 
286 | ```{r}
287 | wrigley_image <- readImage('wrigley_text.jpg')
288 | plot(wrigley_image)
289 | 
290 | for(i in 1:length(wrigley_text$boundingPoly$vertices)){
291 |   xs = wrigley_text$boundingPoly$vertices[[i]]$x
292 |   ys = wrigley_text$boundingPoly$vertices[[i]]$y
293 |   polygon(x=xs,y=ys,border='green',lwd=2)
294 | }
295 | ```
296 | 
297 | ----
298 | 
299 | That's about it for the basics of using the Google Vision API with the RoogleVision library. I highly recommend tinkering around with it a bit, especially because it won't cost you a dime.  
300 | 
301 | While I do enjoy the math under the hood and the thinking required to understand alrgorithms, I do think these sorts of API's will become the way of the future for data science. Outside of specific use cases or special industries, it seems hard to imagine wanting to try and create algorithms that would be better than ones created for mass consumption. As long as they're fast, free and accurate, I'm all about making my life easier! From the hiring perspective, I much prefer someone who can get the job done over someone who can slightly improve performance (as always, there are many cases where this doesn't apply).
302 | 
303 | Please comment if you are utilizing any of the Google API's for business purposes, I would love to hear it!
304 | 
305 | As always you can find this on my [GitHub](https://github.com/stoltzmaniac/ML-Image-Processing-R)
306 | 
307 | 
308 | 
309 | 
310 | 


--------------------------------------------------------------------------------
/Google Vision API/Google Vision API in R._pub.html:
--------------------------------------------------------------------------------
  1 | <h2>Using the Google Vision API in R</h2>
  2 | 
  3 | <h3>Utilizing RoogleVision</h3>
  4 | 
  5 | <p>After doing my post last month on OpenCV and face detection, I started looking into other algorithms used for pattern detection in images. As it turns out, Google has done a phenomenal job with their Vision API. It&#39;s absolutely incredible the amount of information it can spit back to you by simply sending it a picture.  </p>
  6 | 
  7 | <p>Also, it&#39;s 100% free! I believe that includes 1000 images per month. Amazing!  </p>
  8 | 
  9 | <p>In this post I&#39;m going to walk you through the absolute basics of accessing the power of the Google Vision API using the RoogleVision package in R.</p>
 10 | 
 11 | <p>As always, we&#39;ll start off loading some libraries. I wrote some extra notation around where you can install them within the code.  </p>
 12 | 
 13 | <pre><code class="r"># Normal Libraries
 14 | library(tidyverse)
 15 | 
 16 | # devtools::install_github(&quot;flovv/RoogleVision&quot;)
 17 | library(RoogleVision)
 18 | library(jsonlite) # to import credentials
 19 | 
 20 | # For image processing
 21 | # source(&quot;http://bioconductor.org/biocLite.R&quot;)
 22 | # biocLite(&quot;EBImage&quot;)
 23 | library(EBImage)
 24 | 
 25 | # For Latitude Longitude Map
 26 | library(leaflet)
 27 | </code></pre>
 28 | 
 29 | <h4>Google Authentication</h4>
 30 | 
 31 | <p>In order to use the API, you have to authenticate. There is plenty of documentation out there about how to setup an account, create a project, download credentials, etc. Head over to <a href="https://console.cloud.google.com">Google Cloud Console</a> if you don&#39;t have an account already.</p>
 32 | 
 33 | <pre><code class="r"># Credentials file I downloaded from the cloud console
 34 | creds = fromJSON(&#39;credentials.json&#39;)
 35 | 
 36 | # Google Authentication - Use Your Credentials
 37 | # options(&quot;googleAuthR.client_id&quot; = &quot;xxx.apps.googleusercontent.com&quot;)
 38 | # options(&quot;googleAuthR.client_secret&quot; = &quot;&quot;)
 39 | 
 40 | options(&quot;googleAuthR.client_id&quot; = creds$installed$client_id)
 41 | options(&quot;googleAuthR.client_secret&quot; = creds$installed$client_secret)
 42 | options(&quot;googleAuthR.scopes.selected&quot; = c(&quot;https://www.googleapis.com/auth/cloud-platform&quot;))
 43 | googleAuthR::gar_auth()
 44 | </code></pre>
 45 | 
 46 | <pre><code>## 2017-07-31 11:30:34&gt; Token cache file: .httr-oauth
 47 | </code></pre>
 48 | 
 49 | <pre><code>## 2017-07-31 11:30:34&gt; Scopes: https://www.googleapis.com/auth/cloud-platform
 50 | </code></pre>
 51 | 
 52 | <h3>Now You&#39;re Ready to Go</h3>
 53 | 
 54 | <p>The function getGoogleVisionResponse takes three arguments:</p>
 55 | 
 56 | <ol>
 57 | <li>imagePath</li>
 58 | <li>feature</li>
 59 | <li>numResults</li>
 60 | </ol>
 61 | 
 62 | <p>Numbers 1 and 3 are self-explanatory, &ldquo;feature&rdquo; has 5 options:  </p>
 63 | 
 64 | <ul>
 65 | <li>LABEL_DETECTION</li>
 66 | <li>LANDMARK_DETECTION</li>
 67 | <li>FACE_DETECTION</li>
 68 | <li>LOGO_DETECTION</li>
 69 | <li>TEXT_DETECTION<br/></li>
 70 | </ul>
 71 | 
 72 | <p>These are self-explanatory but it&#39;s nice to see each one in action. </p>
 73 | 
 74 | <p>As a side note: there are also other features that the API has which aren&#39;t included (yet) in the RoogleVision package such as &ldquo;Safe Search&rdquo; which identifies inappropriate content, &ldquo;Properties&rdquo; which identifies dominant colors and aspect ratios and a few others can be found at the <a href="https://cloud.google.com/vision/">Cloud Vision website</a></p>
 75 | 
 76 | <hr/>
 77 | 
 78 | <h4>Label Detection</h4>
 79 | 
 80 | <p>This is used to help determine content within the photo. It can basically add a level of metadata around the image.  </p>
 81 | 
 82 | <p>Here is a photo of our dog when we hiked up to Audubon Peak in Colorado:  </p>
 83 | 
 84 | <p><img src="https://www.stoltzmaniac.com/wp-content/uploads/2017/07/unnamed-chunk-2-1.png" alt="plot of chunk unnamed-chunk-2"/></p>
 85 | 
 86 | <pre><code class="r">dog_mountain_label = getGoogleVisionResponse(&#39;dog_mountain.jpg&#39;,
 87 |                                               feature = &#39;LABEL_DETECTION&#39;)
 88 | head(dog_mountain_label)
 89 | </code></pre>
 90 | 
 91 | <pre><code>##            mid           description     score
 92 | ## 1     /m/09d_r              mountain 0.9188690
 93 | ## 2 /g/11jxkqbpp mountainous landforms 0.9009549
 94 | ## 3    /m/023bbt            wilderness 0.8733696
 95 | ## 4     /m/0kpmf             dog breed 0.8398435
 96 | ## 5    /m/0d4djn            dog hiking 0.8352048
 97 | </code></pre>
 98 | 
 99 | <p>All 5 responses were incredibly accurate! The &ldquo;score&rdquo; that is returned is how confident the Google Vision algorithms are, so there&#39;s a 91.9% chance a mountain is prominent in this photo. I like &ldquo;dog hiking&rdquo; the best - considering that&#39;s what we were doing at the time. Kind of a little bit too accurate&hellip;  </p>
100 | 
101 | <hr/>
102 | 
103 | <h4>Landmark Detection</h4>
104 | 
105 | <p>This is a feature designed to specifically pick out a recognizable landmark! It provides the position in the image along with the geolocation of the landmark (in longitude and latitude).  </p>
106 | 
107 | <p>My wife and I took this selfie in at the Linderhof Castle in Bavaria, Germany. </p>
108 | 
109 | <pre><code class="r">us_castle &lt;- readImage(&#39;us_castle_2.jpg&#39;)
110 | plot(us_castle)
111 | </code></pre>
112 | 
113 | <p><img src="https://www.stoltzmaniac.com/wp-content/uploads/2017/07/unnamed-chunk-4-1.png" alt="plot of chunk unnamed-chunk-4"/></p>
114 | 
115 | <p>The response from the Google Vision API was spot on. It returned &ldquo;Linderhof Palace&rdquo; as the description. It also provided a  score (I reduced the resolution of the image which hurt the score), a boundingPoly field and locations.  </p>
116 | 
117 | <ul>
118 | <li>Bounding Poly - gives x,y coordinates for a polygon around the landmark in the image</li>
119 | <li>Locations - provides longitude,latitude coordinates<br/></li>
120 | </ul>
121 | 
122 | <pre><code class="r">us_landmark = getGoogleVisionResponse(&#39;us_castle_2.jpg&#39;,
123 |                                       feature = &#39;LANDMARK_DETECTION&#39;)
124 | head(us_landmark)
125 | </code></pre>
126 | 
127 | <pre><code>##         mid      description     score
128 | ## 1 /m/066h19 Linderhof Palace 0.4665011
129 | ##                               vertices          locations
130 | ## 1 25, 382, 382, 25, 178, 178, 659, 659 47.57127, 10.96072
131 | </code></pre>
132 | 
133 | <p>I plotted the polygon over the image using the coordinates returned. It does a great job (certainly not perfect) of getting the castle identified. It&#39;s a bit tough to say what the actual &ldquo;landmark&rdquo; would be in this case due to the fact the fountains, stairs and grounds are certainly important and are a key part of the castle.</p>
134 | 
135 | <pre><code class="r">us_castle &lt;- readImage(&#39;us_castle_2.jpg&#39;)
136 | plot(us_castle)
137 | xs = us_landmark$boundingPoly$vertices[[1]][1][[1]]
138 | ys = us_landmark$boundingPoly$vertices[[1]][2][[1]]
139 | polygon(x=xs,y=ys,border=&#39;red&#39;,lwd=4)
140 | </code></pre>
141 | 
142 | <p><img src="https://www.stoltzmaniac.com/wp-content/uploads/2017/07/unnamed-chunk-6-1.png" alt="plot of chunk unnamed-chunk-6"/></p>
143 | 
144 | <p>Turning to the locations - I plotted this using the leaflet library. If you haven&#39;t used leaflet, start doing so immediately. I&#39;m a huge fan of it due to speed and simplicity. There are a lot of customization options available as well that you can check out.  </p>
145 | 
146 | <p>The location = spot on! While it isn&#39;t a shock to me that Google could provide the location of &ldquo;Linderhof Castle&rdquo; - it is amazing to me that I don&#39;t have to write a web crawler search function to find it myself! That&#39;s just one of many little luxuries they have built into this API.</p>
147 | 
148 | <pre><code class="r">latt = us_landmark$locations[[1]][[1]][[1]]
149 | lon = us_landmark$locations[[1]][[1]][[2]]
150 | m = leaflet() %&gt;%
151 |   addProviderTiles(providers$CartoDB.Positron) %&gt;%
152 |   setView(lng = lon, lat = latt, zoom = 5) %&gt;%
153 |   addMarkers(lng = lon, lat = latt)
154 | m
155 | </code></pre>
156 | 
157 | <pre><code>## Error in loadNamespace(name): there is no package called &#39;webshot&#39;
158 | </code></pre>
159 | 
160 | <hr/>
161 | 
162 | <h4>Face Detection</h4>
163 | 
164 | <p>My last blog post showed the OpenCV package utilizing the haar cascade algorithm in action. I didn&#39;t dig into Google&#39;s algorithms to figure out what is under the hood, but it provides similar results. However, rather than layering in each subsequent &ldquo;find the eyes&rdquo; and &ldquo;find the mouth&rdquo; and &hellip;etc&hellip; it returns more than you ever needed to know.</p>
165 | 
166 | <ul>
167 | <li>Bounding Poly = highest level polygon</li>
168 | <li>FD Bounding Poly = polygon surrounding each face</li>
169 | <li>Landmarks = (funny name) includes each feature of the face (left eye, right eye, etc.)</li>
170 | <li>Roll Angle, Pan Angle, Tilt Angle = all of the different angles you&#39;d need per face</li>
171 | <li>Confidence (detection and landmarking) = how certain the algorithm is that it&#39;s accurate</li>
172 | <li>Joy, sorrow, anger, surprise, under exposed, blurred, headwear likelihoods = how likely it is that each face contains that emotion or characteristic</li>
173 | </ul>
174 | 
175 | <p>The likelihoods is another amazing piece of information returned! I have run about 20 images through this API and every single one has been accurate - very impressive!</p>
176 | 
177 | <p>I wanted to showcase the face detection and headwear first. Here&#39;s a picture of my wife and I at &ldquo;The Bean&rdquo; in Chicago (side note: it&#39;s awesome! I thought it was going to be really silly, but you can really have a lot of fun with all of the angles and reflections):</p>
178 | 
179 | <pre><code class="r">us_hats_pic &lt;- readImage(&#39;us_hats.jpg&#39;)
180 | plot(us_hats_pic)
181 | </code></pre>
182 | 
183 | <p><img src="https://www.stoltzmaniac.com/wp-content/uploads/2017/07/unnamed-chunk-8-1.png" alt="plot of chunk unnamed-chunk-8"/></p>
184 | 
185 | <pre><code class="r">us_hats = getGoogleVisionResponse(&#39;us_hats.jpg&#39;,
186 |                                       feature = &#39;FACE_DETECTION&#39;)
187 | head(us_hats)
188 | </code></pre>
189 | 
190 | <pre><code>##                                 vertices
191 | ## 1 295, 410, 410, 295, 164, 164, 297, 297
192 | ## 2 353, 455, 455, 353, 261, 261, 381, 381
193 | ##                                 vertices
194 | ## 1 327, 402, 402, 327, 206, 206, 280, 280
195 | ## 2 368, 439, 439, 368, 298, 298, 370, 370
196 | ##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           landmarks
197 | ## 1 LEFT_EYE, RIGHT_EYE, LEFT_OF_LEFT_EYEBROW, RIGHT_OF_LEFT_EYEBROW, LEFT_OF_RIGHT_EYEBROW, RIGHT_OF_RIGHT_EYEBROW, MIDPOINT_BETWEEN_EYES, NOSE_TIP, UPPER_LIP, LOWER_LIP, MOUTH_LEFT, MOUTH_RIGHT, MOUTH_CENTER, NOSE_BOTTOM_RIGHT, NOSE_BOTTOM_LEFT, NOSE_BOTTOM_CENTER, LEFT_EYE_TOP_BOUNDARY, LEFT_EYE_RIGHT_CORNER, LEFT_EYE_BOTTOM_BOUNDARY, LEFT_EYE_LEFT_CORNER, LEFT_EYE_PUPIL, RIGHT_EYE_TOP_BOUNDARY, RIGHT_EYE_RIGHT_CORNER, RIGHT_EYE_BOTTOM_BOUNDARY, RIGHT_EYE_LEFT_CORNER, RIGHT_EYE_PUPIL, LEFT_EYEBROW_UPPER_MIDPOINT, RIGHT_EYEBROW_UPPER_MIDPOINT, LEFT_EAR_TRAGION, RIGHT_EAR_TRAGION, FOREHEAD_GLABELLA, CHIN_GNATHION, CHIN_LEFT_GONION, CHIN_RIGHT_GONION, 352.00974, 380.68124, 340.27664, 363.16348, 378.64938, 393.6553, 370.78906, 371.99802, 366.30664, 364.23642, 349.47012, 377.17905, 364.7603, 375.62842, 357.7237, 367.20822, 352.4306, 358.9425, 351.23474, 343.64124, 351.10004, 384.32953, 388.21667, 382.08743, 375.90262, 383.87732, 353.08627, 387.7416, 312.06622, 384.56946, 371.5381, 360.62714, 318.48486, 383.87354, 225.86505, 229.50423, 216.51169, 219.28635, 220.92139, 222.02762, 227.35771, 248.94884, 259.51044, 272.9798, 261.36096, 263.89874, 265.76828, 251.38408, 248.46135, 253.8837, 223.93387, 227.20102, 228.33765, 225.31805, 226.13412, 227.22661, 229.89023, 231.8548, 229.34843, 229.54358, 213.85588, 217.43123, 236.95158, 244.45172, 219.76247, 287.00592, 260.99124, 267.68896, -0.0009269835, 12.904515, -2.3585303, -3.3569832, 3.4166863, 20.891703, -0.10083569, -8.568332, -0.32282636, 2.7426949, 3.1502135, 14.79839, 2.401884, 8.115268, 0.19641992, -0.7506992, -2.5084567, 2.8466656, -0.38294473, -0.05908208, -1.2792722, 11.411656, 19.373985, 13.421982, 10.900102, 12.992137, -5.2635217, 9.859322, 31.33588, 62.9466, -1.213793, 7.9232774, 20.887934, 49.40408
198 | ## 2 LEFT_EYE, RIGHT_EYE, LEFT_OF_LEFT_EYEBROW, RIGHT_OF_LEFT_EYEBROW, LEFT_OF_RIGHT_EYEBROW, RIGHT_OF_RIGHT_EYEBROW, MIDPOINT_BETWEEN_EYES, NOSE_TIP, UPPER_LIP, LOWER_LIP, MOUTH_LEFT, MOUTH_RIGHT, MOUTH_CENTER, NOSE_BOTTOM_RIGHT, NOSE_BOTTOM_LEFT, NOSE_BOTTOM_CENTER, LEFT_EYE_TOP_BOUNDARY, LEFT_EYE_RIGHT_CORNER, LEFT_EYE_BOTTOM_BOUNDARY, LEFT_EYE_LEFT_CORNER, LEFT_EYE_PUPIL, RIGHT_EYE_TOP_BOUNDARY, RIGHT_EYE_RIGHT_CORNER, RIGHT_EYE_BOTTOM_BOUNDARY, RIGHT_EYE_LEFT_CORNER, RIGHT_EYE_PUPIL, LEFT_EYEBROW_UPPER_MIDPOINT, RIGHT_EYEBROW_UPPER_MIDPOINT, LEFT_EAR_TRAGION, RIGHT_EAR_TRAGION, FOREHEAD_GLABELLA, CHIN_GNATHION, CHIN_LEFT_GONION, CHIN_RIGHT_GONION, 389.67215, 419.01474, 378.68497, 397.29074, 411.57373, 430.68024, 404.34882, 402.9257, 402.77734, 402.28552, 388.3598, 418.2969, 402.50266, 411.8417, 394.88547, 403.11188, 388.78043, 395.5202, 389.06342, 382.62268, 388.21332, 419.86707, 425.98645, 419.21088, 413.45447, 420.1578, 387.80508, 421.56183, 369.29388, 439.9703, 404.44498, 401.90457, 371.50647, 435.39258, 319.11594, 320.14157, 310.8753, 313.32437, 313.59402, 313.3107, 318.78964, 337.8581, 347.607, 360.56134, 350.5411, 351.10315, 354.41702, 339.4301, 339.11786, 342.46072, 317.141, 320.19537, 321.22644, 318.473, 319.0922, 318.58655, 320.64886, 322.44992, 320.4701, 320.58286, 308.29236, 309.85825, 328.25885, 331.54816, 312.85385, 372.5355, 349.82388, 352.82462, 0.00018637085, -0.63476753, 2.1497552, -7.1008844, -7.460493, 1.0756116, -7.027477, -13.670173, -5.1229305, -1.0671108, 4.793461, 4.4603314, -1.8998832, -1.677745, -1.2933732, -5.9320116, -2.3477247, 0.03832738, 0.0054741018, 2.9924247, -0.7714207, -2.9816942, 2.079318, -0.6419869, -0.3527427, -1.4552351, -5.0709085, -5.7559977, 40.608036, 39.10855, -8.547456, 4.8426514, 30.500828, 29.191824
199 | ##   rollAngle panAngle tiltAngle detectionConfidence landmarkingConfidence
200 | ## 1  7.103324 23.46835 -2.816312           0.9877176             0.7072066
201 | ## 2  2.510939 -1.17956 -7.393063           0.9997375             0.7268016
202 | ##   joyLikelihood sorrowLikelihood angerLikelihood surpriseLikelihood
203 | ## 1   VERY_LIKELY    VERY_UNLIKELY   VERY_UNLIKELY      VERY_UNLIKELY
204 | ## 2   VERY_LIKELY    VERY_UNLIKELY   VERY_UNLIKELY      VERY_UNLIKELY
205 | ##   underExposedLikelihood blurredLikelihood headwearLikelihood
206 | ## 1          VERY_UNLIKELY     VERY_UNLIKELY        VERY_LIKELY
207 | ## 2          VERY_UNLIKELY     VERY_UNLIKELY        VERY_LIKELY
208 | </code></pre>
209 | 
210 | <pre><code class="r">us_hats_pic &lt;- readImage(&#39;us_hats.jpg&#39;)
211 | plot(us_hats_pic)
212 | 
213 | xs1 = us_hats$fdBoundingPoly$vertices[[1]][1][[1]]
214 | ys1 = us_hats$fdBoundingPoly$vertices[[1]][2][[1]]
215 | 
216 | xs2 = us_hats$fdBoundingPoly$vertices[[2]][1][[1]]
217 | ys2 = us_hats$fdBoundingPoly$vertices[[2]][2][[1]]
218 | 
219 | polygon(x=xs1,y=ys1,border=&#39;red&#39;,lwd=4)
220 | polygon(x=xs2,y=ys2,border=&#39;green&#39;,lwd=4)
221 | </code></pre>
222 | 
223 | <p><img src="https://www.stoltzmaniac.com/wp-content/uploads/2017/07/unnamed-chunk-10-1.png" alt="plot of chunk unnamed-chunk-10"/></p>
224 | 
225 | <p>Here&#39;s a shot that should be familiar (copied directly from my last blog) - and I wanted to highlight the different features that can be detected. Look at how many points are perfectly placed:</p>
226 | 
227 | <pre><code class="r">my_face_pic &lt;- readImage(&#39;my_face.jpg&#39;)
228 | plot(my_face_pic)
229 | </code></pre>
230 | 
231 | <p><img src="https://www.stoltzmaniac.com/wp-content/uploads/2017/07/unnamed-chunk-11-1.png" alt="plot of chunk unnamed-chunk-11"/></p>
232 | 
233 | <pre><code class="r">my_face = getGoogleVisionResponse(&#39;my_face.jpg&#39;,
234 |                                       feature = &#39;FACE_DETECTION&#39;)
235 | head(my_face)
236 | </code></pre>
237 | 
238 | <pre><code>##                               vertices
239 | ## 1 456, 877, 877, 456, NA, NA, 473, 473
240 | ##                               vertices
241 | ## 1 515, 813, 813, 515, 98, 98, 395, 395
242 | ##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             landmarks
243 | ## 1 LEFT_EYE, RIGHT_EYE, LEFT_OF_LEFT_EYEBROW, RIGHT_OF_LEFT_EYEBROW, LEFT_OF_RIGHT_EYEBROW, RIGHT_OF_RIGHT_EYEBROW, MIDPOINT_BETWEEN_EYES, NOSE_TIP, UPPER_LIP, LOWER_LIP, MOUTH_LEFT, MOUTH_RIGHT, MOUTH_CENTER, NOSE_BOTTOM_RIGHT, NOSE_BOTTOM_LEFT, NOSE_BOTTOM_CENTER, LEFT_EYE_TOP_BOUNDARY, LEFT_EYE_RIGHT_CORNER, LEFT_EYE_BOTTOM_BOUNDARY, LEFT_EYE_LEFT_CORNER, LEFT_EYE_PUPIL, RIGHT_EYE_TOP_BOUNDARY, RIGHT_EYE_RIGHT_CORNER, RIGHT_EYE_BOTTOM_BOUNDARY, RIGHT_EYE_LEFT_CORNER, RIGHT_EYE_PUPIL, LEFT_EYEBROW_UPPER_MIDPOINT, RIGHT_EYEBROW_UPPER_MIDPOINT, LEFT_EAR_TRAGION, RIGHT_EAR_TRAGION, FOREHEAD_GLABELLA, CHIN_GNATHION, CHIN_LEFT_GONION, CHIN_RIGHT_GONION, 598.7636, 723.16125, 556.1954, 628.8224, 693.0257, 767.7514, 661.2344, 661.9072, 662.7698, 662.2978, 603.21814, 722.5995, 662.66486, 700.5242, 626.14417, 663.0441, 597.7986, 624.5084, 597.13776, 572.32404, 596.0174, 725.61145, 751.531, 725.60315, 701.6699, 727.3262, 591.74457, 730.4487, 525.0554, 814.0723, 660.71075, 664.25146, 536.8293, 798.8593, 192.19489, 192.49554, 165.28363, 159.90292, 160.66797, 164.28062, 185.05746, 260.90063, 310.77585, 348.6693, 322.57773, 317.35153, 325.9983, 274.28345, 275.32834, 284.49515, 185.37177, 194.59952, 203.61258, 197.56845, 194.79561, 183.56104, 195.62381, 203.60477, 194.94687, 193.0094, 147.70262, 145.74747, 276.10037, 270.00323, 158.95798, 409.86185, 350.27, 346.58624, -0.0018592946, -4.8054757, 15.825399, -23.345352, -25.614508, 7.637372, -29.068363, -74.15371, -48.44018, -43.53211, -10.572805, -14.504428, -40.966953, -26.340576, -23.933197, -46.457916, -8.027897, -0.8318569, -2.181139, 12.514983, -3.5412567, -12.764345, 5.530805, -7.038474, -3.6184528, -8.517615, -10.674338, -15.85011, 152.71716, 142.93324, -29.311995, -31.410963, 93.14353, 83.41843
244 | ##    rollAngle  panAngle tiltAngle detectionConfidence landmarkingConfidence
245 | ## 1 -0.6375801 -2.120439  5.706552            0.996818             0.8222974
246 | ##   joyLikelihood sorrowLikelihood angerLikelihood surpriseLikelihood
247 | ## 1   VERY_LIKELY    VERY_UNLIKELY   VERY_UNLIKELY      VERY_UNLIKELY
248 | ##   underExposedLikelihood blurredLikelihood headwearLikelihood
249 | ## 1          VERY_UNLIKELY     VERY_UNLIKELY      VERY_UNLIKELY
250 | </code></pre>
251 | 
252 | <pre><code class="r">head(my_face$landmarks)
253 | </code></pre>
254 | 
255 | <pre><code>## [[1]]
256 | ##                            type position.x position.y    position.z
257 | ## 1                      LEFT_EYE   598.7636   192.1949  -0.001859295
258 | ## 2                     RIGHT_EYE   723.1612   192.4955  -4.805475700
259 | ## 3          LEFT_OF_LEFT_EYEBROW   556.1954   165.2836  15.825399000
260 | ## 4         RIGHT_OF_LEFT_EYEBROW   628.8224   159.9029 -23.345352000
261 | ## 5         LEFT_OF_RIGHT_EYEBROW   693.0257   160.6680 -25.614508000
262 | ## 6        RIGHT_OF_RIGHT_EYEBROW   767.7514   164.2806   7.637372000
263 | ## 7         MIDPOINT_BETWEEN_EYES   661.2344   185.0575 -29.068363000
264 | ## 8                      NOSE_TIP   661.9072   260.9006 -74.153710000
265 | ## 9                     UPPER_LIP   662.7698   310.7758 -48.440180000
266 | ## 10                    LOWER_LIP   662.2978   348.6693 -43.532110000
267 | ## 11                   MOUTH_LEFT   603.2181   322.5777 -10.572805000
268 | ## 12                  MOUTH_RIGHT   722.5995   317.3515 -14.504428000
269 | ## 13                 MOUTH_CENTER   662.6649   325.9983 -40.966953000
270 | ## 14            NOSE_BOTTOM_RIGHT   700.5242   274.2835 -26.340576000
271 | ## 15             NOSE_BOTTOM_LEFT   626.1442   275.3283 -23.933197000
272 | ## 16           NOSE_BOTTOM_CENTER   663.0441   284.4952 -46.457916000
273 | ## 17        LEFT_EYE_TOP_BOUNDARY   597.7986   185.3718  -8.027897000
274 | ## 18        LEFT_EYE_RIGHT_CORNER   624.5084   194.5995  -0.831856900
275 | ## 19     LEFT_EYE_BOTTOM_BOUNDARY   597.1378   203.6126  -2.181139000
276 | ## 20         LEFT_EYE_LEFT_CORNER   572.3240   197.5685  12.514983000
277 | ## 21               LEFT_EYE_PUPIL   596.0174   194.7956  -3.541256700
278 | ## 22       RIGHT_EYE_TOP_BOUNDARY   725.6114   183.5610 -12.764345000
279 | ## 23       RIGHT_EYE_RIGHT_CORNER   751.5310   195.6238   5.530805000
280 | ## 24    RIGHT_EYE_BOTTOM_BOUNDARY   725.6032   203.6048  -7.038474000
281 | ## 25        RIGHT_EYE_LEFT_CORNER   701.6699   194.9469  -3.618452800
282 | ## 26              RIGHT_EYE_PUPIL   727.3262   193.0094  -8.517615000
283 | ## 27  LEFT_EYEBROW_UPPER_MIDPOINT   591.7446   147.7026 -10.674338000
284 | ## 28 RIGHT_EYEBROW_UPPER_MIDPOINT   730.4487   145.7475 -15.850110000
285 | ## 29             LEFT_EAR_TRAGION   525.0554   276.1004 152.717160000
286 | ## 30            RIGHT_EAR_TRAGION   814.0723   270.0032 142.933240000
287 | ## 31            FOREHEAD_GLABELLA   660.7107   158.9580 -29.311995000
288 | ## 32                CHIN_GNATHION   664.2515   409.8619 -31.410963000
289 | ## 33             CHIN_LEFT_GONION   536.8293   350.2700  93.143530000
290 | ## 34            CHIN_RIGHT_GONION   798.8593   346.5862  83.418430000
291 | </code></pre>
292 | 
293 | <pre><code class="r">my_face_pic &lt;- readImage(&#39;my_face.jpg&#39;)
294 | plot(my_face_pic)
295 | 
296 | xs1 = my_face$fdBoundingPoly$vertices[[1]][1][[1]]
297 | ys1 = my_face$fdBoundingPoly$vertices[[1]][2][[1]]
298 | 
299 | xs2 = my_face$landmarks[[1]][[2]][[1]]
300 | ys2 = my_face$landmarks[[1]][[2]][[2]]
301 | 
302 | polygon(x=xs1,y=ys1,border=&#39;red&#39;,lwd=4)
303 | points(x=xs2,y=ys2,lwd=2, col=&#39;lightblue&#39;)
304 | </code></pre>
305 | 
306 | <p><img src="https://www.stoltzmaniac.com/wp-content/uploads/2017/07/unnamed-chunk-14-1.png" alt="plot of chunk unnamed-chunk-14"/></p>
307 | 
308 | <hr/>
309 | 
310 | <h4>Logo Detection</h4>
311 | 
312 | <p>To continue along the Chicago trip, we drove by Wrigley field and I took a really bad photo of the sign from a moving car as it was under construction. It&#39;s nice because it has a lot of different lines and writing the Toyota logo isn&#39;t incredibly prominent or necessarily fit to brand colors.</p>
313 | 
314 | <p>This call returns:  </p>
315 | 
316 | <ul>
317 | <li>Description = Brand name of the logo detected</li>
318 | <li>Score = Confidence of prediction accuracy</li>
319 | <li>Bounding Poly = (Again) coordinates of the logo</li>
320 | </ul>
321 | 
322 | <pre><code class="r">wrigley_image &lt;- readImage(&#39;wrigley_text.jpg&#39;)
323 | plot(wrigley_image)
324 | </code></pre>
325 | 
326 | <p><img src="https://www.stoltzmaniac.com/wp-content/uploads/2017/07/unnamed-chunk-15-1.png" alt="plot of chunk unnamed-chunk-15"/></p>
327 | 
328 | <pre><code class="r">wrigley_logo = getGoogleVisionResponse(&#39;wrigley_text.jpg&#39;,
329 |                                    feature = &#39;LOGO_DETECTION&#39;)
330 | head(wrigley_logo)
331 | </code></pre>
332 | 
333 | <pre><code>##           mid description     score                               vertices
334 | ## 1 /g/1tk6469q      Toyota 0.3126611 435, 551, 551, 435, 449, 449, 476, 476
335 | </code></pre>
336 | 
337 | <pre><code class="r">wrigley_image &lt;- readImage(&#39;wrigley_text.jpg&#39;)
338 | plot(wrigley_image)
339 | xs = wrigley_logo$boundingPoly$vertices[[1]][[1]]
340 | ys = wrigley_logo$boundingPoly$vertices[[1]][[2]]
341 | polygon(x=xs,y=ys,border=&#39;green&#39;,lwd=4)
342 | </code></pre>
343 | 
344 | <p><img src="https://www.stoltzmaniac.com/wp-content/uploads/2017/07/unnamed-chunk-17-1.png" alt="plot of chunk unnamed-chunk-17"/></p>
345 | 
346 | <hr/>
347 | 
348 | <h4>Text Detection</h4>
349 | 
350 | <p>I&#39;ll continue using the Wrigley Field picture. There is text all over the place and it&#39;s fun to see what is captured and what isn&#39;t. It appears as if the curved text at the top &ldquo;field&rdquo; isn&#39;t easily interpreted as text. However, the rest is caught and the words are captured.  </p>
351 | 
352 | <p>The response sent back is a bit more difficult to interpret than the rest of the API calls - it breaks things apart by word but also returns everything as one line. Here&#39;s what comes back:  </p>
353 | 
354 | <ul>
355 | <li>Locale = language, returned as source</li>
356 | <li>Description = the text (the first line is everything, and then the rest are indiviudal words)</li>
357 | <li>Bounding Poly = I&#39;m sure you can guess by now</li>
358 | </ul>
359 | 
360 | <pre><code class="r">wrigley_text = getGoogleVisionResponse(&#39;wrigley_text.jpg&#39;,
361 |                                    feature = &#39;TEXT_DETECTION&#39;)
362 | head(wrigley_text)
363 | </code></pre>
364 | 
365 | <pre><code>##   locale
366 | ## 1     en
367 | ## 2   &lt;NA&gt;
368 | ## 3   &lt;NA&gt;
369 | ## 4   &lt;NA&gt;
370 | ## 5   &lt;NA&gt;
371 | ## 6   &lt;NA&gt;
372 | ##                                                                                                        description
373 | ## 1 RIGLEY F\nICHICAGO CUBS\nORDER ONLINE AT GIORDANOS.COM\nTOYOTA\nMIDWEST\nFENCE\n773-722-6616\nCAUTION\nCAUTION\n
374 | ## 2                                                                                                           RIGLEY
375 | ## 3                                                                                                                F
376 | ## 4                                                                                                         ICHICAGO
377 | ## 5                                                                                                             CUBS
378 | ## 6                                                                                                            ORDER
379 | ##                                 vertices
380 | ## 1   55, 657, 657, 55, 210, 210, 852, 852
381 | ## 2 343, 482, 484, 345, 217, 211, 260, 266
382 | ## 3 501, 524, 526, 503, 211, 210, 259, 260
383 | ## 4 222, 503, 501, 220, 295, 307, 348, 336
384 | ## 5 527, 627, 625, 525, 308, 312, 353, 349
385 | ## 6 310, 384, 384, 310, 374, 374, 391, 391
386 | </code></pre>
387 | 
388 | <pre><code class="r">wrigley_image &lt;- readImage(&#39;wrigley_text.jpg&#39;)
389 | plot(wrigley_image)
390 | 
391 | for(i in 1:length(wrigley_text$boundingPoly$vertices)){
392 |   xs = wrigley_text$boundingPoly$vertices[[i]]$x
393 |   ys = wrigley_text$boundingPoly$vertices[[i]]$y
394 |   polygon(x=xs,y=ys,border=&#39;green&#39;,lwd=2)
395 | }
396 | </code></pre>
397 | 
398 | <p><img src="https://www.stoltzmaniac.com/wp-content/uploads/2017/07/unnamed-chunk-19-1.png" alt="plot of chunk unnamed-chunk-19"/></p>
399 | 
400 | <hr/>
401 | 
402 | <p>That&#39;s about it for the basics of using the Google Vision API with the RoogleVision library. I highly recommend tinkering around with it a bit, especially because it won&#39;t cost you a dime.  </p>
403 | 
404 | <p>While I do enjoy the math under the hood and the thinking required to understand alrgorithms, I do think these sorts of API&#39;s will become the way of the future for data science. Outside of specific use cases or special industries, it seems hard to imagine wanting to try and create algorithms that would be better than ones created for mass consumption. As long as they&#39;re fast, free and accurate, I&#39;m all about making my life easier! From the hiring perspective, I much prefer someone who can get the job done over someone who can slightly improve performance (as always, there are many cases where this doesn&#39;t apply).</p>
405 | 
406 | <p>Please comment if you are utilizing any of the Google API&#39;s for business purposes, I would love to hear it!</p>
407 | 
408 | <p>As always you can find this on my <a href="https://github.com/stoltzmaniac/ML-Image-Processing-R">GitHub</a></p>
409 | 
410 | 


--------------------------------------------------------------------------------
/Google Vision API/Google Vision API in R.md:
--------------------------------------------------------------------------------
  1 | ---
  2 | title: "Google Vision API in R"
  3 | author: "Scott Stoltzman"
  4 | date: "7/29/2017"
  5 | output: html_document
  6 | ---
  7 | 
  8 | ## Using the Google Vision API in R  
  9 | 
 10 | ### Utilizing RoogleVision  
 11 | 
 12 | After doing my post last month on OpenCV and face detection, I started looking into other algorithms used for pattern detection in images. As it turns out, Google has done a phenomenal job with their Vision API. It's absolutely incredible the amount of information it can spit back to you by simply sending it a picture.  
 13 | 
 14 | Also, it's 100% free! I believe that includes 1000 images per month. Amazing!  
 15 | 
 16 | In this post I'm going to walk you through the absolute basics of accessing the power of the Google Vision API using the RoogleVision package in R.
 17 | 
 18 | As always, we'll start off loading some libraries. I wrote some extra notation around where you can install them within the code.  
 19 | 
 20 | 
 21 | ```r
 22 | # Normal Libraries
 23 | library(tidyverse)
 24 | 
 25 | # devtools::install_github("flovv/RoogleVision")
 26 | library(RoogleVision)
 27 | library(jsonlite) # to import credentials
 28 | 
 29 | # For image processing
 30 | # source("http://bioconductor.org/biocLite.R")
 31 | # biocLite("EBImage")
 32 | library(EBImage)
 33 | 
 34 | # For Latitude Longitude Map
 35 | library(leaflet)
 36 | ```
 37 | 
 38 | #### Google Authentication  
 39 | 
 40 | In order to use the API, you have to authenticate. There is plenty of documentation out there about how to setup an account, create a project, download credentials, etc. Head over to [Google Cloud Console](https://console.cloud.google.com) if you don't have an account already.
 41 | 
 42 | 
 43 | ```r
 44 | # Credentials file I downloaded from the cloud console
 45 | creds = fromJSON('credentials.json')
 46 | 
 47 | # Google Authentication - Use Your Credentials
 48 | # options("googleAuthR.client_id" = "xxx.apps.googleusercontent.com")
 49 | # options("googleAuthR.client_secret" = "")
 50 | 
 51 | options("googleAuthR.client_id" = creds$installed$client_id)
 52 | options("googleAuthR.client_secret" = creds$installed$client_secret)
 53 | options("googleAuthR.scopes.selected" = c("https://www.googleapis.com/auth/cloud-platform"))
 54 | googleAuthR::gar_auth()
 55 | ```
 56 | 
 57 | ```
 58 | ## 2017-07-31 11:30:34> Token cache file: .httr-oauth
 59 | ```
 60 | 
 61 | ```
 62 | ## 2017-07-31 11:30:34> Scopes: https://www.googleapis.com/auth/cloud-platform
 63 | ```
 64 |   
 65 |   
 66 | ### Now You're Ready to Go  
 67 | 
 68 | The function getGoogleVisionResponse takes three arguments:
 69 | 
 70 |   1. imagePath
 71 |   2. feature
 72 |   3. numResults
 73 | 
 74 | Numbers 1 and 3 are self-explanatory, "feature" has 5 options:  
 75 | 
 76 |   * LABEL_DETECTION
 77 |   * LANDMARK_DETECTION
 78 |   * FACE_DETECTION
 79 |   * LOGO_DETECTION
 80 |   * TEXT_DETECTION  
 81 | 
 82 | These are self-explanatory but it's nice to see each one in action. 
 83 | 
 84 | As a side note: there are also other features that the API has which aren't included (yet) in the RoogleVision package such as "Safe Search" which identifies inappropriate content, "Properties" which identifies dominant colors and aspect ratios and a few others can be found at the [Cloud Vision website](https://cloud.google.com/vision/)
 85 | 
 86 | ----  
 87 | 
 88 | #### Label Detection  
 89 | 
 90 | This is used to help determine content within the photo. It can basically add a level of metadata around the image.  
 91 | 
 92 | Here is a photo of our dog when we hiked up to Audubon Peak in Colorado:  
 93 | 
 94 | ![plot of chunk unnamed-chunk-2](https://www.stoltzmaniac.com/wp-content/uploads/2017/07/unnamed-chunk-2-1.png)
 95 | 
 96 | 
 97 | 
 98 | 
 99 | ```r
100 | dog_mountain_label = getGoogleVisionResponse('dog_mountain.jpg',
101 |                                               feature = 'LABEL_DETECTION')
102 | head(dog_mountain_label)
103 | ```
104 | 
105 | ```
106 | ##            mid           description     score
107 | ## 1     /m/09d_r              mountain 0.9188690
108 | ## 2 /g/11jxkqbpp mountainous landforms 0.9009549
109 | ## 3    /m/023bbt            wilderness 0.8733696
110 | ## 4     /m/0kpmf             dog breed 0.8398435
111 | ## 5    /m/0d4djn            dog hiking 0.8352048
112 | ```
113 |   
114 |   
115 | All 5 responses were incredibly accurate! The "score" that is returned is how confident the Google Vision algorithms are, so there's a 91.9% chance a mountain is prominent in this photo. I like "dog hiking" the best - considering that's what we were doing at the time. Kind of a little bit too accurate...  
116 | 
117 | ----  
118 | 
119 | #### Landmark Detection   
120 | 
121 | This is a feature designed to specifically pick out a recognizable landmark! It provides the position in the image along with the geolocation of the landmark (in longitude and latitude).  
122 | 
123 | My wife and I took this selfie in at the Linderhof Castle in Bavaria, Germany. 
124 | 
125 | 
126 | ```r
127 | us_castle <- readImage('us_castle_2.jpg')
128 | plot(us_castle)
129 | ```
130 | 
131 | ![plot of chunk unnamed-chunk-4](https://www.stoltzmaniac.com/wp-content/uploads/2017/07/unnamed-chunk-4-1.png)
132 |   
133 | 
134 | The response from the Google Vision API was spot on. It returned "Linderhof Palace" as the description. It also provided a  score (I reduced the resolution of the image which hurt the score), a boundingPoly field and locations.  
135 | 
136 |   * Bounding Poly - gives x,y coordinates for a polygon around the landmark in the image
137 |   * Locations - provides longitude,latitude coordinates  
138 | 
139 | 
140 | ```r
141 | us_landmark = getGoogleVisionResponse('us_castle_2.jpg',
142 |                                       feature = 'LANDMARK_DETECTION')
143 | head(us_landmark)
144 | ```
145 | 
146 | ```
147 | ##         mid      description     score
148 | ## 1 /m/066h19 Linderhof Palace 0.4665011
149 | ##                               vertices          locations
150 | ## 1 25, 382, 382, 25, 178, 178, 659, 659 47.57127, 10.96072
151 | ```
152 |   
153 | I plotted the polygon over the image using the coordinates returned. It does a great job (certainly not perfect) of getting the castle identified. It's a bit tough to say what the actual "landmark" would be in this case due to the fact the fountains, stairs and grounds are certainly important and are a key part of the castle.
154 | 
155 | 
156 | ```r
157 | us_castle <- readImage('us_castle_2.jpg')
158 | plot(us_castle)
159 | xs = us_landmark$boundingPoly$vertices[[1]][1][[1]]
160 | ys = us_landmark$boundingPoly$vertices[[1]][2][[1]]
161 | polygon(x=xs,y=ys,border='red',lwd=4)
162 | ```
163 | 
164 | ![plot of chunk unnamed-chunk-6](https://www.stoltzmaniac.com/wp-content/uploads/2017/07/unnamed-chunk-6-1.png)
165 |   
166 |   
167 | Turning to the locations - I plotted this using the leaflet library. If you haven't used leaflet, start doing so immediately. I'm a huge fan of it due to speed and simplicity. There are a lot of customization options available as well that you can check out.  
168 | 
169 | The location = spot on! While it isn't a shock to me that Google could provide the location of "Linderhof Castle" - it is amazing to me that I don't have to write a web crawler search function to find it myself! That's just one of many little luxuries they have built into this API.
170 | 
171 | 
172 | ```r
173 | latt = us_landmark$locations[[1]][[1]][[1]]
174 | lon = us_landmark$locations[[1]][[1]][[2]]
175 | m = leaflet() %>%
176 |   addProviderTiles(providers$CartoDB.Positron) %>%
177 |   setView(lng = lon, lat = latt, zoom = 5) %>%
178 |   addMarkers(lng = lon, lat = latt)
179 | m
180 | ```
181 | 
182 | ```
183 | ## Error in loadNamespace(name): there is no package called 'webshot'
184 | ```
185 | 
186 | 
187 | ----  
188 | 
189 | #### Face Detection  
190 | 
191 | My last blog post showed the OpenCV package utilizing the haar cascade algorithm in action. I didn't dig into Google's algorithms to figure out what is under the hood, but it provides similar results. However, rather than layering in each subsequent "find the eyes" and "find the mouth" and ...etc... it returns more than you ever needed to know.
192 | 
193 |   * Bounding Poly = highest level polygon
194 |   * FD Bounding Poly = polygon surrounding each face
195 |   * Landmarks = (funny name) includes each feature of the face (left eye, right eye, etc.)
196 |   * Roll Angle, Pan Angle, Tilt Angle = all of the different angles you'd need per face
197 |   * Confidence (detection and landmarking) = how certain the algorithm is that it's accurate
198 |   * Joy, sorrow, anger, surprise, under exposed, blurred, headwear likelihoods = how likely it is that each face contains that emotion or characteristic
199 | 
200 | The likelihoods is another amazing piece of information returned! I have run about 20 images through this API and every single one has been accurate - very impressive!
201 | 
202 | I wanted to showcase the face detection and headwear first. Here's a picture of my wife and I at "The Bean" in Chicago (side note: it's awesome! I thought it was going to be really silly, but you can really have a lot of fun with all of the angles and reflections):
203 | 
204 | 
205 | ```r
206 | us_hats_pic <- readImage('us_hats.jpg')
207 | plot(us_hats_pic)
208 | ```
209 | 
210 | ![plot of chunk unnamed-chunk-8](https://www.stoltzmaniac.com/wp-content/uploads/2017/07/unnamed-chunk-8-1.png)
211 | 
212 | 
213 | 
214 | ```r
215 | us_hats = getGoogleVisionResponse('us_hats.jpg',
216 |                                       feature = 'FACE_DETECTION')
217 | head(us_hats)
218 | ```
219 | 
220 | ```
221 | ##                                 vertices
222 | ## 1 295, 410, 410, 295, 164, 164, 297, 297
223 | ## 2 353, 455, 455, 353, 261, 261, 381, 381
224 | ##                                 vertices
225 | ## 1 327, 402, 402, 327, 206, 206, 280, 280
226 | ## 2 368, 439, 439, 368, 298, 298, 370, 370
227 | ##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                           landmarks
228 | ## 1 LEFT_EYE, RIGHT_EYE, LEFT_OF_LEFT_EYEBROW, RIGHT_OF_LEFT_EYEBROW, LEFT_OF_RIGHT_EYEBROW, RIGHT_OF_RIGHT_EYEBROW, MIDPOINT_BETWEEN_EYES, NOSE_TIP, UPPER_LIP, LOWER_LIP, MOUTH_LEFT, MOUTH_RIGHT, MOUTH_CENTER, NOSE_BOTTOM_RIGHT, NOSE_BOTTOM_LEFT, NOSE_BOTTOM_CENTER, LEFT_EYE_TOP_BOUNDARY, LEFT_EYE_RIGHT_CORNER, LEFT_EYE_BOTTOM_BOUNDARY, LEFT_EYE_LEFT_CORNER, LEFT_EYE_PUPIL, RIGHT_EYE_TOP_BOUNDARY, RIGHT_EYE_RIGHT_CORNER, RIGHT_EYE_BOTTOM_BOUNDARY, RIGHT_EYE_LEFT_CORNER, RIGHT_EYE_PUPIL, LEFT_EYEBROW_UPPER_MIDPOINT, RIGHT_EYEBROW_UPPER_MIDPOINT, LEFT_EAR_TRAGION, RIGHT_EAR_TRAGION, FOREHEAD_GLABELLA, CHIN_GNATHION, CHIN_LEFT_GONION, CHIN_RIGHT_GONION, 352.00974, 380.68124, 340.27664, 363.16348, 378.64938, 393.6553, 370.78906, 371.99802, 366.30664, 364.23642, 349.47012, 377.17905, 364.7603, 375.62842, 357.7237, 367.20822, 352.4306, 358.9425, 351.23474, 343.64124, 351.10004, 384.32953, 388.21667, 382.08743, 375.90262, 383.87732, 353.08627, 387.7416, 312.06622, 384.56946, 371.5381, 360.62714, 318.48486, 383.87354, 225.86505, 229.50423, 216.51169, 219.28635, 220.92139, 222.02762, 227.35771, 248.94884, 259.51044, 272.9798, 261.36096, 263.89874, 265.76828, 251.38408, 248.46135, 253.8837, 223.93387, 227.20102, 228.33765, 225.31805, 226.13412, 227.22661, 229.89023, 231.8548, 229.34843, 229.54358, 213.85588, 217.43123, 236.95158, 244.45172, 219.76247, 287.00592, 260.99124, 267.68896, -0.0009269835, 12.904515, -2.3585303, -3.3569832, 3.4166863, 20.891703, -0.10083569, -8.568332, -0.32282636, 2.7426949, 3.1502135, 14.79839, 2.401884, 8.115268, 0.19641992, -0.7506992, -2.5084567, 2.8466656, -0.38294473, -0.05908208, -1.2792722, 11.411656, 19.373985, 13.421982, 10.900102, 12.992137, -5.2635217, 9.859322, 31.33588, 62.9466, -1.213793, 7.9232774, 20.887934, 49.40408
229 | ## 2 LEFT_EYE, RIGHT_EYE, LEFT_OF_LEFT_EYEBROW, RIGHT_OF_LEFT_EYEBROW, LEFT_OF_RIGHT_EYEBROW, RIGHT_OF_RIGHT_EYEBROW, MIDPOINT_BETWEEN_EYES, NOSE_TIP, UPPER_LIP, LOWER_LIP, MOUTH_LEFT, MOUTH_RIGHT, MOUTH_CENTER, NOSE_BOTTOM_RIGHT, NOSE_BOTTOM_LEFT, NOSE_BOTTOM_CENTER, LEFT_EYE_TOP_BOUNDARY, LEFT_EYE_RIGHT_CORNER, LEFT_EYE_BOTTOM_BOUNDARY, LEFT_EYE_LEFT_CORNER, LEFT_EYE_PUPIL, RIGHT_EYE_TOP_BOUNDARY, RIGHT_EYE_RIGHT_CORNER, RIGHT_EYE_BOTTOM_BOUNDARY, RIGHT_EYE_LEFT_CORNER, RIGHT_EYE_PUPIL, LEFT_EYEBROW_UPPER_MIDPOINT, RIGHT_EYEBROW_UPPER_MIDPOINT, LEFT_EAR_TRAGION, RIGHT_EAR_TRAGION, FOREHEAD_GLABELLA, CHIN_GNATHION, CHIN_LEFT_GONION, CHIN_RIGHT_GONION, 389.67215, 419.01474, 378.68497, 397.29074, 411.57373, 430.68024, 404.34882, 402.9257, 402.77734, 402.28552, 388.3598, 418.2969, 402.50266, 411.8417, 394.88547, 403.11188, 388.78043, 395.5202, 389.06342, 382.62268, 388.21332, 419.86707, 425.98645, 419.21088, 413.45447, 420.1578, 387.80508, 421.56183, 369.29388, 439.9703, 404.44498, 401.90457, 371.50647, 435.39258, 319.11594, 320.14157, 310.8753, 313.32437, 313.59402, 313.3107, 318.78964, 337.8581, 347.607, 360.56134, 350.5411, 351.10315, 354.41702, 339.4301, 339.11786, 342.46072, 317.141, 320.19537, 321.22644, 318.473, 319.0922, 318.58655, 320.64886, 322.44992, 320.4701, 320.58286, 308.29236, 309.85825, 328.25885, 331.54816, 312.85385, 372.5355, 349.82388, 352.82462, 0.00018637085, -0.63476753, 2.1497552, -7.1008844, -7.460493, 1.0756116, -7.027477, -13.670173, -5.1229305, -1.0671108, 4.793461, 4.4603314, -1.8998832, -1.677745, -1.2933732, -5.9320116, -2.3477247, 0.03832738, 0.0054741018, 2.9924247, -0.7714207, -2.9816942, 2.079318, -0.6419869, -0.3527427, -1.4552351, -5.0709085, -5.7559977, 40.608036, 39.10855, -8.547456, 4.8426514, 30.500828, 29.191824
230 | ##   rollAngle panAngle tiltAngle detectionConfidence landmarkingConfidence
231 | ## 1  7.103324 23.46835 -2.816312           0.9877176             0.7072066
232 | ## 2  2.510939 -1.17956 -7.393063           0.9997375             0.7268016
233 | ##   joyLikelihood sorrowLikelihood angerLikelihood surpriseLikelihood
234 | ## 1   VERY_LIKELY    VERY_UNLIKELY   VERY_UNLIKELY      VERY_UNLIKELY
235 | ## 2   VERY_LIKELY    VERY_UNLIKELY   VERY_UNLIKELY      VERY_UNLIKELY
236 | ##   underExposedLikelihood blurredLikelihood headwearLikelihood
237 | ## 1          VERY_UNLIKELY     VERY_UNLIKELY        VERY_LIKELY
238 | ## 2          VERY_UNLIKELY     VERY_UNLIKELY        VERY_LIKELY
239 | ```
240 | 
241 | 
242 | 
243 | ```r
244 | us_hats_pic <- readImage('us_hats.jpg')
245 | plot(us_hats_pic)
246 | 
247 | xs1 = us_hats$fdBoundingPoly$vertices[[1]][1][[1]]
248 | ys1 = us_hats$fdBoundingPoly$vertices[[1]][2][[1]]
249 | 
250 | xs2 = us_hats$fdBoundingPoly$vertices[[2]][1][[1]]
251 | ys2 = us_hats$fdBoundingPoly$vertices[[2]][2][[1]]
252 | 
253 | polygon(x=xs1,y=ys1,border='red',lwd=4)
254 | polygon(x=xs2,y=ys2,border='green',lwd=4)
255 | ```
256 | 
257 | ![plot of chunk unnamed-chunk-10](https://www.stoltzmaniac.com/wp-content/uploads/2017/07/unnamed-chunk-10-1.png)
258 | 
259 | 
260 | Here's a shot that should be familiar (copied directly from my last blog) - and I wanted to highlight the different features that can be detected. Look at how many points are perfectly placed:
261 | 
262 | 
263 | ```r
264 | my_face_pic <- readImage('my_face.jpg')
265 | plot(my_face_pic)
266 | ```
267 | 
268 | ![plot of chunk unnamed-chunk-11](https://www.stoltzmaniac.com/wp-content/uploads/2017/07/unnamed-chunk-11-1.png)
269 | 
270 | 
271 | 
272 | 
273 | ```r
274 | my_face = getGoogleVisionResponse('my_face.jpg',
275 |                                       feature = 'FACE_DETECTION')
276 | head(my_face)
277 | ```
278 | 
279 | ```
280 | ##                               vertices
281 | ## 1 456, 877, 877, 456, NA, NA, 473, 473
282 | ##                               vertices
283 | ## 1 515, 813, 813, 515, 98, 98, 395, 395
284 | ##                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             landmarks
285 | ## 1 LEFT_EYE, RIGHT_EYE, LEFT_OF_LEFT_EYEBROW, RIGHT_OF_LEFT_EYEBROW, LEFT_OF_RIGHT_EYEBROW, RIGHT_OF_RIGHT_EYEBROW, MIDPOINT_BETWEEN_EYES, NOSE_TIP, UPPER_LIP, LOWER_LIP, MOUTH_LEFT, MOUTH_RIGHT, MOUTH_CENTER, NOSE_BOTTOM_RIGHT, NOSE_BOTTOM_LEFT, NOSE_BOTTOM_CENTER, LEFT_EYE_TOP_BOUNDARY, LEFT_EYE_RIGHT_CORNER, LEFT_EYE_BOTTOM_BOUNDARY, LEFT_EYE_LEFT_CORNER, LEFT_EYE_PUPIL, RIGHT_EYE_TOP_BOUNDARY, RIGHT_EYE_RIGHT_CORNER, RIGHT_EYE_BOTTOM_BOUNDARY, RIGHT_EYE_LEFT_CORNER, RIGHT_EYE_PUPIL, LEFT_EYEBROW_UPPER_MIDPOINT, RIGHT_EYEBROW_UPPER_MIDPOINT, LEFT_EAR_TRAGION, RIGHT_EAR_TRAGION, FOREHEAD_GLABELLA, CHIN_GNATHION, CHIN_LEFT_GONION, CHIN_RIGHT_GONION, 598.7636, 723.16125, 556.1954, 628.8224, 693.0257, 767.7514, 661.2344, 661.9072, 662.7698, 662.2978, 603.21814, 722.5995, 662.66486, 700.5242, 626.14417, 663.0441, 597.7986, 624.5084, 597.13776, 572.32404, 596.0174, 725.61145, 751.531, 725.60315, 701.6699, 727.3262, 591.74457, 730.4487, 525.0554, 814.0723, 660.71075, 664.25146, 536.8293, 798.8593, 192.19489, 192.49554, 165.28363, 159.90292, 160.66797, 164.28062, 185.05746, 260.90063, 310.77585, 348.6693, 322.57773, 317.35153, 325.9983, 274.28345, 275.32834, 284.49515, 185.37177, 194.59952, 203.61258, 197.56845, 194.79561, 183.56104, 195.62381, 203.60477, 194.94687, 193.0094, 147.70262, 145.74747, 276.10037, 270.00323, 158.95798, 409.86185, 350.27, 346.58624, -0.0018592946, -4.8054757, 15.825399, -23.345352, -25.614508, 7.637372, -29.068363, -74.15371, -48.44018, -43.53211, -10.572805, -14.504428, -40.966953, -26.340576, -23.933197, -46.457916, -8.027897, -0.8318569, -2.181139, 12.514983, -3.5412567, -12.764345, 5.530805, -7.038474, -3.6184528, -8.517615, -10.674338, -15.85011, 152.71716, 142.93324, -29.311995, -31.410963, 93.14353, 83.41843
286 | ##    rollAngle  panAngle tiltAngle detectionConfidence landmarkingConfidence
287 | ## 1 -0.6375801 -2.120439  5.706552            0.996818             0.8222974
288 | ##   joyLikelihood sorrowLikelihood angerLikelihood surpriseLikelihood
289 | ## 1   VERY_LIKELY    VERY_UNLIKELY   VERY_UNLIKELY      VERY_UNLIKELY
290 | ##   underExposedLikelihood blurredLikelihood headwearLikelihood
291 | ## 1          VERY_UNLIKELY     VERY_UNLIKELY      VERY_UNLIKELY
292 | ```
293 | 
294 | 
295 | 
296 | 
297 | ```r
298 | head(my_face$landmarks)
299 | ```
300 | 
301 | ```
302 | ## [[1]]
303 | ##                            type position.x position.y    position.z
304 | ## 1                      LEFT_EYE   598.7636   192.1949  -0.001859295
305 | ## 2                     RIGHT_EYE   723.1612   192.4955  -4.805475700
306 | ## 3          LEFT_OF_LEFT_EYEBROW   556.1954   165.2836  15.825399000
307 | ## 4         RIGHT_OF_LEFT_EYEBROW   628.8224   159.9029 -23.345352000
308 | ## 5         LEFT_OF_RIGHT_EYEBROW   693.0257   160.6680 -25.614508000
309 | ## 6        RIGHT_OF_RIGHT_EYEBROW   767.7514   164.2806   7.637372000
310 | ## 7         MIDPOINT_BETWEEN_EYES   661.2344   185.0575 -29.068363000
311 | ## 8                      NOSE_TIP   661.9072   260.9006 -74.153710000
312 | ## 9                     UPPER_LIP   662.7698   310.7758 -48.440180000
313 | ## 10                    LOWER_LIP   662.2978   348.6693 -43.532110000
314 | ## 11                   MOUTH_LEFT   603.2181   322.5777 -10.572805000
315 | ## 12                  MOUTH_RIGHT   722.5995   317.3515 -14.504428000
316 | ## 13                 MOUTH_CENTER   662.6649   325.9983 -40.966953000
317 | ## 14            NOSE_BOTTOM_RIGHT   700.5242   274.2835 -26.340576000
318 | ## 15             NOSE_BOTTOM_LEFT   626.1442   275.3283 -23.933197000
319 | ## 16           NOSE_BOTTOM_CENTER   663.0441   284.4952 -46.457916000
320 | ## 17        LEFT_EYE_TOP_BOUNDARY   597.7986   185.3718  -8.027897000
321 | ## 18        LEFT_EYE_RIGHT_CORNER   624.5084   194.5995  -0.831856900
322 | ## 19     LEFT_EYE_BOTTOM_BOUNDARY   597.1378   203.6126  -2.181139000
323 | ## 20         LEFT_EYE_LEFT_CORNER   572.3240   197.5685  12.514983000
324 | ## 21               LEFT_EYE_PUPIL   596.0174   194.7956  -3.541256700
325 | ## 22       RIGHT_EYE_TOP_BOUNDARY   725.6114   183.5610 -12.764345000
326 | ## 23       RIGHT_EYE_RIGHT_CORNER   751.5310   195.6238   5.530805000
327 | ## 24    RIGHT_EYE_BOTTOM_BOUNDARY   725.6032   203.6048  -7.038474000
328 | ## 25        RIGHT_EYE_LEFT_CORNER   701.6699   194.9469  -3.618452800
329 | ## 26              RIGHT_EYE_PUPIL   727.3262   193.0094  -8.517615000
330 | ## 27  LEFT_EYEBROW_UPPER_MIDPOINT   591.7446   147.7026 -10.674338000
331 | ## 28 RIGHT_EYEBROW_UPPER_MIDPOINT   730.4487   145.7475 -15.850110000
332 | ## 29             LEFT_EAR_TRAGION   525.0554   276.1004 152.717160000
333 | ## 30            RIGHT_EAR_TRAGION   814.0723   270.0032 142.933240000
334 | ## 31            FOREHEAD_GLABELLA   660.7107   158.9580 -29.311995000
335 | ## 32                CHIN_GNATHION   664.2515   409.8619 -31.410963000
336 | ## 33             CHIN_LEFT_GONION   536.8293   350.2700  93.143530000
337 | ## 34            CHIN_RIGHT_GONION   798.8593   346.5862  83.418430000
338 | ```
339 | 
340 | 
341 | 
342 | 
343 | ```r
344 | my_face_pic <- readImage('my_face.jpg')
345 | plot(my_face_pic)
346 | 
347 | xs1 = my_face$fdBoundingPoly$vertices[[1]][1][[1]]
348 | ys1 = my_face$fdBoundingPoly$vertices[[1]][2][[1]]
349 | 
350 | xs2 = my_face$landmarks[[1]][[2]][[1]]
351 | ys2 = my_face$landmarks[[1]][[2]][[2]]
352 | 
353 | polygon(x=xs1,y=ys1,border='red',lwd=4)
354 | points(x=xs2,y=ys2,lwd=2, col='lightblue')
355 | ```
356 | 
357 | ![plot of chunk unnamed-chunk-14](https://www.stoltzmaniac.com/wp-content/uploads/2017/07/unnamed-chunk-14-1.png)
358 | 
359 | ----
360 | 
361 | #### Logo Detection
362 |   
363 | To continue along the Chicago trip, we drove by Wrigley field and I took a really bad photo of the sign from a moving car as it was under construction. It's nice because it has a lot of different lines and writing the Toyota logo isn't incredibly prominent or necessarily fit to brand colors.
364 | 
365 | This call returns:  
366 | 
367 |   * Description = Brand name of the logo detected
368 |   * Score = Confidence of prediction accuracy
369 |   * Bounding Poly = (Again) coordinates of the logo
370 | 
371 |   
372 | 
373 | ```r
374 | wrigley_image <- readImage('wrigley_text.jpg')
375 | plot(wrigley_image)
376 | ```
377 | 
378 | ![plot of chunk unnamed-chunk-15](https://www.stoltzmaniac.com/wp-content/uploads/2017/07/unnamed-chunk-15-1.png)
379 | 
380 | 
381 | 
382 | 
383 | ```r
384 | wrigley_logo = getGoogleVisionResponse('wrigley_text.jpg',
385 |                                    feature = 'LOGO_DETECTION')
386 | head(wrigley_logo)
387 | ```
388 | 
389 | ```
390 | ##           mid description     score                               vertices
391 | ## 1 /g/1tk6469q      Toyota 0.3126611 435, 551, 551, 435, 449, 449, 476, 476
392 | ```
393 | 
394 | 
395 | 
396 | ```r
397 | wrigley_image <- readImage('wrigley_text.jpg')
398 | plot(wrigley_image)
399 | xs = wrigley_logo$boundingPoly$vertices[[1]][[1]]
400 | ys = wrigley_logo$boundingPoly$vertices[[1]][[2]]
401 | polygon(x=xs,y=ys,border='green',lwd=4)
402 | ```
403 | 
404 | ![plot of chunk unnamed-chunk-17](https://www.stoltzmaniac.com/wp-content/uploads/2017/07/unnamed-chunk-17-1.png)
405 | 
406 | ----
407 | 
408 | #### Text Detection
409 | 
410 | I'll continue using the Wrigley Field picture. There is text all over the place and it's fun to see what is captured and what isn't. It appears as if the curved text at the top "field" isn't easily interpreted as text. However, the rest is caught and the words are captured.  
411 | 
412 | The response sent back is a bit more difficult to interpret than the rest of the API calls - it breaks things apart by word but also returns everything as one line. Here's what comes back:  
413 | 
414 |   * Locale = language, returned as source
415 |   * Description = the text (the first line is everything, and then the rest are indiviudal words)
416 |   * Bounding Poly = I'm sure you can guess by now
417 | 
418 | 
419 | ```r
420 | wrigley_text = getGoogleVisionResponse('wrigley_text.jpg',
421 |                                    feature = 'TEXT_DETECTION')
422 | head(wrigley_text)
423 | ```
424 | 
425 | ```
426 | ##   locale
427 | ## 1     en
428 | ## 2   <NA>
429 | ## 3   <NA>
430 | ## 4   <NA>
431 | ## 5   <NA>
432 | ## 6   <NA>
433 | ##                                                                                                        description
434 | ## 1 RIGLEY F\nICHICAGO CUBS\nORDER ONLINE AT GIORDANOS.COM\nTOYOTA\nMIDWEST\nFENCE\n773-722-6616\nCAUTION\nCAUTION\n
435 | ## 2                                                                                                           RIGLEY
436 | ## 3                                                                                                                F
437 | ## 4                                                                                                         ICHICAGO
438 | ## 5                                                                                                             CUBS
439 | ## 6                                                                                                            ORDER
440 | ##                                 vertices
441 | ## 1   55, 657, 657, 55, 210, 210, 852, 852
442 | ## 2 343, 482, 484, 345, 217, 211, 260, 266
443 | ## 3 501, 524, 526, 503, 211, 210, 259, 260
444 | ## 4 222, 503, 501, 220, 295, 307, 348, 336
445 | ## 5 527, 627, 625, 525, 308, 312, 353, 349
446 | ## 6 310, 384, 384, 310, 374, 374, 391, 391
447 | ```
448 | 
449 | 
450 | ```r
451 | wrigley_image <- readImage('wrigley_text.jpg')
452 | plot(wrigley_image)
453 | 
454 | for(i in 1:length(wrigley_text$boundingPoly$vertices)){
455 |   xs = wrigley_text$boundingPoly$vertices[[i]]$x
456 |   ys = wrigley_text$boundingPoly$vertices[[i]]$y
457 |   polygon(x=xs,y=ys,border='green',lwd=2)
458 | }
459 | ```
460 | 
461 | ![plot of chunk unnamed-chunk-19](https://www.stoltzmaniac.com/wp-content/uploads/2017/07/unnamed-chunk-19-1.png)
462 | 
463 | ----
464 | 
465 | That's about it for the basics of using the Google Vision API with the RoogleVision library. I highly recommend tinkering around with it a bit, especially because it won't cost you a dime.  
466 | 
467 | While I do enjoy the math under the hood and the thinking required to understand alrgorithms, I do think these sorts of API's will become the way of the future for data science. Outside of specific use cases or special industries, it seems hard to imagine wanting to try and create algorithms that would be better than ones created for mass consumption. As long as they're fast, free and accurate, I'm all about making my life easier! From the hiring perspective, I much prefer someone who can get the job done over someone who can slightly improve performance (as always, there are many cases where this doesn't apply).
468 | 
469 | Please comment if you are utilizing any of the Google API's for business purposes, I would love to hear it!
470 | 
471 | As always you can find this on my [GitHub](https://github.com/stoltzmaniac/ML-Image-Processing-R)
472 | 
473 | 
474 | 
475 | 
476 | 


--------------------------------------------------------------------------------
/Google Vision API/Google Vision API.Rproj:
--------------------------------------------------------------------------------
 1 | Version: 1.0
 2 | 
 3 | RestoreWorkspace: Default
 4 | SaveWorkspace: Default
 5 | AlwaysSaveHistory: Default
 6 | 
 7 | EnableCodeIndexing: Yes
 8 | UseSpacesForTab: Yes
 9 | NumSpacesForTab: 2
10 | Encoding: UTF-8
11 | 
12 | RnwWeave: Sweave
13 | LaTeX: pdfLaTeX
14 | 


--------------------------------------------------------------------------------
/Google Vision API/dog_mountain.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stoltzmaniac/ML-Image-Processing-R/33ff327eb94fb404aea9c27ff5aa419e65cdf7cf/Google Vision API/dog_mountain.jpg


--------------------------------------------------------------------------------
/Google Vision API/figure/unnamed-chunk-10-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stoltzmaniac/ML-Image-Processing-R/33ff327eb94fb404aea9c27ff5aa419e65cdf7cf/Google Vision API/figure/unnamed-chunk-10-1.png


--------------------------------------------------------------------------------
/Google Vision API/figure/unnamed-chunk-11-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stoltzmaniac/ML-Image-Processing-R/33ff327eb94fb404aea9c27ff5aa419e65cdf7cf/Google Vision API/figure/unnamed-chunk-11-1.png


--------------------------------------------------------------------------------
/Google Vision API/figure/unnamed-chunk-14-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stoltzmaniac/ML-Image-Processing-R/33ff327eb94fb404aea9c27ff5aa419e65cdf7cf/Google Vision API/figure/unnamed-chunk-14-1.png


--------------------------------------------------------------------------------
/Google Vision API/figure/unnamed-chunk-15-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stoltzmaniac/ML-Image-Processing-R/33ff327eb94fb404aea9c27ff5aa419e65cdf7cf/Google Vision API/figure/unnamed-chunk-15-1.png


--------------------------------------------------------------------------------
/Google Vision API/figure/unnamed-chunk-17-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stoltzmaniac/ML-Image-Processing-R/33ff327eb94fb404aea9c27ff5aa419e65cdf7cf/Google Vision API/figure/unnamed-chunk-17-1.png


--------------------------------------------------------------------------------
/Google Vision API/figure/unnamed-chunk-19-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stoltzmaniac/ML-Image-Processing-R/33ff327eb94fb404aea9c27ff5aa419e65cdf7cf/Google Vision API/figure/unnamed-chunk-19-1.png


--------------------------------------------------------------------------------
/Google Vision API/figure/unnamed-chunk-199-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stoltzmaniac/ML-Image-Processing-R/33ff327eb94fb404aea9c27ff5aa419e65cdf7cf/Google Vision API/figure/unnamed-chunk-199-1.png


--------------------------------------------------------------------------------
/Google Vision API/figure/unnamed-chunk-2-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stoltzmaniac/ML-Image-Processing-R/33ff327eb94fb404aea9c27ff5aa419e65cdf7cf/Google Vision API/figure/unnamed-chunk-2-1.png


--------------------------------------------------------------------------------
/Google Vision API/figure/unnamed-chunk-4-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stoltzmaniac/ML-Image-Processing-R/33ff327eb94fb404aea9c27ff5aa419e65cdf7cf/Google Vision API/figure/unnamed-chunk-4-1.png


--------------------------------------------------------------------------------
/Google Vision API/figure/unnamed-chunk-6-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stoltzmaniac/ML-Image-Processing-R/33ff327eb94fb404aea9c27ff5aa419e65cdf7cf/Google Vision API/figure/unnamed-chunk-6-1.png


--------------------------------------------------------------------------------
/Google Vision API/figure/unnamed-chunk-8-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stoltzmaniac/ML-Image-Processing-R/33ff327eb94fb404aea9c27ff5aa419e65cdf7cf/Google Vision API/figure/unnamed-chunk-8-1.png


--------------------------------------------------------------------------------
/Google Vision API/my_face.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stoltzmaniac/ML-Image-Processing-R/33ff327eb94fb404aea9c27ff5aa419e65cdf7cf/Google Vision API/my_face.jpg


--------------------------------------------------------------------------------
/Google Vision API/originalWebcamShot.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stoltzmaniac/ML-Image-Processing-R/33ff327eb94fb404aea9c27ff5aa419e65cdf7cf/Google Vision API/originalWebcamShot.jpg


--------------------------------------------------------------------------------
/Google Vision API/snacks_logos.JPG:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stoltzmaniac/ML-Image-Processing-R/33ff327eb94fb404aea9c27ff5aa419e65cdf7cf/Google Vision API/snacks_logos.JPG


--------------------------------------------------------------------------------
/Google Vision API/us_castle.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stoltzmaniac/ML-Image-Processing-R/33ff327eb94fb404aea9c27ff5aa419e65cdf7cf/Google Vision API/us_castle.jpg


--------------------------------------------------------------------------------
/Google Vision API/us_castle_2.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stoltzmaniac/ML-Image-Processing-R/33ff327eb94fb404aea9c27ff5aa419e65cdf7cf/Google Vision API/us_castle_2.jpg


--------------------------------------------------------------------------------
/Google Vision API/us_dog_mountain.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stoltzmaniac/ML-Image-Processing-R/33ff327eb94fb404aea9c27ff5aa419e65cdf7cf/Google Vision API/us_dog_mountain.jpg


--------------------------------------------------------------------------------
/Google Vision API/us_hats.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stoltzmaniac/ML-Image-Processing-R/33ff327eb94fb404aea9c27ff5aa419e65cdf7cf/Google Vision API/us_hats.jpg


--------------------------------------------------------------------------------
/Google Vision API/wrigley_text.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stoltzmaniac/ML-Image-Processing-R/33ff327eb94fb404aea9c27ff5aa419e65cdf7cf/Google Vision API/wrigley_text.jpg


--------------------------------------------------------------------------------
/Microsoft Vision API/Microsoft Vision API.Rproj:
--------------------------------------------------------------------------------
 1 | Version: 1.0
 2 | 
 3 | RestoreWorkspace: Default
 4 | SaveWorkspace: Default
 5 | AlwaysSaveHistory: Default
 6 | 
 7 | EnableCodeIndexing: Yes
 8 | UseSpacesForTab: Yes
 9 | NumSpacesForTab: 2
10 | Encoding: UTF-8
11 | 
12 | RnwWeave: Sweave
13 | LaTeX: pdfLaTeX
14 | 


--------------------------------------------------------------------------------
/Microsoft Vision API/R - Microsoft Vision API.Rmd:
--------------------------------------------------------------------------------
  1 | ```{r setup, include=FALSE}
  2 | knitr::opts_chunk$set(echo = FALSE, message = FALSE, warning = FALSE)
  3 | ```
  4 | 
  5 | # Microsoft Cognitive Services Vision API in R  
  6 | 
  7 | ----
  8 | 
  9 | A little while ago I did a brief tutorial of the [Google Vision API using RoogleVision](https://www.stoltzmaniac.com/google-vision-api-in-r-rooglevision/) created by Mark Edmonson. I couldn't find anything similar to that in R for the [Microsoft Cognitive Services API](https://docs.microsoft.com/en-us/azure/cognitive-services/computer-vision/quickstarts/python#AnalyzeImage) so I thought I would give it a shot. I whipped this example together quickly to give it a proof-of-concept but I could certainly see myself building an R package to support this (unless someone can point to one - and please do if one exists)!  
 10 | 
 11 | 
 12 | The API is extremely easy to access using RCurl and httr. There are **a lot** of options which can be accessed. In this example, I'll just cover the basics of image detection and descriptions.  
 13 | 
 14 | 
 15 | ### Getting Started With Microsoft Cognitive Services
 16 | 
 17 | In order to get started, all you need is an [Azure Account](https://azure.microsoft.com/en-us/free/) which is **free** if you can keep yourself under certain thresholds and limits. There is even a free trial period (at the time this was written, at least).  
 18 | 
 19 | Once that is taken care of there are a few things you need to do:  
 20 | 
 21 |   1. Login to [portal.azure.com](https://portal.azure.com)
 22 |   2. On the lefthand menu click "Add"
 23 |   ![Figure 1](https://i.imgur.com/OPihH39.png)  
 24 |   
 25 |   
 26 |   3. Click on "AI + Cognitive Services" and then the "Computer Vision API"
 27 |   ![Figure 2](https://i.imgur.com/GMy2LFZ.png)  
 28 |   
 29 |   
 30 |   4. Fill out the information required. You may have "Free Trial" under Subscription. Pay special attention to **Location** because this will be used in your API script 
 31 |   ![Figure 3](https://i.imgur.com/t7vg4vH.png)  
 32 |   
 33 |   
 34 |   5. In the lefthand menu, click "Keys" underneath "Resource Management" and you will find what you need for credentials. Underneath your Endpoint URL, click on "Show access keys..." - **copy your key** and use it in your script (do not make this publicly accessible)
 35 |   ![Figure 4](https://i.imgur.com/CKkC2nx.png)
 36 | 
 37 | 
 38 | 
 39 | ```{r libraries_and_credentials}
 40 | library(tidyverse)
 41 | library(RCurl)
 42 | library(httr)
 43 | library(EBImage)
 44 | 
 45 | credentials = read_csv('credentials.csv')
 46 | api_key = as.character(credentials$subscription_id) #api key is not subscription id
 47 | api_endpoint_url = "https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/analyze"
 48 | ```
 49 | 
 50 | ```{r}
 51 | image_url = 'https://imgur.com/rapIn0u.jpg'
 52 | visualFeatures = "Description,Tags,Categories,Faces"
 53 | # options = "Categories, Tags, Description, Faces, ImageType, Color, Adult"
 54 | 
 55 | details = "Landmarks"
 56 | # options = Landmarks, Celebrities
 57 | 
 58 | reqURL = paste(api_endpoint_url,
 59 |                "?visualFeatures=",
 60 |                visualFeatures,
 61 |                "&details=",
 62 |                details,
 63 |                sep="")
 64 | 
 65 | APIresponse = POST(url = reqURL,
 66 |                    content_type('application/json'),
 67 |                    add_headers(.headers = c('Ocp-Apim-Subscription-Key' = api_key)),
 68 |                    body=list(url = image_url),
 69 |                    encode = "json") 
 70 | 
 71 | df = content(APIresponse)
 72 | ```
 73 | 
 74 | ```{r}
 75 | my_image <- readImage('SnoozeGenius.jpg')
 76 | plot(my_image)
 77 | ```
 78 | 
 79 | 
 80 | ```{r}
 81 | description_tags = df$description$tags
 82 | description_tags_tib = tibble(tag = character())
 83 | for(tag in description_tags){
 84 |   for(text in tag){
 85 |      if(class(tag) != "list"){  ## To remove the extra caption from being included
 86 |       tmp = tibble(tag = tag)
 87 |       description_tags_tib = description_tags_tib %>% bind_rows(tmp)
 88 |     } 
 89 |   }
 90 | }
 91 | 
 92 | knitr::kable(description_tags_tib[1:5,])
 93 | ```
 94 | 
 95 | ```{r}
 96 | captions = df$description$captions
 97 | captions_tib = tibble(text = character(), confidence = numeric())
 98 | for(caption in captions){
 99 |   tmp = tibble(text = caption$text, confidence = caption$confidence)
100 |   captions_tib = captions_tib %>% bind_rows(tmp)
101 | }
102 | knitr::kable(captions_tib)
103 | ```
104 | 
105 | ```{r}
106 | metadata = df$metadata
107 | metadata_tib = tibble(width = metadata$width, height = metadata$height, format = metadata$format)
108 | knitr::kable(metadata_tib)
109 | ```
110 | 
111 | ```{r}
112 | faces = df$faces
113 | faces_tib = tibble(faceID = numeric(),
114 |                    age = numeric(), 
115 |                    gender = character(),
116 |                    x1 = numeric(),
117 |                    x2 = numeric(),
118 |                    y1 = numeric(),
119 |                    y2 = numeric())
120 | 
121 | n = 0
122 | for(face in faces){
123 |   n = n + 1
124 |   tmp = tibble(faceID = n,
125 |                age = face$age, 
126 |                gender = face$gender,
127 |                x1 = face$faceRectangle$left,
128 |                y1 = face$faceRectangle$top,
129 |                x2 = face$faceRectangle$left + face$faceRectangle$width,
130 |                y2 = face$faceRectangle$top + face$faceRectangle$height)
131 |   faces_tib = faces_tib %>% bind_rows(tmp)
132 | }
133 | faces_tib
134 | knitr::kable(faces_tib)
135 | ```
136 | 
137 | ```{r}
138 | my_image <- readImage('SnoozeGenius.jpg')
139 | plot(my_image)
140 | 
141 | coords = faces_tib %>% select(x1, y1, x2, y2)
142 | for(i in 1:nrow(coords)){
143 |   print(i)
144 |   xs = c(coords$x1[i], coords$x1[i], coords$x2[i], coords$x2[i])
145 |   ys = c(coords$y1[i], coords$y2[i], coords$y2[i], coords$y1[i])
146 |   polygon(x = xs, y = ys, border = i+1, lwd = 4)
147 | }
148 | ```
149 | 
150 | Image Caption = `r print(captions_tib$text)`
151 | 
152 | 
153 | ```{r}
154 | str(df)
155 | ```
156 | 
157 | 


--------------------------------------------------------------------------------
/Microsoft Vision API/SnoozeGenius.jpg:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stoltzmaniac/ML-Image-Processing-R/33ff327eb94fb404aea9c27ff5aa419e65cdf7cf/Microsoft Vision API/SnoozeGenius.jpg


--------------------------------------------------------------------------------
/Microsoft Vision API/df.rds:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/stoltzmaniac/ML-Image-Processing-R/33ff327eb94fb404aea9c27ff5aa419e65cdf7cf/Microsoft Vision API/df.rds


--------------------------------------------------------------------------------
/Microsoft Vision API/sandbox.R:
--------------------------------------------------------------------------------
  1 | library(tidyverse)
  2 | library(RCurl)
  3 | library(httr)
  4 | library(EBImage)
  5 | 
  6 | credentials = read_csv('credentials.csv')
  7 | api_key = as.character(credentials$subscription_id) #api key is not subscription id
  8 | api_endpoint_url = "https://westcentralus.api.cognitive.microsoft.com/vision/v1.0/analyze"
  9 | 
 10 | image_url = 'https://imgur.com/rapIn0u.jpg'
 11 | visualFeatures = "Description,Tags,Categories,Faces"
 12 | # options = "Categories, Tags, Description, Faces, ImageType, Color, Adult"
 13 | 
 14 | details = "Landmarks"
 15 | # options = Landmarks, Celebrities
 16 | 
 17 | reqURL = paste(api_endpoint_url,
 18 |                "?visualFeatures=",
 19 |                visualFeatures,
 20 |                "&details=",
 21 |                details,
 22 |                sep="")
 23 | 
 24 | APIresponse = POST(url = reqURL,
 25 |                    content_type('application/json'),
 26 |                    add_headers(.headers = c('Ocp-Apim-Subscription-Key' = api_key)),
 27 |                    body=list(url = image_url),
 28 |                    encode = "json") 
 29 | 
 30 | df = content(APIresponse)
 31 | str(df)
 32 | 
 33 | 
 34 | 
 35 | description_tags = df$description$tags
 36 | description_tags_tib = tibble(tag = character())
 37 | for(tag in description_tags){
 38 |   for(text in tag){
 39 |     if(class(tag) != "list"){  ## To remove the extra caption from being included
 40 |       tmp = tibble(tag = tag)
 41 |       description_tags_tib = description_tags_tib %>% bind_rows(tmp)
 42 |     } 
 43 |   }
 44 | }
 45 | 
 46 | knitr::kable(description_tags_tib[1:5,])
 47 | 
 48 | 
 49 | captions = df$description$captions
 50 | captions_tib = tibble(text = character(), confidence = numeric())
 51 | for(caption in captions){
 52 |   tmp = tibble(text = caption$text, confidence = caption$confidence)
 53 |   captions_tib = captions_tib %>% bind_rows(tmp)
 54 | }
 55 | knitr::kable(captions_tib)
 56 | 
 57 | 
 58 | metadata = df$metadata
 59 | metadata_tib = tibble(width = metadata$width, height = metadata$height, format = metadata$format)
 60 | knitr::kable(metadata_tib)
 61 | 
 62 | 
 63 | 
 64 | faces = df$faces
 65 | faces_tib = tibble(faceID = numeric(),
 66 |                    age = numeric(), 
 67 |                    gender = character(),
 68 |                    x1 = numeric(),
 69 |                    x2 = numeric(),
 70 |                    y1 = numeric(),
 71 |                    y2 = numeric())
 72 | 
 73 | n = 0
 74 | for(face in faces){
 75 |   n = n + 1
 76 |   tmp = tibble(faceID = n,
 77 |                age = face$age, 
 78 |                gender = face$gender,
 79 |                x1 = face$faceRectangle$left,
 80 |                y1 = face$faceRectangle$top,
 81 |                x2 = face$faceRectangle$left + face$faceRectangle$width,
 82 |                y2 = face$faceRectangle$top + face$faceRectangle$height)
 83 |   faces_tib = faces_tib %>% bind_rows(tmp)
 84 | }
 85 | faces_tib
 86 | knitr::kable(faces_tib)
 87 | 
 88 | 
 89 | 
 90 | my_image <- readImage('SnoozeGenius.jpg')
 91 | plot(my_image)
 92 | 
 93 | coords = faces_tib %>% select(x1, y1, x2, y2)
 94 | for(i in 1:nrow(coords)){
 95 |   print(i)
 96 |   xs = c(coords$x1[i], coords$x1[i], coords$x2[i], coords$x2[i])
 97 |   ys = c(coords$y1[i], coords$y2[i], coords$y2[i], coords$y1[i])
 98 |   polygon(x = xs, y = ys, border = i+1, lwd = 4)
 99 | }
100 | 
101 | 
102 | 
103 | 


--------------------------------------------------------------------------------
/README.md:
--------------------------------------------------------------------------------
1 | # ML-Image-Processing-R
2 | Image processing in R  
3 | 
4 | This repository is starting to look at ways in which to process images using R libraries.  
5 | 
6 | So far you can see: OpenCV (which requires python integration) and Google Vision API. These are incredibly powerful and fast. Please see https://www.stoltzmaniac.com/category/image-processing/ for all posts.  
7 | 
8 |   * [Google Vision API](https://www.stoltzmaniac.com/google-vision-api-in-r-rooglevision/)
9 |   * [OpenCV Face Recognition](https://www.stoltzmaniac.com/facial-recognition-in-r/)


--------------------------------------------------------------------------------