YEAR: 2019 112 | COPYRIGHT HOLDER: John Coene 113 |114 | 115 |
Copyright (c) 2019 John Coene
114 |Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:
115 |The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.
116 |THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
117 |
122 | All functions123 | 124 | |
125 | |
|---|---|
|
129 |
|
131 | Calls |
132 |
| 135 | 136 | | 137 |Call |
138 |
| 141 | 142 | | 143 |Setup |
144 |
setup.RdSetup your session, all subsequent calls will be done using the API key.
118 | 119 |guardian_key(key)122 | 123 |
| key | 128 |Your API key, freely available at https://open-platform.theguardian.com. |
129 |
|---|
You can specify GUARDIAN_API_KEY as environment variable, likely in your .Renviron file.
# NOT RUN { 139 | guardian_key("xXXxxXxXxXXx") 140 | # }142 |141 |
gd_call.RdExecutes calls from guardianCalls objects
gd_call(..., batch_size = 12) 122 | 123 | # S3 method for guardianCalls 124 | gd_call(..., batch_size = 12)125 | 126 |
| ... | 131 |Objects of class |
132 |
|---|---|
| batch_size | 135 |Size of each batch. |
136 |
# NOT RUN { 142 | (to_search <- gd_search("debates", pages = 13)) 143 | results <- gd_call(to_search) 144 | # }146 |145 |
calls.RdAll of The Guardian API endpoints.
118 | 119 |gd_search(q = NULL, ..., pages = 1) 122 | 123 | gd_tags(q = NULL, ..., pages = 1) 124 | 125 | gd_sections(q = NULL, ..., pages = 1) 126 | 127 | gd_editions(q = NULL, ..., pages = 1) 128 | 129 | gd_items(items, ...)130 | 131 |
| q | 136 |The search query parameter supports |
137 |
|---|---|
| ... | 140 |Any other parameter, or filter, see the full list at https://open-platform.theguardian.com/documentation/. |
141 |
| pages | 144 |Number of pages to collect. |
145 |
| items | 148 |Vector of API links to items. |
149 |
This only "prepares" the API calls, use gd_call to execute them.
# NOT RUN { 159 | (to_search <- gd_search("debates", pages = 13)) 160 | results <- gd_call(to_search) 161 | 162 | # select items to retrieve 163 | items_to_get <- gd_items(results$apiUrl[1:13]) 164 | items <- gd_call(items_to_get) 165 | # }167 |166 |
82 | Access over 2 million pieces of content from The Guardian.
85 |86 | 87 | 88 | Reference 89 | 90 | 91 | 92 | Repository 93 | 94 |
95 |You can install the package with remotes from Github, see changes.
# install.packages("remotes")
102 | remotes::install_github("news-r/pkg") # githubTo get started, You need a free API key. Then either specify the aforementioned key using guardian_key or specify it as environment variable (likely in your .Renviron) as GUARDIAN_API_KEY.
guardian_key("xxXXxxXx")The package revolves around the following principle, first create your API calls then execute them with gd_call. This is because the package is built upon the async package which lets you execute API calls asynchonously: the (free) developer plan allows you to do up to 12 calls per second.
Below we look for 15 pages of articles on “Brexit”.
115 |library(guardian)
116 | #> API key loaded!
117 |
118 | # search for brexit articles
119 | (to_search <- gd_search("brexit", pages = 15))
120 | #> ℹ 15 calls
121 |
122 | # actually execute 15 calls (1 per page)
123 | results <- gd_call(to_search)
124 | #> ℹ Making 15 calls in 2 batches of 12
125 | head(results)
126 | #> # A tibble: 6 x 11
127 | #> id type sectionId sectionName webPublicationD… webTitle webUrl apiUrl
128 | #> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
129 | #> 1 poli… arti… politics Politics 2019-06-18T16:2… Brexit … https… https…
130 | #> 2 game… arti… games Games 2019-06-26T12:3… Watch D… https… https…
131 | #> 3 poli… arti… politics Politics 2019-06-24T16:5… Has Bre… https… https…
132 | #> 4 poli… arti… politics Politics 2019-06-20T17:2… The lim… https… https…
133 | #> 5 educ… arti… education Education 2019-07-02T23:0… Brexit … https… https…
134 | #> 6 busi… arti… business Business 2019-03-16T17:0… Brexit … https… https…
135 | #> # … with 3 more variables: isHosted <lgl>, pillarId <chr>,
136 | #> # pillarName <chr>