├── .Rbuildignore ├── .gitignore ├── .travis.yml ├── CONDUCT.md ├── DESCRIPTION ├── LICENSE ├── NAMESPACE ├── NEWS ├── R ├── WikidataR.R ├── geo.R ├── gets.R ├── prints.R └── utils.R ├── README.md ├── WikidataR.Rproj ├── man ├── WikidataR.Rd ├── extract_claims.Rd ├── find_item.Rd ├── get_geo_box.Rd ├── get_geo_entity.Rd ├── get_item.Rd ├── get_random.Rd ├── print.find_item.Rd ├── print.find_property.Rd └── print.wikidata.Rd ├── tests ├── testthat.R └── testthat │ ├── test_geo.R │ ├── test_gets.R │ └── test_search.R └── vignettes ├── Introduction.R ├── Introduction.Rmd ├── Introduction.html └── Introduction.md /.Rbuildignore: -------------------------------------------------------------------------------- 1 | ^.*\.Rproj$ 2 | ^\.Rproj\.user$ 3 | ^CONDUCT\.md$ 4 | .travis.yml 5 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | # History files 2 | .Rhistory 3 | .DS_Store 4 | 5 | # Example code in package build process 6 | *-Ex.R 7 | # R data files from past sessions 8 | .Rdata 9 | # RStudio files 10 | .Rproj.user/ 11 | .Rproj.user 12 | inst/doc 13 | -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- 1 | # Sample .travis.yml for R projects 2 | 3 | language: r 4 | warnings_are_errors: false 5 | sudo: required 6 | 7 | env: 8 | global: 9 | - CRAN: http://cran.rstudio.com 10 | 11 | r_packages: 12 | - testthat 13 | - WikipediR 14 | notifications: 15 | email: 16 | on_failure: change -------------------------------------------------------------------------------- /CONDUCT.md: -------------------------------------------------------------------------------- 1 | # Contributor Code of Conduct 2 | 3 | As contributors and maintainers of this project, we pledge to respect all people who 4 | contribute through reporting issues, posting feature requests, updating documentation, 5 | submitting pull requests or patches, and other activities. 6 | 7 | We are committed to making participation in this project a harassment-free experience for 8 | everyone, regardless of level of experience, gender, gender identity and expression, 9 | sexual orientation, disability, personal appearance, body size, race, ethnicity, age, or religion. 10 | 11 | Examples of unacceptable behavior by participants include the use of sexual language or 12 | imagery, derogatory comments or personal attacks, trolling, public or private harassment, 13 | insults, or other unprofessional conduct. 14 | 15 | Project maintainers have the right and responsibility to remove, edit, or reject comments, 16 | commits, code, wiki edits, issues, and other contributions that are not aligned to this 17 | Code of Conduct. Project maintainers who do not follow the Code of Conduct may be removed 18 | from the project team. 19 | 20 | Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by 21 | opening an issue or contacting one or more of the project maintainers. 22 | 23 | This Code of Conduct is adapted from the Contributor Covenant 24 | (http:contributor-covenant.org), version 1.0.0, available at 25 | http://contributor-covenant.org/version/1/0/0/ 26 | -------------------------------------------------------------------------------- /DESCRIPTION: -------------------------------------------------------------------------------- 1 | Package: WikidataR 2 | Type: Package 3 | Title: API Client Library for 'Wikidata' 4 | Version: 1.4.0 5 | Date: 2017-09-21 6 | Author: Oliver Keyes [aut, cre], Serena Signorelli [aut, cre], 7 | Christian Graul [ctb], Mikhail Popov [ctb] 8 | Maintainer: Oliver Keyes 9 | Description: An API client for the Wikidata store of 10 | semantic data. 11 | BugReports: https://github.com/Ironholds/WikidataR/issues 12 | URL: https://github.com/Ironholds/WikidataR/issues 13 | License: MIT + file LICENSE 14 | Imports: 15 | httr, 16 | jsonlite, 17 | WikipediR (>= 1.4.0), 18 | utils 19 | Suggests: 20 | testthat, 21 | knitr, 22 | pageviews 23 | VignetteBuilder: knitr 24 | RoxygenNote: 6.0.1 25 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | YEAR: 2014 2 | COPYRIGHT HOLDER: Oliver Keyes -------------------------------------------------------------------------------- /NAMESPACE: -------------------------------------------------------------------------------- 1 | # Generated by roxygen2: do not edit by hand 2 | 3 | S3method(print,find_item) 4 | S3method(print,find_property) 5 | S3method(print,wikidata) 6 | export(extract_claims) 7 | export(find_item) 8 | export(find_property) 9 | export(get_geo_box) 10 | export(get_geo_entity) 11 | export(get_item) 12 | export(get_property) 13 | export(get_random_item) 14 | export(get_random_property) 15 | importFrom(WikipediR,page_content) 16 | importFrom(WikipediR,query) 17 | importFrom(WikipediR,random_page) 18 | importFrom(httr,user_agent) 19 | importFrom(jsonlite,fromJSON) 20 | -------------------------------------------------------------------------------- /NEWS: -------------------------------------------------------------------------------- 1 | 1.4.0 2 | ================================================= 3 | * extract_claims() allows you to, well, extract claims. 4 | * SPARQL syntax bug with some geo queries now fixed (thanks to Mikhail Popov) 5 | 6 | 1.3.0 7 | ================================================= 8 | * get_* functions are now vectorised 9 | 10 | 1.2.0 11 | ================================================= 12 | * geographic data for entities that exist relative to other Wikidata items can now be retrieved 13 | with get_geo_entity and get_geo_box, courtesy of excellent Serena Signorelli's excellent 14 | QueryWikidataR package. 15 | 16 | * A bug in printing returned objects is now fixed. 17 | 18 | 1.1.0 19 | ================================================= 20 | * You can now retrieve multiple random properties or items with get_random_item and get_random_property 21 | 22 | 1.0.1 23 | ================================================= 24 | * Various documentation and metadata improvements. 25 | 26 | 1.0.0 27 | ================================================= 28 | * Fix a bug in get_* functions due to a parameter name mismatch 29 | * Print methods added by Christian Graul 30 | 31 | 0.5.0 32 | ================================================= 33 | * This is the initial release! See the explanatory vignettes. 34 | -------------------------------------------------------------------------------- /R/WikidataR.R: -------------------------------------------------------------------------------- 1 | #' @title API client library for Wikidata 2 | #' @description This package serves as an API client for \href{Wikidata}{https://www.wikidata.org}. 3 | #' See the accompanying vignette for more details. 4 | #' 5 | #' @name WikidataR 6 | #' @docType package 7 | #'@seealso \code{\link{get_random}} for selecting a random item or property, 8 | #'\code{\link{get_item}} for a /specific/ item or property, or \code{\link{find_item}} 9 | #'for using search functionality to pull out item or property IDs where the descriptions 10 | #'or aliases match a particular search term. 11 | #' @importFrom WikipediR page_content random_page query 12 | #' @importFrom httr user_agent 13 | #' @importFrom jsonlite fromJSON 14 | #' @aliases WikidataR WikidataR-package 15 | NULL -------------------------------------------------------------------------------- /R/geo.R: -------------------------------------------------------------------------------- 1 | clean_geo <- function(results){ 2 | do.call("rbind", lapply(results, function(item){ 3 | point <- unlist(strsplit(gsub(x = item$coord$value, pattern = "(Point\\(|\\))", replacement = ""), 4 | " ")) 5 | wd_id <- gsub(x = item$item$value, pattern = "http://www.wikidata.org/entity/", 6 | replacement = "", fixed = TRUE) 7 | return(data.frame(item = wd_id, 8 | name = ifelse(item$name$value == wd_id, NA, item$name$value), 9 | latitutde = as.numeric(point[1]), 10 | longitude = as.numeric(point[2]), 11 | stringsAsFactors = FALSE)) 12 | 13 | })) 14 | } 15 | 16 | #'@title Retrieve geographic information from Wikidata 17 | #'@description \code{get_geo_entity} retrieves the item ID, latitude 18 | #'and longitude of any object with geographic data associated with \emph{another} 19 | #'object with geographic data (example: all the locations around/near/associated with 20 | #'a city). 21 | #' 22 | #'@param entity a Wikidata item (\code{Q...}) or series of items, to check 23 | #'for associated geo-tagged items. 24 | #' 25 | #'@param language the two-letter language code to use for the name 26 | #'of the item. "en" by default, because we're imperialist 27 | #'anglocentric westerners. 28 | #' 29 | #'@param radius optionally, a radius (in kilometers) around \code{entity} 30 | #'to restrict the search to. 31 | #' 32 | #'@param ... further arguments to pass to httr's GET. 33 | #' 34 | #'@return a data.frame of 5 columns: 35 | #'\itemize{ 36 | #' \item{item}{ the Wikidata identifier of each object associated with 37 | #' \code{entity}.} 38 | #' \item{name}{ the name of the item, if available, in the requested language. If it 39 | #' is not available, \code{NA} will be returned instead.} 40 | #' \item{latitude}{ the latitude of \code{item}} 41 | #' \item{longitude}{ the longitude of \code{item}} 42 | #' \item{entity}{ the entity the item is associated with (necessary for multi-entity 43 | #' queries).} 44 | #'} 45 | #' 46 | #'@examples 47 | #'# All entities 48 | #'sf_locations <- get_geo_entity("Q62") 49 | #' 50 | #'# Entities with French, rather than English, names 51 | #'sf_locations <- get_geo_entity("Q62", language = "fr") 52 | #' 53 | #'# Entities within 1km 54 | #'sf_close_locations <- get_geo_entity("Q62", radius = 1) 55 | #' 56 | #'# Multiple entities 57 | #'multi_entity <- get_geo_entity(entity = c("Q62", "Q64")) 58 | #' 59 | #'@seealso \code{\link{get_geo_box}} for using a bounding box 60 | #'rather than an unrestricted search or simple radius. 61 | #' 62 | #'@export 63 | get_geo_entity <- function(entity, language = "en", radius = NULL, ...){ 64 | 65 | entity <- check_input(entity, "Q") 66 | 67 | if(is.null(radius)){ 68 | query <- paste0("SELECT DISTINCT ?item ?name ?coord ?propertyLabel WHERE { 69 | ?item wdt:P131* wd:", entity, ". ?item wdt:P625 ?coord . 70 | SERVICE wikibase:label { 71 | bd:serviceParam wikibase:language \"", language, "\" . 72 | ?item rdfs:label ?name 73 | } 74 | } 75 | ORDER BY ASC (?name)") 76 | } else { 77 | query <- paste0("SELECT ?item ?name ?coord 78 | WHERE { 79 | wd:", entity, " wdt:P625 ?mainLoc . 80 | SERVICE wikibase:around { 81 | ?item wdt:P625 ?coord . 82 | bd:serviceParam wikibase:center ?mainLoc . 83 | bd:serviceParam wikibase:radius \"", radius, 84 | "\" . 85 | } 86 | SERVICE wikibase:label { 87 | bd:serviceParam wikibase:language \"", language, "\" . 88 | ?item rdfs:label ?name 89 | } 90 | } ORDER BY ASC (?name)") 91 | } 92 | 93 | if(length(query) > 1){ 94 | return(do.call("rbind", mapply(function(query, entity, ...){ 95 | output <- clean_geo(sparql_query(query, ...)$results$bindings) 96 | output$entity <- entity 97 | return(output) 98 | }, query = query, entity = entity, ..., SIMPLIFY = FALSE))) 99 | } 100 | output <- clean_geo(sparql_query(query)$results$bindings) 101 | output$entity <- entity 102 | return(output) 103 | } 104 | 105 | #'@title Get geographic entities based on a bounding box 106 | #'@description \code{get_geo_box} retrieves all geographic entities in 107 | #'Wikidata that fall between a bounding box between two existing items 108 | #'with geographic attributes (usually cities). 109 | #' 110 | #'@param first_city_code a Wikidata item, or series of items, to use for 111 | #'one corner of the bounding box. 112 | #' 113 | #'@param first_corner the direction of \code{first_city_code} relative 114 | #'to \code{city} (eg "NorthWest", "SouthEast"). 115 | #' 116 | #'@param second_city_code a Wikidata item, or series of items, to use for 117 | #'one corner of the bounding box. 118 | #' 119 | #'@param second_corner the direction of \code{second_city_code} relative 120 | #'to \code{city} (eg "NorthWest", "SouthEast"). 121 | #' 122 | #'@param language the two-letter language code to use for the name 123 | #'of the item. "en" by default. 124 | #' 125 | #'@param ... further arguments to pass to httr's GET. 126 | #' 127 | #'@return a data.frame of 5 columns: 128 | #'\itemize{ 129 | #' \item{item}{ the Wikidata identifier of each object associated with 130 | #' \code{entity}.} 131 | #' \item{name}{ the name of the item, if available, in the requested language. If it 132 | #' is not available, \code{NA} will be returned instead.} 133 | #' \item{latitude}{ the latitude of \code{item}} 134 | #' \item{longitude}{ the longitude of \code{item}} 135 | #' \item{entity}{ the entity the item is associated with (necessary for multi-entity 136 | #' queries).} 137 | #'} 138 | #' 139 | #'@examples 140 | #'# Simple bounding box 141 | #'bruges_box <- WikidataR:::get_geo_box("Q12988", "NorthEast", "Q184287", "SouthWest") 142 | #' 143 | #'# Custom language 144 | #'bruges_box_fr <- WikidataR:::get_geo_box("Q12988", "NorthEast", "Q184287", "SouthWest", 145 | #' language = "fr") 146 | #' 147 | #'@seealso \code{\link{get_geo_entity}} for using an unrestricted search or simple radius, 148 | #'rather than a bounding box. 149 | #' 150 | #'@export 151 | get_geo_box <- function(first_city_code, first_corner, second_city_code, second_corner, 152 | language = "en", ...){ 153 | 154 | # Input checks 155 | first_city_code <- check_input(first_city_code, "Q") 156 | second_city_code <- check_input(second_city_code, "Q") 157 | 158 | # Construct query 159 | query <- paste0("SELECT ?item ?name ?coord WHERE { 160 | wd:", first_city_code, " wdt:P625 ?Firstloc . 161 | wd:", second_city_code, " wdt:P625 ?Secondloc . 162 | SERVICE wikibase:box { 163 | ?item wdt:P625 ?coord . 164 | bd:serviceParam wikibase:corner", first_corner, " ?Firstloc . 165 | bd:serviceParam wikibase:corner", second_corner, " ?Secondloc . 166 | } 167 | SERVICE wikibase:label { 168 | bd:serviceParam wikibase:language \"", language, "\" . 169 | ?item rdfs:label ?name 170 | } 171 | }ORDER BY ASC (?name)") 172 | 173 | # Vectorise if necessary, or not if not! 174 | if(length(query) > 1){ 175 | return(do.call("rbind", mapply(function(query, ...){ 176 | output <- clean_geo(sparql_query(query, ...)$results$bindings) 177 | return(output) 178 | }, query = query, ..., SIMPLIFY = FALSE))) 179 | } 180 | output <- clean_geo(sparql_query(query)$results$bindings) 181 | return(output) 182 | } -------------------------------------------------------------------------------- /R/gets.R: -------------------------------------------------------------------------------- 1 | #'@title Retrieve specific Wikidata items or properties 2 | #'@description \code{get_item} and \code{get_property} allow you to retrieve the data associated 3 | #'with individual Wikidata items and properties, respectively. As with 4 | #'other \code{WikidataR} code, custom print methods are available; use \code{\link{str}} 5 | #'to manipulate and see the underlying structure of the data. 6 | #' 7 | #'@param id the ID number(s) of the item or property you're looking for. This can be in 8 | #'various formats; either a numeric value ("200"), the full name ("Q200") or 9 | #'even with an included namespace ("Property:P10") - the function will format 10 | #'it appropriately. This function is vectorised and will happily accept 11 | #'multiple IDs. 12 | #' 13 | #'@param ... further arguments to pass to httr's GET. 14 | #' 15 | #'@seealso \code{\link{get_random}} for selecting a random item or property, 16 | #'or \code{\link{find_item}} for using search functionality to pull out 17 | #'item or property IDs where the descriptions or aliases match a particular 18 | #'search term. 19 | #' 20 | #'@examples 21 | #' 22 | #'#Retrieve a specific item 23 | #'adams_metadata <- get_item("42") 24 | #' 25 | #'#Retrieve a specific property 26 | #'object_is_child <- get_property("P40") 27 | #' 28 | #'@aliases get_item get_property 29 | #'@rdname get_item 30 | #'@export 31 | get_item <- function(id, ...){ 32 | id <- check_input(id, "Q") 33 | output <- (lapply(id, wd_query, ...)) 34 | class(output) <- "wikidata" 35 | return(output) 36 | } 37 | 38 | #'@rdname get_item 39 | #'@export 40 | get_property <- function(id, ...){ 41 | has_grep <- grepl("^P(?!r)",id, perl = TRUE) 42 | id[has_grep] <- paste0("Property:", id[has_grep]) 43 | id <- check_input(id, "Property:P") 44 | 45 | output <- (lapply(id, wd_query, ...)) 46 | class(output) <- "wikidata" 47 | return(output) 48 | } 49 | 50 | #'@title Retrieve randomly-selected Wikidata items or properties 51 | #'@description \code{get_random_item} and \code{get_random_property} allow you to retrieve the data 52 | #'associated with randomly-selected Wikidata items and properties, respectively. As with 53 | #'other \code{WikidataR} code, custom print methods are available; use \code{\link{str}} 54 | #'to manipulate and see the underlying structure of the data. 55 | #' 56 | #'@param limit how many random items to return. 1 by default, but can be higher. 57 | #' 58 | #'@param ... arguments to pass to httr's GET. 59 | #' 60 | #'@seealso \code{\link{get_item}} for selecting a specific item or property, 61 | #'or \code{\link{find_item}} for using search functionality to pull out 62 | #'item or property IDs where the descriptions or aliases match a particular 63 | #'search term. 64 | #' 65 | #'@examples 66 | #' 67 | #'#Random item 68 | #'random_item <- get_random_item() 69 | #' 70 | #'#Random property 71 | #'random_property <- get_random_property() 72 | #' 73 | #'@aliases get_random get_random_item get_random_property 74 | #'@rdname get_random 75 | #'@export 76 | get_random_item <- function(limit = 1, ...){ 77 | return(wd_rand_query(ns = 0, limit = limit, ...)) 78 | } 79 | 80 | #'@rdname get_random 81 | #'@export 82 | get_random_property <- function(limit = 1, ...){ 83 | return(wd_rand_query(ns = 120, limit = limit, ...)) 84 | } 85 | 86 | #'@title Search for Wikidata items or properties that match a search term 87 | #'@description \code{find_item} and \code{find_property} allow you to retrieve a set 88 | #'of Wikidata items or properties where the aliase or descriptions match a particular 89 | #'search term. As with other \code{WikidataR} code, custom print methods are available; 90 | #'use \code{\link{str}} to manipulate and see the underlying structure of the data. 91 | #' 92 | #'@param search_term a term to search for. 93 | #' 94 | #'@param language the language to return the labels and descriptions in; this should 95 | #'consist of an ISO language code. Set to "en" by default. 96 | #' 97 | #'@param limit the number of results to return; set to 10 by default. 98 | #' 99 | #'@param ... further arguments to pass to httr's GET. 100 | #' 101 | #'@seealso \code{\link{get_random}} for selecting a random item or property, 102 | #'or \code{\link{get_item}} for selecting a specific item or property. 103 | #' 104 | #'@examples 105 | #' 106 | #'#Check for entries relating to Douglas Adams in some way 107 | #'adams_items <- find_item("Douglas Adams") 108 | #' 109 | #'#Check for properties involving the peerage 110 | #'peerage_props <- find_property("peerage") 111 | #' 112 | #'@aliases find_item find_property 113 | #'@rdname find_item 114 | #'@export 115 | find_item <- function(search_term, language = "en", limit = 10, ...){ 116 | res <- searcher(search_term, language, limit, "item") 117 | class(res) <- "find_item" 118 | return(res) 119 | } 120 | 121 | #'@rdname find_item 122 | #'@export 123 | find_property <- function(search_term, language = "en", limit = 10){ 124 | res <- searcher(search_term, language, limit, "property") 125 | class(res) <- "find_property" 126 | return(res) 127 | } 128 | -------------------------------------------------------------------------------- /R/prints.R: -------------------------------------------------------------------------------- 1 | #'@title Print method for find_item 2 | #' 3 | #'@description print found items. 4 | #' 5 | #'@param x find_item object with search results 6 | #'@param \dots Arguments to be passed to methods 7 | #' 8 | #'@method print find_item 9 | #'@export 10 | print.find_item <- function(x, ...) { 11 | cat("\n\tWikidata item search\n\n") 12 | 13 | # number of results 14 | num_results <- length(x) 15 | cat("Number of results:\t", num_results, "\n\n") 16 | 17 | # results 18 | if(num_results > 0) { 19 | cat("Results:\n") 20 | for(i in 1:num_results) { 21 | if(is.null(x[[i]]$description)){ 22 | desc <- "\n" 23 | } 24 | else { 25 | desc <- paste("-", x[[i]]$description, "\n") 26 | } 27 | cat(i, "\t", x[[i]]$label, paste0("(", x[[i]]$id, ")"), desc) 28 | } 29 | } 30 | } 31 | 32 | #'@title Print method for find_property 33 | #' 34 | #'@description print found properties. 35 | #' 36 | #'@param x find_property object with search results 37 | #'@param \dots Arguments to be passed to methods 38 | #' 39 | #'@method print find_property 40 | #'@export 41 | print.find_property <- function(x, ...) { 42 | cat("\n\tWikidata property search\n\n") 43 | 44 | # number of results 45 | num_results <- length(x) 46 | cat("Number of results:\t", num_results, "\n\n") 47 | 48 | # results 49 | if(num_results > 0) { 50 | cat("Results:\n") 51 | for(i in seq_len(num_results)) { 52 | if(is.null(x[[i]]$description)){ 53 | desc <- "\n" 54 | } 55 | else { 56 | desc <- paste("-", x[[i]]$description, "\n") 57 | } 58 | cat(i, "\t", x[[i]]$label, paste0("(", x[[i]]$id, ")"), desc) 59 | } 60 | } 61 | } 62 | 63 | wd_print_base <- function(x, ...){ 64 | 65 | cat("\n\tWikidata", x$type, x$id, "\n\n") 66 | 67 | # labels 68 | num.labels <- length(x$labels) 69 | if(num.labels>0) { 70 | lbl <- x$labels[[1]]$value 71 | if(num.labels==1) cat("Label:\t\t", lbl, "\n") 72 | else { 73 | if(!is.null(x$labels$en)) lbl <- x$labels$en$value 74 | cat("Label:\t\t", lbl, paste0("\t[", num.labels-1, " other languages available]\n")) 75 | } 76 | } 77 | 78 | # aliases 79 | num_aliases <- length(x$aliases) 80 | if(num_aliases > 0) { 81 | al <- unique(unlist(lapply(x$aliases, function(xl){return(xl$value)}))) 82 | cat("Aliases:\t", paste(al, collapse = ", "), "\n") 83 | } 84 | 85 | # descriptions 86 | num_desc <- length(x$descriptions) 87 | if(num_desc > 0) { 88 | desc <- x$descriptions[[1]]$value 89 | if(num_desc == 1){ 90 | cat("Description:", desc, "\n") 91 | } 92 | else { 93 | if(!is.null(x$descriptions$en)){ 94 | desc <- x$descriptions$en$value 95 | } 96 | cat("Description:", desc, paste0("\t[", (num_desc - 1), " other languages available]\n")) 97 | } 98 | } 99 | 100 | # num claims 101 | num_claims <- length(x$claims) 102 | if(num_claims > 0){ 103 | cat("Claims:\t\t", num_claims, "\n") 104 | } 105 | 106 | # num sitelinks 107 | num_links <- length(x$sitelinks) 108 | if(num_links > 0){ 109 | cat("Sitelinks:\t", num_links, "\n") 110 | } 111 | } 112 | 113 | #'@title Print method for Wikidata objects 114 | #' 115 | #'@description print found objects generally. 116 | #' 117 | #'@param x wikidata object from get_item, get_random_item, get_property or get_random_property 118 | #'@param \dots Arguments to be passed to methods 119 | #'@seealso get_item, get_random_item, get_property or get_random_property 120 | #'@method print wikidata 121 | #'@export 122 | print.wikidata <- function(x, ...){ 123 | lapply(x, wd_print_base, ...) 124 | return(invisible()) 125 | } -------------------------------------------------------------------------------- /R/utils.R: -------------------------------------------------------------------------------- 1 | #Generic queryin' function for direct Wikidata calls. Wraps around WikipediR::page_content. 2 | wd_query <- function(title, ...){ 3 | result <- WikipediR::page_content(domain = "wikidata.org", page_name = title, as_wikitext = TRUE, 4 | httr::user_agent("WikidataR - https://github.com/Ironholds/WikidataR"), 5 | ...) 6 | output <- jsonlite::fromJSON(result$parse$wikitext[[1]]) 7 | return(output) 8 | } 9 | 10 | #Query for a random item in "namespace" (ns). Essentially a wrapper around WikipediR::random_page. 11 | wd_rand_query <- function(ns, limit, ...){ 12 | result <- WikipediR::random_page(domain = "wikidata.org", as_wikitext = TRUE, namespaces = ns, 13 | httr::user_agent("WikidataR - https://github.com/Ironholds/WikidataR"), 14 | limit = limit, ...) 15 | output <- lapply(result, function(x){jsonlite::fromJSON(x$wikitext[[1]])}) 16 | class(output) <- "wikidata" 17 | return(output) 18 | 19 | } 20 | 21 | #Generic input checker. Needs additional stuff for property-based querying 22 | #because namespaces are weird, yo. 23 | check_input <- function(input, substitution){ 24 | in_fit <- grepl("^\\d+$",input) 25 | if(any(in_fit)){ 26 | input[in_fit] <- paste0(substitution, input[in_fit]) 27 | } 28 | return(input) 29 | } 30 | 31 | #Generic, direct access to Wikidata's search functionality. 32 | searcher <- function(search_term, language, limit, type, ...){ 33 | result <- WikipediR::query(url = "https://www.wikidata.org/w/api.php", out_class = "list", clean_response = FALSE, 34 | query_param = list( 35 | action = "wbsearchentities", 36 | type = type, 37 | language = language, 38 | limit = limit, 39 | search = search_term 40 | ), 41 | ...) 42 | result <- result$search 43 | return(result) 44 | } 45 | 46 | sparql_query <- function(params, ...){ 47 | result <- httr::GET("https://query.wikidata.org/bigdata/namespace/wdq/sparql", 48 | query = list(query = params), 49 | httr::user_agent("WikidataR - https://github.com/Ironholds/WikidataR"), 50 | ...) 51 | httr::stop_for_status(result) 52 | return(httr::content(result, as = "parsed", type = "application/json")) 53 | } 54 | 55 | #'@title Extract Claims from Returned Item Data 56 | #'@description extract claim information from data returned using 57 | #'\code{\link{get_item}}. 58 | #' 59 | #'@param items a list of one or more Wikidata items returned with 60 | #'\code{\link{get_item}}. 61 | #' 62 | #'@param claims a vector of claims (in the form "P321", "P12") to look for 63 | #'and extract. 64 | #' 65 | #'@return a list containing one sub-list for each entry in \code{items}, 66 | #'and (below that) the found data for each claim. In the event a claim 67 | #'cannot be found for an item, an \code{NA} will be returned 68 | #'instead. 69 | #' 70 | #'@examples 71 | #'# Get item data 72 | #'adams_data <- get_item("42") 73 | #' 74 | #'# Get claim data 75 | #'claims <- extract_claims(adams_data, "P31") 76 | #' 77 | #'@export 78 | extract_claims <- function(items, claims){ 79 | output <- lapply(items, function(x, claims){ 80 | return(lapply(claims, function(claim, obj){ 81 | which_match <- which(names(obj$claims) == claim) 82 | if(!length(which_match)){ 83 | return(NA) 84 | } 85 | return(obj$claims[[which_match[1]]]) 86 | }, obj = x)) 87 | }, claims = claims) 88 | 89 | return(output) 90 | } 91 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | WikidataR 2 | ========= 3 | 4 | An R API wrapper for the Wikidata store of semantic data. 5 | 6 | __Author:__ Oliver Keyes, Serena Signorelli & Christian Graul
7 | __License:__ [MIT](http://opensource.org/licenses/MIT)
8 | __Status:__ Stable 9 | 10 | [![Travis-CI Build Status](https://travis-ci.org/Ironholds/WikidataR.svg?branch=master)](https://travis-ci.org/Ironholds/WikidataR)![downloads](http://cranlogs.r-pkg.org/badges/grand-total/WikidataR) 11 | 12 | Description 13 | ====== 14 | WikidataR is a wrapper around the Wikidata API. It is written in and for R, and was inspired by Christian Graul's 15 | [rwikidata](https://github.com/chgrl/rwikidata) project. For details on how to best use it, see the [explanatory 16 | vignette](https://CRAN.R-project.org/package=WikidataR/vignettes/Introduction.html). 17 | 18 | Please note that this project is released with a 19 | [Contributor Code of Conduct](https://github.com/Ironholds/WikidataR/blob/master/CONDUCT.md). 20 | By participating in this project you agree to abide by its terms. 21 | 22 | Installation 23 | ====== 24 | 25 | For the most recent CRAN version: 26 | 27 | install.packages("WikidataR") 28 | 29 | For the development version: 30 | 31 | library(devtools) 32 | devtools::install_github("ironholds/WikidataR") 33 | 34 | Dependencies 35 | ====== 36 | * R. Doy. 37 | * [httr](https://cran.r-project.org/package=httr) and its dependencies. 38 | * [WikipediR](https://cran.r-project.org/package=WikipediR) 39 | -------------------------------------------------------------------------------- /WikidataR.Rproj: -------------------------------------------------------------------------------- 1 | Version: 1.0 2 | 3 | RestoreWorkspace: Default 4 | SaveWorkspace: Default 5 | AlwaysSaveHistory: Default 6 | 7 | EnableCodeIndexing: Yes 8 | UseSpacesForTab: Yes 9 | NumSpacesForTab: 2 10 | Encoding: UTF-8 11 | 12 | RnwWeave: Sweave 13 | LaTeX: pdfLaTeX 14 | 15 | BuildType: Package 16 | PackageUseDevtools: Yes 17 | PackageInstallArgs: --no-multiarch --with-keep.source 18 | PackageCheckArgs: --as-cran 19 | PackageRoxygenize: rd,collate,namespace,vignette 20 | -------------------------------------------------------------------------------- /man/WikidataR.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/WikidataR.R 3 | \docType{package} 4 | \name{WikidataR} 5 | \alias{WikidataR} 6 | \alias{WikidataR-package} 7 | \alias{WikidataR-package} 8 | \title{API client library for Wikidata} 9 | \description{ 10 | This package serves as an API client for \href{Wikidata}{https://www.wikidata.org}. 11 | See the accompanying vignette for more details. 12 | } 13 | \seealso{ 14 | \code{\link{get_random}} for selecting a random item or property, 15 | \code{\link{get_item}} for a /specific/ item or property, or \code{\link{find_item}} 16 | for using search functionality to pull out item or property IDs where the descriptions 17 | or aliases match a particular search term. 18 | } 19 | -------------------------------------------------------------------------------- /man/extract_claims.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/utils.R 3 | \name{extract_claims} 4 | \alias{extract_claims} 5 | \title{Extract Claims from Returned Item Data} 6 | \usage{ 7 | extract_claims(items, claims) 8 | } 9 | \arguments{ 10 | \item{items}{a list of one or more Wikidata items returned with 11 | \code{\link{get_item}}.} 12 | 13 | \item{claims}{a vector of claims (in the form "P321", "P12") to look for 14 | and extract.} 15 | } 16 | \value{ 17 | a list containing one sub-list for each entry in \code{items}, 18 | and (below that) the found data for each claim. In the event a claim 19 | cannot be found for an item, an \code{NA} will be returned 20 | instead. 21 | } 22 | \description{ 23 | extract claim information from data returned using 24 | \code{\link{get_item}}. 25 | } 26 | \examples{ 27 | # Get item data 28 | adams_data <- get_item("42") 29 | 30 | # Get claim data 31 | claims <- extract_claims(adams_data, "P31") 32 | 33 | } 34 | -------------------------------------------------------------------------------- /man/find_item.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/gets.R 3 | \name{find_item} 4 | \alias{find_item} 5 | \alias{find_property} 6 | \alias{find_property} 7 | \title{Search for Wikidata items or properties that match a search term} 8 | \usage{ 9 | find_item(search_term, language = "en", limit = 10, ...) 10 | 11 | find_property(search_term, language = "en", limit = 10) 12 | } 13 | \arguments{ 14 | \item{search_term}{a term to search for.} 15 | 16 | \item{language}{the language to return the labels and descriptions in; this should 17 | consist of an ISO language code. Set to "en" by default.} 18 | 19 | \item{limit}{the number of results to return; set to 10 by default.} 20 | 21 | \item{...}{further arguments to pass to httr's GET.} 22 | } 23 | \description{ 24 | \code{find_item} and \code{find_property} allow you to retrieve a set 25 | of Wikidata items or properties where the aliase or descriptions match a particular 26 | search term. As with other \code{WikidataR} code, custom print methods are available; 27 | use \code{\link{str}} to manipulate and see the underlying structure of the data. 28 | } 29 | \examples{ 30 | 31 | #Check for entries relating to Douglas Adams in some way 32 | adams_items <- find_item("Douglas Adams") 33 | 34 | #Check for properties involving the peerage 35 | peerage_props <- find_property("peerage") 36 | 37 | } 38 | \seealso{ 39 | \code{\link{get_random}} for selecting a random item or property, 40 | or \code{\link{get_item}} for selecting a specific item or property. 41 | } 42 | -------------------------------------------------------------------------------- /man/get_geo_box.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/geo.R 3 | \name{get_geo_box} 4 | \alias{get_geo_box} 5 | \title{Get geographic entities based on a bounding box} 6 | \usage{ 7 | get_geo_box(first_city_code, first_corner, second_city_code, second_corner, 8 | language = "en", ...) 9 | } 10 | \arguments{ 11 | \item{first_city_code}{a Wikidata item, or series of items, to use for 12 | one corner of the bounding box.} 13 | 14 | \item{first_corner}{the direction of \code{first_city_code} relative 15 | to \code{city} (eg "NorthWest", "SouthEast").} 16 | 17 | \item{second_city_code}{a Wikidata item, or series of items, to use for 18 | one corner of the bounding box.} 19 | 20 | \item{second_corner}{the direction of \code{second_city_code} relative 21 | to \code{city} (eg "NorthWest", "SouthEast").} 22 | 23 | \item{language}{the two-letter language code to use for the name 24 | of the item. "en" by default.} 25 | 26 | \item{...}{further arguments to pass to httr's GET.} 27 | } 28 | \value{ 29 | a data.frame of 5 columns: 30 | \itemize{ 31 | \item{item}{ the Wikidata identifier of each object associated with 32 | \code{entity}.} 33 | \item{name}{ the name of the item, if available, in the requested language. If it 34 | is not available, \code{NA} will be returned instead.} 35 | \item{latitude}{ the latitude of \code{item}} 36 | \item{longitude}{ the longitude of \code{item}} 37 | \item{entity}{ the entity the item is associated with (necessary for multi-entity 38 | queries).} 39 | } 40 | } 41 | \description{ 42 | \code{get_geo_box} retrieves all geographic entities in 43 | Wikidata that fall between a bounding box between two existing items 44 | with geographic attributes (usually cities). 45 | } 46 | \examples{ 47 | # Simple bounding box 48 | bruges_box <- WikidataR:::get_geo_box("Q12988", "NorthEast", "Q184287", "SouthWest") 49 | 50 | # Custom language 51 | bruges_box_fr <- WikidataR:::get_geo_box("Q12988", "NorthEast", "Q184287", "SouthWest", 52 | language = "fr") 53 | 54 | } 55 | \seealso{ 56 | \code{\link{get_geo_entity}} for using an unrestricted search or simple radius, 57 | rather than a bounding box. 58 | } 59 | -------------------------------------------------------------------------------- /man/get_geo_entity.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/geo.R 3 | \name{get_geo_entity} 4 | \alias{get_geo_entity} 5 | \title{Retrieve geographic information from Wikidata} 6 | \usage{ 7 | get_geo_entity(entity, language = "en", radius = NULL, ...) 8 | } 9 | \arguments{ 10 | \item{entity}{a Wikidata item (\code{Q...}) or series of items, to check 11 | for associated geo-tagged items.} 12 | 13 | \item{language}{the two-letter language code to use for the name 14 | of the item. "en" by default, because we're imperialist 15 | anglocentric westerners.} 16 | 17 | \item{radius}{optionally, a radius (in kilometers) around \code{entity} 18 | to restrict the search to.} 19 | 20 | \item{...}{further arguments to pass to httr's GET.} 21 | } 22 | \value{ 23 | a data.frame of 5 columns: 24 | \itemize{ 25 | \item{item}{ the Wikidata identifier of each object associated with 26 | \code{entity}.} 27 | \item{name}{ the name of the item, if available, in the requested language. If it 28 | is not available, \code{NA} will be returned instead.} 29 | \item{latitude}{ the latitude of \code{item}} 30 | \item{longitude}{ the longitude of \code{item}} 31 | \item{entity}{ the entity the item is associated with (necessary for multi-entity 32 | queries).} 33 | } 34 | } 35 | \description{ 36 | \code{get_geo_entity} retrieves the item ID, latitude 37 | and longitude of any object with geographic data associated with \emph{another} 38 | object with geographic data (example: all the locations around/near/associated with 39 | a city). 40 | } 41 | \examples{ 42 | # All entities 43 | sf_locations <- get_geo_entity("Q62") 44 | 45 | # Entities with French, rather than English, names 46 | sf_locations <- get_geo_entity("Q62", language = "fr") 47 | 48 | # Entities within 1km 49 | sf_close_locations <- get_geo_entity("Q62", radius = 1) 50 | 51 | # Multiple entities 52 | multi_entity <- get_geo_entity(entity = c("Q62", "Q64")) 53 | 54 | } 55 | \seealso{ 56 | \code{\link{get_geo_box}} for using a bounding box 57 | rather than an unrestricted search or simple radius. 58 | } 59 | -------------------------------------------------------------------------------- /man/get_item.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/gets.R 3 | \name{get_item} 4 | \alias{get_item} 5 | \alias{get_property} 6 | \alias{get_property} 7 | \title{Retrieve specific Wikidata items or properties} 8 | \usage{ 9 | get_item(id, ...) 10 | 11 | get_property(id, ...) 12 | } 13 | \arguments{ 14 | \item{id}{the ID number(s) of the item or property you're looking for. This can be in 15 | various formats; either a numeric value ("200"), the full name ("Q200") or 16 | even with an included namespace ("Property:P10") - the function will format 17 | it appropriately. This function is vectorised and will happily accept 18 | multiple IDs.} 19 | 20 | \item{...}{further arguments to pass to httr's GET.} 21 | } 22 | \description{ 23 | \code{get_item} and \code{get_property} allow you to retrieve the data associated 24 | with individual Wikidata items and properties, respectively. As with 25 | other \code{WikidataR} code, custom print methods are available; use \code{\link{str}} 26 | to manipulate and see the underlying structure of the data. 27 | } 28 | \examples{ 29 | 30 | #Retrieve a specific item 31 | adams_metadata <- get_item("42") 32 | 33 | #Retrieve a specific property 34 | object_is_child <- get_property("P40") 35 | 36 | } 37 | \seealso{ 38 | \code{\link{get_random}} for selecting a random item or property, 39 | or \code{\link{find_item}} for using search functionality to pull out 40 | item or property IDs where the descriptions or aliases match a particular 41 | search term. 42 | } 43 | -------------------------------------------------------------------------------- /man/get_random.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/gets.R 3 | \name{get_random_item} 4 | \alias{get_random_item} 5 | \alias{get_random} 6 | \alias{get_random_property} 7 | \alias{get_random_property} 8 | \title{Retrieve randomly-selected Wikidata items or properties} 9 | \usage{ 10 | get_random_item(limit = 1, ...) 11 | 12 | get_random_property(limit = 1, ...) 13 | } 14 | \arguments{ 15 | \item{limit}{how many random items to return. 1 by default, but can be higher.} 16 | 17 | \item{...}{arguments to pass to httr's GET.} 18 | } 19 | \description{ 20 | \code{get_random_item} and \code{get_random_property} allow you to retrieve the data 21 | associated with randomly-selected Wikidata items and properties, respectively. As with 22 | other \code{WikidataR} code, custom print methods are available; use \code{\link{str}} 23 | to manipulate and see the underlying structure of the data. 24 | } 25 | \examples{ 26 | 27 | #Random item 28 | random_item <- get_random_item() 29 | 30 | #Random property 31 | random_property <- get_random_property() 32 | 33 | } 34 | \seealso{ 35 | \code{\link{get_item}} for selecting a specific item or property, 36 | or \code{\link{find_item}} for using search functionality to pull out 37 | item or property IDs where the descriptions or aliases match a particular 38 | search term. 39 | } 40 | -------------------------------------------------------------------------------- /man/print.find_item.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/prints.R 3 | \name{print.find_item} 4 | \alias{print.find_item} 5 | \title{Print method for find_item} 6 | \usage{ 7 | \method{print}{find_item}(x, ...) 8 | } 9 | \arguments{ 10 | \item{x}{find_item object with search results} 11 | 12 | \item{\dots}{Arguments to be passed to methods} 13 | } 14 | \description{ 15 | print found items. 16 | } 17 | -------------------------------------------------------------------------------- /man/print.find_property.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/prints.R 3 | \name{print.find_property} 4 | \alias{print.find_property} 5 | \title{Print method for find_property} 6 | \usage{ 7 | \method{print}{find_property}(x, ...) 8 | } 9 | \arguments{ 10 | \item{x}{find_property object with search results} 11 | 12 | \item{\dots}{Arguments to be passed to methods} 13 | } 14 | \description{ 15 | print found properties. 16 | } 17 | -------------------------------------------------------------------------------- /man/print.wikidata.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/prints.R 3 | \name{print.wikidata} 4 | \alias{print.wikidata} 5 | \title{Print method for Wikidata objects} 6 | \usage{ 7 | \method{print}{wikidata}(x, ...) 8 | } 9 | \arguments{ 10 | \item{x}{wikidata object from get_item, get_random_item, get_property or get_random_property} 11 | 12 | \item{\dots}{Arguments to be passed to methods} 13 | } 14 | \description{ 15 | print found objects generally. 16 | } 17 | \seealso{ 18 | get_item, get_random_item, get_property or get_random_property 19 | } 20 | -------------------------------------------------------------------------------- /tests/testthat.R: -------------------------------------------------------------------------------- 1 | library(testthat) 2 | library(WikidataR) 3 | 4 | test_check("WikidataR") 5 | -------------------------------------------------------------------------------- /tests/testthat/test_geo.R: -------------------------------------------------------------------------------- 1 | testthat::context("Geographic queries") 2 | 3 | testthat::test_that("Simple entity-based geo lookups work", { 4 | field_names <- c("item", "name", "latitutde", "longitude", "entity") 5 | sf_locations <- get_geo_entity("Q62") 6 | testthat::expect_true(is.data.frame(sf_locations)) 7 | testthat::expect_true(all(field_names == names(sf_locations))) 8 | testthat::expect_true(unique(sf_locations$entity) == "Q62") 9 | }) 10 | 11 | testthat::test_that("Language-variant entity-based geo lookups work", { 12 | field_names <- c("item", "name", "latitutde", "longitude", "entity") 13 | sf_locations <- get_geo_entity("Q62", language = "fr") 14 | testthat::expect_true(is.data.frame(sf_locations)) 15 | testthat::expect_true(all(field_names == names(sf_locations))) 16 | testthat::expect_true(unique(sf_locations$entity) == "Q62") 17 | }) 18 | 19 | testthat::test_that("Radius restricted entity-based geo lookups work", { 20 | field_names <- c("item", "name", "latitutde", "longitude", "entity") 21 | sf_locations <- get_geo_entity("Q62", radius = 1) 22 | testthat::expect_true(is.data.frame(sf_locations)) 23 | testthat::expect_true(all(field_names == names(sf_locations))) 24 | testthat::expect_true(unique(sf_locations$entity) == "Q62") 25 | }) 26 | 27 | testthat::test_that("multi-entity geo lookups work", { 28 | field_names <- c("item", "name", "latitutde", "longitude", "entity") 29 | sf_locations <- get_geo_entity(c("Q62", "Q64"), radius = 1) 30 | testthat::expect_true(is.data.frame(sf_locations)) 31 | testthat::expect_true(all(field_names == names(sf_locations))) 32 | testthat::expect_equal(length(unique(sf_locations$entity)), 2) 33 | }) 34 | 35 | testthat::test_that("Simple bounding lookups work", { 36 | field_names <- c("item", "name", "latitutde", "longitude") 37 | bruges_box <- get_geo_box("Q12988", "NorthEast", "Q184287", "SouthWest") 38 | testthat::expect_true(is.data.frame(bruges_box)) 39 | testthat::expect_true(all(field_names == names(bruges_box))) 40 | }) 41 | 42 | testthat::test_that("Language-variant bounding lookups work", { 43 | field_names <- c("item", "name", "latitutde", "longitude") 44 | bruges_box <- get_geo_box("Q12988", "NorthEast", "Q184287", "SouthWest", 45 | language = "fr") 46 | testthat::expect_true(is.data.frame(bruges_box)) 47 | testthat::expect_true(all(field_names == names(bruges_box))) 48 | }) -------------------------------------------------------------------------------- /tests/testthat/test_gets.R: -------------------------------------------------------------------------------- 1 | context("Direct Wikidata get functions") 2 | 3 | test_that("A specific item can be retrieved with an entire item code", { 4 | expect_true({get_item("Q100");TRUE}) 5 | }) 6 | 7 | test_that("A specific item can be retrieved with a partial entire item code", { 8 | expect_true({get_item("100");TRUE}) 9 | }) 10 | 11 | test_that("A specific property can be retrieved with an entire prop code + namespace", { 12 | expect_true({get_property("Property:P10");TRUE}) 13 | }) 14 | 15 | test_that("A specific property can be retrieved with an entire prop code + namespace", { 16 | expect_true({get_property("P10");TRUE}) 17 | }) 18 | 19 | 20 | test_that("A specific property can be retrieved with a partial prop code", { 21 | expect_true({get_property("10");TRUE}) 22 | }) 23 | 24 | test_that("A randomly-selected item can be retrieved",{ 25 | expect_true({get_random_item();TRUE}) 26 | }) 27 | 28 | test_that("A randomly-selected property can be retriveed",{ 29 | expect_true({get_random_property();TRUE}) 30 | }) -------------------------------------------------------------------------------- /tests/testthat/test_search.R: -------------------------------------------------------------------------------- 1 | context("Search functions") 2 | 3 | test_that("English-language search works",{ 4 | expect_true({find_item("Wonder Girls", "en");TRUE}) 5 | }) 6 | 7 | test_that("Non-English-language search works",{ 8 | expect_true({find_item("Wonder Girls", "es");TRUE}) 9 | }) 10 | 11 | test_that("Search with limit modding works",{ 12 | expect_that(length(find_item("Wonder Girls", "en", 3)), equals(3)) 13 | }) 14 | 15 | test_that("Property search works",{ 16 | expect_true({find_property("Music", "en");TRUE}) 17 | }) -------------------------------------------------------------------------------- /vignettes/Introduction.R: -------------------------------------------------------------------------------- 1 | ## ---- eval=FALSE--------------------------------------------------------- 2 | # #Retrieve an item 3 | # item <- get_item(id = 1) 4 | # 5 | # #Get information about the property of the first claim it has. 6 | # first_claim <- get_property(id = names(item$claims)[1]) 7 | # #Do we succeed? Dewey! 8 | 9 | ## ---- eval=FALSE--------------------------------------------------------- 10 | # #Retrieve a random item 11 | # rand_item <- get_random_item() 12 | # 13 | # #Retrieve a random property 14 | # rand_prop <- get_random_property() 15 | 16 | ## ---- eval=FALSE--------------------------------------------------------- 17 | # #Retrieve 42 random items 18 | # rand_item <- get_random_item(limit = 42) 19 | # 20 | # #Retrieve 42 random properties 21 | # rand_prop <- get_random_property(limit = 42) 22 | 23 | ## ---- eval=FALSE--------------------------------------------------------- 24 | # #Find item - find defaults to "en" as a language. 25 | # aarons <- find_item("Aaron Halfaker") 26 | # 27 | # #Find a property - also defaults to "en" 28 | # first_names <- find_property("first name") 29 | 30 | ## ---- eval=FALSE--------------------------------------------------------- 31 | # #Find item. 32 | # all_aarons <- find_item("Aaron Halfaker") 33 | # 34 | # #Grab the ID code for the first entry and retrieve the associated item data. 35 | # first_aaron <- get_item(all_aarons[[1]]$id) 36 | 37 | -------------------------------------------------------------------------------- /vignettes/Introduction.Rmd: -------------------------------------------------------------------------------- 1 | 5 | 6 | # WikidataR: the API client library for Wikidata 7 | Wikidata is a wonderful and irreplaceable resource for linked data, containing information on pretty much any subject. If there's a Wikipedia article on it, there's almost certainly a Wikidata item for it. 8 | 9 | WikidataR - following the naming scheme of [WikipediR](https://github.com/Ironholds/WikipediR#thanks-and-misc) - is an API client library for Wikidata, written in and accessible from R. 10 | 11 | ## Items and properties 12 | The two basic component pieces of Wikidata are "items" and "properties". An "item" is a thing - a concept, object or 13 | topic that exists in the real world, such as "Rush". These items each have statements associated with them - for 14 | example, "Rush is an instance of: Rock Band". In that statement, "Rock Band" is a property: a class or trait 15 | that items can hold. Wikidata items are organised as descriptors of the item, in various languages, and references to the properties that that item holds. 16 | 17 | ## Retrieving specific items or properties 18 | Items and properties are both identified by numeric IDs, prefaced with "Q" in the case of items, 19 | and "P" in the case of properties. WikipediR can be used to retrieve items or properties with specific 20 | ID numbers, using the get\_item and get\_property functions: 21 | 22 | ```{r, eval=FALSE} 23 | #Retrieve an item 24 | item <- get_item(id = 1) 25 | 26 | #Get information about the property of the first claim it has. 27 | first_claim <- get_property(id = names(item$claims)[1]) 28 | #Do we succeed? Dewey! 29 | ``` 30 | 31 | These functions are capable of accepting various forms for the ID, including (as examples), "Q100" or "100" 32 | for items, and "Property:P100", "P100" or "100" for properties. They're also vectorised - pass them as many IDs as you want! 33 | 34 | ## Retrieving randomly-selected items or properties 35 | As well as retrieving specific items or properties, Wikidata's API also allows for the retrieval of *random* 36 | elements. With WikidataR, this can be achieved through: 37 | 38 | ```{r, eval=FALSE} 39 | #Retrieve a random item 40 | rand_item <- get_random_item() 41 | 42 | #Retrieve a random property 43 | rand_prop <- get_random_property() 44 | ``` 45 | 46 | These also allow you to retrieve *sets* of random elements - not just one at a time, but say, 50 at a time - by including the "limit" argument: 47 | 48 | ```{r, eval=FALSE} 49 | #Retrieve 42 random items 50 | rand_item <- get_random_item(limit = 42) 51 | 52 | #Retrieve 42 random properties 53 | rand_prop <- get_random_property(limit = 42) 54 | ``` 55 | 56 | ## Search 57 | Wikidata's search functionality can also be used, either to find items or to find properties. All you need is 58 | a search string (which is run over the names and descriptions of items or properties) and a language code 59 | (since Wikidata's descriptions can be in many languages): 60 | 61 | ```{r, eval=FALSE} 62 | #Find item - find defaults to "en" as a language. 63 | aarons <- find_item("Aaron Halfaker") 64 | 65 | #Find a property - also defaults to "en" 66 | first_names <- find_property("first name") 67 | ``` 68 | 69 | The resulting search entries have the ID as a key, making it trivial to then retrieve the full corresponding 70 | items or properties: 71 | 72 | ```{r, eval=FALSE} 73 | #Find item. 74 | all_aarons <- find_item("Aaron Halfaker") 75 | 76 | #Grab the ID code for the first entry and retrieve the associated item data. 77 | first_aaron <- get_item(all_aarons[[1]]$id) 78 | ``` 79 | 80 | ## Other and future functionality 81 | If you have ideas for other types of useful Wikidata access, the best approach 82 | is to either [request it](https://github.com/Ironholds/WikidataR/issues) or [add it](https://github.com/Ironholds/WikidataR/pulls)! 83 | -------------------------------------------------------------------------------- /vignettes/Introduction.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | WikidataR: the API client library for Wikidata 7 | 8 | 19 | 20 | 21 | 53 | 54 | 55 | 59 | 60 | 61 | 62 | 198 | 199 | 200 | 201 | 202 | 203 | 204 | 208 | 209 |

WikidataR: the API client library for Wikidata

210 | 211 |

Wikidata is a wonderful and irreplaceable resource for linked data, containing information on pretty much any subject. If there's a Wikipedia article on it, there's almost certainly a Wikidata item for it.

212 | 213 |

WikidataR - following the naming scheme of WikipediR - is an API client library for Wikidata, written in and accessible from R.

214 | 215 |

Items and properties

216 | 217 |

The two basic component pieces of Wikidata are “items” and “properties”. An “item” is a thing - a concept, object or 218 | topic that exists in the real world, such as “Rush”. These items each have statements associated with them - for 219 | example, “Rush is an instance of: Rock Band”. In that statement, “Rock Band” is a property: a class or trait 220 | that items can hold. Wikidata items are organised as descriptors of the item, in various languages, and references to the properties that that item holds.

221 | 222 |

Retrieving specific items or properties

223 | 224 |

Items and properties are both identified by numeric IDs, prefaced with “Q” in the case of items, 225 | and “P” in the case of properties. WikipediR can be used to retrieve items or properties with specific 226 | ID numbers, using the get_item and get_property functions:

227 | 228 |
#Retrieve an item 
229 | item <- get_item(id = 1)
230 | 
231 | #Get information about the property of the first claim it has.
232 | first_claim <- get_property(id = names(item$claims)[1])
233 | #Do we succeed? Dewey!
234 | 
235 | 236 |

These functions are capable of accepting various forms for the ID, including (as examples), “Q100” or “100” 237 | for items, and “Property:P100”, “P100” or “100” for properties. They're also vectorised - pass them as many IDs as you want!

238 | 239 |

Retrieving randomly-selected items or properties

240 | 241 |

As well as retrieving specific items or properties, Wikidata's API also allows for the retrieval of random 242 | elements. With WikidataR, this can be achieved through:

243 | 244 |
#Retrieve a random item
245 | rand_item <- get_random_item()
246 | 
247 | #Retrieve a random property
248 | rand_prop <- get_random_property()
249 | 
250 | 251 |

These also allow you to retrieve sets of random elements - not just one at a time, but say, 50 at a time - by including the “limit” argument:

252 | 253 |
#Retrieve 42 random items
254 | rand_item <- get_random_item(limit = 42)
255 | 
256 | #Retrieve 42 random properties
257 | rand_prop <- get_random_property(limit = 42)
258 | 
259 | 260 |

Search

261 | 262 |

Wikidata's search functionality can also be used, either to find items or to find properties. All you need is 263 | a search string (which is run over the names and descriptions of items or properties) and a language code 264 | (since Wikidata's descriptions can be in many languages):

265 | 266 |
#Find item - find defaults to "en" as a language.
267 | aarons <- find_item("Aaron Halfaker")
268 | 
269 | #Find a property - also defaults to "en"
270 | first_names <- find_property("first name")
271 | 
272 | 273 |

The resulting search entries have the ID as a key, making it trivial to then retrieve the full corresponding 274 | items or properties:

275 | 276 |
#Find item.
277 | all_aarons <- find_item("Aaron Halfaker")
278 | 
279 | #Grab the ID code for the first entry and retrieve the associated item data.
280 | first_aaron <- get_item(all_aarons[[1]]$id)
281 | 
282 | 283 |

Other and future functionality

284 | 285 |

If you have ideas for other types of useful Wikidata access, the best approach 286 | is to either request it or add it!

287 | 288 | 289 | 290 | 291 | -------------------------------------------------------------------------------- /vignettes/Introduction.md: -------------------------------------------------------------------------------- 1 | 5 | 6 | # WikidataR: the API client library for Wikidata 7 | Wikidata is a wonderful and irreplaceable resource for linked data, containing information on pretty much any subject. If there's a Wikipedia article on it, there's almost certainly a Wikidata item for it. 8 | 9 | WikidataR - following the naming scheme of [WikipediR](https://github.com/Ironholds/WikipediR#thanks-and-misc) - is an API client library for Wikidata, written in and accessible from R. 10 | 11 | ## Items and properties 12 | The two basic component pieces of Wikidata are "items" and "properties". An "item" is a thing - a concept, object or 13 | topic that exists in the real world, such as "Rush". These items each have statements associated with them - for 14 | example, "Rush is an instance of: Rock Band". In that statement, "Rock Band" is a property: a class or trait 15 | that items can hold. Wikidata items are organised as descriptors of the item, in various languages, and references to the properties that that item holds. 16 | 17 | ## Retrieving specific items or properties 18 | Items and properties are both identified by numeric IDs, prefaced with "Q" in the case of items, 19 | and "P" in the case of properties. WikipediR can be used to retrieve items or properties with specific 20 | ID numbers, using the get\_item and get\_property functions: 21 | 22 | 23 | ```r 24 | #Retrieve an item 25 | item <- get_item(id = 1) 26 | 27 | #Get information about the property of the first claim it has. 28 | first_claim <- get_property(id = names(item$claims)[1]) 29 | #Do we succeed? Dewey! 30 | ``` 31 | 32 | These functions are capable of accepting various forms for the ID, including (as examples), "Q100" or "100" 33 | for items, and "Property:P100", "P100" or "100" for properties. They're also vectorised - pass them as many IDs as you want! 34 | 35 | ## Retrieving randomly-selected items or properties 36 | As well as retrieving specific items or properties, Wikidata's API also allows for the retrieval of *random* 37 | elements. With WikidataR, this can be achieved through: 38 | 39 | 40 | ```r 41 | #Retrieve a random item 42 | rand_item <- get_random_item() 43 | 44 | #Retrieve a random property 45 | rand_prop <- get_random_property() 46 | ``` 47 | 48 | These also allow you to retrieve *sets* of random elements - not just one at a time, but say, 50 at a time - by including the "limit" argument: 49 | 50 | 51 | ```r 52 | #Retrieve 42 random items 53 | rand_item <- get_random_item(limit = 42) 54 | 55 | #Retrieve 42 random properties 56 | rand_prop <- get_random_property(limit = 42) 57 | ``` 58 | 59 | ## Search 60 | Wikidata's search functionality can also be used, either to find items or to find properties. All you need is 61 | a search string (which is run over the names and descriptions of items or properties) and a language code 62 | (since Wikidata's descriptions can be in many languages): 63 | 64 | 65 | ```r 66 | #Find item - find defaults to "en" as a language. 67 | aarons <- find_item("Aaron Halfaker") 68 | 69 | #Find a property - also defaults to "en" 70 | first_names <- find_property("first name") 71 | ``` 72 | 73 | The resulting search entries have the ID as a key, making it trivial to then retrieve the full corresponding 74 | items or properties: 75 | 76 | 77 | ```r 78 | #Find item. 79 | all_aarons <- find_item("Aaron Halfaker") 80 | 81 | #Grab the ID code for the first entry and retrieve the associated item data. 82 | first_aaron <- get_item(all_aarons[[1]]$id) 83 | ``` 84 | 85 | ## Other and future functionality 86 | If you have ideas for other types of useful Wikidata access, the best approach 87 | is to either [request it](https://github.com/Ironholds/WikidataR/issues) or [add it](https://github.com/Ironholds/WikidataR/pulls)! 88 | --------------------------------------------------------------------------------