├── .Rbuildignore ├── .gitignore ├── DESCRIPTION ├── LICENSE ├── NAMESPACE ├── NEWS.md ├── R ├── sfarrow.R └── st_arrow.R ├── README.Rmd ├── README.md ├── docs ├── 404.html ├── LICENSE-text.html ├── articles │ ├── example_sfarrow.html │ ├── example_sfarrow_files │ │ ├── accessible-code-block-0.0.1 │ │ │ └── empty-anchor.js │ │ ├── figure-html │ │ │ ├── unnamed-chunk-2-1.png │ │ │ ├── unnamed-chunk-7-1.png │ │ │ └── unnamed-chunk-8-1.png │ │ ├── header-attrs-2.10 │ │ │ └── header-attrs.js │ │ └── header-attrs-2.8 │ │ │ └── header-attrs.js │ └── index.html ├── authors.html ├── bootstrap-toc.css ├── bootstrap-toc.js ├── docsearch.css ├── docsearch.js ├── index.html ├── link.svg ├── news │ └── index.html ├── pkgdown.css ├── pkgdown.js ├── pkgdown.yml └── reference │ ├── Rplot001.png │ ├── arrow_to_sf.html │ ├── create_metadata.html │ ├── encode_wkb.html │ ├── figures │ ├── REAsDME-unnamed-chunk-2-1.png │ ├── REAsDME-unnamed-chunk-3-1.png │ ├── REAsDME-unnamed-chunk-4-1.png │ └── REAsDME-unnamed-chunk-5-1.png │ ├── index.html │ ├── read_sf_dataset-1.png │ ├── read_sf_dataset.html │ ├── sfarrow.html │ ├── st_read_feather-1.png │ ├── st_read_feather.html │ ├── st_read_parquet-1.png │ ├── st_read_parquet.html │ ├── st_write_feather.html │ ├── st_write_parquet.html │ ├── validate_metadata.html │ ├── write_sf_dataset-1.png │ └── write_sf_dataset.html ├── inst └── extdata │ ├── ds │ ├── split1=1 │ │ ├── split2=1 │ │ │ └── part-3.parquet │ │ └── split2=2 │ │ │ └── part-0.parquet │ ├── split1=2 │ │ ├── split2=1 │ │ │ └── part-1.parquet │ │ └── split2=2 │ │ │ └── part-5.parquet │ └── split1=3 │ │ ├── split2=1 │ │ └── part-2.parquet │ │ └── split2=2 │ │ └── part-4.parquet │ ├── world.feather │ └── world.parquet ├── man ├── arrow_to_sf.Rd ├── create_metadata.Rd ├── encode_wkb.Rd ├── figures │ ├── REAsDME-unnamed-chunk-2-1.png │ ├── REAsDME-unnamed-chunk-3-1.png │ ├── REAsDME-unnamed-chunk-4-1.png │ └── REAsDME-unnamed-chunk-5-1.png ├── read_sf_dataset.Rd ├── sfarrow.Rd ├── st_read_feather.Rd ├── st_read_parquet.Rd ├── st_write_feather.Rd ├── st_write_parquet.Rd ├── validate_metadata.Rd └── write_sf_dataset.Rd └── vignettes ├── .gitignore └── example_sfarrow.Rmd /.Rbuildignore: -------------------------------------------------------------------------------- 1 | ^sfarrow\.Rproj$ 2 | ^\.Rproj\.user$ 3 | ^README\.Rmd$ 4 | ^docs$ 5 | ^cran-comments\.md$ 6 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .Rproj.user 2 | .Rhistory 3 | .RData 4 | .Ruserdata 5 | sfarrow.Rproj 6 | inst/doc 7 | -------------------------------------------------------------------------------- /DESCRIPTION: -------------------------------------------------------------------------------- 1 | Package: sfarrow 2 | Title: Read/Write Simple Feature Objects ('sf') with 'Apache' 'Arrow' 3 | Version: 0.4.1 4 | Date: 2021-10-25 5 | Authors@R: 6 | person(given = "Chris", 7 | family = "Jochem", 8 | role = c("aut", "cre"), 9 | email = "w.c.jochem@soton.ac.uk", 10 | comment = c(ORCID = "0000-0003-2192-5988")) 11 | Description: Support for reading/writing simple feature ('sf') spatial objects from/to 'Parquet' files. 'Parquet' files are an open-source, column-oriented data storage format from Apache (), now popular across programming languages. This implementation converts simple feature list geometries into well-known binary format for use by 'arrow', and coordinate reference system information is maintained in a standard metadata format. 12 | License: MIT + file LICENSE 13 | URL: https://github.com/wcjochem/sfarrow, https://wcjochem.github.io/sfarrow/ 14 | BugReports: https://github.com/wcjochem/sfarrow/issues 15 | Encoding: UTF-8 16 | LazyData: true 17 | Roxygen: list(markdown = TRUE) 18 | RoxygenNote: 7.1.1 19 | Imports: 20 | sf, 21 | arrow, 22 | jsonlite, 23 | dplyr, 24 | Suggests: 25 | knitr, 26 | rmarkdown 27 | VignetteBuilder: knitr 28 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | YEAR: 2021 2 | COPYRIGHT HOLDER: Chris Jochem 3 | -------------------------------------------------------------------------------- /NAMESPACE: -------------------------------------------------------------------------------- 1 | # Generated by roxygen2: do not edit by hand 2 | 3 | export(read_sf_dataset) 4 | export(st_read_feather) 5 | export(st_read_parquet) 6 | export(st_write_feather) 7 | export(st_write_parquet) 8 | export(write_sf_dataset) 9 | -------------------------------------------------------------------------------- /NEWS.md: -------------------------------------------------------------------------------- 1 | # sfarrow 0.4.1 2 | 3 | * Cleaning examples to remove reverse dependency check errors in `arrow` 4 | (reported by @jonkeane). 5 | 6 | # sfarrow 0.4.0 7 | 8 | * New `find_geom` parameter in `read_sf_dataset()` adds any geometry columns to 9 | the `arrow_dplyr_query`. Default behaviour is `FALSE` for consistent behaviour. 10 | 11 | * Cleaning documentation and preparing for CRAN submission 12 | 13 | # sfarrow 0.3.0 14 | 15 | * New `st_write_feather()` and `st_read_feather()` allow similar functionality 16 | to read/write to .feather formats with `sf` objects. 17 | * Following `arrow` 2.0.0, properties to `st_write_parquet()` are deprecated. 18 | 19 | # sfarrow 0.2.0 20 | 21 | * New `write_sf_dataset()` and `read_sf_dataset()` to handle partitioned 22 | datasets. These also work with `dplyr` and grouped variables to define 23 | partitions. 24 | 25 | * New vignettes added for documentation of all functions. 26 | 27 | # sfarrow 0.1.1 28 | 29 | * `st_write_parquet()` now warns uses that geo metadata format may change. 30 | 31 | # sfarrow 0.1.0 32 | 33 | * This is the initial release of `sfarrow`. 34 | -------------------------------------------------------------------------------- /R/sfarrow.R: -------------------------------------------------------------------------------- 1 | #' \code{sfarrow}: An R package for reading/writing simple feature (\code{sf}) 2 | #' objects from/to Arrow parquet/feather files with \code{arrow} 3 | #' 4 | #' Simple features are a popular, standardised way to create spatial vector data 5 | #' with a list-type geometry column. Parquet files are standard column-oriented 6 | #' files designed by Apache Arrow (\url{https://parquet.apache.org/}) for fast 7 | #' read/writes. \code{sfarrow} is designed to support the reading and writing of 8 | #' simple features in \code{sf} objects from/to Parquet files (.parquet) and 9 | #' Feather files (.feather) within \code{R}. A key goal of \code{sfarrow} is to 10 | #' support interoperability of spatial data in files between \code{R} and 11 | #' \code{Python} through the use of standardised metadata. 12 | #' 13 | #' @section Metadata: 14 | #' Coordinate reference and geometry field information for \code{sf} objects are 15 | #' stored in standard metadata tables within the files. The metadata are based 16 | #' on a standard representation (Version 0.1.0, reference: 17 | #' \url{https://github.com/geopandas/geo-arrow-spec}). This is compatible with 18 | #' the format used by the Python library \code{GeoPandas} for read/writing 19 | #' Parquet/Feather files. Note to users: this metadata format is not yet stable 20 | #' for production uses and may change in the future. 21 | #' 22 | #' @section Credits: 23 | #' This work was undertaken by Chris Jochem, a member of the WorldPop Research 24 | #' Group at the University of Southampton(\url{https://www.worldpop.org/}). 25 | #' 26 | #' @docType package 27 | #' @keywords internal 28 | #' @name sfarrow 29 | NULL 30 | -------------------------------------------------------------------------------- /README.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | output: 3 | github_document: 4 | html_preview: false 5 | --- 6 | 7 | ```{r setup, include = FALSE} 8 | knitr::opts_chunk$set( 9 | collapse = TRUE, 10 | comment = "#>", 11 | fig.path = "man/figures/REAsDME-", 12 | out.width = "100%" 13 | ) 14 | ``` 15 | 16 | 17 | 18 | # sfarrow: Read/Write Simple Feature Objects (`sf`) with 'Apache' 'Arrow' 19 | 20 | `sfarrow` is a package for reading and writing Parquet and Feather files with 21 | `sf` objects using `arrow` in `R`. 22 | 23 | Simple features are a popular format for representing spatial vector data using 24 | `data.frames` and a list-like geometry column, implemented in the `R` package 25 | [`sf`](https://r-spatial.github.io/sf/). Apache Parquet files are an 26 | open-source, column-oriented data storage format 27 | ([https://parquet.apache.org/](https://parquet.apache.org/)) which enable 28 | efficient read/writing for large files. Parquet files are becoming popular 29 | across programming languages and can be used in `R` using the package 30 | [`arrow`](https://github.com/apache/arrow/). 31 | 32 | The `sfarrow` implementation translates simple feature data objects using 33 | well-known binary (WKB) format for geometries and reads/writes Parquet/Feather 34 | files. A key goal of the package is for interoperability of the files 35 | (particularly with Python `GeoPandas`), so coordinate reference system 36 | information is maintained in a standard metadata format 37 | ([https://github.com/geopandas/geo-arrow-spec](https://github.com/geopandas/geo-arrow-spec)). 38 | Note to users: this metadata format is not yet stable for production uses and 39 | may change in the future. 40 | 41 | ## Installation 42 | 43 | `sfarrow` is available through CRAN with: 44 | 45 | ```{r, eval=FALSE} 46 | install.packages('sfarrow') 47 | ``` 48 | 49 | or it can be installed from Github with: 50 | 51 | ```{r eval=FALSE} 52 | devtools::install_github("wcjochem/sfarrow@main") 53 | ``` 54 | 55 | Load the library to begin using it. 56 | 57 | ```{r} 58 | library(sfarrow) 59 | ``` 60 | 61 | ### `arrow` package 62 | 63 | The installation requires the Arrow library which should be installed with the 64 | `R` package `arrow` dependency. However, some systems may need to follow 65 | additional steps to enable full support of that library. Please refer to the 66 | `arrow` 67 | [documentation](https://CRAN.R-project.org/package=arrow/vignettes/install.html). 68 | 69 | ## Basic usage 70 | 71 | Reading Parquet data of spatial files created with Python `GeoPandas`. 72 | ```{r} 73 | # load Natural Earth low-res dataset. 74 | # Created in Python with geopandas.to_parquet() 75 | path <- system.file("extdata", "world.parquet", package = "sfarrow") 76 | 77 | world <- st_read_parquet(path) 78 | 79 | world 80 | plot(sf::st_geometry(world)) 81 | ``` 82 | 83 | Writing `sf` objects to Parquet format files. These Parquet files created with 84 | `sfarrow` can be read within Python using `GeoPandas`. 85 | ```{r} 86 | nc <- sf::st_read(system.file("shape/nc.shp", package="sf"), quiet=TRUE) 87 | 88 | st_write_parquet(obj=nc, dsn=file.path(tempdir(), "nc.parquet")) 89 | 90 | # read back into R 91 | nc_p <- st_read_parquet(file.path(tempdir(), "nc.parquet")) 92 | 93 | nc_p 94 | plot(sf::st_geometry(nc_p)) 95 | ``` 96 | 97 | For additional examples please see the vignettes. 98 | 99 | ## Contributions 100 | Contributions, questions, ideas, and issue reports are welcome. Please raise an 101 | issue to discuss or submit a pull request. 102 | 103 | ## Acknowledgements 104 | This work benefited from the work by developers in the GeoPandas, Arrow, and 105 | r-spatial teams. Thank you to the teams for their excellent, open-source work. 106 | 107 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | # sfarrow: Read/Write Simple Feature Objects (`sf`) with ‘Apache’ ‘Arrow’ 5 | 6 | `sfarrow` is a package for reading and writing Parquet and Feather files 7 | with `sf` objects using `arrow` in `R`. 8 | 9 | Simple features are a popular format for representing spatial vector 10 | data using `data.frames` and a list-like geometry column, implemented in 11 | the `R` package [`sf`](https://r-spatial.github.io/sf/). Apache Parquet 12 | files are an open-source, column-oriented data storage format 13 | () which enable efficient read/writing for 14 | large files. Parquet files are becoming popular across programming 15 | languages and can be used in `R` using the package 16 | [`arrow`](https://github.com/apache/arrow/). 17 | 18 | The `sfarrow` implementation translates simple feature data objects 19 | using well-known binary (WKB) format for geometries and reads/writes 20 | Parquet/Feather files. A key goal of the package is for interoperability 21 | of the files (particularly with Python `GeoPandas`), so coordinate 22 | reference system information is maintained in a standard metadata format 23 | (). Note to users: this 24 | metadata format is not yet stable for production uses and may change in 25 | the future. 26 | 27 | ## Installation 28 | 29 | `sfarrow` is available through CRAN with: 30 | 31 | ``` r 32 | install.packages('sfarrow') 33 | ``` 34 | 35 | or it can be installed from Github with: 36 | 37 | ``` r 38 | devtools::install_github("wcjochem/sfarrow@main") 39 | ``` 40 | 41 | Load the library to begin using it. 42 | 43 | ``` r 44 | library(sfarrow) 45 | ``` 46 | 47 | ### `arrow` package 48 | 49 | The installation requires the Arrow library which should be installed 50 | with the `R` package `arrow` dependency. However, some systems may need 51 | to follow additional steps to enable full support of that library. 52 | Please refer to the `arrow` 53 | [documentation](https://CRAN.R-project.org/package=arrow/vignettes/install.html). 54 | 55 | ## Basic usage 56 | 57 | Reading Parquet data of spatial files created with Python `GeoPandas`. 58 | 59 | ``` r 60 | # load Natural Earth low-res dataset. 61 | # Created in Python with geopandas.to_parquet() 62 | path <- system.file("extdata", "world.parquet", package = "sfarrow") 63 | 64 | world <- st_read_parquet(path) 65 | 66 | world 67 | #> Simple feature collection with 177 features and 5 fields 68 | #> Geometry type: GEOMETRY 69 | #> Dimension: XY 70 | #> Bounding box: xmin: -180 ymin: -90 xmax: 180 ymax: 83.64513 71 | #> Geodetic CRS: WGS 84 72 | #> First 10 features: 73 | #> pop_est continent name iso_a3 gdp_md_est 74 | #> 1 920938 Oceania Fiji FJI 8.374e+03 75 | #> 2 53950935 Africa Tanzania TZA 1.506e+05 76 | #> 3 603253 Africa W. Sahara ESH 9.065e+02 77 | #> 4 35623680 North America Canada CAN 1.674e+06 78 | #> 5 326625791 North America United States of America USA 1.856e+07 79 | #> 6 18556698 Asia Kazakhstan KAZ 4.607e+05 80 | #> 7 29748859 Asia Uzbekistan UZB 2.023e+05 81 | #> 8 6909701 Oceania Papua New Guinea PNG 2.802e+04 82 | #> 9 260580739 Asia Indonesia IDN 3.028e+06 83 | #> 10 44293293 South America Argentina ARG 8.794e+05 84 | #> geometry 85 | #> 1 MULTIPOLYGON (((180 -16.067... 86 | #> 2 POLYGON ((33.90371 -0.95, 3... 87 | #> 3 POLYGON ((-8.66559 27.65643... 88 | #> 4 MULTIPOLYGON (((-122.84 49,... 89 | #> 5 MULTIPOLYGON (((-122.84 49,... 90 | #> 6 POLYGON ((87.35997 49.21498... 91 | #> 7 POLYGON ((55.96819 41.30864... 92 | #> 8 MULTIPOLYGON (((141.0002 -2... 93 | #> 9 MULTIPOLYGON (((141.0002 -2... 94 | #> 10 MULTIPOLYGON (((-68.63401 -... 95 | plot(sf::st_geometry(world)) 96 | ``` 97 | 98 | 99 | 100 | Writing `sf` objects to Parquet format files. These Parquet files 101 | created with `sfarrow` can be read within Python using `GeoPandas`. 102 | 103 | ``` r 104 | nc <- sf::st_read(system.file("shape/nc.shp", package="sf"), quiet=TRUE) 105 | 106 | st_write_parquet(obj=nc, dsn=file.path(tempdir(), "nc.parquet")) 107 | #> Warning: This is an initial implementation of Parquet/Feather file support and 108 | #> geo metadata. This is tracking version 0.1.0 of the metadata 109 | #> (https://github.com/geopandas/geo-arrow-spec). This metadata 110 | #> specification may change and does not yet make stability promises. We 111 | #> do not yet recommend using this in a production setting unless you are 112 | #> able to rewrite your Parquet/Feather files. 113 | 114 | # read back into R 115 | nc_p <- st_read_parquet(file.path(tempdir(), "nc.parquet")) 116 | 117 | nc_p 118 | #> Simple feature collection with 100 features and 14 fields 119 | #> Geometry type: MULTIPOLYGON 120 | #> Dimension: XY 121 | #> Bounding box: xmin: -84.32385 ymin: 33.88199 xmax: -75.45698 ymax: 36.58965 122 | #> Geodetic CRS: NAD27 123 | #> First 10 features: 124 | #> AREA PERIMETER CNTY_ CNTY_ID NAME FIPS FIPSNO CRESS_ID BIR74 SID74 125 | #> 1 0.114 1.442 1825 1825 Ashe 37009 37009 5 1091 1 126 | #> 2 0.061 1.231 1827 1827 Alleghany 37005 37005 3 487 0 127 | #> 3 0.143 1.630 1828 1828 Surry 37171 37171 86 3188 5 128 | #> 4 0.070 2.968 1831 1831 Currituck 37053 37053 27 508 1 129 | #> 5 0.153 2.206 1832 1832 Northampton 37131 37131 66 1421 9 130 | #> 6 0.097 1.670 1833 1833 Hertford 37091 37091 46 1452 7 131 | #> 7 0.062 1.547 1834 1834 Camden 37029 37029 15 286 0 132 | #> 8 0.091 1.284 1835 1835 Gates 37073 37073 37 420 0 133 | #> 9 0.118 1.421 1836 1836 Warren 37185 37185 93 968 4 134 | #> 10 0.124 1.428 1837 1837 Stokes 37169 37169 85 1612 1 135 | #> NWBIR74 BIR79 SID79 NWBIR79 geometry 136 | #> 1 10 1364 0 19 MULTIPOLYGON (((-81.47276 3... 137 | #> 2 10 542 3 12 MULTIPOLYGON (((-81.23989 3... 138 | #> 3 208 3616 6 260 MULTIPOLYGON (((-80.45634 3... 139 | #> 4 123 830 2 145 MULTIPOLYGON (((-76.00897 3... 140 | #> 5 1066 1606 3 1197 MULTIPOLYGON (((-77.21767 3... 141 | #> 6 954 1838 5 1237 MULTIPOLYGON (((-76.74506 3... 142 | #> 7 115 350 2 139 MULTIPOLYGON (((-76.00897 3... 143 | #> 8 254 594 2 371 MULTIPOLYGON (((-76.56251 3... 144 | #> 9 748 1190 2 844 MULTIPOLYGON (((-78.30876 3... 145 | #> 10 160 2038 5 176 MULTIPOLYGON (((-80.02567 3... 146 | plot(sf::st_geometry(nc_p)) 147 | ``` 148 | 149 | 150 | 151 | For additional examples please see the vignettes. 152 | 153 | ## Contributions 154 | 155 | Contributions, questions, ideas, and issue reports are welcome. Please 156 | raise an issue to discuss or submit a pull request. 157 | 158 | ## Acknowledgements 159 | 160 | This work benefited from the work by developers in the GeoPandas, Arrow, 161 | and r-spatial teams. Thank you to the teams for their excellent, 162 | open-source work. 163 | -------------------------------------------------------------------------------- /docs/404.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Page not found (404) • sfarrow 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 55 | 56 | 57 | 58 | 59 | 60 | 61 |
62 |
63 | 117 | 118 | 119 | 120 |
121 | 122 |
123 |
124 | 127 | 128 | Content not found. Please use links in the navbar. 129 | 130 |
131 | 132 | 137 | 138 |
139 | 140 | 141 | 142 |
143 | 146 | 147 |
148 |

Site built with pkgdown 1.6.1.

149 |
150 | 151 |
152 |
153 | 154 | 155 | 156 | 157 | 158 | 159 | 160 | 161 | -------------------------------------------------------------------------------- /docs/LICENSE-text.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | License • sfarrow 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 55 | 56 | 57 | 58 | 59 | 60 | 61 |
62 |
63 | 117 | 118 | 119 | 120 |
121 | 122 |
123 |
124 | 127 | 128 |
YEAR: 2021
129 | COPYRIGHT HOLDER: Chris Jochem
130 | 
131 | 132 |
133 | 134 | 139 | 140 |
141 | 142 | 143 | 144 |
145 | 148 | 149 |
150 |

Site built with pkgdown 1.6.1.

151 |
152 | 153 |
154 |
155 | 156 | 157 | 158 | 159 | 160 | 161 | 162 | 163 | -------------------------------------------------------------------------------- /docs/articles/example_sfarrow_files/accessible-code-block-0.0.1/empty-anchor.js: -------------------------------------------------------------------------------- 1 | // Hide empty tag within highlighted CodeBlock for screen reader accessibility (see https://github.com/jgm/pandoc/issues/6352#issuecomment-626106786) --> 2 | // v0.0.1 3 | // Written by JooYoung Seo (jooyoung@psu.edu) and Atsushi Yasumoto on June 1st, 2020. 4 | 5 | document.addEventListener('DOMContentLoaded', function() { 6 | const codeList = document.getElementsByClassName("sourceCode"); 7 | for (var i = 0; i < codeList.length; i++) { 8 | var linkList = codeList[i].getElementsByTagName('a'); 9 | for (var j = 0; j < linkList.length; j++) { 10 | if (linkList[j].innerHTML === "") { 11 | linkList[j].setAttribute('aria-hidden', 'true'); 12 | } 13 | } 14 | } 15 | }); 16 | -------------------------------------------------------------------------------- /docs/articles/example_sfarrow_files/figure-html/unnamed-chunk-2-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wcjochem/sfarrow/aff5deed42bec078799bc46a43975badb59a05eb/docs/articles/example_sfarrow_files/figure-html/unnamed-chunk-2-1.png -------------------------------------------------------------------------------- /docs/articles/example_sfarrow_files/figure-html/unnamed-chunk-7-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wcjochem/sfarrow/aff5deed42bec078799bc46a43975badb59a05eb/docs/articles/example_sfarrow_files/figure-html/unnamed-chunk-7-1.png -------------------------------------------------------------------------------- /docs/articles/example_sfarrow_files/figure-html/unnamed-chunk-8-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wcjochem/sfarrow/aff5deed42bec078799bc46a43975badb59a05eb/docs/articles/example_sfarrow_files/figure-html/unnamed-chunk-8-1.png -------------------------------------------------------------------------------- /docs/articles/example_sfarrow_files/header-attrs-2.10/header-attrs.js: -------------------------------------------------------------------------------- 1 | // Pandoc 2.9 adds attributes on both header and div. We remove the former (to 2 | // be compatible with the behavior of Pandoc < 2.8). 3 | document.addEventListener('DOMContentLoaded', function(e) { 4 | var hs = document.querySelectorAll("div.section[class*='level'] > :first-child"); 5 | var i, h, a; 6 | for (i = 0; i < hs.length; i++) { 7 | h = hs[i]; 8 | if (!/^h[1-6]$/i.test(h.tagName)) continue; // it should be a header h1-h6 9 | a = h.attributes; 10 | while (a.length > 0) h.removeAttribute(a[0].name); 11 | } 12 | }); 13 | -------------------------------------------------------------------------------- /docs/articles/example_sfarrow_files/header-attrs-2.8/header-attrs.js: -------------------------------------------------------------------------------- 1 | // Pandoc 2.9 adds attributes on both header and div. We remove the former (to 2 | // be compatible with the behavior of Pandoc < 2.8). 3 | document.addEventListener('DOMContentLoaded', function(e) { 4 | var hs = document.querySelectorAll("div.section[class*='level'] > :first-child"); 5 | var i, h, a; 6 | for (i = 0; i < hs.length; i++) { 7 | h = hs[i]; 8 | if (!/^h[1-6]$/i.test(h.tagName)) continue; // it should be a header h1-h6 9 | a = h.attributes; 10 | while (a.length > 0) h.removeAttribute(a[0].name); 11 | } 12 | }); 13 | -------------------------------------------------------------------------------- /docs/articles/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Articles • sfarrow 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 55 | 56 | 57 | 58 | 59 | 60 | 61 |
62 |
63 | 117 | 118 | 119 | 120 |
121 | 122 |
123 |
124 | 127 | 128 |
129 |

All vignettes

130 |

131 | 132 |
133 |
Getting started examples
134 |

Reading/writing with sfarrow and how it works.

135 |
136 |
137 |
138 |
139 | 140 | 141 |
142 | 145 | 146 |
147 |

Site built with pkgdown 1.6.1.

148 |
149 | 150 |
151 |
152 | 153 | 154 | 155 | 156 | 157 | 158 | 159 | 160 | -------------------------------------------------------------------------------- /docs/authors.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Authors • sfarrow 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 55 | 56 | 57 | 58 | 59 | 60 | 61 |
62 |
63 | 117 | 118 | 119 | 120 |
121 | 122 |
123 |
124 | 127 | 128 |
    129 |
  • 130 |

    Chris Jochem. Author, maintainer. 131 |

    132 |
  • 133 |
134 | 135 |
136 | 137 |
138 | 139 | 140 | 141 |
142 | 145 | 146 |
147 |

Site built with pkgdown 1.6.1.

148 |
149 | 150 |
151 |
152 | 153 | 154 | 155 | 156 | 157 | 158 | 159 | 160 | -------------------------------------------------------------------------------- /docs/bootstrap-toc.css: -------------------------------------------------------------------------------- 1 | /*! 2 | * Bootstrap Table of Contents v0.4.1 (http://afeld.github.io/bootstrap-toc/) 3 | * Copyright 2015 Aidan Feldman 4 | * Licensed under MIT (https://github.com/afeld/bootstrap-toc/blob/gh-pages/LICENSE.md) */ 5 | 6 | /* modified from https://github.com/twbs/bootstrap/blob/94b4076dd2efba9af71f0b18d4ee4b163aa9e0dd/docs/assets/css/src/docs.css#L548-L601 */ 7 | 8 | /* All levels of nav */ 9 | nav[data-toggle='toc'] .nav > li > a { 10 | display: block; 11 | padding: 4px 20px; 12 | font-size: 13px; 13 | font-weight: 500; 14 | color: #767676; 15 | } 16 | nav[data-toggle='toc'] .nav > li > a:hover, 17 | nav[data-toggle='toc'] .nav > li > a:focus { 18 | padding-left: 19px; 19 | color: #563d7c; 20 | text-decoration: none; 21 | background-color: transparent; 22 | border-left: 1px solid #563d7c; 23 | } 24 | nav[data-toggle='toc'] .nav > .active > a, 25 | nav[data-toggle='toc'] .nav > .active:hover > a, 26 | nav[data-toggle='toc'] .nav > .active:focus > a { 27 | padding-left: 18px; 28 | font-weight: bold; 29 | color: #563d7c; 30 | background-color: transparent; 31 | border-left: 2px solid #563d7c; 32 | } 33 | 34 | /* Nav: second level (shown on .active) */ 35 | nav[data-toggle='toc'] .nav .nav { 36 | display: none; /* Hide by default, but at >768px, show it */ 37 | padding-bottom: 10px; 38 | } 39 | nav[data-toggle='toc'] .nav .nav > li > a { 40 | padding-top: 1px; 41 | padding-bottom: 1px; 42 | padding-left: 30px; 43 | font-size: 12px; 44 | font-weight: normal; 45 | } 46 | nav[data-toggle='toc'] .nav .nav > li > a:hover, 47 | nav[data-toggle='toc'] .nav .nav > li > a:focus { 48 | padding-left: 29px; 49 | } 50 | nav[data-toggle='toc'] .nav .nav > .active > a, 51 | nav[data-toggle='toc'] .nav .nav > .active:hover > a, 52 | nav[data-toggle='toc'] .nav .nav > .active:focus > a { 53 | padding-left: 28px; 54 | font-weight: 500; 55 | } 56 | 57 | /* from https://github.com/twbs/bootstrap/blob/e38f066d8c203c3e032da0ff23cd2d6098ee2dd6/docs/assets/css/src/docs.css#L631-L634 */ 58 | nav[data-toggle='toc'] .nav > .active > ul { 59 | display: block; 60 | } 61 | -------------------------------------------------------------------------------- /docs/bootstrap-toc.js: -------------------------------------------------------------------------------- 1 | /*! 2 | * Bootstrap Table of Contents v0.4.1 (http://afeld.github.io/bootstrap-toc/) 3 | * Copyright 2015 Aidan Feldman 4 | * Licensed under MIT (https://github.com/afeld/bootstrap-toc/blob/gh-pages/LICENSE.md) */ 5 | (function() { 6 | 'use strict'; 7 | 8 | window.Toc = { 9 | helpers: { 10 | // return all matching elements in the set, or their descendants 11 | findOrFilter: function($el, selector) { 12 | // http://danielnouri.org/notes/2011/03/14/a-jquery-find-that-also-finds-the-root-element/ 13 | // http://stackoverflow.com/a/12731439/358804 14 | var $descendants = $el.find(selector); 15 | return $el.filter(selector).add($descendants).filter(':not([data-toc-skip])'); 16 | }, 17 | 18 | generateUniqueIdBase: function(el) { 19 | var text = $(el).text(); 20 | var anchor = text.trim().toLowerCase().replace(/[^A-Za-z0-9]+/g, '-'); 21 | return anchor || el.tagName.toLowerCase(); 22 | }, 23 | 24 | generateUniqueId: function(el) { 25 | var anchorBase = this.generateUniqueIdBase(el); 26 | for (var i = 0; ; i++) { 27 | var anchor = anchorBase; 28 | if (i > 0) { 29 | // add suffix 30 | anchor += '-' + i; 31 | } 32 | // check if ID already exists 33 | if (!document.getElementById(anchor)) { 34 | return anchor; 35 | } 36 | } 37 | }, 38 | 39 | generateAnchor: function(el) { 40 | if (el.id) { 41 | return el.id; 42 | } else { 43 | var anchor = this.generateUniqueId(el); 44 | el.id = anchor; 45 | return anchor; 46 | } 47 | }, 48 | 49 | createNavList: function() { 50 | return $(''); 51 | }, 52 | 53 | createChildNavList: function($parent) { 54 | var $childList = this.createNavList(); 55 | $parent.append($childList); 56 | return $childList; 57 | }, 58 | 59 | generateNavEl: function(anchor, text) { 60 | var $a = $(''); 61 | $a.attr('href', '#' + anchor); 62 | $a.text(text); 63 | var $li = $('
  • '); 64 | $li.append($a); 65 | return $li; 66 | }, 67 | 68 | generateNavItem: function(headingEl) { 69 | var anchor = this.generateAnchor(headingEl); 70 | var $heading = $(headingEl); 71 | var text = $heading.data('toc-text') || $heading.text(); 72 | return this.generateNavEl(anchor, text); 73 | }, 74 | 75 | // Find the first heading level (`

    `, then `

    `, etc.) that has more than one element. Defaults to 1 (for `

    `). 76 | getTopLevel: function($scope) { 77 | for (var i = 1; i <= 6; i++) { 78 | var $headings = this.findOrFilter($scope, 'h' + i); 79 | if ($headings.length > 1) { 80 | return i; 81 | } 82 | } 83 | 84 | return 1; 85 | }, 86 | 87 | // returns the elements for the top level, and the next below it 88 | getHeadings: function($scope, topLevel) { 89 | var topSelector = 'h' + topLevel; 90 | 91 | var secondaryLevel = topLevel + 1; 92 | var secondarySelector = 'h' + secondaryLevel; 93 | 94 | return this.findOrFilter($scope, topSelector + ',' + secondarySelector); 95 | }, 96 | 97 | getNavLevel: function(el) { 98 | return parseInt(el.tagName.charAt(1), 10); 99 | }, 100 | 101 | populateNav: function($topContext, topLevel, $headings) { 102 | var $context = $topContext; 103 | var $prevNav; 104 | 105 | var helpers = this; 106 | $headings.each(function(i, el) { 107 | var $newNav = helpers.generateNavItem(el); 108 | var navLevel = helpers.getNavLevel(el); 109 | 110 | // determine the proper $context 111 | if (navLevel === topLevel) { 112 | // use top level 113 | $context = $topContext; 114 | } else if ($prevNav && $context === $topContext) { 115 | // create a new level of the tree and switch to it 116 | $context = helpers.createChildNavList($prevNav); 117 | } // else use the current $context 118 | 119 | $context.append($newNav); 120 | 121 | $prevNav = $newNav; 122 | }); 123 | }, 124 | 125 | parseOps: function(arg) { 126 | var opts; 127 | if (arg.jquery) { 128 | opts = { 129 | $nav: arg 130 | }; 131 | } else { 132 | opts = arg; 133 | } 134 | opts.$scope = opts.$scope || $(document.body); 135 | return opts; 136 | } 137 | }, 138 | 139 | // accepts a jQuery object, or an options object 140 | init: function(opts) { 141 | opts = this.helpers.parseOps(opts); 142 | 143 | // ensure that the data attribute is in place for styling 144 | opts.$nav.attr('data-toggle', 'toc'); 145 | 146 | var $topContext = this.helpers.createChildNavList(opts.$nav); 147 | var topLevel = this.helpers.getTopLevel(opts.$scope); 148 | var $headings = this.helpers.getHeadings(opts.$scope, topLevel); 149 | this.helpers.populateNav($topContext, topLevel, $headings); 150 | } 151 | }; 152 | 153 | $(function() { 154 | $('nav[data-toggle="toc"]').each(function(i, el) { 155 | var $nav = $(el); 156 | Toc.init($nav); 157 | }); 158 | }); 159 | })(); 160 | -------------------------------------------------------------------------------- /docs/docsearch.css: -------------------------------------------------------------------------------- 1 | /* Docsearch -------------------------------------------------------------- */ 2 | /* 3 | Source: https://github.com/algolia/docsearch/ 4 | License: MIT 5 | */ 6 | 7 | .algolia-autocomplete { 8 | display: block; 9 | -webkit-box-flex: 1; 10 | -ms-flex: 1; 11 | flex: 1 12 | } 13 | 14 | .algolia-autocomplete .ds-dropdown-menu { 15 | width: 100%; 16 | min-width: none; 17 | max-width: none; 18 | padding: .75rem 0; 19 | background-color: #fff; 20 | background-clip: padding-box; 21 | border: 1px solid rgba(0, 0, 0, .1); 22 | box-shadow: 0 .5rem 1rem rgba(0, 0, 0, .175); 23 | } 24 | 25 | @media (min-width:768px) { 26 | .algolia-autocomplete .ds-dropdown-menu { 27 | width: 175% 28 | } 29 | } 30 | 31 | .algolia-autocomplete .ds-dropdown-menu::before { 32 | display: none 33 | } 34 | 35 | .algolia-autocomplete .ds-dropdown-menu [class^=ds-dataset-] { 36 | padding: 0; 37 | background-color: rgb(255,255,255); 38 | border: 0; 39 | max-height: 80vh; 40 | } 41 | 42 | .algolia-autocomplete .ds-dropdown-menu .ds-suggestions { 43 | margin-top: 0 44 | } 45 | 46 | .algolia-autocomplete .algolia-docsearch-suggestion { 47 | padding: 0; 48 | overflow: visible 49 | } 50 | 51 | .algolia-autocomplete .algolia-docsearch-suggestion--category-header { 52 | padding: .125rem 1rem; 53 | margin-top: 0; 54 | font-size: 1.3em; 55 | font-weight: 500; 56 | color: #00008B; 57 | border-bottom: 0 58 | } 59 | 60 | .algolia-autocomplete .algolia-docsearch-suggestion--wrapper { 61 | float: none; 62 | padding-top: 0 63 | } 64 | 65 | .algolia-autocomplete .algolia-docsearch-suggestion--subcategory-column { 66 | float: none; 67 | width: auto; 68 | padding: 0; 69 | text-align: left 70 | } 71 | 72 | .algolia-autocomplete .algolia-docsearch-suggestion--content { 73 | float: none; 74 | width: auto; 75 | padding: 0 76 | } 77 | 78 | .algolia-autocomplete .algolia-docsearch-suggestion--content::before { 79 | display: none 80 | } 81 | 82 | .algolia-autocomplete .ds-suggestion:not(:first-child) .algolia-docsearch-suggestion--category-header { 83 | padding-top: .75rem; 84 | margin-top: .75rem; 85 | border-top: 1px solid rgba(0, 0, 0, .1) 86 | } 87 | 88 | .algolia-autocomplete .ds-suggestion .algolia-docsearch-suggestion--subcategory-column { 89 | display: block; 90 | padding: .1rem 1rem; 91 | margin-bottom: 0.1; 92 | font-size: 1.0em; 93 | font-weight: 400 94 | /* display: none */ 95 | } 96 | 97 | .algolia-autocomplete .algolia-docsearch-suggestion--title { 98 | display: block; 99 | padding: .25rem 1rem; 100 | margin-bottom: 0; 101 | font-size: 0.9em; 102 | font-weight: 400 103 | } 104 | 105 | .algolia-autocomplete .algolia-docsearch-suggestion--text { 106 | padding: 0 1rem .5rem; 107 | margin-top: -.25rem; 108 | font-size: 0.8em; 109 | font-weight: 400; 110 | line-height: 1.25 111 | } 112 | 113 | .algolia-autocomplete .algolia-docsearch-footer { 114 | width: 110px; 115 | height: 20px; 116 | z-index: 3; 117 | margin-top: 10.66667px; 118 | float: right; 119 | font-size: 0; 120 | line-height: 0; 121 | } 122 | 123 | .algolia-autocomplete .algolia-docsearch-footer--logo { 124 | background-image: url("data:image/svg+xml;utf8,"); 125 | background-repeat: no-repeat; 126 | background-position: 50%; 127 | background-size: 100%; 128 | overflow: hidden; 129 | text-indent: -9000px; 130 | width: 100%; 131 | height: 100%; 132 | display: block; 133 | transform: translate(-8px); 134 | } 135 | 136 | .algolia-autocomplete .algolia-docsearch-suggestion--highlight { 137 | color: #FF8C00; 138 | background: rgba(232, 189, 54, 0.1) 139 | } 140 | 141 | 142 | .algolia-autocomplete .algolia-docsearch-suggestion--text .algolia-docsearch-suggestion--highlight { 143 | box-shadow: inset 0 -2px 0 0 rgba(105, 105, 105, .5) 144 | } 145 | 146 | .algolia-autocomplete .ds-suggestion.ds-cursor .algolia-docsearch-suggestion--content { 147 | background-color: rgba(192, 192, 192, .15) 148 | } 149 | -------------------------------------------------------------------------------- /docs/docsearch.js: -------------------------------------------------------------------------------- 1 | $(function() { 2 | 3 | // register a handler to move the focus to the search bar 4 | // upon pressing shift + "/" (i.e. "?") 5 | $(document).on('keydown', function(e) { 6 | if (e.shiftKey && e.keyCode == 191) { 7 | e.preventDefault(); 8 | $("#search-input").focus(); 9 | } 10 | }); 11 | 12 | $(document).ready(function() { 13 | // do keyword highlighting 14 | /* modified from https://jsfiddle.net/julmot/bL6bb5oo/ */ 15 | var mark = function() { 16 | 17 | var referrer = document.URL ; 18 | var paramKey = "q" ; 19 | 20 | if (referrer.indexOf("?") !== -1) { 21 | var qs = referrer.substr(referrer.indexOf('?') + 1); 22 | var qs_noanchor = qs.split('#')[0]; 23 | var qsa = qs_noanchor.split('&'); 24 | var keyword = ""; 25 | 26 | for (var i = 0; i < qsa.length; i++) { 27 | var currentParam = qsa[i].split('='); 28 | 29 | if (currentParam.length !== 2) { 30 | continue; 31 | } 32 | 33 | if (currentParam[0] == paramKey) { 34 | keyword = decodeURIComponent(currentParam[1].replace(/\+/g, "%20")); 35 | } 36 | } 37 | 38 | if (keyword !== "") { 39 | $(".contents").unmark({ 40 | done: function() { 41 | $(".contents").mark(keyword); 42 | } 43 | }); 44 | } 45 | } 46 | }; 47 | 48 | mark(); 49 | }); 50 | }); 51 | 52 | /* Search term highlighting ------------------------------*/ 53 | 54 | function matchedWords(hit) { 55 | var words = []; 56 | 57 | var hierarchy = hit._highlightResult.hierarchy; 58 | // loop to fetch from lvl0, lvl1, etc. 59 | for (var idx in hierarchy) { 60 | words = words.concat(hierarchy[idx].matchedWords); 61 | } 62 | 63 | var content = hit._highlightResult.content; 64 | if (content) { 65 | words = words.concat(content.matchedWords); 66 | } 67 | 68 | // return unique words 69 | var words_uniq = [...new Set(words)]; 70 | return words_uniq; 71 | } 72 | 73 | function updateHitURL(hit) { 74 | 75 | var words = matchedWords(hit); 76 | var url = ""; 77 | 78 | if (hit.anchor) { 79 | url = hit.url_without_anchor + '?q=' + escape(words.join(" ")) + '#' + hit.anchor; 80 | } else { 81 | url = hit.url + '?q=' + escape(words.join(" ")); 82 | } 83 | 84 | return url; 85 | } 86 | -------------------------------------------------------------------------------- /docs/link.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 5 | 8 | 12 | 13 | -------------------------------------------------------------------------------- /docs/news/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Changelog • sfarrow 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 55 | 56 | 57 | 58 | 59 | 60 | 61 |
    62 |
    63 | 117 | 118 | 119 | 120 |
    121 | 122 |
    123 |
    124 | 128 | 129 |
    130 |

    131 | sfarrow 0.4.1 2021-10-27 132 |

    133 |
      134 |
    • Cleaning examples to remove reverse dependency check errors in arrow (reported by @jonkeane).
    • 135 |
    136 |
    137 |
    138 |

    139 | sfarrow 0.4.0 2021-06-21 140 |

    141 |
      142 |
    • New find_geom parameter in read_sf_dataset() adds any geometry columns to the arrow_dplyr_query. Default behaviour is FALSE for consistent behaviour.

    • 143 |
    • Cleaning documentation and preparing for CRAN submission

    • 144 |
    145 |
    146 |
    147 |

    148 | sfarrow 0.3.0 Unreleased 149 |

    150 | 154 |
    155 |
    156 |

    157 | sfarrow 0.2.0 Unreleased 158 |

    159 |
      160 |
    • New write_sf_dataset() and read_sf_dataset() to handle partitioned datasets. These also work with dplyr and grouped variables to define partitions.

    • 161 |
    • New vignettes added for documentation of all functions.

    • 162 |
    163 |
    164 |
    165 |

    166 | sfarrow 0.1.1 Unreleased 167 |

    168 |
      169 |
    • 170 | st_write_parquet() now warns uses that geo metadata format may change.
    • 171 |
    172 |
    173 |
    174 |

    175 | sfarrow 0.1.0 Unreleased 176 |

    177 |
      178 |
    • This is the initial release of sfarrow.
    • 179 |
    180 |
    181 |
    182 | 183 | 188 | 189 |
    190 | 191 | 192 |
    193 | 196 | 197 |
    198 |

    Site built with pkgdown 1.6.1.

    199 |
    200 | 201 |
    202 |
    203 | 204 | 205 | 206 | 207 | 208 | 209 | 210 | 211 | -------------------------------------------------------------------------------- /docs/pkgdown.css: -------------------------------------------------------------------------------- 1 | /* Sticky footer */ 2 | 3 | /** 4 | * Basic idea: https://philipwalton.github.io/solved-by-flexbox/demos/sticky-footer/ 5 | * Details: https://github.com/philipwalton/solved-by-flexbox/blob/master/assets/css/components/site.css 6 | * 7 | * .Site -> body > .container 8 | * .Site-content -> body > .container .row 9 | * .footer -> footer 10 | * 11 | * Key idea seems to be to ensure that .container and __all its parents__ 12 | * have height set to 100% 13 | * 14 | */ 15 | 16 | html, body { 17 | height: 100%; 18 | } 19 | 20 | body { 21 | position: relative; 22 | } 23 | 24 | body > .container { 25 | display: flex; 26 | height: 100%; 27 | flex-direction: column; 28 | } 29 | 30 | body > .container .row { 31 | flex: 1 0 auto; 32 | } 33 | 34 | footer { 35 | margin-top: 45px; 36 | padding: 35px 0 36px; 37 | border-top: 1px solid #e5e5e5; 38 | color: #666; 39 | display: flex; 40 | flex-shrink: 0; 41 | } 42 | footer p { 43 | margin-bottom: 0; 44 | } 45 | footer div { 46 | flex: 1; 47 | } 48 | footer .pkgdown { 49 | text-align: right; 50 | } 51 | footer p { 52 | margin-bottom: 0; 53 | } 54 | 55 | img.icon { 56 | float: right; 57 | } 58 | 59 | img { 60 | max-width: 100%; 61 | } 62 | 63 | /* Fix bug in bootstrap (only seen in firefox) */ 64 | summary { 65 | display: list-item; 66 | } 67 | 68 | /* Typographic tweaking ---------------------------------*/ 69 | 70 | .contents .page-header { 71 | margin-top: calc(-60px + 1em); 72 | } 73 | 74 | dd { 75 | margin-left: 3em; 76 | } 77 | 78 | /* Section anchors ---------------------------------*/ 79 | 80 | a.anchor { 81 | margin-left: -30px; 82 | display:inline-block; 83 | width: 30px; 84 | height: 30px; 85 | visibility: hidden; 86 | 87 | background-image: url(./link.svg); 88 | background-repeat: no-repeat; 89 | background-size: 20px 20px; 90 | background-position: center center; 91 | } 92 | 93 | .hasAnchor:hover a.anchor { 94 | visibility: visible; 95 | } 96 | 97 | @media (max-width: 767px) { 98 | .hasAnchor:hover a.anchor { 99 | visibility: hidden; 100 | } 101 | } 102 | 103 | 104 | /* Fixes for fixed navbar --------------------------*/ 105 | 106 | .contents h1, .contents h2, .contents h3, .contents h4 { 107 | padding-top: 60px; 108 | margin-top: -40px; 109 | } 110 | 111 | /* Navbar submenu --------------------------*/ 112 | 113 | .dropdown-submenu { 114 | position: relative; 115 | } 116 | 117 | .dropdown-submenu>.dropdown-menu { 118 | top: 0; 119 | left: 100%; 120 | margin-top: -6px; 121 | margin-left: -1px; 122 | border-radius: 0 6px 6px 6px; 123 | } 124 | 125 | .dropdown-submenu:hover>.dropdown-menu { 126 | display: block; 127 | } 128 | 129 | .dropdown-submenu>a:after { 130 | display: block; 131 | content: " "; 132 | float: right; 133 | width: 0; 134 | height: 0; 135 | border-color: transparent; 136 | border-style: solid; 137 | border-width: 5px 0 5px 5px; 138 | border-left-color: #cccccc; 139 | margin-top: 5px; 140 | margin-right: -10px; 141 | } 142 | 143 | .dropdown-submenu:hover>a:after { 144 | border-left-color: #ffffff; 145 | } 146 | 147 | .dropdown-submenu.pull-left { 148 | float: none; 149 | } 150 | 151 | .dropdown-submenu.pull-left>.dropdown-menu { 152 | left: -100%; 153 | margin-left: 10px; 154 | border-radius: 6px 0 6px 6px; 155 | } 156 | 157 | /* Sidebar --------------------------*/ 158 | 159 | #pkgdown-sidebar { 160 | margin-top: 30px; 161 | position: -webkit-sticky; 162 | position: sticky; 163 | top: 70px; 164 | } 165 | 166 | #pkgdown-sidebar h2 { 167 | font-size: 1.5em; 168 | margin-top: 1em; 169 | } 170 | 171 | #pkgdown-sidebar h2:first-child { 172 | margin-top: 0; 173 | } 174 | 175 | #pkgdown-sidebar .list-unstyled li { 176 | margin-bottom: 0.5em; 177 | } 178 | 179 | /* bootstrap-toc tweaks ------------------------------------------------------*/ 180 | 181 | /* All levels of nav */ 182 | 183 | nav[data-toggle='toc'] .nav > li > a { 184 | padding: 4px 20px 4px 6px; 185 | font-size: 1.5rem; 186 | font-weight: 400; 187 | color: inherit; 188 | } 189 | 190 | nav[data-toggle='toc'] .nav > li > a:hover, 191 | nav[data-toggle='toc'] .nav > li > a:focus { 192 | padding-left: 5px; 193 | color: inherit; 194 | border-left: 1px solid #878787; 195 | } 196 | 197 | nav[data-toggle='toc'] .nav > .active > a, 198 | nav[data-toggle='toc'] .nav > .active:hover > a, 199 | nav[data-toggle='toc'] .nav > .active:focus > a { 200 | padding-left: 5px; 201 | font-size: 1.5rem; 202 | font-weight: 400; 203 | color: inherit; 204 | border-left: 2px solid #878787; 205 | } 206 | 207 | /* Nav: second level (shown on .active) */ 208 | 209 | nav[data-toggle='toc'] .nav .nav { 210 | display: none; /* Hide by default, but at >768px, show it */ 211 | padding-bottom: 10px; 212 | } 213 | 214 | nav[data-toggle='toc'] .nav .nav > li > a { 215 | padding-left: 16px; 216 | font-size: 1.35rem; 217 | } 218 | 219 | nav[data-toggle='toc'] .nav .nav > li > a:hover, 220 | nav[data-toggle='toc'] .nav .nav > li > a:focus { 221 | padding-left: 15px; 222 | } 223 | 224 | nav[data-toggle='toc'] .nav .nav > .active > a, 225 | nav[data-toggle='toc'] .nav .nav > .active:hover > a, 226 | nav[data-toggle='toc'] .nav .nav > .active:focus > a { 227 | padding-left: 15px; 228 | font-weight: 500; 229 | font-size: 1.35rem; 230 | } 231 | 232 | /* orcid ------------------------------------------------------------------- */ 233 | 234 | .orcid { 235 | font-size: 16px; 236 | color: #A6CE39; 237 | /* margins are required by official ORCID trademark and display guidelines */ 238 | margin-left:4px; 239 | margin-right:4px; 240 | vertical-align: middle; 241 | } 242 | 243 | /* Reference index & topics ----------------------------------------------- */ 244 | 245 | .ref-index th {font-weight: normal;} 246 | 247 | .ref-index td {vertical-align: top; min-width: 100px} 248 | .ref-index .icon {width: 40px;} 249 | .ref-index .alias {width: 40%;} 250 | .ref-index-icons .alias {width: calc(40% - 40px);} 251 | .ref-index .title {width: 60%;} 252 | 253 | .ref-arguments th {text-align: right; padding-right: 10px;} 254 | .ref-arguments th, .ref-arguments td {vertical-align: top; min-width: 100px} 255 | .ref-arguments .name {width: 20%;} 256 | .ref-arguments .desc {width: 80%;} 257 | 258 | /* Nice scrolling for wide elements --------------------------------------- */ 259 | 260 | table { 261 | display: block; 262 | overflow: auto; 263 | } 264 | 265 | /* Syntax highlighting ---------------------------------------------------- */ 266 | 267 | pre { 268 | word-wrap: normal; 269 | word-break: normal; 270 | border: 1px solid #eee; 271 | } 272 | 273 | pre, code { 274 | background-color: #f8f8f8; 275 | color: #333; 276 | } 277 | 278 | pre code { 279 | overflow: auto; 280 | word-wrap: normal; 281 | white-space: pre; 282 | } 283 | 284 | pre .img { 285 | margin: 5px 0; 286 | } 287 | 288 | pre .img img { 289 | background-color: #fff; 290 | display: block; 291 | height: auto; 292 | } 293 | 294 | code a, pre a { 295 | color: #375f84; 296 | } 297 | 298 | a.sourceLine:hover { 299 | text-decoration: none; 300 | } 301 | 302 | .fl {color: #1514b5;} 303 | .fu {color: #000000;} /* function */ 304 | .ch,.st {color: #036a07;} /* string */ 305 | .kw {color: #264D66;} /* keyword */ 306 | .co {color: #888888;} /* comment */ 307 | 308 | .message { color: black; font-weight: bolder;} 309 | .error { color: orange; font-weight: bolder;} 310 | .warning { color: #6A0366; font-weight: bolder;} 311 | 312 | /* Clipboard --------------------------*/ 313 | 314 | .hasCopyButton { 315 | position: relative; 316 | } 317 | 318 | .btn-copy-ex { 319 | position: absolute; 320 | right: 0; 321 | top: 0; 322 | visibility: hidden; 323 | } 324 | 325 | .hasCopyButton:hover button.btn-copy-ex { 326 | visibility: visible; 327 | } 328 | 329 | /* headroom.js ------------------------ */ 330 | 331 | .headroom { 332 | will-change: transform; 333 | transition: transform 200ms linear; 334 | } 335 | .headroom--pinned { 336 | transform: translateY(0%); 337 | } 338 | .headroom--unpinned { 339 | transform: translateY(-100%); 340 | } 341 | 342 | /* mark.js ----------------------------*/ 343 | 344 | mark { 345 | background-color: rgba(255, 255, 51, 0.5); 346 | border-bottom: 2px solid rgba(255, 153, 51, 0.3); 347 | padding: 1px; 348 | } 349 | 350 | /* vertical spacing after htmlwidgets */ 351 | .html-widget { 352 | margin-bottom: 10px; 353 | } 354 | 355 | /* fontawesome ------------------------ */ 356 | 357 | .fab { 358 | font-family: "Font Awesome 5 Brands" !important; 359 | } 360 | 361 | /* don't display links in code chunks when printing */ 362 | /* source: https://stackoverflow.com/a/10781533 */ 363 | @media print { 364 | code a:link:after, code a:visited:after { 365 | content: ""; 366 | } 367 | } 368 | -------------------------------------------------------------------------------- /docs/pkgdown.js: -------------------------------------------------------------------------------- 1 | /* http://gregfranko.com/blog/jquery-best-practices/ */ 2 | (function($) { 3 | $(function() { 4 | 5 | $('.navbar-fixed-top').headroom(); 6 | 7 | $('body').css('padding-top', $('.navbar').height() + 10); 8 | $(window).resize(function(){ 9 | $('body').css('padding-top', $('.navbar').height() + 10); 10 | }); 11 | 12 | $('[data-toggle="tooltip"]').tooltip(); 13 | 14 | var cur_path = paths(location.pathname); 15 | var links = $("#navbar ul li a"); 16 | var max_length = -1; 17 | var pos = -1; 18 | for (var i = 0; i < links.length; i++) { 19 | if (links[i].getAttribute("href") === "#") 20 | continue; 21 | // Ignore external links 22 | if (links[i].host !== location.host) 23 | continue; 24 | 25 | var nav_path = paths(links[i].pathname); 26 | 27 | var length = prefix_length(nav_path, cur_path); 28 | if (length > max_length) { 29 | max_length = length; 30 | pos = i; 31 | } 32 | } 33 | 34 | // Add class to parent
  • , and enclosing
  • if in dropdown 35 | if (pos >= 0) { 36 | var menu_anchor = $(links[pos]); 37 | menu_anchor.parent().addClass("active"); 38 | menu_anchor.closest("li.dropdown").addClass("active"); 39 | } 40 | }); 41 | 42 | function paths(pathname) { 43 | var pieces = pathname.split("/"); 44 | pieces.shift(); // always starts with / 45 | 46 | var end = pieces[pieces.length - 1]; 47 | if (end === "index.html" || end === "") 48 | pieces.pop(); 49 | return(pieces); 50 | } 51 | 52 | // Returns -1 if not found 53 | function prefix_length(needle, haystack) { 54 | if (needle.length > haystack.length) 55 | return(-1); 56 | 57 | // Special case for length-0 haystack, since for loop won't run 58 | if (haystack.length === 0) { 59 | return(needle.length === 0 ? 0 : -1); 60 | } 61 | 62 | for (var i = 0; i < haystack.length; i++) { 63 | if (needle[i] != haystack[i]) 64 | return(i); 65 | } 66 | 67 | return(haystack.length); 68 | } 69 | 70 | /* Clipboard --------------------------*/ 71 | 72 | function changeTooltipMessage(element, msg) { 73 | var tooltipOriginalTitle=element.getAttribute('data-original-title'); 74 | element.setAttribute('data-original-title', msg); 75 | $(element).tooltip('show'); 76 | element.setAttribute('data-original-title', tooltipOriginalTitle); 77 | } 78 | 79 | if(ClipboardJS.isSupported()) { 80 | $(document).ready(function() { 81 | var copyButton = ""; 82 | 83 | $(".examples, div.sourceCode").addClass("hasCopyButton"); 84 | 85 | // Insert copy buttons: 86 | $(copyButton).prependTo(".hasCopyButton"); 87 | 88 | // Initialize tooltips: 89 | $('.btn-copy-ex').tooltip({container: 'body'}); 90 | 91 | // Initialize clipboard: 92 | var clipboardBtnCopies = new ClipboardJS('[data-clipboard-copy]', { 93 | text: function(trigger) { 94 | return trigger.parentNode.textContent; 95 | } 96 | }); 97 | 98 | clipboardBtnCopies.on('success', function(e) { 99 | changeTooltipMessage(e.trigger, 'Copied!'); 100 | e.clearSelection(); 101 | }); 102 | 103 | clipboardBtnCopies.on('error', function() { 104 | changeTooltipMessage(e.trigger,'Press Ctrl+C or Command+C to copy'); 105 | }); 106 | }); 107 | } 108 | })(window.jQuery || window.$) 109 | -------------------------------------------------------------------------------- /docs/pkgdown.yml: -------------------------------------------------------------------------------- 1 | pandoc: 2.11.4 2 | pkgdown: 1.6.1 3 | pkgdown_sha: ~ 4 | articles: 5 | example_sfarrow: example_sfarrow.html 6 | last_built: 2021-10-28T10:21Z 7 | 8 | -------------------------------------------------------------------------------- /docs/reference/Rplot001.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wcjochem/sfarrow/aff5deed42bec078799bc46a43975badb59a05eb/docs/reference/Rplot001.png -------------------------------------------------------------------------------- /docs/reference/arrow_to_sf.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Helper function to convert 'data.frame' to sf — arrow_to_sf • sfarrow 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 56 | 57 | 58 | 59 | 60 | 61 | 62 |
    63 |
    64 | 118 | 119 | 120 | 121 |
    122 | 123 |
    124 |
    125 | 130 | 131 |
    132 |

    Helper function to convert 'data.frame' to sf

    133 |
    134 | 135 |
    arrow_to_sf(tbl, metadata)
    136 | 137 |

    Arguments

    138 | 139 | 140 | 141 | 142 | 143 | 144 | 145 | 146 | 147 | 148 |
    tbl

    data.frame from reading an Arrow dataset

    metadata

    list of validated geo metadata

    149 | 150 |

    Value

    151 | 152 |

    object of sf with CRS and geometry columns

    153 | 154 |
    155 | 160 |
    161 | 162 | 163 |
    164 | 167 | 168 |
    169 |

    Site built with pkgdown 1.6.1.

    170 |
    171 | 172 |
    173 |
    174 | 175 | 176 | 177 | 178 | 179 | 180 | 181 | 182 | -------------------------------------------------------------------------------- /docs/reference/create_metadata.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Create standardised geo metadata for Parquet files — create_metadata • sfarrow 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 56 | 57 | 58 | 59 | 60 | 61 | 62 |
    63 |
    64 | 118 | 119 | 120 | 121 |
    122 | 123 |
    124 |
    125 | 130 | 131 |
    132 |

    Create standardised geo metadata for Parquet files

    133 |
    134 | 135 |
    create_metadata(df)
    136 | 137 |

    Arguments

    138 | 139 | 140 | 141 | 142 | 143 | 144 |
    df

    object of class sf

    145 | 146 |

    Value

    147 | 148 |

    JSON formatted list with geo-metadata

    149 |

    Details

    150 | 151 |

    Reference for metadata standard: 152 | https://github.com/geopandas/geo-arrow-spec. This is compatible with 153 | GeoPandas Parquet files.

    154 | 155 |
    156 | 161 |
    162 | 163 | 164 |
    165 | 168 | 169 |
    170 |

    Site built with pkgdown 1.6.1.

    171 |
    172 | 173 |
    174 |
    175 | 176 | 177 | 178 | 179 | 180 | 181 | 182 | 183 | -------------------------------------------------------------------------------- /docs/reference/encode_wkb.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Convert sfc geometry columns into a WKB binary format — encode_wkb • sfarrow 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 56 | 57 | 58 | 59 | 60 | 61 | 62 |
    63 |
    64 | 118 | 119 | 120 | 121 |
    122 | 123 |
    124 |
    125 | 130 | 131 |
    132 |

    Convert sfc geometry columns into a WKB binary format

    133 |
    134 | 135 |
    encode_wkb(df)
    136 | 137 |

    Arguments

    138 | 139 | 140 | 141 | 142 | 143 | 144 |
    df

    sf object

    145 | 146 |

    Value

    147 | 148 |

    data.frame with binary geometry column(s)

    149 |

    Details

    150 | 151 |

    Allows for more than one geometry column in sfc format

    152 | 153 |
    154 | 159 |
    160 | 161 | 162 |
    163 | 166 | 167 |
    168 |

    Site built with pkgdown 1.6.1.

    169 |
    170 | 171 |
    172 |
    173 | 174 | 175 | 176 | 177 | 178 | 179 | 180 | 181 | -------------------------------------------------------------------------------- /docs/reference/figures/REAsDME-unnamed-chunk-2-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wcjochem/sfarrow/aff5deed42bec078799bc46a43975badb59a05eb/docs/reference/figures/REAsDME-unnamed-chunk-2-1.png -------------------------------------------------------------------------------- /docs/reference/figures/REAsDME-unnamed-chunk-3-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wcjochem/sfarrow/aff5deed42bec078799bc46a43975badb59a05eb/docs/reference/figures/REAsDME-unnamed-chunk-3-1.png -------------------------------------------------------------------------------- /docs/reference/figures/REAsDME-unnamed-chunk-4-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wcjochem/sfarrow/aff5deed42bec078799bc46a43975badb59a05eb/docs/reference/figures/REAsDME-unnamed-chunk-4-1.png -------------------------------------------------------------------------------- /docs/reference/figures/REAsDME-unnamed-chunk-5-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wcjochem/sfarrow/aff5deed42bec078799bc46a43975badb59a05eb/docs/reference/figures/REAsDME-unnamed-chunk-5-1.png -------------------------------------------------------------------------------- /docs/reference/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Function reference • sfarrow 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 55 | 56 | 57 | 58 | 59 | 60 | 61 |
    62 |
    63 | 117 | 118 | 119 | 120 |
    121 | 122 |
    123 |
    124 | 127 | 128 | 129 | 130 | 131 | 132 | 133 | 134 | 135 | 136 | 137 | 138 | 142 | 143 | 144 | 145 | 146 | 147 | 148 | 149 | 150 | 153 | 154 | 155 | 156 | 159 | 160 | 161 | 162 | 165 | 166 | 167 | 168 | 171 | 172 | 173 | 174 | 177 | 178 | 179 | 180 | 183 | 184 | 185 | 186 |
    139 |

    All functions

    140 |

    141 |
    151 |

    read_sf_dataset()

    152 |

    Read an Arrow multi-file dataset and create sf object

    157 |

    st_read_feather()

    158 |

    Read a Feather file to sf object

    163 |

    st_read_parquet()

    164 |

    Read a Parquet file to sf object

    169 |

    st_write_feather()

    170 |

    Write sf object to Feather file

    175 |

    st_write_parquet()

    176 |

    Write sf object to Parquet file

    181 |

    write_sf_dataset()

    182 |

    Write sf object to an Arrow multi-file dataset

    187 |
    188 | 189 | 194 |
    195 | 196 | 197 |
    198 | 201 | 202 |
    203 |

    Site built with pkgdown 1.6.1.

    204 |
    205 | 206 |
    207 |
    208 | 209 | 210 | 211 | 212 | 213 | 214 | 215 | 216 | -------------------------------------------------------------------------------- /docs/reference/read_sf_dataset-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wcjochem/sfarrow/aff5deed42bec078799bc46a43975badb59a05eb/docs/reference/read_sf_dataset-1.png -------------------------------------------------------------------------------- /docs/reference/read_sf_dataset.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Read an Arrow multi-file dataset and create sf object — read_sf_dataset • sfarrow 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 56 | 57 | 58 | 59 | 60 | 61 | 62 |
    63 |
    64 | 118 | 119 | 120 | 121 |
    122 | 123 |
    124 |
    125 | 130 | 131 |
    132 |

    Read an Arrow multi-file dataset and create sf object

    133 |
    134 | 135 |
    read_sf_dataset(dataset, find_geom = FALSE)
    136 | 137 |

    Arguments

    138 | 139 | 140 | 141 | 142 | 144 | 145 | 146 | 147 | 151 | 152 |
    dataset

    a Dataset object created by arrow::open_dataset 143 | or an arrow_dplyr_query

    find_geom

    logical. Only needed when returning a subset of columns. 148 | Should all available geometry columns be selected and added to to the 149 | dataset query without being named? Default is FALSE to require 150 | geometry column(s) to be selected specifically.

    153 | 154 |

    Value

    155 | 156 |

    object of class sf

    157 |

    Details

    158 | 159 |

    This function is primarily for use after opening a dataset with 160 | arrow::open_dataset. Users can then query the arrow Dataset 161 | using dplyr methods such as filter or 162 | select. Passing the resulting query to this function 163 | will parse the datasets and create an sf object. The function 164 | expects consistent geographic metadata to be stored with the dataset in 165 | order to create sf objects.

    166 |

    See also

    167 | 168 | 169 | 170 |

    Examples

    171 |
    # read spatial object 172 | nc <- sf::st_read(system.file("shape/nc.shp", package="sf"), quiet = TRUE) 173 | 174 | # create random grouping 175 | nc$group <- sample(1:3, nrow(nc), replace = TRUE) 176 | 177 | # use dplyr to group the dataset. %>% also allowed 178 | nc_g <- dplyr::group_by(nc, group) 179 | 180 | # write out to parquet datasets 181 | tf <- tempfile() # create temporary location 182 | on.exit(unlink(tf)) 183 | # partitioning determined by dplyr 'group_vars' 184 | write_sf_dataset(nc_g, path = tf) 185 |
    #> Warning: This is an initial implementation of Parquet/Feather file support and 186 | #> geo metadata. This is tracking version 0.1.0 of the metadata 187 | #> (https://github.com/geopandas/geo-arrow-spec). This metadata 188 | #> specification may change and does not yet make stability promises. We 189 | #> do not yet recommend using this in a production setting unless you are 190 | #> able to rewrite your Parquet/Feather files.
    191 | list.files(tf, recursive = TRUE) 192 |
    #> [1] "group=1/part-0.parquet" "group=2/part-1.parquet" "group=3/part-2.parquet"
    193 | # open parquet files from dataset 194 | ds <- arrow::open_dataset(tf) 195 | 196 | # create a query. %>% also allowed 197 | q <- dplyr::filter(ds, group == 1) 198 | 199 | # read the dataset (piping syntax also works) 200 | nc_d <- read_sf_dataset(dataset = q) 201 | 202 | nc_d 203 |
    #> Simple feature collection with 33 features and 15 fields 204 | #> Geometry type: MULTIPOLYGON 205 | #> Dimension: XY 206 | #> Bounding box: xmin: -83.98855 ymin: 33.94867 xmax: -75.45698 ymax: 36.58965 207 | #> Geodetic CRS: NAD27 208 | #> First 10 features: 209 | #> AREA PERIMETER CNTY_ CNTY_ID NAME FIPS FIPSNO CRESS_ID BIR74 SID74 210 | #> 1 0.114 1.442 1825 1825 Ashe 37009 37009 5 1091 1 211 | #> 2 0.070 2.968 1831 1831 Currituck 37053 37053 27 508 1 212 | #> 3 0.124 1.428 1837 1837 Stokes 37169 37169 85 1612 1 213 | #> 4 0.114 1.352 1838 1838 Caswell 37033 37033 17 1035 2 214 | #> 5 0.153 1.616 1839 1839 Rockingham 37157 37157 79 4449 16 215 | #> 6 0.072 1.085 1842 1842 Vance 37181 37181 91 2180 4 216 | #> 7 0.064 1.213 1892 1892 Avery 37011 37011 6 781 0 217 | #> 8 0.086 1.267 1893 1893 Yadkin 37197 37197 99 1269 1 218 | #> 9 0.128 1.554 1897 1897 Franklin 37069 37069 35 1399 2 219 | #> 10 0.142 1.640 1913 1913 Nash 37127 37127 64 4021 8 220 | #> NWBIR74 BIR79 SID79 NWBIR79 group geometry 221 | #> 1 10 1364 0 19 1 MULTIPOLYGON (((-81.47276 3... 222 | #> 2 123 830 2 145 1 MULTIPOLYGON (((-76.00897 3... 223 | #> 3 160 2038 5 176 1 MULTIPOLYGON (((-80.02567 3... 224 | #> 4 550 1253 2 597 1 MULTIPOLYGON (((-79.53051 3... 225 | #> 5 1243 5386 5 1369 1 MULTIPOLYGON (((-79.53051 3... 226 | #> 6 1179 2753 6 1492 1 MULTIPOLYGON (((-78.49252 3... 227 | #> 7 4 977 0 5 1 MULTIPOLYGON (((-81.94135 3... 228 | #> 8 65 1568 1 76 1 MULTIPOLYGON (((-80.49554 3... 229 | #> 9 736 1863 0 950 1 MULTIPOLYGON (((-78.25455 3... 230 | #> 10 1851 5189 7 2274 1 MULTIPOLYGON (((-78.18693 3...
    plot(sf::st_geometry(nc_d)) 231 |
    232 |
    233 |
    234 | 239 |
    240 | 241 | 242 |
    243 | 246 | 247 |
    248 |

    Site built with pkgdown 1.6.1.

    249 |
    250 | 251 |
    252 |
    253 | 254 | 255 | 256 | 257 | 258 | 259 | 260 | 261 | -------------------------------------------------------------------------------- /docs/reference/sfarrow.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | sfarrow: An R package for reading/writing simple feature (sf) 10 | objects from/to Arrow parquet/feather files with arrow — sfarrow • sfarrow 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 45 | 53 | 54 | 55 | 56 | 57 | 58 | 59 | 60 | 61 | 65 | 66 | 67 | 68 | 69 | 70 | 71 |
    72 |
    73 | 127 | 128 | 129 | 130 |
    131 | 132 |
    133 |
    134 | 140 | 141 |
    142 |

    Simple features are a popular, standardised way to create spatial vector data 143 | with a list-type geometry column. Parquet files are standard column-oriented 144 | files designed by Apache Arrow (https://parquet.apache.org/) for fast 145 | read/writes. sfarrow is designed to support the reading and writing of 146 | simple features in sf objects from/to Parquet files (.parquet) and 147 | Feather files (.feather) within R. A key goal of sfarrow is to 148 | support interoperability of spatial data in files between R and 149 | Python through the use of standardised metadata.

    150 |
    151 | 152 | 153 | 154 |

    Metadata

    155 | 156 | 157 | 158 |

    Coordinate reference and geometry field information for sf objects are 159 | stored in standard metadata tables within the files. The metadata are based 160 | on a standard representation (Version 0.1.0, reference: 161 | https://github.com/geopandas/geo-arrow-spec). This is compatible with 162 | the format used by the Python library GeoPandas for read/writing 163 | Parquet/Feather files. Note to users: this metadata format is not yet stable 164 | for production uses and may change in the future.

    165 |

    Credits

    166 | 167 | 168 | 169 |

    This work was undertaken by Chris Jochem, a member of the WorldPop Research 170 | Group at the University of Southampton(https://www.worldpop.org/).

    171 | 172 |
    173 | 178 |
    179 | 180 | 181 |
    182 | 185 | 186 |
    187 |

    Site built with pkgdown 1.6.1.

    188 |
    189 | 190 |
    191 |
    192 | 193 | 194 | 195 | 196 | 197 | 198 | 199 | 200 | -------------------------------------------------------------------------------- /docs/reference/st_read_feather-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wcjochem/sfarrow/aff5deed42bec078799bc46a43975badb59a05eb/docs/reference/st_read_feather-1.png -------------------------------------------------------------------------------- /docs/reference/st_read_feather.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Read a Feather file to sf object — st_read_feather • sfarrow 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 57 | 58 | 59 | 60 | 61 | 62 | 63 |
    64 |
    65 | 119 | 120 | 121 | 122 |
    123 | 124 |
    125 |
    126 | 131 | 132 |
    133 |

    Read a Feather file. Uses standard metadata information to 134 | identify geometry columns and coordinate reference system information.

    135 |
    136 | 137 |
    st_read_feather(dsn, col_select = NULL, ...)
    138 | 139 |

    Arguments

    140 | 141 | 142 | 143 | 144 | 145 | 146 | 147 | 148 | 150 | 151 | 152 | 153 | 155 | 156 |
    dsn

    character file path to a data source

    col_select

    A character vector of column names to keep. Default is 149 | NULL which returns all columns

    ...

    additional parameters to pass to 154 | FeatherReader

    157 | 158 |

    Value

    159 | 160 |

    object of class sf

    161 |

    Details

    162 | 163 |

    Reference for the metadata used: 164 | https://github.com/geopandas/geo-arrow-spec. These are standard with 165 | the Python GeoPandas library.

    166 |

    See also

    167 | 168 | 169 | 170 |

    Examples

    171 |
    # load Natural Earth low-res dataset. 172 | # Created in Python with GeoPandas.to_feather() 173 | path <- system.file("extdata", package = "sfarrow") 174 | 175 | world <- st_read_feather(file.path(path, "world.feather")) 176 | 177 | world 178 |
    #> Simple feature collection with 177 features and 5 fields 179 | #> Geometry type: GEOMETRY 180 | #> Dimension: XY 181 | #> Bounding box: xmin: -180 ymin: -90 xmax: 180 ymax: 83.64513 182 | #> Geodetic CRS: WGS 84 183 | #> First 10 features: 184 | #> pop_est continent name iso_a3 gdp_md_est 185 | #> 1 920938 Oceania Fiji FJI 8.374e+03 186 | #> 2 53950935 Africa Tanzania TZA 1.506e+05 187 | #> 3 603253 Africa W. Sahara ESH 9.065e+02 188 | #> 4 35623680 North America Canada CAN 1.674e+06 189 | #> 5 326625791 North America United States of America USA 1.856e+07 190 | #> 6 18556698 Asia Kazakhstan KAZ 4.607e+05 191 | #> 7 29748859 Asia Uzbekistan UZB 2.023e+05 192 | #> 8 6909701 Oceania Papua New Guinea PNG 2.802e+04 193 | #> 9 260580739 Asia Indonesia IDN 3.028e+06 194 | #> 10 44293293 South America Argentina ARG 8.794e+05 195 | #> geometry 196 | #> 1 MULTIPOLYGON (((180 -16.067... 197 | #> 2 POLYGON ((33.90371 -0.95, 3... 198 | #> 3 POLYGON ((-8.66559 27.65643... 199 | #> 4 MULTIPOLYGON (((-122.84 49,... 200 | #> 5 MULTIPOLYGON (((-122.84 49,... 201 | #> 6 POLYGON ((87.35997 49.21498... 202 | #> 7 POLYGON ((55.96819 41.30864... 203 | #> 8 MULTIPOLYGON (((141.0002 -2... 204 | #> 9 MULTIPOLYGON (((141.0002 -2... 205 | #> 10 MULTIPOLYGON (((-68.63401 -...
    plot(sf::st_geometry(world)) 206 |
    207 |
    208 |
    209 | 214 |
    215 | 216 | 217 |
    218 | 221 | 222 |
    223 |

    Site built with pkgdown 1.6.1.

    224 |
    225 | 226 |
    227 |
    228 | 229 | 230 | 231 | 232 | 233 | 234 | 235 | 236 | -------------------------------------------------------------------------------- /docs/reference/st_read_parquet-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wcjochem/sfarrow/aff5deed42bec078799bc46a43975badb59a05eb/docs/reference/st_read_parquet-1.png -------------------------------------------------------------------------------- /docs/reference/st_read_parquet.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Read a Parquet file to sf object — st_read_parquet • sfarrow 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 57 | 58 | 59 | 60 | 61 | 62 | 63 |
    64 |
    65 | 119 | 120 | 121 | 122 |
    123 | 124 |
    125 |
    126 | 131 | 132 |
    133 |

    Read a Parquet file. Uses standard metadata information to 134 | identify geometry columns and coordinate reference system information.

    135 |
    136 | 137 |
    st_read_parquet(dsn, col_select = NULL, props = NULL, ...)
    138 | 139 |

    Arguments

    140 | 141 | 142 | 143 | 144 | 145 | 146 | 147 | 148 | 150 | 151 | 152 | 153 | 154 | 155 | 156 | 157 | 159 | 160 |
    dsn

    character file path to a data source

    col_select

    A character vector of column names to keep. Default is 149 | NULL which returns all columns

    props

    Now deprecated in read_parquet.

    ...

    additional parameters to pass to 158 | ParquetFileReader

    161 | 162 |

    Value

    163 | 164 |

    object of class sf

    165 |

    Details

    166 | 167 |

    Reference for the metadata used: 168 | https://github.com/geopandas/geo-arrow-spec. These are standard with 169 | the Python GeoPandas library.

    170 |

    See also

    171 | 172 | 173 | 174 |

    Examples

    175 |
    # load Natural Earth low-res dataset. 176 | # Created in Python with GeoPandas.to_parquet() 177 | path <- system.file("extdata", package = "sfarrow") 178 | 179 | world <- st_read_parquet(file.path(path, "world.parquet")) 180 | 181 | world 182 |
    #> Simple feature collection with 177 features and 5 fields 183 | #> Geometry type: GEOMETRY 184 | #> Dimension: XY 185 | #> Bounding box: xmin: -180 ymin: -90 xmax: 180 ymax: 83.64513 186 | #> Geodetic CRS: WGS 84 187 | #> First 10 features: 188 | #> pop_est continent name iso_a3 gdp_md_est 189 | #> 1 920938 Oceania Fiji FJI 8.374e+03 190 | #> 2 53950935 Africa Tanzania TZA 1.506e+05 191 | #> 3 603253 Africa W. Sahara ESH 9.065e+02 192 | #> 4 35623680 North America Canada CAN 1.674e+06 193 | #> 5 326625791 North America United States of America USA 1.856e+07 194 | #> 6 18556698 Asia Kazakhstan KAZ 4.607e+05 195 | #> 7 29748859 Asia Uzbekistan UZB 2.023e+05 196 | #> 8 6909701 Oceania Papua New Guinea PNG 2.802e+04 197 | #> 9 260580739 Asia Indonesia IDN 3.028e+06 198 | #> 10 44293293 South America Argentina ARG 8.794e+05 199 | #> geometry 200 | #> 1 MULTIPOLYGON (((180 -16.067... 201 | #> 2 POLYGON ((33.90371 -0.95, 3... 202 | #> 3 POLYGON ((-8.66559 27.65643... 203 | #> 4 MULTIPOLYGON (((-122.84 49,... 204 | #> 5 MULTIPOLYGON (((-122.84 49,... 205 | #> 6 POLYGON ((87.35997 49.21498... 206 | #> 7 POLYGON ((55.96819 41.30864... 207 | #> 8 MULTIPOLYGON (((141.0002 -2... 208 | #> 9 MULTIPOLYGON (((141.0002 -2... 209 | #> 10 MULTIPOLYGON (((-68.63401 -...
    plot(sf::st_geometry(world)) 210 |
    211 |
    212 |
    213 | 218 |
    219 | 220 | 221 |
    222 | 225 | 226 |
    227 |

    Site built with pkgdown 1.6.1.

    228 |
    229 | 230 |
    231 |
    232 | 233 | 234 | 235 | 236 | 237 | 238 | 239 | 240 | -------------------------------------------------------------------------------- /docs/reference/st_write_feather.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Write sf object to Feather file — st_write_feather • sfarrow 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 58 | 59 | 60 | 61 | 62 | 63 | 64 |
    65 |
    66 | 120 | 121 | 122 | 123 |
    124 | 125 |
    126 |
    127 | 132 | 133 |
    134 |

    Convert a simple features spatial object from sf and 135 | write to a Feather file using write_feather. Geometry 136 | columns (type sfc) are converted to well-known binary (WKB) format.

    137 |
    138 | 139 |
    st_write_feather(obj, dsn, ...)
    140 | 141 |

    Arguments

    142 | 143 | 144 | 145 | 146 | 147 | 148 | 149 | 150 | 151 | 152 | 153 | 154 | 155 | 156 |
    obj

    object of class sf

    dsn

    data source name. A path and file name with .parquet extension

    ...

    additional options to pass to write_feather

    157 | 158 |

    Value

    159 | 160 |

    obj invisibly

    161 |

    See also

    162 | 163 | 164 | 165 |

    Examples

    166 |
    # read spatial object 167 | nc <- sf::st_read(system.file("shape/nc.shp", package="sf"), quiet = TRUE) 168 | 169 | # create temp file 170 | tf <- tempfile(fileext = '.feather') 171 | on.exit(unlink(tf)) 172 | 173 | # write out object 174 | st_write_feather(obj = nc, dsn = tf) 175 |
    #> Warning: This is an initial implementation of Parquet/Feather file support and 176 | #> geo metadata. This is tracking version 0.1.0 of the metadata 177 | #> (https://github.com/geopandas/geo-arrow-spec). This metadata 178 | #> specification may change and does not yet make stability promises. We 179 | #> do not yet recommend using this in a production setting unless you are 180 | #> able to rewrite your Parquet/Feather files.
    181 | # In Python, read the new file with geopandas.read_feather(...) 182 | # read back into R 183 | nc_f <- st_read_feather(tf) 184 | 185 |
    186 |
    187 | 192 |
    193 | 194 | 195 |
    196 | 199 | 200 |
    201 |

    Site built with pkgdown 1.6.1.

    202 |
    203 | 204 |
    205 |
    206 | 207 | 208 | 209 | 210 | 211 | 212 | 213 | 214 | -------------------------------------------------------------------------------- /docs/reference/st_write_parquet.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Write sf object to Parquet file — st_write_parquet • sfarrow 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 58 | 59 | 60 | 61 | 62 | 63 | 64 |
    65 |
    66 | 120 | 121 | 122 | 123 |
    124 | 125 |
    126 |
    127 | 132 | 133 |
    134 |

    Convert a simple features spatial object from sf and 135 | write to a Parquet file using write_parquet. Geometry 136 | columns (type sfc) are converted to well-known binary (WKB) format.

    137 |
    138 | 139 |
    st_write_parquet(obj, dsn, ...)
    140 | 141 |

    Arguments

    142 | 143 | 144 | 145 | 146 | 147 | 148 | 149 | 150 | 151 | 152 | 153 | 154 | 155 | 156 |
    obj

    object of class sf

    dsn

    data source name. A path and file name with .parquet extension

    ...

    additional options to pass to write_parquet

    157 | 158 |

    Value

    159 | 160 |

    obj invisibly

    161 |

    See also

    162 | 163 | 164 | 165 |

    Examples

    166 |
    # read spatial object 167 | nc <- sf::st_read(system.file("shape/nc.shp", package="sf"), quiet = TRUE) 168 | 169 | # create temp file 170 | tf <- tempfile(fileext = '.parquet') 171 | on.exit(unlink(tf)) 172 | 173 | # write out object 174 | st_write_parquet(obj = nc, dsn = tf) 175 |
    #> Warning: This is an initial implementation of Parquet/Feather file support and 176 | #> geo metadata. This is tracking version 0.1.0 of the metadata 177 | #> (https://github.com/geopandas/geo-arrow-spec). This metadata 178 | #> specification may change and does not yet make stability promises. We 179 | #> do not yet recommend using this in a production setting unless you are 180 | #> able to rewrite your Parquet/Feather files.
    181 | # In Python, read the new file with geopandas.read_parquet(...) 182 | # read back into R 183 | nc_p <- st_read_parquet(tf) 184 | 185 |
    186 |
    187 | 192 |
    193 | 194 | 195 |
    196 | 199 | 200 |
    201 |

    Site built with pkgdown 1.6.1.

    202 |
    203 | 204 |
    205 |
    206 | 207 | 208 | 209 | 210 | 211 | 212 | 213 | 214 | -------------------------------------------------------------------------------- /docs/reference/validate_metadata.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Basic checking of key geo metadata columns — validate_metadata • sfarrow 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 56 | 57 | 58 | 59 | 60 | 61 | 62 |
    63 |
    64 | 118 | 119 | 120 | 121 |
    122 | 123 |
    124 |
    125 | 130 | 131 |
    132 |

    Basic checking of key geo metadata columns

    133 |
    134 | 135 |
    validate_metadata(metadata)
    136 | 137 |

    Arguments

    138 | 139 | 140 | 141 | 142 | 143 | 144 |
    metadata

    list for geo metadata

    145 | 146 |

    Value

    147 | 148 |

    None. Throws an error and stops execution

    149 | 150 |
    151 | 156 |
    157 | 158 | 159 | 169 |
    170 | 171 | 172 | 173 | 174 | 175 | 176 | 177 | 178 | -------------------------------------------------------------------------------- /docs/reference/write_sf_dataset-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wcjochem/sfarrow/aff5deed42bec078799bc46a43975badb59a05eb/docs/reference/write_sf_dataset-1.png -------------------------------------------------------------------------------- /inst/extdata/ds/split1=1/split2=1/part-3.parquet: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wcjochem/sfarrow/aff5deed42bec078799bc46a43975badb59a05eb/inst/extdata/ds/split1=1/split2=1/part-3.parquet -------------------------------------------------------------------------------- /inst/extdata/ds/split1=1/split2=2/part-0.parquet: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wcjochem/sfarrow/aff5deed42bec078799bc46a43975badb59a05eb/inst/extdata/ds/split1=1/split2=2/part-0.parquet -------------------------------------------------------------------------------- /inst/extdata/ds/split1=2/split2=1/part-1.parquet: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wcjochem/sfarrow/aff5deed42bec078799bc46a43975badb59a05eb/inst/extdata/ds/split1=2/split2=1/part-1.parquet -------------------------------------------------------------------------------- /inst/extdata/ds/split1=2/split2=2/part-5.parquet: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wcjochem/sfarrow/aff5deed42bec078799bc46a43975badb59a05eb/inst/extdata/ds/split1=2/split2=2/part-5.parquet -------------------------------------------------------------------------------- /inst/extdata/ds/split1=3/split2=1/part-2.parquet: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wcjochem/sfarrow/aff5deed42bec078799bc46a43975badb59a05eb/inst/extdata/ds/split1=3/split2=1/part-2.parquet -------------------------------------------------------------------------------- /inst/extdata/ds/split1=3/split2=2/part-4.parquet: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wcjochem/sfarrow/aff5deed42bec078799bc46a43975badb59a05eb/inst/extdata/ds/split1=3/split2=2/part-4.parquet -------------------------------------------------------------------------------- /inst/extdata/world.feather: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wcjochem/sfarrow/aff5deed42bec078799bc46a43975badb59a05eb/inst/extdata/world.feather -------------------------------------------------------------------------------- /inst/extdata/world.parquet: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wcjochem/sfarrow/aff5deed42bec078799bc46a43975badb59a05eb/inst/extdata/world.parquet -------------------------------------------------------------------------------- /man/arrow_to_sf.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/st_arrow.R 3 | \name{arrow_to_sf} 4 | \alias{arrow_to_sf} 5 | \title{Helper function to convert 'data.frame' to \code{sf}} 6 | \usage{ 7 | arrow_to_sf(tbl, metadata) 8 | } 9 | \arguments{ 10 | \item{tbl}{\code{data.frame} from reading an Arrow dataset} 11 | 12 | \item{metadata}{\code{list} of validated geo metadata} 13 | } 14 | \value{ 15 | object of \code{sf} with CRS and geometry columns 16 | } 17 | \description{ 18 | Helper function to convert 'data.frame' to \code{sf} 19 | } 20 | \keyword{internal} 21 | -------------------------------------------------------------------------------- /man/create_metadata.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/st_arrow.R 3 | \name{create_metadata} 4 | \alias{create_metadata} 5 | \title{Create standardised geo metadata for Parquet files} 6 | \usage{ 7 | create_metadata(df) 8 | } 9 | \arguments{ 10 | \item{df}{object of class \code{sf}} 11 | } 12 | \value{ 13 | JSON formatted list with geo-metadata 14 | } 15 | \description{ 16 | Create standardised geo metadata for Parquet files 17 | } 18 | \details{ 19 | Reference for metadata standard: 20 | \url{https://github.com/geopandas/geo-arrow-spec}. This is compatible with 21 | \code{GeoPandas} Parquet files. 22 | } 23 | \keyword{internal} 24 | -------------------------------------------------------------------------------- /man/encode_wkb.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/st_arrow.R 3 | \name{encode_wkb} 4 | \alias{encode_wkb} 5 | \title{Convert \code{sfc} geometry columns into a WKB binary format} 6 | \usage{ 7 | encode_wkb(df) 8 | } 9 | \arguments{ 10 | \item{df}{\code{sf} object} 11 | } 12 | \value{ 13 | \code{data.frame} with binary geometry column(s) 14 | } 15 | \description{ 16 | Convert \code{sfc} geometry columns into a WKB binary format 17 | } 18 | \details{ 19 | Allows for more than one geometry column in \code{sfc} format 20 | } 21 | \keyword{internal} 22 | -------------------------------------------------------------------------------- /man/figures/REAsDME-unnamed-chunk-2-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wcjochem/sfarrow/aff5deed42bec078799bc46a43975badb59a05eb/man/figures/REAsDME-unnamed-chunk-2-1.png -------------------------------------------------------------------------------- /man/figures/REAsDME-unnamed-chunk-3-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wcjochem/sfarrow/aff5deed42bec078799bc46a43975badb59a05eb/man/figures/REAsDME-unnamed-chunk-3-1.png -------------------------------------------------------------------------------- /man/figures/REAsDME-unnamed-chunk-4-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wcjochem/sfarrow/aff5deed42bec078799bc46a43975badb59a05eb/man/figures/REAsDME-unnamed-chunk-4-1.png -------------------------------------------------------------------------------- /man/figures/REAsDME-unnamed-chunk-5-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/wcjochem/sfarrow/aff5deed42bec078799bc46a43975badb59a05eb/man/figures/REAsDME-unnamed-chunk-5-1.png -------------------------------------------------------------------------------- /man/read_sf_dataset.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/st_arrow.R 3 | \name{read_sf_dataset} 4 | \alias{read_sf_dataset} 5 | \title{Read an Arrow multi-file dataset and create \code{sf} object} 6 | \usage{ 7 | read_sf_dataset(dataset, find_geom = FALSE) 8 | } 9 | \arguments{ 10 | \item{dataset}{a \code{Dataset} object created by \code{arrow::open_dataset} 11 | or an \code{arrow_dplyr_query}} 12 | 13 | \item{find_geom}{logical. Only needed when returning a subset of columns. 14 | Should all available geometry columns be selected and added to to the 15 | dataset query without being named? Default is \code{FALSE} to require 16 | geometry column(s) to be selected specifically.} 17 | } 18 | \value{ 19 | object of class \code{\link[sf]{sf}} 20 | } 21 | \description{ 22 | Read an Arrow multi-file dataset and create \code{sf} object 23 | } 24 | \details{ 25 | This function is primarily for use after opening a dataset with 26 | \code{arrow::open_dataset}. Users can then query the \code{arrow Dataset} 27 | using \code{dplyr} methods such as \code{\link[dplyr]{filter}} or 28 | \code{\link[dplyr]{select}}. Passing the resulting query to this function 29 | will parse the datasets and create an \code{sf} object. The function 30 | expects consistent geographic metadata to be stored with the dataset in 31 | order to create \code{\link[sf]{sf}} objects. 32 | } 33 | \examples{ 34 | # read spatial object 35 | nc <- sf::st_read(system.file("shape/nc.shp", package="sf"), quiet = TRUE) 36 | 37 | # create random grouping 38 | nc$group <- sample(1:3, nrow(nc), replace = TRUE) 39 | 40 | # use dplyr to group the dataset. \%>\% also allowed 41 | nc_g <- dplyr::group_by(nc, group) 42 | 43 | # write out to parquet datasets 44 | tf <- tempfile() # create temporary location 45 | on.exit(unlink(tf)) 46 | # partitioning determined by dplyr 'group_vars' 47 | write_sf_dataset(nc_g, path = tf) 48 | 49 | list.files(tf, recursive = TRUE) 50 | 51 | # open parquet files from dataset 52 | ds <- arrow::open_dataset(tf) 53 | 54 | # create a query. \%>\% also allowed 55 | q <- dplyr::filter(ds, group == 1) 56 | 57 | # read the dataset (piping syntax also works) 58 | nc_d <- read_sf_dataset(dataset = q) 59 | 60 | nc_d 61 | plot(sf::st_geometry(nc_d)) 62 | 63 | } 64 | \seealso{ 65 | \code{\link[arrow]{open_dataset}}, \code{\link[sf]{st_read}}, \code{\link{st_read_parquet}} 66 | } 67 | -------------------------------------------------------------------------------- /man/sfarrow.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/sfarrow.R 3 | \docType{package} 4 | \name{sfarrow} 5 | \alias{sfarrow} 6 | \title{\code{sfarrow}: An R package for reading/writing simple feature (\code{sf}) 7 | objects from/to Arrow parquet/feather files with \code{arrow}} 8 | \description{ 9 | Simple features are a popular, standardised way to create spatial vector data 10 | with a list-type geometry column. Parquet files are standard column-oriented 11 | files designed by Apache Arrow (\url{https://parquet.apache.org/}) for fast 12 | read/writes. \code{sfarrow} is designed to support the reading and writing of 13 | simple features in \code{sf} objects from/to Parquet files (.parquet) and 14 | Feather files (.feather) within \code{R}. A key goal of \code{sfarrow} is to 15 | support interoperability of spatial data in files between \code{R} and 16 | \code{Python} through the use of standardised metadata. 17 | } 18 | \section{Metadata}{ 19 | 20 | Coordinate reference and geometry field information for \code{sf} objects are 21 | stored in standard metadata tables within the files. The metadata are based 22 | on a standard representation (Version 0.1.0, reference: 23 | \url{https://github.com/geopandas/geo-arrow-spec}). This is compatible with 24 | the format used by the Python library \code{GeoPandas} for read/writing 25 | Parquet/Feather files. Note to users: this metadata format is not yet stable 26 | for production uses and may change in the future. 27 | } 28 | 29 | \section{Credits}{ 30 | 31 | This work was undertaken by Chris Jochem, a member of the WorldPop Research 32 | Group at the University of Southampton(\url{https://www.worldpop.org/}). 33 | } 34 | 35 | \keyword{internal} 36 | -------------------------------------------------------------------------------- /man/st_read_feather.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/st_arrow.R 3 | \name{st_read_feather} 4 | \alias{st_read_feather} 5 | \title{Read a Feather file to \code{sf} object} 6 | \usage{ 7 | st_read_feather(dsn, col_select = NULL, ...) 8 | } 9 | \arguments{ 10 | \item{dsn}{character file path to a data source} 11 | 12 | \item{col_select}{A character vector of column names to keep. Default is 13 | \code{NULL} which returns all columns} 14 | 15 | \item{...}{additional parameters to pass to 16 | \code{\link[arrow]{FeatherReader}}} 17 | } 18 | \value{ 19 | object of class \code{\link[sf]{sf}} 20 | } 21 | \description{ 22 | Read a Feather file. Uses standard metadata information to 23 | identify geometry columns and coordinate reference system information. 24 | } 25 | \details{ 26 | Reference for the metadata used: 27 | \url{https://github.com/geopandas/geo-arrow-spec}. These are standard with 28 | the Python \code{GeoPandas} library. 29 | } 30 | \examples{ 31 | # load Natural Earth low-res dataset. 32 | # Created in Python with GeoPandas.to_feather() 33 | path <- system.file("extdata", package = "sfarrow") 34 | 35 | world <- st_read_feather(file.path(path, "world.feather")) 36 | 37 | world 38 | plot(sf::st_geometry(world)) 39 | 40 | } 41 | \seealso{ 42 | \code{\link[arrow]{read_feather}}, \code{\link[sf]{st_read}} 43 | } 44 | -------------------------------------------------------------------------------- /man/st_read_parquet.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/st_arrow.R 3 | \name{st_read_parquet} 4 | \alias{st_read_parquet} 5 | \title{Read a Parquet file to \code{sf} object} 6 | \usage{ 7 | st_read_parquet(dsn, col_select = NULL, props = NULL, ...) 8 | } 9 | \arguments{ 10 | \item{dsn}{character file path to a data source} 11 | 12 | \item{col_select}{A character vector of column names to keep. Default is 13 | \code{NULL} which returns all columns} 14 | 15 | \item{props}{Now deprecated in \code{\link[arrow]{read_parquet}}.} 16 | 17 | \item{...}{additional parameters to pass to 18 | \code{\link[arrow]{ParquetFileReader}}} 19 | } 20 | \value{ 21 | object of class \code{\link[sf]{sf}} 22 | } 23 | \description{ 24 | Read a Parquet file. Uses standard metadata information to 25 | identify geometry columns and coordinate reference system information. 26 | } 27 | \details{ 28 | Reference for the metadata used: 29 | \url{https://github.com/geopandas/geo-arrow-spec}. These are standard with 30 | the Python \code{GeoPandas} library. 31 | } 32 | \examples{ 33 | # load Natural Earth low-res dataset. 34 | # Created in Python with GeoPandas.to_parquet() 35 | path <- system.file("extdata", package = "sfarrow") 36 | 37 | world <- st_read_parquet(file.path(path, "world.parquet")) 38 | 39 | world 40 | plot(sf::st_geometry(world)) 41 | 42 | } 43 | \seealso{ 44 | \code{\link[arrow]{read_parquet}}, \code{\link[sf]{st_read}} 45 | } 46 | -------------------------------------------------------------------------------- /man/st_write_feather.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/st_arrow.R 3 | \name{st_write_feather} 4 | \alias{st_write_feather} 5 | \title{Write \code{sf} object to Feather file} 6 | \usage{ 7 | st_write_feather(obj, dsn, ...) 8 | } 9 | \arguments{ 10 | \item{obj}{object of class \code{\link[sf]{sf}}} 11 | 12 | \item{dsn}{data source name. A path and file name with .parquet extension} 13 | 14 | \item{...}{additional options to pass to \code{\link[arrow]{write_feather}}} 15 | } 16 | \value{ 17 | \code{obj} invisibly 18 | } 19 | \description{ 20 | Convert a simple features spatial object from \code{sf} and 21 | write to a Feather file using \code{\link[arrow]{write_feather}}. Geometry 22 | columns (type \code{sfc}) are converted to well-known binary (WKB) format. 23 | } 24 | \examples{ 25 | # read spatial object 26 | nc <- sf::st_read(system.file("shape/nc.shp", package="sf"), quiet = TRUE) 27 | 28 | # create temp file 29 | tf <- tempfile(fileext = '.feather') 30 | on.exit(unlink(tf)) 31 | 32 | # write out object 33 | st_write_feather(obj = nc, dsn = tf) 34 | 35 | # In Python, read the new file with geopandas.read_feather(...) 36 | # read back into R 37 | nc_f <- st_read_feather(tf) 38 | 39 | } 40 | \seealso{ 41 | \code{\link[arrow]{write_feather}} 42 | } 43 | -------------------------------------------------------------------------------- /man/st_write_parquet.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/st_arrow.R 3 | \name{st_write_parquet} 4 | \alias{st_write_parquet} 5 | \title{Write \code{sf} object to Parquet file} 6 | \usage{ 7 | st_write_parquet(obj, dsn, ...) 8 | } 9 | \arguments{ 10 | \item{obj}{object of class \code{\link[sf]{sf}}} 11 | 12 | \item{dsn}{data source name. A path and file name with .parquet extension} 13 | 14 | \item{...}{additional options to pass to \code{\link[arrow]{write_parquet}}} 15 | } 16 | \value{ 17 | \code{obj} invisibly 18 | } 19 | \description{ 20 | Convert a simple features spatial object from \code{sf} and 21 | write to a Parquet file using \code{\link[arrow]{write_parquet}}. Geometry 22 | columns (type \code{sfc}) are converted to well-known binary (WKB) format. 23 | } 24 | \examples{ 25 | # read spatial object 26 | nc <- sf::st_read(system.file("shape/nc.shp", package="sf"), quiet = TRUE) 27 | 28 | # create temp file 29 | tf <- tempfile(fileext = '.parquet') 30 | on.exit(unlink(tf)) 31 | 32 | # write out object 33 | st_write_parquet(obj = nc, dsn = tf) 34 | 35 | # In Python, read the new file with geopandas.read_parquet(...) 36 | # read back into R 37 | nc_p <- st_read_parquet(tf) 38 | 39 | } 40 | \seealso{ 41 | \code{\link[arrow]{write_parquet}} 42 | } 43 | -------------------------------------------------------------------------------- /man/validate_metadata.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/st_arrow.R 3 | \name{validate_metadata} 4 | \alias{validate_metadata} 5 | \title{Basic checking of key geo metadata columns} 6 | \usage{ 7 | validate_metadata(metadata) 8 | } 9 | \arguments{ 10 | \item{metadata}{list for geo metadata} 11 | } 12 | \value{ 13 | None. Throws an error and stops execution 14 | } 15 | \description{ 16 | Basic checking of key geo metadata columns 17 | } 18 | \keyword{internal} 19 | -------------------------------------------------------------------------------- /man/write_sf_dataset.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/st_arrow.R 3 | \name{write_sf_dataset} 4 | \alias{write_sf_dataset} 5 | \title{Write \code{sf} object to an Arrow multi-file dataset} 6 | \usage{ 7 | write_sf_dataset( 8 | obj, 9 | path, 10 | format = "parquet", 11 | partitioning = dplyr::group_vars(obj), 12 | ... 13 | ) 14 | } 15 | \arguments{ 16 | \item{obj}{object of class \code{\link[sf]{sf}}} 17 | 18 | \item{path}{string path referencing a directory for the output} 19 | 20 | \item{format}{output file format ("parquet" or "feather")} 21 | 22 | \item{partitioning}{character vector of columns in \code{obj} for grouping or 23 | the \code{dplyr::group_vars}} 24 | 25 | \item{...}{additional arguments and options passed to 26 | \code{arrow::write_dataset}} 27 | } 28 | \value{ 29 | \code{obj} invisibly 30 | } 31 | \description{ 32 | Write \code{sf} object to an Arrow multi-file dataset 33 | } 34 | \details{ 35 | Translate an \code{sf} spatial object to \code{data.frame} with WKB 36 | geometry columns and then write to an \code{arrow} dataset with 37 | partitioning. Allows for \code{dplyr} grouped datasets (using 38 | \code{\link[dplyr]{group_by}}) and uses those variables to define 39 | partitions. 40 | } 41 | \examples{ 42 | # read spatial object 43 | nc <- sf::st_read(system.file("shape/nc.shp", package="sf"), quiet = TRUE) 44 | 45 | # create random grouping 46 | nc$group <- sample(1:3, nrow(nc), replace = TRUE) 47 | 48 | # use dplyr to group the dataset. \%>\% also allowed 49 | nc_g <- dplyr::group_by(nc, group) 50 | 51 | # write out to parquet datasets 52 | tf <- tempfile() # create temporary location 53 | on.exit(unlink(tf)) 54 | # partitioning determined by dplyr 'group_vars' 55 | write_sf_dataset(nc_g, path = tf) 56 | 57 | list.files(tf, recursive = TRUE) 58 | 59 | # open parquet files from dataset 60 | ds <- arrow::open_dataset(tf) 61 | 62 | # create a query. \%>\% also allowed 63 | q <- dplyr::filter(ds, group == 1) 64 | 65 | # read the dataset (piping syntax also works) 66 | nc_d <- read_sf_dataset(dataset = q) 67 | 68 | nc_d 69 | plot(sf::st_geometry(nc_d)) 70 | 71 | } 72 | \seealso{ 73 | \code{\link[arrow]{write_dataset}}, \code{\link{st_read_parquet}} 74 | } 75 | -------------------------------------------------------------------------------- /vignettes/.gitignore: -------------------------------------------------------------------------------- 1 | *.html 2 | *.R 3 | -------------------------------------------------------------------------------- /vignettes/example_sfarrow.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Getting started examples" 3 | output: rmarkdown::html_vignette 4 | description: Reading/writing with sfarrow and how it works. 5 | vignette: > 6 | %\VignetteIndexEntry{example_sfarrow} 7 | %\VignetteEngine{knitr::rmarkdown} 8 | %\VignetteEncoding{UTF-8} 9 | --- 10 | 11 | ```{r, include = FALSE} 12 | knitr::opts_chunk$set( 13 | collapse = TRUE, 14 | comment = "#>" 15 | ) 16 | ``` 17 | 18 | `sfarrow` is designed to help read/write spatial vector data in "simple feature" 19 | format from/to Parquet files while maintaining coordinate reference system 20 | information. Essentially, this tool is attempting to connect `R` objects in 21 | [`sf`](https://r-spatial.github.io/sf/) and in 22 | [`arrow`](https://arrow.apache.org/docs/r/) and it relies on these packages for 23 | its internal work. 24 | 25 | A key goal is to support interoperability of spatial data in Parquet files. R 26 | objects (including `sf`) can be written to files with `arrow`; however, these do 27 | not necessarily maintain the spatial information or can be read in by Python. 28 | `sfarrow` implements a metadata format also used by Python `GeoPandas`, 29 | described here: 30 | [https://github.com/geopandas/geo-arrow-spec](https://github.com/geopandas/geo-arrow-spec). 31 | Note that these metadata are not stable yet, and `sfarrow` will warn you that it 32 | may change. 33 | 34 | ```{r setup} 35 | # install from CRAN with install.packages('sfarrow') 36 | # or install from devtools::install_github("wcjochem/sfarrow@main) 37 | # load the library 38 | library(sfarrow) 39 | library(dplyr, warn.conflicts = FALSE) 40 | ``` 41 | 42 | ## Reading and writing single files 43 | 44 | A Parquet file (with `.parquet` extension) can be read using `st_read_parquet()` 45 | and pointing to the file system. This will create an `sf` spatial data object in 46 | memory which can then be used as normal using functions from `sf`. 47 | 48 | ```{r} 49 | # read an example dataset created from Python using geopandas 50 | world <- st_read_parquet(system.file("extdata", "world.parquet", package = "sfarrow")) 51 | 52 | class(world) 53 | world 54 | plot(sf::st_geometry(world)) 55 | ``` 56 | 57 | Similarly, a Parquet file can be written from an `sf` object using 58 | `st_write_parquet()` and specifying a path to the new file. Non-spatial objects 59 | cannot be written with `sfarrow`, and users should instead use `arrow`. 60 | 61 | ```{r} 62 | # output the file to a new location 63 | # note the warning about possible future changes in metadata. 64 | st_write_parquet(world, dsn = file.path(tempdir(), "new_world.parquet")) 65 | ``` 66 | 67 | ## Partitioned datasets 68 | 69 | While reading/writing a Parquet file is nice, the real power of `arrow` comes 70 | from splitting big datasets into multiple files, or partitions, based on 71 | criteria that make it faster to query. There is currently basic support in 72 | `sfarrow` for multi-file spatial datasets. For additional dataset querying 73 | options, see the `arrow` 74 | [documentation](https://arrow.apache.org/docs/r/articles/dataset.html). 75 | 76 | ### Querying and reading Datasets 77 | `sfarrow` accesses `arrows`'s `dplyr` interface to explore partitioned, Arrow 78 | datasets. 79 | 80 | For this example we will use a dataset which was created by randomly splitting 81 | the nc.shp file first into three groups and then further partitioning into two 82 | more random groups. This creates a nested set of files. 83 | 84 | ```{r} 85 | list.files(system.file("extdata", "ds", package = "sfarrow"), recursive = TRUE) 86 | ``` 87 | 88 | The file tree is showing that the data were partitioned by the variables 89 | "split1" and "split2". Those are the column names that were used for the random 90 | splits. This partitioning is in ["Hive style"](https://hive.apache.org/) where 91 | the partitioning variables are in the paths. 92 | 93 | The first step is to open the Dataset using `arrow`. 94 | 95 | ```{r} 96 | ds <- arrow::open_dataset(system.file("extdata", "ds", package="sfarrow")) 97 | ``` 98 | 99 | For small datasets (as in the example) we can read the entire set of files into 100 | an `sf` object. 101 | 102 | ```{r} 103 | nc_ds <- read_sf_dataset(ds) 104 | 105 | nc_ds 106 | ``` 107 | 108 | With large datasets, more often we will want query them and return a reduced set 109 | of the partitioned records. To create a query, the easiest way is to use 110 | `dplyr::filter()` on the partitioning (and/or other) variables to subset the 111 | rows and `dplyr::select()` to subset the columns. `read_sf_dataset()` will then 112 | use the `arrow_dplyr_query` and call `dplyr::collect()` to extract and then 113 | process the Arrow Table into `sf`. 114 | 115 | ```{r, tidy=FALSE} 116 | nc_d12 <- ds %>% 117 | filter(split1 == 1, split2 == 2) %>% 118 | read_sf_dataset() 119 | 120 | nc_d12 121 | plot(sf::st_geometry(nc_d12), col="grey") 122 | ``` 123 | 124 | When using `select()` to read only a subset of columns, if the geometry column 125 | is not returned, the default behaviour of `sfarrow` is to throw an error from 126 | `read_sf_dataset`. If you do not need the geometry column for your analyses, 127 | then using `arrow` and not `sfarrow` should be sufficient. However, setting 128 | `find_geom = TRUE` in `read_sf_dataset` will read in any geometry columns in the 129 | metadata, in addition to the selected columns. 130 | 131 | ```{r} 132 | # this command will throw an error 133 | # no geometry column selected for read_sf_dataset 134 | # nc_sub <- ds %>% 135 | # select('FIPS') %>% # subset of columns 136 | # read_sf_dataset() 137 | 138 | # set find_geom 139 | nc_sub <- ds %>% 140 | select('FIPS') %>% # subset of columns 141 | read_sf_dataset(find_geom = TRUE) 142 | 143 | nc_sub 144 | ``` 145 | 146 | 147 | ### Writing to Datasets 148 | 149 | To write an `sf` object into multiple files, we can again construct a query 150 | using `dplyr::group_by()` to define the partitioning variables. The result is 151 | then passed to `sfarrow`. 152 | 153 | ```{r, tidy=FALSE} 154 | world %>% 155 | group_by(continent) %>% 156 | write_sf_dataset(file.path(tempdir(), "world_ds"), 157 | format = "parquet", 158 | hive_style = FALSE) 159 | ``` 160 | 161 | In this example we are not using Hive style. This results in the partitioning 162 | variable not being in the folder paths. 163 | 164 | ```{r} 165 | list.files(file.path(tempdir(), "world_ds")) 166 | ``` 167 | 168 | To read this style of Dataset, we must specify the partitioning variables when 169 | it is opened. 170 | 171 | ```{r, tidy=FALSE} 172 | arrow::open_dataset(file.path(tempdir(), "world_ds"), 173 | partitioning = "continent") %>% 174 | filter(continent == "Africa") %>% 175 | read_sf_dataset() 176 | ``` 177 | 178 | --------------------------------------------------------------------------------