} argument modifier which
8 | indicates that the argument uses \strong{data masking}, a sub-type of
9 | tidy evaluation. If you've never heard of tidy evaluation before,
10 | start with the practical introduction in
11 | \url{https://r4ds.hadley.nz/functions.html#data-frame-functions} then
12 | then read more about the underlying theory in
13 | \url{https://rlang.r-lib.org/reference/topic-data-mask.html}.
14 | }
15 | \section{Key techniques}{
16 | \itemize{
17 | \item To allow the user to supply the column name in a function argument,
18 | embrace the argument, e.g. \code{filter(df, {{ var }})}.
19 |
20 | \if{html}{\out{}}\preformatted{dist_summary <- function(df, var) \{
21 | df \%>\%
22 | summarise(n = n(), min = min(\{\{ var \}\}), max = max(\{\{ var \}\}))
23 | \}
24 | mtcars \%>\% dist_summary(mpg)
25 | mtcars \%>\% group_by(cyl) \%>\% dist_summary(mpg)
26 | }\if{html}{\out{
}}
27 | \item To work with a column name recorded as a string, use the \code{.data}
28 | pronoun, e.g. \code{summarise(df, mean = mean(.data[[var]]))}.
29 |
30 | \if{html}{\out{}}\preformatted{for (var in names(mtcars)) \{
31 | mtcars \%>\% count(.data[[var]]) \%>\% print()
32 | \}
33 |
34 | lapply(names(mtcars), function(var) mtcars \%>\% count(.data[[var]]))
35 | }\if{html}{\out{
}}
36 | \item To suppress \verb{R CMD check} \code{NOTE}s about unknown variables
37 | use \code{.data$var} instead of \code{var}:
38 |
39 | \if{html}{\out{}}\preformatted{# has NOTE
40 | df \%>\% mutate(z = x + y)
41 |
42 | # no NOTE
43 | df \%>\% mutate(z = .data$x + .data$y)
44 | }\if{html}{\out{
}}
45 |
46 | You'll also need to import \code{.data} from rlang with (e.g.)
47 | \verb{@importFrom rlang .data}.
48 | }
49 | }
50 |
51 | \section{Dot-dot-dot (...)}{
52 | \code{...} automatically provides indirection, so you can use it as is
53 | (i.e. without embracing) inside a function:
54 |
55 | \if{html}{\out{}}\preformatted{grouped_mean <- function(df, var, ...) \{
56 | df \%>\%
57 | group_by(...) \%>\%
58 | summarise(mean = mean(\{\{ var \}\}))
59 | \}
60 | }\if{html}{\out{
}}
61 |
62 | You can also use \verb{:=} instead of \code{=} to enable a glue-like syntax for
63 | creating variables from user supplied data:
64 |
65 | \if{html}{\out{}}\preformatted{var_name <- "l100km"
66 | mtcars \%>\% mutate("\{var_name\}" := 235 / mpg)
67 |
68 | summarise_mean <- function(df, var) \{
69 | df \%>\%
70 | summarise("mean_of_\{\{var\}\}" := mean(\{\{ var \}\}))
71 | \}
72 | mtcars \%>\% group_by(cyl) \%>\% summarise_mean(mpg)
73 | }\if{html}{\out{
}}
74 |
75 | Learn more in \url{https://rlang.r-lib.org/reference/topic-data-mask-programming.html}.
76 | }
77 |
78 | \keyword{internal}
79 |
--------------------------------------------------------------------------------
/man/tidyr_legacy.Rd:
--------------------------------------------------------------------------------
1 | % Generated by roxygen2: do not edit by hand
2 | % Please edit documentation in R/utils.R
3 | \name{tidyr_legacy}
4 | \alias{tidyr_legacy}
5 | \title{Legacy name repair}
6 | \usage{
7 | tidyr_legacy(nms, prefix = "V", sep = "")
8 | }
9 | \arguments{
10 | \item{nms}{Character vector of names}
11 |
12 | \item{prefix}{prefix Prefix to use for unnamed column}
13 |
14 | \item{sep}{Separator to use between name and unique suffix}
15 | }
16 | \description{
17 | Ensures all column names are unique using the approach found in
18 | tidyr 0.8.3 and earlier. Only use this function if you want to preserve
19 | the naming strategy, otherwise you're better off adopting the new
20 | tidyverse standard with \code{name_repair = "universal"}
21 | }
22 | \examples{
23 | df <- tibble(x = 1:2, y = list(tibble(x = 3:5), tibble(x = 4:7)))
24 |
25 | # Doesn't work because it would produce a data frame with two
26 | # columns called x
27 | \dontrun{
28 | unnest(df, y)
29 | }
30 |
31 | # The new tidyverse standard:
32 | unnest(df, y, names_repair = "universal")
33 |
34 | # The old tidyr approach
35 | unnest(df, y, names_repair = tidyr_legacy)
36 | }
37 | \keyword{internal}
38 |
--------------------------------------------------------------------------------
/man/tidyr_tidy_select.Rd:
--------------------------------------------------------------------------------
1 | % Generated by roxygen2: do not edit by hand
2 | % Please edit documentation in R/doc-params.R
3 | \name{tidyr_tidy_select}
4 | \alias{tidyr_tidy_select}
5 | \title{Argument type: tidy-select}
6 | \description{
7 | This page describes the \verb{} argument modifier which
8 | indicates that the argument uses \strong{tidy selection}, a sub-type of
9 | tidy evaluation. If you've never heard of tidy evaluation before,
10 | start with the practical introduction in
11 | \url{https://r4ds.hadley.nz/functions.html#data-frame-functions} then
12 | then read more about the underlying theory in
13 | \url{https://rlang.r-lib.org/reference/topic-data-mask.html}.
14 | }
15 | \section{Overview of selection features}{
16 | tidyselect implements a DSL for selecting variables. It provides helpers
17 | for selecting variables:
18 | \itemize{
19 | \item \code{var1:var10}: variables lying between \code{var1} on the left and \code{var10} on the right.
20 | }
21 | \itemize{
22 | \item \code{\link[tidyselect:starts_with]{starts_with("a")}}: names that start with \code{"a"}.
23 | \item \code{\link[tidyselect:starts_with]{ends_with("z")}}: names that end with \code{"z"}.
24 | \item \code{\link[tidyselect:starts_with]{contains("b")}}: names that contain \code{"b"}.
25 | \item \code{\link[tidyselect:starts_with]{matches("x.y")}}: names that match regular expression \code{x.y}.
26 | \item \code{\link[tidyselect:starts_with]{num_range(x, 1:4)}}: names following the pattern, \code{x1}, \code{x2}, ..., \code{x4}.
27 | \item \code{\link[tidyselect:all_of]{all_of(vars)}}/\code{\link[tidyselect:all_of]{any_of(vars)}}:
28 | matches names stored in the character vector \code{vars}. \code{all_of(vars)} will
29 | error if the variables aren't present; \code{any_of(var)} will match just the
30 | variables that exist.
31 | \item \code{\link[tidyselect:everything]{everything()}}: all variables.
32 | \item \code{\link[tidyselect:everything]{last_col()}}: furthest column on the right.
33 | \item \code{\link[tidyselect:where]{where(is.numeric)}}: all variables where
34 | \code{is.numeric()} returns \code{TRUE}.
35 | }
36 |
37 | As well as operators for combining those selections:
38 | \itemize{
39 | \item \code{!selection}: only variables that don't match \code{selection}.
40 | \item \code{selection1 & selection2}: only variables included in both \code{selection1} and \code{selection2}.
41 | \item \code{selection1 | selection2}: all variables that match either \code{selection1} or \code{selection2}.
42 | }
43 | }
44 |
45 | \section{Key techniques}{
46 | \itemize{
47 | \item If you want the user to supply a tidyselect specification in a
48 | function argument, you need to tunnel the selection through the function
49 | argument. This is done by embracing the function argument \code{{{ }}},
50 | e.g \code{unnest(df, {{ vars }})}.
51 | \item If you have a character vector of column names, use \code{all_of()}
52 | or \code{any_of()}, depending on whether or not you want unknown variable
53 | names to cause an error, e.g \code{unnest(df, all_of(vars))},
54 | \code{unnest(df, !any_of(vars))}.
55 | \item To suppress \verb{R CMD check} \code{NOTE}s about unknown variables use \code{"var"}
56 | instead of \code{var}:
57 | }
58 |
59 | \if{html}{\out{}}\preformatted{# has NOTE
60 | df \%>\% select(x, y, z)
61 |
62 | # no NOTE
63 | df \%>\% select("x", "y", "z")
64 | }\if{html}{\out{
}}
65 | }
66 |
67 | \keyword{internal}
68 |
--------------------------------------------------------------------------------
/man/uncount.Rd:
--------------------------------------------------------------------------------
1 | % Generated by roxygen2: do not edit by hand
2 | % Please edit documentation in R/uncount.R
3 | \name{uncount}
4 | \alias{uncount}
5 | \title{"Uncount" a data frame}
6 | \usage{
7 | uncount(data, weights, ..., .remove = TRUE, .id = NULL)
8 | }
9 | \arguments{
10 | \item{data}{A data frame, tibble, or grouped tibble.}
11 |
12 | \item{weights}{A vector of weights. Evaluated in the context of \code{data};
13 | supports quasiquotation.}
14 |
15 | \item{...}{Additional arguments passed on to methods.}
16 |
17 | \item{.remove}{If \code{TRUE}, and \code{weights} is the name of a column in \code{data},
18 | then this column is removed.}
19 |
20 | \item{.id}{Supply a string to create a new variable which gives a unique
21 | identifier for each created row.}
22 | }
23 | \description{
24 | Performs the opposite operation to \code{\link[dplyr:count]{dplyr::count()}}, duplicating rows
25 | according to a weighting variable (or expression).
26 | }
27 | \examples{
28 | df <- tibble(x = c("a", "b"), n = c(1, 2))
29 | uncount(df, n)
30 | uncount(df, n, .id = "id")
31 |
32 | # You can also use constants
33 | uncount(df, 2)
34 |
35 | # Or expressions
36 | uncount(df, 2 / n)
37 | }
38 |
--------------------------------------------------------------------------------
/man/unite.Rd:
--------------------------------------------------------------------------------
1 | % Generated by roxygen2: do not edit by hand
2 | % Please edit documentation in R/unite.R
3 | \name{unite}
4 | \alias{unite}
5 | \title{Unite multiple columns into one by pasting strings together}
6 | \usage{
7 | unite(data, col, ..., sep = "_", remove = TRUE, na.rm = FALSE)
8 | }
9 | \arguments{
10 | \item{data}{A data frame.}
11 |
12 | \item{col}{The name of the new column, as a string or symbol.
13 |
14 | This argument is passed by expression and supports
15 | \link[rlang:topic-inject]{quasiquotation} (you can unquote strings
16 | and symbols). The name is captured from the expression with
17 | \code{\link[rlang:defusing-advanced]{rlang::ensym()}} (note that this kind of interface where
18 | symbols do not represent actual objects is now discouraged in the
19 | tidyverse; we support it here for backward compatibility).}
20 |
21 | \item{...}{<\code{\link[=tidyr_tidy_select]{tidy-select}}> Columns to unite}
22 |
23 | \item{sep}{Separator to use between values.}
24 |
25 | \item{remove}{If \code{TRUE}, remove input columns from output data frame.}
26 |
27 | \item{na.rm}{If \code{TRUE}, missing values will be removed prior to uniting
28 | each value.}
29 | }
30 | \description{
31 | Convenience function to paste together multiple columns into one.
32 | }
33 | \examples{
34 | df <- expand_grid(x = c("a", NA), y = c("b", NA))
35 | df
36 |
37 | df \%>\% unite("z", x:y, remove = FALSE)
38 | # To remove missing values:
39 | df \%>\% unite("z", x:y, na.rm = TRUE, remove = FALSE)
40 |
41 | # Separate is almost the complement of unite
42 | df \%>\%
43 | unite("xy", x:y) \%>\%
44 | separate(xy, c("x", "y"))
45 | # (but note `x` and `y` contain now "NA" not NA)
46 | }
47 | \seealso{
48 | \code{\link[=separate]{separate()}}, the complement.
49 | }
50 |
--------------------------------------------------------------------------------
/man/unnest_auto.Rd:
--------------------------------------------------------------------------------
1 | % Generated by roxygen2: do not edit by hand
2 | % Please edit documentation in R/unnest-auto.R
3 | \name{unnest_auto}
4 | \alias{unnest_auto}
5 | \title{Automatically call \code{unnest_wider()} or \code{unnest_longer()}}
6 | \usage{
7 | unnest_auto(data, col)
8 | }
9 | \arguments{
10 | \item{data}{A data frame.}
11 |
12 | \item{col}{<\code{\link[=tidyr_tidy_select]{tidy-select}}> List-column to unnest.}
13 | }
14 | \description{
15 | \code{unnest_auto()} picks between \code{unnest_wider()} or \code{unnest_longer()}
16 | by inspecting the inner names of the list-col:
17 | \itemize{
18 | \item If all elements are unnamed, it uses
19 | \code{unnest_longer(indices_include = FALSE)}.
20 | \item If all elements are named, and there's at least one name in
21 | common across all components, it uses \code{unnest_wider()}.
22 | \item Otherwise, it falls back to \code{unnest_longer(indices_include = TRUE)}.
23 | }
24 |
25 | It's handy for very rapid interactive exploration but I don't recommend
26 | using it in scripts, because it will succeed even if the underlying data
27 | radically changes.
28 | }
29 | \keyword{internal}
30 |
--------------------------------------------------------------------------------
/man/us_rent_income.Rd:
--------------------------------------------------------------------------------
1 | % Generated by roxygen2: do not edit by hand
2 | % Please edit documentation in R/data.R
3 | \docType{data}
4 | \name{us_rent_income}
5 | \alias{us_rent_income}
6 | \title{US rent and income data}
7 | \format{
8 | A dataset with variables:
9 | \describe{
10 | \item{GEOID}{FIP state identifier}
11 | \item{NAME}{Name of state}
12 | \item{variable}{Variable name: income = median yearly income,
13 | rent = median monthly rent}
14 | \item{estimate}{Estimated value}
15 | \item{moe}{90\% margin of error}
16 | }
17 | }
18 | \usage{
19 | us_rent_income
20 | }
21 | \description{
22 | Captured from the 2017 American Community Survey using the tidycensus
23 | package.
24 | }
25 | \keyword{datasets}
26 |
--------------------------------------------------------------------------------
/man/who.Rd:
--------------------------------------------------------------------------------
1 | % Generated by roxygen2: do not edit by hand
2 | % Please edit documentation in R/data.R
3 | \docType{data}
4 | \name{who}
5 | \alias{who}
6 | \alias{who2}
7 | \alias{population}
8 | \title{World Health Organization TB data}
9 | \format{
10 | \subsection{\code{who}}{
11 |
12 | A data frame with 7,240 rows and 60 columns:
13 | \describe{
14 | \item{country}{Country name}
15 | \item{iso2, iso3}{2 & 3 letter ISO country codes}
16 | \item{year}{Year}
17 | \item{new_sp_m014 - new_rel_f65}{Counts of new TB cases recorded by group.
18 | Column names encode three variables that describe the group.}
19 | }
20 | }
21 |
22 | \subsection{\code{who2}}{
23 |
24 | A data frame with 7,240 rows and 58 columns.
25 | }
26 |
27 | \subsection{\code{population}}{
28 |
29 | A data frame with 4,060 rows and three columns:
30 | \describe{
31 | \item{country}{Country name}
32 | \item{year}{Year}
33 | \item{population}{Population}
34 | }
35 | }
36 | }
37 | \source{
38 | \url{https://www.who.int/teams/global-tuberculosis-programme/data}
39 | }
40 | \usage{
41 | who
42 |
43 | who2
44 |
45 | population
46 | }
47 | \description{
48 | A subset of data from the World Health Organization Global Tuberculosis
49 | Report, and accompanying global populations. \code{who} uses the original
50 | codes from the World Health Organization. The column names for columns
51 | 5 through 60 are made by combining \code{new_} with:
52 | \itemize{
53 | \item the method of diagnosis (\code{rel} = relapse, \code{sn} = negative pulmonary
54 | smear, \code{sp} = positive pulmonary smear, \code{ep} = extrapulmonary),
55 | \item gender (\code{f} = female, \code{m} = male), and
56 | \item age group (\code{014} = 0-14 yrs of age, \code{1524} = 15-24, \code{2534} = 25-34,
57 | \code{3544} = 35-44 years of age, \code{4554} = 45-54, \code{5564} = 55-64,
58 | \code{65} = 65 years or older).
59 | }
60 |
61 | \code{who2} is a lightly modified version that makes teaching the basics
62 | easier by tweaking the variables to be slightly more consistent and
63 | dropping \code{iso2} and \code{iso3}. \code{newrel} is replaced by \code{new_rel}, and a
64 | \verb{_} is added after the gender.
65 | }
66 | \keyword{datasets}
67 |
--------------------------------------------------------------------------------
/man/world_bank_pop.Rd:
--------------------------------------------------------------------------------
1 | % Generated by roxygen2: do not edit by hand
2 | % Please edit documentation in R/data.R
3 | \docType{data}
4 | \name{world_bank_pop}
5 | \alias{world_bank_pop}
6 | \title{Population data from the World Bank}
7 | \format{
8 | A dataset with variables:
9 | \describe{
10 | \item{country}{Three letter country code}
11 | \item{indicator}{Indicator name: \code{SP.POP.GROW} = population growth,
12 | \code{SP.POP.TOTL} = total population, \code{SP.URB.GROW} = urban population
13 | growth, \code{SP.URB.TOTL} = total urban population}
14 | \item{2000-2018}{Value for each year}
15 | }
16 | }
17 | \source{
18 | Dataset from the World Bank data bank: \url{https://data.worldbank.org}
19 | }
20 | \usage{
21 | world_bank_pop
22 | }
23 | \description{
24 | Data about population from the World Bank.
25 | }
26 | \keyword{datasets}
27 |
--------------------------------------------------------------------------------
/pkgdown/favicon/apple-touch-icon-120x120.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tidyverse/tidyr/9783be32423cb9125ed12bc3fa5962ef64dbd337/pkgdown/favicon/apple-touch-icon-120x120.png
--------------------------------------------------------------------------------
/pkgdown/favicon/apple-touch-icon-152x152.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tidyverse/tidyr/9783be32423cb9125ed12bc3fa5962ef64dbd337/pkgdown/favicon/apple-touch-icon-152x152.png
--------------------------------------------------------------------------------
/pkgdown/favicon/apple-touch-icon-180x180.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tidyverse/tidyr/9783be32423cb9125ed12bc3fa5962ef64dbd337/pkgdown/favicon/apple-touch-icon-180x180.png
--------------------------------------------------------------------------------
/pkgdown/favicon/apple-touch-icon-60x60.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tidyverse/tidyr/9783be32423cb9125ed12bc3fa5962ef64dbd337/pkgdown/favicon/apple-touch-icon-60x60.png
--------------------------------------------------------------------------------
/pkgdown/favicon/apple-touch-icon-76x76.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tidyverse/tidyr/9783be32423cb9125ed12bc3fa5962ef64dbd337/pkgdown/favicon/apple-touch-icon-76x76.png
--------------------------------------------------------------------------------
/pkgdown/favicon/apple-touch-icon.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tidyverse/tidyr/9783be32423cb9125ed12bc3fa5962ef64dbd337/pkgdown/favicon/apple-touch-icon.png
--------------------------------------------------------------------------------
/pkgdown/favicon/favicon-16x16.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tidyverse/tidyr/9783be32423cb9125ed12bc3fa5962ef64dbd337/pkgdown/favicon/favicon-16x16.png
--------------------------------------------------------------------------------
/pkgdown/favicon/favicon-32x32.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tidyverse/tidyr/9783be32423cb9125ed12bc3fa5962ef64dbd337/pkgdown/favicon/favicon-32x32.png
--------------------------------------------------------------------------------
/pkgdown/favicon/favicon.ico:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tidyverse/tidyr/9783be32423cb9125ed12bc3fa5962ef64dbd337/pkgdown/favicon/favicon.ico
--------------------------------------------------------------------------------
/revdep/.gitignore:
--------------------------------------------------------------------------------
1 | **/
2 | checks
3 | library
4 | checks.noindex
5 | library.noindex
6 | data.sqlite
7 | *.html
8 |
--------------------------------------------------------------------------------
/revdep/cran.md:
--------------------------------------------------------------------------------
1 | ## revdepcheck results
2 |
3 | We checked 1761 reverse dependencies (1742 from CRAN + 19 from Bioconductor), comparing R CMD check results across CRAN and dev versions of this package.
4 |
5 | * We saw 4 new problems
6 | * We failed to check 47 packages
7 |
8 | Issues with CRAN packages are summarised below.
9 |
10 | ### New problems
11 | (This reports the first line of each new failure)
12 |
13 | * faux
14 | checking re-building of vignette outputs ... WARNING
15 |
16 | * ggpubr
17 | checking examples ... ERROR
18 |
19 | * gprofiler2
20 | checking re-building of vignette outputs ... WARNING
21 |
22 | * wpa
23 | checking examples ... ERROR
24 | checking tests ... ERROR
25 |
26 | ### Failed to check
27 |
28 | * afex (NA)
29 | * autoTS (NA)
30 | * bayesnec (NA)
31 | * BayesPostEst (NA)
32 | * beadplexr (NA)
33 | * breathtestcore (NA)
34 | * broom.helpers (NA)
35 | * broom.mixed (NA)
36 | * datawizard (NA)
37 | * embed (NA)
38 | * escalation (NA)
39 | * ESTER (NA)
40 | * FAMetA (NA)
41 | * finnts (NA)
42 | * genekitr (NA)
43 | * ggPMX (NA)
44 | * ggstatsplot (NA)
45 | * healthyR.ai (NA)
46 | * healthyR.ts (NA)
47 | * historicalborrowlong (NA)
48 | * INSPECTumours (NA)
49 | * loon.ggplot (NA)
50 | * marginaleffects (NA)
51 | * modeltime (NA)
52 | * modeltime.ensemble (NA)
53 | * modeltime.gluonts (NA)
54 | * modeltime.h2o (NA)
55 | * modeltime.resample (NA)
56 | * mpower (NA)
57 | * numbat (NA)
58 | * OlinkAnalyze (NA)
59 | * ordbetareg (NA)
60 | * Platypus (NA)
61 | * RBesT (NA)
62 | * rdss (NA)
63 | * Robyn (NA)
64 | * RVA (NA)
65 | * SCpubr (NA)
66 | * sjPlot (NA)
67 | * sknifedatar (NA)
68 | * statsExpressions (NA)
69 | * tidybayes (NA)
70 | * tidyposterior (NA)
71 | * timetk (NA)
72 | * tinyarray (NA)
73 | * vivid (NA)
74 | * xpose.nlmixr2 (NA)
75 |
--------------------------------------------------------------------------------
/revdep/email.yml:
--------------------------------------------------------------------------------
1 | release_date: 2019-09-09
2 | release_version: 1.0.0
3 | rel_release_date: day
4 | my_news_url: https://github.com/tidyverse/tidyr/blob/main/NEWS.md
5 | release_details: >
6 | This release includes breaking changes to nest() and unnest() in order
7 | to increase consistency across existing functions. This unfortunately
8 | causes some packages to break, so we've prepared an extensive transition
9 | guide to help with the change:
10 |
11 |
12 | (the vignette it also includes general advice on using tidyr in a package)
13 |
14 | If you have any problems please feel to reach out to us so we can help.
15 |
--------------------------------------------------------------------------------
/revdep/revdep-downloads.R:
--------------------------------------------------------------------------------
1 | #' ---
2 | #' output: github_document
3 | #' ---
4 | #'
5 | #+ setup, include = FALSE, cache = FALSE
6 | knitr::opts_chunk$set(collapse = TRUE, comment = "#>", error = TRUE)
7 | options(tidyverse.quiet = TRUE)
8 |
9 | #' Look at the number of downloads in the past month of the packages exhibiting
10 | #' problems in the tidyr revdep check. Useful for prioritizing the
11 | #' investigation.
12 |
13 | #+ body
14 | library(tidyverse)
15 |
16 | new_problems_path <- here::here("revdep/problems.md")
17 | md <- readLines(new_problems_path)
18 | pkg <- md %>%
19 | str_subset("^#[^#]") %>%
20 | str_extract("[[:alnum:]]+")
21 |
22 | dl <- cranlogs::cran_downloads(when = "last-month", packages = pkg)
23 |
24 | dl_count <- dl %>%
25 | count(package, wt = count) %>%
26 | mutate(package = fct_reorder(package, n)) %>%
27 | arrange(desc(package))
28 |
29 | dl_count %>%
30 | mutate(
31 | prop = n / sum(n),
32 | cum_prop = cumsum(prop)
33 | ) %>%
34 | print(n = 20)
35 |
36 | ggplot(head(dl_count, 20), aes(package, n)) +
37 | geom_col() +
38 | coord_flip()
39 |
--------------------------------------------------------------------------------
/revdep/revdep-downloads.md:
--------------------------------------------------------------------------------
1 | revdep-downloads.R
2 | ================
3 | jenny
4 | 2019-08-07
5 |
6 | Look at the number of downloads in the past month of the packages
7 | exhibiting problems in the tidyr revdep check. Useful for prioritizing
8 | the investigation.
9 |
10 | ``` r
11 | library(tidyverse)
12 |
13 | new_problems_path <- here::here("revdep/problems.md")
14 | md <- readLines(new_problems_path)
15 | pkg <- md %>%
16 | str_subset("^#[^#]") %>%
17 | str_extract("[[:alnum:]]+")
18 |
19 | dl <- cranlogs::cran_downloads(when = "last-month", packages = pkg)
20 |
21 | dl_count <- dl %>%
22 | count(package, wt = count) %>%
23 | mutate(package = fct_reorder(package, n)) %>%
24 | arrange(desc(package))
25 |
26 | dl_count %>%
27 | mutate(
28 | prop = n / sum(n),
29 | cum_prop = cumsum(prop)
30 | ) %>%
31 | print(n = 20)
32 | #> # A tibble: 68 x 4
33 | #> package n prop cum_prop
34 | #>
35 | #> 1 modelr 213523 0.397 0.397
36 | #> 2 recipes 92174 0.171 0.568
37 | #> 3 ggpubr 89239 0.166 0.734
38 | #> 4 survminer 19592 0.0364 0.770
39 | #> 5 d3r 18910 0.0351 0.805
40 | #> 6 sunburstR 17281 0.0321 0.838
41 | #> 7 sjstats 15138 0.0281 0.866
42 | #> 8 sjPlot 12447 0.0231 0.889
43 | #> 9 tidyquant 6471 0.0120 0.901
44 | #> 10 gutenbergr 5001 0.00929 0.910
45 | #> 11 tsibble 4173 0.00776 0.918
46 | #> 12 widyr 3288 0.00611 0.924
47 | #> 13 tibbletime 3231 0.00600 0.930
48 | #> 14 bench 3221 0.00599 0.936
49 | #> 15 fuzzyjoin 3118 0.00579 0.942
50 | #> 16 ggstatsplot 2893 0.00538 0.947
51 | #> 17 broomExtra 2441 0.00454 0.952
52 | #> 18 ggalluvial 1890 0.00351 0.955
53 | #> 19 groupedstats 1883 0.00350 0.959
54 | #> 20 anomalize 1672 0.00311 0.962
55 | #> # … with 48 more rows
56 |
57 | ggplot(head(dl_count, 20), aes(package, n)) +
58 | geom_col() +
59 | coord_flip()
60 | ```
61 |
62 | 
63 |
--------------------------------------------------------------------------------
/revdep/revdep-downloads_files/figure-gfm/body-1.png:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tidyverse/tidyr/9783be32423cb9125ed12bc3fa5962ef64dbd337/revdep/revdep-downloads_files/figure-gfm/body-1.png
--------------------------------------------------------------------------------
/src/.gitignore:
--------------------------------------------------------------------------------
1 | *.o
2 | *.so
3 | *.dll
4 |
--------------------------------------------------------------------------------
/src/cpp11.cpp:
--------------------------------------------------------------------------------
1 | // Generated by cpp11: do not edit by hand
2 | // clang-format off
3 |
4 |
5 | #include "cpp11/declarations.hpp"
6 | #include
7 |
8 | // melt.cpp
9 | cpp11::list melt_dataframe(cpp11::data_frame data, const cpp11::integers& id_ind, const cpp11::integers& measure_ind, cpp11::strings variable_name, cpp11::strings value_name, cpp11::sexp attrTemplate, bool factorsAsStrings, bool valueAsFactor, bool variableAsFactor);
10 | extern "C" SEXP _tidyr_melt_dataframe(SEXP data, SEXP id_ind, SEXP measure_ind, SEXP variable_name, SEXP value_name, SEXP attrTemplate, SEXP factorsAsStrings, SEXP valueAsFactor, SEXP variableAsFactor) {
11 | BEGIN_CPP11
12 | return cpp11::as_sexp(melt_dataframe(cpp11::as_cpp>(data), cpp11::as_cpp>(id_ind), cpp11::as_cpp>(measure_ind), cpp11::as_cpp>(variable_name), cpp11::as_cpp>(value_name), cpp11::as_cpp>(attrTemplate), cpp11::as_cpp>(factorsAsStrings), cpp11::as_cpp>(valueAsFactor), cpp11::as_cpp>(variableAsFactor)));
13 | END_CPP11
14 | }
15 | // simplifyPieces.cpp
16 | cpp11::list simplifyPieces(cpp11::list pieces, int p, bool fillLeft);
17 | extern "C" SEXP _tidyr_simplifyPieces(SEXP pieces, SEXP p, SEXP fillLeft) {
18 | BEGIN_CPP11
19 | return cpp11::as_sexp(simplifyPieces(cpp11::as_cpp>(pieces), cpp11::as_cpp>(p), cpp11::as_cpp>(fillLeft)));
20 | END_CPP11
21 | }
22 |
23 | extern "C" {
24 | static const R_CallMethodDef CallEntries[] = {
25 | {"_tidyr_melt_dataframe", (DL_FUNC) &_tidyr_melt_dataframe, 9},
26 | {"_tidyr_simplifyPieces", (DL_FUNC) &_tidyr_simplifyPieces, 3},
27 | {NULL, NULL, 0}
28 | };
29 | }
30 |
31 | extern "C" attribute_visible void R_init_tidyr(DllInfo* dll){
32 | R_registerRoutines(dll, NULL, CallEntries, NULL, NULL);
33 | R_useDynamicSymbols(dll, FALSE);
34 | R_forceSymbols(dll, TRUE);
35 | }
36 |
--------------------------------------------------------------------------------
/src/simplifyPieces.cpp:
--------------------------------------------------------------------------------
1 | #include "cpp11/list.hpp"
2 | #include "cpp11/strings.hpp"
3 | #include "cpp11/as.hpp"
4 | #include
5 |
6 | [[cpp11::register]]
7 | cpp11::list simplifyPieces(cpp11::list pieces, int p,
8 | bool fillLeft = true) {
9 |
10 | std::vector tooSml, tooBig;
11 | int n = pieces.size();
12 |
13 | cpp11::writable::list list(p);
14 | for (int j = 0; j < p; ++j)
15 | list[j] = cpp11::writable::strings(n);
16 | cpp11::writable::list out(list);
17 |
18 | for (int i = 0; i < n; ++i) {
19 | cpp11::strings x(pieces[i]);
20 |
21 | if (x.size() == 1 && x[0] == NA_STRING) {
22 | for (int j = 0; j < p; ++j)
23 | SET_STRING_ELT(out[j], i, NA_STRING);
24 | } else if (x.size() > p) { // too big
25 | tooBig.push_back(i + 1);
26 |
27 | for (int j = 0; j < p; ++j)
28 | SET_STRING_ELT(out[j], i, x[j]);
29 | } else if (x.size() < p) { // too small
30 | tooSml.push_back(i + 1);
31 |
32 | int gap = p - x.size();
33 | for (int j = 0; j < p; ++j) {
34 | if (fillLeft) {
35 | SET_STRING_ELT(out[j], i, (j >= gap) ? static_cast(x[j - gap]) : NA_STRING);
36 | } else {
37 | SET_STRING_ELT(out[j], i, (j < x.size()) ? static_cast(x[j]) : NA_STRING);
38 | }
39 | }
40 |
41 | } else {
42 | for (int j = 0; j < p; ++j)
43 | SET_STRING_ELT(out[j], i, x[j]);
44 | }
45 | }
46 |
47 | using namespace cpp11::literals;
48 |
49 | return cpp11::writable::list({
50 | "strings"_nm = out,
51 | "too_big"_nm = tooBig,
52 | "too_sml"_nm = tooSml}
53 | );
54 | }
55 |
--------------------------------------------------------------------------------
/tests/testthat.R:
--------------------------------------------------------------------------------
1 | library(testthat)
2 | library(tidyr)
3 |
4 | test_check("tidyr")
5 |
--------------------------------------------------------------------------------
/tests/testthat/_snaps/append.md:
--------------------------------------------------------------------------------
1 | # after must be integer or character
2 |
3 | Code
4 | df_append(df1, df2, after = 1.5)
5 | Condition
6 | Error in `df_append()`:
7 | ! `after` must be a whole number, not the number 1.5.
8 | i This is an internal error that was detected in the tidyr package.
9 | Please report it at with a reprex () and the full backtrace.
10 |
11 |
--------------------------------------------------------------------------------
/tests/testthat/_snaps/chop.md:
--------------------------------------------------------------------------------
1 | # chop() validates its input `cols` (#1205)
2 |
3 | Code
4 | chop(df$x)
5 | Condition
6 | Error in `chop()`:
7 | ! `data` must be a data frame, not an integer vector.
8 |
9 | ---
10 |
11 | Code
12 | chop(df)
13 | Condition
14 | Error in `chop()`:
15 | ! `cols` is absent but must be supplied.
16 |
17 | # incompatible ptype mentions the column (#1477)
18 |
19 | Code
20 | unnest(df, data, ptype = list(data = integer()))
21 | Condition
22 | Error in `unnest()`:
23 | ! Can't convert `data[[2]]` to .
24 |
25 | # incompatible sizes are caught
26 |
27 | Code
28 | unchop(df, c(x, y))
29 | Condition
30 | Error in `unchop()`:
31 | ! In row 1, can't recycle input of size 2 to size 3.
32 |
33 | # empty typed inputs are considered in common size, but NULLs aren't
34 |
35 | Code
36 | unchop(df, c(x, y))
37 | Condition
38 | Error in `unchop()`:
39 | ! In row 1, can't recycle input of size 0 to size 2.
40 |
41 | # unchop disallows renaming
42 |
43 | Code
44 | unchop(df, c(y = x))
45 | Condition
46 | Error in `unchop()`:
47 | ! Can't rename variables in this context.
48 |
49 | # unchop validates its inputs
50 |
51 | Code
52 | unchop(1:10)
53 | Condition
54 | Error in `unchop()`:
55 | ! `data` must be a data frame, not an integer vector.
56 |
57 | ---
58 |
59 | Code
60 | unchop(df)
61 | Condition
62 | Error in `unchop()`:
63 | ! `cols` is absent but must be supplied.
64 |
65 | ---
66 |
67 | Code
68 | unchop(df, col, keep_empty = 1)
69 | Condition
70 | Error in `unchop()`:
71 | ! `keep_empty` must be `TRUE` or `FALSE`, not the number 1.
72 |
73 | ---
74 |
75 | Code
76 | unchop(df, col, ptype = 1)
77 | Condition
78 | Error in `unchop()`:
79 | ! `ptype` must be `NULL`, an empty ptype, or a named list of ptypes.
80 |
81 |
--------------------------------------------------------------------------------
/tests/testthat/_snaps/complete.md:
--------------------------------------------------------------------------------
1 | # validates its inputs
2 |
3 | Code
4 | complete(mtcars, explicit = 1)
5 | Condition
6 | Error in `complete()`:
7 | ! `explicit` must be `TRUE` or `FALSE`, not the number 1.
8 |
9 |
--------------------------------------------------------------------------------
/tests/testthat/_snaps/drop-na.md:
--------------------------------------------------------------------------------
1 | # errors are raised
2 |
3 | Code
4 | drop_na(df, list())
5 | Condition
6 | Error in `drop_na()`:
7 | ! Can't select columns with `list()`.
8 | x `list()` must be numeric or character, not an empty list.
9 |
10 | ---
11 |
12 | Code
13 | drop_na(df, "z")
14 | Condition
15 | Error in `drop_na()`:
16 | ! Can't select columns that don't exist.
17 | x Column `z` doesn't exist.
18 |
19 |
--------------------------------------------------------------------------------
/tests/testthat/_snaps/expand.md:
--------------------------------------------------------------------------------
1 | # crossing checks for bad inputs
2 |
3 | Code
4 | crossing(x = 1:10, y = quote(a))
5 | Condition
6 | Error in `crossing()`:
7 | ! `..2` must be a vector, not a symbol.
8 |
9 | # expand() respects `.name_repair`
10 |
11 | Code
12 | out <- df %>% expand(x = x, x = x, .name_repair = "unique")
13 | Message
14 | New names:
15 | * `x` -> `x...1`
16 | * `x` -> `x...2`
17 |
18 | # crossing() / nesting() respect `.name_repair`
19 |
20 | Code
21 | out <- crossing(x = x, x = x, .name_repair = "unique")
22 | Message
23 | New names:
24 | * `x` -> `x...1`
25 | * `x` -> `x...2`
26 |
27 | ---
28 |
29 | Code
30 | out <- nesting(x = x, x = x, .name_repair = "unique")
31 | Message
32 | New names:
33 | * `x` -> `x...1`
34 | * `x` -> `x...2`
35 |
36 | # expand_grid() can control name_repair
37 |
38 | Code
39 | expand_grid(x = x, x = x)
40 | Condition
41 | Error in `expand_grid()`:
42 | ! Names must be unique.
43 | x These names are duplicated:
44 | * "x" at locations 1 and 2.
45 | i Use argument `.name_repair` to specify repair strategy.
46 |
47 | ---
48 |
49 | Code
50 | out <- expand_grid(x = x, x = x, .name_repair = "unique")
51 | Message
52 | New names:
53 | * `x` -> `x...1`
54 | * `x` -> `x...2`
55 |
56 | # expand_grid() throws an error for invalid `.vary` parameter
57 |
58 | Code
59 | expand_grid(x = 1:2, y = 1:2, .vary = "invalid")
60 | Condition
61 | Error in `expand_grid()`:
62 | ! `.vary` must be one of "slowest" or "fastest", not "invalid".
63 |
64 | # grid_dots() reject non-vector input
65 |
66 | Code
67 | grid_dots(lm(1 ~ 1))
68 | Condition
69 | Error:
70 | ! `..1` must be a vector, not a object.
71 |
72 |
--------------------------------------------------------------------------------
/tests/testthat/_snaps/extract.md:
--------------------------------------------------------------------------------
1 | # informative error message if wrong number of groups
2 |
3 | Code
4 | extract(df, x, "y", ".")
5 | Condition
6 | Error in `extract()`:
7 | ! `regex` should define 1 groups; 0 found.
8 |
9 | ---
10 |
11 | Code
12 | extract(df, x, c("y", "z"), ".")
13 | Condition
14 | Error in `extract()`:
15 | ! `regex` should define 2 groups; 0 found.
16 |
17 | # informative error if using stringr modifier functions (#693)
18 |
19 | Code
20 | extract(df, x, "x", regex = regex)
21 | Condition
22 | Error in `extract()`:
23 | ! `regex` can't use modifiers from stringr.
24 |
25 | # validates its inputs
26 |
27 | Code
28 | df %>% extract()
29 | Condition
30 | Error in `extract()`:
31 | ! `col` is absent but must be supplied.
32 |
33 | ---
34 |
35 | Code
36 | df %>% extract(x, regex = 1)
37 | Condition
38 | Error in `extract()`:
39 | ! `regex` must be a single string, not the number 1.
40 |
41 | ---
42 |
43 | Code
44 | df %>% extract(x, into = 1:3)
45 | Condition
46 | Error in `extract()`:
47 | ! `into` must be a character vector, not an integer vector.
48 |
49 | ---
50 |
51 | Code
52 | df %>% extract(x, into = "x", convert = 1)
53 | Condition
54 | Error in `extract()`:
55 | ! `convert` must be `TRUE` or `FALSE`, not the number 1.
56 |
57 |
--------------------------------------------------------------------------------
/tests/testthat/_snaps/fill.md:
--------------------------------------------------------------------------------
1 | # errors on named `...` inputs
2 |
3 | Code
4 | fill(df, fooy = x)
5 | Condition
6 | Error in `fill()`:
7 | ! Arguments in `...` must be passed by position, not name.
8 | x Problematic argument:
9 | * fooy = x
10 |
11 | # validates its inputs
12 |
13 | Code
14 | df %>% fill(x, .direction = "foo")
15 | Condition
16 | Error in `fill()`:
17 | ! `.direction` must be one of "down", "up", "downup", or "updown", not "foo".
18 |
19 | # `.by` can't select columns that don't exist
20 |
21 | Code
22 | fill(df, y, .by = z)
23 | Condition
24 | Error in `dplyr::mutate()`:
25 | ! Can't select columns that don't exist.
26 | x Column `z` doesn't exist.
27 |
28 | # `.by` can't be used on a grouped data frame
29 |
30 | Code
31 | fill(df, y, .by = x)
32 | Condition
33 | Error in `dplyr::mutate()`:
34 | ! Can't supply `.by` when `.data` is a grouped data frame.
35 |
36 |
--------------------------------------------------------------------------------
/tests/testthat/_snaps/gather.md:
--------------------------------------------------------------------------------
1 | # gather throws error for POSIXlt
2 |
3 | Code
4 | gather(df, key, val, -x)
5 | Condition
6 | Error:
7 | ! 'x' is a POSIXlt. Please convert to POSIXct.
8 |
9 | ---
10 |
11 | Code
12 | gather(df, key, val, -y)
13 | Condition
14 | Error:
15 | ! Column 1 is a POSIXlt. Please convert to POSIXct.
16 |
17 | # gather throws error for weird objects
18 |
19 | Code
20 | gather(df, key, val, -y)
21 | Condition
22 | Error:
23 | ! All columns be atomic vectors or lists (not expression)
24 |
25 | ---
26 |
27 | Code
28 | gather(df, key, val, -x)
29 | Condition
30 | Error:
31 | ! All columns must be atomic vectors or lists. Problem with 'x'
32 |
33 | ---
34 |
35 | Code
36 | gather(df, key, val, -y)
37 | Condition
38 | Error:
39 | ! All columns must be atomic vectors or lists. Problem with column 2.
40 |
41 | # factors coerced to characters, not integers
42 |
43 | Code
44 | out <- gather(df, k, v)
45 | Condition
46 | Warning:
47 | attributes are not identical across measure variables; they will be dropped
48 |
49 | # varying attributes are dropped with a warning
50 |
51 | Code
52 | gather(df, k, v)
53 | Condition
54 | Warning:
55 | attributes are not identical across measure variables; they will be dropped
56 | Output
57 | k v
58 | 1 date1 1546300800
59 | 2 date2 17897
60 |
61 |
--------------------------------------------------------------------------------
/tests/testthat/_snaps/hoist.md:
--------------------------------------------------------------------------------
1 | # nested lists generate a cast error if they can't be cast to the ptype
2 |
3 | Code
4 | hoist(df, x, "b", .ptype = list(b = double()))
5 | Condition
6 | Error in `hoist()`:
7 | ! Can't convert `..1` to .
8 |
9 | # non-vectors generate a cast error if a ptype is supplied
10 |
11 | Code
12 | hoist(df, x, "b", .ptype = list(b = integer()))
13 | Condition
14 | Error in `hoist()`:
15 | ! `..1` must be a vector, not a symbol.
16 |
17 | # input validation catches problems
18 |
19 | Code
20 | df %>% hoist(y)
21 | Condition
22 | Error in `hoist()`:
23 | ! `.data[[.col]]` must be a list, not the number 1.
24 |
25 | ---
26 |
27 | Code
28 | df %>% hoist(x, 1)
29 | Condition
30 | Error in `hoist()`:
31 | ! All elements of `...` must be named.
32 |
33 | ---
34 |
35 | Code
36 | df %>% hoist(x, a = "a", a = "b")
37 | Condition
38 | Error in `hoist()`:
39 | ! The names of `...` must be unique.
40 |
41 | # can't hoist() from a data frame column
42 |
43 | Code
44 | hoist(df, a, xx = 1)
45 | Condition
46 | Error in `hoist()`:
47 | ! `.data[[.col]]` must be a list, not a object.
48 |
49 | # hoist() validates its inputs (#1224)
50 |
51 | Code
52 | hoist(1)
53 | Condition
54 | Error in `hoist()`:
55 | ! `.data` must be a data frame, not a number.
56 |
57 | ---
58 |
59 | Code
60 | hoist(df)
61 | Condition
62 | Error in `hoist()`:
63 | ! `.col` is absent but must be supplied.
64 |
65 | ---
66 |
67 | Code
68 | hoist(df, a, .remove = 1)
69 | Condition
70 | Error in `hoist()`:
71 | ! `.remove` must be `TRUE` or `FALSE`, not the number 1.
72 |
73 | ---
74 |
75 | Code
76 | hoist(df, a, .ptype = 1)
77 | Condition
78 | Error in `hoist()`:
79 | ! `.ptype` must be `NULL`, an empty ptype, or a named list of ptypes.
80 |
81 | ---
82 |
83 | Code
84 | hoist(df, a, .transform = 1)
85 | Condition
86 | Error in `hoist()`:
87 | ! `.transform` must be `NULL`, a function, or a named list of functions.
88 |
89 | ---
90 |
91 | Code
92 | hoist(df, a, .simplify = 1)
93 | Condition
94 | Error in `hoist()`:
95 | ! `.simplify` must be a list or a single `TRUE` or `FALSE`.
96 |
97 |
--------------------------------------------------------------------------------
/tests/testthat/_snaps/nest-legacy.md:
--------------------------------------------------------------------------------
1 | # can't combine vectors and data frames
2 |
3 | Code
4 | unnest_legacy(df)
5 | Condition
6 | Error in `unnest_legacy()`:
7 | ! Each column must either be a list of vectors or a list of data frames.
8 | i Problems in: `x`
9 |
10 | # multiple columns must be same length
11 |
12 | Code
13 | unnest_legacy(df)
14 | Condition
15 | Error in `unnest_legacy()`:
16 | ! All nested columns must have the same number of elements.
17 |
18 | ---
19 |
20 | Code
21 | unnest_legacy(df)
22 | Condition
23 | Error in `unnest_legacy()`:
24 | ! All nested columns must have the same number of elements.
25 |
26 |
--------------------------------------------------------------------------------
/tests/testthat/_snaps/pivot.md:
--------------------------------------------------------------------------------
1 | # basic sanity checks for spec occur
2 |
3 | Code
4 | check_pivot_spec(1)
5 | Condition
6 | Error:
7 | ! `spec` must be a data frame, not a number.
8 | Code
9 | check_pivot_spec(mtcars)
10 | Condition
11 | Error:
12 | ! `spec` must have `.name` and `.value` columns.
13 |
14 | # `.name` column must be a character vector
15 |
16 | Code
17 | check_pivot_spec(df)
18 | Condition
19 | Error:
20 | ! `spec$.name` must be a character vector, not an integer vector.
21 |
22 | # `.value` column must be a character vector
23 |
24 | Code
25 | check_pivot_spec(df)
26 | Condition
27 | Error:
28 | ! `spec$.value` must be a character vector, not an integer vector.
29 |
30 | # `.name` column must be unique
31 |
32 | Code
33 | check_pivot_spec(df)
34 | Condition
35 | Error:
36 | ! `spec$.name` must be unique.
37 |
38 |
--------------------------------------------------------------------------------
/tests/testthat/_snaps/replace_na.md:
--------------------------------------------------------------------------------
1 | # can only be length 0
2 |
3 | Code
4 | replace_na(1, 1:10)
5 | Condition
6 | Error in `replace_na()`:
7 | ! Replacement for `data` must be length 1, not length 10.
8 |
9 | # replacement must be castable to `data`
10 |
11 | Code
12 | replace_na(x, 1.5)
13 | Condition
14 | Error in `vec_assign()`:
15 | ! Can't convert from `replace` to `data` due to loss of precision.
16 | * Locations: 1
17 |
18 | # replacement must be castable to corresponding column
19 |
20 | Code
21 | replace_na(df, list(a = 1.5))
22 | Condition
23 | Error in `vec_assign()`:
24 | ! Can't convert from `replace$a` to `data$a` due to loss of precision.
25 | * Locations: 1
26 |
27 | # validates its inputs
28 |
29 | Code
30 | replace_na(df, replace = 1)
31 | Condition
32 | Error in `replace_na()`:
33 | ! `replace` must be a list, not a number.
34 |
35 |
--------------------------------------------------------------------------------
/tests/testthat/_snaps/separate-longer.md:
--------------------------------------------------------------------------------
1 | # separate_longer_delim() validates its inputs
2 |
3 | Code
4 | df %>% separate_longer_delim()
5 | Condition
6 | Error in `separate_longer_delim()`:
7 | ! `cols` is absent but must be supplied.
8 |
9 | ---
10 |
11 | Code
12 | df %>% separate_longer_delim(x, sep = 1)
13 | Condition
14 | Error in `separate_longer_delim()`:
15 | ! `delim` must be a single string, not absent.
16 |
17 | # separate_longer_position() validates its inputs
18 |
19 | Code
20 | df %>% separate_longer_position()
21 | Condition
22 | Error in `separate_longer_position()`:
23 | ! `cols` is absent but must be supplied.
24 |
25 | ---
26 |
27 | Code
28 | df %>% separate_longer_position(y, width = 1)
29 | Condition
30 | Error in `separate_longer_position()`:
31 | ! Can't select columns that don't exist.
32 | x Column `y` doesn't exist.
33 |
34 | ---
35 |
36 | Code
37 | df %>% separate_longer_position(x, width = 1.5)
38 | Condition
39 | Error in `separate_longer_position()`:
40 | ! `width` must be a whole number, not the number 1.5.
41 |
42 |
--------------------------------------------------------------------------------
/tests/testthat/_snaps/separate-rows.md:
--------------------------------------------------------------------------------
1 | # it validates its inputs
2 |
3 | Code
4 | separate_rows(df, x, sep = 1)
5 | Condition
6 | Error in `separate_rows()`:
7 | ! `sep` must be a single string, not the number 1.
8 |
9 | ---
10 |
11 | Code
12 | separate_rows(df, x, convert = 1)
13 | Condition
14 | Error in `separate_rows()`:
15 | ! `convert` must be `TRUE` or `FALSE`, not the number 1.
16 |
17 |
--------------------------------------------------------------------------------
/tests/testthat/_snaps/separate.md:
--------------------------------------------------------------------------------
1 | # too many pieces dealt with as requested
2 |
3 | Code
4 | separate(df, x, c("x", "y"))
5 | Condition
6 | Warning:
7 | Expected 2 pieces. Additional pieces discarded in 1 rows [2].
8 | Output
9 | # A tibble: 2 x 2
10 | x y
11 |
12 | 1 a b
13 | 2 a b
14 |
15 | ---
16 |
17 | Code
18 | separate(df, x, c("x", "y"), extra = "error")
19 | Condition
20 | Warning:
21 | `extra = "error"` is deprecated. Please use `extra = "warn"` instead
22 | Warning:
23 | Expected 2 pieces. Additional pieces discarded in 1 rows [2].
24 | Output
25 | # A tibble: 2 x 2
26 | x y
27 |
28 | 1 a b
29 | 2 a b
30 |
31 | # too few pieces dealt with as requested
32 |
33 | Code
34 | separate(df, x, c("x", "y", "z"))
35 | Condition
36 | Warning:
37 | Expected 3 pieces. Missing pieces filled with `NA` in 1 rows [1].
38 | Output
39 | # A tibble: 2 x 3
40 | x y z
41 |
42 | 1 a b
43 | 2 a b c
44 |
45 | # validates inputs
46 |
47 | Code
48 | separate(df)
49 | Condition
50 | Error in `separate()`:
51 | ! `col` is absent but must be supplied.
52 |
53 | ---
54 |
55 | Code
56 | separate(df, x, into = 1)
57 | Condition
58 | Error in `separate()`:
59 | ! `into` must be a character vector, not the number 1.
60 |
61 | ---
62 |
63 | Code
64 | separate(df, x, into = "x", sep = c("a", "b"))
65 | Condition
66 | Error in `separate()`:
67 | ! `sep` must be a string or numeric vector, not a character vector
68 |
69 | ---
70 |
71 | Code
72 | separate(df, x, into = "x", remove = 1)
73 | Condition
74 | Error in `separate()`:
75 | ! `remove` must be `TRUE` or `FALSE`, not the number 1.
76 |
77 | ---
78 |
79 | Code
80 | separate(df, x, into = "x", convert = 1)
81 | Condition
82 | Error in `separate()`:
83 | ! `convert` must be `TRUE` or `FALSE`, not the number 1.
84 |
85 | # informative error if using stringr modifier functions (#693)
86 |
87 | Code
88 | separate(df, x, "x", sep = sep)
89 | Condition
90 | Error in `separate()`:
91 | ! `sep` can't use modifiers from stringr.
92 |
93 |
--------------------------------------------------------------------------------
/tests/testthat/_snaps/seq.md:
--------------------------------------------------------------------------------
1 | # full_seq errors if sequence isn't regular
2 |
3 | Code
4 | full_seq(c(1, 3, 4), 2)
5 | Condition
6 | Error in `full_seq()`:
7 | ! `x` is not a regular sequence.
8 | Code
9 | full_seq(c(0, 10, 20), 11, tol = 1.8)
10 | Condition
11 | Error in `full_seq()`:
12 | ! `x` is not a regular sequence.
13 |
14 | # validates inputs
15 |
16 | Code
17 | full_seq(x, period = "a")
18 | Condition
19 | Error in `full_seq()`:
20 | ! `period` must be a number, not the string "a".
21 | Code
22 | full_seq(x, 1, tol = "a")
23 | Condition
24 | Error in `full_seq()`:
25 | ! `tol` must be a number, not the string "a".
26 |
27 |
--------------------------------------------------------------------------------
/tests/testthat/_snaps/spread.md:
--------------------------------------------------------------------------------
1 | # duplicate values for one key is an error
2 |
3 | Code
4 | spread(df, x, y)
5 | Condition
6 | Error in `spread()`:
7 | ! Each row of output must be identified by a unique combination of keys.
8 | i Keys are shared for 2 rows
9 | * 2, 3
10 |
11 |
--------------------------------------------------------------------------------
/tests/testthat/_snaps/uncount.md:
--------------------------------------------------------------------------------
1 | # validates inputs
2 |
3 | Code
4 | uncount(df, y)
5 | Condition
6 | Error in `uncount()`:
7 | ! Can't convert `weights` to .
8 | Code
9 | uncount(df, w)
10 | Condition
11 | Error in `uncount()`:
12 | ! `weights` must be a vector of positive numbers. Location 1 is negative.
13 | Code
14 | uncount(df, x, .remove = 1)
15 | Condition
16 | Error in `uncount()`:
17 | ! `.remove` must be `TRUE` or `FALSE`, not the number 1.
18 | Code
19 | uncount(df, x, .id = "")
20 | Condition
21 | Error in `uncount()`:
22 | ! `.id` must be a valid name or `NULL`, not the empty string "".
23 |
24 |
--------------------------------------------------------------------------------
/tests/testthat/_snaps/unite.md:
--------------------------------------------------------------------------------
1 | # validates its inputs
2 |
3 | Code
4 | unite(df)
5 | Condition
6 | Error in `unite()`:
7 | ! `col` is absent but must be supplied.
8 | Code
9 | unite(df, "z", x:y, sep = 1)
10 | Condition
11 | Error in `unite()`:
12 | ! `sep` must be a single string, not the number 1.
13 | Code
14 | unite(df, "z", x:y, remove = 1)
15 | Condition
16 | Error in `unite()`:
17 | ! `remove` must be `TRUE` or `FALSE`, not the number 1.
18 | Code
19 | unite(df, "z", x:y, na.rm = 1)
20 | Condition
21 | Error in `unite()`:
22 | ! `na.rm` must be `TRUE` or `FALSE`, not the number 1.
23 |
24 |
--------------------------------------------------------------------------------
/tests/testthat/_snaps/unnest-helper.md:
--------------------------------------------------------------------------------
1 | # `simplify` is validated
2 |
3 | Code
4 | df_simplify(data.frame(), simplify = 1)
5 | Condition
6 | Error:
7 | ! `simplify` must be a list or a single `TRUE` or `FALSE`.
8 | Code
9 | df_simplify(data.frame(), simplify = NA)
10 | Condition
11 | Error:
12 | ! `simplify` must be a list or a single `TRUE` or `FALSE`.
13 | Code
14 | df_simplify(data.frame(), simplify = c(TRUE, FALSE))
15 | Condition
16 | Error:
17 | ! `simplify` must be a list or a single `TRUE` or `FALSE`.
18 | Code
19 | df_simplify(data.frame(), simplify = list(1))
20 | Condition
21 | Error:
22 | ! All elements of `simplify` must be named.
23 | Code
24 | df_simplify(data.frame(), simplify = list(x = 1, x = 1))
25 | Condition
26 | Error:
27 | ! The names of `simplify` must be unique.
28 |
29 | # `ptype` is validated
30 |
31 | Code
32 | df_simplify(data.frame(), ptype = 1)
33 | Condition
34 | Error:
35 | ! `ptype` must be `NULL`, an empty ptype, or a named list of ptypes.
36 | Code
37 | df_simplify(data.frame(), ptype = list(1))
38 | Condition
39 | Error:
40 | ! All elements of `ptype` must be named.
41 | Code
42 | df_simplify(data.frame(), ptype = list(x = 1, x = 1))
43 | Condition
44 | Error:
45 | ! The names of `ptype` must be unique.
46 |
47 | # `transform` is validated
48 |
49 | Code
50 | df_simplify(data.frame(), transform = list(~.x))
51 | Condition
52 | Error:
53 | ! All elements of `transform` must be named.
54 | Code
55 | df_simplify(data.frame(x = 1), transform = 1)
56 | Condition
57 | Error:
58 | ! `transform` must be `NULL`, a function, or a named list of functions.
59 | Code
60 | df_simplify(data.frame(), transform = list(x = 1))
61 | Condition
62 | Error:
63 | ! Can't convert `transform$x`, a double vector, to a function.
64 | Code
65 | df_simplify(data.frame(), transform = list(x = 1, x = 1))
66 | Condition
67 | Error:
68 | ! The names of `transform` must be unique.
69 |
70 | # ptype is applied after transform
71 |
72 | Code
73 | col_simplify(list(1, 2, 3), ptype = integer(), transform = ~ .x + 1.5)
74 | Condition
75 | Error:
76 | ! Can't convert from `..1` to due to loss of precision.
77 | * Locations: 1
78 |
79 |
--------------------------------------------------------------------------------
/tests/testthat/_snaps/unnest-longer.md:
--------------------------------------------------------------------------------
1 | # unnest_longer - bad inputs generate errors
2 |
3 | Code
4 | unnest_longer(df, y)
5 | Condition
6 | Error in `unnest_longer()`:
7 | ! List-column `y` must contain only vectors or `NULL`.
8 |
9 | # tidyverse recycling rules are applied after `keep_empty`
10 |
11 | Code
12 | unnest_longer(df, c(a, b))
13 | Condition
14 | Error in `unnest_longer()`:
15 | ! In row 1, can't recycle input of size 0 to size 2.
16 |
17 | # can't mix `indices_to` with `indices_include = FALSE`
18 |
19 | Code
20 | unnest_longer(mtcars, mpg, indices_to = "x", indices_include = FALSE)
21 | Condition
22 | Error in `unnest_longer()`:
23 | ! Can't use `indices_include = FALSE` when `indices_to` is supplied.
24 |
25 | # unnest_longer() validates its inputs
26 |
27 | Code
28 | unnest_longer(1)
29 | Condition
30 | Error in `unnest_longer()`:
31 | ! `data` must be a data frame, not a number.
32 | Code
33 | unnest_longer(df)
34 | Condition
35 | Error in `unnest_longer()`:
36 | ! `col` is absent but must be supplied.
37 | Code
38 | unnest_longer(df, x, indices_to = "")
39 | Condition
40 | Error in `unnest_longer()`:
41 | ! `indices_to` must be a valid name or `NULL`, not the empty string "".
42 | Code
43 | unnest_longer(df, x, indices_include = 1)
44 | Condition
45 | Error in `unnest_longer()`:
46 | ! `indices_include` must be `TRUE`, `FALSE`, or `NULL`, not the number 1.
47 | Code
48 | unnest_longer(df, x, values_to = "")
49 | Condition
50 | Error in `unnest_longer()`:
51 | ! `values_to` must be a valid name or `NULL`, not the empty string "".
52 |
53 | # `values_to` is validated
54 |
55 | Code
56 | unnest_longer(mtcars, mpg, values_to = 1)
57 | Condition
58 | Error in `unnest_longer()`:
59 | ! `values_to` must be a valid name or `NULL`, not the number 1.
60 | Code
61 | unnest_longer(mtcars, mpg, values_to = c("x", "y"))
62 | Condition
63 | Error in `unnest_longer()`:
64 | ! `values_to` must be a valid name or `NULL`, not a character vector.
65 |
66 | # `indices_to` is validated
67 |
68 | Code
69 | unnest_longer(mtcars, mpg, indices_to = 1)
70 | Condition
71 | Error in `unnest_longer()`:
72 | ! `indices_to` must be a valid name or `NULL`, not the number 1.
73 | Code
74 | unnest_longer(mtcars, mpg, indices_to = c("x", "y"))
75 | Condition
76 | Error in `unnest_longer()`:
77 | ! `indices_to` must be a valid name or `NULL`, not a character vector.
78 |
79 | # `indices_include` is validated
80 |
81 | Code
82 | unnest_longer(mtcars, mpg, indices_include = 1)
83 | Condition
84 | Error in `unnest_longer()`:
85 | ! `indices_include` must be `TRUE`, `FALSE`, or `NULL`, not the number 1.
86 | Code
87 | unnest_longer(mtcars, mpg, indices_include = c(TRUE, FALSE))
88 | Condition
89 | Error in `unnest_longer()`:
90 | ! `indices_include` must be `TRUE`, `FALSE`, or `NULL`, not a logical vector.
91 |
92 | # `keep_empty` is validated
93 |
94 | Code
95 | unnest_longer(mtcars, mpg, keep_empty = 1)
96 | Condition
97 | Error in `unnest_longer()`:
98 | ! `keep_empty` must be `TRUE` or `FALSE`, not the number 1.
99 | Code
100 | unnest_longer(mtcars, mpg, keep_empty = c(TRUE, FALSE))
101 | Condition
102 | Error in `unnest_longer()`:
103 | ! `keep_empty` must be `TRUE` or `FALSE`, not a logical vector.
104 |
105 |
--------------------------------------------------------------------------------
/tests/testthat/test-append.R:
--------------------------------------------------------------------------------
1 | test_that("columns in y replace those in x", {
2 | df1 <- data.frame(x = 1)
3 | df2 <- data.frame(x = 2)
4 |
5 | expect_equal(df_append(df1, df2), df2)
6 | })
7 |
8 | test_that("replaced columns retain the correct ordering (#1444)", {
9 | df1 <- data.frame(
10 | x = 1,
11 | y = 2,
12 | z = 3
13 | )
14 | df2 <- data.frame(x = 4)
15 |
16 | expect_identical(
17 | df_append(df1, df2, after = 0L),
18 | data.frame(x = 4, y = 2, z = 3)
19 | )
20 | expect_identical(
21 | df_append(df1, df2, after = 1L),
22 | data.frame(x = 4, y = 2, z = 3)
23 | )
24 | expect_identical(
25 | df_append(df1, df2, after = 2L),
26 | data.frame(y = 2, x = 4, z = 3)
27 | )
28 | })
29 |
30 | test_that("after must be integer or character", {
31 | df1 <- data.frame(x = 1)
32 | df2 <- data.frame(x = 2)
33 |
34 | expect_snapshot(df_append(df1, df2, after = 1.5), error = TRUE)
35 | })
36 |
37 | test_that("always returns a bare data frame", {
38 | df1 <- tibble(x = 1)
39 | df2 <- tibble(y = 2)
40 |
41 | expect_identical(df_append(df1, df2), data.frame(x = 1, y = 2))
42 | })
43 |
44 | test_that("retains row names of data.frame `x` (#1454)", {
45 | # These can't be restored by `reconstruct_tibble()`, so it is reasonable to
46 | # retain them. `dplyr:::dplyr_col_modify()` works similarly.
47 | df <- data.frame(x = 1:2, row.names = c("a", "b"))
48 | cols <- list(y = 3:4, z = 5:6)
49 |
50 | expect_identical(row.names(df_append(df, cols)), c("a", "b"))
51 | expect_identical(row.names(df_append(df, cols, after = 0)), c("a", "b"))
52 | expect_identical(row.names(df_append(df, cols, remove = TRUE)), c("a", "b"))
53 | })
54 |
55 | test_that("can append at any integer position", {
56 | df1 <- data.frame(x = 1, y = 2)
57 | df2 <- data.frame(a = 1)
58 |
59 | expect_named(df_append(df1, df2, 0L), c("a", "x", "y"))
60 | expect_named(df_append(df1, df2, 1L), c("x", "a", "y"))
61 | expect_named(df_append(df1, df2, 2L), c("x", "y", "a"))
62 | })
63 |
64 | test_that("can append at any character position", {
65 | df1 <- data.frame(x = 1, y = 2)
66 | df2 <- data.frame(a = 1)
67 |
68 | expect_named(df_append(df1, df2, "x"), c("x", "a", "y"))
69 | expect_named(df_append(df1, df2, "y"), c("x", "y", "a"))
70 | })
71 |
72 | test_that("can replace at any character position ", {
73 | df1 <- data.frame(x = 1, y = 2, z = 3)
74 | df2 <- data.frame(a = 1)
75 |
76 | expect_named(df_append(df1, df2, "x", remove = TRUE), c("a", "y", "z"))
77 | expect_named(df_append(df1, df2, "y", remove = TRUE), c("x", "a", "z"))
78 | expect_named(df_append(df1, df2, "z", remove = TRUE), c("x", "y", "a"))
79 | })
80 |
--------------------------------------------------------------------------------
/tests/testthat/test-drop-na.R:
--------------------------------------------------------------------------------
1 | test_that("empty call drops every row", {
2 | df <- tibble(x = c(1, 2, NA), y = c("a", NA, "b"))
3 | exp <- tibble(x = 1, y = "a")
4 | res <- drop_na(df)
5 | expect_identical(res, exp)
6 | })
7 |
8 | test_that("tidyselection that selects no columns doesn't drop any rows (#1227)", {
9 | df <- tibble(x = c(1, 2, NA), y = c("a", NA, "b"))
10 | expect_identical(drop_na(df, starts_with("foo")), df)
11 | })
12 |
13 | test_that("specifying (a) variables considers only that variable(s)", {
14 | df <- tibble(x = c(1, 2, NA), y = c("a", NA, "b"))
15 |
16 | exp <- tibble(x = c(1, 2), y = c("a", NA))
17 | res <- drop_na(df, x)
18 | expect_identical(res, exp)
19 |
20 | exp <- tibble(x = c(1), y = c("a"))
21 | res <- drop_na(df, x:y)
22 | expect_identical(res, exp)
23 | })
24 |
25 | test_that("groups are preserved", {
26 | df <- tibble(g = c("A", "A", "B"), x = c(1, 2, NA), y = c("a", NA, "b"))
27 | exp <- tibble(g = c("A", "B"), x = c(1, NA), y = c("a", "b"))
28 |
29 | gdf <- dplyr::group_by(df, "g")
30 | gexp <- dplyr::group_by(exp, "g")
31 |
32 | res <- drop_na(gdf, y)
33 |
34 | expect_identical(res, gexp)
35 | expect_identical(dplyr::group_vars(res), dplyr::group_vars(gexp))
36 | })
37 |
38 | test_that("errors are raised", {
39 | df <- tibble(x = c(1, 2, NA), y = c("a", NA, "b"))
40 | expect_snapshot(error = TRUE, {
41 | drop_na(df, list())
42 | })
43 | expect_snapshot(error = TRUE, {
44 | drop_na(df, "z")
45 | })
46 | })
47 |
48 | test_that("single variable data.frame doesn't lose dimension", {
49 | df <- data.frame(x = c(1, 2, NA))
50 | res <- drop_na(df, "x")
51 | exp <- data.frame(x = c(1, 2))
52 | expect_identical(res, exp)
53 | })
54 |
55 | test_that("works with list-cols", {
56 | df <- tibble(x = list(1L, NULL, 3L), y = c(1L, 2L, NA))
57 | rs <- drop_na(df)
58 |
59 | expect_identical(rs, tibble(x = list(1L), y = 1L))
60 | })
61 |
62 | test_that("doesn't drop empty atomic elements of list-cols (#1228)", {
63 | df <- tibble(x = list(1L, NULL, integer()))
64 | expect_identical(drop_na(df), df[c(1, 3), ])
65 | })
66 |
67 | test_that("preserves attributes", {
68 | df <- tibble(x = structure(c(1, NA), attr = "!"))
69 | rs <- drop_na(df)
70 |
71 | expect_identical(rs$x, structure(1, attr = "!"))
72 | })
73 |
74 | test_that("works with df-cols", {
75 | # if any packed row contains a missing value, it is incomplete
76 | df <- tibble(a = tibble(x = c(1, 1, NA, NA), y = c(1, NA, 1, NA)))
77 | expect_identical(drop_na(df, a), tibble(a = tibble(x = 1, y = 1)))
78 | })
79 |
80 | test_that("works with rcrd cols", {
81 | # if any rcrd field contains a missing value, it is incomplete
82 | col <- new_rcrd(list(x = c(1, 1, NA, NA), y = c(1, NA, 1, NA)))
83 | df <- tibble(col = col)
84 |
85 | expect_identical(
86 | drop_na(df, col),
87 | tibble(col = new_rcrd(list(x = 1, y = 1)))
88 | )
89 | })
90 |
--------------------------------------------------------------------------------
/tests/testthat/test-extract.R:
--------------------------------------------------------------------------------
1 | test_that("default returns first alpha group", {
2 | df <- data.frame(x = c("a.b", "a.d", "b.c"))
3 | out <- df %>% extract(x, "A")
4 | expect_equal(out$A, c("a", "a", "b"))
5 | })
6 |
7 | test_that("can match multiple groups", {
8 | df <- data.frame(x = c("a.b", "a.d", "b.c"))
9 | out <- df %>% extract(x, c("A", "B"), "([[:alnum:]]+)\\.([[:alnum:]]+)")
10 | expect_equal(out$A, c("a", "a", "b"))
11 | expect_equal(out$B, c("b", "d", "c"))
12 | })
13 |
14 | test_that("can drop groups", {
15 | df <- data.frame(x = c("a.b.e", "a.d.f", "b.c.g"))
16 | out <- df %>% extract(x, c("x", NA, "y"), "([a-z])\\.([a-z])\\.([a-z])")
17 | expect_named(out, c("x", "y"))
18 | expect_equal(out$y, c("e", "f", "g"))
19 | })
20 |
21 | test_that("match failures give NAs", {
22 | df <- data.frame(x = c("a.b", "a"))
23 | out <- df %>% extract(x, "a", "(b)")
24 | expect_equal(out$a, c("b", NA))
25 | })
26 |
27 | test_that("extract keeps characters as character", {
28 | df <- tibble(x = "X-1")
29 | out <- extract(df, x, c("x", "y"), "(.)-(.)", convert = TRUE)
30 | expect_equal(out$x, "X")
31 | expect_equal(out$y, 1L)
32 | })
33 |
34 | test_that("can combine into multiple columns", {
35 | df <- tibble(x = "abcd")
36 | out <- extract(df, x, c("a", "b", "a", "b"), "(.)(.)(.)(.)", convert = TRUE)
37 | expect_equal(out, tibble(a = "ac", b = "bd"))
38 | })
39 |
40 | test_that("groups are preserved", {
41 | df <- tibble(g = 1, x = "X1") %>% dplyr::group_by(g)
42 | rs <- df %>% extract(x, c("x", "y"), "(.)(.)")
43 | expect_equal(class(df), class(rs))
44 | expect_equal(dplyr::group_vars(df), dplyr::group_vars(rs))
45 | })
46 |
47 | test_that("informative error message if wrong number of groups", {
48 | df <- tibble(x = "a")
49 | expect_snapshot(error = TRUE, {
50 | extract(df, x, "y", ".")
51 | })
52 | expect_snapshot(error = TRUE, {
53 | extract(df, x, c("y", "z"), ".")
54 | })
55 | })
56 |
57 | test_that("informative error if using stringr modifier functions (#693)", {
58 | df <- tibble(x = "a")
59 | regex <- structure("a", class = "pattern")
60 |
61 | expect_snapshot(error = TRUE, {
62 | extract(df, x, "x", regex = regex)
63 | })
64 | })
65 |
66 | test_that("str_match_first handles edge cases", {
67 | expect_identical(
68 | str_match_first(c("r-2", "d-2-3-4"), "(.)-(.)"),
69 | list(c("r", "d"), c("2", "2"))
70 | )
71 | expect_identical(
72 | str_match_first(NA, "test"),
73 | list()
74 | )
75 | expect_equal(
76 | str_match_first(c("", " "), "^(.*)$"),
77 | list(c("", " "))
78 | )
79 | expect_equal(
80 | str_match_first("", "(.)-(.)"),
81 | list(NA_character_, NA_character_)
82 | )
83 | expect_equal(
84 | str_match_first(character(), "(.)-(.)"),
85 | list(character(), character())
86 | )
87 | })
88 |
89 | test_that("validates its inputs", {
90 | df <- data.frame(x = letters)
91 |
92 | expect_snapshot(error = TRUE, {
93 | df %>% extract()
94 | })
95 | expect_snapshot(error = TRUE, {
96 | df %>% extract(x, regex = 1)
97 | })
98 | expect_snapshot(error = TRUE, {
99 | df %>% extract(x, into = 1:3)
100 | })
101 | expect_snapshot(error = TRUE, {
102 | df %>% extract(x, into = "x", convert = 1)
103 | })
104 | })
105 |
--------------------------------------------------------------------------------
/tests/testthat/test-id.R:
--------------------------------------------------------------------------------
1 | test_that("drop preserves count of factor levels", {
2 | x <- factor(levels = c("a", "b"))
3 | expect_equal(id_var(x), structure(integer(), n = 2))
4 | expect_equal(id(data.frame(x)), structure(integer(), n = 2))
5 | })
6 |
7 | test_that("id works with dimensions beyond integer range", {
8 | df <- data.frame(matrix(c(1, 2), nrow = 2, ncol = 32))
9 | expect_equal(id(df), structure(c(1, 2), n = 2^32))
10 | })
11 |
12 | test_that("id_var() handles named vectors (#525)", {
13 | res <- id_var(c(a = 5, b = 3, c = 5))
14 | expect_equal(res, structure(c(2L, 1L, 2L), n = 2L))
15 | })
16 |
--------------------------------------------------------------------------------
/tests/testthat/test-pivot.R:
--------------------------------------------------------------------------------
1 | test_that("basic sanity checks for spec occur", {
2 | expect_snapshot(error = TRUE, {
3 | check_pivot_spec(1)
4 | check_pivot_spec(mtcars)
5 | })
6 | })
7 |
8 | test_that("`.name` column must be a character vector", {
9 | df <- tibble(.name = 1:2, .value = c("a", "b"))
10 | expect_snapshot(check_pivot_spec(df), error = TRUE)
11 | })
12 |
13 | test_that("`.value` column must be a character vector", {
14 | df <- tibble(.name = c("x", "y"), .value = 1:2)
15 | expect_snapshot(check_pivot_spec(df), error = TRUE)
16 | })
17 |
18 | test_that("`.name` column must be unique", {
19 | df <- tibble(.name = c("x", "x"), .value = c("a", "b"))
20 | expect_snapshot(check_pivot_spec(df), error = TRUE)
21 | })
22 |
--------------------------------------------------------------------------------
/tests/testthat/test-replace_na.R:
--------------------------------------------------------------------------------
1 | # vector ------------------------------------------------------------------
2 |
3 | test_that("empty call does nothing", {
4 | x <- c(1, NA)
5 | expect_equal(replace_na(x), x)
6 | })
7 |
8 | test_that("missing values are replaced", {
9 | x <- c(1, NA)
10 | expect_equal(replace_na(x, 0), c(1, 0))
11 | })
12 |
13 | test_that("can only be length 0", {
14 | expect_snapshot(replace_na(1, 1:10), error = TRUE)
15 | })
16 |
17 | test_that("can replace missing rows in arrays", {
18 | x <- matrix(c(NA, NA, NA, 6), nrow = 2)
19 | replace <- matrix(c(-1, -2), nrow = 1)
20 | expect <- matrix(c(-1, NA, -2, 6), nrow = 2)
21 |
22 | expect_identical(replace_na(x, replace), expect)
23 | })
24 |
25 | test_that("can replace missing values in rcrds", {
26 | x <- new_rcrd(list(x = c(1, NA, NA), y = c(1, NA, 2)))
27 | expect <- new_rcrd(list(x = c(1, 0, NA), y = c(1, 0, 2)))
28 |
29 | expect_identical(
30 | replace_na(x, new_rcrd(list(x = 0, y = 0))),
31 | expect
32 | )
33 | })
34 |
35 | test_that("replacement must be castable to `data`", {
36 | x <- c(1L, NA)
37 | expect_snapshot(replace_na(x, 1.5), error = TRUE)
38 | })
39 |
40 | test_that("empty atomic elements are not replaced in lists (#1168)", {
41 | x <- list(character(), NULL)
42 |
43 | expect_identical(
44 | replace_na(x, replace = list("foo")),
45 | list(character(), "foo")
46 | )
47 | })
48 |
49 | test_that("can replace value in `NULL` (#1292)", {
50 | expect_identical(replace_na(NULL, replace = "NA"), NULL)
51 | expect_identical(replace_na(NULL, replace = 1L), NULL)
52 | })
53 |
54 | # data frame -------------------------------------------------------------
55 |
56 | test_that("empty call does nothing", {
57 | df <- tibble(x = c(1, NA))
58 | out <- replace_na(df)
59 | expect_equal(out, df)
60 | })
61 |
62 | test_that("missing values are replaced", {
63 | df <- tibble(x = c(1, NA))
64 | out <- replace_na(df, list(x = 0))
65 | expect_equal(out$x, c(1, 0))
66 | })
67 |
68 | test_that("don't complain about variables that don't exist", {
69 | df <- tibble(a = c(1, NA))
70 | out <- replace_na(df, list(a = 100, b = 0))
71 | expect_equal(out, tibble(a = c(1, 100)))
72 | })
73 |
74 | test_that("can replace NULLs in list-column", {
75 | df <- tibble(x = list(1, NULL))
76 | rs <- replace_na(df, list(x = list(1:5)))
77 |
78 | expect_identical(rs, tibble(x = list(1, 1:5)))
79 | })
80 |
81 | test_that("df-col rows must be completely missing to be replaceable", {
82 | col <- tibble(x = c(1, NA, NA), y = c(1, 2, NA))
83 | df <- tibble(a = col)
84 |
85 | col <- tibble(x = c(1, NA, -1), y = c(1, 2, -2))
86 | expect <- tibble(a = col)
87 |
88 | replace <- tibble(x = -1, y = -2)
89 |
90 | expect_identical(
91 | replace_na(df, list(a = replace)),
92 | expect
93 | )
94 | })
95 |
96 | test_that("replacement must be castable to corresponding column", {
97 | df <- tibble(a = c(1L, NA))
98 | expect_snapshot(replace_na(df, list(a = 1.5)), error = TRUE)
99 | })
100 |
101 | test_that("validates its inputs", {
102 | df <- tibble(a = c(1L, NA))
103 | expect_snapshot(error = TRUE, {
104 | replace_na(df, replace = 1)
105 | })
106 | })
107 |
--------------------------------------------------------------------------------
/tests/testthat/test-separate-longer.R:
--------------------------------------------------------------------------------
1 | test_that("separate_longer_delim() creates rows", {
2 | df <- tibble(id = 1:2, x = c("x", "y,z"))
3 | out <- separate_longer_delim(df, x, delim = ",")
4 | expect_equal(out$id, c(1, 2, 2))
5 | expect_equal(out$x, c("x", "y", "z"))
6 | })
7 |
8 | test_that("separate_longer_delim() validates its inputs", {
9 | df <- tibble(x = "x")
10 | expect_snapshot(error = TRUE, {
11 | df %>% separate_longer_delim()
12 | })
13 | expect_snapshot(error = TRUE, {
14 | df %>% separate_longer_delim(x, sep = 1)
15 | })
16 | })
17 |
18 | test_that("separate_longer_position() creates rows", {
19 | df <- tibble(id = 1:2, x = c("x", "yz"))
20 | out <- separate_longer_position(df, x, width = 1)
21 | expect_equal(out$id, c(1, 2, 2))
22 | expect_equal(out$x, c("x", "y", "z"))
23 | })
24 |
25 | test_that("separate_longer_position() can keep empty rows", {
26 | df <- tibble(id = 1:2, x = c("", "x"))
27 | out <- separate_longer_position(df, x, width = 1)
28 | expect_equal(out$id, 2)
29 | expect_equal(out$x, "x")
30 |
31 | out <- separate_longer_position(df, x, width = 1, keep_empty = TRUE)
32 | expect_equal(out$id, c(1, 2))
33 | expect_equal(out$x, c(NA, "x"))
34 | })
35 |
36 | test_that("works with zero-row data frame", {
37 | df <- tibble(x = character())
38 | expect_equal(separate_longer_position(df, x, 1), df)
39 | expect_equal(separate_longer_delim(df, x, ","), df)
40 | })
41 |
42 | test_that("separate_longer_position() validates its inputs", {
43 | df <- tibble(x = "x")
44 | expect_snapshot(error = TRUE, {
45 | df %>% separate_longer_position()
46 | })
47 | expect_snapshot(error = TRUE, {
48 | df %>% separate_longer_position(y, width = 1)
49 | })
50 | expect_snapshot(error = TRUE, {
51 | df %>% separate_longer_position(x, width = 1.5)
52 | })
53 | })
54 |
--------------------------------------------------------------------------------
/tests/testthat/test-separate-rows.R:
--------------------------------------------------------------------------------
1 | test_that("can handle collapsed rows", {
2 | df <- tibble(x = 1:3, y = c("a", "d,e,f", "g,h"))
3 | expect_equal(separate_rows(df, y)$y, unlist(strsplit(df$y, "\\,")))
4 | })
5 |
6 | test_that("can handle empty data frames (#308)", {
7 | df <- tibble(a = character(), b = character())
8 | rs <- separate_rows(df, b)
9 | expect_equal(rs, tibble(a = character(), b = unspecified()))
10 | })
11 |
12 | test_that("default pattern does not split decimals in nested strings", {
13 | df <- dplyr::tibble(x = 1:3, y = c("1", "1.0,1.1", "2.1"))
14 | expect_equal(separate_rows(df, y)$y, unlist(strsplit(df$y, ",")))
15 | })
16 |
17 | test_that("preserves grouping", {
18 | df <- tibble(g = 1, x = "a:b") %>% dplyr::group_by(g)
19 | rs <- df %>% separate_rows(x)
20 |
21 | expect_equal(class(df), class(rs))
22 | expect_equal(dplyr::group_vars(df), dplyr::group_vars(rs))
23 | })
24 |
25 | test_that("drops grouping when needed", {
26 | df <- tibble(x = 1, y = "a:b") %>% dplyr::group_by(x, y)
27 |
28 | out <- df %>% separate_rows(y)
29 | expect_equal(out$y, c("a", "b"))
30 | expect_equal(dplyr::group_vars(out), "x")
31 |
32 | out <- df %>%
33 | dplyr::group_by(y) %>%
34 | separate_rows(y)
35 | expect_equal(dplyr::group_vars(out), character())
36 | })
37 |
38 | test_that("drops grouping on zero row data frames when needed (#886)", {
39 | df <- tibble(x = numeric(), y = character()) %>% dplyr::group_by(y)
40 | out <- df %>% separate_rows(y)
41 | expect_equal(dplyr::group_vars(out), character())
42 | })
43 |
44 | test_that("convert produces integers etc", {
45 | df <- tibble(x = "1,2,3", y = "T,F,T", z = "a,b,c")
46 |
47 | out <- separate_rows(df, x, y, z, convert = TRUE)
48 | expect_equal(class(out$x), "integer")
49 | expect_equal(class(out$y), "logical")
50 | expect_equal(class(out$z), "character")
51 | })
52 |
53 | test_that("leaves list columns intact (#300)", {
54 | df <- tibble(x = "1,2,3", y = list(1))
55 |
56 | out <- separate_rows(df, x)
57 | # Can't compare tibbles with list columns directly
58 | expect_equal(names(out), c("x", "y"))
59 | expect_equal(out$x, as.character(1:3))
60 | expect_equal(out$y, rep(list(1), 3))
61 | })
62 |
63 | test_that("does not silently drop blank values (#1014)", {
64 | df <- tibble(x = 1:3, y = c("a", "d,e,f", ""))
65 |
66 | out <- separate_rows(df, y)
67 | expect_equal(
68 | out,
69 | tibble(x = c(1, 2, 2, 2, 3), y = c("a", "d", "e", "f", ""))
70 | )
71 | })
72 |
73 | test_that("it validates its inputs", {
74 | df <- tibble(x = 1:3, y = c("a", "d,e,f", ""))
75 |
76 | expect_snapshot(error = TRUE, {
77 | separate_rows(df, x, sep = 1)
78 | })
79 | expect_snapshot(error = TRUE, {
80 | separate_rows(df, x, convert = 1)
81 | })
82 | })
83 |
--------------------------------------------------------------------------------
/tests/testthat/test-seq.R:
--------------------------------------------------------------------------------
1 | test_that("full_seq with tol > 0 allows sequences to fall short of period", {
2 | expect_equal(full_seq(c(0, 10, 20), 11, tol = 2), c(0, 11, 22))
3 | })
4 |
5 | test_that("full_seq pads length correctly for tol > 0", {
6 | expect_equal(full_seq(c(0, 10, 16), 11, tol = 5), c(0, 11))
7 | })
8 |
9 | test_that("sequences don't have to start at zero", {
10 | expect_equal(full_seq(c(1, 5), 2), c(1, 3, 5))
11 | })
12 |
13 | test_that("full_seq fills in gaps", {
14 | expect_equal(full_seq(c(1, 3), 1), c(1, 2, 3))
15 | })
16 |
17 | test_that("preserves attributes", {
18 | x1 <- as.Date("2001-01-01") + c(0, 2)
19 | x2 <- as.POSIXct(x1)
20 |
21 | expect_s3_class(full_seq(x1, 2), "Date")
22 | expect_s3_class(full_seq(x2, 86400), c("POSIXct", "POSIXt"))
23 | })
24 |
25 | test_that("full_seq errors if sequence isn't regular", {
26 | expect_snapshot(error = TRUE, {
27 | full_seq(c(1, 3, 4), 2)
28 | full_seq(c(0, 10, 20), 11, tol = 1.8)
29 | })
30 | })
31 |
32 | test_that("validates inputs", {
33 | x <- 1:5
34 | expect_snapshot(error = TRUE, {
35 | full_seq(x, period = "a")
36 | full_seq(x, 1, tol = "a")
37 | })
38 | })
39 |
--------------------------------------------------------------------------------
/tests/testthat/test-uncount.R:
--------------------------------------------------------------------------------
1 | test_that("symbols weights are dropped in output", {
2 | df <- tibble(x = 1, w = 1)
3 | expect_equal(uncount(df, w), tibble(x = 1))
4 | })
5 |
6 | test_that("can request to preserve symbols", {
7 | df <- tibble(x = 1, w = 1)
8 | expect_equal(uncount(df, w, .remove = FALSE), df)
9 | })
10 |
11 | test_that("unique identifiers created on request", {
12 | df <- tibble(w = 1:3)
13 | expect_equal(uncount(df, w, .id = "id"), tibble(id = c(1L, 1:2, 1:3)))
14 | })
15 |
16 | test_that("expands constants and expressions", {
17 | df <- tibble(x = 1, w = 2)
18 |
19 | expect_equal(uncount(df, 2), df[c(1, 1), ])
20 | expect_equal(uncount(df, 1 + 1), df[c(1, 1), ])
21 | })
22 |
23 | test_that("works with groups", {
24 | df <- tibble(g = 1, x = 1, w = 1) %>% dplyr::group_by(g)
25 | expect_equal(uncount(df, w), df %>% dplyr::select(-w))
26 | })
27 |
28 | test_that("must evaluate to integer", {
29 | df <- tibble(x = 1, w = 1 / 2)
30 | expect_error(uncount(df, w), class = "vctrs_error_cast_lossy")
31 |
32 | df <- tibble(x = 1)
33 | expect_error(uncount(df, "W"), class = "vctrs_error_incompatible_type")
34 | })
35 |
36 | test_that("works with 0 weights", {
37 | df <- tibble(x = 1:2, w = c(0, 1))
38 | expect_equal(uncount(df, w), tibble(x = 2))
39 | })
40 |
41 | test_that("validates inputs", {
42 | df <- tibble(x = 1, y = "a", w = -1)
43 |
44 | expect_snapshot(error = TRUE, {
45 | uncount(df, y)
46 | uncount(df, w)
47 | uncount(df, x, .remove = 1)
48 | uncount(df, x, .id = "")
49 | })
50 | })
51 |
--------------------------------------------------------------------------------
/tests/testthat/test-unite.R:
--------------------------------------------------------------------------------
1 | test_that("unite pastes columns together & removes old col", {
2 | df <- tibble(x = "a", y = "b")
3 | out <- unite(df, z, x:y)
4 | expect_equal(names(out), "z")
5 | expect_equal(out$z, "a_b")
6 | })
7 |
8 | test_that("unite does not remove new col in case of name clash", {
9 | df <- tibble(x = "a", y = "b")
10 | out <- unite(df, x, x:y)
11 | expect_equal(names(out), "x")
12 | expect_equal(out$x, "a_b")
13 | })
14 |
15 | test_that("unite preserves grouping", {
16 | df <- tibble(g = 1, x = "a") %>% dplyr::group_by(g)
17 | rs <- df %>% unite(x, x)
18 | expect_equal(df, rs)
19 | expect_equal(class(df), class(rs))
20 | expect_equal(dplyr::group_vars(df), dplyr::group_vars(rs))
21 | })
22 |
23 | test_that("drops grouping when needed", {
24 | df <- tibble(g = 1, x = "a") %>% dplyr::group_by(g)
25 | rs <- df %>% unite(gx, g, x)
26 | expect_equal(rs$gx, "1_a")
27 | expect_equal(dplyr::group_vars(rs), character())
28 | })
29 |
30 | test_that("preserves row names of data.frames (#1454)", {
31 | df <- data.frame(x = c("1", "2"), y = c("3", "4"), row.names = c("a", "b"))
32 | expect_identical(row.names(unite(df, "xy", x, y)), c("a", "b"))
33 | })
34 |
35 | test_that("empty var spec uses all vars", {
36 | df <- tibble(x = "a", y = "b")
37 | expect_equal(unite(df, "z"), tibble(z = "a_b"))
38 | })
39 |
40 | test_that("can remove missing vars on request", {
41 | df <- expand_grid(x = c("a", NA), y = c("b", NA))
42 | out <- unite(df, "z", x:y, na.rm = TRUE)
43 |
44 | expect_equal(out$z, c("a_b", "a", "b", ""))
45 | })
46 |
47 | test_that("regardless of the type of the NA", {
48 | vec_unite <- function(df, vars) {
49 | unite(df, "out", any_of(vars), na.rm = TRUE)$out
50 | }
51 |
52 | df <- tibble(
53 | x = c("x", "y", "z"),
54 | lgl = NA,
55 | dbl = NA_real_,
56 | chr = NA_character_
57 | )
58 |
59 | expect_equal(vec_unite(df, c("x", "lgl")), c("x", "y", "z"))
60 | expect_equal(vec_unite(df, c("x", "dbl")), c("x", "y", "z"))
61 | expect_equal(vec_unite(df, c("x", "chr")), c("x", "y", "z"))
62 | })
63 |
64 | test_that("validates its inputs", {
65 | df <- tibble(x = "a", y = "b")
66 |
67 | expect_snapshot(error = TRUE, {
68 | unite(df)
69 | unite(df, "z", x:y, sep = 1)
70 | unite(df, "z", x:y, remove = 1)
71 | unite(df, "z", x:y, na.rm = 1)
72 | })
73 | })
74 |
75 | test_that("returns an empty string column for empty selections (#1548)", {
76 | # i.e. it returns the initial value that would be used in a reduction algorithm
77 |
78 | x <- tibble(
79 | x = c("x", "y", "z"),
80 | y = c(1, 2, 3)
81 | )
82 |
83 | out <- unite(x, "new", all_of(c()))
84 |
85 | expect_identical(names(out), c("x", "y", "new"))
86 | expect_identical(out$new, c("", "", ""))
87 | })
88 |
89 | test_that("works with 0 column data frames and empty selections (#1570)", {
90 | x <- tibble(.rows = 2L)
91 |
92 | # No `...` implies "unite all the columns"
93 | out <- unite(x, "new")
94 | expect_identical(names(out), "new")
95 | expect_identical(out$new, c("", ""))
96 |
97 | # Empty selection
98 | out <- unite(x, "new", all_of(names(x)))
99 | expect_identical(names(out), "new")
100 | expect_identical(out$new, c("", ""))
101 | })
102 |
--------------------------------------------------------------------------------
/tests/testthat/test-unnest-auto.R:
--------------------------------------------------------------------------------
1 | # unnest_auto -------------------------------------------------------------
2 |
3 | test_that("unnamed becomes longer", {
4 | df <- tibble(x = 1:2, y = list(1, 2:3))
5 | expect_message(out <- df %>% unnest_auto(y), "unnest_longer")
6 | expect_equal(out$y, c(1, 2, 3))
7 | })
8 |
9 | test_that("common name becomes wider", {
10 | df <- tibble(x = 1:2, y = list(c(a = 1), c(a = 2)))
11 | expect_message(out <- df %>% unnest_auto(y), "unnest_wider")
12 | expect_named(out, c("x", "a"))
13 | })
14 |
15 | test_that("no common name falls back to longer with index", {
16 | df <- tibble(x = 1:2, y = list(c(a = 1), c(b = 2)))
17 | expect_message(out <- df %>% unnest_auto(y), "unnest_longer")
18 | expect_named(out, c("x", "y", "y_id"))
19 | })
20 |
21 | test_that("mix of named and unnamed becomes longer", {
22 | df <- tibble(x = 1:2, y = list(c(a = 1), 2))
23 | expect_message(out <- df %>% unnest_auto(y), "unnest_longer")
24 | expect_named(out, c("x", "y"))
25 | })
26 |
27 | # https://github.com/tidyverse/tidyr/issues/959
28 | test_that("works with an input that has column named `col`", {
29 | df <- tibble(
30 | col = 1L,
31 | list_col = list(list(x = "a", y = "b"), list(x = "c", y = "d"))
32 | )
33 | expect_message(out <- df %>% unnest_auto(list_col), "unnest_wider")
34 | expect_named(out, c("col", "x", "y"))
35 | })
36 |
--------------------------------------------------------------------------------
/tests/testthat/test-utils.R:
--------------------------------------------------------------------------------
1 | test_that("tidyr_legacy copies old approach", {
2 | expect_equal(tidyr_legacy(c()), character())
3 | expect_equal(tidyr_legacy(c("x", "x", "y")), c("x", "x1", "y"))
4 | expect_equal(tidyr_legacy(c("", "", "")), c("V1", "V2", "V3"))
5 | })
6 |
7 | test_that("reconstruct doesn't repair names", {
8 | # This ensures that name repair elsewhere isn't overridden
9 | df <- tibble(x = 1, x = 2, .name_repair = "minimal")
10 | expect_equal(reconstruct_tibble(df, df), df)
11 | })
12 |
--------------------------------------------------------------------------------
/tidyr.Rproj:
--------------------------------------------------------------------------------
1 | Version: 1.0
2 |
3 | RestoreWorkspace: No
4 | SaveWorkspace: No
5 | AlwaysSaveHistory: Default
6 |
7 | EnableCodeIndexing: Yes
8 | UseSpacesForTab: Yes
9 | NumSpacesForTab: 2
10 | Encoding: UTF-8
11 |
12 | RnwWeave: Sweave
13 | LaTeX: pdfLaTeX
14 |
15 | AutoAppendNewline: Yes
16 | StripTrailingWhitespace: Yes
17 |
18 | BuildType: Package
19 | PackageUseDevtools: Yes
20 | PackageInstallArgs: --no-multiarch --with-keep.source
21 | PackageRoxygenize: rd,collate,namespace
22 |
--------------------------------------------------------------------------------
/vignettes/.gitignore:
--------------------------------------------------------------------------------
1 | *.html
2 | *.R
3 | rectangle_cache
4 |
--------------------------------------------------------------------------------
/vignettes/nest.Rmd:
--------------------------------------------------------------------------------
1 | ---
2 | title: "Nested data"
3 | output: rmarkdown::html_vignette
4 | description: |
5 | A nested data frame contains a list-column of data frames. It's an
6 | alternative way of representing grouped data, that works particularly well
7 | when you're modelling.
8 | vignette: >
9 | %\VignetteIndexEntry{Nested data}
10 | %\VignetteEngine{knitr::rmarkdown}
11 | %\VignetteEncoding{UTF-8}
12 | ---
13 |
14 | ```{r, include = FALSE}
15 | knitr::opts_chunk$set(
16 | collapse = TRUE,
17 | comment = "#>"
18 | )
19 | ```
20 |
21 | ```{r setup, message = FALSE}
22 | library(tidyr)
23 | library(dplyr)
24 | library(purrr)
25 | ```
26 |
27 | ## Basics
28 |
29 | A nested data frame is a data frame where one (or more) columns is a list of data frames. You can create simple nested data frames by hand:
30 |
31 | ```{r}
32 | df1 <- tibble(
33 | g = c(1, 2, 3),
34 | data = list(
35 | tibble(x = 1, y = 2),
36 | tibble(x = 4:5, y = 6:7),
37 | tibble(x = 10)
38 | )
39 | )
40 |
41 | df1
42 | ```
43 |
44 | (It is possible to create list-columns in regular data frames, not just in tibbles, but it's considerably more work because the default behaviour of `data.frame()` is to treat lists as lists of columns.)
45 |
46 | But more commonly you'll create them with `tidyr::nest()`:
47 |
48 | ```{r}
49 | df2 <- tribble(
50 | ~g, ~x, ~y,
51 | 1, 1, 2,
52 | 2, 4, 6,
53 | 2, 5, 7,
54 | 3, 10, NA
55 | )
56 | df2 %>% nest(data = c(x, y))
57 | ```
58 |
59 | `nest()` specifies which variables should be nested inside; an alternative is to use `dplyr::group_by()` to describe which variables should be kept outside.
60 |
61 | ```{r}
62 | df2 %>% group_by(g) %>% nest()
63 | ```
64 |
65 | I think nesting is easiest to understand in connection to grouped data: each row in the output corresponds to one _group_ in the input. We'll see shortly this is particularly convenient when you have other per-group objects.
66 |
67 | The opposite of `nest()` is `unnest()`. You give it the name of a list-column containing data frames, and it row-binds the data frames together, repeating the outer columns the right number of times to line up.
68 |
69 | ```{r}
70 | df1 %>% unnest(data)
71 | ```
72 |
73 | ## Nested data and models
74 |
75 | Nested data is a great fit for problems where you have one of _something_ for each group. A common place this arises is when you're fitting multiple models.
76 |
77 | ```{r}
78 | mtcars_nested <- mtcars %>%
79 | group_by(cyl) %>%
80 | nest()
81 |
82 | mtcars_nested
83 | ```
84 |
85 | Once you have a list of data frames, it's very natural to produce a list of models:
86 |
87 | ```{r}
88 | mtcars_nested <- mtcars_nested %>%
89 | mutate(model = map(data, function(df) lm(mpg ~ wt, data = df)))
90 | mtcars_nested
91 | ```
92 |
93 | And then you could even produce a list of predictions:
94 |
95 | ```{r}
96 | mtcars_nested <- mtcars_nested %>%
97 | mutate(pred = map(model, predict))
98 | mtcars_nested
99 | ```
100 |
101 | This workflow works particularly well in conjunction with [broom](https://broom.tidymodels.org/), which makes it easy to turn models into tidy data frames which can then be `unnest()`ed to get back to flat data frames. You can see a bigger example in the [broom and dplyr vignette](https://broom.tidymodels.org/articles/broom_and_dplyr.html).
102 |
--------------------------------------------------------------------------------
/vignettes/pivot-long.key:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tidyverse/tidyr/9783be32423cb9125ed12bc3fa5962ef64dbd337/vignettes/pivot-long.key
--------------------------------------------------------------------------------
/vignettes/pivot-wide.key:
--------------------------------------------------------------------------------
https://raw.githubusercontent.com/tidyverse/tidyr/9783be32423cb9125ed12bc3fa5962ef64dbd337/vignettes/pivot-wide.key
--------------------------------------------------------------------------------
/vignettes/weather.csv:
--------------------------------------------------------------------------------
1 | "id","year","month","element","d1","d2","d3","d4","d5","d6","d7","d8","d9","d10","d11","d12","d13","d14","d15","d16","d17","d18","d19","d20","d21","d22","d23","d24","d25","d26","d27","d28","d29","d30","d31"
2 | "MX17004",2010,1,"tmax",NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,27.8,NA
3 | "MX17004",2010,1,"tmin",NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,14.5,NA
4 | "MX17004",2010,2,"tmax",NA,27.3,24.1,NA,NA,NA,NA,NA,NA,NA,29.7,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,29.9,NA,NA,NA,NA,NA,NA,NA,NA
5 | "MX17004",2010,2,"tmin",NA,14.4,14.4,NA,NA,NA,NA,NA,NA,NA,13.4,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,10.7,NA,NA,NA,NA,NA,NA,NA,NA
6 | "MX17004",2010,3,"tmax",NA,NA,NA,NA,32.1,NA,NA,NA,NA,34.5,NA,NA,NA,NA,NA,31.1,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA
7 | "MX17004",2010,3,"tmin",NA,NA,NA,NA,14.2,NA,NA,NA,NA,16.8,NA,NA,NA,NA,NA,17.6,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA
8 | "MX17004",2010,4,"tmax",NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,36.3,NA,NA,NA,NA
9 | "MX17004",2010,4,"tmin",NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,16.7,NA,NA,NA,NA
10 | "MX17004",2010,5,"tmax",NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,33.2,NA,NA,NA,NA
11 | "MX17004",2010,5,"tmin",NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18.2,NA,NA,NA,NA
12 | "MX17004",2010,6,"tmax",NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,28,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,30.1,NA,NA
13 | "MX17004",2010,6,"tmin",NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,17.5,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,18,NA,NA
14 | "MX17004",2010,7,"tmax",NA,NA,28.6,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,29.9,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA
15 | "MX17004",2010,7,"tmin",NA,NA,17.5,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,16.5,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA
16 | "MX17004",2010,8,"tmax",NA,NA,NA,NA,29.6,NA,NA,29,NA,NA,NA,NA,29.8,NA,NA,NA,NA,NA,NA,NA,NA,NA,26.4,NA,29.7,NA,NA,NA,28,NA,25.4
17 | "MX17004",2010,8,"tmin",NA,NA,NA,NA,15.8,NA,NA,17.3,NA,NA,NA,NA,16.5,NA,NA,NA,NA,NA,NA,NA,NA,NA,15,NA,15.6,NA,NA,NA,15.3,NA,15.4
18 | "MX17004",2010,10,"tmax",NA,NA,NA,NA,27,NA,28.1,NA,NA,NA,NA,NA,NA,29.5,28.7,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,31.2,NA,NA,NA
19 | "MX17004",2010,10,"tmin",NA,NA,NA,NA,14,NA,12.9,NA,NA,NA,NA,NA,NA,13,10.5,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,15,NA,NA,NA
20 | "MX17004",2010,11,"tmax",NA,31.3,NA,27.2,26.3,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,28.1,27.7,NA,NA,NA,NA
21 | "MX17004",2010,11,"tmin",NA,16.3,NA,12,7.9,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,12.1,14.2,NA,NA,NA,NA
22 | "MX17004",2010,12,"tmax",29.9,NA,NA,NA,NA,27.8,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA
23 | "MX17004",2010,12,"tmin",13.8,NA,NA,NA,NA,10.5,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA,NA
24 |
--------------------------------------------------------------------------------