├── .DS_Store ├── .Rbuildignore ├── .gitignore ├── .travis.yml ├── CRAN-RELEASE ├── DESCRIPTION ├── LICENSE ├── LICENSE.md ├── NAMESPACE ├── NEWS.md ├── R ├── binarize.R ├── correlate.R ├── correlationfunnel-package.R ├── data.R ├── global_variables.R ├── plot_correlation_funnel.R ├── tidyquant_theme_compat.R ├── utils-pipe.R └── zzz.R ├── README.Rmd ├── README.md ├── _pkgdown.yml ├── codecov.yml ├── correlationfunnel.Rproj ├── cran-comments.md ├── data ├── customer_churn_tbl.rda └── marketing_campaign_tbl.rda ├── docs ├── 404.html ├── LICENSE-text.html ├── LICENSE.html ├── articles │ ├── faqs.html │ ├── faqs_files │ │ └── figure-html │ │ │ ├── unnamed-chunk-12-1.png │ │ │ ├── unnamed-chunk-15-1.png │ │ │ ├── unnamed-chunk-2-1.png │ │ │ ├── unnamed-chunk-21-1.png │ │ │ ├── unnamed-chunk-23-1.png │ │ │ ├── unnamed-chunk-5-1.png │ │ │ ├── unnamed-chunk-6-1.png │ │ │ └── unnamed-chunk-9-1.png │ ├── index.html │ ├── introducing_correlation_funnel.html │ ├── introducing_correlation_funnel_files │ │ ├── figure-html │ │ │ └── unnamed-chunk-5-1.png │ │ └── header-attrs-2.1 │ │ │ └── header-attrs.js │ ├── key_considerations.html │ └── key_considerations_files │ │ ├── figure-html │ │ ├── unnamed-chunk-10-1.png │ │ ├── unnamed-chunk-12-1.png │ │ ├── unnamed-chunk-13-1.png │ │ ├── unnamed-chunk-15-1.png │ │ ├── unnamed-chunk-16-1.png │ │ ├── unnamed-chunk-2-1.png │ │ ├── unnamed-chunk-21-1.png │ │ ├── unnamed-chunk-22-1.png │ │ ├── unnamed-chunk-23-1.png │ │ ├── unnamed-chunk-24-1.png │ │ ├── unnamed-chunk-3-1.png │ │ ├── unnamed-chunk-5-1.png │ │ ├── unnamed-chunk-6-1.png │ │ ├── unnamed-chunk-7-1.png │ │ └── unnamed-chunk-9-1.png │ │ └── header-attrs-2.1 │ │ └── header-attrs.js ├── authors.html ├── bootstrap-toc.css ├── bootstrap-toc.js ├── docsearch.css ├── docsearch.js ├── index.html ├── link.svg ├── man │ └── figures │ │ └── README-corr_funnel.png ├── news │ └── index.html ├── pkgdown.css ├── pkgdown.js ├── pkgdown.yml └── reference │ ├── binarize.html │ ├── correlate.html │ ├── correlationfunnel-package.html │ ├── customer_churn_tbl.html │ ├── figures │ ├── README-3-course-system.jpg │ ├── README-corr_funnel.png │ ├── README-unnamed-chunk-5-1.png │ ├── README-unnamed-chunk-6-1.png │ └── logo-correlationfunnel.png │ ├── index.html │ ├── marketing_campaign_tbl.html │ ├── pipe.html │ ├── plot_correlation_funnel-1.png │ └── plot_correlation_funnel.html ├── man ├── .DS_Store ├── binarize.Rd ├── correlate.Rd ├── correlationfunnel-package.Rd ├── customer_churn_tbl.Rd ├── figures │ ├── .DS_Store │ ├── README-3-course-system.jpg │ ├── README-corr_funnel.png │ ├── README-unnamed-chunk-5-1.png │ ├── README-unnamed-chunk-6-1.png │ └── logo-correlationfunnel.png ├── marketing_campaign_tbl.Rd ├── pipe.Rd └── plot_correlation_funnel.Rd ├── tests ├── testthat.R └── testthat │ ├── test-binarize.R │ ├── test-correlate.R │ └── test-plot_correlation_funnel.R └── vignettes ├── .gitignore ├── introducing_correlation_funnel.Rmd └── key_considerations.Rmd /.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/.DS_Store -------------------------------------------------------------------------------- /.Rbuildignore: -------------------------------------------------------------------------------- 1 | ^.*\.Rproj$ 2 | ^\.Rproj\.user$ 3 | ^LICENSE\.md$ 4 | ^README\.Rmd$ 5 | ^\.travis\.yml$ 6 | ^codecov\.yml$ 7 | ^cran-comments\.md$ 8 | ^_pkgdown\.yml$ 9 | ^docs$ 10 | ^CRAN-RELEASE$ 11 | ^doc$ 12 | ^Meta$ 13 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .Rproj.user 2 | .Rhistory 3 | .RData 4 | .Ruserdata 5 | inst/doc 6 | doc 7 | Meta 8 | -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- 1 | # R for travis: see documentation at https://docs.travis-ci.com/user/languages/r 2 | 3 | language: R 4 | sudo: false 5 | cache: packages 6 | 7 | after_success: 8 | - Rscript -e 'covr::codecov()' 9 | -------------------------------------------------------------------------------- /CRAN-RELEASE: -------------------------------------------------------------------------------- 1 | This package was submitted to CRAN on 2019-08-05. 2 | Once it is accepted, delete this file and tag the release (commit 97c6fffe64). 3 | -------------------------------------------------------------------------------- /DESCRIPTION: -------------------------------------------------------------------------------- 1 | Package: correlationfunnel 2 | Type: Package 3 | Title: Speed Up Exploratory Data Analysis (EDA) with the Correlation Funnel 4 | Version: 0.2.0 5 | Authors@R: 6 | person("Matt", "Dancho", email = "mdancho@business-science.io", role = c("aut", "cre")) 7 | Description: 8 | Speeds up exploratory data analysis (EDA) 9 | by providing a succinct workflow and interactive visualization tools for understanding 10 | which features have relationships to target (response). Uses binary correlation analysis 11 | to determine relationship. Default correlation method is the Pearson method. 12 | Lian Duan, W Nick Street, Yanchi Liu, Songhua Xu, and Brook Wu (2014) . 13 | URL: https://github.com/business-science/correlationfunnel, https://business-science.github.io/correlationfunnel/ 14 | BugReports: https://github.com/business-science/correlationfunnel/issues 15 | License: MIT + file LICENSE 16 | Encoding: UTF-8 17 | LazyData: true 18 | Depends: 19 | R (>= 3.1) 20 | Imports: 21 | ggplot2, 22 | rlang, 23 | recipes, 24 | magrittr, 25 | plotly, 26 | tibble, 27 | dplyr (>= 1.0.0), 28 | tidyr (>= 1.0.0), 29 | stats, 30 | utils, 31 | ggrepel, 32 | stringr, 33 | forcats, 34 | purrr, 35 | cli, 36 | crayon, 37 | rstudioapi 38 | Suggests: 39 | scales, 40 | knitr, 41 | rmarkdown, 42 | covr, 43 | lubridate, 44 | testthat (>= 2.1.0) 45 | RoxygenNote: 7.1.0 46 | Roxygen: list(markdown = TRUE) 47 | VignetteBuilder: knitr 48 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | YEAR: 2019 2 | COPYRIGHT HOLDER: Matt Dancho 3 | -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- 1 | # MIT License 2 | 3 | Copyright (c) 2019 Matt Dancho 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /NAMESPACE: -------------------------------------------------------------------------------- 1 | # Generated by roxygen2: do not edit by hand 2 | 3 | S3method(binarize,data.frame) 4 | S3method(binarize,default) 5 | S3method(correlate,data.frame) 6 | S3method(correlate,default) 7 | S3method(plot_correlation_funnel,data.frame) 8 | S3method(plot_correlation_funnel,default) 9 | export("%>%") 10 | export(binarize) 11 | export(correlate) 12 | export(plot_correlation_funnel) 13 | importFrom(ggplot2,"%+replace%") 14 | importFrom(ggplot2,element_blank) 15 | importFrom(ggplot2,element_line) 16 | importFrom(ggplot2,element_rect) 17 | importFrom(ggplot2,element_text) 18 | importFrom(ggplot2,margin) 19 | importFrom(ggplot2,rel) 20 | importFrom(ggplot2,theme) 21 | importFrom(ggplot2,theme_grey) 22 | importFrom(ggplot2,unit) 23 | importFrom(magrittr,"%>%") 24 | importFrom(recipes,"all_nominal") 25 | importFrom(recipes,"all_numeric") 26 | importFrom(recipes,"all_predictors") 27 | importFrom(rlang,"!!") 28 | -------------------------------------------------------------------------------- /NEWS.md: -------------------------------------------------------------------------------- 1 | 2 | # correlationfunnel 0.2.0 3 | 4 | * Fix - Allow `integer` data to be `binarize()`-ed 5 | * Fix - Allow `logical` data to be `binarize()`-ed. Values are converted to `integer` and then binarized. 6 | * Compatability with `dplyr` 1.0.0 7 | 8 | # correlationfunnel 0.1.0 9 | 10 | * Initial CRAN Submission 11 | 12 | # correlationfunnel 0.0.9 13 | 14 | * Package under development 15 | * Added a `NEWS.md` file to track changes to the package. 16 | -------------------------------------------------------------------------------- /R/correlate.R: -------------------------------------------------------------------------------- 1 | #' Correlate a response (target) to features in a data set. 2 | #' 3 | #' \code{correlate} returns a correlation between a target column and the features in a data set. 4 | #' 5 | #' 6 | #' @param data A `tibble` or `data.frame` 7 | #' @param target The feature that contains the response (Target) that you want to measure relationship. 8 | #' @param ... Other arguments passed to \link[stats]{cor} 9 | #' 10 | #' @return A `tbl` 11 | #' 12 | #' @details 13 | #' The `correlate()` function provides a convient wrapper around the \link[stats]{cor} function where the `target` 14 | #' is the column containing the Y variable. The function is intended to be used with [`binarize()`], which enables 15 | #' creation of the binary correlation analysis, which is the feed data for the [`plot_correlation_funnel()`] visualization. 16 | #' 17 | #' The default method is the Pearson correlation, which is the Correlation Coefficient from L. Duan et al., 2014. 18 | #' This represents the linear relationship between two dichotomous features (binary variables). 19 | #' Learn more about the binary correlation approach in the Vignette covering the Methodology, Key Considerations and FAQs. 20 | #' 21 | #' 22 | #' 23 | #' @references 24 | #' Lian Duan, W. Nick Street, Yanchi Liu, Songhua Xu, and Brook Wu. 2014. Selecting the right correlation 25 | #' measure for binary data. ACM Trans. Knowl. Discov. Data 9, 2, Article 13 (September 2014), 28 pages. 26 | #' DOI: http://dx.doi.org/10.1145/2637484 27 | #' 28 | #' @seealso 29 | #' [binarize()], [plot_correlation_funnel()] 30 | #' 31 | #' @examples 32 | #' library(dplyr) 33 | #' library(correlationfunnel) 34 | #' 35 | #' marketing_campaign_tbl %>% 36 | #' select(-ID) %>% 37 | #' binarize() %>% 38 | #' correlate(TERM_DEPOSIT__yes) 39 | #' 40 | #' 41 | #' @importFrom rlang !! 42 | #' 43 | #' @export 44 | correlate <- function(data, target, ...) { 45 | UseMethod("correlate", data) 46 | } 47 | 48 | #' @export 49 | correlate.default <- function(data, target, ...) { 50 | stop("Error correlate(): Object is not of class `data.frame`.", call. = FALSE) 51 | } 52 | 53 | #' @export 54 | correlate.data.frame <- function(data, target, ...) { 55 | 56 | # Check missing 57 | if (missing(target)) stop('Error in correlate(): argument "target" is missing, with no default', call. = FALSE) 58 | 59 | # Check all data is numeric 60 | check_data_type(data, 61 | classes_allowed = c("numeric", "integer", "logical"), 62 | .fun_name = "correlate") 63 | 64 | # Extract target 65 | target_expr <- rlang::enquo(target) 66 | target_name <- rlang::quo_name(target_expr) 67 | y <- data %>% dplyr::pull(!! target_expr) 68 | 69 | # Check data balance 70 | if (is.binary(y)) check_imbalance(x = y, thresh = 0.05, .col_name = target_name, .fun_name = "correlate") 71 | 72 | # Correlation logic 73 | data_transformed_tbl <- data %>% 74 | stats::cor(y = y, ...) %>% 75 | tibble::as_tibble(rownames = "feature", .name_repair = "minimal") %>% 76 | dplyr::rename(correlation = 2) %>% 77 | tidyr::separate(feature, into = c("feature", "bin"), sep = "__") %>% 78 | dplyr::filter(!is.na(correlation)) %>% 79 | dplyr::arrange(abs(correlation) %>% dplyr::desc()) %>% 80 | dplyr::mutate(feature = forcats::as_factor(feature) %>% forcats::fct_rev()) 81 | 82 | return(data_transformed_tbl) 83 | 84 | } 85 | 86 | # is.binary function - Checks if vector is binary 87 | is.binary <- function(x) { 88 | unique_vals <- unique(x) 89 | 90 | all(unique_vals %in% c(0, 1)) 91 | } 92 | 93 | # Check data imbalance 94 | check_imbalance <- function(x, thresh, .col_name, .fun_name) { 95 | 96 | prop_x <- sum(x) / length(x) 97 | 98 | if (prop_x < thresh) { 99 | 100 | msg1 <- paste0(.fun_name, "(): ") 101 | msg2 <- paste0("[Data Imbalance Detected] Consider sampling to balance the classes more than ", scales::percent(thresh)) 102 | msg3 <- paste0("\n Column with imbalance: ", .col_name) 103 | 104 | msg <- paste0(msg1, msg2, msg3) 105 | 106 | warning(msg, call. = FALSE) 107 | } 108 | 109 | } 110 | -------------------------------------------------------------------------------- /R/correlationfunnel-package.R: -------------------------------------------------------------------------------- 1 | #' @keywords internal 2 | "_PACKAGE" 3 | 4 | # The following block is used by usethis to automatically manage 5 | # roxygen namespace tags. Modify with care! 6 | ## usethis namespace: start 7 | ## usethis namespace: end 8 | NULL 9 | -------------------------------------------------------------------------------- /R/data.R: -------------------------------------------------------------------------------- 1 | #' Marketing Data for a Bank 2 | #' 3 | #' A dataset containing data related to bank clients, last contact of the current marketing campaign, and attributes related to a 4 | #' previous marketing campaign. 5 | #' 6 | #' # Bank Client Data: 7 | #' - ID (chr): CUSTOMER ID 8 | #' - AGE (dbl): Customer's age 9 | #' - JOB (chr): Type of job (categorical: "admin.","unknown","unemployed","management","housemaid","entrepreneur","student", "blue-collar","self-employed","retired","technician","services") 10 | #' - MARITAL (chr): marital status (categorical: "married","divorced","single"; note: "divorced" means divorced or widowed) 11 | #' - EDUCATION (chr): categorical: "unknown","secondary","primary","tertiary" 12 | #' - DEFAULT (chr): Has credit in default? (binary: "yes","no") 13 | #' - BALANCE (dbl): Average yearly balance, in euros (numeric) 14 | #' - HOUSING (chr): Has housing loan? (binary: "yes","no") 15 | #' - LOAN (chr): Has personal loan? (binary: "yes","no") 16 | #' 17 | #' # Features related to the last contact during the current marketing campaign: 18 | #' - CONTACT (chr): Contact communication type (categorical: "unknown","telephone","cellular") 19 | #' - DAY (dbl): Last contact day of the month (numeric) 20 | #' - MONTH (chr): Last contact month of year (categorical: "jan", "feb", "mar", ..., "nov", "dec") 21 | #' - DURATION (dbl): Last contact duration, in seconds (numeric) 22 | #' 23 | #' # Additional Attributes: 24 | #' - CAMPAIGN (dbl): Number of contacts performed during this campaign and for this client (numeric, includes last contact) 25 | #' - PDAYS (dbl): Number of days that passed by after the client was last contacted from a previous campaign (numeric, -1 means client was not previously contacted) 26 | #' - PREVIOUS (dbl): Number of contacts performed before this campaign and for this client (numeric) 27 | #' - POUTCOME (chr): Outcome of the previous marketing campaign (categorical: "unknown","other","failure","success") 28 | #' 29 | #' # Target Variable (Response): 30 | #' - TERM_DEPOSIT (chr): Has the client subscribed a term deposit? (binary: "yes","no") 31 | #' 32 | #' @source 33 | #' [Moro et al., 2014](https://archive.ics.uci.edu/ml/datasets/Bank+Marketing) S. Moro, P. Cortez and P. Rita. A Data-Driven Approach to Predict the Success of Bank Telemarketing. Decision Support Systems, Elsevier, 62:22-31, June 2014 34 | "marketing_campaign_tbl" 35 | 36 | 37 | 38 | #' Customer Churn Data Set for a Telecommunications Company 39 | #' 40 | #' A dataset containing data related to telecom customers that have enrolled in various products and services 41 | #' 42 | #' # Telecom Customer Data: 43 | #' - customerID (chr): CUSTOMER ID 44 | #' - gender (chr): Customer's gender ("Female", "Male") 45 | #' - SeniorCitizen (dbl): 1 = Senior Citzen, 0 = Not Senior Citizen 46 | #' - Partner (chr): Whether the customer has a partner or not (Yes, No) 47 | #' - Dependents (chr): Whether the customer has dependents or not (Yes, No) 48 | #' - tenure (dbl): Number of months the customer has stayed with the company 49 | #' - PhoneService (chr): Whether the customer has a phone service or not (Yes, No) 50 | #' - MultipleLines (chr): Whether the customer has multiple lines or not (Yes, No, No phone service) 51 | #' - InternetService (chr): Customer’s internet service provider (DSL, Fiber optic, No) 52 | #' - OnlineSecurity (chr): Whether the customer has online security or not (Yes, No, No internet service) 53 | #' - OnlineBackup (chr): Whether the customer has online backup or not (Yes, No, No internet service) 54 | #' - DeviceProtection (chr): Whether the customer has device protection or not (Yes, No, No internet service) 55 | #' - TechSupport (chr): Whether the customer has tech support or not (Yes, No, No internet service) 56 | #' - StreamingTV (chr): Whether the customer has streaming TV or not (Yes, No, No internet service) 57 | #' - StreamingMovies (chr): Whether the customer has streaming movies or not (Yes, No, No internet service) 58 | #' - Contract (chr): The contract term of the customer (Month-to-month, One year, Two year) 59 | #' - PaperlessBilling (chr): Whether the customer has paperless billing or not (Yes, No) 60 | #' - PaymentMethod (chr): The customer’s payment method (Electronic check, Mailed check, Bank transfer (automatic), Credit card (automatic)) 61 | #' - MonthlyCharges (dbl): The amount charged to the customer monthly 62 | #' - TotalCharges (dbl): The total amount charged to the customer 63 | #' - Churn (chr): Outcome. Whether the customer churned or not (Yes or No) 64 | #' 65 | #' @source 66 | #' [IBM Sample Datasets](https://community.ibm.com/community/user/gettingstarted/home) 67 | "customer_churn_tbl" 68 | -------------------------------------------------------------------------------- /R/global_variables.R: -------------------------------------------------------------------------------- 1 | globalVariables(c( 2 | "type", "number", "terms", "value", "id", "label_current", "lable_new", ".", "label_new", 3 | "V1", "bin", "correlation", "feature", "label_text", "check", "key", "count_na", 4 | "count_unique_quantile", "unacceptable_class" 5 | )) 6 | -------------------------------------------------------------------------------- /R/plot_correlation_funnel.R: -------------------------------------------------------------------------------- 1 | #' Plot a Correlation Funnel 2 | #' 3 | #' \code{plot_correlation_funnel} returns a correlation funnel visualization in either static (`ggplot2`) or 4 | #' interactive (`plotly`) formats. 5 | #' 6 | #' 7 | #' @param data A `tibble` or `data.frame` 8 | #' @param interactive Returns either a static (`ggplot2`) visualization or an interactive (`plotly`) visualization 9 | #' @param limits Sets the X-Axis limits for the correlation space 10 | #' @param alpha Sets the transparency of the points on the plot. 11 | #' 12 | #' @return A static `ggplot2` plot or an interactive `plotly` plot 13 | #' 14 | #' 15 | #' @seealso 16 | #' [binarize()], [correlate()] 17 | #' 18 | #' @examples 19 | #' library(dplyr) 20 | #' library(correlationfunnel) 21 | #' 22 | #' marketing_campaign_tbl %>% 23 | #' select(-ID) %>% 24 | #' binarize() %>% 25 | #' correlate(TERM_DEPOSIT__yes) %>% 26 | #' plot_correlation_funnel() 27 | #' 28 | #' 29 | #' @export 30 | plot_correlation_funnel <- function(data, interactive = FALSE, limits = c(-1, 1), alpha = 1) { 31 | UseMethod("plot_correlation_funnel", data) 32 | } 33 | 34 | #' @export 35 | plot_correlation_funnel.default <- function(data, interactive = FALSE, limits = c(-1, 1), alpha = 1) { 36 | stop("plot_correlation_funnel(): Object is not of class `data.frame`.", call. = FALSE) 37 | } 38 | 39 | #' @export 40 | plot_correlation_funnel.data.frame <- function(data, interactive = FALSE, limits = c(-1, 1), alpha = 1) { 41 | 42 | # Checks 43 | check_column_names( 44 | data, 45 | acceptable_column_names = c("feature", "bin", "correlation"), 46 | .fun_name = "plot_correlation_funnel") 47 | 48 | if (interactive) { 49 | 50 | data <- data %>% 51 | dplyr::mutate(label_text = stringr::str_glue("{feature} 52 | Bin: {bin} 53 | Correlation: {round(correlation, 3)}")) 54 | 55 | g <- data %>% 56 | ggplot2::ggplot(ggplot2::aes(x = correlation, y = feature, text = label_text)) + 57 | 58 | # Geometries 59 | ggplot2::geom_vline(xintercept = 0, linetype = 2, color = "red") + 60 | ggplot2::geom_point(color = "#2c3e50", alpha = alpha) + 61 | # ggrepel::geom_text_repel(ggplot2::aes(label = bin), size = 3, color = "#2c3e50") + 62 | 63 | # Formatting 64 | ggplot2::scale_x_continuous(limits = limits) + 65 | theme_tq() + 66 | ggplot2::labs(title = "Correlation Funnel") 67 | 68 | p <- plotly::ggplotly(g, tooltip = "text") 69 | 70 | return(p) 71 | 72 | } else { 73 | g <- data %>% 74 | ggplot2::ggplot(ggplot2::aes(x = correlation, y = feature, text = bin)) + 75 | 76 | # Geometries 77 | ggplot2::geom_vline(xintercept = 0, linetype = 2, color = "red") + 78 | ggplot2::geom_point(color = "#2c3e50", alpha = alpha) + 79 | ggrepel::geom_text_repel(ggplot2::aes(label = bin), size = 3, color = "#2c3e50") + 80 | 81 | # Formatting 82 | ggplot2::scale_x_continuous(limits = limits) + 83 | theme_tq() + 84 | ggplot2::labs(title = "Correlation Funnel") 85 | 86 | return(g) 87 | } 88 | 89 | } 90 | 91 | # Check column names of data 92 | check_column_names <- function(data, acceptable_column_names, .fun_name) { 93 | 94 | if (any(!(names(data) %in% acceptable_column_names))) { 95 | 96 | msg1 <- paste0(.fun_name, "(): ") 97 | msg2 <- paste0("[Unnacceptable Data] Acceptable data is generated from the output of correlate().") 98 | 99 | msg <- paste0(msg1, msg2) 100 | 101 | stop(msg, call. = FALSE) 102 | } 103 | } 104 | -------------------------------------------------------------------------------- /R/tidyquant_theme_compat.R: -------------------------------------------------------------------------------- 1 | #' @importFrom ggplot2 %+replace% theme_grey element_blank element_line element_rect element_text margin rel theme unit 2 | 3 | 4 | # tidyquant functions copied to remove dependency on tidyquant 5 | 6 | theme_tq <- function(base_size = 11, base_family = "") { 7 | 8 | # Tidyquant colors 9 | blue <- "#2c3e50" 10 | green <- "#18BC9C" 11 | white <- "#FFFFFF" 12 | grey <- "grey80" 13 | 14 | # Starts with theme_grey and then modify some parts 15 | theme_grey(base_size = base_size, base_family = base_family) %+replace% 16 | theme( 17 | 18 | # Base Inherited Elements 19 | line = element_line(colour = blue, size = 0.5, linetype = 1, 20 | lineend = "butt"), 21 | rect = element_rect(fill = white, colour = blue, 22 | size = 0.5, linetype = 1), 23 | text = element_text(family = base_family, face = "plain", 24 | colour = blue, size = base_size, 25 | lineheight = 0.9, hjust = 0.5, vjust = 0.5, angle = 0, 26 | margin = margin(), debug = FALSE), 27 | 28 | # Axes 29 | axis.line = element_blank(), 30 | axis.text = element_text(size = rel(0.8)), 31 | axis.ticks = element_line(color = grey, size = rel(1/3)), 32 | axis.title = element_text(size = rel(1.0)), 33 | 34 | # Panel 35 | panel.background = element_rect(fill = white, color = NA), 36 | panel.border = element_rect(fill = NA, size = rel(1/2), color = blue), 37 | panel.grid.major = element_line(color = grey, size = rel(1/3)), 38 | panel.grid.minor = element_line(color = grey, size = rel(1/3)), 39 | panel.grid.minor.x = element_blank(), 40 | panel.spacing = unit(.75, "cm"), 41 | 42 | # Legend 43 | legend.key = element_rect(fill = white, color = NA), 44 | legend.position = "bottom", 45 | 46 | # Strip (Used with multiple panels) 47 | strip.background = element_rect(fill = blue, color = blue), 48 | strip.text = element_text(color = white, size = rel(0.8)), 49 | 50 | # Plot 51 | plot.title = element_text(size = rel(1.2), hjust = 0, 52 | margin = margin(t = 0, r = 0, b = 4, l = 0, unit = "pt")), 53 | plot.subtitle = element_text(size = rel(0.9), hjust = 0, 54 | margin = margin(t = 0, r = 0, b = 3, l = 0, unit = "pt")), 55 | 56 | # Complete theme 57 | complete = TRUE 58 | ) 59 | } 60 | 61 | palette_light <- function() { 62 | c( 63 | "#2c3e50", # blue 64 | "#e31a1c", # red 65 | "#18BC9C", # green 66 | "#CCBE93", # yellow 67 | "#a6cee3", # steel_blue 68 | "#1f78b4", # navy_blue 69 | "#b2df8a", # light_green 70 | "#fb9a99", # pink 71 | "#fdbf6f", # light_orange 72 | "#ff7f00", # orange 73 | "#cab2d6", # light_purple 74 | "#6a3d9a" # purple 75 | ) 76 | } 77 | -------------------------------------------------------------------------------- /R/utils-pipe.R: -------------------------------------------------------------------------------- 1 | #' Pipe operator 2 | #' 3 | #' See \code{magrittr::\link[magrittr]{\%>\%}} for details. 4 | #' 5 | #' @name %>% 6 | #' @rdname pipe 7 | #' @keywords internal 8 | #' @export 9 | #' @importFrom magrittr %>% 10 | #' @usage lhs \%>\% rhs 11 | NULL 12 | -------------------------------------------------------------------------------- /R/zzz.R: -------------------------------------------------------------------------------- 1 | # .onAttach <- function(libname,pkgname) { 2 | # 3 | # bsu_rule_color <- "#2c3e50" 4 | # bsu_main_color <- "#1f78b4" 5 | # 6 | # # Check Theme: If Dark, Update Colors 7 | # if (rstudioapi::isAvailable()) { 8 | # tryCatch({ 9 | # theme <- rstudioapi::getThemeInfo() 10 | # }) 11 | # 12 | # 13 | # } 14 | # 15 | # bsu_main <- crayon::make_style(bsu_main_color) 16 | # 17 | # msg1 <- paste0( 18 | # cli::rule(left = "Using correlationfunnel?", col = bsu_rule_color, line = 2), 19 | # bsu_main('\nYou might also be interested in applied data science training for business.\n'), 20 | # bsu_main(' Learn more at - www.business-science.io ') 21 | # ) 22 | # 23 | # msg2 <- paste0( 24 | # cli::rule(left = "correlationfunnel Tip #1", col = bsu_rule_color, line = 2), 25 | # bsu_main('\nMake sure your data is not overly imbalanced prior to using `correlate()`.\nIf less than 5% imbalance, consider sampling. :)') 26 | # ) 27 | # 28 | # msg3 <- paste0( 29 | # cli::rule(left = "correlationfunnel Tip #2", col = bsu_rule_color, line = 2), 30 | # bsu_main("\nClean your NA's prior to using `binarize()`.\nMissing values and cleaning data are critical to getting great correlations. :)") 31 | # ) 32 | # 33 | # msg4 <- paste0( 34 | # cli::rule(left = "correlationfunnel Tip #3", col = bsu_rule_color, line = 2), 35 | # bsu_main("\nUsing `binarize()` with data containing many columns or many rows can increase dimensionality substantially.\nTry subsetting your data column-wise or row-wise to avoid creating too many columns.\nYou can always make a big problem smaller by sampling. :)") 36 | # ) 37 | # 38 | # msg <- c(msg1, msg1, msg2, msg3, msg4)[sample(1:5, size = 1)] 39 | # packageStartupMessage(msg) 40 | # 41 | # } 42 | -------------------------------------------------------------------------------- /README.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | output: github_document 3 | --- 4 | 5 | 6 | 7 | ```{r, include = FALSE} 8 | knitr::opts_chunk$set( 9 | collapse = TRUE, 10 | comment = "#>", 11 | fig.path = "man/figures/README-", 12 | out.width = "100%", 13 | dpi = 300, 14 | message = F, 15 | warning = F 16 | ) 17 | 18 | devtools::load_all() 19 | library(tidyverse) 20 | ``` 21 | 22 | 23 | # correlationfunnel 24 | _by [Business Science](https://www.business-science.io/)_ 25 | 26 | [![Lifecycle: maturing](https://img.shields.io/badge/lifecycle-maturing-blue.svg)](https://www.tidyverse.org/lifecycle/#maturing) 27 | [![Travis build status](https://travis-ci.org/business-science/correlationfunnel.svg?branch=master)](https://travis-ci.org/business-science/correlationfunnel) 28 | [![Coverage status](https://codecov.io/gh/business-science/correlationfunnel/branch/master/graph/badge.svg)](https://codecov.io/github/business-science/correlationfunnel?branch=master) 29 | [![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/correlationfunnel)](https://cran.r-project.org/package=correlationfunnel) 30 | ![](http://cranlogs.r-pkg.org/badges/correlationfunnel?color=brightgreen) 31 | ![](http://cranlogs.r-pkg.org/badges/grand-total/correlationfunnel?color=brightgreen) 32 | 33 | > Speed Up Exploratory Data Analysis (EDA) 34 | 35 | The goal of `correlationfunnel` is to speed up Exploratory Data Analysis (EDA). Here's how to use it. 36 | 37 | ## Installation 38 | 39 | You can install the latest stable (CRAN) version of `correlationfunnel` with: 40 | 41 | ``` r 42 | install.packages("correlationfunnel") 43 | ``` 44 | 45 | 46 | You can install the development version of `correlationfunnel` from [GitHub](https://github.com/business-science/) with: 47 | 48 | ``` r 49 | devtools::install_github("business-science/correlationfunnel") 50 | ``` 51 | 52 | ## Correlation Funnel in 2-Minutes 53 | 54 | __Problem__: 55 | Exploratory data analysis (EDA) involves looking at feature-target relationships independently. This process is very time consuming even for small data sets. ___Rather than search for relationships, what if we could let the relationships come to us?___ 56 | 57 | 58 | 59 | __Solution:__ 60 | Enter `correlationfunnel`. The package provides a __succinct workflow__ and __interactive visualization tools__ for understanding which features have relationships to target (response). 61 | 62 | __Main Benefits__: 63 | 64 | 1. __Speeds Up Exploratory Data Analysis__ 65 | 66 | 2. __Improves Feature Selection__ 67 | 68 | 3. __Gets You To Business Insights Faster__ 69 | 70 | ## Example - Bank Marketing Campaign 71 | 72 | The following example showcases the power of __fast exploratory correlation analysis__. The goal of the analysis is to determine which features relate to the bank's marketing campaign goal of having customers opt into a TERM DEPOSIT (financial product). 73 | 74 | We will see that using __3 functions__, we can quickly: 75 | 76 | 1. Transform the data into a binary format with `binarize()` 77 | 78 | 2. Perform correlation analysis using `correlate()` 79 | 80 | 3. Visualize the highest correlation features using `plot_correlation_funnel()` 81 | 82 | __Result__: Rather than spend hours looking at individual plots of capaign features and comparing them to which customers opted in to the TERM DEPOSIT product, in seconds we can discover which groups of customers have enrolled, drastically speeding up EDA. 83 | 84 | ### Getting Started 85 | 86 | First, load the libraries. 87 | 88 | ```{r example} 89 | library(correlationfunnel) 90 | library(dplyr) 91 | ``` 92 | 93 | Next, collect data to analyze. We'll use Marketing Campaign Data for a Bank that was popularized by the [UCI Machine Learning Repository](https://archive.ics.uci.edu/ml/datasets/Bank+Marketing). We can load the data with `data("marketing_campaign_tbl")`. 94 | 95 | ```{r} 96 | # Use ?marketing_campagin_tbl to get a description of the marketing campaign features 97 | data("marketing_campaign_tbl") 98 | 99 | marketing_campaign_tbl %>% glimpse() 100 | ``` 101 | 102 | ### Response & Predictor Relationships 103 | 104 | Modeling and Machine Learning problems often involve a response (Enrolled in `TERM_DEPOSIT`, yes/no) and many predictors (AGE, JOB, MARITAL, etc). Our job is to determine which predictors are related to the response. We can do this through __Binary Correlation Analysis__. 105 | 106 | ### Binary Correlation Analysis 107 | 108 | Binary Correlation Analysis is the process of converting continuous (numeric) and categorical (character/factor) data to binary features. We can then perform a correlation analysis to see if there is predictive value between the features and the response (target). 109 | 110 | #### Step 1: Convert to Binary Format 111 | 112 | The first step is converting the continuous and categorical data into binary (0/1) format. We de-select any non-predictive features. The `binarize()` function then converts the features into binary features. 113 | 114 | - __Numeric Features:__ Are binned into ranges or if few unique levels are binned by their value, and then converted to binary features via one-hot encoding 115 | 116 | - __Categorical Features__: Are binned by one-hot encoding 117 | 118 | The result is a data frame that has only binary data with columns representing the bins that the observations fall into. Note that the output is shown in the `glimpse()` format. THere are now 80 columns that are binary (0/1). 119 | 120 | ```{r} 121 | marketing_campaign_binarized_tbl <- marketing_campaign_tbl %>% 122 | select(-ID) %>% 123 | binarize(n_bins = 4, thresh_infreq = 0.01) 124 | 125 | marketing_campaign_binarized_tbl %>% glimpse() 126 | ``` 127 | 128 | #### Step 2: Perform Correlation Analysis 129 | 130 | The second step is to perform a correlation analysis between the response (target = TERM_DEPOSIT_yes) and the rest of the features. This returns a specially formatted tibble with the feature, the bin, and the bin's correlation to the target. The format is exactly what we need for the next step - Producing the __Correlation Funnel__ 131 | 132 | ```{r} 133 | marketing_campaign_correlated_tbl <- marketing_campaign_binarized_tbl %>% 134 | correlate(target = TERM_DEPOSIT__yes) 135 | 136 | marketing_campaign_correlated_tbl 137 | ``` 138 | 139 | #### Step 3: Visualize the Correlation Funnel 140 | 141 | A __Correlation Funnel__ is an tornado plot that lists the highest correlation features (based on absolute magnitude) at the top of the and the lowest correlation features at the bottom. The resulting visualization looks like a Funnel. 142 | 143 | To produce the __Correlation Funnel__, use `plot_correlation_funnel()`. Try setting `interactive = TRUE` to get an interactive plot that can be zoomed in on. 144 | 145 | ```{r, fig.height=8} 146 | marketing_campaign_correlated_tbl %>% 147 | plot_correlation_funnel(interactive = FALSE) 148 | ``` 149 | 150 | 151 | ### Examining the Results 152 | 153 | The most important features are towards the top. We can investigate these. 154 | 155 | ```{r, fig.height=3} 156 | marketing_campaign_correlated_tbl %>% 157 | filter(feature %in% c("DURATION", "POUTCOME", "PDAYS", 158 | "PREVIOUS", "CONTACT", "HOUSING")) %>% 159 | plot_correlation_funnel(interactive = FALSE, limits = c(-0.4, 0.4)) 160 | ``` 161 | 162 | We can see that the following prospect groups have a much greater correlation with enrollment in the TERM DEPOSIT product: 163 | 164 | - When the DURATION, the amount of time a prospect is engaged in marketing campaign material, is 319 seconds or longer. 165 | 166 | - When POUTCOME, whether or not a prospect has previously enrolled in a product, is "success". 167 | 168 | - When CONTACT, the medium used to contact the person, is "cellular" 169 | 170 | - When HOUSING, whether or not the contact has a HOME LOAN is "no" 171 | 172 | 173 | ## Other Great EDA Packages in R 174 | 175 | The main addition of `correlationfunnel` is to quickly expose feature relationships to semi-processed data meaning missing (`NA`) values have been treated, date or date-time features have been feature engineered, and data is in a "clean" format (numeric data and categorical data are ready to be correlated to a Yes/No response). 176 | 177 | Here are several great EDA packages that can help you understand data issues (cleanliness) and get data preprared for Correlation Analysis! 178 | 179 | - [Data Explorer](https://boxuancui.github.io/DataExplorer/) - Automates Exploration and Data Treatment. Amazing for investigating features quickly and efficiently including by data type, missing data, feature engineering, and identifying relationships. 180 | 181 | - [naniar](http://naniar.njtierney.com/) - For understanding missing data. 182 | 183 | - [UpSetR](https://github.com/hms-dbmi/UpSetR) - For generating upset plots 184 | 185 | - [GGally](https://ggobi.github.io/ggally/) - The `ggpairs()` function is one of my all-time favorites for visualizing many features quickly. 186 | 187 | 188 | ## Using Correlation Funnel? You Might Be Interested in Applied Business Education 189 | 190 | [___Business Science___](https://www.business-science.io/) teaches students how to apply data science for business. The entire curriculum is crafted around business consulting with data science. _Correlation Analysis_ is one of the many techniques that we teach in our curriculum. 191 | __Learn from our data science application experience with real-world business projects.__ 192 | 193 | 194 | ### Learn from Real-World Business Projects 195 | 196 | Students learn by solving real world projects using our repeatable project-management framework along with cutting-edge tools like the Correlation Analysis, Automated Machine Learning, and Feature Explanation as part of our ROI-Driven Data Science Curriculum. 197 | 198 | 199 | 200 | - [__Learn Data Science Foundations (DS4B 101-R)__](https://university.business-science.io/p/ds4b-101-r-business-analysis-r): Learn the entire `tidyverse` (`dplyr`, `ggplot2`, `rmarkdown`, & more) and `parsnip` - Solve 2 Projects - Customer Segmentation and Price Optimization projects 201 | 202 | - [__Learn Advanced Machine Learning & Business Consulting (DS4B 201-R)__](https://university.business-science.io/p/hr201-using-machine-learning-h2o-lime-to-predict-employee-turnover/): Churn Project solved with Correlation Analysis, `H2O` AutoML, `LIME` Feature Explanation, and ROI-driven Analysis / Recommendation Systems 203 | 204 | - [__Learn Predictive Web Application Development (DS4B 102-R)__](https://university.business-science.io/p/ds4b-102-r-shiny-web-application-business-level-1/): Build 2 Predictive `Shiny` Web Apps - Sales Dashboard with Demand Forecasting & Price Prediction App 205 | -------------------------------------------------------------------------------- /_pkgdown.yml: -------------------------------------------------------------------------------- 1 | template: 2 | params: 3 | bootswatch: flatly 4 | ganalytics: G-20GDZ5LL77 5 | 6 | navbar: 7 | title: "correlationfunnel" 8 | left: 9 | - text: "Home" 10 | href: index.html 11 | - text: "Function Reference" 12 | href: reference/index.html 13 | - text: "Articles" 14 | href: articles/index.html 15 | menu: 16 | - text: "Introducing Correlation Funnel - Customer Churn Example" 17 | href: articles/introducing_correlation_funnel.html 18 | - text: "Key Considerations & FAQs" 19 | href: articles/key_considerations.html 20 | - text: "News" 21 | href: news/index.html 22 | 23 | right: 24 | - icon: fa-github 25 | href: https://github.com/business-science/correlationfunnel 26 | 27 | reference: 28 | - title: General 29 | contents: 30 | - correlationfunnel-package 31 | - title: Correlation Funnel Workflow 32 | desc: __The main functions used to perform binary correlation analysis.__ 33 | contents: 34 | - binarize 35 | - correlate 36 | - title: Visualization functions 37 | desc: __Plotting utilities for visualizing the Correlation Funnel.__ 38 | contents: 39 | - starts_with("plot_") 40 | - title: Datasets 41 | desc: Datasets that ship with `correlationfunnel` 42 | contents: 43 | - starts_with("customer") 44 | - starts_with("marketing") 45 | -------------------------------------------------------------------------------- /codecov.yml: -------------------------------------------------------------------------------- 1 | comment: false 2 | 3 | coverage: 4 | status: 5 | project: 6 | default: 7 | target: auto 8 | threshold: 1% 9 | patch: 10 | default: 11 | target: auto 12 | threshold: 1% 13 | -------------------------------------------------------------------------------- /correlationfunnel.Rproj: -------------------------------------------------------------------------------- 1 | Version: 1.0 2 | 3 | RestoreWorkspace: Default 4 | SaveWorkspace: Default 5 | AlwaysSaveHistory: Default 6 | 7 | EnableCodeIndexing: Yes 8 | UseSpacesForTab: Yes 9 | NumSpacesForTab: 4 10 | Encoding: UTF-8 11 | 12 | RnwWeave: Sweave 13 | LaTeX: pdfLaTeX 14 | 15 | AutoAppendNewline: Yes 16 | StripTrailingWhitespace: Yes 17 | 18 | BuildType: Package 19 | PackageUseDevtools: Yes 20 | PackageInstallArgs: --no-multiarch --with-keep.source 21 | PackageRoxygenize: rd,collate,namespace 22 | -------------------------------------------------------------------------------- /cran-comments.md: -------------------------------------------------------------------------------- 1 | ## Test environments 2 | * local R installation, R 3.6.1 3 | * ubuntu 16.04 (on travis-ci), R 3.6.1 4 | * win-builder (devel) 5 | 6 | ## R CMD check results 7 | 8 | 0 errors | 0 warnings | 0 notes 9 | 10 | * This is an update for compatability with dplyr 1.0.0 11 | -------------------------------------------------------------------------------- /data/customer_churn_tbl.rda: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/data/customer_churn_tbl.rda -------------------------------------------------------------------------------- /data/marketing_campaign_tbl.rda: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/data/marketing_campaign_tbl.rda -------------------------------------------------------------------------------- /docs/404.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Page not found (404) • correlationfunnel 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 55 | 56 | 57 | 58 | 59 | 66 | 67 | 68 | 69 | 70 | 71 |
72 |
73 | 127 | 128 | 129 | 130 |
131 | 132 |
133 |
134 | 137 | 138 | Content not found. Please use links in the navbar. 139 | 140 |
141 | 142 | 147 | 148 |
149 | 150 | 151 | 152 |
153 | 156 | 157 |
158 |

Site built with pkgdown 1.5.1.

159 |
160 | 161 |
162 |
163 | 164 | 165 | 166 | 167 | 168 | 169 | 170 | 171 | -------------------------------------------------------------------------------- /docs/LICENSE-text.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | License • correlationfunnel 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 55 | 56 | 57 | 58 | 59 | 66 | 67 | 68 | 69 | 70 | 71 |
72 |
73 | 127 | 128 | 129 | 130 |
131 | 132 |
133 |
134 | 137 | 138 |
YEAR: 2019
139 | COPYRIGHT HOLDER: Matt Dancho
140 | 
141 | 142 |
143 | 144 | 149 | 150 |
151 | 152 | 153 | 154 |
155 | 158 | 159 |
160 |

Site built with pkgdown 1.5.1.

161 |
162 | 163 |
164 |
165 | 166 | 167 | 168 | 169 | 170 | 171 | 172 | 173 | -------------------------------------------------------------------------------- /docs/LICENSE.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | MIT License • correlationfunnel 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 55 | 56 | 57 | 58 | 59 | 66 | 67 | 68 | 69 | 70 | 71 |
72 |
73 | 127 | 128 | 129 | 130 |
131 | 132 |
133 |
134 | 137 | 138 |
139 | 140 |

Copyright (c) 2019 Matt Dancho

141 |

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

142 |

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

143 |

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

144 |
145 | 146 |
147 | 148 | 153 | 154 |
155 | 156 | 157 | 158 |
159 | 162 | 163 |
164 |

Site built with pkgdown 1.5.1.

165 |
166 | 167 |
168 |
169 | 170 | 171 | 172 | 173 | 174 | 175 | 176 | 177 | -------------------------------------------------------------------------------- /docs/articles/faqs_files/figure-html/unnamed-chunk-12-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/docs/articles/faqs_files/figure-html/unnamed-chunk-12-1.png -------------------------------------------------------------------------------- /docs/articles/faqs_files/figure-html/unnamed-chunk-15-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/docs/articles/faqs_files/figure-html/unnamed-chunk-15-1.png -------------------------------------------------------------------------------- /docs/articles/faqs_files/figure-html/unnamed-chunk-2-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/docs/articles/faqs_files/figure-html/unnamed-chunk-2-1.png -------------------------------------------------------------------------------- /docs/articles/faqs_files/figure-html/unnamed-chunk-21-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/docs/articles/faqs_files/figure-html/unnamed-chunk-21-1.png -------------------------------------------------------------------------------- /docs/articles/faqs_files/figure-html/unnamed-chunk-23-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/docs/articles/faqs_files/figure-html/unnamed-chunk-23-1.png -------------------------------------------------------------------------------- /docs/articles/faqs_files/figure-html/unnamed-chunk-5-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/docs/articles/faqs_files/figure-html/unnamed-chunk-5-1.png -------------------------------------------------------------------------------- /docs/articles/faqs_files/figure-html/unnamed-chunk-6-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/docs/articles/faqs_files/figure-html/unnamed-chunk-6-1.png -------------------------------------------------------------------------------- /docs/articles/faqs_files/figure-html/unnamed-chunk-9-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/docs/articles/faqs_files/figure-html/unnamed-chunk-9-1.png -------------------------------------------------------------------------------- /docs/articles/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Articles • correlationfunnel 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 55 | 56 | 57 | 58 | 59 | 66 | 67 | 68 | 69 | 70 | 71 |
72 |
73 | 127 | 128 | 129 | 130 |
131 | 132 |
133 |
134 | 137 | 138 |
139 |

All vignettes

140 |

141 | 142 |
143 |
Introducing Correlation Funnel - Customer Churn Example
144 |
145 |
Methodology, Key Considerations, and FAQs
146 |
147 |
148 |
149 |
150 |
151 | 152 | 153 |
154 | 157 | 158 |
159 |

Site built with pkgdown 1.5.1.

160 |
161 | 162 |
163 |
164 | 165 | 166 | 167 | 168 | 169 | 170 | 171 | 172 | -------------------------------------------------------------------------------- /docs/articles/introducing_correlation_funnel_files/figure-html/unnamed-chunk-5-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/docs/articles/introducing_correlation_funnel_files/figure-html/unnamed-chunk-5-1.png -------------------------------------------------------------------------------- /docs/articles/introducing_correlation_funnel_files/header-attrs-2.1/header-attrs.js: -------------------------------------------------------------------------------- 1 | // Pandoc 2.9 adds attributes on both header and div. We remove the former (to 2 | // be compatible with the behavior of Pandoc < 2.8). 3 | document.addEventListener('DOMContentLoaded', function(e) { 4 | var hs = document.querySelectorAll("div.section[class*='level'] > :first-child"); 5 | var i, h, a; 6 | for (i = 0; i < hs.length; i++) { 7 | h = hs[i]; 8 | if (!/^h[1-6]$/i.test(h.tagName)) continue; // it should be a header h1-h6 9 | a = h.attributes; 10 | while (a.length > 0) h.removeAttribute(a[0].name); 11 | } 12 | }); 13 | -------------------------------------------------------------------------------- /docs/articles/key_considerations_files/figure-html/unnamed-chunk-10-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/docs/articles/key_considerations_files/figure-html/unnamed-chunk-10-1.png -------------------------------------------------------------------------------- /docs/articles/key_considerations_files/figure-html/unnamed-chunk-12-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/docs/articles/key_considerations_files/figure-html/unnamed-chunk-12-1.png -------------------------------------------------------------------------------- /docs/articles/key_considerations_files/figure-html/unnamed-chunk-13-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/docs/articles/key_considerations_files/figure-html/unnamed-chunk-13-1.png -------------------------------------------------------------------------------- /docs/articles/key_considerations_files/figure-html/unnamed-chunk-15-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/docs/articles/key_considerations_files/figure-html/unnamed-chunk-15-1.png -------------------------------------------------------------------------------- /docs/articles/key_considerations_files/figure-html/unnamed-chunk-16-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/docs/articles/key_considerations_files/figure-html/unnamed-chunk-16-1.png -------------------------------------------------------------------------------- /docs/articles/key_considerations_files/figure-html/unnamed-chunk-2-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/docs/articles/key_considerations_files/figure-html/unnamed-chunk-2-1.png -------------------------------------------------------------------------------- /docs/articles/key_considerations_files/figure-html/unnamed-chunk-21-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/docs/articles/key_considerations_files/figure-html/unnamed-chunk-21-1.png -------------------------------------------------------------------------------- /docs/articles/key_considerations_files/figure-html/unnamed-chunk-22-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/docs/articles/key_considerations_files/figure-html/unnamed-chunk-22-1.png -------------------------------------------------------------------------------- /docs/articles/key_considerations_files/figure-html/unnamed-chunk-23-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/docs/articles/key_considerations_files/figure-html/unnamed-chunk-23-1.png -------------------------------------------------------------------------------- /docs/articles/key_considerations_files/figure-html/unnamed-chunk-24-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/docs/articles/key_considerations_files/figure-html/unnamed-chunk-24-1.png -------------------------------------------------------------------------------- /docs/articles/key_considerations_files/figure-html/unnamed-chunk-3-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/docs/articles/key_considerations_files/figure-html/unnamed-chunk-3-1.png -------------------------------------------------------------------------------- /docs/articles/key_considerations_files/figure-html/unnamed-chunk-5-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/docs/articles/key_considerations_files/figure-html/unnamed-chunk-5-1.png -------------------------------------------------------------------------------- /docs/articles/key_considerations_files/figure-html/unnamed-chunk-6-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/docs/articles/key_considerations_files/figure-html/unnamed-chunk-6-1.png -------------------------------------------------------------------------------- /docs/articles/key_considerations_files/figure-html/unnamed-chunk-7-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/docs/articles/key_considerations_files/figure-html/unnamed-chunk-7-1.png -------------------------------------------------------------------------------- /docs/articles/key_considerations_files/figure-html/unnamed-chunk-9-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/docs/articles/key_considerations_files/figure-html/unnamed-chunk-9-1.png -------------------------------------------------------------------------------- /docs/articles/key_considerations_files/header-attrs-2.1/header-attrs.js: -------------------------------------------------------------------------------- 1 | // Pandoc 2.9 adds attributes on both header and div. We remove the former (to 2 | // be compatible with the behavior of Pandoc < 2.8). 3 | document.addEventListener('DOMContentLoaded', function(e) { 4 | var hs = document.querySelectorAll("div.section[class*='level'] > :first-child"); 5 | var i, h, a; 6 | for (i = 0; i < hs.length; i++) { 7 | h = hs[i]; 8 | if (!/^h[1-6]$/i.test(h.tagName)) continue; // it should be a header h1-h6 9 | a = h.attributes; 10 | while (a.length > 0) h.removeAttribute(a[0].name); 11 | } 12 | }); 13 | -------------------------------------------------------------------------------- /docs/authors.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Authors • correlationfunnel 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 55 | 56 | 57 | 58 | 59 | 66 | 67 | 68 | 69 | 70 | 71 |
72 |
73 | 127 | 128 | 129 | 130 |
131 | 132 |
133 |
134 | 137 | 138 |
    139 |
  • 140 |

    Matt Dancho. Author, maintainer. 141 |

    142 |
  • 143 |
144 | 145 |
146 | 147 |
148 | 149 | 150 | 151 |
152 | 155 | 156 |
157 |

Site built with pkgdown 1.5.1.

158 |
159 | 160 |
161 |
162 | 163 | 164 | 165 | 166 | 167 | 168 | 169 | 170 | -------------------------------------------------------------------------------- /docs/bootstrap-toc.css: -------------------------------------------------------------------------------- 1 | /*! 2 | * Bootstrap Table of Contents v0.4.1 (http://afeld.github.io/bootstrap-toc/) 3 | * Copyright 2015 Aidan Feldman 4 | * Licensed under MIT (https://github.com/afeld/bootstrap-toc/blob/gh-pages/LICENSE.md) */ 5 | 6 | /* modified from https://github.com/twbs/bootstrap/blob/94b4076dd2efba9af71f0b18d4ee4b163aa9e0dd/docs/assets/css/src/docs.css#L548-L601 */ 7 | 8 | /* All levels of nav */ 9 | nav[data-toggle='toc'] .nav > li > a { 10 | display: block; 11 | padding: 4px 20px; 12 | font-size: 13px; 13 | font-weight: 500; 14 | color: #767676; 15 | } 16 | nav[data-toggle='toc'] .nav > li > a:hover, 17 | nav[data-toggle='toc'] .nav > li > a:focus { 18 | padding-left: 19px; 19 | color: #563d7c; 20 | text-decoration: none; 21 | background-color: transparent; 22 | border-left: 1px solid #563d7c; 23 | } 24 | nav[data-toggle='toc'] .nav > .active > a, 25 | nav[data-toggle='toc'] .nav > .active:hover > a, 26 | nav[data-toggle='toc'] .nav > .active:focus > a { 27 | padding-left: 18px; 28 | font-weight: bold; 29 | color: #563d7c; 30 | background-color: transparent; 31 | border-left: 2px solid #563d7c; 32 | } 33 | 34 | /* Nav: second level (shown on .active) */ 35 | nav[data-toggle='toc'] .nav .nav { 36 | display: none; /* Hide by default, but at >768px, show it */ 37 | padding-bottom: 10px; 38 | } 39 | nav[data-toggle='toc'] .nav .nav > li > a { 40 | padding-top: 1px; 41 | padding-bottom: 1px; 42 | padding-left: 30px; 43 | font-size: 12px; 44 | font-weight: normal; 45 | } 46 | nav[data-toggle='toc'] .nav .nav > li > a:hover, 47 | nav[data-toggle='toc'] .nav .nav > li > a:focus { 48 | padding-left: 29px; 49 | } 50 | nav[data-toggle='toc'] .nav .nav > .active > a, 51 | nav[data-toggle='toc'] .nav .nav > .active:hover > a, 52 | nav[data-toggle='toc'] .nav .nav > .active:focus > a { 53 | padding-left: 28px; 54 | font-weight: 500; 55 | } 56 | 57 | /* from https://github.com/twbs/bootstrap/blob/e38f066d8c203c3e032da0ff23cd2d6098ee2dd6/docs/assets/css/src/docs.css#L631-L634 */ 58 | nav[data-toggle='toc'] .nav > .active > ul { 59 | display: block; 60 | } 61 | -------------------------------------------------------------------------------- /docs/bootstrap-toc.js: -------------------------------------------------------------------------------- 1 | /*! 2 | * Bootstrap Table of Contents v0.4.1 (http://afeld.github.io/bootstrap-toc/) 3 | * Copyright 2015 Aidan Feldman 4 | * Licensed under MIT (https://github.com/afeld/bootstrap-toc/blob/gh-pages/LICENSE.md) */ 5 | (function() { 6 | 'use strict'; 7 | 8 | window.Toc = { 9 | helpers: { 10 | // return all matching elements in the set, or their descendants 11 | findOrFilter: function($el, selector) { 12 | // http://danielnouri.org/notes/2011/03/14/a-jquery-find-that-also-finds-the-root-element/ 13 | // http://stackoverflow.com/a/12731439/358804 14 | var $descendants = $el.find(selector); 15 | return $el.filter(selector).add($descendants).filter(':not([data-toc-skip])'); 16 | }, 17 | 18 | generateUniqueIdBase: function(el) { 19 | var text = $(el).text(); 20 | var anchor = text.trim().toLowerCase().replace(/[^A-Za-z0-9]+/g, '-'); 21 | return anchor || el.tagName.toLowerCase(); 22 | }, 23 | 24 | generateUniqueId: function(el) { 25 | var anchorBase = this.generateUniqueIdBase(el); 26 | for (var i = 0; ; i++) { 27 | var anchor = anchorBase; 28 | if (i > 0) { 29 | // add suffix 30 | anchor += '-' + i; 31 | } 32 | // check if ID already exists 33 | if (!document.getElementById(anchor)) { 34 | return anchor; 35 | } 36 | } 37 | }, 38 | 39 | generateAnchor: function(el) { 40 | if (el.id) { 41 | return el.id; 42 | } else { 43 | var anchor = this.generateUniqueId(el); 44 | el.id = anchor; 45 | return anchor; 46 | } 47 | }, 48 | 49 | createNavList: function() { 50 | return $(''); 51 | }, 52 | 53 | createChildNavList: function($parent) { 54 | var $childList = this.createNavList(); 55 | $parent.append($childList); 56 | return $childList; 57 | }, 58 | 59 | generateNavEl: function(anchor, text) { 60 | var $a = $(''); 61 | $a.attr('href', '#' + anchor); 62 | $a.text(text); 63 | var $li = $('
  • '); 64 | $li.append($a); 65 | return $li; 66 | }, 67 | 68 | generateNavItem: function(headingEl) { 69 | var anchor = this.generateAnchor(headingEl); 70 | var $heading = $(headingEl); 71 | var text = $heading.data('toc-text') || $heading.text(); 72 | return this.generateNavEl(anchor, text); 73 | }, 74 | 75 | // Find the first heading level (`

    `, then `

    `, etc.) that has more than one element. Defaults to 1 (for `

    `). 76 | getTopLevel: function($scope) { 77 | for (var i = 1; i <= 6; i++) { 78 | var $headings = this.findOrFilter($scope, 'h' + i); 79 | if ($headings.length > 1) { 80 | return i; 81 | } 82 | } 83 | 84 | return 1; 85 | }, 86 | 87 | // returns the elements for the top level, and the next below it 88 | getHeadings: function($scope, topLevel) { 89 | var topSelector = 'h' + topLevel; 90 | 91 | var secondaryLevel = topLevel + 1; 92 | var secondarySelector = 'h' + secondaryLevel; 93 | 94 | return this.findOrFilter($scope, topSelector + ',' + secondarySelector); 95 | }, 96 | 97 | getNavLevel: function(el) { 98 | return parseInt(el.tagName.charAt(1), 10); 99 | }, 100 | 101 | populateNav: function($topContext, topLevel, $headings) { 102 | var $context = $topContext; 103 | var $prevNav; 104 | 105 | var helpers = this; 106 | $headings.each(function(i, el) { 107 | var $newNav = helpers.generateNavItem(el); 108 | var navLevel = helpers.getNavLevel(el); 109 | 110 | // determine the proper $context 111 | if (navLevel === topLevel) { 112 | // use top level 113 | $context = $topContext; 114 | } else if ($prevNav && $context === $topContext) { 115 | // create a new level of the tree and switch to it 116 | $context = helpers.createChildNavList($prevNav); 117 | } // else use the current $context 118 | 119 | $context.append($newNav); 120 | 121 | $prevNav = $newNav; 122 | }); 123 | }, 124 | 125 | parseOps: function(arg) { 126 | var opts; 127 | if (arg.jquery) { 128 | opts = { 129 | $nav: arg 130 | }; 131 | } else { 132 | opts = arg; 133 | } 134 | opts.$scope = opts.$scope || $(document.body); 135 | return opts; 136 | } 137 | }, 138 | 139 | // accepts a jQuery object, or an options object 140 | init: function(opts) { 141 | opts = this.helpers.parseOps(opts); 142 | 143 | // ensure that the data attribute is in place for styling 144 | opts.$nav.attr('data-toggle', 'toc'); 145 | 146 | var $topContext = this.helpers.createChildNavList(opts.$nav); 147 | var topLevel = this.helpers.getTopLevel(opts.$scope); 148 | var $headings = this.helpers.getHeadings(opts.$scope, topLevel); 149 | this.helpers.populateNav($topContext, topLevel, $headings); 150 | } 151 | }; 152 | 153 | $(function() { 154 | $('nav[data-toggle="toc"]').each(function(i, el) { 155 | var $nav = $(el); 156 | Toc.init($nav); 157 | }); 158 | }); 159 | })(); 160 | -------------------------------------------------------------------------------- /docs/docsearch.css: -------------------------------------------------------------------------------- 1 | /* Docsearch -------------------------------------------------------------- */ 2 | /* 3 | Source: https://github.com/algolia/docsearch/ 4 | License: MIT 5 | */ 6 | 7 | .algolia-autocomplete { 8 | display: block; 9 | -webkit-box-flex: 1; 10 | -ms-flex: 1; 11 | flex: 1 12 | } 13 | 14 | .algolia-autocomplete .ds-dropdown-menu { 15 | width: 100%; 16 | min-width: none; 17 | max-width: none; 18 | padding: .75rem 0; 19 | background-color: #fff; 20 | background-clip: padding-box; 21 | border: 1px solid rgba(0, 0, 0, .1); 22 | box-shadow: 0 .5rem 1rem rgba(0, 0, 0, .175); 23 | } 24 | 25 | @media (min-width:768px) { 26 | .algolia-autocomplete .ds-dropdown-menu { 27 | width: 175% 28 | } 29 | } 30 | 31 | .algolia-autocomplete .ds-dropdown-menu::before { 32 | display: none 33 | } 34 | 35 | .algolia-autocomplete .ds-dropdown-menu [class^=ds-dataset-] { 36 | padding: 0; 37 | background-color: rgb(255,255,255); 38 | border: 0; 39 | max-height: 80vh; 40 | } 41 | 42 | .algolia-autocomplete .ds-dropdown-menu .ds-suggestions { 43 | margin-top: 0 44 | } 45 | 46 | .algolia-autocomplete .algolia-docsearch-suggestion { 47 | padding: 0; 48 | overflow: visible 49 | } 50 | 51 | .algolia-autocomplete .algolia-docsearch-suggestion--category-header { 52 | padding: .125rem 1rem; 53 | margin-top: 0; 54 | font-size: 1.3em; 55 | font-weight: 500; 56 | color: #00008B; 57 | border-bottom: 0 58 | } 59 | 60 | .algolia-autocomplete .algolia-docsearch-suggestion--wrapper { 61 | float: none; 62 | padding-top: 0 63 | } 64 | 65 | .algolia-autocomplete .algolia-docsearch-suggestion--subcategory-column { 66 | float: none; 67 | width: auto; 68 | padding: 0; 69 | text-align: left 70 | } 71 | 72 | .algolia-autocomplete .algolia-docsearch-suggestion--content { 73 | float: none; 74 | width: auto; 75 | padding: 0 76 | } 77 | 78 | .algolia-autocomplete .algolia-docsearch-suggestion--content::before { 79 | display: none 80 | } 81 | 82 | .algolia-autocomplete .ds-suggestion:not(:first-child) .algolia-docsearch-suggestion--category-header { 83 | padding-top: .75rem; 84 | margin-top: .75rem; 85 | border-top: 1px solid rgba(0, 0, 0, .1) 86 | } 87 | 88 | .algolia-autocomplete .ds-suggestion .algolia-docsearch-suggestion--subcategory-column { 89 | display: block; 90 | padding: .1rem 1rem; 91 | margin-bottom: 0.1; 92 | font-size: 1.0em; 93 | font-weight: 400 94 | /* display: none */ 95 | } 96 | 97 | .algolia-autocomplete .algolia-docsearch-suggestion--title { 98 | display: block; 99 | padding: .25rem 1rem; 100 | margin-bottom: 0; 101 | font-size: 0.9em; 102 | font-weight: 400 103 | } 104 | 105 | .algolia-autocomplete .algolia-docsearch-suggestion--text { 106 | padding: 0 1rem .5rem; 107 | margin-top: -.25rem; 108 | font-size: 0.8em; 109 | font-weight: 400; 110 | line-height: 1.25 111 | } 112 | 113 | .algolia-autocomplete .algolia-docsearch-footer { 114 | width: 110px; 115 | height: 20px; 116 | z-index: 3; 117 | margin-top: 10.66667px; 118 | float: right; 119 | font-size: 0; 120 | line-height: 0; 121 | } 122 | 123 | .algolia-autocomplete .algolia-docsearch-footer--logo { 124 | background-image: url("data:image/svg+xml;utf8,"); 125 | background-repeat: no-repeat; 126 | background-position: 50%; 127 | background-size: 100%; 128 | overflow: hidden; 129 | text-indent: -9000px; 130 | width: 100%; 131 | height: 100%; 132 | display: block; 133 | transform: translate(-8px); 134 | } 135 | 136 | .algolia-autocomplete .algolia-docsearch-suggestion--highlight { 137 | color: #FF8C00; 138 | background: rgba(232, 189, 54, 0.1) 139 | } 140 | 141 | 142 | .algolia-autocomplete .algolia-docsearch-suggestion--text .algolia-docsearch-suggestion--highlight { 143 | box-shadow: inset 0 -2px 0 0 rgba(105, 105, 105, .5) 144 | } 145 | 146 | .algolia-autocomplete .ds-suggestion.ds-cursor .algolia-docsearch-suggestion--content { 147 | background-color: rgba(192, 192, 192, .15) 148 | } 149 | -------------------------------------------------------------------------------- /docs/docsearch.js: -------------------------------------------------------------------------------- 1 | $(function() { 2 | 3 | // register a handler to move the focus to the search bar 4 | // upon pressing shift + "/" (i.e. "?") 5 | $(document).on('keydown', function(e) { 6 | if (e.shiftKey && e.keyCode == 191) { 7 | e.preventDefault(); 8 | $("#search-input").focus(); 9 | } 10 | }); 11 | 12 | $(document).ready(function() { 13 | // do keyword highlighting 14 | /* modified from https://jsfiddle.net/julmot/bL6bb5oo/ */ 15 | var mark = function() { 16 | 17 | var referrer = document.URL ; 18 | var paramKey = "q" ; 19 | 20 | if (referrer.indexOf("?") !== -1) { 21 | var qs = referrer.substr(referrer.indexOf('?') + 1); 22 | var qs_noanchor = qs.split('#')[0]; 23 | var qsa = qs_noanchor.split('&'); 24 | var keyword = ""; 25 | 26 | for (var i = 0; i < qsa.length; i++) { 27 | var currentParam = qsa[i].split('='); 28 | 29 | if (currentParam.length !== 2) { 30 | continue; 31 | } 32 | 33 | if (currentParam[0] == paramKey) { 34 | keyword = decodeURIComponent(currentParam[1].replace(/\+/g, "%20")); 35 | } 36 | } 37 | 38 | if (keyword !== "") { 39 | $(".contents").unmark({ 40 | done: function() { 41 | $(".contents").mark(keyword); 42 | } 43 | }); 44 | } 45 | } 46 | }; 47 | 48 | mark(); 49 | }); 50 | }); 51 | 52 | /* Search term highlighting ------------------------------*/ 53 | 54 | function matchedWords(hit) { 55 | var words = []; 56 | 57 | var hierarchy = hit._highlightResult.hierarchy; 58 | // loop to fetch from lvl0, lvl1, etc. 59 | for (var idx in hierarchy) { 60 | words = words.concat(hierarchy[idx].matchedWords); 61 | } 62 | 63 | var content = hit._highlightResult.content; 64 | if (content) { 65 | words = words.concat(content.matchedWords); 66 | } 67 | 68 | // return unique words 69 | var words_uniq = [...new Set(words)]; 70 | return words_uniq; 71 | } 72 | 73 | function updateHitURL(hit) { 74 | 75 | var words = matchedWords(hit); 76 | var url = ""; 77 | 78 | if (hit.anchor) { 79 | url = hit.url_without_anchor + '?q=' + escape(words.join(" ")) + '#' + hit.anchor; 80 | } else { 81 | url = hit.url + '?q=' + escape(words.join(" ")); 82 | } 83 | 84 | return url; 85 | } 86 | -------------------------------------------------------------------------------- /docs/link.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 5 | 8 | 12 | 13 | -------------------------------------------------------------------------------- /docs/man/figures/README-corr_funnel.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/docs/man/figures/README-corr_funnel.png -------------------------------------------------------------------------------- /docs/news/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Changelog • correlationfunnel 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 55 | 56 | 57 | 58 | 59 | 66 | 67 | 68 | 69 | 70 | 71 |
    72 |
    73 | 127 | 128 | 129 | 130 |
    131 | 132 |
    133 |
    134 | 138 | 139 |
    140 |

    141 | correlationfunnel 0.2.1 Unreleased 142 |

    143 |
      144 |
    • Compatability with dplyr 1.0.0
    • 145 |
    146 |
    147 |
    148 |

    149 | correlationfunnel 0.2.0 Unreleased 150 |

    151 |
      152 |
    • Fix - Allow integer data to be binarize()-ed
    • 153 |
    • Fix - Allow logical data to be binarize()-ed. Values are converted to integer and then binarized.
    • 154 |
    155 |
    156 |
    157 |

    158 | correlationfunnel 0.1.0 2019-08-06 159 |

    160 |
      161 |
    • Initial CRAN Submission
    • 162 |
    163 |
    164 |
    165 |

    166 | correlationfunnel 0.0.9 Unreleased 167 |

    168 |
      169 |
    • Package under development
    • 170 |
    • Added a NEWS.md file to track changes to the package.
    • 171 |
    172 |
    173 |
    174 | 175 | 180 | 181 |
    182 | 183 | 184 |
    185 | 188 | 189 |
    190 |

    Site built with pkgdown 1.5.1.

    191 |
    192 | 193 |
    194 |
    195 | 196 | 197 | 198 | 199 | 200 | 201 | 202 | 203 | -------------------------------------------------------------------------------- /docs/pkgdown.css: -------------------------------------------------------------------------------- 1 | /* Sticky footer */ 2 | 3 | /** 4 | * Basic idea: https://philipwalton.github.io/solved-by-flexbox/demos/sticky-footer/ 5 | * Details: https://github.com/philipwalton/solved-by-flexbox/blob/master/assets/css/components/site.css 6 | * 7 | * .Site -> body > .container 8 | * .Site-content -> body > .container .row 9 | * .footer -> footer 10 | * 11 | * Key idea seems to be to ensure that .container and __all its parents__ 12 | * have height set to 100% 13 | * 14 | */ 15 | 16 | html, body { 17 | height: 100%; 18 | } 19 | 20 | body { 21 | position: relative; 22 | } 23 | 24 | body > .container { 25 | display: flex; 26 | height: 100%; 27 | flex-direction: column; 28 | } 29 | 30 | body > .container .row { 31 | flex: 1 0 auto; 32 | } 33 | 34 | footer { 35 | margin-top: 45px; 36 | padding: 35px 0 36px; 37 | border-top: 1px solid #e5e5e5; 38 | color: #666; 39 | display: flex; 40 | flex-shrink: 0; 41 | } 42 | footer p { 43 | margin-bottom: 0; 44 | } 45 | footer div { 46 | flex: 1; 47 | } 48 | footer .pkgdown { 49 | text-align: right; 50 | } 51 | footer p { 52 | margin-bottom: 0; 53 | } 54 | 55 | img.icon { 56 | float: right; 57 | } 58 | 59 | img { 60 | max-width: 100%; 61 | } 62 | 63 | /* Fix bug in bootstrap (only seen in firefox) */ 64 | summary { 65 | display: list-item; 66 | } 67 | 68 | /* Typographic tweaking ---------------------------------*/ 69 | 70 | .contents .page-header { 71 | margin-top: calc(-60px + 1em); 72 | } 73 | 74 | dd { 75 | margin-left: 3em; 76 | } 77 | 78 | /* Section anchors ---------------------------------*/ 79 | 80 | a.anchor { 81 | margin-left: -30px; 82 | display:inline-block; 83 | width: 30px; 84 | height: 30px; 85 | visibility: hidden; 86 | 87 | background-image: url(./link.svg); 88 | background-repeat: no-repeat; 89 | background-size: 20px 20px; 90 | background-position: center center; 91 | } 92 | 93 | .hasAnchor:hover a.anchor { 94 | visibility: visible; 95 | } 96 | 97 | @media (max-width: 767px) { 98 | .hasAnchor:hover a.anchor { 99 | visibility: hidden; 100 | } 101 | } 102 | 103 | 104 | /* Fixes for fixed navbar --------------------------*/ 105 | 106 | .contents h1, .contents h2, .contents h3, .contents h4 { 107 | padding-top: 60px; 108 | margin-top: -40px; 109 | } 110 | 111 | /* Navbar submenu --------------------------*/ 112 | 113 | .dropdown-submenu { 114 | position: relative; 115 | } 116 | 117 | .dropdown-submenu>.dropdown-menu { 118 | top: 0; 119 | left: 100%; 120 | margin-top: -6px; 121 | margin-left: -1px; 122 | border-radius: 0 6px 6px 6px; 123 | } 124 | 125 | .dropdown-submenu:hover>.dropdown-menu { 126 | display: block; 127 | } 128 | 129 | .dropdown-submenu>a:after { 130 | display: block; 131 | content: " "; 132 | float: right; 133 | width: 0; 134 | height: 0; 135 | border-color: transparent; 136 | border-style: solid; 137 | border-width: 5px 0 5px 5px; 138 | border-left-color: #cccccc; 139 | margin-top: 5px; 140 | margin-right: -10px; 141 | } 142 | 143 | .dropdown-submenu:hover>a:after { 144 | border-left-color: #ffffff; 145 | } 146 | 147 | .dropdown-submenu.pull-left { 148 | float: none; 149 | } 150 | 151 | .dropdown-submenu.pull-left>.dropdown-menu { 152 | left: -100%; 153 | margin-left: 10px; 154 | border-radius: 6px 0 6px 6px; 155 | } 156 | 157 | /* Sidebar --------------------------*/ 158 | 159 | #pkgdown-sidebar { 160 | margin-top: 30px; 161 | position: -webkit-sticky; 162 | position: sticky; 163 | top: 70px; 164 | } 165 | 166 | #pkgdown-sidebar h2 { 167 | font-size: 1.5em; 168 | margin-top: 1em; 169 | } 170 | 171 | #pkgdown-sidebar h2:first-child { 172 | margin-top: 0; 173 | } 174 | 175 | #pkgdown-sidebar .list-unstyled li { 176 | margin-bottom: 0.5em; 177 | } 178 | 179 | /* bootstrap-toc tweaks ------------------------------------------------------*/ 180 | 181 | /* All levels of nav */ 182 | 183 | nav[data-toggle='toc'] .nav > li > a { 184 | padding: 4px 20px 4px 6px; 185 | font-size: 1.5rem; 186 | font-weight: 400; 187 | color: inherit; 188 | } 189 | 190 | nav[data-toggle='toc'] .nav > li > a:hover, 191 | nav[data-toggle='toc'] .nav > li > a:focus { 192 | padding-left: 5px; 193 | color: inherit; 194 | border-left: 1px solid #878787; 195 | } 196 | 197 | nav[data-toggle='toc'] .nav > .active > a, 198 | nav[data-toggle='toc'] .nav > .active:hover > a, 199 | nav[data-toggle='toc'] .nav > .active:focus > a { 200 | padding-left: 5px; 201 | font-size: 1.5rem; 202 | font-weight: 400; 203 | color: inherit; 204 | border-left: 2px solid #878787; 205 | } 206 | 207 | /* Nav: second level (shown on .active) */ 208 | 209 | nav[data-toggle='toc'] .nav .nav { 210 | display: none; /* Hide by default, but at >768px, show it */ 211 | padding-bottom: 10px; 212 | } 213 | 214 | nav[data-toggle='toc'] .nav .nav > li > a { 215 | padding-left: 16px; 216 | font-size: 1.35rem; 217 | } 218 | 219 | nav[data-toggle='toc'] .nav .nav > li > a:hover, 220 | nav[data-toggle='toc'] .nav .nav > li > a:focus { 221 | padding-left: 15px; 222 | } 223 | 224 | nav[data-toggle='toc'] .nav .nav > .active > a, 225 | nav[data-toggle='toc'] .nav .nav > .active:hover > a, 226 | nav[data-toggle='toc'] .nav .nav > .active:focus > a { 227 | padding-left: 15px; 228 | font-weight: 500; 229 | font-size: 1.35rem; 230 | } 231 | 232 | /* orcid ------------------------------------------------------------------- */ 233 | 234 | .orcid { 235 | font-size: 16px; 236 | color: #A6CE39; 237 | /* margins are required by official ORCID trademark and display guidelines */ 238 | margin-left:4px; 239 | margin-right:4px; 240 | vertical-align: middle; 241 | } 242 | 243 | /* Reference index & topics ----------------------------------------------- */ 244 | 245 | .ref-index th {font-weight: normal;} 246 | 247 | .ref-index td {vertical-align: top;} 248 | .ref-index .icon {width: 40px;} 249 | .ref-index .alias {width: 40%;} 250 | .ref-index-icons .alias {width: calc(40% - 40px);} 251 | .ref-index .title {width: 60%;} 252 | 253 | .ref-arguments th {text-align: right; padding-right: 10px;} 254 | .ref-arguments th, .ref-arguments td {vertical-align: top;} 255 | .ref-arguments .name {width: 20%;} 256 | .ref-arguments .desc {width: 80%;} 257 | 258 | /* Nice scrolling for wide elements --------------------------------------- */ 259 | 260 | table { 261 | display: block; 262 | overflow: auto; 263 | } 264 | 265 | /* Syntax highlighting ---------------------------------------------------- */ 266 | 267 | pre { 268 | word-wrap: normal; 269 | word-break: normal; 270 | border: 1px solid #eee; 271 | } 272 | 273 | pre, code { 274 | background-color: #f8f8f8; 275 | color: #333; 276 | } 277 | 278 | pre code { 279 | overflow: auto; 280 | word-wrap: normal; 281 | white-space: pre; 282 | } 283 | 284 | pre .img { 285 | margin: 5px 0; 286 | } 287 | 288 | pre .img img { 289 | background-color: #fff; 290 | display: block; 291 | height: auto; 292 | } 293 | 294 | code a, pre a { 295 | color: #375f84; 296 | } 297 | 298 | a.sourceLine:hover { 299 | text-decoration: none; 300 | } 301 | 302 | .fl {color: #1514b5;} 303 | .fu {color: #000000;} /* function */ 304 | .ch,.st {color: #036a07;} /* string */ 305 | .kw {color: #264D66;} /* keyword */ 306 | .co {color: #888888;} /* comment */ 307 | 308 | .message { color: black; font-weight: bolder;} 309 | .error { color: orange; font-weight: bolder;} 310 | .warning { color: #6A0366; font-weight: bolder;} 311 | 312 | /* Clipboard --------------------------*/ 313 | 314 | .hasCopyButton { 315 | position: relative; 316 | } 317 | 318 | .btn-copy-ex { 319 | position: absolute; 320 | right: 0; 321 | top: 0; 322 | visibility: hidden; 323 | } 324 | 325 | .hasCopyButton:hover button.btn-copy-ex { 326 | visibility: visible; 327 | } 328 | 329 | /* headroom.js ------------------------ */ 330 | 331 | .headroom { 332 | will-change: transform; 333 | transition: transform 200ms linear; 334 | } 335 | .headroom--pinned { 336 | transform: translateY(0%); 337 | } 338 | .headroom--unpinned { 339 | transform: translateY(-100%); 340 | } 341 | 342 | /* mark.js ----------------------------*/ 343 | 344 | mark { 345 | background-color: rgba(255, 255, 51, 0.5); 346 | border-bottom: 2px solid rgba(255, 153, 51, 0.3); 347 | padding: 1px; 348 | } 349 | 350 | /* vertical spacing after htmlwidgets */ 351 | .html-widget { 352 | margin-bottom: 10px; 353 | } 354 | 355 | /* fontawesome ------------------------ */ 356 | 357 | .fab { 358 | font-family: "Font Awesome 5 Brands" !important; 359 | } 360 | 361 | /* don't display links in code chunks when printing */ 362 | /* source: https://stackoverflow.com/a/10781533 */ 363 | @media print { 364 | code a:link:after, code a:visited:after { 365 | content: ""; 366 | } 367 | } 368 | -------------------------------------------------------------------------------- /docs/pkgdown.js: -------------------------------------------------------------------------------- 1 | /* http://gregfranko.com/blog/jquery-best-practices/ */ 2 | (function($) { 3 | $(function() { 4 | 5 | $('.navbar-fixed-top').headroom(); 6 | 7 | $('body').css('padding-top', $('.navbar').height() + 10); 8 | $(window).resize(function(){ 9 | $('body').css('padding-top', $('.navbar').height() + 10); 10 | }); 11 | 12 | $('[data-toggle="tooltip"]').tooltip(); 13 | 14 | var cur_path = paths(location.pathname); 15 | var links = $("#navbar ul li a"); 16 | var max_length = -1; 17 | var pos = -1; 18 | for (var i = 0; i < links.length; i++) { 19 | if (links[i].getAttribute("href") === "#") 20 | continue; 21 | // Ignore external links 22 | if (links[i].host !== location.host) 23 | continue; 24 | 25 | var nav_path = paths(links[i].pathname); 26 | 27 | var length = prefix_length(nav_path, cur_path); 28 | if (length > max_length) { 29 | max_length = length; 30 | pos = i; 31 | } 32 | } 33 | 34 | // Add class to parent
  • , and enclosing
  • if in dropdown 35 | if (pos >= 0) { 36 | var menu_anchor = $(links[pos]); 37 | menu_anchor.parent().addClass("active"); 38 | menu_anchor.closest("li.dropdown").addClass("active"); 39 | } 40 | }); 41 | 42 | function paths(pathname) { 43 | var pieces = pathname.split("/"); 44 | pieces.shift(); // always starts with / 45 | 46 | var end = pieces[pieces.length - 1]; 47 | if (end === "index.html" || end === "") 48 | pieces.pop(); 49 | return(pieces); 50 | } 51 | 52 | // Returns -1 if not found 53 | function prefix_length(needle, haystack) { 54 | if (needle.length > haystack.length) 55 | return(-1); 56 | 57 | // Special case for length-0 haystack, since for loop won't run 58 | if (haystack.length === 0) { 59 | return(needle.length === 0 ? 0 : -1); 60 | } 61 | 62 | for (var i = 0; i < haystack.length; i++) { 63 | if (needle[i] != haystack[i]) 64 | return(i); 65 | } 66 | 67 | return(haystack.length); 68 | } 69 | 70 | /* Clipboard --------------------------*/ 71 | 72 | function changeTooltipMessage(element, msg) { 73 | var tooltipOriginalTitle=element.getAttribute('data-original-title'); 74 | element.setAttribute('data-original-title', msg); 75 | $(element).tooltip('show'); 76 | element.setAttribute('data-original-title', tooltipOriginalTitle); 77 | } 78 | 79 | if(ClipboardJS.isSupported()) { 80 | $(document).ready(function() { 81 | var copyButton = ""; 82 | 83 | $(".examples, div.sourceCode").addClass("hasCopyButton"); 84 | 85 | // Insert copy buttons: 86 | $(copyButton).prependTo(".hasCopyButton"); 87 | 88 | // Initialize tooltips: 89 | $('.btn-copy-ex').tooltip({container: 'body'}); 90 | 91 | // Initialize clipboard: 92 | var clipboardBtnCopies = new ClipboardJS('[data-clipboard-copy]', { 93 | text: function(trigger) { 94 | return trigger.parentNode.textContent; 95 | } 96 | }); 97 | 98 | clipboardBtnCopies.on('success', function(e) { 99 | changeTooltipMessage(e.trigger, 'Copied!'); 100 | e.clearSelection(); 101 | }); 102 | 103 | clipboardBtnCopies.on('error', function() { 104 | changeTooltipMessage(e.trigger,'Press Ctrl+C or Command+C to copy'); 105 | }); 106 | }); 107 | } 108 | })(window.jQuery || window.$) 109 | -------------------------------------------------------------------------------- /docs/pkgdown.yml: -------------------------------------------------------------------------------- 1 | pandoc: 2.9.2.1 2 | pkgdown: 1.5.1 3 | pkgdown_sha: ~ 4 | articles: 5 | introducing_correlation_funnel: introducing_correlation_funnel.html 6 | key_considerations: key_considerations.html 7 | last_built: 2020-06-09T00:30Z 8 | 9 | -------------------------------------------------------------------------------- /docs/reference/correlate.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Correlate a response (target) to features in a data set. — correlate • correlationfunnel 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 56 | 57 | 58 | 59 | 60 | 67 | 68 | 69 | 70 | 71 | 72 |
    73 |
    74 | 128 | 129 | 130 | 131 |
    132 | 133 |
    134 |
    135 | 140 | 141 |
    142 |

    correlate returns a correlation between a target column and the features in a data set.

    143 |
    144 | 145 |
    correlate(data, target, ...)
    146 | 147 |

    Arguments

    148 | 149 | 150 | 151 | 152 | 153 | 154 | 155 | 156 | 157 | 158 | 159 | 160 | 161 | 162 |
    data

    A tibble or data.frame

    target

    The feature that contains the response (Target) that you want to measure relationship.

    ...

    Other arguments passed to cor

    163 | 164 |

    Value

    165 | 166 |

    A tbl

    167 |

    Details

    168 | 169 |

    The correlate() function provides a convient wrapper around the cor function where the target 170 | is the column containing the Y variable. The function is intended to be used with binarize(), which enables 171 | creation of the binary correlation analysis, which is the feed data for the plot_correlation_funnel() visualization.

    172 |

    The default method is the Pearson correlation, which is the Correlation Coefficient from L. Duan et al., 2014. 173 | This represents the linear relationship between two dichotomous features (binary variables). 174 | Learn more about the binary correlation approach in the Vignette covering the Methodology, Key Considerations and FAQs.

    175 |

    References

    176 | 177 |

    Lian Duan, W. Nick Street, Yanchi Liu, Songhua Xu, and Brook Wu. 2014. Selecting the right correlation 178 | measure for binary data. ACM Trans. Knowl. Discov. Data 9, 2, Article 13 (September 2014), 28 pages. 179 | DOI: http://dx.doi.org/10.1145/2637484

    180 |

    See also

    181 | 182 | 183 | 184 |

    Examples

    185 |
    library(dplyr) 186 | library(correlationfunnel) 187 | 188 | marketing_campaign_tbl %>% 189 | select(-ID) %>% 190 | binarize() %>% 191 | correlate(TERM_DEPOSIT__yes)
    #> # A tibble: 74 x 3 192 | #> feature bin correlation 193 | #> <fct> <chr> <dbl> 194 | #> 1 TERM_DEPOSIT no -1.00 195 | #> 2 TERM_DEPOSIT yes 1.00 196 | #> 3 DURATION 319_Inf 0.318 197 | #> 4 POUTCOME success 0.307 198 | #> 5 DURATION -Inf_103 -0.191 199 | #> 6 PDAYS -OTHER 0.167 200 | #> 7 PDAYS -1 -0.167 201 | #> 8 PREVIOUS 0 -0.167 202 | #> 9 POUTCOME unknown -0.167 203 | #> 10 CONTACT unknown -0.151 204 | #> # … with 64 more rows
    205 | 206 |
    207 |
    208 | 213 |
    214 | 215 | 216 |
    217 | 220 | 221 |
    222 |

    Site built with pkgdown 1.5.1.

    223 |
    224 | 225 |
    226 |
    227 | 228 | 229 | 230 | 231 | 232 | 233 | 234 | 235 | -------------------------------------------------------------------------------- /docs/reference/correlationfunnel-package.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | correlationfunnel: Speed Up Exploratory Data Analysis (EDA) with the Correlation Funnel — correlationfunnel-package • correlationfunnel 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 55 | 56 | 60 | 61 | 62 | 63 | 64 | 71 | 72 | 73 | 74 | 75 | 76 |
    77 |
    78 | 132 | 133 | 134 | 135 |
    136 | 137 |
    138 |
    139 | 144 | 145 |
    146 |

    Speeds up exploratory data analysis (EDA) 147 | by providing a succinct workflow and interactive visualization tools for understanding 148 | which features have relationships to target (response). Uses binary correlation analysis 149 | to determine relationship. Default correlation method is the Pearson method. 150 | Lian Duan, W Nick Street, Yanchi Liu, Songhua Xu, and Brook Wu (2014) <doi:10.1145/2637484>.

    151 |
    152 | 153 | 154 | 155 |

    See also

    156 | 157 | 163 | 164 |
    165 | 170 |
    171 | 172 | 173 |
    174 | 177 | 178 |
    179 |

    Site built with pkgdown 1.5.1.

    180 |
    181 | 182 |
    183 |
    184 | 185 | 186 | 187 | 188 | 189 | 190 | 191 | 192 | -------------------------------------------------------------------------------- /docs/reference/customer_churn_tbl.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Customer Churn Data Set for a Telecommunications Company — customer_churn_tbl • correlationfunnel 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 56 | 57 | 58 | 59 | 60 | 67 | 68 | 69 | 70 | 71 | 72 |
    73 |
    74 | 128 | 129 | 130 | 131 |
    132 | 133 |
    134 |
    135 | 140 | 141 |
    142 |

    A dataset containing data related to telecom customers that have enrolled in various products and services

    143 |
    144 | 145 |
    customer_churn_tbl
    146 | 147 | 148 |

    Format

    149 | 150 |

    An object of class spec_tbl_df (inherits from tbl_df, tbl, data.frame) with 7043 rows and 21 columns.

    151 |

    Source

    152 | 153 |

    IBM Sample Datasets

    154 |

    Telecom Customer Data:

    155 | 156 | 157 |
      158 |
    • customerID (chr): CUSTOMER ID

    • 159 |
    • gender (chr): Customer's gender ("Female", "Male")

    • 160 |
    • SeniorCitizen (dbl): 1 = Senior Citzen, 0 = Not Senior Citizen

    • 161 |
    • Partner (chr): Whether the customer has a partner or not (Yes, No)

    • 162 |
    • Dependents (chr): Whether the customer has dependents or not (Yes, No)

    • 163 |
    • tenure (dbl): Number of months the customer has stayed with the company

    • 164 |
    • PhoneService (chr): Whether the customer has a phone service or not (Yes, No)

    • 165 |
    • MultipleLines (chr): Whether the customer has multiple lines or not (Yes, No, No phone service)

    • 166 |
    • InternetService (chr): Customer’s internet service provider (DSL, Fiber optic, No)

    • 167 |
    • OnlineSecurity (chr): Whether the customer has online security or not (Yes, No, No internet service)

    • 168 |
    • OnlineBackup (chr): Whether the customer has online backup or not (Yes, No, No internet service)

    • 169 |
    • DeviceProtection (chr): Whether the customer has device protection or not (Yes, No, No internet service)

    • 170 |
    • TechSupport (chr): Whether the customer has tech support or not (Yes, No, No internet service)

    • 171 |
    • StreamingTV (chr): Whether the customer has streaming TV or not (Yes, No, No internet service)

    • 172 |
    • StreamingMovies (chr): Whether the customer has streaming movies or not (Yes, No, No internet service)

    • 173 |
    • Contract (chr): The contract term of the customer (Month-to-month, One year, Two year)

    • 174 |
    • PaperlessBilling (chr): Whether the customer has paperless billing or not (Yes, No)

    • 175 |
    • PaymentMethod (chr): The customer’s payment method (Electronic check, Mailed check, Bank transfer (automatic), Credit card (automatic))

    • 176 |
    • MonthlyCharges (dbl): The amount charged to the customer monthly

    • 177 |
    • TotalCharges (dbl): The total amount charged to the customer

    • 178 |
    • Churn (chr): Outcome. Whether the customer churned or not (Yes or No)

    • 179 |
    180 | 181 | 182 |
    183 | 188 |
    189 | 190 | 191 |
    192 | 195 | 196 |
    197 |

    Site built with pkgdown 1.5.1.

    198 |
    199 | 200 |
    201 |
    202 | 203 | 204 | 205 | 206 | 207 | 208 | 209 | 210 | -------------------------------------------------------------------------------- /docs/reference/figures/README-3-course-system.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/docs/reference/figures/README-3-course-system.jpg -------------------------------------------------------------------------------- /docs/reference/figures/README-corr_funnel.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/docs/reference/figures/README-corr_funnel.png -------------------------------------------------------------------------------- /docs/reference/figures/README-unnamed-chunk-5-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/docs/reference/figures/README-unnamed-chunk-5-1.png -------------------------------------------------------------------------------- /docs/reference/figures/README-unnamed-chunk-6-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/docs/reference/figures/README-unnamed-chunk-6-1.png -------------------------------------------------------------------------------- /docs/reference/figures/logo-correlationfunnel.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/docs/reference/figures/logo-correlationfunnel.png -------------------------------------------------------------------------------- /docs/reference/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Function reference • correlationfunnel 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 55 | 56 | 57 | 58 | 59 | 66 | 67 | 68 | 69 | 70 | 71 |
    72 |
    73 | 127 | 128 | 129 | 130 |
    131 | 132 |
    133 |
    134 | 137 | 138 | 139 | 140 | 141 | 142 | 143 | 144 | 145 | 146 | 147 | 148 | 152 | 153 | 154 | 155 | 156 | 157 | 158 | 159 | 160 | 163 | 164 | 165 | 166 | 167 | 171 | 172 | 173 | 174 | 175 | 176 | 177 | 178 | 179 | 182 | 183 | 184 | 185 | 188 | 189 | 190 | 191 | 192 | 196 | 197 | 198 | 199 | 200 | 201 | 202 | 203 | 204 | 207 | 208 | 209 | 210 | 211 | 215 | 216 | 217 | 218 | 219 | 220 | 221 | 222 | 223 | 226 | 227 | 228 | 229 | 232 | 233 | 234 | 235 |
    149 |

    General

    150 |

    151 |
    161 |

    correlationfunnel-package

    162 |

    correlationfunnel: Speed Up Exploratory Data Analysis (EDA) with the Correlation Funnel

    168 |

    Correlation Funnel Workflow

    169 |

    The main functions used to perform binary correlation analysis.

    170 |
    180 |

    binarize()

    181 |

    Turn data with numeric, categorical features into binary data.

    186 |

    correlate()

    187 |

    Correlate a response (target) to features in a data set.

    193 |

    Visualization functions

    194 |

    Plotting utilities for visualizing the Correlation Funnel.

    195 |
    205 |

    plot_correlation_funnel()

    206 |

    Plot a Correlation Funnel

    212 |

    Datasets

    213 |

    Datasets that ship with correlationfunnel

    214 |
    224 |

    customer_churn_tbl

    225 |

    Customer Churn Data Set for a Telecommunications Company

    230 |

    marketing_campaign_tbl

    231 |

    Marketing Data for a Bank

    236 |
    237 | 238 | 243 |
    244 | 245 | 246 |
    247 | 250 | 251 |
    252 |

    Site built with pkgdown 1.5.1.

    253 |
    254 | 255 |
    256 |
    257 | 258 | 259 | 260 | 261 | 262 | 263 | 264 | 265 | -------------------------------------------------------------------------------- /docs/reference/marketing_campaign_tbl.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Marketing Data for a Bank — marketing_campaign_tbl • correlationfunnel 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 57 | 58 | 59 | 60 | 61 | 68 | 69 | 70 | 71 | 72 | 73 |
    74 |
    75 | 129 | 130 | 131 | 132 |
    133 | 134 |
    135 |
    136 | 141 | 142 |
    143 |

    A dataset containing data related to bank clients, last contact of the current marketing campaign, and attributes related to a 144 | previous marketing campaign.

    145 |
    146 | 147 |
    marketing_campaign_tbl
    148 | 149 | 150 |

    Format

    151 | 152 |

    An object of class tbl_df (inherits from tbl, data.frame) with 45211 rows and 18 columns.

    153 |

    Source

    154 | 155 |

    Moro et al., 2014 S. Moro, P. Cortez and P. Rita. A Data-Driven Approach to Predict the Success of Bank Telemarketing. Decision Support Systems, Elsevier, 62:22-31, June 2014

    156 |

    Bank Client Data:

    157 | 158 | 159 |
      160 |
    • ID (chr): CUSTOMER ID

    • 161 |
    • AGE (dbl): Customer's age

    • 162 |
    • JOB (chr): Type of job (categorical: "admin.","unknown","unemployed","management","housemaid","entrepreneur","student", "blue-collar","self-employed","retired","technician","services")

    • 163 |
    • MARITAL (chr): marital status (categorical: "married","divorced","single"; note: "divorced" means divorced or widowed)

    • 164 |
    • EDUCATION (chr): categorical: "unknown","secondary","primary","tertiary"

    • 165 |
    • DEFAULT (chr): Has credit in default? (binary: "yes","no")

    • 166 |
    • BALANCE (dbl): Average yearly balance, in euros (numeric)

    • 167 |
    • HOUSING (chr): Has housing loan? (binary: "yes","no")

    • 168 |
    • LOAN (chr): Has personal loan? (binary: "yes","no")

    • 169 |
    170 | 171 | 172 | 173 | 174 |
      175 |
    • CONTACT (chr): Contact communication type (categorical: "unknown","telephone","cellular")

    • 176 |
    • DAY (dbl): Last contact day of the month (numeric)

    • 177 |
    • MONTH (chr): Last contact month of year (categorical: "jan", "feb", "mar", ..., "nov", "dec")

    • 178 |
    • DURATION (dbl): Last contact duration, in seconds (numeric)

    • 179 |
    180 | 181 |

    Additional Attributes:

    182 | 183 | 184 |
      185 |
    • CAMPAIGN (dbl): Number of contacts performed during this campaign and for this client (numeric, includes last contact)

    • 186 |
    • PDAYS (dbl): Number of days that passed by after the client was last contacted from a previous campaign (numeric, -1 means client was not previously contacted)

    • 187 |
    • PREVIOUS (dbl): Number of contacts performed before this campaign and for this client (numeric)

    • 188 |
    • POUTCOME (chr): Outcome of the previous marketing campaign (categorical: "unknown","other","failure","success")

    • 189 |
    190 | 191 |

    Target Variable (Response):

    192 | 193 | 194 |
      195 |
    • TERM_DEPOSIT (chr): Has the client subscribed a term deposit? (binary: "yes","no")

    • 196 |
    197 | 198 | 199 |
    200 | 205 |
    206 | 207 | 208 |
    209 | 212 | 213 |
    214 |

    Site built with pkgdown 1.5.1.

    215 |
    216 | 217 |
    218 |
    219 | 220 | 221 | 222 | 223 | 224 | 225 | 226 | 227 | -------------------------------------------------------------------------------- /docs/reference/pipe.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Pipe operator — %>% • correlationfunnel 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 56 | 57 | 58 | 59 | 60 | 67 | 68 | 69 | 70 | 71 | 72 |
    73 |
    74 | 128 | 129 | 130 | 131 |
    132 | 133 |
    134 |
    135 | 140 | 141 |
    142 |

    See magrittr::%>% for details.

    143 |
    144 | 145 |
    lhs %>% rhs
    146 | 147 | 148 | 149 |
    150 | 155 |
    156 | 157 | 158 |
    159 | 162 | 163 |
    164 |

    Site built with pkgdown 1.5.1.

    165 |
    166 | 167 |
    168 |
    169 | 170 | 171 | 172 | 173 | 174 | 175 | 176 | 177 | -------------------------------------------------------------------------------- /docs/reference/plot_correlation_funnel-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/docs/reference/plot_correlation_funnel-1.png -------------------------------------------------------------------------------- /docs/reference/plot_correlation_funnel.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Plot a Correlation Funnel — plot_correlation_funnel • correlationfunnel 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 57 | 58 | 59 | 60 | 61 | 68 | 69 | 70 | 71 | 72 | 73 |
    74 |
    75 | 129 | 130 | 131 | 132 |
    133 | 134 |
    135 |
    136 | 141 | 142 |
    143 |

    plot_correlation_funnel returns a correlation funnel visualization in either static (ggplot2) or 144 | interactive (plotly) formats.

    145 |
    146 | 147 |
    plot_correlation_funnel(
    148 |   data,
    149 |   interactive = FALSE,
    150 |   limits = c(-1, 1),
    151 |   alpha = 1
    152 | )
    153 | 154 |

    Arguments

    155 | 156 | 157 | 158 | 159 | 160 | 161 | 162 | 163 | 164 | 165 | 166 | 167 | 168 | 169 | 170 | 171 | 172 | 173 |
    data

    A tibble or data.frame

    interactive

    Returns either a static (ggplot2) visualization or an interactive (plotly) visualization

    limits

    Sets the X-Axis limits for the correlation space

    alpha

    Sets the transparency of the points on the plot.

    174 | 175 |

    Value

    176 | 177 |

    A static ggplot2 plot or an interactive plotly plot

    178 |

    See also

    179 | 180 | 181 | 182 |

    Examples

    183 |
    library(dplyr) 184 | library(correlationfunnel) 185 | 186 | marketing_campaign_tbl %>% 187 | select(-ID) %>% 188 | binarize() %>% 189 | correlate(TERM_DEPOSIT__yes) %>% 190 | plot_correlation_funnel()
    191 | 192 |
    193 |
    194 | 199 |
    200 | 201 | 202 |
    203 | 206 | 207 |
    208 |

    Site built with pkgdown 1.5.1.

    209 |
    210 | 211 |
    212 |
    213 | 214 | 215 | 216 | 217 | 218 | 219 | 220 | 221 | -------------------------------------------------------------------------------- /man/.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/man/.DS_Store -------------------------------------------------------------------------------- /man/binarize.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/binarize.R 3 | \name{binarize} 4 | \alias{binarize} 5 | \title{Turn data with numeric, categorical features into binary data.} 6 | \usage{ 7 | binarize( 8 | data, 9 | n_bins = 4, 10 | thresh_infreq = 0.01, 11 | name_infreq = "-OTHER", 12 | one_hot = TRUE 13 | ) 14 | } 15 | \arguments{ 16 | \item{data}{A \code{tibble} or \code{data.frame}} 17 | 18 | \item{n_bins}{The number of bins to for converting continuous (numeric features) into discrete features (bins)} 19 | 20 | \item{thresh_infreq}{The threshold for converting categorical (character or factor features) into an "Other" Category.} 21 | 22 | \item{name_infreq}{The name for infrequently appearing categories to be lumped into. Set to "-OTHER" by default.} 23 | 24 | \item{one_hot}{If set to \code{TRUE}, binarization returns number of new columns = number of levels. 25 | If \code{FALSE}, binarization returns number of new columns = number of levels - 1 (dummy encoding).} 26 | } 27 | \value{ 28 | A \code{tbl} 29 | } 30 | \description{ 31 | \code{binarize} returns the binary data coverted from data in normal (numeric and categorical) format. 32 | } 33 | \details{ 34 | \subsection{The Goal}{ 35 | 36 | The binned format helps correlation analysis to identify non-linear trends between a predictor (binned values) and a 37 | response (the target) 38 | } 39 | 40 | \subsection{What Binarize Does}{ 41 | 42 | The \code{binarize()} function takes data in a "normal" format and converts to a binary format that is useful as a preparation 43 | step before using \code{\link[=correlate]{correlate()}}: 44 | 45 | \strong{Numeric Features}: 46 | The "Normal Data" format has numeric features that are continuous values in numeric format (\code{double} or \code{integer}). 47 | The \code{binarize()} function converts these to bins (categories) and then discretizes the bins using a one-hot encoding process. 48 | 49 | \strong{Categorical Features}: 50 | The "Normal Data" format has categorical features that are \code{character} or \code{factor} format. 51 | The \code{binarize()} function converts these to binary features using a one-hot encoding process. 52 | } 53 | } 54 | \examples{ 55 | library(dplyr) 56 | library(correlationfunnel) 57 | 58 | marketing_campaign_tbl \%>\% 59 | select(-ID) \%>\% 60 | binarize() 61 | 62 | 63 | } 64 | -------------------------------------------------------------------------------- /man/correlate.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/correlate.R 3 | \name{correlate} 4 | \alias{correlate} 5 | \title{Correlate a response (target) to features in a data set.} 6 | \usage{ 7 | correlate(data, target, ...) 8 | } 9 | \arguments{ 10 | \item{data}{A \code{tibble} or \code{data.frame}} 11 | 12 | \item{target}{The feature that contains the response (Target) that you want to measure relationship.} 13 | 14 | \item{...}{Other arguments passed to \link[stats]{cor}} 15 | } 16 | \value{ 17 | A \code{tbl} 18 | } 19 | \description{ 20 | \code{correlate} returns a correlation between a target column and the features in a data set. 21 | } 22 | \details{ 23 | The \code{correlate()} function provides a convient wrapper around the \link[stats]{cor} function where the \code{target} 24 | is the column containing the Y variable. The function is intended to be used with \code{\link[=binarize]{binarize()}}, which enables 25 | creation of the binary correlation analysis, which is the feed data for the \code{\link[=plot_correlation_funnel]{plot_correlation_funnel()}} visualization. 26 | 27 | The default method is the Pearson correlation, which is the Correlation Coefficient from L. Duan et al., 2014. 28 | This represents the linear relationship between two dichotomous features (binary variables). 29 | Learn more about the binary correlation approach in the Vignette covering the Methodology, Key Considerations and FAQs. 30 | } 31 | \examples{ 32 | library(dplyr) 33 | library(correlationfunnel) 34 | 35 | marketing_campaign_tbl \%>\% 36 | select(-ID) \%>\% 37 | binarize() \%>\% 38 | correlate(TERM_DEPOSIT__yes) 39 | 40 | 41 | } 42 | \references{ 43 | Lian Duan, W. Nick Street, Yanchi Liu, Songhua Xu, and Brook Wu. 2014. Selecting the right correlation 44 | measure for binary data. ACM Trans. Knowl. Discov. Data 9, 2, Article 13 (September 2014), 28 pages. 45 | DOI: http://dx.doi.org/10.1145/2637484 46 | } 47 | \seealso{ 48 | \code{\link[=binarize]{binarize()}}, \code{\link[=plot_correlation_funnel]{plot_correlation_funnel()}} 49 | } 50 | -------------------------------------------------------------------------------- /man/correlationfunnel-package.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/correlationfunnel-package.R 3 | \docType{package} 4 | \name{correlationfunnel-package} 5 | \alias{correlationfunnel} 6 | \alias{correlationfunnel-package} 7 | \title{correlationfunnel: Speed Up Exploratory Data Analysis (EDA) with the Correlation Funnel} 8 | \description{ 9 | Speeds up exploratory data analysis (EDA) 10 | by providing a succinct workflow and interactive visualization tools for understanding 11 | which features have relationships to target (response). Uses binary correlation analysis 12 | to determine relationship. Default correlation method is the Pearson method. 13 | Lian Duan, W Nick Street, Yanchi Liu, Songhua Xu, and Brook Wu (2014) . 14 | } 15 | \seealso{ 16 | Useful links: 17 | \itemize{ 18 | \item \url{https://github.com/business-science/correlationfunnel} 19 | \item Report bugs at \url{https://github.com/business-science/correlationfunnel/issues} 20 | } 21 | 22 | } 23 | \author{ 24 | \strong{Maintainer}: Matt Dancho \email{mdancho@business-science.io} 25 | 26 | } 27 | \keyword{internal} 28 | -------------------------------------------------------------------------------- /man/customer_churn_tbl.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/data.R 3 | \docType{data} 4 | \name{customer_churn_tbl} 5 | \alias{customer_churn_tbl} 6 | \title{Customer Churn Data Set for a Telecommunications Company} 7 | \format{ 8 | An object of class \code{spec_tbl_df} (inherits from \code{tbl_df}, \code{tbl}, \code{data.frame}) with 7043 rows and 21 columns. 9 | } 10 | \source{ 11 | \href{https://community.ibm.com/community/user/gettingstarted/home}{IBM Sample Datasets} 12 | } 13 | \usage{ 14 | customer_churn_tbl 15 | } 16 | \description{ 17 | A dataset containing data related to telecom customers that have enrolled in various products and services 18 | } 19 | \section{Telecom Customer Data:}{ 20 | \itemize{ 21 | \item customerID (chr): CUSTOMER ID 22 | \item gender (chr): Customer's gender ("Female", "Male") 23 | \item SeniorCitizen (dbl): 1 = Senior Citzen, 0 = Not Senior Citizen 24 | \item Partner (chr): Whether the customer has a partner or not (Yes, No) 25 | \item Dependents (chr): Whether the customer has dependents or not (Yes, No) 26 | \item tenure (dbl): Number of months the customer has stayed with the company 27 | \item PhoneService (chr): Whether the customer has a phone service or not (Yes, No) 28 | \item MultipleLines (chr): Whether the customer has multiple lines or not (Yes, No, No phone service) 29 | \item InternetService (chr): Customer’s internet service provider (DSL, Fiber optic, No) 30 | \item OnlineSecurity (chr): Whether the customer has online security or not (Yes, No, No internet service) 31 | \item OnlineBackup (chr): Whether the customer has online backup or not (Yes, No, No internet service) 32 | \item DeviceProtection (chr): Whether the customer has device protection or not (Yes, No, No internet service) 33 | \item TechSupport (chr): Whether the customer has tech support or not (Yes, No, No internet service) 34 | \item StreamingTV (chr): Whether the customer has streaming TV or not (Yes, No, No internet service) 35 | \item StreamingMovies (chr): Whether the customer has streaming movies or not (Yes, No, No internet service) 36 | \item Contract (chr): The contract term of the customer (Month-to-month, One year, Two year) 37 | \item PaperlessBilling (chr): Whether the customer has paperless billing or not (Yes, No) 38 | \item PaymentMethod (chr): The customer’s payment method (Electronic check, Mailed check, Bank transfer (automatic), Credit card (automatic)) 39 | \item MonthlyCharges (dbl): The amount charged to the customer monthly 40 | \item TotalCharges (dbl): The total amount charged to the customer 41 | \item Churn (chr): Outcome. Whether the customer churned or not (Yes or No) 42 | } 43 | } 44 | 45 | \keyword{datasets} 46 | -------------------------------------------------------------------------------- /man/figures/.DS_Store: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/man/figures/.DS_Store -------------------------------------------------------------------------------- /man/figures/README-3-course-system.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/man/figures/README-3-course-system.jpg -------------------------------------------------------------------------------- /man/figures/README-corr_funnel.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/man/figures/README-corr_funnel.png -------------------------------------------------------------------------------- /man/figures/README-unnamed-chunk-5-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/man/figures/README-unnamed-chunk-5-1.png -------------------------------------------------------------------------------- /man/figures/README-unnamed-chunk-6-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/man/figures/README-unnamed-chunk-6-1.png -------------------------------------------------------------------------------- /man/figures/logo-correlationfunnel.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/business-science/correlationfunnel/e592ef37ff50f8cbdb2763681f6a0a03aad72611/man/figures/logo-correlationfunnel.png -------------------------------------------------------------------------------- /man/marketing_campaign_tbl.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/data.R 3 | \docType{data} 4 | \name{marketing_campaign_tbl} 5 | \alias{marketing_campaign_tbl} 6 | \title{Marketing Data for a Bank} 7 | \format{ 8 | An object of class \code{tbl_df} (inherits from \code{tbl}, \code{data.frame}) with 45211 rows and 18 columns. 9 | } 10 | \source{ 11 | \href{https://archive.ics.uci.edu/ml/datasets/Bank+Marketing}{Moro et al., 2014} S. Moro, P. Cortez and P. Rita. A Data-Driven Approach to Predict the Success of Bank Telemarketing. Decision Support Systems, Elsevier, 62:22-31, June 2014 12 | } 13 | \usage{ 14 | marketing_campaign_tbl 15 | } 16 | \description{ 17 | A dataset containing data related to bank clients, last contact of the current marketing campaign, and attributes related to a 18 | previous marketing campaign. 19 | } 20 | \section{Bank Client Data:}{ 21 | \itemize{ 22 | \item ID (chr): CUSTOMER ID 23 | \item AGE (dbl): Customer's age 24 | \item JOB (chr): Type of job (categorical: "admin.","unknown","unemployed","management","housemaid","entrepreneur","student", "blue-collar","self-employed","retired","technician","services") 25 | \item MARITAL (chr): marital status (categorical: "married","divorced","single"; note: "divorced" means divorced or widowed) 26 | \item EDUCATION (chr): categorical: "unknown","secondary","primary","tertiary" 27 | \item DEFAULT (chr): Has credit in default? (binary: "yes","no") 28 | \item BALANCE (dbl): Average yearly balance, in euros (numeric) 29 | \item HOUSING (chr): Has housing loan? (binary: "yes","no") 30 | \item LOAN (chr): Has personal loan? (binary: "yes","no") 31 | } 32 | } 33 | 34 | \section{Features related to the last contact during the current marketing campaign:}{ 35 | \itemize{ 36 | \item CONTACT (chr): Contact communication type (categorical: "unknown","telephone","cellular") 37 | \item DAY (dbl): Last contact day of the month (numeric) 38 | \item MONTH (chr): Last contact month of year (categorical: "jan", "feb", "mar", ..., "nov", "dec") 39 | \item DURATION (dbl): Last contact duration, in seconds (numeric) 40 | } 41 | } 42 | 43 | \section{Additional Attributes:}{ 44 | \itemize{ 45 | \item CAMPAIGN (dbl): Number of contacts performed during this campaign and for this client (numeric, includes last contact) 46 | \item PDAYS (dbl): Number of days that passed by after the client was last contacted from a previous campaign (numeric, -1 means client was not previously contacted) 47 | \item PREVIOUS (dbl): Number of contacts performed before this campaign and for this client (numeric) 48 | \item POUTCOME (chr): Outcome of the previous marketing campaign (categorical: "unknown","other","failure","success") 49 | } 50 | } 51 | 52 | \section{Target Variable (Response):}{ 53 | \itemize{ 54 | \item TERM_DEPOSIT (chr): Has the client subscribed a term deposit? (binary: "yes","no") 55 | } 56 | } 57 | 58 | \keyword{datasets} 59 | -------------------------------------------------------------------------------- /man/pipe.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/utils-pipe.R 3 | \name{\%>\%} 4 | \alias{\%>\%} 5 | \title{Pipe operator} 6 | \usage{ 7 | lhs \%>\% rhs 8 | } 9 | \description{ 10 | See \code{magrittr::\link[magrittr]{\%>\%}} for details. 11 | } 12 | \keyword{internal} 13 | -------------------------------------------------------------------------------- /man/plot_correlation_funnel.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/plot_correlation_funnel.R 3 | \name{plot_correlation_funnel} 4 | \alias{plot_correlation_funnel} 5 | \title{Plot a Correlation Funnel} 6 | \usage{ 7 | plot_correlation_funnel( 8 | data, 9 | interactive = FALSE, 10 | limits = c(-1, 1), 11 | alpha = 1 12 | ) 13 | } 14 | \arguments{ 15 | \item{data}{A \code{tibble} or \code{data.frame}} 16 | 17 | \item{interactive}{Returns either a static (\code{ggplot2}) visualization or an interactive (\code{plotly}) visualization} 18 | 19 | \item{limits}{Sets the X-Axis limits for the correlation space} 20 | 21 | \item{alpha}{Sets the transparency of the points on the plot.} 22 | } 23 | \value{ 24 | A static \code{ggplot2} plot or an interactive \code{plotly} plot 25 | } 26 | \description{ 27 | \code{plot_correlation_funnel} returns a correlation funnel visualization in either static (\code{ggplot2}) or 28 | interactive (\code{plotly}) formats. 29 | } 30 | \examples{ 31 | library(dplyr) 32 | library(correlationfunnel) 33 | 34 | marketing_campaign_tbl \%>\% 35 | select(-ID) \%>\% 36 | binarize() \%>\% 37 | correlate(TERM_DEPOSIT__yes) \%>\% 38 | plot_correlation_funnel() 39 | 40 | 41 | } 42 | \seealso{ 43 | \code{\link[=binarize]{binarize()}}, \code{\link[=correlate]{correlate()}} 44 | } 45 | -------------------------------------------------------------------------------- /tests/testthat.R: -------------------------------------------------------------------------------- 1 | library(testthat) 2 | library(dplyr) 3 | library(lubridate) 4 | library(stringr) 5 | library(correlationfunnel) 6 | 7 | test_check("correlationfunnel") 8 | -------------------------------------------------------------------------------- /tests/testthat/test-binarize.R: -------------------------------------------------------------------------------- 1 | # TEST BINARIZE ---- 2 | 3 | # 1.0 SETUP ---- 4 | non_data_frame <- 1:100 5 | 6 | set.seed(123) 7 | bad_tbl <- tibble( 8 | date = seq.Date(from = ymd("2018-01-01"), ymd("2018-12-31"), by = "day"), 9 | n = rnorm(n = 365), 10 | c = rep("yes", length.out = 365), 11 | b = sample(c(0,1), size = 365, replace = TRUE) %>% as.logical() 12 | ) 13 | 14 | data("customer_churn_tbl") 15 | 16 | data("marketing_campaign_tbl") 17 | 18 | # 2.0 TESTS ---- 19 | 20 | # 2.1 Check Data Type ---- 21 | test_that("Check non-data frame throws error", { 22 | expect_error(non_data_frame %>% binarize()) 23 | }) 24 | 25 | test_that("Check data types", { 26 | 27 | expect_error({ 28 | bad_tbl %>% 29 | binarize() 30 | }) 31 | 32 | }) 33 | 34 | # 2.2 Check missing data ---- 35 | test_that("Check missing data", { 36 | 37 | expect_error({ 38 | customer_churn_tbl %>% 39 | binarize() 40 | }) 41 | 42 | }) 43 | 44 | # 2.3 Check Numeric Binarization ---- 45 | test_that("Check binarize - numeric", { 46 | 47 | AGE_bin4_tbl <- marketing_campaign_tbl %>% select(AGE) %>% binarize(n_bins = 4) 48 | expect_equal(ncol(AGE_bin4_tbl), 4) 49 | 50 | AGE_bin5_tbl <- marketing_campaign_tbl %>% select(AGE) %>% binarize(n_bins = 5) 51 | expect_equal(ncol(AGE_bin5_tbl), 5) 52 | 53 | }) 54 | 55 | test_that("Check binarize - numeric - high skew", { 56 | 57 | PDAYS_bin5_tbl <- marketing_campaign_tbl %>% select(PDAYS) %>% binarize(n_bins = 5) 58 | expect_equal(ncol(PDAYS_bin5_tbl), 2) 59 | 60 | PREVIOUS_thresh_infreq_0_tbl <- marketing_campaign_tbl %>% select(PREVIOUS) %>% binarize(thresh_infreq = 0) 61 | expect_equal(ncol(PREVIOUS_thresh_infreq_0_tbl), 41) 62 | 63 | }) 64 | 65 | # 2.4 Check Categorical Binarization ---- 66 | test_that("Check binarize - categorical", { 67 | 68 | JOB_thresh_infreq_0_tbl <- marketing_campaign_tbl %>% select(JOB) %>% binarize(thresh_infreq = 0, name_infreq = "MISC") 69 | expect_equal(ncol(JOB_thresh_infreq_0_tbl), 12) 70 | expect_false(names(JOB_thresh_infreq_0_tbl) %>% str_detect("MISC") %>% any()) # Should not contain a miscellaneous column 71 | 72 | JOB_thresh_infreq_0.1_tbl <- marketing_campaign_tbl %>% select(JOB) %>% binarize(thresh_infreq = 0.1, name_infreq = "MISC") 73 | expect_equal(ncol(JOB_thresh_infreq_0.1_tbl), 5) 74 | expect_true(names(JOB_thresh_infreq_0.1_tbl) %>% str_detect("MISC") %>% any()) # Should contain a miscellaneous column 75 | 76 | }) 77 | 78 | 79 | -------------------------------------------------------------------------------- /tests/testthat/test-correlate.R: -------------------------------------------------------------------------------- 1 | # TEST CORRELATE ---- 2 | 3 | # 1.0 SETUP ---- 4 | non_data_frame <- 1:100 5 | 6 | set.seed(123) 7 | bad_tbl <- tibble( 8 | date = seq.Date(from = ymd("2018-01-01"), ymd("2018-12-31"), by = "day"), 9 | n = rnorm(n = 365), 10 | c = rep("yes", length.out = 365), 11 | b = sample(c(0,1), size = 365, replace = TRUE) %>% as.logical() 12 | ) 13 | 14 | set.seed(123) 15 | bad_balance_tbl <- tibble( 16 | target = c(rep(0, 99), 1), 17 | x = sample(c(0, 1), size = 100, replace = TRUE) 18 | ) %>% 19 | binarize() 20 | 21 | data("marketing_campaign_tbl") 22 | 23 | marketing_binarized_tbl <- marketing_campaign_tbl %>% 24 | select(-ID) %>% 25 | binarize(n_bins = 4, thresh_infreq = 0.01, name_infreq = "MISC") 26 | 27 | # 2.0 TESTS ---- 28 | 29 | # 2.1 Check Data & Class Types ---- 30 | test_that("Check non-data frame throws error", { 31 | expect_error(non_data_frame %>% correlate()) 32 | }) 33 | 34 | test_that("Test missing target", { 35 | expect_error({ 36 | bad_tbl %>% 37 | correlate() 38 | }) 39 | }) 40 | 41 | test_that("Check for non-numeric columns", { 42 | expect_error({ 43 | bad_tbl %>% 44 | correlate(n) 45 | }) 46 | }) 47 | 48 | test_that("Check for bad balance", { 49 | expect_warning({ 50 | bad_balance_tbl %>% 51 | correlate(target__1) 52 | }) 53 | }) 54 | 55 | # 2.2 Check Correlation ---- 56 | test_that("Check correlation", { 57 | 58 | marketing_correlated_tbl <- marketing_binarized_tbl %>% 59 | correlate(TERM_DEPOSIT__yes) 60 | 61 | expect_equal(nrow(marketing_correlated_tbl), 74) 62 | expect_equal(ncol(marketing_correlated_tbl), 3) 63 | 64 | }) 65 | 66 | 67 | -------------------------------------------------------------------------------- /tests/testthat/test-plot_correlation_funnel.R: -------------------------------------------------------------------------------- 1 | # TEST PLOT_CORRELATION_FUNNEL ---- 2 | 3 | # 1.0 SETUP ---- 4 | non_data_frame <- 1:100 5 | 6 | set.seed(123) 7 | bad_tbl <- tibble( 8 | date = seq.Date(from = ymd("2018-01-01"), ymd("2018-12-31"), by = "day"), 9 | n = rnorm(n = 365), 10 | c = rep("yes", length.out = 365), 11 | b = sample(c(0,1), size = 365, replace = TRUE) %>% as.logical() 12 | ) 13 | 14 | data("marketing_campaign_tbl") 15 | 16 | marketing_correlated_tbl <- marketing_campaign_tbl %>% 17 | select(-ID) %>% 18 | binarize(n_bins = 4, thresh_infreq = 0.01, name_infreq = "MISC") %>% 19 | correlate(TERM_DEPOSIT__yes) 20 | 21 | g <- plot_correlation_funnel(marketing_correlated_tbl) 22 | 23 | p <- plot_correlation_funnel(marketing_correlated_tbl, interactive = TRUE) 24 | 25 | # 2.0 TESTS ---- 26 | 27 | # 2.1 Check Data & Class Types ---- 28 | test_that("Check non-data frame throws error", { 29 | expect_error(non_data_frame %>% plot_correlation_funnel()) 30 | }) 31 | 32 | test_that("Check bad column names", { 33 | expect_error(bad_tbl %>% plot_correlation_funnel()) 34 | }) 35 | 36 | # 2.2 Check output of plot_correlation_funnel() 37 | test_that("Check ggplot", { 38 | expect_true(any(class(g) %in% "ggplot")) 39 | }) 40 | 41 | test_that("Check plotly", { 42 | expect_true(any(class(p) %in% "plotly")) 43 | }) 44 | -------------------------------------------------------------------------------- /vignettes/.gitignore: -------------------------------------------------------------------------------- 1 | *.html 2 | *.R 3 | -------------------------------------------------------------------------------- /vignettes/introducing_correlation_funnel.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Introducing Correlation Funnel - Customer Churn Example" 3 | output: 4 | rmarkdown::html_vignette: 5 | toc: TRUE 6 | vignette: > 7 | %\VignetteIndexEntry{Introducing Correlation Funnel} 8 | %\VignetteEngine{knitr::rmarkdown} 9 | %\VignetteEncoding{UTF-8} 10 | --- 11 | 12 | ```{r, include = FALSE} 13 | knitr::opts_chunk$set( 14 | collapse = TRUE, 15 | comment = "#>", 16 | warning = FALSE, 17 | message = FALSE, 18 | dpi = 100 19 | ) 20 | ``` 21 | 22 | > Speed Up Exploratory Data Analysis (EDA) with `correlationfunnel` 23 | 24 | The goal of `correlationfunnel` is to help data scientist's speed up [Exploratory Data Analysis (EDA)](https://en.wikipedia.org/wiki/Exploratory_data_analysis). EDA can be an incredibly time consuming process. 25 | 26 | ## Problem 27 | 28 | Traditional approaches to EDA are ___labor intense___ where the data scientist reviews each of the features (predictors) in the data set for relationship to the target (i.e. goal or response). This process of manually building many visualizations and searching for relationships can take hours. 29 | 30 | ## Solution 31 | 32 | 33 | 34 | __Correlation Analysis__ on data that has been _preprocessed_ (more on this shortly) can drastically speed up EDA by identifying key features that relate to the target. The key is getting the features into the "right format". This is where `correlationfunnel` helps. 35 | 36 | The `correlationfunnel` package includes a ___streamlined 3-step process for preparing data and performing visual Correlation Analysis___. The visualization produced uncovers insights by elevating high-correlation features and loweribng low-correlation features. The shape looks like a ___funnel___ (hence the name "Correlation Funnel"), making it very efficient to understand which features are most likely to provide business insights and lend well to a machine learning model. 37 | 38 | 39 | ## Main Benefits 40 | 41 | 1. __Speeds Up Exploratory Data Analysis__ - You can drastically increase the speed at which you perform Exploratory Data Analysis (EDA) by using Correlation Analysis to focus on key features (rather than investigating all features). 42 | 43 | 2. __Improves Feature Selection__ - Using correlation to determine if you have good features prior to spending significant time developing Machine Learning Models. 44 | 45 | 3. __Gets You To Business Insights Faster__ - Understanding how features are related to a target variable can help you develop the story in the data (aka business insights). 46 | 47 | ## Correlation Funnel Process 48 | 49 | The Correlation Funnel process uses __3 functions__: 50 | 51 | 1. Transform the data into a binary format with `binarize()` - This step prepares semi-processed data for an optimal format (binary) for correlation analysis 52 | 53 | 2. Perform correlation analysis using `correlate()` - This step correlates the "binarized" data (binary features) with the target 54 | 55 | 3. Visualize the feature-target relationships using `plot_correlation_funnel()` - This step produces the visualization from which we can get business insights 56 | 57 | ## Example - Customer Churn 58 | 59 | We'll step through an example of understanding what features are related to Customer Churn. 60 | 61 | Load the necessary libraries. 62 | 63 | ```{r setup} 64 | library(correlationfunnel) 65 | library(dplyr) 66 | ``` 67 | 68 | Get the `customer_churn_tbl` dataset. The dataset contains a number of features related to a telecommunications company's customer-base and whether or not the customer has churned. The target is "Churn". 69 | 70 | ```{r} 71 | data("customer_churn_tbl") 72 | 73 | customer_churn_tbl %>% glimpse() 74 | ``` 75 | 76 | 77 | ### Step 1 - Prepare Data as Binary Features 78 | 79 | We use the `binarize()` function to produce a feature set of binary (0/1) variables. Numeric data are binned (using `n_bins`) into categorical data, then all categorical data is one-hot encoded to produce binary features. To prevent low frequency categories (high cardinality categories) from increasing the dimensionality (width of the resulting data frame), we use `thresh_infreq = 0.01` and `name_infreq = "OTHER"` to group excess categories. 80 | 81 | ```{r} 82 | customer_churn_binarized_tbl <- customer_churn_tbl %>% 83 | select(-customerID) %>% 84 | mutate(TotalCharges = ifelse(is.na(TotalCharges), MonthlyCharges, TotalCharges)) %>% 85 | binarize(n_bins = 5, thresh_infreq = 0.01, name_infreq = "OTHER", one_hot = TRUE) 86 | 87 | customer_churn_binarized_tbl %>% glimpse() 88 | ``` 89 | 90 | ### Step 2 - Correlate to the Target 91 | 92 | Next, we use `correlate()` to correlate the binary features to a target (in our case Customer Churn). 93 | 94 | ```{r} 95 | customer_churn_corr_tbl <- customer_churn_binarized_tbl %>% 96 | correlate(Churn__Yes) 97 | 98 | customer_churn_corr_tbl 99 | ``` 100 | 101 | ### Step 3 - Plot the Correlation Funnel 102 | 103 | Finally, we visualize the correlation using the `plot_correlation_funnel()` function. 104 | 105 | ```{r, fig.height=12, fig.width=8, out.height="70%", out.width="70%", fig.align="center"} 106 | customer_churn_corr_tbl %>% 107 | plot_correlation_funnel() 108 | ``` 109 | 110 | ### Business Insights 111 | 112 | We can see that the following features are correlated with Churn: 113 | 114 | - "Month to Month" Contract Type 115 | - No Online Security 116 | - No Tech Support 117 | - Customer tenure less than 6 months 118 | - Fiber Optic internet service 119 | - Pays with electronic check 120 | 121 | We can also see that the following features are correlated with Staying (No Churn): 122 | 123 | - "Two Year" Contract Type 124 | - Customer Purchases Online Security 125 | - Customer Purchases Tech Support 126 | - Customer tenure greater than 60 months (5 years) 127 | - DSL internet service 128 | - Pays with automatic credit card 129 | 130 | We can then develop a strategy to retain high risk customers: 131 | 132 | - Promotions for 2 Year Contract, Online Security, and Tech Support 133 | - Loyalty Bonuses to incentivize tenure 134 | - Incentives for setting up an automatic credit card payment 135 | 136 | ## Conclusion 137 | 138 | The `correlationfunnel` package provides a 3-step workflow that streamlines the EDA process, helps with feature selection, and improves the ease of obtaining Business Insights. 139 | 140 | ## More Information 141 | 142 | To learn about the inner-workings of and key considerations for use of `correlationfunnel`, __please read the Key Considerations and FAQs__. 143 | --------------------------------------------------------------------------------