├── CODE_OF_CONDUCT.md ├── DESCRIPTION ├── LICENSE ├── NAMESPACE ├── NEWS ├── NEWS.md ├── R ├── plot_cut_point.R ├── plot_gmm.R ├── plot_mix_comps.R ├── plot_mix_comps_normal.R └── plot_mm.R ├── README.md ├── docs └── src │ └── assets │ ├── issue_bug.png │ ├── issue_featureidea.png │ └── issue_help.png ├── man ├── plot_cut_point.Rd ├── plot_gmm.Rd ├── plot_mix_comps.Rd ├── plot_mix_comps_normal.Rd └── plot_mm.Rd ├── tests ├── testthat.R └── testthat │ ├── test_plot_gmm.R │ └── test_plot_mm.R └── vignettes └── Getting-Started.Rmd /CODE_OF_CONDUCT.md: -------------------------------------------------------------------------------- 1 | # Contributor Covenant Code of Conduct 2 | 3 | ## Our Pledge 4 | 5 | In the interest of fostering an open and welcoming environment, we as 6 | contributors and maintainers pledge to making participation in our project and 7 | our community a harassment-free experience for everyone, regardless of age, body 8 | size, disability, ethnicity, sex characteristics, gender identity and expression, 9 | level of experience, education, socio-economic status, nationality, personal 10 | appearance, race, religion, or sexual identity and orientation. 11 | 12 | ## Our Standards 13 | 14 | Examples of behavior that contributes to creating a positive environment 15 | include: 16 | 17 | * Using welcoming and inclusive language 18 | * Being respectful of differing viewpoints and experiences 19 | * Gracefully accepting constructive criticism 20 | * Focusing on what is best for the community 21 | * Showing empathy towards other community members 22 | 23 | Examples of unacceptable behavior by participants include: 24 | 25 | * The use of sexualized language or imagery and unwelcome sexual attention or 26 | advances 27 | * Trolling, insulting/derogatory comments, and personal or political attacks 28 | * Public or private harassment 29 | * Publishing others' private information, such as a physical or electronic 30 | address, without explicit permission 31 | * Other conduct which could reasonably be considered inappropriate in a 32 | professional setting 33 | 34 | ## Our Responsibilities 35 | 36 | Project maintainers are responsible for clarifying the standards of acceptable 37 | behavior and are expected to take appropriate and fair corrective action in 38 | response to any instances of unacceptable behavior. 39 | 40 | Project maintainers have the right and responsibility to remove, edit, or 41 | reject comments, commits, code, wiki edits, issues, and other contributions 42 | that are not aligned to this Code of Conduct, or to ban temporarily or 43 | permanently any contributor for other behaviors that they deem inappropriate, 44 | threatening, offensive, or harmful. 45 | 46 | ## Scope 47 | 48 | This Code of Conduct applies both within project spaces and in public spaces 49 | when an individual is representing the project or its community. Examples of 50 | representing a project or community include using an official project e-mail 51 | address, posting via an official social media account, or acting as an appointed 52 | representative at an online or offline event. Representation of a project may be 53 | further defined and clarified by project maintainers. 54 | 55 | ## Enforcement 56 | 57 | Instances of abusive, harassing, or otherwise unacceptable behavior may be 58 | reported by contacting the project team at pdwaggoner@uchicago.edu. All 59 | complaints will be reviewed and investigated and will result in a response that 60 | is deemed necessary and appropriate to the circumstances. The project team is 61 | obligated to maintain confidentiality with regard to the reporter of an incident. 62 | Further details of specific enforcement policies may be posted separately. 63 | 64 | Project maintainers who do not follow or enforce the Code of Conduct in good 65 | faith may face temporary or permanent repercussions as determined by other 66 | members of the project's leadership. 67 | 68 | ## Attribution 69 | 70 | This Code of Conduct is adapted from the [Contributor Covenant][homepage], version 1.4, 71 | available at https://www.contributor-covenant.org/version/1/4/code-of-conduct.html 72 | 73 | [homepage]: https://www.contributor-covenant.org 74 | 75 | For answers to common questions about this code of conduct, see 76 | https://www.contributor-covenant.org/faq 77 | -------------------------------------------------------------------------------- /DESCRIPTION: -------------------------------------------------------------------------------- 1 | Package: plotmm 2 | Type: Package 3 | Title: Tidy Tools for Visualizing Mixture Models 4 | Version: 0.1.2 5 | BugReports: https://github.com/pdwaggoner/plotmm/issues 6 | Authors@R: c(person("Philip", "Waggoner", role = c("aut", "cre"), email = "philip.waggoner@gmail.com"), 7 | person("Fong", "Chan", role = c("aut", "ctb")), 8 | person("Lu", "Zhang", role = c("aut", "ctb"))) 9 | Maintainer: Philip Waggoner 10 | Description: The main function, plot_mm(), is used for (gg)plotting output from mixture models, 11 | including both densities and overlaying mixture weight component curves from the fit models in line with the 12 | tidy principles. The package includes several additional functions for added plot customization. 13 | Supported model objects include: 'mixtools', 'EMCluster', and 'flexmix', with more from each in active dev. 14 | Supported mixture model specifications include mixtures of univariate Gaussians, multivariate Gaussians, Gammas, 15 | logistic regressions, linear regressions, and Poisson regressions. 16 | Imports: methods, wesanderson, amerika, ggplot2, mixtools, EMCluster, flexmix 17 | Suggests: 18 | testthat, 19 | graphics, 20 | dplyr, 21 | patchwork, 22 | survival, 23 | magrittr, 24 | knitr, 25 | rmarkdown 26 | License: MIT + file LICENSE 27 | Encoding: UTF-8 28 | LazyData: true 29 | RoxygenNote: 7.3.1 30 | VignetteBuilder: knitr 31 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | MIT License 2 | 3 | Copyright (c) 2020 Philip Waggoner and Fong Chan 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /NAMESPACE: -------------------------------------------------------------------------------- 1 | importFrom("stats", "dnorm", "plogis", "dgamma", "rpois", "density") 2 | exportPattern("^[[:alpha:]]+") 3 | export(plot_cut_point) 4 | export(plot_gmm) 5 | export(plot_mix_comps) 6 | export(plot_mix_comps_normal) 7 | export(plot_mm) 8 | -------------------------------------------------------------------------------- /NEWS: -------------------------------------------------------------------------------- 1 | --- 2 | title: "NEWS.md" 3 | author: "Waggoner, Chan, Zhang" 4 | --- 5 | 6 | # `plotmm` v0.1.2 7 | 8 | Many changes and corrections across code base to address batch of version and import errors. All tests once again passing and package is stable. 9 | 10 | 11 | # `plotmm` v0.1.1 12 | 13 | Transitioned most content from `README` to a vignette. 14 | 15 | Small bug fixes, mostly with examples in `R` files/source. Adding `\donttest{}` for now. Will address in detail later; dev currently in process but required to fix by 11/20/2022 16 | 17 | Adjusted `@details` for `plot_mix_comps()` to clarify the case where this makes sense (namely: 2-component Gaussian mixture models with equal priors and equal variances). Thanks to @pdmoerland in [issue #13](https://github.com/pdwaggoner/plotmm/issues/13) for pointing this out. Dev for more complex cases is in process. 18 | 19 | # `plotmm` v0.1.0 20 | 21 | ## Tidy Tools for Visualizing Mixture Models 22 | 23 | The `plotmm` package is a substantially updated version of the `plotGMM` package (Waggoner and Chan). Whereas `plotGMM` only includes support for visualizing univariate Gaussian mixture models fit via the `mixtools` package, the new `plotmm` package supports numerous mixture model specifications from several packages (model objects). 24 | 25 | Waggoner and Chan would like to sincerely thank Zhang for joining the team and making development of this package possible. 26 | 27 | ## Key Features 28 | 29 | Supported model objects/packages include: 30 | 31 | 1. `mixtools` 32 | 2. `EMCluster` 33 | 3. `flexmix` 34 | 35 | Supported specifications include mixtures of: 36 | 37 | 1. Univariate Gaussians 38 | 2. Multivariate Gaussians 39 | 3. Gammas 40 | 4. Logistic regressions 41 | 5. Linear regressions 42 | 6. Poisson regressions 43 | 44 | Note that though `plotmm` includes many updates and expanded functionality beyond `plotGMM`, it is under active development with support for more model objects and specifications forthcoming. Stay tuned for updates, and always feel free to open an issue ticket to share anything you'd like to see included in future versions of the package. 45 | 46 | ## How do I get `plotmm `? 47 | 48 | The package is available on CRAN. If you have any questions or find any bugs requiring fixing, please feel free to contact us either directly (see the `DESCRIPTION` for more) or by opening an issue ticket on GitHub. Thanks and enjoy! 49 | -------------------------------------------------------------------------------- /NEWS.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "NEWS.md" 3 | author: "Waggoner, Chan, Zhang" 4 | --- 5 | 6 | # `plotmm` v0.1.1 7 | 8 | Transitioned most content from `README` to a vignette. 9 | 10 | Small bug fixes, mostly with examples in `R` files/source. Adding `\donttest{}` for now. Will address in detail later; dev currently in process but required to fix by 11/20/2022 11 | 12 | Adjusted `@details` for `plot_mix_comps()` to clarify the case where this makes sense (namely: 2-component Gaussian mixture models with equal priors and equal variances). Thanks to @pdmoerland in [issue #13](https://github.com/pdwaggoner/plotmm/issues/13) for pointing this out. Dev for more complex cases is in process. 13 | 14 | # `plotmm` v0.1.0 15 | 16 | ## Tidy Tools for Visualizing Mixture Models 17 | 18 | The `plotmm` package is a substantially updated version of the `plotGMM` package (Waggoner and Chan). Whereas `plotGMM` only includes support for visualizing univariate Gaussian mixture models fit via the `mixtools` package, the new `plotmm` package supports numerous mixture model specifications from several packages (model objects). 19 | 20 | Waggoner and Chan would like to sincerely thank Zhang for joining the team and making development of this package possible. 21 | 22 | ## Key Features 23 | 24 | Supported model objects/packages include: 25 | 26 | 1. `mixtools` 27 | 2. `EMCluster` 28 | 3. `flexmix` 29 | 30 | Supported specifications include mixtures of: 31 | 32 | 1. Univariate Gaussians 33 | 2. Multivariate Gaussians 34 | 3. Gammas 35 | 4. Logistic regressions 36 | 5. Linear regressions 37 | 6. Poisson regressions 38 | 39 | Note that though `plotmm` includes many updates and expanded functionality beyond `plotGMM`, it is under active development with support for more model objects and specifications forthcoming. Stay tuned for updates, and always feel free to open an issue ticket to share anything you'd like to see included in future versions of the package. 40 | 41 | ## How do I get `plotmm `? 42 | 43 | The package is available on CRAN. If you have any questions or find any bugs requiring fixing, please feel free to contact us either directly (see the `DESCRIPTION` for more) or by opening an issue ticket on GitHub. Thanks and enjoy! 44 | -------------------------------------------------------------------------------- /R/plot_cut_point.R: -------------------------------------------------------------------------------- 1 | #' Tidy Visualization of a Cut Point from a Mixture Model 2 | #' 3 | #' Returns a plot of the data density (histogram) with an overlaid cut point generated by the fit mixture model 4 | #' @usage plot_cut_point(m, plot = TRUE, color = c("grayscale", "amerika", "wesanderson")) 5 | #' @param m An object of class \code{mixEM} corresponding with the fit mixture model 6 | #' @param plot Logical for generating the plot. If FALSE, only the cut point value from the GMM is returned. If TRUE, histogram with the overlaid cut point is returned. Default is set to TRUE. 7 | #' @param color A vector of color options including "amerika" (from \code{amerika} package), "wesanderson" (from \code{wesanderson} package), and "grayscale", which is the default option. 8 | #' 9 | #' @details Mixture models can be used to derive cut points separating clusters via soft assignment (See Benaglia et al. 2009 for more). \code{plot_cut_point()} plots data density with an overlaid cut point (the mean of the calculated \code{mu}) from \code{mixEM} objects via \code{mixtools}. Note, this function is in its infancy, and at present only works in the limited context of 2-component Gaussian mixture models with equal priors and equal variances. Development for more complex cases is in process. 10 | #' 11 | #' @examples 12 | #' \dontrun{ 13 | #' if(require(mixtools)){ 14 | #' mixmdl <- mixtools::normalmixEM(faithful$waiting, k = 2) 15 | #' } 16 | #' plot_cut_point(mixmdl, plot = TRUE, color = "amerika") # returns plot, amerika 17 | #' plot_cut_point(mixmdl, plot = TRUE, color = "wesanderson") # returns plot, wesanderson 18 | #' plot_cut_point(mixmdl, plot = FALSE) # returns only the cut point value from the GMM 19 | #'} 20 | #' 21 | #' @references Benaglia, T., Chauveau, D., Hunter, D. and Young, D. 2009. mixtools: An R package for analyzing finite mixture models. Journal of Statistical Software, 32(6), pp.1-29. 22 | #' @references Ram, K., and Wickham, H. 2015. wesanderson: a Wes Anderson palette generator. R package version 0.3. 23 | #' 24 | #' @export 25 | plot_cut_point <- function(m, plot = TRUE, color = c("grayscale", "amerika", "wesanderson")) { 26 | if (!requireNamespace("dplyr", quietly = TRUE)) { 27 | stop("Package \"dplyr\" needed for this function to work. Please install it.", 28 | call. = FALSE) 29 | } 30 | if (!requireNamespace("ggplot2", quietly = TRUE)) { 31 | stop("Package \"ggplot2\" needed for this function to work. Please install it.", 32 | call. = FALSE) 33 | } 34 | if (!requireNamespace("mixtools", quietly = TRUE)) { 35 | stop("Package \"mixtools\" needed for this function to work. Please install it.", 36 | call. = FALSE) 37 | } 38 | if (!requireNamespace("methods", quietly = TRUE)) { 39 | stop("Package \"methods\" needed for this function to work. Please install it.", 40 | call. = FALSE) 41 | } 42 | if (!requireNamespace("amerika", quietly = TRUE)) { 43 | stop("Package \"amerika\" needed for this function to work. Please install it.", 44 | call. = FALSE) 45 | } 46 | if (!requireNamespace("wesanderson", quietly = TRUE)) { 47 | stop("Package \"wesanderson\" needed for this function to work. Please install it.", 48 | call. = FALSE) 49 | } 50 | m <- try(methods::as(m, "mixEM", strict=TRUE)) 51 | if (!inherits(m, "mixEM")){ 52 | stop("must be a mixEM object", call. = FALSE) 53 | } 54 | x <- m$x 55 | p <- ggplot2::ggplot(data.frame(m$x)) 56 | if(plot == TRUE){ 57 | if (color == "amerika"){ 58 | a <- p + ggplot2::geom_histogram(ggplot2::aes(m$x), 59 | colour = "darkgray", 60 | fill = amerika::amerika_palette(n = 30, name = "Dem_Ind_Rep7", type = "continuous")) + 61 | ggplot2::geom_vline(xintercept = mean(m$mu)) + 62 | ggplot2::theme_bw() + 63 | ggplot2::labs(x = "x", 64 | y = "Density", 65 | title = "Cutpoint from Gaussian Mixture Model") 66 | a 67 | } else if (color == "wesanderson"){ 68 | w <- p + ggplot2::geom_histogram(ggplot2::aes(m$x), 69 | colour = "darkgray", 70 | fill = wesanderson::wes_palette(n = 30, name = "Rushmore1", type = "continuous")) + 71 | ggplot2::geom_vline(xintercept = mean(m$mu)) + 72 | ggplot2::theme_bw() + 73 | ggplot2::labs(x = "x", 74 | y = "Density", 75 | title = "Cutpoint from Gaussian Mixture Model") 76 | w 77 | } else { 78 | g <- p + ggplot2::geom_histogram(ggplot2::aes(m$x), 79 | colour = "darkgray", 80 | fill = "lightgray") + 81 | ggplot2::geom_vline(xintercept = mean(m$mu)) + 82 | ggplot2::theme_bw() + 83 | ggplot2::labs(x = "x", 84 | y = "Density", 85 | title = "Cutpoint from Gaussian Mixture Model") 86 | g 87 | } 88 | } else mean(m$mu) 89 | } 90 | -------------------------------------------------------------------------------- /R/plot_gmm.R: -------------------------------------------------------------------------------- 1 | #' Plots Mixture Components from Gaussian Mixture Models 2 | #' 3 | #' Generates a plot of data densities with overlaid mixture components from a Gaussian mixture model (GMM) 4 | #' @usage plot_gmm(m, k = NULL) 5 | #' @param m An object of class \code{mixEM} corresponding with the fit GMM 6 | #' @param k The number of components specified in the GMM, \code{m} 7 | #' @details Original function from the \code{plotGMM} package. Retained here for bridging between the packages. We recommend using instead the updated \code{plot_mm} function. 8 | #' 9 | #' Note: \code{plot_gmm} requires a \code{mixtools} object to be supplied. Users must enter the same component value, \code{k}, in the \code{plot_gmm} function, as that which was specified in the original GMM specification (also \code{k} in \code{mixtools}). 10 | #' 11 | #' @examples 12 | #' \dontrun{ 13 | #' if(require(mixtools)){ 14 | #' mixmdl <- mixtools::normalmixEM(faithful$waiting, k = 2) 15 | #' } 16 | #' plot_gmm(mixmdl, 2) 17 | #' } 18 | #' 19 | #' @references Benaglia, T., Chauveau, D., Hunter, D. and Young, D., 2009. mixtools: An R package for analyzing finite mixture models. Journal of Statistical Software, 32(6), pp.1-29. 20 | #' @references Wickham, H., 2016. ggplot2: elegant graphics for data analysis. Springer. 21 | #' 22 | #' @export 23 | plot_gmm <- function(m, k=NULL) { 24 | if (!requireNamespace("ggplot2", quietly = TRUE)) { 25 | stop("Package \"ggplot2\" needed for this function to work. Please install it.", 26 | call. = FALSE) 27 | } 28 | if (!requireNamespace("methods", quietly = TRUE)) { 29 | stop("Package \"methods\" needed for this function to work. Please install it.", 30 | call. = FALSE) 31 | } 32 | m <- try(methods::as(m, "mixEM", strict=TRUE)) 33 | if (!inherits(m, "mixEM")){ 34 | stop("must be a mixEM object", call. = FALSE) 35 | } 36 | x <- m$x 37 | x <- data.frame(x) 38 | if (k <= 1){ 39 | stop("Specified components must be at least length 2.") 40 | } else if (k >= 16){ 41 | stop("Are you really searching for 16 or more clusters? If so, open an issue ticket: `https://github.com/pdwaggoner/plotGMM/issues`\nand we will consider updating the package. If not, go back and make sure the GMM is properly specified.") 42 | } 43 | 44 | component_colors <- 45 | c( 46 | "red", 47 | "blue", 48 | "green", 49 | "yellow", 50 | "orange", 51 | "purple", 52 | "darksalmon", 53 | "goldenrod2", 54 | "dodgerblue", 55 | "darkorange3", 56 | "burlywood4", 57 | "darkmagenta", 58 | "firebrick", 59 | "deeppink2", 60 | "darkseagreen1" 61 | ) 62 | 63 | out_plot <- 64 | ggplot2::ggplot(data.frame(x)) + 65 | ggplot2::geom_density( 66 | ggplot2::aes(x), 67 | colour = "darkgray", fill = "lightgray" 68 | ) 69 | 70 | for (i in seq(1, k)) { 71 | out_plot <- 72 | out_plot + 73 | ggplot2::stat_function(geom = "line", fun = plot_mix_comps_normal, 74 | args = list(m$mu[i], m$sigma[i], lam = m$lambda[i]), 75 | colour = component_colors[i], lwd = 1) + 76 | ggplot2::ylab("Density") + 77 | ggplot2::theme_bw() 78 | } 79 | 80 | out_plot 81 | 82 | } 83 | -------------------------------------------------------------------------------- /R/plot_mix_comps.R: -------------------------------------------------------------------------------- 1 | #' Helper Function for Overlaying Mixture Components 2 | #' 3 | #' Allows for plotting mixture components conditioned on a superimposed function meant for passage to ggplot's \code{stat_function()} 4 | #' @usage plot_mix_comps(x, mu = NULL, sigma = NULL, lam = 1, beta0 = NULL, 5 | #' beta1=NULL, alpha=NULL, beta=NULL, 6 | #' normal=FALSE, logisreg=FALSE, 7 | #' gamma=FALSE, poisson=FALSE) 8 | #' @param x Input data 9 | #' @param mu Component mean 10 | #' @param sigma Component variance 11 | #' @param lam Component mixture weight 12 | #' @param alpha Initial shape parameters 13 | #' @param beta Initial parameter values 14 | #' @param beta0 Coefficient values 15 | #' @param beta1 Coefficient values 16 | #' @param normal Logical for normal distribution 17 | #' @param logisreg Logical for logistic regression mixtures 18 | #' @param gamma Logical for gamma distribution 19 | #' @param poisson Logical for poisson regression mixtures 20 | #' 21 | #' @details Allows for component curves to be superimposed over a mixture model plot 22 | #' 23 | #' @examples 24 | #' \dontrun{ 25 | #' if(require(mixtools)){ 26 | #' mixmdl <- mixtools::normalmixEM(faithful$waiting, k = 2) 27 | #' } 28 | #' x <- mixmdl$x 29 | #' x <- data.frame(x) 30 | #' ggplot2::ggplot(data.frame(x)) + 31 | #' ggplot2::geom_density(ggplot2::aes(x), color="black", fill="black") + 32 | #' ggplot2::stat_function(geom = "line", fun = plot_mix_comps, 33 | #' args = list(mixmdl$mu[1], mixmdl$sigma[1], lam = mixmdl$lambda[1]), 34 | #' colour = "red") + 35 | #' ggplot2::stat_function(geom = "line", fun = plot_mix_comps, 36 | #' args = list(mixmdl$mu[2], mixmdl$sigma[2], lam = mixmdl$lambda[2]), 37 | #' colour = "blue") 38 | #'} 39 | #' 40 | #' @export 41 | plot_mix_comps <- function(x, mu = NULL, sigma = NULL, lam = 1, beta0 = NULL, 42 | beta1 = NULL, alpha = NULL, beta = NULL, 43 | normal = FALSE, logisreg = FALSE, 44 | gamma = FALSE, poisson = FALSE) { 45 | if (normal==TRUE) { 46 | if(is.null(mu) | is.null(sigma) | is.null(lam)){ 47 | stop("Check input for normal distribution.") 48 | } 49 | lam*stats::dnorm(x, mean = mu, sd = sigma) 50 | } else if (logisreg==TRUE) { 51 | if(is.null(beta0) | is.null(beta1)){ 52 | stop("Check input for binary logistic regrssions.") 53 | } 54 | stats::plogis(beta0 + beta1*x) 55 | } else if (gamma==TRUE) { 56 | if(is.null(alpha) | is.null(beta) | is.null(lam)){ 57 | stop("Check input for gamma distribution.") 58 | } 59 | lam*stats::dgamma(x, shape = alpha, scale = beta) 60 | } else if (poisson==TRUE) { 61 | if(is.null(beta0) | is.null(beta1)){ 62 | stop("Check input for poisson distribution.") 63 | } 64 | lam*stats::rpois(x, exp(beta0 + beta1)) 65 | } 66 | } 67 | -------------------------------------------------------------------------------- /R/plot_mix_comps_normal.R: -------------------------------------------------------------------------------- 1 | #' Custom Function for Overlaying Mixture Components for Normal Distributions 2 | #' 3 | #' Plots a mixture component conditioned on a superimposed function 4 | #' @usage plot_mix_comps_normal(x, mu, sigma, lam) 5 | #' @param x Input data 6 | #' @param mu Mean of component 7 | #' @param sigma Variance of component 8 | #' @param lam Mixture weight of component 9 | #' 10 | #' @details Allows for specifying a custom function to be superimposed when plotting a mixture component assuming a normal distribution. This is the original function for the package, which is also included in the updated \code{plot_mix_comps()} function. 11 | #' 12 | #' @examples 13 | #' \dontrun{ 14 | #' if(require(mixtools)){ 15 | #' mixmdl <- mixtools::normalmixEM(faithful$waiting, k = 2) 16 | #' } 17 | #' x <- mixmdl$x 18 | #' x <- data.frame(x) 19 | #' ggplot2::ggplot(data.frame(x)) + 20 | #' ggplot2::geom_density(ggplot2::aes(x), color="black", fill="black") + 21 | #' ggplot2::stat_function(geom = "line", fun = plot_mix_comps_normal, 22 | #' args = list(mixmdl$mu[1], mixmdl$sigma[1], lam = mixmdl$lambda[1]), 23 | #' colour = "red") + 24 | #' ggplot2::stat_function(geom = "line", fun = plot_mix_comps_normal, 25 | #' args = list(mixmdl$mu[2], mixmdl$sigma[2], lam = mixmdl$lambda[2]), 26 | #' colour = "blue") 27 | #'} 28 | #' 29 | #' @export 30 | plot_mix_comps_normal <- function(x, mu, sigma, lam) { 31 | lam * stats::dnorm(x, mu, sigma) 32 | } 33 | -------------------------------------------------------------------------------- /R/plot_mm.R: -------------------------------------------------------------------------------- 1 | #' Tidy Visualization of Mixture Models 2 | #' 3 | #' Generates a ggplot of data densities with overlaid mixture components from fit mixture models. 4 | #' @usage plot_mm(m, k = NULL, data = NULL) 5 | #' @param m A mixture model object 6 | #' @param k Optional. The number of components specified in the mixture model, \code{m} 7 | #' @param data Name of data object required only for \code{EMCluster} objects 8 | #' 9 | #' @details This is the core function in the package, returning a \code{ggplot} object for a fit mixture model. The plot includes the data density with overlaid mixture components. 10 | #' 11 | #' @examples 12 | #' \dontrun{ 13 | #' if(require(mixtools)){ 14 | #' mixmdl1 <- mixtools::normalmixEM(faithful$waiting, k = 2) 15 | #' } 16 | #' plot_mm(mixmdl1, 2) 17 | #' 18 | #' if(require(mixtools)){ 19 | #' x <- c(rgamma(200, shape = 50, scale = 11), rgamma(200, shape = 28, scale = 6)) 20 | #' mixmdl2 <- mixtools::gammamixEM(x, lambda = c(1, 1)/2) 21 | #' } 22 | #' plot_mm(mixmdl2) 23 | #'} 24 | #' 25 | #' @references Wickham, H., 2016. ggplot2: elegant graphics for data analysis. Springer. 26 | #' 27 | #' @export 28 | plot_mm <- function(m, k = NULL, data = NULL) { 29 | # tools 30 | if (!requireNamespace("ggplot2", quietly = TRUE)) { 31 | stop("Package \"ggplot2\" needed for this function to work. Please install it.", 32 | call. = FALSE) 33 | } 34 | if (!requireNamespace("methods", quietly = TRUE)) { 35 | stop("Package \"methods\" needed for this function to work. Please install it.", 36 | call. = FALSE) 37 | } 38 | if (!requireNamespace("dplyr", quietly = TRUE)) { 39 | stop("Package \"dplyr\" needed for this function to work. Please install it.", 40 | call. = FALSE) 41 | } 42 | if (!requireNamespace("patchwork", quietly = TRUE)) { 43 | stop("Package \"patchwork\" needed for this function to work. Please install it.", 44 | call. = FALSE) 45 | } 46 | # models 47 | if (!requireNamespace("mixtools", quietly = TRUE)) { 48 | stop("Package \"mixtools\" needed for this function to work. Please install it.", 49 | call. = FALSE) 50 | } 51 | if (!requireNamespace("EMCluster", quietly = TRUE)) { 52 | stop("Package \"EMCluster\" needed for this function to work. Please install it.", 53 | call. = FALSE) 54 | } 55 | if (!requireNamespace("flexmix", quietly = TRUE)) { 56 | stop("Package \"flexmix\" needed for this function to work. Please install it.", 57 | call. = FALSE) 58 | } 59 | X1 <- X2 <- NULL # for local calling in multivariate cases 60 | component_colors <- c( 61 | "red", 62 | "blue", 63 | "green", 64 | "yellow", 65 | "orange", 66 | "purple", 67 | "darksalmon", 68 | "goldenrod2", 69 | "dodgerblue", 70 | "darkorange3", 71 | "burlywood4", 72 | "darkmagenta", 73 | "firebrick", 74 | "deeppink2", 75 | "darkseagreen1" 76 | ) 77 | ## mixtools objects 78 | if (inherits(m, "mixEM")){ 79 | # if gammamixEM 80 | if (m$ft=="gammamixEM") { 81 | if(is.null(k)){ 82 | k <- length(m$gamma.pars)/2 83 | } 84 | # range of k, bigger than 16 if needed 85 | if (k <= 1){ 86 | stop("Specified components must be at least length 2.") 87 | } else if (k >= 16){ 88 | stop("Are you really searching for 16 or more components? If so, open an issue ticket --> `https://github.com/pdwaggoner/plotmm/issues` \n 89 | We will consider updating the package. \n 90 | If not, make sure the mixture model is properly specified.") 91 | } 92 | x <- data.frame(m$x) 93 | colnames(x) <- "density" 94 | out_plot <- ggplot2::ggplot(x) + 95 | ggplot2::geom_density(ggplot2::aes(x=density), 96 | colour = "darkgray", 97 | fill = "lightgray") + 98 | ggplot2::theme_minimal() 99 | for (i in 1:k) { 100 | out_plot <- out_plot + 101 | ggplot2::stat_function(geom = "line", fun = plot_mix_comps, 102 | args = list(alpha = m$gamma.pars[1,i], 103 | beta = m$gamma.pars[2,i], 104 | lam = m$lambda[i], 105 | gamma = TRUE), 106 | colour = component_colors[i], lwd = 1) + 107 | ggplot2::ylab("Density") + 108 | ggplot2::theme_minimal() 109 | } 110 | } 111 | 112 | # if binary logisremixEM 113 | if (m$ft=="logisregmixEM") { 114 | if(is.null(k)){ 115 | k <- ncol(m$beta) 116 | } 117 | if (k <= 1){ 118 | stop("Specified components must be at least length 2.") 119 | } else if (k >= 16){ 120 | stop("Are you really searching for 16 or more components? If so, open an issue ticket --> `https://github.com/pdwaggoner/plotmm/issues` \n 121 | We will consider updating the package. \n 122 | If not, make sure the mixture model is properly specified.") 123 | } 124 | df <- data.frame(matrix(NA, 125 | nrow = length(m$y), 126 | ncol=2)) 127 | X_1 <- as.numeric(data.frame(m$x)[,2]) 128 | X_2 <- as.numeric(m$y) 129 | if (length(X_2)!=sum(X_2==0 | X_2==1)){ 130 | stop("The response feature should be binary. Check input or report the issue") 131 | } 132 | out_plot <- ggplot2::qplot(x = X_1, 133 | y = X_2) + 134 | ggplot2::geom_point(colour = "lightgray") + 135 | ggplot2::theme_minimal() 136 | for (i in 1:k) { 137 | out_plot <- out_plot + 138 | ggplot2::stat_function(geom = "line", fun = plot_mix_comps, 139 | args = list(beta0 = m$beta[1,i], 140 | beta1 = m$beta[2,i], 141 | logisreg = TRUE), 142 | colour = component_colors[i], lwd = 1) + 143 | ggplot2::xlab("X") + 144 | ggplot2::ylab("Y") + 145 | ggplot2::theme_minimal() 146 | } 147 | } 148 | 149 | # if poisregmixEM 150 | if (m$ft=="poisregmixEM") { 151 | if(is.null(k)){ 152 | k <- ncol(m$beta) 153 | } 154 | if (k <= 1){ 155 | stop("Specified components must be at least length 2.") 156 | } else if (k >= 16){ 157 | stop("Are you really searching for 16 or more components? If so, open an issue ticket --> `https://github.com/pdwaggoner/plotmm/issues` \n 158 | We will consider updating the package. \n 159 | If not, make sure the mixture model is properly specified.") 160 | } 161 | df <- data.frame(matrix(NA, 162 | nrow = length(m$y), 163 | ncol = 2)) 164 | X_1 <- as.numeric(data.frame(m$x)[,2]) 165 | X_2 <- as.numeric(m$y) 166 | out_plot <- ggplot2::qplot(x = X_1, 167 | y = X_2) + 168 | ggplot2::geom_point(colour = "lightgray") + 169 | ggplot2::theme_minimal() 170 | for (i in 1:k) { 171 | out_plot <- out_plot + 172 | ggplot2::stat_function(geom = "line", fun = plot_mix_comps, 173 | args = list(beta0 = m$beta[1,i], 174 | beta1 = m$beta[2,i], 175 | lam = m$lambda[i], 176 | poisson = TRUE), 177 | colour = component_colors[i], lwd = 1) + 178 | ggplot2::xlab("X") + 179 | ggplot2::ylab("Y") + 180 | ggplot2::theme_minimal() 181 | } 182 | } 183 | 184 | # if regmixEM 185 | if (m$ft=="regmixEM"){ 186 | if(is.null(k)){ 187 | k <- ncol(m$posterior) 188 | } 189 | if (k <= 1){ 190 | stop("Specified components must be at least length 2.") 191 | } else if (k >= 16){ 192 | stop("Are you really searching for 16 or more components? If so, open an issue ticket --> `https://github.com/pdwaggoner/plotmm/issues` \n 193 | We will consider updating the package. \n 194 | If not, make sure the mixture model is properly specified.") 195 | } 196 | df <- data.frame(matrix(NA, 197 | nrow = length(m$y), 198 | ncol = 2)) 199 | X_1 <- as.numeric(data.frame(m$x)[,2]) 200 | X_2 <- as.numeric(m$y) 201 | coeff <- data.frame(m$beta) 202 | out_plot <- ggplot2::qplot(x = X_1, 203 | y = X_2) + 204 | ggplot2::geom_point(colour = "lightgray") + 205 | ggplot2::xlab("X") + 206 | ggplot2::ylab("Y") + 207 | ggplot2::theme_minimal() 208 | for (i in 1:k) { 209 | out_plot <- out_plot + 210 | ggplot2::geom_abline(intercept = coeff[1,i], 211 | slope = coeff[2,i], 212 | colour = component_colors[i], 213 | lwd = 1) + 214 | ggplot2::theme_minimal() 215 | } 216 | } 217 | 218 | # if univariate normalmixEM or repnormixEM 219 | if (m$ft=="normalmixEM" | m$ft=="repnormmixEM") { 220 | if(is.null(k)){ 221 | k <- m$posterior 222 | } 223 | # range of k, bigger than 16 if needed 224 | if (k <= 1){ 225 | stop("Specified components must be at least length 2.") 226 | } else if (k >= 16){ 227 | stop("Are you really searching for 16 or more components? If so, open an issue ticket --> `https://github.com/pdwaggoner/plotmm/issues` \n 228 | We will consider updating the package. \n 229 | If not, make sure the mixture model is properly specified.") 230 | } 231 | x <- data.frame(m$x) 232 | colnames(x) <- "density" 233 | out_plot <- ggplot2::ggplot(x) + 234 | ggplot2::geom_density(ggplot2::aes(x=density), colour = "darkgray", fill = "lightgray") + 235 | ggplot2::theme_minimal() 236 | for (i in 1:k) { 237 | out_plot <- out_plot + 238 | ggplot2::stat_function(geom = "line", fun = plot_mix_comps, 239 | args = list(mu = m$mu[i], 240 | sigma = m$sigma[i], 241 | lam = m$lambda[i], 242 | normal = TRUE), 243 | colour = component_colors[i], lwd = 1) + 244 | ggplot2::ylab("Density") + 245 | ggplot2::theme_minimal() 246 | } 247 | } 248 | 249 | # if bivariate normalmixEM 250 | if (m$ft=="mvnormalmixEM" && ncol(m$x)==2){ 251 | if(is.null(k)){ 252 | k <- ncol(m$posterior) 253 | } 254 | # range of k, bigger than 16 if needed 255 | if (k <= 1){ 256 | stop("Specified components must be at least length 2.") 257 | } else if (k >= 16){ 258 | stop("Are you really searching for 16 or more components? If so, open an issue ticket --> `https://github.com/pdwaggoner/plotmm/issues` \n 259 | We will consider updating the package. \n 260 | If not, make sure the mixture model is properly specified.") 261 | } 262 | x <- data.frame(m$x) 263 | mean <- m$mu 264 | sigma <- m$sigma 265 | X_1 <- x[ ,1] 266 | X_2 <- x[ ,2] 267 | post <- m$posterior 268 | out_plot <- ggplot2::qplot(x = X_1, 269 | y = X_2) + 270 | ggplot2::geom_point(colour = "darkgray", fill = "lightgray", size=0.7) + 271 | ggplot2::theme_minimal() 272 | for (i in 1:k){ 273 | p <- data.frame(t(data.frame(mean[[i]]))) 274 | e <- data.frame(mixtools::ellipse(mean[[i]], sigma, newplot = TRUE, npoints = 500)) 275 | out_plot <- out_plot + 276 | ggplot2::geom_point(data = p, ggplot2::aes(x = X1, y = X2), colour = "black", size = 0.7) + 277 | ggplot2::geom_point(data = e, ggplot2::aes(x = X1, y = X2), colour = component_colors[i], size = 0.3) + 278 | ggplot2::theme_minimal() 279 | } 280 | x1 <- data.frame(m$x[,1]) 281 | colnames(x1) <- "density" 282 | x2 <- data.frame(m$x[,2]) 283 | colnames(x2) <- "density" 284 | hist1 <- ggplot2::ggplot(x1) + 285 | ggplot2::geom_density(ggplot2::aes(x=density), colour = "darkgray", fill = "lightgray") + 286 | ggplot2::ggtitle("X1") + 287 | ggplot2::theme_minimal() 288 | hist2 <- ggplot2::ggplot(x2) + 289 | ggplot2::geom_density(ggplot2::aes(x=density), colour = "darkgray", fill = "lightgray") + 290 | ggplot2::ggtitle("X2") + 291 | ggplot2::theme_minimal() 292 | out_plot <- out_plot / (hist1 + hist2) 293 | } 294 | 295 | # if multivariate normalmixEM 296 | if (m$ft=="mvnormalmixEM"&&ncol(m$x)>2){ 297 | stop("Not compatible with multivariate `normalmixEM` objects where ncol(m$x) >= 3 yet. \n 298 | Development is in process.") 299 | } 300 | } 301 | 302 | ## EMCluster objects 303 | else if (inherits(m, "emret")){ 304 | if (is.null(data)){ 305 | stop("The original data is required as an input for `emret` objects.") 306 | } 307 | if (is.null(k)){ 308 | k <- m$nclass 309 | } 310 | 311 | # if bivariate case 312 | if (ncol(data)==2){ 313 | x <- data.frame(data) 314 | X_1 <- x[ ,1] 315 | X_2 <- x[ ,2] 316 | out_plot <- ggplot2::qplot(x=X_1, y=X_2) + 317 | ggplot2::geom_point(colour = "lightgray") + 318 | ggplot2::xlab("X1") + 319 | ggplot2::ylab("X2") + 320 | ggplot2::theme_minimal() 321 | } 322 | for (i in 1:k){ 323 | p <- data.frame(m$Mu)[i,] 324 | sigma <- data.frame(EMCluster::LTSigma2variance(m$LTSigma)[,,i]) 325 | e <- data.frame(mixtools::ellipse(m$Mu[i,], sigma, newplot = TRUE, npoints = 500)) 326 | out_plot <- out_plot + 327 | ggplot2::geom_point(data = p, ggplot2::aes(x = X1, y = X2), colour = "black", size = 0.7) + 328 | ggplot2::geom_point(data = e, ggplot2::aes(x = X1, y = X2), colour = component_colors[i], size = 0.3) + 329 | ggplot2::theme_minimal() 330 | } 331 | x1 <- data.frame(x[,1]) 332 | colnames(x1) <- "density" 333 | x2 <- data.frame(x[,2]) 334 | colnames(x2) <- "density" 335 | hist1 <- ggplot2::ggplot(x1) + 336 | ggplot2::geom_density(ggplot2::aes(x=density), colour = "darkgray", fill = "lightgray") + 337 | ggplot2::ggtitle("X1") + 338 | ggplot2::theme_minimal() 339 | hist2 <- ggplot2::ggplot(x2) + 340 | ggplot2::ggtitle("X2") + 341 | ggplot2::geom_density(ggplot2::aes(x=density), colour = "darkgray", fill = "lightgray") + 342 | ggplot2::theme_minimal() 343 | out_plot <- out_plot / (hist1 + hist2) 344 | 345 | # if multivariate case 346 | if (ncol(data)!=2){ 347 | stop("Not compatible with multivariate `emret` objects where ncol(data) >= 3 yet. \n 348 | Development is in process.") 349 | } 350 | 351 | } 352 | 353 | ## flexmix objects 354 | else if (inherits(m, "flexmix")){ 355 | if (is.null(k)){ 356 | k <- m@k0 357 | } 358 | if (k <= 1){ 359 | stop("Specified components must be at least length 2.") 360 | } else if (k >= 16){ 361 | stop("Are you really searching for 16 or more components? If so, open an issue ticket --> `https://github.com/pdwaggoner/plotmm/issues` \n 362 | We will consider updating the package. \n 363 | If not, make sure the mixture model is properly specified.") 364 | } 365 | 366 | num_model <- length(m@model) 367 | if (num_model==1){ 368 | family <- m@model[[1]]@family 369 | if (family=="poisson" & ncol(m@model[[1]]@x)==2) { 370 | 371 | } else if (family=="gaussian" & ncol(m@model[[1]]@x)==1) { 372 | x <- data.frame(m@model[[1]]@y) 373 | colnames(x) <- "density" 374 | out_plot <- ggplot2::ggplot(x) + 375 | ggplot2::geom_density(ggplot2::aes(x=density), colour = "darkgray", fill = "lightgray") + 376 | ggplot2::theme_minimal() 377 | sigma <- flexmix::parameters(m)[2,] 378 | lam <- table(flexmix::clusters(m)) 379 | mu <- flexmix::parameters(m)[1,] 380 | for (i in 1:k) { 381 | out_plot <- out_plot + 382 | ggplot2::stat_function(geom = "line", fun = plot_mix_comps, 383 | args = list(mu = mu[i], 384 | sigma = sigma[i], 385 | lam = lam[i]/sum(lam), 386 | normal = TRUE), 387 | colour = component_colors[i], 388 | lwd = 1) + 389 | ggplot2::ylab("Density") + 390 | ggplot2::xlab("x") + 391 | ggplot2::theme_minimal() 392 | } 393 | } else if (family=="Gamma" & ncol(m@model[[1]]@x)==1) { 394 | x <- data.frame(m@model[[1]]@y) 395 | colnames(x) <- "density" 396 | out_plot <- ggplot2::ggplot(x) + 397 | ggplot2::geom_density(ggplot2::aes(x=density), colour = "darkgray", fill = "lightgray") + 398 | ggplot2::theme_minimal() 399 | shape <- flexmix::parameters(m)[2,] 400 | lam <- table(flexmix::clusters(m)) 401 | coef <- flexmix::parameters(m)[1,] 402 | for (i in 1:k) { 403 | out_plot <- out_plot + 404 | ggplot2::stat_function(geom = "line", fun = plot_mix_comps, 405 | args = list(alpha = shape[i], 406 | beta = 1/(coef[i]*shape[i]), 407 | lam = lam[i]/sum(lam), 408 | gamma = TRUE), 409 | colour = component_colors[i], 410 | lwd = 1) + 411 | ggplot2::ylab("Density") + 412 | ggplot2::theme_minimal() 413 | } 414 | } else if (family=="binomial"& ncol(m@model[[1]]@x)==2) { 415 | 416 | } 417 | } else if (num_model>1){ 418 | for (j in i:num_model){ 419 | family <- m@model[[j]]@family 420 | if (family=="poisson" & ncol(m@model[[1]]@x)==2) { 421 | 422 | } else if (family=="gaussian" & ncol(m@model[[j]]@x)==1) { 423 | x <- data.frame(m@model[[j]]@y) 424 | colnames(x) <- "density" 425 | out_plot <- ggplot2::ggplot(x) + 426 | ggplot2::geom_density(ggplot2::aes(x=density), colour = "darkgray", fill = "lightgray") + 427 | ggplot2::theme_minimal() 428 | sigma <- flexmix::parameters(m)[[j]][2,] 429 | lam <- table(flexmix::clusters(m)) 430 | mu <- flexmix::parameters(m)[[j]][1,] 431 | for (i in 1:k) { 432 | out_plot <- out_plot + 433 | ggplot2::stat_function(geom = "line", fun = plot_mix_comps, 434 | args = list(mu = mu[i], 435 | sigma = sigma[i], 436 | lam = lam[i]/sum(lam), 437 | normal = TRUE), 438 | colour = component_colors[i], 439 | lwd = 1) + 440 | ggplot2::ylab("Density") + 441 | ggplot2::xlab("x") + 442 | ggplot2::theme_minimal() 443 | } 444 | } else if (family=="Gamma" & ncol(m@model[[j]]@x)==1) { 445 | x <- data.frame(m@model[[j]]@y) 446 | colnames(x) <- "density" 447 | out_plot <- ggplot2::ggplot(x) + 448 | ggplot2::geom_density(ggplot2::aes(x=density), colour = "darkgray", fill = "lightgray") + 449 | ggplot2::theme_minimal() 450 | shape <- flexmix::parameters(m)[[j]][2,] 451 | lam <- table(flexmix::clusters(m)) 452 | scale <- flexmix::parameters(m)[[j]][1,] 453 | for (i in 1:k) { 454 | out_plot <- out_plot + 455 | ggplot2::stat_function(geom = "line", fun = plot_mix_comps, 456 | args = list(alpha = shape[i], 457 | beta = 1/(coef[i]*shape[i]), 458 | lam = lam[i]/sum(lam), 459 | gamma = TRUE), 460 | colour = component_colors[i], 461 | lwd = 1) + 462 | ggplot2::ylab("Density") + 463 | ggplot2::theme_minimal() 464 | } 465 | } else if (family=="binomial") { 466 | 467 | } 468 | } 469 | } 470 | } 471 | else { 472 | stop("Please check the input type. Currently objects from 'mixtools', 'EMCluster', and 'flexmix' are supported. \n 473 | If you want to generate plots for mixture model objects from other packages, please open an issue ticket: \n 474 | `https://github.com/pdwaggoner/plotmm/issues`.") 475 | } 476 | 477 | out_plot 478 | 479 | } 480 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # `plotmm` Tidy Tools for Visualizing Mixture Models 2 | [![Project Status: Active – The project has reached a stable, usable state and is being actively developed.](https://www.repostatus.org/badges/latest/active.svg)](https://www.repostatus.org/#active) 3 | [![CRAN_Status_Badge](https://www.r-pkg.org/badges/version/plotmm)](http://cran.r-project.org/package=plotmm) 4 | [![Downloads](https://cranlogs.r-pkg.org/badges/grand-total/plotmm)](https://cran.r-project.org/package=plotmm) 5 | [![Documentation](https://img.shields.io/badge/documentation-plotmm-orange.svg?colorB=E91E63)](https://www.r-pkg.org/pkg/plotmm) 6 | 7 | --- 8 | #### [`plotGMM`](https://CRAN.R-project.org/package=plotGMM): 9 | [![lifecycle](https://img.shields.io/badge/lifecycle-retired-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html) 10 | [![CRAN_Status_Badge](https://www.r-pkg.org/badges/version/plotGMM)](http://cran.r-project.org/package=plotGMM) 11 | [![Downloads](https://cranlogs.r-pkg.org/badges/grand-total/plotGMM)](https://cran.r-project.org/package=plotGMM) 12 | [![Documentation](https://img.shields.io/badge/documentation-plotGMM-orange.svg?colorB=E91E63)](https://www.r-pkg.org/pkg/plotGMM) 13 | 14 | --- 15 | 16 | Package authors and contributors: 17 | - [Philip D. Waggoner](https://github.com/pdwaggoner) (pkg maintainer), YouGov & Columbia University 18 | - [Fong Chun Chan](https://github.com/tinyheero), Achilles Therapeutics 19 | - [Lu Zhang](https://github.com/LuZhang0128), University of Chicago (formerly Emory) 20 | 21 | --- 22 | 23 | In collaboration with [Fong Chan](https://github.com/tinyheero) (Achilles Therapeutics) and [Lu Zhang](https://github.com/LuZhang0128) (University of Chicago), we have developed `plotmm` for tidy visualization of mixture models. This package is a substantial update to the [`plotGMM` package](https://CRAN.R-project.org/package=plotGMM). 24 | 25 | ## Install 26 | 27 | Load the latest version: 28 | 29 | ```{r} 30 | devtools::install_github("pdwaggoner/plotmm") 31 | 32 | library(plotmm) 33 | ``` 34 | 35 | Load the latest stable (CRAN) version: 36 | 37 | ```{r} 38 | install.packages("plotmm") 39 | 40 | library(plotmm) 41 | ``` 42 | 43 | ## Getting Started 44 | 45 | To get started, read the ["Getting Started" vignette](https://github.com/pdwaggoner/plotmm/tree/master/vignettes), which includes a few basic examples. 46 | 47 | 48 | ## Contribute 49 | 50 | Anyone is welcome to contribute to the package. Before collaborating, please take a look at and abide by the [contributor code of conduct](https://github.com/pdwaggoner/plotmm/blob/master/CODE_OF_CONDUCT.md). Here's a sampling of how to contribute: 51 | 52 | - Submit an [issue](https://github.com/pdwaggoner/plotmm/issues) reporting a bug, requesting a feature enhancement, etc. 53 | 54 | - Suggest changes directly via [pull request](https://github.com/pdwaggoner/plotmm/pulls) 55 | 56 | - [Reach out directly](https://pdwaggoner.github.io/) with ideas if you're uneasy with public interaction 57 | -------------------------------------------------------------------------------- /docs/src/assets/issue_bug.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/pdwaggoner/plotmm/3b1f57ffa917fca5551259faae78cd613e60e9d8/docs/src/assets/issue_bug.png -------------------------------------------------------------------------------- /docs/src/assets/issue_featureidea.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/pdwaggoner/plotmm/3b1f57ffa917fca5551259faae78cd613e60e9d8/docs/src/assets/issue_featureidea.png -------------------------------------------------------------------------------- /docs/src/assets/issue_help.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/pdwaggoner/plotmm/3b1f57ffa917fca5551259faae78cd613e60e9d8/docs/src/assets/issue_help.png -------------------------------------------------------------------------------- /man/plot_cut_point.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/plot_cut_point.R 3 | \name{plot_cut_point} 4 | \alias{plot_cut_point} 5 | \title{Tidy Visualization of a Cut Point from a Mixture Model} 6 | \usage{ 7 | plot_cut_point(m, plot = TRUE, color = c("grayscale", "amerika", "wesanderson")) 8 | } 9 | \arguments{ 10 | \item{m}{An object of class \code{mixEM} corresponding with the fit mixture model} 11 | 12 | \item{plot}{Logical for generating the plot. If FALSE, only the cut point value from the GMM is returned. If TRUE, histogram with the overlaid cut point is returned. Default is set to TRUE.} 13 | 14 | \item{color}{A vector of color options including "amerika" (from \code{amerika} package), "wesanderson" (from \code{wesanderson} package), and "grayscale", which is the default option.} 15 | } 16 | \description{ 17 | Returns a plot of the data density (histogram) with an overlaid cut point generated by the fit mixture model 18 | } 19 | \details{ 20 | Mixture models can be used to derive cut points separating clusters via soft assignment (See Benaglia et al. 2009 for more). \code{plot_cut_point()} plots data density with an overlaid cut point (the mean of the calculated \code{mu}) from \code{mixEM} objects via \code{mixtools}. Note, this function is in its infancy, and at present only works in the limited context of 2-component Gaussian mixture models with equal priors and equal variances. Development for more complex cases is in process. 21 | } 22 | \examples{ 23 | \dontrun{ 24 | if(require(mixtools)){ 25 | mixmdl <- mixtools::normalmixEM(faithful$waiting, k = 2) 26 | } 27 | plot_cut_point(mixmdl, plot = TRUE, color = "amerika") # returns plot, amerika 28 | plot_cut_point(mixmdl, plot = TRUE, color = "wesanderson") # returns plot, wesanderson 29 | plot_cut_point(mixmdl, plot = FALSE) # returns only the cut point value from the GMM 30 | } 31 | 32 | } 33 | \references{ 34 | Benaglia, T., Chauveau, D., Hunter, D. and Young, D. 2009. mixtools: An R package for analyzing finite mixture models. Journal of Statistical Software, 32(6), pp.1-29. 35 | 36 | Ram, K., and Wickham, H. 2015. wesanderson: a Wes Anderson palette generator. R package version 0.3. 37 | } 38 | -------------------------------------------------------------------------------- /man/plot_gmm.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/plot_gmm.R 3 | \name{plot_gmm} 4 | \alias{plot_gmm} 5 | \title{Plots Mixture Components from Gaussian Mixture Models} 6 | \usage{ 7 | plot_gmm(m, k = NULL) 8 | } 9 | \arguments{ 10 | \item{m}{An object of class \code{mixEM} corresponding with the fit GMM} 11 | 12 | \item{k}{The number of components specified in the GMM, \code{m}} 13 | } 14 | \description{ 15 | Generates a plot of data densities with overlaid mixture components from a Gaussian mixture model (GMM) 16 | } 17 | \details{ 18 | Original function from the \code{plotGMM} package. Retained here for bridging between the packages. We recommend using instead the updated \code{plot_mm} function. 19 | 20 | Note: \code{plot_gmm} requires a \code{mixtools} object to be supplied. Users must enter the same component value, \code{k}, in the \code{plot_gmm} function, as that which was specified in the original GMM specification (also \code{k} in \code{mixtools}). 21 | } 22 | \examples{ 23 | \dontrun{ 24 | if(require(mixtools)){ 25 | mixmdl <- mixtools::normalmixEM(faithful$waiting, k = 2) 26 | } 27 | plot_gmm(mixmdl, 2) 28 | } 29 | 30 | } 31 | \references{ 32 | Benaglia, T., Chauveau, D., Hunter, D. and Young, D., 2009. mixtools: An R package for analyzing finite mixture models. Journal of Statistical Software, 32(6), pp.1-29. 33 | 34 | Wickham, H., 2016. ggplot2: elegant graphics for data analysis. Springer. 35 | } 36 | -------------------------------------------------------------------------------- /man/plot_mix_comps.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/plot_mix_comps.R 3 | \name{plot_mix_comps} 4 | \alias{plot_mix_comps} 5 | \title{Helper Function for Overlaying Mixture Components} 6 | \usage{ 7 | plot_mix_comps(x, mu = NULL, sigma = NULL, lam = 1, beta0 = NULL, 8 | beta1=NULL, alpha=NULL, beta=NULL, 9 | normal=FALSE, logisreg=FALSE, 10 | gamma=FALSE, poisson=FALSE) 11 | } 12 | \arguments{ 13 | \item{x}{Input data} 14 | 15 | \item{mu}{Component mean} 16 | 17 | \item{sigma}{Component variance} 18 | 19 | \item{lam}{Component mixture weight} 20 | 21 | \item{beta0}{Coefficient values} 22 | 23 | \item{beta1}{Coefficient values} 24 | 25 | \item{alpha}{Initial shape parameters} 26 | 27 | \item{beta}{Initial parameter values} 28 | 29 | \item{normal}{Logical for normal distribution} 30 | 31 | \item{logisreg}{Logical for logistic regression mixtures} 32 | 33 | \item{gamma}{Logical for gamma distribution} 34 | 35 | \item{poisson}{Logical for poisson regression mixtures} 36 | } 37 | \description{ 38 | Allows for plotting mixture components conditioned on a superimposed function meant for passage to ggplot's \code{stat_function()} 39 | } 40 | \details{ 41 | Allows for component curves to be superimposed over a mixture model plot 42 | } 43 | \examples{ 44 | \dontrun{ 45 | if(require(mixtools)){ 46 | mixmdl <- mixtools::normalmixEM(faithful$waiting, k = 2) 47 | } 48 | x <- mixmdl$x 49 | x <- data.frame(x) 50 | ggplot2::ggplot(data.frame(x)) + 51 | ggplot2::geom_density(ggplot2::aes(x), color="black", fill="black") + 52 | ggplot2::stat_function(geom = "line", fun = plot_mix_comps, 53 | args = list(mixmdl$mu[1], mixmdl$sigma[1], lam = mixmdl$lambda[1]), 54 | colour = "red") + 55 | ggplot2::stat_function(geom = "line", fun = plot_mix_comps, 56 | args = list(mixmdl$mu[2], mixmdl$sigma[2], lam = mixmdl$lambda[2]), 57 | colour = "blue") 58 | } 59 | 60 | } 61 | -------------------------------------------------------------------------------- /man/plot_mix_comps_normal.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/plot_mix_comps_normal.R 3 | \name{plot_mix_comps_normal} 4 | \alias{plot_mix_comps_normal} 5 | \title{Custom Function for Overlaying Mixture Components for Normal Distributions} 6 | \usage{ 7 | plot_mix_comps_normal(x, mu, sigma, lam) 8 | } 9 | \arguments{ 10 | \item{x}{Input data} 11 | 12 | \item{mu}{Mean of component} 13 | 14 | \item{sigma}{Variance of component} 15 | 16 | \item{lam}{Mixture weight of component} 17 | } 18 | \description{ 19 | Plots a mixture component conditioned on a superimposed function 20 | } 21 | \details{ 22 | Allows for specifying a custom function to be superimposed when plotting a mixture component assuming a normal distribution. This is the original function for the package, which is also included in the updated \code{plot_mix_comps()} function. 23 | } 24 | \examples{ 25 | \dontrun{ 26 | if(require(mixtools)){ 27 | mixmdl <- mixtools::normalmixEM(faithful$waiting, k = 2) 28 | } 29 | x <- mixmdl$x 30 | x <- data.frame(x) 31 | ggplot2::ggplot(data.frame(x)) + 32 | ggplot2::geom_density(ggplot2::aes(x), color="black", fill="black") + 33 | ggplot2::stat_function(geom = "line", fun = plot_mix_comps_normal, 34 | args = list(mixmdl$mu[1], mixmdl$sigma[1], lam = mixmdl$lambda[1]), 35 | colour = "red") + 36 | ggplot2::stat_function(geom = "line", fun = plot_mix_comps_normal, 37 | args = list(mixmdl$mu[2], mixmdl$sigma[2], lam = mixmdl$lambda[2]), 38 | colour = "blue") 39 | } 40 | 41 | } 42 | -------------------------------------------------------------------------------- /man/plot_mm.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/plot_mm.R 3 | \name{plot_mm} 4 | \alias{plot_mm} 5 | \title{Tidy Visualization of Mixture Models} 6 | \usage{ 7 | plot_mm(m, k = NULL, data = NULL) 8 | } 9 | \arguments{ 10 | \item{m}{A mixture model object} 11 | 12 | \item{k}{Optional. The number of components specified in the mixture model, \code{m}} 13 | 14 | \item{data}{Name of data object required only for \code{EMCluster} objects} 15 | } 16 | \description{ 17 | Generates a ggplot of data densities with overlaid mixture components from fit mixture models. 18 | } 19 | \details{ 20 | This is the core function in the package, returning a \code{ggplot} object for a fit mixture model. The plot includes the data density with overlaid mixture components. 21 | } 22 | \examples{ 23 | \dontrun{ 24 | if(require(mixtools)){ 25 | mixmdl1 <- mixtools::normalmixEM(faithful$waiting, k = 2) 26 | } 27 | plot_mm(mixmdl1, 2) 28 | 29 | if(require(mixtools)){ 30 | x <- c(rgamma(200, shape = 50, scale = 11), rgamma(200, shape = 28, scale = 6)) 31 | mixmdl2 <- mixtools::gammamixEM(x, lambda = c(1, 1)/2) 32 | } 33 | plot_mm(mixmdl2) 34 | } 35 | 36 | } 37 | \references{ 38 | Wickham, H., 2016. ggplot2: elegant graphics for data analysis. Springer. 39 | } 40 | -------------------------------------------------------------------------------- /tests/testthat.R: -------------------------------------------------------------------------------- 1 | library("testthat") 2 | library("plotmm") 3 | 4 | test_check("plotmm") 5 | -------------------------------------------------------------------------------- /tests/testthat/test_plot_gmm.R: -------------------------------------------------------------------------------- 1 | options(repos = c(CRAN = "https://cloud.r-project.org")) 2 | 3 | context("plot_gmm") 4 | 5 | if (!requireNamespace("EMCluster", quietly = TRUE)) { 6 | install.packages("EMCluster") 7 | } 8 | 9 | if (!requireNamespace("flexmix", quietly = TRUE)) { 10 | install.packages("flexmix") 11 | } 12 | 13 | if (!requireNamespace("mixtools", quietly = TRUE)) { 14 | install.packages("mixtools") 15 | } 16 | 17 | if (!requireNamespace("ggplot2", quietly = TRUE)) { 18 | install.packages("ggplot2") 19 | } 20 | 21 | if (!requireNamespace("plotmm", quietly = TRUE)) { 22 | install.packages("plotmm") 23 | } 24 | 25 | set.seed(235) 26 | 27 | test_that("plot_gmm catches input errors", { 28 | mixmdl <- mixtools::normalmixEM(faithful$waiting, k = 2) 29 | expect_error(plot_gmm(mixmdl, k = 1)) 30 | expect_error(plot_gmm(mixmdl, k = 16)) 31 | }) 32 | 33 | # Testing beyond 11 components causes problems. 34 | test_that("plot_gmm successfully plots", { 35 | for (num_comps in seq(2, 11)) { 36 | mixmdl <- mixtools::normalmixEM(faithful$waiting, k = num_comps) 37 | expect_is(plot_gmm(mixmdl, k = num_comps), "ggplot") 38 | } 39 | }) 40 | -------------------------------------------------------------------------------- /tests/testthat/test_plot_mm.R: -------------------------------------------------------------------------------- 1 | options(repos = c(CRAN = "https://cloud.r-project.org")) 2 | 3 | context("plot_mm") 4 | 5 | if (!requireNamespace("EMCluster", quietly = TRUE)) { 6 | install.packages("EMCluster") 7 | } 8 | 9 | if (!requireNamespace("flexmix", quietly = TRUE)) { 10 | install.packages("flexmix") 11 | } 12 | 13 | if (!requireNamespace("mixtools", quietly = TRUE)) { 14 | install.packages("mixtools") 15 | } 16 | 17 | if (!requireNamespace("ggplot2", quietly = TRUE)) { 18 | install.packages("ggplot2") 19 | } 20 | 21 | if (!requireNamespace("plotmm", quietly = TRUE)) { 22 | install.packages("plotmm") 23 | } 24 | 25 | set.seed(235) 26 | 27 | test_that("plot_mm catches input errors", { 28 | mixmdl <- mixtools::normalmixEM(faithful$waiting, k = 2) 29 | expect_error(plot_mm(mixmdl, k = 1)) 30 | expect_error(plot_mm(mixmdl, k = 16)) 31 | }) 32 | 33 | # Testing beyond 11 components causes problems for full function too. 34 | test_that("plot_mm successfully plots", { 35 | for (num_comps in seq(2, 11)) { 36 | mixmdl <- mixtools::normalmixEM(faithful$waiting, k = num_comps) 37 | expect_is(plot_mm(mixmdl, k = num_comps), "ggplot") 38 | } 39 | }) 40 | 41 | -------------------------------------------------------------------------------- /vignettes/Getting-Started.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Getting Started" 3 | output: rmarkdown::html_vignette 4 | vignette: > 5 | %\VignetteIndexEntry{Getting-Started} 6 | %\VignetteEngine{knitr::rmarkdown} 7 | %\VignetteEncoding{UTF-8} 8 | --- 9 | 10 | ```{r, include = FALSE} 11 | knitr::opts_chunk$set( 12 | collapse = TRUE, 13 | comment = "#>", 14 | eval = FALSE, 15 | echo = TRUE 16 | ) 17 | 18 | options(repos = c(CRAN = "https://cloud.r-project.org")) 19 | ``` 20 | 21 | This vignette offers a few basic examples to help users get started with `plotmm`. 22 | 23 | The package has five functions: 24 | 25 | 1. `plot_mm()`: The main function of the package, `plot_mm` allows the user to simply input the name of the fit mixture model, as well as an optional argument to pass the number of components `k` that were used in the original fit. *Note*: the function will automatically detect the number of components if `k` is not supplied. The result is a tidy ggplot of the density of the data with overlaid mixture weight component curves. Importantly, as the grammar of graphics is the basis of visualization in this package, all other tidyverse-friendly customization options work with any of the `plotmm`'s functions (e.g., customizing with `ggplot2`'s functions like `labs()` or `theme_*()`; or `patchwork`'s `plot_annotation()`). There are examples of these and others below. 26 | 27 | 2. `plot_cut_point()`: Mixture models are often used to derive cut points of separation between groups in feature space. `plot_cut_point()` plots the data density with the overlaid cut point (the mean of the calculated `mu`) from the fit mixture model. 28 | 29 | 3. `plot_mix_comps()`: A helper function allowing for expanded customization of mixture model plots. The function superimposes the shape of the components over a `ggplot2` object. This function is also used to render all plots in the main `plot_mm()` function. 30 | 31 | 4. `plot_gmm()`: The original function upon which the package was expanded. It is included in `plotmm` for quicker access to a common mixture model form (univariate Gaussian), as well as to bridge between the original `plotGMM` package. 32 | 33 | 5. `plot_mix_comps_normal()`: Similarly, this function is the original basis of `plot_mix_comps()`, but for Gaussian mixture models only. It is included in `plotmm` for bridging between the original `plotGMM` package. 34 | 35 | The package supports several model objects (from 'mixtools', 'EMCluster', and 'flexmix'), as well as many mixture model specifications, including mixtures of: 36 | 37 | 1. Univariate Gaussians 38 | 2. Bivariate Gaussians 39 | 3. Gammas 40 | 4. Logistic regressions 41 | 5. Linear regressions 42 | 6. Poisson regressions 43 | 44 | First, load the stable version from CRAN, along with some additional packages. 45 | 46 | ```{r setup} 47 | if (!requireNamespace("EMCluster", quietly = TRUE)) { 48 | install.packages("EMCluster") 49 | } 50 | 51 | if (!requireNamespace("flexmix", quietly = TRUE)) { 52 | install.packages("flexmix") 53 | } 54 | 55 | if (!requireNamespace("mixtools", quietly = TRUE)) { 56 | install.packages("mixtools") 57 | } 58 | 59 | if (!requireNamespace("ggplot2", quietly = TRUE)) { 60 | install.packages("ggplot2") 61 | } 62 | 63 | if (!requireNamespace("plotmm", quietly = TRUE)) { 64 | install.packages("plotmm") 65 | } 66 | 67 | library(plotmm) 68 | ``` 69 | 70 | ### Tidy visualization of mixture models via `plot_mm()` 71 | 72 | First, here is an example for univariate normal mixture model: 73 | 74 | ```{r} 75 | library(mixtools) 76 | library(ggplot2) 77 | 78 | set.seed(576) 79 | 80 | mixmdl <- normalmixEM(iris$Petal.Length, k = 2) 81 | 82 | # visualize 83 | plot_mm(mixmdl, 2) + 84 | labs(title = "Univariate Gaussian Mixture Model", 85 | subtitle = "Mixtools Object") 86 | ``` 87 | 88 | Next is an example of a mixture of linear regressions: 89 | 90 | ```{r} 91 | library(mixtools) 92 | library(ggplot2) 93 | 94 | # set up the data (replication of mixtools examples for comparability) 95 | data(NOdata) 96 | attach(NOdata) 97 | 98 | set.seed(100) 99 | 100 | out <- regmixEM(Equivalence, NO, verb = TRUE, epsilon = 1e-04) 101 | 102 | df <- data.frame(out$beta) 103 | 104 | # visualize 105 | plot_mm(out) + 106 | labs(title = "Mixture of Regressions", 107 | subtitle = "Mixtools Object") 108 | ``` 109 | 110 | Next is a bivariate Gaussian mixture model (via `EMCluster`). *Note*: in this case, all plots print by default for full display of options. Use indexing (e.g., `plot[1]`) to plot a specific or preferred quantity. 111 | 112 | ```{r} 113 | library(EMCluster) 114 | library(patchwork) 115 | 116 | set.seed(1234) 117 | 118 | x <- da1$da 119 | 120 | out <- init.EM(x, nclass = 10, method = "em.EM") 121 | 122 | plot_mm(out, data = x) + 123 | plot_annotation(title = "Bivariate Gaussian Mixture Model", 124 | subtitle = "EMCluster Object") 125 | ``` 126 | 127 | ### Plot cut points (or not) via `plot_cut_point()` (with the [amerika](https://CRAN.R-project.org/package=amerika) color palette) 128 | 129 | ```{r} 130 | library(mixtools) 131 | 132 | mixmdl <- normalmixEM(faithful$waiting, k = 2) 133 | 134 | plot_cut_point(mixmdl, plot = TRUE, color = "amerika") # produces plot 135 | 136 | plot_cut_point(mixmdl, plot = FALSE) # gives the cut point value, not the plot 137 | ``` 138 | 139 | ### Customize a ggplot with `plot_mix_comps()` 140 | 141 | ```{r} 142 | library(mixtools) 143 | library(magrittr) 144 | library(ggplot2) 145 | 146 | # Fit a univariate mixture model via mixtools 147 | set.seed(576) 148 | 149 | mixmdl <- normalmixEM(faithful$waiting, k = 2) 150 | 151 | # Customize a plot with `plot_mix_comps_normal()` 152 | data.frame(x = mixmdl$x) %>% 153 | ggplot() + 154 | geom_histogram(aes(x, ..density..), binwidth = 1, colour = "black", 155 | fill = "white") + 156 | stat_function(geom = "line", fun = plot_mix_comps_normal, # here is the function 157 | args = list(mixmdl$mu[1], mixmdl$sigma[1], lam = mixmdl$lambda[1]), 158 | colour = "red", lwd = 1.5) + 159 | stat_function(geom = "line", fun = plot_mix_comps_normal, # here again as k = 2 160 | args = list(mixmdl$mu[2], mixmdl$sigma[2], lam = mixmdl$lambda[2]), 161 | colour = "blue", lwd = 1.5) + 162 | ylab("Density") 163 | ``` 164 | 165 | ## Contribute 166 | 167 | Anyone is welcome to contribute to the package. Before collaborating, please take a look at and abide by the [contributor code of conduct](https://github.com/pdwaggoner/plotmm/blob/master/CODE_OF_CONDUCT.md). Here's a sampling of how to contribute: 168 | 169 | - Submit an [issue](https://github.com/pdwaggoner/plotmm/issues) reporting a bug, requesting a feature enhancement, etc. 170 | 171 | - Suggest changes directly via [pull request](https://github.com/pdwaggoner/plotmm/pulls) 172 | 173 | - [Reach out directly](https://pdwaggoner.github.io/) with ideas if you're uneasy with public interaction 174 | --------------------------------------------------------------------------------