├── text ├── bmr_tutorial.pdf ├── pics │ ├── pp_check_FE.pdf │ ├── prior_post.png │ ├── pp_checks_plot.pdf │ ├── basic_data_plot.pdf │ ├── table_coefficients.pdf │ ├── table_coefficients.png │ ├── posterior_density_FE.pdf │ ├── pp_check_FE_noGender.pdf │ ├── table_mean_hypotheses.key │ ├── table_mean_hypotheses.pdf │ ├── table_mean_hypotheses.png │ ├── posterior_density_cell_means.pdf │ ├── table_mean_hypotheses_cropped.png │ └── prior_like_post.svg ├── mfenvironments.sty ├── mfpackages.sty ├── references.bib └── mfcommands.sty ├── faintr ├── Meta │ └── vignette.rds ├── NAMESPACE ├── DESCRIPTION ├── man │ ├── print.faintCompare.Rd │ ├── get_factor_information.Rd │ ├── faintr-package.Rd │ └── compare_groups.Rd ├── R │ ├── faintr-package.R │ └── faintr_functions.R ├── doc │ ├── faintr_basics.R │ ├── faintr_basics.Rmd │ └── faintr_basics.html └── vignettes │ └── faintr_basics.Rmd ├── assembled_feedback_TR.pages ├── README.md ├── code ├── politeness_data.csv ├── generate_prior_post_plot.R ├── data_analysis_for_reader.R └── data_analysis.R ├── .gitignore └── LICENSE /text/bmr_tutorial.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/michael-franke/bayes_mixed_regression_tutorial/HEAD/text/bmr_tutorial.pdf -------------------------------------------------------------------------------- /faintr/Meta/vignette.rds: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/michael-franke/bayes_mixed_regression_tutorial/HEAD/faintr/Meta/vignette.rds -------------------------------------------------------------------------------- /text/pics/pp_check_FE.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/michael-franke/bayes_mixed_regression_tutorial/HEAD/text/pics/pp_check_FE.pdf -------------------------------------------------------------------------------- /text/pics/prior_post.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/michael-franke/bayes_mixed_regression_tutorial/HEAD/text/pics/prior_post.png -------------------------------------------------------------------------------- /assembled_feedback_TR.pages: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/michael-franke/bayes_mixed_regression_tutorial/HEAD/assembled_feedback_TR.pages -------------------------------------------------------------------------------- /text/pics/pp_checks_plot.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/michael-franke/bayes_mixed_regression_tutorial/HEAD/text/pics/pp_checks_plot.pdf -------------------------------------------------------------------------------- /text/pics/basic_data_plot.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/michael-franke/bayes_mixed_regression_tutorial/HEAD/text/pics/basic_data_plot.pdf -------------------------------------------------------------------------------- /text/pics/table_coefficients.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/michael-franke/bayes_mixed_regression_tutorial/HEAD/text/pics/table_coefficients.pdf -------------------------------------------------------------------------------- /text/pics/table_coefficients.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/michael-franke/bayes_mixed_regression_tutorial/HEAD/text/pics/table_coefficients.png -------------------------------------------------------------------------------- /text/pics/posterior_density_FE.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/michael-franke/bayes_mixed_regression_tutorial/HEAD/text/pics/posterior_density_FE.pdf -------------------------------------------------------------------------------- /text/pics/pp_check_FE_noGender.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/michael-franke/bayes_mixed_regression_tutorial/HEAD/text/pics/pp_check_FE_noGender.pdf -------------------------------------------------------------------------------- /text/pics/table_mean_hypotheses.key: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/michael-franke/bayes_mixed_regression_tutorial/HEAD/text/pics/table_mean_hypotheses.key -------------------------------------------------------------------------------- /text/pics/table_mean_hypotheses.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/michael-franke/bayes_mixed_regression_tutorial/HEAD/text/pics/table_mean_hypotheses.pdf -------------------------------------------------------------------------------- /text/pics/table_mean_hypotheses.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/michael-franke/bayes_mixed_regression_tutorial/HEAD/text/pics/table_mean_hypotheses.png -------------------------------------------------------------------------------- /text/pics/posterior_density_cell_means.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/michael-franke/bayes_mixed_regression_tutorial/HEAD/text/pics/posterior_density_cell_means.pdf -------------------------------------------------------------------------------- /text/pics/table_mean_hypotheses_cropped.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/michael-franke/bayes_mixed_regression_tutorial/HEAD/text/pics/table_mean_hypotheses_cropped.png -------------------------------------------------------------------------------- /faintr/NAMESPACE: -------------------------------------------------------------------------------- 1 | # Generated by roxygen2: do not edit by hand 2 | 3 | S3method(print,faintCompare) 4 | export(compare_groups) 5 | export(get_factor_information) 6 | export(post_cells) 7 | import(brms) 8 | import(tidyverse) 9 | importFrom(HDInterval,hdi) 10 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # bayes_mixed_regression_tutorial 2 | Very basic tutorial on Bayesian mixed effects regression for factorial designs 3 | 4 | To obtain the `faintr` package type: 5 | 6 | ``` 7 | library(devtools) 8 | install_github('michael-franke/bayes_mixed_regression_tutorial/faintr', build_vignettes = TRUE) 9 | ``` 10 | -------------------------------------------------------------------------------- /faintr/DESCRIPTION: -------------------------------------------------------------------------------- 1 | Package: faintr 2 | Title: Compare all Factor Levels in Factorial Design Bayesian Regression (using BRMS) 3 | Version: 0.0.8 4 | Date: 2019-07-10 5 | Authors@R: person("Michael", "Franke", email = "mchfranke@gmail.com", role = c("aut", "cre")) 6 | Description: The 'faintr' package provides convenience functions to interpret output of model fits of Bayesian regression models for factorial designs, obtained with the package 'brms'. 7 | Depends: R (>= 3.5.0) 8 | License: MIT 9 | Encoding: UTF-8 10 | LazyData: true 11 | RoxygenNote: 6.1.1 12 | Suggests: knitr, rmarkdown 13 | VignetteBuilder: knitr 14 | -------------------------------------------------------------------------------- /faintr/man/print.faintCompare.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/faintr_functions.R 3 | \name{print.faintCompare} 4 | \alias{print.faintCompare} 5 | \title{Print comparison object between factor groups} 6 | \usage{ 7 | \method{print}{faintCompare}(obj) 8 | } 9 | \arguments{ 10 | \item{model}{Model fit from brms package.} 11 | } 12 | \value{ 13 | string 14 | } 15 | \description{ 16 | Print comparison object between factor groups 17 | } 18 | \examples{ 19 | print(model_fit) 20 | } 21 | \keyword{brms} 22 | \keyword{design,} 23 | \keyword{factorial} 24 | \keyword{regression,} 25 | -------------------------------------------------------------------------------- /faintr/man/get_factor_information.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/faintr_functions.R 3 | \name{get_factor_information} 4 | \alias{get_factor_information} 5 | \title{Obtaining information about factors in regression model} 6 | \usage{ 7 | get_factor_information(model) 8 | } 9 | \arguments{ 10 | \item{model}{Model fit from brms package.} 11 | } 12 | \value{ 13 | list with names of factors and their levels, including the reference levels (in dummy coding) 14 | } 15 | \description{ 16 | For a model for a factorial design, fitted with brms, this function returns information about the factors used, their levels, and the reference levels. 17 | For more information see \code{vignette('faintr_basics')}. 18 | } 19 | \examples{ 20 | library(brms) 21 | m = brm(yield ~ N * P * K, npk) 22 | get_factor_information(m) 23 | } 24 | \keyword{brms} 25 | \keyword{design,} 26 | \keyword{factorial} 27 | \keyword{regression,} 28 | -------------------------------------------------------------------------------- /faintr/R/faintr-package.R: -------------------------------------------------------------------------------- 1 | #' Factorial-design interpreter for Bayesian regression models 2 | #' 3 | #' @docType package 4 | #' @name faintr-package 5 | #' @aliases faintr 6 | #' 7 | #' @description 8 | #' 9 | #' he \pkg{faintr} package provides convenience functions for testing different hypotheses 10 | #' about factor level combinations for a Bayesian regression analysis of data from a 11 | #' factorial-design experiment. It builds on a regression model fitted with the help of 12 | #' the \pkg{brms} package. For more information see \href{https://michael-franke.github.io/bayes_mixed_regression_tutorial/faintr/doc/faintr_basics.html}{the basic tutorial online}. 13 | #' 14 | #' 15 | #' @details 16 | #' 17 | #' The package provides the following functions. 18 | #' \code{\link{get_factor_information}} extracts information about the factors 19 | #' and their (reference) levels used in the regression analysis. 20 | #' \code{\link{post_cells}} gives the estimated means for all 21 | #' design cells, and a complete pairwise comparison between cells. 22 | #' \code{\link{compare_groups}} allows the comparison of any pair of 23 | #' (subsets) of cells. 24 | #' 25 | #' For more information see \href{https://michael-franke.github.io/bayes_mixed_regression_tutorial/faintr/doc/faintr_basics.html}{the basic tutorial online}. 26 | #' 27 | #' @seealso 28 | #' \code{\link{brm}}, 29 | #' \code{\link{brmsfit}} 30 | #' 31 | NULL -------------------------------------------------------------------------------- /faintr/man/faintr-package.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/faintr-package.R 3 | \docType{package} 4 | \name{faintr-package} 5 | \alias{faintr-package} 6 | \alias{faintr} 7 | \title{Factorial-design interpreter for Bayesian regression models} 8 | \description{ 9 | he \pkg{faintr} package provides convenience functions for testing different hypotheses 10 | about factor level combinations for a Bayesian regression analysis of data from a 11 | factorial-design experiment. It builds on a regression model fitted with the help of 12 | the \pkg{brms} package. For more information see \href{https://michael-franke.github.io/bayes_mixed_regression_tutorial/faintr/doc/faintr_basics.html}{the basic tutorial online}. 13 | } 14 | \details{ 15 | The package provides the following functions. 16 | \code{\link{get_factor_information}} extracts information about the factors 17 | and their (reference) levels used in the regression analysis. 18 | \code{\link{post_cells}} gives the estimated means for all 19 | design cells, and a complete pairwise comparison between cells. 20 | \code{\link{compare_groups}} allows the comparison of any pair of 21 | (subsets) of cells. 22 | 23 | For more information see \href{https://michael-franke.github.io/bayes_mixed_regression_tutorial/faintr/doc/faintr_basics.html}{the basic tutorial online}. 24 | } 25 | \seealso{ 26 | \code{\link{brm}}, 27 | \code{\link{brmsfit}} 28 | } 29 | -------------------------------------------------------------------------------- /faintr/man/compare_groups.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/faintr_functions.R 3 | \name{compare_groups} 4 | \alias{compare_groups} 5 | \title{Compare means of two subsets of factorial design cells} 6 | \usage{ 7 | compare_groups(model, higher, lower) 8 | } 9 | \arguments{ 10 | \item{model}{Model fit from brms package.} 11 | } 12 | \value{ 13 | list with posterior samples for each group, and the posterior probability that group 'higher' has a higher estimated coefficient in the posterior samples than the group 'lower' 14 | } 15 | \description{ 16 | This function takes a brms model fit for a factorial design and a specification of two groups (subsets of design cells) to compare. 17 | A group is specified as a named list, specifiying the factors and their levels which to include in the group. 18 | It outputs the posterior mean of the 'higher' minus the 'lower' subset of cells, its 95 percent credible interval and the posterior probability that the 'higher' group has a higher mean than the the 'lower' group. 19 | For more information see \code{vignette('faintr_basics')}. 20 | } 21 | \examples{ 22 | library(brms) 23 | m = brm(yield ~ N * P * K, npk) 24 | # this compares two single cells in the factorial design 25 | compare_groups( 26 | model = m, 27 | higher = list("N" = "1", "P" = "1", "K" = "1"), 28 | lower = list("N" = "0", "P" = "0", "K" = "1") 29 | ) 30 | # this compares the average of N=1 cells to the grand mean 31 | # like in deviance conding 32 | compare_groups( 33 | model = m, 34 | higher = list("N" = "1"), 35 | lower = list() 36 | ) 37 | 38 | } 39 | \keyword{brms} 40 | \keyword{design,} 41 | \keyword{factorial} 42 | \keyword{regression,} 43 | -------------------------------------------------------------------------------- /code/politeness_data.csv: -------------------------------------------------------------------------------- 1 | subject,gender,sentence,context,pitch 2 | F1,F,S1,pol,213.3 3 | F1,F,S1,inf,204.5 4 | F1,F,S2,pol,285.1 5 | F1,F,S2,inf,259.7 6 | F1,F,S3,pol,203.9 7 | F1,F,S3,inf,286.9 8 | F1,F,S4,pol,250.8 9 | F1,F,S4,inf,276.8 10 | F1,F,S5,pol,231.9 11 | F1,F,S5,inf,252.4 12 | F1,F,S6,pol,181.2 13 | F1,F,S6,inf,230.7 14 | F1,F,S7,inf,216.5 15 | F1,F,S7,pol,154.8 16 | F3,F,S1,pol,229.7 17 | F3,F,S1,inf,237.3 18 | F3,F,S2,pol,236.8 19 | F3,F,S2,inf,251 20 | F3,F,S3,pol,267 21 | F3,F,S3,inf,266 22 | F3,F,S4,pol,275.4 23 | F3,F,S4,inf,306.8 24 | F3,F,S5,pol,232.6 25 | F3,F,S5,inf,252.5 26 | F3,F,S6,pol,226.5 27 | F3,F,S6,inf,278.8 28 | F3,F,S7,inf,264.4 29 | F3,F,S7,pol,185.5 30 | M4,M,S1,pol,110.7 31 | M4,M,S1,inf,123.6 32 | M4,M,S2,pol,229 33 | M4,M,S2,inf,114.9 34 | M4,M,S3,pol,112.2 35 | M4,M,S3,inf,213.6 36 | M4,M,S4,pol,193.4 37 | M4,M,S4,inf,162.9 38 | M4,M,S5,pol,101.8 39 | M4,M,S5,inf,126.9 40 | M4,M,S6,inf,136.2 41 | M4,M,S7,inf,146 42 | M4,M,S7,pol,126.5 43 | M7,M,S1,pol,86.1 44 | M7,M,S1,inf,99.1 45 | M7,M,S2,pol,82.2 46 | M7,M,S2,inf,104.3 47 | M7,M,S3,pol,85.9 48 | M7,M,S3,inf,110.2 49 | M7,M,S4,pol,97.1 50 | M7,M,S4,inf,120 51 | M7,M,S5,pol,93.7 52 | M7,M,S5,inf,102.9 53 | M7,M,S6,pol,108.8 54 | M7,M,S6,inf,108.2 55 | M7,M,S7,inf,124.4 56 | M7,M,S7,pol,107.6 57 | F2,F,S1,pol,232.7 58 | F2,F,S1,inf,231.3 59 | F2,F,S2,pol,246.3 60 | F2,F,S2,inf,259.7 61 | F2,F,S3,pol,289.4 62 | F2,F,S3,inf,301.8 63 | F2,F,S4,pol,243.2 64 | F2,F,S4,inf,296.2 65 | F2,F,S5,pol,277.7 66 | F2,F,S5,inf,294.2 67 | F2,F,S6,pol,208 68 | F2,F,S6,inf,225.9 69 | F2,F,S7,inf,281 70 | F2,F,S7,pol,227.2 71 | M3,M,S1,pol,153.8 72 | M3,M,S1,inf,188.8 73 | M3,M,S2,pol,142.4 74 | M3,M,S2,inf,199.7 75 | M3,M,S3,pol,160.2 76 | M3,M,S3,inf,186.1 77 | M3,M,S4,pol,207.5 78 | M3,M,S4,inf,190.9 79 | M3,M,S5,pol,160.7 80 | M3,M,S5,inf,156.5 81 | M3,M,S6,pol,146.7 82 | M3,M,S6,inf,158 83 | M3,M,S7,inf,161.1 84 | M3,M,S7,pol,153.3 85 | -------------------------------------------------------------------------------- /text/mfenvironments.sty: -------------------------------------------------------------------------------- 1 | \NeedsTeXFormat{LaTeX2e} 2 | \ProvidesPackage{myenvironments}[2010/03/29 Personal Environments] 3 | 4 | 5 | \newtheoremstyle{Satz} 6 | {} %Space above 7 | {1em} %Space below 8 | {\normalfont} %Body font 9 | {} %Indent amount (empty = no indent, 10 | %\parindent = para indent) 11 | {\normalfont} %Thm head font 12 | {.} %Punctuation after thm head 13 | {.8em} %Space after thm head: " " = normal interword 14 | %space; \newline = linebreak 15 | {\bfseries\thmname{#1}\thmnumber{ #2}\thmnote{ (#3)}} 16 | %Thm head spec (can be left empty, meaning 17 | %`normal') 18 | 19 | \theoremstyle{Satz} 20 | \newtheorem{theorem}{Theorem} 21 | \newtheorem{lemma}[theorem]{Lemma} 22 | \newtheorem{definition}[theorem]{Definition} 23 | \newtheorem{proposition}[theorem]{Proposition} 24 | \newtheorem{fact}[theorem]{Fact} 25 | \newtheorem{claim}[theorem]{Claim} 26 | \newtheorem{remark}[theorem]{Remark} 27 | \newtheorem{exercise}{Exercise} 28 | \newtheorem{problem}{Problem} 29 | \newtheorem{corollary}[theorem]{Corollary} 30 | \newtheorem{example}[theorem]{Example} 31 | 32 | \newtheoremstyle{Bsp} 33 | {} %Space above 34 | {1em} %Space below 35 | {\itshape} %Body font 36 | {} %Indent amount (empty = no indent, 37 | %\parindent = para indent) 38 | {\normalfont} %Thm head font 39 | {.} %Punctuation after thm head 40 | {.8em} %Space after thm head: " " = normal interword 41 | %space; \newline = linebreak 42 | {\thmname{#1}\thmnumber{ #2}\thmnote{ (#3)}} 43 | %Thm head spec (can be left empty, meaning 44 | %`normal') 45 | \theoremstyle{Bsp} 46 | % \newtheorem{example}[theorem]{Example} 47 | 48 | 49 | 50 | -------------------------------------------------------------------------------- /code/generate_prior_post_plot.R: -------------------------------------------------------------------------------- 1 | 2 | prior = rnorm(10000000,0,50) 3 | likelihood = rnorm(10000000, 100, 32) 4 | posterior = rnorm(10000000, 65, 32) 5 | 6 | ydata <- data.frame(prior,likelihood, posterior) 7 | 8 | ggplot(ydata) + 9 | geom_density(aes(x= prior), fill = "#f1a340", color = "#f1a340", alpha = 0.5) + 10 | geom_density(aes(x= likelihood), fill = "#998ec3", color = "#998ec3", alpha = 0.5) + 11 | geom_density(aes(x= posterior), fill = "white", color = "grey", alpha = 0.5) + 12 | xlab("\npitch difference in Hz (women - men)") + 13 | theme_classic() + 14 | scale_x_continuous(breaks = (c(-200,-100,0,100,200)), limits = c(-300,300)) + 15 | theme(legend.position = "right", 16 | legend.key.height = unit(2,"line"), 17 | legend.title = element_text(size = 18, face = "bold"), 18 | legend.text = element_text(size = 16), 19 | legend.background = element_rect(fill = "transparent"), 20 | strip.background = element_blank(), 21 | strip.text = element_text(size = 18, face = "bold"), 22 | axis.line = element_blank(), 23 | panel.spacing = unit(2, "lines"), 24 | plot.background = element_rect(fill = "transparent", colour = NA), 25 | panel.background = element_rect(fill = "transparent"), 26 | axis.ticks.y = element_blank(), 27 | axis.text.y = element_blank(), 28 | axis.title.y = element_blank(), 29 | axis.text.x = element_text(size = 16), 30 | axis.title.x = element_text(size = 18, face = "bold"), 31 | plot.title = element_text(size = 18, face = "bold"), 32 | plot.margin = unit(c(0.2,0.1,0.2,0.1),"cm")) 33 | 34 | ggsave(filename = "prior_like_post.pdf", 35 | plot = last_plot(), 36 | device = "pdf", 37 | width = 160, 38 | height = 100, 39 | units = "mm", 40 | #bg = "transparent", 41 | dpi = 300) 42 | 43 | likelihood = c(100, 50, 14, 150, 186) 44 | mean(likelihood); sd(likelihood) 45 | xdata <- data.frame(likelihood) 46 | 47 | prior_dummy <- c( 48 | # define a skeptical prior for the relevant coefficients 49 | prior(normal(0, 50), class = "Intercept") 50 | ) 51 | 52 | 53 | x <- brm(likelihood ~ 1, xdata) 54 | x_prior <- brm(likelihood ~ 1, prior = prior_dummy, xdata) 55 | -------------------------------------------------------------------------------- /faintr/doc/faintr_basics.R: -------------------------------------------------------------------------------- 1 | ## ----setup, include=FALSE, echo = FALSE, message = FALSE----------------- 2 | knitr::opts_chunk$set(echo=TRUE, 3 | warning=FALSE, 4 | message=FALSE, 5 | collapse = TRUE, 6 | cache = TRUE, 7 | dev.args = list(bg = 'transparent'), 8 | fig.align='center', 9 | fig.height = 3, 10 | fig.widht=4) 11 | library(tidyverse) 12 | theme_set(theme_bw() + theme(plot.background=element_blank()) ) 13 | 14 | ## ---- eval = F----------------------------------------------------------- 15 | # devtools::install_github('michael-franke/bayes_mixed_regression_tutorial/faintr', 16 | # build_vignettes = TRUE) 17 | # library(faintr) 18 | 19 | ## ---- echo = F----------------------------------------------------------- 20 | library(faintr) 21 | 22 | ## ---- error=FALSE, warning=FALSE, message=FALSE-------------------------- 23 | library(tidyverse) 24 | politedata = read_csv('https://raw.githubusercontent.com/michael-franke/bayes_mixed_regression_tutorial/master/code/politeness_data.csv') 25 | head(politedata) 26 | 27 | ## ------------------------------------------------------------------------ 28 | politedata %>% group_by(gender, context) %>% summarize(mean_pitch = mean(pitch)) 29 | 30 | ## ---- error=FALSE, warning=FALSE, message=FALSE, results="hide"---------- 31 | library(brms) 32 | m_dummy = brm(pitch ~ gender * context + (1 | subject + sentence), politedata) 33 | 34 | ## ------------------------------------------------------------------------ 35 | fixef(m_dummy) 36 | 37 | ## ------------------------------------------------------------------------ 38 | compare_groups( 39 | model = m_dummy, 40 | higher = list(gender = "F", context = "pol"), 41 | lower = list(gender = "M", context = "inf") 42 | ) 43 | 44 | ## ------------------------------------------------------------------------ 45 | compare_groups( 46 | model = m_dummy, 47 | higher = list(gender = "F"), 48 | lower = list() 49 | ) 50 | 51 | ## ------------------------------------------------------------------------ 52 | extract_posterior_cell_means(m_dummy)$all_cells_compared 53 | 54 | ## ------------------------------------------------------------------------ 55 | extract_posterior_cell_means(m_dummy)$cell_summary 56 | 57 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | ## Core latex/pdflatex auxiliary files: 2 | *.aux 3 | *.lof 4 | *.log 5 | *.lot 6 | *.fls 7 | *.out 8 | *.toc 9 | *.fmt 10 | *.fot 11 | *.cb 12 | *.cb2 13 | .*.lb 14 | 15 | ## Intermediate documents: 16 | *.dvi 17 | *.xdv 18 | *-converted-to.* 19 | # these rules might exclude image files for figures etc. 20 | # *.ps 21 | # *.eps 22 | # *.pdf 23 | 24 | ## Generated if empty string is given at "Please type another file name for output:" 25 | .pdf 26 | 27 | ## Bibliography auxiliary files (bibtex/biblatex/biber): 28 | *.bbl 29 | *.bcf 30 | *.blg 31 | *-blx.aux 32 | *-blx.bib 33 | *.run.xml 34 | 35 | ## Build tool auxiliary files: 36 | *.fdb_latexmk 37 | *.synctex 38 | *.synctex(busy) 39 | *.synctex.gz 40 | *.synctex.gz(busy) 41 | *.pdfsync 42 | 43 | ## Auxiliary and intermediate files from other packages: 44 | # algorithms 45 | *.alg 46 | *.loa 47 | 48 | # achemso 49 | acs-*.bib 50 | 51 | # amsthm 52 | *.thm 53 | 54 | # beamer 55 | *.nav 56 | *.pre 57 | *.snm 58 | *.vrb 59 | 60 | # changes 61 | *.soc 62 | 63 | # cprotect 64 | *.cpt 65 | 66 | # elsarticle (documentclass of Elsevier journals) 67 | *.spl 68 | 69 | # endnotes 70 | *.ent 71 | 72 | # fixme 73 | *.lox 74 | 75 | # feynmf/feynmp 76 | *.mf 77 | *.mp 78 | *.t[1-9] 79 | *.t[1-9][0-9] 80 | *.tfm 81 | 82 | #(r)(e)ledmac/(r)(e)ledpar 83 | *.end 84 | *.?end 85 | *.[1-9] 86 | *.[1-9][0-9] 87 | *.[1-9][0-9][0-9] 88 | *.[1-9]R 89 | *.[1-9][0-9]R 90 | *.[1-9][0-9][0-9]R 91 | *.eledsec[1-9] 92 | *.eledsec[1-9]R 93 | *.eledsec[1-9][0-9] 94 | *.eledsec[1-9][0-9]R 95 | *.eledsec[1-9][0-9][0-9] 96 | *.eledsec[1-9][0-9][0-9]R 97 | 98 | # glossaries 99 | *.acn 100 | *.acr 101 | *.glg 102 | *.glo 103 | *.gls 104 | *.glsdefs 105 | 106 | # gnuplottex 107 | *-gnuplottex-* 108 | 109 | # gregoriotex 110 | *.gaux 111 | *.gtex 112 | 113 | # htlatex 114 | *.4ct 115 | *.4tc 116 | *.idv 117 | *.lg 118 | *.trc 119 | *.xref 120 | 121 | # hyperref 122 | *.brf 123 | 124 | # knitr 125 | *-concordance.tex 126 | # TODO Comment the next line if you want to keep your tikz graphics files 127 | *.tikz 128 | *-tikzDictionary 129 | 130 | # listings 131 | *.lol 132 | 133 | # makeidx 134 | *.idx 135 | *.ilg 136 | *.ind 137 | *.ist 138 | 139 | # minitoc 140 | *.maf 141 | *.mlf 142 | *.mlt 143 | *.mtc[0-9]* 144 | *.slf[0-9]* 145 | *.slt[0-9]* 146 | *.stc[0-9]* 147 | 148 | # minted 149 | _minted* 150 | *.pyg 151 | 152 | # morewrites 153 | *.mw 154 | 155 | # nomencl 156 | *.nlg 157 | *.nlo 158 | *.nls 159 | 160 | # pax 161 | *.pax 162 | 163 | # pdfpcnotes 164 | *.pdfpc 165 | 166 | # sagetex 167 | *.sagetex.sage 168 | *.sagetex.py 169 | *.sagetex.scmd 170 | 171 | # scrwfile 172 | *.wrt 173 | 174 | # sympy 175 | *.sout 176 | *.sympy 177 | sympy-plots-for-*.tex/ 178 | 179 | # pdfcomment 180 | *.upa 181 | *.upb 182 | 183 | # pythontex 184 | *.pytxcode 185 | pythontex-files-*/ 186 | 187 | # thmtools 188 | *.loe 189 | 190 | # TikZ & PGF 191 | *.dpth 192 | *.md5 193 | *.auxlock 194 | 195 | # todonotes 196 | *.tdo 197 | 198 | # easy-todo 199 | *.lod 200 | 201 | # xmpincl 202 | *.xmpi 203 | 204 | # xindy 205 | *.xdy 206 | 207 | # xypic precompiled matrices 208 | *.xyc 209 | 210 | # endfloat 211 | *.ttt 212 | *.fff 213 | 214 | # Latexian 215 | TSWLatexianTemp* 216 | 217 | ## Editors: 218 | # WinEdt 219 | *.bak 220 | *.sav 221 | 222 | # Texpad 223 | .texpadtmp 224 | 225 | # Kile 226 | *.backup 227 | 228 | # KBibTeX 229 | *~[0-9]* 230 | 231 | # auto folder when using emacs and auctex 232 | ./auto/* 233 | *.el 234 | 235 | # expex forward references with \gathertags 236 | *-tags.tex 237 | 238 | # standalone packages 239 | *.sta 240 | 241 | # generated if using elsarticle.cls 242 | *.spl 243 | .Rproj.user 244 | -------------------------------------------------------------------------------- /faintr/vignettes/faintr_basics.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Comparing groups of factor levels with the `faintr` package" 3 | output: rmarkdown::html_vignette 4 | vignette: > 5 | %\VignetteIndexEntry{Comparing groups of factor levels with the `faintr` package} 6 | %\VignetteEngine{knitr::rmarkdown} 7 | \usepackage[utf8]{inputenc} 8 | --- 9 | 10 | ```{r setup, include=FALSE, echo = FALSE, message = FALSE} 11 | knitr::opts_chunk$set(echo=TRUE, 12 | warning=FALSE, 13 | message=FALSE, 14 | collapse = TRUE, 15 | cache = TRUE, 16 | dev.args = list(bg = 'transparent'), 17 | fig.align='center', 18 | fig.height = 3, 19 | fig.widht=4) 20 | library(tidyverse) 21 | theme_set(theme_bw() + theme(plot.background=element_blank()) ) 22 | ``` 23 | 24 | # Motivation 25 | 26 | The `faintr` package provides convenience function for the evaluation of a model fit, obtained with the `brms` package, for a Bayesian regression model for data from a factorial design. If the original model fit used (default) dummy coding of factors, the `faintr` package allow extraction of many more meaningful comparisons. For example, it is possible to directly compare the difference between cells which are not comparable by dummy coding, and it is also possible to compare means in sets of cells, so as to recover the outcomes of deviance coding. 27 | 28 | # Installation 29 | 30 | Install the `faintr` package with `devtools` from GitHub: 31 | 32 | ```{r, eval = F} 33 | devtools::install_github('michael-franke/bayes_mixed_regression_tutorial/faintr', 34 | build_vignettes = TRUE) 35 | library(faintr) 36 | ``` 37 | 38 | ```{r, echo = F} 39 | library(faintr) 40 | ``` 41 | 42 | # Example 43 | 44 | Consider a data set on pitch frequency in the speech of female and male speakers in polite and informal contexts. 45 | 46 | ```{r, error=FALSE, warning=FALSE, message=FALSE} 47 | library(tidyverse) 48 | politedata = read_csv('https://tinyurl.com/polite-data') 49 | head(politedata) 50 | ``` 51 | 52 | The cell means of this data set are: 53 | 54 | ```{r} 55 | politedata %>% group_by(gender, context) %>% summarize(mean_pitch = mean(pitch)) 56 | ``` 57 | 58 | 59 | A Bayesian regression model for a factorial design with by-subject and by-item random intercepts can be obtained with the `brms` package as follows: 60 | 61 | ```{r, error=FALSE, warning=FALSE, message=FALSE, results="hide"} 62 | library(brms) 63 | m_dummy = brm(pitch ~ gender * context + (1 | subject + sentence), politedata) 64 | ``` 65 | 66 | The `brm` function uses dummy coding per default. Look at the estimated coefficients: 67 | 68 | ```{r} 69 | fixef(m_dummy) 70 | ``` 71 | 72 | The reference cell is where `gender:F` and `context:inf`, so female speakers in informal contexts. The estimated mean for the cell with data from male speakers in informal contexts is retrievable by adding the estimated coefficient `genderM` in the output above from the estimated Intercept. 73 | 74 | The `faintr` package provides convenience functions to compare different (groups of) cells to each other, based on a model fit like the above. Although the fit of the regression model uses a particular reference cell for dummy-coding, other contrasts of relevance can be retrieved from the posterior samples. For example, if we want to compare two cell diagonally, say, male speakers in informal contexts against female speakers in polite contexts, we can do this: 75 | 76 | ```{r} 77 | compare_groups( 78 | model = m_dummy, 79 | higher = list(gender = "F", context = "pol"), 80 | lower = list(gender = "M", context = "inf") 81 | ) 82 | ``` 83 | 84 | We can also compare the effect of gender female against the grand mean, to retrieve the information normally obtained by deviance coding: 85 | 86 | ```{r} 87 | compare_groups( 88 | model = m_dummy, 89 | higher = list(gender = "F"), 90 | lower = list() 91 | ) 92 | ``` 93 | 94 | To explore all pairwise comparisons between design cells, try: 95 | 96 | ```{r} 97 | post_cells(m_dummy)$all_cells_compared 98 | ``` 99 | 100 | We can also extract the estimated means of each cell: 101 | 102 | ```{r} 103 | post_cells(m_dummy)$cell_summary 104 | ``` 105 | 106 | -------------------------------------------------------------------------------- /faintr/doc/faintr_basics.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Comparing groups of factor levels with the `faintr` package" 3 | output: rmarkdown::html_vignette 4 | vignette: > 5 | %\VignetteIndexEntry{Comparing groups of factor levels with the `faintr` package} 6 | %\VignetteEngine{knitr::rmarkdown} 7 | \usepackage[utf8]{inputenc} 8 | --- 9 | 10 | ```{r setup, include=FALSE, echo = FALSE, message = FALSE} 11 | knitr::opts_chunk$set(echo=TRUE, 12 | warning=FALSE, 13 | message=FALSE, 14 | collapse = TRUE, 15 | cache = TRUE, 16 | dev.args = list(bg = 'transparent'), 17 | fig.align='center', 18 | fig.height = 3, 19 | fig.widht=4) 20 | library(tidyverse) 21 | theme_set(theme_bw() + theme(plot.background=element_blank()) ) 22 | ``` 23 | 24 | # Motivation 25 | 26 | The `faintr` package provides convenience function for the evaluation of a model fit, obtained with the `brms` package, for a Bayesian regression model for data from a factorial design. If the original model fit used (default) dummy coding of factors, the `faintr` package allow extraction of many more meaningful comparisons. For example, it is possible to directly compare the difference between cells which are not comparable by dummy coding, and it is also possible to compare means in sets of cells, so as to recover the outcomes of deviance coding. 27 | 28 | # Installation 29 | 30 | Install the `faintr` package with `devtools` from GitHub: 31 | 32 | ```{r, eval = F} 33 | devtools::install_github('michael-franke/bayes_mixed_regression_tutorial/faintr', 34 | build_vignettes = TRUE) 35 | library(faintr) 36 | ``` 37 | 38 | ```{r, echo = F} 39 | library(faintr) 40 | ``` 41 | 42 | # Example 43 | 44 | Consider a data set on pitch frequency in the speech of female and male speakers in polite and informal contexts. 45 | 46 | ```{r, error=FALSE, warning=FALSE, message=FALSE} 47 | library(tidyverse) 48 | politedata = read_csv('https://raw.githubusercontent.com/michael-franke/bayes_mixed_regression_tutorial/master/code/politeness_data.csv') 49 | head(politedata) 50 | ``` 51 | 52 | The cell means of this data set are: 53 | 54 | ```{r} 55 | politedata %>% group_by(gender, context) %>% summarize(mean_pitch = mean(pitch)) 56 | ``` 57 | 58 | 59 | A Bayesian regression model for a factorial design with by-subject and by-item random intercepts can be obtained with the `brms` package as follows: 60 | 61 | ```{r, error=FALSE, warning=FALSE, message=FALSE, results="hide"} 62 | library(brms) 63 | m_dummy = brm(pitch ~ gender * context + (1 | subject + sentence), politedata) 64 | ``` 65 | 66 | The `brm` function uses dummy coding per default. Look at the estimated coefficients: 67 | 68 | ```{r} 69 | fixef(m_dummy) 70 | ``` 71 | 72 | The reference cell is where `gender:F` and `context:inf`, so female speakers in informal contexts. The estimated mean for the cell with data from male speakers in informal contexts is retrievable by adding the estimated coefficient `genderM` in the output above from the estimated Intercept. 73 | 74 | The `faintr` package provides convenience functions to compare different (groups of) cells to each other, based on a model fit like the above. Although the fit of the regression model uses a particular reference cell for dummy-coding, other contrasts of relevance can be retrieved from the posterior samples. For example, if we want to compare two cell diagonally, say, male speakers in informal contexts against female speakers in polite contexts, we can do this: 75 | 76 | ```{r} 77 | compare_groups( 78 | model = m_dummy, 79 | higher = list(gender = "F", context = "pol"), 80 | lower = list(gender = "M", context = "inf") 81 | ) 82 | ``` 83 | 84 | We can also compare the effect of gender female against the grand mean, to retrieve the information normally obtained by deviance coding: 85 | 86 | ```{r} 87 | compare_groups( 88 | model = m_dummy, 89 | higher = list(gender = "F"), 90 | lower = list() 91 | ) 92 | ``` 93 | 94 | To explore all pairwise comparisons between design cells, try: 95 | 96 | ```{r} 97 | extract_posterior_cell_means(m_dummy)$all_cells_compared 98 | ``` 99 | 100 | We can also extract the estimated means of each cell: 101 | 102 | ```{r} 103 | extract_posterior_cell_means(m_dummy)$cell_summary 104 | ``` 105 | 106 | -------------------------------------------------------------------------------- /text/mfpackages.sty: -------------------------------------------------------------------------------- 1 | \NeedsTeXFormat{LaTeX2e} 2 | \ProvidesPackage{mfpackages}[2010/03/29 Personalized Packages Michael Franke] 3 | 4 | \newif\ifpala 5 | \palafalse 6 | 7 | \newif\ifnobib 8 | \nobibtrue 9 | 10 | \newif\ifnosubcap 11 | \nosubcaptrue 12 | 13 | \newif\ifnohyper 14 | \nohypertrue 15 | 16 | \newif\ifnoams 17 | \noamstrue 18 | 19 | \newif\ifnotik 20 | \notiktrue 21 | 22 | \newif\ifnograph 23 | \nographtrue 24 | 25 | \newif\ifnoGB 26 | \noGBtrue 27 | 28 | \DeclareOption{palatino}{ 29 | \palatrue 30 | } 31 | 32 | \DeclareOption{nobiblatex}{ 33 | \nobibfalse 34 | } 35 | 36 | \DeclareOption{nosubcaption}{ 37 | \nosubcapfalse 38 | } 39 | 40 | \DeclareOption{nohyperref}{ 41 | \nohyperfalse 42 | } 43 | 44 | \DeclareOption{noamsthm}{ 45 | \noamsfalse 46 | } 47 | 48 | \DeclareOption{notikz}{ 49 | \notikfalse 50 | } 51 | 52 | \DeclareOption{nographicx}{ 53 | \nographfalse 54 | } 55 | 56 | \DeclareOption{nogb4e}{ 57 | \noGBfalse 58 | } 59 | 60 | \ProcessOptions\relax 61 | 62 | \RequirePackage{amsmath} % Formeln 63 | \RequirePackage{amsfonts} % Fonts for Formulas 64 | \RequirePackage{amssymb} 65 | 66 | \ifnoams 67 | \RequirePackage{amsthm} 68 | \fi 69 | 70 | \RequirePackage{dsfont} % double stroke fonts 71 | 72 | \ifnograph 73 | \RequirePackage[final]{graphicx} 74 | \fi 75 | \RequirePackage{booktabs} 76 | \RequirePackage{enumerate} 77 | \RequirePackage{paralist} 78 | 79 | \RequirePackage[all]{xy} 80 | \RequirePackage{url} 81 | 82 | \ifnoGB 83 | \RequirePackage{gb4e} 84 | \noautomath 85 | \renewcommand{\eachwordtwo}{\relsize{-1}} % format translation in glosses 86 | % \renewcommand{\trans}{\vskip.15\baselineskip\relsize{-1}} 87 | \renewcommand{\trans}{\vskip.15\baselineskip} 88 | \fi 89 | 90 | 91 | 92 | % \RequirePackage{fourier} % double brackets for \den{} 93 | 94 | \RequirePackage{lipsum} 95 | \RequirePackage{txfonts} % for strict implication symbols 96 | \RequirePackage{soul} 97 | \RequirePackage{relsize} % provides command \relsize{+/-x} for relative 98 | % font size changes 99 | \RequirePackage[ngerman,english]{babel} 100 | \RequirePackage[utf8]{inputenc} 101 | \RequirePackage[T1]{fontenc} 102 | % \RequirePackage{subfig} 103 | \RequirePackage{xypic} 104 | \RequirePackage{multicol} 105 | 106 | 107 | \ifnosubcap 108 | \RequirePackage{subcaption} 109 | \fi 110 | 111 | \ifnotik 112 | \RequirePackage{tikz} 113 | \usetikzlibrary{arrows,shapes,automata,backgrounds,petri,fit,decorations.pathmorphing} 114 | \fi 115 | 116 | \RequirePackage{units} 117 | 118 | \RequirePackage{refcount} % for proper repetition of gb4e examples 119 | 120 | \RequirePackage{dialogue} 121 | 122 | % \RequirePackage{attrib} % for right-aligned references at the end of 123 | % % quotes; part of Frankenstein bundle, but 124 | % \renewcommand\PreTrib {} % overwrite these commands, from attrib.sty 125 | % \renewcommand\PostTrib {} % to suppress additional brackets around 126 | % % attributions 127 | 128 | \RequirePackage{setspace} 129 | 130 | \RequirePackage[colorinlistoftodos,color=lightgray!20,bordercolor=blue,textsize=footnotesize]{todonotes} 131 | 132 | \RequirePackage{xspace} % for \xspace in definition of acronyms etc. 133 | 134 | \ifpala 135 | \RequirePackage[sc,osf]{mathpazo} 136 | \linespread{1.12} 137 | \fi 138 | 139 | \ifnobib 140 | \RequirePackage[style=authoryear-comp, % Citation marks as [Jef65] 141 | natbib=true, % Natbib-style cite macros \citeauthor &c. 142 | hyperref=true, % Cites in pdf are links to bib 143 | % (hyperref conf.) 144 | maxnames=3, % truncate name lists if more than 2 145 | % names appear 146 | doi=true, % no doi's 147 | url=true, % no url's 148 | sortcites=false, % do NOT sort cites in the style of the 149 | % bibliography 150 | %backref=true % insert backrefs in reference section 151 | ]{biblatex} 152 | \fi 153 | 154 | 155 | 156 | 157 | \ifnohyper 158 | \RequirePackage[final, % override "draft" which means "no nothing" 159 | colorlinks, % rather than outlining them in boxes 160 | linkcolor=black, % override truly awful colour choices 161 | citecolor=black, % (ditto) 162 | urlcolor=black, % (ditto) 163 | plainpages=false, % to overcome complaints with multiple 164 | pdfpagelabels, % multiple page 1-s due to preface 165 | hypertexnames=false % solves warning, but interferes with 166 | % index and \autoref apparently 167 | ]{hyperref} 168 | \fi -------------------------------------------------------------------------------- /text/references.bib: -------------------------------------------------------------------------------- 1 | 2 | @unpublished{VehtariGelman2016:Practical-Bayes, 3 | Author = {Aki Vehtari and Andrew Gelman and Jonah Gabry}, 4 | Note = {Manuscript}, 5 | Title = {Practical Bayesian model evaluation using leave-one-out cross-validation and WAIC}, 6 | Year = {2016}} 7 | 8 | @article{KlugkistKato2005:Bayesian-model-, 9 | Author = {Irene Klugkist and Bernet Kato and Herbert Hoijtink}, 10 | Journal = {Statistica Neelandica}, 11 | Number = {1}, 12 | Pages = {57--69}, 13 | Title = {Bayesian model selection using encompassing priors}, 14 | Volume = {59}, 15 | Year = {2005}} 16 | 17 | @article{GronauSarafoglou2017:A-tutorial-on-b, 18 | Author = {Quentin F. Gronau and Alexandra Sarafoglou and Dora Matzke and Alexander Ly and Udo Boehm and Maarten Marsman and David S. Leslie and Jonathan J. Forster and Eric-Jan Wagenmakers and Helen Steingroever}, 19 | Journal = {Journal of Mathematical Psychology}, 20 | Pages = {80--97}, 21 | Title = {A tutorial on bridge sampling}, 22 | Volume = {81}, 23 | Year = {2017}} 24 | 25 | @article{KassRaftery1995:Bayes-Factors, 26 | Author = {Robert E. Kass and Adrian E. Raftery}, 27 | Journal = {Journal of the American Statistical Association}, 28 | Number = {430}, 29 | Pages = {773--795}, 30 | Title = {Bayes Factors}, 31 | Volume = {90}, 32 | Year = {1995}} 33 | 34 | @book{Jeffreys1961:Theory-of-Proba, 35 | Address = {Oxford}, 36 | Author = {Harold Jeffreys}, 37 | Date-Added = {2013-09-26 10:14:12 +0000}, 38 | Date-Modified = {2013-09-26 10:15:41 +0000}, 39 | Edition = {3rd}, 40 | Publisher = {Oxford University Press}, 41 | Title = {Theory of Probability}, 42 | Year = {1961}} 43 | 44 | @incollection{VandekerckhoveMatzke2015:Model-Compariso, 45 | Address = {Oxford}, 46 | Author = {Joachim Vandekerckhove and Dora Matzke and Eric-Jan Wagenmakers}, 47 | Booktitle = {Oxford Handbook of Computational and Mathematical Psychology}, 48 | Editor = {J. Busemeyer and J. Townsend and Z. J. Wang and A. Eidels}, 49 | Pages = {300--319}, 50 | Publisher = {Oxford University Press}, 51 | Title = {Model Comparison and the Principle of Parsimony}, 52 | Year = {2015}} 53 | 54 | @article{WagenmakersLodewyckx2010:Bayesian-hypoth, 55 | Author = {Eric-Jan Wagenmakers and Tom Lodewyckx and Himanshu Kuriyal and Raoul Grasman}, 56 | Journal = {Cognitive Psychology}, 57 | Pages = {158--189}, 58 | Title = {Bayesian hypothesis testing for psychologists: {A} tutorial on the {S}avage--{D}ickey method}, 59 | Volume = {60}, 60 | Year = {2010}} 61 | 62 | @article{buerkner2016brms, 63 | Author = {Buerkner, Paul-Christian}, 64 | Date-Modified = {2017-12-03 08:06:41 +0000}, 65 | Journal = {Journal of Statistical Software}, 66 | Number = {1}, 67 | Pages = {1--28}, 68 | Title = {{brms}: An {R} package for {B}ayesian multilevel models using {Stan}}, 69 | Volume = {80}, 70 | Year = {2016}} 71 | 72 | @article{matuschek2017balancing, 73 | title={Balancing Type I error and power in linear mixed models}, 74 | author={Matuschek, Hannes and Kliegl, Reinhold and Vasishth, Shravan and Baayen, Harald and Bates, Douglas}, 75 | journal={Journal of Memory and Language}, 76 | volume={94}, 77 | pages={305--315}, 78 | year={2017}, 79 | publisher={Elsevier} 80 | } 81 | 82 | @article{barr2013random, 83 | title={Random effects structure for confirmatory hypothesis testing: Keep it maximal}, 84 | author={Barr, Dale J and Levy, Roger and Scheepers, Christoph and Tily, Harry J}, 85 | journal={Journal of memory and language}, 86 | volume={68}, 87 | number={3}, 88 | pages={255--278}, 89 | year={2013}, 90 | publisher={Elsevier} 91 | } 92 | 93 | @article{carpenter2016stan, 94 | Author = {Carpenter, Bob and Gelman, Andrew and Hoffman, Matt and Lee, Daniel and Goodrich, Ben and Betancourt, Michael and Brubaker, Michael A. and Guo, Jiqiang and Li, Peter and Riddell, Allen}, 95 | Journal = {Journal of Statistical Software}, 96 | Pages = {1--37}, 97 | Title = {Stan: A probabilistic programming language}, 98 | Volume = {20}, 99 | Year = {2016}} 100 | 101 | @article{clark1973language, 102 | title={The language-as-fixed-effect fallacy: A critique of language statistics in psychological research}, 103 | author={Clark, Herbert H}, 104 | journal={Journal of verbal learning and verbal behavior}, 105 | volume={12}, 106 | number={4}, 107 | pages={335--359}, 108 | year={1973}, 109 | publisher={Elsevier} 110 | } 111 | 112 | @book{McElreath2016:Statistical-Ret, 113 | Address = {Boca Raton}, 114 | Author = {Richard {McElreath}}, 115 | Publisher = {Chapman and Hall}, 116 | Title = {Statistical Rethinking}, 117 | Year = {2016}} 118 | 119 | @book{GelmanCarlin2014:Bayesian-Data-A, 120 | Address = {Boca Raton}, 121 | Author = {Andrew Gelman and John B. Carlin and Hal S. Stern and Donald B. Rubin}, 122 | Edition = {3rd edition}, 123 | Publisher = {Chapman and Hall}, 124 | Title = {Bayesian Data Analysis}, 125 | Year = {2014}} 126 | 127 | @book{Kruschke2011:Doing-Bayesian-, 128 | Address = {Burlington, MA}, 129 | Author = {John E. Kruschke}, 130 | Edition = {2nd edition}, 131 | Publisher = {Academic Press}, 132 | Title = {Doing {B}ayesian Data Analysis}, 133 | Year = {2015}} 134 | 135 | @misc{Manual, 136 | Address = {Vienna, Austria}, 137 | Author = {{R Core Team}}, 138 | Organization = {R Foundation for Statistical Computing}, 139 | Title = {R: A Language and Environment for Statistical Computing}, 140 | Url = {https://www.R-project.org/}, 141 | Year = {2017}, 142 | Bdsk-Url-1 = {https://www.R-project.org/}} 143 | 144 | @article{WinterGrawunder2012:The-Phonetic-Pr, 145 | Author = {Bodo Winter and S. Grawunder }, 146 | Journal = {Journal of Phonetics}, 147 | Pages = {808--815}, 148 | Title = {The Phonetic Profile of Korean Formality}, 149 | Volume = {40}, 150 | Year = {2012}} 151 | 152 | @misc{Winter2013:Linear-models-a, 153 | Author = {Bodo Winter}, 154 | Title = {{Linear models and linear mixed effects models in R with linguistic applications}}, 155 | Url = {https://arxiv.org/abs/1308.5499}, 156 | Year = {2013}} 157 | 158 | @article{SorensenHohensteinb2016:Bayesian-linear, 159 | Author = {Tanner Sorensen and Sven Hohensteinb and Shravan Vasishth}, 160 | Journal = {The Quantitative Methods for Psychology}, 161 | Title = {Bayesian linear mixed models using {S}tan: {A} tutorial for psychologists, linguists, and cognitive scientists}, 162 | Year = {2016}} 163 | -------------------------------------------------------------------------------- /code/data_analysis_for_reader.R: -------------------------------------------------------------------------------- 1 | ## Code to follow the tutorial: "Bayesian regression modeling (for factorial designs): A tutorial" 2 | ## Authors: Michael Franke & Timo Roettger 3 | ## Date last modified: 21/06/19 4 | ## Contact: mchfranke@gmail.com; timo.b.roettger@gmail.com 5 | 6 | 7 | #################### 8 | ## install packages 9 | ##################### 10 | 11 | # package for convenience functions (e.g. plotting) 12 | library(tidyverse) 13 | 14 | # package for Bayesian regression modeling 15 | library(brms) 16 | 17 | # option for Bayesian regression models: 18 | # use all available cores for parallel computing 19 | options(mc.cores = parallel::detectCores()) 20 | 21 | # package for credible interval computation 22 | library(HDInterval) 23 | 24 | # set the random seed in order to make sure 25 | # you can reproduce the same results 26 | set.seed(1702) 27 | 28 | ################## 29 | ## load the data 30 | ################## 31 | 32 | # load the data into variable "politedata" 33 | politedata = read_csv("https://raw.githubusercontent.com/michael-franke/bayes_mixed_regression_tutorial/master/code/politeness_data.csv") 34 | 35 | # inspect head of data 36 | head(politedata) 37 | 38 | ##################################################### 39 | ## run & inspect model with only fixed effects 40 | ##################################################### 41 | 42 | # formula for fixed effect regression model 43 | formula_FE = pitch ~ gender * context 44 | 45 | # run regression model in brms 46 | model_FE = brm( 47 | formula = formula_FE, 48 | data = politedata, 49 | seed = 1702 50 | ) 51 | 52 | # print out model_FE summary 53 | model_FE 54 | 55 | # extract posterior samples 56 | post_samples_FE = posterior_samples(model_FE) 57 | head(post_samples_FE %>% round(1)) 58 | 59 | # proportion of negative samples for parameter b_contextpol 60 | # this number approximates P(b_contextpol < 0 | model, data) 61 | mean(post_samples_FE$b_contextpol < 0) 62 | 63 | # proportion of samples for which the mean of cell 2 was larger 64 | # than that of cell 3 65 | # this number approximates the quantity: 66 | # P(b_contextpol > b_genderM | model, data) 67 | mean(post_samples_FE$b_contextpol > post_samples_FE$b_genderM) 68 | 69 | 70 | ########################################### 71 | ## showcasing the faintr package 72 | ## (still hoping for the best) 73 | ########################################### 74 | 75 | # load package to allow installation from GitHub 76 | library(devtools) 77 | 78 | # install package with convenience function for Bayesian regression 79 | # models for factorial designs from GitHub 80 | install_github( 81 | repo = "michael-franke/bayes_mixed_regression_tutorial", 82 | subdir = "faintr") 83 | 84 | # load the just installed package 85 | library(faintr) 86 | 87 | # extract posterior cell means 88 | post_cells(model_FE)$predictor_values 89 | 90 | # extract cell means and plot them 91 | posterior_cell_means = post_cells(model_FE)$predictor_values %>% 92 | gather(key = "parameter", value = "posterior") 93 | 94 | # compare cell means with each other 95 | compare_groups( 96 | model = model_FE, 97 | lower = list(gender = "M", context = "inf"), 98 | higher = list(gender = "F", context = "pol") 99 | ) 100 | 101 | 102 | get_posterior_beliefs_about_hypotheses = function(model) { 103 | # insert the comparisons you are interested in as strings 104 | tibble( 105 | hypothesis = c("Female-polite < Female-informal", 106 | "Male-polite < Male-informal", 107 | "Male-informal < Female-polite"), 108 | probability = c( 109 | # insert the comparisons you are interested in referring to the extracted samples 110 | compare_groups( 111 | model = model, 112 | lower = list(gender = "F", context = "pol"), 113 | higher = list(gender = "F", context = "inf") 114 | )$probability, 115 | compare_groups( 116 | model = model, 117 | lower = list(gender = "M", context = "pol"), 118 | higher = list(gender = "M", context = "inf") 119 | )$probability, 120 | compare_groups( 121 | model = model, 122 | lower = list(gender = "M", context = "inf"), 123 | higher = list(gender = "F", context = "pol") 124 | )$probability 125 | ) 126 | ) 127 | } 128 | 129 | get_posterior_beliefs_about_hypotheses(model_FE) 130 | 131 | ######################### 132 | ## add prior information 133 | ######################### 134 | 135 | # see the priors of your fitted model 136 | prior_summary(model_FE) 137 | 138 | # get all possible priors for your model before fitting it 139 | # (here prior_summary and get_prior give identical outputs 140 | # because you have not specified any priors beyond the default priors 141 | get_prior(formula = pitch ~ gender * context, 142 | data = politedata) 143 | 144 | # define priors 145 | priorFE <- c( 146 | # define a skeptical prior for the effect of context on female speakers 147 | prior(normal(0, 10), coef = contextpol) 148 | ) 149 | 150 | # let's run our models with our specified priors 151 | model_FE_prior = brm(formula = pitch ~ gender * context, 152 | prior = priorFE, # add prior 153 | data = politedata, 154 | control = list(adapt_delta = 0.99), 155 | seed = 1702) 156 | 157 | # Extract posterior beliefs about our hypotheses 158 | get_posterior_beliefs_about_hypotheses(model_FE_prior) 159 | 160 | ############### 161 | ## model check 162 | ############### 163 | 164 | # run model without considering gender 165 | model_FE_noGender = brm(formula = pitch ~ context, 166 | data = politedata, 167 | control = list(adapt_delta = 0.99), 168 | seed = 1702) 169 | 170 | # perform posterior predictive check fo both models 171 | pp_check(model_FE_noGender, nsample = 100) 172 | pp_check(model_FE, nsample = 100) 173 | 174 | ############################################### 175 | ## models with additional random effects 176 | ############################################### 177 | 178 | # hierarchical model with the maximial RE structure licensed by the design 179 | # (notice that factor 'gender' does not vary for a given value of variable 'subject') 180 | model_MaxRE = brm(formula = pitch ~ gender * context + 181 | (1 + gender * context | sentence) + 182 | (1 + context | subject), 183 | data = politedata, 184 | control = list(adapt_delta = 0.99), 185 | seed = 1702) 186 | 187 | # Extract posterior beliefs about our hypotheses 188 | get_posterior_beliefs_about_hypotheses(model_MaxRE) 189 | 190 | compare_groups( 191 | model = model_FE, 192 | lower = list(gender = "F", context = "pol"), 193 | higher = list(gender = "F", context = "inf") 194 | ) 195 | 196 | compare_groups( 197 | model = model_MaxRE, 198 | lower = list(gender = "M", context = "pol"), 199 | higher = list(gender = "M", context = "inf") 200 | ) 201 | -------------------------------------------------------------------------------- /text/mfcommands.sty: -------------------------------------------------------------------------------- 1 | \ProvidesPackage{mycommands} 2 | 3 | % Math -------------------- 4 | \newcommand{\set}[1]{\left\{#1\right\}} 5 | \newcommand{\tuple}[1]{\left \langle #1\right\rangle} 6 | \newcommand{\card}[1]{\left \lvert \, #1 \, \right\rvert} 7 | \newcommand{\abs}[1]{\lvert #1 \rvert} 8 | \newcommand{\setbar}{\ensuremath{\thinspace \mid \thinspace}} 9 | \newcommand{\probbar}{\ensuremath{\mid}} 10 | % \DeclareMathOperator*{\argmax}{arg\,max} 11 | % \DeclareMathOperator*{\argmin}{arg\,min} 12 | \newcommand{\df}{\rightarrow} 13 | \newcommand{\es}{\emptyset} 14 | \newcommand{\den}[1]{\left [\! \left [ #1 \right ]\! \right]} 15 | % \newcommand{\den}[1]{\left \llbracket #1 \right \rrbracket} % this would use fourier package which makes 'cases' environment bad 16 | \newcommand{\no}{\noindent} 17 | \newcommand{\hin}{"$\Rightarrow$" } 18 | \newcommand{\rueck}{"$\Leftarrow$" } 19 | \newcommand{\exs}{\vspace{.15cm}} 20 | \newcommand{\pow}[1]{\ensuremath{\mathcal{P}(#1)}} % Powerset 21 | \newcommand{\restr}{{\restriction}} 22 | \newcommand{\implicates}{\ensuremath{\leadsto}} % arrow for 23 | % implicatures in examples 24 | \newcommand{\update}[2]{\ensuremath{#1[#2]}} 25 | \newcommand{\myts}{\ensuremath{\thinspace}} 26 | \newcommand{\mycolon}{\ensuremath{\thinspace \colon \thinspace}} 27 | \newcommand{\mydot}{\ensuremath{\thinspace . \thinspace}} 28 | 29 | 30 | % \makeatletter 31 | % \newcommand{\prob}{\@ifstar 32 | % \simpleprob% 33 | % \condprob% 34 | % } 35 | % \def\simpleprob(#1){\ensuremath{\Pr(#1)}} 36 | % \def\condprob(#1|#2){\ensuremath{\Pr(#1 \,|\, #2)}} 37 | % \makeatother 38 | 39 | % General Text Markup-------------------- 40 | \newcommand{\runex}[1]{\begin{center}#1\end{center}} 41 | \newcommand{\mydef}[1]{\textsc{#1}} % definitions 42 | \newcommand{\markdef}[1]{\textsc{#1}} % definitions; alternative 43 | \newcommand{\myment}[1]{\emph{#1}} % first mentions 44 | \newcommand{\myword}[1]{\textbf{\texttt{#1}}} % refering to the word 45 | \newcommand{\myemph}[1]{\emph{#1}} % emphasis 46 | \newenvironment{exnonum}{ 47 | \begin{list}{}{ 48 | \setlength{\leftmargin}{2.5em} 49 | \setlength{\rightmargin}{2.5em} 50 | %\setlength{\itemindent}{-1.5em} 51 | } 52 | }{ 53 | \end{list} 54 | } 55 | 56 | 57 | % Slanted Fractions 58 | \newcommand{\myslantfrac}[2]{\msf{#1}{#2}} 59 | \newcommand{\msf}[2]{\ensuremath{\nicefrac{#1}{#2}}} 60 | \newcommand{\msftext}[2]{\nicefrac{#1}{#2}} 61 | 62 | 63 | % Symbols for Conditionals 64 | \newcommand{\cond}{\ensuremath{>}} 65 | \newcommand{\bicond}{\ensuremath{\Leftrightarrow}} 66 | \newcommand{\strcondWill}{\ensuremath{\boxRight}} 67 | \newcommand{\strcondMight}{\ensuremath{\DiamondRight}} 68 | 69 | % Signaling Games & IBR 70 | \newcommand{\sen}{\ensuremath{S}\xspace} % Sender variable 71 | \newcommand{\mysen}[1]{\ensuremath{\sen^{#1}}} % Sender of type XYZ 72 | \newcommand{\rec}{\ensuremath{R}\xspace} % Receiver variable 73 | \newcommand{\myrec}[1]{\ensuremath{\rec_{#1}}} % Receiver of type XYZ 74 | \newcommand{\States}{\ensuremath{T}\xspace} % Set of States 75 | \newcommand{\state}{\ensuremath{t}\xspace} % single states 76 | \newcommand{\mystate}[1]{\ensuremath{\state_{\text{#1}}}\xspace} %meaningful states 77 | \newcommand{\Messgs}{\ensuremath{M}\xspace} % Set of Messages 78 | \newcommand{\messg}{\ensuremath{m}\xspace} % single messages 79 | \newcommand{\mymessg}[1]{\ensuremath{\messg_{\text{#1}}}\xspace} %meaningful messages 80 | \newcommand{\cost}{\ensuremath\operatorname{C}} % cost function 81 | \newcommand{\Acts}{\ensuremath{A}\xspace} % Set of R-actions 82 | \newcommand{\act}{\ensuremath{a}\xspace} % single action 83 | \newcommand{\myact}[1]{\ensuremath{\act_{\text{#1}}}\xspace} %meaningful 84 | \newcommand{\Worlds}{\ensuremath{W}} % Worlds 85 | \newcommand{\world}{\ensuremath{w}} % single world 86 | \newcommand{\myworld}[1]{\ensuremath{\world_{\text{#1}}}} %named world 87 | \newcommand{\util}{\ensuremath{\operatorname{U}}} % Utility function 88 | \newcommand{\Util}{\ensuremath{\operatorname{U}}} % Utility function 89 | \newcommand{\utils}{\ensuremath{\operatorname{U}}} % Utility function 90 | \newcommand{\Utils}{\ensuremath{\operatorname{U}}} % Utility function 91 | \newcommand{\RealUtil}{\ensuremath{\operatorname{V}}} % material payoffs 92 | \newcommand{\Sstrat}{\ensuremath{\sigma}} % Behav/Probab Sender strategy 93 | \newcommand{\Sstrats}{\ensuremath{\mathcal{S}}} % Set of S-strategies 94 | \newcommand{\Spure}{\ensuremath{s}} % Pure sender strategy 95 | \newcommand{\Spures}{\ensuremath{\mathsf{S}}} % Set of pure sen strategies 96 | \newcommand{\Smixed}{\ensuremath{\tilde{s}}} % Mixed sender strategy 97 | \newcommand{\Smixeds}{\ensuremath{\Delta(\Messgs^\States)}} 98 | \newcommand{\SpuresW}{\ensuremath{\mathsf{S}}} 99 | \newcommand{\SpuresS}{\ensuremath{\mathsf{S}^{{\mathrm{S}}}}} 100 | \newcommand{\Rstrat}{\ensuremath{\rho}} % Behav/Probab Receiver strategy 101 | \newcommand{\Rstrats}{\ensuremath{\mathcal{R}}} % Set of R-Strategies 102 | \newcommand{\Rpure}{\ensuremath{r}} % Pure receiver strategy 103 | \newcommand{\Rpures}{\ensuremath{\mathsf{R}}} % Set of pure rec strategies 104 | \newcommand{\Rmixed}{\ensuremath{\tilde{r}}} % Mixed receiver strategy 105 | \newcommand{\Rmixeds}{\ensuremath{\Delta(\Acts^\Messgs)}} 106 | \newcommand{\RpuresW}{\ensuremath{\mathsf{R}}} 107 | \newcommand{\RpuresS}{\ensuremath{\mathsf{R}^{\mathrm{S}}}} 108 | \newcommand{\PureBR}{\ensuremath{\operatorname{BR}}} % Set of pure best responses 109 | \newcommand{\ProbBR}{\ensuremath{\operatorname{BR_{Prob}}}} % Set of mixed best responses 110 | \newcommand{\bel}{\ensuremath{\pi}} 111 | \newcommand{\Bels}{\ensuremath{\Pi}} 112 | \newcommand{\Sbel}{\ensuremath{\pi_{\sen}}} 113 | \newcommand{\Sbels}{\ensuremath{\Pi_{\sen}}} 114 | \newcommand{\Rbel}{\ensuremath{\pi_{\rec}}} 115 | \newcommand{\Rbels}{\ensuremath{\Pi_{\rec}}} 116 | \newcommand{\EU}{\ensuremath{\operatorname{EU}}} % Expected Utility 117 | \newcommand{\EV}{\ensuremath{\operatorname{EV}}} % Expected Response Utility 118 | \newcommand{\BR}{\ensuremath{\operatorname{BR}}} % Best Response 119 | \newcommand{\QR}{\ensuremath{\operatorname{QR}}} % Quantal Response 120 | \newcommand{\WBR}{\ensuremath{\text{{\relsize{-1}W}BR}}} % Weak Best Response 121 | \newcommand{\SBR}{\ensuremath{\text{{\relsize{-1}W}BR}}} % Strong Best Response 122 | \newcommand{\interpr}{\ensuremath{\delta}} % Interpretation strategy 123 | 124 | % OT Stuff 125 | \newcommand{\Gen}{\ensuremath{\operatorname{Gen}}} % Generator 126 | \newcommand{\Eval}{\ensuremath{\operatorname{Eval}}} % Evaluator 127 | \newcommand{\Con}{\ensuremath{\operatorname{Con}}} % Constraints 128 | \newcommand{\metsuc}{\ensuremath{\succ}} % Symbol for BiOT metric 129 | \newcommand{\metsuceq}{\ensuremath{\succeq}} % Symbol for BiOT metric 130 | \newcommand{\Go}[1]{\operatorname{Pool}_{#1}} 131 | \newcommand{\Oo}[1]{\operatorname{Opt}_{#1}} 132 | \newcommand{\Bo}[1]{\operatorname{Blo}_{#1}} 133 | \newcommand{\Gr}[1]{\operatorname{GAM^{\rho}}_{#1}} 134 | \newcommand{\Or}[1]{\operatorname{OPT^{\rho}}_{#1}} 135 | \newcommand{\Br}[1]{\operatorname{BLO^{\rho}}_{#1}} 136 | 137 | % Abbreviations/Acronyms: 138 | \newcommand{\acro}[1]{\textsc{#1}\xspace} 139 | \newcommand{\acros}[1]{\textsc{#1}{\relsize{-1}s}\xspace} 140 | \newcommand{\bc}{\acro{bc}} % Biscuit Conditional(s) 141 | \newcommand{\bcs}{\acros{bc}} 142 | \newcommand{\cbc}{\acro{cbc}} % Counterfactual BCs 143 | \newcommand{\cbcs}{\acros{cbc}} 144 | \newcommand{\ibr}{\acro{ibr}} % IBR model 145 | \newcommand{\iqr}{\acro{iqr}} % IQR model 146 | \newcommand{\rsa}{\acro{rsa}} % RSA model 147 | \newcommand{\ot}{\acro{ot}} % BiOT 148 | \newcommand{\biot}{\acro{b{\relsize{-1}i}ot}} % BiOT 149 | \newcommand{\tom}{\acro{t{\relsize{-1}o}m}} % ToM 150 | \newcommand{\fc}{\acro{fc}} % Free Choice 151 | \newcommand{\cp}{\acro{cp}} % Conditional Perfection 152 | \newcommand{\uc}{\acro{uc}} % Unconditional Readings 153 | \newcommand{\pbe}{\acro{pbe}} % Perfect Bayesian Equil. 154 | \newcommand{\pbes}{\acros{pbe}} 155 | \newcommand{\forind}{\acro{fi}} % forward induction 156 | \newcommand{\tcp}{\acro{tcp}} % truth ceteris paribus 157 | \newcommand{\cmr}{\acro{cmr}} % credible message rationalizability 158 | \newcommand{\cm}{\acro{cm}} % credible message (profile) (Rabin) 159 | \newcommand{\condition}[2]{\acro{#1}{#2}} % conditions 160 | \newcommand{\br}{\acro{br}} % best response (property) 161 | \newcommand{\wbr}{\acro{{\relsize{-1}w}br}} % weak best response (property) 162 | \newcommand{\sbr}{\acro{{\relsize{-1}s}br}} % strong best response (property) 163 | \newcommand{\curb}{\acro{curb}} % curb sets 164 | \newcommand{\gtp}{\acro{gtp}} % game theoretic pragmatics 165 | \newcommand{\sda}{\acro{sda}} % simplification of disjunctive antecedents 166 | \newcommand{\decprob}{\ensuremath{\mathcal{D}}} 167 | \newcommand{\ques}{\ensuremath{\mathfrak{Q}}} 168 | \newcommand{\vsi}{\acro{vsi}} 169 | \newcommand{\evsi}{\acro{evsi}} 170 | \newcommand{\uv}{\acro{uv}} 171 | \newcommand{\qud}{\acro{qud}} 172 | \newcommand{\NE}{\acro{ne}} 173 | \newcommand{\NEs}{\acros{ne}} 174 | \newcommand{\SNE}{\acro{sne}} 175 | \newcommand{\SNEs}{\acros{sne}} 176 | \newcommand{\SG}{\acro{sg}} 177 | \newcommand{\SGs}{\acros{sg}} 178 | \newcommand{\KO}{\textsc{ko\relsize{-1}bs}} % Kennedy's observation 179 | \newcommand{\EVP}{\acro{evp}} % extreme-value principle 180 | \newcommand{\illc}{\acro{illc}} 181 | 182 | % Evolution 183 | \newcommand{\EGT}{\acro{egt}} 184 | \newcommand{\ESS}{\acro{ess}} 185 | \newcommand{\ESSs}{\acros{ess}} 186 | \newcommand{\NSS}{\acro{nss}} 187 | \newcommand{\NSSs}{\acros{nss}} 188 | \newcommand{\sigsys}{\textsc{SigSys}\xspace} % signaling system 189 | \newcommand{\sigsyss}{\textsc{SigSys{\relsize{-1}s}}\xspace} % signaling system Plural 190 | 191 | \newcommand{\fin}{\rule{0mm}{1mm}\hfill{\rule{1.5cm}{0.2pt}}} 192 | 193 | %Properly typeset tilde for URLs 194 | \def\urltilde{\kern -.15em\lower .7ex\hbox{\~{}}\kern .04em} 195 | 196 | 197 | % Beamer footnote for references: 198 | \newcommand{\beamfn}[1]{ 199 | \vfill 200 | \begin{footnotesize} 201 | \leftskip 0.1in 202 | \parindent -0.1in 203 | \hspace{-0.3cm}\rule{2cm}{0.01cm}\\ \vspace{-0.15cm} 204 | #1 205 | \end{footnotesize} 206 | } 207 | 208 | \newcommand{\myvec}[1]{\ensuremath{\mathbf{#1}}} 209 | \newcommand{\transpose}[1]{\ensuremath{\operatorname{T}(#1)}} 210 | \newcommand{\normalize}[1]{\ensuremath{\operatorname{N}(#1)}} 211 | 212 | \newcommand{\dn}[1]{\draftnote{#1}} 213 | \newcommand{\fn}[1]{\footnote{#1}} 214 | 215 | \newcommand{\stateunmarked}{\ensuremath{\state}\xspace} 216 | \newcommand{\statemarked}{\ensuremath{\state^*}\xspace} 217 | \newcommand{\messgunmarked}{\ensuremath{\messg}\xspace} 218 | \newcommand{\messgmarked}{\ensuremath{\messg^*}\xspace} 219 | \newcommand{\actunmarked}{\ensuremath{\act}\xspace} 220 | \newcommand{\actmarked}{\ensuremath{\act^*}\xspace} 221 | 222 | \newcommand{\sunmarked}{\ensuremath{\state}\xspace} 223 | \newcommand{\smarked}{\ensuremath{\state^*}\xspace} 224 | \newcommand{\munmarked}{\ensuremath{\messg}\xspace} 225 | \newcommand{\mmarked}{\ensuremath{\messg^*}\xspace} 226 | \newcommand{\aunmarked}{\ensuremath{\act}\xspace} 227 | \newcommand{\amarked}{\ensuremath{\act^*}\xspace} 228 | 229 | \newcommand{\ssome}{\mystate{\ensuremath{\exists\neg\forall}}} 230 | \newcommand{\sall}{\mystate{\ensuremath{\forall}}} 231 | \newcommand{\msome}{\mymessg{some}} 232 | \newcommand{\mall}{\mymessg{all}} 233 | \newcommand{\asome}{\myact{\ensuremath{\exists\neg\forall}}} 234 | \newcommand{\aall}{\myact{\ensuremath{\forall}}} 235 | 236 | % for repeating examples with gb4e 237 | 238 | \newcounter{myexememory} 239 | \newenvironment{exer}[1] 240 | { 241 | \setcounter{myexememory}{\value{exx}} 242 | \setcounter{exx}{\getrefnumber{#1}} 243 | \addtocounter{exx}{-1} 244 | \begin{exe} 245 | } 246 | { 247 | \end{exe} 248 | \setcounter{exx}{\value{myexememory}} 249 | } 250 | 251 | \newenvironment{nakedlist}{ 252 | \begin{list}{\quad}{} 253 | } 254 | { 255 | \end{list} 256 | } 257 | 258 | \DefineNamedColor{named}{mycol}{cmyk}{0.6,0.6,0,0} 259 | \DefineNamedColor{named}{mygray}{cmyk}{0.05,0.05,0.05,0.05} 260 | \DefineNamedColor{named}{mygraylight}{cmyk}{0.017,0.017,0.017,0.017} 261 | \DefineNamedColor{named}{mycol2}{cmyk}{0.8,0,0.8,0.2} 262 | \definecolor{Red}{RGB}{178,34,34} 263 | \newcommand{\mf}[1]{\textcolor{Red}{[#1]}} 264 | \newcommand{\mycolh}[1]{{\textcolor{mycol2}{#1}}} 265 | \newcommand{\mymark}[1]{{\color{mycol}{#1}}} 266 | 267 | \DeclareMathOperator{\expo}{exp} 268 | \DeclareMathOperator*{\argmin}{arg\,min} 269 | \DeclareMathOperator*{\argmax}{arg\,max} 270 | 271 | 272 | 273 | 274 | % \usepackage{calc} 275 | \newsavebox\CBox 276 | \newcommand\msout[2][0.5pt]{% 277 | \ifmmode\sbox\CBox{$#2$}\else\sbox\CBox{#2}\fi% 278 | \makebox[0pt][l]{\usebox\CBox}% 279 | \rule[0.5\ht\CBox-#1/2]{\wd\CBox}{#1}} 280 | 281 | \newcommand{\greensquare}{\raisebox{1.5pt}{\textcolor{Green}{\Large{\ensuremath{\blacksquare}}}}} 282 | \newcommand{\bluecircle}{\textcolor{blue}{\Huge{\ensuremath{\bullet}}}} 283 | \newcommand{\greencircle}{\textcolor{Green}{\Huge{\ensuremath{\bullet}}}} 284 | 285 | \newcommand{\greensquareS}{\raisebox{1.5pt}{\textcolor{Green}{\normalsize{\ensuremath{\blacksquare}}}}} 286 | \newcommand{\greensquareSS}{\raisebox{1.5pt}{\textcolor{Green}{\footnotesize{\ensuremath{\blacksquare}}}}} 287 | \newcommand{\bluecircleS}{\textcolor{blue}{\Large{\ensuremath{\bullet}}}} 288 | \newcommand{\greencircleS}{\textcolor{Green}{\Large{\ensuremath{\bullet}}}} 289 | 290 | \newcommand{\soc}{\ensuremath{\theta}\xspace} -------------------------------------------------------------------------------- /faintr/R/faintr_functions.R: -------------------------------------------------------------------------------- 1 | #' Obtaining information about factors in regression model 2 | #' 3 | #' For a model for a factorial design, fitted with brms, this function returns information about the factors used, their levels, and the reference levels. 4 | #' For more information see \code{vignette('faintr_basics')}. 5 | #' @param model Model fit from brms package. 6 | #' @keywords regression, factorial design, brms 7 | #' @import tidyverse brms 8 | #' @export 9 | #' @return list with names of factors and their levels, including the reference levels (in dummy coding) 10 | #' @examples 11 | #' library(brms) 12 | #' m = brm(yield ~ N * P * K, npk) 13 | #' get_factor_information(m) 14 | get_factor_information = function(model) { 15 | 16 | # extract information about dependent and independent variables from formula 17 | ## TODO :: check extensively, especially for mixed effects models whether this stuff here works 18 | dependent_variable = as.character(formula(model)[[1]])[[2]] 19 | independent_variables = strsplit(x = gsub(pattern = "\\(.*\\|.*\\)", "", as.character(formula(model)[[1]])[[3]]), 20 | split = "(\\*|\\+)", 21 | fixed = FALSE)[[1]] %>% trimws() %>% unique() 22 | independent_variables = independent_variables[which(independent_variables != "")] 23 | 24 | # stop this if there are not at least two factors 25 | if (length(independent_variables) < 1) { 26 | stop("Oeps! There do not seem to be any factors!") 27 | } 28 | 29 | # stop if any factor name or factor level contains a character that brms might not handle correctly 30 | check_permitted_characters = function(chr_vec) { 31 | str_which(chr_vec, "[^([:alnum:]|\\.|_)]") 32 | } 33 | # dependent variable 34 | if (length(check_permitted_characters(dependent_variable))>0) { 35 | stop("All variables, factor names, and factor levels must not contain any character except alpha-numeric characters (letters and numbers), dots '.', or underscores '_'. 36 | The dependent variable '", dependent_variable, "' does not satisfy this constraint.") 37 | } 38 | # independent variables 39 | if (length(check_permitted_characters(independent_variables))>0) { 40 | stop("All variables, factor names, and factor levels must not contain any character except alpha-numeric characters (letters and numbers), dots '.', or underscores '_'. 41 | The dependent variable(s) '", paste0(independent_variables[check_permitted_characters(independent_variables)], collapse = ", ") , "' does (do) not satisfy this constraint.") 42 | } 43 | 44 | # construct three helpful representations of factors and their levels for the following 45 | ## factors :: a list with all factors and their levels 46 | factors = list() 47 | n_levels = c() 48 | for (iv in independent_variables) { 49 | new_levels = list(levels(as.factor(model.frame(model) %>% pull(iv)))) 50 | factors = append(factors, new_levels) 51 | n_levels = c(n_levels, length(new_levels[[1]])) 52 | } 53 | names(factors) = independent_variables 54 | if (min(n_levels) <= 1){ 55 | stop("Oeps! There seems to be a factor with less than 2 levels. Please check and possibly exclude that factor.") 56 | } 57 | 58 | # check naming conventions in factor levels 59 | factor_levels = unlist(factors) 60 | if (length(check_permitted_characters(factor_levels))>0) { 61 | stop("All variables, factor names, and factor levels must not contain any character except alpha-numeric characters (letters and numbers), dots '.', or underscores '_'. 62 | The factor level(s) '", paste0(factor_levels[check_permitted_characters(factor_levels)], collapse = ", ") , "' does (do) not satisfy this constraint.") 63 | } 64 | 65 | 66 | ## ref_levels_list :: a list with all factors and their refernce levels 67 | ref_levels_list = factors 68 | for (iv in independent_variables) { 69 | ref_levels_list[[iv]] = ref_levels_list[[iv]][1] 70 | } 71 | ## ref_levels :: a string representation of each factor and its reference level 72 | ref_levels = c() 73 | for (iv in independent_variables) { 74 | ref_levels = c(ref_levels, paste0(iv, ref_levels_list[[iv]][1])) 75 | } 76 | 77 | return(list( 78 | dependent_variable = dependent_variable, 79 | independent_variables = independent_variables, 80 | factors = factors, 81 | ref_levels_list = ref_levels_list, 82 | ref_levels = ref_levels 83 | )) 84 | } 85 | 86 | 87 | #' Extracting posterior cell means 88 | #' 89 | #' This function takes a brms model fit for a factorial design and outputs a comparison of all factor levels against each other, and posterior samples of all cell means. 90 | #' For more information see \code{vignette('faintr_basics')}. 91 | #' @param model Model fit from brms package. 92 | #' @keywords regression, factorial design, brms 93 | #' @import tidyverse brms 94 | #' @export 95 | #' @return list with (i) samples of estimated means of all cells, (ii) pairwise comparison of each cell (whether one has credibly a higher inferred mean than the other), and (iii) a summary (mean & 95% HDI) for each posterior estimate of cell means. 96 | #' @examples 97 | #' #' library(brms) 98 | #' m = brm(yield ~ N * P * K, npk) 99 | #' post_cells(m) 100 | post_cells = function(model) { 101 | 102 | # check if repsonse variable is metric and abort if not 103 | if (model$family$family != "gaussian") { 104 | stop("Unfortunately, 'faintr' currently only works with metric response variables (family 'gaussian').") 105 | } 106 | 107 | 108 | # get information about factors 109 | factor_info = get_factor_information(model) 110 | dependent_variable = factor_info[["dependent_variable"]] 111 | independent_variables = factor_info[["independent_variables"]] 112 | factors = factor_info[["factors"]] 113 | ref_levels_list = factor_info[["ref_levels_list"]] 114 | ref_levels = factor_info[["ref_levels"]] 115 | 116 | # get the posterior samples for all regression coefficients 117 | post_samples = posterior_samples(model) %>% dplyr::select(starts_with("b_")) 118 | 119 | # get a table of cells (factor-level combinations) in an ugly format (for internal use) 120 | cells = expand.grid(factors) 121 | for (j in 1:ncol(cells)) { 122 | levels(cells[,j]) = paste0(colnames(cells)[j], levels(cells[,j])) 123 | } 124 | 125 | # get a table of cells (factor-level combinations) with more readable labels (for final output) 126 | cells_readable = expand.grid(factors) 127 | for (j in 1:ncol(cells_readable)) { 128 | levels(cells_readable[,j]) = paste0(colnames(cells_readable)[j], ":", levels(cells_readable[,j])) 129 | } 130 | 131 | # get the names of all estimated coefficients 132 | coefficient_names = names(post_samples) 133 | # add the reference levels to the coefficient names (where it is missing/implicit) 134 | for (c in 1:length(coefficient_names)) { 135 | for (f in names(factors)) { 136 | if (!grepl(f, coefficient_names[c])) { 137 | coefficient_names[c] = paste0(coefficient_names[c], "_", f, ref_levels_list[[f]]) 138 | } 139 | } 140 | } 141 | names(post_samples) = coefficient_names 142 | 143 | # two convenience functions to get all coefficients that belong to a design cell 144 | replace_with_ref_level_recursion = function(cell) { 145 | which_fcts_are_at_ref_level = map_lgl(cell, function(fl) fl %in% ref_levels) 146 | if (all(which_fcts_are_at_ref_level)) { 147 | return(list(cell)) 148 | } 149 | else { 150 | factors_to_replace_with_ref_level = which(which_fcts_are_at_ref_level == F) 151 | output = list(cell) 152 | for (f in factors_to_replace_with_ref_level) { 153 | replaced_cell = cell 154 | replaced_cell[f] = ref_levels[f] 155 | output = append(output, replace_with_ref_level(replaced_cell)) 156 | } 157 | output 158 | } 159 | } 160 | replace_with_ref_level = function(cell) { 161 | unique(replace_with_ref_level_recursion(cell)) 162 | } 163 | 164 | # get samples for the predictor values for each design cell 165 | predictor_values = map_df( 166 | 1:nrow(cells), 167 | function(i) { 168 | cell = cells[i,1:(ncol(cells))] 169 | coefficients_to_check = replace_with_ref_level(cell) 170 | out = 0 171 | column_indices = map_dbl(1:length(coefficients_to_check), 172 | function(j){ 173 | which( 174 | map_lgl(coefficient_names, function(coefficient_in_question) { 175 | all(map_lgl(coefficients_to_check[[j]], function(c) grepl(c, coefficient_in_question))) 176 | }) == T 177 | ) 178 | } 179 | ) 180 | tibble( 181 | cell = paste(map_chr(1:ifelse(is.null(ncol(cell)), 1, ncol(cell)), 182 | function(j) as.character(cells_readable[i,j])), collapse = "__"), 183 | predictor_value = rowSums(post_samples[column_indices]), 184 | n_sample = 1:length(predictor_value) 185 | ) 186 | } 187 | ) 188 | 189 | predictor_values = predictor_values %>% spread(key = cell, value = predictor_value) 190 | 191 | ## an alternative (more versatile) output which compares all cells 192 | 193 | cells = expand.grid(factors) 194 | for (j in 1:ncol(cells_readable)) {cells_readable[,j] = as.character(cells_readable[,j])} 195 | cells$cell_name = map_chr( 196 | 1:nrow(cells_readable), 197 | function(i) {paste(as.character(cells_readable[i,]), collapse = "__")} 198 | ) 199 | for (j in 1:ncol(cells)) {cells[,j] = as.character(cells[,j])} 200 | 201 | cells_high = cells 202 | cells_low = cells 203 | names(cells_high) = map_chr(names(cells), function(c) {paste0(c, "_high")}) 204 | names(cells_low) = map_chr(names(cells), function(c) {paste0(c, "_low")}) 205 | 206 | # borrowed from here: https://stackoverflow.com/questions/11693599/alternative-to-expand-grid-for-data-frames 207 | expand.grid.df <- function(...) Reduce(function(...) merge(..., by=NULL), list(...)) 208 | 209 | all_cells_compared = expand.grid.df(cells_high, cells_low) 210 | for (j in 1:ncol(all_cells_compared)) {all_cells_compared[,j] = as.character(all_cells_compared[,j])} 211 | all_cells_compared = all_cells_compared %>% filter(cell_name_high != cell_name_low) 212 | 213 | all_cells_compared$posterior = map_dbl( 214 | 1:nrow(all_cells_compared), 215 | function(i) { 216 | mean(predictor_values[all_cells_compared$cell_name_high[i]] > 217 | predictor_values[all_cells_compared$cell_name_low[i]]) 218 | } 219 | ) 220 | 221 | ## safely remove column with sample number 222 | if ("n_sample" %in% names(predictor_values)) { 223 | predictor_values = predictor_values %>% dplyr::select(-n_sample) 224 | } 225 | 226 | cell_summary = full_join( 227 | map_df(predictor_values, function(x){HDInterval::hdi(x)[1]}) %>% gather(key = "cell", value = "lower 95% CI"), 228 | map_df(predictor_values, mean) %>% gather(key = "cell", value = "mean"), 229 | by = "cell" 230 | ) %>% 231 | full_join( 232 | map_df(predictor_values, function(x){HDInterval::hdi(x)[2]}) %>% gather(key = "cell", value = "upper 95% CI"), 233 | by = "cell" 234 | ) 235 | 236 | ## output 237 | 238 | return( 239 | list( 240 | predictor_values = predictor_values, 241 | all_cells_compared = all_cells_compared, 242 | cell_summary = cell_summary 243 | ) 244 | ) 245 | 246 | } 247 | 248 | 249 | #' Compare means of two subsets of factorial design cells 250 | #' 251 | #' This function takes a brms model fit for a factorial design and a specification of two groups (subsets of design cells) to compare. 252 | #' A group is specified as a named list, specifiying the factors and their levels which to include in the group. 253 | #' It outputs the posterior mean of the 'higher' minus the 'lower' subset of cells, its 95 percent credible interval and the posterior probability that the 'higher' group has a higher mean than the the 'lower' group. 254 | #' For more information see \code{vignette('faintr_basics')}. 255 | #' @param model Model fit from brms package. 256 | #' @keywords regression, factorial design, brms 257 | #' @import tidyverse brms 258 | #' @importFrom HDInterval hdi 259 | #' @export 260 | #' @return list with posterior samples for each group, and the posterior probability that group 'higher' has a higher estimated coefficient in the posterior samples than the group 'lower' 261 | #' @examples 262 | #' library(brms) 263 | #' m = brm(yield ~ N * P * K, npk) 264 | #' # this compares two single cells in the factorial design 265 | #' compare_groups( 266 | #' model = m, 267 | #' higher = list("N" = "1", "P" = "1", "K" = "1"), 268 | #' lower = list("N" = "0", "P" = "0", "K" = "1") 269 | #' ) 270 | #' # this compares the average of N=1 cells to the grand mean 271 | #' # like in deviance conding 272 | #' compare_groups( 273 | #' model = m, 274 | #' higher = list("N" = "1"), 275 | #' lower = list() 276 | #' ) 277 | #' 278 | compare_groups = function(model, higher, lower) { 279 | 280 | # get information about factors 281 | factor_info = get_factor_information(model) 282 | dependent_variable = factor_info[["dependent_variable"]] 283 | independent_variables = factor_info[["independent_variables"]] 284 | factors = factor_info[["factors"]] 285 | ref_levels_list = factor_info[["ref_levels_list"]] 286 | ref_levels = factor_info[["ref_levels"]] 287 | 288 | # check the input groups 289 | input_combined = c(higher, lower) 290 | ## check if all factor names specified are actually in the model 291 | input_factor_names = unique(names(input_combined)) 292 | known_factor_names = map_lgl(input_factor_names, function(f) {f %in% independent_variables}) 293 | if (sum(known_factor_names) < length(known_factor_names)) { 294 | stop("The following factor names specified in the groups to be compared do not match any independent variable in the specified model: 295 | ", paste(input_factor_names[known_factor_names == F], collapse = ", ")) 296 | } 297 | ## check if all factor levels specified are actually in the model 298 | for (i in length(input_combined)) { 299 | if (! input_combined[[i]] %in% factors[[names(input_combined)[i]]]) { 300 | stop("The level '", input_combined[[i]], "' is not part of the factor '", names(input_combined)[i], "' in the given model.") 301 | } 302 | } 303 | 304 | # get posterior samples for all cell means 305 | post_cell_samples = post_cells(model)$predictor_values 306 | 307 | ## helper function :: recursive extraction of cell names 308 | collect_cell_names = function(remaining_names, remaining_factors) { 309 | if (length(remaining_factors) == 1) { 310 | return ( str_subset(remaining_names, remaining_factors) ) 311 | } else { 312 | remaining_names = str_subset(remaining_names, remaining_factors[1]) 313 | remaining_factors = remaining_factors[2:length(remaining_factors)] 314 | return(collect_cell_names(remaining_names, remaining_factors)) 315 | } 316 | } 317 | 318 | ## helper function :: get names for cells 319 | get_group_names = function(group){ 320 | if (length(group) == 0) { 321 | return("grand mean") 322 | } 323 | map_chr(1:length(group), 324 | function(c) {paste(names(group[c]), unlist(group[c]), sep = ":")}) 325 | } 326 | 327 | ## helper function :: get means for each cell 328 | extract_group_samples = function(group) { 329 | if (length(group) == 0) { 330 | return(apply(as.matrix(post_cell_samples), 1, mean)) 331 | } 332 | factors_group = get_group_names(group) 333 | cells_group = collect_cell_names(names(post_cell_samples), factors_group) 334 | apply(as.matrix(post_cell_samples %>% dplyr::select(cells_group)), 1, mean) 335 | } 336 | 337 | post_samples_higher = extract_group_samples(higher) 338 | post_samples_lower = extract_group_samples(lower) 339 | 340 | outlist = list( 341 | post_samples_higher = post_samples_higher, 342 | post_samples_lower = post_samples_lower, 343 | higher = get_group_names(higher), 344 | lower = get_group_names(lower), 345 | mean_diff = mean(post_samples_higher - post_samples_lower), 346 | l95_ci = as.vector(HDInterval::hdi(post_samples_higher - post_samples_lower)[1]), 347 | u95_ci = as.vector(HDInterval::hdi(post_samples_higher - post_samples_lower)[2]), 348 | probability = mean(post_samples_higher > post_samples_lower) 349 | ) 350 | class(outlist) = "faintCompare" 351 | return(outlist) 352 | } 353 | 354 | #' Print comparison object between factor groups 355 | #' @param model Model fit from brms package. 356 | #' @keywords regression, factorial design, brms 357 | #' @export 358 | #' @return string 359 | #' @examples print(model_fit) 360 | print.faintCompare = function(obj) { 361 | cat("Outcome of comparing groups:\n") 362 | cat(" * higher: ", obj$higher, "\n") 363 | cat(" * lower: ", obj$lower, "\n") 364 | cat("Mean 'higher - lower': ", signif(obj$mean_diff, 4), "\n") 365 | cat("95% CI: [", signif(obj$l95_ci, 4), ";", signif(obj$u95_ci,4), "]\n") 366 | cat("P('higher - lower' > 0): ", signif(obj$probability,4), "\n") 367 | } 368 | -------------------------------------------------------------------------------- /code/data_analysis.R: -------------------------------------------------------------------------------- 1 | ##################################################### 2 | ## package includes and options 3 | ##################################################### 4 | 5 | # package for convenience functions (e.g. plotting) 6 | library(tidyverse) 7 | 8 | # package for Bayesian regression modeling 9 | library(brms) 10 | # option for Bayesian regression models: 11 | # use all available cores for parallel computing 12 | options(mc.cores = parallel::detectCores()) 13 | 14 | # package for credible interval computation 15 | library(HDInterval) 16 | 17 | # set seed 18 | set.seed(1702) 19 | 20 | ##################################################### 21 | ## read and massage the data 22 | ##################################################### 23 | 24 | politedata = read_csv('https://raw.githubusercontent.com/michael-franke/bayes_mixed_regression_tutorial/master/code/politeness_data.csv') 25 | # politedata = read_csv('politeness_data.csv') 26 | head(politedata) 27 | 28 | ##################################################### 29 | ## plot means for each group 30 | ##################################################### 31 | 32 | politedata.agg <- 33 | politedata %>% 34 | group_by(gender, context, sentence) %>% 35 | summarize(mean_frequency = mean(pitch)) 36 | 37 | politedata.agg2 <- 38 | politedata %>% 39 | group_by(gender, context) %>% 40 | summarize(mean_frequency = round(mean(pitch), 0)) 41 | 42 | ggplot(data = politedata.agg, 43 | aes(x = gender, 44 | y = mean_frequency, 45 | colour = context)) + 46 | geom_point(position = position_dodge(0.5), 47 | alpha = 0.3, 48 | size = 3) + 49 | geom_point(data = politedata.agg2, 50 | aes(x = gender, 51 | y = mean_frequency, 52 | #colour = context, 53 | fill = context), 54 | position = position_dodge(0.5), 55 | pch = 21, 56 | colour = "black", 57 | size = 5) + 58 | scale_x_discrete(breaks = c("F", "M"), 59 | labels = c("female", "male")) + 60 | scale_y_continuous(expand = c(0, 0), breaks = (c(50,100,150,200,250,300)), limits = c(50,300)) + 61 | scale_colour_manual(breaks = c("inf", "pol"), 62 | labels = c("informal", "polite"), 63 | values = c("#f1a340", "#998ec3")) + 64 | scale_fill_manual(breaks = c("inf", "pol"), 65 | labels = c("informal", "polite"), 66 | values = c("#f1a340", "#998ec3")) + 67 | ylab("pitch in Hz\n") + 68 | xlab("\ngender") + 69 | theme_classic() + 70 | theme(legend.position = "right", 71 | legend.key.height = unit(2,"line"), 72 | legend.title = element_text(size = 18, face = "bold"), 73 | legend.text = element_text(size = 16), 74 | legend.background = element_rect(fill = "transparent"), 75 | strip.background = element_blank(), 76 | strip.text = element_text(size = 18, face = "bold"), 77 | axis.line.x = element_blank(), 78 | panel.spacing = unit(2, "lines"), 79 | plot.background = element_rect(fill = "transparent", colour = NA), 80 | panel.background = element_rect(fill = "transparent"), 81 | axis.text = element_text(size = 16), 82 | axis.title = element_text(size = 18, face = "bold"), 83 | plot.title = element_text(size = 18, face = "bold"), 84 | plot.margin = unit(c(0.2,0.1,0.2,0.1),"cm")) 85 | 86 | current_path = rstudioapi::getActiveDocumentContext()$path 87 | setwd(dirname(current_path)) 88 | 89 | ggsave(filename = "../text/pics/basic_data_plot.pdf", 90 | plot = last_plot(), 91 | width = 6, height = 4) 92 | 93 | ##################################################### 94 | ## run & inspect model with only fixed effects 95 | ##################################################### 96 | 97 | formula_FE = pitch ~ gender * context 98 | 99 | # model with only fixed effects (non-hierarchical) 100 | model_FE = brm( 101 | formula = formula_FE, 102 | data = politedata, 103 | seed = 1702) 104 | 105 | # extract posterior samples 106 | post_samples_FE = posterior_samples(model_FE) 107 | head(post_samples_FE %>% round(1)) 108 | 109 | # plotting the posterior distributions 110 | plot_posterior_density_FE = 111 | model_FE %>% as_tibble() %>% 112 | select(- lp__, - sigma) %>% 113 | gather(key = "parameter", value = "posterior") %>% 114 | mutate(parameter = case_when(parameter == "b_Intercept" ~ "Intercept", 115 | parameter == "b_contextpol" ~ "context:pol", 116 | parameter == "b_genderM" ~ "gender:M", 117 | parameter == "b_genderM.contextpol" ~ "gender:M__context:pol")) %>% 118 | mutate(parameter = as.factor(parameter)) %>% 119 | mutate(parameter = factor(parameter, levels = c("Intercept", "context:pol", "gender:M", "gender:M__context:pol"))) %>% 120 | ggplot(aes(x = posterior)) + 121 | geom_density(fill = "grey") + 122 | facet_wrap(~ parameter, scales = "free") + 123 | ylab("density\n") + 124 | xlab("\nparameter value") + 125 | theme_classic() + 126 | theme(legend.position = "right", 127 | legend.key.height = unit(2,"line"), 128 | legend.title = element_text(size = 18, face = "bold"), 129 | legend.text = element_text(size = 16), 130 | legend.background = element_rect(fill = "transparent"), 131 | strip.background = element_blank(), 132 | strip.text = element_text(size = 18, face = "bold"), 133 | axis.line = element_blank(), 134 | panel.spacing = unit(2, "lines"), 135 | plot.background = element_rect(fill = "transparent", colour = NA), 136 | panel.background = element_rect(fill = "transparent"), 137 | axis.text = element_text(size = 16), 138 | axis.title = element_text(size = 18, face = "bold"), 139 | plot.title = element_text(size = 18, face = "bold"), 140 | plot.margin = unit(c(0.2,0.1,0.2,0.1),"cm")) + 141 | geom_segment( 142 | mapping = aes(y = 0, yend = 0, x = low, xend = high), 143 | color = "firebrick", 144 | size = 3, 145 | #alpha = 0.7, 146 | data = fixef(model_FE) %>% as_tibble() %>% 147 | mutate(parameter = c("Intercept", "gender:M", "context:pol", "gender:M__context:pol")) %>% 148 | mutate(parameter = as.factor(parameter)) %>% 149 | mutate(parameter = factor(parameter, levels = c("Intercept", "context:pol", "gender:M", "gender:M__context:pol"))) %>% 150 | rename(low = Q2.5, high = Q97.5) 151 | ) 152 | 153 | # save the plotted figure 154 | ggsave(plot = last_plot(), filename = "../text/pics/posterior_density_FE.pdf", 155 | width = 9, height = 6) 156 | 157 | # proportion of negative samples for parameter p_contextpol 158 | # this number approximates P(beta_pol < 0 | model, data) 159 | mean(post_samples_FE$b_contextpol < 0) 160 | 161 | # proportion of samples where the mean for cell 2 was bigger 162 | # than that of cell 3 163 | # this number approximates P(beta_pol > beta_male | model, data) 164 | mean(post_samples_FE$b_contextpol > post_samples_FE$b_genderM) 165 | 166 | ########################################### 167 | ## showcasing the faintr package 168 | ## (still hoping for the best) 169 | ########################################### 170 | 171 | # package to allow installation from github 172 | library(devtools) 173 | 174 | # package with convenience function for Bayesian regression models for factorial designs 175 | # install_github( 176 | # repo = 'michael-franke/bayes_mixed_regression_tutorial', 177 | # subdir = 'faintr' 178 | # ) # install from GitHub 179 | 180 | library(faintr) 181 | 182 | # extract cell means and plot them 183 | posterior_cell_means = post_cells(model_FE)$predictor_values %>% 184 | gather(key = "parameter", value = "posterior") 185 | 186 | posterior_cell_means_HDIs = posterior_cell_means %>% 187 | group_by(parameter) %>% 188 | summarize(low = hdi(posterior)[1], 189 | high = hdi(posterior)[2]) 190 | 191 | posterior_cell_means_plot = posterior_cell_means %>% 192 | mutate(parameter = as.factor(parameter)) %>% 193 | mutate(parameter = factor(parameter, labels = c("female - informal", "female - polite", "male - informal", "male - polite"))) %>% 194 | ggplot(aes(x = posterior)) + 195 | geom_density(fill = "grey") + 196 | facet_wrap(~ parameter, scales = "free") + 197 | ylab("density\n") + 198 | xlab("\nparameter value") + 199 | scale_x_continuous(expand = c(0, 0), breaks = (c(100,200,300)), limits = c(80,300)) + 200 | scale_y_continuous(expand = c(0, 0), breaks = (c(0,0.02,0.04,0.06)), limits = c(0,0.06)) + 201 | theme_classic() + 202 | theme(legend.position = "right", 203 | legend.key.height = unit(2,"line"), 204 | legend.title = element_text(size = 18, face = "bold"), 205 | legend.text = element_text(size = 16), 206 | legend.background = element_rect(fill = "transparent"), 207 | strip.background = element_blank(), 208 | strip.text = element_text(size = 18, face = "bold"), 209 | axis.line = element_blank(), 210 | panel.spacing = unit(2, "lines"), 211 | plot.background = element_rect(fill = "transparent", colour = NA), 212 | panel.background = element_rect(fill = "transparent"), 213 | axis.text = element_text(size = 16), 214 | axis.title = element_text(size = 18, face = "bold"), 215 | plot.title = element_text(size = 18, face = "bold"), 216 | plot.margin = unit(c(0.2,0.4,0.2,0.1),"cm")) + 217 | geom_segment( 218 | mapping = aes(y = 0, yend = 0, x = low, xend = high), 219 | color = "firebrick", 220 | size = 3, 221 | #alpha = 0.7, 222 | data = posterior_cell_means_HDIs %>% 223 | mutate(parameter = as.factor(parameter)) %>% 224 | mutate(parameter = factor(parameter, labels = c("female - informal", "female - polite", "male - informal", "male - polite"))) 225 | ) 226 | 227 | # save the plotted figure 228 | ggsave(plot = last_plot(), filename = "../text/pics/posterior_density_cell_means.pdf", 229 | width = 9, height = 6) 230 | 231 | compare_groups( 232 | model = model_FE, 233 | lower = list(gender = "M", context = "inf"), 234 | higher = list(gender = "F", context = "pol") 235 | ) 236 | 237 | get_posterior_beliefs_about_hypotheses = function(model) { 238 | # insert the comparisons you are interested in as strings 239 | tibble( 240 | hypothesis = c("Female-polite < Female-informal", 241 | "Male-polite < Male-informal", 242 | "Male-informal < Female-polite"), 243 | probability = c( 244 | # insert the comparisons you are interested in referring to the extracted samples 245 | compare_groups( 246 | model = model, 247 | lower = list(gender = "F", context = "pol"), 248 | higher = list(gender = "F", context = "inf") 249 | )$probability, 250 | compare_groups( 251 | model = model, 252 | lower = list(gender = "M", context = "pol"), 253 | higher = list(gender = "M", context = "inf") 254 | )$probability, 255 | compare_groups( 256 | model = model, 257 | lower = list(gender = "M", context = "inf"), 258 | higher = list(gender = "F", context = "pol") 259 | )$probability 260 | ) 261 | ) 262 | } 263 | 264 | get_posterior_beliefs_about_hypotheses(model_FE) 265 | 266 | ############################ 267 | ## add prior information 268 | ############################ 269 | 270 | # get all possible priors for your model 271 | get_prior(formula = pitch ~ gender * context, 272 | data = politedata) 273 | 274 | # define priors 275 | priorFE <- c( 276 | # define a skeptical prior for the relevant coefficients 277 | prior(normal(0, 10), coef = contextpol) 278 | ) 279 | 280 | # let's run our models with our specified priors 281 | model_FE_prior = brm(formula = pitch ~ gender * context, 282 | prior = priorFE, 283 | data = politedata, 284 | control = list(adapt_delta = 0.99), 285 | seed = 1702) 286 | 287 | get_posterior_beliefs_about_hypotheses(model_FE_prior) 288 | 289 | #################### 290 | ## model check 291 | #################### 292 | 293 | # run model without considering gender 294 | model_FE_noGender = brm(formula = pitch ~ context, 295 | data = politedata, 296 | control = list(adapt_delta = 0.99), 297 | seed = 1702) 298 | 299 | pp_check1 <- pp_check(model_FE_noGender, nsample = 100) + 300 | scale_color_manual(values = c("#f1a340", "lightgrey")) + 301 | scale_x_continuous(breaks = (c(0,100,200,300,400)), limits = c(-50,400)) + 302 | scale_y_continuous(limits = c(0,0.01)) + 303 | labs(title = "Model without gender") + 304 | theme(legend.position = "none", 305 | legend.key.height = unit(2,"line"), 306 | legend.title = element_text(size = 18, face = "bold"), 307 | legend.text = element_text(size = 16), 308 | legend.background = element_rect(fill = "transparent"), 309 | strip.background = element_blank(), 310 | strip.text = element_text(size = 18, face = "bold"), 311 | axis.line = element_blank(), 312 | panel.spacing = unit(2, "lines"), 313 | plot.background = element_rect(fill = "transparent", colour = NA), 314 | panel.background = element_rect(fill = "transparent"), 315 | axis.text = element_text(size = 16), 316 | axis.title = element_text(size = 18, face = "bold"), 317 | plot.title = element_text(size = 18, face = "bold", hjust = 0.5), 318 | plot.margin = unit(c(0.2,0.4,0.2,0.4),"cm")) 319 | 320 | # model with gender 321 | pp_check2 <- pp_check(model_FE, nsample = 100) + 322 | scale_color_manual(values = c("#f1a340", "lightgrey")) + 323 | scale_x_continuous(breaks = (c(0,100,200,300,400)), limits = c(-50,400)) + 324 | scale_y_continuous(limits = c(0,0.01)) + 325 | labs(title = "Model including gender") + 326 | theme(legend.position = "none", 327 | legend.key.height = unit(2,"line"), 328 | legend.title = element_text(size = 18, face = "bold"), 329 | legend.text = element_text(size = 16), 330 | legend.background = element_rect(fill = "transparent"), 331 | strip.background = element_blank(), 332 | strip.text = element_text(size = 18, face = "bold"), 333 | axis.line = element_blank(), 334 | panel.spacing = unit(2, "lines"), 335 | plot.background = element_rect(fill = "transparent", colour = NA), 336 | panel.background = element_rect(fill = "transparent"), 337 | axis.text = element_text(size = 16), 338 | axis.title = element_text(size = 18, face = "bold"), 339 | plot.title = element_text(size = 18, face = "bold", hjust = 0.5), 340 | plot.margin = unit(c(0.2,0.4,0.2,0.1),"cm")) 341 | 342 | # combine plots 343 | library(ggpubr) 344 | 345 | pp_checks_plot <- 346 | ggarrange(pp_check1, pp_check2, 347 | heights = c(1,1), 348 | widths = c(1,1), 349 | #labels = c("A - Model without gender", "B - Model including gender"), 350 | font.label = list(size = 20), legend = "none", 351 | align = "h", 352 | ncol = 2, nrow = 1) 353 | 354 | ggsave(plot = pp_checks_plot, filename = "../text/pics/pp_checks_plot.pdf", 355 | width = 8, height = 4) 356 | 357 | ############################################### 358 | ## models with additional random effects 359 | ############################################### 360 | 361 | # hierarchical model with random intercepts 362 | 363 | # model 364 | model_interceptOnly = brm(formula = pitch ~ gender * context + 365 | (1 | sentence + subject), 366 | data = politedata, 367 | control = list(adapt_delta = 0.99), 368 | seed = 1702) 369 | 370 | # hierarchical model with the maximial RE structure licensed by the design 371 | # (notice that factor 'gender' does not vary for a given value of variable 'subject') 372 | 373 | # model 374 | model_MaxRE = brm(formula = pitch ~ gender * context + 375 | (1 + gender * context | sentence) + 376 | (1 + context | subject), 377 | data = politedata, 378 | control = list(adapt_delta = 0.99), 379 | seed = 1702) 380 | 381 | # extract cell means and 95% CIs 382 | posterior_cell_means = post_cells(model_MaxRE)$predictor_values %>% 383 | gather(key = "parameter", value = "posterior") %>% 384 | group_by(parameter) %>% 385 | summarize(mean = mean(posterior), 386 | low = hdi(posterior)[1], 387 | high = hdi(posterior)[2]) 388 | 389 | ################################## 390 | ## comparing selected hypotheses 391 | ################################## 392 | 393 | get_posterior_beliefs_about_hypotheses(model_FE) 394 | get_posterior_beliefs_about_hypotheses(model_interceptOnly) 395 | get_posterior_beliefs_about_hypotheses(model_MaxRE) 396 | 397 | ################################# 398 | ## posteriors of cell differences 399 | ## for final data report 400 | ################################# 401 | 402 | compare_groups( 403 | model = model_MaxRE, 404 | lower = list(gender = "F", context = "pol"), 405 | higher = list(gender = "F", context = "inf") 406 | ) 407 | 408 | compare_groups( 409 | model = model_MaxRE, 410 | lower = list(gender = "M", context = "pol"), 411 | higher = list(gender = "M", context = "inf") 412 | ) 413 | 414 | compare_groups( 415 | model = model_MaxRE, 416 | lower = list(gender = "M", context = "inf"), 417 | higher = list(gender = "F", context = "pol") 418 | ) 419 | -------------------------------------------------------------------------------- /faintr/doc/faintr_basics.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 |
6 | 7 | 8 | 9 | 10 | 11 | 12 | 13 | 14 | 15 |faintr packageThe faintr package provides convenience function for the evaluation of a model fit, obtained with the brms package, for a Bayesian regression model for data from a factorial design. If the original model fit used (default) dummy coding of factors, the faintr package allow extraction of many more meaningful comparisons. For example, it is possible to directly compare the difference between cells which are not comparable by dummy coding, and it is also possible to compare means in sets of cells, so as to recover the outcomes of deviance coding.
Install the faintr package with devtools from GitHub:
Consider a data set on pitch frequency in the speech of female and male speakers in polite and informal contexts.
320 |library(tidyverse)
321 | politedata = read_csv('https://raw.githubusercontent.com/michael-franke/bayes_mixed_regression_tutorial/master/code/politeness_data.csv')
322 | head(politedata)
323 | ## # A tibble: 6 x 5
324 | ## subject gender sentence context pitch
325 | ## <chr> <chr> <chr> <chr> <dbl>
326 | ## 1 F1 F S1 pol 213.
327 | ## 2 F1 F S1 inf 204.
328 | ## 3 F1 F S2 pol 285.
329 | ## 4 F1 F S2 inf 260.
330 | ## 5 F1 F S3 pol 204.
331 | ## 6 F1 F S3 inf 287.The cell means of this data set are:
333 |politedata %>% group_by(gender, context) %>% summarize(mean_pitch = mean(pitch))
334 | ## # A tibble: 4 x 3
335 | ## # Groups: gender [2]
336 | ## gender context mean_pitch
337 | ## <chr> <chr> <dbl>
338 | ## 1 F inf 261.
339 | ## 2 F pol 233.
340 | ## 3 M inf 144.
341 | ## 4 M pol 133.A Bayesian regression model for a factorial design with by-subject and by-item random intercepts can be obtained with the brms package as follows:
The brm function uses dummy coding per default. Look at the estimated coefficients:
fixef(m_dummy)
347 | ## Estimate Est.Error Q2.5 Q97.5
348 | ## Intercept 261.02934 22.093681 217.330291 303.64922
349 | ## genderM -116.70584 29.203414 -177.641515 -57.72146
350 | ## contextpol -27.16302 7.971296 -42.355886 -11.50569
351 | ## genderM:contextpol 15.27829 11.352729 -7.025012 37.81053The reference cell is where gender:F and context:inf, so female speakers in informal contexts. The estimated mean for the cell with data from male speakers in informal contexts is retrievable by adding the estimated coefficient genderM in the output above from the estimated Intercept.
The faintr package provides convenience functions to compare different (groups of) cells to each other, based on a model fit like the above. Although the fit of the regression model uses a particular reference cell for dummy-coding, other contrasts of relevance can be retrieved from the posterior samples. For example, if we want to compare two cell diagonally, say, male speakers in informal contexts against female speakers in polite contexts, we can do this:
compare_groups(
355 | model = m_dummy,
356 | higher = list(gender = "F", context = "pol"),
357 | lower = list(gender = "M", context = "inf")
358 | )
359 | ## Outcome of comparing groups:
360 | ## * higher: gender:F context:pol
361 | ## * lower: gender:M context:inf
362 | ## Mean 'higher - lower': 89.54
363 | ## 95% CI: [ 30.12 ; 149.4 ]
364 | ## P('higher - lower' > 0): 0.9962We can also compare the effect of gender female against the grand mean, to retrieve the information normally obtained by deviance coding:
366 |compare_groups(
367 | model = m_dummy,
368 | higher = list(gender = "F"),
369 | lower = list()
370 | )
371 | ## Outcome of comparing groups:
372 | ## * higher: gender:F
373 | ## * lower: grand mean
374 | ## Mean 'higher - lower': 54.53
375 | ## 95% CI: [ 25.56 ; 84.37 ]
376 | ## P('higher - lower' > 0): 0.998To explore all pairwise comparisons between design cells, try:
378 |extract_posterior_cell_means(m_dummy)$all_cells_compared
379 | ## gender_high context_high cell_name_high gender_low context_low
380 | ## 1 M inf gender:M__context:inf F inf
381 | ## 2 F pol gender:F__context:pol F inf
382 | ## 3 M pol gender:M__context:pol F inf
383 | ## 4 F inf gender:F__context:inf M inf
384 | ## 5 F pol gender:F__context:pol M inf
385 | ## 6 M pol gender:M__context:pol M inf
386 | ## 7 F inf gender:F__context:inf F pol
387 | ## 8 M inf gender:M__context:inf F pol
388 | ## 9 M pol gender:M__context:pol F pol
389 | ## 10 F inf gender:F__context:inf M pol
390 | ## 11 M inf gender:M__context:inf M pol
391 | ## 12 F pol gender:F__context:pol M pol
392 | ## cell_name_low posterior
393 | ## 1 gender:F__context:inf 0.00100
394 | ## 2 gender:F__context:inf 0.00050
395 | ## 3 gender:F__context:inf 0.00075
396 | ## 4 gender:M__context:inf 0.99900
397 | ## 5 gender:M__context:inf 0.99625
398 | ## 6 gender:M__context:inf 0.06675
399 | ## 7 gender:F__context:pol 0.99950
400 | ## 8 gender:F__context:pol 0.00375
401 | ## 9 gender:F__context:pol 0.00275
402 | ## 10 gender:M__context:pol 0.99925
403 | ## 11 gender:M__context:pol 0.93325
404 | ## 12 gender:M__context:pol 0.99725We can also extract the estimated means of each cell:
406 |extract_posterior_cell_means(m_dummy)$cell_summary
407 | ## # A tibble: 4 x 4
408 | ## cell `lower 95% CI` mean `upper 95% CI`
409 | ## <chr> <dbl> <dbl> <dbl>
410 | ## 1 gender:F__context:inf 219. 261. 305.
411 | ## 2 gender:F__context:pol 192. 234. 279.
412 | ## 3 gender:M__context:inf 98.2 144. 191.
413 | ## 4 gender:M__context:pol 87.3 132. 180.