├── .Rbuildignore ├── .gitignore ├── DESCRIPTION ├── NAMESPACE ├── R └── data.R ├── README.md ├── TUPD.pdf ├── data └── debate_transcripts.rda └── man ├── debate_transcripts.Rd └── figures └── logo.png /.Rbuildignore: -------------------------------------------------------------------------------- 1 | ^.*\.Rproj$ 2 | ^\.Rproj\.user$ 3 | ^.RData 4 | ^.Ruserdata 5 | ^raw-data/* 6 | ^raw-data-scraped/* 7 | ^raw-data-unedited/* 8 | ^raw-data-scraped-unedited/* 9 | ^figures/* 10 | ^WordNet-3.0/* 11 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .Rproj.user 2 | .Rhistory 3 | .RData 4 | .Ruserdata 5 | raw-data/* 6 | raw-data-scraped/* 7 | raw-data-unedited/* 8 | raw-data-scraped-unedited/* 9 | figures/* 10 | WordNet-3.0/* 11 | -------------------------------------------------------------------------------- /DESCRIPTION: -------------------------------------------------------------------------------- 1 | Package: debates 2 | Title: Presidential Debate Transcripts 3 | Version: 0.0.0.9000 4 | Authors@R: person("James", "Martherus", email = "james@martherus.com", 5 | role = c("aut", "cre")) 6 | Description: Includes tidy versions of canidate, vice presidential, and presidential debates from 2012-2020 7 | Depends: R (>= 3.5.0) 8 | License: MIT 9 | LazyData: true 10 | RoxygenNote: 7.0.2 11 | Encoding: UTF-8 12 | -------------------------------------------------------------------------------- /NAMESPACE: -------------------------------------------------------------------------------- 1 | # Generated by roxygen2: do not edit by hand 2 | 3 | -------------------------------------------------------------------------------- /R/data.R: -------------------------------------------------------------------------------- 1 | #' Presidential Debate Transcripts Data File 2 | #' 3 | #' A tibble containing transcripts for a variety of presidential debates. 4 | #' Presidential and Vice Presidential debates since 1960 are included, along 5 | #' with primary debates since 2008. 6 | #' 7 | #' @format An object of class \code{tibble} with 17,664 rows and 6 columns. 8 | #' \describe{ 9 | #' \item{speaker}{Name of Individual Speaking} 10 | #' \item{text}{The text of the speaker's statement} 11 | #' \item{type}{The election type. Possible values include Pres (Presidential), VP (Vice Presidential), Dem (Democratic Primary), and Rep (Republican Primary)} 12 | #' \item{election_year}{The election year associated with the debate} 13 | #' \item{date}{The date the debate took place} 14 | #' \item{candidate}{Indicated whether or not the speaker was a candidate in the debate (as opposed to a moderator, announcer, etc.)} 15 | #' } 16 | #' 17 | #' @docType data 18 | #' @usage data(debate_transcripts) 19 | #' @keywords datasets 20 | #' @name debate_transcripts 21 | #' @format A tibble. 22 | #' @source Various sources, including \href{Rev.com}{https://www.rev.com/blog/transcript-category/debate-transcripts?view=all}, \href{debates.org}{https://www.debates.org/voter-education/debate-transcripts/}, and a variety of news sites. 23 | 'debate_transcripts' -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # `debates`: US Presidential Debate Transcripts 2 | 3 | Presidential debates are an important opportunity for candidates to share their platforms. `debates` provides easy access to debate transcripts from Presidential, Vice Presidential, and primary candidate debates. The current version includes Presidential and Vice-Presidential debate transcripts starting in 1960, and for most debates from the 2012, 2016, and 2020 primary elections. `debates` includes one dataset, `debate_transcripts`, as a compact rda object. Once the package is installed and loaded, the dataset can be loaded using the `data()` function. 4 | 5 | `debate_transcripts` includes speaker-level and debate-level data. Each row in `debate_transcripts` represents one statement. Along with the text of the statement, each row includes the speaker's name and an indicator variable that identifies whether or not the speaker is a candidate (as opposed to being a moderator, an announcer, or someone asking a question). Each row also indicates the date, location, and type of debate. To suggest additional fields, please open an issue. 6 | 7 | For more information on how the dataset was compiled, see the file TUPD.pdf, also available here: https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3611815 8 | 9 | 10 | ## Installation 11 | 12 | To install `debates`, use the `install_github` function from the `devtools` package: 13 | 14 | ``` 15 | library(devtools) 16 | install_github("jamesmartherus/debates") 17 | ``` 18 | 19 | Alternatively, you can download transcripts.rda directly from the `data` folder. 20 | 21 | ## Examples 22 | 23 | ``` 24 | library(debates) 25 | 26 | data(debate_transcripts) #Load Transcript Data File 27 | ``` 28 | 29 | ## Acknowledgments 30 | 31 | - Transcripts were gathered from a variety of sources including [Rev.com](https://www.rev.com/blog/transcript-category/debate-transcripts?view=all), [debates.org](https://www.debates.org/voter-education/debate-transcripts/), and a variety of news sites. 32 | 33 | 34 | -------------------------------------------------------------------------------- /TUPD.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jamesmartherus/debates/4ea13a8ac1ca470fbff561e841582f3e030aff6e/TUPD.pdf -------------------------------------------------------------------------------- /data/debate_transcripts.rda: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jamesmartherus/debates/4ea13a8ac1ca470fbff561e841582f3e030aff6e/data/debate_transcripts.rda -------------------------------------------------------------------------------- /man/debate_transcripts.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/data.R 3 | \docType{data} 4 | \name{debate_transcripts} 5 | \alias{debate_transcripts} 6 | \title{Presidential Debate Transcripts Data File} 7 | \format{An object of class \code{tibble} with 17,664 rows and 6 columns. 8 | \describe{ 9 | \item{speaker}{Name of Individual Speaking} 10 | \item{text}{The text of the speaker's statement} 11 | \item{type}{The election type. Possible values include Pres (Presidential), VP (Vice Presidential), Dem (Democratic Primary), and Rep (Republican Primary)} 12 | \item{election_year}{The election year associated with the debate} 13 | \item{date}{The date the debate took place} 14 | \item{candidate}{Indicated whether or not the speaker was a candidate in the debate (as opposed to a moderator, announcer, etc.)} 15 | }} 16 | \source{ 17 | Various sources, including \href{Rev.com}{https://www.rev.com/blog/transcript-category/debate-transcripts?view=all}, \href{debates.org}{https://www.debates.org/voter-education/debate-transcripts/}, and a variety of news sites. 18 | } 19 | \usage{ 20 | data(debate_transcripts) 21 | } 22 | \description{ 23 | A tibble containing transcripts for a variety of presidential debates. 24 | Presidential and Vice Presidential debates since 1960 are included, along 25 | with primary debates since 2008. 26 | } 27 | \keyword{datasets} 28 | -------------------------------------------------------------------------------- /man/figures/logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/jamesmartherus/debates/4ea13a8ac1ca470fbff561e841582f3e030aff6e/man/figures/logo.png --------------------------------------------------------------------------------