├── .github ├── .gitignore └── workflows │ └── check-standard.yaml ├── vignettes └── .gitignore ├── .gitignore ├── data ├── lbg.RData ├── demanif.RData ├── ukmanif.RData ├── daildata.RData ├── demanif.econ.RData ├── demanif.soc.RData ├── iebudget2009.RData ├── interestgroups.RData └── demanif.foreign.RData ├── tests ├── testthat.R └── testthat │ ├── test_wordfish.R │ └── test_wordscores.R ├── .Rbuildignore ├── inst ├── java │ └── wordfreq.jar └── CITATION ├── docs ├── reference │ ├── Rplot001.png │ ├── iebudget2009.html │ ├── daildata.html │ ├── ukmanif.html │ ├── interestgroups.html │ ├── jl_reindex.html │ ├── jl_count_tokens.html │ ├── demanif.html │ ├── jl_tokenize_words.html │ ├── jl_types.html │ ├── jl_doclen.html │ ├── demanif.soc.html │ ├── demanif.econ.html │ ├── jl_summarize_counts.html │ ├── jl_demote_counts.html │ ├── demanif.foreign.html │ ├── jl_promote_counts.html │ ├── jl_identify.html │ └── lbg.html ├── pkgdown.yml ├── articles │ ├── austin_files │ │ ├── figure-html │ │ │ └── unnamed-chunk-10-1.png │ │ └── header-attrs-2.7 │ │ │ └── header-attrs.js │ └── index.html ├── link.svg ├── bootstrap-toc.css ├── docsearch.js ├── pkgdown.js ├── bootstrap-toc.js ├── 404.html ├── authors.html └── news │ └── index.html ├── R ├── plot.positions.group.R ├── extractwords.R └── rescale.R ├── austin.Rproj ├── man ├── is.wfm.Rd ├── docs.Rd ├── words.Rd ├── fitted.wordfish.Rd ├── as.wfm.Rd ├── K2009.Rd ├── coef.classic.wordscores.Rd ├── LB2013.Rd ├── wordmargin.Rd ├── plot.classic.wordscores.Rd ├── ukmanif.Rd ├── as.docword.Rd ├── LG2000.Rd ├── as.worddoc.Rd ├── demanif.Rd ├── SP2008.Rd ├── SP2008_soc.Rd ├── demanif.soc.Rd ├── interestgroups.Rd ├── demanif.econ.Rd ├── extractwords.Rd ├── summary.classic.wordscores.Rd ├── SP2008_econ.Rd ├── SP2008_for.Rd ├── daildata.Rd ├── LB2002.Rd ├── demanif.foreign.Rd ├── iebudget2009.Rd ├── plot.coef.wordfish.Rd ├── getdocs.Rd ├── austin.Rd ├── rescale.Rd ├── summary.wordfish.Rd ├── trim.Rd ├── plot.wordfish.Rd ├── bootstrap.se.Rd ├── wfm2bmr.Rd ├── lbg.Rd ├── wfm2lda.Rd ├── LBG2003.Rd ├── coef.wordfish.Rd ├── initialize.urfish.Rd ├── predict.classic.wordscores.Rd ├── classic.wordscores.Rd ├── predict.wordfish.Rd ├── sim.wordfish.Rd ├── wfm.Rd └── wordfish.Rd ├── README.md ├── NEWS.md ├── DESCRIPTION └── NAMESPACE /.github/.gitignore: -------------------------------------------------------------------------------- 1 | *.html 2 | -------------------------------------------------------------------------------- /vignettes/.gitignore: -------------------------------------------------------------------------------- 1 | *.html 2 | *.R 3 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .Rproj.user 2 | .Rhistory 3 | .RData 4 | 5 | inst/doc 6 | -------------------------------------------------------------------------------- /data/lbg.RData: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/conjugateprior/austin/HEAD/data/lbg.RData -------------------------------------------------------------------------------- /tests/testthat.R: -------------------------------------------------------------------------------- 1 | library(testthat) 2 | library(austin) 3 | 4 | test_check("austin") 5 | -------------------------------------------------------------------------------- /data/demanif.RData: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/conjugateprior/austin/HEAD/data/demanif.RData -------------------------------------------------------------------------------- /data/ukmanif.RData: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/conjugateprior/austin/HEAD/data/ukmanif.RData -------------------------------------------------------------------------------- /.Rbuildignore: -------------------------------------------------------------------------------- 1 | ^.*\.Rproj$ 2 | ^\.Rproj\.user$ 3 | ^\.travis\.yml$ 4 | ^docs$ 5 | ^\.github$ 6 | -------------------------------------------------------------------------------- /data/daildata.RData: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/conjugateprior/austin/HEAD/data/daildata.RData -------------------------------------------------------------------------------- /data/demanif.econ.RData: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/conjugateprior/austin/HEAD/data/demanif.econ.RData -------------------------------------------------------------------------------- /data/demanif.soc.RData: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/conjugateprior/austin/HEAD/data/demanif.soc.RData -------------------------------------------------------------------------------- /data/iebudget2009.RData: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/conjugateprior/austin/HEAD/data/iebudget2009.RData -------------------------------------------------------------------------------- /inst/java/wordfreq.jar: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/conjugateprior/austin/HEAD/inst/java/wordfreq.jar -------------------------------------------------------------------------------- /data/interestgroups.RData: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/conjugateprior/austin/HEAD/data/interestgroups.RData -------------------------------------------------------------------------------- /data/demanif.foreign.RData: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/conjugateprior/austin/HEAD/data/demanif.foreign.RData -------------------------------------------------------------------------------- /docs/reference/Rplot001.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/conjugateprior/austin/HEAD/docs/reference/Rplot001.png -------------------------------------------------------------------------------- /docs/pkgdown.yml: -------------------------------------------------------------------------------- 1 | pandoc: 2.11.2 2 | pkgdown: 1.6.1 3 | pkgdown_sha: ~ 4 | articles: 5 | austin: austin.html 6 | last_built: 2021-04-27T20:05Z 7 | 8 | -------------------------------------------------------------------------------- /R/plot.positions.group.R: -------------------------------------------------------------------------------- 1 | plot.positions.group <- function(x, x.label, groups, names, ...) { 2 | stop("This function is deprecated\nUse dotchart instead") 3 | } 4 | -------------------------------------------------------------------------------- /docs/articles/austin_files/figure-html/unnamed-chunk-10-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/conjugateprior/austin/HEAD/docs/articles/austin_files/figure-html/unnamed-chunk-10-1.png -------------------------------------------------------------------------------- /tests/testthat/test_wordfish.R: -------------------------------------------------------------------------------- 1 | library(austin) 2 | context("wordfish") 3 | 4 | tol <- 0.00001 5 | 6 | test_that("wordfish replicates the generated data", { 7 | set.seed(1234) 8 | dd <- sim.wordfish() 9 | wmod <- wordfish(dd$Y, dir=c(1,10)) 10 | expect_is(wmod, 'wordfish') 11 | expect_true(cor(wmod$beta, dd$beta) > 0.99) 12 | expect_true(cor(wmod$theta, dd$theta) > 0.99) 13 | }) 14 | -------------------------------------------------------------------------------- /austin.Rproj: -------------------------------------------------------------------------------- 1 | Version: 1.0 2 | 3 | RestoreWorkspace: Default 4 | SaveWorkspace: Default 5 | AlwaysSaveHistory: Default 6 | 7 | EnableCodeIndexing: Yes 8 | UseSpacesForTab: Yes 9 | NumSpacesForTab: 2 10 | Encoding: UTF-8 11 | 12 | RnwWeave: knitr 13 | LaTeX: XeLaTeX 14 | 15 | AutoAppendNewline: Yes 16 | 17 | BuildType: Package 18 | PackageUseDevtools: Yes 19 | PackageInstallArgs: --no-multiarch --with-keep.source 20 | PackageRoxygenize: rd,collate,namespace,vignette 21 | -------------------------------------------------------------------------------- /inst/CITATION: -------------------------------------------------------------------------------- 1 | 2 | citHeader("To cite austin in publications use:") 3 | 4 | citEntry(entry="Manual", 5 | title = "Austin: Do Things with Words", 6 | author = "Will Lowe", 7 | year = 2021, 8 | url = "http://conjugateprior.github.io/austin", 9 | textVersion = 10 | paste("Will Lowe 2021.", 11 | "Austin: Do things with words. Version 0.5.0", 12 | "URL http://conjugateprior.github.io/austin") 13 | ) 14 | 15 | -------------------------------------------------------------------------------- /man/is.wfm.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/wfm.R 3 | \name{is.wfm} 4 | \alias{is.wfm} 5 | \title{Checks for Word Frequency Matrix} 6 | \usage{ 7 | is.wfm(x) 8 | } 9 | \arguments{ 10 | \item{x}{a matrix of counts} 11 | } 12 | \value{ 13 | Whether the object can be used as a Word Frequency Matrix 14 | } 15 | \description{ 16 | Checks whether an object is a Word Frequency Matrix 17 | } 18 | \seealso{ 19 | \code{\link{wfm}} 20 | } 21 | \author{ 22 | Will Lowe 23 | } 24 | -------------------------------------------------------------------------------- /man/docs.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/wfm.R 3 | \name{docs} 4 | \alias{docs} 5 | \alias{docs<-} 6 | \title{Extract Document Names} 7 | \usage{ 8 | docs(wfm) 9 | 10 | docs(wfm) <- value 11 | } 12 | \arguments{ 13 | \item{wfm}{an object of type wfm} 14 | 15 | \item{value}{replacement if assignment} 16 | } 17 | \value{ 18 | A list of document names. 19 | } 20 | \description{ 21 | Extracts the document names from a wfm object. 22 | } 23 | \seealso{ 24 | \code{\link{wfm}} 25 | } 26 | \author{ 27 | Will Lowe 28 | } 29 | -------------------------------------------------------------------------------- /man/words.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/wfm.R 3 | \name{words} 4 | \alias{words} 5 | \alias{words<-} 6 | \title{Extract Words} 7 | \usage{ 8 | words(wfm) 9 | 10 | words(wfm) <- value 11 | } 12 | \arguments{ 13 | \item{wfm}{an object of type wfm} 14 | 15 | \item{value}{replacement if assignment} 16 | } 17 | \value{ 18 | A list of words. 19 | } 20 | \description{ 21 | Extracts the words from a wfm object 22 | } 23 | \seealso{ 24 | \code{\link{wfm}}, \code{\link{docs}} 25 | } 26 | \author{ 27 | Will Lowe 28 | } 29 | -------------------------------------------------------------------------------- /man/fitted.wordfish.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/wordfish.R 3 | \name{fitted.wordfish} 4 | \alias{fitted.wordfish} 5 | \title{Get Fitted Values from a Wordfish Model} 6 | \usage{ 7 | \method{fitted}{wordfish}(object, ...) 8 | } 9 | \arguments{ 10 | \item{object}{a fitted Wordfish model} 11 | 12 | \item{...}{Unused} 13 | } 14 | \value{ 15 | Expected counts in the word frequency matrix 16 | } 17 | \description{ 18 | Extracts the estimated word rates from a fitted Wordfish model 19 | } 20 | \author{ 21 | Will Lowe 22 | } 23 | -------------------------------------------------------------------------------- /man/as.wfm.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/wfm.R 3 | \name{as.wfm} 4 | \alias{as.wfm} 5 | \title{Coerce to a Word Frequency Matrix} 6 | \usage{ 7 | as.wfm(mat, word.margin = 1) 8 | } 9 | \arguments{ 10 | \item{mat}{a matrix of counts} 11 | 12 | \item{word.margin}{which margin of mat represents the words} 13 | } 14 | \value{ 15 | an object of class wfm 16 | } 17 | \description{ 18 | Constructs a wfm object from various other kinds of objects 19 | } 20 | \seealso{ 21 | \code{\link{wfm}} 22 | } 23 | \author{ 24 | Will Lowe 25 | } 26 | -------------------------------------------------------------------------------- /docs/articles/austin_files/header-attrs-2.7/header-attrs.js: -------------------------------------------------------------------------------- 1 | // Pandoc 2.9 adds attributes on both header and div. We remove the former (to 2 | // be compatible with the behavior of Pandoc < 2.8). 3 | document.addEventListener('DOMContentLoaded', function(e) { 4 | var hs = document.querySelectorAll("div.section[class*='level'] > :first-child"); 5 | var i, h, a; 6 | for (i = 0; i < hs.length; i++) { 7 | h = hs[i]; 8 | if (!/^h[1-6]$/i.test(h.tagName)) continue; // it should be a header h1-h6 9 | a = h.attributes; 10 | while (a.length > 0) h.removeAttribute(a[0].name); 11 | } 12 | }); 13 | -------------------------------------------------------------------------------- /man/K2009.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/austin-package.R 3 | \docType{data} 4 | \name{K2009} 5 | \alias{K2009} 6 | \title{Interest Groups} 7 | \description{ 8 | Interest Groups and the European Commission 9 | } 10 | \details{ 11 | Word counts from interest groups and a European Commission proposal to 12 | reduce CO2 emissions in 2007. 13 | 14 | \code{K2009} is a \code{jl_df} object. 15 | } 16 | \references{ 17 | H. Kluever (2009) 'Measuring influence group influence using 18 | quantitative text analysis' European Union Politics 10(4) 535-549. 19 | } 20 | \keyword{datasets} 21 | -------------------------------------------------------------------------------- /man/coef.classic.wordscores.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/classic.wordscores.R 3 | \name{coef.classic.wordscores} 4 | \alias{coef.classic.wordscores} 5 | \title{Show Wordscores} 6 | \usage{ 7 | \method{coef}{classic.wordscores}(object, ...) 8 | } 9 | \arguments{ 10 | \item{object}{a fitted Wordscores model} 11 | 12 | \item{...}{extra arguments, currently unused} 13 | } 14 | \value{ 15 | The wordscores 16 | } 17 | \description{ 18 | Lists wordscores from a fitted Wordscores model. 19 | } 20 | \seealso{ 21 | \code{\link{classic.wordscores}} 22 | } 23 | \author{ 24 | Will Lowe 25 | } 26 | -------------------------------------------------------------------------------- /man/LB2013.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/austin-package.R 3 | \docType{data} 4 | \name{LB2013} 5 | \alias{LB2013} 6 | \title{Irish Budget Debate Data 2009} 7 | \description{ 8 | Irish budget debate 2009 9 | } 10 | \details{ 11 | These are word counts from the 2009 Budget debate in Ireland. 12 | 13 | \code{LB2013} is a \code{jl_df} object 14 | } 15 | \references{ 16 | W. Lowe and K. Benoit (2013) 'Validating estimates of latent 17 | traits from textual data using human judgment as a benchmark' 18 | Political Analysis 21(3) 298-313. 19 | } 20 | \keyword{datasets} 21 | -------------------------------------------------------------------------------- /man/wordmargin.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/wfm.R 3 | \name{wordmargin} 4 | \alias{wordmargin} 5 | \title{Which margin holds the words} 6 | \usage{ 7 | wordmargin(x) 8 | } 9 | \arguments{ 10 | \item{x}{a word frequency matrix} 11 | } 12 | \value{ 13 | 1 if words are rows and 2 if words are columns. 14 | } 15 | \description{ 16 | Checks which margin (rows or columns) of a Word Frequency Matrix holds the 17 | words 18 | } 19 | \details{ 20 | Changing the wordmargin by assignment just swaps the dimnames 21 | } 22 | \seealso{ 23 | \code{\link{wfm}} 24 | } 25 | \author{ 26 | Will Lowe 27 | } 28 | -------------------------------------------------------------------------------- /man/plot.classic.wordscores.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/classic.wordscores.R 3 | \name{plot.classic.wordscores} 4 | \alias{plot.classic.wordscores} 5 | \title{Plot a Wordscores Model} 6 | \usage{ 7 | \method{plot}{classic.wordscores}(x, ...) 8 | } 9 | \arguments{ 10 | \item{x}{a fitted Wordscores model} 11 | 12 | \item{...}{other arguments, passed to the dotchart command} 13 | } 14 | \value{ 15 | A plot of the wordscores in increasing order. 16 | } 17 | \description{ 18 | Plots Wordscores from a fitted Wordscores model 19 | } 20 | \seealso{ 21 | \code{\link{classic.wordscores}} 22 | } 23 | \author{ 24 | Will Lowe 25 | } 26 | -------------------------------------------------------------------------------- /man/ukmanif.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/austin-package.R 3 | \docType{data} 4 | \name{ukmanif} 5 | \alias{ukmanif} 6 | \title{UK Manifesto Data} 7 | \description{ 8 | UK manifesto data from Laver et al. 9 | } 10 | \details{ 11 | This are word counts from the manifestos of the three main UK parties for 12 | the 1992 and 1997 elections. 13 | 14 | ukmanif is a word frequency object. 15 | } 16 | \references{ 17 | Laver, Benoit and Garry (2003) `Estimating policy positions from 18 | political text using words as data' American Political Science 19 | Review 97(2) 311-331. 20 | } 21 | \keyword{datasets} 22 | -------------------------------------------------------------------------------- /man/as.docword.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/wfm.R 3 | \name{as.docword} 4 | \alias{as.docword} 5 | \title{Extract a Document by Word Matrix} 6 | \usage{ 7 | as.docword(wfm) 8 | } 9 | \arguments{ 10 | \item{wfm}{an object of class wfm} 11 | } 12 | \value{ 13 | a document by word count matrix 14 | } 15 | \description{ 16 | Extract a word count matrix with documents as rows and words as columns 17 | } 18 | \details{ 19 | This is a helper function for wfm objects. Use it instead of manipulating 20 | wfm object themselves. 21 | } 22 | \seealso{ 23 | \code{\link{as.worddoc}}, \code{\link{wfm}} 24 | } 25 | \author{ 26 | Will Lowe 27 | } 28 | -------------------------------------------------------------------------------- /man/LG2000.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/austin-package.R 3 | \docType{data} 4 | \name{LG2000} 5 | \alias{LG2000} 6 | \title{UK Manifesto Data} 7 | \description{ 8 | UK manifesto data from Laver et al. 9 | } 10 | \details{ 11 | This are word counts from the manifestos of the three main UK parties for 12 | the 1992 and 1997 elections. 13 | 14 | \code{LG2000} is a \code{jl_df} object. 15 | } 16 | \references{ 17 | M. Laver, K. Benoit and J. Garry (2003) 'Estimating policy 18 | positions from political text using words as data' 19 | American Political Science Review 97(2) 311-331. 20 | } 21 | \keyword{datasets} 22 | -------------------------------------------------------------------------------- /man/as.worddoc.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/wfm.R 3 | \name{as.worddoc} 4 | \alias{as.worddoc} 5 | \title{Extract a Word by Document Matrix} 6 | \usage{ 7 | as.worddoc(wfm) 8 | } 9 | \arguments{ 10 | \item{wfm}{an object of class wfm} 11 | } 12 | \value{ 13 | a word by document count matrix 14 | } 15 | \description{ 16 | Extract a matrix of word counts with words as rows and documents as columns 17 | } 18 | \details{ 19 | This is a helper function for wfm objects. Use it instead of manipulating 20 | wfm object themselves. 21 | } 22 | \seealso{ 23 | \code{\link{as.docword}}, \code{\link{wfm}} 24 | } 25 | \author{ 26 | Will Lowe 27 | } 28 | -------------------------------------------------------------------------------- /man/demanif.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/austin-package.R 3 | \docType{data} 4 | \name{demanif} 5 | \alias{demanif} 6 | \title{German Party Manifesto Data} 7 | \source{ 8 | Wordfish website (http://www.wordfish.org) 9 | } 10 | \description{ 11 | A random sample of words and their frequency in German political party 12 | manifestos from 1990-2005. 13 | } 14 | \details{ 15 | demanif is word frequency matrix 16 | } 17 | \references{ 18 | J. Slapin and S.-O. Proksch (2008) 'A scaling model for 19 | estimating time-series party positions from texts' American Journal of 20 | Political Science 52(3), 705-722. 21 | } 22 | \keyword{datasets} 23 | -------------------------------------------------------------------------------- /man/SP2008.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/austin-package.R 3 | \docType{data} 4 | \name{SP2008} 5 | \alias{SP2008} 6 | \title{German Party Manifesto Data} 7 | \source{ 8 | Wordfish website (http://www.wordfish.org) 9 | } 10 | \description{ 11 | A random sample of words and their frequency in German political party 12 | manifestos from 1990-2005. 13 | } 14 | \details{ 15 | \code{SP2008} is a \code{jl_df} object. 16 | } 17 | \references{ 18 | J. Slapin and S.-O. Proksch (2008) 'A scaling model for 19 | estimating time-series party positions from texts' American Journal of 20 | Political Science 52(3), 705-722. 21 | } 22 | \keyword{datasets} 23 | -------------------------------------------------------------------------------- /man/SP2008_soc.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/austin-package.R 3 | \docType{data} 4 | \name{SP2008_soc} 5 | \alias{SP2008_soc} 6 | \title{Societal sections of German Party Manifestos} 7 | \source{ 8 | These data courtesy of S.-O. Proksch. 9 | } 10 | \description{ 11 | A word frequency matrix from the societal sections of German political party 12 | manifestos from 1990-2005. 13 | } 14 | \details{ 15 | \code{SP2008_soc} is a \code{jl_df} object 16 | } 17 | \references{ 18 | J. Slapin and S.-O. Proksch (2008) 'A scaling model for 19 | estimating time-series party positions from texts' American Journal of 20 | Political Science 52(3), 705-722. 21 | } 22 | \keyword{datasets} 23 | -------------------------------------------------------------------------------- /man/demanif.soc.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/austin-package.R 3 | \docType{data} 4 | \name{demanif.soc} 5 | \alias{demanif.soc} 6 | \title{Societal sections of German Party Manifestos} 7 | \source{ 8 | These data courtesy are of S.-O. Proksch. 9 | } 10 | \description{ 11 | A word frequency matrix from the societal sections of German political party 12 | manifestos from 1990-2005. 13 | } 14 | \details{ 15 | demanif.soc is word frequency matrix 16 | } 17 | \references{ 18 | J. Slapin and S.-O. Proksch (2008) 'A scaling model for 19 | estimating time-series party positions from texts' American Journal of 20 | Political Science 52(3), 705-722. 21 | } 22 | \keyword{datasets} 23 | -------------------------------------------------------------------------------- /man/interestgroups.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/austin-package.R 3 | \docType{data} 4 | \name{interestgroups} 5 | \alias{interestgroups} 6 | \title{Interest Groups} 7 | \description{ 8 | Interest Groups and the European Commission 9 | } 10 | \details{ 11 | Word counts from interest groups and a European Commission proposal to 12 | reduce CO2 emissions in 2007. 13 | 14 | \code{comm1} and \code{comm2} are the Commission's proposal before and after 15 | the proposals of the interest groups. 16 | } 17 | \references{ 18 | H. Kluever (2009) 'Measuring influence group influence using 19 | quantitative text analysis' European Union Politics 11:1. 20 | } 21 | \keyword{datasets} 22 | -------------------------------------------------------------------------------- /man/demanif.econ.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/austin-package.R 3 | \docType{data} 4 | \name{demanif.econ} 5 | \alias{demanif.econ} 6 | \title{Economics sections of German Party Manifestos} 7 | \source{ 8 | These data are courtesy of S.-O. Proksch. 9 | } 10 | \description{ 11 | A word frequency matrix from the economic sections of German political party 12 | manifestos from 1990-2005. 13 | } 14 | \details{ 15 | demanif.econ is word frequency matrix 16 | } 17 | \references{ 18 | J. Slapin and S.-O. Proksch (2008) 'A scaling model for 19 | estimating time-series party positions from texts' American Journal of 20 | Political Science 52(3), 705-722. 21 | } 22 | \keyword{datasets} 23 | -------------------------------------------------------------------------------- /man/extractwords.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/extractwords.R 3 | \name{extractwords} 4 | \alias{extractwords} 5 | \title{Pull Words From a List} 6 | \usage{ 7 | extractwords(words, patternfile, pattern.type = c("glob", "re")) 8 | } 9 | \arguments{ 10 | \item{words}{the words against which patters are matched} 11 | 12 | \item{patternfile}{file containing the patters to match, one per line} 13 | 14 | \item{pattern.type}{marks whether the patterns are 'globs' or full regular 15 | expressions} 16 | } 17 | \value{ 18 | A list of matching words. 19 | } 20 | \description{ 21 | Extract a list of matching words from another list of words 22 | } 23 | \author{ 24 | Will Lowe 25 | } 26 | -------------------------------------------------------------------------------- /man/summary.classic.wordscores.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/classic.wordscores.R 3 | \name{summary.classic.wordscores} 4 | \alias{summary.classic.wordscores} 5 | \title{Summarize an Classic Wordscores Model} 6 | \usage{ 7 | \method{summary}{classic.wordscores}(object, ...) 8 | } 9 | \arguments{ 10 | \item{object}{a fitted wordscores model} 11 | 12 | \item{...}{extra arguments (currently ignored)} 13 | } 14 | \value{ 15 | A summary of information about the reference documents used to fit 16 | the model. 17 | } 18 | \description{ 19 | Summarises a Wordscores model 20 | } 21 | \details{ 22 | To see the wordscores, use \code{coef}. 23 | } 24 | \author{ 25 | Will Lowe 26 | } 27 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | # Austin 2 | 3 | 4 | [![R build status](https://github.com/conjugateprior/austin/workflows/R-CMD-check/badge.svg)](https://github.com/conjugateprior/austin/actions) 5 | 6 | 7 | [![codecov](https://codecov.io/gh/conjugateprior/austin/branch/master/graph/badge.svg)](https://codecov.io/gh/conjugateprior/austin) 8 | 9 | Austin fits Wordscores and Wordfish models to document-feature matrices. It 10 | also keep some useful data sets around. 11 | 12 | ## Installation 13 | 14 | ``` 15 | devtools::install_github("conjugateprior/austin") 16 | ``` 17 | If that didn't work you may need to 18 | ``` 19 | install.packages('devtools') 20 | ``` 21 | first, then try again. 22 | 23 | 24 | -------------------------------------------------------------------------------- /man/SP2008_econ.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/austin-package.R 3 | \docType{data} 4 | \name{SP2008_econ} 5 | \alias{SP2008_econ} 6 | \title{Economics sections of German Party Manifestos} 7 | \source{ 8 | These data are courtesy of S.-O. Proksch. 9 | } 10 | \description{ 11 | A word frequency matrix from the economic sections of German political party 12 | manifestos from 1990-2005. 13 | } 14 | \details{ 15 | \code{SP2008_econ} is a \code{jl_df} object 16 | } 17 | \references{ 18 | J. Slapin and S.-O. Proksch (2008) 'A scaling model for 19 | estimating time-series party positions from texts' American Journal of 20 | Political Science 52(3), 705-722. 21 | } 22 | \keyword{datasets} 23 | -------------------------------------------------------------------------------- /man/SP2008_for.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/austin-package.R 3 | \docType{data} 4 | \name{SP2008_for} 5 | \alias{SP2008_for} 6 | \title{Foreign Policy Sections of German Party Manifestos} 7 | \source{ 8 | These data courtesy of S.-O. Proksch. 9 | } 10 | \description{ 11 | A word frequency matrix from the foreign policy sections of German political 12 | party manifestos from 1990-2005. 13 | } 14 | \details{ 15 | \code{SP2008_for} is a \code{jl_df} object 16 | } 17 | \references{ 18 | J. Slapin and S.-O. Proksch (2008) 'A scaling model for 19 | estimating time-series party positions from texts' American Journal of 20 | Political Science 52(3), 705-722. 21 | } 22 | \keyword{datasets} 23 | -------------------------------------------------------------------------------- /man/daildata.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/austin-package.R 3 | \docType{data} 4 | \name{daildata} 5 | \alias{daildata} 6 | \title{The 1991 Irish Confidence debate} 7 | \description{ 8 | Irish Confidence Debate 9 | } 10 | \details{ 11 | This are word counts from the no-confidence motion debated in the 12 | Irish Dáil from 16-18 October 1991 over the future of the Fianna 13 | Fail-Progressive Democrat coalition. 14 | \code{daildata} is a word frequency object. 15 | } 16 | \references{ 17 | Laver, M. & Benoit, K.R. (2002). Locating TDs in Policy Spaces: 18 | Wordscoring Dáil Speeches. Irish Political Studies, 17(1), 19 | 59–73. 20 | } 21 | \keyword{datasets} 22 | -------------------------------------------------------------------------------- /man/LB2002.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/austin-package.R 3 | \docType{data} 4 | \name{LB2002} 5 | \alias{LB2002} 6 | \title{The 1991 Irish Confidence debate} 7 | \description{ 8 | Irish Confidence Debate (jl format) 9 | } 10 | \details{ 11 | This are word counts from the no-confidence motion debated in the 12 | Irish Dáil from 16-18 October 1991 over the future of the Fianna 13 | Fail-Progressive Democrat coalition. 14 | 15 | \code{LB2003} is \code{jl_df} object. 16 | } 17 | \references{ 18 | Laver, M. & Benoit, K.R. (2002). Locating TDs in Policy Spaces: 19 | Wordscoring Dáil Speeches. Irish Political Studies, 17(1), 20 | 59–73. 21 | } 22 | \keyword{datasets} 23 | -------------------------------------------------------------------------------- /man/demanif.foreign.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/austin-package.R 3 | \docType{data} 4 | \name{demanif.foreign} 5 | \alias{demanif.foreign} 6 | \title{Foreign Policy Sections of German Party Manifestos} 7 | \source{ 8 | These data courtesy of S.-O. Proksch. 9 | } 10 | \description{ 11 | A word frequency matrix from the foreign policy sections of German political 12 | party manifestos from 1990-2005. 13 | } 14 | \details{ 15 | demanif.foreign is word frequency matrix 16 | } 17 | \references{ 18 | J. Slapin and S.-O. Proksch (2008) 'A scaling model for 19 | estimating time-series party positions from texts' American Journal of 20 | Political Science 52(3), 705-722. 21 | } 22 | \keyword{datasets} 23 | -------------------------------------------------------------------------------- /NEWS.md: -------------------------------------------------------------------------------- 1 | ## austin 0.5.1 2 | 3 | * Removing barely used java jar for word counting and related functions 4 | 5 | ## austin 0.5.0 6 | 7 | * prototype general text and covariate manipulation functions 8 | added. Caution: very experimental. 9 | * switched to github actions 10 | 11 | ## austin 0.4.0 12 | 13 | * removed 'foreign' characters from German datasets 14 | 15 | ## austin 0.3.0 16 | 17 | * Updated package structure to more recent R 18 | * Added a github site 19 | 20 | ## austin 0.22 21 | 22 | * Made sure `demanif.*` datasets to work with indexing behaviour of 23 | recent R versions 24 | 25 | ## austin 0.21 26 | 27 | * First import from r-forge 28 | * Revised vignette is now in markdown 29 | * Some basic tests 30 | 31 | 32 | 33 | -------------------------------------------------------------------------------- /man/iebudget2009.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/austin-package.R 3 | \docType{data} 4 | \name{iebudget2009} 5 | \alias{iebudget2009} 6 | \alias{iebudget2009cov} 7 | \title{Irish Budget Debate Data 2009} 8 | \description{ 9 | Irish budget debate 2009 10 | 11 | Irish budget debate 2009 12 | } 13 | \details{ 14 | This are word counts from the 2009 Budget debate in Ireland. 15 | 16 | This is a word frequency nmatrix. Loading this data also makes available 17 | \code{iebudget2009cov} which contains covariates for the speakers. 18 | 19 | This are word counts from the 2009 Budget debate in Ireland. 20 | 21 | This is a word frequency nmatrix. Loading this data also makes available 22 | \code{iebudget2009cov} which contains covariates for the speakers. 23 | } 24 | \keyword{datasets} 25 | -------------------------------------------------------------------------------- /man/plot.coef.wordfish.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/wordfish.R 3 | \name{plot.coef.wordfish} 4 | \alias{plot.coef.wordfish} 5 | \title{Plot the Word Parameters From a Wordfish Model} 6 | \usage{ 7 | \method{plot}{coef.wordfish}(x, pch = 20, psi = TRUE, ...) 8 | } 9 | \arguments{ 10 | \item{x}{a fitted Wordfish model} 11 | 12 | \item{pch}{Default is to use small dots to plot positions} 13 | 14 | \item{psi}{whether to plot word fixed effects} 15 | 16 | \item{...}{Any extra graphics parameters to pass in} 17 | } 18 | \value{ 19 | A plot of sorted beta and optionally psi parameters. 20 | } 21 | \description{ 22 | Plots sorted beta and optionally also psi parameters from a Wordfish model 23 | } 24 | \seealso{ 25 | \code{\link{wordfish}} 26 | } 27 | \author{ 28 | Will Lowe 29 | } 30 | -------------------------------------------------------------------------------- /docs/link.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 5 | 8 | 12 | 13 | -------------------------------------------------------------------------------- /man/getdocs.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/wfm.R 3 | \name{getdocs} 4 | \alias{getdocs} 5 | \title{Get Documents} 6 | \usage{ 7 | getdocs(wfm, which) 8 | } 9 | \arguments{ 10 | \item{wfm}{a wfm object} 11 | 12 | \item{which}{names or indexes of documents} 13 | } 14 | \value{ 15 | A smaller wfm object containing only the desired documents with the 16 | same word margin setting as the original matrix. 17 | } 18 | \description{ 19 | Gets particular documents from a wfm by name or index 20 | } 21 | \details{ 22 | getdocs is essentially a subset command that picks the correct margin for 23 | you. 24 | } 25 | \seealso{ 26 | \code{\link{as.wfm}}, \code{\link{as.docword}}, 27 | \code{\link{as.worddoc}}, \code{\link{docs}}, \code{\link{words}}, 28 | \code{\link{is.wfm}}, \code{\link{wordmargin}} 29 | } 30 | \author{ 31 | Will Lowe 32 | } 33 | -------------------------------------------------------------------------------- /DESCRIPTION: -------------------------------------------------------------------------------- 1 | Package: austin 2 | Type: Package 3 | Title: Do Things with Words 4 | Version: 0.5.0 5 | Date: 2020-12-23 6 | Authors@R: person("Will", "Lowe", email = "lowe@hertie-school.org", role = c("aut", "cre")) 7 | Description: Doing things with words currently means scaling documents on 8 | a presumed underlying dimension on the basis of word frequencies and heroic 9 | assumptions about language generation. 10 | URL: https://conjugateprior.github.io/austin 11 | BugReports: https://github.com/conjugateprior/austin/issues 12 | Depends: 13 | R (>= 3.1) 14 | Imports: 15 | dplyr, 16 | tibble, 17 | numDeriv, 18 | methods, 19 | Matrix, 20 | tokenizers, 21 | irlba 22 | License: file LICENSE 23 | LazyLoad: yes 24 | Suggests: 25 | knitr, 26 | rmarkdown, 27 | testthat 28 | VignetteBuilder: knitr 29 | Encoding: UTF-8 30 | RoxygenNote: 7.1.1 31 | 32 | -------------------------------------------------------------------------------- /man/austin.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/austin-package.R 3 | \docType{package} 4 | \name{austin} 5 | \alias{austin} 6 | \title{austin: Do things with words} 7 | \description{ 8 | Austin helps you see what people, usually politicians, do with words. 9 | Currently that means how positions on a presumed underlying 10 | policy scale are taken by manipulating word occurrence counts. 11 | The models implemented here try to 12 | try to recover those positions using only this information, plus 13 | some heroic assumptions about language generation, e.g. 14 | unidimensionality, conditional independence of words given ideal point 15 | and Poisson-distributed word counts. 16 | } 17 | \details{ 18 | The package currently implements Wordfish (Slapin and Proksch, 2008) and 19 | Wordscores (Laver, Benoit and Garry, 2003). See references for details. 20 | } 21 | -------------------------------------------------------------------------------- /R/extractwords.R: -------------------------------------------------------------------------------- 1 | #' Pull Words From a List 2 | #' 3 | #' Extract a list of matching words from another list of words 4 | #' 5 | #' 6 | #' @param words the words against which patters are matched 7 | #' @param patternfile file containing the patters to match, one per line 8 | #' @param pattern.type marks whether the patterns are 'globs' or full regular 9 | #' expressions 10 | #' @return A list of matching words. 11 | #' @importFrom utils read.table glob2rx 12 | #' @export 13 | #' @author Will Lowe 14 | extractwords <- function(words, 15 | patternfile, 16 | pattern.type=c('glob', 're')){ 17 | pats <- read.table(patternfile, strip.white=TRUE) 18 | if (pattern.type[1] == 'glob') 19 | pats <- apply(pats, 1, glob2rx) 20 | use <- c() 21 | for (i in 1:length(pats)) 22 | use <- union(use, grep(pats[i], words)) 23 | return(use) 24 | } 25 | -------------------------------------------------------------------------------- /man/rescale.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/rescale.R 3 | \name{rescale} 4 | \alias{rescale} 5 | \title{Rescale Estimated Document Positions} 6 | \usage{ 7 | rescale(object, ident = c(1, -1, 10, 1)) 8 | } 9 | \arguments{ 10 | \item{object}{fitted wordfish or wordscores object} 11 | 12 | \item{ident}{two documents indexes and and their desired new positions} 13 | } 14 | \value{ 15 | A data frame containing the rescaled document positions with 16 | standard errors if available. 17 | } 18 | \description{ 19 | Linearly rescales estimated document positions on the basis of two control 20 | points. 21 | } 22 | \details{ 23 | The rescaled positions set document with index ident[1] to position ident[2] 24 | and docuemnt with index ident[3] to position ident[4]. The fitted model 25 | passed as the first argument is not affected. 26 | } 27 | \author{ 28 | Will Lowe 29 | } 30 | -------------------------------------------------------------------------------- /man/summary.wordfish.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/wordfish.R 3 | \name{summary.wordfish} 4 | \alias{summary.wordfish} 5 | \title{Summarize a Wordfish Model} 6 | \usage{ 7 | \method{summary}{wordfish}(object, level = 0.95, ...) 8 | } 9 | \arguments{ 10 | \item{object}{fitted wordfish model} 11 | 12 | \item{level}{confidence interval coverage} 13 | 14 | \item{...}{extra arguments, e.g. level} 15 | } 16 | \value{ 17 | A data.frame containing estimated document position with standard 18 | errors and confidence intervals. 19 | } 20 | \description{ 21 | Summarises estimated document positions from a fitted Wordfish model 22 | } 23 | \details{ 24 | if `level' is passed to the function, e.g. 0.95 for 95 percent confidence, 25 | this generates the appropriate width intervals. 26 | } 27 | \seealso{ 28 | \code{\link{wordfish}} 29 | } 30 | \author{ 31 | Will Lowe 32 | } 33 | -------------------------------------------------------------------------------- /man/trim.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/wfm.R 3 | \name{trim} 4 | \alias{trim} 5 | \title{Trim a Word Frequency Data} 6 | \usage{ 7 | trim(wfm, min.count = 5, min.doc = 5, sample = NULL, verbose = TRUE) 8 | } 9 | \arguments{ 10 | \item{wfm}{an object of class wfm, or a data matrix} 11 | 12 | \item{min.count}{the smallest permissible word count} 13 | 14 | \item{min.doc}{the fewest permissible documents a word can appear in} 15 | 16 | \item{sample}{how many words to randomly retain} 17 | 18 | \item{verbose}{whether to say what we did} 19 | } 20 | \value{ 21 | If \code{sample} is a number then this many words will be retained 22 | after \code{min.doc} and \code{min.doc} filters have been applied. 23 | } 24 | \description{ 25 | Ejects low frequency observations and subsamples 26 | } 27 | \seealso{ 28 | \code{\link{wfm}} 29 | } 30 | \author{ 31 | Will Lowe 32 | } 33 | -------------------------------------------------------------------------------- /man/plot.wordfish.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/wordfish.R 3 | \name{plot.wordfish} 4 | \alias{plot.wordfish} 5 | \title{Plot a Wordfish Model} 6 | \usage{ 7 | \method{plot}{wordfish}(x, truevals = NULL, level = 0.95, pch = 20, ...) 8 | } 9 | \arguments{ 10 | \item{x}{a fitted Wordfish model} 11 | 12 | \item{truevals}{True document positions if known} 13 | 14 | \item{level}{Intended coverage of confidence intervals} 15 | 16 | \item{pch}{Default is to use small dots to plot positions} 17 | 18 | \item{...}{Any extra graphics parameters to pass in} 19 | } 20 | \value{ 21 | A plot of sorted estimated document positions, with confidence 22 | intervals and true document positions, if these are available. 23 | } 24 | \description{ 25 | Plots a fitted Wordfish model with confidence intervals 26 | } 27 | \seealso{ 28 | \code{\link{wordfish}} 29 | } 30 | \author{ 31 | Will Lowe 32 | } 33 | -------------------------------------------------------------------------------- /man/bootstrap.se.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/wordfish.R 3 | \name{bootstrap.se} 4 | \alias{bootstrap.se} 5 | \title{Compute Bootstrap Standard Errors} 6 | \usage{ 7 | bootstrap.se(object, L = 50, verbose = FALSE, ...) 8 | } 9 | \arguments{ 10 | \item{object}{a fitted Wordfish model} 11 | 12 | \item{L}{how many replications} 13 | 14 | \item{verbose}{Give progress updates} 15 | 16 | \item{...}{Unused} 17 | } 18 | \value{ 19 | Standard errors for document positions 20 | } 21 | \description{ 22 | Computes bootstrap standard errors for document positions from a fitted 23 | Wordfish model 24 | } 25 | \details{ 26 | This function computes a parametric bootstrap by resampling counts from the 27 | fitted word counts, refitting the model, and storing the document positions. 28 | The standard deviations for each resampled document position are returned. 29 | } 30 | \author{ 31 | Will Lowe 32 | } 33 | -------------------------------------------------------------------------------- /man/wfm2bmr.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/wfm.R 3 | \name{wfm2bmr} 4 | \alias{wfm2bmr} 5 | \title{Transform Word Frequency Matrix for BMR/BLR} 6 | \usage{ 7 | wfm2bmr(y, wfm, filename) 8 | } 9 | \arguments{ 10 | \item{y}{integer dependent variable, may be NULL} 11 | 12 | \item{wfm}{a word frequency matrix} 13 | 14 | \item{filename}{Name of the file to save data to} 15 | } 16 | \value{ 17 | A file containing the variables in in sparse matrix format. 18 | } 19 | \description{ 20 | Transforms a wfm to the format used by BMR/BLR 21 | } 22 | \details{ 23 | BMR is sparse matrix format similar to that used by SVMlight 24 | 25 | Each line contains an optional dependent variable index and a sequence of 26 | indexes and feature value pairs divided by colons. Indexes refer to the 27 | words with non-zero counts in the original matrix, and the feature values 28 | are the counts. 29 | } 30 | \seealso{ 31 | \code{\link{wfm}} 32 | } 33 | \author{ 34 | Will Lowe 35 | } 36 | -------------------------------------------------------------------------------- /man/lbg.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/austin-package.R 3 | \docType{data} 4 | \name{lbg} 5 | \alias{lbg} 6 | \title{Example Data} 7 | \description{ 8 | Example data from Laver Benoit and Garry (2003) 9 | } 10 | \details{ 11 | This is the example word count data from Laver, Benoit and Garry's (2000) 12 | article on Wordscores. Documents R1 to R5 are assumed to have known 13 | positions: -1.5, -0.75, 0, 0.75, 1.5. Document V1 is assumed unknown. The 14 | `correct' position for V1 is presumed to be -0.45. 15 | \code{\link{classic.wordscores}} generates approximately -0.45. 16 | 17 | To replicate the analysis in the paper, use the wordscores function either 18 | with identification fixing the first 5 document positions and leaving 19 | position of V1 to be predicted. 20 | } 21 | \references{ 22 | Laver, Benoit and Garry (2003) `Estimating policy positions from 23 | political text using words as data' American Political Science Review 97(2). 24 | } 25 | \keyword{datasets} 26 | -------------------------------------------------------------------------------- /man/wfm2lda.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/wfm.R 3 | \name{wfm2lda} 4 | \alias{wfm2lda} 5 | \title{Transform Word Frequency Matrix for lda} 6 | \usage{ 7 | wfm2lda(wfm, dir = NULL, names = c("mult.dat", "vocab.dat")) 8 | } 9 | \arguments{ 10 | \item{wfm}{a word frequency matrix} 11 | 12 | \item{dir}{a file to dump the converted data} 13 | 14 | \item{names}{Names of the data and vocabulary file respectively} 15 | } 16 | \value{ 17 | A list containing \item{data}{zero indexed word frequency 18 | information about a set of documents} \item{vocab}{a vocabulary list}, 19 | unless \code{dir} is specified. 20 | 21 | If \code{dir} is specified then the same information is dumped to 22 | 'vocab.dat' and 'mult.dat' in the \code{dir} folder. 23 | } 24 | \description{ 25 | Transforms a wfm to the format used by the lda package 26 | } 27 | \details{ 28 | See the documentation of \code{lda} package for the relevant object 29 | structures and file formats. 30 | } 31 | \seealso{ 32 | \code{\link{wfm}} 33 | } 34 | \author{ 35 | Will Lowe 36 | } 37 | -------------------------------------------------------------------------------- /man/LBG2003.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/austin-package.R 3 | \docType{data} 4 | \name{LBG2003} 5 | \alias{LBG2003} 6 | \title{Example Data} 7 | \description{ 8 | Example data from Laver Benoit and Garry (2003) 9 | } 10 | \details{ 11 | This is the example word count data from Laver, Benoit and Garry's (2000) 12 | article on Wordscores. Documents R1 to R5 are assumed to have known 13 | positions: -1.5, -0.75, 0, 0.75, 1.5. Document V1 is assumed unknown. The 14 | `correct' position for V1 is presumably -0.45. 15 | \code{\link{classic.wordscores}} generates approximately -0.45. 16 | 17 | To replicate the analysis in the paper, use the wordscores function either 18 | with identification fixing the first 5 document positions and leaving 19 | position of V1 to be predicted. 20 | 21 | \code{LBG2003} is a \code{jl_df} object. 22 | } 23 | \references{ 24 | M. Laver, K. Benoit and J. Garry (2003) 'Estimating policy 25 | positions from political text using words as data' American 26 | Political Science Review. 97(2) 311-331. 27 | } 28 | \keyword{datasets} 29 | -------------------------------------------------------------------------------- /man/coef.wordfish.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/wordfish.R 3 | \name{coef.wordfish} 4 | \alias{coef.wordfish} 5 | \title{Extract Word Parameters} 6 | \usage{ 7 | \method{coef}{wordfish}(object, form = c("poisson", "multinomial"), ...) 8 | } 9 | \arguments{ 10 | \item{object}{an object of class wordfish} 11 | 12 | \item{form}{which parameterization of the model to return parameters for} 13 | 14 | \item{...}{extra arguments} 15 | } 16 | \value{ 17 | A data.frame of word parameters from a wordfish model in one or 18 | other parameterization. 19 | } 20 | \description{ 21 | Extract word parameters beta and psi in an appropriate model 22 | parameterization 23 | } 24 | \details{ 25 | Slope parameters and intercepts are labelled beta and psi respectively. In 26 | multinomial form the coefficient names reflect the fact that the 27 | first-listed word is taken as the reference category. In poisson form, the 28 | coefficients are labeled by the words the correspond to. 29 | 30 | Note that in both forms there will be beta and psi parameters, so make sure 31 | they are the ones you want. 32 | } 33 | \seealso{ 34 | \code{\link{wordfish}} 35 | } 36 | \author{ 37 | Will Lowe 38 | } 39 | -------------------------------------------------------------------------------- /R/rescale.R: -------------------------------------------------------------------------------- 1 | #' Rescale Estimated Document Positions 2 | #' 3 | #' Linearly rescales estimated document positions on the basis of two control 4 | #' points. 5 | #' 6 | #' The rescaled positions set document with index ident[1] to position ident[2] 7 | #' and docuemnt with index ident[3] to position ident[4]. The fitted model 8 | #' passed as the first argument is not affected. 9 | #' 10 | #' @param object fitted wordfish or wordscores object 11 | #' @param ident two documents indexes and and their desired new positions 12 | #' @return A data frame containing the rescaled document positions with 13 | #' standard errors if available. 14 | #' @author Will Lowe 15 | #' @importFrom stats coef lm 16 | #' @export 17 | rescale <- function(object, ident=c(1,-1,10,1)){ 18 | val1 <- object$theta[ident[1]] 19 | val2 <- object$theta[ident[3]] 20 | ab <- as.numeric(coef(lm(c(ident[2], ident[4]) ~ c(val1, val2)))) 21 | 22 | if (is(object, 'wordfish')){ 23 | d <- data.frame(theta=(object$theta*ab[2] + ab[1]), se=object$se.theta*ab[2]) 24 | names(d) <- c('Estimate', 'Std. Error') 25 | } else if (is(object, 'wordscores')){ 26 | d <- data.frame(Estimate=(object$theta*ab[2] + ab[1])) 27 | } 28 | return(d) 29 | } 30 | 31 | -------------------------------------------------------------------------------- /man/initialize.urfish.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/wordfish.R 3 | \name{initialize.urfish} 4 | \alias{initialize.urfish} 5 | \title{initialize.urfish} 6 | \usage{ 7 | initialize.urfish(tY) 8 | } 9 | \arguments{ 10 | \item{tY}{a document by word matrix of counts} 11 | } 12 | \value{ 13 | List with elements: \item{alpha}{starting values of alpha 14 | parameters} \item{psi}{starting values of psi parameters} 15 | \item{beta}{starting values of beta parameters} \item{theta}{starting values 16 | for document positions} 17 | } 18 | \description{ 19 | Get cheap starting values for a Wordfish model 20 | } 21 | \details{ 22 | This function is only called by model fitting routines and does therefore 23 | not take a wfm classes. tY is assumed to be in document by term form. 24 | 25 | In the poisson form of the model incidental parameters (alpha) are set to 26 | log(rowmeans/rowmeans[1]). intercept (psi) values are set to log(colmeans) 27 | These are subtracted from a the data matrix, which is logged and decomposed 28 | by SVD. Word slope (beta) and document position (theta) are estimated by 29 | rescaling SVD output. 30 | } 31 | \references{ 32 | This is substantially the method used by Slapin and Proksch's 33 | original code. 34 | } 35 | \author{ 36 | Will Lowe 37 | } 38 | -------------------------------------------------------------------------------- /man/predict.classic.wordscores.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/classic.wordscores.R 3 | \name{predict.classic.wordscores} 4 | \alias{predict.classic.wordscores} 5 | \title{Predict New Document Positions} 6 | \usage{ 7 | \method{predict}{classic.wordscores}(object, newdata = NULL, rescale = c("lbg", "none"), z = 0.95, ...) 8 | } 9 | \arguments{ 10 | \item{object}{Fitted wordscores model} 11 | 12 | \item{newdata}{An object of class wfm in which to look for word counts to 13 | predict document ideal points. If omitted, the reference documents are used.} 14 | 15 | \item{rescale}{Rescale method for estimated positions.} 16 | 17 | \item{z}{Notional confidence interval coverage} 18 | 19 | \item{...}{further arguments (quietly ignored)} 20 | } 21 | \value{ 22 | \code{predict.wordscores} produces a vector of predicted document 23 | positions and standard errors and confidence intervals. 24 | } 25 | \description{ 26 | Predicts positions of new documents from a fitted Wordscores model 27 | } 28 | \details{ 29 | This is the method described in Laver et al. 2003, including rescaling for 30 | more than one virgin text. Confidence intervals are not provided if 31 | \code{rescale} is 'none'. 32 | } 33 | \seealso{ 34 | \code{\link{classic.wordscores}} 35 | } 36 | \author{ 37 | Will Lowe 38 | } 39 | -------------------------------------------------------------------------------- /man/classic.wordscores.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/classic.wordscores.R 3 | \name{classic.wordscores} 4 | \alias{classic.wordscores} 5 | \title{Old-Style Wordscores} 6 | \usage{ 7 | classic.wordscores(wfm, scores) 8 | } 9 | \arguments{ 10 | \item{wfm}{object of class wfm} 11 | 12 | \item{scores}{reference document positions/scores} 13 | } 14 | \value{ 15 | An old-style Wordscores analysis. 16 | } 17 | \description{ 18 | Construct a Wordscores model from reference document scores 19 | } 20 | \details{ 21 | This version of Wordscores is exactly as described in Laver et al. 2003 and 22 | is provided for historical interest and continued replicability of older 23 | analyses. 24 | 25 | \code{scores} is a vector of document scores corresponding to the documents 26 | in the word frequency matrix \code{wfm}. The function computes wordscores 27 | and returns a model from which virgin text scores can be predicted. 28 | } 29 | \examples{ 30 | 31 | data(lbg) 32 | ref <- getdocs(lbg, 1:5) 33 | ws <- classic.wordscores(ref, scores=seq(-1.5,1.5,by=0.75)) 34 | summary(ws) 35 | vir <- getdocs(lbg, 'V1') 36 | predict(ws, newdata=vir) 37 | 38 | } 39 | \references{ 40 | Laver, M. and Benoit, K. and Garry, J. (2003) 'Extracting policy 41 | positions from political texts using words as data' American Political 42 | Science Review. 97. pp.311-333 43 | } 44 | \seealso{ 45 | \code{\link{summary.classic.wordscores}} 46 | } 47 | \author{ 48 | Will Lowe 49 | } 50 | -------------------------------------------------------------------------------- /man/predict.wordfish.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/wordfish.R 3 | \name{predict.wordfish} 4 | \alias{predict.wordfish} 5 | \title{Predict Method for Wordfish} 6 | \usage{ 7 | \method{predict}{wordfish}( 8 | object, 9 | newdata = NULL, 10 | se.fit = FALSE, 11 | interval = c("none", "confidence"), 12 | level = 0.95, 13 | ... 14 | ) 15 | } 16 | \arguments{ 17 | \item{object}{A fitted wordfish model} 18 | 19 | \item{newdata}{An optional data frame or object of class wfm in which to 20 | look for word counts to predict document ideal points which to predict. If 21 | omitted, the fitted values are used.} 22 | 23 | \item{se.fit}{A switch indicating if standard errors are required.} 24 | 25 | \item{interval}{Type of interval calculation} 26 | 27 | \item{level}{Tolerance/confidence level} 28 | 29 | \item{...}{further arguments passed to or from other methods.} 30 | } 31 | \value{ 32 | \code{predict.wordfish} produces a vector of predictions or a matrix 33 | of predictions and bounds with column names `fit' and `se.fit', and with 34 | `lwr', and `upr' if `interval' is also set. 35 | } 36 | \description{ 37 | Predicts positions of new documents using a fitted Wordfish model 38 | } 39 | \details{ 40 | Standard errors for document positions are generated by numerically 41 | inverting the relevant Hessians from the profile likelihood of the 42 | multinomial form of the model. 43 | } 44 | \seealso{ 45 | \code{\link{wordfish}} 46 | } 47 | \author{ 48 | Will Lowe 49 | } 50 | -------------------------------------------------------------------------------- /NAMESPACE: -------------------------------------------------------------------------------- 1 | # Generated by roxygen2: do not edit by hand 2 | 3 | S3method(coef,classic.wordscores) 4 | S3method(coef,wordfish) 5 | S3method(fitted,wordfish) 6 | S3method(plot,classic.wordscores) 7 | S3method(plot,coef.wordfish) 8 | S3method(plot,wordfish) 9 | S3method(predict,classic.wordscores) 10 | S3method(predict,wordfish) 11 | S3method(print,summary.wordfish) 12 | S3method(print,wordfish) 13 | S3method(summary,classic.wordscores) 14 | S3method(summary,wordfish) 15 | export("docs<-") 16 | export("words<-") 17 | export(as.docword) 18 | export(as.wfm) 19 | export(as.worddoc) 20 | export(bootstrap.se) 21 | export(classic.wordscores) 22 | export(docs) 23 | export(extractwords) 24 | export(getdocs) 25 | export(initialize.urfish) 26 | export(is.wfm) 27 | export(rescale) 28 | export(sim.wordfish) 29 | export(trim) 30 | export(wfm) 31 | export(wfm2bmr) 32 | export(wfm2lda) 33 | export(wordfish) 34 | export(wordmargin) 35 | export(words) 36 | importFrom(grDevices,rgb) 37 | importFrom(graphics,dotchart) 38 | importFrom(graphics,plot) 39 | importFrom(graphics,points) 40 | importFrom(graphics,segments) 41 | importFrom(graphics,text) 42 | importFrom(graphics,title) 43 | importFrom(methods,is) 44 | importFrom(numDeriv,hessian) 45 | importFrom(stats,coef) 46 | importFrom(stats,cor) 47 | importFrom(stats,dmultinom) 48 | importFrom(stats,lm) 49 | importFrom(stats,median) 50 | importFrom(stats,optim) 51 | importFrom(stats,optimize) 52 | importFrom(stats,qnorm) 53 | importFrom(stats,rmultinom) 54 | importFrom(stats,rnorm) 55 | importFrom(stats,rpois) 56 | importFrom(stats,sd) 57 | importFrom(utils,flush.console) 58 | importFrom(utils,glob2rx) 59 | importFrom(utils,read.csv) 60 | importFrom(utils,read.table) 61 | -------------------------------------------------------------------------------- /man/sim.wordfish.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/wordfish.R 3 | \name{sim.wordfish} 4 | \alias{sim.wordfish} 5 | \title{Simulate data and parameters for a Wordfish model} 6 | \usage{ 7 | sim.wordfish( 8 | docs = 10, 9 | vocab = 20, 10 | doclen = 500, 11 | dist = c("spaced", "normal"), 12 | scaled = TRUE 13 | ) 14 | } 15 | \arguments{ 16 | \item{docs}{How many `documents' should be generated} 17 | 18 | \item{vocab}{How many `word' types should be generated} 19 | 20 | \item{doclen}{A scalar `document' length or vector of lengths} 21 | 22 | \item{dist}{the distribution of `document' positions} 23 | 24 | \item{scaled}{whether the document positions should be mean 0, unit sd} 25 | } 26 | \value{ 27 | \item{Y}{A sample word-document matrix} \item{theta}{The `document' 28 | positions} \item{doclen}{The `document' lengths} \item{beta}{`Word' 29 | intercepts} \item{psi}{`Word' slopes} 30 | } 31 | \description{ 32 | Simulates data and returns parameter values using Wordfish model 33 | assumptions: Counts are sampled under the assumption of independent Poisson 34 | draws with log expected means linearly related to a lattice of document 35 | positions. 36 | } 37 | \details{ 38 | This function draws `docs' document positions from a Normal distribution, or 39 | regularly spaced between 1/`docs' and 1. 40 | 41 | `vocab'/2 word slopes are 1, the rest -1. All word intercepts are 0. 42 | `doclen' words are then sampled from a multinomial with these parameters. 43 | 44 | Document position (theta) is sorted in increasing size across the documents. 45 | If `scaled' is true it is normalized to mean zero, unit standard deviation. 46 | This is most helpful when dist=normal. 47 | } 48 | \author{ 49 | Will Lowe 50 | } 51 | -------------------------------------------------------------------------------- /docs/bootstrap-toc.css: -------------------------------------------------------------------------------- 1 | /*! 2 | * Bootstrap Table of Contents v0.4.1 (http://afeld.github.io/bootstrap-toc/) 3 | * Copyright 2015 Aidan Feldman 4 | * Licensed under MIT (https://github.com/afeld/bootstrap-toc/blob/gh-pages/LICENSE.md) */ 5 | 6 | /* modified from https://github.com/twbs/bootstrap/blob/94b4076dd2efba9af71f0b18d4ee4b163aa9e0dd/docs/assets/css/src/docs.css#L548-L601 */ 7 | 8 | /* All levels of nav */ 9 | nav[data-toggle='toc'] .nav > li > a { 10 | display: block; 11 | padding: 4px 20px; 12 | font-size: 13px; 13 | font-weight: 500; 14 | color: #767676; 15 | } 16 | nav[data-toggle='toc'] .nav > li > a:hover, 17 | nav[data-toggle='toc'] .nav > li > a:focus { 18 | padding-left: 19px; 19 | color: #563d7c; 20 | text-decoration: none; 21 | background-color: transparent; 22 | border-left: 1px solid #563d7c; 23 | } 24 | nav[data-toggle='toc'] .nav > .active > a, 25 | nav[data-toggle='toc'] .nav > .active:hover > a, 26 | nav[data-toggle='toc'] .nav > .active:focus > a { 27 | padding-left: 18px; 28 | font-weight: bold; 29 | color: #563d7c; 30 | background-color: transparent; 31 | border-left: 2px solid #563d7c; 32 | } 33 | 34 | /* Nav: second level (shown on .active) */ 35 | nav[data-toggle='toc'] .nav .nav { 36 | display: none; /* Hide by default, but at >768px, show it */ 37 | padding-bottom: 10px; 38 | } 39 | nav[data-toggle='toc'] .nav .nav > li > a { 40 | padding-top: 1px; 41 | padding-bottom: 1px; 42 | padding-left: 30px; 43 | font-size: 12px; 44 | font-weight: normal; 45 | } 46 | nav[data-toggle='toc'] .nav .nav > li > a:hover, 47 | nav[data-toggle='toc'] .nav .nav > li > a:focus { 48 | padding-left: 29px; 49 | } 50 | nav[data-toggle='toc'] .nav .nav > .active > a, 51 | nav[data-toggle='toc'] .nav .nav > .active:hover > a, 52 | nav[data-toggle='toc'] .nav .nav > .active:focus > a { 53 | padding-left: 28px; 54 | font-weight: 500; 55 | } 56 | 57 | /* from https://github.com/twbs/bootstrap/blob/e38f066d8c203c3e032da0ff23cd2d6098ee2dd6/docs/assets/css/src/docs.css#L631-L634 */ 58 | nav[data-toggle='toc'] .nav > .active > ul { 59 | display: block; 60 | } 61 | -------------------------------------------------------------------------------- /tests/testthat/test_wordscores.R: -------------------------------------------------------------------------------- 1 | library(austin) 2 | context("wordscores") 3 | 4 | tol <- 0.00001 5 | 6 | test_that("wordscores replicates the toy data example", { 7 | data(lbg) 8 | cws <- classic.wordscores(lbg[,1:5], scores=seq(-1.5, 1.5, 0.75)) 9 | expect_is(cws, 'wordscores') 10 | expect_is(cws, 'classic.wordscores') 11 | 12 | (summ <- summary(cws)) 13 | expect_equal(summ$Total, rep(1000, 5)) 14 | expect_equal(summ$Max[1], 158) 15 | expect_equal(summ$Mean[1], 27.02703, tolerance=tol) 16 | 17 | (ws <- coef(cws)) 18 | expect_equal(ws['A',], -1.5) 19 | expect_equal(ws['F',], -1.48125) 20 | expect_equal(ws['Z',], 1.0369898) 21 | expect_equal(ws['ZK',], 1.5) 22 | 23 | pre <- predict(cws, newdata=lbg[,6,drop=FALSE]) 24 | expect_equal(pre$Score[1], -0.4480591, tolerance = tol) 25 | ## The Laver et al. paper says the SE is 0.018. That's a typo 26 | expect_equal(pre$'Std. Err.'[1], 0.01189767, tolerance = tol) 27 | expect_equal(pre$Rescaled[1], -0.4480591, tolerance = tol) 28 | expect_equal(pre$Lower[1], -0.4593619, tolerance = tol) 29 | expect_equal(pre$Upper[1], -0.4367563, tolerance = tol) 30 | 31 | }) 32 | 33 | test_that("wordscores (nearly) replicates the UK manifesto data", { 34 | data(ukmanif) 35 | cws <- classic.wordscores(ukmanif[,c(2,4,5)], 36 | scores=c(17.21, 5.35, 8.21)) 37 | 38 | (summ <- summary(cws)) 39 | expect_equal(summ$Total, c(28672,11345,17203)) 40 | expect_equal(summ$Max, c(1851,613,992)) 41 | expect_equal(summ$Mean, c(4.007828, 1.585826, 2.404669), 42 | tolerance=tol) 43 | 44 | (ws <- coef(cws)) 45 | expect_equal(nrow(ws), 5511) 46 | expect_equal(ws['wide',], 9.767373, tolerance=tol) 47 | expect_equal(ws['travel',], 13.995690, tolerance=tol) 48 | expect_equal(ws['tax',], 9.932435, tolerance=tol) 49 | 50 | (pre <- predict(cws, newdata=ukmanif[,c(1,3,6)])) 51 | ## within 0.05 of the paper's numbers 52 | expect_equal(pre$Rescaled, c(9.193603, 17.160808, 4.972796), 53 | tolerance = tol) 54 | ## The Laver et al. paper says the SE is 0.018. That's a typo 55 | expect_equal(pre$'Std. Err.'[1], 0.01479069, tolerance = tol) 56 | }) 57 | -------------------------------------------------------------------------------- /docs/docsearch.js: -------------------------------------------------------------------------------- 1 | $(function() { 2 | 3 | // register a handler to move the focus to the search bar 4 | // upon pressing shift + "/" (i.e. "?") 5 | $(document).on('keydown', function(e) { 6 | if (e.shiftKey && e.keyCode == 191) { 7 | e.preventDefault(); 8 | $("#search-input").focus(); 9 | } 10 | }); 11 | 12 | $(document).ready(function() { 13 | // do keyword highlighting 14 | /* modified from https://jsfiddle.net/julmot/bL6bb5oo/ */ 15 | var mark = function() { 16 | 17 | var referrer = document.URL ; 18 | var paramKey = "q" ; 19 | 20 | if (referrer.indexOf("?") !== -1) { 21 | var qs = referrer.substr(referrer.indexOf('?') + 1); 22 | var qs_noanchor = qs.split('#')[0]; 23 | var qsa = qs_noanchor.split('&'); 24 | var keyword = ""; 25 | 26 | for (var i = 0; i < qsa.length; i++) { 27 | var currentParam = qsa[i].split('='); 28 | 29 | if (currentParam.length !== 2) { 30 | continue; 31 | } 32 | 33 | if (currentParam[0] == paramKey) { 34 | keyword = decodeURIComponent(currentParam[1].replace(/\+/g, "%20")); 35 | } 36 | } 37 | 38 | if (keyword !== "") { 39 | $(".contents").unmark({ 40 | done: function() { 41 | $(".contents").mark(keyword); 42 | } 43 | }); 44 | } 45 | } 46 | }; 47 | 48 | mark(); 49 | }); 50 | }); 51 | 52 | /* Search term highlighting ------------------------------*/ 53 | 54 | function matchedWords(hit) { 55 | var words = []; 56 | 57 | var hierarchy = hit._highlightResult.hierarchy; 58 | // loop to fetch from lvl0, lvl1, etc. 59 | for (var idx in hierarchy) { 60 | words = words.concat(hierarchy[idx].matchedWords); 61 | } 62 | 63 | var content = hit._highlightResult.content; 64 | if (content) { 65 | words = words.concat(content.matchedWords); 66 | } 67 | 68 | // return unique words 69 | var words_uniq = [...new Set(words)]; 70 | return words_uniq; 71 | } 72 | 73 | function updateHitURL(hit) { 74 | 75 | var words = matchedWords(hit); 76 | var url = ""; 77 | 78 | if (hit.anchor) { 79 | url = hit.url_without_anchor + '?q=' + escape(words.join(" ")) + '#' + hit.anchor; 80 | } else { 81 | url = hit.url + '?q=' + escape(words.join(" ")); 82 | } 83 | 84 | return url; 85 | } 86 | -------------------------------------------------------------------------------- /man/wfm.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/wfm.R 3 | \name{wfm} 4 | \alias{wfm} 5 | \title{Word Frequency Matrix} 6 | \usage{ 7 | wfm(mat, word.margin = 1) 8 | } 9 | \arguments{ 10 | \item{mat}{matrix of word counts or the name of a csv file of word counts} 11 | 12 | \item{word.margin}{which margin holds the words} 13 | } 14 | \value{ 15 | A word frequency matrix from a suitable object, or read from a file 16 | if \code{mat} is character. Which margin is treated as representing words 17 | is set by \code{word.margin}. 18 | } 19 | \description{ 20 | A word count matrix that know which margin holds the words. 21 | } 22 | \details{ 23 | If \code{mat} is a filename it should name a comma separated value format 24 | with row labels in the first column and column labels in the first row. 25 | Which represents words and which documents is specified by 26 | \code{word.margin}, which defaults to words as rows. 27 | 28 | A word frequency matrix is defined as any two dimensional matrix with 29 | non-empty row and column names and dimnames 'words' and 'docs' (in either 30 | order). The actual class of such an object is not important for the 31 | operation of the functions in this package, so wfm is essentially an 32 | interface. The function \code{\link{is.wfm}} is a (currently rather loose) 33 | check whether an object fulfils the interface contract. 34 | 35 | For such objects the convenience accessor functions \code{\link{as.docword}} 36 | and \code{\link{as.worddoc}} can be used to to get counts whichever way up 37 | you need them. 38 | 39 | \code{\link{words}} returns the words and \code{\link{docs}} returns the 40 | document titles. \code{\link{wordmargin}} reminds you which margin contains 41 | the words. Assigning \code{wordmargin} flips the dimension names. 42 | 43 | To get extract particular documents by name or index, use \link{getdocs}. 44 | 45 | \code{\link{as.wfm}} attempts to convert things to be word frequency 46 | matrices. This functionality is currently limited to objects on which 47 | \code{as.matrix} already works, and to \code{TermDocument} and 48 | \code{DocumentTerm} objects from the \code{tm} package. 49 | } 50 | \examples{ 51 | 52 | mat <- matrix(1:6, ncol = 2) 53 | rownames(mat) <- c('W1','W2','W3') 54 | colnames(mat) <- c('D1','D2') 55 | m <- wfm(mat, word.margin = 1) 56 | getdocs(as.docword(m), 'D2') 57 | 58 | } 59 | \seealso{ 60 | \code{\link{as.wfm}}, \code{\link{as.docword}}, 61 | \code{\link{as.worddoc}}, \code{\link{docs}}, \code{\link{words}}, 62 | \code{\link{is.wfm}}, \code{\link{wordmargin}} 63 | } 64 | \author{ 65 | Will Lowe 66 | } 67 | -------------------------------------------------------------------------------- /.github/workflows/check-standard.yaml: -------------------------------------------------------------------------------- 1 | # For help debugging build failures open an issue on the RStudio community with the 'github-actions' tag. 2 | # https://community.rstudio.com/new-topic?category=Package%20development&tags=github-actions 3 | on: 4 | push: 5 | branches: 6 | - main 7 | - master 8 | pull_request: 9 | branches: 10 | - main 11 | - master 12 | 13 | name: R-CMD-check 14 | 15 | jobs: 16 | R-CMD-check: 17 | runs-on: ${{ matrix.config.os }} 18 | 19 | name: ${{ matrix.config.os }} (${{ matrix.config.r }}) 20 | 21 | strategy: 22 | fail-fast: false 23 | matrix: 24 | config: 25 | - {os: windows-latest, r: 'release'} 26 | - {os: macOS-latest, r: 'release'} 27 | - {os: ubuntu-20.04, r: 'release', rspm: "https://packagemanager.rstudio.com/cran/__linux__/focal/latest"} 28 | - {os: ubuntu-20.04, r: 'devel', rspm: "https://packagemanager.rstudio.com/cran/__linux__/focal/latest"} 29 | 30 | env: 31 | R_REMOTES_NO_ERRORS_FROM_WARNINGS: true 32 | RSPM: ${{ matrix.config.rspm }} 33 | GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} 34 | 35 | steps: 36 | - uses: actions/checkout@v2 37 | 38 | - uses: r-lib/actions/setup-r@v1 39 | with: 40 | r-version: ${{ matrix.config.r }} 41 | 42 | - uses: r-lib/actions/setup-pandoc@v1 43 | 44 | - name: Query dependencies 45 | run: | 46 | install.packages('remotes') 47 | saveRDS(remotes::dev_package_deps(dependencies = TRUE), ".github/depends.Rds", version = 2) 48 | writeLines(sprintf("R-%i.%i", getRversion()$major, getRversion()$minor), ".github/R-version") 49 | shell: Rscript {0} 50 | 51 | - name: Restore R package cache 52 | if: runner.os != 'Windows' 53 | uses: actions/cache@v2 54 | with: 55 | path: ${{ env.R_LIBS_USER }} 56 | key: ${{ runner.os }}-${{ hashFiles('.github/R-version') }}-1-${{ hashFiles('.github/depends.Rds') }} 57 | restore-keys: ${{ runner.os }}-${{ hashFiles('.github/R-version') }}-1- 58 | 59 | - name: Install system dependencies 60 | if: runner.os == 'Linux' 61 | run: | 62 | while read -r cmd 63 | do 64 | eval sudo $cmd 65 | done < <(Rscript -e 'writeLines(remotes::system_requirements("ubuntu", "20.04"))') 66 | 67 | - name: Install dependencies 68 | run: | 69 | remotes::install_deps(dependencies = TRUE) 70 | remotes::install_cran("rcmdcheck") 71 | shell: Rscript {0} 72 | 73 | - name: Check 74 | env: 75 | _R_CHECK_CRAN_INCOMING_REMOTE_: false 76 | run: | 77 | options(crayon.enabled = TRUE) 78 | rcmdcheck::rcmdcheck(args = c("--no-manual", "--as-cran"), error_on = "warning", check_dir = "check") 79 | shell: Rscript {0} 80 | 81 | - name: Upload check results 82 | if: failure() 83 | uses: actions/upload-artifact@main 84 | with: 85 | name: ${{ runner.os }}-r${{ matrix.config.r }}-results 86 | path: check 87 | -------------------------------------------------------------------------------- /man/wordfish.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/wordfish.R 3 | \name{wordfish} 4 | \alias{wordfish} 5 | \title{Estimate a Wordfish Model} 6 | \usage{ 7 | wordfish( 8 | wfm, 9 | dir = c(1, length(docs(wfm))), 10 | control = list(tol = 1e-06, sigma = 3, startparams = NULL, conv.check = c("ll", 11 | "cor")), 12 | verbose = FALSE 13 | ) 14 | } 15 | \arguments{ 16 | \item{wfm}{a word frequency matrix} 17 | 18 | \item{dir}{set global identification by forcing \code{theta[dir[1]]} < 19 | \code{theta[dir[2]]} (defaults to first and last document)} 20 | 21 | \item{control}{list of estimation options} 22 | 23 | \item{verbose}{produce a running commentary} 24 | } 25 | \value{ 26 | An object of class wordfish. This is a list containing: 27 | 28 | \item{dir}{global identification of the dimension} 29 | \item{theta}{document 30 | positions} 31 | \item{alpha}{document fixed effects} 32 | \item{beta}{word slope 33 | parameters} 34 | \item{psi}{word fixed effects} 35 | \item{docs}{names of the documents} 36 | \item{words}{names of words} 37 | \item{sigma}{regularization parameter for betas in poisson form} 38 | \item{ll}{final log likelihood} 39 | \item{se.theta}{standard errors for document position} 40 | \item{data}{the original data} 41 | } 42 | \description{ 43 | Estimates a Wordfish model using Conditional Maximum Likelihood. 44 | } 45 | \details{ 46 | Fits a Wordfish model with document ideal points constrained to mean zero 47 | and unit standard deviation. 48 | 49 | The \code{control} list specifies options for the estimation process. 50 | \code{conv.check} is either 'll' which stops when the difference 51 | in log likelihood between iterations is less than \code{tol}, or 'cor' 52 | which stops when one minus the correlation between the \code{theta}s 53 | from the current and the previous iterations is less 54 | than \code{tol}. \code{sigma} is the standard deviation for the beta 55 | prior in poisson form. \code{startparams} is a list of starting values 56 | (\code{theta}, \code{beta}, \code{psi} and \code{alpha}) or a 57 | previously fitted Wordfish model for the same data. 58 | \code{verbose} generates a running commentary during estimation 59 | 60 | The model has two equivalent forms: a poisson model with two sets of 61 | document and two sets of word parameters, and a multinomial with two sets of 62 | word parameters and document ideal points. The first form is used for 63 | estimation, the second is available for alternative summaries, prediction, 64 | and profile standard error calculations. 65 | 66 | The model is regularized by assuming a prior on beta with mean zero and 67 | standard deviation sigma (in poisson form). If you don't want to 68 | regularize, set beta to a large number. 69 | } 70 | \examples{ 71 | 72 | dd <- sim.wordfish() 73 | wf <- wordfish(dd$Y) 74 | summary(wf) 75 | 76 | } 77 | \references{ 78 | Slapin and Proksch (2008) 'A Scaling Model for Estimating 79 | Time-Series Party Positions from Texts.' American Journal of Political 80 | Science 52(3):705-772. 81 | } 82 | \seealso{ 83 | \code{\link{plot.wordfish}}, \code{\link{summary.wordfish}}, 84 | \code{\link{coef.wordfish}}, \code{\link{fitted.wordfish}}, 85 | \code{\link{predict.wordfish}}, \code{\link{sim.wordfish}} 86 | } 87 | \author{ 88 | Will Lowe 89 | } 90 | -------------------------------------------------------------------------------- /docs/pkgdown.js: -------------------------------------------------------------------------------- 1 | /* http://gregfranko.com/blog/jquery-best-practices/ */ 2 | (function($) { 3 | $(function() { 4 | 5 | $('.navbar-fixed-top').headroom(); 6 | 7 | $('body').css('padding-top', $('.navbar').height() + 10); 8 | $(window).resize(function(){ 9 | $('body').css('padding-top', $('.navbar').height() + 10); 10 | }); 11 | 12 | $('[data-toggle="tooltip"]').tooltip(); 13 | 14 | var cur_path = paths(location.pathname); 15 | var links = $("#navbar ul li a"); 16 | var max_length = -1; 17 | var pos = -1; 18 | for (var i = 0; i < links.length; i++) { 19 | if (links[i].getAttribute("href") === "#") 20 | continue; 21 | // Ignore external links 22 | if (links[i].host !== location.host) 23 | continue; 24 | 25 | var nav_path = paths(links[i].pathname); 26 | 27 | var length = prefix_length(nav_path, cur_path); 28 | if (length > max_length) { 29 | max_length = length; 30 | pos = i; 31 | } 32 | } 33 | 34 | // Add class to parent
  • , and enclosing
  • if in dropdown 35 | if (pos >= 0) { 36 | var menu_anchor = $(links[pos]); 37 | menu_anchor.parent().addClass("active"); 38 | menu_anchor.closest("li.dropdown").addClass("active"); 39 | } 40 | }); 41 | 42 | function paths(pathname) { 43 | var pieces = pathname.split("/"); 44 | pieces.shift(); // always starts with / 45 | 46 | var end = pieces[pieces.length - 1]; 47 | if (end === "index.html" || end === "") 48 | pieces.pop(); 49 | return(pieces); 50 | } 51 | 52 | // Returns -1 if not found 53 | function prefix_length(needle, haystack) { 54 | if (needle.length > haystack.length) 55 | return(-1); 56 | 57 | // Special case for length-0 haystack, since for loop won't run 58 | if (haystack.length === 0) { 59 | return(needle.length === 0 ? 0 : -1); 60 | } 61 | 62 | for (var i = 0; i < haystack.length; i++) { 63 | if (needle[i] != haystack[i]) 64 | return(i); 65 | } 66 | 67 | return(haystack.length); 68 | } 69 | 70 | /* Clipboard --------------------------*/ 71 | 72 | function changeTooltipMessage(element, msg) { 73 | var tooltipOriginalTitle=element.getAttribute('data-original-title'); 74 | element.setAttribute('data-original-title', msg); 75 | $(element).tooltip('show'); 76 | element.setAttribute('data-original-title', tooltipOriginalTitle); 77 | } 78 | 79 | if(ClipboardJS.isSupported()) { 80 | $(document).ready(function() { 81 | var copyButton = ""; 82 | 83 | $(".examples, div.sourceCode").addClass("hasCopyButton"); 84 | 85 | // Insert copy buttons: 86 | $(copyButton).prependTo(".hasCopyButton"); 87 | 88 | // Initialize tooltips: 89 | $('.btn-copy-ex').tooltip({container: 'body'}); 90 | 91 | // Initialize clipboard: 92 | var clipboardBtnCopies = new ClipboardJS('[data-clipboard-copy]', { 93 | text: function(trigger) { 94 | return trigger.parentNode.textContent; 95 | } 96 | }); 97 | 98 | clipboardBtnCopies.on('success', function(e) { 99 | changeTooltipMessage(e.trigger, 'Copied!'); 100 | e.clearSelection(); 101 | }); 102 | 103 | clipboardBtnCopies.on('error', function() { 104 | changeTooltipMessage(e.trigger,'Press Ctrl+C or Command+C to copy'); 105 | }); 106 | }); 107 | } 108 | })(window.jQuery || window.$) 109 | -------------------------------------------------------------------------------- /docs/bootstrap-toc.js: -------------------------------------------------------------------------------- 1 | /*! 2 | * Bootstrap Table of Contents v0.4.1 (http://afeld.github.io/bootstrap-toc/) 3 | * Copyright 2015 Aidan Feldman 4 | * Licensed under MIT (https://github.com/afeld/bootstrap-toc/blob/gh-pages/LICENSE.md) */ 5 | (function() { 6 | 'use strict'; 7 | 8 | window.Toc = { 9 | helpers: { 10 | // return all matching elements in the set, or their descendants 11 | findOrFilter: function($el, selector) { 12 | // http://danielnouri.org/notes/2011/03/14/a-jquery-find-that-also-finds-the-root-element/ 13 | // http://stackoverflow.com/a/12731439/358804 14 | var $descendants = $el.find(selector); 15 | return $el.filter(selector).add($descendants).filter(':not([data-toc-skip])'); 16 | }, 17 | 18 | generateUniqueIdBase: function(el) { 19 | var text = $(el).text(); 20 | var anchor = text.trim().toLowerCase().replace(/[^A-Za-z0-9]+/g, '-'); 21 | return anchor || el.tagName.toLowerCase(); 22 | }, 23 | 24 | generateUniqueId: function(el) { 25 | var anchorBase = this.generateUniqueIdBase(el); 26 | for (var i = 0; ; i++) { 27 | var anchor = anchorBase; 28 | if (i > 0) { 29 | // add suffix 30 | anchor += '-' + i; 31 | } 32 | // check if ID already exists 33 | if (!document.getElementById(anchor)) { 34 | return anchor; 35 | } 36 | } 37 | }, 38 | 39 | generateAnchor: function(el) { 40 | if (el.id) { 41 | return el.id; 42 | } else { 43 | var anchor = this.generateUniqueId(el); 44 | el.id = anchor; 45 | return anchor; 46 | } 47 | }, 48 | 49 | createNavList: function() { 50 | return $(''); 51 | }, 52 | 53 | createChildNavList: function($parent) { 54 | var $childList = this.createNavList(); 55 | $parent.append($childList); 56 | return $childList; 57 | }, 58 | 59 | generateNavEl: function(anchor, text) { 60 | var $a = $(''); 61 | $a.attr('href', '#' + anchor); 62 | $a.text(text); 63 | var $li = $('
  • '); 64 | $li.append($a); 65 | return $li; 66 | }, 67 | 68 | generateNavItem: function(headingEl) { 69 | var anchor = this.generateAnchor(headingEl); 70 | var $heading = $(headingEl); 71 | var text = $heading.data('toc-text') || $heading.text(); 72 | return this.generateNavEl(anchor, text); 73 | }, 74 | 75 | // Find the first heading level (`

    `, then `

    `, etc.) that has more than one element. Defaults to 1 (for `

    `). 76 | getTopLevel: function($scope) { 77 | for (var i = 1; i <= 6; i++) { 78 | var $headings = this.findOrFilter($scope, 'h' + i); 79 | if ($headings.length > 1) { 80 | return i; 81 | } 82 | } 83 | 84 | return 1; 85 | }, 86 | 87 | // returns the elements for the top level, and the next below it 88 | getHeadings: function($scope, topLevel) { 89 | var topSelector = 'h' + topLevel; 90 | 91 | var secondaryLevel = topLevel + 1; 92 | var secondarySelector = 'h' + secondaryLevel; 93 | 94 | return this.findOrFilter($scope, topSelector + ',' + secondarySelector); 95 | }, 96 | 97 | getNavLevel: function(el) { 98 | return parseInt(el.tagName.charAt(1), 10); 99 | }, 100 | 101 | populateNav: function($topContext, topLevel, $headings) { 102 | var $context = $topContext; 103 | var $prevNav; 104 | 105 | var helpers = this; 106 | $headings.each(function(i, el) { 107 | var $newNav = helpers.generateNavItem(el); 108 | var navLevel = helpers.getNavLevel(el); 109 | 110 | // determine the proper $context 111 | if (navLevel === topLevel) { 112 | // use top level 113 | $context = $topContext; 114 | } else if ($prevNav && $context === $topContext) { 115 | // create a new level of the tree and switch to it 116 | $context = helpers.createChildNavList($prevNav); 117 | } // else use the current $context 118 | 119 | $context.append($newNav); 120 | 121 | $prevNav = $newNav; 122 | }); 123 | }, 124 | 125 | parseOps: function(arg) { 126 | var opts; 127 | if (arg.jquery) { 128 | opts = { 129 | $nav: arg 130 | }; 131 | } else { 132 | opts = arg; 133 | } 134 | opts.$scope = opts.$scope || $(document.body); 135 | return opts; 136 | } 137 | }, 138 | 139 | // accepts a jQuery object, or an options object 140 | init: function(opts) { 141 | opts = this.helpers.parseOps(opts); 142 | 143 | // ensure that the data attribute is in place for styling 144 | opts.$nav.attr('data-toggle', 'toc'); 145 | 146 | var $topContext = this.helpers.createChildNavList(opts.$nav); 147 | var topLevel = this.helpers.getTopLevel(opts.$scope); 148 | var $headings = this.helpers.getHeadings(opts.$scope, topLevel); 149 | this.helpers.populateNav($topContext, topLevel, $headings); 150 | } 151 | }; 152 | 153 | $(function() { 154 | $('nav[data-toggle="toc"]').each(function(i, el) { 155 | var $nav = $(el); 156 | Toc.init($nav); 157 | }); 158 | }); 159 | })(); 160 | -------------------------------------------------------------------------------- /docs/articles/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Articles • austin 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 55 | 56 | 57 | 58 | 59 | 60 | 61 |
    62 |
    63 | 108 | 109 | 110 | 111 |
    112 | 113 |
    114 |
    115 | 118 | 119 |
    120 |

    All vignettes

    121 |

    122 | 123 |
    124 |
    Introduction to Austin
    125 |
    126 |
    127 |
    128 |
    129 |
    130 | 131 | 132 |
    133 | 136 | 137 |
    138 |

    Site built with pkgdown 1.6.1.

    139 |
    140 | 141 |
    142 |
    143 | 144 | 145 | 146 | 147 | 148 | 149 | 150 | 151 | -------------------------------------------------------------------------------- /docs/404.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Page not found (404) • austin 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 55 | 56 | 57 | 58 | 59 | 60 | 61 |
    62 |
    63 | 108 | 109 | 110 | 111 |
    112 | 113 |
    114 |
    115 | 118 | 119 | Content not found. Please use links in the navbar. 120 | 121 |
    122 | 123 | 128 | 129 |
    130 | 131 | 132 | 133 |
    134 | 137 | 138 |
    139 |

    Site built with pkgdown 1.6.1.

    140 |
    141 | 142 |
    143 |
    144 | 145 | 146 | 147 | 148 | 149 | 150 | 151 | 152 | -------------------------------------------------------------------------------- /docs/authors.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Citation and Authors • austin 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 55 | 56 | 57 | 58 | 59 | 60 | 61 |
    62 |
    63 | 108 | 109 | 110 | 111 |
    112 | 113 |
    114 |
    115 | 119 | 120 |

    Will Lowe 2015. Austin: Do things with words. Version 0.3.0 URL http://conjugateprior.github.io/austin

    121 |
    @Manual{,
    122 |   title = {Austin: Do things with words},
    123 |   author = {Will Lowe},
    124 |   year = {2017},
    125 |   url = {http://conjugateprior.github.io/austin},
    126 | }
    127 | 128 | 131 | 132 |
      133 |
    • 134 |

      Will Lowe. Author, maintainer. 135 |

      136 |
    • 137 |
    138 | 139 |
    140 | 141 |
    142 | 143 | 144 | 145 |
    146 | 149 | 150 |
    151 |

    Site built with pkgdown 1.6.1.

    152 |
    153 | 154 |
    155 |
    156 | 157 | 158 | 159 | 160 | 161 | 162 | 163 | 164 | -------------------------------------------------------------------------------- /docs/reference/iebudget2009.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Irish Budget Debate Data 2009 — iebudget2009 • austin 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 56 | 57 | 58 | 59 | 60 | 61 | 62 |
    63 |
    64 | 109 | 110 | 111 | 112 |
    113 | 114 |
    115 |
    116 | 121 | 122 |
    123 |

    Irish budget debate 2009

    124 |
    125 | 126 | 127 | 128 |

    Details

    129 | 130 |

    This are word counts from the 2009 Budget debate in Ireland.

    131 |

    This is a word frequency nmatrix. Loading this data also makes available 132 | iebudget2009cov which contains covariates for the speakers.

    133 | 134 |
    135 | 140 |
    141 | 142 | 143 |
    144 | 147 | 148 |
    149 |

    Site built with pkgdown 1.6.1.

    150 |
    151 | 152 |
    153 |
    154 | 155 | 156 | 157 | 158 | 159 | 160 | 161 | 162 | -------------------------------------------------------------------------------- /docs/reference/daildata.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | The Irish No-Confidence debate — daildata • austin 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 56 | 57 | 58 | 59 | 60 | 61 | 62 |
    63 |
    64 | 109 | 110 | 111 | 112 |
    113 | 114 |
    115 |
    116 | 121 | 122 |
    123 |

    Irish No-Confidence Debate

    124 |
    125 | 126 | 127 | 128 |

    Details

    129 | 130 |

    This are word counts from the no-confidence debate in Ireland.

    131 |

    daildata is a word frequency object.

    132 |

    References

    133 | 134 |

    Benoit and Laver's Irish Political Studies piece. (fixme!)

    135 | 136 |
    137 | 142 |
    143 | 144 | 145 |
    146 | 149 | 150 |
    151 |

    Site built with pkgdown 1.6.1.

    152 |
    153 | 154 |
    155 |
    156 | 157 | 158 | 159 | 160 | 161 | 162 | 163 | 164 | -------------------------------------------------------------------------------- /docs/news/index.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Changelog • austin 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 55 | 56 | 57 | 58 | 59 | 60 | 61 |
    62 |
    63 | 108 | 109 | 110 | 111 |
    112 | 113 |
    114 |
    115 | 119 | 120 |
    121 |

    122 | austin 0.3.0

    123 |
      124 |
    • Updated package structure to more recent R
    • 125 |
    • Added a github site
    • 126 |
    127 |
    128 |
    129 |

    130 | austin 0.22

    131 |
      132 |
    • Made sure demanif.* datasets to work with indexing behaviour of recent R versions
    • 133 |
    134 |
    135 |
    136 |

    137 | austin 0.21

    138 |
      139 |
    • First import from r-forge
    • 140 |
    • Revised vignette is now in markdown
    • 141 |
    • Some basic tests
    • 142 |
    143 |
    144 |
    145 | 146 | 151 | 152 |
    153 | 154 | 155 |
    156 | 159 | 160 |
    161 |

    Site built with pkgdown 1.6.1.

    162 |
    163 | 164 |
    165 |
    166 | 167 | 168 | 169 | 170 | 171 | 172 | 173 | 174 | -------------------------------------------------------------------------------- /docs/reference/ukmanif.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | UK Manifesto Data — ukmanif • austin 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 56 | 57 | 58 | 59 | 60 | 61 | 62 |
    63 |
    64 | 109 | 110 | 111 | 112 |
    113 | 114 |
    115 |
    116 | 121 | 122 |
    123 |

    UK manifesto data from Laver et al.

    124 |
    125 | 126 | 127 | 128 |

    Details

    129 | 130 |

    This are word counts from the manifestos of the three main UK parties for 131 | the 1992 and 1997 elections.

    132 |

    ukmanif is a word frequency object.

    133 |

    References

    134 | 135 |

    Laver, Benoit and Garry (2003) `Estimating policy positions from 136 | political text using words as data' American Political Science Review 97(2).

    137 | 138 |
    139 | 144 |
    145 | 146 | 147 |
    148 | 151 | 152 |
    153 |

    Site built with pkgdown 1.6.1.

    154 |
    155 | 156 |
    157 |
    158 | 159 | 160 | 161 | 162 | 163 | 164 | 165 | 166 | -------------------------------------------------------------------------------- /docs/reference/interestgroups.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Interest Groups — interestgroups • austin 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 56 | 57 | 58 | 59 | 60 | 61 | 62 |
    63 |
    64 | 109 | 110 | 111 | 112 |
    113 | 114 |
    115 |
    116 | 121 | 122 |
    123 |

    Interest Groups and the European Commission

    124 |
    125 | 126 | 127 | 128 |

    Details

    129 | 130 |

    Word counts from interest groups and a European commission proposal to 131 | reduce CO2 emmissions in 2007.

    132 |

    comm1 and comm2 are the commission's proposal before and after 133 | the proposals of the interest groups.

    134 |

    References

    135 | 136 |

    H. Kluever (2009) 'Measuring influence group influence using 137 | quantitative text analysis' European Union Politics 11:1.

    138 | 139 |
    140 | 145 |
    146 | 147 | 148 |
    149 | 152 | 153 |
    154 |

    Site built with pkgdown 1.6.1.

    155 |
    156 | 157 |
    158 |
    159 | 160 | 161 | 162 | 163 | 164 | 165 | 166 | 167 | -------------------------------------------------------------------------------- /docs/reference/jl_reindex.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Reindex counts variable — jl_reindex • austin 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 56 | 57 | 58 | 59 | 60 | 61 | 62 |
    63 |
    64 | 109 | 110 | 111 | 112 |
    113 | 114 |
    115 |
    116 | 121 | 122 |
    123 |

    Re indexes 'counts' and provides a new 'types' attribute for it.

    124 |
    125 | 126 |
    jl_reindex(res)
    127 | 128 |

    Arguments

    129 | 130 | 131 | 132 | 133 | 134 | 135 |
    res

    a tibble with a 'counts' variable

    136 | 137 |

    Value

    138 | 139 |

    a tibble

    140 | 141 |
    142 | 147 |
    148 | 149 | 150 |
    151 | 154 | 155 |
    156 |

    Site built with pkgdown 1.6.1.

    157 |
    158 | 159 |
    160 |
    161 | 162 | 163 | 164 | 165 | 166 | 167 | 168 | 169 | -------------------------------------------------------------------------------- /docs/reference/jl_count_tokens.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Tabulate whatever the tokens are — jl_count_tokens • austin 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 56 | 57 | 58 | 59 | 60 | 61 | 62 |
    63 |
    64 | 109 | 110 | 111 | 112 |
    113 | 114 |
    115 |
    116 | 121 | 122 |
    123 |

    Tabulate whatever the tokens are

    124 |
    125 | 126 |
    jl_count_tokens(x)
    127 | 128 |

    Arguments

    129 | 130 | 131 | 132 | 133 | 134 | 135 |
    x

    a tibble with a 'tokens' variable

    136 | 137 |

    Value

    138 | 139 |

    a tibble with a 'counts' variable

    140 | 141 |
    142 | 147 |
    148 | 149 | 150 |
    151 | 154 | 155 |
    156 |

    Site built with pkgdown 1.6.1.

    157 |
    158 | 159 |
    160 |
    161 | 162 | 163 | 164 | 165 | 166 | 167 | 168 | 169 | -------------------------------------------------------------------------------- /docs/reference/demanif.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | German Party Manifesto Data — demanif • austin 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 57 | 58 | 59 | 60 | 61 | 62 | 63 |
    64 |
    65 | 110 | 111 | 112 | 113 |
    114 | 115 |
    116 |
    117 | 122 | 123 |
    124 |

    A random sample of words and their frequency in German political party 125 | manifestos from 1990-2005.

    126 |
    127 | 128 | 129 | 130 |

    Source

    131 | 132 |

    Wordfish website (http://www.wordfish.org)

    133 |

    Details

    134 | 135 |

    demanif is word frequency matrix

    136 |

    References

    137 | 138 |

    J. Slapin and S.-O. Proksch (2008) 'A scaling model for 139 | estimating time-series party positions from texts' American Journal of 140 | Political Science 52(3), 705-722.

    141 | 142 |
    143 | 148 |
    149 | 150 | 151 |
    152 | 155 | 156 |
    157 |

    Site built with pkgdown 1.6.1.

    158 |
    159 | 160 |
    161 |
    162 | 163 | 164 | 165 | 166 | 167 | 168 | 169 | 170 | -------------------------------------------------------------------------------- /docs/reference/jl_tokenize_words.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Split text into words — jl_tokenize_words • austin 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 56 | 57 | 58 | 59 | 60 | 61 | 62 |
    63 |
    64 | 109 | 110 | 111 | 112 |
    113 | 114 |
    115 |
    116 | 121 | 122 |
    123 |

    Fills the 'tokens' variable with word tokens.

    124 |
    125 | 126 |
    jl_tokenize_words(x, ...)
    127 | 128 |

    Arguments

    129 | 130 | 131 | 132 | 133 | 134 | 135 | 136 | 137 | 138 | 139 |
    x

    a tibble

    ...

    extra arguments to tokenizers::tokenize_*

    140 | 141 |

    Value

    142 | 143 |

    a tibble

    144 | 145 |
    146 | 151 |
    152 | 153 | 154 |
    155 | 158 | 159 |
    160 |

    Site built with pkgdown 1.6.1.

    161 |
    162 | 163 |
    164 |
    165 | 166 | 167 | 168 | 169 | 170 | 171 | 172 | 173 | -------------------------------------------------------------------------------- /docs/reference/jl_types.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Get the vocabulary under counts — jl_types • austin 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 57 | 58 | 59 | 60 | 61 | 62 | 63 |
    64 |
    65 | 110 | 111 | 112 | 113 |
    114 | 115 |
    116 |
    117 | 122 | 123 |
    124 |

    Returns the list of types that are counted in 'counts'. If 125 | 'counts' does not exist, does the calculation for 'tokens'

    126 |
    127 | 128 |
    jl_types(x)
    129 | 130 |

    Arguments

    131 | 132 | 133 | 134 | 135 | 136 | 137 |
    x

    a tibble

    138 | 139 |

    Value

    140 | 141 |

    a vector of type values

    142 | 143 |
    144 | 149 |
    150 | 151 | 152 |
    153 | 156 | 157 |
    158 |

    Site built with pkgdown 1.6.1.

    159 |
    160 | 161 |
    162 |
    163 | 164 | 165 | 166 | 167 | 168 | 169 | 170 | 171 | -------------------------------------------------------------------------------- /docs/reference/jl_doclen.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Get document lengths in tokens — jl_doclen • austin 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 57 | 58 | 59 | 60 | 61 | 62 | 63 |
    64 |
    65 | 110 | 111 | 112 | 113 |
    114 | 115 |
    116 |
    117 | 122 | 123 |
    124 |

    For each document, how many tokens it contains. Note that this may 125 | not correspond to the number of words or dictionary categories

    126 |
    127 | 128 |
    jl_doclen(x)
    129 | 130 |

    Arguments

    131 | 132 | 133 | 134 | 135 | 136 | 137 |
    x

    a tibble

    138 | 139 |

    Value

    140 | 141 |

    a vector of token counts

    142 | 143 |
    144 | 149 |
    150 | 151 | 152 |
    153 | 156 | 157 |
    158 |

    Site built with pkgdown 1.6.1.

    159 |
    160 | 161 |
    162 |
    163 | 164 | 165 | 166 | 167 | 168 | 169 | 170 | 171 | -------------------------------------------------------------------------------- /docs/reference/demanif.soc.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Societal sections of German Party Manifestos — demanif.soc • austin 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 57 | 58 | 59 | 60 | 61 | 62 | 63 |
    64 |
    65 | 110 | 111 | 112 | 113 |
    114 | 115 |
    116 |
    117 | 122 | 123 |
    124 |

    A word frequency matrix from the societal sections of German political party 125 | manifestos from 1990-2005.

    126 |
    127 | 128 | 129 | 130 |

    Source

    131 | 132 |

    These data courtesy are of S.-O. Proksch.

    133 |

    Details

    134 | 135 |

    demanif.soc is word frequency matrix

    136 |

    References

    137 | 138 |

    J. Slapin and S.-O. Proksch (2008) 'A scaling model for 139 | estimating time-series party positions from texts' American Journal of 140 | Political Science 52(3), 705-722.

    141 | 142 |
    143 | 148 |
    149 | 150 | 151 |
    152 | 155 | 156 |
    157 |

    Site built with pkgdown 1.6.1.

    158 |
    159 | 160 |
    161 |
    162 | 163 | 164 | 165 | 166 | 167 | 168 | 169 | 170 | -------------------------------------------------------------------------------- /docs/reference/demanif.econ.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Economics sections of German Party Manifestos — demanif.econ • austin 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 57 | 58 | 59 | 60 | 61 | 62 | 63 |
    64 |
    65 | 110 | 111 | 112 | 113 |
    114 | 115 |
    116 |
    117 | 122 | 123 |
    124 |

    A word frequency matrix from the economic sections of German political party 125 | manifestos from 1990-2005.

    126 |
    127 | 128 | 129 | 130 |

    Source

    131 | 132 |

    These data courtesy are of S.-O. Proksch.

    133 |

    Details

    134 | 135 |

    demanif.econ is word frequency matrix

    136 |

    References

    137 | 138 |

    J. Slapin and S.-O. Proksch (2008) 'A scaling model for 139 | estimating time-series party positions from texts' American Journal of 140 | Political Science 52(3), 705-722.

    141 | 142 |
    143 | 148 |
    149 | 150 | 151 |
    152 | 155 | 156 |
    157 |

    Site built with pkgdown 1.6.1.

    158 |
    159 | 160 |
    161 |
    162 | 163 | 164 | 165 | 166 | 167 | 168 | 169 | 170 | -------------------------------------------------------------------------------- /docs/reference/jl_summarize_counts.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Collapse counts for grouped tibbles — jl_summarize_counts • austin 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 57 | 58 | 59 | 60 | 61 | 62 | 63 |
    64 |
    65 | 110 | 111 | 112 | 113 |
    114 | 115 |
    116 |
    117 | 122 | 123 |
    124 |

    For grouped tibbles, collapse the 'counts' within each group. Like 125 | summarize, after group_by, but with no choice of function.

    126 |
    127 | 128 |
    jl_summarize_counts(x)
    129 | 130 |

    Arguments

    131 | 132 | 133 | 134 | 135 | 136 | 137 |
    x

    a tibble

    138 | 139 |

    Value

    140 | 141 |

    an aggregated tibble

    142 | 143 |
    144 | 149 |
    150 | 151 | 152 |
    153 | 156 | 157 |
    158 |

    Site built with pkgdown 1.6.1.

    159 |
    160 | 161 |
    162 |
    163 | 164 | 165 | 166 | 167 | 168 | 169 | 170 | 171 | -------------------------------------------------------------------------------- /docs/reference/jl_demote_counts.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Undo the effects of jl_promote_counts — jl_demote_counts • austin 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 56 | 57 | 58 | 59 | 60 | 61 | 62 |
    63 |
    64 | 109 | 110 | 111 | 112 |
    113 | 114 |
    115 |
    116 | 121 | 122 |
    123 |

    Removes the variables that jl_promote_counts added

    124 |
    125 | 126 |
    jl_demote_counts(x, prefix = NULL)
    127 | 128 |

    Arguments

    129 | 130 | 131 | 132 | 133 | 134 | 135 | 136 | 137 | 138 | 139 |
    x

    a tibble

    prefix

    whether to look for a prefix when removing

    140 | 141 |

    Value

    142 | 143 |

    a tibble

    144 | 145 |
    146 | 151 |
    152 | 153 | 154 |
    155 | 158 | 159 |
    160 |

    Site built with pkgdown 1.6.1.

    161 |
    162 | 163 |
    164 |
    165 | 166 | 167 | 168 | 169 | 170 | 171 | 172 | 173 | -------------------------------------------------------------------------------- /docs/reference/demanif.foreign.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Foreign Policy Sections of German Party Manifestos — demanif.foreign • austin 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 57 | 58 | 59 | 60 | 61 | 62 | 63 |
    64 |
    65 | 110 | 111 | 112 | 113 |
    114 | 115 |
    116 |
    117 | 122 | 123 |
    124 |

    A word frequency matrix from the foreign policy sections of German political 125 | party manifestos from 1990-2005.

    126 |
    127 | 128 | 129 | 130 |

    Source

    131 | 132 |

    These data courtesy are of S.-O. Proksch.

    133 |

    Details

    134 | 135 |

    demanif.foreign is word frequency matrix

    136 |

    References

    137 | 138 |

    J. Slapin and S.-O. Proksch (2008) 'A scaling model for 139 | estimating time-series party positions from texts' American Journal of 140 | Political Science 52(3), 705-722.

    141 | 142 |
    143 | 148 |
    149 | 150 | 151 |
    152 | 155 | 156 |
    157 |

    Site built with pkgdown 1.6.1.

    158 |
    159 | 160 |
    161 |
    162 | 163 | 164 | 165 | 166 | 167 | 168 | 169 | 170 | -------------------------------------------------------------------------------- /docs/reference/jl_promote_counts.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Make counts real variables in wide form — jl_promote_counts • austin 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 56 | 57 | 58 | 59 | 60 | 61 | 62 |
    63 |
    64 | 109 | 110 | 111 | 112 |
    113 | 114 |
    115 |
    116 | 121 | 122 |
    123 |

    Make counts real variables in wide form

    124 |
    125 | 126 |
    jl_promote_counts(x, prefix = NULL)
    127 | 128 |

    Arguments

    129 | 130 | 131 | 132 | 133 | 134 | 135 | 136 | 137 | 138 | 139 |
    x

    a tibble with 'counts'

    prefix

    what to prefix each counted element's value when it turns into a variable name

    140 | 141 |

    Value

    142 | 143 |

    a tibble with more columns, one for each counted type

    144 | 145 |
    146 | 151 |
    152 | 153 | 154 |
    155 | 158 | 159 |
    160 |

    Site built with pkgdown 1.6.1.

    161 |
    162 | 163 |
    164 |
    165 | 166 | 167 | 168 | 169 | 170 | 171 | 172 | 173 | -------------------------------------------------------------------------------- /docs/reference/jl_identify.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Add a document identifier — jl_identify • austin 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 53 | 54 | 58 | 59 | 60 | 61 | 62 | 63 | 64 |
    65 |
    66 | 111 | 112 | 113 | 114 |
    115 | 116 |
    117 |
    118 | 123 | 124 |
    125 |

    Adds a variable 'doc_id' to uniquely identify each row. This identifier 126 | may later, if jl_split is used, contain information about disaggregation 127 | level. If the document identifier already exists, overwrite it.

    128 |
    129 | 130 |
    jl_identify(x)
    131 | 132 |

    Arguments

    133 | 134 | 135 | 136 | 137 | 138 | 139 |
    x

    a tibble

    140 | 141 |

    Value

    142 | 143 |

    a tibble

    144 | 145 |
    146 | 151 |
    152 | 153 | 154 |
    155 | 158 | 159 |
    160 |

    Site built with pkgdown 1.6.1.

    161 |
    162 | 163 |
    164 |
    165 | 166 | 167 | 168 | 169 | 170 | 171 | 172 | 173 | -------------------------------------------------------------------------------- /docs/reference/lbg.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | Example Data — lbg • austin 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | 31 | 32 | 33 | 34 | 35 | 36 | 37 | 38 | 39 | 40 | 41 | 42 | 43 | 44 | 45 | 46 | 47 | 48 | 49 | 50 | 51 | 52 | 56 | 57 | 58 | 59 | 60 | 61 | 62 |
    63 |
    64 | 109 | 110 | 111 | 112 |
    113 | 114 |
    115 |
    116 | 121 | 122 |
    123 |

    Example data from Laver Benoit and Garry (2003)

    124 |
    125 | 126 | 127 | 128 |

    Details

    129 | 130 |

    This is the example word count data from Laver, Benoit and Garry's (2000) 131 | article on Wordscores. Documents R1 to R5 are assumed to have known 132 | positions: -1.5, -0.75, 0, 0.75, 1.5. Document V1 is assumed unknown. The 133 | `correct' position for V1 is presumed to be -0.45. 134 | classic.wordscores generates approximately -0.45.

    135 |

    To replicate the analysis in the paper, use the wordscores function either 136 | with identification fixing the first 5 document positions and leaving 137 | position of V1 to be predicted.

    138 |

    References

    139 | 140 |

    Laver, Benoit and Garry (2003) `Estimating policy positions from 141 | political text using words as data' American Political Science Review 97(2).

    142 | 143 |
    144 | 149 |
    150 | 151 | 152 | 162 |
    163 | 164 | 165 | 166 | 167 | 168 | 169 | 170 | 171 | --------------------------------------------------------------------------------