├── .Rbuildignore ├── .gitattributes ├── .github ├── .gitignore └── workflows │ ├── R-CMD-check.yaml │ ├── test-coverage.yaml │ └── with-real-requests.yaml ├── .gitignore ├── CONTRIBUTING.md ├── CRAN-SUBMISSION ├── DESCRIPTION ├── LICENSE ├── LICENSE.md ├── NAMESPACE ├── NEWS.md ├── R ├── add-users-data.R ├── analyze-network.R ├── analyze-url.R ├── get-upstream-tweets.R └── setup-functions.R ├── README-archived.Rmd ├── README-archived.md ├── README.Rmd ├── README.md ├── _pkgdown.yml ├── codecov.yml ├── codemeta.json ├── cran-comments.md ├── docs ├── 404.html ├── CONTRIBUTING.html ├── LICENSE-text.html ├── LICENSE.html ├── apple-touch-icon-120x120.png ├── apple-touch-icon-152x152.png ├── apple-touch-icon-180x180.png ├── apple-touch-icon-60x60.png ├── apple-touch-icon-76x76.png ├── apple-touch-icon.png ├── articles │ ├── files │ │ ├── TAGS-identifier-from-browser.png │ │ ├── TAGS-identifier-highlighted.png │ │ ├── TAGS-make-copy.png │ │ ├── TAGS-ready.png │ │ ├── choice-TAGS-version.png │ │ ├── key-task-1-success.png │ │ ├── pain-point-1-success.png │ │ ├── publish-to-web-choices.png │ │ ├── publish-to-web-menu.png │ │ ├── share-anyone-with-link.png │ │ ├── share-button.png │ │ ├── tidytags-setup-google-api_1.png │ │ ├── tidytags-setup-google-api_2.png │ │ └── tidytags-setup-google-api_3.png │ ├── index.html │ ├── setup.html │ ├── setup_files │ │ └── header-attrs-2.11 │ │ │ └── header-attrs.js │ ├── tidytags-with-conf-hashtags.html │ ├── tidytags-with-conf-hashtags_files │ │ └── header-attrs-2.11 │ │ │ └── header-attrs.js │ └── vignette-network-visualization-1.png ├── authors.html ├── bootstrap-toc.css ├── bootstrap-toc.js ├── docsearch.css ├── docsearch.js ├── favicon-16x16.png ├── favicon-32x32.png ├── favicon.ico ├── index.html ├── link.svg ├── logo.png ├── news │ └── index.html ├── paper.html ├── pkgdown.css ├── pkgdown.js ├── pkgdown.yml ├── reference │ ├── Rplot001.png │ ├── add_users_data.html │ ├── create_edgelist.html │ ├── figures │ │ ├── logo.png │ │ └── tidytags-workflow.jpg │ ├── filter_by_tweet_type.html │ ├── geocode_tags.html │ ├── get_char_tweet_ids.html │ ├── get_upstream_tweets.html │ ├── get_url_domain.html │ ├── index.html │ ├── lookup_many_tweets.html │ ├── lookup_many_users.html │ ├── process_tweets.html │ ├── pull_tweet_data.html │ └── read_tags.html └── sitemap.xml ├── inst └── CITATION ├── man ├── add_users_data.Rd ├── create_edgelist.Rd ├── figures │ ├── logo.png │ └── tidytags-workflow.jpg ├── filter_by_tweet_type.Rd ├── fragments │ ├── ethics.Rmd │ └── getting-help.Rmd ├── get_char_tweet_ids.Rd ├── get_upstream_tweets.Rd ├── get_url_domain.Rd ├── lookup_many_tweets.Rd ├── process_tweets.Rd ├── pull_tweet_data.Rd └── read_tags.Rd ├── paper.bib ├── paper.md ├── pkgdown └── favicon │ ├── apple-touch-icon-120x120.png │ ├── apple-touch-icon-152x152.png │ ├── apple-touch-icon-180x180.png │ ├── apple-touch-icon-60x60.png │ ├── apple-touch-icon-76x76.png │ ├── apple-touch-icon.png │ ├── favicon-16x16.png │ ├── favicon-32x32.png │ └── favicon.ico ├── tests ├── fixtures │ ├── different_tweet_types.yml │ ├── lookup_many.yml │ ├── metadata_from_ids.yml │ ├── metadata_from_rtweet.yml │ ├── metadata_from_urls.yml │ ├── sample_tags.yml │ ├── tweet_ids.yml │ ├── upstream_tweets.yml │ ├── upstream_tweets_empty.yml │ ├── url_domains.yml │ └── users_info.yml ├── testthat.R └── testthat │ ├── helper-vcr.R │ ├── sample-data.csv │ ├── sample-tweet.csv │ ├── test-1-read_tags.R │ ├── test-10-create_edgelist.R │ ├── test-11-add_users_data.R │ ├── test-2-get_char_tweet_ids.R │ ├── test-3-pull_tweet_data.R │ ├── test-4-lookup_many_tweets.R │ ├── test-5-flag_unknown_upstream.R │ ├── test-6-get_upstream_tweets.R │ ├── test-7-process_tweets.R │ ├── test-8-get_url_domain.R │ └── test-9-filter_by_tweet_type.R ├── tidytags-logo.png ├── tidytags.Rproj └── vignettes ├── .gitignore ├── files ├── TAGS-identifier-from-browser.png ├── TAGS-identifier-highlighted.png ├── TAGS-make-copy.png ├── TAGS-ready.png ├── choice-TAGS-version.png ├── key-task-1-success.png ├── publish-to-web-choices.png ├── publish-to-web-menu.png ├── share-anyone-with-link.png └── share-button.png ├── setup.Rmd ├── tidytags-with-conf-hashtags.Rmd ├── tidytags-with-conf-hashtags.Rmd.orig └── vignette-network-visualization-1.png /.Rbuildignore: -------------------------------------------------------------------------------- 1 | ^README\.Rmd$ 2 | ^.*\.Rproj$ 3 | ^\.Rproj\.user$ 4 | ^\.httr-oauth$ 5 | ^_pkgdown\.yml$ 6 | ^docs$ 7 | ^pkgdown$ 8 | CONTRIBUTING.md 9 | LICENSE.md 10 | paper.md 11 | paper.bib 12 | cran-comments.md 13 | tidytags-logo.png 14 | ^codecov\.yml$ 15 | ^doc$ 16 | ^Meta$ 17 | ^\.github$ 18 | ^codemeta\.json$ 19 | ^CRAN-SUBMISSION$ 20 | -------------------------------------------------------------------------------- /.gitattributes: -------------------------------------------------------------------------------- 1 | * text=auto 2 | tests/fixtures/**/* -diff 3 | -------------------------------------------------------------------------------- /.github/.gitignore: -------------------------------------------------------------------------------- 1 | *.html 2 | -------------------------------------------------------------------------------- /.github/workflows/R-CMD-check.yaml: -------------------------------------------------------------------------------- 1 | # See https://github.com/r-lib/actions/tree/v2/examples#standard-ci-workflow 2 | 3 | # Workflow derived from https://github.com/r-lib/actions/tree/v2/examples 4 | # Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help 5 | 6 | on: 7 | push: 8 | branches: [main, master] 9 | pull_request: 10 | branches: [main, master] 11 | 12 | name: R-CMD-check 13 | 14 | jobs: 15 | R-CMD-check: 16 | runs-on: ${{ matrix.config.os }} 17 | 18 | name: ${{ matrix.config.os }} (${{ matrix.config.r }}) 19 | 20 | strategy: 21 | fail-fast: false 22 | matrix: 23 | config: 24 | - {os: macOS-latest, r: 'release'} 25 | - {os: windows-latest, r: 'release'} 26 | - {os: ubuntu-latest, r: 'devel', http-user-agent: 'release'} 27 | - {os: ubuntu-latest, r: 'release'} 28 | - {os: ubuntu-latest, r: '4.2'} # defined min dependent version of R 29 | 30 | env: 31 | GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} 32 | R_KEEP_PKG_SOURCE: yes 33 | 34 | steps: 35 | - uses: actions/checkout@v3 36 | 37 | - uses: r-lib/actions/setup-pandoc@v2 38 | 39 | - uses: r-lib/actions/setup-r@v2 40 | with: 41 | r-version: ${{ matrix.config.r }} 42 | http-user-agent: ${{ matrix.config.http-user-agent }} 43 | use-public-rspm: true 44 | 45 | - uses: r-lib/actions/setup-r-dependencies@v2 46 | with: 47 | extra-packages: any::rcmdcheck 48 | needs: check 49 | 50 | - uses: r-lib/actions/check-r-package@v2 51 | with: 52 | upload-snapshots: true 53 | 54 | - name: Show testthat output 55 | if: always() 56 | run: find check -name 'testthat.Rout*' -exec cat '{}' \; || true 57 | shell: bash 58 | 59 | - name: Upload check results 60 | if: failure() 61 | uses: actions/upload-artifact@main 62 | with: 63 | name: ${{ runner.os }}-r${{ matrix.config.r }}-results 64 | path: check 65 | -------------------------------------------------------------------------------- /.github/workflows/test-coverage.yaml: -------------------------------------------------------------------------------- 1 | # Workflow derived from https://github.com/r-lib/actions/tree/v2/examples 2 | # Need help debugging build failures? Start at https://github.com/r-lib/actions#where-to-find-help 3 | on: 4 | push: 5 | branches: [main, master] 6 | pull_request: 7 | branches: [main, master] 8 | 9 | name: test-coverage 10 | 11 | jobs: 12 | test-coverage: 13 | runs-on: ubuntu-latest 14 | env: 15 | GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} 16 | 17 | steps: 18 | - uses: actions/checkout@v3 19 | 20 | - uses: r-lib/actions/setup-r@v2 21 | with: 22 | use-public-rspm: true 23 | 24 | - uses: r-lib/actions/setup-r-dependencies@v2 25 | with: 26 | extra-packages: any::covr 27 | needs: coverage 28 | 29 | - name: Test coverage 30 | run: covr::codecov() 31 | shell: Rscript {0} 32 | -------------------------------------------------------------------------------- /.github/workflows/with-real-requests.yaml: -------------------------------------------------------------------------------- 1 | ################################################################################ 2 | # For help debugging build failures open an issue on the RStudio community 3 | # with the 'github-actions' tag. 4 | # https://community.rstudio.com/new-topic?category=Package%20development&tags=github-actions 5 | # https://github.com/r-lib/actions 6 | # https://blog--simonpcouch.netlify.app/blog/r-github-actions-commit/ 7 | # https://github.com/rladies/meetupr/blob/master/.github/workflows/with-auth.yaml 8 | ################################################################################ 9 | 10 | on: 11 | schedule: 12 | - cron: '0 6 * * MON,WED,FRI' 13 | # https://crontab.cronhub.io/ 14 | # Scheduled to run Mon-Wed-Fri: '0 6 * * MON,WED,FRI' 15 | # Scheduled to run every Sunday at midnight: '0 0 * * SUN' 16 | # Scheduled to run every hour: '0 * * * *' 17 | # Format: 18 | 19 | name: with-real-requests 20 | 21 | jobs: 22 | with-real-requests: 23 | runs-on: macOS-latest 24 | 25 | env: 26 | R_REMOTES_NO_ERRORS_FROM_WARNINGS: true 27 | GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }} 28 | VCR_TURN_OFF: true 29 | TWITTER_BEARER_TOKEN: ${{ secrets.TWITTER_BEARER_TOKEN }} 30 | 31 | steps: 32 | - uses: actions/checkout@v3 33 | 34 | - uses: r-lib/actions/setup-r@v2 35 | id: install-r 36 | with: 37 | r-version: release 38 | 39 | - uses: r-lib/actions/setup-pandoc@v2 40 | 41 | - name: Install pak and query dependencies 42 | run: | 43 | install.packages("pak", repos = "https://r-lib.github.io/p/pak/dev/") 44 | saveRDS(pak::pkg_deps("local::.", dependencies = TRUE), ".github/r-depends.rds") 45 | shell: Rscript {0} 46 | 47 | - name: Cache R packages 48 | uses: actions/cache@v2 49 | with: 50 | path: ${{ env.R_LIBS_USER }} 51 | key: macOS-latest-${{ steps.install-r.outputs.installed-r-version }}-1-${{ hashFiles('.github/r-depends.rds') }} 52 | restore-keys: macOS-latest-${{ steps.install-r.outputs.installed-r-version }}-1- 53 | 54 | - name: Install dependencies 55 | run: | 56 | pak::local_install_dev_deps(upgrade = FALSE) 57 | pak::pkg_install("rcmdcheck") 58 | shell: Rscript {0} 59 | 60 | - name: Create rtweet token 61 | run: | 62 | app <- 63 | rtweet::rtweet_app(bearer_token = Sys.getenv('TWITTER_BEARER_TOKEN')) 64 | rtweet::auth_as(app) 65 | shell: Rscript {0} 66 | 67 | - name: Session info 68 | run: | 69 | options(width = 100) 70 | pkgs <- installed.packages()[, "Package"] 71 | sessioninfo::session_info(pkgs, include_base = TRUE) 72 | shell: Rscript {0} 73 | 74 | - name: Check 75 | env: 76 | _R_CHECK_CRAN_INCOMING_: false 77 | run: | 78 | options(crayon.enabled = TRUE) 79 | rcmdcheck::rcmdcheck(args = c("--no-manual", "--as-cran"), error_on = "warning", check_dir = "check") 80 | shell: Rscript {0} 81 | 82 | - name: Upload check results 83 | if: failure() 84 | uses: actions/upload-artifact@main 85 | with: 86 | name: macOS-latest-r${{ steps.install-r.outputs.installed-r-version }}-results 87 | path: check 88 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .Rproj.user 2 | .Rhistory 3 | .DS_Store 4 | .RData 5 | .Ruserdata 6 | .httr-oauth 7 | inst/doc 8 | doc 9 | Meta 10 | *.rds 11 | -------------------------------------------------------------------------------- /CONTRIBUTING.md: -------------------------------------------------------------------------------- 1 | # Contributing 2 | 3 | This contributing guide has been derived from the **tidyverse** boilerplate (see their high-level [contributing guide](https://www.tidyverse.org/contribute/)). If you have any questions about contributing, please don't hesitate to [reach out](https://docs.ropensci.org/tidytags/#getting-help). We appreciate every contribution. We suggest first reading the [Getting started with tidytags](https://docs.ropensci.org/tidytags/articles/setup.html) vignette. 4 | 5 | ## Contributor Code of Conduct 6 | 7 | Please note that this package is released with a [Contributor Code of Conduct](https://ropensci.org/code-of-conduct/). By contributing to this project, you agree to abide by its terms. 8 | 9 | ## Non-technical contributions to **tidytags** 10 | 11 | Feel free to [report issues](https://github.com/ropensci/tidytags/issues): 12 | 13 | * **Questions** are for seeking clarification or more information. Both question askers and question answerers are welcome contributors! 14 | * **Bug reports** are for unplanned malfunctions. If you have found a bug, follow the issue template to create a minimal [reprex](https://www.tidyverse.org/help/#reprex). 15 | * **Enhancement requests** are for ideas and new features. 16 | 17 | ## Technical contributions to **tidytags** 18 | 19 | If you would like to contribute to the **tidytags** code base, follow the process below: 20 | 21 | * [Prerequisites](#prerequisites) 22 | * [PR process](#pr-process) 23 | * [Fork, clone, branch](#fork-clone-branch) 24 | * [Check](#check) 25 | * [Style](#style) 26 | * [Document](#document) 27 | * [Test](#test) 28 | * [Re-check](#re-check) 29 | * [Commit](#commit) 30 | * [Push and pull](#push-and-pull) 31 | * [Check the docs](#check-the-docs) 32 | * [Review, revise, repeat](#review-revise-repeat) 33 | * [Resources](#resources) 34 | * [Code of conduct](#code-of-conduct) 35 | 36 | This explains how to propose a change to **tidytags** via a pull request using 37 | Git and GitHub. 38 | 39 | For more general info about contributing to **tidytags**, see the 40 | [Resources](#resources) at the end of this document. 41 | 42 | ### Prerequisites 43 | 44 | To test the **tidytags** package, you can use an openly shared [TAGS tracker](https://docs.google.com/spreadsheets/d/18clYlQeJOc6W5QRuSlJ6_v3snqKJImFhU42bRkM_OX8) that has been collecting tweets associated with the AECT 2019 since September 30, 2019. This is the same TAGS tracker used in the [Using tidytags with a conference hashtag](https://docs.ropensci.org/tidytags/articles/tidytags-with-conf-hashtags.html) vignette. 45 | 46 | Note that this TAGS tracker is read-only in the web browser, because the utility of **tidytags** is reading a TAGS tracker archive into R using `read_tags("18clYlQeJOc6W5QRuSlJ6_v3snqKJImFhU42bRkM_OX8")` and then conducting analyses in an R environment. 47 | 48 | * Before you do a pull request, you should always file an issue and make sure someone from the **tidytags** team agrees that it’s a problem, and is happy with your basic proposal for fixing it. We don’t want you to spend a bunch of time on something that we don’t think is a real problem or an appropriate solution. 49 | * Also make sure to read the [**tidyverse** style guide](http://style.tidyverse.org/) which will make sure that your new code and documentation matches the existing style. This makes the review process much smoother. 50 | 51 | ### PR process 52 | 53 | You are welcome to contribute a *pull request* (PR) to **tidytags**. The most important thing to know is that tidyverse packages use **roxygen2**: this means that documentation is found in the R code close to the source of each function. 54 | 55 | #### Fork, clone, branch 56 | 57 | The first thing you'll need to do is to [fork](https://help.github.com/articles/fork-a-repo/) 58 | the [**tidytags** GitHub repo](https://github.com/ropensci/tidytags), and 59 | then clone it locally. We recommend that you create a branch for each PR. 60 | 61 | #### Check 62 | 63 | Before changing anything, make sure the package still passes the below listed 64 | flavors of `R CMD check` locally for you. 65 | 66 | ```r 67 | goodpractice::goodpractice(quiet = FALSE, ) 68 | devtools::check() 69 | ``` 70 | 71 | #### Style 72 | 73 | Match the existing code style. This means you should follow the tidyverse 74 | [style guide](http://style.tidyverse.org). Use the [**styler**](https://CRAN.R-project.org/package=styler) package to apply the style guide automatically and the [**spelling**](https://CRAN.R-project.org/package=spelling) package to check spelling. 75 | 76 | Be careful to only make style changes to the code you are contributing. If you find that there is a lot of code that doesn't meet the style guide, it would be better to file an issue or a separate PR to fix that first. 77 | 78 | ```r 79 | styler::style_pkg() 80 | spelling::spell_check_package() 81 | spelling::update_wordlist() 82 | ``` 83 | 84 | #### Document 85 | 86 | We use [**roxygen2**](https://cran.r-project.org/package=roxygen2), specifically with [Markdown syntax](https://cran.r-project.org/web/packages/roxygen2/vignettes/markdown.html), to create `NAMESPACE` and all `.Rd` files. All edits to documentation should be done in roxygen comments above the associated function or object. Then, run `devtools::document()` to rebuild the `NAMESPACE` and `.Rd` files. 87 | 88 | See the `RoxygenNote` in [DESCRIPTION](DESCRIPTION) for the version of 89 | **roxygen2** being used. 90 | 91 | #### Test 92 | 93 | We use [**testthat**](https://cran.r-project.org/package=testthat) for testing. Contributions with test cases are easier to review and verify. 94 | 95 | ```r 96 | devtools::test() 97 | devtools::test_coverage() 98 | ``` 99 | 100 | Note that because **tidytags** queries the Twitter API, testing can be a bit tricky. Be sure to follow the [Getting started with tidytags](https://docs.ropensci.org/tidytags/articles/setup.html) vignette for establishing your own Twitter API tokens to conduct local testing. For CI testing, view the [setup-tidytags.R](tests/testthat/setup-tidytags.R) file in the package testing documentation to see how fake OAuth tokens are set up. The [HTTP testing in R](https://books.ropensci.org/http-testing/index.html) book is an invaluable resource. 101 | 102 | #### Re-check 103 | 104 | Before submitting your changes, make sure that the package either still 105 | passes `R CMD check`, or that the warnings and/or notes have not _changed_ 106 | as a result of your edits. 107 | 108 | ```r 109 | devtools::check() 110 | goodpractice::goodpractice(quiet = FALSE) 111 | ``` 112 | 113 | #### Commit 114 | 115 | When you've made your changes, write a clear commit message describing what 116 | you've done. If you've fixed or closed an issue, make sure to include keywords 117 | (e.g. `fixes #17`) at the end of your commit message (not in its 118 | title) to automatically close the issue when the PR is merged. 119 | 120 | #### Push and pull 121 | 122 | Once you've pushed your commit(s) to a branch in _your_ fork, you're ready to 123 | make the pull request. Pull requests should have descriptive titles to remind 124 | reviewers/maintainers what the PR is about. You can easily view what exact 125 | changes you are proposing using either the [Git diff](http://r-pkgs.had.co.nz/git.html#git-status) 126 | view in RStudio, or the [branch comparison view](https://help.github.com/articles/creating-a-pull-request/) 127 | you'll be taken to when you go to create a new PR. If the PR is related to an 128 | issue, provide the issue number and slug in the _description_ using 129 | auto-linking syntax (e.g. `#17`). 130 | 131 | #### Check the docs 132 | 133 | Double check the output of the [GitHub Actions CI](https://github.com/ropensci/tidytags/actions) for any breakages or error messages. 134 | 135 | #### Review, revise, repeat 136 | 137 | The latency period between submitting your PR and its review may vary. When a maintainer does review your contribution, be sure to use the same conventions described here with any revision commits. 138 | 139 | ### Resources 140 | 141 | * [Happy Git and GitHub for the useR](http://happygitwithr.com/) by Jenny Bryan. 142 | * [Contribute to the tidyverse](https://www.tidyverse.org/contribute/) covers 143 | several ways to contribute that _don't_ involve writing code. 144 | * [Contributing Code to the Tidyverse](http://www.jimhester.com/2017/08/08/contributing/) by Jim Hester. 145 | * [R packages](http://r-pkgs.had.co.nz/) by Hadley Wickham. 146 | * [Git and GitHub](http://r-pkgs.had.co.nz/git.html) 147 | * [Automated checking](http://r-pkgs.had.co.nz/check.html) 148 | * [Object documentation](http://r-pkgs.had.co.nz/man.html) 149 | * [Testing](http://r-pkgs.had.co.nz/tests.html) 150 | * [dplyr’s `NEWS.md`](https://github.com/tidyverse/dplyr/blob/master/NEWS.md) 151 | is a good source of examples for both content and styling. 152 | * [Closing issues using keywords](https://help.github.com/articles/closing-issues-using-keywords/) 153 | on GitHub. 154 | * [Autolinked references and URLs](https://help.github.com/articles/autolinked-references-and-urls/) 155 | on GitHub. 156 | * [GitHub Guides: Forking Projects](https://guides.github.com/activities/forking/). 157 | -------------------------------------------------------------------------------- /CRAN-SUBMISSION: -------------------------------------------------------------------------------- 1 | Version: 1.1.1 2 | Date: 2023-01-10 04:37:15 UTC 3 | SHA: d01a008601220486153ef91a6c028ff7d1bcdd8b 4 | -------------------------------------------------------------------------------- /DESCRIPTION: -------------------------------------------------------------------------------- 1 | Package: tidytags 2 | Title: Importing and Analyzing 'Twitter' Data Collected with 'Twitter Archiving Google Sheets' 3 | Version: 1.1.1 4 | License: MIT + file LICENSE 5 | Authors@R: c( 6 | person("K. Bret", "Staudt Willet", , 7 | email = "bret.staudtwillet@fsu.edu", 8 | role = c("aut", "cre"), 9 | comment = c(ORCID = "0000-0002-6984-416X") 10 | ), 11 | person("Joshua M.", "Rosenberg", , 12 | role = c("aut"), 13 | comment = c(ORCID = "0000-0003-2170-0447") 14 | ), 15 | person("Lluís", "Revilla Sancho", , 16 | role = c("rev"), 17 | comment = c(ORCID = "0000-0001-9747-2570") 18 | ), 19 | person("Marion", "Louveaux", , 20 | role = c("rev"), 21 | comment = c(ORCID = "0000-0002-1794-3748") 22 | ) 23 | ) 24 | Description: The 'tidytags' package coordinates the simplicity of collecting tweets 25 | over time with a 'Twitter Archiving Google Sheet' (TAGS; ) 26 | and the utility of the 'rtweet' package () 27 | for processing and preparing additional 'Twitter' metadata. 'tidytags' also 28 | introduces functions developed to facilitate systematic yet flexible analyses 29 | of data from 'Twitter'. 30 | Language: en-US 31 | URL: https://docs.ropensci.org/tidytags/ (website) https://github.com/ropensci/tidytags 32 | Depends: 33 | R (>= 4.2) 34 | Imports: 35 | dplyr (>= 1.0), 36 | googlesheets4 (>= 1.0), 37 | rlang (>= 1.0), 38 | rtweet (>= 1.1), 39 | stringr (>= 1.4) 40 | Suggests: 41 | beepr, 42 | covr, 43 | ggplot2, 44 | ggraph, 45 | knitr, 46 | longurl, 47 | readr, 48 | rmarkdown, 49 | testthat, 50 | tibble, 51 | tidygraph, 52 | urltools, 53 | vcr (>= 1.2) 54 | Encoding: UTF-8 55 | BugReports: https://github.com/ropensci/tidytags/issues 56 | VignetteBuilder: knitr 57 | RoxygenNote: 7.2.3 58 | Roxygen: list(markdown = TRUE) 59 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | YEAR: 2021 2 | COPYRIGHT HOLDER: K. Bret Staudt Willet & Joshua M. Rosenberg 3 | -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- 1 | # MIT License 2 | 3 | Copyright (c) 2021, K. Bret Staudt Willet & Joshua M. Rosenberg 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining 6 | a copy of this software and associated documentation files (the 7 | "Software"), to deal in the Software without restriction, including 8 | without limitation the rights to use, copy, modify, merge, publish, 9 | distribute, sublicense, and/or sell copies of the Software, and to 10 | permit persons to whom the Software is furnished to do so, subject to 11 | the following conditions: 12 | 13 | The above copyright notice and this permission notice shall be 14 | included in all copies or substantial portions of the Software. 15 | 16 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 17 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 18 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 19 | NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE 20 | LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION 21 | OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION 22 | WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. 23 | -------------------------------------------------------------------------------- /NAMESPACE: -------------------------------------------------------------------------------- 1 | # Generated by roxygen2: do not edit by hand 2 | 3 | export(add_users_data) 4 | export(create_edgelist) 5 | export(filter_by_tweet_type) 6 | export(get_char_tweet_ids) 7 | export(get_upstream_tweets) 8 | export(get_url_domain) 9 | export(lookup_many_tweets) 10 | export(process_tweets) 11 | export(pull_tweet_data) 12 | export(read_tags) 13 | importFrom(rlang,.data) 14 | -------------------------------------------------------------------------------- /NEWS.md: -------------------------------------------------------------------------------- 1 | tidytags is no longer supported and has been archived (2024-01-23) 2 | ========================= 3 | 4 | 5 | 6 | tidytags 1.1.1 (2023-01-09) 7 | ========================= 8 | 9 | ### BUG FIXES 10 | 11 | * Fixed a bug with **vcr** testing prompted by the **rtweet** update to v1.1 (https://github.com/ropensci/tidytags/issues/90) 12 | 13 | tidytags 1.1.0 (2022-11-18) 14 | ========================= 15 | 16 | ### BUG FIXES 17 | 18 | * Fixed another bug in `get_upstream_tweets()` where the column `possibly_sensitive` was sometimes read in as a list, other times as a logical type. 19 | * Addressed a bug in **vcr v1.1** that was causing **tidytags** tests to error in the CRAN logs (https://cran.r-project.org/web/checks/check_results_tidytags.html). 20 | * The `setup-vcr.R` file was renamed to `helper-vcr.R` so that it will not be loaded when someone simply installs tidytags, only when package developers compile and test. 21 | * In addition, the vcr bug has been addressed by the developers of that package, so **vcr v1.2** is now the minimum suggested version for tidytags. 22 | 23 | tidytags 1.0.3 (2022-10-14) 24 | ========================= 25 | 26 | ### BUG FIXES 27 | 28 | * Fixed a bug in `get_upstream_tweets()` where column names were out of order and caused an error 29 | 30 | tidytags 1.0.2 (2022-08-19) 31 | ========================= 32 | 33 | ### DOCUMENTATION FIXES 34 | 35 | * Fixed broken URLs and reduced tarball size in preparation for CRAN resubmission. 36 | * In function documentation, \dontrun{} instances have been updated to \donttest{}. 37 | 38 | tidytags 1.0.1 (2022-08-18) 39 | ========================= 40 | 41 | ### DOCUMENTATION FIXES 42 | 43 | * Cleaned up documentation in preparation for CRAN submission. 44 | 45 | tidytags 1.0.0 (2022-08-05) 46 | ========================= 47 | 48 | ### BREAKING CHANGES 49 | 50 | * Updated Twitter authentication process to align with breaking changes caused by the rtweet 1.0 release. 51 | * Updated the process_tweets() function to align with changes in available metadata and new variable names used in rtweet 1.0. 52 | * Removed the lookup_many_users() function. With the rtweet 1.0 update, user information can be accessed with the rtweet::users_data() function. 53 | * Updated flag_unknown_upstream() and get_upstream_tweets() to align with new variable names used in rtweet 1.0. 54 | * Updated filter_by_tweet_type(), create_edgelist(), and add_users_data() to align with new variable names used in rtweet 1.0. 55 | * Removed the geocode_tags() function because rtweet 1.0 changed how location data is available and also added a new rtweet::lookup_coords() function. Note that at this time, rtweet::lookup_coords() requires a Google Maps API key rather than the OpenCage API we had recommended in earlier versions of tidytags. We still recommend the sf and mapview R packages for working with locations and geocoding. 56 | 57 | ### NEW FEATURES 58 | 59 | * Updated the read_tags() function so that a Google API key is no longer needed to pull tweet data from publicly shared Google Sheets. 60 | * The process_tweets() function now also adds user information associated with the creator of each status. process_tweets() also now returns a column for the tweet type of each status. 61 | 62 | tidytags 0.3.0 (2022-02-04) 63 | ========================= 64 | 65 | ### BUG FIXES 66 | 67 | * Updated to most recent versions of CI tests for R-CMD-check and test coverage. 68 | 69 | ### DOCUMENTATION FIXES 70 | 71 | * Updated paper.md and paper.bib to coincide with submission for peer review at Journal of Open Source Software (JOSS). 72 | 73 | tidytags 0.2.1 (2021-12-14) 74 | ========================= 75 | 76 | ### NEW FEATURES 77 | 78 | * Added a new function filter_by_tweet_type() to filter a Twitter dataset to only include statuses of a particular type (e.g., replies, retweets, quote tweets, mentions). 79 | * Updated the function create_edgelist() to take a "type" argument (e.g., "reply", "retweet", "quote", "mention", "all"). This replaces the need for specialized functions like create_mentions_edgelist(). 80 | 81 | tidytags 0.2.0 (2021-11-19) 82 | ========================= 83 | 84 | ### NEW FEATURES 85 | 86 | * Added a new function lookup_many_users() to automatically iterate through the Twitter API limit of pulling metadata for only 90,000 users at one time 87 | 88 | ### BUG FIXES 89 | 90 | * Updated several function names so as not to mask newer functions imported from {rtweet}, for example, get_mentions() is now create_mentions_edgelist(), and similar updates have been made for function building edgelists from quotes, replies, and retweets 91 | * Updated tests to work with latest version of {vcr} 92 | * Made fixes so CI tests would again work with real requests in addition to pre-recorded {vcr} data 93 | 94 | ### DOCUMENTATION FIXES 95 | 96 | * Extensively updated the README doc and Setup vignette to help scaffold {tidytags} setup 97 | 98 | tidytags 0.1.2 (2021-03-02) 99 | ========================= 100 | 101 | ### BUG FIXES 102 | 103 | * CI tests now work with real requests in addition to pre-recorded vcr data 104 | * Added a Google API key for accessing a Google Sheet with `read_tags()` 105 | 106 | ### DOCUMENTATION FIXES 107 | 108 | * Clarified process for obtaining an setting up API keys and tokens for Google, Twitter, and OpenCage 109 | 110 | tidytags 0.1.1 (2020-11-24) 111 | ========================= 112 | 113 | ### NEW FEATURES 114 | 115 | * Switched to OpenCage for geocoding (previously used Google Maps API) 116 | * Switched to GitHub Actions (from Travis CI) for CI testing 117 | 118 | tidytags 0.1.0 (2020-02-21) 119 | ========================= 120 | 121 | * Initial release on GitHub. 122 | -------------------------------------------------------------------------------- /R/add-users-data.R: -------------------------------------------------------------------------------- 1 | #' Retrieve user information for everyone in an edgelist 2 | #' 3 | #' Updates an edgelist created with `create_edgelist()` by appending user 4 | #' data retrieved with `rtweet::lookup_users()`. The resulting dataframe 5 | #' adds many additional columns and appends "_sender" or "_receiver" to the 6 | #' column names. 7 | #' @param edgelist An edgelist of senders and receivers, such as that returned 8 | #' by the function `create_edgelist()`. 9 | #' @return A dataframe in the form of an edgelist (i.e., with senders and 10 | #' receivers) as well as numerous, appropriately named columns of details 11 | #' about the senders and receivers. 12 | #' @details This function requires authentication; please see 13 | #' `vignette("setup", package = "tidytags")` 14 | #' @seealso Read more about rtweet authentication setup at 15 | #' `vignette("auth", package = "rtweet")` 16 | #' @examples 17 | #' 18 | #' \donttest{ 19 | #' example_url <- "18clYlQeJOc6W5QRuSlJ6_v3snqKJImFhU42bRkM_OX8" 20 | #' tags_content <- read_tags(example_url) 21 | #' 22 | #' if (rtweet::auth_has_default()) { 23 | #' tweets_data <- lookup_many_tweets(tags_content) 24 | #' add_users_data(create_edgelist(tweets_data)) 25 | #' } 26 | #' } 27 | #' 28 | #' @importFrom rlang .data 29 | #' @export 30 | add_users_data <- 31 | function(edgelist) { 32 | all_users <- unique(c(edgelist$sender, edgelist$receiver)) 33 | users_data <- rtweet::lookup_users(all_users) 34 | 35 | senders_data <- 36 | dplyr::filter(users_data, .data$screen_name %in% edgelist$sender) 37 | names(senders_data) <- stringr::str_c("sender_", names(senders_data)) 38 | names(senders_data)[4] <- "sender" 39 | 40 | receivers_data <- 41 | dplyr::filter(users_data, .data$screen_name %in% edgelist$receiver) 42 | names(receivers_data) <- stringr::str_c("receiver_", names(receivers_data)) 43 | names(receivers_data)[4] <- "receiver" 44 | 45 | edgelist_with_senders_data <- 46 | dplyr::left_join( 47 | edgelist, 48 | senders_data, 49 | by = "sender" 50 | ) 51 | 52 | edgelist_with_all_users_data <- 53 | dplyr::left_join( 54 | edgelist_with_senders_data, 55 | receivers_data, 56 | by = "receiver" 57 | ) 58 | 59 | edgelist_with_all_users_data 60 | } 61 | -------------------------------------------------------------------------------- /R/analyze-network.R: -------------------------------------------------------------------------------- 1 | #' Filter a Twitter dataset to only include statuses of a particular type 2 | #' 3 | #' Starting with a dataframe of Twitter data imported to R with 4 | #' `read_tags()` and additional metadata retrieved by 5 | #' `pull_tweet_data()`, `filter_by_tweet_type()` processes the 6 | #' statuses by calling `process_tweets()` and then removes any statuses 7 | #' that are not of the requested type (e.g., replies, retweets, and quote 8 | #' tweets). `filter_by_tweet_type()` is a useful function in itself, but it is 9 | #' also used in `create_edgelist()`. 10 | #' @param df A dataframe returned by `pull_tweet_data()` 11 | #' @param type The specific kind of statuses that will be kept in the dataset 12 | #' after filtering the rest. Choices for `type`include "reply", 13 | #' "retweet", "quote", and "original". 14 | #' @return A dataframe of processed statuses and fewer rows that the input 15 | #' dataframe. Only the statuses of the specified type will remain. 16 | #' @examples 17 | #' 18 | #' \donttest{ 19 | #' example_url <- "18clYlQeJOc6W5QRuSlJ6_v3snqKJImFhU42bRkM_OX8" 20 | #' tags_content <- read_tags(example_url) 21 | #' 22 | #' if (rtweet::auth_has_default()) { 23 | #' tweets_data <- lookup_many_tweets(tags_content) 24 | #' only_replies <- filter_by_tweet_type(tweets_data, "reply") 25 | #' only_retweets <- filter_by_tweet_type(tweets_data, "retweet") 26 | #' only_quote_tweets <- filter_by_tweet_type(tweets_data, "quote") 27 | #' only_originals <- filter_by_tweet_type(tweets_data, "original") 28 | #' } 29 | #' } 30 | #' 31 | #' @export 32 | filter_by_tweet_type <- 33 | function(df, type) { 34 | if("tweet_type" %in% names(df)) { 35 | processed_df <- df 36 | } else { 37 | processed_df <- process_tweets(df) 38 | } 39 | index <- processed_df$tweet_type == type 40 | processed_df[index, ] 41 | } 42 | 43 | #' Create an edgelist where senders and receivers are defined by different types 44 | #' of Twitter interactions 45 | #' 46 | #' Starting with a dataframe of Twitter data imported to R with 47 | #' `read_tags()` and additional metadata retrieved by 48 | #' `pull_tweet_data()`, `create_edgelist()` removes any statuses 49 | #' that are not of the requested type (e.g., replies, retweets, and quote 50 | #' tweets) by calling `filter_by_tweet_type()`. Finally, `create_edgelist()` 51 | #' pulls out senders and receivers of the specified type of statuses, and then 52 | #' adds a new column called `edge_type`. 53 | #' @param df A dataframe returned by `pull_tweet_data()` 54 | #' @param type The specific kind of statuses used to define the interactions 55 | #' around which the edgelist will be built. Choices include "reply", 56 | #' "retweet", or "quote". Defaults to "all". 57 | #' @return A dataframe edgelist defined by interactions through the type of 58 | #' statuses specified. The dataframe has three columns: `sender`, 59 | #' `receiver`, and `edge_type`. 60 | #' @examples 61 | #' 62 | #' \donttest{ 63 | #' example_url <- "18clYlQeJOc6W5QRuSlJ6_v3snqKJImFhU42bRkM_OX8" 64 | #' tags_content <- read_tags(example_url) 65 | #' 66 | #' if (rtweet::auth_has_default()) { 67 | #' tweets_data <- lookup_many_tweets(tags_content) 68 | #' full_edgelist <- create_edgelist(tweets_data) 69 | #' full_edgelist 70 | #' 71 | #' reply_edgelist <- create_edgelist(tweets_data, type = "reply") 72 | #' retweet_edgelist <- create_edgelist(tweets_data, type = "retweet") 73 | #' quote_edgelist <- create_edgelist(tweets_data, type = "quote") 74 | #' } 75 | #' } 76 | #' 77 | #' @importFrom rlang .data 78 | #' @export 79 | create_edgelist <- 80 | function(df, type = "all") { 81 | if(type != "all") { 82 | filtered_df <- filter_by_tweet_type(df, type) 83 | } else { 84 | filtered_df <- 85 | dplyr::bind_rows( 86 | filter_by_tweet_type(df, "reply"), 87 | filter_by_tweet_type(df, "retweet"), 88 | filter_by_tweet_type(df, "quote") 89 | ) 90 | } 91 | 92 | filtered_df <- dplyr::rename(filtered_df, sender = .data$screen_name) 93 | el_reply <- NULL 94 | el_retweet <- NULL 95 | el_quote <- NULL 96 | 97 | if(nrow(dplyr::filter(filtered_df, .data$tweet_type == "reply")) > 0) { 98 | filtered_df_reply <- 99 | dplyr::filter(filtered_df, .data$tweet_type == "reply") 100 | receiver <- filtered_df_reply$in_reply_to_screen_name 101 | el_reply <- 102 | dplyr::select(filtered_df_reply, .data$tweet_type, .data$sender) 103 | el_reply <- 104 | dplyr::bind_cols(el_reply, receiver = receiver) 105 | } 106 | 107 | if(nrow(dplyr::filter(filtered_df, .data$tweet_type == "retweet")) > 0) { 108 | filtered_df_retweet <- 109 | dplyr::filter(filtered_df, .data$tweet_type == "retweet") 110 | receiver <- character() 111 | for(i in 1:nrow(filtered_df_retweet)) { 112 | receiver[i] <- 113 | filtered_df_retweet$retweeted_status[[1]]$user$screen_name 114 | } 115 | el_retweet <- 116 | dplyr::select(filtered_df_retweet, .data$tweet_type, .data$sender) 117 | el_retweet <- 118 | dplyr::bind_cols(el_retweet, receiver = receiver) 119 | } 120 | 121 | if(nrow(dplyr::filter(filtered_df, .data$tweet_type == "quote")) > 0) { 122 | filtered_df_quote <- 123 | dplyr::filter(filtered_df, .data$tweet_type == "quote") 124 | receiver <- character() 125 | for(i in 1:nrow(filtered_df_quote)) { 126 | receiver[i] <- 127 | filtered_df_quote$quoted_status[[1]]$user$screen_name 128 | } 129 | el_quote <- 130 | dplyr::select(filtered_df_quote, .data$tweet_type, .data$sender) 131 | el_quote <- 132 | dplyr::bind_cols(el_quote, receiver = receiver) 133 | } 134 | 135 | if(type == "all") { 136 | el <- dplyr::bind_rows(el_reply, el_retweet, el_quote) 137 | } else { 138 | if(type == "reply") { 139 | el <- el_reply 140 | } else { 141 | if(type =="retweet") { 142 | el <- el_retweet 143 | } else { 144 | el <- el_quote 145 | } 146 | } 147 | } 148 | 149 | el 150 | } 151 | -------------------------------------------------------------------------------- /R/analyze-url.R: -------------------------------------------------------------------------------- 1 | #' Find the domain name of URLs, even shortened URLs 2 | #' 3 | #' `get_url_domain()` retrieves the Web domain name from a URL, including 4 | #' URLs shortened with services such as bit.ly and t.co 5 | #' @param x A list or vector of hyperlinks, whether shortened or expanded 6 | #' @param wait How long (in seconds) to wait on the 7 | #' `longurl::expand_urls()` function to retrieve the full, expanded URL 8 | #' from a shortened URL (e.g., a bit.ly). The `longurl` default is 2 9 | #' seconds, but we have found that this misses a number of valid URLs. Here, 10 | #' we have made the default `wait = 10` seconds, but the user can adjust 11 | #' this as they like. 12 | #' @return A list or vector of Web domain names 13 | #' @seealso Read the documentation for `longurl::expand_urls()` and 14 | #' `urltools::domain()`. 15 | #' @examples 16 | #' 17 | #' get_url_domain("https://www.tidyverse.org/packages/") 18 | #' get_url_domain("https://dplyr.tidyverse.org/") 19 | #' get_url_domain("http://bit.ly/2SfWO3K") 20 | #' 21 | #' @export 22 | get_url_domain <- 23 | function(x, wait = 10) { 24 | if (!requireNamespace("longurl", quietly = TRUE)) { 25 | stop( 26 | "Please install the {longurl} package to use this function", 27 | call. = FALSE 28 | ) 29 | } 30 | 31 | if (!requireNamespace("urltools", quietly = TRUE)) { 32 | stop( 33 | "Please install the {urltools} package to use this function", 34 | call. = FALSE 35 | ) 36 | } 37 | 38 | new_urls <- suppressWarnings(longurl::expand_urls(x, seconds = wait)) 39 | domains <- urltools::domain(new_urls$expanded_url) 40 | domains <- gsub("www[0-9]?.", "", domains) 41 | domains 42 | } 43 | -------------------------------------------------------------------------------- /R/get-upstream-tweets.R: -------------------------------------------------------------------------------- 1 | #' Flag any upstream statuses not already in a dataset 2 | #' 3 | #' Because the Twitter API offers a `in_reply_to_status_id_str` column, it is 4 | #' possible to iteratively reconstruct reply threads in an *upstream* 5 | #' direction, that is, retrieving statuses composed earlier than replies in 6 | #' the dataset. The `flag_unknown_upstream()` function identifies which 7 | #' statuses are replies to statuses not found in the current dataset. 8 | #' @param df A dataframe of statuses and full metadata from the Twitter API as 9 | #' returned by `pull_tweet_data()` 10 | #' @return A new, filtered dataframe which only includes any reply statuses that 11 | #' are not responses to statuses already in the dataset (i.e., upstream 12 | #' replies) 13 | #' @importFrom rlang .data 14 | #' @keywords internal 15 | #' @noRd 16 | flag_unknown_upstream <- 17 | function(df) { 18 | unknown_upstream <- 19 | dplyr::filter(df, 20 | !is.na(.data$in_reply_to_status_id_str) & 21 | !(.data$in_reply_to_status_id_str %in% 22 | df$id_str) 23 | ) 24 | unknown_upstream 25 | } 26 | 27 | #' Collect upstream statuses and add to dataset 28 | #' 29 | #' Because the Twitter API offers a `in_reply_to_status_id_str` column, it is 30 | #' possible to iteratively reconstruct reply threads in an *upstream* 31 | #' direction, that is, retrieving statuses composed earlier than replies in 32 | #' the dataset. The `get_upstream_tweets()` function collects upstream 33 | #' replies not previously found in the dataset. Keep in mind that there is no 34 | #' way to predict how far upstream you can trace back a reply thread, so 35 | #' running `get_upstream_tweets()` could take a while and potentially hit 36 | #' the Twitter API rate limit of 90,000 statuses in a 15-minute period. 37 | #' @param df A dataframe of statuses and full metadata from the Twitter API as 38 | #' returned by `pull_tweet_data()` 39 | #' @return A new, expanded dataframe which includes any retrievable upstream 40 | #' replies 41 | #' @details This function requires authentication; please see 42 | #' `vignette("setup", package = "tidytags")` 43 | #' @seealso Read more about rtweet authentication setup at 44 | #' `vignette("auth", package = "rtweet")` 45 | #' @examples 46 | #' 47 | #' \donttest{ 48 | #' example_url <- "18clYlQeJOc6W5QRuSlJ6_v3snqKJImFhU42bRkM_OX8" 49 | #' tags_content <- read_tags(example_url) 50 | #' 51 | #' if (rtweet::auth_has_default()) { 52 | #' tweets_data <- lookup_many_tweets(tags_content) 53 | #' more_replies_df <- get_upstream_tweets(tweets_data) 54 | #' more_replies_df 55 | #' } 56 | #' } 57 | #' 58 | #' @export 59 | get_upstream_tweets <- 60 | function(df) { 61 | unknown_upstream <- flag_unknown_upstream(df) 62 | 63 | if (nrow(unknown_upstream) == 0) { 64 | message("There are no upstream replies to get.") 65 | } else { 66 | 67 | searchable_replies <- 68 | pull_tweet_data(id_vector = unknown_upstream$in_reply_to_status_id_str) 69 | searchable_n <- 70 | ifelse(is.null(searchable_replies), 0, nrow(searchable_replies)) 71 | 72 | if (searchable_n > 0) { 73 | i <- 0 74 | n <- 0 75 | while (searchable_n > 0) { 76 | i <- i + 1 77 | message("Iteration: ", i) 78 | new_tweets <- 79 | pull_tweet_data(id_vector = 80 | unknown_upstream$in_reply_to_status_id_str) 81 | n <- n + nrow(new_tweets) 82 | df <- rbind(df, new_tweets) 83 | 84 | unknown_upstream <- flag_unknown_upstream(df) 85 | 86 | searchable_replies <- 87 | pull_tweet_data(id_vector = 88 | unknown_upstream$in_reply_to_status_id_str) 89 | searchable_n <- 90 | ifelse(is.null(searchable_replies), 0, nrow(searchable_replies)) 91 | 92 | message( 93 | "New statuses added to the dataset: ", 94 | nrow(new_tweets), 95 | "; reply statuses that were not able to be retrieved: ", 96 | nrow(unknown_upstream), 97 | "; newly added replies where we can still go further upstream: ", 98 | nrow(searchable_replies) 99 | ) 100 | } 101 | 102 | message("We've gone as far upstream as we're able to go.", 103 | " This process resulted in ", n, 104 | " new replies being added to the dataset.") 105 | } 106 | } 107 | df 108 | } 109 | -------------------------------------------------------------------------------- /README.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | output: github_document 3 | --- 4 | 5 | 6 | 7 | ```{r setup, include=FALSE} 8 | knitr::opts_chunk$set( 9 | collapse = TRUE, 10 | comment = "#>", 11 | fig.path = "man/figures/README-", 12 | out.width = "100%", 13 | message = FALSE 14 | ) 15 | ``` 16 | 17 | # tidytags 18 | 19 | ##### *Importing and Analyzing 'Twitter' Data Collected with 'Twitter Archiving Google Sheets'* 20 | 21 | [![Project Status: Unsupported](https://www.repostatus.org/badges/latest/unsupported.svg)](https://www.repostatus.org/#unsupported) 22 | [![rOpenSci Peer Review](https://badges.ropensci.org/382_status.svg)](https://github.com/ropensci/software-review/issues/382) 23 | 24 | This package has been archived. The former README is now in [README-archived](https://github.com/ropensci/tidytags/blob/main/README-archived.md). 25 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | # tidytags 5 | 6 | ##### *Importing and Analyzing ‘Twitter’ Data Collected with ‘Twitter Archiving Google Sheets’* 7 | 8 | [![Project Status: 9 | Unsupported](https://www.repostatus.org/badges/latest/unsupported.svg)](https://www.repostatus.org/#unsupported) 10 | [![rOpenSci Peer 11 | Review](https://badges.ropensci.org/382_status.svg)](https://github.com/ropensci/software-review/issues/382) 12 | 13 | This package has been archived. The former README is now in 14 | [README-archived](https://github.com/ropensci/tidytags/blob/main/README-archived.md). 15 | -------------------------------------------------------------------------------- /_pkgdown.yml: -------------------------------------------------------------------------------- 1 | url: https://docs.ropensci.org/tidytags 2 | 3 | authors: 4 | K. Bret Staudt Willet: 5 | href: http://bretsw.com/ 6 | Joshua M. Rosenberg: 7 | href: https://joshuamrosenberg.com/ 8 | 9 | navbar: 10 | title: "tidytags" 11 | left: 12 | - text: "Get started" 13 | href: articles/setup.html 14 | - text: "Reference" 15 | href: reference/index.html 16 | - text: "Articles" 17 | menu: 18 | - text: Getting started with tidytags 19 | href: articles/setup.html 20 | - text: "Using tidytags with a conference hashtag" 21 | href: articles/tidytags-with-conf-hashtags.html 22 | - text: "Changelog" 23 | href: news/index.html 24 | right: 25 | - icon: fa-github 26 | href: https://github.com/ropensci/tidytags 27 | 28 | reference: 29 | - title: "Accessing data" 30 | desc: Functions for accessing TAGS data. 31 | contents: 32 | - read_tags 33 | - pull_tweet_data 34 | - lookup_many_tweets 35 | - title: "Processing data" 36 | desc: Functions related to processing data for subsequent analysis. 37 | contents: 38 | - process_tweets 39 | - get_char_tweet_ids 40 | - get_upstream_tweets 41 | - get_url_domain 42 | - title: "Social network analysis" 43 | desc: Functions related to carrying out social network analysis. 44 | contents: 45 | - filter_by_tweet_type 46 | - create_edgelist 47 | - add_users_data 48 | -------------------------------------------------------------------------------- /codecov.yml: -------------------------------------------------------------------------------- 1 | comment: false 2 | 3 | coverage: 4 | status: 5 | project: 6 | default: 7 | target: auto 8 | threshold: 1% 9 | informational: true 10 | patch: 11 | default: 12 | target: auto 13 | threshold: 1% 14 | informational: true 15 | -------------------------------------------------------------------------------- /cran-comments.md: -------------------------------------------------------------------------------- 1 | # CRAN comments 2 | 3 | ## tidytags v1.1.1 4 | 5 | **1/10/2023** 6 | 7 | This is an update from tidytags v1.0.3 submitted on 10/14/2022. This version fixes a bug causing an error in the `get_upstream_tweets()` function and requires an updated version of the vcr package (>= 1.2). 8 | 9 | --- 10 | 11 | ## R CMD check results 12 | 13 | `devtools::check()` result: 14 | 15 | **Test environment:** local MacOS Version 11.7 install, R 4.2.1 16 | 17 | **0 errors ✔ | 0 warnings ✔ | 0 notes ✔ ** 18 | 19 | --- 20 | 21 | ## GitHub Actions result: 22 | 23 | **Test environments:** 24 | 25 | - macOS-latest, R release 26 | - windows-latest, R release 27 | - ubuntu-latest, R devel 28 | - ubuntu-latest, R release 29 | - ubuntu-latest, R 4.2 30 | 31 | **0 errors ✔ | 0 warnings ✔ | 0 notes ✔ ** 32 | 33 | --- 34 | 35 | `rhub::check_for_cran()` result: 36 | 37 | **Test environment:** Windows Server 2022, R-devel, 64 bit 38 | 39 | **0 errors ✔ | 0 warnings ✔ | 0 notes ✔ ** 40 | 41 | --- 42 | 43 | `rhub::check_for_cran()` result: 44 | 45 | **Test environment:** Fedora Linux, R-devel, clang, gfortran 46 | 47 | **0 errors ✔ | 0 warnings ✔ | 1 note * ** 48 | 49 | - checking HTML version of manual ... NOTE: Skipping checking HTML validation: no command 'tidy' found 50 | - Explanation: As noted in an [r-source check](https://github.com/wch/r-source/blob/trunk/src/library/tools/R/check.R), this seems like an issue related to macOS's old version of HTML Tidy and not related to the package being checked. 51 | 52 | --- 53 | 54 | `rhub::check_on_windows()` result: 55 | 56 | **Test environment:** Windows Server 2022, R-release, 32/64 bit 57 | 58 | **0 errors ✔ | 0 warnings ✔ | 0 notes ✔ ** 59 | -------------------------------------------------------------------------------- /docs/404.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | Page not found (404) • tidytags 9 | 10 | 11 | 12 | 13 | 14 | 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 26 | 27 | 28 | 29 | 30 |
31 |
86 | 87 | 88 | 89 | 90 |
91 |
92 | 95 | 96 | Content not found. Please use links in the navbar. 97 | 98 |
99 | 100 | 104 | 105 |
106 | 107 | 108 | 109 | 120 |
121 | 122 | 123 | 124 | 125 | 126 | 127 | 128 | 129 | -------------------------------------------------------------------------------- /docs/LICENSE-text.html: -------------------------------------------------------------------------------- 1 | 2 | License • tidytags 6 | 7 | 8 |
9 |
56 | 57 | 58 | 59 |
60 |
61 | 64 | 65 |
YEAR: 2021
66 | COPYRIGHT HOLDER: K. Bret Staudt Willet & Joshua M. Rosenberg
67 | 
68 | 69 |
70 | 71 | 74 | 75 |
76 | 77 | 78 | 79 |
88 | 89 | 90 | 91 | 92 | 93 | 94 | 95 | 96 | -------------------------------------------------------------------------------- /docs/LICENSE.html: -------------------------------------------------------------------------------- 1 | 2 | MIT License • tidytags 6 | 7 | 8 |
9 |
56 | 57 | 58 | 59 |
60 |
61 | 64 | 65 |
66 | 67 |

Copyright (c) 2021, K. Bret Staudt Willet & Joshua M. Rosenberg

68 |

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the “Software”), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions:

69 |

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software.

70 |

THE SOFTWARE IS PROVIDED “AS IS”, WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

71 |
72 | 73 |
74 | 75 | 78 | 79 |
80 | 81 | 82 | 83 |
92 | 93 | 94 | 95 | 96 | 97 | 98 | 99 | 100 | -------------------------------------------------------------------------------- /docs/apple-touch-icon-120x120.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/docs/apple-touch-icon-120x120.png -------------------------------------------------------------------------------- /docs/apple-touch-icon-152x152.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/docs/apple-touch-icon-152x152.png -------------------------------------------------------------------------------- /docs/apple-touch-icon-180x180.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/docs/apple-touch-icon-180x180.png -------------------------------------------------------------------------------- /docs/apple-touch-icon-60x60.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/docs/apple-touch-icon-60x60.png -------------------------------------------------------------------------------- /docs/apple-touch-icon-76x76.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/docs/apple-touch-icon-76x76.png -------------------------------------------------------------------------------- /docs/apple-touch-icon.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/docs/apple-touch-icon.png -------------------------------------------------------------------------------- /docs/articles/files/TAGS-identifier-from-browser.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/docs/articles/files/TAGS-identifier-from-browser.png -------------------------------------------------------------------------------- /docs/articles/files/TAGS-identifier-highlighted.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/docs/articles/files/TAGS-identifier-highlighted.png -------------------------------------------------------------------------------- /docs/articles/files/TAGS-make-copy.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/docs/articles/files/TAGS-make-copy.png -------------------------------------------------------------------------------- /docs/articles/files/TAGS-ready.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/docs/articles/files/TAGS-ready.png -------------------------------------------------------------------------------- /docs/articles/files/choice-TAGS-version.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/docs/articles/files/choice-TAGS-version.png -------------------------------------------------------------------------------- /docs/articles/files/key-task-1-success.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/docs/articles/files/key-task-1-success.png -------------------------------------------------------------------------------- /docs/articles/files/pain-point-1-success.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/docs/articles/files/pain-point-1-success.png -------------------------------------------------------------------------------- /docs/articles/files/publish-to-web-choices.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/docs/articles/files/publish-to-web-choices.png -------------------------------------------------------------------------------- /docs/articles/files/publish-to-web-menu.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/docs/articles/files/publish-to-web-menu.png -------------------------------------------------------------------------------- /docs/articles/files/share-anyone-with-link.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/docs/articles/files/share-anyone-with-link.png -------------------------------------------------------------------------------- /docs/articles/files/share-button.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/docs/articles/files/share-button.png -------------------------------------------------------------------------------- /docs/articles/files/tidytags-setup-google-api_1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/docs/articles/files/tidytags-setup-google-api_1.png -------------------------------------------------------------------------------- /docs/articles/files/tidytags-setup-google-api_2.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/docs/articles/files/tidytags-setup-google-api_2.png -------------------------------------------------------------------------------- /docs/articles/files/tidytags-setup-google-api_3.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/docs/articles/files/tidytags-setup-google-api_3.png -------------------------------------------------------------------------------- /docs/articles/index.html: -------------------------------------------------------------------------------- 1 | 2 | Articles • tidytags 6 | 7 | 8 |
9 |
56 | 57 | 58 | 59 |
60 |
61 | 64 | 65 |
66 |

All vignettes

67 |

68 | 69 |
Getting started with tidytags
70 |
71 |
Using tidytags with a conference hashtag
72 |
73 |
74 |
75 |
76 | 77 | 78 |
87 | 88 | 89 | 90 | 91 | 92 | 93 | 94 | 95 | -------------------------------------------------------------------------------- /docs/articles/setup_files/header-attrs-2.11/header-attrs.js: -------------------------------------------------------------------------------- 1 | // Pandoc 2.9 adds attributes on both header and div. We remove the former (to 2 | // be compatible with the behavior of Pandoc < 2.8). 3 | document.addEventListener('DOMContentLoaded', function(e) { 4 | var hs = document.querySelectorAll("div.section[class*='level'] > :first-child"); 5 | var i, h, a; 6 | for (i = 0; i < hs.length; i++) { 7 | h = hs[i]; 8 | if (!/^h[1-6]$/i.test(h.tagName)) continue; // it should be a header h1-h6 9 | a = h.attributes; 10 | while (a.length > 0) h.removeAttribute(a[0].name); 11 | } 12 | }); 13 | -------------------------------------------------------------------------------- /docs/articles/tidytags-with-conf-hashtags_files/header-attrs-2.11/header-attrs.js: -------------------------------------------------------------------------------- 1 | // Pandoc 2.9 adds attributes on both header and div. We remove the former (to 2 | // be compatible with the behavior of Pandoc < 2.8). 3 | document.addEventListener('DOMContentLoaded', function(e) { 4 | var hs = document.querySelectorAll("div.section[class*='level'] > :first-child"); 5 | var i, h, a; 6 | for (i = 0; i < hs.length; i++) { 7 | h = hs[i]; 8 | if (!/^h[1-6]$/i.test(h.tagName)) continue; // it should be a header h1-h6 9 | a = h.attributes; 10 | while (a.length > 0) h.removeAttribute(a[0].name); 11 | } 12 | }); 13 | -------------------------------------------------------------------------------- /docs/articles/vignette-network-visualization-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/docs/articles/vignette-network-visualization-1.png -------------------------------------------------------------------------------- /docs/authors.html: -------------------------------------------------------------------------------- 1 | 2 | Authors and Citation • tidytags 6 | 7 | 8 |
9 |
56 | 57 | 58 | 59 |
60 |
61 |
62 | 65 | 66 | 67 |
  • 68 |

    K. Bret Staudt Willet. Author, maintainer. 69 |

    70 |
  • 71 |
  • 72 |

    Joshua M. Rosenberg. Author. 73 |

    74 |
  • 75 |
  • 76 |

    Lluís Revilla Sancho. Reviewer. 77 |

    78 |
  • 79 |
  • 80 |

    Marion Louveaux. Reviewer. 81 |

    82 |
  • 83 |
84 |
85 |
86 |

Citation

87 | Source: inst/CITATION 88 |
89 |
90 | 91 | 92 |

Staudt Willet, K. B., & Rosenberg, J. M. (2022). tidytags: Importing and analyzing Twitter data collected with Twitter Archiving Google Sheets. https://github.com/ropensci/tidytags

93 |
@Manual{tidytags-package,
 94 |   title = {tidytags: Importing and analyzing Twitter data collected with Twitter Archiving Google Sheets},
 95 |   author = {K. Bret Staudt Willet and Joshua M. Rosenberg},
 96 |   year = {2022},
 97 | }
98 | 99 |
100 | 101 |
102 | 103 | 104 | 105 |
114 | 115 | 116 | 117 | 118 | 119 | 120 | 121 | 122 | -------------------------------------------------------------------------------- /docs/bootstrap-toc.css: -------------------------------------------------------------------------------- 1 | /*! 2 | * Bootstrap Table of Contents v0.4.1 (http://afeld.github.io/bootstrap-toc/) 3 | * Copyright 2015 Aidan Feldman 4 | * Licensed under MIT (https://github.com/afeld/bootstrap-toc/blob/gh-pages/LICENSE.md) */ 5 | 6 | /* modified from https://github.com/twbs/bootstrap/blob/94b4076dd2efba9af71f0b18d4ee4b163aa9e0dd/docs/assets/css/src/docs.css#L548-L601 */ 7 | 8 | /* All levels of nav */ 9 | nav[data-toggle='toc'] .nav > li > a { 10 | display: block; 11 | padding: 4px 20px; 12 | font-size: 13px; 13 | font-weight: 500; 14 | color: #767676; 15 | } 16 | nav[data-toggle='toc'] .nav > li > a:hover, 17 | nav[data-toggle='toc'] .nav > li > a:focus { 18 | padding-left: 19px; 19 | color: #563d7c; 20 | text-decoration: none; 21 | background-color: transparent; 22 | border-left: 1px solid #563d7c; 23 | } 24 | nav[data-toggle='toc'] .nav > .active > a, 25 | nav[data-toggle='toc'] .nav > .active:hover > a, 26 | nav[data-toggle='toc'] .nav > .active:focus > a { 27 | padding-left: 18px; 28 | font-weight: bold; 29 | color: #563d7c; 30 | background-color: transparent; 31 | border-left: 2px solid #563d7c; 32 | } 33 | 34 | /* Nav: second level (shown on .active) */ 35 | nav[data-toggle='toc'] .nav .nav { 36 | display: none; /* Hide by default, but at >768px, show it */ 37 | padding-bottom: 10px; 38 | } 39 | nav[data-toggle='toc'] .nav .nav > li > a { 40 | padding-top: 1px; 41 | padding-bottom: 1px; 42 | padding-left: 30px; 43 | font-size: 12px; 44 | font-weight: normal; 45 | } 46 | nav[data-toggle='toc'] .nav .nav > li > a:hover, 47 | nav[data-toggle='toc'] .nav .nav > li > a:focus { 48 | padding-left: 29px; 49 | } 50 | nav[data-toggle='toc'] .nav .nav > .active > a, 51 | nav[data-toggle='toc'] .nav .nav > .active:hover > a, 52 | nav[data-toggle='toc'] .nav .nav > .active:focus > a { 53 | padding-left: 28px; 54 | font-weight: 500; 55 | } 56 | 57 | /* from https://github.com/twbs/bootstrap/blob/e38f066d8c203c3e032da0ff23cd2d6098ee2dd6/docs/assets/css/src/docs.css#L631-L634 */ 58 | nav[data-toggle='toc'] .nav > .active > ul { 59 | display: block; 60 | } 61 | -------------------------------------------------------------------------------- /docs/bootstrap-toc.js: -------------------------------------------------------------------------------- 1 | /*! 2 | * Bootstrap Table of Contents v0.4.1 (http://afeld.github.io/bootstrap-toc/) 3 | * Copyright 2015 Aidan Feldman 4 | * Licensed under MIT (https://github.com/afeld/bootstrap-toc/blob/gh-pages/LICENSE.md) */ 5 | (function() { 6 | 'use strict'; 7 | 8 | window.Toc = { 9 | helpers: { 10 | // return all matching elements in the set, or their descendants 11 | findOrFilter: function($el, selector) { 12 | // http://danielnouri.org/notes/2011/03/14/a-jquery-find-that-also-finds-the-root-element/ 13 | // http://stackoverflow.com/a/12731439/358804 14 | var $descendants = $el.find(selector); 15 | return $el.filter(selector).add($descendants).filter(':not([data-toc-skip])'); 16 | }, 17 | 18 | generateUniqueIdBase: function(el) { 19 | var text = $(el).text(); 20 | var anchor = text.trim().toLowerCase().replace(/[^A-Za-z0-9]+/g, '-'); 21 | return anchor || el.tagName.toLowerCase(); 22 | }, 23 | 24 | generateUniqueId: function(el) { 25 | var anchorBase = this.generateUniqueIdBase(el); 26 | for (var i = 0; ; i++) { 27 | var anchor = anchorBase; 28 | if (i > 0) { 29 | // add suffix 30 | anchor += '-' + i; 31 | } 32 | // check if ID already exists 33 | if (!document.getElementById(anchor)) { 34 | return anchor; 35 | } 36 | } 37 | }, 38 | 39 | generateAnchor: function(el) { 40 | if (el.id) { 41 | return el.id; 42 | } else { 43 | var anchor = this.generateUniqueId(el); 44 | el.id = anchor; 45 | return anchor; 46 | } 47 | }, 48 | 49 | createNavList: function() { 50 | return $(''); 51 | }, 52 | 53 | createChildNavList: function($parent) { 54 | var $childList = this.createNavList(); 55 | $parent.append($childList); 56 | return $childList; 57 | }, 58 | 59 | generateNavEl: function(anchor, text) { 60 | var $a = $(''); 61 | $a.attr('href', '#' + anchor); 62 | $a.text(text); 63 | var $li = $('
  • '); 64 | $li.append($a); 65 | return $li; 66 | }, 67 | 68 | generateNavItem: function(headingEl) { 69 | var anchor = this.generateAnchor(headingEl); 70 | var $heading = $(headingEl); 71 | var text = $heading.data('toc-text') || $heading.text(); 72 | return this.generateNavEl(anchor, text); 73 | }, 74 | 75 | // Find the first heading level (`

    `, then `

    `, etc.) that has more than one element. Defaults to 1 (for `

    `). 76 | getTopLevel: function($scope) { 77 | for (var i = 1; i <= 6; i++) { 78 | var $headings = this.findOrFilter($scope, 'h' + i); 79 | if ($headings.length > 1) { 80 | return i; 81 | } 82 | } 83 | 84 | return 1; 85 | }, 86 | 87 | // returns the elements for the top level, and the next below it 88 | getHeadings: function($scope, topLevel) { 89 | var topSelector = 'h' + topLevel; 90 | 91 | var secondaryLevel = topLevel + 1; 92 | var secondarySelector = 'h' + secondaryLevel; 93 | 94 | return this.findOrFilter($scope, topSelector + ',' + secondarySelector); 95 | }, 96 | 97 | getNavLevel: function(el) { 98 | return parseInt(el.tagName.charAt(1), 10); 99 | }, 100 | 101 | populateNav: function($topContext, topLevel, $headings) { 102 | var $context = $topContext; 103 | var $prevNav; 104 | 105 | var helpers = this; 106 | $headings.each(function(i, el) { 107 | var $newNav = helpers.generateNavItem(el); 108 | var navLevel = helpers.getNavLevel(el); 109 | 110 | // determine the proper $context 111 | if (navLevel === topLevel) { 112 | // use top level 113 | $context = $topContext; 114 | } else if ($prevNav && $context === $topContext) { 115 | // create a new level of the tree and switch to it 116 | $context = helpers.createChildNavList($prevNav); 117 | } // else use the current $context 118 | 119 | $context.append($newNav); 120 | 121 | $prevNav = $newNav; 122 | }); 123 | }, 124 | 125 | parseOps: function(arg) { 126 | var opts; 127 | if (arg.jquery) { 128 | opts = { 129 | $nav: arg 130 | }; 131 | } else { 132 | opts = arg; 133 | } 134 | opts.$scope = opts.$scope || $(document.body); 135 | return opts; 136 | } 137 | }, 138 | 139 | // accepts a jQuery object, or an options object 140 | init: function(opts) { 141 | opts = this.helpers.parseOps(opts); 142 | 143 | // ensure that the data attribute is in place for styling 144 | opts.$nav.attr('data-toggle', 'toc'); 145 | 146 | var $topContext = this.helpers.createChildNavList(opts.$nav); 147 | var topLevel = this.helpers.getTopLevel(opts.$scope); 148 | var $headings = this.helpers.getHeadings(opts.$scope, topLevel); 149 | this.helpers.populateNav($topContext, topLevel, $headings); 150 | } 151 | }; 152 | 153 | $(function() { 154 | $('nav[data-toggle="toc"]').each(function(i, el) { 155 | var $nav = $(el); 156 | Toc.init($nav); 157 | }); 158 | }); 159 | })(); 160 | -------------------------------------------------------------------------------- /docs/docsearch.js: -------------------------------------------------------------------------------- 1 | $(function() { 2 | 3 | // register a handler to move the focus to the search bar 4 | // upon pressing shift + "/" (i.e. "?") 5 | $(document).on('keydown', function(e) { 6 | if (e.shiftKey && e.keyCode == 191) { 7 | e.preventDefault(); 8 | $("#search-input").focus(); 9 | } 10 | }); 11 | 12 | $(document).ready(function() { 13 | // do keyword highlighting 14 | /* modified from https://jsfiddle.net/julmot/bL6bb5oo/ */ 15 | var mark = function() { 16 | 17 | var referrer = document.URL ; 18 | var paramKey = "q" ; 19 | 20 | if (referrer.indexOf("?") !== -1) { 21 | var qs = referrer.substr(referrer.indexOf('?') + 1); 22 | var qs_noanchor = qs.split('#')[0]; 23 | var qsa = qs_noanchor.split('&'); 24 | var keyword = ""; 25 | 26 | for (var i = 0; i < qsa.length; i++) { 27 | var currentParam = qsa[i].split('='); 28 | 29 | if (currentParam.length !== 2) { 30 | continue; 31 | } 32 | 33 | if (currentParam[0] == paramKey) { 34 | keyword = decodeURIComponent(currentParam[1].replace(/\+/g, "%20")); 35 | } 36 | } 37 | 38 | if (keyword !== "") { 39 | $(".contents").unmark({ 40 | done: function() { 41 | $(".contents").mark(keyword); 42 | } 43 | }); 44 | } 45 | } 46 | }; 47 | 48 | mark(); 49 | }); 50 | }); 51 | 52 | /* Search term highlighting ------------------------------*/ 53 | 54 | function matchedWords(hit) { 55 | var words = []; 56 | 57 | var hierarchy = hit._highlightResult.hierarchy; 58 | // loop to fetch from lvl0, lvl1, etc. 59 | for (var idx in hierarchy) { 60 | words = words.concat(hierarchy[idx].matchedWords); 61 | } 62 | 63 | var content = hit._highlightResult.content; 64 | if (content) { 65 | words = words.concat(content.matchedWords); 66 | } 67 | 68 | // return unique words 69 | var words_uniq = [...new Set(words)]; 70 | return words_uniq; 71 | } 72 | 73 | function updateHitURL(hit) { 74 | 75 | var words = matchedWords(hit); 76 | var url = ""; 77 | 78 | if (hit.anchor) { 79 | url = hit.url_without_anchor + '?q=' + escape(words.join(" ")) + '#' + hit.anchor; 80 | } else { 81 | url = hit.url + '?q=' + escape(words.join(" ")); 82 | } 83 | 84 | return url; 85 | } 86 | -------------------------------------------------------------------------------- /docs/favicon-16x16.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/docs/favicon-16x16.png -------------------------------------------------------------------------------- /docs/favicon-32x32.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/docs/favicon-32x32.png -------------------------------------------------------------------------------- /docs/favicon.ico: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/docs/favicon.ico -------------------------------------------------------------------------------- /docs/link.svg: -------------------------------------------------------------------------------- 1 | 2 | 3 | 5 | 8 | 12 | 13 | -------------------------------------------------------------------------------- /docs/logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/docs/logo.png -------------------------------------------------------------------------------- /docs/pkgdown.css: -------------------------------------------------------------------------------- 1 | /* Sticky footer */ 2 | 3 | /** 4 | * Basic idea: https://philipwalton.github.io/solved-by-flexbox/demos/sticky-footer/ 5 | * Details: https://github.com/philipwalton/solved-by-flexbox/blob/master/assets/css/components/site.css 6 | * 7 | * .Site -> body > .container 8 | * .Site-content -> body > .container .row 9 | * .footer -> footer 10 | * 11 | * Key idea seems to be to ensure that .container and __all its parents__ 12 | * have height set to 100% 13 | * 14 | */ 15 | 16 | html, body { 17 | height: 100%; 18 | } 19 | 20 | body { 21 | position: relative; 22 | } 23 | 24 | body > .container { 25 | display: flex; 26 | height: 100%; 27 | flex-direction: column; 28 | } 29 | 30 | body > .container .row { 31 | flex: 1 0 auto; 32 | } 33 | 34 | footer { 35 | margin-top: 45px; 36 | padding: 35px 0 36px; 37 | border-top: 1px solid #e5e5e5; 38 | color: #666; 39 | display: flex; 40 | flex-shrink: 0; 41 | } 42 | footer p { 43 | margin-bottom: 0; 44 | } 45 | footer div { 46 | flex: 1; 47 | } 48 | footer .pkgdown { 49 | text-align: right; 50 | } 51 | footer p { 52 | margin-bottom: 0; 53 | } 54 | 55 | img.icon { 56 | float: right; 57 | } 58 | 59 | /* Ensure in-page images don't run outside their container */ 60 | .contents img { 61 | max-width: 100%; 62 | height: auto; 63 | } 64 | 65 | /* Fix bug in bootstrap (only seen in firefox) */ 66 | summary { 67 | display: list-item; 68 | } 69 | 70 | /* Typographic tweaking ---------------------------------*/ 71 | 72 | .contents .page-header { 73 | margin-top: calc(-60px + 1em); 74 | } 75 | 76 | dd { 77 | margin-left: 3em; 78 | } 79 | 80 | /* Section anchors ---------------------------------*/ 81 | 82 | a.anchor { 83 | display: none; 84 | margin-left: 5px; 85 | width: 20px; 86 | height: 20px; 87 | 88 | background-image: url(./link.svg); 89 | background-repeat: no-repeat; 90 | background-size: 20px 20px; 91 | background-position: center center; 92 | } 93 | 94 | h1:hover .anchor, 95 | h2:hover .anchor, 96 | h3:hover .anchor, 97 | h4:hover .anchor, 98 | h5:hover .anchor, 99 | h6:hover .anchor { 100 | display: inline-block; 101 | } 102 | 103 | /* Fixes for fixed navbar --------------------------*/ 104 | 105 | .contents h1, .contents h2, .contents h3, .contents h4 { 106 | padding-top: 60px; 107 | margin-top: -40px; 108 | } 109 | 110 | /* Navbar submenu --------------------------*/ 111 | 112 | .dropdown-submenu { 113 | position: relative; 114 | } 115 | 116 | .dropdown-submenu>.dropdown-menu { 117 | top: 0; 118 | left: 100%; 119 | margin-top: -6px; 120 | margin-left: -1px; 121 | border-radius: 0 6px 6px 6px; 122 | } 123 | 124 | .dropdown-submenu:hover>.dropdown-menu { 125 | display: block; 126 | } 127 | 128 | .dropdown-submenu>a:after { 129 | display: block; 130 | content: " "; 131 | float: right; 132 | width: 0; 133 | height: 0; 134 | border-color: transparent; 135 | border-style: solid; 136 | border-width: 5px 0 5px 5px; 137 | border-left-color: #cccccc; 138 | margin-top: 5px; 139 | margin-right: -10px; 140 | } 141 | 142 | .dropdown-submenu:hover>a:after { 143 | border-left-color: #ffffff; 144 | } 145 | 146 | .dropdown-submenu.pull-left { 147 | float: none; 148 | } 149 | 150 | .dropdown-submenu.pull-left>.dropdown-menu { 151 | left: -100%; 152 | margin-left: 10px; 153 | border-radius: 6px 0 6px 6px; 154 | } 155 | 156 | /* Sidebar --------------------------*/ 157 | 158 | #pkgdown-sidebar { 159 | margin-top: 30px; 160 | position: -webkit-sticky; 161 | position: sticky; 162 | top: 70px; 163 | } 164 | 165 | #pkgdown-sidebar h2 { 166 | font-size: 1.5em; 167 | margin-top: 1em; 168 | } 169 | 170 | #pkgdown-sidebar h2:first-child { 171 | margin-top: 0; 172 | } 173 | 174 | #pkgdown-sidebar .list-unstyled li { 175 | margin-bottom: 0.5em; 176 | } 177 | 178 | /* bootstrap-toc tweaks ------------------------------------------------------*/ 179 | 180 | /* All levels of nav */ 181 | 182 | nav[data-toggle='toc'] .nav > li > a { 183 | padding: 4px 20px 4px 6px; 184 | font-size: 1.5rem; 185 | font-weight: 400; 186 | color: inherit; 187 | } 188 | 189 | nav[data-toggle='toc'] .nav > li > a:hover, 190 | nav[data-toggle='toc'] .nav > li > a:focus { 191 | padding-left: 5px; 192 | color: inherit; 193 | border-left: 1px solid #878787; 194 | } 195 | 196 | nav[data-toggle='toc'] .nav > .active > a, 197 | nav[data-toggle='toc'] .nav > .active:hover > a, 198 | nav[data-toggle='toc'] .nav > .active:focus > a { 199 | padding-left: 5px; 200 | font-size: 1.5rem; 201 | font-weight: 400; 202 | color: inherit; 203 | border-left: 2px solid #878787; 204 | } 205 | 206 | /* Nav: second level (shown on .active) */ 207 | 208 | nav[data-toggle='toc'] .nav .nav { 209 | display: none; /* Hide by default, but at >768px, show it */ 210 | padding-bottom: 10px; 211 | } 212 | 213 | nav[data-toggle='toc'] .nav .nav > li > a { 214 | padding-left: 16px; 215 | font-size: 1.35rem; 216 | } 217 | 218 | nav[data-toggle='toc'] .nav .nav > li > a:hover, 219 | nav[data-toggle='toc'] .nav .nav > li > a:focus { 220 | padding-left: 15px; 221 | } 222 | 223 | nav[data-toggle='toc'] .nav .nav > .active > a, 224 | nav[data-toggle='toc'] .nav .nav > .active:hover > a, 225 | nav[data-toggle='toc'] .nav .nav > .active:focus > a { 226 | padding-left: 15px; 227 | font-weight: 500; 228 | font-size: 1.35rem; 229 | } 230 | 231 | /* orcid ------------------------------------------------------------------- */ 232 | 233 | .orcid { 234 | font-size: 16px; 235 | color: #A6CE39; 236 | /* margins are required by official ORCID trademark and display guidelines */ 237 | margin-left:4px; 238 | margin-right:4px; 239 | vertical-align: middle; 240 | } 241 | 242 | /* Reference index & topics ----------------------------------------------- */ 243 | 244 | .ref-index th {font-weight: normal;} 245 | 246 | .ref-index td {vertical-align: top; min-width: 100px} 247 | .ref-index .icon {width: 40px;} 248 | .ref-index .alias {width: 40%;} 249 | .ref-index-icons .alias {width: calc(40% - 40px);} 250 | .ref-index .title {width: 60%;} 251 | 252 | .ref-arguments th {text-align: right; padding-right: 10px;} 253 | .ref-arguments th, .ref-arguments td {vertical-align: top; min-width: 100px} 254 | .ref-arguments .name {width: 20%;} 255 | .ref-arguments .desc {width: 80%;} 256 | 257 | /* Nice scrolling for wide elements --------------------------------------- */ 258 | 259 | table { 260 | display: block; 261 | overflow: auto; 262 | } 263 | 264 | /* Syntax highlighting ---------------------------------------------------- */ 265 | 266 | pre, code, pre code { 267 | background-color: #f8f8f8; 268 | color: #333; 269 | } 270 | pre, pre code { 271 | white-space: pre-wrap; 272 | word-break: break-all; 273 | overflow-wrap: break-word; 274 | } 275 | 276 | pre { 277 | border: 1px solid #eee; 278 | } 279 | 280 | pre .img, pre .r-plt { 281 | margin: 5px 0; 282 | } 283 | 284 | pre .img img, pre .r-plt img { 285 | background-color: #fff; 286 | } 287 | 288 | code a, pre a { 289 | color: #375f84; 290 | } 291 | 292 | a.sourceLine:hover { 293 | text-decoration: none; 294 | } 295 | 296 | .fl {color: #1514b5;} 297 | .fu {color: #000000;} /* function */ 298 | .ch,.st {color: #036a07;} /* string */ 299 | .kw {color: #264D66;} /* keyword */ 300 | .co {color: #888888;} /* comment */ 301 | 302 | .error {font-weight: bolder;} 303 | .warning {font-weight: bolder;} 304 | 305 | /* Clipboard --------------------------*/ 306 | 307 | .hasCopyButton { 308 | position: relative; 309 | } 310 | 311 | .btn-copy-ex { 312 | position: absolute; 313 | right: 0; 314 | top: 0; 315 | visibility: hidden; 316 | } 317 | 318 | .hasCopyButton:hover button.btn-copy-ex { 319 | visibility: visible; 320 | } 321 | 322 | /* headroom.js ------------------------ */ 323 | 324 | .headroom { 325 | will-change: transform; 326 | transition: transform 200ms linear; 327 | } 328 | .headroom--pinned { 329 | transform: translateY(0%); 330 | } 331 | .headroom--unpinned { 332 | transform: translateY(-100%); 333 | } 334 | 335 | /* mark.js ----------------------------*/ 336 | 337 | mark { 338 | background-color: rgba(255, 255, 51, 0.5); 339 | border-bottom: 2px solid rgba(255, 153, 51, 0.3); 340 | padding: 1px; 341 | } 342 | 343 | /* vertical spacing after htmlwidgets */ 344 | .html-widget { 345 | margin-bottom: 10px; 346 | } 347 | 348 | /* fontawesome ------------------------ */ 349 | 350 | .fab { 351 | font-family: "Font Awesome 5 Brands" !important; 352 | } 353 | 354 | /* don't display links in code chunks when printing */ 355 | /* source: https://stackoverflow.com/a/10781533 */ 356 | @media print { 357 | code a:link:after, code a:visited:after { 358 | content: ""; 359 | } 360 | } 361 | 362 | /* Section anchors --------------------------------- 363 | Added in pandoc 2.11: https://github.com/jgm/pandoc-templates/commit/9904bf71 364 | */ 365 | 366 | div.csl-bib-body { } 367 | div.csl-entry { 368 | clear: both; 369 | } 370 | .hanging-indent div.csl-entry { 371 | margin-left:2em; 372 | text-indent:-2em; 373 | } 374 | div.csl-left-margin { 375 | min-width:2em; 376 | float:left; 377 | } 378 | div.csl-right-inline { 379 | margin-left:2em; 380 | padding-left:1em; 381 | } 382 | div.csl-indent { 383 | margin-left: 2em; 384 | } 385 | -------------------------------------------------------------------------------- /docs/pkgdown.js: -------------------------------------------------------------------------------- 1 | /* http://gregfranko.com/blog/jquery-best-practices/ */ 2 | (function($) { 3 | $(function() { 4 | 5 | $('.navbar-fixed-top').headroom(); 6 | 7 | $('body').css('padding-top', $('.navbar').height() + 10); 8 | $(window).resize(function(){ 9 | $('body').css('padding-top', $('.navbar').height() + 10); 10 | }); 11 | 12 | $('[data-toggle="tooltip"]').tooltip(); 13 | 14 | var cur_path = paths(location.pathname); 15 | var links = $("#navbar ul li a"); 16 | var max_length = -1; 17 | var pos = -1; 18 | for (var i = 0; i < links.length; i++) { 19 | if (links[i].getAttribute("href") === "#") 20 | continue; 21 | // Ignore external links 22 | if (links[i].host !== location.host) 23 | continue; 24 | 25 | var nav_path = paths(links[i].pathname); 26 | 27 | var length = prefix_length(nav_path, cur_path); 28 | if (length > max_length) { 29 | max_length = length; 30 | pos = i; 31 | } 32 | } 33 | 34 | // Add class to parent
  • , and enclosing
  • if in dropdown 35 | if (pos >= 0) { 36 | var menu_anchor = $(links[pos]); 37 | menu_anchor.parent().addClass("active"); 38 | menu_anchor.closest("li.dropdown").addClass("active"); 39 | } 40 | }); 41 | 42 | function paths(pathname) { 43 | var pieces = pathname.split("/"); 44 | pieces.shift(); // always starts with / 45 | 46 | var end = pieces[pieces.length - 1]; 47 | if (end === "index.html" || end === "") 48 | pieces.pop(); 49 | return(pieces); 50 | } 51 | 52 | // Returns -1 if not found 53 | function prefix_length(needle, haystack) { 54 | if (needle.length > haystack.length) 55 | return(-1); 56 | 57 | // Special case for length-0 haystack, since for loop won't run 58 | if (haystack.length === 0) { 59 | return(needle.length === 0 ? 0 : -1); 60 | } 61 | 62 | for (var i = 0; i < haystack.length; i++) { 63 | if (needle[i] != haystack[i]) 64 | return(i); 65 | } 66 | 67 | return(haystack.length); 68 | } 69 | 70 | /* Clipboard --------------------------*/ 71 | 72 | function changeTooltipMessage(element, msg) { 73 | var tooltipOriginalTitle=element.getAttribute('data-original-title'); 74 | element.setAttribute('data-original-title', msg); 75 | $(element).tooltip('show'); 76 | element.setAttribute('data-original-title', tooltipOriginalTitle); 77 | } 78 | 79 | if(ClipboardJS.isSupported()) { 80 | $(document).ready(function() { 81 | var copyButton = ""; 82 | 83 | $("div.sourceCode").addClass("hasCopyButton"); 84 | 85 | // Insert copy buttons: 86 | $(copyButton).prependTo(".hasCopyButton"); 87 | 88 | // Initialize tooltips: 89 | $('.btn-copy-ex').tooltip({container: 'body'}); 90 | 91 | // Initialize clipboard: 92 | var clipboardBtnCopies = new ClipboardJS('[data-clipboard-copy]', { 93 | text: function(trigger) { 94 | return trigger.parentNode.textContent.replace(/\n#>[^\n]*/g, ""); 95 | } 96 | }); 97 | 98 | clipboardBtnCopies.on('success', function(e) { 99 | changeTooltipMessage(e.trigger, 'Copied!'); 100 | e.clearSelection(); 101 | }); 102 | 103 | clipboardBtnCopies.on('error', function() { 104 | changeTooltipMessage(e.trigger,'Press Ctrl+C or Command+C to copy'); 105 | }); 106 | }); 107 | } 108 | })(window.jQuery || window.$) 109 | -------------------------------------------------------------------------------- /docs/pkgdown.yml: -------------------------------------------------------------------------------- 1 | pandoc: 2.19.2 2 | pkgdown: 2.0.6 3 | pkgdown_sha: ~ 4 | articles: 5 | setup: setup.html 6 | tidytags-with-conf-hashtags: tidytags-with-conf-hashtags.html 7 | last_built: 2023-01-10T21:54Z 8 | urls: 9 | reference: https://docs.ropensci.org/tidytags/reference 10 | article: https://docs.ropensci.org/tidytags/articles 11 | 12 | -------------------------------------------------------------------------------- /docs/reference/Rplot001.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/docs/reference/Rplot001.png -------------------------------------------------------------------------------- /docs/reference/figures/logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/docs/reference/figures/logo.png -------------------------------------------------------------------------------- /docs/reference/figures/tidytags-workflow.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/docs/reference/figures/tidytags-workflow.jpg -------------------------------------------------------------------------------- /docs/reference/index.html: -------------------------------------------------------------------------------- 1 | 2 | Function reference • tidytags 6 | 7 | 8 |
    9 |
    56 | 57 | 58 | 59 |
    60 |
    61 | 64 | 65 | 69 | 72 | 73 | 76 | 77 | 80 | 81 | 85 | 88 | 89 | 92 | 93 | 96 | 97 | 100 | 101 | 105 | 108 | 109 | 112 | 114 | 117 | 118 |
    66 |

    Accessing data

    67 |

    Functions for accessing TAGS data.

    68 |
    70 |

    read_tags()

    71 |

    Retrieve a TAGS archive of Twitter statuses and bring into R

    74 |

    pull_tweet_data()

    75 |

    Retrieve the fullest extent of status metadata available from the Twitter API

    78 |

    lookup_many_tweets()

    79 |

    Retrieve the fullest extent of metadata for more than 90,000 statuses

    82 |

    Processing data

    83 |

    Functions related to processing data for subsequent analysis.

    84 |
    86 |

    process_tweets()

    87 |

    Calculate additional information using status metadata

    90 |

    get_char_tweet_ids()

    91 |

    Get Twitter status ID numbers as character strings

    94 |

    get_upstream_tweets()

    95 |

    Collect upstream statuses and add to dataset

    98 |

    get_url_domain()

    99 |

    Find the domain name of URLs, even shortened URLs

    102 |

    Social network analysis

    103 |

    Functions related to carrying out social network analysis.

    104 |
    106 |

    filter_by_tweet_type()

    107 |

    Filter a Twitter dataset to only include statuses of a particular type

    110 |

    create_edgelist()

    111 |

    Create an edgelist where senders and receivers are defined by different types 113 | of Twitter interactions

    115 |

    add_users_data()

    116 |

    Retrieve user information for everyone in an edgelist

    119 | 120 | 123 |
    124 | 125 | 126 |
    135 | 136 | 137 | 138 | 139 | 140 | 141 | 142 | 143 | -------------------------------------------------------------------------------- /docs/sitemap.xml: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | https://docs.ropensci.org/tidytags/404.html 5 | 6 | 7 | https://docs.ropensci.org/tidytags/CONTRIBUTING.html 8 | 9 | 10 | https://docs.ropensci.org/tidytags/LICENSE-text.html 11 | 12 | 13 | https://docs.ropensci.org/tidytags/LICENSE.html 14 | 15 | 16 | https://docs.ropensci.org/tidytags/articles/index.html 17 | 18 | 19 | https://docs.ropensci.org/tidytags/articles/setup.html 20 | 21 | 22 | https://docs.ropensci.org/tidytags/articles/tidytags-with-conf-hashtags.html 23 | 24 | 25 | https://docs.ropensci.org/tidytags/authors.html 26 | 27 | 28 | https://docs.ropensci.org/tidytags/index.html 29 | 30 | 31 | https://docs.ropensci.org/tidytags/news/index.html 32 | 33 | 34 | https://docs.ropensci.org/tidytags/paper.html 35 | 36 | 37 | https://docs.ropensci.org/tidytags/reference/add_users_data.html 38 | 39 | 40 | https://docs.ropensci.org/tidytags/reference/create_edgelist.html 41 | 42 | 43 | https://docs.ropensci.org/tidytags/reference/filter_by_tweet_type.html 44 | 45 | 46 | https://docs.ropensci.org/tidytags/reference/geocode_tags.html 47 | 48 | 49 | https://docs.ropensci.org/tidytags/reference/get_char_tweet_ids.html 50 | 51 | 52 | https://docs.ropensci.org/tidytags/reference/get_upstream_tweets.html 53 | 54 | 55 | https://docs.ropensci.org/tidytags/reference/get_url_domain.html 56 | 57 | 58 | https://docs.ropensci.org/tidytags/reference/index.html 59 | 60 | 61 | https://docs.ropensci.org/tidytags/reference/lookup_many_tweets.html 62 | 63 | 64 | https://docs.ropensci.org/tidytags/reference/lookup_many_users.html 65 | 66 | 67 | https://docs.ropensci.org/tidytags/reference/process_tweets.html 68 | 69 | 70 | https://docs.ropensci.org/tidytags/reference/pull_tweet_data.html 71 | 72 | 73 | https://docs.ropensci.org/tidytags/reference/read_tags.html 74 | 75 | 76 | -------------------------------------------------------------------------------- /inst/CITATION: -------------------------------------------------------------------------------- 1 | citHeader("To cite tidytags in publications, please use:") 2 | 3 | citEntry( 4 | entry = "manual", 5 | title = "tidytags: Importing and analyzing Twitter data collected with Twitter Archiving Google Sheets", 6 | author = c(as.person("K. Bret Staudt Willet"), as.person("Joshua M. Rosenberg")), 7 | year = "2022", 8 | journal = "", 9 | volume = "", 10 | number = "", 11 | pages = "", 12 | doi = "", 13 | url = "", 14 | key = "tidytags-package", 15 | textVersion = paste( 16 | "Staudt Willet, K. B., & Rosenberg, J. M. (2022).", 17 | "tidytags: Importing and analyzing Twitter data collected with Twitter", 18 | "Archiving Google Sheets.", 19 | "https://github.com/ropensci/tidytags" 20 | ) 21 | ) 22 | -------------------------------------------------------------------------------- /man/add_users_data.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/add-users-data.R 3 | \name{add_users_data} 4 | \alias{add_users_data} 5 | \title{Retrieve user information for everyone in an edgelist} 6 | \usage{ 7 | add_users_data(edgelist) 8 | } 9 | \arguments{ 10 | \item{edgelist}{An edgelist of senders and receivers, such as that returned 11 | by the function \code{create_edgelist()}.} 12 | } 13 | \value{ 14 | A dataframe in the form of an edgelist (i.e., with senders and 15 | receivers) as well as numerous, appropriately named columns of details 16 | about the senders and receivers. 17 | } 18 | \description{ 19 | Updates an edgelist created with \code{create_edgelist()} by appending user 20 | data retrieved with \code{rtweet::lookup_users()}. The resulting dataframe 21 | adds many additional columns and appends "_sender" or "_receiver" to the 22 | column names. 23 | } 24 | \details{ 25 | This function requires authentication; please see 26 | \code{vignette("setup", package = "tidytags")} 27 | } 28 | \examples{ 29 | 30 | \donttest{ 31 | example_url <- "18clYlQeJOc6W5QRuSlJ6_v3snqKJImFhU42bRkM_OX8" 32 | tags_content <- read_tags(example_url) 33 | 34 | if (rtweet::auth_has_default()) { 35 | tweets_data <- lookup_many_tweets(tags_content) 36 | add_users_data(create_edgelist(tweets_data)) 37 | } 38 | } 39 | 40 | } 41 | \seealso{ 42 | Read more about rtweet authentication setup at 43 | \code{vignette("auth", package = "rtweet")} 44 | } 45 | -------------------------------------------------------------------------------- /man/create_edgelist.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/analyze-network.R 3 | \name{create_edgelist} 4 | \alias{create_edgelist} 5 | \title{Create an edgelist where senders and receivers are defined by different types 6 | of Twitter interactions} 7 | \usage{ 8 | create_edgelist(df, type = "all") 9 | } 10 | \arguments{ 11 | \item{df}{A dataframe returned by \code{pull_tweet_data()}} 12 | 13 | \item{type}{The specific kind of statuses used to define the interactions 14 | around which the edgelist will be built. Choices include "reply", 15 | "retweet", or "quote". Defaults to "all".} 16 | } 17 | \value{ 18 | A dataframe edgelist defined by interactions through the type of 19 | statuses specified. The dataframe has three columns: \code{sender}, 20 | \code{receiver}, and \code{edge_type}. 21 | } 22 | \description{ 23 | Starting with a dataframe of Twitter data imported to R with 24 | \code{read_tags()} and additional metadata retrieved by 25 | \code{pull_tweet_data()}, \code{create_edgelist()} removes any statuses 26 | that are not of the requested type (e.g., replies, retweets, and quote 27 | tweets) by calling \code{filter_by_tweet_type()}. Finally, \code{create_edgelist()} 28 | pulls out senders and receivers of the specified type of statuses, and then 29 | adds a new column called \code{edge_type}. 30 | } 31 | \examples{ 32 | 33 | \donttest{ 34 | example_url <- "18clYlQeJOc6W5QRuSlJ6_v3snqKJImFhU42bRkM_OX8" 35 | tags_content <- read_tags(example_url) 36 | 37 | if (rtweet::auth_has_default()) { 38 | tweets_data <- lookup_many_tweets(tags_content) 39 | full_edgelist <- create_edgelist(tweets_data) 40 | full_edgelist 41 | 42 | reply_edgelist <- create_edgelist(tweets_data, type = "reply") 43 | retweet_edgelist <- create_edgelist(tweets_data, type = "retweet") 44 | quote_edgelist <- create_edgelist(tweets_data, type = "quote") 45 | } 46 | } 47 | 48 | } 49 | -------------------------------------------------------------------------------- /man/figures/logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/man/figures/logo.png -------------------------------------------------------------------------------- /man/figures/tidytags-workflow.jpg: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/man/figures/tidytags-workflow.jpg -------------------------------------------------------------------------------- /man/filter_by_tweet_type.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/analyze-network.R 3 | \name{filter_by_tweet_type} 4 | \alias{filter_by_tweet_type} 5 | \title{Filter a Twitter dataset to only include statuses of a particular type} 6 | \usage{ 7 | filter_by_tweet_type(df, type) 8 | } 9 | \arguments{ 10 | \item{df}{A dataframe returned by \code{pull_tweet_data()}} 11 | 12 | \item{type}{The specific kind of statuses that will be kept in the dataset 13 | after filtering the rest. Choices for \code{type}include "reply", 14 | "retweet", "quote", and "original".} 15 | } 16 | \value{ 17 | A dataframe of processed statuses and fewer rows that the input 18 | dataframe. Only the statuses of the specified type will remain. 19 | } 20 | \description{ 21 | Starting with a dataframe of Twitter data imported to R with 22 | \code{read_tags()} and additional metadata retrieved by 23 | \code{pull_tweet_data()}, \code{filter_by_tweet_type()} processes the 24 | statuses by calling \code{process_tweets()} and then removes any statuses 25 | that are not of the requested type (e.g., replies, retweets, and quote 26 | tweets). \code{filter_by_tweet_type()} is a useful function in itself, but it is 27 | also used in \code{create_edgelist()}. 28 | } 29 | \examples{ 30 | 31 | \donttest{ 32 | example_url <- "18clYlQeJOc6W5QRuSlJ6_v3snqKJImFhU42bRkM_OX8" 33 | tags_content <- read_tags(example_url) 34 | 35 | if (rtweet::auth_has_default()) { 36 | tweets_data <- lookup_many_tweets(tags_content) 37 | only_replies <- filter_by_tweet_type(tweets_data, "reply") 38 | only_retweets <- filter_by_tweet_type(tweets_data, "retweet") 39 | only_quote_tweets <- filter_by_tweet_type(tweets_data, "quote") 40 | only_originals <- filter_by_tweet_type(tweets_data, "original") 41 | } 42 | } 43 | 44 | } 45 | -------------------------------------------------------------------------------- /man/fragments/ethics.Rmd: -------------------------------------------------------------------------------- 1 | **tidytags should be used in strict accordance with Twitter's [developer terms](https://developer.twitter.com/en/developer-terms/more-on-restricted-use-cases).** 2 | 3 | Although most Institutional Review Boards (IRBs) consider the Twitter data that **tidytags** analyzes to _not_ necessarily be human subjects research, there remain ethical considerations pertaining to the use of the **tidytags** package that should be discussed. 4 | 5 | Even if **tidytags** use is not for research purposes (or if an IRB determines that a study is not human subjects research), "the release of personally identifiable or sensitive data is potentially harmful," as noted in the [rOpenSci Packages guide](https://devguide.ropensci.org/policies.html#ethics-data-privacy-and-human-subjects-research). Therefore, although you _can_ collect Twitter data (and you _can_ use **tidytags** to analyze it), we urge care and thoughtfulness regarding how you analyze the data and communicate the results. In short, please remember that most (if not all) of the data you collect may be about people---and [those people may not like the idea of their data being analyzed or included in research](https://doi.org/10.1177/2056305118763366). 6 | 7 | We recommend [the Association of Internet Researchers' (AoIR) resources related to conducting analyses in ethical ways](https://aoir.org/ethics/) when working with data about people. AoIR's [ethical guidelines](https://aoir.org/reports/ethics3.pdf) may be especially helpful for navigating tensions related to collecting, analyzing, and sharing social media data. 8 | -------------------------------------------------------------------------------- /man/fragments/getting-help.Rmd: -------------------------------------------------------------------------------- 1 | **tidytags** is still a work in progress, so we fully expect that there are still some bugs to work out and functions to document better. If you find an issue, have a question, or think of something that you really wish **tidytags** would do for you, don't hesitate to [email Bret](mailto:bret@bretsw.com) or reach out on Twitter: [\@bretsw](https://twitter.com/bretsw) and [\@jrosenberg6432](https://twitter.com/jrosenberg6432). 2 | 3 | You can also [submit an issue on GitHub](https://github.com/ropensci/tidytags/issues/). 4 | 5 | You may also wish to try some general troubleshooting strategies: 6 | 7 | - Try to find out what the specific problem is 8 | - Identify what is *not* causing the problem 9 | - "Unplug and plug it back in" - restart R, close and reopen R 10 | - Reach out to others! Sharing what is causing an issue can often help to clarify the problem. 11 | - RStudio Community - https://community.rstudio.com/ (highly recommended!) 12 | - Twitter hashtag: #rstats 13 | - General strategies on learning more: https://datascienceineducation.com/c17.html 14 | 15 | -------------------------------------------------------------------------------- /man/get_char_tweet_ids.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/setup-functions.R 3 | \name{get_char_tweet_ids} 4 | \alias{get_char_tweet_ids} 5 | \title{Get Twitter status ID numbers as character strings} 6 | \usage{ 7 | get_char_tweet_ids(x) 8 | } 9 | \arguments{ 10 | \item{x}{A dataframe containing the column name 'status_url' 11 | (i.e., the hyperlink to specific statuses), such as that returned by 12 | \code{read_tags()}, or a vector of status URLs, such as as those contained 13 | in the 'status_url' column of a dataframe returned by 14 | \code{tidytags::read_tags()}} 15 | } 16 | \value{ 17 | A vector of Twitter status IDs as character strings 18 | } 19 | \description{ 20 | This function is useful because Google Sheets (and hence TAGS) 21 | typically round very large numbers into an exponential form. Thus, 22 | because status ID numbers are very large, they often get corrupted in this 23 | rounding process. The most reliable way to get full status ID numbers is by 24 | using this function, \code{get_char_tweet_ids()}, to pull the ID numbers 25 | from the URL linking to specific statuses. 26 | } 27 | \examples{ 28 | 29 | \donttest{ 30 | example_url <- "18clYlQeJOc6W5QRuSlJ6_v3snqKJImFhU42bRkM_OX8" 31 | tags_content <- read_tags(example_url) 32 | get_char_tweet_ids(tags_content[1:10, ]) 33 | get_char_tweet_ids(tags_content$status_url[1:10]) 34 | get_char_tweet_ids( 35 | "https://twitter.com/tweet__example/status/1176592704647716864") 36 | } 37 | 38 | } 39 | -------------------------------------------------------------------------------- /man/get_upstream_tweets.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/get-upstream-tweets.R 3 | \name{get_upstream_tweets} 4 | \alias{get_upstream_tweets} 5 | \title{Collect upstream statuses and add to dataset} 6 | \usage{ 7 | get_upstream_tweets(df) 8 | } 9 | \arguments{ 10 | \item{df}{A dataframe of statuses and full metadata from the Twitter API as 11 | returned by \code{pull_tweet_data()}} 12 | } 13 | \value{ 14 | A new, expanded dataframe which includes any retrievable upstream 15 | replies 16 | } 17 | \description{ 18 | Because the Twitter API offers a \code{in_reply_to_status_id_str} column, it is 19 | possible to iteratively reconstruct reply threads in an \emph{upstream} 20 | direction, that is, retrieving statuses composed earlier than replies in 21 | the dataset. The \code{get_upstream_tweets()} function collects upstream 22 | replies not previously found in the dataset. Keep in mind that there is no 23 | way to predict how far upstream you can trace back a reply thread, so 24 | running \code{get_upstream_tweets()} could take a while and potentially hit 25 | the Twitter API rate limit of 90,000 statuses in a 15-minute period. 26 | } 27 | \details{ 28 | This function requires authentication; please see 29 | \code{vignette("setup", package = "tidytags")} 30 | } 31 | \examples{ 32 | 33 | \donttest{ 34 | example_url <- "18clYlQeJOc6W5QRuSlJ6_v3snqKJImFhU42bRkM_OX8" 35 | tags_content <- read_tags(example_url) 36 | 37 | if (rtweet::auth_has_default()) { 38 | tweets_data <- lookup_many_tweets(tags_content) 39 | more_replies_df <- get_upstream_tweets(tweets_data) 40 | more_replies_df 41 | } 42 | } 43 | 44 | } 45 | \seealso{ 46 | Read more about rtweet authentication setup at 47 | \code{vignette("auth", package = "rtweet")} 48 | } 49 | -------------------------------------------------------------------------------- /man/get_url_domain.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/analyze-url.R 3 | \name{get_url_domain} 4 | \alias{get_url_domain} 5 | \title{Find the domain name of URLs, even shortened URLs} 6 | \usage{ 7 | get_url_domain(x, wait = 10) 8 | } 9 | \arguments{ 10 | \item{x}{A list or vector of hyperlinks, whether shortened or expanded} 11 | 12 | \item{wait}{How long (in seconds) to wait on the 13 | \code{longurl::expand_urls()} function to retrieve the full, expanded URL 14 | from a shortened URL (e.g., a bit.ly). The \code{longurl} default is 2 15 | seconds, but we have found that this misses a number of valid URLs. Here, 16 | we have made the default \code{wait = 10} seconds, but the user can adjust 17 | this as they like.} 18 | } 19 | \value{ 20 | A list or vector of Web domain names 21 | } 22 | \description{ 23 | \code{get_url_domain()} retrieves the Web domain name from a URL, including 24 | URLs shortened with services such as bit.ly and t.co 25 | } 26 | \examples{ 27 | 28 | get_url_domain("https://www.tidyverse.org/packages/") 29 | get_url_domain("https://dplyr.tidyverse.org/") 30 | get_url_domain("http://bit.ly/2SfWO3K") 31 | 32 | } 33 | \seealso{ 34 | Read the documentation for \code{longurl::expand_urls()} and 35 | \code{urltools::domain()}. 36 | } 37 | -------------------------------------------------------------------------------- /man/lookup_many_tweets.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/setup-functions.R 3 | \name{lookup_many_tweets} 4 | \alias{lookup_many_tweets} 5 | \title{Retrieve the fullest extent of metadata for more than 90,000 statuses} 6 | \usage{ 7 | lookup_many_tweets(x, alarm = FALSE) 8 | } 9 | \arguments{ 10 | \item{x}{A list or vector of status ID numbers} 11 | 12 | \item{alarm}{An audible notification that a batch of 90,000 statuses has been 13 | completed} 14 | } 15 | \value{ 16 | A dataframe of statuses and full metadata from the Twitter API 17 | } 18 | \description{ 19 | This function calls \code{pull_tweet_data()}, but has a built-in delay 20 | of 15 minutes to allow the Twitter API to reset after looking up 90,000 21 | statuses 22 | } 23 | \details{ 24 | This function requires authentication; please see 25 | \code{vignette("setup", package = "tidytags")} 26 | } 27 | \examples{ 28 | 29 | \donttest{ 30 | example_url <- "18clYlQeJOc6W5QRuSlJ6_v3snqKJImFhU42bRkM_OX8" 31 | tags_content <- read_tags(example_url) 32 | 33 | if (rtweet::auth_has_default()) { 34 | tweets_data <- lookup_many_tweets(tags_content$id_str) 35 | one_tweet_data <- lookup_many_tweets("1176592704647716864") 36 | one_tweet_data <- lookup_many_tweets("1176592704647716864", alarm = TRUE) 37 | one_tweet_data 38 | } 39 | } 40 | 41 | } 42 | \seealso{ 43 | Read more about rtweet authentication setup at 44 | \code{vignette("auth", package = "rtweet")} 45 | } 46 | -------------------------------------------------------------------------------- /man/process_tweets.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/setup-functions.R 3 | \name{process_tweets} 4 | \alias{process_tweets} 5 | \title{Calculate additional information using status metadata} 6 | \usage{ 7 | process_tweets(df) 8 | } 9 | \arguments{ 10 | \item{df}{A dataframe of statuses and full metadata from the Twitter API as 11 | returned by \code{pull_tweet_data()}} 12 | } 13 | \value{ 14 | A dataframe with several additional columns: mentions_count, 15 | hashtags_count, urls_count, tweet_type, is_self_reply 16 | } 17 | \description{ 18 | Calculate additional information using status metadata 19 | } 20 | \examples{ 21 | 22 | \donttest{ 23 | example_url <- "18clYlQeJOc6W5QRuSlJ6_v3snqKJImFhU42bRkM_OX8" 24 | tags_content <- read_tags(example_url) 25 | 26 | if (rtweet::auth_has_default()) { 27 | tweets_data <- lookup_many_tweets(tags_content) 28 | tweets_processed <- process_tweets(tweets_data) 29 | tweets_processed 30 | } 31 | } 32 | 33 | } 34 | -------------------------------------------------------------------------------- /man/pull_tweet_data.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/setup-functions.R 3 | \name{pull_tweet_data} 4 | \alias{pull_tweet_data} 5 | \title{Retrieve the fullest extent of status metadata available from the Twitter API} 6 | \usage{ 7 | pull_tweet_data(df = NULL, url_vector = NULL, id_vector = NULL, n = NULL) 8 | } 9 | \arguments{ 10 | \item{df}{A dataframe of containing the column name 'status_url' 11 | (i.e., the hyperlink to specific statuses), such as that returned by 12 | \code{read_tags()}} 13 | 14 | \item{url_vector}{A vector of status URLs, such as as those contained in 15 | the 'status_url' column of a dataframe returned by 16 | \code{tidytags::read_tags()}} 17 | 18 | \item{id_vector}{A vector of statuses (i.e., ID numbers, such as 19 | those contained in the 'id_str' column of a dataframe returned by 20 | \code{tidytags::read_tags()}} 21 | 22 | \item{n}{The number of statuses to look up, by default the total number 23 | of tweet ID numbers available, but capped at 90,000 due to Twitter API 24 | limitations.} 25 | } 26 | \value{ 27 | A dataframe of statuses and full metadata from the Twitter API 28 | } 29 | \description{ 30 | With a TAGS archive imported into R, \code{pull_tweet_data()} uses the 31 | \strong{rtweet} package to query the Twitter API. Using rtweet requires Twitter 32 | API keys associated with an approved developer account. Fortunately, the 33 | rtweet vignette, 34 | \href{https://docs.ropensci.org/rtweet/articles/auth.html}{Authentication}, 35 | provides a thorough guide to obtaining Twitter API keys and authenticating 36 | access to the Twitter API. Following the directions for "Apps," you will 37 | run the \code{rtweet::rtweet_app()} function. 38 | } 39 | \details{ 40 | This function requires authentication; please see 41 | \code{vignette("setup", package = "tidytags")} 42 | } 43 | \examples{ 44 | 45 | \donttest{ 46 | ## Import data from a TAGS tracker: 47 | example_tags_tracker <- "18clYlQeJOc6W5QRuSlJ6_v3snqKJImFhU42bRkM_OX8" 48 | tags_content <- read_tags(example_tags_tracker) 49 | 50 | if (rtweet::auth_has_default()) { 51 | ## Use any of three input parameters (TAGS dataframe, `status_url` 52 | ## column, or `id_str` column) 53 | tweets_data_from_df <- pull_tweet_data(tags_content) 54 | tweets_data_from_url <- 55 | pull_tweet_data(url_vector = tags_content$status_url) 56 | tweets_data_from_ids <- pull_tweet_data(id_vector = tags_content$id_str) 57 | 58 | ## Specifying the parameter `n` clarifies how many statuses to look up, 59 | ## but the returned values may be less than `n` because some statuses 60 | ## may have been deleted or made protected since the TAGS tracker 61 | ## originally recorded them. 62 | tweets_data_10 <- pull_tweet_data(tags_content, n = 10) 63 | 64 | ## Note that the following two examples will return the same thing: 65 | one_tweet_data <- 66 | pull_tweet_data(url_vector = 67 | "https://twitter.com/tweet__example/status/1176592704647716864") 68 | one_tweet_data <- pull_tweet_data(id_vector = "1176592704647716864") 69 | one_tweet_data 70 | } 71 | } 72 | 73 | } 74 | \seealso{ 75 | Read more about rtweet authentication setup at 76 | \code{vignette("auth", package = "rtweet")} 77 | } 78 | -------------------------------------------------------------------------------- /man/read_tags.Rd: -------------------------------------------------------------------------------- 1 | % Generated by roxygen2: do not edit by hand 2 | % Please edit documentation in R/setup-functions.R 3 | \name{read_tags} 4 | \alias{read_tags} 5 | \title{Retrieve a TAGS archive of Twitter statuses and bring into R} 6 | \usage{ 7 | read_tags(tags_id) 8 | } 9 | \arguments{ 10 | \item{tags_id}{A Google Sheet identifier (i.e., the alphanumeric string 11 | following "https://docs.google.com/spreadsheets/d/" in the TAGS tracker's 12 | URL.)} 13 | } 14 | \value{ 15 | A tibble of the TAGS archive of Twitter statuses 16 | } 17 | \description{ 18 | Keep in mind that \code{read_tags()} uses the \strong{googlesheets4} package, 19 | and one requirement is that your TAGS tracker has been "published to the 20 | web." To do this, with the TAGS page open in a web browser, navigate to 21 | \verb{File >> Share >> Publish to the web}. The \code{Link} field should be 22 | 'Entire document' and the \code{Embed} field should be 'Web page.' If 23 | everything looks right, then click the \code{Publish} button. Next, click 24 | the \code{Share} button in the top right corner of the Google Sheets 25 | browser window, select \verb{Get shareable link}, and set the permissions 26 | to 'Anyone with the link can view.' 27 | } 28 | \examples{ 29 | 30 | \donttest{ 31 | example_tags <- "18clYlQeJOc6W5QRuSlJ6_v3snqKJImFhU42bRkM_OX8" 32 | read_tags(example_tags) 33 | } 34 | 35 | } 36 | \seealso{ 37 | Read more about \code{library(googlesheets4)} 38 | \href{https://github.com/tidyverse/googlesheets4}{here}. 39 | } 40 | -------------------------------------------------------------------------------- /paper.bib: -------------------------------------------------------------------------------- 1 | @article{arslan_et_al2021, 2 | title={Understanding topic duration in {T}witter learning communities using data mining}, 3 | author={Arslan, Okan and Xing, Wanli and Inan, Fethi A and Du, Hanxiang}, 4 | journal={Journal of Computer Assisted Learning}, 5 | volume={}, 6 | number={}, 7 | pages={1--13}, 8 | year={2021}, 9 | publisher={John Wiley \& Sons}, 10 | doi={10.1111/jcal.12633}, 11 | url={https://doi.org/10.1111/jcal.12633} 12 | } 13 | 14 | @article{csardi_nepusz2006, 15 | title={The igraph software package for complex network research}, 16 | author={ Csardi, Gabor and Nepusz, Tamas}, 17 | journal={InterJournal}, 18 | volume={Complex Systems}, 19 | pages={1695}, 20 | year={2006}, 21 | url={https://igraph.org} 22 | } 23 | 24 | @article{greenhalgh_et_al2018, 25 | title={Tweet, and we shall find: Using digital methods to locate participants in educational hashtags}, 26 | author={Greenhalgh, Spencer P and Staudt Willet, K Bret and Rosenberg, Joshua M and Koehler, Matthew J}, 27 | journal={TechTrends}, 28 | volume={62}, 29 | number={5}, 30 | pages={501--508}, 31 | year={2018}, 32 | publisher={Springer}, 33 | doi={10.1007/s11528-018-0313-6}, 34 | url={https://doi.org/10.1007/s11528-018-0313-6} 35 | } 36 | 37 | @article{greenhalgh_et_al2020, 38 | title={Identifying multiple learning spaces within a single teacher-focused {T}witter hashtag}, 39 | author={Greenhalgh, Spencer P and Rosenberg, Joshua M and Staudt Willet, K Bret and Koehler, Matthew J and Akcaoglu, Mete}, 40 | journal={Computers \& Education}, 41 | volume={148}, 42 | pages={103809}, 43 | year={2020}, 44 | publisher={Elsevier}, 45 | doi={10.1016/j.compedu.2020.103809}, 46 | url={https://doi.org/10.1016/j.compedu.2020.103809} 47 | } 48 | 49 | @manual{hawksey2016, 50 | title={TAGS: Twitter archiving {G}oogle sheet}, 51 | author={Hawksey, Martin}, 52 | year={2016}, 53 | note = {Version 6.1}, 54 | url={https://tags.hawksey.info/} 55 | } 56 | 57 | @article{kearney2019, 58 | title={{r}tweet: Collecting and analyzing {T}witter data}, 59 | author={Kearney, Michael W.}, 60 | journal={Journal of Open Source Software}, 61 | volume={4}, 62 | number={42}, 63 | pages={1829}, 64 | year={2019}, 65 | doi={10.21105/joss.01829}, 66 | url={https://joss.theoj.org/papers/10.21105/joss.01829} 67 | } 68 | 69 | @manual{pedersen2020, 70 | title={{t}idygraph: A tidy API for graph manipulation}, 71 | author={Pedersen, Thomas Lin}, 72 | year={2020}, 73 | url={https://CRAN.R-project.org/package=tidygraph} 74 | } 75 | 76 | @manual{pedersen2021, 77 | title={{g}graph: An implementation of grammar of graphics for graphs and networks}, 78 | author={Pedersen, Thomas Lin}, 79 | year={2021}, 80 | organization={RStudio}, 81 | url={https://CRAN.R-project.org/package=ggraph} 82 | } 83 | 84 | @article{staudtwillet2019, 85 | title={Revisiting how and why educators use {T}witter: Tweet types and purposes in #{E}dchat}, 86 | author={Staudt Willet, K Bret}, 87 | journal={Journal of Research on Technology in Education}, 88 | volume={51}, 89 | number={3}, 90 | pages={273--289}, 91 | year={2019}, 92 | publisher={Taylor \& Francis}, 93 | doi={10.1080/15391523.2019.1611507}, 94 | url={https://doi.org/10.1080/15391523.2019.1611507} 95 | } 96 | 97 | @manual{tornes_trujillo2021, 98 | title={Enabling the future of academic research with the {T}witter API}, 99 | author={Tornes, Adam and Trujillo, Leanne}, 100 | organization={Twitter Developer Platform Blog}, 101 | year={2021}, 102 | month={Jan}, 103 | url={https://blog.twitter.com/developer/en_us/topics/tools/2021/enabling-the-future-of-academic-research-with-the-twitter-api}, 104 | note={Accessed December 15, 2021} 105 | } 106 | 107 | @article{veletsianos_et_al2019, 108 | title={Academics' social media use over time is associated with individual, relational, cultural and political factors}, 109 | author={Veletsianos, George and Johnson, Nicole and Belikov, Olga}, 110 | journal={British Journal of Educational Technology}, 111 | volume={50}, 112 | number={4}, 113 | pages={1713--1728}, 114 | year={2019}, 115 | publisher={Wiley Online Library}, 116 | doi={10.1111/bjet.12788}, 117 | url={https://doi.org/10.1111/bjet.12788} 118 | } 119 | 120 | @article{xing_gao2018, 121 | title={Exploring the relationship between online discourse and commitment in {T}witter professional learning communities}, 122 | author={Xing, Wanli and Gao, Fei}, 123 | journal={Computers \& Education}, 124 | volume={126}, 125 | pages={388--398}, 126 | year={2018}, 127 | publisher={Elsevier}, 128 | doi={10.1016/j.compedu.2018.08.010}, 129 | url={https://doi.org/10.1016/j.compedu.2018.08.010} 130 | } 131 | -------------------------------------------------------------------------------- /paper.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: 'tidytags: Importing and Analyzing Twitter Data Collected with Twitter Archiving Google Sheets' 3 | authors: 4 | - name: K. Bret Staudt Willet 5 | orcid: 0000-0002-6984-416X 6 | affiliation: '1' 7 | - name: Joshua M. Rosenberg 8 | orcid: 0000-0003-2170-0447 9 | affiliation: '2' 10 | affiliations: 11 | - name: Florida State University 12 | index: '1' 13 | - name: University of Tennessee, Knoxville 14 | index: '2' 15 | date: 02 February 2022 16 | bibliography: paper.bib 17 | tags: 18 | - R 19 | - Twitter 20 | - social media 21 | - data science 22 | - data mining 23 | - wrapper 24 | --- 25 | 26 | # Summary 27 | 28 | The **tidytags** R package coordinates the simplicity of collecting tweets over time with a **Twitter Archiving Google Sheet** [TAGS](https://tags.hawksey.info/) tweet collector [@hawksey2016] and the utility of the **rtweet** R package [@kearney2019] for processing and preparing additional Twitter metadata. **tidytags** also introduces functions to facilitate systematic yet flexible analyses of data from Twitter. 29 | 30 | # Statement of Need 31 | 32 | An essential component of understanding behavior across the social sciences is to study the actions and artifacts of group members over time. Social media platforms such as Twitter are a context for inquiry and analysis of a variety of topics that have a temporal component. For instance, online communities often struggle with attrition and lack of commitment, so it would be beneficial to understand why some users continue to sustain participation while others gradually drop out [@arslan_et_al2021; @xing_gao2018]. Also, because scholars' social media use is interwoven with changes in their personal lives and societal transitions, their social media practices must be studied over time [@veletsianos_et_al2019]. 33 | 34 | Twitter data are best collected in the moment that they are being produced. Full access to Twitter data is limited by the platform’s API, particularly in terms of retrieving data from more than a week or two prior to the time of collection. For instance, a researcher using the Twitter API in the summer of 2020 to search for tweets about the 2019 conference of the Association for Educational Communication & Technology, [AECT](https://aect.org/) using hashtags #AECT19 or #AECTinspired would not be able to readily access tweets from the time of the convention (which occurred in the fall of 2019). 35 | 36 | Accessing historical content from Twitter can be difficult and expensive; it is not impossible, but there are real obstacles. Academic researchers got a boost in January 2021, when Twitter launched an [Academic Research product track](https://developer.twitter.com/en/products/twitter-api/academic-research) for the updated Twitter API [@tornes_trujillo2021]. This new feature provides nearly unlimited (there is a cap of 10 million queries that resets every month) access to the Twitter API for researchers who confirm their academic credentials and the scholarly purpose of their project. For everyone else, there are third-party companies that collect historical Twitter data and make these available to researchers for the right price, an approach that can become expensive. There are also technical solutions to collect past tweets through web scraping, but these require advanced technical skills and risk the likely violation of [Twitter's Terms of Service](https://twitter.com/en/tos) agreements. Meanwhile, the obstacles to time travel are familiar enough that they need not be repeated here. 37 | 38 | Even when not navigating the challenges of retrieving historical Twitter data, the task of collecting in-the-moment social media data often requires an extent of technical skill that may dissuade social scientists from even getting started. However, for those interested in Twitter data, a relatively straightforward and beginner-level solution is to use a **Twitter Archiving Google Sheet** [TAGS](https://tags.hawksey.info/) tweet collector [@hawksey2016] . Getting started with TAGS is as simple as setting up a new Google Sheet, which will then automatically query the Twitter API with a keyword search every hour going forward. However, although TAGS provides several advantages for **data collection**, it has important limitations related to **data analysis**. First, Google Sheets are not an environment conducive to statistical analysis beyond a few basic calculations, Additionally, TAGS returns limited metadata compared to what is available from the Twitter API: approximately 20% of all categories of information. Specifically, a TAGS tweet collector returns the time, sender, and text of tweets, but not many additional details such as a list of the hashtags or hyperlinks contained in a tweet. 39 | 40 | We introduce the **tidytags** package as an approach that allows for both simple data collection through TAGS and rigorous data analysis in the R statistical computing environment. In short, **tidytags** first uses TAGS to easily and automatically collect tweet ID numbers and then provides a wrapper to the **rtweet** R package [@kearney2019] to re-query the Twitter API to collect additional metadata. **tidytags** then offers several functions to clean the data and perform additional calculations including social network analysis. 41 | 42 | # Getting started with **tidytags** 43 | 44 | For help with initial **tidytags** setup, see the [Getting started with tidytags](https://docs.ropensci.org/tidytags/articles/setup.html) guide on the **tidytags** website. Specifically, this guide offers help for two key tasks: 45 | 46 | 1. Making sure your TAGS tweet collector can be accessed 47 | 2. Getting and storing Twitter API tokens 48 | 49 | For a walkthrough of numerous additional **tidytags** functions, see the [Using tidytags with a conference hashtag](https://docs.ropensci.org/tidytags/articles/tidytags-with-conf-hashtags.html) guide. 50 | 51 | # The **tidytags** Workflow 52 | 53 | A workflow for Twitter research has been formalized in **tidytags**. This workflow is simple enough for beginning programmers to get started but powerful enough to serve as the analytic foundation of research that has been featured in academic journals such as *Computers & Education* [@greenhalgh_et_al2020], *Journal of Research on Technology in Education* [@staudtwillet2019], and *TechTrends* [@greenhalgh_et_al2018]. 54 | 55 | The **tidytags** workflow for exploring Twitter data over time using R includes: 56 | 57 | 1. Set up a **Twitter Archiving Google Sheet** [TAGS](https://tags.hawksey.info/) tweet collector [@hawksey2016] . 58 | 59 | 2. View tweets collected by TAGS using the function `get_tags()` and either the TAGS tweet collector URL or the Google Sheet identifier (i.e., the alphanumeric string following "https://docs.google.com/spreadsheets/d/" in the TAGS tweet collector's URL). 60 | 61 | ```{r} 62 | aect_tweets_tags <- read_tags("18clYlQeJOc6W5QRuSlJ6_v3snqKJImFhU42bRkM_OX8") 63 | ``` 64 | 65 | 3. Pull additional tweet metadata using the function `pull_tweet_data()`. 66 | 67 | ```{r} 68 | aect_tweets_full <- pull_tweet_data(aect_tweets_tags) 69 | ``` 70 | 71 | 4. Calculate additional tweet attributes using the function `process_tweets()`. 72 | 73 | ```{r} 74 | aect_tweets_processed <- process_tweets(aect_tweets_full) 75 | ``` 76 | 77 | 5. Analyze hyperlinks and web domains in tweets using the function `get_url_domain()`. 78 | 79 | ```{r} 80 | example_urls <- dplyr::filter(example_processed, urls_count > 0) 81 | urls_list <- list() 82 | for(i in 1:nrow(example_urls)) { 83 | urls_list[[i]] <- example_urls$entities[[i]]$urls$expanded_url 84 | } 85 | urls_vector <- unlist(urls_list) 86 | example_domains <- get_url_domain(urls_vector) 87 | domain_table <- tibble::as_tibble(table(example_domains)) 88 | domain_table_sorted <- dplyr::arrange(domain_table, desc(n)) 89 | head(domain_table_sorted, 20) 90 | ``` 91 | 92 | 6. Analyze the social network of tweeters using the function `create_edgelist()`. 93 | 94 | ```{r} 95 | aect_edgelist <- create_edgelist(aect_tweets_processed) 96 | ``` 97 | 98 | 7. Append additional tweeter information to the edgelist using the function `add_users_data()`. 99 | 100 | ```{r} 101 | aect_senders_receivers_data <- add_users_data(aect_edgelist) 102 | ``` 103 | 104 | From here, the data are shaped for straightforward use of the **igraph** R package [@csardi_nepusz2006] or the **tidygraph** R package [@pedersen2020] for social network analysis and the **ggraph** R package [@pedersen2021] for network visualization. 105 | 106 | # Conclusion 107 | 108 | **tidytags** is intended to lower barriers to powerful analyses of Twitter data. By combining the easy-to-use **Twitter Archiving Google Sheet** [TAGS](https://tags.hawksey.info/) [@hawksey2016] to collect a large volume of longitudinal data from Twitter, analysis from the **rtweet** R package [@kearney2019], and new functions that facilitate and extend their combined use, **tidytags** has the potential to assist in the collection of Tweets for a wide range of social-science-related analyses and research. 109 | 110 | # References 111 | 112 | -------------------------------------------------------------------------------- /pkgdown/favicon/apple-touch-icon-120x120.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/pkgdown/favicon/apple-touch-icon-120x120.png -------------------------------------------------------------------------------- /pkgdown/favicon/apple-touch-icon-152x152.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/pkgdown/favicon/apple-touch-icon-152x152.png -------------------------------------------------------------------------------- /pkgdown/favicon/apple-touch-icon-180x180.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/pkgdown/favicon/apple-touch-icon-180x180.png -------------------------------------------------------------------------------- /pkgdown/favicon/apple-touch-icon-60x60.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/pkgdown/favicon/apple-touch-icon-60x60.png -------------------------------------------------------------------------------- /pkgdown/favicon/apple-touch-icon-76x76.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/pkgdown/favicon/apple-touch-icon-76x76.png -------------------------------------------------------------------------------- /pkgdown/favicon/apple-touch-icon.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/pkgdown/favicon/apple-touch-icon.png -------------------------------------------------------------------------------- /pkgdown/favicon/favicon-16x16.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/pkgdown/favicon/favicon-16x16.png -------------------------------------------------------------------------------- /pkgdown/favicon/favicon-32x32.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/pkgdown/favicon/favicon-32x32.png -------------------------------------------------------------------------------- /pkgdown/favicon/favicon.ico: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/pkgdown/favicon/favicon.ico -------------------------------------------------------------------------------- /tests/fixtures/url_domains.yml: -------------------------------------------------------------------------------- 1 | http_interactions: 2 | - request: 3 | method: head 4 | uri: https://www.tidyverse.org/ 5 | body: 6 | encoding: '' 7 | string: '' 8 | headers: 9 | Accept: application/json, text/xml, application/xml, */* 10 | response: 11 | status: 12 | status_code: 200 13 | category: Success 14 | reason: OK 15 | message: 'Success: (200) OK' 16 | headers: 17 | age: '32082' 18 | cache-control: public, max-age=0, must-revalidate 19 | content-encoding: gzip 20 | content-type: text/html; charset=UTF-8 21 | date: Fri, 18 Nov 2022 07:13:10 GMT 22 | etag: '"5d5e024adab9e7cb8f4d3b65449e25a8-ssl-df"' 23 | server: Netlify 24 | strict-transport-security: max-age=31536000 25 | vary: Accept-Encoding 26 | x-nf-request-id: 01GJ5QY5CNQQXJ9JDEC6J88B2W 27 | content-length: '3888' 28 | body: 29 | encoding: '' 30 | file: no 31 | string: '' 32 | recorded_at: 2022-11-18 16:07:53 GMT 33 | recorded_with: vcr/1.2.0, webmockr/0.8.2 34 | - request: 35 | method: head 36 | uri: https://www.tidyverse.org/packages/ 37 | body: 38 | encoding: '' 39 | string: '' 40 | headers: 41 | Accept: application/json, text/xml, application/xml, */* 42 | response: 43 | status: 44 | status_code: 200 45 | category: Success 46 | reason: OK 47 | message: 'Success: (200) OK' 48 | headers: 49 | age: '9005' 50 | cache-control: public, max-age=0, must-revalidate 51 | content-encoding: gzip 52 | content-type: text/html; charset=UTF-8 53 | date: Fri, 18 Nov 2022 13:37:47 GMT 54 | etag: '"276cfee2709432c21354b0ee05cc3ebf-ssl-df"' 55 | server: Netlify 56 | strict-transport-security: max-age=31536000 57 | vary: Accept-Encoding 58 | x-nf-request-id: 01GJ5QY5ESA5135EEPV8RM4EBM 59 | content-length: '6941' 60 | body: 61 | encoding: '' 62 | file: no 63 | string: '' 64 | recorded_at: 2022-11-18 16:07:53 GMT 65 | recorded_with: vcr/1.2.0, webmockr/0.8.2 66 | - request: 67 | method: head 68 | uri: https://dplyr.tidyverse.org/ 69 | body: 70 | encoding: '' 71 | string: '' 72 | headers: 73 | Accept: application/json, text/xml, application/xml, */* 74 | response: 75 | status: 76 | status_code: 200 77 | category: Success 78 | reason: OK 79 | message: 'Success: (200) OK' 80 | headers: 81 | server: GitHub.com 82 | content-type: text/html; charset=utf-8 83 | last-modified: Fri, 18 Nov 2022 15:57:18 GMT 84 | access-control-allow-origin: '*' 85 | etag: W/"6377ab5e-6555" 86 | expires: Fri, 18 Nov 2022 16:17:52 GMT 87 | cache-control: max-age=600 88 | content-encoding: gzip 89 | x-proxy-cache: MISS 90 | x-github-request-id: 14CC:015B:E62548:13F971E:6377ADD8 91 | accept-ranges: bytes 92 | date: Fri, 18 Nov 2022 16:07:52 GMT 93 | via: 1.1 varnish 94 | age: '0' 95 | x-served-by: cache-mia11328-MIA 96 | x-cache: MISS 97 | x-cache-hits: '0' 98 | x-timer: S1668787673.731997,VS0,VE32 99 | vary: Accept-Encoding 100 | x-fastly-request-id: af0951ff7fc1bebdc66bc7138ba45848dad01297 101 | content-length: '6262' 102 | body: 103 | encoding: '' 104 | file: no 105 | string: '' 106 | recorded_at: 2022-11-18 16:07:53 GMT 107 | recorded_with: vcr/1.2.0, webmockr/0.8.2 108 | - request: 109 | method: head 110 | uri: https://www.npr.org/sections/technology/ 111 | body: 112 | encoding: '' 113 | string: '' 114 | headers: 115 | Accept: application/json, text/xml, application/xml, */* 116 | response: 117 | status: 118 | status_code: 200 119 | category: Success 120 | reason: OK 121 | message: 'Success: (200) OK' 122 | headers: 123 | content-type: text/html; charset=UTF-8 124 | x-cache-npr: HIT 125 | access-control-allow-origin: '*' 126 | access-control-allow-credentials: 'true' 127 | x-npr-trace-id: renX7ensdiY 128 | x-content-type-options: nosniff 129 | x-xss-protection: 1; mode=block 130 | x-served-by: pod-www-render-nginx-6cdbf7dd46-snpwr 131 | referrer-policy: no-referrer-when-downgrade 132 | strict-transport-security: max-age=15724800; includeSubDomains 133 | vary: Accept-Encoding 134 | content-encoding: gzip 135 | cache-control: no-cache 136 | expires: Fri, 18 Nov 2022 16:07:53 GMT 137 | date: Fri, 18 Nov 2022 16:07:53 GMT 138 | content-length: '20' 139 | body: 140 | encoding: '' 141 | file: no 142 | string: '' 143 | recorded_at: 2022-11-18 16:07:53 GMT 144 | recorded_with: vcr/1.2.0, webmockr/0.8.2 145 | -------------------------------------------------------------------------------- /tests/testthat.R: -------------------------------------------------------------------------------- 1 | library(testthat) 2 | library(tidytags) 3 | 4 | test_check("tidytags") 5 | -------------------------------------------------------------------------------- /tests/testthat/helper-vcr.R: -------------------------------------------------------------------------------- 1 | library(vcr) 2 | 3 | vcr_dir <- vcr::vcr_test_path("fixtures") 4 | 5 | if (!nzchar(Sys.getenv("TWITTER_BEARER_TOKEN"))) { 6 | if (dir.exists(vcr_dir)) { 7 | # Fake API token to fool our package 8 | Sys.setenv("TWITTER_BEARER_TOKEN" = "foobar") 9 | } else { 10 | # If there's no mock files nor API token, impossible to run tests 11 | stop("No API key nor cassettes, tests cannot be run.", 12 | call. = FALSE) 13 | } 14 | } 15 | 16 | invisible(vcr::vcr_configure( 17 | dir = vcr_dir, 18 | # Filter the request header where the token is sent, make sure you know 19 | # how authentication works in your case and read the Security chapter :-) 20 | filter_request_headers = list(Authorization = "My bearer token is safe") 21 | )) 22 | -------------------------------------------------------------------------------- /tests/testthat/sample-data.csv: -------------------------------------------------------------------------------- 1 | id_str,from_user,text,created_at,time,geo_coordinates,user_lang,in_reply_to_user_id_str,in_reply_to_screen_name,from_user_id_str,in_reply_to_status_id_str,source,profile_image_url,user_followers_count,user_friends_count,user_location,status_url,entities_str 2 | 1251954312772812801,Harriet96152202,"RT @RoutledgeEd: Congrats to authors Joseph Rene Corbeil, Maria Elena Corbeil, and (not pictured) Badrul Khan, who received the Outstanding…",Sun Apr 19 19:22:23 +0000 2020,2020-04-19T20:22:23Z,NA,NA,NA,NA,1251951804398669825,NA,"Twitter Web App",http://pbs.twimg.com/profile_images/1251951950163255299/cxSX369n_normal.jpg,NA,9,NA,http://twitter.com/Harriet96152202/statuses/1251954312772812801,"{""hashtags"":[],""symbols"":[],""user_mentions"":[{""screen_name"":""RoutledgeEd"",""name"":""Routledge Education Books"",""id"":27606068,""id_str"":""27606068"",""indices"":[3,15]}],""urls"":[]}" 3 | 1248064163211096064,Patrick81040643,RT @dtpthanh: Look like sisters? @YamChaivisit #aect19 https://t.co/Kr50LjJ0vI,Thu Apr 09 01:44:19 +0000 2020,2020-04-09T02:44:19Z,NA,NA,NA,NA,1245077697686130689,NA,"Twitter for Android",http://pbs.twimg.com/profile_images/1245078641475813377/mFziZnGt_normal.jpg,6,140,"Mississippi, USA",http://twitter.com/Patrick81040643/statuses/1248064163211096064,"{""hashtags"":[{""text"":""aect19"",""indices"":[47,54]}],""symbols"":[],""user_mentions"":[{""screen_name"":""dtpthanh"",""name"":""Thanh Do"",""id"":991017763824025600,""id_str"":""991017763824025600"",""indices"":[3,12]},{""screen_name"":""YamChaivisit"",""name"":""Yam Chaivisit"",""id"":2896012002,""id_str"":""2896012002"",""indices"":[33,46]}],""urls"":[],""media"":[{""id"":1187526100240355300,""id_str"":""1187526100240355328"",""indices"":[55,78],""media_url"":""http://pbs.twimg.com/media/EHrwskZU0AA2xdG.jpg"",""media_url_https"":""https://pbs.twimg.com/media/EHrwskZU0AA2xdG.jpg"",""url"":""https://t.co/Kr50LjJ0vI"",""display_url"":""pic.twitter.com/Kr50LjJ0vI"",""expanded_url"":""https://twitter.com/dtpthanh/status/1187526111489474561/photo/1"",""type"":""photo"",""sizes"":{""large"":{""w"":1740,""h"":2048,""resize"":""fit""},""small"":{""w"":578,""h"":680,""resize"":""fit""},""thumb"":{""w"":150,""h"":150,""resize"":""crop""},""medium"":{""w"":1020,""h"":1200,""resize"":""fit""}},""source_status_id"":1187526111489474600,""source_status_id_str"":""1187526111489474561"",""source_user_id"":991017763824025600,""source_user_id_str"":""991017763824025600""}]}" 4 | 1234206946732830720,ELTAugusta,"RT @veletsianos: Reminder: Call for Chapter Proposals: Critical Digital Pedagogy – Broadening Horizons, Bridging Theory and Practice: 5 | 6 | htt…",Sun Mar 01 20:00:41 +0000 2020,2020-03-01T20:00:40Z,NA,NA,NA,NA,3294167372,NA,"Twitter for iPad",http://pbs.twimg.com/profile_images/704165359020978176/wbBjVGk1_normal.jpg,707,1160,Burnaby Canada,http://twitter.com/ELTAugusta/statuses/1234206946732830720,"{""hashtags"":[],""symbols"":[],""user_mentions"":[{""screen_name"":""veletsianos"",""name"":""George Veletsianos, PhD"",""id"":17883918,""id_str"":""17883918"",""indices"":[3,15]}],""urls"":[]}" 7 | 1229405350178127872,gsa_aect,RT @tadousay: Many thanks to @AECTTechTrends for supporting our @gsa_aect with the Grad Member Musings column! The latest guest author is #…,Mon Feb 17 14:00:51 +0000 2020,2020-02-17T14:00:50Z,NA,NA,NA,NA,922536306437181440,NA,"TweetDeck",http://pbs.twimg.com/profile_images/1025847363615641600/1UylGRlO_normal.jpg,239,43,NA,http://twitter.com/gsa_aect/statuses/1229405350178127872,"{""hashtags"":[],""symbols"":[],""user_mentions"":[{""screen_name"":""tadousay"",""name"":""Tonia A. Dousay"",""id"":14215524,""id_str"":""14215524"",""indices"":[3,12]},{""screen_name"":""AECTTechTrends"",""name"":""TechTrends Editor"",""id"":804807943,""id_str"":""804807943"",""indices"":[29,44]},{""screen_name"":""gsa_aect"",""name"":""AECT GSA"",""id"":922536306437181400,""id_str"":""922536306437181440"",""indices"":[64,73]}],""urls"":[]}" 8 | 1227652243870097408,fcis_iu,"Give Madinah what it deserves and help Saudi achieve 2030 vision. So many great presenters, great content and great… https://t.co/ODA1A8KUZ5",Wed Feb 12 17:54:38 +0000 2020,2020-02-12T17:54:38Z,NA,NA,NA,NA,846230769899069440,NA,"Twitter Web App",http://pbs.twimg.com/profile_images/1179372789045878784/f0m9YedP_normal.jpg,5000,18,"Al Madinah Al Munawwarah, King",http://twitter.com/fcis_iu/statuses/1227652243870097408,"{""hashtags"":[],""symbols"":[],""user_mentions"":[],""urls"":[{""url"":""https://t.co/ODA1A8KUZ5"",""expanded_url"":""https://twitter.com/i/web/status/1227652243870097408"",""display_url"":""twitter.com/i/web/status/1…"",""indices"":[117,140]}]}" 9 | 1225505187453964288,StaufferEdu,RT @tadousay: Many thanks to @AECTTechTrends for supporting our @gsa_aect with the Grad Member Musings column! The latest guest author is #…,Thu Feb 06 19:43:00 +0000 2020,2020-02-06T19:43:00Z,NA,NA,NA,NA,21425886,NA,"Twitter for iPad",http://pbs.twimg.com/profile_images/1087813745970167809/z7LsNu1H_normal.jpg,737,943,"ÜT: 38.042058,-78.492393",http://twitter.com/StaufferEdu/statuses/1225505187453964288,"{""hashtags"":[],""symbols"":[],""user_mentions"":[{""screen_name"":""tadousay"",""name"":""Tonia A. Dousay"",""id"":14215524,""id_str"":""14215524"",""indices"":[3,12]},{""screen_name"":""AECTTechTrends"",""name"":""TechTrends Editor"",""id"":804807943,""id_str"":""804807943"",""indices"":[29,44]},{""screen_name"":""gsa_aect"",""name"":""AECT GSA"",""id"":922536306437181400,""id_str"":""922536306437181440"",""indices"":[64,73]}],""urls"":[]}" 10 | 1225137879921385472,AECTTechTrends,RT @tadousay: Many thanks to @AECTTechTrends for supporting our @gsa_aect with the Grad Member Musings column! The latest guest author is #…,Wed Feb 05 19:23:27 +0000 2020,2020-02-05T19:23:27Z,NA,NA,NA,NA,804807943,NA,"Twitter for Android",http://pbs.twimg.com/profile_images/817129539809714177/Px0jTTcg_normal.jpg,1455,16,NA,http://twitter.com/AECTTechTrends/statuses/1225137879921385472,"{""hashtags"":[],""symbols"":[],""user_mentions"":[{""screen_name"":""tadousay"",""name"":""Tonia A. Dousay"",""id"":14215524,""id_str"":""14215524"",""indices"":[3,12]},{""screen_name"":""AECTTechTrends"",""name"":""TechTrends Editor"",""id"":804807943,""id_str"":""804807943"",""indices"":[29,44]},{""screen_name"":""gsa_aect"",""name"":""AECT GSA"",""id"":922536306437181400,""id_str"":""922536306437181440"",""indices"":[64,73]}],""urls"":[]}" 11 | 1225122317849657345,tadousay,Many thanks to @AECTTechTrends for supporting our @gsa_aect with the Grad Member Musings column! The latest guest a… https://t.co/9uNyjeuSxq,Wed Feb 05 18:21:36 +0000 2020,2020-02-05T18:21:36Z,NA,NA,NA,NA,14215524,NA,"TweetDeck",http://pbs.twimg.com/profile_images/1214316658635886593/SrpJRUZj_normal.jpg,2034,1199,"Moscow, ID",http://twitter.com/tadousay/statuses/1225122317849657345,"{""hashtags"":[],""symbols"":[],""user_mentions"":[{""screen_name"":""AECTTechTrends"",""name"":""TechTrends Editor"",""id"":804807943,""id_str"":""804807943"",""indices"":[15,30]},{""screen_name"":""gsa_aect"",""name"":""AECT GSA"",""id"":922536306437181400,""id_str"":""922536306437181440"",""indices"":[50,59]}],""urls"":[{""url"":""https://t.co/9uNyjeuSxq"",""expanded_url"":""https://twitter.com/i/web/status/1225122317849657345"",""display_url"":""twitter.com/i/web/status/1…"",""indices"":[117,140]}]}" 12 | 1219758386436165633,gsa_aect,RT @AECT: The #AECT19 convention proceedings are available! The papers published in these volumes were presented at the annual AECT Convent…,Tue Jan 21 23:07:15 +0000 2020,2020-01-21T23:07:15Z,NA,NA,NA,NA,922536306437181440,NA,"TweetDeck",http://pbs.twimg.com/profile_images/1025847363615641600/1UylGRlO_normal.jpg,223,43,NA,http://twitter.com/gsa_aect/statuses/1219758386436165633,"{""hashtags"":[{""text"":""AECT19"",""indices"":[14,21]}],""symbols"":[],""user_mentions"":[{""screen_name"":""AECT"",""name"":""✵ AECT ✵"",""id"":12030342,""id_str"":""12030342"",""indices"":[3,8]}],""urls"":[]}" 13 | 1219043574555299840,aectddl,RT @AECT: The #AECT19 convention proceedings are available! The papers published in these volumes were presented at the annual AECT Convent…,Sun Jan 19 23:46:51 +0000 2020,2020-01-19T23:46:50Z,NA,NA,NA,NA,1088189033266798598,NA,"Twitter Web App",http://pbs.twimg.com/profile_images/1088189813784829959/Wfbg66KT_normal.jpg,201,143,NA,http://twitter.com/aectddl/statuses/1219043574555299840,"{""hashtags"":[{""text"":""AECT19"",""indices"":[14,21]}],""symbols"":[],""user_mentions"":[{""screen_name"":""AECT"",""name"":""✵ AECT ✵"",""id"":12030342,""id_str"":""12030342"",""indices"":[3,8]}],""urls"":[]}" 14 | -------------------------------------------------------------------------------- /tests/testthat/sample-tweet.csv: -------------------------------------------------------------------------------- 1 | id_str,from_user,text,created_at,time,geo_coordinates,user_lang,in_reply_to_user_id_str,in_reply_to_screen_name,from_user_id_str,in_reply_to_status_id_str,source,profile_image_url,user_followers_count,user_friends_count,user_location,status_url,entities_str 2 | 1219758386436165633,gsa_aect,RT @AECT: The #AECT19 convention proceedings are available! The papers published in these volumes were presented at the annual AECT Convent…,Tue Jan 21 23:07:15 +0000 2020,2020-01-21T23:07:15Z,NA,NA,NA,NA,922536306437181440,NA,"TweetDeck",http://pbs.twimg.com/profile_images/1025847363615641600/1UylGRlO_normal.jpg,223,43,NA,http://twitter.com/gsa_aect/statuses/1219758386436165633,"{""hashtags"":[{""text"":""AECT19"",""indices"":[14,21]}],""symbols"":[],""user_mentions"":[{""screen_name"":""AECT"",""name"":""✵ AECT ✵"",""id"":12030342,""id_str"":""12030342"",""indices"":[3,8]}],""urls"":[]}" 3 | -------------------------------------------------------------------------------- /tests/testthat/test-1-read_tags.R: -------------------------------------------------------------------------------- 1 | test_that("a TAGS tweet tracker is imported properly from Google Sheets", { 2 | 3 | vcr::use_cassette("sample_tags", { 4 | example_url <- "18clYlQeJOc6W5QRuSlJ6_v3snqKJImFhU42bRkM_OX8" 5 | sample_tags <- head(read_tags(example_url), 200) 6 | sample_tags <- sample_tags[,1:(length(sample_tags)-1)] 7 | }) 8 | 9 | testthat::expect_true(is.data.frame(sample_tags)) 10 | testthat::expect_named(sample_tags) 11 | testthat::expect_true("id_str" %in% names(sample_tags)) 12 | testthat::expect_true("from_user" %in% names(sample_tags)) 13 | testthat::expect_true("status_url" %in% names(sample_tags)) 14 | testthat::expect_gt(ncol(sample_tags), 15) 15 | }) 16 | -------------------------------------------------------------------------------- /tests/testthat/test-10-create_edgelist.R: -------------------------------------------------------------------------------- 1 | vcr::use_cassette("sample_tags", { 2 | example_url <- "18clYlQeJOc6W5QRuSlJ6_v3snqKJImFhU42bRkM_OX8" 3 | sample_tags <- head(read_tags(example_url), 200) 4 | sample_tags <- sample_tags[,1:(length(sample_tags)-1)] 5 | }) 6 | 7 | vcr::use_cassette("different_tweet_types", { 8 | app <- rtweet::rtweet_app(bearer_token = Sys.getenv("TWITTER_BEARER_TOKEN")) 9 | rtweet::auth_as(app) 10 | more_info <- pull_tweet_data(sample_tags, n = 100) 11 | }) 12 | 13 | processed_data <- process_tweets(more_info) 14 | 15 | 16 | 17 | test_that("tweets build into edgelist with default parameter", { 18 | 19 | el <- create_edgelist(processed_data) 20 | 21 | testthat::expect_equal(is.data.frame(el), TRUE) 22 | testthat::expect_named(el) 23 | testthat::expect_true("sender" %in% names(el)) 24 | testthat::expect_true("receiver" %in% names(el)) 25 | testthat::expect_true("tweet_type" %in% names(el)) 26 | testthat::expect_false("id_str" %in% names(el)) 27 | testthat::expect_false("user_id_str" %in% names(el)) 28 | testthat::expect_false("screen_name" %in% names(el)) 29 | testthat::expect_true("gsa_aect" %in% el$sender) 30 | testthat::expect_true("AECT" %in% el$receiver) 31 | testthat::expect_true("RoutledgeEd" %in% el$receiver) 32 | testthat::expect_true("reply" %in% el$tweet_type) 33 | testthat::expect_true("retweet" %in% el$tweet_type) 34 | testthat::expect_true("quote" %in% el$tweet_type) 35 | 36 | }) 37 | 38 | 39 | 40 | test_that("tweets build into edgelist with replies", { 41 | 42 | el <- create_edgelist(processed_data, type = "reply") 43 | 44 | testthat::expect_equal(is.data.frame(el), TRUE) 45 | testthat::expect_named(el) 46 | testthat::expect_true("sender" %in% names(el)) 47 | testthat::expect_true("receiver" %in% names(el)) 48 | testthat::expect_true("tweet_type" %in% names(el)) 49 | testthat::expect_false("id_str" %in% names(el)) 50 | testthat::expect_false("user_id_str" %in% names(el)) 51 | testthat::expect_false("screen_name" %in% names(el)) 52 | testthat::expect_false("gsa_aect" %in% el$sender) 53 | testthat::expect_false("AECT" %in% el$receiver) 54 | testthat::expect_false("RoutledgeEd" %in% el$receiver) 55 | testthat::expect_true("reply" %in% el$tweet_type) 56 | testthat::expect_false("retweet" %in% el$tweet_type) 57 | testthat::expect_false("quote" %in% el$tweet_type) 58 | }) 59 | 60 | 61 | 62 | test_that("tweets build into edgelist with retweets", { 63 | 64 | el <- create_edgelist(processed_data, type = "retweet") 65 | 66 | testthat::expect_equal(is.data.frame(el), TRUE) 67 | testthat::expect_named(el) 68 | testthat::expect_true("sender" %in% names(el)) 69 | testthat::expect_true("receiver" %in% names(el)) 70 | testthat::expect_true("tweet_type" %in% names(el)) 71 | testthat::expect_false("id_str" %in% names(el)) 72 | testthat::expect_false("user_id_str" %in% names(el)) 73 | testthat::expect_false("screen_name" %in% names(el)) 74 | testthat::expect_true("gsa_aect" %in% el$sender) 75 | testthat::expect_false("AECT" %in% el$receiver) 76 | testthat::expect_true("RoutledgeEd" %in% el$receiver) 77 | testthat::expect_false("reply" %in% el$tweet_type) 78 | testthat::expect_true("retweet" %in% el$tweet_type) 79 | testthat::expect_false("quote" %in% el$tweet_type) 80 | }) 81 | 82 | 83 | 84 | test_that("tweets build into edgelist with quote tweets", { 85 | 86 | el <- create_edgelist(processed_data, type = "quote") 87 | 88 | testthat::expect_equal(is.data.frame(el), TRUE) 89 | testthat::expect_named(el) 90 | testthat::expect_true("sender" %in% names(el)) 91 | testthat::expect_true("receiver" %in% names(el)) 92 | testthat::expect_true("tweet_type" %in% names(el)) 93 | testthat::expect_false("id_str" %in% names(el)) 94 | testthat::expect_false("user_id_str" %in% names(el)) 95 | testthat::expect_false("screen_name" %in% names(el)) 96 | testthat::expect_false("gsa_aect" %in% el$sender) 97 | testthat::expect_true("AECT" %in% el$receiver) 98 | testthat::expect_false("RoutledgeEd" %in% el$receiver) 99 | testthat::expect_false("reply" %in% el$tweet_type) 100 | testthat::expect_false("retweet" %in% el$tweet_type) 101 | testthat::expect_true("quote" %in% el$tweet_type) 102 | }) 103 | -------------------------------------------------------------------------------- /tests/testthat/test-11-add_users_data.R: -------------------------------------------------------------------------------- 1 | vcr::use_cassette("sample_tags", { 2 | sample_tags <- read_tags("18clYlQeJOc6W5QRuSlJ6_v3snqKJImFhU42bRkM_OX8") 3 | }) 4 | 5 | vcr::use_cassette("different_tweet_types", { 6 | app <- rtweet::rtweet_app(bearer_token = Sys.getenv("TWITTER_BEARER_TOKEN")) 7 | rtweet::auth_as(app) 8 | more_info <- pull_tweet_data(sample_tags, n = 100) 9 | }) 10 | 11 | processed_data <- process_tweets(more_info) 12 | el <- create_edgelist(processed_data) 13 | 14 | vcr::use_cassette("users_info", { 15 | el_users <- add_users_data(el) 16 | }) 17 | 18 | test_that("user data is added properly", { 19 | 20 | testthat::expect_equal(is.data.frame(el_users), TRUE) 21 | testthat::expect_named(el_users) 22 | testthat::expect_true("sender" %in% names(el_users)) 23 | testthat::expect_true("receiver" %in% names(el_users)) 24 | testthat::expect_true("tweet_type" %in% names(el_users)) 25 | testthat::expect_false("id_str" %in% names(el_users)) 26 | testthat::expect_true("sender_id_str" %in% names(el_users)) 27 | testthat::expect_true("receiver_id_str" %in% names(el_users)) 28 | testthat::expect_true("sender_location" %in% names(el_users)) 29 | testthat::expect_true("receiver_location" %in% names(el_users)) 30 | testthat::expect_true("sender_description" %in% names(el_users)) 31 | testthat::expect_true("receiver_description" %in% names(el_users)) 32 | testthat::expect_true("gsa_aect" %in% el_users$sender) 33 | testthat::expect_true("AECT" %in% el_users$receiver) 34 | testthat::expect_true("RoutledgeEd" %in% el_users$receiver) 35 | testthat::expect_true("reply" %in% el_users$tweet_type) 36 | testthat::expect_true("retweet" %in% el_users$tweet_type) 37 | testthat::expect_true("quote" %in% el_users$tweet_type) 38 | testthat::expect_gt(ncol(el_users), ncol(el)) 39 | testthat::expect_equal(nrow(el_users), nrow(el)) 40 | testthat::expect_equal(el_users$sender, el$sender) 41 | testthat::expect_equal(el_users$receiver, el$receiver) 42 | }) 43 | -------------------------------------------------------------------------------- /tests/testthat/test-2-get_char_tweet_ids.R: -------------------------------------------------------------------------------- 1 | sample_data <- 2 | readr::read_csv("sample-data.csv", col_names = TRUE) 3 | 4 | sample_tweet <- 5 | readr::read_csv("sample-tweet.csv", col_names = TRUE) 6 | 7 | 8 | test_that("get_char_tweet_ids() extracts correct ID number", { 9 | testthat::expect_equal(get_char_tweet_ids(sample_tweet), 10 | "1219758386436165633") 11 | testthat::expect_equal(get_char_tweet_ids(sample_tweet$status_url), 12 | "1219758386436165633") 13 | testthat::expect_equal(get_char_tweet_ids(sample_data[8, ]), 14 | "1225122317849657345") 15 | testthat::expect_equal(get_char_tweet_ids(sample_data$status_url[8]), 16 | "1225122317849657345") 17 | }) 18 | -------------------------------------------------------------------------------- /tests/testthat/test-3-pull_tweet_data.R: -------------------------------------------------------------------------------- 1 | test_that("pull_tweet_data() is able to retrieve additional metadata starting 2 | with dataframe", { 3 | 4 | vcr::use_cassette("sample_tags", { 5 | example_url <- "18clYlQeJOc6W5QRuSlJ6_v3snqKJImFhU42bRkM_OX8" 6 | sample_tags <- head(read_tags(example_url), 200) 7 | sample_tags <- sample_tags[,1:(length(sample_tags)-1)] 8 | }) 9 | 10 | vcr::use_cassette("metadata_from_rtweet", { 11 | app <- rtweet::rtweet_app(bearer_token = 12 | Sys.getenv("TWITTER_BEARER_TOKEN")) 13 | rtweet::auth_as(app) 14 | from_df <- pull_tweet_data(sample_tags, n = 10) 15 | }) 16 | 17 | testthat::expect_equal(is.data.frame(from_df), TRUE) 18 | testthat::expect_named(from_df) 19 | testthat::expect_true("created_at" %in% names(from_df)) 20 | testthat::expect_true("id_str" %in% names(from_df)) 21 | testthat::expect_true("full_text" %in% names(from_df)) 22 | testthat::expect_true("entities" %in% names(from_df)) 23 | testthat::expect_true("in_reply_to_status_id_str" %in% 24 | names(from_df)) 25 | testthat::expect_true("user_id_str" %in% names(from_df)) 26 | testthat::expect_true("screen_name" %in% names(from_df)) 27 | testthat::expect_true("location" %in% names(from_df)) 28 | testthat::expect_true("followers_count" %in% names(from_df)) 29 | testthat::expect_true("friends_count" %in% names(from_df)) 30 | testthat::expect_gt(ncol(from_df), ncol(sample_tags)) 31 | testthat::expect_lte(nrow(from_df), nrow(sample_tags)) 32 | }) 33 | 34 | 35 | 36 | test_that("pull_tweet_data() is able to retrieve additional metadata starting 37 | with tweet IDs", { 38 | 39 | vcr::use_cassette("sample_tags", { 40 | sample_tags <- 41 | read_tags("18clYlQeJOc6W5QRuSlJ6_v3snqKJImFhU42bRkM_OX8") 42 | }) 43 | 44 | vcr::use_cassette("metadata_from_ids", { 45 | app <- rtweet::rtweet_app(bearer_token = 46 | Sys.getenv("TWITTER_BEARER_TOKEN")) 47 | rtweet::auth_as(app) 48 | from_ids <- pull_tweet_data(id_vector = 49 | sample_tags$id_str, n = 10) 50 | }) 51 | 52 | testthat::expect_equal(is.data.frame(from_ids), TRUE) 53 | testthat::expect_named(from_ids) 54 | testthat::expect_true("created_at" %in% names(from_ids)) 55 | testthat::expect_true("id_str" %in% names(from_ids)) 56 | testthat::expect_true("full_text" %in% names(from_ids)) 57 | testthat::expect_true("entities" %in% names(from_ids)) 58 | testthat::expect_true("in_reply_to_status_id_str" %in% 59 | names(from_ids)) 60 | testthat::expect_true("user_id_str" %in% names(from_ids)) 61 | testthat::expect_true("screen_name" %in% names(from_ids)) 62 | testthat::expect_true("location" %in% names(from_ids)) 63 | testthat::expect_true("followers_count" %in% names(from_ids)) 64 | testthat::expect_true("friends_count" %in% names(from_ids)) 65 | testthat::expect_gt(ncol(from_ids), ncol(sample_tags)) 66 | testthat::expect_lte(nrow(from_ids), nrow(sample_tags)) 67 | }) 68 | 69 | 70 | 71 | test_that("pull_tweet_data() is able to retrieve additional metadata starting 72 | with tweet URLs", { 73 | 74 | vcr::use_cassette("sample_tags", { 75 | sample_tags <- 76 | read_tags("18clYlQeJOc6W5QRuSlJ6_v3snqKJImFhU42bRkM_OX8") 77 | }) 78 | 79 | vcr::use_cassette("metadata_from_urls", { 80 | app <- rtweet::rtweet_app(bearer_token = 81 | Sys.getenv("TWITTER_BEARER_TOKEN")) 82 | rtweet::auth_as(app) 83 | from_urls <- pull_tweet_data(url_vector = 84 | sample_tags$status_url, n = 10) 85 | }) 86 | 87 | testthat::expect_equal(is.data.frame(from_urls), TRUE) 88 | testthat::expect_named(from_urls) 89 | testthat::expect_true("created_at" %in% names(from_urls)) 90 | testthat::expect_true("id_str" %in% names(from_urls)) 91 | testthat::expect_true("full_text" %in% names(from_urls)) 92 | testthat::expect_true("entities" %in% names(from_urls)) 93 | testthat::expect_true("in_reply_to_status_id_str" %in% 94 | names(from_urls)) 95 | testthat::expect_true("user_id_str" %in% names(from_urls)) 96 | testthat::expect_true("screen_name" %in% names(from_urls)) 97 | testthat::expect_true("location" %in% names(from_urls)) 98 | testthat::expect_true("followers_count" %in% names(from_urls)) 99 | testthat::expect_true("friends_count" %in% names(from_urls)) 100 | testthat::expect_gt(ncol(from_urls), ncol(sample_tags)) 101 | testthat::expect_lte(nrow(from_urls), nrow(sample_tags)) 102 | }) 103 | 104 | 105 | 106 | test_that("pull_tweet_data() output keeps columns in consistent order", { 107 | vcr::use_cassette("tweet_ids", { 108 | app <- rtweet::rtweet_app(bearer_token = Sys.getenv("TWITTER_BEARER_TOKEN")) 109 | rtweet::auth_as(app) 110 | 111 | colnames0a <- colnames(pull_tweet_data(id_vector = "X")) 112 | colnames0b <- colnames(pull_tweet_data(id_vector = "1580002144631279616")) 113 | colnames0c <- colnames(pull_tweet_data(id_vector = "1580212580249133056")) 114 | 115 | expect_null(colnames0a) 116 | expect_null(colnames0b) 117 | expect_null(colnames0c) 118 | 119 | 120 | colnames1 <- colnames(pull_tweet_data(id_vector = "1578252090102751232")) 121 | colnames2 <- colnames(pull_tweet_data(id_vector = "1578824308260237312")) 122 | colnames3 <- colnames(pull_tweet_data(id_vector = "1580186891151777792")) 123 | colnames4 <- colnames(pull_tweet_data(id_vector = "1580172355699023872")) 124 | colnames5 <- colnames(pull_tweet_data(id_vector = "1579969347942219776")) 125 | }) 126 | 127 | a <- tibble::tibble(colnames1, colnames2, colnames3, 128 | colnames4, colnames5) 129 | b <- apply(a, 1, function(x){length(unique(x))}) 130 | expect_true(all(b == 1)) 131 | 132 | expected_names <- 133 | c("created_at", "id", "id_str", "text", "full_text", "truncated", 134 | "entities", "source", "in_reply_to_status_id", 135 | "in_reply_to_status_id_str", "in_reply_to_user_id", 136 | "in_reply_to_user_id_str", "in_reply_to_screen_name", "geo", 137 | "coordinates", "place", "contributors", "is_quote_status", 138 | "retweet_count", "favorite_count", "favorited", "favorited_by", 139 | "retweeted", "scopes", "lang", "possibly_sensitive", 140 | "display_text_width", "display_text_range", "retweeted_status", 141 | "quoted_status", "quoted_status_id", "quoted_status_id_str", 142 | "quoted_status_permalink", "quote_count", "timestamp_ms", "reply_count", 143 | "filter_level", "metadata", "query", "withheld_scope", 144 | "withheld_copyright", "withheld_in_countries", 145 | "possibly_sensitive_appealable", "user_id", "user_id_str", "name", 146 | "screen_name", "location", "description", "url", "protected", 147 | "followers_count", "friends_count", "listed_count", "user_created_at", 148 | "favourites_count", "verified", "statuses_count", 149 | "profile_image_url_https", "profile_banner_url", "default_profile", 150 | "default_profile_image", "user_withheld_in_countries", "derived", 151 | "user_withheld_scope", "user_entities") 152 | 153 | expect_true(length(expected_names) == length(colnames1)) 154 | expect_true(expected_names[1] == colnames1[1]) 155 | expect_true(length(expected_names) == length(colnames2)) 156 | expect_true(expected_names[1] == colnames2[1]) 157 | expect_true(length(expected_names) == length(colnames3)) 158 | expect_true(expected_names[1] == colnames3[1]) 159 | expect_true(length(expected_names) == length(colnames4)) 160 | expect_true(expected_names[1] == colnames4[1]) 161 | expect_true(length(expected_names) == length(colnames5)) 162 | expect_true(expected_names[1] == colnames5[1]) 163 | }) 164 | -------------------------------------------------------------------------------- /tests/testthat/test-4-lookup_many_tweets.R: -------------------------------------------------------------------------------- 1 | test_that("lookup_many_tweets() works like pull_tweet_data()", { 2 | 3 | vcr::use_cassette("sample_tags", { 4 | example_url <- "18clYlQeJOc6W5QRuSlJ6_v3snqKJImFhU42bRkM_OX8" 5 | sample_tags <- head(read_tags(example_url), 200) 6 | sample_tags <- sample_tags[,1:(length(sample_tags)-1)] 7 | }) 8 | 9 | vcr::use_cassette("lookup_many", { 10 | app <- rtweet::rtweet_app(bearer_token = Sys.getenv("TWITTER_BEARER_TOKEN")) 11 | rtweet::auth_as(app) 12 | pull_regular <- pull_tweet_data(sample_tags, n = 10) 13 | pull_many <- lookup_many_tweets(sample_tags$id_str[1:10]) 14 | }) 15 | 16 | testthat::expect_equal(is.data.frame(pull_many), TRUE) 17 | testthat::expect_named(pull_many) 18 | testthat::expect_true("created_at" %in% names(pull_many)) 19 | testthat::expect_true("id_str" %in% names(pull_many)) 20 | testthat::expect_true("full_text" %in% names(pull_many)) 21 | testthat::expect_true("entities" %in% names(pull_many)) 22 | testthat::expect_true("in_reply_to_status_id_str" %in% names(pull_many)) 23 | testthat::expect_true("user_id_str" %in% names(pull_many)) 24 | testthat::expect_true("screen_name" %in% names(pull_many)) 25 | testthat::expect_true("location" %in% names(pull_many)) 26 | testthat::expect_true("followers_count" %in% names(pull_many)) 27 | testthat::expect_true("friends_count" %in% names(pull_many)) 28 | testthat::expect_gt(ncol(pull_many), ncol(sample_tags)) 29 | testthat::expect_lte(nrow(pull_many), nrow(sample_tags)) 30 | testthat::expect_equal(ncol(pull_many), ncol(pull_regular)) 31 | testthat::expect_equal(nrow(pull_many), nrow(pull_regular)) 32 | testthat::expect_equal(pull_many$created_at, pull_regular$created_at) 33 | testthat::expect_equal(pull_many$id_str, pull_regular$id_str) 34 | }) 35 | -------------------------------------------------------------------------------- /tests/testthat/test-5-flag_unknown_upstream.R: -------------------------------------------------------------------------------- 1 | id_str <- c("aaa", "bbb", "ccc", "ddd", "eee") 2 | in_reply_to_status_id_str <- c("bbb", NA, NA, "fff", NA) 3 | 4 | df1 <- data.frame(id_str, in_reply_to_status_id_str) 5 | df2 <- data.frame(id_str, in_reply_to_status_id_str = rep(NA, 5)) 6 | df3 <- data.frame(x = id_str, in_reply_to_status_id_str) 7 | df4 <- data.frame(id_str, y = rep(NA, 5)) 8 | 9 | test_that("function returns dataframe of expected length", { 10 | testthat::expect_equal(nrow(flag_unknown_upstream(df1)), 1) 11 | testthat::expect_equal(nrow(flag_unknown_upstream(df2)), 0) 12 | testthat::expect_equal(nrow(flag_unknown_upstream(df3)), 2) 13 | testthat::expect_error(flag_unknown_upstream(df4)) 14 | }) 15 | -------------------------------------------------------------------------------- /tests/testthat/test-6-get_upstream_tweets.R: -------------------------------------------------------------------------------- 1 | test_that("get_upstream_tweets() finds additional tweets", { 2 | 3 | vcr::use_cassette("sample_tags", { 4 | example_url <- "18clYlQeJOc6W5QRuSlJ6_v3snqKJImFhU42bRkM_OX8" 5 | sample_tags <- head(read_tags(example_url), 200) 6 | sample_tags <- sample_tags[,1:(length(sample_tags)-1)] 7 | }) 8 | 9 | vcr::use_cassette("upstream_tweets", { 10 | app <- rtweet::rtweet_app(bearer_token = Sys.getenv("TWITTER_BEARER_TOKEN")) 11 | rtweet::auth_as(app) 12 | more_info <- pull_tweet_data(sample_tags) 13 | replies_plus <- get_upstream_tweets(more_info) 14 | }) 15 | 16 | testthat::expect_equal(is.data.frame(replies_plus), TRUE) 17 | testthat::expect_named(replies_plus) 18 | testthat::expect_true("created_at" %in% names(replies_plus)) 19 | testthat::expect_true("id_str" %in% names(replies_plus)) 20 | testthat::expect_true("full_text" %in% names(replies_plus)) 21 | testthat::expect_true("entities" %in% names(replies_plus)) 22 | testthat::expect_true("in_reply_to_status_id_str" %in% names(replies_plus)) 23 | testthat::expect_true("user_id_str" %in% names(replies_plus)) 24 | testthat::expect_true("screen_name" %in% names(replies_plus)) 25 | testthat::expect_true("location" %in% names(replies_plus)) 26 | testthat::expect_true("followers_count" %in% names(replies_plus)) 27 | testthat::expect_true("friends_count" %in% names(replies_plus)) 28 | testthat::expect_gt(ncol(replies_plus), ncol(sample_tags)) 29 | testthat::expect_lte(nrow(replies_plus), nrow(sample_tags)) 30 | testthat::expect_equal(ncol(replies_plus), ncol(more_info)) 31 | testthat::expect_gte(nrow(replies_plus), nrow(more_info)) 32 | }) 33 | 34 | 35 | test_that("get_upstream_tweets() works with no new replies found", { 36 | sample_data <- 37 | readr::read_csv("sample-data.csv", col_names = TRUE) 38 | 39 | vcr::use_cassette("upstream_tweets_empty", { 40 | app <- rtweet::rtweet_app(bearer_token = Sys.getenv("TWITTER_BEARER_TOKEN")) 41 | rtweet::auth_as(app) 42 | more_info <- pull_tweet_data(sample_data) 43 | replies_plus <- get_upstream_tweets(more_info) 44 | }) 45 | 46 | testthat::expect_equal(is.data.frame(replies_plus), TRUE) 47 | testthat::expect_named(replies_plus) 48 | testthat::expect_true("created_at" %in% names(replies_plus)) 49 | testthat::expect_true("id_str" %in% names(replies_plus)) 50 | testthat::expect_true("full_text" %in% names(replies_plus)) 51 | testthat::expect_true("entities" %in% names(replies_plus)) 52 | testthat::expect_true("in_reply_to_status_id_str" %in% names(replies_plus)) 53 | testthat::expect_true("user_id_str" %in% names(replies_plus)) 54 | testthat::expect_true("screen_name" %in% names(replies_plus)) 55 | testthat::expect_true("location" %in% names(replies_plus)) 56 | testthat::expect_true("followers_count" %in% names(replies_plus)) 57 | testthat::expect_true("friends_count" %in% names(replies_plus)) 58 | testthat::expect_equal(ncol(replies_plus), ncol(more_info)) 59 | testthat::expect_equal(nrow(replies_plus), nrow(more_info)) 60 | testthat::expect_equal(replies_plus, more_info) 61 | }) 62 | -------------------------------------------------------------------------------- /tests/testthat/test-7-process_tweets.R: -------------------------------------------------------------------------------- 1 | test_that("process_tweets() mutates and adds additional columns", { 2 | 3 | vcr::use_cassette("sample_tags", { 4 | example_url <- "18clYlQeJOc6W5QRuSlJ6_v3snqKJImFhU42bRkM_OX8" 5 | sample_tags <- head(read_tags(example_url), 200) 6 | sample_tags <- sample_tags[,1:(length(sample_tags)-1)] 7 | }) 8 | 9 | vcr::use_cassette("metadata_from_rtweet", { 10 | app <- rtweet::rtweet_app(bearer_token = Sys.getenv("TWITTER_BEARER_TOKEN")) 11 | rtweet::auth_as(app) 12 | from_df <- pull_tweet_data(sample_tags, n = 10) 13 | }) 14 | 15 | more_info <- from_df 16 | processed_data <- process_tweets(more_info) 17 | 18 | testthat::expect_equal(is.data.frame(processed_data), TRUE) 19 | testthat::expect_named(processed_data) 20 | testthat::expect_true("created_at" %in% names(processed_data)) 21 | testthat::expect_true("id_str" %in% names(processed_data)) 22 | testthat::expect_true("full_text" %in% names(processed_data)) 23 | testthat::expect_true("entities" %in% names(processed_data)) 24 | testthat::expect_true("in_reply_to_status_id_str" %in% names(processed_data)) 25 | testthat::expect_true("user_id_str" %in% names(processed_data)) 26 | testthat::expect_true("screen_name" %in% names(processed_data)) 27 | testthat::expect_true("location" %in% names(processed_data)) 28 | testthat::expect_true("followers_count" %in% names(processed_data)) 29 | testthat::expect_true("friends_count" %in% names(processed_data)) 30 | testthat::expect_true("mentions_count" %in% names(processed_data)) 31 | testthat::expect_true("hashtags_count" %in% names(processed_data)) 32 | testthat::expect_true("urls_count" %in% names(processed_data)) 33 | testthat::expect_true("tweet_type" %in% names(processed_data)) 34 | testthat::expect_true("is_self_reply" %in% names(processed_data)) 35 | testthat::expect_gt(ncol(processed_data), ncol(more_info)) 36 | testthat::expect_equal(nrow(processed_data), nrow(more_info)) 37 | testthat::expect_equal(processed_data$id_str, more_info$id_str) 38 | }) 39 | 40 | 41 | -------------------------------------------------------------------------------- /tests/testthat/test-8-get_url_domain.R: -------------------------------------------------------------------------------- 1 | test_that("get_url_domain() works on a variety of domains", { 2 | 3 | vcr::use_cassette("url_domains", { 4 | domain1 <- get_url_domain("https://www.tidyverse.org/") 5 | domain2 <- get_url_domain("https://www.tidyverse.org/packages/") 6 | domain3 <- get_url_domain("https://dplyr.tidyverse.org/") 7 | domain4 <- get_url_domain("https://www.npr.org/sections/technology/") 8 | }) 9 | 10 | testthat::expect_equal(domain1, "tidyverse.org") 11 | testthat::expect_equal(domain2, "tidyverse.org") 12 | testthat::expect_equal(domain3, "dplyr.tidyverse.org") 13 | testthat::expect_equal(domain4, "npr.org") 14 | }) 15 | -------------------------------------------------------------------------------- /tests/testthat/test-9-filter_by_tweet_type.R: -------------------------------------------------------------------------------- 1 | vcr::use_cassette("sample_tags", { 2 | sample_tags <- read_tags("18clYlQeJOc6W5QRuSlJ6_v3snqKJImFhU42bRkM_OX8") 3 | }) 4 | 5 | vcr::use_cassette("different_tweet_types", { 6 | app <- rtweet::rtweet_app(bearer_token = Sys.getenv("TWITTER_BEARER_TOKEN")) 7 | rtweet::auth_as(app) 8 | more_info <- pull_tweet_data(sample_tags, n = 100) 9 | }) 10 | 11 | processed_data <- process_tweets(more_info) 12 | 13 | 14 | 15 | test_that("filter works for replies", { 16 | 17 | filtered_df <- filter_by_tweet_type(processed_data, "reply") 18 | 19 | testthat::expect_equal(is.data.frame(filtered_df), TRUE) 20 | testthat::expect_named(filtered_df) 21 | testthat::expect_true("created_at" %in% names(filtered_df)) 22 | testthat::expect_true("id_str" %in% names(filtered_df)) 23 | testthat::expect_true("full_text" %in% names(filtered_df)) 24 | testthat::expect_true("entities" %in% names(filtered_df)) 25 | testthat::expect_true("mentions_count" %in% names(filtered_df)) 26 | testthat::expect_true("hashtags_count" %in% names(filtered_df)) 27 | testthat::expect_true("urls_count" %in% names(filtered_df)) 28 | testthat::expect_true("tweet_type" %in% names(filtered_df)) 29 | testthat::expect_true("is_self_reply" %in% names(filtered_df)) 30 | testthat::expect_gt(ncol(filtered_df), ncol(more_info)) 31 | testthat::expect_equal(ncol(filtered_df), ncol(processed_data)) 32 | testthat::expect_lte(nrow(filtered_df), nrow(processed_data)) 33 | testthat::expect_equal(filtered_df$tweet_type[1], "reply") 34 | }) 35 | 36 | 37 | 38 | test_that("filter works for retweets", { 39 | 40 | filtered_df <- filter_by_tweet_type(processed_data, "retweet") 41 | 42 | testthat::expect_equal(is.data.frame(filtered_df), TRUE) 43 | testthat::expect_named(filtered_df) 44 | testthat::expect_true("created_at" %in% names(filtered_df)) 45 | testthat::expect_true("id_str" %in% names(filtered_df)) 46 | testthat::expect_true("full_text" %in% names(filtered_df)) 47 | testthat::expect_true("entities" %in% names(filtered_df)) 48 | testthat::expect_true("mentions_count" %in% names(filtered_df)) 49 | testthat::expect_true("hashtags_count" %in% names(filtered_df)) 50 | testthat::expect_true("urls_count" %in% names(filtered_df)) 51 | testthat::expect_true("tweet_type" %in% names(filtered_df)) 52 | testthat::expect_true("is_self_reply" %in% names(filtered_df)) 53 | testthat::expect_gt(ncol(filtered_df), ncol(more_info)) 54 | testthat::expect_equal(ncol(filtered_df), ncol(processed_data)) 55 | testthat::expect_lte(nrow(filtered_df), nrow(processed_data)) 56 | testthat::expect_equal(filtered_df$tweet_type[1], "retweet") 57 | }) 58 | 59 | 60 | 61 | test_that("filter works for quote tweets", { 62 | 63 | filtered_df <- filter_by_tweet_type(processed_data, "quote") 64 | 65 | testthat::expect_equal(is.data.frame(filtered_df), TRUE) 66 | testthat::expect_named(filtered_df) 67 | testthat::expect_true("created_at" %in% names(filtered_df)) 68 | testthat::expect_true("id_str" %in% names(filtered_df)) 69 | testthat::expect_true("full_text" %in% names(filtered_df)) 70 | testthat::expect_true("entities" %in% names(filtered_df)) 71 | testthat::expect_true("in_reply_to_status_id_str" %in% names(filtered_df)) 72 | testthat::expect_true("user_id_str" %in% names(filtered_df)) 73 | testthat::expect_true("screen_name" %in% names(filtered_df)) 74 | testthat::expect_true("location" %in% names(filtered_df)) 75 | testthat::expect_true("followers_count" %in% names(filtered_df)) 76 | testthat::expect_true("friends_count" %in% names(filtered_df)) 77 | testthat::expect_true("mentions_count" %in% names(filtered_df)) 78 | testthat::expect_true("hashtags_count" %in% names(filtered_df)) 79 | testthat::expect_true("urls_count" %in% names(filtered_df)) 80 | testthat::expect_true("tweet_type" %in% names(filtered_df)) 81 | testthat::expect_true("is_self_reply" %in% names(filtered_df)) 82 | testthat::expect_gt(ncol(filtered_df), ncol(more_info)) 83 | testthat::expect_equal(ncol(filtered_df), ncol(processed_data)) 84 | testthat::expect_lte(nrow(filtered_df), nrow(processed_data)) 85 | testthat::expect_equal(filtered_df$tweet_type[1], "quote") 86 | }) 87 | 88 | 89 | 90 | test_that("filter works for original tweets", { 91 | 92 | filtered_df <- filter_by_tweet_type(processed_data, "original") 93 | 94 | testthat::expect_equal(is.data.frame(filtered_df), TRUE) 95 | testthat::expect_named(filtered_df) 96 | testthat::expect_true("created_at" %in% names(filtered_df)) 97 | testthat::expect_true("id_str" %in% names(filtered_df)) 98 | testthat::expect_true("full_text" %in% names(filtered_df)) 99 | testthat::expect_true("entities" %in% names(filtered_df)) 100 | testthat::expect_true("in_reply_to_status_id_str" %in% names(filtered_df)) 101 | testthat::expect_true("user_id_str" %in% names(filtered_df)) 102 | testthat::expect_true("screen_name" %in% names(filtered_df)) 103 | testthat::expect_true("location" %in% names(filtered_df)) 104 | testthat::expect_true("followers_count" %in% names(filtered_df)) 105 | testthat::expect_true("friends_count" %in% names(filtered_df)) 106 | testthat::expect_true("mentions_count" %in% names(filtered_df)) 107 | testthat::expect_true("hashtags_count" %in% names(filtered_df)) 108 | testthat::expect_true("urls_count" %in% names(filtered_df)) 109 | testthat::expect_true("tweet_type" %in% names(filtered_df)) 110 | testthat::expect_true("is_self_reply" %in% names(filtered_df)) 111 | testthat::expect_gt(ncol(filtered_df), ncol(more_info)) 112 | testthat::expect_equal(ncol(filtered_df), ncol(processed_data)) 113 | testthat::expect_lte(nrow(filtered_df), nrow(processed_data)) 114 | testthat::expect_equal(filtered_df$tweet_type[1], "original") 115 | }) 116 | -------------------------------------------------------------------------------- /tidytags-logo.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/tidytags-logo.png -------------------------------------------------------------------------------- /tidytags.Rproj: -------------------------------------------------------------------------------- 1 | Version: 1.0 2 | 3 | RestoreWorkspace: Default 4 | SaveWorkspace: Default 5 | AlwaysSaveHistory: Default 6 | 7 | EnableCodeIndexing: Yes 8 | UseSpacesForTab: Yes 9 | NumSpacesForTab: 2 10 | Encoding: UTF-8 11 | 12 | RnwWeave: Sweave 13 | LaTeX: pdfLaTeX 14 | 15 | AutoAppendNewline: Yes 16 | StripTrailingWhitespace: Yes 17 | 18 | BuildType: Package 19 | PackageUseDevtools: Yes 20 | PackageInstallArgs: --no-multiarch --with-keep.source 21 | -------------------------------------------------------------------------------- /vignettes/.gitignore: -------------------------------------------------------------------------------- 1 | *.html 2 | *.R 3 | -------------------------------------------------------------------------------- /vignettes/files/TAGS-identifier-from-browser.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/vignettes/files/TAGS-identifier-from-browser.png -------------------------------------------------------------------------------- /vignettes/files/TAGS-identifier-highlighted.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/vignettes/files/TAGS-identifier-highlighted.png -------------------------------------------------------------------------------- /vignettes/files/TAGS-make-copy.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/vignettes/files/TAGS-make-copy.png -------------------------------------------------------------------------------- /vignettes/files/TAGS-ready.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/vignettes/files/TAGS-ready.png -------------------------------------------------------------------------------- /vignettes/files/choice-TAGS-version.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/vignettes/files/choice-TAGS-version.png -------------------------------------------------------------------------------- /vignettes/files/key-task-1-success.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/vignettes/files/key-task-1-success.png -------------------------------------------------------------------------------- /vignettes/files/publish-to-web-choices.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/vignettes/files/publish-to-web-choices.png -------------------------------------------------------------------------------- /vignettes/files/publish-to-web-menu.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/vignettes/files/publish-to-web-menu.png -------------------------------------------------------------------------------- /vignettes/files/share-anyone-with-link.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/vignettes/files/share-anyone-with-link.png -------------------------------------------------------------------------------- /vignettes/files/share-button.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/vignettes/files/share-button.png -------------------------------------------------------------------------------- /vignettes/setup.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Getting started with tidytags" 3 | output: 4 | rmarkdown::html_vignette 5 | vignette: > 6 | %\VignetteIndexEntry{Getting started with tidytags} 7 | %\VignetteEngine{knitr::rmarkdown} 8 | %\VignetteEncoding{UTF-8} 9 | --- 10 | 11 | ```{r setup, include=FALSE} 12 | knitr::opts_chunk$set( 13 | collapse = TRUE, 14 | comment = "#>", 15 | fig.path = ".", 16 | out.width = "100%", 17 | message = FALSE, 18 | warning = FALSE, 19 | error = FALSE) 20 | ``` 21 | 22 | This vignette introduces the initial setup necessary to use **tidytags**. Specifically, this guide offers help for two key tasks. 23 | 24 | 1. Making sure your TAGS tracker can be accessed 25 | 2. Getting and storing Twitter API tokens 26 | 27 | ## Considerations Related to Ethics, Data Privacy, and Human Subjects Research 28 | 29 | Before reading through these steps for setting up **tidytags**, please take a few moments to **reflect on ethical considerations** related to social media research. 30 | 31 | ```{r, child='../man/fragments/ethics.Rmd'} 32 | ``` 33 | 34 | With these things in mind, let's get started working through two key tasks. 35 | 36 | ## Key Task #1. Making sure your TAGS tracker can be accessed 37 | 38 | A core functionality of **tidytags** is to retrieve tweets data from a [Twitter Archiving Google Sheet](https://tags.hawksey.info/); TAGS). A TAGS tracker continuously collects tweets from Twitter, based on predefined search criteria and collection frequency. 39 | 40 | Here we offer **a brief overview on how to set up TAGS**, but be sure to read through the information on the [TAGS landing page](https://tags.hawksey.info/get-tags/) for thorough instructions on getting started with TAGS. 41 | 42 | We recommend using **TAGS v6.1**. 43 | 44 |

    45 | 46 | You will be prompted to `Make a copy` of TAGS that will then reside in your own Google Drive space. Click the button to do this. 47 | 48 |

    49 | 50 | Your TAGS tracker is now ready to use! Just follow the two-steps of instructions on the TAGS tracker: 51 | 52 |

    tags tracker screenshot

    53 | 54 | **tidytags** is set up to access a TAGS tracker by using the [**googlesheets4** package](https://CRAN.R-project.org/package=googlesheets4). One requirement for using **googlesheets4** is that your TAGS tracker has been "published to the web." To do this, with the TAGS page open in a web browser, go to `File >> Share >> Publish to the web`. 55 | 56 |

    publish-to-web-menu

    57 | 58 | The `Link` field should be 'Entire document' and the `Embed` field should be 'Web page.' If everything looks right, then click the `Publish` button. 59 | 60 |

    publish-to-web-choices

    61 | 62 | Next, click the `Share` button in the top right corner of the Google Sheets window, select `Get shareable link`, and set the permissions to 'Anyone with the link can view.' 63 | 64 |

    share-button

    65 | 66 |

    share-anyone-with-link

    67 | 68 | The input needed for the `tidytags::read_tags()` function is either the entire URL from the top of the web browser when opened to a TAGS tracker, or a Google Sheet identifier (i.e., the alphanumeric string following `https://docs.google.com/spreadsheets/d/` in the TAGS tracker's URL). 69 | 70 |

    TAGS-identifier-from-browser

    71 | 72 |

    TAGS-identifier-highlighted

    73 | 74 | Be sure to put quotations marks around the URL or sheet identifier when entering it into `read_tags()` function. 75 | 76 | To verify that this step worked for you, run the following code: 77 | 78 | `read_tags("18clYlQeJOc6W5QRuSlJ6_v3snqKJImFhU42bRkM_OX8")` 79 | 80 | What should return is the following: 81 | 82 |

    Key Task #1 success

    83 | 84 | Then, try to run `read_tags()` with your own URL or sheet identifier. If that does not work, carefully review the steps above. 85 | 86 | ## Key Task #2. Getting and storing Twitter API token 87 | 88 | With a TAGS tracker archive imported into R, **tidytags** allows you to gather quite a bit more information related to the TAGS-collected tweets with the `pull_tweet_data()` function. This function builds off the [**rtweet** package](https://docs.ropensci.org/rtweet/) (via `rtweet::lookup_tweets()`) to query the Twitter API. However, **to access the Twitter API, whether through rtweet or tidytags, you will need to apply for developers' access from Twitter**. You do this [through Twitter's developer website](https://developer.twitter.com/en/apply-for-access). 89 | 90 | ### Getting Twitter API token 91 | 92 | Once approved for developer's access to the Twitter API, be sure to save the keys and tokens granted to you. These will only be available to you once (but you can easily generate new ones later as needed), so save them in a secure place. 93 | 94 | **Never share API keys or tokens with anyone; never add these directly to your R code or output.** 95 | 96 | One option is to save your Twitter API credentials in the **.Renviron** file accessed through the `usethis::edit_r_environ(scope='user')` function. 97 | 98 | Your saved Twitter API key and tokens should like something like this: 99 | 100 | ```{r, eval=FALSE} 101 | TWITTER_APP = NameOfYourTwitterApp 102 | TWITTER_API_KEY = YourConsumerKey 103 | TWITTER_API_SECRET = YourConsumerSecretKey 104 | TWITTER_ACCESS_TOKEN = YourAccessToken 105 | TWITTER_ACCESS_TOKEN_SECRET = YourAccessTokenSecret 106 | TWITTER_BEARER_TOKEN = YourBearerToken 107 | TWITTER_BEARER = YourBearer 108 | ``` 109 | 110 | ### Setting up 111 | 112 | The **rtweet** documentation already contains a very thorough vignette, "Authentication with rtweet" (`vignette("auth", package = "rtweet")`), to guide you through the process of authenticating access to the Twitter API. We recommend the **app-based authentication** method that uses `auth <- rtweet::rtweet_app()`, described in the [Apps](https://docs.ropensci.org/rtweet/articles/auth.html#apps) section of the vignette. 113 | 114 | The default for the app-based method is to enter the Twitter bearer token (what you saved as **TWITTER_BEARER_TOKEN**) interactively, when prompted. 115 | 116 | Finally, to make sure the authentication works properly, run the code `rtweet::get_token()`. 117 | 118 | ## Start using tidytags 119 | 120 | After completing these two key task, you're now ready to start using **tidytags**! 121 | 122 | Now would be a good time to learn about the full functionality of the package by walking through the "Using tidytags with a conference hashtag" guide (`vignette("tidytags-with-conf-hashtags", package = "tidytags")`). 123 | 124 | ## Getting help 125 | 126 | ```{r, child='../man/fragments/getting-help.Rmd'} 127 | ``` 128 | -------------------------------------------------------------------------------- /vignettes/vignette-network-visualization-1.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/ropensci-archive/tidytags/65c40bcb0e6721b55be676c42b524e33ea59885e/vignettes/vignette-network-visualization-1.png --------------------------------------------------------------------------------