├── .Rbuildignore ├── .gitignore ├── .travis.yml ├── DESCRIPTION ├── LICENSE ├── LICENSE.md ├── README.md ├── _bookdown.yml ├── children ├── contribution.Rmd └── session_type.Rmd ├── data └── erum2020_confirmed_program.json ├── erum2020program.Rproj ├── index.Rmd ├── materials ├── An enriched disease risk assessment model based on historical blood donors records.pdf ├── Hanly-CovidR-joint-winner-COVOID-Package.pdf ├── Jooken-CollaborateR.pdf ├── MOFA_bvelten.pdf ├── README.md ├── Severino-CorpFinder-Application_to_identify_Large_Corporate_Risks.pdf ├── cordanoe_geotopbricks_presenetation.pdf ├── gillespie_harsh-cran.pdf ├── kalibera-invisible-work-on-r.pdf ├── lenz-JuliaConnectoR.pdf ├── moritz_time-series-missing-value-visualizations.pdf └── onkelinx_effectclass.pdf └── tools ├── .gitignore ├── _proof.sh ├── eRum2020-program-shared.ods ├── generate_dataset.R ├── program-materials-parse-readme.R ├── program-materials-placeholders.R ├── program-materials-yt.R ├── render-conference-materials.R └── yt-descriptions.md /.Rbuildignore: -------------------------------------------------------------------------------- 1 | ^erum2020program\.Rproj$ 2 | ^\.Rproj\.user$ 3 | ^LICENSE\.md$ 4 | ^\.travis\.yml$ 5 | -------------------------------------------------------------------------------- /.gitignore: -------------------------------------------------------------------------------- 1 | .Rproj.user 2 | .Rhistory 3 | _site 4 | .htmlproofer*cache 5 | -------------------------------------------------------------------------------- /.travis.yml: -------------------------------------------------------------------------------- 1 | # R for travis: see documentation at https://docs.travis-ci.com/user/languages/r 2 | 3 | language: R 4 | cache: 5 | packages: true 6 | directories: 7 | - .htmlproofer-readme-cache 8 | - .htmlproofer-book-cache 9 | - .htmlproofer-materials-cache 10 | 11 | # Install html-proofer 12 | before_install: 13 | - gem install html-proofer 14 | env: 15 | global: 16 | - NOKOGIRI_USE_SYSTEM_LIBRARIES=true # Speed up the html-proofer install 17 | 18 | script: 19 | # proof README to check links 20 | - Rscript -e "rmarkdown::render('README.md', 'github_document', output_file = 'proof-README.md')" 21 | - tools/_proof.sh --storage-dir .htmlproofer-readme-cache proof-README.html 22 | - rm proof-README.* 23 | # site 24 | # > bookdown 25 | - Rscript -e "bookdown::render_book('index.Rmd', 'bookdown::gitbook')" 26 | # > conference materials page 27 | - Rscript tools/render-conference-materials.R 28 | # > hosted materials 29 | - cp -r materials _site 30 | # > proof 31 | - htmlproofer --file-ignore /conference-materials/ --storage-dir .htmlproofer-site-cache --timeframe 28d _site 32 | - tools/_proof.sh --storage-dir .htmlproofer-materials-cache _site/conference-materials.html 33 | 34 | deploy: 35 | provider: pages 36 | skip_cleanup: true 37 | github_token: $GITHUB_PAT # Set in the settings page of your repository, as a secure variable 38 | local_dir: _site 39 | on: 40 | branch: master 41 | -------------------------------------------------------------------------------- /DESCRIPTION: -------------------------------------------------------------------------------- 1 | Package: erum2020program 2 | Type: Book 3 | Title: eRum2020 Program 4 | Version: 0.0.0.9000 5 | Authors@R: 6 | person(given = "Francesca", 7 | family = "Vitalini", 8 | role = c("aut", "cre"), 9 | email = "francesca.vitalini@mirai-solutions.com") 10 | Description: Visualize eRum2020 Program as a bookdown. 11 | License: MIT + file LICENSE 12 | Encoding: UTF-8 13 | Imports: 14 | bookdown, 15 | dplyr, 16 | jsonlite, 17 | readxl, 18 | rmarkdown, 19 | tidyr, 20 | purrr, 21 | xml2, 22 | rvest 23 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | YEAR: 2020 2 | COPYRIGHT HOLDER: MilanoR 3 | -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- 1 | # MIT License 2 | 3 | Copyright (c) 2020 MilanoR 4 | 5 | Permission is hereby granted, free of charge, to any person obtaining a copy 6 | of this software and associated documentation files (the "Software"), to deal 7 | in the Software without restriction, including without limitation the rights 8 | to use, copy, modify, merge, publish, distribute, sublicense, and/or sell 9 | copies of the Software, and to permit persons to whom the Software is 10 | furnished to do so, subject to the following conditions: 11 | 12 | The above copyright notice and this permission notice shall be included in all 13 | copies or substantial portions of the Software. 14 | 15 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR 16 | IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, 17 | FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE 18 | AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER 19 | LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, 20 | OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE 21 | SOFTWARE. 22 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | 2 | # e-Rum2020 Program 3 | 4 | 5 | [![Travis build status](https://travis-ci.com/Milano-R/erum2020program.svg?branch=master)](https://travis-ci.com/Milano-R/erum2020program) 6 | 7 | 8 | ## Book of abstracts for accepted contributions 9 | 10 | The abstracts of accepted contributions organized by session type is available as a bookdown at https://milano-r.github.io/erum2020program/erum2020-contributed-program.html. 11 | 12 | ## Detailed program schedule 13 | 14 | The detailed schedule of e-Rum2020 as [PDF booklet](http://2020.erum.io/wp-content/uploads/2020/06/program_brochure_v5_20200617.pdf). 15 | 16 | ## Conference Materials 17 | 18 | Materials of all talks and workshops are collected below. 19 | 20 | Conference materials are being populated based on **speakers and community contributions**. We recommend to _Watch_ and _Star_ the repository to be up-to-date with newly-collected materials. For larger contributions, we recommend creating [issues](https://github.com/Milano-R/erum2020program/issues) first for better coordination, and checking the existing ones for "help wanted". 21 | 22 | ---- 23 | 24 | ## Index 25 | 26 | - [Keynotes](#keynotes) [[:movie_camera: playlist](https://www.youtube.com/playlist?list=PLX5iPLTva9UkHBP323LcXE8Nz1FBQc4iL)] 27 | - [Keynote 1 - Stephanie Hicks - Life Sciences](#keynote-1---stephanie-hicks---life-sciences) [[:movie_camera: video](https://youtu.be/YqLb6gCyUGw)] 28 | - [Keynote 2 - Jared Lander - Applications](#keynote-2---jared-lander---applications) [[:movie_camera: video](https://youtu.be/wYxUdjW8vDg)] 29 | - [Keynote 3 - Francesco Bartolucci - Machine Learning & Models](#keynote-3---francesco-bartolucci---machine-learning--models) [[:movie_camera: video](https://youtu.be/NUSnkL0PLEo)] 30 | - [Keynote 4 - Sharon Machlis - DataViz](#keynote-4---sharon-machlis---dataviz) [[:movie_camera: video](https://youtu.be/Ar8msLCYroM)] 31 | - [Keynote 5 - Tomas Kalibera - R World](#keynote-5---tomas-kalibera---r-world) [[:movie_camera: video](https://youtu.be/aKJ7c-Wm6VA)] 32 | - [Keynote 6 - Kelly O'Briant - R in Production](#keynote-6---kelly-obriant---r-in-production) [[:movie_camera: video](https://youtu.be/Qbwm9rWErfU)] 33 | 34 | - [Invited Sessions](#invited-sessions) [[:movie_camera: playlist](https://www.youtube.com/playlist?list=PLX5iPLTva9UlOtwXchd1TaneFDfMHNQEd)] 35 | - [Invited Session 1 - Life Sciences / CovidR / R World](#invited-session-1---life-sciences--covidr--r-world) [[:movie_camera: video](https://youtu.be/EhqiA2RMzEU)] 36 | - [Invited Session 2 - Machine Learning & Models / DataViz (Shiny) / Applications](#invited-session-2---machine-learning--models--dataviz-shiny--applications) [[:movie_camera: video](https://youtu.be/awj5dEkgsTI)] 37 | - [Invited Session 3 - Shiny / R in Production / R World](#invited-session-3---shiny--r-in-production--r-world) [[:movie_camera: video](https://youtu.be/t7IK1uirEKo)] 38 | 39 | - [Parallel Sessions](#parallel-sessions) [[:movie_camera: playlist](https://www.youtube.com/playlist?list=PLX5iPLTva9UkGBO7PkPHGqubm6aRwS9TM)] 40 | - [Parallel Session 1 - Machine Learning](#parallel-session-1---machine-learning) [[:movie_camera: video](https://youtu.be/4Kf6BBSdBoE)] 41 | - [Parallel Session 2 - Life Sciences](#parallel-session-2---life-sciences) [[:movie_camera: video](https://youtu.be/qU3ob0t6dHE)] 42 | - [Parallel Session 3 - Applications / R in Production](#parallel-session-3---applications--r-in-production) [[:movie_camera: video](https://youtu.be/NwVOvfpGq4o)] 43 | - [Parallel Session 4 - Life Sciences](#parallel-session-4---life-sciences) [[:movie_camera: video](https://youtu.be/vj_dwE4iFrs)] 44 | - [Parallel Session 5 - R World](#parallel-session-5---r-world) [[:movie_camera: video](https://youtu.be/Z2KBhz-xDIM)] 45 | - [Parallel Session 6 - R in Production / R World](#parallel-session-6---r-in-production--r-world) [[:movie_camera: video](https://youtu.be/RbGHIqVQ6OA)] 46 | - [Parallel Session 7 - Data Visualization & Shiny](#parallel-session-7---data-visualization--shiny) [[:movie_camera: video](https://youtu.be/r70YV-4ktIY)] 47 | - [Parallel Session 8 - Machine Learning & Models](#parallel-session-8---machine-learning--models) [[:movie_camera: video](https://youtu.be/RhiEEeMfklM)] 48 | - [Parallel Session 9 - R in Production](#parallel-session-9---r-in-production) [[:movie_camera: video](https://youtu.be/VQtEI-akVyk)] 49 | - [Parallel Session 10 - Machine Learning / Applications](#parallel-session-10---machine-learning--applications) [[:movie_camera: video](https://youtu.be/MoM31cFoe8g)] 50 | 51 | - [Lightning Talks](#lightning-talks) [[:movie_camera: playlist](https://www.youtube.com/playlist?list=PLX5iPLTva9UlmVS_t-A8OQicxYKODVfqH)] 52 | - [Lightning Talks 1 - Data Visualization & Production](#lightning-talks-1---data-visualization--production) [[:movie_camera: video](https://youtu.be/hFKwzJBXvIU)] 53 | - [Lightning Talks 2 - Life Sciences](#lightning-talks-2---life-sciences) [[:movie_camera: video](https://youtu.be/79lUgThZ4HE)] 54 | - [Lightning Talks 3 - Machine Learning](#lightning-talks-3---machine-learning) [[:movie_camera: video](https://youtu.be/sHlj_H8K2RI)] 55 | - [Lightning Talks 4 - Applications](#lightning-talks-4---applications) [[:movie_camera: video](https://youtu.be/KOYZPMMgiHo)] 56 | 57 | - [Shiny Demos](#shiny-demos) [[:movie_camera: playlist](https://www.youtube.com/playlist?list=PLX5iPLTva9UlCY2qFZBSRydU1oPEAJNig)] 58 | - [Shiny Demo 1 - Machine Learning & Applications](#shiny-demo-1---machine-learning--applications) [[:movie_camera: video](https://youtu.be/t8PZbP5b8EM)] 59 | - [Shiny Demo 2 - Mobility & Spatial](#shiny-demo-2---mobility--spatial) [[:movie_camera: video](https://youtu.be/sU_iT3Spc-Q)] 60 | 61 | - [Poster Sessions](#poster-sessions) [[:movie_camera: playlist](https://www.youtube.com/playlist?list=PLX5iPLTva9Ukknkenms-VDx94QlRcHgmM)] 62 | - [Poster Session 1 - Life Sciences & Applications](#poster-session-1---life-sciences--applications) [[:movie_camera: video](https://youtu.be/HDLpw3gCouQ)] 63 | - [Poster Session 2 - DataViz & Machine Learning](#poster-session-2---dataviz--machine-learning) [[:movie_camera: video](https://youtu.be/dwIXqhk1NdM)] 64 | 65 | - [Thematic Lounges](#thematic-lounges) 66 | - [Thematic Lounge 1 - Developing Software and Careers in Life Sciences](#thematic-lounge-1---developing-software-and-careers-in-life-sciences) 67 | - [Thematic Lounge 2 - Building Community & Diversity](#thematic-lounge-2---building-community--diversity) 68 | - [Thematic Lounge 3 - Data science: freelancing and business](#thematic-lounge-3---data-science-freelancing-and-business) 69 | 70 | - [Workshops](#workshops) [[:movie_camera: playlist](https://www.youtube.com/playlist?list=PLX5iPLTva9UnZmlzry0_YNB8rH85DuLRa)] 71 | 72 | - [Morning Workshops](#morning-workshops) 73 | - [Afternoon Workshops](#afternoon-workshops) 74 | 75 | ## Keynotes 76 | 77 | [:movie_camera: **Playlist**](https://www.youtube.com/playlist?list=PLX5iPLTva9UkHBP323LcXE8Nz1FBQc4iL) 78 | 79 | ### Keynote 1 - Stephanie Hicks - Life Sciences 80 | 81 | #### Using R and data science to improve human health 82 | 83 | - Speaker: Stephanie Hicks ([\@stephaniehicks](https://twitter.com/@stephaniehicks), [website](https://www.stephaniehicks.com), [LinkedIn](https://www.linkedin.com/in/hicksstephanie)) 84 | - [:movie_camera: Video](https://youtu.be/YqLb6gCyUGw) 85 | - Materials: [slides](https://docs.google.com/presentation/d/1TRmPRv4k4aceJbdVQDub4hCXYvIdTeQfDJZfpBu28PQ) 86 | 87 | ### Keynote 2 - Jared Lander - Applications 88 | 89 | #### Applying R at Work 90 | 91 | - Speaker: Jared Lander ([\@jaredlander](https://twitter.com/@jaredlander), [website](https://www.jaredlander.com), [LinkedIn](https://www.linkedin.com/in/jaredlander)) 92 | - [:movie_camera: Video](https://youtu.be/wYxUdjW8vDg) 93 | - Materials: [slides](https://jaredlander.com/content/2020/06/ApplyingRAtWork.html) 94 | 95 | ### Keynote 3 - Francesco Bartolucci - Machine Learning & Models 96 | 97 | #### Latent Markov models for longitudinal data in R by LMest package 98 | 99 | - Speaker: Francesco Bartolucci ([\@f_bartolucci](https://twitter.com/@f_bartolucci), [website](https://sites.google.com/site/bartstatistics), [LinkedIn](https://www.linkedin.com/in/francesco-bartolucci-b6365a74)) 100 | - [:movie_camera: Video](https://youtu.be/NUSnkL0PLEo) 101 | - Materials: [slides](https://drive.google.com/file/d/1-QZrmhdHCquJ0AWsCDeJdmusNwraD_EL/view) 102 | 103 | ### Keynote 4 - Sharon Machlis - DataViz 104 | 105 | #### What I Learned as an R Journalist 106 | 107 | - Speaker: Sharon Machlis ([\@sharon000](https://twitter.com/@sharon000), [LinkedIn](https://www.linkedin.com/in/sharonmachlis)) 108 | - [:movie_camera: Video](https://youtu.be/Ar8msLCYroM) 109 | - Materials: [slides](http://www.machlis.com/eRum2020/) 110 | 111 | ### Keynote 5 - Tomas Kalibera - R World 112 | 113 | #### The Invisible Work on R 114 | 115 | - Speaker: Tomas Kalibera ([LinkedIn](https://www.linkedin.com/in/kalibera)) 116 | - [:movie_camera: Video](https://youtu.be/aKJ7c-Wm6VA) 117 | - Materials: [slides](materials/kalibera-invisible-work-on-r.pdf) 118 | 119 | ### Keynote 6 - Kelly O'Briant - R in Production 120 | 121 | #### Reflections on two years of solutions engineering at RStudio 122 | 123 | - Speaker: Kelly O'Briant ([\@kellrstats](https://twitter.com/@kellrstats), [website](https://kellobri.github.io), [LinkedIn](https://www.linkedin.com/in/kellyobriant)) 124 | - [:movie_camera: Video](https://youtu.be/Qbwm9rWErfU) 125 | - Materials: [slides](https://speakerdeck.com/kellobri/erum-2020-reflections-on-r-in-production) 126 | 127 | ## Invited Sessions 128 | 129 | [:movie_camera: **Playlist**](https://www.youtube.com/playlist?list=PLX5iPLTva9UlOtwXchd1TaneFDfMHNQEd) 130 | 131 | ### Invited Session 1 - Life Sciences / CovidR / R World 132 | 133 | [:movie_camera: **Video**](https://youtu.be/EhqiA2RMzEU) 134 | 135 | #### Multi-Omics Factor Analysis Plus: A probabilistic framework for comprehensive and scalable integration of multi-modal data 136 | 137 | - Speaker: Britta Velten ([website](https://bv2.github.io), [LinkedIn](https://www.linkedin.com/in/britta-velten-22a36813a), [GitHub](https://github.com/bv2)) 138 | - Materials: [slides](materials/MOFA_bvelten.pdf) 139 | 140 | #### COVID-19 Data Hub 141 | 142 | - Speaker: Emanuele Guidotti ([website](https://guidotti.dev), [LinkedIn](https://www.linkedin.com/in/emanuele-guidotti/), [GitHub](https://github.com/eguidotti)) 143 | - Materials: [COVID19 R Package](https://covid19datahub.io/articles/api/r.html) 144 | 145 | #### COVOID: Modelling COVID-19 Transmission and Interventions 146 | 147 | - Speaker: Mark Hanly ([bio](https://cbdrh.med.unsw.edu.au/people/dr-mark-hanly)) 148 | - Materials: [slides](materials/Hanly-CovidR-joint-winner-COVOID-Package.pdf) 149 | 150 | #### Building Agile data products leveraging the R package structure 151 | 152 | - Speaker: Edwin Thoen ([\@edwin_thoen](https://twitter.com/@edwin_thoen)) 153 | - Materials: [slides](https://github.com/EdwinTh/ADSwR/raw/master/presentations/20200617_eRum_publish.pdf) 154 | 155 | ### Invited Session 2 - Machine Learning & Models / DataViz (Shiny) / Applications 156 | 157 | [**:movie_camera: Video**](https://youtu.be/awj5dEkgsTI) 158 | 159 | #### brquasi: Improved quasi-likelihood estimation 160 | 161 | - Speaker: Ioannis Kosmidis ([\@IKosmidis_](https://twitter.com/@IKosmidis_), [website](http://www.ikosmidis.com), [LinkedIn](https://www.linkedin.com/in/kosmidis)) 162 | - Materials: [slides](http://www.ikosmidis.com/files/ikosmidis_brquasi_eRum2020) 163 | 164 | #### Better than Deep Learning: Gradient Boosting Machines (GBM) – with 2020 updates 165 | 166 | - Speaker: Szilard Pafka ([LinkedIn](https://www.linkedin.com/in/szilard)/[Github](https://github.com/szilard)/[Twitter](https://twitter.com/DataScienceLA)) 167 | - Materials: [slides](https://bit.ly/szilard-talk-erum20) 168 | 169 | #### Testing Shiny: What, why, and how. 170 | 171 | - Speaker: Colin Fay ([\@_ColinFay](https://twitter.com/@_ColinFay)) 172 | - Materials: [slides](https://speakerdeck.com/colinfay/erum-2020-testing-shiny-why-what-and-how) 173 | 174 | #### From writing code to infoRming policy: a case study of reproducible research in transport planning 175 | 176 | - Speaker: Robin Lovelace 177 | - Materials: [slides](https://www.robinlovelace.net/presentations/c4p-slides.html) 178 | 179 | ### Invited Session 3 - Shiny / R in Production / R World 180 | 181 | [:movie_camera: **Video**](https://youtu.be/t7IK1uirEKo) 182 | 183 | #### CRANalerts: A Shinyapp-as-a-Service for Impatient R Users 184 | 185 | - Speaker: Dean Attali 186 | - Materials: TBD 187 | 188 | #### And you thought CRAN was harsh 189 | 190 | - Speaker: Colin Gillespie ([\@csgillespie](https://twitter.com/@csgillespie)) 191 | - Materials: [slides](materials/gillespie_harsh-cran.pdf), [repo](https://github.com/jumpingrivers/inteRgrate) 192 | 193 | #### How to improve your R package 194 | 195 | - Speaker: Maëlle Salmon ([\@ma_salmon](https://twitter.com/@ma_salmon)) 196 | - Materials: [slides](https://masalmon.eu/talks/2020-06-19-e-rum/) 197 | 198 | #### dplyr 1.0.0 199 | 200 | - Speaker: Romain Francois ([\@romain_francois](https://twitter.com/@romain_francois)) 201 | - Materials: [slides](https://speakerdeck.com/romainfrancois/dplyr-1-dot-0-0) 202 | 203 | ## Parallel Sessions 204 | 205 | [:movie_camera: **Playlist**](https://www.youtube.com/playlist?list=PLX5iPLTva9UkGBO7PkPHGqubm6aRwS9TM) 206 | 207 | ### Parallel Session 1 - Machine Learning 208 | 209 | [:movie_camera: **Video**](https://youtu.be/4Kf6BBSdBoE) 210 | 211 | #### Deduplicating real estate ads using Naive Bayes record linkage 212 | 213 | - Speaker: Daniel Meister ([\@danimeist](https://twitter.com/@danimeist)) 214 | - Materials: [slides](https://slides.datahouse.ch/3fd640b/static/1c97060e67e41c04cc03d4f0c3497c03adb4a30b) 215 | 216 | #### Hydrological Modelling and R 217 | 218 | - Speaker: Emanuele Cordano 219 | - Materials: [slides](materials/cordanoe_geotopbricks_presenetation.pdf) 220 | 221 | #### gWQS: An R Package for Linear and Generalized Weighted Quantile Sum (WQS) Regression 222 | 223 | - Speaker: Stefano Renzetti 224 | - Materials: TBD 225 | 226 | #### Flexible Meta-Analysis of Generalized Additive Models with metagam 227 | 228 | - Speaker: Øystein Sørensen ([\@SorensenOystein](https://twitter.com/@SorensenOystein), [website](https://osorensen.rbind.io)) 229 | - Materials: [slides](https://osorensen.rbind.io/post/presentation-of-metagam-at-erum2020), [blog post](https://osorensen.rbind.io/post/presentation-of-metagam-at-erum2020) 230 | 231 | ### Parallel Session 2 - Life Sciences 232 | 233 | [:movie_camera: **Video**](https://youtu.be/qU3ob0t6dHE) 234 | 235 | #### CONNECTOR: a computational approach to study intratumor heterogeneity 236 | 237 | - Speaker: Simone Pernice 238 | - Materials: TBD 239 | 240 | #### EPIMOD: A computational framework for studying epidemiological systems 241 | 242 | - Speaker: Paolo Castagno 243 | - Materials: TBD 244 | 245 | #### How to apply R in a hospital environment on standard available hospital-wide data 246 | 247 | - Speaker: Mieke Deschepper 248 | - Materials: TBD 249 | 250 | #### APFr: Average Power Function and Bayes FDR for Robust Brain Networks Construction 251 | 252 | - Speaker: Nicolò Margaritella 253 | - Materials: TBD 254 | 255 | ### Parallel Session 3 - Applications / R in Production 256 | 257 | [:movie_camera: **Video**](https://youtu.be/NwVOvfpGq4o) 258 | 259 | #### Using process mining principles to extract a collaboration graph from a version control system log 260 | 261 | - Speaker: Leen Jooken 262 | - Materials: [slides](materials/Jooken-CollaborateR.pdf), [repo](https://github.com/bupaverse/collaborateR) 263 | 264 | #### Design Patterns For Big Shiny Apps 265 | 266 | - Speaker: Alex Gold ([\@alexkgold](https://twitter.com/@alexkgold)) 267 | - Materials: [slides](https://docs.google.com/presentation/d/1mAAU0sw7GIVwOeGVsvdZCWA86mHhu-GZgkjYgreqmBk/edit), [repo](https://github.com/akgold/erum-big-shiny) 268 | 269 | #### Fake News: AI on the battle ground 270 | 271 | - Speaker: Ayomide Shodipo 272 | - Materials: TBD 273 | 274 | #### progressr: An Inclusive, Unifying API for Progress Updates 275 | 276 | - Speaker: Henrik Bengtsson ([\@henrikbengtsson](https://twitter.com/henrikbengtsson), [website](https://www.jottr.org/)) 277 | - Materials: [slides](https://www.jottr.org/presentations/eRum2020/BengtssonH_20200617-progressr-An_Inclusive,_Unifying_API_for_Progress_Updates.pdf), [blog post](https://www.jottr.org/2020/07/04/progressr-erum2020-slides) 278 | 279 | ### Parallel Session 4 - Life Sciences 280 | 281 | [:movie_camera: **Video**](https://youtu.be/vj_dwE4iFrs) 282 | 283 | #### Interpretable and accessible Deep Learning for omics data with R and friends 284 | 285 | - Speaker: Moritz Hess 286 | - Materials: TBD 287 | 288 | #### GeneTonic: enjoy RNA-seq data analysis, responsibly 289 | 290 | - Speaker: Federico Marini ([\@FedeBioinfo](https://twitter.com/@FedeBioinfo)) 291 | - Materials: [slides](https://federicomarini.github.io/eRum2020) 292 | 293 | #### DaMiRseq 2.0: from high dimensional data to cost-effective reliable prediction models 294 | 295 | - Speaker: Mattia Chiesa 296 | - Materials: TBD 297 | 298 | #### A simple and flexible inactivity/sleep detection R package 299 | 300 | - Speaker: Francesca Giorgolo 301 | - Materials: TBD 302 | 303 | ### Parallel Session 5 - R World 304 | 305 | [:movie_camera: **Video**](https://youtu.be/Z2KBhz-xDIM) 306 | 307 | #### The R Consortium 2020: adapting to rapid change and global crisis 308 | 309 | - Speaker: Joseph Rickert 310 | - Materials: TBD 311 | 312 | #### CorrelAidX - Building R-focused Communities for Social Good on the Local Level 313 | 314 | - Speaker: Regina Siegers 315 | - Materials: TBD 316 | 317 | #### From consulting to open-source and back 318 | 319 | - Speaker: Christoph Sax 320 | - Materials: TBD 321 | 322 | #### {polite}: web etiquette for R users 323 | 324 | - Speaker: Dmytro Perepolkin ([\@dmi3k](https://twitter.com/dmi3k)) 325 | - Materials: [slides](https://bit.ly/polite20) 326 | 327 | ### Parallel Session 6 - R in Production / R World 328 | 329 | [:movie_camera: **Video**](https://youtu.be/RbGHIqVQ6OA) 330 | 331 | #### Powering Turing e-Atlas with R 332 | 333 | - Speaker: Layik Hama ([\@layik](https://twitter.com/@layik)) 334 | - Materials: [slides](https://layik.github.io/presentations/eRum2020/slides.html) 335 | 336 | #### Dplyr snowflake integration for cloud based massive and fast data manipulation 337 | 338 | - Speaker: Massimiliano Silano 339 | - Materials: TBD 340 | 341 | #### Supporting R in the Binder Community 342 | 343 | - Speaker: Sarah Gibson 344 | - Materials: TBD 345 | 346 | ### Parallel Session 7 - Data Visualization & Shiny 347 | 348 | [:movie_camera: **Video**](https://youtu.be/r70YV-4ktIY) 349 | 350 | #### Transparent Journalism Through the Power of R 351 | 352 | - Speaker: Tatjana Kecojevic 353 | - Materials: TBD 354 | 355 | #### Elevating shiny module with {tidymodules} 356 | 357 | - Speaker: Mustapha Larbaoui 358 | - Materials: TBD 359 | 360 | #### Interactive visualization of complex texts 361 | 362 | - Speaker: Renate Delucchi Danhier 363 | - Materials: TBD 364 | 365 | #### What’s New in ShinyProxy 366 | 367 | - Speaker: Tobias Verbeke 368 | - Materials: TBD 369 | 370 | ### Parallel Session 8 - Machine Learning & Models 371 | 372 | [:movie_camera: **Video**](https://youtu.be/RhiEEeMfklM) 373 | 374 | #### Voronoi Linkage for Spatially Misaligned Data 375 | 376 | - Speaker: Luís G. Silva 377 | - Materials: TBD 378 | 379 | #### Astronomical source detection and background separation: a Bayesian nonparametric approach 380 | 381 | - Speaker: Andrea Sottosanti 382 | - Materials: TBD 383 | 384 | #### High dimensional sampling and volume computation 385 | 386 | - Speaker: Apostolos Chalkis 387 | - Materials: TBD 388 | 389 | #### BNPmix: an new package to estimate Bayesian nonparametric mixtures 390 | 391 | - Speaker: Riccardo Corradin 392 | - Materials: TBD 393 | 394 | ### Parallel Session 9 - R in Production 395 | 396 | [:movie_camera: **Video**](https://youtu.be/VQtEI-akVyk) 397 | 398 | #### Be proud of your code! Tools and patterns for making production-ready, clean R code 399 | 400 | - Speaker: Marcin Dubel ([\@DubelMarcin](https://twitter.com/@DubelMarcin)) 401 | - Materials: [slides](https://raw.githubusercontent.com/Appsilon/eRum2020/master/Marcin.%20Be%20proud%20of%20your%20code!.pdf) 402 | 403 | #### R alongside Airflow, Docker and Gitlab CI 404 | 405 | - Speaker: Matthias Bannert 406 | - Materials: TBD 407 | 408 | #### Using XGBoost, Plumber and Docker in production to power a new banking product 409 | 410 | - Speaker: André Rivenæs & Markus Mortensen 411 | - Materials: TBD 412 | 413 | #### Creating drag-and-drop shiny applications using sortable 414 | 415 | - Speaker: Andrie De Vries 416 | - Materials: TBD 417 | 418 | ### Parallel Session 10 - Machine Learning / Applications 419 | 420 | [:movie_camera: **Video**](https://youtu.be/MoM31cFoe8g) 421 | 422 | #### FastAI in R: preserving wildlife with computer vision 423 | 424 | - Speaker: Jędrzej Świeżewski 425 | - Materials: TBD 426 | 427 | #### Manifoldgstat: an R package for spatial statistics of manifold data 428 | 429 | - Speaker: Luca Torriani & Ilaria Sartori 430 | - Materials: TBD 431 | 432 | #### varycoef: Modeling Spatially Varying Coefficients 433 | 434 | - Speaker: Jakob Dambon ([\@JakobDambon](https://twitter.com/@JakobDambon)) 435 | - Materials: [slides](https://user.math.uzh.ch/dambon/talks/2020_eRum_varycoef.pdf) 436 | 437 | #### Computer Algebra Systems in R 438 | 439 | - Speaker: Mikkel Meyer Andersen 440 | - Materials: TBD 441 | 442 | ## Lightning Talks 443 | 444 | [:movie_camera: **Playlist**](https://www.youtube.com/playlist?list=PLX5iPLTva9UlmVS_t-A8OQicxYKODVfqH) 445 | 446 | ### Lightning Talks 1 - Data Visualization & Production 447 | 448 | [:movie_camera: **Video**](https://youtu.be/hFKwzJBXvIU) 449 | 450 | #### Time Series Missing Data Visualizations 451 | 452 | - Speaker: Steffen Moritz ([\@_SteffenMoritz](https://twitter.com/@_SteffenMoritz), [website](https://www.data-science-decaf.com)) 453 | - Materials: [presentation](materials/moritz_time-series-missing-value-visualizations.pdf), [package](http://steffenmoritz.github.io/imputeTS/), [github](https://github.com/SteffenMoritz/imputeTS) 454 | 455 | #### effectclass: an R package to interpret effects and visualise uncertainty 456 | 457 | - Speaker: Thierry Onkelinx ([github](https://github.com/thierryo), [blog](https://www.muscardinus.be)) 458 | - Materials: [presentation](materials/onkelinx_effectclass.pdf), [package documentation](https://effectclass.netlify.app/), [package source code](https://github.com/inbo/effectclass/) 459 | 460 | #### Supporting Twitter analytics application with graph-databases and the aRangodb package 461 | 462 | - Speaker: Gabriele Galatolo 463 | - Materials: TBD 464 | 465 | #### dm: working with relational data models in R 466 | 467 | - Speaker: Kirill Muller 468 | - Materials: TBD 469 | 470 | #### An innovative way to support your sales force 471 | 472 | - Speaker: Matilde Grecchi 473 | - Materials: TBD 474 | 475 | #### Keeping on top of R in Real-Time, High-Stakes trading systems 476 | 477 | - Speaker: Nicholas Jhirad 478 | - Materials: TBD 479 | 480 | ### Lightning Talks 2 - Life Sciences 481 | 482 | [:movie_camera: **Video**](https://youtu.be/79lUgThZ4HE) 483 | 484 | #### An enriched disease risk assessment model based on historical blood donors records 485 | 486 | - Speaker: Andrea Cappozzo ([\@AndreaCappozzo](https://twitter.com/AndreaCappozzo), [website](https://andreacappozzo.rbind.io)) 487 | - Materials: [slides](https://andreacappozzo.rbind.io/erum_2020_heartindata.html#1) 488 | 489 | #### A principal component analysis based method to detect biomarker captation from vibrational spectra 490 | 491 | - Speaker: Marco Calderisi 492 | - Materials: TBD 493 | 494 | #### ptmixed: an R package for flexible modelling of longitudinal overdispersed count data 495 | 496 | - Speaker: Mirko Signorelli ([\@signormirko](https://twitter.com/@signormirko), [website](https://mirkosignorelli.github.io)) 497 | - Materials: [slides](https://raw.githubusercontent.com/mirkosignorelli/erum2020/master/beamer-signorelli-erum2020.pdf), [resources](https://github.com/mirkosignorelli/erum2020#readme) 498 | 499 | #### Differential Enriched Scan 2 (DEScan2): an R pipeline for epigenomic analysis 500 | 501 | - Speaker: Dario Righelli 502 | - Materials: TBD 503 | 504 | #### Reproducible Data Visualization with CanvasXpress 505 | 506 | - Speaker: Ger Inberg ([\@g_inberg](https://twitter.com/@g_inberg)) 507 | - Materials: [slides](https://github.com/ginberg/conference-talks/raw/master/erum2020/reproducible_data_viz_cx.odp) 508 | 509 | #### Using open-access data to derive genome composition of emerging viruses 510 | 511 | - Speaker: Liam Brierley ([\@L_Brierley](https://twitter.com/@L_Brierley)) 512 | - Materials: [slides](https://drive.google.com/file/d/1Udi55KnfXxbXoyJMcOhdlRp1fB6HOBJp/view?usp=sharing) 513 | 514 | ### Lightning Talks 3 - Machine Learning 515 | 516 | [:movie_camera: **Video**](https://youtu.be/sHlj_H8K2RI) 517 | 518 | #### Ultra fast penalized regressions with R package {bigstatsr} 519 | 520 | - Speaker: Florian Privé ([\@privefl](https://twitter.com/@privefl)) 521 | - Materials: [slides](https://privefl.github.io/R-presentation/slides-eRum2020.html) 522 | 523 | #### Explaining black-box models with xspliner to make deliberate business decisions 524 | 525 | - Speaker: Krystian Igras ([\@krystian8207](https://twitter.com/@krystian8207)) 526 | - Materials: [slides](https://raw.githubusercontent.com/Appsilon/eRum2020/master/Krystian.%20Surrogate%20Models%20with%20xspliner.pdf) 527 | 528 | #### One-way non-normal ANOVA in reliability analysis using with doex 529 | 530 | - Speaker: Mustafa Cavus 531 | - Materials: TBD 532 | 533 | #### Analyzing Preference Data with the BayesMallows Package 534 | 535 | - Speaker: Øystein Sørensen ([\@SorensenOystein](https://twitter.com/@SorensenOystein), [website](https://osorensen.rbind.io)) 536 | - Materials: [slides](https://osorensen.github.io/ERUM2020/BayesMallows/presentation.html), [blog post](https://osorensen.rbind.io/post/bayesmallows-presentation-at-the-european-r-user-meeting-2020) 537 | 538 | #### Flexible deep learning via the JuliaConnectoR 539 | 540 | - Speaker: Stefan Lenz ([website](https://www.uniklinik-freiburg.de/imbi-en/employees.html?imbiuser=lenz)) 541 | - Materials: [slides](materials/lenz-JuliaConnectoR.pdf), [article](https://arxiv.org/abs/2005.06334), [CRAN](https://CRAN.R-project.org/package=JuliaConnectoR), [GitHub](https://github.com/stefan-m-lenz/JuliaConnectoR) 542 | 543 | ### Lightning Talks 4 - Applications 544 | 545 | [:movie_camera: **Video**](https://youtu.be/KOYZPMMgiHo) 546 | 547 | #### rdwd: R interface to German Weather Service data 548 | 549 | - Speaker: Berry Boessenkool ([website](https://brry.github.io)) 550 | - Materials: slides [pdf](https://github.com/brry/rdwd/raw/master/misc/eRum/rdwd.pdf)/[source](https://github.com/brry/rdwd/tree/master/misc/eRum), [github](https://github.com/brry/rdwd#rdwd), [CRAN](https://cran.r-project.org/package=rdwd), [Homepage](https://bookdown.org/brry/rdwd) 551 | 552 | #### tv: Show Data Frames in the Browser 553 | 554 | - Speaker: Christoph Sax 555 | - Materials: TBD 556 | 557 | #### Predicting the Euro 2020 results using tournament rank probabilities scores from the socceR package 558 | 559 | - Speaker: Claus Ekstrom 560 | - Materials: TBD 561 | 562 | #### Design your own quantum simulator with R 563 | 564 | - Speaker: Indranil Ghosh ([\@indraghosh314](https://twitter.com/indraghosh314), [website](https://indrag49.github.io/)) 565 | - Materials: [slides](https://speakerdeck.com/indrag49/design-your-own-quantum-simulator-with-r), [CRAN](https://cran.r-project.org/web/packages/QGameTheory/index.html) 566 | 567 | #### What are the potato eaters eating 568 | 569 | - Speaker: Keshav Bhatt 570 | - Materials: TBD 571 | 572 | #### Towards more structured data quality assessment in the process mining field: the DaQAPO package 573 | 574 | - Speaker: Niels Martin ([\@Niels_Martin](https://twitter.com/Niels_Martin)) 575 | - Materials: [slides](https://www.slideshare.net/NielsMartin/erum-2020-daqapo-235962835), [CRAN](https://cran.r-project.org/web/packages/daqapo/index.html) 576 | 577 | ## Shiny Demos 578 | 579 | [:movie_camera: **Playlist**](https://www.youtube.com/playlist?list=PLX5iPLTva9UlCY2qFZBSRydU1oPEAJNig) 580 | 581 | ### Shiny Demo 1 - Machine Learning & Applications 582 | 583 | [:movie_camera: **Video**](https://youtu.be/t8PZbP5b8EM) 584 | 585 | #### A demonstration of ABACUS: Apps Based Activities for Communicating and Understanding Statistics 586 | 587 | - Speaker: Mintu Nath 588 | - Materials: TBD 589 | 590 | #### tsviz: a data-scientist-friendly addin for RStudio 591 | 592 | - Speaker: Emanuele Fabbiani 593 | - Materials: [Medium Post](https://towardsdatascience.com/introducing-tsviz-interactive-time-series-visualization-in-r-studio-a96cde507a14), [GitHub repo](https://github.com/xtreamsrl/tsviz), [CRAN package](https://cran.r-project.org/package=tsviz) 594 | 595 | #### Media Shiny: Marketing Mix Models Builder 596 | 597 | - Speaker: Andrea Melloncelli [\@akirocode](https://twitter.com/akirocode) 598 | - Materials: [Slides and Demo](https://docs.google.com/presentation/d/1MxgrmBQBenzBJs4vsmgLTgFcPtfporzafQXria4Y5Bo/edit?usp=sharing) 599 | 600 | #### Scoring the Implicit Association Test has never been easier: DscoreApp 601 | 602 | - Speaker: Ottavia M. Epifania [Linkedin](https://www.linkedin.com/in/ottavia-epifania-612ba9b3/), [Twitter](https://twitter.com/ExeOttavia), [Psicostat](https://ip146179.psy.unipd.it/psicostat/web/people.html) 603 | - Materials: [Slides](https://github.com/OttaviaE/eRum2020/blob/master/eRumShinyDemo-Epifania.Rmd), [App repository](https://github.com/OttaviaE/DscoreApp), [DscoreApp](http://fisppa.psy.unipd.it/DscoreApp/) 604 | 605 | #### rTRhexNG: Hexagon sticker app for rTRNG 606 | 607 | - Speaker: Riccardo Porreca 608 | - Materials: [live app](https://miraisolutions.shinyapps.io/rTRhexNG), [GitHub repo](https://github.com/miraisolutions/rTRhexNG#readme) 609 | 610 | #### How green is your portfolio ? Tracking C02 footprint in the insurance sector 611 | 612 | - Speaker: Antoine Logean 613 | - Materials: TBD 614 | 615 | ### Shiny Demo 2 - Mobility & Spatial 616 | 617 | [:movie_camera: **Video**](https://youtu.be/sU_iT3Spc-Q) 618 | 619 | #### Visualising and Modelling Bike Sharing Mobility usage in the city of Milan 620 | 621 | - Speaker: Agostino Torti 622 | - Materials: TBD 623 | 624 | #### ESPRES: A shiny web tool to support River Basin Management planning in European Watersheds 625 | 626 | - Speaker: Angel Udias 627 | - Materials: TBD 628 | 629 | #### Mobility scan 630 | 631 | - Speaker: Josue Aduna 632 | - Materials: TBD 633 | 634 | #### Developing Shiny applications to facilitate precision agriculture workflows 635 | 636 | - Speaker: Lorenzo Busetto 637 | - Materials: TBD 638 | 639 | #### “GUInterp”: a Shiny GUI to support spatial interpolation 640 | 641 | - Speaker: [Luigi Ranghetti](https://luigi.ranghetti.info) 642 | - Materials: [documentation](https://guinterp.ranghetti.info), [source code](https://github.com/ranghetti/guinterp), [live demo](https://ranghetti.shinyapps.io/guinterp). 643 | 644 | ## Poster Sessions 645 | 646 | [:movie_camera: **Playlist**](https://www.youtube.com/playlist?list=PLX5iPLTva9Ukknkenms-VDx94QlRcHgmM) 647 | 648 | ### Poster Session 1 - Life Sciences & Applications 649 | 650 | [:movie_camera: **Video**](https://youtu.be/HDLpw3gCouQ) 651 | 652 | #### A flexible dashboard for monitoring platform trials 653 | 654 | - Speaker: Alessio Crippa 655 | - Materials: TBD 656 | 657 | #### PRDA package: Enhancing Statistical Inference via Prospective and Retrospective Design Analysis 658 | 659 | - Speaker: Angela Andreella ([\@aangeella](https://twitter.com/@aangeella)) 660 | - Materials: [Poster materials](https://github.com/angeella/eRum_2020#readme) 661 | 662 | #### EasyReporting: a Bioconductor package for Reproducible Research implementation 663 | 664 | - Speaker: Dario Righelli 665 | - Materials: TBD 666 | 667 | #### NewWave: a scalable R package for the dimensionality reduction of single-cell RNA-seq 668 | 669 | - Speaker: Federico Agostinis 670 | - Materials: TBD 671 | 672 | #### Integrating professional software engineering practices in medical research software 673 | 674 | - Speaker: Patricia Ryser-Welch 675 | - Materials: TBD 676 | 677 | #### Dealing with changing administrative boundaries: The case of Swiss municipalities 678 | 679 | - Speaker: Tobias Schieferdecker 680 | - Materials: [poster](https://cynkra.github.io/SwissCommunes-poster/), [package-repository](https://github.com/cynkra/SwissCommunes), [poster-repository](https://github.com/cynkra/SwissCommunes-poster/) 681 | 682 | ### Poster Session 2 - DataViz & Machine Learning 683 | 684 | [:movie_camera: **Video**](https://youtu.be/dwIXqhk1NdM) 685 | 686 | #### Automate flexdashboard with GitHub 687 | 688 | - Speaker: Binod Jung Bogati 689 | - Materials: TBD 690 | 691 | #### orf: Ordered Random Forests 692 | 693 | - Speaker: [Gabriel Okasa](https://okasag.github.io/) 694 | - Materials: [Poster](https://okasag.github.io/assets/pdf/orf_erum_poster_okasa.pdf), [Slides](https://okasag.github.io/assets/pdf/orf_erum_milano_okasa.pdf), [Paper](https://arxiv.org/pdf/1907.02436.pdf), [Repository](https://github.com/okasag/orf), [CRAN](https://CRAN.R-project.org/package=orf) 695 | 696 | #### Power Supply health status monitoring dashboard 697 | 698 | - Speaker: Marco Calderisi 699 | - Materials: TBD 700 | 701 | #### First-year ICT students dropout predicting with R models 702 | 703 | - Speaker: Olga Dunajeva 704 | - Materials: TBD 705 | 706 | #### Benchmark Percentage Disjoint Data Splitting in Cross Validation for Assessing the Skill of Machine 707 | 708 | - Speaker: Olalekan Joseph Akintande 709 | - Materials: TBD 710 | 711 | #### badDEA: An R package for measuring firms’ efficiency adjusted by undesirable outputs 712 | 713 | - Speaker: Hervé Dakpo 714 | - Materials: TBD 715 | 716 | #### CorpFinder- a new application to identify Large Corporate Risks 717 | 718 | - Speaker: Aurélien Severino 719 | - Materials: [slides](materials/Severino-CorpFinder-Application_to_identify_Large_Corporate_Risks.pdf) 720 | 721 | ## Thematic Lounges 722 | 723 | ### Thematic Lounge 1 - Developing Software and Careers in Life Sciences 724 | 725 | #### Developing Software and Careers in Life Sciences 726 | 727 | - Chairs: Federico Marini ([\@FedeBioinfo](https://twitter.com/@FedeBioinfo)), Charlotte Soneson ([\@CSoneson](https://twitter.com/@CSoneson)), Davide Risso ([\@drisso1893](https://twitter.com/@drisso1893)) 728 | - Materials: https://docs.google.com/document/d/1R2lhSG7B3HqeGAWRqenvd9XdxWQ7JmGELUbmwY7a_ug/edit?usp=sharing 729 | 730 | ### Thematic Lounge 2 - Building Community & Diversity 731 | 732 | #### Building Community & Diversity 733 | 734 | - Chairs: Sara Iacozza ([\@IacozzaSara](https://twitter.com/@IacozzaSara)), Levi Waldron ([\@leviwaldron1](https://twitter.com/@leviwaldron1)), Janine Khuc ([\@Janinekhuc](https://twitter.com/@Janinekhuc)), Parvaneh Shafiei ([\@parvanehshafiei](https://twitter.com/@parvanehshafiei)) 735 | - Materials: https://docs.google.com/document/d/1HlzARyRmobvvYli99BL-ChWCorx1ogibJTlDLXU3KrY/edit?usp=sharing 736 | 737 | ### Thematic Lounge 3 - Data science: freelancing and business 738 | 739 | #### Data science: freelancing and business 740 | 741 | - Chairs: Enrico Deusebio ([\@enrico_deusebio](https://twitter.com/@enrico_deusebio)), Mariachiara Fortuna ([\@maryclaryf](https://twitter.com/@maryclaryf)), Riccardo L. Rossi ([\@ricreds](https://twitter.com/@ricreds)) 742 | - Materials: https://docs.google.com/document/d/1dEKXtALLaOCf2x-ZJIhsen5iZS0gZEZwG8JupkRqVKE/edit?usp=sharing 743 | 744 | ## Workshops 745 | 746 | [:movie_camera: **Playlist**](https://www.youtube.com/playlist?list=PLX5iPLTva9UnZmlzry0_YNB8rH85DuLRa) 747 | 748 | ### Morning Workshops 749 | 750 | #### Is R ready for Production? Let’s develop a Professional Shiny Application! 751 | 752 | - Instructor: Andrea Melloncelli ([\@akirocode](https://twitter.com/akirocode)) 753 | - Materials: [repository](https://github.com/vanlog/vincent) 754 | 755 | #### Image processing and computer vision with R 756 | 757 | - Instructor: [Lubomír Štěpánek](https://twitter.com/LubomirStepanek), Jiří Novák 758 | - Materials: [repository](https://github.com/LStepanek/eRum_2020_image_processing_and_computer_vision_with_R) 759 | 760 | #### Explanation and exploration of machine learning models with R 761 | 762 | - Instructor: Przemyslaw Biecek, Anna Kozak, Szymon Maksymiuk 763 | - Materials: https://github.com/pbiecek/XAIatERUM2020 764 | 765 | #### Non-disclosive federated analysis in R 766 | 767 | - Instructor: Patricia Ryser-Welch, Paul Burton, Demetris Avraam, Stuart Wheater, Olly Butters, Becca Wilson, Alex Westerberg, Leire Abarrategui-Martinez 768 | - [:movie_camera: Video](https://youtu.be/WpQFCfe1n_Q) 769 | - Materials: https://github.com/patRyserWelch8/DataSHIELD.eRUM.2020/wiki 770 | 771 | #### A unified approach for writing automatic reports: parameterization and generalization of R-Markdown 772 | 773 | - Instructor: Cristina Muschitiello, Niccolò Stamboglis 774 | - [:movie_camera: Video](https://youtu.be/XFHGBX5bN_E) 775 | - Materials: [repo (with sides inside each section)](https://github.com/muschitiello/e-Rum2020-Unified-Approach-For-Automatic-Reports) 776 | 777 | #### Build a website with blogdown in R 778 | 779 | - Instructor: Tatjana Kecojevic, Katarina Kosmina, Tijana Blagojev 780 | - Materials: [repository](https://github.com/TanjaKec/eRum2020), [website](https://websiteinr.netlify.app/), [slides](https://erum2020.netlify.app) 781 | 782 | ### Afternoon Workshops 783 | 784 | #### Reproducible workflows with the RENKU platform 785 | 786 | - Instructor: Christine Choirat, The Renku Development Team 787 | - Materials: TBD 788 | 789 | #### How to build htmlwidgets 790 | 791 | - Instructor: [John Coene](https://john-coene.com/) 792 | - Materials: [repository](https://github.com/JohnCoene/how-to-build-htmlwidgets), [website](https://htmlwidgets.john-coene.com/), [slides](https://htmlwidgets.john-coene.com/presentation/) 793 | 794 | #### Semantic Web in R for Data Scientists 795 | 796 | - Instructor: Goran Milovanović 797 | - Materials: https://github.com/datakolektiv/e-Rum2020_SemanticWeb 798 | 799 | #### Advanced User Interfaces for Shiny Developers 800 | 801 | - Instructor: [David Granjon](https://divadnojnarg.github.io/), Mustapha Larbaoui, Flavio Lombardo, Douglas Robinson 802 | - Materials: [repository](https://github.com/Novartis/Advanced-User-Interfaces-for-Shiny-Developers), 803 | [slides](https://rinterface.com/shiny/talks/eRum2020/dg/index.html), 804 | [Rstudio Cloud](https://rstudio.cloud/project/1395473), 805 | [:blue_book: book](https://divadnojnarg.github.io/outstanding-shiny-ui/), 806 | [tidymodules](https://github.com/Novartis/tidymodules), 807 | [RinteRface](https://rinterface.com/) 808 | 809 | #### Bring your R Application Safely to Production. Collaborate, Deploy, Automate 810 | 811 | - Instructor: Riccardo Porreca, Peter Schmid 812 | - [:movie_camera: Video](https://youtu.be/dPc10Ka-L94) 813 | - Materials: [blog post](https://mirai-solutions.ch/news/2020/08/25/erum2020-workshop), [repository](https://github.com/miraisolutions/eRum2020Workshop), [slides](http://mirai-solutions.ch/eRum2020Workshop/workshop-slides.html) 814 | -------------------------------------------------------------------------------- /_bookdown.yml: -------------------------------------------------------------------------------- 1 | output_dir: _site 2 | -------------------------------------------------------------------------------- /children/contribution.Rmd: -------------------------------------------------------------------------------- 1 | ```{r, include = FALSE} 2 | # Child document rendering a single `contribution` with a give `contr_heading`, 3 | # both set by the parent document 4 | eval(envir = knitr::knit_global(), { 5 | stopifnot(exists("contribution")) 6 | stopifnot(exists("contr_heading")) 7 | }) 8 | stopifnot(nrow(contribution) == 1L) 9 | stopifnot(is.character(contr_heading), length(contr_heading) == 1L) 10 | ``` 11 | 12 | `r paste(contr_heading, contribution$title)` 13 | 14 | **`r contribution$author`**, *`r contribution$affiliation`* 15 | 16 | ```{r, include=FALSE} 17 | auth <- !is.na(contribution$author2) 18 | ``` 19 | 20 | ```{r, echo = FALSE, eval = auth} 21 | library(knitr) 22 | asis_output(paste("\n**",contribution$author2,"**, ","*",contribution$affiliation2,"*",sep="")) 23 | ``` 24 | 25 | 26 | Track(s): `r contribution$track` 27 | 28 | **Abstract:** 29 | 30 | `r contribution$description` 31 | 32 | `r if (!is.na(contribution$coauthor)) {paste("Coauthor(s): ",contribution$coauthor,".")}` 33 | 34 | 35 | -------------------------------------------------------------------------------- /children/session_type.Rmd: -------------------------------------------------------------------------------- 1 | ```{r, include = FALSE} 2 | # Child document rendering the subset of all `contributions` for a given 3 | # `session_type` 4 | eval(envir = knitr::knit_global(), { 5 | stopifnot(exists("session_type")) 6 | stopifnot(exists("contributions")) 7 | }) 8 | ``` 9 | 10 | ## `r paste0(session_type, "s")` 11 | 12 | ```{r, echo = FALSE, results = 'asis', warning = FALSE} 13 | # heading level used for rendering individual contributions 14 | contr_heading <- "###" 15 | for (i in which(contributions$session_type %in% session_type)) { 16 | contribution <- contributions[i, , drop = FALSE] 17 | cat(knitr::knit_child( 18 | file.path("contribution.Rmd"), 19 | quiet = TRUE 20 | )) 21 | } 22 | ``` 23 | 24 | -------------------------------------------------------------------------------- /data/erum2020_confirmed_program.json: -------------------------------------------------------------------------------- 1 | [ 2 | { 3 | "title": "An enriched disease risk assessment model based on historical blood donors records", 4 | "author": "Andrea Cappozzo", 5 | "affiliation": "PhD student at University of Milan-Bicocca", 6 | "namesurname": "ANDREA,CAPPOZZO", 7 | "coauthor": "Edoardo Michielon, Alessandro De Bettin, Chiara D'Ignazio, Luigi Noto, Davide Drago, Alberto Prospero, Francesca De Chiara, Sergio Casartelli", 8 | "track": "R Applications", 9 | "session_type": "Lightning talk", 10 | "description": "Historically, the medical literature has largely focused on determining risk factors at an illness-specific level. Nevertheless, recent studies suggested that identical risk factors may cause the appearance of different diseases in different patients (Meijers & De Boer, 2019).\r\n\r\nThanks to the joint collaboration of Heartindata, a group of data scientists offering their passion and skills for social good, and Avis Milano, the Italian blood donor organization, an enriched disease risk assessment model is developed. Multiple risk factors and donations drop-out causes are collectively analyzed from AVIS longitudinal records, with the final aim of providing a broader and clearer overview of the interplay between risk factors and associated diseases in the blood donors population.", 11 | "email": "andrea.cappozzo@unimib.it" 12 | }, 13 | { 14 | "title": "rdwd: R interface to German Weather Service data", 15 | "author": "Berry Boessenkool", 16 | "affiliation": "R trainer & consultant", 17 | "namesurname": "BERRY,BOESSENKOOL", 18 | "track": "R Applications", 19 | "session_type": "Lightning talk", 20 | "description": "rdwd is an R package to handle data from the German Weather Service (DWD). It allows to easily select, download and read observational data from over 6k weather stations. Both current data and historical records (partially dating back to the 1860s) are handled. Since about a year, gridded data from radar measurements can be read as well.", 21 | "email": "berryboessenkool@hotmail.com" 22 | }, 23 | { 24 | "title": "tv: Show Data Frames in the Browser", 25 | "author": "Christoph Sax", 26 | "affiliation": "R-enthusiast, economist @cynkra", 27 | "namesurname": "CHRISTOPH,SAX", 28 | "coauthor": "Kirill Müller", 29 | "track": "R World", 30 | "session_type": "Lightning talk", 31 | "description": "The tv package lively displays data frames during data analysis.\r\nIt modifies the print method of data frames, tibbles or data tables to also appear in a browser or in the view pane of RStudio.\r\n\r\nThis is similar in spirit to the View() function in RStudio, works in other development environments, and has several advantages.\r\nChanges in data frame are shown immediately and next to the script and the console output, rather than on top of them.\r\nThe display keeps the position and the width of columns if a modified data frame is shown in tv.\r\nIt is updated asynchronously, without interrupting the analysis workflow.", 32 | "email": "christoph@cynkra.com" 33 | }, 34 | { 35 | "title": "Predicting the Euro 2020 results using tournament rank probabilities scores from the socceR package", 36 | "author": "Claus Ekstrøm", 37 | "affiliation": "Statistician at University of Copenhagen. Longtime R hacker.", 38 | "namesurname": "CLAUS,EKSTRØM", 39 | "track": "R Applications", 40 | "session_type": "Lightning talk", 41 | "description": "The 2020 UEFA European Football Championship will be played this summer. Football championships are the source of almost endless predictions about the winner and the results of the individual matches, and we will show how the recently developed tournament rank probability score can be used to compare predictions.\r\n\r\nDifferent statistical models form the basis for predicting the result of individual matches. We present an R framework for comparing different prediction models and for comparing predictions about the Euro results. Everyone is encouraged to contribute their own function to make predictions for the result of the Euro 2020 championship.\r\n\r\nEach contributer will be shown how to provide two functions: a\r\nfunction that predicts the final score for a match between two teams with different skill levels, and a function that updates the\r\nskill levels based on the results of a given match. By supplying these two functions to the R framework the prediction results can be compared and the winner of the best football predictor can be found when Euro 2020 finishes.", 42 | "email": "ekstrom@sund.ku.dk" 43 | }, 44 | { 45 | "title": "Differential Enriched Scan 2 (DEScan2): an R pipeline for epigenomic analysis.", 46 | "author": "Dario Righelli", 47 | "affiliation": "Department of Statistics, University of Padua, Post-Doc", 48 | "namesurname": "DARIO,RIGHELLI", 49 | "coauthor": "Koberstein John, Gomes Bruce, Zhang Nancy, Angelini Claudia, Peixoto Lucia, Risso Davide", 50 | "track": "R Life Sciences", 51 | "session_type": "Lightning talk", 52 | "description": "We present DEScan2, a R/Bioconductor package for the differential enrichment analysis of epigenomic sequencing data.\r\nOur method consists of three steps: peak caller, peak consensus across samples, and peak signal quantification.\r\nThe peak caller is a standard moving scan window comparing the counts between a sliding window and a larger region outside the window, using a Poisson likelihood, providing a z-score for each peak. However, the package can work with any external peak caller: to this end, we provide additional functionalities to load peaks from bed files and handle them as internal optimized structures.\r\nThe consensus step aims to determine if a peak is a \"true peak\" based on its replicability across samples: we developed a filtering step to filter out those peaks not present in at least a user given number of samples. A further threshold can be used over the peak z-scores.\r\nFinally, the third step produces a count matrix where each column is a sample and each row a previously filtered peak. The value of each matrix cell is the number of reads for the peak in the sample.\r\nFurthermore, our package provides several functionalities for common genomic data structure handling, for instance, to give the possibility to split the data over the chromosomes to speed-up the computations parallelizing them on multiple CPUs.", 53 | "email": "dario.righelli@unipd.it" 54 | }, 55 | { 56 | "title": "Ultra fast penalized regressions with R package {bigstatsr}", 57 | "author": "Florian Privé", 58 | "affiliation": "Postdoc at Aarhus University", 59 | "namesurname": "FLORIAN,PRIVÉ", 60 | "track": "R Machine Learning & Models", 61 | "session_type": "Lightning talk", 62 | "description": "In this talk, I introduce the implementations of penalized linear and logistic regressions as implemented in R package {bigstatsr}.\r\nThese implementations use data stored on disk to handle very large matrices.\r\nThey automatically perform a procedure similar to cross-validation to choose the two hyper-parameters, λ and α, of the elastic net regularization, in parallel.\r\nThey employ an early stopping criterion to avoid fitting very expensive models, making these implementations on average 10 times faster than with {glmnet}.\r\nHowever, package {bigstatsr} does not implement all the many models and options provided by the excellent package {glmnet}; some are area of future development.", 63 | "email": "florian.prive.21@gmail.com" 64 | }, 65 | { 66 | "title": "Supporting Twitter analytics application with graph-databases and the aRangodb package", 67 | "author": "Gabriele Galatolo", 68 | "affiliation": "Kode Srl, Software Developer & Data Scientist", 69 | "namesurname": "GABRIELE,GALATOLO", 70 | "coauthor": "Francesca Giorgolo, Ilaria Ceppa, Marco Calderisi, Davide Massidda, Matteo Papi, Andrea Spinelli, Andrea Zedda, Jacopo Baldacci, Caterina Giacomelli", 71 | "track": "R Applications", 72 | "session_type": "Lightning talk", 73 | "description": "The importance of finding efficient ways to model and to store unstructured data has incredibly grown in the last decade, in particular with the strong expansion of social-media services. Among those storing tools an increasingly important class of databases is represented by the graph-oriented databases, where relationships between data are considered first-class citizens.\r\nIn order to support the analyst or the data scientist to interact and use in a simple way with this paradigm, we developed last year the package aRangodb, an interface with the graph-oriented database ArangoDB.\r\nTo show the capabilities of the package and of the underlying way to model data using graphs we present Tweetmood, a tool to analyze and visualize tweets from Twitter.\r\nIn this talk, we will present some of the most significant features of the package applied in the Tweetmood context, such as functionalities to traverse the graph and some examples in which the user can elaborate those graphs to get new information that can easily be stored using the functions and the tools available in the package.", 74 | "email": "g.galatolo@kode-solutions.net" 75 | }, 76 | { 77 | "title": "Reproducible Data Visualization with CanvasXpress", 78 | "author": "Ger Inberg", 79 | "affiliation": "Freelance Analytics Developer", 80 | "namesurname": "GER,INBERG", 81 | "track": "R Dataviz & Shiny", 82 | "session_type": "Lightning talk", 83 | "description": "canvasXpress was developed as the core visualization component for bioinformatics and systems biology analysis at Bristol-Myers Squibb. It supports a large number of visualizations to display scientific and non-scientific data. canvasXpress also includes a simple and unobtrusive user interface to explore complex data sets, a sophisticated and unique mechanism to keep track of all user customization for Reproducible Research purposes, as well as an 'out of the box' broadcasting capability to synchronize selected data points in all canvasXpress plots in a page. Data can be easily sorted, grouped, transposed, transformed or clustered dynamically. The fully customizable mouse events as well as the zooming, panning and drag-and-drop capabilities are features that make this library unique in its class.", 84 | "email": "ginberg@gmail.com" 85 | }, 86 | { 87 | "title": "Design your own quantum simulator with R", 88 | "author": "Indranil Ghosh", 89 | "affiliation": "Final year post Graduate student from the department of Physics, Jadavpur University, Kolkata, India", 90 | "namesurname": "INDRANIL,GHOSH", 91 | "track": "R Applications", 92 | "session_type": "Lightning talk", 93 | "description": "The main idea of the project is to use the R ecosystem to write computer codes for designing a quantum simulator, for simulating different quantum algorithms. I will start with giving a brief introduction to linear algebra for starting with quantum computation, and how to write your own R codes from scratch to implement them. Then I will take a dive into implementing simple quantum circuits starting with initializing qubits and terminating with a measurement. I will also implement simple quantum algorithms concluding with giving a brief intro to quantum game theory and their simulations with R.", 94 | "email": "indranilg49@gmail.com" 95 | }, 96 | { 97 | "title": "What are the potato eaters eating", 98 | "author": "Keshav Bhatt", 99 | "affiliation": "R-fan and independent researcher", 100 | "namesurname": "KESHAV,BHATT", 101 | "track": "R Applications, R Dataviz & Shiny", 102 | "session_type": "Lightning talk", 103 | "description": "Although stereotypes can quite useful they are often not correct. For instance, the Dutch are stereotyped as being potato eaters. While this might have been historically correct, it is not currently accurate. The Dutch sparingly eat potatoes and this paper uses data to disprove the stereotype. To get an impression of Dutch food habits, a popular local website was scraped. Besides its popularity, the website hosts user-generated content, giving a good proxy of Dutch taste-buds.\r\nWhile it was apparent on the website, lasagna is the most popular dish. Detailed NLP analysis of more than 50,000 recipes showed that potato based dishes are in fact nowhere at the top. This vindicated my belief. Moreover, it shows that the Dutch kitchen is globalizing. Tomato, a hallmark of South Europe is more popular than the Dutch potato. Also observed is the popularity of many herbs in the recipes, which are not a traditional component of the Dutch kitchen. \r\nThe world is changing and our kitchens too.\r\nThis trend will also be explored for other countries also.", 104 | "email": "bhatt.keshav@gmail.com" 105 | }, 106 | { 107 | "title": "dm: working with relational data models in R", 108 | "author": "Kirill Müller", 109 | "affiliation": "Clean code, tidy data. Consulting for cynkra, coding in the open.", 110 | "namesurname": "KIRILL,MÜLLER", 111 | "track": "R Applications, R Production, R World", 112 | "session_type": "Lightning talk", 113 | "description": "Storing all data related to a problem in a single table or data frame (\"the dataset\") can result in many repetitive values. Separation into multiple tables helps data quality but requires \"merge\" or \"join\" operations. {dm} is a new package that fills a gap in the R ecosystem: it makes working with multiple tables just as easy as working with a single table.\r\n\r\nA \"data model\" consists of tables (both the definition and the data), and primary and foreign keys. The {dm} package combines these concepts with data manipulation powered by the tidyverse: entire data models are handled in a single entity, a \"dm\" object.\r\n\r\nThree principal use cases for {dm} can be identified:\r\n\r\n1. When you consume a data model, {dm} helps access and manipulate a dataset consisting of multiple tables (database or local data frames) through a consistent interface.\r\n\r\n2. When you use a third-party dataset, {dm} helps normalizing the data to remove redundancies as part of the cleaning process.\r\n\r\n3. To create a relational data model, you can prepare the data using R and familiar tools and seamlessly export to a database.\r\n\r\nThe presentation revolves around these use cases and shows a few applications. The {dm} package is available on GitHub and will be submitted to CRAN in early February.", 114 | "email": "kirill@cynkra.com" 115 | }, 116 | { 117 | "title": "Explaining black-box models with xspliner to make deliberate business decisions", 118 | "author": "Krystian Igras", 119 | "affiliation": "Data Scientists and Software Engineer at Appsilon", 120 | "namesurname": "KRYSTIAN,IGRAS", 121 | "track": "R Machine Learning & Models", 122 | "session_type": "Lightning talk", 123 | "description": "A vast majority of the state of the art ML algorithms are black boxes, meaning it is difficult to understand their inner workings. The more that algorithms are used as decision support systems in everyday life, the greater the necessity of understanding the underlying decision rules. This is important for many reasons, including regulatory issues as well as making sure that the model learned sensible features. You can achieve all that with the xspliner R package that I have created.\r\n\r\nOne of the most promising methods to explain models is building surrogate models. This can be achieved by inferring Partial Dependence Plot (PDP) curves from the black box model and building Generalised Linear Models based on these curves. The advantage of this approach is that it is model agnostic, which means you can use it regardless of what methods you used to create your model.\r\n\r\nFrom this presentation, you will learn what PDP curves and GLMs are and how you can calculate them based on black box models. We will take a look at an interesting business use case in which we'll find out whether the original black box model or the surrogate one is a better decision system for our needs. Finally, we will see an example of how you can explain your models using this approach with the xspliner package for R (already available on CRAN!).", 124 | "email": "krystian@appsilon.com" 125 | }, 126 | { 127 | "title": "Using open-access data to derive genome composition of emerging viruses", 128 | "author": "Liam Brierley", 129 | "affiliation": "MRC Skills Development Fellow, University of Liverpool", 130 | "namesurname": "LIAM,BRIERLEY", 131 | "coauthor": "Anna Auer-Fowler, Maya Wardeh, Matthew Baylis, Prudence Wong", 132 | "track": "R Life Sciences", 133 | "session_type": "Lightning talk", 134 | "description": "Outbreaks of new viruses continue to threaten global health, including pandemic influenza, Ebola virus, and the novel coronavirus ‘nCoV-2019’. Advances in genome sequencing allow access to virus RNA sequences on an unprecedented scale, representing a powerful tool for epidemiologists to understand new viral outbreaks. \r\n\r\nWe use NCBI’s GenBank, a curated open-access repository containing >200 million genetic sequences (3 million viral sequences) directly submitted by users, representing many individual studies. However, the resulting breadth of data and inconsistencies in metadata present consistent challenges.\r\n\r\nWe demonstrate our approach using R to address these challenges and a need for reproducibility as data increases. Firstly, we use `rentrez` to programmatically search, filter, and obtain virus sequences from GenBank. Secondly, we use `taxize` to resolve pervasive problems of naming conflicts, as virus names are often recorded differently between entries, partly because virus classification is complex and regularly revised. We successfully resolve 428 mammal and bird RNA viruses to species level before extracting sequences.\r\n\r\nObtaining genome sequences of a large inventory of viruses allows us to estimate genomic composition biases, which show promise in predicting virus epidemiology. Ultimately, this pathway will allow better quantification of future epidemic threats.", 135 | "email": "liam.brierley@liverpool.ac.uk" 136 | }, 137 | { 138 | "title": "A principal component analysis based method to detect biomarker captation from vibrational spectra", 139 | "author": "Marco Calderisi", 140 | "affiliation": "Kode srl, CTO", 141 | "namesurname": "MARCO,CALDERISI", 142 | "coauthor": "Francesca Giorgolo, Ilaria Ceppa, Davide Massidda, Matteo Papi, Gabriele Galatolo, Andre Spinelli, Andrea Zedda, Jacopo Baldacci, Caterina Giacomelli, marco cecchini, matteo agostini", 143 | "track": "R Life Sciences", 144 | "session_type": "Lightning talk", 145 | "description": "BRAIKER is a microfluidics-Based biosensor aimed to detect biomarkers. The device is responsive to changes of mass and viscosity over its surface. When selected markers react with the sensor, a variation of resonant acoustic frequencies (called harmonics) is produced. A serious problem when examining the data produced by biosensors is the subjectivity of standard method to evaluate the pattern of harmonics. In our research, a method based on the principal component analysis has been applied on vibrational data. An R-Shiny application was developed in order to present data visualizations and multivariate analyses of vibrational spectra. The Shiny application allows to clean and explore data by using interactive data visualisation tools. The principal component analysis is applied to analyse simultaneously the full set of frequencies for multiple experimental runs, reducing the multivariate data set into a small number of components accounting for a component of variance near to that the original data. Functionalised and non-functionalised resonating foils of biosensor can be classified in order to validate the capability of the device to detect biomarkers, lowering the LOD and increasing sensitivity and resolution.", 146 | "email": "m.calderisi@kode-solutions.net" 147 | }, 148 | { 149 | "title": "An innovative way to support your sales force", 150 | "author": "Matilde Grecchi", 151 | "affiliation": "Head of Data Science & Innovation @ZucchettiSpa", 152 | "namesurname": "MATILDE,GRECCHI", 153 | "track": "R Production, R Dataviz & Shiny, R Machine Learning & Models, R Applications", 154 | "session_type": "Lightning talk", 155 | "description": "Explanation of the web application realized in Shiny and deployed in production to support the sales force of Zucchetti. An overview of the overall step followed from data ingestion to modeling, from validation of the model to shiny web-app realization, from deployment in production to continous learning thanks to feedbacks coming from sales force and redemption of customers. All the code is written in R using RStudio. The deployment of the app is done with ShinyProxy.io", 156 | "email": "ing.maty@gmail.com" 157 | }, 158 | { 159 | "title": "ptmixed: an R package for flexible modelling of longitudinal overdispersed count data", 160 | "author": "Mirko Signorelli", 161 | "affiliation": "Dept. of Biomedical Data Sciences, Leiden University Medical Center", 162 | "namesurname": "MIRKO,SIGNORELLI", 163 | "coauthor": "Roula Tsonaka, Pietro Spitali", 164 | "track": "R Machine Learning & Models, R Life Sciences", 165 | "session_type": "Lightning talk", 166 | "description": "Overdispersion is a commonly encountered feature of count data, and it is usually modelled using the negative binomial (NB) distribution. However, not all overdispersed distributions are created equal: while some are severely zero-inflated, other exhibit heavy tails.\r\nMounting evidence from many research fields suggests that often NB models cannot fit sufficiently well heavy-tailed or zero-inflated counts. It has been proposed to solve this problem by using the more flexible Poisson-Tweedie (PT) family of distributions, of which the NB is special case. However, current methods based on the PT can only handle cross-sectional datasets and no extension for correlated data is available.\r\nTo overcome this limitation we propose a PT mixed-effects model that can be used to flexibly model longitudinal overdispersed counts. To estimate this model we develop a computational pipeline that uses adaptive quadratures to accurately approximate the likelihood of the model, and numeric optimization methods to maximize it. We have implemented this approach in the R package ptmixed, which is published on CRAN.\r\nBesides showcasing the package’s functionalities, we will present an assessment of the accuracy of our estimation procedure, and provide an example application where we analyse longitudinal RNA-seq data, which often exhibit high levels of zero-inflation and heavy tails. \r\n\r\nReference: Signorelli, M., Spitali, P., Tsonaka, R. (2020, in press). Poisson-Tweedie mixed-effects model: a flexible approach for the analysis of longitudinal RNA-seq data. To appear in *Statistical Modelling*. arXiv preprint: [arXiv:2004.11193](https://arxiv.org/abs/2004.11193)", 167 | "email": "m.signorelli@lumc.nl" 168 | }, 169 | { 170 | "title": "One-way non-normal ANOVA in reliability analysis using with doex", 171 | "author": "Mustafa CAVUS", 172 | "affiliation": "PhD Student @Eskisehir Technical University", 173 | "namesurname": "MUSTAFA,CAVUS", 174 | "coauthor": "Berna YAZICI", 175 | "track": "R Production, R Life Sciences, R Applications", 176 | "session_type": "Lightning talk", 177 | "description": "One-way ANOVA is used for testing equality of several population means in statistics, and current packages in R provides functions to apply it. However, the violation of its assumptions are normality and variance heterogeneity limits its use, also not possible in some cases. doex provides alternative statistical methods to solve this problem. It has several tests based on generalized p-value, parametric bootstrap and fiducial approaches for the violation of variance heterogeneity and normality. Moreover, it provides the newly proposed methods for testing equality of mean lifetimes under different failure rates. \r\n\r\nThis talk introduces doex package provides has several methods for testing equality of population means independently the strict assumptions of ANOVA. An illustrative example is given for testing equality of mean of product lifetimes under different failure rates.", 178 | "email": "mustafacavus@eskisehir.edu.tr" 179 | }, 180 | { 181 | "title": "Keeping on top of R in Real-Time, High-Stakes trading systems", 182 | "author": "Nicholas Jhirad", 183 | "affiliation": "Senior Data Scientist, CINQ ICT (on Contract to Pinnacle Sports)", 184 | "namesurname": "NICHOLAS,JHIRAD", 185 | "coauthor": "Aaron Jacobs", 186 | "track": "R Production", 187 | "session_type": "Lightning talk", 188 | "description": "Visibility is the key to production. For R to work inside that environment, we need ubiquitous logging. I'll share insights from our experience building a production-grade R stack and monitoring all of our R applications via syslog, the 'rsyslog' package (on CRAN) and splunk.", 189 | "email": "shapenaji@gmail.com" 190 | }, 191 | { 192 | "title": "Towards more structured data quality assessment in the process mining field: the DaQAPO package", 193 | "author": "Niels Martin", 194 | "affiliation": "Postdoctoral researcher Research Foundation Flanders (FWO) - Hasselt University", 195 | "namesurname": "NIELS,MARTIN", 196 | "coauthor": "Niels Martin (Research Foundation Flanders FWO - Hasselt University), Greg Van Houdt (Hasselt University), Gert Janssenswillen (Hasselt University)", 197 | "track": "R Applications", 198 | "session_type": "Lightning talk", 199 | "description": "Process mining is a research field focusing on the extraction of insights on business processes from process execution data embedded in files called event logs. Event logs are a specific data structure originating from information systems supporting a business process such as an Enterprise Resource Planning System or a Hospital Information System. As a research field, process mining predominantly focused on the development of algorithms to retrieve process insights from an event log. However, consistent with the “garbage in - garbage out”-principle, the reliability of the algorithm’s outcomes strongly depends upon the data quality of the event log. It has been widely recognized that real-life event logs typically suffer from a multitude of data quality issues, stressing the need for thorough data quality assessment. Currently, event log quality is often judged on an ad-hoc basis, entailing the risk that important issues are overlooked. Hence, the need for a more structured data quality assessment approach within the process mining field. Therefore, the DaQAPO package has been developed, which is an acronym for Data Quality Assessment of Process-Oriented data. It offers an extensive set of functions to automatically identify common data quality problems in process execution data. In this way, it is the first R-package which supports systematic data quality assessment for event data.", 200 | "email": "niels.martin@uhasselt.be" 201 | }, 202 | { 203 | "title": "Analyzing Preference Data with the BayesMallows Package", 204 | "author": "Øystein Sørensen", 205 | "affiliation": "Associate Professor, University of Oslo", 206 | "namesurname": "ØYSTEIN,SØRENSEN", 207 | "coauthor": "Marta Crispino, Qinghua Liu, Valeria Vitelli", 208 | "track": "R Machine Learning & Models", 209 | "session_type": "Lightning talk", 210 | "description": "BayesMallows is an R package for analyzing preference data in the form of rankings with the Mallows rank model, and its finite mixture extension, in a Bayesian framework. The model is grounded on the idea that the probability density of an observed ranking decreases exponentially with the distance to the location parameter. It is the first Bayesian implementation that allows wide choices of distances, and it works well with a large number of items to be ranked. BayesMallows handles non-standard data: partial rankings and pairwise comparisons, even in cases including non-transitive preference patterns. The Bayesian paradigm allows coherent quantification of posterior uncertainties of estimates of any quantity of interest. These posteriors are fully available to the user, and the package comes with convenient tools for summarizing and visualizing the posterior distributions.\r\n\r\nThis talk will focus on how the BayesMallows package can be used to analyze preference data, in particular how the Bayesian paradigm allows endless possibilities in answering questions of interest with the help of visualization of posterior distributions. Such posterior summaries can easily be communicated with scientific collaborators and business stakeholders who may not be machine learning experts themselves.", 211 | "email": "oystein.sorensen.1985@gmail.com" 212 | }, 213 | { 214 | "title": "Predicting Business Cycle Fluctuations Using Text Analytics", 215 | "author": "Sami Diaf", 216 | "affiliation": "Researcher at the University of Hamburg", 217 | "namesurname": "SAMI,DIAF", 218 | "track": "R Machine Learning & Models", 219 | "session_type": "Lightning talk", 220 | "description": "The use of computational linguistics proved to be crucial in studying macroeconomic forecasts and understanding the essence of such exercises. \r\n\r\nCombining machine learning algorithms with text mining pipelines helps dissecting potential patterns of forecast errors and investigates the role of ideology in such outcomes.\r\n\r\nThe Priority Program “Exploring the Experience-Expectation Nexus” builds up, from a large database of German business cycle reports, advanced topic models and predictive analytics to investigate the role of ideology in the production of macroeconomic forecasts. The pipelines call for advanced data processing, predicting business fluctuations from text covariates, measuring ideological stances of forecasters and explaining what influences forecast errors.", 221 | "email": "sami.diaf@uni-hamburg.de" 222 | }, 223 | { 224 | "title": "Flexible deep learning via the JuliaConnectoR", 225 | "author": "Stefan Lenz", 226 | "affiliation": "Statistician at the Institute of Medical Biometry and Statistics (IMBI), Faculty of Medicine and Medical Center – University of Freiburg", 227 | "namesurname": "STEFAN,LENZ", 228 | "coauthor": "Harald Binder", 229 | "track": "R Machine Learning & Models", 230 | "session_type": "Lightning talk", 231 | "description": "For deep learning in R, frameworks from other languages, e. g. from Python, are widely used. Julia is another language which offers computational speed and a growing ecosystem for machine learning, e. g. with the package “Flux”. Integrating functionality of Julia in R is especially promising due to the many commonalities of Julia and R. We take advantage of these in the design of our “JuliaConnectoR” R package, which aims at a tight integration of Julia in R. We would like to present our package, together with some deep learning examples.\r\nThe JuliaConnectoR can import Julia functions, also from whole packages, and make them directly callable in R. Values and data structures are translated between the two languages. This includes the management of objects holding external resources such as memory pointers. The possibility to pass R functions as arguments to Julia functions makes the JuliaConnectoR a truly functional interface. Such callback functions can, e. g., be used to interactively display the learning process of a neural network in R while it is trained in Julia. Among others, this feature sets the JuliaConnectoR apart from the other R packages for integrating Julia in R, “XRJulia” and “JuliaCall”. This becomes possible with an optimized communication protocol, which also allows a highly efficient data transfer, leveraging the similarities in the binary representation of values in Julia and R.", 232 | "email": "lenz@imbi.uni-freiburg.de" 233 | }, 234 | { 235 | "title": "Time Series Missing Data Visualizations", 236 | "author": "Steffen Moritz", 237 | "affiliation": "Institute for Data Science, Engineering, and Analytics, TH Köln", 238 | "namesurname": "STEFFEN,MORITZ", 239 | "coauthor": "Thomas Bartz-Beielstein", 240 | "track": "R Dataviz & Shiny, R Applications", 241 | "session_type": "Lightning talk", 242 | "description": "Missing data is a quite common problem for time series, which usually also complicates later analysis steps.\r\nIn order to deal with this problem, visualizing the missing data is a very good start. \r\n\r\nVisualizing the patterns in the missing data can provide more information about the reasons for the missing data and give hints on how to best proceed with the analysis.\r\n\r\nThis talk gives a short intro into the new plotting functions being introduced with the 3.1 version of the imputeTS CRAN package.", 243 | "email": "steffen.moritz10@gmail.com" 244 | }, 245 | { 246 | "title": "effectclass: an R package to interpret effects and visualise uncertainty", 247 | "author": "Thierry Onkelinx", 248 | "affiliation": "Statistician at the Research Institute for Nature and Forest", 249 | "namesurname": "THIERRY,ONKELINX", 250 | "track": "R Dataviz & Shiny", 251 | "session_type": "Lightning talk", 252 | "description": "The package classifies effects by comparing their confidence interval with a reference, a lower and an upper threshold, all of which are set by the user a priori. The null hypothesis is a good choice as reference. The lower and upper threshold define a region around the reference in which the effect is small enough to be irrelevant. These thresholds are ideally based on the effect size used in the statistical power analysis of the design. Otherwise they can be based on expert judgement.\r\n\r\nThe result is a ten-scale classification of the effect. Three classes exist for significant effects above the reference and three classes for significant effects below the reference. The remaining four classes split the non-significant effects. The most important distinction is between “no effect” and “unknown effect”.\r\n\r\neffectclass provides ggplot2 add-ons stat_effect() and scale_effect() to visualise the effects as points with shapes depending on the classification. It provides stat_fan() which displays the uncertainty as multiple overlapping intervals with different confidence probability. stat_fan() is inspired by Britton, E.; Fisher, P. & J. Whitley (1998)\r\n\r\nMore details on the package website: https://effectclass.netlify.com/\r\n\r\nBritton, E.; Fisher, P. & J. Whitley (1998). The Inflation Report Projections: Understanding the Fan Chart. Bank of England Quarterly Bulletin.", 253 | "email": "thierry.onkelinx@inbo.be" 254 | }, 255 | { 256 | "title": "A flexible dashboard for monitoring platform trials", 257 | "author": "Alessio Crippa", 258 | "affiliation": "Karolinska Institutet, postdoc", 259 | "namesurname": "ALESSIO,CRIPPA", 260 | "coauthor": "Andrea Discacciati, Erin Gabriel, Martin Eklund", 261 | "track": "R Applications", 262 | "session_type": "Poster", 263 | "description": "The Data and Safety Monitoring Board (DSMB) is an essential component for a successful clinical trial. It consists of an independent group of experts that periodically revise and evaluate the accumulating data from an ongoing trial to assess patients’ safety, study progress, and drug efficacy. Based on their evaluation, a recommendation to continue, modify or stop the trial will be delivered to the trial’s sponsor. It is essential to provide the DSMB with the best delivery visualization tools for monitoring on a regular basis the live data from the study trial.\r\nWe designed and developed an interactive dashboard using flexdashboard for R as a helping tool for assisting the DSMB in the evaluation of the results of the ProBio study, a clinical platform for improving treatment decision in patients with metastatic castrate resistant prostate cancer. We will focus on the customized structure for best displaying the most interesting variables and the adoption of interactive tools as a particularly useful aid for the assessment of the ongoing data. We will also cover the connection to the data sources, the automatic generation process, and the selected permission for the people in the DSMB to access the dashboard.", 264 | "email": "alessio.crippa@ki.se" 265 | }, 266 | { 267 | "title": "PRDA package: Enhancing Statistical Inference via Prospective and Retrospective Design Analysis.", 268 | "author": "Angela Andreella", 269 | "affiliation": "University of Padua", 270 | "namesurname": "ANGELA,ANDREELLA", 271 | "coauthor": "Vesely Anna, Zandonella Callegher Claudio, Pastore Massimiliano, Altoè Gianmarco", 272 | "track": "R Life Sciences", 273 | "session_type": "Poster", 274 | "description": "There is a growing recognition of the importance of power analysis and calculation of the appropriate sample size when planning a research experiment. However, power analysis is not the only relevant aspect of the design of an experiment. Other inferential risks, such as the probability of estimating the effect in the wrong direction or the average overestimation of the actual effects, are also important. The evaluation of these inferential risks as well as the statistical power, in what Gelman and Carlin (2014) defined as Design Analysis, may help researchers to make informed choices both when planning an experiment or evaluating study results.\r\nWe introduce the PRDA (Prospective and Retrospective Design Analysis) package that allows researchers to carry a Design Analysis under different experimental scenarios (Altoè et al., 2020). Considering a plausible effect size (or its prior distribution) researchers can evaluate either the inferential risks for given sample size or the required sampled size to obtain a given statistical power.\r\nPreviously, PRDA functions were limited to mean differences between groups considering Cohen’s d in the Null significance Hypothesis Testing (NHST) framework. Now, we present the newly developed features that include other effect sizes (such as Pearson’s correlation) as well as Bayes Factor hypothesis testing.", 275 | "email": "angela.andreella@studenti.unipd.it" 276 | }, 277 | { 278 | "title": "Automate flexdashboard with GitHub", 279 | "author": "Binod Jung Bogati", 280 | "affiliation": "Data Analyst Intern at VIN", 281 | "namesurname": "BINOD,JUNG,BOGATI", 282 | "track": "R Dataviz & Shiny", 283 | "session_type": "Poster", 284 | "description": "flexdashboard is a great tool for building an interactive dashboard in R. We can host it for free on GitHub Pages, Rpubs and many other places.\r\n\r\nHosted flexdashboard is static so changes in our data we need to manually update and publish every time. If we want to auto-update we may need to integrate Shiny. However, it may not be suitable for every case.\r\n\r\nTo overcome this, we have a solution called GitHub Action. It's a feature from GitHub which automates our tasks in a convenient way. \r\n\r\nWith the help of GitHub Actions, we can automate our flexdashboard (Rmarkdown) updates. It builds a container that runs our R scripts. We can trigger it every time we push on GitHub or schedule it every X minutes/hours/days/month. \r\n\r\nIf you want to learn more about the GitHub Action. And also know how to automate updates on your flexdashboard. Please do come and join me.", 285 | "email": "bjungbogati@gmail.com" 286 | }, 287 | { 288 | "title": "EasyReporting: a Bioconductor package for Reproducible Research implementation", 289 | "author": "Dario Righelli", 290 | "affiliation": "Department of Statistics, University of Padua, Post-Doc", 291 | "namesurname": "DARIO,RIGHELLI", 292 | "coauthor": "angelini claudia", 293 | "track": "R Applications, R Life Sciences", 294 | "session_type": "Poster", 295 | "description": "EasyReporting is a novel R/Bioconductor package for speeding up Reproducible Research (RR) implementation when analyzing data, implementing workflows or other packages.\r\nIt is an S4 class helping developers to integrate an RR layer inside their software products, as well as data analysts speeding up their report production without learning the rmarkdown language.\r\nThanks to minimal additional efforts by developers, the end-user has available a rmarkdown file within all the source code generated during the analysis, divided into Code Chunks (CC) ready for the compilation. Moreover, EasyReporting gives also the possibility to add natural language comments and textual descriptions into the final report to be compiled for producing an enriched document that incorporates input data, source code and output results. Once compiled, the final document can be attached to the publication of the analysis as supplementary material, helping the interested community to entirely reproduce the computational part of work.\r\nDespite other previously proposed solutions, that usually require a significant effort by the final user, potentially bringing him/her to renounce to include RR inside the scripts, our approach is versatile and easy to be incorporated, allowing to the final developer/analyzer to automatically create and store a rmarkdown document, and providing also methods for its compilation.", 296 | "email": "dario.righelli@unipd.it" 297 | }, 298 | { 299 | "title": "NewWave: a scalable R package for the dimensionality reduction of single-cell RNA-seq", 300 | "author": "Federico Agostinis", 301 | "affiliation": "Università degli studi di Padova, Fellowship", 302 | "namesurname": "FEDERICO,AGOSTINIS", 303 | "coauthor": "Chiara Romualdi, Gabriele Sales, Davide Risso", 304 | "track": "R Life Sciences", 305 | "session_type": "Poster", 306 | "description": "The fast development of single cell sequencing technologies in the recent\r\nyears has generated a gap between the throughput of the experiments and the\r\ncapability of analizing the generated data.\r\nOne recent method for dimensionality reduction of single-cell RNA-seq data is\r\nzinbwave, it uses zero inflated negative binomial likelihood function optimization\r\nto find biological meaningful latent factors and remove batch effect. Zinbwave\r\nhas optimal performance but has some scalability issues due to large memory\r\nusage. To address this, we developed an R package with new software architec-\r\nture extending zinbwave.\r\nIn this package, we implement mini-batch stochastic gradient descent and the\r\npossibility of working with HDF5 files. We decide to use a negative binomial\r\nmodel following the observation that droplet sequencing technologies do not\r\ninduce zero inflation in the data. Thanks to these improvements and the possi-\r\nbility of massively parallelize the estimation process using PSOCK clusters, we\r\nare able to speed up the computations with the same or even better results than\r\nzinbwave. This type of parallelization can be used on multiple hardware setups,\r\nranging from simple laptops to dedicated server clusters. This, paired with the\r\nability to work with out-of-memory data, enables us to analyze datasets with\r\nmilions of cells.", 307 | "email": "federico.agostinis@gmail.com" 308 | }, 309 | { 310 | "title": "orf: Ordered Random Forests", 311 | "author": "Gabriel Okasa", 312 | "affiliation": "Research Assistant and PhD Candidate at the Swiss Institute for Empirical Economic Research, University of St. Gallen, Switzerland", 313 | "namesurname": "GABRIEL,OKASA", 314 | "coauthor": "Michael Lechner", 315 | "track": "R Machine Learning & Models", 316 | "session_type": "Poster", 317 | "description": "The R package ‘orf’ is a software implementation of the Ordered Forest estimator as developed in Lechner and Okasa (2019). The Ordered Forest flexibly estimates the conditional class probabilities of models involving categorical outcomes with an inherent ordering structure, known as ordered choice models. Additionally to common machine learning algorithms, the Ordered Forest enables estimation of marginal effects together with statistical inference and thus provides comparable output as in standard econometric models. Accordingly, the ‘orf’ package provides generic R functions to estimate, predict, plot, print and summarize the estimation output of the Ordered Forest along with various options for specific forest-related tuning parameters. Finally, computational speed is ensured as the core forest algorithm relies on the fast C++ forest implementation from the ranger package (Wright and Ziegler 2017).", 318 | "email": "gabriel.okasa@unisg.ch" 319 | }, 320 | { 321 | "title": "Power Supply health status monitoring dashboard", 322 | "author": "Marco Calderisi", 323 | "affiliation": "Kode srl, CTO", 324 | "namesurname": "MARCO,CALDERISI", 325 | "coauthor": "Jacopo Baldacci, Caterina Giacomelli, Ilaria Ceppa, Davide Massidda, Matteo Papi, Gabriele Galatolo, Francesca Giorgolo, ferdinando giordano, alessandro iovene", 326 | "track": "R Dataviz & Shiny", 327 | "session_type": "Poster", 328 | "description": "The Primis project dashboard allows to perform an analysis on the health status of power supplies boards on two levels:\r\n(1) analysis of a specific board, to check its status and the presence of any anomalies,\r\n(2) analysis of multiple boards within a single Power Supply, to check if the set of boards reveals abnormal behavior and if some boards behave in a distinctly different way from the others.\r\nThe analysis algorithms and the web application were created using the programming language R, and in particular the Shiny library.\r\nThe application is therefore divided into two parts that reflect these different types of analysis, called respectively \"Product View\" (analysis and diagnostics of a specific board) and \"Product Comparison\" (comparison analysis between multiple boards of the same Power Supply). Both analyzes can be carried out on an arbitrary time interval, selectable through a special application menu.\r\nThe analysis is carried out by means of: \r\n(1) univariate analysis, focusing on a specific parameter of one or more channels and displaying aggregate information regarding the status of the board in the entire observation period\r\n(2) multivariate analysis, that is the application of multivariate algorithms that allows to perform an overall analysis of the board, taking into account all the variables simultaneously.", 329 | "email": "m.calderisi@kode-solutions.net" 330 | }, 331 | { 332 | "title": "First-year ICT students dropout predicting with R models", 333 | "author": "Natalja Maksimova", 334 | "affiliation": "Virumaa College of Tallinn University of Technology, lecturer", 335 | "namesurname": "NATALJA,MAKSIMOVA", 336 | "coauthor": "Olga Dunajeva", 337 | "track": "R Machine Learning & Models", 338 | "session_type": "Poster", 339 | "description": "The aim of this study is to find how it is possible to predict first-year ICT students dropout in one Estonian college, Virumaa College of Tallinn University of Technology (TalTech) and possibly to engage methods to decrease dropout rate. We perform three approaches of machine learning using R tools: logistic regressions, decision trees and Naive Bayes to predict. The models are computed on the basis of the TalTech study information system data. As a result, we propose a methodical approach that may be realized in practice at other institutions. \r\nAll applied methods yield high prediction with more than 85% accuracies. In the same time some influencing and non-influencing factors were found in predicting ICT students’ dropout.", 340 | "email": "natalja.maksimova@taltech.ee" 341 | }, 342 | { 343 | "title": "Benchmark Percentage Disjoint Data Splitting in Cross Validation for Assessing the Skill of Machine", 344 | "author": "Olalekan Joseph Akintande", 345 | "affiliation": "University of Ibadan, Ph.D. Student", 346 | "namesurname": "OLALEKAN,JOSEPH,AKINTANDE", 347 | "coauthor": "O.E. Olubusoye", 348 | "track": "R Machine Learning & Models", 349 | "session_type": "Poster", 350 | "description": "The controversies surrounding dataset splitting technique and folklore of what has been or what should be, remain an open debate. Several authors (bloggers, researchers, and data scientists) in the field of machine learning and similar research areas, have proposed various arbitrary percentage disjoint dataset splitting (DDS) options for validating the skill of machine learning algorithms and by extension the appropriate percentage DDS based on cross-validation techniques. In this work, we propose benchmarks for which the percentage DDS procedure should be based. These benchmarks are founded on various training sizes (m) and serve as the basis and justification for the choice of an appropriate percentage DDS for assessing the skill of ML algorithms and related fields, on the concept of cross-validation techniques.", 351 | "email": "aojsoft@gmail.com" 352 | }, 353 | { 354 | "title": "Integrating professional software engineering practices in medical research software", 355 | "author": "Patricia Ryser-Welch", 356 | "affiliation": "Newcastle University, Population Health Science Institute, Research Associate,", 357 | "namesurname": "PATRICIA,RYSER-WELCH", 358 | "coauthor": "http://www.datashield.ac.uk", 359 | "track": "R Applications", 360 | "session_type": "Poster", 361 | "description": "Health data sets are getting bigger, more complex, and are increasingly being linked up with other data sources. With this trend there is an increasing risk of patient identification and disclosure. Two different ways of mitigating this risk are to use a federated analysis approach or to use a data safe haven.\r\n\r\nDataSHIELD (www.datashield.ac.uk) is an established federated data analysis tool that is used in the medical sciences. This software has a variety of methods to reduce the risk of disclosure built in. Here we describe the steps we are taking to apply modern software engineering methodologies to DataSHIELD. The upcoming Medical Devices legislation requires that software has more rigourous testing done on it. While this legislation does not directly apply to software used for research, we think it is important the ideas behind this do filter down to research software. For us these principles include testing that functions work, as well as testing that they produce the correct answers. Using a static standard data set to test against (that is publicly available) is also an important aspect. This work is being done in a continuous integraion framework using Microsoft Azure. Additionally all our software is developed as open source.\r\n\r\nIn addition to the protection DataSHIELD provides on its own we are also integrating it into our Trusted Research Environment as part of Connected Health Cities North East and North Cumbria. This will give an extra level of protection to data that may automatically flow from multiple data sources. Additionally, as analysis can be done in a federated way it means that that data does not need to leave its data controller's environment. This opens up the possibility of analysis happening across trusts and regions.", 362 | "email": "patricia.ryser-welch@newcastle.ac.uk" 363 | }, 364 | { 365 | "title": "Dealing with changing administrative boundaries: The case of Swiss municipalities", 366 | "author": "Tobias Schieferdecker", 367 | "affiliation": "Daily dealings with data at cynkra", 368 | "namesurname": "TOBIAS,SCHIEFERDECKER", 369 | "coauthor": "Thomas Knecht, Kirill Müller, Christoph Sax", 370 | "track": "R Applications", 371 | "session_type": "Poster", 372 | "description": "Switzerland's municipalities are frequently merged or reorganized, in an attempt to reduce costs and increase efficiency.\r\nThese mergers create a substantial problem for data analysis.\r\nOften it is desirable to study a municipality over time.\r\nBut in order to create a time series for a region of interest, its borders should stay constant.\r\nOur goal is to provide R-functions that allow an easy and consistent handling of these mergers.\r\nWe create a mapping table for municipality ID’s for a specified period of time, that allows us to track the mergers over time.\r\nWe also provide weights, such as population as well as area of the municipalities, to facilitate the construction of weighted time series.\r\nVarious other municipality mutations are also taken into account.\r\n\r\nWe are creating two R packages: an infrastructure package that handles the task of keeping the data up to date; and a user package that contains the functions to deal with the mergers.", 373 | "email": "tobias@cynkra.com" 374 | }, 375 | { 376 | "title": "badDEA: An R package for measuring firms’ efficiency adjusted by undesirable outputs", 377 | "author": "Yann Desjeux", 378 | "affiliation": "INRAE, France", 379 | "namesurname": "YANN,DESJEUX", 380 | "coauthor": "K Hervé DAKPO; Yann DESJEUX; Laure LATRUFFE", 381 | "track": "R Applications", 382 | "session_type": "Poster", 383 | "description": "Growing concerns on the detrimental effects of human production activities on the environment, e.g. air, soil and water pollution, have triggered the development of new performance indicators (including productivity and efficiency measures) accounting for such undesirable impacts. Firms can now be benchmarked not only in terms of economic performance, but also in terms of environmental performance linked to production. In the performance benchmarking literature, and more specifically the one on the non-parametric approach Data Envelopment Analysis (DEA), several methodologies have been developed to consider these impacts as undesirable (or bad) outputs. Related empirical applications in the literature, performed with various software, show that conclusions differ depending on the way these undesirable outputs are introduced. However none of these methodologies are routinely developed in R. In this context, we developed the badDEA package in order to provide a consistent and single framework where users (students, researchers, practitioners) can find the major methodologies proposed in the literature to compute efficiency measures that are adjusted by undesirable outputs. In this presentation, we will describe the aim, structure and options of the badDEA package, unfolding all the methodologies in their different variants and providing a promising tool for decision-making.", 384 | "email": "yann.desjeux@inrae.fr" 385 | }, 386 | { 387 | "title": "Design Patterns For Big Shiny Apps", 388 | "author": "Alex Gold", 389 | "affiliation": "Solutions Engineer, RStudio", 390 | "namesurname": "ALEX,GOLD", 391 | "coauthor": "Cole Arendt", 392 | "track": "R Dataviz & Shiny, R Production", 393 | "session_type": "Regular talk", 394 | "description": "In about 20 minutes on the morning of January 27, 2020, one engineer launched over 500 individual cloud server instances for workshop attendees at RStudio::conf and managed them for the duration of the workshops — all from a Shiny app. The RStudio team manages a variety of production systems using Shiny apps including our workshop infrastructure and access to our sales demo server. \r\n\r\nThe Shiny apps are robust enough for these mission-critical activities because of an important lesson from web engineering: separation of concerns between front-end user interaction logic and back-end business logic. This design pattern can be implemented in R by creating user interfaces in Shiny and managing interactions with other systems with Plumber APIs and R6 classes. \r\n\r\nThis pattern allows for even complex Shiny apps to still be understandable and maintainable. Moreover, this pattern of designing and building large Shiny apps is broadly applicable to any app that has substantial interaction with outside systems. Session attendees will gain an understanding of this pattern, which can be useful for many large Shiny apps.", 395 | "email": "alexkgold@gmail.com" 396 | }, 397 | { 398 | "title": "Using XGBoost, Plumber and Docker in production to power a new banking product", 399 | "author": "André Rivenæs", 400 | "affiliation": "Data Scientist, PwC", 401 | "namesurname": "ANDRÉ,RIVENÆS", 402 | "author2": "Markus Mortensen", 403 | "affiliation2": "PwC", 404 | "track": "R Machine Learning & Models, R Production", 405 | "session_type": "Regular talk", 406 | "description": "Buffer is a brand new and innovative banking product by one of the largest retail banks in Norway, Sparebanken Vest, and it is powered by R.\r\n\r\nIn fact, the product's decision engine is written entirely in R. We analyze whether a customer should get a loan and how much loan they should be allocated by analyzing large amounts of data from various sources. An essential part is analyzing the customer's invoices using machine learning (XGBoost). \r\n\r\nIn this talk, we will cover:\r\n\r\n- How we use ML and Bayesian statistics to estimate the probability of an invoice being repaid. \r\n- How we successfully put the decision engine in production, using e.g. Plumber, Docker, CircleCI and Kubernetes. \r\n- What we have learned from using R in production at scale.", 407 | "email": "andre@rivenaes.net" 408 | }, 409 | { 410 | "title": "Astronomical source detection and background separation: a Bayesian nonparametric approach", 411 | "author": "Andrea Sottosanti", 412 | "affiliation": "University of Padova", 413 | "namesurname": "ANDREA,SOTTOSANTI", 414 | "coauthor": "Mauro Bernardi, Alessandra R. Brazzale, Roberto Trotta, David A. van Dyk", 415 | "track": "R Machine Learning & Models, R Applications", 416 | "session_type": "Regular talk", 417 | "description": "We propose an innovative approach based on Bayesian nonparametric methods to the signal extraction of astronomical sources in gamma-ray count maps under the presence of a strong background contamination. Our model simultaneously induces clustering on the photons using their spatial information and gives an estimate of the number of sources, while separating them from the irregular signal of the background component that extends over the entire map. From a statistical perspective, the signal of the sources is modeled using a Dirichlet Process mixture, that allows to discover and locate a possible infinite number of clusters, while the background component is completely reconstructed using a new flexible Bayesian nonparametric model based on b-spline basis functions. The resultant can be then thought of as a hierarchical mixture of nonparametric mixtures for flexible clustering of highly contaminated signals. We provide also a Markov chain Monte Carlo algorithm to infer on the posterior distribution of the model parameters, and a suitable post-processing algorithm to quantify the information coming from the detected clusters. Results on different datasets confirm the capacity of the model to discover and locate the sources in the analysed map, to quantify their intensities and to estimate and account for the presence of the background contamination.", 418 | "email": "sottosanti@stat.unipd.it" 419 | }, 420 | { 421 | "title": "Creating drag-and-drop shiny applications using sortable", 422 | "author": "Andrie de Vries", 423 | "affiliation": "Solutions engineer at RStudio, Author of \"R for Dummies\"", 424 | "namesurname": "ANDRIE,DE,VRIES", 425 | "coauthor": "Barret Schloerke, Kenton Russell", 426 | "track": "R Dataviz & Shiny", 427 | "session_type": "Regular talk", 428 | "description": "Using the `learnr` package you can create interactive tutorials in your R Markdown documents. For a long time, you could only use the built-in question types, including R coding exercises and quizzes with single or multiple choice answers. Since the release of learnr version 0.10.0, it has been possible to create custom question types. The new framework allows you to define different behaviour for the appearance and behaviour of your questions. The sortable package uses this capability to introduce new question types for ranking questions and bucketing questions. With ranking questions you can ask your students to arrange answer options in the correct order. With bucketing questions you can ask them to arrange answer options into different buckets. The sortable package achieves this by exposing an `htmlwidget` wrapper around the SortableJS JavaScript library. This library lets you sort any object in a Shiny app, with dynamic drag-and-drop behaviour. For example, you can arrange items in a list, or drag-and-drop the order of shiny tabs. During this presentation you will see how easy it is to add dynamic behaviour to your shiny app, and how simple it is to use the new sorting and bucketing tasks in your tutorials.", 429 | "email": "andrie@rstudio.com" 430 | }, 431 | { 432 | "title": "High dimensional sampling and volume computation", 433 | "author": "Apostolos Chalkis", 434 | "affiliation": "PhD in Computer Science", 435 | "namesurname": "APOSTOLOS,CHALKIS", 436 | "coauthor": "Vissarion Fisikopoulos", 437 | "track": "R Machine Learning & Models", 438 | "session_type": "Regular talk", 439 | "description": "Sampling from multivariate distributions is a fundamental problem in statistics that plays important role in modern machine learning and data science. Many important problems such as convex optimization and multivariate integration can be efficiently solved via sampling. This talk presents the CRAN package volesti which offers to R community efficient C++ implementations of state-of-the-art algorithms for sampling and volume computation of convex sets. It scales up to hundred or thousand dimensions, depending the problem, providing the most efficient implementations for sampling and volume computation to date. Thus, volesti allows users to solve problems in dimensions and order of magnitude higher than before. We present the basic functionality of volesti and show how it can be used to provide approximate solutions to intractable problems in combinatorics, financial modeling, bioinformatics and engineering. We stand out two famous applications in finance. We show how volesti can be used to detect financial crises and evaluate portfolios performance in large stock markets with hundreds of assets, by giving real life examples using public data.", 440 | "email": "tolis.chal@gmail.com" 441 | }, 442 | { 443 | "title": "Fake News: AI on the battle ground", 444 | "author": "Ayomide Shodipo", 445 | "affiliation": "Senior Developer Advocate & Media Developer Expert at Cloudinary", 446 | "namesurname": "AYOMIDE,SHODIPO", 447 | "track": "R Machine Learning & Models, R Life Sciences, R Production, R World", 448 | "session_type": "Regular talk", 449 | "description": "Assumed products have been a longstanding and growing pain for companies around the globe. In addition to impacting company revenue, they damage brand reputation and customer confidence. Companies were asked to build a solution for a global electronics brand that can identify fake products by just taking one picture on a smartphone.\r\n\r\nIn this session, we will look into the building blocks that make this AI solution work. We’ll find out that there is much more to it than just training a convolutional neural network.\r\n\r\nWe look at challenges like how to manage and monitor the AI model and how to build and improve the model in a way that fits your DevOps production chain.\r\n\r\nLearn how we used Azure Functions, Cosmos DB and Docker to build a solid foundation. See how we used the Azure Machine Learning service to train the models. And find out how we used Azure DevOps to control, build and deploy this state-of-the-art solution.", 450 | "email": "shodipovi@gmail.com" 451 | }, 452 | { 453 | "title": "From consulting to open-source and back", 454 | "author": "Christoph Sax", 455 | "affiliation": "R-enthusiast, economist @cynkra", 456 | "namesurname": "CHRISTOPH,SAX", 457 | "coauthor": "Kirill Müller", 458 | "track": "R World", 459 | "session_type": "Regular talk", 460 | "description": "Open-source development is a great source of satisfaction and fulfillment, but someone has to pay the bills. A straightforward solution is to consult customers and help them to pick the right tools. As a small group of R enthusiasts, we try to align open source development by supporting our clients to accomplish their goals, contributing to the community along the way. It turns out that the benefits work in both ways: In addition to funding, consulting work allows us to test our tools and to improve their usability in a practical setting. At the same time, the involvement in open source development sharpens our analytical skills and serves as a first stop for new customers. Ideally, consulting projects lead to new developments, which in turn lead to new consulting projects.", 461 | "email": "christoph@cynkra.com" 462 | }, 463 | { 464 | "title": "Deduplicating real estate ads using Naive Bayes record linkage", 465 | "author": "Daniel Meister", 466 | "affiliation": "Datahouse AG", 467 | "namesurname": "DANIEL,MEISTER", 468 | "track": "R Applications", 469 | "session_type": "Regular talk", 470 | "description": "We demonstrate in this talk, how we used a containerized R and PostgreSQL data pipeline to deduplicate 60 million real estate ads from Germany and Switzerland using a multi-step Naive Bayes record linkage approach. Real estate platforms publish millions of rental flat and condominium ads yearly. A given region or country of interest is normally covered by various competing platforms, leading to multiple published ads for a single real world object. Because quantifying and modeling the real estate market requires unbiased input data, our aim was to deduplicate real estate ads using Naive Bayes record linkage. We used commercially available German and Swiss real estate ad data from 2012 to 2019 consisting of approximately 60 million individual records. After multiple data cleaning and preparation steps we employed a Naive Bayes weighting of 12-14 variables to calculate similarity scores between ads and determined a linkage threshold based on expert judgment. The deduplication pipeline consisted of three steps: linking ads based on identity comparisons, linking similar ads within small regional areas (municipalities) and linking similar ads within large regional areas (cantons, states). The pipeline was deployed as a containerized setup with in-memory calculations in R and out-of-memory calculations and data storage in PostgreSQL. Deduplication linked the around 60 million ads to around 14 million object groups (Germany: 10 millions, Switzerland: 4 millions). The distribution of similarity scores showed high separation power and the resulting object groups displayed high homogeneity in geographic location and price distribution. Furthermore, yearly results corresponded well with published relocation rates. Using Naive Bayes record linkage to deduplicate real estate ads resulted in a sensible grouping of ads into object groups (rental flats, condominiums). We were able to combine similarities across different variables into a single similarity score. An advantage of the Naive Bayes approach is the high interpretability of the influence of individual variables. However, by manually determining the linkage threshold our results are heavily influenced by possible expert biases. The containerized R and PostgreSQL setup proved it’s portability and scaling capabilities. The same approach could easily be transferred to other domains requiring deduplication of multivariate data sets.", 471 | "email": "daniel.meister@datahouse.ch" 472 | }, 473 | { 474 | "title": "{polite}: web etiquette for R users", 475 | "author": "Dmytro Perepolkin", 476 | "affiliation": "Lund University", 477 | "namesurname": "DMYTRO,PEREPOLKIN", 478 | "track": "R World, R Applications", 479 | "session_type": "Regular talk", 480 | "description": "Data is everywhere, but it does not mean it is freely available. What are best practices and acceptable norms for accessing the data on the web? How does one know when it is OK to scrape the content of a website and how to do it in such a way that it does not create problems for data owner and/or other users? This talk with provide examples of using {polite} package for safe and responsible web scraping. The three pillars of {polite} are seeking permission, taking slowly and never asking twice.", 481 | "email": "dperepolkin@gmail.com" 482 | }, 483 | { 484 | "title": "Hydrological Modelling and R", 485 | "author": "Emanuele Cordano", 486 | "affiliation": "www.rendena100.eu", 487 | "namesurname": "EMANUELE,CORDANO", 488 | "coauthor": "Giacomo Bertoldi", 489 | "track": "R Applications", 490 | "session_type": "Regular talk", 491 | "description": "Eco-hydrological and biophysical models are increasingly used in the contexts of hydrology, ecology, precision agriculture for better management of water resources and climate change impact studies at various scales: local, watershed or regional scale. However, to satisfy the researchers and stakeholders demand, user friendly interfaces are needed. The integration of such models in the powerful software environment of R greatly eases the application, input data preparation, output elaboration and visualization. In this work we present new developments for a R interface (R open-source package **geotopbricks** (https://CRAN.R-project.org/package=geotopbricks) and related) for the GEOtop hydrological distributed model (www.geotop.org - GNU General Public License v3.0). This package aims to be a link between the work of environmental engineers, who develop hydrological models, and the ones of data and applied scientists, who can extract information from the model results. Applications related to the simulation of water cycle dynamics (model calibration, mapping, data visualization) in some alpine basins and under scenarios of climate change and variability are shown. In particular, we will present an application to predict with the model winter snow conditions, which play a critical role in governing the spatial distribution of fauna in temperate ecosystems.", 492 | "email": "emanuele.cordano@gmail.com" 493 | }, 494 | { 495 | "title": "GeneTonic: enjoy RNA-seq data analysis, responsibly", 496 | "author": "Federico Marini", 497 | "affiliation": "Center for Thrombosis and Hemostasis (CTH) & Institute of Medical Biostatistics, Epidemiology and Informatics (IMBEI) - University Medical Center Mainz", 498 | "namesurname": "FEDERICO,MARINI", 499 | "track": "R Life Sciences", 500 | "session_type": "Regular talk", 501 | "description": "Interpreting the results from RNA-seq transcriptome experiments can be a complex task, where the essential information is distributed among different tabular and list formats - normalized expression values, results from differential expression analysis, and results from functional enrichment analyses.\r\n\r\nThe identification of relevant functional patterns, as well as their contextualization in the data and results at hand, are not straightforward operations if these pieces of information are not combined together efficiently.\r\n\r\nInteractivity can play an essential role in simplifying the way how one accesses and digests RNA-seq data analysis in a more comprehensive way.\r\n\r\nI introduce `GeneTonic` (https://github.com/federicomarini/GeneTonic), an application developed in Shiny and based on many essential elements of the Bioconductor project, that aims to reduce the barrier to understanding such data better, and to efficiently combine the different components of the analytic workflow.\r\n\r\nFor example, starting from bird's eye perspective summaries (with interactive bipartite gene-geneset graphs, or enrichment maps), it is easy to generate a number of visualizations, where drill-down user actions enable further insight and deliver additional information (e.g., gene info boxes, geneset summary, and signature heatmaps).\r\n\r\nComplex datasets interpretation can be wrapped up into a single call to the GeneTonic main function, which also supports built-in RMarkdown reporting, to both conclude an exploration session, or also to generate in batch the output of the available functionality, delivering an essential foundation for computational reproducibility.", 502 | "email": "marinif@uni-mainz.de" 503 | }, 504 | { 505 | "title": "A simple and flexible inactivity/sleep detection R package", 506 | "author": "Francesca Giorgolo", 507 | "affiliation": "Kode s.r.l. - Data Scientist", 508 | "namesurname": "FRANCESCA,GIORGOLO", 509 | "coauthor": "Ilaria Ceppa, Marco Calderisi, Davide Massidda, Matteo Papi, Gabriele Galatolo, Andrea Spinelli, Andrea Zedda, Jacopo Baldacci, Caterina Giacomelli", 510 | "track": "R Life Sciences", 511 | "session_type": "Regular talk", 512 | "description": "With the widespread usage of wearable devices great amount of data became available and new fields of application arised, like health monitoring and activity detection. \r\nOur work focused on inactivity and sleep detection from continuous raw tri-axis accelerometer data, recorded using different accelerometers brands having sampling frequencies below and above 1Hz.\r\nThe algorithm implemented is the SPT-window detection algorithm described in literature slighty modified to met the flexibility requirement we imposed ourselves.\r\nThe R package developed provides functions to clean data, to identify inactivity/sleep windows and to visualize the results.\r\nThe main function has a parameter to specify the measurement unit of the data, a threshold to distinguish low and high activity and also a parameter to handle non-wearing periods, where a non wear period is defined as a period of time where all the accelerometers are equal to zero. Other functions allow to separate overlapped accelerometer signals, i.e. when a device is replaced by another, and to visualize the obtained results.", 513 | "email": "f.giorgolo@kode-solutions.net" 514 | }, 515 | { 516 | "title": "progressr: An Inclusive, Unifying API for Progress Updates", 517 | "author": "Henrik Bengtsson", 518 | "affiliation": "UCSF, Assoc Prof, CS/Stats, R since 2000", 519 | "namesurname": "HENRIK,BENGTSSON", 520 | "track": "R Production, R Applications", 521 | "session_type": "Regular talk", 522 | "description": "The 'progressr' package provides a minimal, unifying API for scripts and packages to report progress from anywhere including when using parallel processing to anywhere.\r\n\r\nIt is designed such that the developer can focus on what to report progress on without having to worry about how to present it. The end user has full control of how, where, and when to render these progress updates. Progress bars from popular progress packages are supported and more can be added.\r\n\r\nThe 'progressr' is inclusive by design. Specifically, no assumptions are made how progress is reported, i.e. it does not have to be a progress bar in the terminal. Progress can also be reported as audio (e.g. unique begin and end sounds with intermediate non-intrusive step sounds), or via a local or online notification system.\r\n\r\nAnother novelty is that progress updates are controlled and signaled via R's condition framework. Because of this, there is no need for progress-specific arguments and progress can be reported from nearly everywhere in R, e.g. in classical for and while loops, within map-reduce APIs like the 'lapply()' family of functions, 'purrr', 'plyr', and 'foreach'. It also works with parallel processing via the 'future' framework, e.g. 'future.apply', 'furrr', and 'foreach' with 'doFuture'. The package is compatible with Shiny applications.", 523 | "email": "henrik.bengtsson@gmail.com" 524 | }, 525 | { 526 | "title": "varycoef: Modeling Spatially Varying Coefficients", 527 | "author": "Jakob Dambon", 528 | "affiliation": "HSLU & UZH, Switzerland", 529 | "namesurname": "JAKOB,DAMBON", 530 | "coauthor": "Fabio Sigrist, Reinhard Furrer", 531 | "track": "R Machine Learning & Models", 532 | "session_type": "Regular talk", 533 | "description": "In regression models for spatial data, it is often assumed that the marginal effects of covariates on the response are constant over space. In practice, this assumption might often be questionable. Spatially varying coefficient (SVC) models are commonly used to account for spatial structure within the coefficients. \r\nWith the R package varycoef, we provide the frame work to estimate Gaussian process-based SVC models. It is based on maximum likelihood estimation (MLE) and in contrast to existing model-based approaches, our method scales better to data where both the number of spatial points is large and the number of spatially varying covariates is moderately-sized, e.g., above ten.\r\nWe compare our methodology to existing methods such as a Bayesian approach using the stochastic partial differential equation (SPDE) link, geographically weighted regression (GWR), and eigenvector spatial filtering (ESF) in both a simulation study and an application where the goal is to predict prices of real estate apartments in Switzerland. The results from both the simulation study and application show that our proposed approach results in increased predictive accuracy and more precise estimates.", 534 | "email": "jakob.dambon@gmail.com" 535 | }, 536 | { 537 | "title": "FastAI in R: preserving wildlife with computer vision", 538 | "author": "Jędrzej Świeżewski", 539 | "affiliation": "Data Scientist at Appsilon", 540 | "namesurname": "JĘDRZEJ,ŚWIEŻEWSKI", 541 | "coauthor": "Marek Rogala", 542 | "track": "R Machine Learning & Models", 543 | "session_type": "Regular talk", 544 | "description": "In this presentation, we will discuss using the latest techniques in computer vision as an important part of “AI for Good” efforts, namely, enhancing wildlife preservation. We will present how to make use of the latest technical advancements in an R setup even if they are originally implemented in Python.\r\n\r\nA topic rightfully receiving growing attention among Machine Learning researchers and practitioners is how to make good use of the power obtained with the advancement of the tools. One of the avenues in these efforts is assisting wildlife conservation by employing computer vision in making observations of wildlife much more effective. We will discuss several of such efforts during the talk.\r\n\r\nOne of the very promising frameworks for computer vision developed recently is the Fast.ai wrapper of PyTorch, a Python framework used for computer vision among other things. While it incorporates the latest theoretical developments in the field (such as one cycle policy training) it provides an easy to use framework allowing a much wider audience to benefit from the tools, such as AI for Good initiatives run by people who are not formally trained in Machine Learning.\r\n\r\nDuring the presentation we will show how to make use of a model trained using the Python’s fastai library within an R workflow with the use of the reticulate package. We will focus on use cases concerning classifying species of African wildlife based on images from camera traps.", 545 | "email": "jedrzej@appsilon.com" 546 | }, 547 | { 548 | "title": "The R Consortium 2020: adapting to rapid change and global crisis", 549 | "author": "Joseph Rickert", 550 | "affiliation": "RStudio: R Community Ambassador, R Consortium's Board of Directors", 551 | "namesurname": "JOSEPH,RICKERT", 552 | "track": "R World", 553 | "session_type": "Regular talk", 554 | "description": "The COVID-19 pandemic has turned the world upside down, and like everyone else the R Community is learning how to adapt to rapid change in order to carry on important work while looking for ways to contribute to the fight against the pandemic. In this talk, I will report on continuing R Community work being organized through the R Consortium such as the R Hub, R User Group Support Program and Diversity and Inclusion Projects; and through the various working groups including the Validation Hub, R / Pharma, R / Medicine and R Business. Additionally, I will describe some of the recently funded ISC projects and report on the COVID-19 Data Forum, a new project that the R Consortium is organizing in partnership with Stanford’s Data Science Institute." 555 | }, 556 | { 557 | "title": "Powering Turing e-Atlas with R", 558 | "author": "Layik Hama", 559 | "affiliation": "Alan Turing Institute", 560 | "namesurname": "LAYIK,HAMA", 561 | "coauthor": "Dr Nik Lomax, Dr Roger Beecham", 562 | "track": "R Applications, R Production, R Dataviz & Shiny", 563 | "session_type": "Regular talk", 564 | "description": "Turing e-Atlas is a research project under the Urban Analytics research theme at Alan Turing Institute (ATI). The ATI is UK's national institute for data science and Artificial Intelligence based at the British Library in London. \r\n\r\nThe research is a grand vision for which we have been trying to take baby steps under the banner of an e-Atlas. And we believe R is positioned to play a foundation role in any scalable solution to analyse and visualize large scale datasets especially geospatial datasets. \r\n\r\nThe application presented is built using RStudio's Plumber package which relies on solid libraries to develop web applications. The front-end is made up of Uber's various visualization packages using Facebook's React JavaScript framework.", 565 | "email": "l.hama@leeds.ac.uk" 566 | }, 567 | { 568 | "title": "Using process mining principles to extract a collaboration graph from a version control system log", 569 | "author": "Leen JOOKEN", 570 | "affiliation": "Hasselt University, PhD student", 571 | "namesurname": "LEEN,JOOKEN", 572 | "coauthor": "Gert Janssenswillen, Mathijs Creemers, Mieke Jans, Benoît Depaire", 573 | "track": "R Production, R World", 574 | "session_type": "Regular talk", 575 | "description": "Knowledge management is an indispensable component of modern-day, fast changing and flexible software engineering environments. A clear overview on how software developers collaborate can reveal interesting insights such as the general structure of collaboration, crucial resources, and risks in terms of knowledge preservation that can arise when a programmer decides to leave the company. Version control system (VCS) logs, which keep track of what team members work on and when, contain data to provide these insights. We present an R package that provides an algorithm which extracts and visualizes a collaboration graph from VCS log data. The algorithm is based on principles from graph theory, cartography and process mining. Its structure consists of four phases: (1) building the base graph, (2) calculating weights for nodes and edges, (3) simplifying the graph using aggregation and abstraction, and (4) extending it to include specific insights of interest. Each of these phases offers the user a lot of flexibility in deciding which parameters and metrics to include. This makes it possible for the human expert to exploit his existing knowledge about the project and team to guide the algorithm in building the graph that best fits his specific use case, and hence will provide the most accurate insights.", 576 | "email": "leen.jooken@uhasselt.be" 577 | }, 578 | { 579 | "title": "Manifoldgstat: an R package for spatial statistics of manifold data", 580 | "author": "Luca Torriani", 581 | "affiliation": "MOX, Department of Mathematics, Politecnico di Milano", 582 | "namesurname": "LUCA,TORRIANI", 583 | "coauthor": "Alessandra Menafoglio, Piercesare Secchi", 584 | "author2": "Ilaria Sartori", 585 | "affiliation2": "Politecnico di Milano", 586 | "track": "R Machine Learning & Models", 587 | "session_type": "Regular talk", 588 | "description": "The statistical analysis of data belonging to Riemannian manifolds is becoming increasingly important in many applications, such as shape analysis or diffusion tensor imaging. In many cases, the available data are georeferenced, making spatial dependence a non-negligible data characteristic. Modeling and accounting for it, typically, is not trivial, because of the non-linear geometry of the manifold. In this contribution, we present the Manifoldgstat R package, which provides a set of fast routines allowing to efficiently analyze sets of spatial Riemannian data, based on state-of-the-art statistical methodologies. The package stems from the need to create an efficient and reproducible environment allowing to run extensive simulation studies and bagging algorithms for spatial prediction of symmetric positive definite matrices. The package implements three main algorithms (Pigoli et al, 2016, Menafoglio et al, 2019, Sartori & Torriani, 2019). The latter two are particularly computationally demanding, as they rely on Random Domain Decompositions of the geographical domain. To substantially improve performances, the package exploits dplyr and Rcpp to integrate R with C++ code, where template factories handle all run-time choices. In this communication, we shall revise the characteristics of the three methodologies considered, and the key-points of their implementation.", 589 | "email": "luca.torriani94@gmail.com" 590 | }, 591 | { 592 | "title": "Voronoi Linkage for Spatially Misaligned Data", 593 | "author": "Luís G. Silva e Silva", 594 | "affiliation": "Food and Agriculture Organization - FAO - Data Scientist", 595 | "namesurname": "LUÍS,G.,SILVA,E,SILVA", 596 | "coauthor": "Lucas Godoy, Douglas Azevedo, Augusto Marcolin, Jun Yan", 597 | "track": "R Dataviz & Shiny, R World", 598 | "session_type": "Regular talk", 599 | "description": "In studies of elections, voting outcomes are point-referenced at voting stations while socioeconomic covariates are areal data available at census tracts. The misaligned spatial structure of these two data sources makes the regression analysis to identify socioeconomic factors that affect the voting outcomes a challenge. Here we propose a novel approach to link these two sources of spatial data through Voronoi tessellation. Our proposal is creating a Voronoi tessellation with respect to the point-referenced data, with this outset, the spatial points become a set of mutually exclusive polygons named Voronoi cells. The extraction of data from the census tracts is proportional to the intersection area of each census tract polygon and Voronoi cells. Consequently, we use 100% of the available information and preserve the polygons’ autocorrelation structure. When visualised through our Shiny App, the method provides a finer spatial resolution than municipalities and facilitates the identification of spatial structures at a most detailed level. The technique is applied for the 2018 Brazilian presidential election data. The tool provides deep access to Brazilian election results by enabling to create general maps, plots, and tables by states and cities.", 600 | "email": "lgsilvaesilva@gmail.com" 601 | }, 602 | { 603 | "title": "Be proud of your code! Tools and patterns for making production-ready, clean R code", 604 | "author": "Marcin Dubel", 605 | "affiliation": "Software Engineer at Appsilon Data Science", 606 | "namesurname": "MARCIN,DUBEL", 607 | "track": "R Production, R World", 608 | "session_type": "Regular talk", 609 | "description": "In this talk you’ll learn the tools and best practices for making clean, reproducible R code in a working environment ready to be shared and productionalised. Save your time for maintenance, adjusting and struggling with packages. \r\n\r\nR is a great tool for fast data analysis. It’s simplicity in setup combined with powerful features and community support makes it a perfect language for many subject matter experts e.g. in finance or bioinformatics. Yet often what started as a pet project or proof of concept begins to grow and expand, with additional collaborators working on it. It is then crucial that you have your project organised well, reusable, with an environment set, so that the code works every time and on any machine. Otherwise the solution won’t be used by anyone but you. By following a few patterns and with appropriate tools it won’t be overwhelming or disturbing and will highlight the true value of the code.\r\n\r\nBoth Appsilon and I personally have taken part in many R projects for which the goal was to clean and organise the code as well as the project structure. We would like to share our experience, best practices and useful tools to share code shamelessly.\r\n\r\nDuring the presentation I will show: \r\nsetting up the development environment with **packrat**, **renv** and **docker**,\r\norganising the project structure,\r\nthe best practices in writing R code, automated with **linter**,\r\nsharing the code using git,\r\norganising workflow with **drake**,\r\noptimising the Shiny apps and data loading with **plumber** and **database**,\r\npreparing the tests and continuous integration **circle CI**.", 610 | "email": "marcin@appsilon.com" 611 | }, 612 | { 613 | "title": "Going in the fast lane with R. How we use R within the biggest digital dealer program in EMEA.", 614 | "author": "Marco Cavaliere", 615 | "affiliation": "Like Reply - Business Data Analyst", 616 | "namesurname": "MARCO,CAVALIERE", 617 | "track": "R Production, R Machine Learning & Models", 618 | "session_type": "Regular talk", 619 | "description": "How we are using R as the foundation of all the data-related tasks in the biggest dealer digital program at FCA. \r\nFrom simple tasks as dashboarding or reporting to more strategic capabilities as predicting advertising ROI through Tensorflow or developing a production-grade, data-driven microservices, we leverage R ecosystem to deliver better results and increase the data awareness for all the project stakeholders.", 620 | "email": "m.cavaliere@reply.it" 621 | }, 622 | { 623 | "title": "R alongside Airflow, Docker and Gitlab CI", 624 | "author": "Matthias Bannert", 625 | "affiliation": "Research Engineering Lead at ETH Zurich, KOF Swiss Economic Institute", 626 | "namesurname": "MATTHIAS,BANNERT", 627 | "track": "R Production", 628 | "session_type": "Regular talk", 629 | "description": "The KOF Swiss Economic Institute at ETH Zurich (Switzerland) regularly surveys more than 10K companies, computes economic indicators and forecasts as it monitors the national economy. Today, the institute updates its website in automated fashion and operates automated data pipelines to partners such as regional statistical offices or the Swiss National Bank. At KOF, production is based on an open source ecosystem to a large degree. More and more processes are being migrated to an environment that consists of open source components Apache Airflow, Docker, Gitlab Continous Integration, PostgreSQL and R. This talk shows not only how well R interfaces and works with all parts from workflow automation to databases, but also how R's advantages impact this system: From R Studio Servers to internal packages and an own internal mini-CRAN, the use of the R language is crucial in making the environment stable and convenient to maintain with the software carpentry type of background.", 630 | "email": "matthias.bannert@gmail.com" 631 | }, 632 | { 633 | "title": "DaMiRseq 2.0: from high dimensional data to cost-effective reliable prediction models", 634 | "author": "Mattia Chiesa", 635 | "affiliation": "Senior data scientist @ Centro Cardiologico Monzino IRCCS", 636 | "namesurname": "MATTIA,CHIESA", 637 | "coauthor": "Chiara Vavassori, Gualtiero I. Colombo, Luca Piacentini", 638 | "track": "R Life Sciences", 639 | "session_type": "Regular talk", 640 | "description": "High dimensional data generated by modern high-throughput platforms pose a great challenge in selecting a small number of informative variables, for biomarker discovery and classification. Machine learning is an appropriate approach to derive general knowledge from data, identifying highly discriminative features and building accurate prediction models. To this end, we developed the R/Bioconductor package DaMiRseq, which (i) helps researchers to filter and normalize high dimensional datasets, arising from RNA-Seq experiments, by removing noise and bias and (ii) exploits a custom machine learning workflow to select the minimum set of robust informative features able to discriminate classes.\r\nHere, we present the version 2.0 of the DaMiRseq package, an extension that provides a flexible and convenient framework for managing high dimensional data such as omics data, large-scale medical histories, or even social media and financial data. Specifically, DaMiRseq 2.0 implements new functions that allow training and testing of several different classifiers and selection of the most reliable one, in terms of classification performance and number of selected features. The resulting classification model can be further used for any prediction purpose. This framework will give users the ability to build an efficient prediction model that can be easily replicated in further related settings.", 641 | "email": "mattia.chiesa@ccfm.it" 642 | }, 643 | { 644 | "title": "How to apply R in a hospital environment on standard available hospital-wide data", 645 | "author": "Mieke Deschepper", 646 | "affiliation": "University hospital Ghent, staf member Strategic Policy cell, Ph.D.", 647 | "namesurname": "MIEKE,DESCHEPPER", 648 | "track": "R Life Sciences", 649 | "session_type": "Regular talk", 650 | "description": "Lots of data is registered within hospitals, for financial, clinical and administrative purposes. Today, this data is barely used. Due to not knowing the existence of the data, the possible applications and the skills to execute the analysis, …\r\nIn this presentation we show how we can apply R on this data and what the possibilities are using standard available hospital-wide data on a low cost budget. \r\n1.\tReporting with R \r\n-\tusing R and markdown as a tool for management reporting\r\n-\tUsing R for data handling (ETL)\r\n-\tShiny applications as alternative for dashboarding\r\n2.\tUsing R as a statistical tool\r\n-\tPerforming regression models to gain insight in certain predictor\r\n3.\tUsing R a data science tool\r\n-\tUsing R to perform Machine Learning analysis, e.g. Random Forests\r\n-\tUsing R for the data wrangling and handle the high dimensional data\r\n4.\tRequirements for all of the above", 651 | "email": "mieke.deschepper@uzgent.be" 652 | }, 653 | { 654 | "title": "Computer Algebra Systems in R", 655 | "author": "Mikkel Meyer Andersen", 656 | "affiliation": "Assoc. Prof., Department of Mathematical Sciences, Aalborg University, Denmark", 657 | "namesurname": "MIKKEL,MEYER,ANDERSEN", 658 | "coauthor": "Søren Højsgaard", 659 | "track": "R World, R Machine Learning & Models, R Applications", 660 | "session_type": "Regular talk", 661 | "description": "R's ability to do symbolic mathematics is largely restricted to finding derivatives. There are many tasks involving symbolic math and that are of interest to R users, e.g. inversion of symbolic matrices, limits and solving non-linear equations. Users must resort to other computer algebra systems (CAS) for such tasks and many R users (especially outside of academia) do not readily have access to such software. There are also other indirect use-cases of symbolic mathematics in R that can exploit other strengths of R, including Shiny apps with auto-generated mathematics exercises.\r\n\r\nWe are maintaining two packages enabling symbolic mathematics in R: Ryacas and caracas. Ryacas is based on Yacas (Yet Another Computer Algebra System) and caracas is based on SymPy (Python library). Each have their advantages: Yacas is extensible and has a close integration to R which makes auto-generated mathematics exercises easy to make. SymPy is feature-rich and thus gives many possibilities.\r\n\r\nIn this talk we will discuss the two packages and demonstrate various use-cases including uses that help understanding statistical models and Shiny apps with auto-generated mathematics exercises.", 662 | "email": "mikl@math.aau.dk" 663 | }, 664 | { 665 | "title": "Interpretable and accessible Deep Learning for omics data with R and friends", 666 | "author": "Moritz Hess", 667 | "affiliation": "Research Associate, Institute of Medical Biometry and Statistics, Faculty of Medicine and Medical Center - University of Freiburg", 668 | "namesurname": "MORITZ,HESS", 669 | "coauthor": "Stefan Lenz, Harald Binder", 670 | "track": "R Life Sciences", 671 | "session_type": "Regular talk", 672 | "description": "Recently, generative Deep Learning approaches were shown to have a huge potential for e.g. retrieving compact, latent representations of high-dimensional omics data such as single-cell RNA-Seq data. However, there are no established methods to infer how these latent representations relate to the observed variables, i.e. the genes.\r\n\r\nFor extracting interpretable patterns from gene expression data that indicate distinct sub-populations in the data, we here employ log-linear models, applied to the synthetic data and corresponding latent representations, sampled from generative deep models, which were trained with single-cell gene expression data.\r\n\r\nWhile omics data are routinely analyzed in R and powerful toolboxes, tailored to omics data are available, there are no established and truely accessible approaches for Deep Learning applications here. \r\n\r\nTo close this gap, we here demonstrate how easily customizable Deep Learning frameworks, developed for the Julia programming language, can be leveraged in R, to perform accessible and interpretable Deep Learning with omics data.", 673 | "email": "hess@imbi.uni-freiburg.de" 674 | }, 675 | { 676 | "title": "Elevating shiny module with {tidymodules}", 677 | "author": "Mustapha Larbaoui", 678 | "affiliation": "Novartis, Associate Director, Scientific Computing & Consulting", 679 | "namesurname": "MUSTAPHA,LARBAOUI", 680 | "coauthor": "Doug Robinson, Xiao Ni, David Granjon", 681 | "track": "R Dataviz & Shiny", 682 | "session_type": "Regular talk", 683 | "description": "Shiny App developers have warmly welcomed the concept of Shiny modules as a way to simplify the app development process through the introduction of reusable building blocks. Shiny modules are similar in spirit to the concept of functions in R, except each is implemented with paired ui and server codes along with their own namespace. The {tidymodules} R package introduces a novel structure that harmonizes module development based on R6 (https://r6.r-lib.org/), which is an implementation of encapsulated object-oriented programming for R, thus knowledge of R6 is a prerequisite for using {tidymodules} to develop Shiny modules. Some key features of this package are module encapsulation, reference semantics, central module store and an innovative framework for enabling and facilitating cross module – module communication. It does this through the creation of “ports”, both input and output, where users may pass data and information through pipe operators. Because the connections are strictly specified, the module network may be visualized which shows how data move from one module to another. We feel the {tidymodules} framework will simplify the module development process and will reduce the code complexity through programming concepts like inheritance.", 684 | "email": "mustapha.larbaoui@novartis.com" 685 | }, 686 | { 687 | "title": "APFr: Average Power Function and Bayes FDR for Robust Brain Networks Construction", 688 | "author": "Nicolo' Margaritella", 689 | "affiliation": "University of Edinburgh", 690 | "namesurname": "NICOLO',MARGARITELLA", 691 | "coauthor": "Piero Quatto", 692 | "track": "R Life Sciences", 693 | "session_type": "Regular talk", 694 | "description": "Brain functional connectivity is widely investigated in neuroscience. In recent years, the study of brain connectivity has been largely aided by graph theory. The link between time series recorded at multiple locations in the brain and a graph is usually an adjacency matrix. This converts a measure of the connectivity between two time series, typically a correlation coefficient, into a binary choice on whether the two brain locations are functionally connected or not. As a result, the choice of a threshold over the correlation coefficient is key.\r\nIn the present work, we propose a multiple testing approach to the choice of a suitable threshold using the Bayes false discovery rate (FDR) and a new estimator of the statistical power called average power function (APF) to balance the two types of statistical error. \r\nWe show that the proposed APF behaves well in case of independence of the tests and it is reliable under several dependence conditions. Moreover, we propose a robust method for threshold selection using the 5% and 95% percentiles of APF and FDR bootstrap distributions, respectively, to improve stability.\r\nIn addition, we developed a R-package called APFr which performs APF and Bayes FDR robust estimation and provides simple examples to improve usability. The package has attracted more than 3200 downloads since its publication online (June 2019) at https://CRAN.R-project.org/package=APFr.", 695 | "email": "N.Margaritella@sms.ed.ac.uk" 696 | }, 697 | { 698 | "title": "Flexible Meta-Analysis of Generalized Additive Models with metagam", 699 | "author": "Øystein Sørensen", 700 | "affiliation": "Associate Professor, University of Oslo", 701 | "namesurname": "ØYSTEIN,SØRENSEN", 702 | "coauthor": "Andreas Brandmaier", 703 | "track": "R Life Sciences, R Machine Learning & Models", 704 | "session_type": "Regular talk", 705 | "description": "Analyzing biomedical data from multiple studies has great potential in terms of increasing statistical power, enabling detection of associations of smaller magnitude than would be possible analyzing each study separately. Restrictions due to privacy or proprietary data as well as more practical concerns can make it hard to share datasets, such that analyzing all data in a single mega-analysis might not be possible. Meta-analytic methods provide a way to overcome this issue, by combining aggregated quantities like model parameters or risk ratios. However, most meta-analytic tools have focused on parametric statistical models, and software for meta-analyzing semi-parametric models like generalized additive models (GAMs) have not been developed. The metagam package attempts to fill this gap: It provides functionality for removing individual participant data from GAM objects such that they can be analyzed in a common location; furthermore metagam enables meta-analysis of the resulting GAM objects, as well as various tools for visualization and statistical analysis. This talk will illustrate use of the metagam package for analysis of the relationship between sleep quality and brain structure using data from six European brain imaging cohorts.", 706 | "email": "oystein.sorensen.1985@gmail.com" 707 | }, 708 | { 709 | "title": "EPIMOD: A computational framework for studying epidemiological systems.", 710 | "author": "Paolo Castagno", 711 | "affiliation": "Ph.D.", 712 | "namesurname": "PAOLO,CASTAGNO", 713 | "coauthor": "Simone Pernice, Matteo Sereno, Marco Beccuti.", 714 | "track": "R Life Sciences, R Applications", 715 | "session_type": "Regular talk", 716 | "description": "Computational-mathematical models can be efficiently used to provide new insights into drivers of a disease spread, investigating different explanations of observed resurgence and predicting potential effects of different therapies. In this context, we present a new general modeling framework for the analysis of epidemiological and biological systems, characterized by features that make easy its utilization even by researchers without advanced mathematical and computational skills. The implementation of the R package, called “Epimod”, provides a friendly interface to access the model creation and the analysis techniques implemented in the framework. In details, by exploiting the graphical formalism of the Petri Net it is possible to simplify the model creation phase, providing a compact graphical description of the system and an automatically derivation of the underlying stochastic or deterministic process. Then, by using four functions it is possible to deal with Model Generation, Sensitivity Analysis, Model Calibration and Model Analysis phases. Finally, the Docker containerization of all analysis techniques grants a high level of portability and reproducibility. We applied Epimod to model pertussis epidemiology, investigating alternative explanations of its resurgence and to predict potential effects of different vaccination strategies.", 717 | "email": "castagno@di.unito.it" 718 | }, 719 | { 720 | "title": "CorrelAidX - Building R-focused Communities for Social Good on the Local Level", 721 | "author": "Regina Siegers", 722 | "affiliation": "CorrelAidX Coordination", 723 | "namesurname": "REGINA,SIEGERS", 724 | "coauthor": "Konstantin Gavras", 725 | "track": "R World", 726 | "session_type": "Regular talk", 727 | "description": "Data scientists with their valuable skills have enormous potential to contribute to the social good. This is also true for the R community - and R users seem to be especially motivated to use their skills for the social good, as the overwhelmingly positive reception of Julien Cornebise's keynote \"AI for Good\" at useR2019 (Cornebise 2019) has shown. However, specific strategies for putting the abstract goal \"use data science for the social good\" into practice are often missing, especially in volunteering contexts like the R community, where resources are often limited.\r\n\r\nIn our talk, we present formats that we have implemented on the local level to build R-focused, data-for-good communities across Europe. Originating from the German data4good network CorrelAid with its over 1600 members, we have established 9 local CorrelAidX groups in Germany, the Netherlands, and France.\r\n\r\nThe specific formats build on a three-pillared concept of community building, namely group-bonding, social entrepreneurship and outreach. We present multiple examples that illustrate how our local chapters operate to put data science for good into practice - using the formats of data dialogues, local data4good projects, and CorrelAidX workshops. Lastly, we also outline possibilities to implement such formats in cooperation between CorrelAidX chapters and R community groups such as R user groups or RLadies chapters.", 728 | "email": "regina.siegers@posteo.de" 729 | }, 730 | { 731 | "title": "Interactive visualization of complex texts", 732 | "author": "Renate Delucchi Danhier", 733 | "affiliation": "Post-Doc, TU Dortmund", 734 | "namesurname": "RENATE,DELUCCHI,DANHIER", 735 | "coauthor": "Paula González Ávalos", 736 | "track": "R Dataviz & Shiny", 737 | "session_type": "Regular talk", 738 | "description": "Hundreds of speakers may describe the same circumstance - e.g. explaining a fixed route to a goal - without producing two identical texts. The enormous variability of language and the complexity involved in encoding meaning poses a real difficulty for linguists analyzing text databases. In order to aid linguists in identifying patterns to perform comparative research, we developed an interactive shiny app that enables quantitative analysis of text corpora without oversimplifying the structure of language. Route directions are an example of complex texts, in which speakers take cognitive decisions such as segmenting the route, selecting landmarks and organizing spatial concepts into sentences. In the data visualization, symbols and colors representing linguistic concepts are plotted into coordinates that relate the information to fixed points along the route. Six interconnected layers of meaning represent the multi-layered form-to-meaning mapping characteristic of language. The shiny app allows to select and deselect information on these different layers, offering a holistic linguistic analysis way beyond the complexity attempted within traditional linguistics. The result is a kind of visual language in itself that deconstructs the interconnected layers of meaning found in natural language.", 739 | "email": "renatedelucchi@gmail.com" 740 | }, 741 | { 742 | "title": "CONNECTOR: a computational approach to study intratumor heterogeneity.", 743 | "author": "Simone Pernice", 744 | "affiliation": "Ph.D student at Department of Computer Science of the University of Turin", 745 | "namesurname": "SIMONE,PERNICE", 746 | "coauthor": "Beccuti Marco, Sirovich Roberta, Cordero Francesca", 747 | "track": "R Life Sciences", 748 | "session_type": "Regular talk", 749 | "description": "Literature is characterized by a broad class of mathematical models which can be used for fitting cancer growth time series, but with no a global consensus or biological evidence that can drive the choice of the correct model. The conventional perception is that the mechanistic models enable the biological understanding of the systems under study. However, there is no way that these models can capture the variability characterizing the cancer progression, especially because of the irregularity and sparsity of the available data.\r\nFor this reason, we propose CONNECTOR, an R package built on the model-based approach for clustering functional data. Such method is based on the clustering and fitting of the data through a combination of cubic natural splines basis with coefficients treated as random variables. Our approach is particularly effective when the observations are sparse and irregularly spaced, as growth curves usually are. CONNECTOR provides a tool set to easily guide through the parameters selection, i.e., (i) the dimension of the spline basis, (ii) the dimension of the mean space and (iii) the number of clusters to fit, to be properly chosen before fitting. The effectiveness of CONNECTOR is evaluated on growth curves of Patient Derived Xenografts (PDXs) of ovarian cancer. Genomic analyses of PDXs allowed correlating fitted and clustered PDX growth curves to cell population distribution.", 750 | "email": "simone.pernice@unito.it" 751 | }, 752 | { 753 | "title": "gWQS: An R Package for Linear and Generalized Weighted Quantile Sum (WQS) Regression", 754 | "author": "Stefano Renzetti", 755 | "affiliation": "PhD Student at Università degli Studi di Milano", 756 | "namesurname": "STEFANO,RENZETTI", 757 | "coauthor": "Chris Gennings, Paul C. Curtin", 758 | "track": "R Machine Learning & Models, R Life Sciences", 759 | "session_type": "Regular talk", 760 | "description": "Weighted Quantile Sum (WQS) regression is a statistical model for multivariate regression in high-dimensional datasets commonly encountered in environmental exposures. The model constructs a weighted index estimating the mixture effect associated with all predictor variables on an outcome. The package gWQS extends WQS regression to applications with continuous, categorical and count outcomes. We provide four examples to illustrate the usage of the package.", 761 | "email": "stefano.renzetti@unibs.it" 762 | }, 763 | { 764 | "title": "Transparent Journalism Through the Power of R", 765 | "author": "Tatjana Kecojevic", 766 | "affiliation": "SisterAnalyst.org; founder and director", 767 | "namesurname": "TATJANA,KECOJEVIC", 768 | "track": "R World", 769 | "session_type": "Regular talk", 770 | "description": "This study examines the often-tricky process of delivering data literacy programmes to professionals with most to gain from a deeper understanding of data analysis. As such, the author discusses the process of building and delivering training strategies to journalists in regions where press freedom is constrained by numerous factors, not least of all institutionalised corruption. \r\n\r\nReporting stories that are supplemented with transparent procedural systems are less likely to be contradicted and challenged by vested interest actors. Journalists are able to present findings supported by charts and info graphics, but these are open to translation. Therefore, most importantly, the data and code of the applied analytical methodology should also be available for scrutiny and is less likely to be subverted or prohibited.\r\n\r\nAs part of creating an accessible programme geared to acquiring skills necessary for data journalism, the author takes a step-by-step approach to discussing the actualities of building online platforms for training purposes. Through the use of grammar of graphics in R and Shiny, a web application framework for R, it is possible to develop interactive applications for graphical data visualisation. Presenting findings through interactive and accessible visualisation methods in a transparent and reproducible way is an effective form of reaching audiences that might not otherwise realise the value of the topic or data at hand. \r\n\r\nThe resulting ‘R toolbox for journalists’ is an accessible open-source resource. It can also be adapted to accommodate the need to provide a deeper understanding of the potential for data proficiency to other professions.\r\n\r\nThe accessibility of R allows for users to build support communities, which in the case of journalists is essential for information gathering. Establishing and implementing transparent channels of communication is the key to scrupulous journalism and is why R is so applicable to this objective.", 771 | "email": "tatjana.keco@gmail.com" 772 | }, 773 | { 774 | "title": "What's New in ShinyProxy", 775 | "author": "Tobias Verbeke", 776 | "affiliation": "Managing Director, Open Analytics", 777 | "namesurname": "TOBIAS,VERBEKE", 778 | "track": "R Dataviz & Shiny", 779 | "session_type": "Regular talk", 780 | "description": "Shiny is nice technology to write interactive R-based applications. It is broadly adopted and the R community has collaborated on many \ninteresting extensions. Until recently, though, deployments in larger organizations and companies required proprietary solutions. ShinyProxy fills this gap and \noffers a fully open source alternative to run and manage shiny applications at large. In this talk we detail the \nShinyProxy architecture and demonstrate how it meets the needs of organizations. \nWe will discuss how it scales to thousands of concurrent users and how it offers authentication and \nauthorization functionality using standard technologies (LDAP, ActiveDirectory, OpenID Connect, SAML 2.0 and Kerberos). \nAlso, we will discuss the management interface and how it allows to monitor application usage to collect \nusage statistics in event logging databases. Finally, we will demonstrate that Shiny applications \ncan now be easily embedded in broader applications and (responsive) web sites using the ShinyProxy API. \nLearn how academic institutions, governmental organizations and industry roll out Shiny apps with \n ShinyProxy and how you can do this too. See https://shinyproxy.io." 781 | }, 782 | { 783 | "title": "Visualising and Modelling Bike Sharing Mobility usage in the city of Milan", 784 | "author": "Agostino Torti", 785 | "affiliation": "PhD student - Politecnico di Milano", 786 | "namesurname": "AGOSTINO,TORTI", 787 | "coauthor": "Alessia Pini, Simone Vantini", 788 | "track": "R Dataviz & Shiny", 789 | "session_type": "Shiny demo", 790 | "description": "A major trend in modern science is the collection of datasets which are not only “big” but also “complex”. This is particularly true in all sharing mobility systems where data are continuously collected at all time and they are characterised by a high number of features. To extract useful information contained in this huge mass of data, the development of novel statistical techniques and innovative visualization methods are requested.\r\nIn this context, we focused on BikeMi, the main bike sharing system (BSS) in the city of Milan in Italy, and we implemented an R Shiny App to analyse and study people's mobility behaviour across the city. Through the app, it is possible to dynamically select different parameters which allow to visualise the bike sharing flows among the districts of the city according to the hour of the day, the day of the week and the weather conditions. Moreover, a predictive model is implemented in the dashboard allowing to observe the future behaviour or the BSS. By doing this, we would like to both visualize the statistically significant spatio-temporal patterns of the users and to model each possible bike flow in the BikeMi network.", 791 | "email": "agostino.torti@gmail.com" 792 | }, 793 | { 794 | "title": "Media Shiny: Marketing Mix Models Builder", 795 | "author": "Andrea Melloncelli", 796 | "affiliation": "VanLog", 797 | "namesurname": "ANDREA,MELLONCELLI", 798 | "coauthor": "Mariachiara Fortuna", 799 | "track": "R Dataviz & Shiny, R Machine Learning & Models, R Production", 800 | "session_type": "Shiny demo", 801 | "description": "Marketing Mix Models are used to understand the effects of advertising campaigns. Building such models is challenging: first of all, it requires control of the external effects, such as seasonalities, competitor activities etc. Secondly, it requires to model the decay effect of communication (adstock effect: I do my advertising today, and in two weeks it still has some effect).\r\nThe MediaShiny application allows to build Marketing Mix Models interactively: all steps of MMM, such as selecting, transforming and exploring features (time series), adstock control, model building, evaluation and forecasting can be done interactively. \r\nAs a result the model explains the impact on sales of each media channel (tv, digital, press etc), controlling the external effects. An extra module allows the media budget optimization, using historical data to understand if the level of advertising has no impact or is too high (saturation).\r\n\r\nMediaShiny is a Package App developed with Golem and modularized with Shiny modules. Automatically built and provisioned as a Docker image running in a Shiny Proxy instance. Best User Experience is provided with Drag and Drop and navigation guided by action buttons.", 802 | "email": "andrea.melloncelli@vanlog.it" 803 | }, 804 | { 805 | "title": "ESPRES: A shiny web tool to support River Basin Management planning in European Watersheds", 806 | "author": "Angel Udias", 807 | "affiliation": "European Commission, Joint Research Centre (JRC), Ispra, Italy", 808 | "namesurname": "ANGEL,UDIAS", 809 | "coauthor": "A. Udias, B. Grizzetti, F. Bouraoui, O Vigiak, A. Pistocchi", 810 | "track": "R Life Sciences", 811 | "session_type": "Shiny demo", 812 | "description": "Integrated river basin management must meet environmental targets while preserving the economic activities of its communities. Stakeholder decisions need to consider conflicting trade-offs between legislative environmental targets and economic activities, while maintaining a basis of transparency and accountability. ESPRES is a shiny web-based Decision Support Tool (DST) that can be used by stakeholders to explore management options in European water bodies. The management options considered in ESPRES are related with the pressures (water use and nutrient application) reduction. \r\nThe shiny web interface provides a point of access to perform analyses of efficient pressure reduction strategies reflecting their perception of costs/effort, political difficulty, and social acceptability of the available solutions. Stakeholders express preferences and perceived difficulties in addressing each environmental pressure by assigning relative weights. The tool include a MOO engine to identified Pareto efficient strategies in terms of maximize the quality in the basin minimizing the total effort for reducing the pressures \r\nAn online version of ESPRES is currently available (www.espres.eu) for four European basins of the Globaqua project, namely the Adige, the Ebro, the Evrotas, and the Sava, to addresses water abstraction and nutrient pollution pressures.", 813 | "email": "angelluis.udias@gmail.com" 814 | }, 815 | { 816 | "title": "tsviz: a data-scientist-friendly addin for RStudio", 817 | "author": "Emanuele Fabbiani", 818 | "affiliation": "Chief Data Scientist at xtream, PhD student in Machine Learning", 819 | "namesurname": "EMANUELE,FABBIANI", 820 | "coauthor": "Marta Peroni, Riccardo Maganza", 821 | "track": "R Dataviz & Shiny", 822 | "session_type": "Shiny demo", 823 | "description": "In recent years, charting libraries have evolved following two main directions. First, they provided users with as many features as possible and second, they added high-level APIs to easily create the most frequent visualizations. RStudio, with its addins, offers the opportunity to further ease the creation of common plots. \r\n\r\nBorn as an internal project in xtream, tsviz is an open-source Shiny-based addin which contains powerful tools to perform explorative analysis of multivariate time series.\r\n\r\nIts usage is dead simple. Once launched, it scans the global environment for suitable variables. You chose one, and several plots of the time series are shown. Line charts, scatter plots, autocorrelogram, periodogram are only a few examples. Interactivity is achieved by the miniUI framework and the adoption of Plotly charts.\r\n\r\nIts wide adoption among our customers and the overall positive feedback we received demonstrate how addins, usually thought as shortcuts for developers, may provide effective support to data scientists in performing their routine tasks.", 824 | "email": "donlelef@gmail.com" 825 | }, 826 | { 827 | "title": "Mobility scan", 828 | "author": "Josue Aduna", 829 | "affiliation": "Behavioural and data scientist at Livemobility", 830 | "namesurname": "JOSUE,ADUNA", 831 | "track": "R Dataviz & Shiny", 832 | "session_type": "Shiny demo", 833 | "description": "This is a Shiny application designed and developed to foster sustainable mobility behavior under a specific initiative that I currently work in: Livemobility (see https://www.livemobility.com/). \r\n\r\nBroadly speaking, Livemobility is a platform that rewards people for sustainable commuting behavior and helps companies to save money, avoid environmental pollution, improve public health and save travel time. This is achieved through a digital ecosystem that analyses mobility behavior and generates personalized insights to improve mobility efficiency.\r\n\r\nThe Shiny application makes use of web interactive settings together with Google Maps APIs to provide relevant indicators of impact, generate geographic scans and create mobility profiles.", 834 | "email": "jadunac@gmail.com" 835 | }, 836 | { 837 | "title": "Developing Shiny applications to facilitate precision agriculture workflows", 838 | "author": "Lorenzo Busetto", 839 | "affiliation": "Institute on Remote Sensing of Enviroment - National Research Council of Italy (CNR-IREA)", 840 | "namesurname": "LORENZO,BUSETTO", 841 | "coauthor": "Luigi Ranghetti, Donato Cillis, Maddalena Campi, Saverio Zagaglia, Gabriele Dottori, MIrco Boschetti", 842 | "track": "R Applications, R Dataviz & Shiny", 843 | "session_type": "Shiny demo", 844 | "description": "Precision Agriculture applications rely on geospatial datasets from heterogeneous sources such as crop maps, information about fertilization/phytosanitary treatments, satellite and meteo data, to optimize agricultural practices from an economic and environmental standpoint. Software instruments allowing to easily record, manage and process such datasets are therefore of paramount importance to facilitate, standardize and speed-up the steps required to implement specific workflows. Although required functionalities are available in open source/commercial software, technicians are often required to use different software tools. This affects the time and effort required to replicate a specific workflow on different areas and crop seasons. \r\n\r\nIn this contribution we present our experience in developing two Shiny-based prototypes specifically tailored to the needs of operators of a agro-consulting firm providing precision agriculture services. The first prototype is mainly aimed at providing a simplified, standardized and scalable way to insert, record and query information about agricultural practices, such as crop type/variety, fertilisation and phytosanitary treatments and yield. The second is instead dedicated to facilitating access to satellite imagery data, and applying dedicated processing algorithms for identification of homegenous Management Unit Zones.", 845 | "email": "lbusett@gmail.com" 846 | }, 847 | { 848 | "title": "\"GUInterp\": a Shiny GUI to support spatial interpolation", 849 | "author": "Luigi Ranghetti", 850 | "affiliation": "Institute for Remote Sensing of Environment, Consiglio Nazionale delle Ricerche (IREA-CNR)", 851 | "namesurname": "LUIGI,RANGHETTI", 852 | "coauthor": "Luigi Ranghetti, Mirco Boschetti, Donato Cillis, Lorenzo Busetto", 853 | "track": "R Dataviz & Shiny", 854 | "session_type": "Shiny demo", 855 | "description": "In this demo we present \"GUInterp\", a Shiny interface written to facilitate the operations required to interpolate point data. A typical spatial interpolation workflow includes common steps: loading point data, filtering them to exclude undesired outlier values, setting the interpolation method and parameters, defining an output raster grid and processing data. Interpolation can be conducted in R using dedicated packages; nevertheless, the availability of an interactive interface could be useful to provide additional control during steps requiring user intervention and to facilitate users with low or no programming skills. \"GUInterp\" was written for this purpose. The user can import input point data, optionally loading a polygon dataset of borders used to constrain the extent of the interpolated outputs. A set of selectors allows filtering input points based on the distribution of the variable to interpolate (which is shown with a reactive histogram) or the spatial position of points (visible on a map). The interpolation can be performed with IDW or Ordinary Kriging methods: in the latter case, the semivariogram can be interactively defined and optimised using a dedicated interface. Further settings can be exploited to tune computation requirements (RAM usage, amount of time) on the basis of available hardware or user needs. \"GUInterp\" is released as R package under the GNU GPL-3 license.", 856 | "email": "ranghetti.l@irea.cnr.it" 857 | }, 858 | { 859 | "title": "A demonstration of ABACUS: Apps Based Activities for Communicating and Understanding Statistics", 860 | "author": "Mintu Nath", 861 | "affiliation": "Medical Statistics Team, Institute of Applied Health Sciences, University of Aberdeen, AB25 2ZD, UK", 862 | "namesurname": "MINTU,NATH", 863 | "track": "R Dataviz & Shiny", 864 | "session_type": "Shiny demo", 865 | "description": "ABACUS, developed using Shiny framework, is a set of applications for effective communication and understanding in statistics. It is currently available as an R package. Users who are not familiar with R programming can also access the applications through its web-based interface. The current version of ABACUS includes properties of Normal distribution, properties of the sampling distribution, one-sample z and t tests, two samples unpaired t-test and analysis of variance and comparison of Normal and t distributions. Using an example, the shiny demonstration will include the essential features of the application particularly its relevance in generating data across wide-ranging disciplines, its interactive elements and identifying best practices for presentation of results and interpretation of statistical outputs.", 866 | "email": "dr.m.nath@gmail.com" 867 | }, 868 | { 869 | "title": "Scoring the Implicit Association Test has never been easier: DscoreApp", 870 | "author": "Ottavia M. Epifania", 871 | "affiliation": "University of Padova (IT)", 872 | "namesurname": "OTTAVIA,M.,EPIFANIA", 873 | "coauthor": "Anselmi Pasquale, Robusto Egidio", 874 | "track": "R Dataviz & Shiny", 875 | "session_type": "Shiny demo", 876 | "description": "Throughout the past decades, the interest in the implicit investigation of attitudes and preferences has been constantly growing among social scientists, and the Implicit Association Test (IAT) is one of the most common measures used for this aim. The so-called “IAT effect” (i.e., the difference in respondents’ performance between two contrasting categorization tasks) is usually expressed by the D-score. Despite that several options exist for computing the D-score, including R packages and SPSS syntaxes, none of them provides either an easy to use interface or a means for immediately visualizing the results. A Shiny Web application (DscoreApp) was developed to provide IAT users with an easy to use and powerful tool for the computation of the D-score. DscoreApp allows users to upload their IAT data, decide which specific D-score algorithm to compute, and immediately see the results in easy to read and interactive graphs. At the end of the computation, users can download a data frame containing the computed D-score and other information on respondents’ performance, such as the proportion of correct responses or the number of trials exceeding a time threshold. Graphical representations can be downloaded as well. Besides providing an easy to use and open source tool for computing the D-score, DscoreApp allows for grasping an immediate overview of the results, and to visually inspect them.", 877 | "email": "otta.epifania@gmail.com" 878 | }, 879 | { 880 | "title": "rTRhexNG: Hexagon sticker app for rTRNG", 881 | "author": "Riccardo Porreca", 882 | "affiliation": "R Enthusiast at Mirai Solutions", 883 | "namesurname": "RICCARDO,PORRECA", 884 | "track": "R Dataviz & Shiny", 885 | "session_type": "Shiny demo", 886 | "description": "Hexagon stickers have become a popular way to make software tools, and R packages in particular, visually recognizable and stand out as landmarks in an ever-growing ecosystem. In general, good hexagon logos are not only visually appealing but also convey the key aspects of a package with their graphical design.\r\nIn this talk, we will showcase rTRhexNG (https://github.com/miraisolutions/rTRhexNG#readme), a Shiny app built for creating the hexagon sticker of the rTRNG (https://github.com/miraisolutions/rTRNG#readme) package. The core idea behind the logo was to have an appealing design that would at the same time illustrate the key features of the package: jump and split operations on (pseudo-)random sequences. Leveraging on the simple yet powerful SVG image format, R was used to automate the creation and location of several visual elements representing random sequences, and a Shiny app was built on top to quickly assess different designs in an interactive way.\r\nWe demonstrate the Shiny app in action to concretely explain what jump and split mean in rTRNG, and show how the sticker design naturally emerges from their visual representation. The power of this interactive yet automated approach was invaluable to fine-tune the final look of the sticker, also allowing to easily explore alternative polygon or circle designs the implementation naturally extends to.", 887 | "email": "riccardo.porreca@mirai-solutions.com" 888 | } 889 | ] 890 | -------------------------------------------------------------------------------- /erum2020program.Rproj: -------------------------------------------------------------------------------- 1 | Version: 1.0 2 | 3 | RestoreWorkspace: No 4 | SaveWorkspace: No 5 | AlwaysSaveHistory: Default 6 | 7 | EnableCodeIndexing: Yes 8 | UseSpacesForTab: Yes 9 | NumSpacesForTab: 2 10 | Encoding: UTF-8 11 | 12 | RnwWeave: Sweave 13 | LaTeX: pdfLaTeX 14 | 15 | AutoAppendNewline: Yes 16 | StripTrailingWhitespace: Yes 17 | 18 | BuildType: Website 19 | -------------------------------------------------------------------------------- /index.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "e-Rum2020 Program" 3 | author: "e-Rum2020 Organizing Committee" 4 | date: "`r Sys.time()`" 5 | site: bookdown::bookdown_site 6 | output: 7 | bookdown::gitbook: 8 | split_by: section 9 | --- 10 | 11 | ```{r, include=FALSE} 12 | knitr::opts_chunk$set(echo = TRUE) 13 | ``` 14 | 15 | 16 | # Index {-} 17 | 18 | - [Book of abstracts for accepted contributions](erum2020-contributed-program.html) 19 | 20 | - [Detailed program schedule booklet](http://2020.erum.io/wp-content/uploads/2020/06/program_brochure_v5_20200617.pdf) 21 | 22 | - [Conference materials](conference-materials.html) (collected in a [GitHub](https://github.com/Milano-R/erum2020program#conference-materials) repository) 23 | 24 | 25 | # eRum2020 Contributed Program 26 | 27 | Overview of accepted contributions organized by session type. 28 | 29 | ```{r, echo = FALSE, results = 'asis', warning = FALSE} 30 | contributions <- jsonlite::read_json( 31 | file.path("data", "erum2020_confirmed_program.json"), 32 | simplifyVector = TRUE 33 | ) 34 | session_types <- unique(contributions$session_type) 35 | for (session_type in session_types) { 36 | cat(knitr::knit_child( 37 | file.path("children", "session_type.Rmd"), 38 | quiet = TRUE 39 | )) 40 | } 41 | ``` 42 | -------------------------------------------------------------------------------- /materials/An enriched disease risk assessment model based on historical blood donors records.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Milano-R/erum2020program/9428c21925b9000df2e0df71ae3defa73e2b898e/materials/An enriched disease risk assessment model based on historical blood donors records.pdf -------------------------------------------------------------------------------- /materials/Hanly-CovidR-joint-winner-COVOID-Package.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Milano-R/erum2020program/9428c21925b9000df2e0df71ae3defa73e2b898e/materials/Hanly-CovidR-joint-winner-COVOID-Package.pdf -------------------------------------------------------------------------------- /materials/Jooken-CollaborateR.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Milano-R/erum2020program/9428c21925b9000df2e0df71ae3defa73e2b898e/materials/Jooken-CollaborateR.pdf -------------------------------------------------------------------------------- /materials/MOFA_bvelten.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Milano-R/erum2020program/9428c21925b9000df2e0df71ae3defa73e2b898e/materials/MOFA_bvelten.pdf -------------------------------------------------------------------------------- /materials/README.md: -------------------------------------------------------------------------------- 1 | # Hosted materials 2 | 3 | This is a space for collecting resources (e.g. slides) to be linked as [Conference Materials](https://github.com/Milano-R/erum2020program#conference-materials), in case they are not already available online. 4 | 5 | We recommend using a naming scheme like `speaker-short-title...`, e.g. `porreca-rTRhexNG-shiny-app.pdf`. 6 | 7 | Resources uploaded here can be linked in the main [README.md](../README.md) as e.g. `[slides](materials/porreca-rTRhexNG-shiny-app.pdf)`. 8 | -------------------------------------------------------------------------------- /materials/Severino-CorpFinder-Application_to_identify_Large_Corporate_Risks.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Milano-R/erum2020program/9428c21925b9000df2e0df71ae3defa73e2b898e/materials/Severino-CorpFinder-Application_to_identify_Large_Corporate_Risks.pdf -------------------------------------------------------------------------------- /materials/cordanoe_geotopbricks_presenetation.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Milano-R/erum2020program/9428c21925b9000df2e0df71ae3defa73e2b898e/materials/cordanoe_geotopbricks_presenetation.pdf -------------------------------------------------------------------------------- /materials/gillespie_harsh-cran.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Milano-R/erum2020program/9428c21925b9000df2e0df71ae3defa73e2b898e/materials/gillespie_harsh-cran.pdf -------------------------------------------------------------------------------- /materials/kalibera-invisible-work-on-r.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Milano-R/erum2020program/9428c21925b9000df2e0df71ae3defa73e2b898e/materials/kalibera-invisible-work-on-r.pdf -------------------------------------------------------------------------------- /materials/lenz-JuliaConnectoR.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Milano-R/erum2020program/9428c21925b9000df2e0df71ae3defa73e2b898e/materials/lenz-JuliaConnectoR.pdf -------------------------------------------------------------------------------- /materials/moritz_time-series-missing-value-visualizations.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Milano-R/erum2020program/9428c21925b9000df2e0df71ae3defa73e2b898e/materials/moritz_time-series-missing-value-visualizations.pdf -------------------------------------------------------------------------------- /materials/onkelinx_effectclass.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Milano-R/erum2020program/9428c21925b9000df2e0df71ae3defa73e2b898e/materials/onkelinx_effectclass.pdf -------------------------------------------------------------------------------- /tools/.gitignore: -------------------------------------------------------------------------------- 1 | data_dump 2 | -------------------------------------------------------------------------------- /tools/_proof.sh: -------------------------------------------------------------------------------- 1 | #!/bin/bash 2 | 3 | args=( 4 | 5 | # external link cache, set to 28d for re-checking existing links possibly 6 | # turning broken every 4 weeks 7 | --timeframe 28d 8 | 9 | # ignored external URLs: 10 | # - Linkedin links are known to have `999 No error` (gjtorikian/html-proofer#215) and are excluded 11 | --url-ignore /linkedin.com/ 12 | 13 | # enables HTML validation errors from Nokogumbo 14 | --check-html 15 | # full-set of check-html errors to be reported 16 | --report-invalid-tags 17 | --report-missing-names 18 | --report-script-embeds 19 | --report-missing-doctype 20 | --report-eof-tags 21 | --report-mismatched-tags 22 | 23 | ) 24 | 25 | htmlproofer "${args[@]}" "$@" 26 | -------------------------------------------------------------------------------- /tools/eRum2020-program-shared.ods: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/Milano-R/erum2020program/9428c21925b9000df2e0df71ae3defa73e2b898e/tools/eRum2020-program-shared.ods -------------------------------------------------------------------------------- /tools/generate_dataset.R: -------------------------------------------------------------------------------- 1 | # Create dataset to display as bookdown 2 | # To run manually to generate the clean dataset to display 3 | rm(list=ls()) 4 | # Dependencies ---- 5 | library(dplyr) 6 | library(tidyr) 7 | library(readxl) 8 | library(purrr) 9 | library(googledrive) 10 | library(stringr) 11 | 12 | # Expected files ---- 13 | #Generate filan output 14 | 15 | output_file <- file.path("data", "erum2020_confirmed_program.json") 16 | 17 | #contributions dump 18 | 19 | update_dump <- TRUE 20 | dump_dir <- file.path("tools", "data_dump") 21 | 22 | if (update_dump) { 23 | googledrive::drive_auth() 24 | gdrive_program <- drive_find("Program", n_max = 1) 25 | } 26 | 27 | program_downolad <- function(file) { 28 | if (update_dump) { 29 | no_ext <- sub("[.][^.]*$", "", file) 30 | gdrive_file <- drive_ls(as_id(gdrive_program$id), recursive = TRUE) %>% 31 | subset(., grepl(no_ext, name, fixed = TRUE)) 32 | stopifnot(nrow(gdrive_file) == 1L) 33 | result <- drive_download(as_id(gdrive_file$id), path = file.path(dump_dir, file), overwrite = TRUE) 34 | result$local_path 35 | } else { 36 | file.path(dump_dir, file) 37 | } 38 | } 39 | 40 | sessionize_full_dump <- program_downolad("erum2020 sessions - exported 2020-03-05.xlsx") 41 | gform_confirmation <- program_downolad("eRum 2020 - Contribution Acceptance Form (Responses).xlsx") 42 | accepted_full <- program_downolad("finaltable_homework_contributedsessions.xlsx") 43 | eventbrite_full <- program_downolad("report-2020-05-31T1707.xlsx") 44 | 45 | if (update_dump) { 46 | googledrive::drive_deauth() 47 | } 48 | 49 | # save.image("dataDumpMay31.RData") 50 | load("dataDumpMay31.RData") 51 | # Read files 52 | all_sessions <- read_excel(sessionize_full_dump, sheet = "All Submitted Sessions") 53 | all_speakers <- read_excel(sessionize_full_dump, sheet = "All Speakers") 54 | confirmations <- read_excel(gform_confirmation) 55 | accepted <- read_excel(accepted_full) 56 | eventbrite <- read_excel(eventbrite_full) 57 | 58 | 59 | # Utilities ---- 60 | remove_spaces_df <- function(df) { 61 | gsub(" ","", names(df)) 62 | } 63 | 64 | clean_up_NameSurname <- function(df){ 65 | if ("NameSurname" %in% names(df)) { 66 | df <- df %>% 67 | mutate(NameSurname = toupper(NameSurname)) %>% 68 | mutate(NameSurname = gsub(",N/A", "", .$NameSurname)) %>% 69 | mutate(NameSurname = gsub(" ", ",", .$NameSurname)) 70 | } 71 | df 72 | } 73 | 74 | remove_double_spaces <- function(x){ 75 | gsub(" "," ", x) 76 | } 77 | 78 | clean_df <- function(df){ 79 | df %>% 80 | setNames(remove_spaces_df(df)) %>% 81 | purrr::map_df(trimws) %>% 82 | purrr::map_df(remove_double_spaces) 83 | } 84 | 85 | 86 | # Clean up data ---- 87 | # Remove unecessary columns 88 | all_sessions_reduced <- all_sessions %>% 89 | mutate(Id = as.character(Id)) %>% 90 | select(-`Date Submitted`) %>% 91 | #Manual fix add RICCARDO,CORRADIN and Thomas Maier contribution that does not seem to be part of the all_session but is part of the accepted. Notice that this does not fix the all_speakers tab, so we are still missing his affiliation 92 | bind_rows(accepted %>% filter(Speakers == "Riccardo Corradin")) %>% 93 | bind_rows(accepted %>% filter(Speakers == "Thomas Maier" )) %>% 94 | clean_df() %>% 95 | select(Id, Title, Description, Speakers, Track, `Co-authors` ) %>% 96 | mutate(Speakers = gsub(" N/A", "", .$Speakers)) %>% 97 | #Manual fix case of Shodipo Ayomide that switched Name and Surname 98 | #Manual fix for Dana Jomar/Jomer that mispelled her name 99 | #Manual fix for Parvaneh Shafiei that mispelled her name 100 | mutate(Speakers = case_when( 101 | Speakers == "Shodipo Ayomide" ~ "Ayomide Shodipo", 102 | Speakers == "Dana Jomar" ~ "Dana Jomer", 103 | Speakers == "parvane shafiei" ~ "Parvaneh Shafiei", 104 | Speakers == "mustapha Larbaoui" ~ "Mustapha Larbaoui", 105 | Speakers == "Thomas Maier" ~ "Daniel Meister", 106 | TRUE ~ Speakers 107 | )) %>% 108 | mutate(NameSurname = gsub(" ", ",", .$Speakers)) %>% 109 | clean_up_NameSurname() %>% 110 | mutate(`Co-authors` = case_when(NameSurname == "ANDRÉ,RIVENÆS" ~ NA_character_, 111 | NameSurname == "LUCA,TORRIANI" ~ "Alessandra Menafoglio, Piercesare Secchi", 112 | TRUE ~ `Co-authors` 113 | )) 114 | 115 | # check 116 | # all_sessions_reduced[all_sessions_reduced$NameSurname == "ANDRÉ,RIVENÆS",] 117 | # all_sessions_reduced[all_sessions_reduced$NameSurname == "LUCA,TORRIANI",] 118 | 119 | all_speakers_reduced <- all_speakers %>% 120 | clean_df() %>% 121 | select(Id, FirstName, LastName, TagLine) %>% 122 | #Manual fix add speaker NameSurname RICCARDO,CORRADIN 123 | #Manual fix add speaker NameSurname THOMAS,MAIER 124 | add_row(FirstName = "Riccardo", LastName = "Corradin", TagLine = "Università degli Studi Milano Bicocca") %>% 125 | add_row(FirstName = "Daniel", LastName = "Meister", TagLine = "Datahouse AG") %>% 126 | add_row(FirstName = "Vincent", LastName = "Guyader", TagLine = "ThinkR") %>% 127 | add_row(FirstName = "Ursula", LastName = "Gasser", TagLine = "PartnerRe") %>% 128 | unite(NameSurname, FirstName, LastName, sep = ",") %>% 129 | clean_up_NameSurname() %>% 130 | #Manual fix case of Shodipo Ayomide and that switched Name and Surname 131 | #Manual fix for Dana Jomar/Jomer that mispelled her name 132 | #Manual fix for Parvaneh Shafiei that mispelled her name 133 | mutate(NameSurname = case_when( 134 | NameSurname == toupper("Shodipo,Ayomide") ~ toupper("Ayomide,Shodipo"), 135 | NameSurname == toupper("Dana,Jomar") ~ toupper("Dana,Jomer"), 136 | NameSurname == toupper("Parvane,Shafiei") ~ toupper("Parvaneh,Shafiei"), 137 | TRUE ~ NameSurname 138 | )) 139 | 140 | # all_speakers_reduced$NameSurname[str_detect(all_speakers_reduced$NameSurname,"KIRILL")] 141 | 142 | # confirmations_reduced <- confirmations %>% 143 | # rename(confirm = starts_with("Do you confirm")) %>% 144 | # clean_df() %>% 145 | # unite(NameSurname, Name, Surname, sep = ",") %>% 146 | # select(NameSurname, confirm) %>% 147 | # clean_up_NameSurname() %>% 148 | # #Manual fix case of Shodipo Ayomide and that switched Name and Surname 149 | # #Manual fix for Dana Jomar/Jomer that mispelled her name 150 | # #Manual fix for Giulio,Ferrero,Ferrero repeated surname 151 | # #Manual fix for Olalekan Joseph Akintande that did not add middle name 152 | # #Manual fix for name with special character inconsitently entered 153 | # #Manual fix for Desjeux Yann that switched Name Surname 154 | # #Manual fix for Izhar Asael Alonzo Matamoros Asael that did not separate correctly Name and Surname 155 | # #Manual fix for Mustafa Larbaoui that switched Name and Surname 156 | # #Manual fix for Luís Gustavo Silva e Silva che in sessionize era Luís G. Silva e Silva 157 | # #Manual fix for Paula González Avalos che in sessionize era Paula Gonzalez Avalos 158 | # mutate(NameSurname = case_when( 159 | # NameSurname == toupper("Shodipo,Ayomide") ~ toupper("Ayomide,Shodipo"), 160 | # NameSurname == toupper("Dana,Jomar") ~ toupper("Dana,Jomer"), 161 | # NameSurname == toupper("Zbynek,Slajchrt") ~ toupper("Zbyněk,Šlajchrt"), 162 | # NameSurname == toupper("Giulio,Ferrero,Ferrero") ~ toupper("Giulio,Ferrero"), 163 | # NameSurname == toupper("Olalekan,Akintande") ~ toupper("Olalekan,Joseph,Akintande"), 164 | # NameSurname == toupper("Desjeux,Yann") ~ toupper("Yann,Desjeux"), 165 | # NameSurname == toupper("Izhar,Asael,Alonzo,Matamoros,Asael") ~ toupper("Izhar,Asael,Alonzo,Matamoros"), 166 | # NameSurname == toupper("Larbaoui,Mustapha") ~ toupper("Mustapha,Larbaoui"), 167 | # NameSurname == toupper("Luís,Gustavo,Silva,e,Silva") ~ toupper("LUÍS,G.,SILVA,E,SILVA"), 168 | # NameSurname == toupper("PAULA,GONZÁLEZ,AVALOS") ~ toupper("PAULA,GONZALEZ,AVALOS"), 169 | # TRUE ~ NameSurname 170 | # )) 171 | 172 | accepted_reduced <- accepted %>% 173 | purrr::map_df(trimws) %>% 174 | mutate(choice = `choice\n`) %>% 175 | #Manual fix case of Shodipo Ayomide and that switched Name and Surname 176 | #Manual fix for Dana Jomar/Jomer that mispelled her name 177 | #Manual fix for mustapha Larbaoui to Mustapha 178 | mutate(Speakers = case_when( 179 | Speakers == "Shodipo Ayomide" ~ "Ayomide Shodipo", 180 | Speakers == "Dana Jomar" ~ "Dana Jomer", 181 | Speakers == "parvane shafiei" ~ "Parvaneh Shafiei", 182 | Speakers == "mustapha Larbaoui" ~ "Mustapha Larbaoui", 183 | Speakers == "Thomas Maier" ~ "Daniel Meister", 184 | TRUE ~ Speakers 185 | )) %>% 186 | mutate(NameSurname = gsub(" ", ",", .$Speakers)) %>% 187 | clean_up_NameSurname() %>% 188 | select(Id, `choice`, Title, AssignedFormat ,NameSurname) 189 | 190 | 191 | eventbrite_reduced <- eventbrite %>% 192 | mutate(TipologiaBiglietto = `Tipologia biglietto`) %>% 193 | add_row(Nome = "Andrie", Cognome = "De Vries", `E-mail` = "andrie@rstudio.com") %>% 194 | filter(`Tipologia biglietto` == "Conference ticket - Speaker" 195 | | Cognome == "Crippa" 196 | | Cognome == "Ryser-Welch" 197 | | Cognome == "Marini" 198 | | Cognome == "Melloncelli" 199 | | Cognome == "Sax" 200 | | Cognome == "De Vries") %>% 201 | unite(NameSurname,Nome,Cognome, sep = ",") %>% 202 | select(NameSurname, `E-mail`, TipologiaBiglietto) %>% 203 | clean_up_NameSurname() %>% 204 | distinct(NameSurname,TipologiaBiglietto,.keep_all = TRUE) %>% 205 | mutate(author2 = NA_character_, affiliation2 = NA_character_) %>% 206 | mutate(NameSurname = case_when( 207 | NameSurname == toupper("ottavia,epifania") ~ toupper("ottavia,m.,epifania"), 208 | NameSurname == toupper("shodipo,ayomide") ~ toupper("ayomide,shodipo"), 209 | NameSurname == toupper("marco,franco") ~ toupper("marco,cavaliere"), 210 | NameSurname == toupper("ANDRÉ,WAAGE,RIVENÆS") ~ toupper("ANDRÉ,RIVENÆS"), 211 | NameSurname == toupper("MATT,BANNERT") ~ toupper("MATTHIAS,BANNERT"), 212 | NameSurname == toupper("CLAUS,EKSTROM") ~ toupper("CLAUS,EKSTRØM"), 213 | NameSurname == toupper("OLALEKAN,AKINTANDE") ~ toupper("OLALEKAN,JOSEPH,AKINTANDE"), 214 | NameSurname == toupper("LUÍS,SILVA,E,SILVA") ~ toupper("LUÍS,G.,SILVA,E,SILVA"), 215 | # NameSurname == toupper("KIRILL,MÜLLER") ~ toupper("KIRILL,MÜLLER"), 216 | TRUE ~ NameSurname 217 | )) %>% 218 | mutate(author2 = case_when(NameSurname == "ANDRÉ,RIVENÆS" ~ "Markus Mortensen", 219 | NameSurname == "LUCA,TORRIANI" ~ "Ilaria Sartori", 220 | TRUE ~ author2 221 | )) %>% 222 | mutate(affiliation2 = case_when(NameSurname == "ANDRÉ,RIVENÆS" ~ "PwC", 223 | NameSurname == "LUCA,TORRIANI" ~ "Politecnico di Milano", 224 | TRUE ~ affiliation2)) 225 | 226 | 227 | 228 | 229 | # Join data 230 | all_speakers_confirmed <- all_speakers_reduced %>% 231 | full_join(eventbrite_reduced, by = "NameSurname") 232 | 233 | all_sessions_accepted <- all_sessions_reduced %>% 234 | full_join(accepted_reduced, by = c("NameSurname", "Title") ) %>% 235 | filter(choice > 0) 236 | 237 | session_speakers_confirmed <- full_join(all_sessions_accepted, all_speakers_confirmed, by = "NameSurname") %>% 238 | # by hand Kirill Muller I don't know why full join doesn't work 239 | mutate(`E-mail` = case_when( 240 | Id == "905a186f-f5cc-4253-b449-455c45dea8c4" ~ "kirill@cynkra.com", 241 | TRUE ~ `E-mail`)) %>% 242 | mutate(TipologiaBiglietto = case_when( 243 | Id == "905a186f-f5cc-4253-b449-455c45dea8c4" ~ "Conference ticket - Speaker", 244 | TRUE ~ TipologiaBiglietto)) %>% 245 | filter(!is.na(TipologiaBiglietto)) %>% 246 | transmute( 247 | title = Title, 248 | author = Speakers, 249 | affiliation = TagLine, 250 | namesurname = NameSurname, 251 | coauthor = `Co-authors`, 252 | author2 = author2, 253 | affiliation2 = affiliation2, 254 | track = Track, 255 | session_type = AssignedFormat, 256 | description = Description, 257 | email = `E-mail` 258 | ) %>% 259 | distinct() %>% 260 | filter(!is.na(title)) %>% 261 | # manual 262 | filter(title != "Transparent presentation of uncertain lotteries using {deals}") %>% 263 | arrange(session_type, author, title) 264 | 265 | session_speakers_confirmed <- session_speakers_confirmed %>% 266 | add_row(title = "What's New in ShinyProxy", author = "Tobias Verbeke", 267 | affiliation = "Managing Director, Open Analytics", session_type = "Regular talk", 268 | description = "Shiny is nice technology to write interactive R-based applications. It is broadly adopted and the R community has collaborated on many 269 | interesting extensions. Until recently, though, deployments in larger organizations and companies required proprietary solutions. ShinyProxy fills this gap and 270 | offers a fully open source alternative to run and manage shiny applications at large. In this talk we detail the 271 | ShinyProxy architecture and demonstrate how it meets the needs of organizations. 272 | We will discuss how it scales to thousands of concurrent users and how it offers authentication and 273 | authorization functionality using standard technologies (LDAP, ActiveDirectory, OpenID Connect, SAML 2.0 and Kerberos). 274 | Also, we will discuss the management interface and how it allows to monitor application usage to collect 275 | usage statistics in event logging databases. Finally, we will demonstrate that Shiny applications 276 | can now be easily embedded in broader applications and (responsive) web sites using the ShinyProxy API. 277 | Learn how academic institutions, governmental organizations and industry roll out Shiny apps with 278 | ShinyProxy and how you can do this too. See https://shinyproxy.io.", 279 | namesurname = "TOBIAS,VERBEKE", track = "R Dataviz & Shiny") 280 | 281 | session_speakers_confirmed <- session_speakers_confirmed %>% 282 | add_row(title = "The R Consortium 2020: adapting to rapid change and global crisis", author = "Joseph Rickert", 283 | affiliation = "RStudio: R Community Ambassador, R Consortium's Board of Directors", session_type = "Regular talk", 284 | description = "The COVID-19 pandemic has turned the world upside down, and like everyone else the R Community is learning how to adapt to rapid change in order to carry on important work while looking for ways to contribute to the fight against the pandemic. In this talk, I will report on continuing R Community work being organized through the R Consortium such as the R Hub, R User Group Support Program and Diversity and Inclusion Projects; and through the various working groups including the Validation Hub, R / Pharma, R / Medicine and R Business. Additionally, I will describe some of the recently funded ISC projects and report on the COVID-19 Data Forum, a new project that the R Consortium is organizing in partnership with Stanford’s Data Science Institute.", 285 | namesurname = "JOSEPH,RICKERT", track = "R World") 286 | 287 | session_speakers_confirmed <- session_speakers_confirmed %>% 288 | arrange(session_type, author, title) 289 | 290 | 291 | # check: comment line filter and check manually 292 | 293 | # filter(all_speakers_reduced,str_detect(NameSurname,"AKINTANDE")) 294 | # filter(eventbrite_reduced,str_detect(NameSurname,"AKINTANDE")) 295 | # eventbrite_reduced[str_detect(eventbrite_reduced$NameSurname,"KIRILL"),1] 296 | # all_speakers_reduced[str_detect(all_speakers_reduced$NameSurname,"KIRILL"),2] 297 | # session_speakers_confirmed[str_detect(session_speakers_confirmed$namesurname,"CORRADIN"),2] 298 | filter(all_speakers_confirmed,TipologiaBiglietto != "Conference ticket - Speaker") -> ToAddEventbrite 299 | 300 | 301 | #Manual fix remove broken link 302 | idx_broken_link <- grepl("Hydrological Modelling and R",session_speakers_confirmed$title) 303 | session_speakers_confirmed$description[idx_broken_link] <- gsub('(https://)+(www\\.)+geotop', 'www\\.geotop', session_speakers_confirmed$description[idx_broken_link]) 304 | 305 | # Save output ---- 306 | jsonlite::write_json(session_speakers_confirmed, output_file, pretty = TRUE) 307 | -------------------------------------------------------------------------------- /tools/program-materials-parse-readme.R: -------------------------------------------------------------------------------- 1 | `%>%` <- dplyr::`%>%` 2 | 3 | output_file <- tempfile(fileext = ".html") 4 | 5 | # get_readme_html_with_toc("README.md") 6 | get_readme_html_with_toc <- function(file) { 7 | readme_content <- rmarkdown::render( 8 | file, 9 | output_file = output_file, 10 | output_format = rmarkdown::github_document( 11 | # we use toc to iner correct anchor links 12 | toc = TRUE, toc_depth = 4, 13 | # don't use smart quotes so we can match the raw text in the README 14 | md_extensions = "-smart" 15 | ) 16 | ) %>% 17 | xml2::read_html() 18 | } 19 | 20 | # get_readme_anchors("README.md") 21 | get_readme_anchors <- function(file) { 22 | internal_links <- 23 | get_readme_html_with_toc(file) %>% 24 | rvest::xml_nodes("a[href^='#']") 25 | anchors <- stats::setNames( 26 | rvest::html_attr(internal_links, "href"), 27 | rvest::html_text(internal_links) 28 | ) 29 | anchors <- anchors[!duplicated(names(anchors))] 30 | anchors 31 | } 32 | 33 | 34 | find_headings <- function(x, lev) { 35 | grep(sprintf("^#{%d,%d}\\s+", lev, lev), x) 36 | } 37 | 38 | split_heading_content <- function(x, lev) { 39 | heading_start <- find_headings(x, lev + 1) 40 | heading_end <- dplyr::lead(heading_start - 1, 1, default = length(x)) 41 | content_end <- c(head(heading_start, 1) - 1, length(x))[[1]] 42 | list( 43 | content = x[seq_len(content_end)], 44 | children = Map(function(start, end) x[start:end], heading_start, heading_end) 45 | ) 46 | } 47 | 48 | parse_bullet <- function(x) { 49 | x <- paste(x, collapse = " ") 50 | x <- sub("^\\s*-\\s+", "", x) 51 | pattern <- "^([^\\:]+):\\s+(.*)$" 52 | if (grepl(pattern, x)) { 53 | x <- setNames( 54 | sub(pattern, "\\2", x), 55 | sub(pattern, "\\1", x) 56 | ) 57 | } 58 | as.list(x) 59 | } 60 | parse_bullet(" - Field: value") 61 | parse_bullet(" - value") 62 | 63 | parse_content <- function(x) { 64 | bullet_start <- grep("^\\s*-", x) 65 | if (length(bullet_start) == 0L) { 66 | x 67 | } else { 68 | bullet_end <- dplyr::lead(bullet_start - 1, 1, default = length(x)) 69 | do.call( 70 | c, 71 | Map( 72 | function(start, end) parse_bullet(x[start:end]), 73 | bullet_start, bullet_end 74 | ) 75 | ) 76 | } 77 | } 78 | 79 | set_headings_name <- function(x) { 80 | setNames(x, vapply(x, `[[`, "title", FUN.VALUE = "")) 81 | } 82 | 83 | parse_heading <- function(x, anchors) { 84 | x <- unlist(strsplit(x, "\\n")) 85 | stopifnot(substr(x[[1]], 1, 1) == "#") 86 | heading_lev <- nchar(sub("^(#+)\\s+.*$", "\\1", x[[1]])) 87 | title <- sub("^#+\\s+", "", x[[1]]) 88 | content <- split_heading_content(x[-1], heading_lev) 89 | if (is.na(anchors[title])) { 90 | stop("No materials link found for ", sQuote(title)) 91 | } 92 | materials_repo <- "https://github.com/Milano-R/erum2020program" 93 | materials_url <- paste0(materials_repo, anchors[title]) 94 | 95 | list( 96 | title = title, 97 | materials_url = materials_url, 98 | content = parse_content(content$content), 99 | children = lapply( 100 | content$children, 101 | parse_heading, 102 | anchors = anchors 103 | ) %>% 104 | set_headings_name() 105 | ) 106 | } 107 | 108 | # materials <- parse_readme_materials("README.md") 109 | # str(materials, max.level = 7) 110 | parse_readme_materials <- function(file) { 111 | anchors <- get_readme_anchors(file) 112 | materials <- readLines("README.md") %>% 113 | split_heading_content(1) %>% 114 | .$children %>% 115 | lapply(parse_heading, anchors) %>% 116 | set_headings_name() %>% 117 | tail(-which(names(.) == "Index")) 118 | } 119 | -------------------------------------------------------------------------------- /tools/program-materials-placeholders.R: -------------------------------------------------------------------------------- 1 | prog <- readODS::read_ods("tools/eRum2020-program-shared.ods") 2 | 3 | speaker_field <- function(session) { 4 | .type <- function(pattern) grepl(pattern, sub("-.*", "", session), ignore.case = TRUE) 5 | dplyr::case_when( 6 | .type("workshop") ~ "Instructor", 7 | .type("thematic") ~ "Participants", 8 | TRUE ~ "Speaker" 9 | ) 10 | } 11 | 12 | md_entry <- function(title, session_type, speaker) { 13 | glue::glue( 14 | "#### {title}", 15 | "", 16 | "- {speaker_field(session_type)}: {ifelse(is.na(speaker), 'TBD', speaker)}", 17 | "- Materials: TBD", 18 | "", 19 | .sep = "\n" 20 | ) 21 | } 22 | 23 | combine_topics <- function(topics) { 24 | topics <- unique(topics[!is.na(topics)]) 25 | if (length(topics) > 0L) paste(topics, collapse = " / ") 26 | } 27 | 28 | sort_sessions_by_type <- function(sessions) { 29 | .by <- function(pattern) !grepl(pattern, sub("-.*", "", sessions), ignore.case = TRUE) 30 | sessions[order( 31 | .by("keynote"), 32 | .by("invited"), 33 | .by("parallel"), 34 | .by("lightning"), 35 | .by("shiny"), 36 | .by("poster"), 37 | .by("thematic"), 38 | .by("workshop") 39 | )] 40 | } 41 | 42 | group_factor <- function(sessions) { 43 | factor(sessions, unique(sort_sessions_by_type(sessions))) 44 | } 45 | 46 | gfm_heading_slugify <- function(heading) { 47 | slug <- tolower(heading) 48 | slug <- gsub("[^a-z0-9' -]+", "", slug) 49 | slug <- gsub("'", "", slug) 50 | slug <- gsub("\\s", "-", slug) 51 | slug 52 | } 53 | # test git-flavored-markdown internal hash slugs: 54 | # > rmarkdown::render("README.md", "github_document", output_file = "/tmp/README.md") 55 | # $ htmlproofer /tmp/README.html 56 | 57 | heading_link <- function(session) { 58 | glue::glue( 59 | "[{session}](#{gfm_heading_slugify(session)})" 60 | ) 61 | } 62 | 63 | session_type <- function(session) { 64 | type <- sub("\\s+[0-9]+$", "", session) 65 | type <- ifelse(grepl("workshop", type, ignore.case = TRUE), "Workshop", type) 66 | type <- sub("^(.*[^s])$", "\\1s", type) 67 | type 68 | } 69 | 70 | session_title <- function(session, topics, speaker) { 71 | stopifnot(length(session) == 1L) 72 | if (grepl("keynote", session, ignore.case = TRUE)) { 73 | session <- c(session, speaker) 74 | } 75 | title <- paste(c(unique(session), combine_topics(topics)), collapse = " - ") 76 | } 77 | 78 | `%>%` <- dplyr::`%>%` 79 | 80 | prog_sessions_md <- prog %>% 81 | dplyr::rename_with(tolower) %>% 82 | dplyr::group_by(session) %>% 83 | dplyr::mutate( 84 | session_type = session_type(session), 85 | session = session_title(unique(session), topic, speaker), 86 | md = md_entry(title, session_type, speaker) 87 | ) %>% 88 | dplyr::ungroup() 89 | toc_md <- prog_sessions_md %>% 90 | dplyr::group_by(group_factor(session_type)) %>% 91 | dplyr::summarize( 92 | md = paste0( 93 | glue::glue("- {heading_link(unique(session_type))}"), "\n", 94 | paste(glue::glue(" - {heading_link(unique(session))}"), collapse = "\n") 95 | ) 96 | ) %>% 97 | .[["md"]] %>% 98 | c("## Index", .) %>% 99 | paste(collapse = "\n\n") 100 | sessions_md <- prog_sessions_md %>% 101 | split(group_factor(.$session_type)) %>% 102 | vapply(FUN.VALUE = "", function(x) { 103 | x %>% dplyr::group_by(group_factor(session)) %>% 104 | dplyr::summarize( 105 | md = paste( 106 | glue::glue("### {unique(session)}"), 107 | paste(md, collapse = "\n\n"), 108 | sep = "\n\n" 109 | ) 110 | ) %>% 111 | .[["md"]] %>% 112 | c(glue::glue("## {unique(x$session_type)}"), .) %>% 113 | paste(collapse = "\n\n") 114 | }) 115 | 116 | cat( 117 | toc_md, 118 | sessions_md, 119 | sep = "\n\n", 120 | file = "README.md", append = TRUE 121 | ) 122 | -------------------------------------------------------------------------------- /tools/program-materials-yt.R: -------------------------------------------------------------------------------- 1 | `%>%` <- dplyr::`%>%` 2 | source("tools/program-materials-parse-readme.R") 3 | 4 | clean_parentheses <- function(x) { 5 | pattern <- "\\s*[(][^()]*[)]" 6 | while (any(grepl(pattern, x))) { 7 | x <- gsub(pattern, "", x) 8 | } 9 | x <- gsub("\\[|\\]", "", x) 10 | x 11 | } 12 | 13 | 14 | track_pattern <- "^(.*)\\s+\\-\\s+([^\\-]*)$" 15 | strip_track <- function(title) { 16 | sub(track_pattern, "\\1", title) 17 | } 18 | get_track <- function(title) { 19 | sub(track_pattern, "\\2", title) 20 | } 21 | is_keynote <- function(title) { 22 | grepl("keynote", title, ignore.case = TRUE) 23 | } 24 | yt_title <- function(session) { 25 | title <- glue::glue("e-Rum2020 :: {session$title}") 26 | if (is_keynote(session$title)) { 27 | stopifnot(length(session$children) == 1L) 28 | title <- glue::glue("{strip_track(title)}: {session$children[[1]]$title}") 29 | } 30 | title 31 | } 32 | 33 | # create_yt_description(materials$`Invited Sessions`$children$`Invited Session 1 - Life Sciences / CovidR / R World`) 34 | create_yt_description <- function(session) { 35 | c( 36 | yt_title(session), "\n", 37 | if (is_keynote(session$title)) { 38 | glue::glue('Keynote talk for the "{get_track(session$title)}" session\n\n') 39 | }, 40 | paste0( 41 | "Speaker information and materials for this session are available at ", 42 | session$materials_url, 43 | "\n" 44 | ), 45 | if (!is_keynote(session$title)) { 46 | vapply(session$children, FUN.VALUE = "", USE.NAMES = FALSE, function(talk) { 47 | speaker <- talk$content %>% 48 | .[[intersect(names(.), c("Speaker", "Instructor", "Chairs"))]] %>% 49 | clean_parentheses() 50 | glue::glue( 51 | "- {speaker}: \"{talk$title}\"" 52 | ) 53 | }) 54 | } 55 | ) 56 | } 57 | 58 | materials <- parse_readme_materials("README.md") 59 | yt_descriptions <- materials %>% 60 | lapply(`[[`, "children") %>% unname() %>% 61 | unlist(recursive = FALSE) %>% 62 | lapply(create_yt_description) 63 | 64 | Map( 65 | function(title, desc) { 66 | glue::glue("## {title}\n\n{paste(desc, collapse='\n')}") 67 | }, 68 | names(yt_descriptions), yt_descriptions 69 | ) %>% unlist() %>% 70 | cat(file = "tools/yt-descriptions.md", sep = "\n\n") 71 | -------------------------------------------------------------------------------- /tools/render-conference-materials.R: -------------------------------------------------------------------------------- 1 | output_dir <- "_site" 2 | 3 | # create .md file for the conference materials page, extracted from README.md 4 | readme <- readLines("README.md") 5 | materials_md <- c( 6 | # include heading 7 | "## e-Rum2020 Conference Materials", 8 | # content from the Index section (w/o heading) 9 | tail(readme, -(grep("# Index", readme))) 10 | ) 11 | md_file <- file.path(output_dir, "conference-materials.md") 12 | writeLines(materials_md, md_file) 13 | 14 | # render to HTML as GitHub document 15 | rmarkdown::render(md_file, "github_document") 16 | file.remove(md_file) 17 | 18 | # make external links opening in a new tab, to also avoid browser security 19 | # issues when embedding the page 20 | html_file <- sub(".md", ".html", md_file, fixed = TRUE) 21 | materials_html <- xml2::read_html(html_file) 22 | links <- rvest::html_nodes(materials_html, css = "a") 23 | internal <- grepl("^[#]", xml2::xml_attr(links, "href")) 24 | xml2::xml_attr(links[!internal], "target") <- "_blank" 25 | xml2::write_html(materials_html, html_file) 26 | -------------------------------------------------------------------------------- /tools/yt-descriptions.md: -------------------------------------------------------------------------------- 1 | ## Keynote 1 - Stephanie Hicks - Life Sciences 2 | 3 | e-Rum2020 :: Keynote 1 - Stephanie Hicks: Using R and data science to improve human health 4 | 5 | 6 | Keynote talk for the "Life Sciences" session 7 | 8 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#keynote-1---stephanie-hicks---life-sciences 9 | 10 | 11 | ## Keynote 2 - Jared Lander - Applications 12 | 13 | e-Rum2020 :: Keynote 2 - Jared Lander: Applying R at Work 14 | 15 | 16 | Keynote talk for the "Applications" session 17 | 18 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#keynote-2---jared-lander---applications 19 | 20 | 21 | ## Keynote 3 - Francesco Bartolucci - Machine Learning & Models 22 | 23 | e-Rum2020 :: Keynote 3 - Francesco Bartolucci: Latent Markov models for longitudinal data in R by LMest package 24 | 25 | 26 | Keynote talk for the "Machine Learning & Models" session 27 | 28 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#keynote-3---francesco-bartolucci---machine-learning-models 29 | 30 | 31 | ## Keynote 4 - Sharon Machlis - DataViz 32 | 33 | e-Rum2020 :: Keynote 4 - Sharon Machlis: What I Learned as an R Journalist 34 | 35 | 36 | Keynote talk for the "DataViz" session 37 | 38 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#keynote-4---sharon-machlis---dataviz 39 | 40 | 41 | ## Keynote 5 - Tomas Kalibera - R World 42 | 43 | e-Rum2020 :: Keynote 5 - Tomas Kalibera: The Invisible Work on R 44 | 45 | 46 | Keynote talk for the "R World" session 47 | 48 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#keynote-5---tomas-kalibera---r-world 49 | 50 | 51 | ## Keynote 6 - Kelly O'Briant - R in Production 52 | 53 | e-Rum2020 :: Keynote 6 - Kelly O'Briant: Reflections on two years of solutions engineering at RStudio 54 | 55 | 56 | Keynote talk for the "R in Production" session 57 | 58 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#keynote-6---kelly-obriant---r-in-production 59 | 60 | 61 | ## Invited Session 1 - Life Sciences / CovidR / R World 62 | 63 | e-Rum2020 :: Invited Session 1 - Life Sciences / CovidR / R World 64 | 65 | 66 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#invited-session-1---life-sciences-covidr-r-world 67 | 68 | - Britta Velten: "Multi-Omics Factor Analysis Plus: A probabilistic framework for comprehensive and scalable integration of multi-modal data" 69 | - Emanuele Guidotti: "COVID-19 Data Hub" 70 | - Mark Hanly: "COVOID: Modelling COVID-19 Transmission and Interventions" 71 | - Edwin Thoen: "Building Agile data products leveraging the R package structure" 72 | 73 | ## Invited Session 2 - Machine Learning & Models / DataViz (Shiny) / Applications 74 | 75 | e-Rum2020 :: Invited Session 2 - Machine Learning & Models / DataViz (Shiny) / Applications 76 | 77 | 78 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#invited-session-2---machine-learning-models-dataviz-shiny-applications 79 | 80 | - Ioannis Kosmidis: "brquasi: Improved quasi-likelihood estimation" 81 | - Szilard Pafka: "Better than Deep Learning: Gradient Boosting Machines (GBM) – with 2020 updates" 82 | - Colin Fay: "Testing Shiny: What, why, and how." 83 | - Robin Lovelace: "From writing code to infoRming policy: a case study of reproducible research in transport planning" 84 | 85 | ## Invited Session 3 - Shiny / R in Production / R World 86 | 87 | e-Rum2020 :: Invited Session 3 - Shiny / R in Production / R World 88 | 89 | 90 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#invited-session-3---shiny-r-in-production-r-world 91 | 92 | - Dean Attali: "CRANalerts: A Shinyapp-as-a-Service for Impatient R Users" 93 | - Colin Gillespie: "And you thought CRAN was harsh" 94 | - Maëlle Salmon: "How to improve your R package" 95 | - Romain Francois: "dplyr 1.0.0" 96 | 97 | ## Parallel Session 1 - Machine Learning 98 | 99 | e-Rum2020 :: Parallel Session 1 - Machine Learning 100 | 101 | 102 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#parallel-session-1---machine-learning 103 | 104 | - Daniel Meister: "Deduplicating real estate ads using Naive Bayes record linkage" 105 | - Emanuele Cordano: "Hydrological Modelling and R" 106 | - Stefano Renzetti: "gWQS: An R Package for Linear and Generalized Weighted Quantile Sum (WQS) Regression" 107 | - Øystein Sørensen: "Flexible Meta-Analysis of Generalized Additive Models with metagam" 108 | 109 | ## Parallel Session 2 - Life Sciences 110 | 111 | e-Rum2020 :: Parallel Session 2 - Life Sciences 112 | 113 | 114 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#parallel-session-2---life-sciences 115 | 116 | - Simone Pernice: "CONNECTOR: a computational approach to study intratumor heterogeneity" 117 | - Paolo Castagno: "EPIMOD: A computational framework for studying epidemiological systems" 118 | - Mieke Deschepper: "How to apply R in a hospital environment on standard available hospital-wide data" 119 | - Nicolò Margaritella: "APFr: Average Power Function and Bayes FDR for Robust Brain Networks Construction" 120 | 121 | ## Parallel Session 3 - Applications / R in Production 122 | 123 | e-Rum2020 :: Parallel Session 3 - Applications / R in Production 124 | 125 | 126 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#parallel-session-3---applications-r-in-production 127 | 128 | - Leen Jooken: "Using process mining principles to extract a collaboration graph from a version control system log" 129 | - Alex Gold: "Design Patterns For Big Shiny Apps" 130 | - Ayomide Shodipo: "Fake News: AI on the battle ground" 131 | - Henrik Bengtsson: "progressr: An Inclusive, Unifying API for Progress Updates" 132 | 133 | ## Parallel Session 4 - Life Sciences 134 | 135 | e-Rum2020 :: Parallel Session 4 - Life Sciences 136 | 137 | 138 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#parallel-session-4---life-sciences 139 | 140 | - Moritz Hess: "Interpretable and accessible Deep Learning for omics data with R and friends" 141 | - Federico Marini: "GeneTonic: enjoy RNA-seq data analysis, responsibly" 142 | - Mattia Chiesa: "DaMiRseq 2.0: from high dimensional data to cost-effective reliable prediction models" 143 | - Francesca Giorgolo: "A simple and flexible inactivity/sleep detection R package" 144 | 145 | ## Parallel Session 5 - R World 146 | 147 | e-Rum2020 :: Parallel Session 5 - R World 148 | 149 | 150 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#parallel-session-5---r-world 151 | 152 | - Joseph Rickert: "The R Consortium 2020: adapting to rapid change and global crisis" 153 | - Regina Siegers: "CorrelAidX - Building R-focused Communities for Social Good on the Local Level" 154 | - Christoph Sax: "From consulting to open-source and back" 155 | - Dmytro Perepolkin: "{polite}: web etiquette for R users" 156 | 157 | ## Parallel Session 6 - R in Production / R World 158 | 159 | e-Rum2020 :: Parallel Session 6 - R in Production / R World 160 | 161 | 162 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#parallel-session-6---r-in-production-r-world 163 | 164 | - Layik Hama: "Powering Turing e-Atlas with R" 165 | - Massimiliano Silano: "Dplyr snowflake integration for cloud based massive and fast data manipulation" 166 | - Sarah Gibson: "Supporting R in the Binder Community" 167 | 168 | ## Parallel Session 7 - Data Visualization & Shiny 169 | 170 | e-Rum2020 :: Parallel Session 7 - Data Visualization & Shiny 171 | 172 | 173 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#parallel-session-7---data-visualization-shiny 174 | 175 | - Tatjana Kecojevic: "Transparent Journalism Through the Power of R" 176 | - Mustapha Larbaoui: "Elevating shiny module with {tidymodules}" 177 | - Renate Delucchi Danhier: "Interactive visualization of complex texts" 178 | - Tobias Verbeke: "What’s New in ShinyProxy" 179 | 180 | ## Parallel Session 8 - Machine Learning & Models 181 | 182 | e-Rum2020 :: Parallel Session 8 - Machine Learning & Models 183 | 184 | 185 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#parallel-session-8---machine-learning-models 186 | 187 | - Luís G. Silva: "Voronoi Linkage for Spatially Misaligned Data" 188 | - Andrea Sottosanti: "Astronomical source detection and background separation: a Bayesian nonparametric approach" 189 | - Apostolos Chalkis: "High dimensional sampling and volume computation" 190 | - Riccardo Corradin: "BNPmix: an new package to estimate Bayesian nonparametric mixtures" 191 | 192 | ## Parallel Session 9 - R in Production 193 | 194 | e-Rum2020 :: Parallel Session 9 - R in Production 195 | 196 | 197 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#parallel-session-9---r-in-production 198 | 199 | - Marcin Dubel: "Be proud of your code! Tools and patterns for making production-ready, clean R code" 200 | - Matthias Bannert: "R alongside Airflow, Docker and Gitlab CI" 201 | - André Rivenæs & Markus Mortensen: "Using XGBoost, Plumber and Docker in production to power a new banking product" 202 | - Andrie De Vries: "Creating drag-and-drop shiny applications using sortable" 203 | 204 | ## Parallel Session 10 - Machine Learning / Applications 205 | 206 | e-Rum2020 :: Parallel Session 10 - Machine Learning / Applications 207 | 208 | 209 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#parallel-session-10---machine-learning-applications 210 | 211 | - Jędrzej Świeżewski: "FastAI in R: preserving wildlife with computer vision" 212 | - Luca Torriani & Ilaria Sartori: "Manifoldgstat: an R package for spatial statistics of manifold data" 213 | - Jakob Dambon: "varycoef: Modeling Spatially Varying Coefficients" 214 | - Mikkel Meyer Andersen: "Computer Algebra Systems in R" 215 | 216 | ## Lightning Talks 1 - Data Visualization & Production 217 | 218 | e-Rum2020 :: Lightning Talks 1 - Data Visualization & Production 219 | 220 | 221 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#lightning-talks-1---data-visualization-production 222 | 223 | - Steffen Moritz: "Time Series Missing Data Visualizations" 224 | - Thierry Onkelinx: "effectclass: an R package to interpret effects and visualise uncertainty" 225 | - Gabriele Galatolo: "Supporting Twitter analytics application with graph-databases and the aRangodb package" 226 | - Kirill Muller: "dm: working with relational data models in R" 227 | - Matilde Grecchi: "An innovative way to support your sales force" 228 | - Nicholas Jhirad: "Keeping on top of R in Real-Time, High-Stakes trading systems" 229 | 230 | ## Lightning Talks 2 - Life Sciences 231 | 232 | e-Rum2020 :: Lightning Talks 2 - Life Sciences 233 | 234 | 235 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#lightning-talks-2---life-sciences 236 | 237 | - Andrea Cappozzo: "An enriched disease risk assessment model based on historical blood donors records" 238 | - Marco Calderisi: "A principal component analysis based method to detect biomarker captation from vibrational spectra" 239 | - Mirko Signorelli: "ptmixed: an R package for flexible modelling of longitudinal overdispersed count data" 240 | - Dario Righelli: "Differential Enriched Scan 2 (DEScan2): an R pipeline for epigenomic analysis" 241 | - Ger Inberg: "Reproducible Data Visualization with CanvasXpress" 242 | - Liam Brierley: "Using open-access data to derive genome composition of emerging viruses" 243 | 244 | ## Lightning Talks 3 - Machine Learning 245 | 246 | e-Rum2020 :: Lightning Talks 3 - Machine Learning 247 | 248 | 249 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#lightning-talks-3---machine-learning 250 | 251 | - Florian Privé: "Ultra fast penalized regressions with R package {bigstatsr}" 252 | - Krystian Igras: "Explaining black-box models with xspliner to make deliberate business decisions" 253 | - Mustafa Cavus: "One-way non-normal ANOVA in reliability analysis using with doex" 254 | - Øystein Sørensen: "Analyzing Preference Data with the BayesMallows Package" 255 | - Stefan Lenz: "Flexible deep learning via the JuliaConnectoR" 256 | 257 | ## Lightning Talks 4 - Applications 258 | 259 | e-Rum2020 :: Lightning Talks 4 - Applications 260 | 261 | 262 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#lightning-talks-4---applications 263 | 264 | - Berry Boessenkool: "rdwd: R interface to German Weather Service data" 265 | - Christoph Sax: "tv: Show Data Frames in the Browser" 266 | - Claus Ekstrom: "Predicting the Euro 2020 results using tournament rank probabilities scores from the socceR package" 267 | - Indranil Ghosh: "Design your own quantum simulator with R" 268 | - Keshav Bhatt: "What are the potato eaters eating" 269 | - Niels Martin: "Towards more structured data quality assessment in the process mining field: the DaQAPO package" 270 | 271 | ## Shiny Demo 1 - Machine Learning & Applications 272 | 273 | e-Rum2020 :: Shiny Demo 1 - Machine Learning & Applications 274 | 275 | 276 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#shiny-demo-1---machine-learning-applications 277 | 278 | - Mintu Nath: "A demonstration of ABACUS: Apps Based Activities for Communicating and Understanding Statistics" 279 | - Emanuele Fabbiani: "tsviz: a data-scientist-friendly addin for RStudio" 280 | - Andrea Melloncelli: "Media Shiny: Marketing Mix Models Builder" 281 | - Ottavia M. Epifania Linkedin, Twitter, Psicostat: "Scoring the Implicit Association Test has never been easier: DscoreApp" 282 | - Riccardo Porreca: "rTRhexNG: Hexagon sticker app for rTRNG" 283 | - Antoine Logean: "How green is your portfolio ? Tracking C02 footprint in the insurance sector" 284 | 285 | ## Shiny Demo 2 - Mobility & Spatial 286 | 287 | e-Rum2020 :: Shiny Demo 2 - Mobility & Spatial 288 | 289 | 290 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#shiny-demo-2---mobility-spatial 291 | 292 | - Agostino Torti: "Visualising and Modelling Bike Sharing Mobility usage in the city of Milan" 293 | - Angel Udias: "ESPRES: A shiny web tool to support River Basin Management planning in European Watersheds" 294 | - Josue Aduna: "Mobility scan" 295 | - Lorenzo Busetto: "Developing Shiny applications to facilitate precision agriculture workflows" 296 | - Luigi Ranghetti: "“GUInterp”: a Shiny GUI to support spatial interpolation" 297 | 298 | ## Poster Session 1 - Life Sciences & Applications 299 | 300 | e-Rum2020 :: Poster Session 1 - Life Sciences & Applications 301 | 302 | 303 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#poster-session-1---life-sciences-applications 304 | 305 | - Alessio Crippa: "A flexible dashboard for monitoring platform trials" 306 | - Angela Andreella: "PRDA package: Enhancing Statistical Inference via Prospective and Retrospective Design Analysis" 307 | - Dario Righelli: "EasyReporting: a Bioconductor package for Reproducible Research implementation" 308 | - Federico Agostinis: "NewWave: a scalable R package for the dimensionality reduction of single-cell RNA-seq" 309 | - Patricia Ryser-Welch: "Integrating professional software engineering practices in medical research software" 310 | - Tobias Schieferdecker: "Dealing with changing administrative boundaries: The case of Swiss municipalities" 311 | 312 | ## Poster Session 2 - DataViz & Machine Learning 313 | 314 | e-Rum2020 :: Poster Session 2 - DataViz & Machine Learning 315 | 316 | 317 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#poster-session-2---dataviz-machine-learning 318 | 319 | - Binod Jung Bogati: "Automate flexdashboard with GitHub" 320 | - Gabriel Okasa: "orf: Ordered Random Forests" 321 | - Marco Calderisi: "Power Supply health status monitoring dashboard" 322 | - Olga Dunajeva: "First-year ICT students dropout predicting with R models" 323 | - Olalekan Joseph Akintande: "Benchmark Percentage Disjoint Data Splitting in Cross Validation for Assessing the Skill of Machine" 324 | - Hervé Dakpo: "badDEA: An R package for measuring firms’ efficiency adjusted by undesirable outputs" 325 | - Aurélien Severino: "CorpFinder- a new application to identify Large Corporate Risks" 326 | 327 | ## Thematic Lounge 1 - Developing Software and Careers in Life Sciences 328 | 329 | e-Rum2020 :: Thematic Lounge 1 - Developing Software and Careers in Life Sciences 330 | 331 | 332 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#thematic-lounge-1---developing-software-and-careers-in-life-sciences 333 | 334 | - Federico Marini, Charlotte Soneson, Davide Risso: "Developing Software and Careers in Life Sciences" 335 | 336 | ## Thematic Lounge 2 - Building Community & Diversity 337 | 338 | e-Rum2020 :: Thematic Lounge 2 - Building Community & Diversity 339 | 340 | 341 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#thematic-lounge-2---building-community-diversity 342 | 343 | - Sara Iacozza, Levi Waldron, Janine Khuc, Parvaneh Shafiei: "Building Community & Diversity" 344 | 345 | ## Thematic Lounge 3 - Data science: freelancing and business 346 | 347 | e-Rum2020 :: Thematic Lounge 3 - Data science: freelancing and business 348 | 349 | 350 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#thematic-lounge-3---data-science-freelancing-and-business 351 | 352 | - Enrico Deusebio, Mariachiara Fortuna, Riccardo L. Rossi: "Data science: freelancing and business" 353 | 354 | ## Morning Workshops 355 | 356 | e-Rum2020 :: Morning Workshops 357 | 358 | 359 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#morning-workshops 360 | 361 | - Andrea Melloncelli: "Is R ready for Production? Let’s develop a Professional Shiny Application!" 362 | - Lubomír Štěpánek, Jiří Novák: "Image processing and computer vision with R" 363 | - Przemyslaw Biecek, Anna Kozak, Szymon Maksymiuk: "Explanation and exploration of machine learning models with R" 364 | - Patricia Ryser-Welch, Paul Burton, Demetris Avraam, Stuart Wheater, Olly Butters, Becca Wilson, Alex Westerberg, Leire Abarrategui-Martinez: "Non-disclosive federated analysis in R" 365 | - Cristina Muschitiello, Niccolò Stamboglis: "A unified approach for writing automatic reports: parameterization and generalization of R-Markdown" 366 | - Tatjana Kecojevic, Katarina Kosmina, Tijana Blagojev: "Build a website with blogdown in R" 367 | 368 | ## Afternoon Workshops 369 | 370 | e-Rum2020 :: Afternoon Workshops 371 | 372 | 373 | Speaker information and materials for this session are available at https://github.com/Milano-R/erum2020program#afternoon-workshops 374 | 375 | - Christine Choirat, The Renku Development Team: "Reproducible workflows with the RENKU platform" 376 | - John Coene: "How to build htmlwidgets" 377 | - Goran Milovanović: "Semantic Web in R for Data Scientists" 378 | - David Granjon, Mustapha Larbaoui, Flavio Lombardo, Douglas Robinson: "Advanced User Interfaces for Shiny Developers" 379 | - Riccardo Porreca, Peter Schmid: "Bring your R Application Safely to Production. Collaborate, Deploy, Automate" 380 | --------------------------------------------------------------------------------