├── .gitignore ├── LICENSE ├── README.md ├── rmpi-demo ├── README.md ├── job.sh └── rmpi-test.R ├── slides ├── beamerthemeExecushares.sty └── slides.Rmd ├── snow-demo ├── README.md ├── job.sh └── snow-test.R ├── snowfall-demo ├── README.md ├── job.sh └── snowfall-test.R └── ss.png /.gitignore: -------------------------------------------------------------------------------- 1 | *.out 2 | *.err 3 | .nfs* 4 | .Rproj.user 5 | .Rhistory 6 | slides.pdf 7 | *.Rproj 8 | 9 | -------------------------------------------------------------------------------- /LICENSE: -------------------------------------------------------------------------------- 1 | License 2 | 3 | THE WORK (AS DEFINED BELOW) IS PROVIDED UNDER THE TERMS OF THIS CREATIVE COMMONS PUBLIC LICENSE ("CCPL" OR "LICENSE"). THE WORK IS PROTECTED BY COPYRIGHT AND/OR OTHER APPLICABLE LAW. ANY USE OF THE WORK OTHER THAN AS AUTHORIZED UNDER THIS LICENSE OR COPYRIGHT LAW IS PROHIBITED. 4 | 5 | BY EXERCISING ANY RIGHTS TO THE WORK PROVIDED HERE, YOU ACCEPT AND AGREE TO BE BOUND BY THE TERMS OF THIS LICENSE. TO THE EXTENT THIS LICENSE MAY BE CONSIDERED TO BE A CONTRACT, THE LICENSOR GRANTS YOU THE RIGHTS CONTAINED HERE IN CONSIDERATION OF YOUR ACCEPTANCE OF SUCH TERMS AND CONDITIONS. 6 | 7 | 1. Definitions 8 | 9 | "Collective Work" means a work, such as a periodical issue, anthology or encyclopedia, in which the Work in its entirety in unmodified form, along with one or more other contributions, constituting separate and independent works in themselves, are assembled into a collective whole. A work that constitutes a Collective Work will not be considered a Derivative Work (as defined below) for the purposes of this License. 10 | "Derivative Work" means a work based upon the Work or upon the Work and other pre-existing works, such as a translation, musical arrangement, dramatization, fictionalization, motion picture version, sound recording, art reproduction, abridgment, condensation, or any other form in which the Work may be recast, transformed, or adapted, except that a work that constitutes a Collective Work will not be considered a Derivative Work for the purpose of this License. For the avoidance of doubt, where the Work is a musical composition or sound recording, the synchronization of the Work in timed-relation with a moving image ("synching") will be considered a Derivative Work for the purpose of this License. 11 | "Licensor" means the individual, individuals, entity or entities that offers the Work under the terms of this License. 12 | "Original Author" means the individual, individuals, entity or entities who created the Work. 13 | "Work" means the copyrightable work of authorship offered under the terms of this License. 14 | "You" means an individual or entity exercising rights under this License who has not previously violated the terms of this License with respect to the Work, or who has received express permission from the Licensor to exercise rights under this License despite a previous violation. 15 | 16 | 2. Fair Use Rights. Nothing in this license is intended to reduce, limit, or restrict any rights arising from fair use, first sale or other limitations on the exclusive rights of the copyright owner under copyright law or other applicable laws. 17 | 18 | 3. License Grant. Subject to the terms and conditions of this License, Licensor hereby grants You a worldwide, royalty-free, non-exclusive, perpetual (for the duration of the applicable copyright) license to exercise the rights in the Work as stated below: 19 | 20 | to reproduce the Work, to incorporate the Work into one or more Collective Works, and to reproduce the Work as incorporated in the Collective Works; 21 | to create and reproduce Derivative Works provided that any such Derivative Work, including any translation in any medium, takes reasonable steps to clearly label, demarcate or otherwise identify that changes were made to the original Work. For example, a translation could be marked "The original work was translated from English to Spanish," or a modification could indicate "The original work has been modified.";; 22 | to distribute copies or phonorecords of, display publicly, perform publicly, and perform publicly by means of a digital audio transmission the Work including as incorporated in Collective Works; 23 | to distribute copies or phonorecords of, display publicly, perform publicly, and perform publicly by means of a digital audio transmission Derivative Works. 24 | 25 | For the avoidance of doubt, where the Work is a musical composition: 26 | Performance Royalties Under Blanket Licenses. Licensor waives the exclusive right to collect, whether individually or, in the event that Licensor is a member of a performance rights society (e.g. ASCAP, BMI, SESAC), via that society, royalties for the public performance or public digital performance (e.g. webcast) of the Work. 27 | Mechanical Rights and Statutory Royalties. Licensor waives the exclusive right to collect, whether individually or via a music rights agency or designated agent (e.g. Harry Fox Agency), royalties for any phonorecord You create from the Work ("cover version") and distribute, subject to the compulsory license created by 17 USC Section 115 of the US Copyright Act (or the equivalent in other jurisdictions). 28 | Webcasting Rights and Statutory Royalties. For the avoidance of doubt, where the Work is a sound recording, Licensor waives the exclusive right to collect, whether individually or via a performance-rights society (e.g. SoundExchange), royalties for the public digital performance (e.g. webcast) of the Work, subject to the compulsory license created by 17 USC Section 114 of the US Copyright Act (or the equivalent in other jurisdictions). 29 | 30 | The above rights may be exercised in all media and formats whether now known or hereafter devised. The above rights include the right to make such modifications as are technically necessary to exercise the rights in other media and formats. All rights not expressly granted by Licensor are hereby reserved. 31 | 32 | 4. Restrictions. The license granted in Section 3 above is expressly made subject to and limited by the following restrictions: 33 | 34 | You may distribute, publicly display, publicly perform, or publicly digitally perform the Work only under the terms of this License, and You must include a copy of, or the Uniform Resource Identifier for, this License with every copy or phonorecord of the Work You distribute, publicly display, publicly perform, or publicly digitally perform. You may not offer or impose any terms on the Work that restrict the terms of this License or the ability of a recipient of the Work to exercise the rights granted to that recipient under the terms of the License. You may not sublicense the Work. You must keep intact all notices that refer to this License and to the disclaimer of warranties. When You distribute, publicly display, publicly perform, or publicly digitally perform the Work, You may not impose any technological measures on the Work that restrict the ability of a recipient of the Work from You to exercise the rights granted to that recipient under the terms of the License. This Section 4(a) applies to the Work as incorporated in a Collective Work, but this does not require the Collective Work apart from the Work itself to be made subject to the terms of this License. If You create a Collective Work, upon notice from any Licensor You must, to the extent practicable, remove from the Collective Work any credit as required by Section 4(b), as requested. If You create a Derivative Work, upon notice from any Licensor You must, to the extent practicable, remove from the Derivative Work any credit as required by Section 4(b), as requested. 35 | If You distribute, publicly display, publicly perform, or publicly digitally perform the Work (as defined in Section 1 above) or any Derivative Works (as defined in Section 1 above) or Collective Works (as defined in Section 1 above), You must, unless a request has been made pursuant to Section 4(a), keep intact all copyright notices for the Work and provide, reasonable to the medium or means You are utilizing: (i) the name of the Original Author (or pseudonym, if applicable) if supplied, and/or (ii) if the Original Author and/or Licensor designate another party or parties (e.g. a sponsor institute, publishing entity, journal) for attribution ("Attribution Parties") in Licensor's copyright notice, terms of service or by other reasonable means, the name of such party or parties; the title of the Work if supplied; to the extent reasonably practicable, the Uniform Resource Identifier, if any, that Licensor specifies to be associated with the Work, unless such URI does not refer to the copyright notice or licensing information for the Work; and, consistent with Section 3(b) in the case of a Derivative Work, a credit identifying the use of the Work in the Derivative Work (e.g., "French translation of the Work by Original Author," or "Screenplay based on original Work by Original Author"). The credit required by this Section 4(b) may be implemented in any reasonable manner; provided, however, that in the case of a Derivative Work or Collective Work, at a minimum such credit will appear, if a credit for all contributing authors of the Derivative Work or Collective Work appears, then as part of these credits and in a manner at least as prominent as the credits for the other contributing authors. For the avoidance of doubt, You may only use the credit required by this Section for the purpose of attribution in the manner set out above and, by exercising Your rights under this License, You may not implicitly or explicitly assert or imply any connection with, sponsorship or endorsement by the Original Author, Licensor and/or Attribution Parties, as appropriate, of You or Your use of the Work, without the separate, express prior written permission of the Original Author, Licensor and/or Attribution Parties. 36 | 37 | 5. Representations, Warranties and Disclaimer 38 | 39 | UNLESS OTHERWISE MUTUALLY AGREED TO BY THE PARTIES IN WRITING, LICENSOR OFFERS THE WORK AS-IS AND ONLY TO THE EXTENT OF ANY RIGHTS HELD IN THE LICENSED WORK BY THE LICENSOR. THE LICENSOR MAKES NO REPRESENTATIONS OR WARRANTIES OF ANY KIND CONCERNING THE WORK, EXPRESS, IMPLIED, STATUTORY OR OTHERWISE, INCLUDING, WITHOUT LIMITATION, WARRANTIES OF TITLE, MARKETABILITY, MERCHANTIBILITY, FITNESS FOR A PARTICULAR PURPOSE, NONINFRINGEMENT, OR THE ABSENCE OF LATENT OR OTHER DEFECTS, ACCURACY, OR THE PRESENCE OF ABSENCE OF ERRORS, WHETHER OR NOT DISCOVERABLE. SOME JURISDICTIONS DO NOT ALLOW THE EXCLUSION OF IMPLIED WARRANTIES, SO SUCH EXCLUSION MAY NOT APPLY TO YOU. 40 | 41 | 6. Limitation on Liability. EXCEPT TO THE EXTENT REQUIRED BY APPLICABLE LAW, IN NO EVENT WILL LICENSOR BE LIABLE TO YOU ON ANY LEGAL THEORY FOR ANY SPECIAL, INCIDENTAL, CONSEQUENTIAL, PUNITIVE OR EXEMPLARY DAMAGES ARISING OUT OF THIS LICENSE OR THE USE OF THE WORK, EVEN IF LICENSOR HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. 42 | 43 | 7. Termination 44 | 45 | This License and the rights granted hereunder will terminate automatically upon any breach by You of the terms of this License. Individuals or entities who have received Derivative Works (as defined in Section 1 above) or Collective Works (as defined in Section 1 above) from You under this License, however, will not have their licenses terminated provided such individuals or entities remain in full compliance with those licenses. Sections 1, 2, 5, 6, 7, and 8 will survive any termination of this License. 46 | Subject to the above terms and conditions, the license granted here is perpetual (for the duration of the applicable copyright in the Work). Notwithstanding the above, Licensor reserves the right to release the Work under different license terms or to stop distributing the Work at any time; provided, however that any such election will not serve to withdraw this License (or any other license that has been, or is required to be, granted under the terms of this License), and this License will continue in full force and effect unless terminated as stated above. 47 | 48 | 8. Miscellaneous 49 | 50 | Each time You distribute or publicly digitally perform the Work (as defined in Section 1 above) or a Collective Work (as defined in Section 1 above), the Licensor offers to the recipient a license to the Work on the same terms and conditions as the license granted to You under this License. 51 | Each time You distribute or publicly digitally perform a Derivative Work, Licensor offers to the recipient a license to the original Work on the same terms and conditions as the license granted to You under this License. 52 | If any provision of this License is invalid or unenforceable under applicable law, it shall not affect the validity or enforceability of the remainder of the terms of this License, and without further action by the parties to this agreement, such provision shall be reformed to the minimum extent necessary to make such provision valid and enforceable. 53 | No term or provision of this License shall be deemed waived and no breach consented to unless such waiver or consent shall be in writing and signed by the party to be charged with such waiver or consent. 54 | This License constitutes the entire agreement between the parties with respect to the Work licensed here. There are no understandings, agreements or representations with respect to the Work not specified here. Licensor shall not be bound by any additional provisions that may appear in any communication from You. This License may not be modified without the mutual written agreement of the Licensor and You. 55 | -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | Introduction to parallel R 2 | ========================== 3 | 4 | Sep 22, 2016 5 | 6 | This material is associated with a presentation to the University of Colorado Computational Sciences and Engineering meetup group. 7 | The slides are written in Rmarkdown and rendered to pdf via Beamer. 8 | There are three examples to run on the CU supercomputer (Janus as of Sep 22, 2016) that demonstrate Rmpi, snow, and snowfall in a high performance computing environment. 9 | 10 | The rendered slides are available on FigShare: https://figshare.com/articles/Introduction_to_parallel_R/3848310 11 | 12 | ![](ss.png) 13 | -------------------------------------------------------------------------------- /rmpi-demo/README.md: -------------------------------------------------------------------------------- 1 | # Rmpi demo on Janus 2 | 3 | Rmpi is a package that is bundled up in a module on Janus. 4 | The file `job.sh` is the slurm job script to be submitted, which calls `rmpi-test.R`. 5 | 6 | Usage: 7 | 8 | This requires that you are logged in to Janus: 9 | 10 | ``` 11 | ssh -l $your_username login.rc.colorado.edu 12 | ``` 13 | 14 | Once you've logged in, load slurm and then use `sbatch` to submit the job to `janus-debug`: 15 | 16 | ``` 17 | ml slurm 18 | sbatch job.sh 19 | ``` 20 | 21 | -------------------------------------------------------------------------------- /rmpi-demo/job.sh: -------------------------------------------------------------------------------- 1 | #! /bin/bash 2 | #SBATCH --job-name rmpi-test 3 | #SBATCH --time 00:00:30 4 | #SBATCH --nodes 2 5 | #SBATCH --output results.out 6 | #SBATCH --qos=janus-debug 7 | 8 | ml gcc/5.1.0 9 | ml R/3.2.0 10 | ml Rmpi/0.6-5 11 | 12 | mpirun -np 1 R --vanilla -f rmpi-test.R 13 | 14 | # clean up logfiles 15 | rm node*.log 16 | -------------------------------------------------------------------------------- /rmpi-demo/rmpi-test.R: -------------------------------------------------------------------------------- 1 | library("Rmpi") 2 | 3 | # spawn slaves, saving one process for the master 4 | np <- mpi.universe.size() 5 | mpi.spawn.Rslaves(nslaves = np - 1) 6 | 7 | # have the slaves identify themselves 8 | mpi.bcast.cmd(id <- mpi.comm.rank()) 9 | mpi.bcast.cmd(np <- mpi.comm.size()) 10 | mpi.bcast.cmd(host <- mpi.get.processor.name()) 11 | mpi.remote.exec(paste("I am", id, "of", np, "running on", host)) 12 | 13 | # close the slaves and quit 14 | mpi.close.Rslaves(dellog = FALSE) 15 | mpi.quit() 16 | 17 | -------------------------------------------------------------------------------- /slides/beamerthemeExecushares.sty: -------------------------------------------------------------------------------- 1 | % forked from https://github.com/FuzzyWuzzie/Beamer-Theme-Execushares 2 | % modified by Max Joseph Sep 2016 3 | 4 | % the various libraries we will be using 5 | \usepackage{tikz} 6 | \usetikzlibrary{calc} 7 | \usepackage[none]{hyphenat} 8 | \usepackage{fontspec} 9 | \defaultfontfeatures{Ligatures=TeX} 10 | 11 | \newif\ifbeamer@pixelitem 12 | \beamer@pixelitemtrue 13 | \DeclareOptionBeamer{nopixelitem}{\beamer@pixelitemfalse} 14 | \ProcessOptionsBeamer 15 | 16 | % define colours 17 | % taken from pickton on Adobe Kuler: 18 | % https://kuler.adobe.com/Some-Kind-Of-Execushares-color-theme-3837185/ 19 | \definecolor{ExecusharesRed}{RGB}{230,37,52} 20 | \definecolor{ExecusharesBlack}{RGB}{43,40,40} 21 | \definecolor{ExecusharesBlue}{RGB}{22,190,207} 22 | \definecolor{ExecusharesWhite}{RGB}{255,255,243} 23 | \definecolor{ExecusharesGrey}{RGB}{107,110,108} 24 | 25 | % use Adobe's Source Pro fonts: 26 | % Source Serif Pro: https://github.com/adobe-fonts/source-serif-pro 27 | % Source Sans Pro: https://github.com/adobe-fonts/source-sans-pro 28 | % Source Code Pro: https://github.com/adobe-fonts/source-code-pro 29 | \setmainfont{Source Serif Pro} 30 | \setsansfont{Source Sans Pro} 31 | \setmonofont{Source Code Pro} 32 | 33 | % To use with pdflatex, 34 | % comment the fontspec package at the top 35 | %\usepackage{sourceserifpro} 36 | %\usepackage{sourcesanspro} 37 | %\usepackage{sourcecodepro} 38 | 39 | % set colours 40 | \setbeamercolor{itemize item}{fg=ExecusharesBlue} 41 | \setbeamercolor{enumerate item}{fg=ExecusharesBlue} 42 | \setbeamercolor{alerted text}{fg=ExecusharesBlue} 43 | \setbeamercolor{section in toc}{fg=ExecusharesBlack} 44 | 45 | % set fonts 46 | \setbeamerfont{itemize/enumerate body}{size=\large} 47 | \setbeamerfont{itemize/enumerate subbody}{size=\normalsize} 48 | \setbeamerfont{itemize/enumerate subsubbody}{size=\small} 49 | \setbeamerfont{institute}{size=\normalsize\itshape} 50 | 51 | \ifbeamer@pixelitem 52 | % make the itemize bullets pixelated > 53 | \setbeamertemplate{itemize item}{ 54 | \tikz{ 55 | \draw[fill=ExecusharesBlue,draw=none] (0, 0) rectangle(0.1, 0.1); 56 | \draw[fill=ExecusharesBlue,draw=none] (0.1, 0.1) rectangle(0.2, 0.2); 57 | \draw[fill=ExecusharesBlue,draw=none] (0, 0.2) rectangle(0.1, 0.3); 58 | } 59 | } 60 | % make the subitems also pixelated >, but a little smaller and red 61 | \setbeamertemplate{itemize subitem}{ 62 | \tikz{ 63 | \draw[fill=ExecusharesBlue,draw=none] (0, 0) rectangle(0.075, 0.075); 64 | \draw[fill=ExecusharesBlue,draw=none] (0.075, 0.075) rectangle(0.15, 0.15); 65 | \draw[fill=ExecusharesBlue,draw=none] (0, 0.15) rectangle(0.075, 0.225); 66 | } 67 | } 68 | \fi 69 | 70 | % disable navigation 71 | \setbeamertemplate{navigation symbols}{} 72 | 73 | % custom draw the title page above 74 | \setbeamertemplate{title page}{} 75 | 76 | % again, manually draw the frame title above 77 | \setbeamertemplate{frametitle}{} 78 | 79 | % disable "Figure:" in the captions 80 | \setbeamertemplate{caption}{\tiny\insertcaption} 81 | \setbeamertemplate{caption label separator}{} 82 | 83 | % since I don't know a better way to do this, these are all switches 84 | % doing `\setcounter{showProgressBar}{0}` will turn the progress bar off (I turn it off for Appendix slides) 85 | % etc 86 | \newcounter{showProgressBar} 87 | \setcounter{showProgressBar}{1} 88 | \newcounter{showSlideNumbers} 89 | \setcounter{showSlideNumbers}{1} 90 | \newcounter{showSlideTotal} 91 | \setcounter{showSlideTotal}{1} 92 | 93 | % use \makeatletter for our progress bar definitions 94 | % progress bar idea from http://tex.stackexchange.com/a/59749/44221 95 | % slightly adapted for visual purposes here 96 | \makeatletter 97 | \newcount\progressbar@tmpcounta% auxiliary counter 98 | \newcount\progressbar@tmpcountb% auxiliary counter 99 | \newdimen\progressbar@pbwidth %progressbar width 100 | \newdimen\progressbar@tmpdim % auxiliary dimension 101 | 102 | \newdimen\slidewidth % auxiliary dimension 103 | \newdimen\slideheight % auxiliary dimension 104 | 105 | % make the progress bar go across the screen 106 | %\progressbar@pbwidth=12.8cm 107 | \progressbar@pbwidth=\the\paperwidth 108 | \slidewidth=\the\paperwidth 109 | \slideheight=\the\paperheight 110 | 111 | % use tikz to draw everything 112 | % it may not be the best, but it's easy to work with 113 | % and looks good 114 | % TODO: base title slide and contents slide on something other than slide numbers :/ 115 | \setbeamertemplate{background}{ 116 | % deal with progress bar stuff 117 | % (calculate where it should go) 118 | \progressbar@tmpcounta=\insertframenumber 119 | \progressbar@tmpcountb=\inserttotalframenumber 120 | \progressbar@tmpdim=\progressbar@pbwidth 121 | \multiply\progressbar@tmpdim by \progressbar@tmpcounta 122 | \divide\progressbar@tmpdim by \progressbar@tmpcountb 123 | 124 | \begin{tikzpicture} 125 | % set up the entire slide as the canvas 126 | \useasboundingbox (0,0) rectangle(\the\paperwidth,\the\paperheight); 127 | 128 | % the background 129 | \fill[color=ExecusharesWhite] (0,0) rectangle(\the\paperwidth,\the\paperheight); 130 | 131 | % separate the drawing based on if we're the first (title) slide or not 132 | \ifnum\thepage=1\relax 133 | % the title page 134 | % draw the fills 135 | \fill[color=ExecusharesBlue] (0, 4cm) rectangle(\slidewidth,\slideheight); 136 | 137 | % draw the actual text 138 | \node[anchor=south,text width=\slidewidth-1cm,inner xsep=0.5cm] at (0.5\slidewidth,4cm) {\color{ExecusharesWhite}\Huge\textbf{\inserttitle}}; 139 | \node[anchor=north east,text width=\slidewidth-1cm,align=left] at (\slidewidth-0.4cm,4cm) {\color{ExecusharesBlack}\small\insertsubtitle}; 140 | \node at (0.5\slidewidth,2cm) {\color{ExecusharesBlack}\LARGE\insertauthor}; 141 | 142 | % add the date in the corner 143 | \node[anchor=south east] at(\slidewidth,0cm) {\color{ExecusharesGrey}\tiny\insertdate}; 144 | \else 145 | % NOT the title page 146 | % title bar 147 | \fill[color=ExecusharesBlue] (0, \slideheight-1cm) rectangle(\slidewidth,\slideheight); 148 | 149 | % swap the comment on these to add section titles to slide titles 150 | %\node[anchor=north,text width=11.8cm,inner xsep=0.5cm,inner ysep=0.25cm] at (6.4cm,9.6cm) {\color{ExecusharesWhite}\Large\textbf{\insertsectionhead: \insertframetitle}}; 151 | \node[anchor=north,text width=\slidewidth-1cm,inner xsep=0.5cm,inner ysep=0.25cm] at (0.5\slidewidth,\slideheight) {\color{ExecusharesWhite}\huge\textbf{\insertframetitle}}; 152 | 153 | % if we're showing a progress bar, show it 154 | % (I disable the progress bar and slide numbers for the "Appendix" slides) 155 | \ifnum \value{showProgressBar}>0\relax% 156 | % the the progress bar icon in the middle of the screen 157 | \draw[fill=ExecusharesGrey,draw=none] (0cm,0cm) rectangle(\slidewidth,0.25cm); 158 | \draw[fill=ExecusharesBlue,draw=none] (0cm,0cm) rectangle(\progressbar@tmpdim,0.25cm); 159 | 160 | % bottom information 161 | \node[anchor=south west] at(0cm,0.25cm) {\color{ExecusharesGrey}\tiny\vphantom{lp}\insertsection}; 162 | % if slide numbers are active 163 | \ifnum \value{showSlideNumbers}>0\relax% 164 | % if slide totals are active 165 | \ifnum \value{showSlideTotal}>0\relax% 166 | % draw both slide number and slide total 167 | \node[anchor=south east] at(\slidewidth,0.25cm) {\color{ExecusharesGrey}\tiny\insertframenumber/\inserttotalframenumber}; 168 | \else 169 | % slide totals aren't active, don't draw them 170 | \node[anchor=south east] at(\slidewidth,0.25cm) {\color{ExecusharesGrey}\tiny\insertframenumber}; 171 | \fi 172 | \fi 173 | % don't show the progress bar? 174 | \else 175 | % section title in the bottom left 176 | \node[anchor=south west] at(0cm,0cm) {\color{ExecusharesGrey}\tiny\vphantom{lp}\insertsection}; 177 | % if we're showing slide numbers 178 | \ifnum \value{showSlideNumbers}>0\relax% 179 | % if slide totals are active 180 | \ifnum \value{showSlideTotal}>0\relax% 181 | % draw both slide number and slide total 182 | \node[anchor=south east] at(\slidewidth,0cm) {\color{ExecusharesGrey}\tiny\insertframenumber/\inserttotalframenumber}; 183 | \else 184 | % slide totals aren't active, don't draw them 185 | \node[anchor=south east] at(\slidewidth,0cm) {\color{ExecusharesGrey}\tiny\insertframenumber}; 186 | \fi 187 | \fi 188 | \fi 189 | \fi 190 | \end{tikzpicture} 191 | } 192 | \makeatother 193 | 194 | % add section titles 195 | \AtBeginSection{\frame{\sectionpage}} 196 | \setbeamertemplate{section page} 197 | { 198 | \begin{tikzpicture} 199 | % set up the entire slide as the canvas 200 | \useasboundingbox (0,0) rectangle(\slidewidth,\slideheight); 201 | %\fill[color=ExecusharesWhite] (0,0) rectangle(\the\paperwidth,\the\paperheight); 202 | \fill[color=ExecusharesWhite] (-1cm, 2cm) rectangle (\slidewidth, \slideheight+0.1cm); 203 | \fill[color=ExecusharesBlue] (-1cm, 0.5\slideheight-1cm) rectangle(\slidewidth, 0.5\slideheight+1cm); 204 | \node[text width=\the\paperwidth-1cm,align=center] at (0.4\slidewidth, 0.5\slideheight) {\color{ExecusharesWhite}\Huge\textbf{\insertsection}}; 205 | \end{tikzpicture} 206 | } 207 | -------------------------------------------------------------------------------- /slides/slides.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Introduction to parallel R" 3 | subtitle: \Large{Max Joseph} \newline Earth Lab, CU Boulder \newline \url{github.com/mbjoseph/intro-parallel-r} 4 | author: "" 5 | date: "`r format(Sys.time(), '%d %B, %Y')`" 6 | output: 7 | beamer_presentation: 8 | latex_engine: xelatex 9 | header-includes: 10 | - \usepackage{blindtext} 11 | - \usetheme{Execushares} 12 | --- 13 | 14 | ```{r, echo = FALSE, message = FALSE} 15 | library(knitr) 16 | library(ggplot2) 17 | library(dplyr) 18 | library(parallel) 19 | library(snowfall) 20 | # set some global options 21 | opts_chunk$set(comment = NA, 22 | fig.width = 2.6, 23 | fig.height = 2.2, 24 | fig.align = 'center') 25 | 26 | theme_set(theme_minimal(base_size = 8) + 27 | theme(panel.grid.minor = element_blank())) 28 | ps <- 1 29 | ``` 30 | 31 | ## What is parallel computing? 32 | 33 | Processes run simultaneously (e.g., on separate cores) 34 | 35 | ![AMD Athlon™ X2 Dual-Core Processor 6400+ in AM2 package (David W. Smith)](https://upload.wikimedia.org/wikipedia/commons/f/fb/Athlon64x2-6400plus.jpg) 36 | 37 | 38 | 39 | ## Parallel computing analogy 40 | 41 | We want 100 coin flips 42 | 43 | **In serial**: 44 | 45 | flip one coin 100 times 46 | 47 | **In parallel**: 48 | 49 | get 100 people to flip 1 coin each simultaneously 50 | 51 | 52 | 53 | ## Why parallelism? 54 | 55 | 1. Speed gains 56 | 57 | ```{r, echo = FALSE, fig.width = 3.5, fig.height = 2.3} 58 | # compute theoretical speedup in latency for computations 59 | # across a range of N_processors and percent parallel 60 | n_processors <- 2^(0:14) 61 | p <- c(.5, .75, .9, .95, .975) 62 | d <- expand.grid(n = n_processors, p = p) 63 | d <- d %>% 64 | mutate(speedup = 1 / ((1 - p) + p / n)) 65 | label_d <- d %>% 66 | group_by(p) %>% 67 | summarize(n = max(n), 68 | speedup = max(speedup), 69 | label = unique(paste0(100 * p, '% parallel'))) 70 | 71 | ggplot(d, aes(x = n, y = speedup, group = p)) + 72 | geom_line() + 73 | scale_x_log10(limits = c(1, 2^18), breaks = n_processors) + 74 | geom_text(data = label_d, aes(label = label), nudge_x = .8, size = 2.4) + 75 | theme(legend.position = 'none') + 76 | xlab('Number of processors') + 77 | ylab('Theoretical speedup') + 78 | theme(axis.text.x = element_text(angle = 45)) 79 | ``` 80 | 81 | 82 | ## When to parallelize? 83 | 84 | - speed gains required 85 | - other optimizations are not enough 86 | - the task can be parallelized 87 | 88 | 89 | 90 | ## Embarrasingly parallel tasks 91 | 92 | Little to no dependency among tasks (e.g,. computing row sums) 93 | 94 | $$\begin{bmatrix} 95 | 1 & 4 & 7\\ 96 | 2 & 5 & 8\\ 97 | 3 & 6 & 9 98 | \end{bmatrix} \rightarrow 99 | \begin{bmatrix} 100 | 12\\ 101 | 15\\ 102 | 18 103 | \end{bmatrix}$$ 104 | 105 | 106 | 107 | ## Dependent process: random walk 108 | 109 | $$y_t = y_{t - 1} + \text{Normal}(0, 1)$$ 110 | 111 | ```{r, echo = FALSE, fig.height = 2} 112 | n <- 100 113 | epsilon <- rnorm(n) 114 | y <- rep(NA, n) 115 | y[1] <- 0 116 | for (i in 2:n) y[i] <- y[i - 1] + epsilon[i] 117 | 118 | data.frame(t = 1:n, y = y) %>% 119 | ggplot(aes(x = t, y = y)) + 120 | geom_path() + 121 | xlab('Time') 122 | ``` 123 | 124 | 125 | 126 | 127 | ## Making parallel-ready R code 128 | 129 | 1. Start with a serial implementation 130 | 2. Port it to a parallel-friendly function 131 | 132 | 133 | 134 | ## `for` loops 135 | 136 | ```{r} 137 | M <- matrix(1:9, nrow = 3) 138 | for (i in 1:3) { 139 | print(sum(M[i, ])) 140 | } 141 | ``` 142 | 143 | - imposes meaningless ordering 144 | - allows for dependence (e.g., we could access row `i - 1`) 145 | 146 | 147 | 148 | ## `apply()` instead 149 | 150 | ```{r} 151 | M <- matrix(1:9, nrow = 3) 152 | apply(M, 1, sum) 153 | ``` 154 | 155 | - idiomatic in R 156 | - easier to parallelize 157 | - operations are unambiguously independent 158 | 159 | 160 | 161 | ## `apply()` visualized 162 | 163 | ![http://blog.datacamp.com/wp-content/uploads/2015/07/Apply_function.png](http://blog.datacamp.com/wp-content/uploads/2015/07/Apply_function.png) 164 | 165 | 166 | 167 | ## A more complex example 168 | 169 | ```{r, echo = FALSE} 170 | library(ggplot2) 171 | 172 | f <- function(x, y) { 173 | # unit disk function (region in plane bounded by circle) 174 | # whose area is pi 175 | # - the value of the function is one in the circle, and zero elsewhere 176 | # - the radius of the circle is 1 177 | # - the integral of the function over x and y is known to be pi 178 | ifelse(x^2 + y^2 <= 1, 1, 0) 179 | } 180 | 181 | # generate a circular path to plot 182 | tt <- seq(0, 2 * pi, length.out = 500) 183 | circle_path <- data.frame(x = cos(tt), y = sin(tt)) 184 | 185 | p <- ggplot(circle_path, aes(x = x, y = y)) + 186 | geom_path() + 187 | coord_equal() 188 | p 189 | ``` 190 | 191 | **Goal**: estimate the area of a circle with radius $= 1$ and area $= \pi$ using Monte carlo integration. 192 | 193 | 194 | 195 | ## Monte Carlo integration 196 | 197 | ```{r, echo = FALSE} 198 | # generate a random sample of points on the interval 199 | # x \in (-1, 1), y \in (-1, 1) 200 | n_points <- 1000 201 | mc_points <- data.frame(x = runif(n_points, -1, 1), 202 | y = runif(n_points, -1, 1)) 203 | 204 | # evaluate the function at each point and plot results 205 | mc_points$f <- f(x = mc_points$x, y = mc_points$y) 206 | mc_points$f_fact <- as.factor(mc_points$f) 207 | 208 | 209 | # approximate pi 210 | mc_volume <- 4 # (volume of sampling space: square with length 2, A = 2**2) 211 | 212 | pi_hat <- mc_volume * sum(mc_points$f) / n_points 213 | 214 | ggplot(mc_points, aes(x = x, y = y)) + 215 | geom_point(shape = 1, size = ps) + 216 | geom_path(data = circle_path) + 217 | coord_equal() 218 | ``` 219 | 220 | 221 | 222 | ## Monte Carlo integration 223 | 224 | ```{r, echo = FALSE} 225 | ggplot(mc_points, aes(x = x, y = y)) + 226 | geom_point(aes(color = f_fact), size = ps) + 227 | coord_equal() + 228 | scale_color_discrete(expression(paste(f(bar(x)[i])))) 229 | ``` 230 | 231 | 232 | 233 | ## Monte Carlo integration 234 | 235 | ```{r, echo = FALSE} 236 | ggplot(mc_points, aes(x = x, y = y)) + 237 | geom_point(aes(color = f_fact), size = ps) + 238 | coord_equal() + 239 | scale_color_discrete(expression(paste(f(bar(x)[i])))) + 240 | ggtitle(substitute(paste(hat(pi), '=', pi_hat), 241 | list(pi_hat = pi_hat))) 242 | ``` 243 | 244 | $$\hat{\pi} = V N^{-1} \displaystyle \sum_{i = 1}^N f(\bar{x}_i)$$ 245 | 246 | 247 | 248 | ## MC integration in R 249 | 250 | ```{r, echo = TRUE} 251 | approx_pi <- function(n) { 252 | # estimate pi w/ MC integration 253 | x <- runif(n, min = -1, max = 1) 254 | y <- runif(n, min = -1, max = 1) 255 | V <- 4 256 | f_hat <- ifelse(x^2 + y^2 <= 1, 1, 0) 257 | V * sum(f_hat) / n 258 | } 259 | ``` 260 | 261 | 262 | 263 | ## How does $N$ influence $\hat{\pi}$? 264 | 265 | ```{r} 266 | n <- seq(10, 10000, by = 10) 267 | pi_hat <- rep(NA, length(n)) 268 | 269 | for (i in seq_along(n)) { 270 | pi_hat[i] <- approx_pi(n[i]) 271 | } 272 | ``` 273 | 274 | 275 | 276 | ## How does $N$ influence $\hat{\pi}$? 277 | 278 | ```{r, echo = FALSE, fig.height = 2} 279 | data.frame(n, pi_hat) %>% 280 | ggplot(aes(x = n, y = pi_hat)) + 281 | geom_point(shape = 1, size = ps * .5) + 282 | xlab('Number of MC samples') + 283 | ylab(expression(paste(hat(pi)))) 284 | ``` 285 | 286 | 287 | 288 | ## Avoiding a for-loop 289 | 290 | `lapply()` returns a list 291 | 292 | ```{r, echo = TRUE} 293 | pi_hat <- lapply(n, approx_pi) 294 | str(pi_hat[1:5]) 295 | ``` 296 | 297 | 298 | 299 | ## `apply()` for vectors 300 | 301 | `sapply()` returns vectors, matrices, and arrays 302 | 303 | ```{r} 304 | pi_hat <- sapply(n, approx_pi) 305 | str(pi_hat) 306 | ``` 307 | 308 | 309 | 310 | ## Local parallelization 311 | 312 | Each MC integration is embarrasingly parallel! 313 | 314 | To parallelize: 315 | 316 | 1. start a cluster 317 | 2. compute simultaneously across the cluster 318 | 3. gather results 319 | 4. close cluster 320 | 321 | 322 | 323 | 324 | ## the `parallel` package 325 | 326 | ```{r} 327 | cl <- makeCluster(2) 328 | 329 | pi_hat <- parSapply(cl, n, approx_pi) 330 | 331 | stopCluster(cl) 332 | ``` 333 | 334 | 335 | ## doMC and foreach 336 | 337 | ```{r, message = FALSE} 338 | library(doMC) 339 | registerDoMC(2) 340 | 341 | pis <- foreach(i = 1:length(n)) %dopar% { 342 | approx_pi(n[i]) 343 | } 344 | 345 | class(pis) 346 | ``` 347 | 348 | 349 | 350 | ## Custom combines via `.combine` 351 | 352 | ```{r, message = FALSE} 353 | foreach(i = 1:length(n), .combine = c) %dopar% { 354 | approx_pi(n[i]) 355 | } %>% 356 | str() 357 | ``` 358 | 359 | 360 | 361 | ## What if we want one estimate? 362 | 363 | Now, suppose we want **one** precise estimate of $\pi$: 364 | 365 | $\rightarrow$ we need lots of MC samples! 366 | 367 | e.g., if we drop $N$ points $J$ times: 368 | 369 | $$\hat{\pi} = V (NJ)^{-1} \displaystyle \sum_{j=1}^J \sum_{i = 1}^N f(\bar{x}_{ij})$$ 370 | 371 | 372 | 373 | ## Getting one precise estimate 374 | 375 | ```{r} 376 | sum_f <- function(n) { 377 | x <- runif(n, min = -1, max = 1) 378 | y <- runif(n, min = -1, max = 1) 379 | sum(x^2 + y^2 <= 1) 380 | } 381 | ``` 382 | 383 | $$\hat{\pi} = V (NJ)^{-1} \displaystyle \sum_{j=1}^J \color{blue}{\sum_{i = 1}^N f(\bar{x}_{ij}})$$ 384 | 385 | 386 | ## Getting one precise estimate 387 | 388 | ```{r} 389 | N <- 10000 390 | J <- 10000 391 | n <- rep(N, J) 392 | 393 | cl <- makeCluster(4) 394 | f_sums <- parSapply(cl, n, sum_f) 395 | stopCluster(cl) 396 | 397 | 4 * sum(f_sums) / (N * J) - pi 398 | ``` 399 | 400 | 401 | 402 | ## Parallel R in HPC environments 403 | 404 | Communication across nodes via message passing interface (MPI) 405 | 406 | - *de facto* standard on distributed memory systems 407 | 408 | **Relevant R packages**: 409 | 410 | - Rmpi 411 | - snow 412 | - snowfall 413 | - pbdR 414 | 415 | 416 | 417 | ## Rmpi 418 | 419 | **Initialization**: 420 | 421 | `mpi.spawn.Rslaves(nslaves = ...)` 422 | 423 | **Execution**: 424 | 425 | `mpi.bcast.cmd(...)` 426 | 427 | `mpi.remote.exec(...)` 428 | 429 | **Shut down**: 430 | 431 | `mpi.close.Rslaves()` 432 | 433 | 434 | 435 | ## snow 436 | 437 | Simple network of workstations 438 | 439 | **Initialization**: 440 | 441 | `cl <- makeCluster(...)` 442 | 443 | **Execution**: 444 | 445 | `clusterExport(cl, list, envir = .GlobalEnv)` 446 | 447 | `clusterCall(cl, function)` 448 | 449 | `clusterApply(cl, x, fun)` 450 | 451 | `clusterApplyLB(cl, x, fun) # load balanced` 452 | 453 | **Shut down**: 454 | 455 | `stopCluster(cl)` 456 | 457 | 458 | 459 | ## snowfall 460 | 461 | Simpler simple network of workstations 462 | 463 | **Initialization**: 464 | 465 | `sfInit(parallel = TRUE, cpus = ...)` 466 | 467 | **Execution**: 468 | 469 | `sfLibrary(dplyr)` 470 | 471 | `sfSource("file.R")` 472 | 473 | `sfExport(...)` 474 | 475 | `sfClusterApply(x, fun)` 476 | 477 | `sfClusterApplyLB(x, fun)` 478 | 479 | **Shut down**: 480 | 481 | `sfStop()` 482 | 483 | 484 | 485 | ## Local snowfall 486 | 487 | ```{r, message = FALSE, warning = FALSE} 488 | sfInit(parallel = TRUE, cpus = 2) 489 | sfClusterEval(print("yummie")) 490 | sfStop() 491 | ``` 492 | 493 | 494 | 495 | ## Advantages of snowfall 496 | 497 | 1. Easy prototyping on a local multicore machine 498 | 499 | **Local** 500 | 501 | ``` 502 | sfInit(parallel = TRUE, cpus = 2) 503 | ``` 504 | 505 | **Remote** 506 | 507 | ``` 508 | sfInit(parallel = TRUE, cpus = 240, type = "MPI") 509 | ``` 510 | 511 | 512 | ## Advantages of snowfall 513 | 514 | 1. Easy prototyping on a local multicore machine 515 | 2. Easy serial execution 516 | 517 | ```{r, message=FALSE, warning = FALSE} 518 | sfInit(parallel = FALSE, cpus = 2) 519 | sfClusterEval(print("yummie")) 520 | sfStop() 521 | ``` 522 | 523 | 524 | 525 | ## Disadvantages of snowfall 526 | 527 | 1. `Rmpi` and `snow` are dependencies 528 | 2. Very thin wrapper around `snow` 529 | - if you don't need serial execution, maybe `snow` is sufficient 530 | 531 | 532 | 533 | ## TL;DL 534 | 535 | To make parallel R easier: 536 | 537 | 1. Know what's parallelizable 538 | 1. Use the `apply()` functions 539 | 1. `snow`/`snowfall` work locally *and* in an HPC environment 540 | 541 | 542 | 543 | ## Thank you 544 | 545 | **Slides**: figshare.com/articles/Introduction_to_parallel_R/3848310 546 | 547 | **Source code**: github.com/mbjoseph/intro-parallel-r 548 | 549 | **E-mail**: maxwell.b.joseph@colorado.edu 550 | 551 | 552 | -------------------------------------------------------------------------------- /snow-demo/README.md: -------------------------------------------------------------------------------- 1 | # snow demo on Janus 2 | 3 | The `snow` R package provides simple parallel functionality with multiple backends, including MPI, and is bundled up in a module on Janus. 4 | The file `job.sh` is the slurm job script to be submitted, which calls `snow-test.R`. 5 | 6 | Usage: 7 | 8 | This requires that you are logged in to Janus: 9 | 10 | ``` 11 | ssh -l $your_username login.rc.colorado.edu 12 | ``` 13 | 14 | Once you've logged in, load slurm and then use `sbatch` to submit the job to `janus-debug`: 15 | 16 | ``` 17 | ml slurm 18 | sbatch job.sh 19 | ``` 20 | 21 | -------------------------------------------------------------------------------- /snow-demo/job.sh: -------------------------------------------------------------------------------- 1 | #! /bin/bash 2 | #SBATCH --job-name snow-test 3 | #SBATCH --time 00:00:30 4 | #SBATCH --nodes 2 5 | #SBATCH --output results.out 6 | #SBATCH --ntasks-per-node 12 7 | #SBATCH --qos=janus-debug 8 | 9 | ml gcc/5.1.0 10 | ml R/3.2.0 11 | ml Rmpi/0.6-5 12 | ml snow/0.3-13 13 | 14 | mpirun -np 1 R --vanilla -f snow-test.R 15 | 16 | -------------------------------------------------------------------------------- /snow-demo/snow-test.R: -------------------------------------------------------------------------------- 1 | library("Rmpi") 2 | library("snow") 3 | 4 | # start up a cluster 5 | np <- Rmpi::mpi.universe.size() 6 | print(paste('There are', np, 'cores')) 7 | cl <- makeCluster(np - 1, type = "MPI") 8 | 9 | # have each cluster member say hello 10 | greet <- function() { 11 | paste("I'm", Sys.info()['nodename'], 12 | 'with CPU type', Sys.info()['machine']) 13 | } 14 | 15 | clusterCall(cl, greet) 16 | 17 | stopCluster(cl) 18 | mpi.quit() 19 | 20 | -------------------------------------------------------------------------------- /snowfall-demo/README.md: -------------------------------------------------------------------------------- 1 | # snowfall demo on Janus 2 | 3 | The `snowfall` R package wraps `snow` and `Rmpi`, and is bundled up in a module on Janus. 4 | The file `job.sh` is the slurm job script to be submitted, which calls `snowfall-test.R`. 5 | 6 | Usage: 7 | 8 | This requires that you are logged in to Janus: 9 | 10 | ``` 11 | ssh -l $your_username login.rc.colorado.edu 12 | ``` 13 | 14 | Once you've logged in, load slurm and then use `sbatch` to submit the job to `janus-debug`: 15 | 16 | ``` 17 | ml slurm 18 | sbatch job.sh 19 | ``` 20 | 21 | -------------------------------------------------------------------------------- /snowfall-demo/job.sh: -------------------------------------------------------------------------------- 1 | #! /bin/bash 2 | #SBATCH --job-name snowfall-test 3 | #SBATCH --time 00:00:30 4 | #SBATCH --nodes 2 5 | #SBATCH --output results.out 6 | #SBATCH --ntasks-per-node 12 7 | #SBATCH --qos=janus-debug 8 | 9 | ml gcc/5.1.0 10 | ml R/3.2.0 11 | ml Rmpi/0.6-5 12 | ml snow/0.3-13 13 | ml snowfall/1.84-6 14 | 15 | mpirun -np 1 R --vanilla -f snowfall-test.R 16 | 17 | -------------------------------------------------------------------------------- /snowfall-demo/snowfall-test.R: -------------------------------------------------------------------------------- 1 | library("Rmpi") 2 | library("snowfall") 3 | 4 | # initialization 5 | np <- Rmpi::mpi.universe.size() 6 | print(paste('There are', np, 'cores')) 7 | sfInit(parallel = TRUE, cpus = np - 1, type = "MPI") 8 | 9 | # use monte carlo integration to approximate pi with varying MC sample sizes 10 | approx_pi <- function(n) { 11 | # approximates pi via MC integration of unit disk 12 | x <- runif(n, min = -1, max = 1) 13 | y <- runif(n, min = -1, max = 1) 14 | V <- 4 15 | f_hat <- ifelse(x^2 + y^2 <= 1, 1, 0) 16 | V * sum(f_hat) / n 17 | } 18 | 19 | n <- seq(10, 1E7, length.out = 50) 20 | result <- sfSapply(n, approx_pi) 21 | result 22 | 23 | # shut down and quit 24 | sfStop() 25 | mpi.quit() 26 | 27 | -------------------------------------------------------------------------------- /ss.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/mbjoseph/intro-parallel-r/118e95f525fb8ab506babd46b6f85de1c0fc9dec/ss.png --------------------------------------------------------------------------------