├── .gitignore ├── .gitmodules ├── 404.html ├── LICENSE.md ├── README.md ├── _config.yml ├── _includes ├── head.html └── sidebar.html ├── _layouts ├── default.html ├── page.html └── post.html ├── _posts ├── 2012-02-06-whats-jekyll.md ├── 2012-02-07-example-content.md └── 2013-12-28-introducing-hyde.md ├── about.md ├── assignments.md ├── assignments ├── R_basics.md ├── R_intermediate.Rmd ├── R_intermediate.html ├── git_basics.md ├── multivariate_models.md ├── smokies_transects.png ├── spatial_models.md └── univariate_models.md ├── atom.xml ├── data.md ├── data ├── BCI_env.csv ├── MODISfire2010.zip ├── crabdat(MF).xlsx ├── elemapi2v2.csv ├── fir.csv ├── inflammation-01.csv ├── maple.csv ├── milkweeds.csv ├── music genres ranking (Responses) - Form Responses 1.csv ├── tgpp.csv ├── tree_metadata.txt ├── treedata.csv ├── treedata_subset.csv └── v62.0_HO_public.csv ├── google671b87772b9c5779.html ├── images └── Thumbs.db ├── index.md ├── lessons.md ├── lessons ├── 00-before-we-start.Rmd ├── 00-before-we-start.html ├── R_intermediate.R ├── R_intermediate.Rmd ├── R_intermediate.html ├── R_introduction.R ├── R_introduction.Rmd ├── R_introduction.html ├── chaotic-pop │ └── app.R ├── community_structure_slides_with_notes.pdf ├── data_exploration.Rmd ├── data_exploration.html ├── figures │ ├── Rmd_knited.png │ ├── Rmd_prepopulated.png │ ├── crawley_2007_table9_2_model_simplification.png │ ├── final_doc.gif │ ├── git_diff.png │ ├── git_panel.PNG │ ├── git_tab_explained.png │ ├── isotropic_variogram_models_plots.png │ ├── isotropic_variogram_models_table.png │ ├── knit_button.png │ ├── naming_repo.PNG │ ├── new_proj.PNG │ ├── new_proj2.PNG │ ├── new_repo.PNG │ ├── proj_url.PNG │ ├── r_starting_how_it_should_look.png │ ├── repo_fresh.PNG │ ├── rmarkdown_dialogue.JPG │ ├── serious_git.png │ └── terminal.png ├── git_introduction.md ├── git_slides.pdf ├── more_with_maps.R ├── more_with_maps.nb.html ├── multivariate_models.Rmd ├── multivariate_models.html ├── ordination_table.csv ├── paired_samples.R ├── paired_samples.html ├── partial_residual_plots.Rmd ├── partial_residual_plots.html ├── prepare_data.R ├── rmarkdown_notes.html ├── rmarkdown_notes.md ├── shapefiles_and_rasters.R ├── shapefiles_and_rasters.html ├── simulations.Rmd ├── simulations.html ├── spatial_models.Rmd ├── spatial_models.html ├── spatial_models_notebook.nb.html ├── standardized_beta_coefficients.Rmd ├── standardized_beta_coefficients.html ├── stats_primer.pdf ├── tcltk_0.1-1.tar.gz ├── univariate_models.R ├── univariate_models.Rmd └── univariate_models.html ├── motivation.html ├── motivation.md ├── projects.md ├── projects ├── code_review.md └── naming-slides.pdf ├── public ├── R-Prog-Lang-Logo-sm.png ├── apple-touch-icon-144-precomposed.png ├── cc-by-80x15.png └── css │ ├── hyde.css │ ├── poole.css │ └── syntax.css ├── resources.md ├── scripts ├── collect_student_urls.R ├── download_fire_data.R ├── fibanacci_seq.R ├── google_sheets_mang.R ├── pull_student_repos.R ├── shiny_kmeans.R └── utility_functions.R ├── software.md ├── syllabus.md ├── syllabus_bio470.md ├── syllabus_bio470.pdf ├── syllabus_bio570.md └── syllabus_bio570.pdf /.gitignore: -------------------------------------------------------------------------------- 1 | .Rproj.user 2 | /solutions* 3 | /student* 4 | /projects 5 | .Rhistory 6 | *.Rproj -------------------------------------------------------------------------------- /.gitmodules: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/.gitmodules -------------------------------------------------------------------------------- /404.html: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | title: "404: Page not found" 4 | permalink: 404.html 5 | --- 6 | 7 |
8 |

404: Page not found

9 |

Sorry, we've misplaced that URL or it's pointing to something that doesn't exist. Head back home to try finding it again.

10 |
11 | -------------------------------------------------------------------------------- /LICENSE.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: page 3 | title: Licenses 4 | --- 5 | 6 | ### Instructional Material 7 | 8 | All instructional material is made available under the [Creative Commons 9 | Attribution 4.0 license](https://creativecommons.org/licenses/by/4.0/). You are 10 | free to: 11 | 12 | * **Share**---copy and redistribute the material in any medium or format 13 | 14 | * **Adapt**---remix, transform, and build upon the material for any purpose, even commercially. 15 | 16 | Under the following terms: 17 | 18 | * **Attribution**---You must give appropriate credit, provide a link to the 19 | license, and indicate if changes were made. You may do so in any reasonable 20 | manner, but not in any way that suggests the licensor endorses you or your 21 | use. 22 | 23 | With the understanding that: 24 | 25 | * You do not have to comply with the license for elements of the material in the 26 | public domain or where your use is permitted by an applicable exception or 27 | limitation. 28 | 29 | * No warranties are given. The license may not give you all of the permissions 30 | necessary for your intended use. For example, other rights such as publicity, 31 | privacy, or moral rights may limit how you use the material. 32 | 33 | 34 | For the full legal text of this license, please see 35 | [http://creativecommons.org/licenses/by/4.0/legalcode](http://creativecommons.org/licenses/by/4.0/legalcode). 36 | 37 | ### Software 38 | 39 | Except where otherwise noted, the example programs and other software provided 40 | are made available under the [OSI](http://opensource.org)-approved [MIT 41 | license](http://opensource.org/licenses/mit-license.html). 42 | 43 | Permission is hereby granted, free of charge, to any person obtaining 44 | a copy of this software and associated documentation files (the 45 | "Software"), to deal in the Software without restriction, including 46 | without limitation the rights to use, copy, modify, merge, publish, 47 | distribute, sublicense, and/or sell copies of the Software, and to 48 | permit persons to whom the Software is furnished to do so, subject to 49 | the following conditions: 50 | 51 | The above copyright notice and this permission notice shall be 52 | included in all copies or substantial portions of the Software. 53 | 54 | THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, 55 | EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF 56 | MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND 57 | NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE 58 | LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION 59 | OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION 60 | WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. -------------------------------------------------------------------------------- /README.md: -------------------------------------------------------------------------------- 1 | Welcome to Applied Quantitative Methods 2 | ======================================= 3 | 4 | This site contains the teaching materials for the course Applied Quantitative 5 | Methods. The course is aimed at developing a small set of core skills in 6 | quantitative methods, and providing students an opportunity to apply these tools 7 | towards their own projects. 8 | 9 | I plan to add a better front-end webpage to provide better access to these 10 | teaching materials, but for now this repo serves the purpose. 11 | 12 | Index 13 | ----- 14 | * [syllabus](syllabus.md) 15 | * [lessons](lesson_index.md) 16 | * [online resources](resource_links.md) 17 | 18 | Licence 19 | ------- 20 | The [license](LICENSE.md) file describes that you are encourage to reuse or 21 | modify these materials with attribution (Creative Commons Attribution 4.0 license), 22 | and submit issues or contribute changes if you see things that need improving. 23 | 24 | Acknowledgements 25 | ---------------- 26 | I would like to thank the following people for helping provide suggestions about 27 | how to best implement this course. 28 | * Ethan White 29 | 30 | -------------------------------------------------------------------------------- /_config.yml: -------------------------------------------------------------------------------- 1 | # Dependencies 2 | markdown: kramdown 3 | highlighter: rouge 4 | 5 | # Permalinks 6 | permalink: pretty 7 | 8 | # Setup 9 | title: Applied Quantitative Methods 10 | tagline: 'An open course' 11 | description: 'Structure data, build models, and plot results in the R programming language' 12 | url: dmcglinn.github.io/quant_methods 13 | baseurl: /quant_methods/ 14 | 15 | author: 16 | name: 'Dan McGlinn' 17 | url: https://twitter.com/danmcglinn 18 | 19 | github: 20 | repo: https://github.com/dmcglinn/quant_methods 21 | -------------------------------------------------------------------------------- /_includes/head.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | 10 | {% if page.title == "Home" %} 11 | {{ site.title }} · {{ site.tagline }} 12 | {% else %} 13 | {{ page.title }} · {{ site.title }} 14 | {% endif %} 15 | 16 | 17 | 18 | 19 | 20 | 21 | 22 | 23 | 24 | 25 | 26 | 27 | 28 | 29 | 30 | -------------------------------------------------------------------------------- /_includes/sidebar.html: -------------------------------------------------------------------------------- 1 | 30 | -------------------------------------------------------------------------------- /_layouts/default.html: -------------------------------------------------------------------------------- 1 | 2 | 3 | 4 | {% include head.html %} 5 | 6 | 7 | 8 | {% include sidebar.html %} 9 | 10 |
11 | {{ content }} 12 |
13 | 14 | 15 | 16 | -------------------------------------------------------------------------------- /_layouts/page.html: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | --- 4 | 5 |
6 |

{{ page.title }}

7 | {{ content }} 8 |
9 | -------------------------------------------------------------------------------- /_layouts/post.html: -------------------------------------------------------------------------------- 1 | --- 2 | layout: default 3 | --- 4 | 5 |
6 |

{{ page.title }}

7 | {{ page.date | date_to_string }} 8 | {{ content }} 9 |
10 | 11 | 26 | -------------------------------------------------------------------------------- /_posts/2012-02-06-whats-jekyll.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: post 3 | title: What's Jekyll? 4 | --- 5 | 6 | [Jekyll](http://jekyllrb.com) is a static site generator, an open-source tool for creating simple yet powerful websites of all shapes and sizes. From [the project's readme](https://github.com/mojombo/jekyll/blob/master/README.markdown): 7 | 8 | > Jekyll is a simple, blog aware, static site generator. It takes a template directory [...] and spits out a complete, static website suitable for serving with Apache or your favorite web server. This is also the engine behind GitHub Pages, which you can use to host your project’s page or blog right here from GitHub. 9 | 10 | It's an immensely useful tool and one we encourage you to use here with Hyde. 11 | 12 | Find out more by [visiting the project on GitHub](https://github.com/mojombo/jekyll). -------------------------------------------------------------------------------- /_posts/2012-02-07-example-content.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: post 3 | title: Example content 4 | --- 5 | 6 | 7 |
8 | Howdy! This is an example blog post that shows several types of HTML content supported in this theme. 9 |
10 | 11 | Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. *Aenean eu leo quam.* Pellentesque ornare sem lacinia quam venenatis vestibulum. Sed posuere consectetur est at lobortis. Cras mattis consectetur purus sit amet fermentum. 12 | 13 | > Curabitur blandit tempus porttitor. Nullam quis risus eget urna mollis ornare vel eu leo. Nullam id dolor id nibh ultricies vehicula ut id elit. 14 | 15 | Etiam porta **sem malesuada magna** mollis euismod. Cras mattis consectetur purus sit amet fermentum. Aenean lacinia bibendum nulla sed consectetur. 16 | 17 | ## Inline HTML elements 18 | 19 | HTML defines a long list of available inline tags, a complete list of which can be found on the [Mozilla Developer Network](https://developer.mozilla.org/en-US/docs/Web/HTML/Element). 20 | 21 | - **To bold text**, use ``. 22 | - *To italicize text*, use ``. 23 | - Abbreviations, like HTML should use ``, with an optional `title` attribute for the full phrase. 24 | - Citations, like — Mark otto, should use ``. 25 | - Deleted text should use `` and inserted text should use ``. 26 | - Superscript text uses `` and subscript text uses ``. 27 | 28 | Most of these elements are styled by browsers with few modifications on our part. 29 | 30 | ## Heading 31 | 32 | Vivamus sagittis lacus vel augue rutrum faucibus dolor auctor. Duis mollis, est non commodo luctus, nisi erat porttitor ligula, eget lacinia odio sem nec elit. Morbi leo risus, porta ac consectetur ac, vestibulum at eros. 33 | 34 | ### Code 35 | 36 | Cum sociis natoque penatibus et magnis dis `code element` montes, nascetur ridiculus mus. 37 | 38 | {% highlight js %} 39 | // Example can be run directly in your JavaScript console 40 | 41 | // Create a function that takes two arguments and returns the sum of those arguments 42 | var adder = new Function("a", "b", "return a + b"); 43 | 44 | // Call the function 45 | adder(2, 6); 46 | // > 8 47 | {% endhighlight %} 48 | 49 | Aenean lacinia bibendum nulla sed consectetur. Etiam porta sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus commodo, tortor mauris condimentum nibh, ut fermentum massa. 50 | 51 | ### Gists via GitHub Pages 52 | 53 | Vestibulum id ligula porta felis euismod semper. Nullam quis risus eget urna mollis ornare vel eu leo. Donec sed odio dui. 54 | 55 | {% gist 5555251 gist.md %} 56 | 57 | Aenean eu leo quam. Pellentesque ornare sem lacinia quam venenatis vestibulum. Nullam quis risus eget urna mollis ornare vel eu leo. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec sed odio dui. Vestibulum id ligula porta felis euismod semper. 58 | 59 | ### Lists 60 | 61 | Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Aenean lacinia bibendum nulla sed consectetur. Etiam porta sem malesuada magna mollis euismod. Fusce dapibus, tellus ac cursus commodo, tortor mauris condimentum nibh, ut fermentum massa justo sit amet risus. 62 | 63 | * Praesent commodo cursus magna, vel scelerisque nisl consectetur et. 64 | * Donec id elit non mi porta gravida at eget metus. 65 | * Nulla vitae elit libero, a pharetra augue. 66 | 67 | Donec ullamcorper nulla non metus auctor fringilla. Nulla vitae elit libero, a pharetra augue. 68 | 69 | 1. Vestibulum id ligula porta felis euismod semper. 70 | 2. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. 71 | 3. Maecenas sed diam eget risus varius blandit sit amet non magna. 72 | 73 | Cras mattis consectetur purus sit amet fermentum. Sed posuere consectetur est at lobortis. 74 | 75 |
76 |
HyperText Markup Language (HTML)
77 |
The language used to describe and define the content of a Web page
78 | 79 |
Cascading Style Sheets (CSS)
80 |
Used to describe the appearance of Web content
81 | 82 |
JavaScript (JS)
83 |
The programming language used to build advanced Web sites and applications
84 |
85 | 86 | Integer posuere erat a ante venenatis dapibus posuere velit aliquet. Morbi leo risus, porta ac consectetur ac, vestibulum at eros. Nullam quis risus eget urna mollis ornare vel eu leo. 87 | 88 | ### Images 89 | 90 | Quisque consequat sapien eget quam rhoncus, sit amet laoreet diam tempus. Aliquam aliquam metus erat, a pulvinar turpis suscipit at. 91 | 92 | ![placeholder](http://placehold.it/800x400 "Large example image") 93 | ![placeholder](http://placehold.it/400x200 "Medium example image") 94 | ![placeholder](http://placehold.it/200x200 "Small example image") 95 | 96 | ### Tables 97 | 98 | Aenean lacinia bibendum nulla sed consectetur. Lorem ipsum dolor sit amet, consectetur adipiscing elit. 99 | 100 | 101 | 102 | 103 | 104 | 105 | 106 | 107 | 108 | 109 | 110 | 111 | 112 | 113 | 114 | 115 | 116 | 117 | 118 | 119 | 120 | 121 | 122 | 123 | 124 | 125 | 126 | 127 | 128 | 129 | 130 | 131 | 132 |
NameUpvotesDownvotes
Totals2123
Alice1011
Bob43
Charlie79
133 | 134 | Nullam id dolor id nibh ultricies vehicula ut id elit. Sed posuere consectetur est at lobortis. Nullam quis risus eget urna mollis ornare vel eu leo. 135 | 136 | ----- 137 | 138 | Want to see something else added? Open an issue. 139 | -------------------------------------------------------------------------------- /_posts/2013-12-28-introducing-hyde.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: post 3 | title: Introducing Hyde 4 | --- 5 | 6 | Hyde is a brazen two-column [Jekyll](http://jekyllrb.com) theme that pairs a prominent sidebar with uncomplicated content. It's based on [Poole](http://getpoole.com), the Jekyll butler. 7 | 8 | ### Built on Poole 9 | 10 | Poole is the Jekyll Butler, serving as an upstanding and effective foundation for Jekyll themes by [@mdo](https://twitter.com/mdo). Poole, and every theme built on it (like Hyde here) includes the following: 11 | 12 | * Complete Jekyll setup included (layouts, config, [404](/404), [RSS feed](/atom.xml), posts, and [example page](/about)) 13 | * Mobile friendly design and development 14 | * Easily scalable text and component sizing with `rem` units in the CSS 15 | * Support for a wide gamut of HTML elements 16 | * Related posts (time-based, because Jekyll) below each post 17 | * Syntax highlighting, courtesy Pygments (the Python-based code snippet highlighter) 18 | 19 | ### Hyde features 20 | 21 | In addition to the features of Poole, Hyde adds the following: 22 | 23 | * Sidebar includes support for textual modules and a dynamically generated navigation with active link support 24 | * Two orientations for content and sidebar, default (left sidebar) and [reverse](https://github.com/poole/lanyon#reverse-layout) (right sidebar), available via `` classes 25 | * [Eight optional color schemes](https://github.com/poole/hyde#themes), available via `` classes 26 | 27 | [Head to the readme](https://github.com/poole/hyde#readme) to learn more. 28 | 29 | ### Browser support 30 | 31 | Hyde is by preference a forward-thinking project. In addition to the latest versions of Chrome, Safari (mobile and desktop), and Firefox, it is only compatible with Internet Explorer 9 and above. 32 | 33 | ### Download 34 | 35 | Hyde is developed on and hosted with GitHub. Head to the GitHub repository for downloads, bug reports, and features requests. 36 | 37 | Thanks! 38 | -------------------------------------------------------------------------------- /about.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: page 3 | title: About 4 | --- 5 | 6 | This course was developed by [Dan McGlinn](https://mcglinnlab.org). 7 | 8 | I would like to thank Ethan White, Greg Wilson, and David LeBauer for advice 9 | on how to create this website using jekyll. -------------------------------------------------------------------------------- /assignments.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: page 3 | title: Assignments 4 | --- 5 | 6 | * Introduction to R 7 | - [basic](./R_basics) 8 | - intermediate 9 | * Version Control 10 | - [Setup remote git repo](./git_basics) 11 | * Univariate Models 12 | - [Modeling tree cover and richness](./univariate_models) 13 | * Multivariate Models 14 | - [Modeling dutch dune vegetation](./multivariate_models) 15 | * Spatial Models 16 | - [Modeling abundance of tropical trees](./spatial_models) -------------------------------------------------------------------------------- /assignments/R_basics.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: R basics 3 | layout: page 4 | --- 5 | 6 | We covered some of the basics of creating and working with objects in R during 7 | our first meeting. Now it is time to apply those skills to address some common 8 | tasks faced when processing data in R. **Indicate what R commands and their 9 | output are for each question.** 10 | 11 | Download and read in the datafile `tgpp.csv` from the class 12 | website using the R function `read.csv`. Use these steps: 13 | 14 | * Navigate to the class website: https://github.com/dmcglinn/quant_methods 15 | * Click the `data` folder 16 | * Click the `tgpp.csv` file 17 | * Now click the button in the top right corner of the spreadsheet called `raw` 18 | 19 | This will take you to a plain text view of the file. At this point you have to options for getting this file into R: 20 | 21 | 1) Manual download and import 22 | 23 | * Save the file to your local machine to the directory you would like to use for 24 | this course. 25 | * Now you just have to point the function `read.csv` to the file so it knows where 26 | the file is located. For example if I save the file in the following directory: 27 | `C:/users/dan/Rclass/data/` then I would use the following R command 28 | 29 | ``` 30 | tgpp <- read.csv('C:/users/dan/Rclass/data/tgpp.csv') 31 | ``` 32 | 33 | 2) Alternatively I could just supply the function `read.csv` the url of the raw 34 | file that I navigated to on github: 35 | 36 | ``` 37 | tgpp <- read.csv('https://raw.githubusercontent.com/dmcglinn/quant_methods/gh-pages/data/tgpp.csv') 38 | ``` 39 | 40 | The second option is faster but if the file is ever taken offline then that code will break. 41 | 42 | This dataset represents the vascular plant species richness that was 43 | collected from the Tallgrass Prairie Preserve from 10 x 10 m quadrats. Species 44 | richness is simply the number of species that occur within a quadrat. 45 | 46 | Read the data into R, note this datafile has a header (i.e., it has column 47 | names) unlike the example we examined in class. 48 | 49 | 1. What are the names of the columns in this dataset? 50 | 51 | 2. How many rows and columns does this data file have? 52 | 53 | 3. What kind of object is each data column? Hint: checkout the function sapply(). 54 | 55 | 4. What are the values of the the datafile for rows 1, 5, and 8 at columns 3, 56 | 7, and 10 57 | 58 | 5. Create a pdf of the relationship between the variables "scale" and "richness". 59 | Scale is the area in square meters of the quadrat in which richness was 60 | recorded. Be sure to label your axes clearly, and choose a color you find 61 | pleasing for the points. To get a list of available stock colors use the 62 | function colors(). Also see this link: https://r-charts.com/colors/. 63 | 64 | 6. What happens to your plot when you set the plot argument log equal to 'xy'. 65 | `plot(..., log='xy')` 66 | 67 | 68 | 69 | 70 | 71 | 72 | -------------------------------------------------------------------------------- /assignments/R_intermediate.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "R intermediate" 3 | author: "Dan McGlinn" 4 | date: "January 15, 2016" 5 | output: html_document 6 | --- 7 | 8 | > Premature optimization is the root of all evil -- Donald Knuth 9 | 10 | The humble for loop is often considered distasteful by seasoned programmers 11 | because it is inefficient; however, the for loop is one of the most useful and 12 | generalizable programming structures in R. If you can learn how to construct and 13 | understand for loops then you can code almost any iterative task. Once your loop 14 | works you can always work to optimize your code and increase its efficiency. 15 | 16 | Before attempting these exercises you should review the lesson 17 | [R intermediate](../lessons/R_intermediate) in which loops were covered. 18 | 19 | Examine the following for loop, and then complete the exercises 20 | 21 | ```{r} 22 | data(iris) 23 | head(iris) 24 | 25 | sp_ids <- unique(iris$Species) 26 | 27 | output <- matrix(0, nrow=length(sp_ids), ncol=ncol(iris)-1) 28 | rownames(output) <- sp_ids 29 | colnames(output) <- names(iris[ , -ncol(iris)]) 30 | 31 | for(i in seq_along(sp_ids)) { 32 | iris_sp <- subset(iris, subset=Species == sp_ids[i], select=-Species) 33 | for(j in 1:(ncol(iris_sp))) { 34 | x <- 0 35 | y <- 0 36 | if (nrow(iris_sp) > 0) { 37 | for(k in 1:nrow(iris_sp)) { 38 | x <- x + iris_sp[k, j] 39 | y <- y + 1 40 | } 41 | output[i, j] <- x / y 42 | } 43 | } 44 | } 45 | output 46 | ``` 47 | ##Excercises 48 | ###Iris loops 49 | 50 | 1. Describe the values stored in the object `output`. In other words what did the 51 | loops create? 52 | 53 | 2. Describe using pseudo-code how `output` was calculated, for example, 54 | ```{r, eval=FALSE} 55 | Loop from 1 to length of species identities 56 | Take a subset of iris data 57 | Loop from 1 to number of columns of the iris data 58 | If ... occurs then do ... 59 | ``` 60 | 61 | 3. The variables in the loop were named so as to be vague. How can the objects 62 | `output`, `x`, and `y` be renamed such that it is clearer what is occurring in 63 | the loop. 64 | 65 | 4. It is possible to accomplish the same task using fewer lines of code? Please 66 | suggest one other way to calculate `output` that decreases the number of loops 67 | by 1. 68 | 69 | ###Sum of a sequence 70 | 71 | 5. You have a vector `x` with the numbers 1:10. Write a for loop that will 72 | produce a vector `y` that contains the sum of `x` up to that index of `x`. So 73 | for example the elements of `x` are 1, 2, 3, and so on and the elements of `y` 74 | would be 1, 3, 6, and so on. 75 | 76 | 6. Modify your for loop so that if the sum is greater than 10 the value of `y` 77 | is set to NA 78 | 79 | 7. Place your for loop into a function that accepts as its argument any vector 80 | of arbitrary length and it will return `y`. 81 | 82 | ###(Optional)Fibonacci numbers and Golden ratio 83 | 84 | 8. Fibonacci numbers are a sequence in which a given number is the sum of the 85 | precedding two numbers. So starting at 0 and 1 the sequence would be 86 | 87 | 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, ... 88 | 89 | Write and apply a simple R function that can accomplish this task with a for loop. 90 | Then write a function that computes the ratio of each sequential pair of 91 | Fibonacci numbers. Do they asympoticly approch the golden ratio (1 + sqrt(5)) / 2) ? 92 | 93 | 94 | 95 | 96 | 97 | 98 | 99 | 100 | 101 | -------------------------------------------------------------------------------- /assignments/git_basics.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: page 3 | --- 4 | ## Git Assignment 5 | 6 | The purpose of this assignment is to build your familarity with Git and to 7 | begin using GitHub as repository for your code. 8 | 9 | 1. On GitHub create a repository with a README file for your materials for this 10 | class. I do not care if it is private or public. However if you decide to create 11 | a private repo then you must add me as a collaborator. To this on the home page 12 | of your repository click on the settings button (a wrench and a screw driver) 13 | and add my user name dmcglinn as a collaborator. 14 | 15 | 2. Clone the repository to your local machine with the following command 16 | `git clone https://github.com/your_user_name/your_repo_name.git` where you 17 | replace "your_user_name" with your user name and "your_repo_name" with the repo 18 | name you have chosen. 19 | 20 | 3. Copy and paste your class materials in this directory on your local machine 21 | 22 | 4. Stage and commit files to your repository and then push them to GitHub with 23 | the command `git push origin master` 24 | 25 | 5. Send me a link to your repository. 26 | 27 | 28 | -------------------------------------------------------------------------------- /assignments/multivariate_models.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: page 3 | title: multivariate models 4 | --- 5 | 6 | For this assignment will be analyzing data on the Vegetation 7 | and Environment in Dutch Dune Meadows. 8 | 9 | To import the data and read the metadata run the following: 10 | 11 | ```{r} 12 | library(vegan) 13 | data(dune) 14 | data(dune.env) 15 | ?dune 16 | # there are few nomial variables in the dataset that make modeling a 17 | # bit of a pain let's convert those to numeric vectors or 18 | # to plain un-ranked factors so that they are easier to work 19 | # with and interpret. 20 | dune.env$Moisture <- as.numeric(dune.env$Moisture) 21 | dune.env$Manure <- as.numeric(dune.env$Manure) 22 | dune.env$Management <- factor(dune.env$Management, ordered = FALSE) 23 | dune.env$Use <- factor(dune.env$Use, ordered = FALSE) 24 | ``` 25 | 26 | 1. Conduct an indirect ordination on the dune plant community. Specifically, 27 | visually examine a NMDS plot using the bray-curtis distance metric. Below is 28 | some code to help you develop a potential plot that emphasizes the role of the 29 | environmental variable "Moisture". Describe how you interpret the 30 | graphic. What is the goal of creating such a plot? Does this analysis suggest 31 | any interesting findings with respect to the dune vegetation? 32 | 33 | ```{r} 34 | plot(dune_mds, type='n') 35 | text(dune_mds, 'sp', cex=.5) 36 | # generate vector of colors 37 | color_vect <- rev(terrain.colors(6))[-1] 38 | points(dune_mds, 'sites', pch=19, 39 | col=color_vect[dune.env$Moisture]) 40 | legend('topright', paste("Moisture =", 1:5, sep=''), 41 | col=color_vect, pch=19) 42 | ``` 43 | 44 | 2. Carry out a direct ordination using CCA in order to test any potential 45 | hypotheses that you developed after examining the MDS plot. Specifically, 46 | carry out a test of the entire model (i.e., including all constrained axes) 47 | and also carry out tests at the scale of individual explanatory variables 48 | you included in your model if you included more than one variable. Interpret 49 | the tests and the overall fit of the model to your data. Plot and interpret your 50 | results. 51 | 52 | 3. Do your two analyses agree with one another or complement one another or do 53 | these two analyses seem to be suggesting different take home messages? Which 54 | analysis do you find to be more useful? 55 | 56 | -------------------------------------------------------------------------------- /assignments/smokies_transects.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/assignments/smokies_transects.png -------------------------------------------------------------------------------- /assignments/spatial_models.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: page 3 | --- 4 | 5 | ## Spatial Modeling Assignment 6 | 7 | ```{r} 8 | library(vegan) 9 | data(BCI) 10 | ## UTM Coordinates (in metres) 11 | BCI_xy = data.frame(x = rep(seq(625754, 626654, by=100), each=5), 12 | y = rep(seq(1011569, 1011969, by=100), len=50)) 13 | ``` 14 | 15 | 1) Examine if there is evidence of spatial dependence in a rare and a common 16 | species in the BCI tree dataset 17 | 18 | 19 | 2) Build two generalized linear models to predict the abundance of the species 20 | *Drypetes standleyi* using the abundance of other tree species in the study site. 21 | Specifically examine the following species as predictor variables: 22 | 23 | ```{r} 24 | sp_ids = c("Cordia.lasiocalyx", "Hirtella.triandra", 25 | "Picramnia.latifolia", "Quassia.amara", 26 | "Tabernaemontana.arborea", "Trattinnickia.aspera", 27 | "Xylopia.macrantha") 28 | ``` 29 | Note renaming the species ids to something a little easier to work with like 30 | "sp_a", "sp_b" will make model construction a little less cumbersome 31 | 32 | * Model 1: only include a single species as a predictor variable 33 | 34 | * Model 2: include all of the species as predictor variables 35 | 36 | With both models examine the spatial dependence of the residuals using the 37 | function `Variogram`. Model the spatial dependence in the residuals using one 38 | of the error structures available. 39 | 40 | * Did including the spatial error term have a large impact on the coefficients 41 | of the model? 42 | 43 | * Did including the spatial error terms significantly improve model fit (use 44 | function `anova` to carry out model comparison)? 45 | 46 | * Explain why you did or did not observe a difference in the influence of adding the spatial error term between the two models. 47 | -------------------------------------------------------------------------------- /assignments/univariate_models.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: Univariate models 3 | layout: page 4 | --- 5 | 6 | ## Univariate Assignment 7 | 8 | Read in tree data 9 | 10 | ```r 11 | # read in data directly from website: 12 | # red maple dataset 13 | maple <- read.csv('https://raw.githubusercontent.com/dmcglinn/quant_methods/gh-pages/data/maple.csv', 14 | stringsAsFactors = TRUE) 15 | # Frasier fir dataset 16 | fir <- read.csv('https://raw.githubusercontent.com/dmcglinn/quant_methods/gh-pages/data/fir.csv', 17 | stringsAsFactors = TRUE) 18 | ``` 19 | 20 | Examine this dataset and see how the data is structured, see function `str` 21 | 22 | The contents of the metadata file 23 | ([`./data/tree_metadata.txt`](https://raw.githubusercontent.com/dmcglinn/quant_methods/gh-pages/data/tree_metadata.txt)) 24 | is provided below: 25 | 26 | 27 | The dataset includes tree abundances from a subset of a vegetation database of 28 | Great Smoky Mountains National Park (TN, NC). 29 | 30 | * cover: local abundance measured as estimated horizontal cover (ie, relative area of shadow if sun is directly above) classes 1-10 are: 1=trace, 2=0-1%, 3=1-2%, 4=2-5%, 5=5-10%, 6=10-25%, 7=25-50%, 8=50-75%, 9=75-95%, 10=95-100% 31 | * elev: elevation in meters from a digital elevation model (10 m res) 32 | * tci: topographic convergence index, or site "water potential"; measured as the upslope contributing area divided by the tangent of the slope angle (Beven and Kirkby 1979) 33 | * streamdist: distance of plot from the nearest permanent stream (meters) 34 | * disturb: plot disturbance history (from a Park report); CORPLOG=corporate logging; SETTLE=concentrated settlement, VIRGIN="high in virgin attributes", LT-SEL=light or selective logging 35 | * beers: transformed slope aspect ('heat load index'); 0 is SW (hottest), 2 is NE (coolest) 36 | 37 | ![](../smokies_transects.png) 38 | 39 | Above shows a map of the regional and local location of the elevational 40 | transects included in the dataset (from [Fridley 41 | 2009](https://drive.google.com/file/d/1FcO290WW4bdNFAO4nI3OTqkX6Aht8zuD/view)). 42 | 43 | 44 | 1\. Carry out an exploratory analysis using the two tree datasets. Metadata for the 45 | tree study can be found [here](../data/tree_metadata.txt). Specifically, I would 46 | like you to visually examine how the explanatory variables relate to tree cover 47 | for a habitat generalist [*Acer rubrum* (Red maple)](http://www.durhamtownship.com/blog-archives/pix/November1407.jpg) and a 48 | habitat specialist [*Abies fraseri* (Frasier 49 | fir)](https://upload.wikimedia.org/wikipedia/commons/d/d0/Abies_fraseri_Mitchell.jpg). 50 | 51 | After carrying out a visual examination of the correlations with tree cover go 52 | ahead and build linear multiple regression models and interpret them. This 53 | this dataset includes both continuous and discrete (i.e., `disturb`) explanatory 54 | variables so we will use both the functions `summary` and 55 | `car::Anova(..., type = 3)` to interpret the model. For example, your code will 56 | likely look something like: 57 | 58 | ```r 59 | #install.packages('car') # if you have not installed before 60 | library(car) # load the library 61 | # build the linear model 62 | my_mod <- lm(cover ~ elev + tci + ... , data = maple) 63 | # where ... represents all of the variables you decide to include in your model 64 | # the function summary() provides a lot of useful information 65 | summary(my_mod) 66 | # to look at the effect of the discrete variable more directly try 67 | Anova(my_mod, type=3) # example of a type 3 anova 68 | ``` 69 | 70 | This will estimate partial effect sizes, variance explained, and p-values for 71 | each explanatory variable included in the model. 72 | 73 | Compare the p-values you observe using the function `Anova` to those generated 74 | using `summary`. 75 | 76 | For each species address the following additional questions: 77 | 78 | * **what patterns did you notice in your visual examination of the data?** 79 | * how well does the exploratory model appear to explain cover? 80 | * which explanatory variables does your model indicate are the most important? 81 | * **are these the same variables that your visual examination uncovered?** 82 | * do model diagnostics indicate any problems with violations of OLS assumptions? 83 | * are you able to explain variance in one species better than another, 84 | why might this be the case (statically or ecologically)? 85 | 86 | 87 | 88 | 2\. You may have noticed that the variable cover is defined as positive integers 89 | between 1 and 10. and is therefore better treated as a discrete rather than 90 | continuous variable. Re-examine your solutions to the question above but from 91 | the perspective of a General Linear Model (GLM) with a Poisson error term 92 | (rather than a Gaussian one as in OLS). The Poisson distribution generates 93 | integers 0 to positive infinity so this may provide a good first approximation. 94 | Your new model calls will look as follows: 95 | 96 | ```r 97 | acer_poi = glm(cover ~ tci + elev + ... , data = my_data, 98 | family='poisson') 99 | ``` 100 | 101 | For assessing the degree of variation explained you can use a 102 | pseudo-R-squared statistic (note this is just one of many possible) 103 | 104 | ```r 105 | pseudo_r2 = function(glm_mod) { 106 | 1 - glm_mod$deviance / glm_mod$null.deviance 107 | } 108 | ``` 109 | 110 | Compare your qualitative assessment of which variables were most important in each model. 111 | Does it appear that changing the error distribution changed the results much? In what ways? 112 | 113 | 3\. Provide a plain English summary (i.e., no statistics) of what you have 114 | found and what conclusions we can take away from your analysis? 115 | 116 | 4\. (optional) Examine the behavior of the function `stepAIC()` using the 117 | exploratory models developed above. This is a very simple and not very 118 | robust machine learning stepwise algorithm that uses AIC to select a 119 | best model. By default it does a backward selection routine. 120 | 121 | 5\. (optional) Develop a model for the number of species in each site 122 | (i.e., unique plotID). This variable will also be discrete so the Poisson 123 | may be a good starting approximation. Side note: the Poisson 124 | distribution converges asymptotically on the Gaussian distribution as the 125 | mean of the distribution increases. Thus Poisson regression does not differ 126 | much from traditional OLS when means are large. 127 | -------------------------------------------------------------------------------- /atom.xml: -------------------------------------------------------------------------------- 1 | --- 2 | layout: null 3 | --- 4 | 5 | 6 | 7 | 8 | {{ site.title }} 9 | 10 | 11 | {{ site.time | date_to_xmlschema }} 12 | {{ site.url }} 13 | 14 | {{ site.author.name }} 15 | {{ site.author.email }} 16 | 17 | 18 | {% for post in site.posts %} 19 | 20 | {{ post.title }} 21 | 22 | {{ post.date | date_to_xmlschema }} 23 | {{ site.url }}{{ post.id }} 24 | {{ post.content | xml_escape }} 25 | 26 | {% endfor %} 27 | 28 | 29 | -------------------------------------------------------------------------------- /data.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: page 3 | title: Data 4 | --- 5 | 6 | A list of data sources and software for data acquisition 7 | 8 | ### Software 9 | * [EcoData Retriever](http://www.ecodataretriever.org/) 10 | - [R package](https://github.com/ropensci/ecoretriever) 11 | * [rOpenSci](https://ropensci.org/) 12 | - [GitHub](https://github.com/ropensci/) 13 | - [FedData](https://github.com/ropensci/FedData) 14 | 15 | ### Data aggregations 16 | * [Ecological Data Wiki](http://ecologicaldata.org/) 17 | -[R package](https://github.com/ropensci/ecoretriever) 18 | * [DataOne](https://www.dataone.org/data) 19 | * [Geospatial Innovation Facility](http://gif.berkeley.edu/resources/data_subject.html) 20 | * [Ecology Data Papers](http://esapubs.org/archive/search.php) 21 | * [Natureserve](http://www.natureserve.org/conservation-tools/data-maps-tools) 22 | * [The Global Population Dynamics Database](http://www3.imperial.ac.uk/cpb/databases/gpdd) 23 | * [Long-Term Ecological Research](http://www.lternet.edu/) 24 | * [National Geophysical Data Center (NGDC)](http://www.ngdc.noaa.gov/) 25 | * [Kaggle](https://www.kaggle.com/datasets) 26 | * [List of free big data links](http://www.datasciencecentral.com/profiles/blogs/great-github-list-of-public-data-sets) 27 | * [WorldClim Climate Data](http://www.worldclim.org/) 28 | - [biological relevant climate variales (BioClim)](http://www.worldclim.org/bioclim) 29 | * [Climatic Research Unit](http://www.cru.uea.ac.uk/data) 30 | * [Paleoclimatology](https://www.ncdc.noaa.gov/data-access/paleoclimatology-data/datasets) 31 | * [Paleoenviornment](http://www.neotomadb.org/) 32 | -[R package](https://github.com/ropensci/neotoma) 33 | * [NOAA Tide and Currents Data](https://tidesandcurrents.noaa.gov/) 34 | - good source for sea level rise data in the US. 35 | 36 | ### Single data source 37 | 38 | #### Multi-taxon 39 | * [GBIF](http://www.gbif.org/) 40 | - [R package](https://github.com/ropensci/rgbif) 41 | - Note: Dan has cleaner scripts for this resource 42 | * [inaturalist](https://www.inaturalist.org/) 43 | - [multi-taxon range maps](https://www.inaturalist.org/posts/106918) 44 | 45 | #### Arthropods 46 | * [Caterpillars Count](https://caterpillarscount.unc.edu/dataDownload/) 47 | - [Data Exploration](https://caterpillarscount.unc.edu/pdfs/Data%20Exploration.pdf) 48 | 49 | #### Birds 50 | * [Breeding Bird Survey](http://www.mbr-pwrc.usgs.gov/bbs/) 51 | * [eBird](http://ebird.org/) 52 | - [R package](https://github.com/ropensci/rebird/) 53 | * [Global phylogeny of birds](http://birdtree.org/) 54 | * [Birds of North America](http://bna.birds.cornell.edu/bna) 55 | * [FAA bird strikes](https://wildlife.faa.gov/database.aspx) 56 | - [FAA 1990-2015 all state data dump](https://gist.github.com/dannguyen/4caf05f4a27775e0a550cd0a4f3fa21f) 57 | 58 | #### Plants 59 | * [North American Plants](http://plants.usda.gov/) 60 | * [North American Tree range maps](http://esp.cr.usgs.gov/data/little/) 61 | * [Global Forest Watch](http://www.globalforestwatch.org/map/) 62 | * [Centre du Quebec Forest Plots](https://figshare.com/articles/dataset/Centre_du_Quebec_Forest_Plots/10325681) 63 | * [OpenNahele: the open Hawaiian forest plot database](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6172291/) 64 | 65 | #### Fish 66 | * [Global Fishing Pressue](https://globalfishingwatch.org/datasets-and-code/) 67 | * [Southeast Area Monitoring and Assessment Program](http://www.seamap.org/) 68 | 69 | #### Human health 70 | * [Influenza Research Database](http://www.fludb.org/) 71 | * [Vaccine-Preventable Outbreaks](http://www.cfr.org/interactives/GH_Vaccine_Map/#map) 72 | * [Center for Disease Control](http://www.cdc.gov/DataStatistics/) 73 | * [Human microbiome](https://github.com/biocore/American-Gut) - Easy to access but large size. 74 | 75 | #### Human behavior 76 | * [Religion in the US](http://www.thearda.com/Archive/Files/Descriptions/RCMSCY10.asp) 77 | - [related blog post](http://www.arilamstein.com/blog/2016/01/25/mapping-us-religion-adherence-county-r/) 78 | 79 | -------------------------------------------------------------------------------- /data/BCI_env.csv: -------------------------------------------------------------------------------- 1 | "UTM.EW","UTM.NS","Precipitation","Elevation","Age.cat","Geology" 2 | 625753.967,1011568.985,2530,120,"c3","Tb" 3 | 625753.967,1011668.985,2530,120,"c3","Tb" 4 | 625753.967,1011768.985,2530,120,"c3","Tb" 5 | 625753.967,1011868.985,2530,120,"c3","Tb" 6 | 625753.967,1011968.985,2530,120,"c3","Tb" 7 | 625853.967,1011568.985,2530,120,"c3","Tb" 8 | 625853.967,1011668.985,2530,120,"c3","Tb" 9 | 625853.967,1011768.985,2530,120,"c3","Tb" 10 | 625853.967,1011868.985,2530,120,"c3","Tb" 11 | 625853.967,1011968.985,2530,120,"c3","Tb" 12 | 625953.967,1011568.985,2530,120,"c3","Tb" 13 | 625953.967,1011668.985,2530,120,"c3","Tb" 14 | 625953.967,1011768.985,2530,120,"c3","Tb" 15 | 625953.967,1011868.985,2530,120,"c3","Tb" 16 | 625953.967,1011968.985,2530,120,"c3","Tb" 17 | 626053.967,1011568.985,2530,120,"c3","Tb" 18 | 626053.967,1011668.985,2530,120,"c3","Tb" 19 | 626053.967,1011768.985,2530,120,"c3","Tb" 20 | 626053.967,1011868.985,2530,120,"c3","Tb" 21 | 626053.967,1011968.985,2530,120,"c3","Tb" 22 | 626153.967,1011568.985,2530,120,"c3","Tb" 23 | 626153.967,1011668.985,2530,120,"c3","Tb" 24 | 626153.967,1011768.985,2530,120,"c3","Tb" 25 | 626153.967,1011868.985,2530,120,"c3","Tb" 26 | 626153.967,1011968.985,2530,120,"c3","Tb" 27 | 626253.967,1011568.985,2530,120,"c3","Tb" 28 | 626253.967,1011668.985,2530,120,"c3","Tb" 29 | 626253.967,1011768.985,2530,120,"c3","Tb" 30 | 626253.967,1011868.985,2530,120,"c3","Tb" 31 | 626253.967,1011968.985,2530,120,"c3","Tb" 32 | 626353.967,1011568.985,2530,120,"c3","Tb" 33 | 626353.967,1011668.985,2530,120,"c3","Tb" 34 | 626353.967,1011768.985,2530,120,"c2","Tb" 35 | 626353.967,1011868.985,2530,120,"c3","Tb" 36 | 626353.967,1011968.985,2530,120,"c3","Tb" 37 | 626453.967,1011568.985,2530,120,"c3","Tb" 38 | 626453.967,1011668.985,2530,120,"c3","Tb" 39 | 626453.967,1011768.985,2530,120,"c3","Tb" 40 | 626453.967,1011868.985,2530,120,"c3","Tb" 41 | 626453.967,1011968.985,2530,120,"c3","Tb" 42 | 626553.967,1011568.985,2530,120,"c3","Tb" 43 | 626553.967,1011668.985,2530,120,"c3","Tb" 44 | 626553.967,1011768.985,2530,120,"c3","Tb" 45 | 626553.967,1011868.985,2530,120,"c3","Tb" 46 | 626553.967,1011968.985,2530,120,"c3","Tb" 47 | 626653.967,1011568.985,2530,120,"c3","Tb" 48 | 626653.967,1011668.985,2530,120,"c3","Tb" 49 | 626653.967,1011768.985,2530,120,"c3","Tb" 50 | 626653.967,1011868.985,2530,120,"c3","Tb" 51 | 626653.967,1011968.985,2530,120,"c3","Tb" 52 | -------------------------------------------------------------------------------- /data/MODISfire2010.zip: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/data/MODISfire2010.zip -------------------------------------------------------------------------------- /data/crabdat(MF).xlsx: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/data/crabdat(MF).xlsx -------------------------------------------------------------------------------- /data/inflammation-01.csv: -------------------------------------------------------------------------------- 1 | 0,0,1,3,1,2,4,7,8,3,3,3,10,5,7,4,7,7,12,18,6,13,11,11,7,7,4,6,8,8,4,4,5,7,3,4,2,3,0,0 2 | 0,1,2,1,2,1,3,2,2,6,10,11,5,9,4,4,7,16,8,6,18,4,12,5,12,7,11,5,11,3,3,5,4,4,5,5,1,1,0,1 3 | 0,1,1,3,3,2,6,2,5,9,5,7,4,5,4,15,5,11,9,10,19,14,12,17,7,12,11,7,4,2,10,5,4,2,2,3,2,2,1,1 4 | 0,0,2,0,4,2,2,1,6,7,10,7,9,13,8,8,15,10,10,7,17,4,4,7,6,15,6,4,9,11,3,5,6,3,3,4,2,3,2,1 5 | 0,1,1,3,3,1,3,5,2,4,4,7,6,5,3,10,8,10,6,17,9,14,9,7,13,9,12,6,7,7,9,6,3,2,2,4,2,0,1,1 6 | 0,0,1,2,2,4,2,1,6,4,7,6,6,9,9,15,4,16,18,12,12,5,18,9,5,3,10,3,12,7,8,4,7,3,5,4,4,3,2,1 7 | 0,0,2,2,4,2,2,5,5,8,6,5,11,9,4,13,5,12,10,6,9,17,15,8,9,3,13,7,8,2,8,8,4,2,3,5,4,1,1,1 8 | 0,0,1,2,3,1,2,3,5,3,7,8,8,5,10,9,15,11,18,19,20,8,5,13,15,10,6,10,6,7,4,9,3,5,2,5,3,2,2,1 9 | 0,0,0,3,1,5,6,5,5,8,2,4,11,12,10,11,9,10,17,11,6,16,12,6,8,14,6,13,10,11,4,6,4,7,6,3,2,1,0,0 10 | 0,1,1,2,1,3,5,3,5,8,6,8,12,5,13,6,13,8,16,8,18,15,16,14,12,7,3,8,9,11,2,5,4,5,1,4,1,2,0,0 11 | 0,1,0,0,4,3,3,5,5,4,5,8,7,10,13,3,7,13,15,18,8,15,15,16,11,14,12,4,10,10,4,3,4,5,5,3,3,2,2,1 12 | 0,1,0,0,3,4,2,7,8,5,2,8,11,5,5,8,14,11,6,11,9,16,18,6,12,5,4,3,5,7,8,3,5,4,5,5,4,0,1,1 13 | 0,0,2,1,4,3,6,4,6,7,9,9,3,11,6,12,4,17,13,15,13,12,8,7,4,7,12,9,5,6,5,4,7,3,5,4,2,3,0,1 14 | 0,0,0,0,1,3,1,6,6,5,5,6,3,6,13,3,10,13,9,16,15,9,11,4,6,4,11,11,12,3,5,8,7,4,6,4,1,3,0,0 15 | 0,1,2,1,1,1,4,1,5,2,3,3,10,7,13,5,7,17,6,9,12,13,10,4,12,4,6,7,6,10,8,2,5,1,3,4,2,0,2,0 16 | 0,1,1,0,1,2,4,3,6,4,7,5,5,7,5,10,7,8,18,17,9,8,12,11,11,11,14,6,11,2,10,9,5,6,5,3,4,2,2,0 17 | 0,0,0,0,2,3,6,5,7,4,3,2,10,7,9,11,12,5,12,9,13,19,14,17,5,13,8,11,5,10,9,8,7,5,3,1,4,0,2,1 18 | 0,0,0,1,2,1,4,3,6,7,4,2,12,6,12,4,14,7,8,14,13,19,6,9,12,6,4,13,6,7,2,3,6,5,4,2,3,0,1,0 19 | 0,0,2,1,2,5,4,2,7,8,4,7,11,9,8,11,15,17,11,12,7,12,7,6,7,4,13,5,7,6,6,9,2,1,1,2,2,0,1,0 20 | 0,1,2,0,1,4,3,2,2,7,3,3,12,13,11,13,6,5,9,16,9,19,16,11,8,9,14,12,11,9,6,6,6,1,1,2,4,3,1,1 21 | 0,1,1,3,1,4,4,1,8,2,2,3,12,12,10,15,13,6,5,5,18,19,9,6,11,12,7,6,3,6,3,2,4,3,1,5,4,2,2,0 22 | 0,0,2,3,2,3,2,6,3,8,7,4,6,6,9,5,12,12,8,5,12,10,16,7,14,12,5,4,6,9,8,5,6,6,1,4,3,0,2,0 23 | 0,0,0,3,4,5,1,7,7,8,2,5,12,4,10,14,5,5,17,13,16,15,13,6,12,9,10,3,3,7,4,4,8,2,6,5,1,0,1,0 24 | 0,1,1,1,1,3,3,2,6,3,9,7,8,8,4,13,7,14,11,15,14,13,5,13,7,14,9,10,5,11,5,3,5,1,1,4,4,1,2,0 25 | 0,1,1,1,2,3,5,3,6,3,7,10,3,8,12,4,12,9,15,5,17,16,5,10,10,15,7,5,3,11,5,5,6,1,1,1,1,0,2,1 26 | 0,0,2,1,3,3,2,7,4,4,3,8,12,9,12,9,5,16,8,17,7,11,14,7,13,11,7,12,12,7,8,5,7,2,2,4,1,1,1,0 27 | 0,0,1,2,4,2,2,3,5,7,10,5,5,12,3,13,4,13,7,15,9,12,18,14,16,12,3,11,3,2,7,4,8,2,2,1,3,0,1,1 28 | 0,0,1,1,1,5,1,5,2,2,4,10,4,8,14,6,15,6,12,15,15,13,7,17,4,5,11,4,8,7,9,4,5,3,2,5,4,3,2,1 29 | 0,0,2,2,3,4,6,3,7,6,4,5,8,4,7,7,6,11,12,19,20,18,9,5,4,7,14,8,4,3,7,7,8,3,5,4,1,3,1,0 30 | 0,0,0,1,4,4,6,3,8,6,4,10,12,3,3,6,8,7,17,16,14,15,17,4,14,13,4,4,12,11,6,9,5,5,2,5,2,1,0,1 31 | 0,1,1,0,3,2,4,6,8,6,2,3,11,3,14,14,12,8,8,16,13,7,6,9,15,7,6,4,10,8,10,4,2,6,5,5,2,3,2,1 32 | 0,0,2,3,3,4,5,3,6,7,10,5,10,13,14,3,8,10,9,9,19,15,15,6,8,8,11,5,5,7,3,6,6,4,5,2,2,3,0,0 33 | 0,1,2,2,2,3,6,6,6,7,6,3,11,12,13,15,15,10,14,11,11,8,6,12,10,5,12,7,7,11,5,8,5,2,5,5,2,0,2,1 34 | 0,0,2,1,3,5,6,7,5,8,9,3,12,10,12,4,12,9,13,10,10,6,10,11,4,15,13,7,3,4,2,9,7,2,4,2,1,2,1,1 35 | 0,0,1,2,4,1,5,5,2,3,4,8,8,12,5,15,9,17,7,19,14,18,12,17,14,4,13,13,8,11,5,6,6,2,3,5,2,1,1,1 36 | 0,0,0,3,1,3,6,4,3,4,8,3,4,8,3,11,5,7,10,5,15,9,16,17,16,3,8,9,8,3,3,9,5,1,6,5,4,2,2,0 37 | 0,1,2,2,2,5,5,1,4,6,3,6,5,9,6,7,4,7,16,7,16,13,9,16,12,6,7,9,10,3,6,4,5,4,6,3,4,3,2,1 38 | 0,1,1,2,3,1,5,1,2,2,5,7,6,6,5,10,6,7,17,13,15,16,17,14,4,4,10,10,10,11,9,9,5,4,4,2,1,0,1,0 39 | 0,1,0,3,2,4,1,1,5,9,10,7,12,10,9,15,12,13,13,6,19,9,10,6,13,5,13,6,7,2,5,5,2,1,1,1,1,3,0,1 40 | 0,1,1,3,1,1,5,5,3,7,2,2,3,12,4,6,8,15,16,16,15,4,14,5,13,10,7,10,6,3,2,3,6,3,3,5,4,3,2,1 41 | 0,0,0,2,2,1,3,4,5,5,6,5,5,12,13,5,7,5,11,15,18,7,9,10,14,12,11,9,10,3,2,9,6,2,2,5,3,0,0,1 42 | 0,0,1,3,3,1,2,1,8,9,2,8,10,3,8,6,10,13,11,17,19,6,4,11,6,12,7,5,5,4,4,8,2,6,6,4,2,2,0,0 43 | 0,1,1,3,4,5,2,1,3,7,9,6,10,5,8,15,11,12,15,6,12,16,6,4,14,3,12,9,6,11,5,8,5,5,6,1,2,1,2,0 44 | 0,0,1,3,1,4,3,6,7,8,5,7,11,3,6,11,6,10,6,19,18,14,6,10,7,9,8,5,8,3,10,2,5,1,5,4,2,1,0,1 45 | 0,1,1,3,3,4,4,6,3,4,9,9,7,6,8,15,12,15,6,11,6,18,5,14,15,12,9,8,3,6,10,6,8,7,2,5,4,3,1,1 46 | 0,1,2,2,4,3,1,4,8,9,5,10,10,3,4,6,7,11,16,6,14,9,11,10,10,7,10,8,8,4,5,8,4,4,5,2,4,1,1,0 47 | 0,0,2,3,4,5,4,6,2,9,7,4,9,10,8,11,16,12,15,17,19,10,18,13,15,11,8,4,7,11,6,7,6,5,1,3,1,0,0,0 48 | 0,1,1,3,1,4,6,2,8,2,10,3,11,9,13,15,5,15,6,10,10,5,14,15,12,7,4,5,11,4,6,9,5,6,1,1,2,1,2,1 49 | 0,0,1,3,2,5,1,2,7,6,6,3,12,9,4,14,4,6,12,9,12,7,11,7,16,8,13,6,7,6,10,7,6,3,1,5,4,3,0,0 50 | 0,0,1,2,3,4,5,7,5,4,10,5,12,12,5,4,7,9,18,16,16,10,15,15,10,4,3,7,5,9,4,6,2,4,1,4,2,2,2,1 51 | 0,1,2,1,1,3,5,3,6,3,10,10,11,10,13,10,13,6,6,14,5,4,5,5,9,4,12,7,7,4,7,9,3,3,6,3,4,1,2,0 52 | 0,1,2,2,3,5,2,4,5,6,8,3,5,4,3,15,15,12,16,7,20,15,12,8,9,6,12,5,8,3,8,5,4,1,3,2,1,3,1,0 53 | 0,0,0,2,4,4,5,3,3,3,10,4,4,4,14,11,15,13,10,14,11,17,9,11,11,7,10,12,10,10,10,8,7,5,2,2,4,1,2,1 54 | 0,0,2,1,1,4,4,7,2,9,4,10,12,7,6,6,11,12,9,15,15,6,6,13,5,12,9,6,4,7,7,6,5,4,1,4,2,2,2,1 55 | 0,1,2,1,1,4,5,4,4,5,9,7,10,3,13,13,8,9,17,16,16,15,12,13,5,12,10,9,11,9,4,5,5,2,2,5,1,0,0,1 56 | 0,0,1,3,2,3,6,4,5,7,2,4,11,11,3,8,8,16,5,13,16,5,8,8,6,9,10,10,9,3,3,5,3,5,4,5,3,3,0,1 57 | 0,1,1,2,2,5,1,7,4,2,5,5,4,6,6,4,16,11,14,16,14,14,8,17,4,14,13,7,6,3,7,7,5,6,3,4,2,2,1,1 58 | 0,1,1,1,4,1,6,4,6,3,6,5,6,4,14,13,13,9,12,19,9,10,15,10,9,10,10,7,5,6,8,6,6,4,3,5,2,1,1,1 59 | 0,0,0,1,4,5,6,3,8,7,9,10,8,6,5,12,15,5,10,5,8,13,18,17,14,9,13,4,10,11,10,8,8,6,5,5,2,0,2,0 60 | 0,0,1,0,3,2,5,4,8,2,9,3,3,10,12,9,14,11,13,8,6,18,11,9,13,11,8,5,5,2,8,5,3,5,4,1,3,1,1,0 61 | -------------------------------------------------------------------------------- /data/milkweeds.csv: -------------------------------------------------------------------------------- 1 | samp_id,trt,plant_ht_cm,fruit_mass_mg 2 | 1,fertilized,30.32,19.17 3 | 2,unfertilized,23.07,15.06 4 | 3,fertilized,21.90,16.09 5 | 4,unfertilized,32.32,17.19 6 | 5,unfertilized,30.19,15.88 7 | 6,fertilized,19.63,18.91 8 | 7,unfertilized,15.32,13.43 9 | 8,fertilized,25.26,15.54 10 | 9,fertilized,25.42,22.90 11 | 10,fertilized,22.33,15.97 12 | 11,fertilized,36.81,24.80 13 | 12,unfertilized,23.76,17.18 14 | 13,fertilized,22.75,22.00 15 | 14,fertilized,32.29,19.31 16 | 15,unfertilized,29.27,16.14 17 | 16,fertilized,24.16,17.06 18 | 17,unfertilized,22.62,15.04 19 | 18,unfertilized,18.97,16.70 20 | 19,fertilized,25.32,20.15 21 | 20,unfertilized,27.58,17.97 22 | 21,unfertilized,17.92,16.09 23 | 22,unfertilized,25.08,16.71 24 | 23,fertilized,28.38,19.68 25 | 24,fertilized,30.80,21.19 26 | 25,unfertilized,27.11,18.46 27 | 26,unfertilized,16.63,13.35 28 | 27,fertilized,28.26,24.34 29 | 28,fertilized,37.67,23.41 30 | 29,unfertilized,25.11,20.72 31 | 30,unfertilized,21.72,12.61 32 | 31,fertilized,25.10,22.23 33 | 32,fertilized,31.17,16.74 34 | 33,fertilized,21.08,19.69 35 | 34,unfertilized,26.69,17.69 36 | 35,fertilized,26.04,20.89 37 | 36,unfertilized,32.68,19.59 38 | 37,unfertilized,31.71,18.29 39 | 38,unfertilized,30.93,19.17 40 | 39,unfertilized,29.05,16.92 41 | 40,fertilized,31.95,21.93 -------------------------------------------------------------------------------- /data/music genres ranking (Responses) - Form Responses 1.csv: -------------------------------------------------------------------------------- 1 | Timestamp,rock,pop,country,indie,rap,taylor swift,age,latitude of birthplace,gender, 2 | 2/6/2024 8:22:36,5,3,4,1,1,1,21,2,male, 3 | 2/6/2024 8:22:41,4,5,1,5,3,5,26,1,female, 4 | 2/6/2024 8:22:42,5,5,4,5,3,5,23,1,female, 5 | 2/6/2024 8:22:43,5,3,1,4,4,1,21,2,female,Option 1 6 | 2/6/2024 8:23:00,4,5,1,5,1,1,24,5,male, 7 | 2/6/2024 8:23:02,4,1,1,4,5,1,22,1,female,Option 1 8 | 2/6/2024 8:23:04,5,5,5,2,2,3,59,3,male, 9 | 2/6/2024 8:23:06,4,5,5,5,2,3,21,5,female, 10 | 2/6/2024 8:23:12,5,3,4,5,1,1,23,2,female, 11 | 2/6/2024 8:23:16,4,4,3,3,5,3,22,5,male, 12 | 2/6/2024 8:23:20,1,4,5,5,3,4,22,3,female, 13 | 2/6/2024 8:23:26,3,3,1,5,1,5,21,5,male,Option 1 14 | 2/6/2024 8:23:27,1,4,5,5,3,5,23,3,female, 15 | 2/6/2024 8:23:29,5,5,2,5,3,3,22,1,female, 16 | 2/6/2024 8:23:45,5,3,4,5,4,1,23,5,female, 17 | 2/6/2024 8:23:54,2,5,3,4,5,1,22,4,female, -------------------------------------------------------------------------------- /data/tree_metadata.txt: -------------------------------------------------------------------------------- 1 | The dataset includes tree abundances from a subset of a vegetation database of Great Smoky Mountains National Park (TN, NC). 2 | 3 | plotID: unique code for each spatial unit (note some sampled more than once) 4 | date: when species occurrence recorded 5 | plotsize: size of quadrat in m2 6 | spcode: unique 7-letter code for each species 7 | species: species name 8 | cover: local abundance measured as estimated horizontal cover (ie, relative area of shadow if sun is directly above) classes 1-10 are: 1=trace, 2=0-1%, 3=1-2%, 4=2-5%, 5=5-10%, 6=10-25%, 7=25-50%, 8=50-75%, 9=75-95%, 10=95-100% 9 | utme: plot UTM Easting, zone 17 (NAD27 Datum) 10 | utmn: plot UTM Northing, zone 17 (NAD27 Datum) 11 | elev: elevation in meters from a digital elevation model (10 m res) 12 | tci: topographic convergence index, or site "water potential"; measured as the upslope contributing area divided by the tangent of the slope angle (Beven and Kirkby 1979) 13 | streamdist: distance of plot from the nearest permanent stream (meters) 14 | disturb: plot disturbance history (from a Park report); CORPLOG=corporate logging; SETTLE=concentrated settlement, VIRGIN="high in virgin attributes", LT-SEL=light or selective logging 15 | beers: transformed slope aspect ('heat load index'); 0 is SW (hottest), 2 is NE (coolest) -------------------------------------------------------------------------------- /google671b87772b9c5779.html: -------------------------------------------------------------------------------- 1 | google-site-verification: google671b87772b9c5779.html -------------------------------------------------------------------------------- /images/Thumbs.db: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/images/Thumbs.db -------------------------------------------------------------------------------- /index.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: page 3 | title: Home 4 | --- 5 | 6 | This site contains teaching materials for applied quantitative methods. The course is aimed at developing a set of core skills using the R programming 7 | language, and providing students an opportunity to apply these tools on real datasets and towards their own projects. 8 | 9 | * [syllabus](./syllabus) 10 | * [software](./software) 11 | * [lessons](./lessons) 12 | * [assignments](./assignments) 13 | * [online resources](./resources) 14 | 15 | ## License 16 | 17 | This course material is open source (MIT) and open access (CC-BY) licensed, so please feel free to use these materials in your own classes and contribute changes if you see things that need improving. The more explicit license can 18 | be found [here](./LICENSE) 19 | 20 | ## Acknowledgements 21 | 22 | I would like to thank the following people for helping provide suggestions about how to best implement this course. 23 | 24 | * Ethan White 25 | * Greg Wilson 26 | * David LeBauer 27 | 28 | -------------------------------------------------------------------------------- /lessons.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: page 3 | title: Lessons 4 | --- 5 | Quantitative topics covered in this course include: 6 | 7 | * Background Information 8 | - motivation 9 | - before we start 10 | - R markdown setup and usage 11 | - Projects 12 | - Using Rstudio Projects (from Posit) 13 | - Why you should use Rstudio Projects (from blog of David Chen) 14 | * Introduction to R 15 | - basic 16 | - data structures (from *Advanced R*) 17 | - intermediate 18 | - Best Practices for Writing R Code (from Soft. Carp.) 19 | - R style guide (from *Advanced R*) 20 | - Optional Advanced Material 21 | - Object oriented programming, S3 objects (from *Advanced R*) 22 | - Debugging your code (from *Advanced R*) 23 | * [Introduction to Version Control and the Terminal](./git_introduction) 24 | - [Guide to GitHub's issue system (from Github)](https://guides.github.com/features/issues/) 25 | * Statistics Review Material 26 | - [Statistics Primer](./stats_primer.pdf) 27 | * Univariate Models 28 | - OLS modeling and model selection 29 | - Mini-sub lessons 30 | - Data exploration: the importance of plotting 31 | - Paired and block designs 32 | - Partial residual plots 33 | - Standardized beta coefficients 34 | - Type I/II/III Anova explained (from r-bloggers) 35 | * Multivariate Models 36 | - Multivariate ordination models 37 | * Spatial Models 38 | - Spatial pattern detection and modeling 39 | * GIS manipulations 40 | - Shapefiles and rasters 41 | - More Examples with Maps in R 42 | * Simulations and Null models 43 | - Simulations in R 44 | - Zoom recorded lecture 45 | -------------------------------------------------------------------------------- /lessons/00-before-we-start.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: Before we start 3 | author: Dan McGlinn and other Data Carpentry contributors 4 | date: "January 11, 2016" 5 | output: html_document 6 | --- 7 | 8 | > ## Learning Objectives 9 | > 10 | > * Articulating motivations for this lesson 11 | > * Introduce participants to the RStudio interface 12 | > * Set up participants to have a working directory with a `data/` folder inside 13 | > * Introduce R syntax 14 | > * Point to relevant information on how to get help, and understand how to ask well formulated questions 15 | 16 | ------------ 17 | 18 | # Before we get started 19 | 20 | * Start RStudio (presentation of RStudio -below- should happen here) 21 | * Under the `File` menu, click on `New project`, choose `New directory`, then 22 | `Empty project` 23 | * Enter a name for this new folder, and choose a convenient location for 24 | it. This will be your **working directory** for the rest of the day 25 | (e.g., `~/quant_methods`) 26 | * Click on "Create project" 27 | * Under the `Files` tab on the right of the screen, click on `New Folder` and 28 | create a folder named `scripts` within your newly created working directory. 29 | (e.g., `~/quant_methods/scripts`) 30 | * Create a new R script (File > New File > R script) and save it in your scripts 31 | directory (e.g. `./scripts/data-carpentry-script.R`) 32 | 33 | Your working directory should now look like this: 34 | 35 | ![How it should look like at the beginning of this lesson](figures/r_starting_how_it_should_look.png) 36 | 37 | # Presentation of RStudio 38 | 39 | Let's start by learning about our tool. 40 | 41 | * Console, Scripts, Environments, Plots 42 | * Code and workflow are more reproducible if we can document everything that we 43 | do. 44 | * Our end goal is not just to "do stuff" but to do it in a way that anyone can 45 | easily and exactly replicate our workflow and results. 46 | 47 | A good reference for the different components of the Rstudio IDE is this reference card: 48 | 49 | 50 | # Interacting with R 51 | 52 | There are two main ways of interacting with R: using the console or by using 53 | script files (plain text files that contain your code). 54 | 55 | The console window (in RStudio, the bottom left panel) is the place where R is 56 | waiting for you to tell it what to do, and where it will show the results of a 57 | command. You can type commands directly into the console, but they will be 58 | forgotten when you close the session (Note they will be saved in Rstudio's 59 | history panel but this isn't ideal). It is better to enter the commands in the 60 | script editor, and save the script. This way, you have a complete record of what 61 | you did, you can easily show others how you did it and you can do it again later 62 | on if needed. You can copy-paste into the R console, but the Rstudio script 63 | editor allows you to 'send' the current line or the currently selected text to 64 | the R console using the `Ctrl-Enter` shortcut. 65 | 66 | At some point in your analysis you may want to check the content of variable or 67 | the structure of an object, without necessarily keep a record of it in your 68 | script. You can type these commands directly in the console. RStudio provides 69 | the `Ctrl-1` and `Ctrl-2` shortcuts allow you to jump between the script and the 70 | console windows. 71 | 72 | If R is ready to accept commands, the R console shows a `>` prompt. If it 73 | receives a command (by typing, copy-pasting or sent from the script editor using 74 | `Ctrl-Enter`), R will try to execute it, and when ready, show the results and 75 | come back with a new `>`-prompt to wait for new commands. 76 | 77 | If R is still waiting for you to enter more data because it isn't complete yet, 78 | the console will show a `+` prompt. It means that you haven't finished entering 79 | a complete command. This is because you have not 'closed' a parenthesis or 80 | quotation. If you're in Rstudio and this happens, click inside the console 81 | window and press `Esc`; this should help you out of trouble. 82 | 83 | # Basics of R 84 | 85 | R is a versatile, open source programming/scripting language that's useful both 86 | for statistics but also data science. Inspired by the programming language S. 87 | 88 | * Open source software under GPL. 89 | * Superior (if not just comparable) to commercial alternatives. R has over 7,000 90 | user contributed packages at this time. It's widely used both in academia and 91 | industry. 92 | * Available on all platforms. 93 | * Not just for statistics, but also general purpose programming. 94 | * For people who have experience in programmming: R is both an object-oriented 95 | and a so-called [functional language](http://adv-r.had.co.nz/Functional-programming.html) 96 | * Large and growing community of peers. 97 | * Additional [motivations](../motivations) 98 | 99 | ## Commenting 100 | 101 | Use `#` signs to comment. Comment liberally in your R scripts. Anything to the 102 | right of a `#` is ignored by R. 103 | 104 | ## Assignment operator 105 | 106 | `<-` is the assignment operator. It assigns values on the right to objects on 107 | the left. So, after executing `x <- 3`, the value of `x` is `3`. The arrow can 108 | be read as 3 **goes into** `x`. You can also use `=` for assignments. Most R 109 | users prefer to use the `<-` operator but I personally prefer `=` because it has 110 | one less keystroke and is more asthesticly pleasing. The choice is ultimately 111 | up to the user. 112 | 113 | In RStudio, typing `Alt + -` (push `Alt`, the key next to your space bar at the 114 | same time as the `-` key) will write ` <- ` in a single keystroke. 115 | 116 | ## Organizing your working directory 117 | 118 | You should separate the original data (raw data) from intermediate datasets that 119 | you may create for the need of a particular analysis. For instance, you may want 120 | to create a `data/` directory within your working directory that stores the raw 121 | data, and have a `data_output/` directory for intermediate datasets and a 122 | `figure_output/` directory for the plots you will generate. 123 | 124 | ## Seeking help 125 | 126 | ### I know the name of the function I want to use, but I'm not sure how to use it 127 | 128 | If you need help with a specific function, let's say `barplot()`, you can type: 129 | 130 | ```{r, eval=FALSE, purl=FALSE} 131 | ?barplot 132 | ``` 133 | 134 | If you just need to remind yourself of the names of the arguments, you can use: 135 | 136 | ```{r, eval=FALSE, purl=FALSE} 137 | args(lm) 138 | ``` 139 | 140 | Rstudio provides quick argument documentation using the tab key. Simply write a 141 | function's name with the leading the parentheses and hit the `Tab` key to get a 142 | list of arguments and a short description of each one. 143 | 144 | ### I want to use a function that does X, there must be a function for it but I don't know which one... 145 | 146 | If you are looking for a function to do a particular task, you can use 147 | `help.search()` function, which is called by the double question mark `??`. 148 | However, this only looks through the installed packages for help pages with a match to your search request 149 | 150 | ```{r, eval=FALSE, purl=FALSE} 151 | ??kruskal 152 | ``` 153 | 154 | There is an extensive list of R cheatsheets and reference cards. 155 | Here is just a short list of useful ones: 156 | 157 | * [base-r.pdf](http://github.com/rstudio/cheatsheets/raw/master/base-r.pdf) 158 | * [short card](https://cran.r-project.org/doc/contrib/Short-refcard.pdf) 159 | * [Rstudio IDE](https://github.com/rstudio/cheatsheets/raw/master/rstudio-ide.pdf) 160 | 161 | 162 | If you can't find what you are looking for, you can use the 163 | [rdocumention.org](http://www.rdocumentation.org) website that search through 164 | the help files across all packages available. 165 | 166 | ### I am stuck... I get an error message that I don't understand 167 | 168 | Start by googling the error message. However, this doesn't always work very well 169 | because often, package developers rely on the error catching provided by R. You 170 | end up with general error messages that might not be very helpful to diagnose a 171 | problem (e.g. "subscript out of bounds"). 172 | 173 | However, you should check stackoverflow. Search using the `[r]` tag. Most 174 | questions have already been answered, but the challenge is to use the right 175 | words in the search to find the answers: 176 | [http://stackoverflow.com/questions/tagged/r](http://stackoverflow.com/questions/tagged/r) 177 | 178 | The [Introduction to R](http://cran.r-project.org/doc/manuals/R-intro.pdf) can 179 | also be dense for people with little programming experience but it is a good 180 | place to understand the underpinnings of the R language. 181 | 182 | The [R FAQ](http://cran.r-project.org/doc/FAQ/R-FAQ.html) is dense and technical 183 | but it is full of useful information. 184 | 185 | ### Asking for help 186 | 187 | The key to get help from someone is for them to grasp your problem rapidly. You 188 | should make it as easy as possible to pinpoint where the issue might be. 189 | 190 | Try to use the correct words to describe your problem. For instance, a package 191 | is not the same thing as a library. Most people will understand what you meant, 192 | but others have really strong feelings about the difference in meaning. The key 193 | point is that it can make things confusing for people trying to help you. Be as 194 | precise as possible when describing your problem 195 | 196 | If possible, try to reduce what doesn't work to a simple reproducible 197 | example. If you can reproduce the problem using a very small `data.frame` 198 | instead of your 50,000 rows and 10,000 columns one, provide the small one with 199 | the description of your problem. When appropriate, try to generalize what you 200 | are doing so even people who are not in your field can understand the question. 201 | 202 | To share an object with someone else, if it's relatively small, you can use the 203 | function `dput()`. It will output R code that can be used to recreate the exact same 204 | object as the one in memory: 205 | 206 | ```{r, results='show', purl=FALSE} 207 | dput(head(iris)) # iris is an example data.frame that comes with R 208 | ``` 209 | 210 | If the object is larger, provide either the raw file (i.e., your CSV file) with 211 | your script up to the point of the error (and after removing everything that is 212 | not relevant to your issue). Alternatively, in particular if your questions is 213 | not related to a `data.frame`, you can save any R object to a file: 214 | 215 | ```{r, eval=FALSE, purl=FALSE} 216 | saveRDS(iris, file="/tmp/iris.rds") 217 | ``` 218 | 219 | The content of this file is however not human readable and cannot be posted 220 | directly on stackoverflow. It can however be sent to someone by email who can read 221 | it with this command: 222 | 223 | ```{r, eval=FALSE, purl=FALSE} 224 | some_data <- readRDS(file="~/Downloads/iris.rds") 225 | ``` 226 | 227 | Last, but certainly not least, **always include the output of `sessionInfo()`** 228 | as it provides critical information about your platform, the versions of R and 229 | the packages that you are using, and other information that can be very helpful 230 | to understand your problem. 231 | 232 | ```{r, results='show', purl=FALSE} 233 | sessionInfo() 234 | ``` 235 | 236 | ### Where to ask for help? 237 | 238 | * Your friendly colleagues: if you know someone with more experience than you, 239 | they might be able and willing to help you. 240 | * StackOverflow: if your question hasn't been answered before and is well 241 | crafted, chances are you will get an answer in less than 5 min. 242 | * The [R-help](https://stat.ethz.ch/mailman/listinfo/r-help): it is read by a 243 | lot of people (including most of the R core team), a lot of people post to it, 244 | but the tone can be pretty dry, and it is not always very welcoming to new 245 | users. If your question is valid, you are likely to get an answer very fast 246 | but don't expect that it will come with smiley faces. Also, here more than 247 | everywhere else, be sure to use correct vocabulary (otherwise you might get an 248 | answer pointing to the misuse of your words rather than answering your 249 | question). You will also have more success if your question is about a base 250 | function rather than a specific package. 251 | * If your question is about a specific package, see if there is a mailing list 252 | for it. Usually it's included in the DESCRIPTION file of the package that can 253 | be accessed using `packageDescription("name-of-package")`. You may also want 254 | to try to email the author of the package directly. 255 | * There are also some topic-specific mailing lists (GIS, phylogenetics, etc...), 256 | the complete list is [here](http://www.r-project.org/mail.html). 257 | 258 | ### More resources 259 | 260 | * The [Posting Guide](http://www.r-project.org/posting-guide.html) for the R 261 | mailing lists. 262 | * [How to ask for R help](http://blog.revolutionanalytics.com/2014/01/how-to-ask-for-r-help.html) 263 | useful guidelines 264 | -------------------------------------------------------------------------------- /lessons/R_intermediate.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "R intermediate" 3 | author: "Dan McGlinn" 4 | date: "Thursday, January 29, 2015" 5 | output: html_document 6 | --- 7 | The goals of this lesson are to increase student's 8 | familiarity with the R programming language by discussing 9 | how to control program flow and use functions 10 | 11 | ## Lesson Outline 12 | * Programming for repetitive tasks 13 | * For loops 14 | - Capturing output 15 | - Make loops general 16 | * If statements 17 | - Else statements 18 | - Nested operations 19 | - Else if statements 20 | * Define Functions 21 | * Debug Functions 22 | * Document Functions 23 | 24 | ```{r setup, echo=FALSE} 25 | # specify that the root directory should be the parent directory of where this 26 | # script is stored this is because this .Rmd file is in ./quant_methods/lessons 27 | # and the data file that will be read into is located in ./quant_methods/data . 28 | # If your data file located in the same directory as or in a subdirectory of 29 | # your .Rmd file then you don't need to specify this. 30 | knitr::opts_knit$set(root.dir = '../') 31 | ``` 32 | 33 | ```{r import} 34 | # read in some data to work with 35 | dat <- read.csv('./data/tgpp.csv') 36 | ``` 37 | 38 | ## # Programming for repetive tasks 39 | 40 | Frequently in programming you have to carry out repetitive tasks 41 | for example you might want to know what the class of column of a data.frame 42 | you could simply write this as 43 | 44 | ```{r wrongway} 45 | class(dat[,1]) 46 | class(dat[,2]) 47 | class(dat[,3]) 48 | ``` 49 | 50 | and so on, but this is not only laborious but highly prone to typos and thus 51 | errors. 52 | 53 | Based on the last HW assignment we know that the best approach to carrying out 54 | this repetitive task is to use the `sapply()` function 55 | 56 | ```{r apply} 57 | sapply(dat, class) 58 | ``` 59 | 60 | However, it is very common that we need a more general approach to carrying out 61 | a repetitive task then simply applying a single function (in the example above 62 | applying the function class() to each column of dat 63 | 64 | ## # For Loop 65 | For loops are common feature of almost all programming languages. They are 66 | typically not the most efficient way to carry out a repetitive or iterative 67 | task however, they are frequently easy to understand and relatively easy to 68 | modify to include additional tasks. 69 | To use a for loop we need to create an iterator that will provide an index for 70 | the operation we would like to repeat. An iterative this is any variable you 71 | wish typically i, j, or k and so forth but could just as easily be "index" or 72 | "my_iterator" although that is not recommended. 73 | 74 | ```{r first loop} 75 | #In the example below we will asign the iterator the value of "i" 76 | for (i in 1:11) { 77 | print(class(dat[ , i])) 78 | } 79 | ``` 80 | 81 | To break this example down we can see that 82 | 83 | ```{r} 84 | 1:11 85 | ``` 86 | 87 | Generates a vector of numbers from 1 to 11. 88 | The portion of code for(i in 1:11) sets the value of i to each value of 89 | this vector as the for loop completes its tasks. 90 | 91 | Note the usage of `i in 1:11` this is somewhat unique to R because many other 92 | languages use `i = 1:11` and thus this is a frequent error for many students. 93 | Again I just want to emphasize we could have used a different name for our index 94 | something like `j` or `my_index` it did not have to be `i` this is simply the 95 | most common choice of an index in programming like in alebgra. 96 | 97 | Also here it is important to note the syntax and code style of the for loop: 98 | ```{r, eval = FALSE} 99 | for (i in 1:11) { 100 | ... # note this line is 4 spaces from the left margin, 2 spaces is also common, 0 spaces is bad form 101 | } 102 | ``` 103 | 104 | Above the `...` just represents anything you want the loop to do each iteration 105 | of the loop. This loop will iterate 11 times as `i` counts from 1 to 11. Note the 106 | spacing of the code and the placement of the curly brackets to start and stop the 107 | for loop. Note: it is possible to use different spacing (but not recommended): 108 | 109 | ```{r, eval = FALSE} 110 | # cramped example 111 | for(i in 1:11){print(class(dat[,i]))} 112 | ``` 113 | 114 | **Question**: Why do you think the code style in the above chunk is not generally recommended? 115 | 116 | ### #Capturing output 117 | Right now our for loop just prints output to the console but often times we want 118 | to capture that output and do something with it. To do this first we will have 119 | to define an empty object we'll call this `dat_classes` 120 | 121 | ```{r} 122 | dat_classes <- NULL 123 | ``` 124 | 125 | Once the empty object is initialized we can simply index is R is smart enough to 126 | convert this object to a vector of arbitrary size on the fly. This is not a wise 127 | move if memory or time is a necessity but it makes for easy programming. 128 | 129 | ```{r} 130 | for (i in 1:11) { 131 | dat_classes[i] <- class(dat[ , i]) 132 | } 133 | 134 | dat_classes 135 | 136 | ## alternatively you can concatenate 137 | # but the first approach is a bit cleaner 138 | dat_classes <- NULL 139 | for (i in 1:11) { 140 | dat_classes <- c(dat_classes, class(dat[ , i])) 141 | } 142 | 143 | ## the gold star approach to this is to set aside exactly how much 144 | # memory you will need in your holder variable. In our case this is a 145 | # vector of strings 11 elements long so we can use: 146 | 147 | dat_classes <- vector("character", 11) 148 | for (i in 1:11) { 149 | dat_classes[i] <- class(dat[ , i]) 150 | } 151 | ``` 152 | 153 | The three approaches above all give the same results but the third approach is 154 | typically considered best practice and the first approach is probably the 155 | easiest to read. We'll use the first approach for the reminder of this lesson. 156 | 157 | ### #Make your loops general 158 | You don't want it to break if the number of columns of dat changes so you need 159 | to write the loop such that it will always count to the appropriate number of 160 | columns in dat 161 | 162 | ```{r} 163 | dat_classes <- NULL 164 | for (i in 1:ncol(dat)) { 165 | dat_classes[i] <- class(dat[ , i]) 166 | } 167 | ``` 168 | 169 | ## # If statements 170 | If statements, like for loops, are a staple of programming. They allow 171 | the user to specify that a particular task be executed based on a logical 172 | TRUE / FALSE test. 173 | 174 | ```{r if} 175 | dat_classes <- NULL 176 | for (i in 1:ncol(dat)) { 177 | dat_classes[i] <- class(dat[ , i]) 178 | if(dat_classes[i] == "integer") { 179 | print('sweet!') 180 | } 181 | } 182 | ``` 183 | 184 | Note above because this if statement is only a single line it is not required 185 | that we include the brackets {} however it does make it more explicit to a 186 | reader what your code is doing 187 | 188 | ### # Else statement 189 | You can use an else clause to specify an alternative task to be carried out 190 | if the logical test is FALSE. 191 | 192 | ```{r} 193 | dat_classes <- NULL 194 | for (i in 1:ncol(dat)) { 195 | dat_classes[i] <- class(dat[ , i]) 196 | if(dat_classes[i] == "integer") { 197 | print('sweet!') 198 | } 199 | else { 200 | print('sour') 201 | } 202 | } 203 | ``` 204 | 205 | ####Nested statements 206 | You can nest if statements (and for loops) within one another 207 | 208 | ```{r, eval=FALSE} 209 | dat_classes <- NULL 210 | for (i in 1:ncol(dat)) { 211 | dat_classes[i] <- class(dat[ , i]) 212 | if (dat_classes[i] == "integer") { 213 | print('sweet!') 214 | } 215 | else { 216 | if (dat_classes[i] == 'factor') { 217 | print('ok') 218 | } 219 | else { 220 | print('sour') 221 | } 222 | } 223 | } 224 | ``` 225 | 226 | ### #Else if statement 227 | An alternative to the above syntax is to use an else if statement which are 228 | sometimes a bit easier to read 229 | 230 | ```{r, eval=FALSE} 231 | dat_classes <- NULL 232 | for (i in 1:ncol(dat)) { 233 | dat_classes[i] <- class(dat[ , i]) 234 | if (dat_classes[i] == "integer") { 235 | print('sweet!') 236 | } 237 | else if (dat_classes[i] == 'factor') { 238 | print('ok') 239 | } 240 | else { 241 | print('sour') 242 | } 243 | } 244 | ``` 245 | 246 | In one liner situations you can also use the function `ifelse()` 247 | 248 | ```{r} 249 | x <- 1:10 250 | ifelse(x > 5 , 'sweet!', 'sour!') 251 | ``` 252 | 253 | Which produces the same result as: 254 | 255 | ```{r, eval=FALSE} 256 | for (i in x) { 257 | if (i > 5) 258 | print('sweet') 259 | else 260 | print('sour') 261 | } 262 | ``` 263 | 264 | ## #Define functions 265 | Functions are one of the most important objects for unlocking R's power. The 266 | provide a way to modularize repetitive tasks that we need for our analyses. 267 | For example we can take the for loop that we wrote above which works on 268 | the data.frame called "dat" and place it in a function so that the same 269 | code can work on any data.frame we provide it. 270 | Function names should be verbs when possible and also avoid other known R 271 | function names when known. 272 | 273 | ```{r func} 274 | eval_class <- function(x) { 275 | dat_classes <- NULL 276 | for (i in 1:ncol(x)) { 277 | dat_classes[i] <- class(x[ , i]) 278 | if (dat_classes[i] == "integer") { 279 | print('sweet!') 280 | } 281 | else if (dat_classes[i] == 'factor') { 282 | print('ok') 283 | } 284 | else { 285 | print('sour') 286 | } 287 | } 288 | return(dat_classes) 289 | } 290 | 291 | eval_class(dat) 292 | ``` 293 | 294 | Above the only change we have made to our for loop is to substitute the object 295 | name "dat" for "x". For our function eval_class() x is a variable or argument. 296 | Additionally we added the line `return(dat_classes` which ensures that the object 297 | is output by the function 298 | 299 | What if dat had twice as many columns? 300 | 301 | ```{r} 302 | dbl_dat <- cbind(dat, dat) 303 | 304 | eval_class(dbl_dat) 305 | ``` 306 | 307 | It is best practice to program defensively by ensuring that the user 308 | supplies an object for the variable x that is sensible. In our case it 309 | has to be a data.frame or a matrix object other types should return an 310 | error with a reasonable explanation 311 | 312 | ```{r, error=TRUE} 313 | eval_class <- function(x) { 314 | if (class(x) %in% c('data.frame', 'matrix')){ 315 | x_classes <- NULL 316 | for (i in 1:ncol(x)) { 317 | x_classes[i] <- class(x[ , i]) 318 | if (x_classes[i] == "integer") { 319 | print('sweet!') 320 | } 321 | else if (x_classes[i] == 'factor') { 322 | print('ok') 323 | } 324 | else { 325 | print('sour') 326 | } 327 | } 328 | } 329 | else { 330 | stop('x must be either a data.frame or matrix') 331 | } 332 | return(x_classes) 333 | } 334 | 335 | my_obj <- 1:10 336 | eval_class(my_obj) 337 | ``` 338 | 339 | ## #Debug functions 340 | To debug your function in R use the functions `debug()` and `undebug()`. 341 | Rstudio has made the debugging experience for R users much better than previously. 342 | Try out the following lines of code 343 | 344 | ```{r debug, eval=FALSE} 345 | debug(eval_class) 346 | eval_class(dat) 347 | undebug(eval_class) 348 | ``` 349 | 350 | 351 | ## #Document functions 352 | Documentation is critical particularly when it comes to using functions which 353 | usually have a least one argument and somekind of output. 354 | 355 | One best practice to follow when documenting functions is to use Roxygen which is 356 | a package that helps to build R help files (i.e., .Rd files) which are accessed 357 | when the function `help` or `?` is used preceeding a function name. Here is a 358 | page that goes into detail about how to do this: https://jozef.io/r102-addin-roxytags/, but 359 | for simplity here is an example with our function: 360 | 361 | ```{r document} 362 | #' Evaluate the class of each column in a matrix or data.frame 363 | #' 364 | #' @param x a matrix or data.frame 365 | #' 366 | #' @return a vector of strings that idicates the class of each column of `x` 367 | #' @export 368 | #' 369 | #' @examples 370 | #' eval_class(cars) 371 | eval_class <- function(x) { 372 | if (class(x) %in% c('data.frame', 'matrix')){ 373 | x_classes <- NULL 374 | for (i in 1:ncol(x)) { 375 | x_classes[i] <- class(x[ , i]) 376 | } 377 | } 378 | else { 379 | stop('x must be either a data.frame or matrix') 380 | } 381 | return(x_classes) 382 | } 383 | ``` 384 | 385 | This provides a nice format that is easily understandable by a human, and if 386 | you ever decide to package your function this can can now be used to generate 387 | a help file for your function. Learn more at https://roxygen2.r-lib.org/articles/roxygen2.html 388 | 389 | -------------------------------------------------------------------------------- /lessons/R_introduction.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "R Introduction" 3 | author: "Dan McGlinn" 4 | date: "Sunday, January 18, 2015" 5 | output: html_document 6 | --- 7 | 8 | The purpose of this lesson is to introduce students to the R programming 9 | environment for the first time. The lesson builds off the 10 | [Software Carpentry](http://software-carpentry.org/) Lesson developed here: 11 | 12 | 13 | ## Readings 14 | * Chapters 1-7 of *The R Book* (1st ed) by Crawley 15 | * Chapters 1-4 of *MASS* (4th ed)by Venables and Ripley 16 | 17 | ## Lesson Outline 18 | * Arithmetic 19 | * Logical operations 20 | * Variable assignment 21 | * Reading in data 22 | * Using the help 23 | * Examine data 24 | * Subsetting the data 25 | * Summary statistics 26 | * Aggregate across rows or columns 27 | * Plot data 28 | 29 | ```{r setup, echo=FALSE} 30 | library(knitr) 31 | opts_knit$set(root.dir='../') 32 | ``` 33 | 34 | ## # Arithmetic 35 | 36 | ```{r} 37 | 3 + 4 # summation 38 | 39 | 3 * 4 # multiplication 40 | 41 | 3 / 4 # division 42 | 43 | 3^4 # exponents 44 | 45 | log(3) # log base e 46 | 47 | log(3, 10) # log base 10 48 | 49 | log10(3) # log base 10 50 | 51 | exp(log(3)) # e 52 | ``` 53 | 54 | ## #Logical operations 55 | 56 | ```{r logical operations} 57 | ?logical 58 | 59 | 3 > 4 # greater than 60 | 61 | 3 < 4 # less than 62 | 63 | 3 >= 4 # greater than or equal to 64 | 65 | 3 <= 4 # less than or equal to 66 | 67 | 3 != 4 # not equal to 68 | 69 | 3 == 4 # equal to 70 | 71 | TRUE # True 72 | T # True 73 | TRUE == 1 # True is set to one in R 74 | 75 | FALSE # False 76 | F # False 77 | FALSE == 0 # False is set to zero in R 78 | 79 | ## operations 80 | # ! reverses a logical 81 | !FALSE 82 | 83 | # & can be used as an "and" statement 84 | T & T 85 | T & F 86 | 87 | # | can be used as an "or" statement 88 | T | T 89 | T | F 90 | 91 | ## logical algebra 92 | # T is treated as a 1 93 | # F is treated as a 0 94 | T + T + F # what would this equal? 95 | 96 | ## useful functions 97 | # any() and all() 98 | any(c(T, F)) 99 | all(c(T, F)) 100 | ``` 101 | 102 | ## #Variable assignment 103 | You can use "<-" or "=" to assign a value to a variable 104 | ```{r variable assignment} 105 | weight_kg <- 55 106 | 107 | # print the value of the variable by simply calling its name 108 | weight_kg 109 | 110 | # weight in pounds: 111 | 2.2 * weight_kg 112 | 113 | weight_kg <- 57.5 114 | 115 | # weight in kilograms is now 116 | weight_kg 117 | 118 | weight_lb <- 2.2 * weight_kg 119 | # weight in kg... 120 | weight_kg 121 | # ...and in pounds 122 | weight_lb 123 | 124 | weight_kg <- 100.0 125 | # weight in kg now... 126 | weight_kg 127 | # ...and in weight pounds still 128 | weight_lb 129 | ``` 130 | 131 | Coming up with good object and file names can be difficult, but there 132 | are two general rules that can help guide you: 133 | 134 | 1) be descriptive 135 | 2) don't make names you must type a lot too long 136 | 137 | So for something like a file name which you'll only type probably once at read and 138 | write you should use a long descriptive name, but for objects in your R code you 139 | need to consider typeability and readability when designing the name. A long name 140 | like root_rhiz_prod_total_mm is very clear but is a pain to read and worse to 141 | type. R has a built-in name completion system but this doesn't completely 142 | remove the burden on you for using long object names. 143 | 144 | ## #Reading in data 145 | 146 | First check what your working directory is: 147 | ```{r} 148 | getwd() 149 | ``` 150 | 151 | because I have setup a Project in the quant_methods folder I can make my 152 | path relative to this location. 153 | 154 | let's read in the datafile `inflammation-01.csv` which is located in the 155 | directory: `./quant_methods/data)` where the `.` indicates the directory 156 | location in which the directory `quant_methods` is stored in. The usage of 157 | the `.` is a shorthand way to create relative paths. 158 | Because my working directory is already set to: ``r normalizePath('.')`` 159 | I can shorten the path to `./data/inflammation-01.csv` where again `.` 160 | indicates my current working directory path. 161 | 162 | ```{r read in data} 163 | dat <- read.csv(file = "./data/inflammation-01.csv", header = FALSE) 164 | ``` 165 | 166 | Alternatively and a bit more cleanly we can take advantage of the ability to 167 | supply the function `read.csv` with a url as such 168 | 169 | ```{r url import} 170 | dat <- read.csv('http://dmcglinn.github.io/quant_methods/data/inflammation-01.csv', 171 | header=F) 172 | ``` 173 | 174 | ## #Using the help 175 | 176 | ```{r get help, eval=FALSE} 177 | # above we used the function "read.csv" to find out more about this function see 178 | ?read.csv 179 | # or equivalently 180 | help(read.csv) 181 | # to do a fuzzy help search use 182 | help.search('read') 183 | help.search('csv') 184 | ``` 185 | 186 | ## #Examine data 187 | 188 | ```{r examine data} 189 | # visual summary of first 6 rows 190 | head(dat) 191 | # visual summary of last 6 rows 192 | tail(dat) 193 | 194 | # what kind of object is dat 195 | class(dat) 196 | 197 | # what are the dimensions of dat 198 | dim(dat) 199 | ``` 200 | 201 | You may notice that the data did not have column names and R auto assigned the 202 | columns the names V1, V2, V3, and so on. In this dataset, each column represent 203 | different times. We can assign column names using the function `names` 204 | ```{r setting column names} 205 | names(dat) 206 | names(dat) <- paste("day", 1:ncol(dat), sep='') 207 | names(dat) 208 | ``` 209 | 210 | Above the function `paste` was used to construct text strings that combined the 211 | word "patient" with a given index in this case from 1 to the total number of 212 | columns in the object `dat`. By default the function `paste` inserts a space 213 | between strings that you wish to paste together, I've set the `sep` argument 214 | to `''` to ensure that no space is inserted (see also `?paste0`) 215 | 216 | ## #Subsetting the data 217 | 218 | There are a variety of ways to subset data in data.frames. This section 219 | demonstrates how to subset data using indices. 220 | ```{r subset index} 221 | # first value in dat 222 | dat[1, 1] 223 | 224 | # middle value in dat 225 | dat[30, 20] 226 | 227 | # chunk of data in dat 228 | dat[1:4, 1:10] 229 | 230 | # select specific rows and columns 231 | dat[c(3, 8, 37, 56), c(10, 14, 29)] 232 | 233 | # all columns from row 5 234 | dat[5, ] 235 | # all rows from column 16 236 | dat[ , 16] 237 | dat[1:nrow(dat), 16] 238 | 239 | #first 5 rows and all columns except 16 240 | dat[1:5, -16] 241 | ``` 242 | 243 | An alternative way to carry out subsetting is to reference specific column 244 | names or to use the `subset` function 245 | 246 | ```{r subset names} 247 | # here to avoid printing too much information to the screen I'll just focus on 248 | # on the first 5 rows of each subset 249 | dat$patient10[1:5] 250 | dat[1:5 , 'day10'] 251 | dat[1:5 , c('day10', 'day15')] 252 | # notice that the following would give and error 253 | #dat[ , -c('patient3')] 254 | # but that the following would accompish the intended goal of dropping patient 3 255 | dat[1:5 , -3] 256 | 257 | #let's try using the subset function 258 | # only data for day 3 259 | subset(dat, select = day3)[1:5, ] 260 | # data on all days but 3 261 | subset(dat, select = -day3)[1:5, ] 262 | # data only on day 3 when inflammation in day 1 is equal to 0 263 | subset(dat, subset = day1 == 0, select = day3)[1:5, ] 264 | ``` 265 | 266 | ## #Summary statistics 267 | ```{r} 268 | # first row, all of the columns 269 | patient_1 <- dat[1, ] 270 | # max inflammation for patient 1 271 | max(patient_1) 272 | 273 | # max inflammation for patient 2 274 | max(dat[2, ]) 275 | 276 | # minimum inflammation on day 7 277 | min(dat[ , 7]) 278 | 279 | # mean inflammation on day 7 280 | mean(dat[ , 7]) 281 | # median inflammation on day 7 282 | median(dat[ , 7]) 283 | # standard deviation of inflammation on day 7 284 | sd(dat[ , 7]) 285 | 286 | summary(dat[ , 7]) 287 | ``` 288 | 289 | ## #Aggregate across rows or columns 290 | 291 | To obtain the average inflammation of each patient we will need to 292 | calculate the mean of all of the rows (`MARGIN = 1`) of the data frame. 293 | 294 | ```{r} 295 | avg_patient_inflammation <- apply(dat, 1, mean) 296 | ``` 297 | 298 | And to obtain the average inflammation of each day we will need to calculate 299 | the mean of all of the columns (`MARGIN = 2`) of the data frame. 300 | 301 | ```{r} 302 | avg_day_inflammation <- apply(dat, 2, mean) 303 | ``` 304 | 305 | We can change the function "mean" for other functions such as "sd" which 306 | calculates the standard deviation 307 | 308 | ```{r} 309 | # standard deviation of day 310 | sd_day_inflammation <- apply(dat, 2, sd) 311 | 312 | # standard deviation of patients 313 | sd_patient_inflammation <- apply(dat, 1, sd) 314 | ``` 315 | ## #Plot data 316 | 317 | ```{r plot help, eval=FALSE} 318 | # use the function plot() to plot data 319 | ?plot 320 | ``` 321 | 322 | This help file provides a long list of potential arguments and examples 323 | at a minimum you must provide a single quantitative variable, for example: 324 | 325 | ```{r default plot} 326 | plot(avg_day_inflammation) 327 | ``` 328 | 329 | Notice how R fills in lots of pieces of missing information automatically. 330 | specifically it assumes that the independent variable is simply an index from 331 | 1 to the length of the object in this case avg_day_inflammation. A safer more 332 | clear way to accomplish the same plot is to use the following: 333 | 334 | ```{r} 335 | plot(1:length(avg_day_inflammation), avg_day_inflammation, xlab='day', 336 | ylab='inflammation') 337 | ``` 338 | 339 | This makes it clearer that the x-variable is simply an index from 1 to the 340 | length of avg_day_inflammation, and it makes the x and y axis labels more 341 | sensical. 342 | 343 | To output multi-panel plots use for example 344 | ```{r} 345 | par(mfrow=c(2,1)) 346 | # which will create a single plotting row with two columns 347 | plot(1:length(avg_day_inflammation), avg_day_inflammation, xlab='day', 348 | ylab='inflammation') 349 | plot(1:length(avg_patient_inflammation), avg_patient_inflammation, 350 | xlab='patient identity', ylab='inflammation') 351 | ``` 352 | 353 | To output the figure to file you can use Rstudio's GUI features or you can use 354 | the command line which is what I recommend so that the code is fully 355 | reproducible: 356 | 357 | ```{r make pdf, eval=FALSE} 358 | pdf('./lessons/inflammation_fig1.pdf') 359 | par(mfrow = c(2,1)) 360 | plot(1:length(avg_day_inflammation), avg_day_inflammation, xlab='Day', 361 | ylab='Inflammation', frame.plot=F, col='magenta', pch=2, cex=2) 362 | plot(1:length(avg_patient_inflammation), avg_patient_inflammation, 363 | xlab='patient identity', ylab='inflammation', col='dodgerblue') 364 | dev.off() 365 | ``` 366 | 367 | -------------------------------------------------------------------------------- /lessons/chaotic-pop/app.R: -------------------------------------------------------------------------------- 1 | # 2 | # This is a Shiny web application. You can run the application by clicking 3 | # the 'Run App' button above. 4 | # 5 | # Find out more about building applications with Shiny here: 6 | # 7 | # http://shiny.rstudio.com/ 8 | # 9 | 10 | library(shiny) 11 | 12 | # simple model of logistic growth 13 | dNt <- function(r, N) r * N * (1 - N) 14 | 15 | # iterate growth through time 16 | Nt <- function(r, N, t) { 17 | for (i in 1:(t - 1)) { 18 | # population at next time step is population at current time + pop growth 19 | N[i + 1] <- N[i] + dNt(r, N[i]) 20 | } 21 | N 22 | } 23 | 24 | # Define UI for application that draws a histogram 25 | ui <- fluidPage( 26 | 27 | # Application title 28 | titlePanel("Logistic Population Dynamics, dN/dt = rN(1 - N)"), 29 | 30 | # Sidebar with a slider input for number of bins 31 | sidebarLayout( 32 | sidebarPanel( 33 | sliderInput("r", 34 | "Population growth rate (r):", 35 | min = 0.01, 36 | max = 3, 37 | value = 0.1), 38 | sliderInput("t", 39 | "Length of time series (t):", 40 | min = 10, 41 | max = 1000, 42 | value = 100), 43 | sliderInput("N", 44 | "Starting population abundance (N(t=0))", 45 | min = 0.01, 46 | max = 2, 47 | value = 0.5) 48 | 49 | ), 50 | 51 | # Show a plot of the generated distribution 52 | mainPanel( 53 | plotOutput("plotNt") 54 | ) 55 | ) 56 | ) 57 | 58 | # Define server logic required to draw a histogram 59 | server <- function(input, output) { 60 | 61 | output$plotNt <- renderPlot({ 62 | plot(1:input$t, Nt(input$r, input$N, input$t), type = 'l', xlab = 'Time (t)', 63 | ylab = 'Population size (N)', ylim = c(0, 2)) 64 | abline(h = 1, lty = 2, col='grey') 65 | }) 66 | } 67 | 68 | # Run the application 69 | shinyApp(ui = ui, server = server) 70 | 71 | -------------------------------------------------------------------------------- /lessons/community_structure_slides_with_notes.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/lessons/community_structure_slides_with_notes.pdf -------------------------------------------------------------------------------- /lessons/data_exploration.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Data exploration: the importance of plotting" 3 | output: html_document 4 | --- 5 | 6 | ```{r setup, include=FALSE} 7 | knitr::opts_chunk$set(echo = TRUE) 8 | ``` 9 | 10 | ## Simple example 11 | 12 | Consider that you have two variables `x` and `y`. 13 | 14 | ```{r create data} 15 | x <- 1:100 16 | y <- 20 * x - 0.2 * x^2 + rnorm(100, 0, 30) 17 | ``` 18 | 19 | 20 | You are interested to understand if `x` explains variation in `y`. 21 | How will you approach this? 22 | 23 | ## Should you build a linear model first or plot the variables first and visually explore? 24 | 25 | Let's examine the outcome and inference if **we do not visually examine our data** 26 | and only rely on regression modeling. 27 | 28 | ```{r model} 29 | mod_lin <- lm(y ~ x) 30 | summary(mod_lin) 31 | ``` 32 | 33 | The above model summary table would lead us to believe that there is no relationship 34 | between `x` and `y` which we know is false because we created the variables above. 35 | 36 | ## Why is it our regression table leading us to an incorrect inference? 37 | 38 | Because our model is systematically mis-representing the functional form of the 39 | relationship between `x` and `y` which we defined to be a quadratic relationship. 40 | This would have been obvious if we first graphed `y` and `x` 41 | 42 | ```{r} 43 | plot(y ~ x) 44 | ``` 45 | 46 | This simple step would indicate to us that `y` is not just a linear function of 47 | `x` but it is a quadratic function, such that a more appropriate model is: 48 | 49 | ```{r} 50 | mod_quad <- lm(y ~ x + I(x^2)) 51 | summary(mod_quad) 52 | ``` 53 | 54 | To further articulate why the regression only approach failed we should 55 | overlay the models and the data. Unfortunately, this critical step is often 56 | overlooked. 57 | 58 | ```{r} 59 | plot(y ~ x) 60 | lines(x, predict(mod_lin), col='red') 61 | lines(x, predict(mod_quad), col='blue') 62 | legend('bottom', c('linear', 'quadratic'), col=c('red', 'blue'), lty=1, bty='n') 63 | ``` 64 | 65 | Here we just looked at two variables so it may be obvious that graphing them is wise, 66 | but even in multivariate scenarios where a graphical exploration may be more 67 | tedious it can still be very helpful and illuminating prior to model fitting. 68 | 69 | -------------------------------------------------------------------------------- /lessons/figures/Rmd_knited.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/lessons/figures/Rmd_knited.png -------------------------------------------------------------------------------- /lessons/figures/Rmd_prepopulated.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/lessons/figures/Rmd_prepopulated.png -------------------------------------------------------------------------------- /lessons/figures/crawley_2007_table9_2_model_simplification.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/lessons/figures/crawley_2007_table9_2_model_simplification.png -------------------------------------------------------------------------------- /lessons/figures/final_doc.gif: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/lessons/figures/final_doc.gif -------------------------------------------------------------------------------- /lessons/figures/git_diff.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/lessons/figures/git_diff.png -------------------------------------------------------------------------------- /lessons/figures/git_panel.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/lessons/figures/git_panel.PNG -------------------------------------------------------------------------------- /lessons/figures/git_tab_explained.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/lessons/figures/git_tab_explained.png -------------------------------------------------------------------------------- /lessons/figures/isotropic_variogram_models_plots.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/lessons/figures/isotropic_variogram_models_plots.png -------------------------------------------------------------------------------- /lessons/figures/isotropic_variogram_models_table.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/lessons/figures/isotropic_variogram_models_table.png -------------------------------------------------------------------------------- /lessons/figures/knit_button.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/lessons/figures/knit_button.png -------------------------------------------------------------------------------- /lessons/figures/naming_repo.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/lessons/figures/naming_repo.PNG -------------------------------------------------------------------------------- /lessons/figures/new_proj.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/lessons/figures/new_proj.PNG -------------------------------------------------------------------------------- /lessons/figures/new_proj2.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/lessons/figures/new_proj2.PNG -------------------------------------------------------------------------------- /lessons/figures/new_repo.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/lessons/figures/new_repo.PNG -------------------------------------------------------------------------------- /lessons/figures/proj_url.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/lessons/figures/proj_url.PNG -------------------------------------------------------------------------------- /lessons/figures/r_starting_how_it_should_look.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/lessons/figures/r_starting_how_it_should_look.png -------------------------------------------------------------------------------- /lessons/figures/repo_fresh.PNG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/lessons/figures/repo_fresh.PNG -------------------------------------------------------------------------------- /lessons/figures/rmarkdown_dialogue.JPG: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/lessons/figures/rmarkdown_dialogue.JPG -------------------------------------------------------------------------------- /lessons/figures/serious_git.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/lessons/figures/serious_git.png -------------------------------------------------------------------------------- /lessons/figures/terminal.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/lessons/figures/terminal.png -------------------------------------------------------------------------------- /lessons/git_slides.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/lessons/git_slides.pdf -------------------------------------------------------------------------------- /lessons/more_with_maps.R: -------------------------------------------------------------------------------- 1 | #'--- 2 | #'title: More with maps 3 | #'author: Dan McGlinn 4 | #'output: html_notebook 5 | #'--- 6 | 7 | #+ echo=FALSE 8 | # setup the R environment for knitting markdown doc properly 9 | knitr::opts_knit$set(root.dir='../') 10 | 11 | #' import GIS libraries 12 | library(maps) 13 | library(sf) 14 | library(leaflet) 15 | library(viridis) # a color palette for maps 16 | library(readxl) 17 | 18 | #' ## Ancient Human DNA 19 | #' Let's examine an ancient human DNA project 20 | # https://dataverse.harvard.edu/dataset.xhtml?persistentId=doi:10.7910/DVN/FFIDCW 21 | dat <- read.csv('./data/v62.0_HO_public.csv', skip = 1, na.strings = '..') 22 | head(dat) 23 | 24 | # drop samples without coordinates 25 | dat <- subset(dat, !is.na(long) & !is.na(lat)) 26 | dat <- st_as_sf(dat, coords = c('long', 'lat')) 27 | 28 | map('world') 29 | points(dat) 30 | 31 | # 32 | m <- leaflet(data = dat) %>% addTiles() %>% 33 | addCircleMarkers(col = ~ as.factor(mtDNA_hap), label = ~mtDNA_hap) 34 | m 35 | 36 | # make a simplier haplotype using just first two letters 37 | dat$mtDNA_hap_simp <- substr(dat$mtDNA_hap, 1, 2) 38 | 39 | happal <- colorFactor(viridis(length(unique(dat$mtDNA_hap_simp))), 40 | dat$mtDNA_hap_simp) 41 | 42 | leaflet(dat) %>% 43 | addProviderTiles("CartoDB.Positron") %>% 44 | addCircleMarkers(radius = 1.5, 45 | fillColor = ~happal(mtDNA_hap_simp), 46 | stroke=FALSE, 47 | fillOpacity = 0.8, 48 | popup = ~mtDNA_hap_simp) %>% 49 | addLegend("bottomright", pal = happal, 50 | values = ~mtDNA_hap_simp, labels = "haplotypes", 51 | title = "mtDNA haplotype") 52 | 53 | #' ## Blue Crab Project 54 | #' let's import and map data on matrue Female bluecrabs 55 | dat <- read_excel('./data/crabdat(MF).xlsx') 56 | head(dat) 57 | dat <- st_as_sf(dat, coords = c('longitude', 'latitude')) 58 | 59 | 60 | pal <- colorNumeric("viridis", domain = dat$width_mm) 61 | 62 | leaflet(dat) %>% 63 | addProviderTiles("CartoDB.Positron") %>% 64 | addCircleMarkers(fillColor = ~pal(width_mm), 65 | stroke=FALSE, 66 | fillOpacity = 0.8, 67 | label = ~width_mm) %>% 68 | addProviderTiles(providers$Esri.NatGeoWorldMap) %>% 69 | addLegend(data = dat, 70 | position = "bottomright", 71 | pal = pal, values = ~width_mm, 72 | title = "Legend", 73 | opacity = 1) 74 | 75 | 76 | -------------------------------------------------------------------------------- /lessons/ordination_table.csv: -------------------------------------------------------------------------------- 1 | Acronym,Name,Type of analysis,Algorithm,R function(s) 2 | PCA,Principal Components Analysis,indirect,"Eigenanalysis, SVD","stats::princomp, vegan::rda" 3 | CA,Correspondence Analysis,indirect,"RA, eigenanalysis, SVD",vegan::cca 4 | DCA,Detrended Correspondence Analysis,indirect,RA with detrending and rescaling,vegan::decorana 5 | NMDS,Non-metric Mulidimensional Scaling,indirect,"Distance based ordination, non-eigenbased",vegan::metaMDS 6 | RDA,Redundancy Analysis,direct,"Eigenanalysis, SVD",vegan::rda 7 | CCA,Canonical Correspondence Analysis,direct,"RA with regressions, eigenanalysis",vegan::cca 8 | DCCA,Detrended Canonical Correspondence Analysis,direct,RA with regressions and detrending,NA 9 | -------------------------------------------------------------------------------- /lessons/paired_samples.R: -------------------------------------------------------------------------------- 1 | #'--- 2 | #' title: The importance of controlling for block effects and the use of paired t-tests 3 | #' author: Dan McGlinn 4 | #' output: html_document 5 | #'--- 6 | #' 7 | #' 8 | #' In complex systems such as a forest it is rare that the variable you are 9 | #' interested is only influenced by a single other variable. In fact, the driver 10 | #' variable that you are testing the effect of may be obscured because of 11 | #' variation due to a different variable. 12 | #' 13 | #' For example, you might be interested in comparing the effect of 14 | #' nutrient addition on plant growth but you know that plant growth is also 15 | #' strongly driven by salinity. If we wish to apply our inference about nutrients 16 | #' across a salinity gradient then we should replicate our nutrient addition 17 | #' at different salinities. Since we are not necessarily interested in the 18 | #' salinity effect we are treating it more as a nuisance variable or a "block" 19 | #' effect. 20 | #' 21 | #' Let's examine the incorrect approach to this analysis using 22 | #' a linear model that ignores the block effect, and then we will examine 23 | #' how the model interpretation changes once the block effect is correctly 24 | #' considered. We will see that controlling for the block effect is the 25 | #' same as using a paired (aka 1 sample) t-test. 26 | 27 | set.seed(1) 28 | 29 | #' First setup the block variable 30 | nblock <- 20 31 | block <- rep(1:nblock, each = 2) 32 | block 33 | 34 | #' Treatment will just have two levels: 0 = control, 1 = treatment 35 | trt <- rep(0:1, nblock) 36 | trt 37 | 38 | #' set error or uncertainty of model 39 | noise <- 0.1 40 | err <- rnorm(length(trt), mean = 0, sd = noise) 41 | 42 | #' Now we can generate response variable y using a 43 | #' linear model where we know the coefficients. 44 | #' Note the last term with Gaussian noise is added in. 45 | y <- 0.33 * block + 0.25 * trt + err 46 | 47 | par(mfrow=c(1,2)) 48 | boxplot(y ~ trt) 49 | boxplot(y ~ block) 50 | 51 | #' note above that the figures show that most of 52 | #' the variation in `y` is due to block effects rather 53 | #' than treatment effects. These strong block effects 54 | #' have the potential to obsurce treatment effects if 55 | #' they are ignored. 56 | 57 | # recode block and treatment as factors 58 | block <- as.factor(block) 59 | trt <- as.factor(trt) 60 | 61 | #' Let's fit models. We will find that if block is ignored 62 | #' then treatment has no effect similar to ignoring 63 | #' important biology you get wrong answer (i.e., not 64 | #' signifianct). Note tje `trt` coefficient is still correct 65 | #' (i.e., matches what we set in our model) which is 66 | #' impressive. 67 | summary(lm(y ~ trt)) 68 | 69 | #' if you include block with treatment in the model 70 | #' now you get correct inference (i.e., trt p-value < 0.05) 71 | summary(lm(y ~ block + trt)) 72 | 73 | #' now let's consider that each sample in each block is a 74 | #' paired sample. In other words we are most interested 75 | #' in the comparison of the samples within the blocks not 76 | #' between the blocks 77 | #' 78 | #' To carry out paired test we need to compute the 79 | #' differences in y in the two treatment levels. This is 80 | #' easiest if we reshape the data to wide format 81 | dat <- data.frame(y, block, trt) 82 | dat_pair <- tidyr::pivot_wider(dat, names_from = trt, 83 | names_prefix = 'trt_', 84 | values_from = y) 85 | head(dat_pair) 86 | #' now we can compute diff directly for testing 87 | dat_pair$diff <- dat_pair$trt_1 - dat_pair$trt_0 88 | #' we can use the `t.test` function with the formula 89 | #' response ~ 1 here the 1 indicates that this is a paired 90 | #' 1 sample test. Usually the 1 is the grouping variable 91 | t.test(dat_pair$diff ~ 1) 92 | #' Rather than compute the difference manually in trt_1 and 93 | #' trt_0 we can use the `Pair` function 94 | t.test(Pair(trt_1, trt_0) ~ 1, data = dat_pair) 95 | #' Ok let's see if the paired t-test gave us a different 96 | #' result to our simple linear model 97 | summary(lm(y ~ block + trt)) 98 | #' the coefficient estimate and t values of `trt` is identical 99 | #' between the paired t-test and the linear model which 100 | #' includes block as a factor. 101 | #' In this special case if we want to summary the test results 102 | #' using ANOVA we can use the default `anova` function 103 | #' which carries out sequential tests (i.e., the order of 104 | #' the variables in the models matters!). Here we want to 105 | #' first factor out block effects then test `trt` effect 106 | anova(lm(y ~ block + trt)) 107 | #' we can verify that this is identical to the more usual 108 | #' type 3 anova test we are typically interesed when using 109 | #' observational (rather than experimental) datasets. 110 | car::Anova(lm(y ~ block + trt), type = 3) 111 | 112 | -------------------------------------------------------------------------------- /lessons/partial_residual_plots.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Partial Residuals Plots" 3 | author: "Dan McGlinn" 4 | date: '`r paste("First created on 2015-01-29. Updated on", Sys.Date())`' 5 | output: 6 | html_document: 7 | fig_width: 10 8 | fig_height: 5 9 | --- 10 | 11 | Home Page - http://dmcglinn.github.io/quant_methods/ 12 | GitHub Repo - https://github.com/dmcglinn/quant_methods 13 | 14 | ### Source Code Link 15 | https://raw.githubusercontent.com/dmcglinn/quant_methods/gh-pages/lessons/partial_residual_plots.Rmd 16 | 17 | When working with multiple regression models we often spend a lot of time 18 | examining tabular summary tables and not enough time examining graphical 19 | fits of the models to the data. Partial residual plots help us to visualize 20 | the fit of each independent variable in the multiple regression model after 21 | controlling for the other variables. Mathematically partial residuals are defined 22 | as: 23 | 24 | $\text{Residuals} + \hat{\beta}_iX_i \text{ versus } X_i,$ 25 | 26 | where 27 | 28 | $\text{Residuals = residuals from the full model}$ 29 | $\hat{\beta}_i \text{= regression coefficient from the i-th independent variable in the full model}$ 30 | $X_i \text{= the i-th independent variable}$ 31 | 32 | We will use the function `termplot` to examine partial regression plots in R for 33 | `lm` models but these also work for `glm` models. 34 | 35 | Before we go further we should mention two caveats: 36 | 37 | This approach is not well suited for models that contain interaction 38 | effects because in that case examining the partial effect of a single term that 39 | is included in an interaction effect does not make sense. 40 | 41 | This approach can also be misleading if there is strong multicollinearity in which 42 | the independent variables are highly correlated with each other. In this case, 43 | the variance indicated by the partial residual plot can be much less than the 44 | actual variance. 45 | 46 | We will use a simple simulated example with 3 independent variables. 47 | 48 | ```{r} 49 | set.seed(1) 50 | x1 <- rnorm(100) # continuous variable 1 51 | x2 <- rnorm(100) # continuous variable 2 52 | x3 <- as.factor(rep(c('cnt', 'trt'), 50)) # categorical variable 53 | 54 | y <- .5 * x1^2 + .75 * x2 + ifelse(x3 == 'cnt', -0.5, 0.5) + rnorm(100, 0, 0.1) 55 | ``` 56 | 57 | We can visually examine our simulated data relationships to `y` prior to modeling 58 | 59 | ```{r} 60 | par(mfrow=c(1,3)) 61 | plot(y ~ x1) 62 | plot(y ~ x2) 63 | plot(y ~ x3) 64 | ``` 65 | 66 | Even though in this simulated case we know the 'true' model because we designed it 67 | by visually examining the relationship between `y` and each variable individually 68 | it does look like `x1` has no relationship, `x1` has a positive relationship, and 69 | for x3 it appears that the `trt` samples have a higher `y` 70 | 71 | Let's carryout multiple regression modeling to see if we can uncover more information: 72 | 73 | ```{r} 74 | mod <- lm(y ~ x1 + x2 + x3) 75 | summary(mod) 76 | ``` 77 | OK we fit the model and the tabular output seems to agree with our initial 78 | graphical exploration prior to model fitting. Let's now examine the partial 79 | residual plot for this example: 80 | 81 | ```{r} 82 | par(mfrow = c(1, 3)) 83 | termplot(mod, partial.resid = TRUE, se=T, smooth=panel.smooth, pch=19, cex=0.75, 84 | col.res = 'black', col.term = 'dodgerblue', lwd.term=2, 85 | col.se = 'dodgerblue', lwd.se = 2, lty.se=3, 86 | col.smth = 'red', lty.smth = 2) 87 | ``` 88 | 89 | First let's get oriented on these graphics. The y-axis is the partial residual of 90 | the response variable `y` against each specific independent variable. There are 91 | several sets of lines on the graphs. The light blue lines are the linear model 92 | fits with their associated standard errors (there is more uncertainty at the ends 93 | of the best fit line). The graphs also include a localized smoother or loess 94 | curve (in red) which fits the data as well as possible. 95 | 96 | In the first panel we are examining the relationship `y ~ x1 | x2 + x3` where the `|` 97 | stands for `given` so this formula asking what is the strength of the relationship 98 | of `x1` to `y` given or after controlling for `x2` and `x3` in other words: 99 | 100 | ```{r} 101 | lm(residuals(lm(y ~ x2 + x3)) ~ x1) 102 | ``` 103 | 104 | Note that the leftmost panel above indicates something really useful to us. It 105 | shows clearly that the regression model (light blue line) has the wrong functional 106 | form. It isn't that `x1` has no relationship to `y` it is just that it is not a 107 | linear relationship. The loess smoother (red line) shows this clearly. 108 | 109 | The other two panels for `x2` and `x3` also so their partial effects on `y`. 110 | These panels look good and indicate that our linear model assumption for the 111 | relationship between `y1` and `x2` seems pretty reasonable. 112 | 113 | Let's re-specify our model and re-examine the partial residual plot 114 | 115 | ```{r} 116 | mod2 <- lm(y ~ I(x1^2) + x2 + x3) 117 | summary(mod2) 118 | par(mfrow = c(1, 3)) 119 | termplot(mod2, partial.resid = TRUE, se=T, smooth=panel.smooth, pch=19, cex=0.75, 120 | col.res = 'black', col.term = 'dodgerblue', lwd.term=2, 121 | col.se = 'dodgerblue', lwd.se = 2, lty.se=3, 122 | col.smth = 'red', lty.smth = 2) 123 | ``` 124 | 125 | Now the model is properly specified the model fits look better and they have lower 126 | uncertainty associated with them. 127 | 128 | Home Page - http://dmcglinn.github.io/quant_methods/ 129 | GitHub Repo - https://github.com/dmcglinn/quant_methods 130 | -------------------------------------------------------------------------------- /lessons/prepare_data.R: -------------------------------------------------------------------------------- 1 | 2 | library(ecoretriever) 3 | 4 | ## create a clean subset of the McGlinn2010 Tallgrass Prairie dataset 5 | ## metadata available here: http://esapubs.org/archive/ecol/E091/124/ 6 | dat = fetch('McGlinn2010') 7 | cols_of_interest = c('plot', 'year', 'easting', 'northing', 'slope', 'ph', 'yrsslb') 8 | tgpp = merge(dat$richness, dat$environment[ , cols_of_interest], 9 | all.x = T, all.y=F) 10 | write.csv(tgpp, file='./data/tgpp.csv', row.names=F) 11 | -------------------------------------------------------------------------------- /lessons/rmarkdown_notes.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "R Markdown Notes" 3 | author: "Dan McGlinn" 4 | date: "1/14/2016" 5 | output: html_document 6 | --- 7 | 8 | The primary way to document R code is using and R script which is just a plain 9 | text file with the file extension .R rather than .txt. Any line in an R script 10 | that begins with the comment operator # is ignored and all other lines are 11 | interpreted as code. 12 | 13 | In the last decade, R markdown was developed in which the R code may be compiled 14 | with the results into a document often referred to as a dynamic report because 15 | the report will change each time the input data or code is updated. 16 | 17 | Just like an R script, an R markdown file is just a plain text file but it has 18 | the file extension .Rmd (rather than .R). R markdown files represent the 19 | combination of a markdown file (.md file) and an Rscript (.R file). 20 | 21 | [Markdown](http://daringfireball.net/projects/markdown/basics) is a simple set 22 | of tags that allow for formatting of text it was designed to be very simple and 23 | intuitive. There are lots of resources online about how to use markdown tags to 24 | format plain text. Here is a good one 25 | 26 | One of the best learning resources I've found on how to use R markdown document 27 | is by [Karl Broman](https://kbroman.org/): 28 | I also recommend that you work through this short demonstration 29 | by the folks at Rstudio. 30 | 31 | What follows just touches more on the mechanics of working with R markdown docs in Rstudio 32 | which I have noticed the above links have less information on. 33 | 34 | ## How to create an R Markdown file in Rstudio 35 | 36 | To create an R markdown file simply click on **File** > **New File** > 37 | **R Markdown**. 38 | If this is your first time ever creating an R markdown file you will be prompted 39 | to install the necessary dependencies. 40 | Rstudio should take care of this for you but in rare instances will result in 41 | errors that need to be chased down via Google searches. 42 | If errors persist I suggest completely removing the R markdown libraries from 43 | your hard drive and then attempting to re-install them fresh. 44 | 45 | If their are no errors with installing the R package dependencies you will see a 46 | dialogue box that provides you with a number of different output formatting options: 47 | 48 | ![](./figures/rmarkdown_dialogue.JPG) 49 | 50 | By filling out the interactive dialogue box shown above you will pre-populate 51 | your R markdown (.Rmd) file with those pieces of information. 52 | 53 | Notice in the dialogue box shown above that that if you chose to export to a pdf 54 | additional dependencies are required. 55 | Specifically: _*PDF output requires TeX (MiKTeX on Windows, MacText 2013+ on OS X, TeX Live 2013+ on Linux).*_ 56 | 57 | If you attempt to compile a pdf without installing these additional dependencies 58 | you will receive fairly informative errors that provide download links and 59 | specific instructions for how to get the correct dependency. You can avoid 60 | installing these additional dependencies for the time being by simply requesting 61 | to build an html document rather than a pdf. See the section below on Troubleshooting 62 | if you have trouble building to a pdf. 63 | 64 | ## How to knit an R markdown document 65 | 66 | When you create an R markdown file in R studio it is pre-populated with a file that 67 | looks like this. 68 | 69 | ![](./figures/Rmd_prepopulated.png) 70 | 71 | This is prepopulated information is meant to demonstrate how one can use an R 72 | markdown file. Before we talk about each piece of this document let's simply 73 | try to "knit" (i.e., compile) the document into the output of our choice (e.g., 74 | html, pdf, or doc). 75 | 76 | Click on the "Knit button" ![](./figures/knit_button.png) that appears at the 77 | top of the script panel. This will prompt you to save the file. Be sure to name 78 | your file with the .Rmd file extension which indicates that it is an Rmarkdown 79 | file. If the document compiles properly you should see a 80 | document that looks like this: 81 | 82 | ![](./figures/Rmd_knited.png) 83 | 84 | ## Understanding the components of an R markdown doc 85 | 86 | Let's go back through the .Rmd file we created and examine what line accomplishes: 87 | 88 | First we have the header lines denoted by `---` 89 | 90 |
 91 | ---
 92 | title: "My first R markdown file"
 93 | author: "Dan McGlinn"
 94 | date: "1/14/2019"
 95 | output: html_document
 96 | ---
 97 | 
98 | 99 | Upon knitting this information is used to generate a title and subtitle of the 100 | document. It also specifies what kind of a document will be generated in this 101 | case `html document` generates an html file. if you want to change the document 102 | output you can just change `html_document` to `pdf_document` or `word_document` 103 | 104 | Next we have 105 | 106 | ```{r setup, include=FALSE} 107 | knitr::opts_chunk$set(echo = TRUE) 108 | ``` 109 | 110 | which is an R code chunk. This is R code that will be run when the document is 111 | knitted. Let's break down the parts of this code chunk. First you have the 112 |
```
which indicate that a code chunk is starting. The curly bracket 113 | `{` indicates the start of specifying the code chunk options. The little `r` 114 | indicates the language of the software in this case R. The next word `setup` is 115 | the name of this code chunk this can be any thing you want or left blank. Naming 116 | code chunks just makes it easier to navigate your R markdown script. After the 117 | chunks name you have `, incude = FALSE` this is where the option of the chunk is 118 | specified. There is a wide range of options you can specify 119 | The option 120 | `include` does the following: 121 | 122 | >include: (TRUE; logical) whether to include the chunk output in the final 123 | output document; if include=FALSE, nothing will be written into the output 124 | document, but the code is still evaluated and plot files are generated if there 125 | are any plots in the chunk, so you can manually insert figures; note this is the 126 | only chunk option that is not cached, i.e., changing it will not invalidate the 127 | cache 128 | The next link of the code chunk has some R code `knitr::opts_chunk$set(echo = 129 | TRUE)` if you want to see what that does you could used `?knitr::opts_chunk` in 130 | the R console to pull up the document that for that function. Essentially what 131 | this line of R code is doing is setting global options for all code chunks 132 | rather than specifying options for each chunk individually as `include = FALSE` 133 | did above. 134 | Note above that the code in an R code chunk can always be evaulated inline 135 | (rather than using the knit button) by clicking on the little green trianage on 136 | the code chunk or by using Cntrl-Enter to copy, paste, and run the line in the R 137 | console. 138 | 139 | ## Setting R markdown's root directory 140 | 141 | An additional useful global option that I sometimes like to set is the root 142 | directory of the script. 143 | 144 | ```{r setup, echo=FALSE} 145 | # setup the R enviornment for kniting markdown doc properly 146 | knitr::opts_knit$set(root.dir='../') 147 | ``` 148 | 149 | By default if this option is not set when you knit the file it will expect all 150 | file paths to be relative to where the R markdown file is saved. I often use a 151 | file structure which looks like the following: 152 | 153 | ``` 154 | my_project_dir 155 | data 156 | my_data.csv 157 | scripts 158 | my_code.Rmd 159 | figures 160 | results 161 | ``` 162 | 163 | If I'm working with `my_code.Rmd` and I want it to import `my_data.csv` and I 164 | use `my_data <- read.csv('./data/my_data.csv')` I will get an error message when 165 | I knit `my_code.Rmd` because it will assume that `./data/my_data.csv` is located 166 | in `my_project_dir/scripts/` because that is where `my_code.Rmd` is saved at. To 167 | fix this I can simply run 168 | 169 | ## Update - Rscript -> Report 170 | **R markdown is great but it can be cumbersome and it sometimes has difficulties 171 | with directories and file paths that can be frustrating.** 172 | 173 | A recent feature that very few people are aware of is that you can actually 174 | **just compile any R script (.R) into an html or pdf document without any 175 | additional notation or changes to the code**: 176 | 177 | https://rmarkdown.rstudio.com/articles_report_from_r_script.html 178 | 179 | However as the above link shows adding just a few easy to remember tags allows 180 | you to generate a beautiful dynamic report and keep the streamlined benefits of 181 | a traditional R script. Specifically, the file header and plain text are 182 | demarcated with a preceding #'. R markdown code chunk options are preceded by #+ 183 | . It's that simple! **This is the method I recommend students use in this course** 184 | 185 | ## Troubleshooting: Rendering a pdf using Rmarkdown 186 | 187 | ### Installing MiKTeX on PC 188 | Unfortunately, although the R error instructions where helpful I still had 189 | to hack around to get this to work on my PC. Here are steps I had to take 190 | 191 | 1) download installer at this address and run install 192 | 2) R markdown still would not compile to pdf because of missing *.sty files 193 | 3) Error messages from the Rmarkdown console and a little bit of Googling 194 | suggested the following additional MiKTeX packages needed to be installed 195 | `framed` and `titling`. 196 | Luckily it is relatively easy to use the MiKTex Package Manager to install these additional two packages. Go ahead and do that. 197 | 4) turn off Rstudio and then turn it back on and you should be able to compile 198 | your .Rmd file into a beautiful .pdf file (hopefully fingers crossed) 199 | 200 | ### Installing MacText 2013+ on OS X 201 | These instructions were useful as of 01/2016. 202 | There are two versions of MacText available online at 203 | . 204 | One of the versions is very large (2.5GB) but a slimmed down version of the 205 | program is also available known as BasicTeX. 206 | These instructions will follow the route of using the smaller install of BasicTex. 207 | 208 | 1) download the BasicText installer: 209 | and run install 210 | 2) install missing *.sty files `framed.sty` and `titling.sty` by running the 211 | following in your terminal 212 | 213 | ``` 214 | sudo tlmgr update --self 215 | sudo tlmgr install framed 216 | sudo tlmgr install titling 217 | ``` 218 | 219 | When using sudo you will be prompted for a password this is just your administrative 220 | password on your machine. 221 | If you don't know how to access mac's terminal simply search for it or use 222 | Rstudio's terminal by clicking **Tools** > **shell** 223 | 224 | 3) turn off Rstudio and then turn it back on and you should be able to compile 225 | your .Rmd file into a beautiful .pdf file (hopefully fingers crossed) 226 | 227 | ## Additional topics to add 228 | * error handling and warnings 229 | * running code internally 230 | -------------------------------------------------------------------------------- /lessons/shapefiles_and_rasters.R: -------------------------------------------------------------------------------- 1 | #' --- 2 | #' title: "Working with Geospatial Data in R" 3 | #' author: Dan McGlinn 4 | #' output: html_document 5 | #' --- 6 | 7 | #' Home Page - http://dmcglinn.github.io/quant_methods/ 8 | #' GitHub Repo - https://github.com/dmcglinn/quant_methods 9 | 10 | #' ### Source Code Link 11 | #' https://raw.githubusercontent.com/dmcglinn/quant_methods/gh-pages/lessons/shapefiles_and_rasters.R 12 | 13 | #+ echo=FALSE 14 | # setup the R environment for knitting markdown doc properly 15 | knitr::opts_knit$set(root.dir='../') 16 | 17 | #' This lesson covers how to map and work with geospatial data in R. 18 | #' First let's load the relevant libraries 19 | # install.packages(c("maps","sf", "raster", "ggplot2", "leaflet")) 20 | 21 | library(maps) # convenient pkg for maps of the world, state, and county 22 | library(sf) # spatial features package that is helpful working with polygons 23 | library(raster) # raster data class for working with grid data 24 | library(ggplot2) # has some mapping functions (geom_sf) 25 | library(leaflet) # for interactive maps 26 | 27 | #' First, make some maps. The "maps" package has databases="world","usa","state", 28 | #' or "county". (There are also a few for foreign countries like France and 29 | #' Italy.) In each database, you can plot any region or group of regions. 30 | 31 | world <- map(database="world") 32 | names(world) 33 | head(world$names) 34 | 35 | #' Names in world database can all be plot as a region. For example: 36 | map(database="world", regions=c("Cambodia", "Thailand", "Vietnam", "Laos")) 37 | 38 | map(database="state", regions=c("virginia", "north carolina", "south carolina")) 39 | #' You can add cities using `map.cities()` function. 40 | #' Note: if the map is getting cropped on the edges close and clear the plotting 41 | #' window using `dev.off()` then manually widen the plotting window and try 42 | #' the `map()` function again - it should not crop the map if the window is 43 | #' big enough. 44 | #' 45 | #' Let's look at a county map: 46 | nc_county <- map(database="county",regions="north carolina") 47 | head(nc_county$names) 48 | triangle <- map(database="county", 49 | regions=c("north carolina,orange","north carolina,wake","north carolina,durham"), 50 | fill=T, plot=F) 51 | triangle$names 52 | 53 | #' We can turn a map into a spatial polygons object and fill it with data, 54 | #' turning it into a spatial polygon dataframe. 55 | 56 | tri_sf <- st_as_sf(triangle) 57 | 58 | plot(tri_sf, axes=T) 59 | 60 | #' add data to make spatial polygons data frame 61 | population <- c(266132, 132272, 892409) 62 | 63 | tri_sf$pop <- population 64 | 65 | plot(tri_sf['pop']) 66 | 67 | #' We can also use ggplot to produce a similar graphic with less hideous default 68 | #' colors 69 | ggplot(tri_sf) + 70 | geom_sf(aes(fill = pop)) 71 | 72 | #' If your data are in different projections, you need to change the projection 73 | #' so that they are all in 74 | #' the same coordinate reference system. 75 | 76 | world <- map(database="world", fill=T, plot=F) 77 | world_longlat <- st_as_sf(world) 78 | 79 | # Use function spTransform to transform to Mercator projection 80 | world_merc <- st_transform(world_longlat, crs=st_crs("+proj=merc")) 81 | # Lambert Azimuthal Equal Area 82 | world_laea <- st_transform(world_longlat, crs=st_crs("+proj=laea")) 83 | # sinusoidal 84 | world_sinusoidal <- st_transform(world_longlat, 85 | crs=st_crs("+proj=sinu")) 86 | 87 | #' plot the four together to see the difference. Mercator projection distorts 88 | #' area far from the equator. LAEA is accurately represents area but not angles. 89 | #' Sinusoidal is equal area and conserves distances along parallels. 90 | par(mfrow=c(1,1)) 91 | plot(world_longlat) 92 | plot(world_merc) 93 | plot(world_laea) 94 | plot(world_sinusoidal) 95 | 96 | #' A four page cheat sheet about coordinate reference systems (CRSs), including 97 | #' projections, datums, and coordinate systems, and the use of these in R: 98 | #' https://www.nceas.ucsb.edu/sites/default/files/2020-04/OverviewCoordinateReferenceSystems.pdf 99 | #' A fun projection link: http://xkcd.com/977/ 100 | #' 101 | #' OK, some slightly more complicated stuff. Data from MODIS fire detentions in 102 | #' the US in 2021. 103 | #' metadata and data link for MODIS fire data available at: 104 | #' https://fsapps.nwcg.gov/afm/data/fireptdata/modisfire_2021_conus.htm 105 | #' 106 | #' To download the data use: 107 | #+ eval=FALSE 108 | download.file('https://fsapps.nwcg.gov/afm/data/fireptdata/modis_fire_2021_365_conus_shapefile.zip', 109 | destfile = './data/modis_fire_2021_365_conus_shapefile.zip') 110 | unzip('./data/modis_fire_2021_365_conus_shapefile.zip', exdir = './data/') 111 | 112 | #' read in data 113 | fire2021 <- sf::st_read(dsn = './data/modis_fire_2021_365_conus.shp') 114 | 115 | #' the shape file is read in as a data.frame with spatial attributes 116 | class(fire2021) 117 | names(fire2021) # provides names of data table 118 | dim(fire2021) # dimensions of data table (1 row per spatial feature) 119 | st_crs(fire2021) # retrieves coordinate reference system 120 | 121 | #' make a map of Julian day of each fire 122 | plot(fire2021["JULIAN"], pch = '.') 123 | 124 | #' alternatively use ggplot (a bit slower option in this case) 125 | ggplot(fire2021) + 126 | geom_sf(aes(col = JULIAN), pch = '.') + 127 | theme_minimal() 128 | 129 | #' get summary statistics for each attribute just like we would for a dataframe 130 | 131 | summary(fire2021) 132 | 133 | #' We can also easily subset the object. 134 | #' Let's select fires that occurred in the last 6 months (days 182 to 365) 135 | #' of 2021 136 | 137 | fire <- subset(fire2021, JULIAN > 182) 138 | class(fire) 139 | names(fire) 140 | dim(fire) 141 | summary(fire) 142 | plot(fire['JULIAN'], cex = 0.25) 143 | 144 | #' It is a little more difficult to do a geographicaly defined subset. 145 | #' For example, let's select the fires in SC. First we'll identify which state 146 | #' each fire occurred in, then subset the ones from SC. 147 | USA <- map(database='state', plot=FALSE, fill=TRUE) 148 | names(USA) 149 | USA$names 150 | USA_sf <- st_as_sf(USA) 151 | 152 | #' Let's make sure projection of USA polygon is same as fire points 153 | USA_sf <- st_transform(USA_sf, st_crs(fire2021)) 154 | 155 | #' Now we overlay performs a "point in polygon" operation--meaning that it will 156 | #' return us a vector giving the index of which polygon in USA_sp each point in 157 | #' fire is. We then index that to names(USA_sp) to get the name of the state 158 | #' for that index. 159 | sf_use_s2(FALSE) 160 | fire_state <- st_intersection(fire2021, USA_sf) 161 | 162 | #' Now let's subset the ones from SC. We have to use grep because there are actually 163 | #' three polygons for SC. 164 | firesc <- fire_state[grep("south carolina", fire_state$ID), ] 165 | dim(firesc) 166 | 167 | 168 | #' We'll setup a legend for the map of fires by temperature. 169 | addLegendToSFPlot <- function(values = c(0, 1), labels = c("Low", "High"), 170 | palette = c("blue", "red"), ...){ 171 | 172 | # Get the axis limits and calculate size 173 | axisLimits <- par()$usr 174 | xLength <- axisLimits[2] - axisLimits[1] 175 | yLength <- axisLimits[4] - axisLimits[3] 176 | 177 | # Define the colour palette 178 | colourPalette <- leaflet::colorNumeric(palette, range(values)) 179 | 180 | # Add the legend 181 | plotrix::color.legend(xl=axisLimits[2] - 0.1*xLength, xr=axisLimits[2], 182 | yb=axisLimits[3], yt=axisLimits[3] + 0.1 * yLength, 183 | legend = labels, rect.col = colourPalette(values), 184 | gradient="y", ...) 185 | } 186 | 187 | #' First make an SC spatial polygon to put around the fire points. 188 | 189 | SC <- map(database='state', regions='south carolina', fill=T, 190 | plot=F) 191 | 192 | SC_st <- st_as_sf(SC) 193 | SC_st <- st_transform(SC_st, crs=st_crs(firesc)) 194 | 195 | cuts <- cut(firesc$JULIAN, 10) 196 | colors <- heat.colors(10)[as.numeric(cuts)] 197 | 198 | plot(st_geometry(SC_st)) 199 | plot(st_geometry(firesc['JULIAN']), add = TRUE, 200 | col = colors, pch = 19, cex = 0.5) 201 | 202 | 203 | addLegendToSFPlot(values = seq(from = 183, to = 363, length.out = 10), 204 | labels = c("Low", "", "", "", "Medium", "", "", "", "", "High"), 205 | palette = heat.colors(10)) 206 | 207 | 208 | #' Here's where ggplot shines as it makes it easy to combine multiple maps 209 | ggplot() + 210 | geom_sf(data = SC_st) + # add in SC polygon 211 | geom_sf(data = firesc, aes(col = JULIAN), cex = 0.25) # add in fire data 212 | 213 | #' For the last step, let's export this as a KML (readable by google earth) using 214 | #' write OGR and plot the locations of fires in SC in google earth. The first 215 | #' argument is the object we want to export, the second is the filename (by 216 | #' default it will go in our working directory), the layer we want to export, and 217 | #' the file format. 218 | write_sf(firesc, "firesctemp.kml", driver="kml") 219 | 220 | #' ## Rasters 221 | #' Rasters are grids of data. A common data grid to work with is 222 | #' climate data. 223 | #' You can download an Rdata file of bioclim climate data here: 224 | ## https://www.dropbox.com/s/gafxazc9575nf3j/bioclim_10m.Rdata?dl=0 225 | #+ eval = FALSE 226 | #' let's load load and plot the data 227 | load('./data/bioclim_10m.Rdata') 228 | bioStack 229 | class(bioStack) 230 | names(bioStack) 231 | projection(bioStack) # this is unprojected latlong like the fire data 232 | plot(bioStack, "mat") 233 | 234 | #' Let's extract the historical climate data at each of our fire locations 235 | fire_climate <- extract(bioStack, firesc) 236 | class(fire_climate) 237 | head(fire_climate) 238 | nrow(fire_climate) 239 | 240 | #' merge the two datasets 241 | firesc <- cbind(firesc, fire_climate) 242 | head(firesc) 243 | 244 | # Fire temperatures in deg K at fire locations 245 | ggplot() + 246 | geom_sf(data = SC_st) + 247 | geom_sf(data = firesc, aes(col = TEMP), cex = 0.25) 248 | # historical annual precip 249 | ggplot() + 250 | geom_sf(data = SC_st) + 251 | geom_sf(data = firesc, aes(col = ap), cex = 0.25) 252 | # relationship between annual precip and temp 253 | plot(TEMP ~ mat, data=firesc, xlab = 'Annual precip', ylab = 'Fire temp (K)') 254 | 255 | 256 | #' These maps are cool but they are static. Let's make an interactive map using 257 | #' the package `leaflet` 258 | #' A quick demo of `leaflet` can be found here: https://rstudio.github.io/leaflet 259 | 260 | mapStates <- map("state", fill = TRUE, plot = FALSE) 261 | leaflet(data = mapStates) %>% addTiles() %>% 262 | addPolygons(fillColor = topo.colors(10, alpha = NULL), stroke = FALSE) 263 | 264 | #' we can add various provider tiles (i.e., maps) with additional data features 265 | m <- leaflet(data = firesc) %>% addTiles() %>% 266 | addCircleMarkers(radius = 2, label = ~as.character(firesc$TEMP)) 267 | m %>% addProviderTiles(providers$Esri.NatGeoWorldMap) 268 | 269 | 270 | #' it is possible to vary point radius and color based upon data fields 271 | m <- leaflet(data = firesc) %>% addTiles() %>% 272 | addCircleMarkers(radius = ~(TEMP/max(TEMP)), label = ~as.character(firesc$TEMP)) 273 | m 274 | 275 | m <- leaflet(data = firesc) %>% addTiles() %>% 276 | addCircleMarkers(radius = 2, color = ~TEMP, 277 | label = ~as.character(firesc$TEMP)) 278 | m 279 | 280 | 281 | -------------------------------------------------------------------------------- /lessons/simulations.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Simulations in R" 3 | output: html_document 4 | --- 5 | 6 | R provides an excellent environment for carrying out numerical experiments or 7 | simulations. These can be very helpful for increasing mathematical competence, 8 | testing hypotheses, assessing statistical power, and generating theoretical 9 | expectations. 10 | 11 | We've already experimented a bit with simulations in this course: 12 | 13 | 1. In the univariate lesson we [simulated variables that we then used in a 14 | regression analysis](http://dmcglinn.github.io/quant_methods/lessons/univariate_models.html#sim). 15 | 2. In the multivariate lesson we [permuted rows of our matrix to test if our 16 | constrained ordination model fit the data better than we would expect due to 17 | chance](http://dmcglinn.github.io/quant_methods/lessons/multivariate_models.html). 18 | 3. In the the spatial lesson we [permuted the spatial coordinates of samples to 19 | test if the spatial correlation was larger than we would expect due to chance](http://dmcglinn.github.io/quant_methods/lessons/spatial_models.html) 20 | 21 | In the first example listed above the simulation was designed to inform our 22 | understanding of regression and to examine how sensitive it was to violations of 23 | its assumptions. In examples 2 and 3 listed above the role of the simulation was 24 | to generate a null model that an observed pattern could be tested against. 25 | 26 | Here I present an example of a simulation of a theoretical model to try to gain 27 | an understanding of this model and of a broader concept known as Chaos theory. 28 | 29 | I was inspired to code this in R after watching this video by 30 | [Veritasium - This equation will change how you see the world](https://www.youtube.com/watch?v=ovJcsL7vyrk) 31 | 32 | # Chaotic logistic population equlibria 33 | 34 | First we are going to define a simple theoretical model of population growth that 35 | has a negative feedback. In other words, as the population grows its rate of 36 | increase decreases as it approaches a carrying capacity for example. This kind of 37 | growth model is known as a logistic growth model and it contrasts from the 38 | exponential model of growth in which the population continues to grow to infinity. 39 | The logistic model can be presented mathematically as: 40 | 41 | $$\dfrac{dN}{dt} = rN(1-N)$$ 42 | where *N* is population size, *t* is time, and *r* is the intrinsic rate of 43 | population growth rate on a per capita basis. In this formulation, without loss 44 | of generality the carrying capacity of the population is implicitly defined as 45 | equal to 1. 46 | 47 | Let's code this simple model up in R and then examine its behavior with a simulation. 48 | 49 | 50 | ```{r} 51 | # simple model of logistic growth 52 | dNt <- function(r, N) r * N * (1 - N) 53 | 54 | # iterate growth through time 55 | Nt <- function(r, N, t) { 56 | for (i in 1:(t - 1)) { 57 | # population at next time step is population at current time + pop growth 58 | N[i + 1] <- N[i] + dNt(r, N[i]) 59 | } 60 | N 61 | } 62 | ``` 63 | 64 | Now let's examine what happens in this model if we run it through time for 65 | different starting abundance values. 66 | 67 | ```{r} 68 | t <- 100 69 | r <- 0.1 70 | # lets consider 4 different starting abundances (i.e., N(t=0) values) 71 | Nt0 = c(0.1, 0.5, 1.5, 2) 72 | 73 | par(mfrow=c(2,2)) 74 | for (i in seq_along(Nt0)) { 75 | plot(1:t, Nt(r, Nt0[i], t), type = 'l', xlab = 'time', ylab = 'Population size', 76 | main = paste('N(t=0) =', Nt0[i]), ylim =c(0, 2)) 77 | abline(h = 1, lty = 2, col='grey') 78 | } 79 | 80 | ``` 81 | 82 | So what we learn from this is if you start below the carrying capacity (grey 83 | dashed line) your population will increase to it, and if you start above 84 | carrying capacity your population will decrease to it. We also learn that the 85 | rate of increase or decrease depends on how far from the equilibrium you are at 86 | the beginning of the time series. 87 | 88 | ## Interactive Simulation 89 | You can play around with an interactive version of the logistic population 90 | growth model that we examined above on this [Shiny app](https://danmcglinn.shinyapps.io/chaotic-pop/) 91 | ([Source code](https://github.com/dmcglinn/quant_methods/blob/gh-pages/lessons/chaotic-pop/app.R)): 92 | 93 | ## Interpretation thus far... 94 | 95 | OK so all is well in the world of logistic population growth - given enough time 96 | all of the populations eventually hit carrying capacity 97 | and do not change from that point. Therefore, if we wanted to predict 98 | long-term (i.e., equilibrium) abundance then this model would seem to suggest 99 | that should be 1 for all of these populations because they all have the same 100 | carrying capacity of 1. 101 | 102 | To examine if this is actually true let's examine the other parameter of the model 103 | that we can adjust the rate of population growth (*r*). 104 | 105 | **Question**: do all of the populations go to the same expected population 106 | equilibrium (i.e., the carrying capacity or 1 in this specific case)? 107 | 108 | To test this we will examine a range of *r*-values from 0.01 to 3, we'll set the 109 | starting population size to 0.5 but where you actually start N out ends up not 110 | making a difference for addressing this specific question. The approach laid out 111 | below we will assume that equilibrium is hit at the end of the first of the time 112 | series. This is would seem to be a pretty reasonable assumption given what we 113 | learned above and we'll run the simulation 10 times longer (1000 time steps) to 114 | make extra sure that the model should have achieved equilibrium. 115 | 116 | ```{r} 117 | 118 | # set starting conditions and amount of time 119 | t <- 1000 120 | r <- seq(0.01, 3, .01) 121 | Nt0 <- 0.5 122 | # compute the population sizes across the times 123 | e <- sapply(r, function(r) Nt(r, Nt0, t)) 124 | 125 | # only use 2nd half of times presuming those will be at equilibrium 126 | thalf <- round(t/2) 127 | e <- e[thalf:t, ] 128 | t <- nrow(e) 129 | maxE <- max(as.vector(e)) 130 | 131 | # plot simulation results 132 | 133 | ptsize = 0.25 134 | plot(rep(r[1], t), e[ , 1], ylim = c(0, maxE), xlim = range(r), 135 | cex = ptsize, xlab = 'Population growth rate (r)', 136 | ylab = 'Equilibrium abundance') 137 | for (i in seq_along(r)) { 138 | points(rep(r[i], t), e[ , i], cex = ptsize) 139 | } 140 | abline(h = 1, col='grey', lty=2) 141 | ``` 142 | 143 | What we see above was quite unexpected to the ecologist Robert May who first 144 | discovered this phenomena ([May 1976](https://www.nature.com/articles/261459a0)). 145 | Essentially what this figure tells us is that for population growth rates above 146 | 2 that the population does not necessarily stay at the equilibrium where we expect it 147 | to (at the dashed grey line, 1). Instead equilibrium abundances bifurcated into 148 | different values and further bifurcate until essentially any abundance value is 149 | possible at *r* values greater than 2.6 or so. 150 | 151 | The important thing to remember about the result above is that: 152 | 153 | 1. this is a purely deterministic model - i.e., no error so this complex outcome 154 | was generated only by the negative feedback loop in the model, 155 | 2. this is a very simple model so this is not due to an exceptional complex 156 | internal dynamics of the model, and 157 | 3. although equilibrium abundance is unpredictable at higher r values the overall 158 | pattern of the bifurcation is predicable. It is what folks call a "strange attractor". 159 | 160 | As might be imagined this unpredictable behavior from one of the simplest and most 161 | canonical models of population growth shook ecology to its core. The reverberations 162 | of which are still being felt over 40 years later. 163 | 164 | This simple example demonstrates some of R's power to explore parameter space of 165 | models and to find novel insights that would require a lot more pure 166 | mathematical expertise if just using pen and paper. 167 | 168 | ## Student Excercise 169 | 170 | As noted above the model we examined just now was purely deterministic meaning 171 | that it had no error or noise. 172 | 173 | How would you change the following code chunk so that the model allows for 174 | additive process error? 175 | 176 | * What distribution will error in your model take (e.g., Gaussian, Log Normal)? 177 | 178 | * You may want to consider a default value for error so that downstream code 179 | does not break that does not specify a value for the error term. 180 | 181 | * Can you reproduce the first figure in the lesson but now with stochasticity? 182 | 183 | ```{r} 184 | # simple model of logistic growth 185 | dNt <- function(r, N) r * N * (1 - N) 186 | 187 | # iterate growth through time 188 | Nt <- function(r, N, t) { 189 | for (i in 1:(t - 1)) { 190 | # population at next time step is population at current time + pop growth 191 | N[i + 1] <- N[i] + dNt(r, N[i]) 192 | } 193 | N 194 | } 195 | ``` 196 | 197 | -------------------------------------------------------------------------------- /lessons/standardized_beta_coefficients.Rmd: -------------------------------------------------------------------------------- 1 | --- 2 | title: Standardized \(\beta\) coefficients 3 | output: html_document 4 | --- 5 | 6 | Home Page - http://dmcglinn.github.io/quant_methods/ 7 | GitHub Repo - https://github.com/dmcglinn/quant_methods 8 | 9 | ### Source Code Link 10 | https://raw.githubusercontent.com/dmcglinn/quant_methods/gh-pages/lessons/standardized_beta_coefficients.Rmd 11 | 12 | 13 | This mini-lesson is to introduce the concept of standardized regression 14 | coefficients in R. A standardized regression coefficient is simply the 15 | \(\beta\) estimate from a regression on standardized variables. A standardized 16 | variable is a variable that has a mean of 0 and a standard deviation of 1. 17 | 18 | One reason for standardizing variables is that you can interpret the \(\beta\) estimates as partial correlation coefficients. In other words now that the 19 | variables are standardized you can compare how correlated they are to the 20 | response variable using their regression coefficients. Below is a demo of this. 21 | 22 | ```{r} 23 | ## We will use this function to plot the data and correlations 24 | panel.cor <- function(x, y, digits = 2, prefix = "", cex.cor=3, ...) 25 | { 26 | usr <- par("usr"); on.exit(par(usr)) 27 | par(usr = c(0, 1, 0, 1)) 28 | r <- abs(cor(x, y)) 29 | txt <- format(c(r, 0.123456789), digits = digits)[1] 30 | txt <- paste0(prefix, txt) 31 | if(missing(cex.cor)) 32 | cex.cor <- 0.8/strwidth(txt) 33 | text(0.5, 0.5, txt, cex = cex.cor) 34 | } 35 | ``` 36 | 37 | Simulate some data for running models. Here to provide a clear demonstration 38 | we need explanatory variables that are independent normal variates. 39 | 40 | ```{r} 41 | set.seed(10) 42 | n = 90 43 | x1 = rnorm(n) 44 | x2 = rnorm(n) 45 | x3 = rnorm(n) 46 | 47 | #create noise b/c there is always error in real life 48 | epsilon = rnorm(n, 0, 3) 49 | #generate response: additive model plus noise, intercept=0 50 | y = 2*x1 + x2 + 3*x3 + epsilon 51 | #organize predictors in data frame 52 | sim_data = data.frame(y, x1, x2, x3) 53 | ``` 54 | 55 | Before standardizing variables it is worthwhile to highlight that the 56 | relationship between correlation and regression statistics. Specifically, 57 | the t-statistic from a simple correlation coefficient is exactly what is 58 | reported for the \(\beta_1\) coefficient in a regression model. 59 | 60 | ```{r} 61 | cor.test(sim_data$y, sim_data$x1)$statistic 62 | summary(lm(y ~ x1, data=sim_data))$coef 63 | ``` 64 | 65 | The \(\beta\) coefficient reported by the regression is not equal to the 66 | correlation coefficient though because the \(\beta\) is in the units of the 67 | \(x_1\) variable (i.e., it has not been standardized). Now let's use the function 68 | `scale()` to standardize the independent and dependent variables. 69 | 70 | ```{r} 71 | sim_data_std = data.frame(scale(sim_data)) 72 | 73 | mod = lm(y ~ x1 + x2 + x3, data=sim_data) 74 | mod_std = lm(y ~ x1 + x2 + x3, data=sim_data_std) 75 | round(summary(mod)$coef, 3) 76 | round(summary(mod_std)$coef, 3) 77 | cor(sim_data$y, sim_data$x1) 78 | cor(sim_data$y, sim_data$x2) 79 | cor(sim_data$y, sim_data$x3) 80 | 81 | ``` 82 | 83 | Notice that above the t-statistics and consequently the p-values between `mod` 84 | and `mod_std` don't change (with the exception of the intercept term which is 85 | always 0 in a regression of standardized variables). This is because the 86 | t-statistic is a pivotal statistic meaning that its value doesn't depend on the 87 | scale of the difference. 88 | 89 | Additionally notice that the individual correlation coefficients are very 90 | similar to the \(\beta\) estimates in `mod_std`. Why are these not exactly the same? 91 | Here's a hint - what would happen if their was strong multicollinarity between 92 | the explanatory variables? 93 | 94 | Let's plot the variables against one another and also display their individual 95 | Pearson correlation coefficients to get a visual perspective on the problem 96 | 97 | ```{r} 98 | pairs(sim_data, lower.panel = panel.cor, upper.panel = panel.smooth) 99 | ``` 100 | 101 | Home Page - http://dmcglinn.github.io/quant_methods/ 102 | GitHub Repo - https://github.com/dmcglinn/quant_methods -------------------------------------------------------------------------------- /lessons/stats_primer.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/lessons/stats_primer.pdf -------------------------------------------------------------------------------- /lessons/tcltk_0.1-1.tar.gz: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/lessons/tcltk_0.1-1.tar.gz -------------------------------------------------------------------------------- /lessons/univariate_models.R: -------------------------------------------------------------------------------- 1 | 2 | ## content from this lesson is modified from the following sources: 3 | 4 | http://www.unc.edu/courses/2010fall/ecol/563/001/docs/lectures/lecture1.htm 5 | 6 | http://plantecology.syr.edu/fridley/bio793/lm.html 7 | 8 | ## hypothetical models --------------------------------------------------- 9 | 10 | # All models are wrong 11 | # but some models are useful 12 | 13 | # In R linear modeling is a very common task and is carryed out by the function 14 | # lm() 15 | 16 | #generate data for example 17 | set.seed(10) 18 | x1 = runif(90) 19 | x2 = rbinom(90, 10, .5) 20 | x3 = rgamma(90, .1, .1) 21 | 22 | #organize predictors in data frame 23 | sim_data = data.frame(x1, x2, x3) 24 | #create noise b/c there is always error in real life 25 | epsilon = rnorm(90, 0, 3) 26 | #generate response: additive model plus noise, intercept=0 27 | sim_data$y = 2*x1 + x2 + 3*x3 + epsilon 28 | #simple linear regression with x1 as predictor 29 | 30 | mod1 = lm(y ~ x1, data=sim_data) 31 | mod1 32 | 33 | #plot regression line and mean line 34 | plot(y ~ x1, data=sim_data) 35 | abline(h=mean(sim_data$y), col='pink', lwd=3) 36 | abline(mod1, lty=2) 37 | #simple linear regression with x3 as a predictor 38 | mod3 = lm(y ~ x3, data=sim_data) 39 | #graph regression line and mean line 40 | plot(y ~ x3, data=sim_data) 41 | abline(mod3) 42 | abline(h=mean(sim_data$y), col=2, lwd=2) 43 | legend('topleft', c('OLS fit', 'mean'), col=c('black', 'pink'), lty=1) 44 | 45 | # let's examine the statistics of these model fits 46 | #remove outlier in x3 space 47 | sim_data_sub = sim_data[sim_data$y < 25,] 48 | #verify that one observation was removed 49 | dim(sim_data) 50 | dim(sim_data_sub) 51 | #refit model to reduced data 52 | mod3_sub = lm(y ~ x3, data=sim_data_sub) 53 | summary(mod3_sub) 54 | 55 | # so R^2 is highly sensative to outliers but coefficients not so much 56 | 57 | ## Question: create a plot of both models along side the data, how much 58 | ## to they visually differ from one another. Examine the arguments to abline() 59 | ## including lty and lwd 60 | 61 | ## multiple regression -------------------------------------------------- 62 | mod_main = lm(y ~ x1 + x2 + x3, data=sim_data) 63 | mod_main 64 | summary(mod_main) 65 | 66 | ## interaction effects ----------------------------------------------- 67 | 68 | lm(y ~ x1 + x2 + x3 + x1*x2 + x1*x3 + x2*x3 + x1*x2*x3) 69 | 70 | lm(y ~ 1) 71 | 72 | mod_full = update(mod_main, ~ . + x1*x2*x3) 73 | summary(mod_full) 74 | 75 | anova(mod_main, mod_full) 76 | 77 | AIC(mod_full) 78 | AIC(mod_main) 79 | 80 | #install.packages('MASS') 81 | library(MASS) 82 | stepAIC(mod_full) 83 | 84 | 85 | 86 | -------------------------------------------------------------------------------- /motivation.md: -------------------------------------------------------------------------------- 1 | 2 | # Why learn R? 3 | 4 | ## R is not a GUI, and that's a good thing 5 | 6 | The learning curve might be steeper than with other software, but with R, you 7 | can save all the steps you used to go from the data to the results. So, if you 8 | want to redo your analysis because you collected more data, you don't have to 9 | remember which button you clicked in which order to obtain your results, you 10 | just have to run your script again. 11 | 12 | Working with scripts makes the steps you used in your analysis clear, and the 13 | code you write can be inspected by someone else who can give you feedback and 14 | spot mistakes. 15 | 16 | Working with scripts forces you to have deeper understanding of what you are 17 | doing, and facilitates your learning and comprehension of the methods you use. 18 | 19 | 20 | ## R code is great for reproducibility 21 | 22 | Reproducibility is when someone else (including your future self) can obtain the 23 | same results from the same dataset when using the same analysis. 24 | 25 | R integrates with other tools to generate manuscripts from your code. If you 26 | collect more data, or fix a mistake in your dataset, the figures and the 27 | statistical tests in your manuscript are updated automatically. 28 | 29 | An increasing number of journals and funding agencies expect analyses to be 30 | reproducible, knowing R will give you an edge with these requirements. 31 | 32 | 33 | ## R is interdisciplinary and extensible 34 | 35 | With 6,000+ packages that can be installed to extend its capabilities, R 36 | provides a framework that allows you to combine analyses across many scientific 37 | disciplines to best suit the analyses you want to use on your data. For 38 | instance, R has packages for image analysis, GIS, time series, population 39 | genetics, and a lot more. 40 | 41 | 42 | ## R works on data of all shapes and size 43 | 44 | The skills you learn with R scale easily with the size of your dataset. Whether 45 | your dataset has hundreds or millions of lines, it won't make much difference to 46 | you. 47 | 48 | R is designed for data analysis. It comes with special data structures and data 49 | types that make handling of missing data and statistical factors convenient. 50 | 51 | R can connect to spreadsheets, databases, and many other data formats, on your 52 | computer or on the web. 53 | 54 | 55 | ## R produces high-quality graphics 56 | 57 | The plotting functionalities in R are endless, and allow you to adjust any 58 | aspect of your graph to convey most effectively the message from your data. 59 | 60 | 61 | ## R has a large community 62 | 63 | Thousands of people use R daily. Many of them are willing to help you through 64 | mailing lists and stack overflow. 65 | 66 | 67 | ## Not only R is free, but it is also open-source and cross-platform 68 | 69 | Anyone can inspect the source code to see how R works. Because of this 70 | transparency, there is less chance for mistakes, and if you (or someone else) 71 | find some, you can report and fix bugs. 72 | -------------------------------------------------------------------------------- /projects.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: page 3 | title: Projects 4 | --- 5 | 6 | Class projects may cover any topic that involves quantitative methods. 7 | Take a look at the page of links to [datasets](../data) which could be 8 | analyzed as part of a project. 9 | 10 | Students are expected to contribute: 11 | 12 | * project code 13 | * oral presentations (Project Pitch & Final Presentation) 14 | 15 | Graduate students are also required to contribute: 16 | 17 | * written description of analysis (e.g., Methods and Results section of a paper) 18 | 19 | ### Project Code 20 | 21 | The project code will be submitted via a link to a google drive folder that 22 | Dan will share with the class. 23 | 24 | 25 | The home directory of all projects should contain at least the following directories: 26 | 27 | * scripts 28 | * figs 29 | * data (only if the project actually used data) 30 | 31 | All R code in the scripts directory must assume that the working directory is 32 | the project home directory and all file paths must be relative to the project 33 | home directory. 34 | **Points will be deducted for absolute file paths** as these decrease the portability, 35 | reproducibility, and readability of code. 36 | 37 | Do not add very large > 100 MB data files to the shared drive. Instructions for 38 | how to download these files or other justifications for why the data are not 39 | included with the code are sufficient. 40 | 41 | If the data are not available to reproduce the results then at minimum a 42 | representative example portion of the data must be included to provide a means 43 | of generating example results. 44 | 45 | The project directory should also contain a `README.md` file that describes (at a minimum): 46 | 47 | * the objective of the project 48 | * the structure of the code-base including dependencies 49 | * the structure of the data that is required as input including the metadata 50 | * instructions on how to recreate your results 51 | * any relevant acknowledgements 52 | 53 | Although not required your instructor and your future-self will find it very 54 | useful if you include a master script that controls project flow. 55 | See for example 56 | Another very effective approach is to use an Rmarkdown document that walks a reader 57 | through you analysis with code and results interspersed with plain English 58 | descriptions of motivations and methodology. See for example 59 | 60 | #### Peer code review 61 | 62 | All project will be reviewed by two student peer reviewers using the following 63 | [template](./code_review). 64 | 65 | ### Oral Presentation 66 | 67 | #### Project Pitch (early in the semester) 68 | 69 | A short presentation that may be accompanied by slides that covers your: 70 | * question 71 | * methods / data 72 | * proposed or preliminary results 73 | * interpretation 74 | 75 | #### Final Presenation (late in the semester) 76 | 77 | A 10 minute presentation accompanied by slides on your: 78 | 79 | * question 80 | * methods 81 | * results 82 | * interpretation 83 | 84 | The oral presentation should summarize the broader context within which your 85 | work falls by citing the peer-reviewed literature. 86 | It should be clear what your over-arching question is and what specific questions 87 | you have attempted to address 88 | Your data and statistical methods need to be adequately described. 89 | We do not need to know which R packages or what R code you used but we do need to 90 | know the names of the methods you used and how you examined your hypotheses. 91 | Some projects will not use data and thus that portion can be skipped in those 92 | contexts. 93 | 94 | ### Written Description 95 | At a minimum this should include: 96 | 97 | * Thesis statement (i.e., your question and predictions) 98 | * Methods 99 | * Results 100 | * Interpretation 101 | 102 | However those that wish to tackle an entire scientific paper are encouraged to 103 | do so and your instructor will give you comments on your entire document. 104 | The sections of the written description should be formatted and prepared in the 105 | style of a relevant scientific peer-reviewed journal in your field that you 106 | would like to submit the finished product to. 107 | Scientific literature should be cited in the methods and interpretation sections 108 | of the document. 109 | 110 | ### Links to student projects 111 | * Spring 2022 112 | - Public repos: 113 | - [Emphemeral Wetland Bird Biodiversity](https://github.com/jacksonbarrattheitmann/RclassProject) 114 | - [Microplastic and Algae Distribution in Coastal SC Stormwater Drainage Ponds](https://github.com/a-apint4/MP_Algae_Project) 115 | - [Indian River Lagoon - Community Ecosystem Function Project](https://github.com/Lexie-DelViscio/IRLCommunityEcosystemFunction) 116 | - [Using Hemolymph Chemistry to Predict and Assess Molting in Green Crabs, Carcinus maenas](https://github.com/emilydombrowski/green_crab_phys_2022) 117 | - [The effect of elevated salinity on survival of southern toad embryos](https://github.com/Regan-Honeycutt/Embryo-Survival) 118 | - [The ingestion of microplastics by young-of-the-year sharks in South Carolina estuaries](https://github.com/lattomusme/shark_plastics) 119 | - [Phenotypic and developmental effects of T-DNA inserts within auxin related genes on the development of A. thaliana](https://github.com/sydowpw/APA-Development-Project) 120 | - [The effects of Eastern mud snails on benthic microalgae community structures](https://github.com/Timara-Vereen/RClass-Project) 121 | - [Model abundance of 3 target species of deep-sea corals in the Pinnacles Trend Mesophotic Area](https://github.com/MorganWill13/Pinnacles_Trend) 122 | - [Dolphin Vocalization and Sighting Analyses](https://github.com/ctribss/Projectfiles) 123 | - Private repos: 124 | - [Resource partitioning among Caribbean sponges](https://github.com/huntjones88/summer_2021_pulse-chase_data) 125 | 126 | 127 | * Spring 2020 128 | - Public repos: 129 | - [Chara_Ecosystem_Dynamics](https://github.com/CassandraEvanchuk/Chara_Ecosystem_Dynamics.git) 130 | - [Exposureof Nanobubble Ozonation on Red Drum](https://github.com/radchenkoa5/Exposure-of-Nanobubble-Ozonation-on-Red-Drum) 131 | - [Sciaenops ocellatus feeding trial metabolomics](https://github.com/daveklett/David-Klett-Sciaenops-Ocellatus-Feeding-Trial-Metabolomics) 132 | - [Wood tracking Method](https://github.com/millertp1/Wood_Tracking_Method.git) 133 | - [Fertilization results of pairwise crosses of Staghorn coral](https://github.com/eeparsons42/cervicornis_analysis) 134 | - [The effect of headstarting on bite force in the diamondback terrapin](https://github.com/reisenfeldk/Thesis) 135 | - [Sea trout physiological relationship between myxospore density in the muscle tissue and swimming performance](https://github.com/dalyjm/SST-project/) 136 | - [Life history parameters and the interacting abiotic variables of the brief squid, Lolliguncula brevis](https://github.com/jtgood/Lbrevis.git) 137 | - [Herptafauna response to prescribe fire](https://github.com/mcglinnlab/fire_herps) 138 | - [Global shark and ray beta diversity](https://github.com/mosscr/Shark-Ray-Beta-Diversity) 139 | - [Analysis of Enterococcus in Charleston waterways](https://github.com/Vwilcox98/R-Project---CWK) 140 | 141 | * Spring 2019 (incomplete list) 142 | - Public repos: 143 | - [Food consumption patterns in the Philippines](https://github.com/jbalipal/PhFoodExpenditures) 144 | - Private repos: 145 | - [Analysis of Shark Distribution in Bulls Bay, South Carolina](https://github.com/strangebb/shark-dist-bullsbay) 146 | 147 | 148 | * Spring 2018 149 | - Public repos: 150 | - [Mapping relgion affiation](https://github.com/katiebalcewicz/quant-methods/tree/master/Project) 151 | - [NFL statistics](https://github.com/g-rock/nfl.git) 152 | - [Rshiny app for the measurement of biodiversity](https://github.com/caroliver/mobr.git) 153 | - [Black Coral habitat suitability model](https://github.com/prouxzs/BlackCoralMesoscaleHabitatSuitabilityModel.git) 154 | - [Fitting thermal performance curves](https://github.com/Wellingem/Metabolic_thermal_performance_curves.git) 155 | - [Analysis of Avian Conservation Center bird strikes](https://github.com/conradcd/ACC_Bird_Strikes) 156 | - [Multiple Paternity Analysis Program](https://github.com/sporrema/Multiple-Paternity-Analysis-Program) 157 | - [How much will people pay for eco-friendly flowers](https://github.com/rachelwiser/WiserThesisRCode) 158 | * Spring 2016 159 | - Public repos: 160 | - [Modeling disease outbreaks](https://github.com/TomNash/vaccine-project) 161 | - [Crab incubation](https://github.com/mackk1/Project) 162 | - [Congitive skill as a predictor of infarct volume](https://github.com/andersenme/infarct_volume_analysis) 163 | - [Spatial analysis of shots from NBA players](https://github.com/oshimamh/nbaProj) 164 | - [Salt intrusion in freshwater aquifiers](https://github.com/mikala-randich/fwsw_proj) 165 | - [Horse-shoe crab bleeding induced mortality](https://github.com/kristinlinesch/HSC_bleed) 166 | - [Shark morphometrics](https://github.com/Jordylacrosse/Shark-Morphometrics) 167 | - [Post-hurricane Hugo recovery of the Santee long-term fire experiment](https://github.com/smccau/santee_fire) 168 | - [Spatial and temporal trends in south Atlantic reef fish](https://github.com/walkermf/Reef_fish-) 169 | - [Contemporary patterns of refugee migration](https://github.com/sarahwie/refugee_migration_trends) 170 | - [Modeling DMSP across depths along a longitudinal transect](https://github.com/shoresk/Savannah-June-2015-DMSP-Predictors) 171 | * Spring 2015 172 | - Public repos: 173 | - [Spatial decomposition of community variance in forests](https://github.com/claydustin/tree_vario) 174 | - [Spatial cross validation methods](https://github.com/lesliedb/spatial_cv) 175 | - [Modeling influenza across the US](https://github.com/tswilkin/Influenza-Quant-Project) 176 | - Private repos: 177 | - [Response of roots to CO2 enrichment](https://github.com/Kvcross/Duke_FACE_Belowground) 178 | - [Response of rhizomorphs to CO2 and N2 enrichment](https://github.com/davidmhood/Rhizomorph_FACE) 179 | - [Coral community composition in response to substrate](https://github.com/MRittinghouse/ThesisProject) 180 | - [Modeling fish abundance and size](https://github.com/friedrichknuth/project) 181 | 182 | 183 | -------------------------------------------------------------------------------- /projects/code_review.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: page 3 | title: Code Review 4 | --- 5 | 6 | ## General Review Checklist 7 | 8 | ### Purpose 9 | * Is the purpose of the project is clear? 10 | 11 | * Is it clear what each file in the project is intended for? 12 | 13 | * It is clear how the various files interact? 14 | 15 | * Is it clear what the purpose if of specific sections of code? 16 | 17 | * How well commented is the code on a scale of 1(no comments) to 10(very well commented). 18 | 19 | * How can the purposes of the project and files be improved? 20 | 21 | ### Organization 22 | 23 | * Is the project organized such that you can intuit where the data, 24 | scripts, and output files are stored? 25 | 26 | * Approximately how much time did it take you to understand the work flow 27 | in the project? 28 | 29 | * How well defined are code chunks in the project? 30 | 31 | * How can organization be improved? 32 | 33 | ### Functionality 34 | 35 | * Does the code appear to advance the purpose of the project? 36 | 37 | * Do the existing components of the project appear to function? 38 | 39 | * How can the author improve functionality of code? 40 | 41 | 42 | ## Specific File Comments 43 | * README.md 44 | - (for example) Easy to understand, but consider adding a code licence... 45 | 46 | -------------------------------------------------------------------------------- /projects/naming-slides.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/projects/naming-slides.pdf -------------------------------------------------------------------------------- /public/R-Prog-Lang-Logo-sm.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/public/R-Prog-Lang-Logo-sm.png -------------------------------------------------------------------------------- /public/apple-touch-icon-144-precomposed.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/public/apple-touch-icon-144-precomposed.png -------------------------------------------------------------------------------- /public/cc-by-80x15.png: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/public/cc-by-80x15.png -------------------------------------------------------------------------------- /public/css/hyde.css: -------------------------------------------------------------------------------- 1 | /* 2 | * __ __ 3 | * /\ \ /\ \ 4 | * \ \ \___ __ __ \_\ \ __ 5 | * \ \ _ `\/\ \/\ \ /'_` \ /'__`\ 6 | * \ \ \ \ \ \ \_\ \/\ \_\ \/\ __/ 7 | * \ \_\ \_\/`____ \ \___,_\ \____\ 8 | * \/_/\/_/`/___/> \/__,_ /\/____/ 9 | * /\___/ 10 | * \/__/ 11 | * 12 | * Designed, built, and released under MIT license by @mdo. Learn more at 13 | * https://github.com/poole/hyde. 14 | */ 15 | 16 | 17 | /* 18 | * Contents 19 | * 20 | * Global resets 21 | * Sidebar 22 | * Container 23 | * Reverse layout 24 | * Themes 25 | */ 26 | 27 | 28 | /* 29 | * Global resets 30 | * 31 | * Update the foundational and global aspects of the page. 32 | */ 33 | 34 | html { 35 | font-family: "PT Sans", Helvetica, Arial, sans-serif; 36 | } 37 | @media (min-width: 48em) { 38 | html { 39 | font-size: 16px; 40 | } 41 | } 42 | @media (min-width: 58em) { 43 | html { 44 | font-size: 20px; 45 | } 46 | } 47 | 48 | 49 | /* 50 | * Sidebar 51 | * 52 | * Flexible banner for housing site name, intro, and "footer" content. Starts 53 | * out above content in mobile and later moves to the side with wider viewports. 54 | */ 55 | 56 | .sidebar { 57 | text-align: center; 58 | padding: 2rem 1rem; 59 | color: rgba(255,255,255,.5); 60 | background-color: #202020; 61 | } 62 | @media (min-width: 48em) { 63 | .sidebar { 64 | position: fixed; 65 | top: 0; 66 | left: 0; 67 | bottom: 0; 68 | width: 18rem; 69 | text-align: left; 70 | } 71 | } 72 | 73 | /* Sidebar links */ 74 | .sidebar a { 75 | color: #fff; 76 | } 77 | 78 | /* About section */ 79 | .sidebar-about h1 { 80 | color: #fff; 81 | margin-top: 0; 82 | font-family: "Abril Fatface", serif; 83 | font-size: 2.25rem; 84 | } 85 | 86 | /* Sidebar nav */ 87 | .sidebar-nav { 88 | margin-bottom: 1rem; 89 | } 90 | .sidebar-nav-item { 91 | display: block; 92 | } 93 | a.sidebar-nav-item:hover, 94 | a.sidebar-nav-item:focus { 95 | text-decoration: underline; 96 | } 97 | .sidebar-nav-item.active { 98 | font-weight: bold; 99 | } 100 | 101 | /* Sticky sidebar 102 | * 103 | * Add the `sidebar-sticky` class to the sidebar's container to affix it the 104 | * contents to the bottom of the sidebar in tablets and up. 105 | */ 106 | 107 | @media (min-width: 48em) { 108 | .sidebar-sticky { 109 | position: absolute; 110 | right: 1rem; 111 | bottom: 1rem; 112 | left: 1rem; 113 | } 114 | } 115 | 116 | 117 | /* Container 118 | * 119 | * Align the contents of the site above the proper threshold with some margin-fu 120 | * with a 25%-wide `.sidebar`. 121 | */ 122 | 123 | .content { 124 | padding-top: 4rem; 125 | padding-bottom: 4rem; 126 | } 127 | 128 | @media (min-width: 48em) { 129 | .content { 130 | max-width: 38rem; 131 | margin-left: 20rem; 132 | margin-right: 2rem; 133 | } 134 | } 135 | 136 | @media (min-width: 64em) { 137 | .content { 138 | margin-left: 22rem; 139 | margin-right: 4rem; 140 | } 141 | } 142 | 143 | 144 | /* 145 | * Reverse layout 146 | * 147 | * Flip the orientation of the page by placing the `.sidebar` on the right. 148 | */ 149 | 150 | @media (min-width: 48em) { 151 | .layout-reverse .sidebar { 152 | left: auto; 153 | right: 0; 154 | } 155 | .layout-reverse .content { 156 | margin-left: 2rem; 157 | margin-right: 20rem; 158 | } 159 | } 160 | 161 | @media (min-width: 64em) { 162 | .layout-reverse .content { 163 | margin-left: 4rem; 164 | margin-right: 22rem; 165 | } 166 | } 167 | 168 | 169 | 170 | /* 171 | * Themes 172 | * 173 | * As of v1.1, Hyde includes optional themes to color the sidebar and links 174 | * within blog posts. To use, add the class of your choosing to the `body`. 175 | */ 176 | 177 | /* Base16 (http://chriskempson.github.io/base16/#default) */ 178 | 179 | /* Red */ 180 | .theme-base-08 .sidebar { 181 | background-color: #ac4142; 182 | } 183 | .theme-base-08 .content a, 184 | .theme-base-08 .related-posts li a:hover { 185 | color: #ac4142; 186 | } 187 | 188 | /* Orange */ 189 | .theme-base-09 .sidebar { 190 | background-color: #d28445; 191 | } 192 | .theme-base-09 .content a, 193 | .theme-base-09 .related-posts li a:hover { 194 | color: #d28445; 195 | } 196 | 197 | /* Yellow */ 198 | .theme-base-0a .sidebar { 199 | background-color: #f4bf75; 200 | } 201 | .theme-base-0a .content a, 202 | .theme-base-0a .related-posts li a:hover { 203 | color: #f4bf75; 204 | } 205 | 206 | /* Green */ 207 | .theme-base-0b .sidebar { 208 | background-color: #90a959; 209 | } 210 | .theme-base-0b .content a, 211 | .theme-base-0b .related-posts li a:hover { 212 | color: #90a959; 213 | } 214 | 215 | /* Cyan */ 216 | .theme-base-0c .sidebar { 217 | background-color: #75b5aa; 218 | } 219 | .theme-base-0c .content a, 220 | .theme-base-0c .related-posts li a:hover { 221 | color: #75b5aa; 222 | } 223 | 224 | /* Blue */ 225 | .theme-base-0d .sidebar { 226 | background-color: #6a9fb5; 227 | } 228 | .theme-base-0d .content a, 229 | .theme-base-0d .related-posts li a:hover { 230 | color: #6a9fb5; 231 | } 232 | 233 | /* Magenta */ 234 | .theme-base-0e .sidebar { 235 | background-color: #aa759f; 236 | } 237 | .theme-base-0e .content a, 238 | .theme-base-0e .related-posts li a:hover { 239 | color: #aa759f; 240 | } 241 | 242 | /* Brown */ 243 | .theme-base-0f .sidebar { 244 | background-color: #8f5536; 245 | } 246 | .theme-base-0f .content a, 247 | .theme-base-0f .related-posts li a:hover { 248 | color: #8f5536; 249 | } 250 | -------------------------------------------------------------------------------- /public/css/poole.css: -------------------------------------------------------------------------------- 1 | /* 2 | * ___ 3 | * /\_ \ 4 | * _____ ___ ___\//\ \ __ 5 | * /\ '__`\ / __`\ / __`\\ \ \ /'__`\ 6 | * \ \ \_\ \/\ \_\ \/\ \_\ \\_\ \_/\ __/ 7 | * \ \ ,__/\ \____/\ \____//\____\ \____\ 8 | * \ \ \/ \/___/ \/___/ \/____/\/____/ 9 | * \ \_\ 10 | * \/_/ 11 | * 12 | * Designed, built, and released under MIT license by @mdo. Learn more at 13 | * https://github.com/poole/poole. 14 | */ 15 | 16 | 17 | /* 18 | * Contents 19 | * 20 | * Body resets 21 | * Custom type 22 | * Messages 23 | * Container 24 | * Masthead 25 | * Posts and pages 26 | * Pagination 27 | * Reverse layout 28 | * Themes 29 | */ 30 | 31 | 32 | /* 33 | * Body resets 34 | * 35 | * Update the foundational and global aspects of the page. 36 | */ 37 | 38 | * { 39 | -webkit-box-sizing: border-box; 40 | -moz-box-sizing: border-box; 41 | box-sizing: border-box; 42 | } 43 | 44 | html, 45 | body { 46 | margin: 0; 47 | padding: 0; 48 | } 49 | 50 | html { 51 | font-family: "Helvetica Neue", Helvetica, Arial, sans-serif; 52 | font-size: 16px; 53 | line-height: 1.5; 54 | } 55 | @media (min-width: 38em) { 56 | html { 57 | font-size: 20px; 58 | } 59 | } 60 | 61 | body { 62 | color: #515151; 63 | background-color: #fff; 64 | -webkit-text-size-adjust: 100%; 65 | -ms-text-size-adjust: 100%; 66 | } 67 | 68 | /* No `:visited` state is required by default (browsers will use `a`) */ 69 | a { 70 | color: #268bd2; 71 | text-decoration: none; 72 | } 73 | a strong { 74 | color: inherit; 75 | } 76 | /* `:focus` is linked to `:hover` for basic accessibility */ 77 | a:hover, 78 | a:focus { 79 | text-decoration: underline; 80 | } 81 | 82 | /* Headings */ 83 | h1, h2, h3, h4, h5, h6 { 84 | margin-bottom: .5rem; 85 | font-weight: bold; 86 | line-height: 1.25; 87 | color: #313131; 88 | text-rendering: optimizeLegibility; 89 | } 90 | h1 { 91 | font-size: 2rem; 92 | } 93 | h2 { 94 | margin-top: 1rem; 95 | font-size: 1.5rem; 96 | } 97 | h3 { 98 | margin-top: 1.5rem; 99 | font-size: 1.25rem; 100 | } 101 | h4, h5, h6 { 102 | margin-top: 1rem; 103 | font-size: 1rem; 104 | } 105 | 106 | /* Body text */ 107 | p { 108 | margin-top: 0; 109 | margin-bottom: 1rem; 110 | } 111 | 112 | strong { 113 | color: #303030; 114 | } 115 | 116 | 117 | /* Lists */ 118 | ul, ol, dl { 119 | margin-top: 0; 120 | margin-bottom: 1rem; 121 | } 122 | 123 | dt { 124 | font-weight: bold; 125 | } 126 | dd { 127 | margin-bottom: .5rem; 128 | } 129 | 130 | /* Misc */ 131 | hr { 132 | position: relative; 133 | margin: 1.5rem 0; 134 | border: 0; 135 | border-top: 1px solid #eee; 136 | border-bottom: 1px solid #fff; 137 | } 138 | 139 | abbr { 140 | font-size: 85%; 141 | font-weight: bold; 142 | color: #555; 143 | text-transform: uppercase; 144 | } 145 | abbr[title] { 146 | cursor: help; 147 | border-bottom: 1px dotted #e5e5e5; 148 | } 149 | 150 | /* Code */ 151 | code, 152 | pre { 153 | font-family: Menlo, Monaco, "Courier New", monospace; 154 | } 155 | code { 156 | padding: .25em .5em; 157 | font-size: 85%; 158 | color: #bf616a; 159 | background-color: #f9f9f9; 160 | border-radius: 3px; 161 | } 162 | pre { 163 | display: block; 164 | margin-top: 0; 165 | margin-bottom: 1rem; 166 | padding: 1rem; 167 | font-size: .8rem; 168 | line-height: 1.4; 169 | white-space: pre; 170 | white-space: pre-wrap; 171 | word-break: break-all; 172 | word-wrap: break-word; 173 | background-color: #f9f9f9; 174 | } 175 | pre code { 176 | padding: 0; 177 | font-size: 100%; 178 | color: inherit; 179 | background-color: transparent; 180 | } 181 | 182 | /* Pygments via Jekyll */ 183 | .highlight { 184 | margin-bottom: 1rem; 185 | border-radius: 4px; 186 | } 187 | .highlight pre { 188 | margin-bottom: 0; 189 | } 190 | 191 | /* Gist via GitHub Pages */ 192 | .gist .gist-file { 193 | font-family: Menlo, Monaco, "Courier New", monospace !important; 194 | } 195 | .gist .markdown-body { 196 | padding: 15px; 197 | } 198 | .gist pre { 199 | padding: 0; 200 | background-color: transparent; 201 | } 202 | .gist .gist-file .gist-data { 203 | font-size: .8rem !important; 204 | line-height: 1.4; 205 | } 206 | .gist code { 207 | padding: 0; 208 | color: inherit; 209 | background-color: transparent; 210 | border-radius: 0; 211 | } 212 | 213 | /* Quotes */ 214 | blockquote { 215 | padding: .5rem 1rem; 216 | margin: .8rem 0; 217 | color: #7a7a7a; 218 | border-left: .25rem solid #e5e5e5; 219 | } 220 | blockquote p:last-child { 221 | margin-bottom: 0; 222 | } 223 | @media (min-width: 30em) { 224 | blockquote { 225 | padding-right: 5rem; 226 | padding-left: 1.25rem; 227 | } 228 | } 229 | 230 | img { 231 | display: block; 232 | max-width: 100%; 233 | margin: 0 0 1rem; 234 | border-radius: 5px; 235 | } 236 | 237 | /* Tables */ 238 | table { 239 | margin-bottom: 1rem; 240 | width: 100%; 241 | border: 1px solid #e5e5e5; 242 | border-collapse: collapse; 243 | } 244 | td, 245 | th { 246 | padding: .25rem .5rem; 247 | border: 1px solid #e5e5e5; 248 | } 249 | tbody tr:nth-child(odd) td, 250 | tbody tr:nth-child(odd) th { 251 | background-color: #f9f9f9; 252 | } 253 | 254 | 255 | /* 256 | * Custom type 257 | * 258 | * Extend paragraphs with `.lead` for larger introductory text. 259 | */ 260 | 261 | .lead { 262 | font-size: 1.25rem; 263 | font-weight: 300; 264 | } 265 | 266 | 267 | /* 268 | * Messages 269 | * 270 | * Show alert messages to users. You may add it to single elements like a `

`, 271 | * or to a parent if there are multiple elements to show. 272 | */ 273 | 274 | .message { 275 | margin-bottom: 1rem; 276 | padding: 1rem; 277 | color: #717171; 278 | background-color: #f9f9f9; 279 | } 280 | 281 | 282 | /* 283 | * Container 284 | * 285 | * Center the page content. 286 | */ 287 | 288 | .container { 289 | max-width: 38rem; 290 | padding-left: 1rem; 291 | padding-right: 1rem; 292 | margin-left: auto; 293 | margin-right: auto; 294 | } 295 | 296 | 297 | /* 298 | * Masthead 299 | * 300 | * Super small header above the content for site name and short description. 301 | */ 302 | 303 | .masthead { 304 | padding-top: 1rem; 305 | padding-bottom: 1rem; 306 | margin-bottom: 3rem; 307 | } 308 | .masthead-title { 309 | margin-top: 0; 310 | margin-bottom: 0; 311 | color: #505050; 312 | } 313 | .masthead-title a { 314 | color: #505050; 315 | } 316 | .masthead-title small { 317 | font-size: 75%; 318 | font-weight: 400; 319 | color: #c0c0c0; 320 | letter-spacing: 0; 321 | } 322 | 323 | 324 | /* 325 | * Posts and pages 326 | * 327 | * Each post is wrapped in `.post` and is used on default and post layouts. Each 328 | * page is wrapped in `.page` and is only used on the page layout. 329 | */ 330 | 331 | .page, 332 | .post { 333 | margin-bottom: 4em; 334 | } 335 | 336 | /* Blog post or page title */ 337 | .page-title, 338 | .post-title, 339 | .post-title a { 340 | color: #303030; 341 | } 342 | .page-title, 343 | .post-title { 344 | margin-top: 0; 345 | } 346 | 347 | /* Meta data line below post title */ 348 | .post-date { 349 | display: block; 350 | margin-top: -.5rem; 351 | margin-bottom: 1rem; 352 | color: #9a9a9a; 353 | } 354 | 355 | /* Related posts */ 356 | .related { 357 | padding-top: 2rem; 358 | padding-bottom: 2rem; 359 | border-top: 1px solid #eee; 360 | } 361 | .related-posts { 362 | padding-left: 0; 363 | list-style: none; 364 | } 365 | .related-posts h3 { 366 | margin-top: 0; 367 | } 368 | .related-posts li small { 369 | font-size: 75%; 370 | color: #999; 371 | } 372 | .related-posts li a:hover { 373 | color: #268bd2; 374 | text-decoration: none; 375 | } 376 | .related-posts li a:hover small { 377 | color: inherit; 378 | } 379 | 380 | 381 | /* 382 | * Pagination 383 | * 384 | * Super lightweight (HTML-wise) blog pagination. `span`s are provide for when 385 | * there are no more previous or next posts to show. 386 | */ 387 | 388 | .pagination { 389 | overflow: hidden; /* clearfix */ 390 | margin-left: -1rem; 391 | margin-right: -1rem; 392 | font-family: "PT Sans", Helvetica, Arial, sans-serif; 393 | color: #ccc; 394 | text-align: center; 395 | } 396 | 397 | /* Pagination items can be `span`s or `a`s */ 398 | .pagination-item { 399 | display: block; 400 | padding: 1rem; 401 | border: 1px solid #eee; 402 | } 403 | .pagination-item:first-child { 404 | margin-bottom: -1px; 405 | } 406 | 407 | /* Only provide a hover state for linked pagination items */ 408 | a.pagination-item:hover { 409 | background-color: #f5f5f5; 410 | } 411 | 412 | @media (min-width: 30em) { 413 | .pagination { 414 | margin: 3rem 0; 415 | } 416 | .pagination-item { 417 | float: left; 418 | width: 50%; 419 | } 420 | .pagination-item:first-child { 421 | margin-bottom: 0; 422 | border-top-left-radius: 4px; 423 | border-bottom-left-radius: 4px; 424 | } 425 | .pagination-item:last-child { 426 | margin-left: -1px; 427 | border-top-right-radius: 4px; 428 | border-bottom-right-radius: 4px; 429 | } 430 | } 431 | -------------------------------------------------------------------------------- /public/css/syntax.css: -------------------------------------------------------------------------------- 1 | .highlight .hll { background-color: #ffc; } 2 | .highlight .c { color: #999; } /* Comment */ 3 | .highlight .err { color: #a00; background-color: #faa } /* Error */ 4 | .highlight .k { color: #069; } /* Keyword */ 5 | .highlight .o { color: #555 } /* Operator */ 6 | .highlight .cm { color: #09f; font-style: italic } /* Comment.Multiline */ 7 | .highlight .cp { color: #099 } /* Comment.Preproc */ 8 | .highlight .c1 { color: #999; } /* Comment.Single */ 9 | .highlight .cs { color: #999; } /* Comment.Special */ 10 | .highlight .gd { background-color: #fcc; border: 1px solid #c00 } /* Generic.Deleted */ 11 | .highlight .ge { font-style: italic } /* Generic.Emph */ 12 | .highlight .gr { color: #f00 } /* Generic.Error */ 13 | .highlight .gh { color: #030; } /* Generic.Heading */ 14 | .highlight .gi { background-color: #cfc; border: 1px solid #0c0 } /* Generic.Inserted */ 15 | .highlight .go { color: #aaa } /* Generic.Output */ 16 | .highlight .gp { color: #009; } /* Generic.Prompt */ 17 | .highlight .gs { } /* Generic.Strong */ 18 | .highlight .gu { color: #030; } /* Generic.Subheading */ 19 | .highlight .gt { color: #9c6 } /* Generic.Traceback */ 20 | .highlight .kc { color: #069; } /* Keyword.Constant */ 21 | .highlight .kd { color: #069; } /* Keyword.Declaration */ 22 | .highlight .kn { color: #069; } /* Keyword.Namespace */ 23 | .highlight .kp { color: #069 } /* Keyword.Pseudo */ 24 | .highlight .kr { color: #069; } /* Keyword.Reserved */ 25 | .highlight .kt { color: #078; } /* Keyword.Type */ 26 | .highlight .m { color: #f60 } /* Literal.Number */ 27 | .highlight .s { color: #d44950 } /* Literal.String */ 28 | .highlight .na { color: #4f9fcf } /* Name.Attribute */ 29 | .highlight .nb { color: #366 } /* Name.Builtin */ 30 | .highlight .nc { color: #0a8; } /* Name.Class */ 31 | .highlight .no { color: #360 } /* Name.Constant */ 32 | .highlight .nd { color: #99f } /* Name.Decorator */ 33 | .highlight .ni { color: #999; } /* Name.Entity */ 34 | .highlight .ne { color: #c00; } /* Name.Exception */ 35 | .highlight .nf { color: #c0f } /* Name.Function */ 36 | .highlight .nl { color: #99f } /* Name.Label */ 37 | .highlight .nn { color: #0cf; } /* Name.Namespace */ 38 | .highlight .nt { color: #2f6f9f; } /* Name.Tag */ 39 | .highlight .nv { color: #033 } /* Name.Variable */ 40 | .highlight .ow { color: #000; } /* Operator.Word */ 41 | .highlight .w { color: #bbb } /* Text.Whitespace */ 42 | .highlight .mf { color: #f60 } /* Literal.Number.Float */ 43 | .highlight .mh { color: #f60 } /* Literal.Number.Hex */ 44 | .highlight .mi { color: #f60 } /* Literal.Number.Integer */ 45 | .highlight .mo { color: #f60 } /* Literal.Number.Oct */ 46 | .highlight .sb { color: #c30 } /* Literal.String.Backtick */ 47 | .highlight .sc { color: #c30 } /* Literal.String.Char */ 48 | .highlight .sd { color: #c30; font-style: italic } /* Literal.String.Doc */ 49 | .highlight .s2 { color: #c30 } /* Literal.String.Double */ 50 | .highlight .se { color: #c30; } /* Literal.String.Escape */ 51 | .highlight .sh { color: #c30 } /* Literal.String.Heredoc */ 52 | .highlight .si { color: #a00 } /* Literal.String.Interpol */ 53 | .highlight .sx { color: #c30 } /* Literal.String.Other */ 54 | .highlight .sr { color: #3aa } /* Literal.String.Regex */ 55 | .highlight .s1 { color: #c30 } /* Literal.String.Single */ 56 | .highlight .ss { color: #fc3 } /* Literal.String.Symbol */ 57 | .highlight .bp { color: #366 } /* Name.Builtin.Pseudo */ 58 | .highlight .vc { color: #033 } /* Name.Variable.Class */ 59 | .highlight .vg { color: #033 } /* Name.Variable.Global */ 60 | .highlight .vi { color: #033 } /* Name.Variable.Instance */ 61 | .highlight .il { color: #f60 } /* Literal.Number.Integer.Long */ 62 | 63 | .css .o, 64 | .css .o + .nt, 65 | .css .nt + .nt { color: #999; } 66 | -------------------------------------------------------------------------------- /resources.md: -------------------------------------------------------------------------------- 1 | --- 2 | layout: page 3 | title: Resources 4 | --- 5 | 6 | ## Best Practices 7 | * Software Carpentry. 8 | 9 | * Data Carpentry. 10 | 11 | * Why are your files and directories a mess? 12 | 13 | * How to name files 14 | 15 | * Getting your project files in order, Hao Ye's Notes on Data Organization in 16 | Spreadsheets and R. 17 | 18 | * Naming Things - by Jenny Bryan 19 | 20 | * Wilson, G. et al. 2014. Best Practices for Scientific Computing. PLoS Biol 21 | 12(1): e1001745. doi:10.1371/journal.pbio.1001745. URL: 22 | 23 | * White, E.P. et al. 2013. Nine simple ways to make it easier to (re)use your 24 | data. Ideas in Ecology and Evolution. 6(2): 1-10. URL: 25 | 26 | * Ram, K. 2013. Git can facilitate greater reproducibility and increased 27 | transparency in science. Source Code for Biology and Medicine. 8:7. 28 | doi:10.1186/1751-0473-8-7 URL: 29 | 30 | * A data science blog greared towards beginneers with a mixture of R and python 31 | content 32 | 33 | ## Statistics 34 | * Jack Weiss's courses on statistics for ecologists and environmental scientists 35 | - [Statistical Methods in Ecology](https://sakai.unc.edu/access/content/group/3d1eb92e-7848-4f55-90c3-7c72a54e7e43/public/index.html) 36 | - [Statistics for Environmental Science](https://sakai.unc.edu/access/content/group/2842013b-58f5-4453-aa8d-3e01bacbfc3d/public/Ecol562_Spring2012/index.html) 37 | * Code to accompany *A primer of ecological statistics* (Gotelli and Ellison 2012) 38 | - 39 | * Datasets to accompany *The R book* (Jones et al. 2022) 40 | - 41 | * Introduction to Data Exploration and Analysis with R by Michael Mahoney 42 | - free ebook 43 | - Basic statistics using R 44 | * Topics in R Statistical Language (Penn State Open Edu) 45 | - [Statistics in R - Part 1](https://online.stat.psu.edu/stat484/) 46 | - [Statistics in R - Part 2](https://online.stat.psu.edu/stat485/) 47 | * Jason Fridley's R based plant ecology course 48 | - 49 | * Patrick Breheny courses on statistics for biologists 50 | - 51 | * Linear Models with R (free ebook) 52 | - 53 | * Mixed Effects Models and Extensions in Ecology with R by Zurr et al. (free for cofc students) 54 | - 55 | * Anova 56 | - 57 | - Type I/II/III: 58 | - 59 | - Contrasts: 60 | - 61 | - 62 | * Ordination webpage 63 | - Amazing resource for multivariate approaches for analyzing community 64 | ecology data. 65 | - 66 | * Anscombe's quartet 67 | - 68 | * Dynamic Ecology blog posts by Brian McGill related to statistics and data. 69 | - [Why AIC appeals to ecologists lowests instincts](https://dynamicecology.wordpress.com/2015/05/21/why-aic-appeals-to-ecologists-lowest-instincts/) 70 | - [In praise of exporatory statistics](https://dynamicecology.wordpress.com/2013/10/16/in-praise-of-exploratory-statistics/) 71 | - Ecologists need to do a better job of prediction (4 part series) 72 | - [Part I – the insidious evils of ANOVA](https://dynamicecology.wordpress.com/2012/11/27/ecologists-need-to-do-a-better-job-of-prediction-part-i-the-insidious-evils-of-anova/) 73 | - [Part II - partly cloudy and a 20% chance of extinction (or the 6 P’s of good prediction](https://dynamicecology.wordpress.com/2013/01/09/ecologists-need-to-do-a-better-job-of-prediction-part-ii-mechanism-vs-pattern/) 74 | - [Part III - mechanistic or phenomenological?](https://dynamicecology.wordpress.com/2013/02/21/ecologists-need-to-do-a-better-job-of-prediction-part-iii-the-need-for-data/) 75 | - [Part IV - quantifying prediction quality](https://dynamicecology.wordpress.com/2013/03/19/ecologists-need-to-do-a-better-job-of-prediction-part-iv-quantifying-prediction-quality/) 76 | - [Ten commandments for good data management](https://dynamicecology.wordpress.com/2016/08/22/ten-commandments-for-good-data-management) 77 | - [Why ecology is hard (and fun) – multicausality](https://dynamicecology.wordpress.com/2016/03/02/why-ecology-is-hard-and-fun-multicausality) 78 | 79 | ## R Programming 80 | 81 | * Big Book of R: a collection of bookmarked R lessons and demos 82 | - 83 | 84 | ### Basics 85 | * Free manuals and tutorials provided by R users 86 | - 87 | * R programming style guide 88 | - by Whickham 89 | - [lintr](https://github.com/jimhester/lintr) for cleaning up R code 90 | - [formatR](https://yihui.name/formatr/) 91 | * Tryr by codeschool 92 | - Interactive online lesson for learning R basics 93 | - 94 | * A fairly comprehensive R reference card 95 | - 96 | * Programming with R by Software Carpentry 97 | - The software carpentry team have some of the best lessons for learning 98 | computational tools on the web. 99 | - 100 | * Five useful R functions for manipulating data 101 | - 102 | 103 | ### Advanced 104 | * Advanced R by Hadley Wickham: 105 | - One of the best references for taking your R to the next level 106 | - 107 | 108 | ### Graphics 109 | * Colors 110 | - [Color Palettes](https://www.nceas.ucsb.edu/~frazier/RSpatialGuides/colorPaletteCheatsheet.pdf) 111 | 112 | ## Git 113 | * Git and GitHub by Hadley Wickham 114 | - A very nice overview and step by step instructions for using git and 115 | integrating it with Rstudio. 116 | - 117 | * git - the simple guide 118 | - 119 | * Version Control by Software Carpentry 120 | - 121 | * Git reference card: 122 | - 123 | * Git quick reference for beginners 124 | - 125 | * Git flight rules - for when things go wrong 126 | - 127 | 128 | -------------------------------------------------------------------------------- /scripts/collect_student_urls.R: -------------------------------------------------------------------------------- 1 | # clone repos from data frame 2 | 3 | stud <- read.csv('./data/Rclass_spring20.csv') 4 | stud$HW.url 5 | # drop last row 6 | stud <- stud[-nrow(stud), ] 7 | 8 | year = 2020 9 | student_path = paste0('./student_', year) 10 | 11 | for (i in seq_along(stud$HW.url)) { 12 | #dir.create(paste0(student_path, '/', stud$Username[i])) 13 | system(paste0('cd ', student_path, '; git clone ', 14 | stud$HW.url[i], ' ', stud$Username[i]), 15 | intern=TRUE)[1] 16 | } 17 | 18 | year = 2020 19 | student_path = paste0('./student_', year) 20 | students = dir(student_path) 21 | git_urls = list() 22 | for(i in seq_along(students)) { 23 | git_urls[i] = system(paste0('cd ', student_path, '/', students[i], 24 | ' ; git remote -v'), intern=TRUE)[1] 25 | } 26 | 27 | ## older code useful if you've already cloned all student repos 28 | 29 | git_urls = unlist(git_urls) 30 | git_urls = sub('origin\t', '', git_urls, fixed=T) 31 | git_urls = sub(' (fetch)', '', git_urls, fixed=T) 32 | 33 | git_handle = sub('https://github.com/','',git_urls) 34 | git_handle = sub('/.+', '', git_handle) 35 | 36 | 37 | write.csv(data.frame(git_urls, git_handle), 38 | file='./student_git_repos.csv', row.names=F) 39 | -------------------------------------------------------------------------------- /scripts/download_fire_data.R: -------------------------------------------------------------------------------- 1 | # metadata url: 2 | # https://fsapps.nwcg.gov/afm/data/fireptdata/modisfire_2007_conus.htm 3 | # data from MODIS is from 2007 to 2020 but some years are missing(?) 4 | 5 | yrs <- c(2007, 2009, 2010, 2013:2021) 6 | 7 | for (i in seq_along(yrs)) { 8 | url <- paste0('https://fsapps.nwcg.gov/afm/data/fireptdata/modis_fire_', yrs[i], 9 | '_365_conus_shapefile.zip') 10 | folder <- paste0('./data/modis_fire/modis_fire_', yrs[i], '_365_conus') 11 | file <- paste0('modis_fire_', yrs[i], '_365_conus_shapefile.zip') 12 | dir.create(folder) 13 | download.file(url, paste(folder, file, sep ='/')) 14 | unzip(paste(folder, file, sep ='/'), exdir = folder) 15 | } 16 | -------------------------------------------------------------------------------- /scripts/fibanacci_seq.R: -------------------------------------------------------------------------------- 1 | 2 | sum2c <- function(x) { 3 | return(c(x, x[length(x)] + x[length(x)-1])) 4 | } 5 | 6 | get_fib <- function(x, depth) { 7 | output <- x 8 | for(i in 1:depth) 9 | output <- sum2c(output) 10 | return(output) 11 | } 12 | 13 | get_ratios <- function(x) { 14 | return(x[-1] / x[-length(x)]) 15 | } 16 | 17 | plot_fib <- function(x) { 18 | x <- rev(x) 19 | xlims <- c(0, x[1] + x[2]) 20 | ylims <- c(0, x[1]) 21 | plot(1, 1, type ='n', xlim = xlims, ylim = ylims) 22 | for (i in seq_along(x)) 23 | polygon(c(0, x[i], x[i], 0), 24 | c(x[i], x[i], 0, 0), col=i) 25 | } 26 | 27 | plot_fib2 <- function(x) { 28 | x <- rev(x) 29 | xlims <- c(0, x[1] + x[2]) 30 | ylims <- c(0, x[1]) 31 | plot(1, 1, type ='n', xlim = xlims, ylim = ylims) 32 | for (i in seq_along(x)) { 33 | if (i == 1) { 34 | x_start <- 0 ; y_start <- 0 35 | x_end <- x[i] ; y_end <- x[i] 36 | polygon(c(x_end , x_start, x_start, x_end), 37 | c(y_start, y_start, y_end , y_end), 38 | col=i) 39 | } else { 40 | 41 | if (i %% 2 == 0) { # even 42 | x_start <- x_end ; y_start <- y_end 43 | x_end <- x_start + x[i] ; y_end <- y_start - x[i] 44 | polygon(c(x_start, x_end , x_end, x_start), 45 | c(y_start, y_start, y_end , y_end), 46 | col=i) 47 | } else { 48 | x_start <- x_end ; y_start <- y_end 49 | x_end <- x_end - x[i] ; y_end <- y_start - x[i] 50 | polygon(c(x_start, x_end, x_end, x_start), c(y_end, y_end, y_start, y_start), 51 | col=i) 52 | } 53 | } 54 | } 55 | } 56 | 57 | 58 | sum2c(0:1) 59 | sum2c(3:4) 60 | 61 | sum2c(c(0, 1, 1, 2, 3, 5)) 62 | 63 | sum2c(sum2c(sum2c(sum2c(0:1)))) 64 | 65 | fs <- get_fib(0:1, 20) 66 | 67 | get_ratios(fs) 68 | 69 | plot(fs, type = 'o', log='y') 70 | plot(get_ratios(fs), type ='o') 71 | abline(h = (1 + sqrt(5)) / 2, lty=2) # golden ratio 72 | 73 | 74 | -------------------------------------------------------------------------------- /scripts/google_sheets_mang.R: -------------------------------------------------------------------------------- 1 | 2 | library(googledrive) 3 | 4 | # read in class list 5 | rclass <- read_sheet("https://docs.google.com/spreadsheets/d/1cZYMmzFNHoggn8qBTlcq_Q4srirbv9JK5WyDo2sJ950/edit?gid=0#gid=0") 6 | rclass$last 7 | 8 | eval_urls <- NULL 9 | for (i in seq_along(rclass$last)) { 10 | drive_cp(file = "https://docs.google.com/spreadsheets/d/16An8KUIj_wOS-RZUiTCl82O3gPSHFTZrvr7_Y02ciwY") 11 | new_file <- drive_get(path = "student_evals_2025/Copy of student_feedback") 12 | renamed_file <- drive_mv(as_id(new_file), name = paste("student_feedback_", rclass$last[i], sep='')) 13 | renamed_file %>% 14 | drive_share( 15 | role = "writer", 16 | type = "user", 17 | emailAddress = rclass$email[i], 18 | emailMessage = "Here is your R project sheet" 19 | ) 20 | eval_urls[i] <- drive_link(renamed_file) 21 | } 22 | 23 | eval_urls 24 | -------------------------------------------------------------------------------- /scripts/pull_student_repos.R: -------------------------------------------------------------------------------- 1 | year = 2020 2 | student_path = paste0('./student_', year) 3 | students = dir(student_path) 4 | for(i in seq_along(students)) { 5 | system(paste0('cd ', student_path, '/', students[i], 6 | ' ; git pull origin master')) 7 | } 8 | -------------------------------------------------------------------------------- /scripts/shiny_kmeans.R: -------------------------------------------------------------------------------- 1 | library(shiny) 2 | 3 | 4 | ui = pageWithSidebar( 5 | headerPanel('Iris k-means clustering'), 6 | sidebarPanel( 7 | selectInput('xcol', 'X Variable', names(iris)), 8 | selectInput('ycol', 'Y Variable', names(iris), 9 | selected=names(iris)[[2]]) 10 | ), 11 | mainPanel( 12 | verbatimTextOutput('lm_sum') 13 | ) 14 | ) 15 | 16 | 17 | server = function(input, output, session) { 18 | selectedData <- reactive({ 19 | iris[, c(input$xcol, input$ycol)] 20 | }) 21 | 22 | output$lm_sum <- renderPrint( 23 | summary(lm(selectedData())) 24 | ) 25 | 26 | } 27 | 28 | shinyApp(ui, server) -------------------------------------------------------------------------------- /scripts/utility_functions.R: -------------------------------------------------------------------------------- 1 | panel.cor <- function(x, y, digits = 2, prefix = "", cex.cor=3, ...) 2 | { 3 | usr <- par("usr"); on.exit(par(usr)) 4 | par(usr = c(0, 1, 0, 1)) 5 | r <- abs(cor(x, y)) 6 | txt <- format(c(r, 0.123456789), digits = digits)[1] 7 | txt <- paste0(prefix, txt) 8 | if(missing(cex.cor)) 9 | cex.cor <- 0.8/strwidth(txt) 10 | text(0.5, 0.5, txt, cex = cex.cor) 11 | } 12 | 13 | "cleanplot.pca" <- function(res.pca, ax1=1, ax2=2, point=FALSE, 14 | ahead=0.07, cex=0.7) 15 | { 16 | # A function to draw two biplots (scaling 1 and scaling 2) from an object 17 | # of class "rda" (PCA or RDA result from vegan's rda() function) 18 | # 19 | # License: GPL-2 20 | # Authors: Francois Gillet & Daniel Borcard, 24 August 2012 21 | 22 | require("vegan") 23 | 24 | par(mfrow=c(1,2)) 25 | p <- length(res.pca$CA$eig) 26 | 27 | # Scaling 1: "species" scores scaled to relative eigenvalues 28 | sit.sc1 <- scores(res.pca, display="wa", scaling=1, choices=c(1:p)) 29 | spe.sc1 <- scores(res.pca, display="sp", scaling=1, choices=c(1:p)) 30 | plot(res.pca, choices=c(ax1, ax2), display=c("wa", "sp"), type="n", 31 | main="PCA - scaling 1", scaling=1) 32 | if (point) 33 | { 34 | points(sit.sc1[,ax1], sit.sc1[,ax2], pch=20) 35 | text(res.pca, display="wa", choices=c(ax1, ax2), cex=cex, pos=3, scaling=1) 36 | } 37 | else 38 | { 39 | text(res.pca, display="wa", choices=c(ax1, ax2), cex=cex, scaling=1) 40 | } 41 | text(res.pca, display="sp", choices=c(ax1, ax2), cex=cex, pos=4, 42 | col="red", scaling=1) 43 | arrows(0, 0, spe.sc1[,ax1], spe.sc1[,ax2], length=ahead, angle=20, col="red") 44 | pcacircle(res.pca) 45 | 46 | # Scaling 2: site scores scaled to relative eigenvalues 47 | sit.sc2 <- scores(res.pca, display="wa", choices=c(1:p)) 48 | spe.sc2 <- scores(res.pca, display="sp", choices=c(1:p)) 49 | plot(res.pca, choices=c(ax1,ax2), display=c("wa","sp"), type="n", 50 | main="PCA - scaling 2") 51 | if (point) { 52 | points(sit.sc2[,ax1], sit.sc2[,ax2], pch=20) 53 | text(res.pca, display="wa", choices=c(ax1 ,ax2), cex=cex, pos=3) 54 | } 55 | else 56 | { 57 | text(res.pca, display="wa", choices=c(ax1, ax2), cex=cex) 58 | } 59 | text(res.pca, display="sp", choices=c(ax1, ax2), cex=cex, pos=4, col="red") 60 | arrows(0, 0, spe.sc2[,ax1], spe.sc2[,ax2], length=ahead, angle=20, col="red") 61 | par(mfrow=c(1,1)) 62 | } 63 | 64 | 65 | 66 | "pcacircle" <- function (pca) 67 | { 68 | # Draws a circle of equilibrium contribution on a PCA plot 69 | # generated from a vegan analysis. 70 | # vegan uses special constants for its outputs, hence 71 | # the 'const' value below. 72 | 73 | eigenv <- pca$CA$eig 74 | p <- length(eigenv) 75 | n <- nrow(pca$CA$u) 76 | tot <- sum(eigenv) 77 | const <- ((n - 1) * tot)^0.25 78 | radius <- (2/p)^0.5 79 | radius <- radius * const 80 | symbols(0, 0, circles=radius, inches=FALSE, add=TRUE, fg=2) 81 | } 82 | 83 | pseudo_r2 <- function(mod, null_mod=NULL) { 84 | if (class(mod) == 'glm') 85 | r2 <- 1 - glm_mod$deviance / glm_mod$null.deviance 86 | if (class(mod) == 'gls') { 87 | if (is.null(null_mod)) 88 | null_mod <- update(mod, . ~ 1) 89 | r2 <- 1 - (as.numeric(logLik(mod) / logLik(null_mod))) 90 | } 91 | return(r2) 92 | } 93 | 94 | get_spat_mods = function(gls_mod) { 95 | err_mods = c('corExp', 'corGaus', 'corLin', 'corRatio', 'corSpher') 96 | out = vector('list', length(err_mods)) 97 | names(out) = sub('cor', '', err_mods) 98 | for(i in seq_along(err_mods)) { 99 | mods = vector('list', 2) 100 | names(mods) = c('nonug', 'nug') 101 | mods[[1]] = try(eval(parse(text=paste('update(gls_mod, corr=', 102 | err_mods[i], 103 | '(form = ~ x + y, nugget=F))', sep='')))) 104 | mods[[2]] = try(eval(parse(text=paste('update(gls_mod, corr=', 105 | err_mods[i], 106 | '(form = ~ x + y, nugget=T))', sep='')))) 107 | out[[i]] = mods 108 | } 109 | out 110 | } 111 | 112 | get_spat_AIC = function(spat_mods) { 113 | out = data.frame(mods = names(spat_mods), 114 | AIC_no_nug = NA, AIC_nug=NA) 115 | for(i in seq_along(spat_mods)) 116 | for(j in 1:2) 117 | if(class(spat_mods[[i]][[j]]) == 'gls') 118 | out[i, j + 1] = AIC(spat_mods[[i]][[j]]) 119 | out 120 | } 121 | -------------------------------------------------------------------------------- /software.md: -------------------------------------------------------------------------------- 1 | --- 2 | title: "Software" 3 | output: html_document 4 | layout: page 5 | --- 6 | 7 | This course is taught using the R statistical programming language. 8 | 9 | **R** is a free, open-source software package that can be downloaded here: 10 | 11 | [https://cran.r-project.org/](https://cran.r-project.org/) 12 | 13 | **Rstudio** is a free GUI that makes it easier to interact with R, it can be 14 | downloaded here: 15 | 16 | [https://www.rstudio.com/products/rstudio/download/#download](https://www.rstudio.com/products/rstudio/download/#download) 17 | 18 | **git** is a version control system for tracking changes in files and collaborating, 19 | it can be downloaded here: 20 | 21 | [https://git-scm.com/downloads](https://git-scm.com/downloads) 22 | 23 | If you have trouble installing these software then send Dan an email or make 24 | an appointment. Alternatively use the [Rstudio server](uniola.biology.cofc.edu:8787) 25 | that is configured for class usage: 26 | 27 | -------------------------------------------------------------------------------- /syllabus_bio470.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/syllabus_bio470.pdf -------------------------------------------------------------------------------- /syllabus_bio570.pdf: -------------------------------------------------------------------------------- https://raw.githubusercontent.com/dmcglinn/quant_methods/747c05f4c60c0f441496293adacbfc3984af2b11/syllabus_bio570.pdf --------------------------------------------------------------------------------