└── README.md /README.md: -------------------------------------------------------------------------------- 1 | # Learning resources for bioinformatics, genomics and computational biology 2 | 3 | [Glittr](https://glittr.org/) - a large searchable community database of git repositories with bioinformatics training material 4 | 5 | [DReSA](https://dresa.org.au/) - Digital Research Skills Australasia - an active database of training material and national workshop events 6 | 7 | ## Other Monash groups and pages 8 | 9 | - [Monash eResearch training page](https://docs.erc.monash.edu/training) 10 | 11 | - [Monash NUMBATs R workshops](https://numbat.space/workshops/) 12 | 13 | ## General programming and command line 14 | 15 | > In the wet lab, you might need to learn to pipette, use a centrifuge, or maybe run some gel electrophoresis before you can get useful results. 16 | > In the dry lab, you need to learn to use a computer to automate tasks and analyse data before you can get useful results - often this means telling the computer what to do using plain text. 17 | 18 | ### Linux, the "shell", command line 19 | 20 | - https://linuxjourney.com/ 21 | - https://sandbox.bio/ - interactive commandline tutorials for bioinformatics 22 | - https://www.edx.org/course/introduction-linux-linuxfoundationx-lfs101x-1 23 | - http://andrewjrobinson.github.io/training_docs/tutorials/unix/ 24 | - http://andrewjrobinson.github.io/training_docs/tutorials/hpc/ 25 | 26 | ### Git 27 | 28 | - https://learngitbranching.js.org/ - a visual, interactive tutorial to help you understand Git 29 | 30 | ### R 31 | 32 | #### MGBP developed R workshops 33 | 34 | - [Introduction to R](https://monashdatafluency.github.io/r-intro-2/) is our recommended starting point. 35 | 36 | - [Programming and Tidy Data analysis in R](https://monashdatafluency.github.io/r-progtidy/) covers automating tasks such as loading a large set of files, as well as data wrangling and how "tidy" data streamlines visualization and data analysis. 37 | 38 | - [Linear models in R](https://monashdatafluency.github.io/r-linear/) covers many common statistical tasks in a unified way using "linear models". This will also be useful background knowledge for RNA-Seq analysis, especially with complex experimental designs. 39 | 40 | - [Introduction to R Shiny](https://monashdatafluency.github.io/R-ShinyIntro/) covers presentation of your data interactively. 41 | 42 | - [Working with DNA sequences and features in R with Bioconductor](https://monashdatafluency.github.io/r-bioc-2/) 43 | 44 | 45 | #### Introductory R 46 | 47 | - [Introduction to R - Tidyverse](https://bookdown.org/ansellbr/WEHI_tidyR_course_book/), a workshop developed at WEHI. 48 | - [The R for Data Science book](https://r4ds.hadley.nz/) is a popular book covering the tidy approach. 49 | - [Posit Cloud](https://posit.cloud/) to try R and RStudio online. 50 | - R programming - coursera: https://www.coursera.org/course/rprog 51 | - Data Carpentry R lessons: https://datacarpentry.org/R-ecology-lesson/ and https://datacarpentry.org/genomics-r-intro/ 52 | - It's often useful to generate reports including code and outputs, either using [RMarkdown](https://rmarkdown.rstudio.com/authoring_quick_tour.html) or the newer [Quarto system](https://quarto.org/docs/computations/r.html). 53 | - [StatsTest (Wayback Machine archive)](https://web.archive.org/web/20241117181954/https://www.statstest.com/) - which statistical test should you use? (Paul's comment: Don't be overwhelmed, learn to use linear models. Linear models provide a systematic way of thinking about the factors and variables in your experiment, and how they translate into a statistical model and tests. Then if the assumptions aren't quite met for using a linear model, consider one of the specialized methods listed in this page.) 54 | 55 | #### Advanced R 56 | 57 | - End-to-end visualisation using ggplot2; https://rviews.rstudio.com/2017/08/14/end-to-end-visualization-using-ggplot2/ 58 | - HarvardX biomedical data science MOOC: http://genomicsclass.github.io/book/ (Chapter 5 has good examples of linear model design and contrasts - with diagrams) 59 | 60 | 61 | ### Python 62 | 63 | #### Introductory Python 64 | 65 | - Monash Data Fluency [Introduction to Python Workshop](https://monashdatafluency.github.io/python-workshop-base/) material 66 | - ... uses some material from: [Data Analysis and Visualization in Python for Ecologists](http://www.datacarpentry.org/python-ecology-lesson/) 67 | - The 'official' Python tutorial: https://docs.python.org/3/tutorial/ 68 | - http://rosalind.info/problems/list-view/?location=python-village 69 | - http://andrewjrobinson.github.io/training_docs/tutorials/python_overview/python_overview/ - more of a quickstart for those comfortable with programming 70 | - Intro to Python: http://introtopython.org/ 71 | - Introduction to Data Processing with Python: http://opentechschool.github.io/python-data-intro/ 72 | - Python for Everyone (Basic introductory material, through to object oriented programming, interaction with web services, databases, plotting) https://www.py4e.com/lessons 73 | - [BE/Bi 103 a: Introduction to Data Analysis in the Biological Sciences (Caltech)](https://bebi103a.github.io/index.html) - Data Science for Biology, with statistics and visualisation in Python 74 | - [Programming in the Biological Sciences Bootcamp notes](http://justinbois.github.io/bootcamp/2020/index.html) (Caltech) - a comprehensive introduction to Python programming, with some Pandas, Numpy, Scipy and git thrown in for good measure. 75 | 76 | #### Advanced Python 77 | 78 | - Magic methods, context managers (__enter__, __exit__ for 'with' and more !): https://web.archive.org/web/20161024123835/http://www.rafekettler.com/magicmethods.html 79 | - @property decorators, Descriptors: https://web.archive.org/web/20150407105027/http://intermediatepythonista.com/classes-and-objects-ii-descriptors 80 | - @staticmethod, @classmethod and @abc.abstractmethod 81 | - https://julien.danjou.info/blog/2013/guide-python-static-class-abstract-methods 82 | - Metaclasses: http://eli.thegreenplace.net/2011/08/14/python-metaclasses-by-example and https://stackoverflow.com/questions/100003/what-is-a-metaclass-in-python 83 | - Understanding scope, closures: https://www.farside.org.uk/201307/understanding_python_scope 84 | 85 | ## Visualization 86 | 87 | - Introductory interactive visualization using Altair (University of Washington): https://uwdata.github.io/visualization-curriculum/intro.html 88 | - PCA for Data Science: https://pca4ds.github.io/ 89 | 90 | ## General Bioinformatics and Computational Biology 91 | 92 | - EMBL-ABR training videos: https://www.youtube.com/channel/UC5WlFNBSfmt3e8Js8o2fFqQ/videos 93 | - Bioinformatics Workbook: https://bioinformaticsworkbook.org/#gsc.tab=0 94 | - SequenceEng - resource of most seq applications and analysis pipeline: http://education.knoweng.org/sequenceng/ 95 | 96 | #### Theory 97 | 98 | - [Rosalind](http://rosalind.info/problems/locations/) - problem solving exercises in computational biology to learn the fundamentals 99 | - [JHU Computational Genomics notebooks](https://github.com/BenLangmead/comp-genomics-class) - in depth code examples to help understand how genome short read alignment and assembly works. Burrows-Wheeler Transforms and de Bruijn graphs. 100 | 101 | #### Genomics on the commandline 102 | 103 | - [Data Carpentry Genomics Workshop](https://datacarpentry.org/genomics-workshop/) - a great starting point for learning genomics on the commandline. 104 | - [Computational Genomics Tutorial](https://genomics.sschmeier.com/index.html) - based on the Massey University Genome Science course taught by Sebastian Schmeier. A very clear tutorial series that covers installing and running tools for doing NGS read quality control, genome assembly and mapping, annotation, variant calling and interpretation. Light on theory, but a good starting point for working through the mechanics of genomics on the commandline. 105 | - [Sanger Pathogen Informatics Training](https://github.com/sanger-pathogens/pathogen-informatics-training) - commandline tutorials covering various analysis on microbial pathogens. Structured as a series of notebooks to follow, [starting here](https://github.com/sanger-pathogens/pathogen-informatics-training/blob/master/Notebooks/index.ipynb). 106 | 107 | #### RNA-seq (bulk) 108 | 109 | Some of these tutorials and guides start with raw FASTQ reads, through to differential expression analysis. Others begin with the counts matrix. 110 | 111 | - [MGBP RNA-Seq workshop](https://monashbioinformaticsplatform.github.io/RNAseq_workshop/) 112 | 113 | - [Sydney informatics hub RNAseq tutorial 2023](https://sydney-informatics-hub.github.io/rnaseq-workshop-2023/) - starting from raw reads with nf-core/rnaseq, through to differential expression analysis with R & DESeq2, and functional enrichment. 114 | - [Introduction to differential gene expression analysis using RNA-seq (Dündar, Skrabanek, Zumbo @ Cornell)](https://doi.org/10.5281/zenodo.3985046) - a very nice RNA-seq overview and tutorial, from RNA extraction and experimental design to differential gene expression analysis, with Unix commandline and R exercises. 115 | - [RNA-seq analysis is easy as 1-2-3 with limma, Glimma and edgeR](https://f1000research.com/articles/5-1408/v3) - Law _et al_, 2016 - a good practical tutorial for edgeR and limma, starting from a counts matrix. 116 | - [From reads to genes to pathways: differential expression analysis of RNA-Seq experiments using Rsubread and the edgeR quasi-likelihood pipeline](https://f1000research.com/articles/5-1438/v2) 117 | - [A guide to creating design matrices for gene expression experiments](https://bioconductor.org/packages/release/workflows/vignettes/RNAseq123/inst/doc/designmatrices.html) 118 | - A big list of RNASeq links - nicely organized into sections like 'normalization' and 'batch effects': https://github.com/crazyhottommy/RNA-seq-analysis 119 | - http://master.bioconductor.org/help/course-materials/2015/Uruguay2015/V6-RNASeq.html 120 | - [https://diytranscriptomics.com/](https://diytranscriptomics.com/) - largely video-based tutorial series for learning RNA-seq analysis using R. 121 | - Harvard Chan core [RNA-seq beginner](https://github.com/hbctraining/rnaseq_overview) and [Salmon+DESeq2](https://hbctraining.github.io/DGE_workshop_salmon_online/schedule/links-to-lessons.html) courses 122 | - [Case study: using a Bioconductor R pipeline to analyze RNA-seq data](https://web.archive.org/web/20210920045824/http://bioinf.wehi.edu.au/RNAseqCaseStudy/) 123 | - https://www.ebi.ac.uk/training/online/course/ebi-next-generation-sequencing-practical-course/rna-sequencing/rna-seq-analysis-transcripto-0 124 | - RNASeq tutorial from UOregon: https://github.com/griffithlab/rnaseq_tutorial/wiki 125 | - http://www.ngscourse.org/Course_Materials/alignment/tutorial/example.html 126 | - COMBINE- http://combine-australia.github.io/RNAseq-R/ 127 | - http://www.rnaseqforthenextgeneration.org/protocols/index.htm 128 | - https://github.com/MaayanLab/intro-rnaseq-jupyter 129 | - https://newonlinecourses.science.psu.edu/stat555/node/78/ 130 | - https://mikelove.github.io/counts-model/index.html 131 | 132 | #### RNA-seq (single cell) 133 | 134 | - [MGBP single cell RNA-Seq workshop](https://monashbioinformaticsplatform.github.io/scRNAseq_Workshop/) 135 | 136 | - [Orchestrating Single-Cell Analysis with Bioconductor](https://bioconductor.org/books/release/OSCA/) covers the Bioconductor way of doing single cell. But many people prefer to use [Seurat](https://satijalab.org/seurat/) instead. Seurat also has many useful vignettes. 137 | 138 | #### Metagenomics 139 | 140 | - [Data processing and visualization for metagenomics](https://carpentries-incubator.github.io/metagenomics/index.html) - a Carpentries workshop in incubation. May have some rough edges, but it's already looking quite good. 141 | 142 | - [Orchestrating Microbiome Analysis](https://microbiome.github.io/OMA/docs/devel/) covers the Bioconductor way of doing microbiome analysis. 143 | 144 | #### Functional Enrichment Analysis 145 | 146 | - [MGBP + Sydney Informatics Hub functional enrichment workshop (2024)](https://monashbioinformaticsplatform.github.io/Functional_Enrichment_BioCommons_2024/) 147 | 148 | Papers that could be of interest for functional enrichment analysis. 149 | 150 | - Null hypothesis in GSEA https://www.frontiersin.org/articles/10.3389/fgene.2020.00654/full 151 | - Survey of ORA and FCS (including recommendations) http://ziemann-lab.net/public/kaumadi/manuscript.html 152 | - Univariate and Mutivariate FCS - https://genomebiology.biomedcentral.com/articles/10.1186/s13059-019-1790-4 153 | - Multi contrast and multi-omics FCS- Mitch R package (https://bmcgenomics.biomedcentral.com/articles/10.1186/s12864-020-06856-9) 154 | --------------------------------------------------------------------------------