Course: Analysis of RNA sequencing data with R/Bioconductor
Where: Freie Universitat Berlin (Germany)
When: 22-26 June 2020
This course will provide biologists and bioinformaticians with practical statistical analysis skills to perform rigorous analysis of RNAseq data with R and Bioconductor. The course assumes basic familiarity with genomics, but does not assume prior statistical training. It covers the statistical concepts necessary to design experiments and analyze high-throughput data generated by next-generation sequencing, including: exploratory data analysis, principal components analysis, clustering, differential expression, and gene set analysis.
Session 1 β Introduction
Monday - 09:30 to 17:30
Lecture 1: Data distributions
random variables
distributions
population and samples
Hands-On 1: Introduction to R
Lecture 2: Creating high-quality graphics in R
Visualizing data in 1D, 2D & more than two dimensions
Heatmaps
Data transformations
Hands-On 2: Graphics with base R and ggplot2
Session 2 β Hypothesis testing
Tuesday - 09:30 to 17:30
Lecture 1: Hypothesis testing theory
type I and II error and power
multiple hypothesis testing: false discovery rate, familywise error rate
exploratory data analysis (EDA)
Hands-On 1: Standard tests & EDA
Lecture 2: Hypothesis testing in practice
hypothesis tests for categorical variables (chi-square, Fisher's exact)
Monte Carlo simulation
Permutation tests
Hands-On 2: Permutation tests
Session 3 - Bioconductor
Wednesday β Classes from 09:30 to 17:30
Lecture 1: Introduction to Bioconductor
Incorporating Bioconductor in your data analysis
ExpressionSet / SummarizedExperiment
Annotation resources
Hands-On 1: Leveraging Bioconductor annotation resources
Lecture 2: Genomic intervals
Introduction to genomic region algebra
Basic operations: construction, intra- and inter-region operations
Finding overlaps
Hands-On 2: Solving common bioinformatic challenges with GenomicRanges
Session 4 - Next-generation sequencing data
Thursday - 09:30 to 17:30
Lecture 1: High-throughput count data
Characteristics of count data
Exploring count data
Modeling count data
Hands-On 1: Analyzing next-generation sequencing data
Lecture 2: Clustering and Principal Components Analysis
Measures of similarity
Hierarchical clustering
Dimension reduction
Principal components analysis (PCA)
Hands-On 2: Clustering & PCA
Session 5 - Differential expression and gene set analysis
Friday - 09:30 to 17:30
Lecture 1 - Differential expression analysis
Normalization
Experimental designs
Generalized linear models
Lab 1: Performing differential expression analysis with DESeq2
Lecture 2 - Gene set analysis
A primer on terminology, existing methods & statistical theory
GO/KEGG overrepresentation analysis
Functional class scoring & permutation testing
Network-based enrichment analysis
Lab 2: Performing gene set enrichment analysis with the EnrichmentBrowser
Where: Freie Universitat Berlin (Germany)
When: 22-26 June 2020
This course will provide biologists and bioinformaticians with practical statistical analysis skills to perform rigorous analysis of RNAseq data with R and Bioconductor. The course assumes basic familiarity with genomics, but does not assume prior statistical training. It covers the statistical concepts necessary to design experiments and analyze high-throughput data generated by next-generation sequencing, including: exploratory data analysis, principal components analysis, clustering, differential expression, and gene set analysis.
Session 1 β Introduction
Monday - 09:30 to 17:30
Lecture 1: Data distributions
random variables
distributions
population and samples
Hands-On 1: Introduction to R
Lecture 2: Creating high-quality graphics in R
Visualizing data in 1D, 2D & more than two dimensions
Heatmaps
Data transformations
Hands-On 2: Graphics with base R and ggplot2
Session 2 β Hypothesis testing
Tuesday - 09:30 to 17:30
Lecture 1: Hypothesis testing theory
type I and II error and power
multiple hypothesis testing: false discovery rate, familywise error rate
exploratory data analysis (EDA)
Hands-On 1: Standard tests & EDA
Lecture 2: Hypothesis testing in practice
hypothesis tests for categorical variables (chi-square, Fisher's exact)
Monte Carlo simulation
Permutation tests
Hands-On 2: Permutation tests
Session 3 - Bioconductor
Wednesday β Classes from 09:30 to 17:30
Lecture 1: Introduction to Bioconductor
Incorporating Bioconductor in your data analysis
ExpressionSet / SummarizedExperiment
Annotation resources
Hands-On 1: Leveraging Bioconductor annotation resources
Lecture 2: Genomic intervals
Introduction to genomic region algebra
Basic operations: construction, intra- and inter-region operations
Finding overlaps
Hands-On 2: Solving common bioinformatic challenges with GenomicRanges
Session 4 - Next-generation sequencing data
Thursday - 09:30 to 17:30
Lecture 1: High-throughput count data
Characteristics of count data
Exploring count data
Modeling count data
Hands-On 1: Analyzing next-generation sequencing data
Lecture 2: Clustering and Principal Components Analysis
Measures of similarity
Hierarchical clustering
Dimension reduction
Principal components analysis (PCA)
Hands-On 2: Clustering & PCA
Session 5 - Differential expression and gene set analysis
Friday - 09:30 to 17:30
Lecture 1 - Differential expression analysis
Normalization
Experimental designs
Generalized linear models
Lab 1: Performing differential expression analysis with DESeq2
Lecture 2 - Gene set analysis
A primer on terminology, existing methods & statistical theory
GO/KEGG overrepresentation analysis
Functional class scoring & permutation testing
Network-based enrichment analysis
Lab 2: Performing gene set enrichment analysis with the EnrichmentBrowser