LAST MAJOR UPDATE: 26.09.2024
Welcome to our R Programming Course specifically designed for biologists, including master and PhD students. This course aims to equip participants with basic R programming skills and introduce them to statistical analysis techniques applicable in molecular biology.
This two-day intensive course covers everything from basic programming in R to advanced statistical analyses relevant to molecular biology. Participants will learn through a mix of lectures, hands-on exercises, and interactive discussions. By the end of the course, you will be able to perform data manipulation, create visualizations, and conduct statistical analyses using R.
Participants are expected to have the following installed on their computers before the course begins:
You can find the installation guides once you click on them:
R
R [R Installation Guide](https://cran.r-project.org/)RStudio
RStudio [RStudio Installation Guide](https://www.rstudio.com/products/rstudio/download/)Git
Git [Git Installation Guide](https://git-scm.com/book/en/v2/Getting-Started-Installing-Git)Installing the following R-libraries:
- ggplot2
- dplyr
- DESeq2
- gprofiler2
- clusterProfiler
- imager
- magick
- tibble
- MASS
- tidyr
- stringr
- Additionally, please install the following libraries as they were added afterwards: [UPDATE]
- To install these additional packages, use the following R command:
install.packages(c("car", "Rcmdr", "ggpubr", "openxlsx"))
Participants are also expected to have a GitHub account.
It's crucial for all participants to install R, RStudio, and Git prior to the start of the course. These tools are essential for participating in the course exercises and for following along with the instruction.
Considering the importance of a smooth start to our course, we're planning to host a Zoom pre-course session. This session is intended to help with the installation process, address any issues you might encounter, and answer any questions. Stay tuned for the schedule and details.
If you encounter any issues during the installation process, please:
- Refer to the FAQs and troubleshooting guides provided on the respective software download pages.
- Post your issue on the GitHub issues section of this repository. Please provide as much detail as possible about the problem you're experiencing.
- Contact us directly via email, and we'll do our best to assist you.
We strongly recommend that you try to familiarize yourself with R and RStudio by following some basic tutorials or trying out simple exercises. This will help you hit the ground running when the course starts.
Day 1: Introduction to R and Basic Programming Concepts [Slides]
- Introduction to R and RStudio
- Basic R Syntax and Operations
Session 2: Data Entry and Data Management [Exercise]
- Data Import and Export
- Data Manipulation (base R and
dplyr
)
Session 3: Creating Graphics [Exercise]
- Introduction to Base Graphics in R (base R and
ggplot2
)
- Frequency/Cross Tables, Mean, Standard Deviation, Correlation
Day 2: Statistical Analysis in R [Slides]
- Fisher’s Exact Test, T-tests.
- Multiple Linear Regression, ANOVA
Session 3: Molecular Biology Applications [Exercise]
- PCA and Hierarchical Clustering
Session 4: Real-World Data Application [Tutorial]
- Case Study: Gene Expression Data Analysis
To get started with the course materials, clone this repository using Git:
https://github.com/CECADBioinformaticsCoreFacility/R_course_CGA
Navigate into the cloned directory to access all course materials, datasets, and exercises.
For further learning and exploration of R, we recommend the following resources:
We welcome contributions to improve the course materials. Please feel free to fork the repository, make changes, and submit a pull request.
For any queries regarding the course, please reach out to us at [email protected]
We would like to thank all contributors and participants for making this course possible. Special thanks to the R community for the comprehensive resources and support.