Project Template

Motivation

One of the most important things in bioinformatics is staying organised. This can be daunting because it involves managing code and data in a way that lets you get your day to day work done, while also ensuring that your project is reproducible and sharable with colleagues.

To make matters worse your project often needs to live in multiple places. The diagram below shows a typical setup. Some parts of your project will be things you want to share publicly, or at least with your collaborators. This is best kept in a github repository. Your day to day work is done on a personal computer of some kind and you will often need to run larger analyses (or store larger files) on a high performance computing system.

The aim of this project template is to provide a framework for a workflow involving these three components.

Adapt this template to your own project

Replace "project-template" with the name of your own project
Replace cloudstor URL with a download link to your own data
Divide your project into numbered RMarkdown files (eg 01.check_raw_data.Rmd, 02.firstresult.Rmd)
Place large data files and outputs from compute intensive processes in the hpc folder
Add paths to data files that will be read by RMarkdown scripts in data.list

When you are ready to publish your project

Clean up this file README.md. Remove generic instructions like this one. Flesh out your project outline and instructions to users
Make sure your latest code is committed and pushed to github
Run build_data_package.sh
Upload the resulting data.tgz file to a repository that will hold large data
Replace the download URL in the instructions below with an appropriate URL to access your data

Providing larger files via cloudstor

This repository contains RMarkdown files and R code but does not contain raw data. To obtain the data required to run these scripts you should do the following;

Checkout this repository

git clone https://github.com/marine-omics/project-template.git

Download the raw data and unpack it from within the project repository.

cd project-template
wget 'http://data.qld.edu.au/public/Q5999/marine-omics/project-template/data.tgz' -O data.tgz
tar -zxvf data.tgz

Project Outline

Initial Data Quality Check

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
01.check_raw_data_files/figure-gfm		01.check_raw_data_files/figure-gfm
R		R
bin		bin
figures		figures
hpc		hpc
raw_data		raw_data
results		results
.gitignore		.gitignore
01.check_raw_data.Rmd		01.check_raw_data.Rmd
01.check_raw_data.md		01.check_raw_data.md
LICENSE		LICENSE
README.md		README.md
build_data_package.sh		build_data_package.sh
data.list		data.list
project-template.Rproj		project-template.Rproj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project Template

Motivation

Adapt this template to your own project

When you are ready to publish your project

Providing larger files via cloudstor

Project Outline

About

Releases 1

Packages

Languages

License

marine-omics/project-template

Folders and files

Latest commit

History

Repository files navigation

Project Template

Motivation

Adapt this template to your own project

When you are ready to publish your project

Providing larger files via cloudstor

Project Outline

About

Resources

License

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages