Skip to content

Commit

Permalink
Dockerfile, CI
Browse files Browse the repository at this point in the history
  • Loading branch information
subwaystation committed Mar 23, 2021
1 parent 1ad4be0 commit a0805a7
Show file tree
Hide file tree
Showing 9 changed files with 2,700 additions and 3 deletions.
95 changes: 95 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
FROM debian:bullseye-slim

LABEL authors="Simon Heumos, Jean Monlong"
LABEL description="Preliminary docker image containing all requirements for pgge pipeline"
LABEL base_image="debian:bullseye-slim"
LABEL software="pgge"
LABEL about.home="https://github.com/pangenome/pgge"
LABEL about.license="SPDX:MIT"

# Required dependencies
# samtools
# TODO add samtools from Bioconda?
RUN apt-get update \
&& apt-get install -y --no-install-recommends \
wget \
curl \
less \
gcc \
samtools \
tzdata \
make \
git \
sudo \
pkg-config \
bc \
time \
libxml2-dev libssl-dev libcurl4-openssl-dev \
apt-transport-https software-properties-common dirmngr gpg-agent \
&& rm -rf /var/lib/apt/lists/*

# rust
RUN curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs -o rust.sh && \
sh rust.sh -y --no-modify-path

ENV PATH /root/.cargo/bin:$PATH

RUN cd ../../
# peanut
RUN git clone https://github.com/pangenome/rs-peanut.git \
&& cd rs-peanut \
&& git pull \
&& git checkout 2783bca \
&& cargo build --release \
&& cp target/release/peanut /usr/local/bin/peanut \
&& cd ../

# splitfa
RUN git clone https://github.com/ekg/splitfa.git \
&& cd splitfa \
&& git pull \
&& git checkout 98589b2 \
&& cargo build --release \
&& cp target/release/splitfa /usr/local/bin/splitfa \
&& cd ../

# miniconda3
RUN wget \
https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh \
&& bash Miniconda3-latest-Linux-x86_64.sh -b -p /miniconda \
&& rm -f Miniconda3-latest-Linux-x86_64.sh
ENV PATH /miniconda/bin:$PATH
SHELL ["/bin/bash", "-c"]

# GraphAligner
# Unfortunately, the current Bioconda version of GraphAligner emits a 15-column GAF, whereas the most recent commit on github emits a 16-column GAF
# Therefore, we can't use the Bioconda version as of now
RUN git clone --recursive https://github.com/maickrau/GraphAligner \
&& cd GraphAligner \
&& git pull \
&& git checkout 48143da \
&& git submodule update --init --recursive \
&& conda env create -f CondaEnvironment.yml \
&& source activate GraphAligner \
&& make bin/GraphAligner \
&& cp bin/GraphAligner /usr/local/bin/GraphAligner \
&& cd ../ \
&& exit

# Install the conda environment
COPY environment.yml /
RUN conda env create --quiet -f /environment.yml && conda clean -a

# Add conda installation dir to PATH (instead of doing 'conda activate')
ENV PATH /miniconda/envs/pgge-dev/bin:$PATH

# Set path for all users
RUN echo "export PATH=$PATH" > /etc/profile

# bring in the binaries and scripts from pgge
COPY pgge /usr/local/bin/pgge
RUN mkdir /scripts
COPY scripts/beehave.R /scripts/beehave.R
RUN chmod 777 /usr/local/bin/pgge && chmod 777 /scripts/beehave.R

ENTRYPOINT [ "/bin/bash", "-l", "-c" ]
47 changes: 44 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -135,6 +135,47 @@ _`pgge`_ also generates a visualization of the results `pgge_yeast/pgge-l100000-

6. _[R](https://www.r-project.org/)_ with packages _[tidyverse](https://www.tidyverse.org/)_, _[ggrepel](https://www.rdocumentation.org/packages/ggrepel/versions/0.9.1)_, _[gridExtra](https://www.rdocumentation.org/packages/gridExtra/versions/2.3)_ installed.

### docker

To simplify installation and versioning, we have an automated GitHub action that pushes the current docker build to the GitHub registry.
To use it, first pull the actual image:

```sh
docker pull ghcr.io/pangenome/pgge:latest
```

Or if you want to pull a specific snapshot from [https://github.com/orgs/pangenome/packages/container/package/pgge](https://github.com/orgs/pangenome/packages/container/package/pgge):

```sh
docker pull ghcr.io/pangenome/pgge:TAG
```

Going in the `pgge` directory

```sh
git clone --recursive https://github.com/pangenome/pgge.git
cd pgge
```

you can run the container using the example [DRB1-3123](data/HLA/DRB1-3123) provided in this repo:
```sh
docker run -it -v ${PWD}/data/:/data pangenome/pgge "pgge -g "/data/HLA/DRB1-3123/*.consensus*.gfa" -f /data/HLA/DRB1-3123/DRB1-3123.fa -r /scripts/beehave.R -t 16 -o /data/HLA/DRB1-3123/pgge_docker -l 1000 -s 1000 -p 100"
```

The `-v` argument of `docker run` always expects a full path: `If you intended to pass a host directory, use absolute path.` This is taken care of by using `${PWD}`.

If you want to experiment around, you can build a docker image locally using the `Dockerfile`:

```sh
docker build -t ${USER}/pgge:latest .
```

Staying in the `pgge` directory, we can run `pgge` with the locally build image:

```sh
docker run -it -v ${PWD}/data/:/data ${USER}/pgge "pgge -g "/data/HLA/DRB1-3123/*.consensus*.gfa" -f /data/HLA/DRB1-3123/DRB1-3123.fa -r /scripts/beehave.R -t 16 -o /data/HLA/DRB1-3123/pgge_docker -l 1000 -s 1000 -p 100"
```

## TODOs
- [x] _`pgge`_ should accept a list of GFA files as input (_path/to/files/\*.consensus\*.gfa_) and output the summarized results in one PNG
- [x] Integrate https://github.com/ekg/splitfa as an option to prepare the input FASTA.
Expand All @@ -146,10 +187,10 @@ _`pgge`_ also generates a visualization of the results `pgge_yeast/pgge-l100000-
- [ ] Add possibility to input several GAF files. Make sure the user can input a list of samples for the GAFs.
- [ ] The user should be able to select options for GraphAligner.
- [ ] Add a toolchain that compares the query alignments with the exact nodes they aligned to in the graph.
- [ ] Add Dockerfile.
- [ ] Add a CI building the Dockerfile and emitting evaluation metrics for all tools using `HLA-Zoo` data.
- [x] Add Dockerfile.
- [x] Add a CI building the Dockerfile and emitting evaluation metrics for all tools using `HLA-Zoo` data.
- [ ] Add usage examples for _`minigraph`_, _`cactus`_, and _`SibeliaZ`_.
- [ ] Integrate into nf-core/pangenome pipeline.
- [ ] Integrate into nf-core/pangenome pipeline. `HALFWAY THERE`.

## authors

Expand Down
Loading

0 comments on commit a0805a7

Please sign in to comment.