Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: devcontainer development environment #487

Draft
wants to merge 44 commits into
base: dev
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
f5ea847
feat: working version of a devcontainer
d0choa Dec 7, 2023
5713961
feat: add github pull requests extension
d0choa Dec 7, 2023
8d8cdbd
feat: welcome message
d0choa Dec 7, 2023
01fc5d8
feat: multiple features
d0choa Dec 8, 2023
57a549f
revert: check json
d0choa Dec 8, 2023
4ae2d59
feat: working version
d0choa Dec 11, 2023
4034fba
feat: development environment dockerfile
d0choa Dec 11, 2023
11a8f03
revert: removing black as formatter
d0choa Dec 11, 2023
88d681e
chore: poetry lock updated
d0choa Dec 11, 2023
2157869
revert: remove dependencies
d0choa Dec 12, 2023
a0e1143
test: no coverage report
d0choa Dec 12, 2023
40770f9
revert: isort not required
d0choa Dec 12, 2023
32a0691
chore: merge dev
d0choa Jan 19, 2024
c9ef273
chore: updates made by vscode
d0choa Jan 19, 2024
65e4d60
chore: node_modules ignored
d0choa Jan 19, 2024
143847f
chore: rename to gentropy
d0choa Jan 19, 2024
01361e5
chore: merge dev
d0choa Feb 8, 2024
a715d40
chore: fix poetry lock
d0choa Feb 8, 2024
3d210fb
feat: add extensions
d0choa Feb 9, 2024
14be555
revert: welcome message not working
d0choa Feb 9, 2024
a017418
feat: badge in readme
d0choa Feb 9, 2024
d2efd86
fix: typo
d0choa Feb 9, 2024
2ff4108
feat: add badge to index
d0choa Feb 9, 2024
4af5330
docs: devcontainer documentation and cleanup
d0choa Feb 12, 2024
5a5329c
chore: change devcontainer location
d0choa Feb 12, 2024
a3cb47a
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Feb 12, 2024
8a77078
Merge branch 'dev' into do_devcontainer
d0choa Feb 12, 2024
8a414a6
chore: merge dev
d0choa Feb 12, 2024
a213a2e
chore: update lock
d0choa Feb 12, 2024
ad0e5b5
docs: several ammendments
d0choa Feb 12, 2024
ada84c0
feat: spark ui port forwarding
d0choa Feb 12, 2024
85d5dde
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Feb 12, 2024
7c06276
feat: gcp cli
d0choa Feb 13, 2024
d7327a4
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Feb 13, 2024
ba432c1
feat: mount gcloud credentials
d0choa Feb 13, 2024
d696729
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Feb 13, 2024
55a240f
feat: missing extension
d0choa Feb 17, 2024
c1803b7
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Feb 17, 2024
05cc6a7
feat: pre-commit edits
d0choa Feb 17, 2024
01f40ad
chore: merge
d0choa Feb 17, 2024
5bdddf0
[pre-commit.ci] auto fixes from pre-commit.com hooks
pre-commit-ci[bot] Feb 17, 2024
d911342
chore: merge dev
d0choa Feb 21, 2024
243b7cc
chore: pre-commit auto fixes [...]
pre-commit-ci[bot] Feb 21, 2024
e85050b
chore: merge dev
d0choa Apr 29, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
79 changes: 79 additions & 0 deletions .devcontainer/Dockerfile.dev
Original file line number Diff line number Diff line change
@@ -0,0 +1,79 @@
# See here for image contents: https://github.com/microsoft/vscode-dev-containers/tree/v0.231.6/containers/python-3/.devcontainer/base.Dockerfile

# [Choice] Python version (use -bullseye variants on local arm64/Apple Silicon): 3, 3.10, 3.9, 3.8, 3.7, 3.6, 3-bullseye, 3.10-bullseye, 3.9-bullseye, 3.8-bullseye, 3.7-bullseye, 3.6-bullseye, 3-buster, 3.10-buster, 3.9-buster, 3.8-buster, 3.7-buster, 3.6-buster
ARG VARIANT="3.10-bullseye"
FROM mcr.microsoft.com/vscode/devcontainers/python:0-${VARIANT} as development_build

ARG YOUR_ENV

ENV PYTHONUNBUFFERED=1 \
PYTHONFAULTHANDLER=1 \
PYTHONHASHSEED=random \
PYTHONDONTWRITEBYTECODE=1 \
# pip
# PIP_DISABLE_PIP_VERSION_CHECK=1 \
# PIP_DEFAULT_TIMEOUT=100 \
PIP_ROOT_USER_ACTION=ignore \
# Poetry
# https://python-poetry.org/docs/configuration/#using-environment-variables
POETRY_VERSION=1.7.1 \
# make poetry install to this location
POETRY_HOME="/usr/local" \
# do not ask any interactive question
POETRY_NO_INTERACTION=1 \
# cahce dir for poetry
POETRY_CACHE_DIR='/var/cache/pypoetry' \
# never create virtual environment automatically, only use env prepared by us
POETRY_VIRTUALENVS_CREATE=false


# System deps (we don't use exact versions because it is hard to update them,
# pin when needed):
# hadolint ignore=DL3008
RUN apt-get update && \
apt-get install -y --no-install-recommends \
default-jdk \
htop \
fzf \
jq \
bat \
# Installing `poetry` package manager:
# https://github.com/python-poetry/poetry
&& curl -sSL 'https://install.python-poetry.org' | python - \
# Enable tab completion for bash
&& poetry completions bash >> /home/vscode/.bash_completion \
# Enable tab completion for Zsh
&& mkdir -p /home/vscode/.zfunc/ \
&& poetry completions zsh > /home/vscode/.zfunc/_poetry \
&& echo "fpath+=~/.zfunc\nautoload -Uz compinit && compinit" >> /home/vscode/.zshrc \
&& poetry --version \
&& apt-get purge -y --auto-remove -o APT::AutoRemove::RecommendsImportant=false \
&& apt-get clean -y && rm -rf /var/lib/apt/lists/*

# Add fzf to .zshrc
RUN echo "source /usr/share/doc/fzf/examples/key-bindings.zsh" >> /home/vscode/.zshrc && \
echo "source /usr/share/doc/fzf/examples/completion.zsh" >> /home/vscode/.zshrc

ARG USERNAME=vscode
# Used to persist bash history as per https://code.visualstudio.com/remote/advancedcontainers/persist-bash-history
# hadolint ignore=SC2086
RUN SNIPPET="export PROMPT_COMMAND='history -a' && export HISTFILE=/commandhistory/.zsh_history" \
&& mkdir /commandhistory \
&& touch /commandhistory/.zsh_history \
&& chown -R $USERNAME /commandhistory \
&& echo "$SNIPPET" >> "/home/$USERNAME/.zshrc"

# working directory
WORKDIR /workspaces

# Copy only requirements, to cache them in docker layer
COPY ./poetry.lock ./pyproject.toml ./README.md ./
COPY src/gentropy src/gentropy
COPY src/utils src/utils

# Install runtime deps:
# hadolint ignore=SC2046
RUN --mount=type=cache,target="$POETRY_CACHE_DIR" \
poetry install --no-interaction --no-ansi

EXPOSE 8080
78 changes: 78 additions & 0 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,78 @@
{
"name": "OTGdev",
"build": {
"dockerfile": "Dockerfile.dev",
"context": "..",
"args": {
"VARIANT": "3.10-bullseye",
"POETRY_VERSION": "1.7.1"
}
},
"features": {
"ghcr.io/devcontainers/features/github-cli:1": {},
"ghcr.io/dhoeric/features/google-cloud-cli:1": {},
"ghcr.io/mikaello/devcontainer-features/modern-shell-utils:1": {},
"ghcr.io/devcontainers-contrib/features/pre-commit:2": {},
"ghcr.io/devcontainers/features/docker-outside-of-docker:1": {}
},
"mounts": [
"source=/var/run/docker.sock,target=/var/run/docker.sock,type=bind",
"source=devcontainer-bashhistory,target=/commandhistory,type=volume",
"source=${localEnv:HOME}/.gitconfig,target=/home/vscode/.gitconfig,type=bind,consistency=cached",
"source=${localEnv:HOME}/.config/gcloud,target=/home/vscode/.config/gcloud,type=bind,consistency=cached"
],
"runArgs": ["-e", "GIT_EDITOR=code --wait"],
"postCreateCommand": "bash .devcontainer/postCreateCommand.sh",
"customizations": {
"vscode": {
// Please keep this file in sync with settings in /.vscode/extensions.json
"extensions": [
"charliermarsh.ruff",
"ms-python.mypy-type-checker",
"ms-python.python",
"esbenp.prettier-vscode",
"mhutchie.git-graph",
"eamodio.gitlens",
"ms-azuretools.vscode-docker",
"github.vscode-github-actions",
"timonwong.shellcheck",
"GitHub.copilot",
"vivaxy.vscode-conventional-commits",
"ms-vscode-remote.remote-ssh",
"ms-toolsai.jupyter",
"ms-vscode.makefile-tools",
"GitHub.vscode-pull-request-github",
"fnando.linter"
],
// Please keep this file in sync with settings in /.vscode/settings.json
"settings": {
"editor.defaultFormatter": "esbenp.prettier-vscode",
"terminal.integrated.fontFamily": "Fira Code",
"terminal.integrated.defaultProfile.linux": "zsh",
"editor.formatOnPaste": false,
"editor.formatOnSave": true,
"editor.formatOnType": true,
"[python]": {
"editor.defaultFormatter": "charliermarsh.ruff"
},
"python.terminal.launchArgs": ["-m", "IPython", "--no-autoindent"],
"python.defaultInterpreterPath": "/usr/local/bin/python",
"python.testing.pytestEnabled": true,
"python.testing.pytestArgs": [
"src/",
"tests/",
"--doctest-modules",
"--no-cov"
]
}
}
},
"portsAttributes": {
"4040": {
"label": "SparkUI",
"onAutoForward": "notify"
}
},

"forwardPorts": [4040]
}
4 changes: 4 additions & 0 deletions .devcontainer/postCreateCommand.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
#!/bin/bash

# # Install Pre-commit
pre-commit install --install-hooks
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -11,3 +11,4 @@ src/airflow/logs/*
!src/airflow/logs/.gitkeep
site/
.env
node_modules
1 change: 0 additions & 1 deletion .vscode/extensions.json
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,6 @@
"ms-python.mypy-type-checker",
"ms-python.python",
"esbenp.prettier-vscode",
"redhat.vscode-yaml",
"fnando.linter"
],
"unwantedRecommendations": ["ms-python.flake8"]
Expand Down
27 changes: 13 additions & 14 deletions .vscode/settings.json
Original file line number Diff line number Diff line change
@@ -1,29 +1,28 @@
{
"editor.defaultFormatter": "esbenp.prettier-vscode",
"terminal.integrated.fontFamily": "Fira Code",
"terminal.integrated.defaultProfile.linux": "zsh",
"editor.formatOnPaste": false,
"editor.formatOnSave": true,
"editor.formatOnType": true,
"[python]": {
"editor.defaultFormatter": "charliermarsh.ruff",
"editor.formatOnPaste": false,
"editor.formatOnSave": true,
"editor.codeActionsOnSave": {
"source.fixAll": "never",
"source.organizeImports": "explicit"
}
},
"python.terminal.launchArgs": ["-m", "IPython", "--no-autoindent"],
"python.testing.pytestEnabled": true,
"python.testing.pytestArgs": [
"src/",
"tests/",
"--doctest-modules",
"--cov=src/"
],
"[jsonc]": {
"editor.tabSize": 2,
"editor.insertSpaces": true,
"editor.detectIndentation": false,
"editor.formatOnSave": true,
"editor.codeActionsOnSave": {
"source.organizeImports": "explicit"
}
},
"json.format.keepLines": true,
"autoDocstring.docstringFormat": "google",
"python.testing.pytestArgs": [".", "--doctest-modules", "--cov=src/"],
"python.testing.unittestEnabled": false,
"python.testing.pytestEnabled": true,
"mypy-type-checker.severity": {
"error": "Information"
}
}
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@
<a href="https://opentargets.github.io/gentropy/"><img src="https://github.com/opentargets/gentropy/actions/workflows/release.yaml/badge.svg" alt="image" /></a>
<a href="https://codecov.io/gh/opentargets/gentropy"><img src="https://codecov.io/gh/opentargets/gentropy/branch/main/graph/badge.svg?token=5ixzgu8KFP" alt="codecov" /></a>
<a href="https://opensource.org/licenses/Apache-2.0"><img src="https://img.shields.io/badge/License-Apache_2.0-blue.svg" alt="License" /></a>
<a href="https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/opentargets/gentropy"><img src="https://img.shields.io/badge/Dev%20Containers-Open-blue?logo=visualstudiocode" alt="Dev Containers" /></a>
<a href="https://doi.org/10.5281/zenodo.10527086"><img src="https://zenodo.org/badge/DOI/10.5281/zenodo.10527086.svg" alt="DOI" /></a>
</p>

Expand Down
46 changes: 46 additions & 0 deletions docs/development/a_environment.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,46 @@
---
Title: Environment
---

# Environment

## Devcontainer (recommended)

Developing in a devcontainer ensures a reproducible development environment with minimal setup. To use the devcontainer, you need to have Docker installed on your system and use Visual Studio Code as your IDE.

For a quick start you can either:

1. [Open existing folder in container](https://code.visualstudio.com/docs/devcontainers/containers#_quick-start-open-an-existing-folder-in-a-container).

1. [Clone repository in container volume](https://code.visualstudio.com/docs/devcontainers/containers#_quick-start-open-a-git-repository-or-github-pr-in-an-isolated-container-volume) [![Open in Dev Containers](https://img.shields.io/static/v1?label=Dev%20Containers&message=Open&color=blue&logo=visualstudiocode)](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/opentargets/gentropy)

If you already have VS Code and Docker installed, you can click the badge above or [here](https://vscode.dev/redirect?url=vscode://ms-vscode-remote.remote-containers/cloneInVolume?url=https://github.com/opentargets/gentropy) to get started. Clicking these links will cause VS Code to automatically install the Dev Containers extension if needed, clone the source code into a container volume, and spin up a dev container for use.

More information on working with devcontainers can be found [here](https://code.visualstudio.com/docs/devcontainers/containers).

!!! note "I/O performance"

The Dev Containers extension uses "bind mounts" to source code in your local filesystem by default. While this is the simplest option, on macOS and Windows, you may encounter slower disk performance. There are [few things you can do](https://code.visualstudio.com/remote/advancedcontainers/improve-performance) to resolve these type of issues including cloning repository in container volume.

## Codespaces

A devcontainer can also be triggered within Github using [Codespaces](https://github.com/features/codespaces). This option requires no local setup as the environment is managed by Github.

## Local environment

To setup a full local environment of the package please follow the next steps.

Requirements:

- java
- make
- [Google Cloud SDK](https://cloud.google.com/sdk/docs/install).

Run `make setup-dev` to install/update the necessary packages and activate the development environment.

!!! info "Google Cloud configuration"

To complete the Google Cloud configuration, you need to:

- Log in to your work Google Account: run `gcloud auth login` and follow instructions.
- Obtain Google application credentials: run `gcloud auth application-default login` and follow instructions.
34 changes: 34 additions & 0 deletions docs/development/b_contributing.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
---
Title: Contributing
---

# Contributing guidelines

## Contributing checklist

When making changes, and especially when implementing a new module or feature, it's essential to ensure that all relevant sections of the code base are modified.

- [ ] Run `make check`. This will run the linter and formatter to ensure that the code is compliant with the project conventions.
- [ ] Develop unit tests for your code and run `make test`. This will run all unit tests in the repository, including the examples appended in the docstrings of some methods.
- [ ] Update the configuration if necessary.
- [ ] Update the documentation and check it with `make build-documentation`. This will start a local server to browse it (URL will be printed, usually `http://127.0.0.1:8000/`)

For more details on each of these steps, see the sections below.

## Documentation

- If during development you had a question which wasn't covered in the documentation, and someone explained it to you, add it to the documentation. The same applies if you encountered any instructions in the documentation which were obsolete or incorrect.
- Documentation autogeneration expressions start with `:::`. They will automatically generate sections of the documentation based on class and method docstrings. Be sure to update them for:
- Dataset definitions in `docs/python_api/datasource/STEP` (example: `docs/python_api/datasource/finngen/study_index.md`)
- Step definition in `docs/python_api/step/STEP.md` (example: `docs/python_api/step/finngen.md`)

## Classes

- Dataset class in `src/gentropy/datasource/STEP` (example: `src/gentropy/datasource/finngen/study_index.py` → `FinnGenStudyIndex`)
- Step main running class in `src/gentropy/STEP.py` (example: `src/gentropy/finngen.py`)

## Tests

- Test study fixture in `tests/conftest.py` (example: `mock_study_index_finngen` in that module)
- Test sample data in `tests/data_samples` (example: `tests/data_samples/finngen_studies_sample.json`)
- Test definition in `tests/` (example: `tests/dataset/test_study_index.py` → `test_study_index_finngen_creation`)
29 changes: 29 additions & 0 deletions docs/development/airflow.md → docs/development/c_airflow.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
---
Title: Airflow
---

# Airflow configuration

This section describes how to set up a local Airflow server which will orchestrate running workflows in Google Cloud Platform. This is useful for testing and debugging, but for production use, it is recommended to run Airflow on a dedicated server.
Expand Down Expand Up @@ -117,6 +121,31 @@ More information on running Airflow with Docker Compose can be found in the [off

1. **Additional pip packages**. They can be added to the `requirements.txt` file.

## Example: A gentropy running DAG using GCP

All pipelines in this repository are intended to be run in Google Dataproc. Running them locally is not currently supported.

In order to run the code:

1. Manually edit your local `src/airflow/dags/*` file and comment out the steps you do not want to run.

2. Manually edit your local `pyproject.toml` file and modify the version of the code.

- This must be different from the version used by any other people working on the repository to avoid any deployment conflicts, so it's a good idea to use your name, for example: `1.2.3+jdoe`.
- You can also add a brief branch description, for example: `1.2.3+jdoe.myfeature`.
- Note that the version must comply with [PEP440 conventions](https://peps.python.org/pep-0440/#normalization), otherwise Poetry will not allow it to be deployed.
- Do not use underscores or hyphens in your version name. When building the WHL file, they will be automatically converted to dots, which means the file name will no longer match the version and the build will fail. Use dots instead.

3. Manually edit your local `src/airflow/dags/common_airflow.py` and set `OTG_VERSION` to the same version as you did in the previous step.

4. Run `make build`.

- This will create a bundle containing the neccessary code, configuration and dependencies to run the ETL pipeline, and then upload this bundle to Google Cloud.
- A version specific subpath is used, so uploading the code will not affect any branches but your own.
- If there was already a code bundle uploaded with the same version number, it will be replaced.

5. Open Airflow UI and run the DAG.

## Troubleshooting

Note that when you a a new workflow under `dags/`, Airflow will not pick that up immediately. By default the filesystem is only scanned for new DAGs every 300s. However, once the DAG is added, updates are applied nearly instantaneously.
Expand Down
Loading