Skip to content

Commit

Permalink
Merge remote-tracking branch 'origin/main' into feature/ref-15/update…
Browse files Browse the repository at this point in the history
…-to-core-dependency
  • Loading branch information
KaiWaldrant committed Sep 20, 2024
2 parents 8e7983f + 10119ed commit 17e5867
Show file tree
Hide file tree
Showing 24 changed files with 55 additions and 52 deletions.
3 changes: 3 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,12 +34,15 @@

* Small changes to api file names (PR #13).

* Update test_resources path in components (PR #18).

* Update workflows to use core repository dependency (PR #20).

## BUG FIXES

* Update the nextflow workflow dependencies (PR #17).

* Fix paths in scripts (PR #18).

## transfer from openproblems-v2 repository

Expand Down
17 changes: 12 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -72,13 +72,15 @@ flowchart LR

A subset of the common dataset.

Example file: `resources_test/common/pancreas/dataset.h5ad`
Example file:
`resources_test/common/cxg_mouse_pancreas_atlas/dataset.h5ad`

Format:

<div class="small">

AnnData object
obs: 'batch'
layers: 'counts'
uns: 'dataset_id', 'dataset_name', 'dataset_url', 'dataset_reference', 'dataset_summary', 'dataset_description', 'dataset_organism'

Expand All @@ -90,6 +92,7 @@ Data structure:

| Slot | Type | Description |
|:---|:---|:---|
| `obs["batch"]` | `string` | (*Optional*) Batch information. |
| `layers["counts"]` | `integer` | Raw counts. |
| `uns["dataset_id"]` | `string` | A unique identifier for the dataset. |
| `uns["dataset_name"]` | `string` | Nicely formatted name. |
Expand Down Expand Up @@ -121,7 +124,8 @@ Arguments:

The subset of molecules used for the test dataset

Example file: `resources_test/denoising/pancreas/test.h5ad`
Example file:
`resources_test/task_denoising/cxg_mouse_pancreas_atlas/test.h5ad`

Format:

Expand Down Expand Up @@ -155,7 +159,8 @@ Data structure:

The subset of molecules used for the training dataset

Example file: `resources_test/denoising/pancreas/train.h5ad`
Example file:
`resources_test/task_denoising/cxg_mouse_pancreas_atlas/train.h5ad`

Format:

Expand Down Expand Up @@ -229,7 +234,8 @@ Arguments:

A denoised dataset as output by a method.

Example file: `resources_test/denoising/pancreas/denoised.h5ad`
Example file:
`resources_test/task_denoising/cxg_mouse_pancreas_atlas/denoised.h5ad`

Format:

Expand Down Expand Up @@ -257,7 +263,8 @@ Data structure:

File indicating the score of a metric.

Example file: `resources_test/denoising/pancreas/score.h5ad`
Example file:
`resources_test/task_denoising/cxg_mouse_pancreas_atlas/score.h5ad`

Format:

Expand Down
4 changes: 2 additions & 2 deletions _viash.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -44,8 +44,8 @@ info:
test_resources:
- type: s3
path: s3://openproblems-data/resources_test/denoising/
dest: resources_test/denoising
path: s3://openproblems-data/resources_test/task_denoising/
dest: resources_test/task_denoising
- type: s3
path: s3://openproblems-data/resources_test/common/
dest: resources_test/common
Expand Down
2 changes: 1 addition & 1 deletion scripts/create_resources/resources.sh
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ output_state: "$id/state.yaml"
publish_dir: s3://openproblems-data/resources/denoising/datasets
HERE

tw launch https://github.com/openproblems-bio/task_template.git \
tw launch https://github.com/openproblems-bio/task_denoising.git \
--revision build/main \
--pull-latest \
--main-script target/nextflow/workflows/process_datasets/main.nf \
Expand Down
27 changes: 10 additions & 17 deletions scripts/create_resources/test_resources.sh
Original file line number Diff line number Diff line change
Expand Up @@ -6,47 +6,40 @@ REPO_ROOT=$(git rev-parse --show-toplevel)
# ensure that the command below is run from the root of the repository
cd "$REPO_ROOT"

# # remove this when you have implemented the script
# echo "TODO: replace the commands in this script with the sequence of components that you need to run to generate test_resources."
# echo " Inside this script, you will need to place commands to generate example files for each of the 'src/api/file_*.yaml' files."
# exit 1

set -e

RAW_DATA=resources_test/common
DATASET_DIR=resources_test/denoising
DATASET_DIR=resources_test/task_denoising

mkdir -p $DATASET_DIR

# process dataset
viash run src/data_processors/process_dataset/config.vsh.yaml -- \
--input $RAW_DATA/cxg_mouse_pancreas_atlas/dataset.h5ad \
--output_train $DATASET_DIR/cxg_mouse_pancreas_atlas/train.h5ad \
--output_test $DATASET_DIR/cxg_mouse_pancreas_atlas/test.h5ad \
--output_solution $DATASET_DIR/cxg_mouse_pancreas_atlas/solution.h5ad
--output_test $DATASET_DIR/cxg_mouse_pancreas_atlas/test.h5ad

# run one method
viash run src/methods/magic/config.vsh.yaml -- \
--input_train $DATASET_DIR/pancreas/train.h5ad \
--output $DATASET_DIR/pancreas/denoised.h5ad
--input_train $DATASET_DIR/cxg_mouse_pancreas_atlas/train.h5ad \
--output $DATASET_DIR/cxg_mouse_pancreas_atlas/denoised.h5ad

# run one metric
viash run src/metrics/poisson/config.vsh.yaml -- \
--input_denoised $DATASET_DIR/pancreas/denoised.h5ad \
--input_test $DATASET_DIR/pancreas/test.h5ad \
--output $DATASET_DIR/pancreas/score.h5ad
--input_prediction $DATASET_DIR/cxg_mouse_pancreas_atlas/denoised.h5ad \
--input_test $DATASET_DIR/cxg_mouse_pancreas_atlas/test.h5ad \
--output $DATASET_DIR/cxg_mouse_pancreas_atlas/score.h5ad

# write manual state.yaml. this is not actually necessary but you never know it might be useful
cat > $DATASET_DIR/cxg_mouse_pancreas_atlas/state.yaml << HERE
id: cxg_mouse_pancreas_atlas
train: !file train.h5ad
test: !file test.h5ad
solution: !file solution.h5ad
prediction: !file denoised.h5ad
score: !file score.h5ad
HERE

# only run this if you have access to the openproblems-data bucket
# aws s3 sync --profile op \
# "$DATASET_DIR" s3://openproblems-data/resources_test/denoising \
# --delete --dryrun
aws s3 sync --profile OP \
"$DATASET_DIR" s3://openproblems-data/resources_test/task_denoising \
--delete --dryrun
4 changes: 2 additions & 2 deletions scripts/run_benchmark/run_test_local.sh
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ nextflow run . \
-resume \
-c common/nextflow_helpers/labels_ci.config \
--id cxg_mouse_pancreas_atlas \
--input_train resources_test/denoising/cxg_mouse_pancreas_atlas/train.h5ad \
--input_test resources_test/denoising/cxg_mouse_pancreas_atlas/test.h5ad \
--input_train resources_test/task_denoising/cxg_mouse_pancreas_atlas/train.h5ad \
--input_test resources_test/task_denoising/cxg_mouse_pancreas_atlas/test.h5ad \
--output_state state.yaml \
--publish_dir "$publish_dir"
4 changes: 2 additions & 2 deletions src/api/comp_control_method.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -29,5 +29,5 @@ test_resources:
- type: python_script
path: /common/component_tests/check_config.py
- path: /common/library.bib
- path: /resources_test/denoising/pancreas
dest: resources_test/denoising/pancreas
- path: /resources_test/task_denoising/cxg_mouse_pancreas_atlas
dest: resources_test/task_denoising/cxg_mouse_pancreas_atlas
4 changes: 2 additions & 2 deletions src/api/comp_data_processor.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -22,5 +22,5 @@ arguments:
test_resources:
- type: python_script
path: /common/component_tests/run_and_check_output.py
- path: /resources_test/common/pancreas
dest: resources_test/common/pancreas
- path: /resources_test/common/cxg_mouse_pancreas_atlas
dest: resources_test/common/cxg_mouse_pancreas_atlas
4 changes: 2 additions & 2 deletions src/api/comp_method.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -21,5 +21,5 @@ test_resources:
- type: python_script
path: /common/component_tests/check_config.py
- path: /common/library.bib
- path: /resources_test/denoising/pancreas
dest: resources_test/denoising/pancreas
- path: /resources_test/task_denoising/cxg_mouse_pancreas_atlas
dest: resources_test/task_denoising/cxg_mouse_pancreas_atlas
4 changes: 2 additions & 2 deletions src/api/comp_metric.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -25,5 +25,5 @@ test_resources:
- type: python_script
path: /common/component_tests/run_and_check_output.py
- path: /common/library.bib
- path: /resources_test/denoising/pancreas
dest: resources_test/denoising/pancreas
- path: /resources_test/task_denoising/cxg_mouse_pancreas_atlas
dest: resources_test/task_denoising/cxg_mouse_pancreas_atlas
2 changes: 1 addition & 1 deletion src/api/file_common_dataset.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
type: file
example: "resources_test/common/pancreas/dataset.h5ad"
example: "resources_test/common/cxg_mouse_pancreas_atlas/dataset.h5ad"
label: "Common Dataset"
summary: A subset of the common dataset.
info:
Expand Down
2 changes: 1 addition & 1 deletion src/api/file_prediction.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
type: file
example: "resources_test/denoising/pancreas/denoised.h5ad"
example: "resources_test/task_denoising/cxg_mouse_pancreas_atlas/denoised.h5ad"
label: "Denoised data"
summary: A denoised dataset as output by a method.
info:
Expand Down
2 changes: 1 addition & 1 deletion src/api/file_score.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
type: file
example: resources_test/denoising/pancreas/score.h5ad
example: resources_test/task_denoising/cxg_mouse_pancreas_atlas/score.h5ad
label: Score
summary: "File indicating the score of a metric."
info:
Expand Down
2 changes: 1 addition & 1 deletion src/api/file_test.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
type: file
example: "resources_test/denoising/pancreas/test.h5ad"
example: "resources_test/task_denoising/cxg_mouse_pancreas_atlas/test.h5ad"
label: "Test data"
summary: The subset of molecules used for the test dataset
info:
Expand Down
2 changes: 1 addition & 1 deletion src/api/file_train.yaml
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
type: file
example: "resources_test/denoising/pancreas/train.h5ad"
example: "resources_test/task_denoising/cxg_mouse_pancreas_atlas/train.h5ad"
label: "Training data"
summary: The subset of molecules used for the training dataset
info:
Expand Down
4 changes: 2 additions & 2 deletions src/control_methods/perfect_denoising/script.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,8 @@

## VIASH START
par = {
'input_train': 'resources_test/denoising/pancreas/train.h5ad',
'input_test': 'resources_test/denoising/pancreas/test.h5ad',
'input_train': 'resources_test/task_denoising/cxg_mouse_pancreas_atlas/train.h5ad',
'input_test': 'resources_test/task_denoising/cxg_mouse_pancreas_atlas/test.h5ad',
'output': 'output_PD.h5ad',
}
meta = {
Expand Down
2 changes: 1 addition & 1 deletion src/data_processors/process_dataset/script.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@

## VIASH START
par = {
'input': "resources_test/common/pancreas/dataset.h5ad",
'input': "resources_test/common/cxg_mouse_pancreas_atlas/dataset.h5ad",
'output_train': "train.h5ad",
'output_test': "test.h5ad",
'train_frac': 0.9,
Expand Down
2 changes: 1 addition & 1 deletion src/methods/alra/script.R
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@ library(ALRA, warn.conflicts = FALSE)

## VIASH START
par <- list(
input_train = "resources_test/denoising/pancreas/train.h5ad",
input_train = "resources_test/task_denoising/cxg_mouse_pancreas_atlas/train.h5ad",
norm = "log",
output = "output.h5ad"
)
Expand Down
2 changes: 1 addition & 1 deletion src/methods/dca/script.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

## VIASH START
par = {
'input_train': 'resources_test/denoising/pancreas/train.h5ad',
'input_train': 'resources_test/task_denoising/cxg_mouse_pancreas_atlas/train.h5ad',
'output': 'output_dca.h5ad',
'epochs': 300,
}
Expand Down
2 changes: 1 addition & 1 deletion src/methods/knn_smoothing/script.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@

## VIASH START
par = {
'input_train': 'resources_test/denoising/pancreas/train.h5ad',
'input_train': 'resources_test/task_denoising/cxg_mouse_pancreas_atlas/train.h5ad',
'output': 'output_knn.h5ad',
}
meta = {
Expand Down
2 changes: 1 addition & 1 deletion src/methods/magic/script.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@

## VIASH START
par = {
"input_train": "resources_test/denoising/pancreas/train.h5ad",
"input_train": "resources_test/task_denoising/cxg_mouse_pancreas_atlas/train.h5ad",
"output": "output_magic.h5ad",
"solver": "exact",
"norm": "sqrt",
Expand Down
2 changes: 1 addition & 1 deletion src/methods/saver/script.R
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ library(Matrix, warn.conflicts = FALSE)

## VIASH START
par <- list(
input_train = "resources_test/denoising/pancreas/train.h5ad",
input_train = "resources_test/task_denoising/cxg_mouse_pancreas_atlas/train.h5ad",
norm = "log",
output = "output.h5ad"
)
Expand Down
4 changes: 2 additions & 2 deletions src/metrics/mse/script.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,8 +5,8 @@

## VIASH START
par = {
'input_test': 'resources_test/denoising/pancreas/test.h5ad',
'input_prediction': 'resources_test/denoising/pancreas/denoised.h5ad',
'input_test': 'resources_test/task_denoising/cxg_mouse_pancreas_atlas/test.h5ad',
'input_prediction': 'resources_test/task_denoising/cxg_mouse_pancreas_atlas/denoised.h5ad',
'output': 'output_mse.h5ad'
}
meta = {
Expand Down
4 changes: 2 additions & 2 deletions src/metrics/poisson/script.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,8 @@

## VIASH START
par = {
'input_prediction': 'output_magic.h5ad',
'input_test': 'output_test.h5ad',
'input_test': 'resources_test/task_denoising/cxg_mouse_pancreas_atlas/test.h5ad',
'input_prediction': 'resources_test/task_denoising/cxg_mouse_pancreas_atlas/denoised.h5ad',
'output': 'output_poisson.h5ad'
}
meta = {
Expand Down

0 comments on commit 17e5867

Please sign in to comment.