Skip to content

Commit

Permalink
update readme
Browse files Browse the repository at this point in the history
  • Loading branch information
rcannood committed Aug 23, 2024
1 parent 552f9d0 commit b400237
Show file tree
Hide file tree
Showing 3 changed files with 169 additions and 35 deletions.
194 changes: 164 additions & 30 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,47 +1,181 @@
# Task Template
# Spatial Simulators


<!--
This file is automatically generated from the tasks's api/*.yaml files.
Do not edit this file directly.
-->

Assessing the quality of spatial transcriptomics simulators

Repository:
[openproblems-bio/task_spatial_simulators](https://github.com/openproblems-bio/task_spatial_simulators)

## Description

Computational methods for spatially resolved transcriptomics (SRT) are
frequently developed and assessed through data simulation. The
effectiveness of these evaluations relies on the simulation methods’
ability to accurately reflect experimental data. However, a systematic
evaluation framework for spatial simulators is lacking. Here, we present
SpatialSimBench, a comprehensive evaluation framework that assesses 13
simulation methods using 10 distinct STR datasets.

The research goal of this benchmark is to systematically evaluate and
compare the performance of various simulation methods for spatial
transcriptomics (ST) data. It aims to address the lack of a
comprehensive evaluation framework for spatial simulators and explore
the feasibility of leveraging existing single-cell simulators for ST
data. The experimental setup involves collecting public spatial
transcriptomics datasets and corresponding scRNA-seq datasets. The
spatial and scRNA-seq datasets can originate from different study but
should consist of similar cell types from similar tissues.

## Authors & contributors

| name | roles |
|:------------------|:-------------------|
| Xiaoqi Liang | author, maintainer |
| Yue Cao | author |
| Jean Yang | author |
| Robrecht Cannoodt | contributor |
| Sai Nirmayi Yasa | contributor |

## API

``` mermaid
flowchart LR
comp_process_datasets[/"Process Dataset"/]
file_dataset_sc("Single-Cell Dataset")
file_dataset_sp("Spatial Dataset")
comp_metric[/"Metric"/]
comp_control_method[/"Control Method"/]
comp_method[/"Method"/]
file_score("Score")
file_simulated_dataset("Solution")
comp_process_datasets-->file_dataset_sc
comp_process_datasets-->file_dataset_sp
file_dataset_sc---comp_metric
file_dataset_sp---comp_metric
file_dataset_sp---comp_control_method
file_dataset_sp---comp_method
comp_metric-->file_score
comp_control_method-->file_simulated_dataset
comp_method-->file_simulated_dataset
file_simulated_dataset---comp_metric
```

This repo is a template to create a new task for the OpenProblems v2. This repo contains several example files and components that can be used when updated with the task info.
## Component type: Process Dataset

> [!WARNING]
> This README will be overwritten when performing the `create_task_readme` script.
Preprocessing of spatial transcriptomics and single-cell transcriptomics
datasets.

## Create a repository from this template
Arguments:

> [!IMPORTANT]
> Before creating a new repository, make sure you are part of the openProblems task team. This will be done when you create an issue for the task and you got the go ahead to create the task.
> For more information on how to create a new task, check out the [Create a new task](https://openproblems.bio/documentation/create_task/) documentation.
<div class="small">

The instructions below will guide you through creating a new repository from this template ([creating-a-repository-from-a-template](https://docs.github.com/en/repositories/creating-and-managing-repositories/creating-a-repository-from-a-template#creating-a-repository-from-a-template)).
| Name | Type | Description |
|:---|:---|:---|
| `--input_sc` | `file` | NA. |
| `--input_sp` | `file` | NA. |
| `--output_sc` | `file` | (*Output*) An unprocessed single-cell dataset as output by a dataset loader. Default: `$id/output_sc.h5ad`. |
| `--output_sp` | `file` | (*Output*) An unprocessed spatial dataset as output by a dataset loader. Default: `$id/output_sp.h5ad`. |
| `--dataset_id` | `string` | NA. |
| `--dataset_name` | `string` | NA. |
| `--dataset_url` | `string` | (*Optional*) NA. |
| `--dataset_reference` | `string` | (*Optional*) NA. |
| `--dataset_summary` | `string` | NA. |
| `--dataset_description` | `string` | NA. |
| `--dataset_organism` | `string` | NA. |

</div>

* Click the "Use this template" button on the top right of the repository.
* Use the Owner dropdown menu to select the `openproblems-bio` account.
* Type a name for your repository (task_...), and a description.
* Set the repository visibility to public.
* Click "Create repository from template".
## File format: Single-Cell Dataset

## Clone the repository
An unprocessed single-cell dataset as output by a dataset loader.

To clone the repository with the submodule files, you can use the following command:
Example file: `resources_test/datasets/MOBNEW/dataset_sc.h5ad`

```bash
git clone --recursive [email protected]:openproblems-bio/task_spatial_simulators.git
```
Description:

If you already cloned the repository and forgot to use the `--recursive` flag, you can use the following command to update the submodules:
This dataset contains raw counts and metadata as output by a dataset
loader.

```bash
git submodule update --init --recursive
```
The format of this file is derived from the [CELLxGENE schema
v4.0.0](https://github.com/chanzuckerberg/single-cell-curation/blob/main/schema/4.0.0/schema.md).

Once you have cloned the repository, you can start download the test resources using the following command:
## File format: Spatial Dataset

```bash
scripts/download_resources.sh
```
An unprocessed spatial dataset as output by a dataset loader.

Example file: `resources_test/datasets/MOBNEW/dataset_sp.h5ad`

Description:

This dataset contains raw counts and metadata as output by a dataset
loader.

The format of this file is derived from the [CELLxGENE schema
v4.0.0](https://github.com/chanzuckerberg/single-cell-curation/blob/main/schema/4.0.0/schema.md).

## Component type: Metric

A metric.

Arguments:

<div class="small">

| Name | Type | Description |
|:---|:---|:---|
| `--input_spatial_dataset` | `file` | An unprocessed spatial dataset as output by a dataset loader. |
| `--input_singlecell_dataset` | `file` | An unprocessed single-cell dataset as output by a dataset loader. |
| `--input_simulated_dataset` | `file` | The solution for the test data. |
| `--output` | `file` | (*Output*) File indicating the score of a metric. |

</div>

## Component type: Control Method

A control method.

Arguments:

<div class="small">

| Name | Type | Description |
|:---|:---|:---|
| `--input` | `file` | (*Optional*) An unprocessed spatial dataset as output by a dataset loader. |
| `--output` | `file` | (*Output*) The solution for the test data. |

</div>

## Component type: Method

A method.

Arguments:

<div class="small">

| Name | Type | Description |
|:---|:---|:---|
| `--input` | `file` | (*Optional*) An unprocessed spatial dataset as output by a dataset loader. |
| `--base` | `string` | (*Optional*) NA. Default: `domain`. |
| `--base` | `string` | (*Optional*) NA. Default: `domain`. |
| `--output` | `file` | (*Output*) The solution for the test data. |

</div>

## File format: Score

File indicating the score of a metric.

Example file: `resources/score.h5ad`

## File format: Solution

## What to do next
The solution for the test data

Check out the [instructions](common/INSTRUCTIONS.md) for more information on how to update the example files and components. These instructions also contain information on how to build out the task and basic commands.
Example file: `resources_test/datasets/MOBNEW/simulated_dataset.h5ad`

For more information on the OpenProblems v2, check out the [Documentation](https://openproblems.bio/documentation/).
8 changes: 4 additions & 4 deletions src/api/comp_process_datasets.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -23,14 +23,14 @@ argument_groups:

- name: Outputs
arguments:
- type: file
name: --output_sc
- name: --output_sc
__merge__: file_dataset_sc.yaml
description: Processed single-cell dataset
direction: output
required: true
default: '$id/output_sc.h5ad'
- type: file
name: --output_sp
- name: --output_sp
__merge__: file_dataset_sp.yaml
description: Processed spatial dataset
direction: output
required: true
Expand Down

0 comments on commit b400237

Please sign in to comment.