Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SYNPY-1362] High level best practices for project structure #1028

Merged
merged 26 commits into from
Dec 22, 2023
Merged
Show file tree
Hide file tree
Changes from 22 commits
Commits
Show all changes
26 commits
Select commit Hold shift + click to select a range
7c23317
Migrating to mkdocstrings
BryanFauble Dec 14, 2023
521f7a4
Using GH action to build docs pages
BryanFauble Dec 15, 2023
287a103
Temp change
BryanFauble Dec 15, 2023
77fb669
Revert temp content
BryanFauble Dec 15, 2023
e929b7d
Update build-docs to master branch
BryanFauble Dec 15, 2023
0b0e88b
Correcting some issues
BryanFauble Dec 15, 2023
a8bc6a8
Make table content slightly larger
BryanFauble Dec 15, 2023
cd69af6
Update print release issue script to allow md
BryanFauble Dec 15, 2023
1508f7d
Adding note around code style
BryanFauble Dec 15, 2023
40cfa1b
Add in template for structuring data
thomasyu888 Dec 18, 2023
8af27cc
Add EOL
thomasyu888 Dec 18, 2023
fefa771
Add in details about ELITE portal
thomasyu888 Dec 18, 2023
89cf9d1
Add note about documentation
thomasyu888 Dec 18, 2023
feafeda
Add WIP note
thomasyu888 Dec 18, 2023
da6db6a
Merge branch 'develop' into SYNPY-1362-best-practices-folder-struct
BryanFauble Dec 18, 2023
68500d3
Update docs/explanations/structuring_your_project.md
thomasyu888 Dec 18, 2023
f5e79cf
Update changes
thomasyu888 Dec 18, 2023
541c26d
Merge branch 'SYNPY-1362-best-practices-folder-struct' of github.com:…
thomasyu888 Dec 18, 2023
8fb018b
Add section about manifests
thomasyu888 Dec 18, 2023
d757cf5
Add different sections
thomasyu888 Dec 18, 2023
4fd65f0
Add in project structure
thomasyu888 Dec 18, 2023
1ab4822
Update mkdocs.yml
thomasyu888 Dec 18, 2023
a9c12a9
Update docs/explanations/structuring_your_project.md
thomasyu888 Dec 19, 2023
e7a83aa
Update docs/explanations/structuring_your_project.md
thomasyu888 Dec 19, 2023
b269586
Update docs/explanations/structuring_your_project.md
thomasyu888 Dec 19, 2023
5e0126c
Standardize File View
thomasyu888 Dec 21, 2023
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
113 changes: 113 additions & 0 deletions docs/explanations/structuring_your_project.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
# Structuring Your Synapse Project

Based on the experience on managing data coordination projects for 10+ years at Sage, the below are recommendations on how best to manage your Synapse project for data sharing and management.

> Note: This page is a work in progress and will contain code examples at a later date.

## Permissions Management

We recommend creating Synapse Teams for permission management on your Synapse project so that you manage users that are in these teams instead of granting individual users access to your project. These are the recommended teams to create:

* **<project> Admin** - This team should have "Administrator" access to the project and is used to used to manage the project and grant access to other teams.
* **<project> Users** - This team is optional, but can be used to grant a curated set of users download access to the project, by grant "Can Download" permission to the team. If you want to grant all registered users on Synapse download access to your project, click the "Make Public" button in project's sharing settings instead of creating and managing this team.

Below are some key permission criterias to consider when setting up your project:
thomasyu888 marked this conversation as resolved.
Show resolved Hide resolved

* Create a Synapse Team and Project per data contributor if there are (1) multiple data contributors and (2) the data contributors should not have access to each other's raw data. You would then create a central "public" project that will contain the harmonized data. You can technically leverage [local share settings](https://help.synapse.org/docs/Sharing-Settings,-Permissions,-and-Conditions-for-Use.2024276030.html#SharingSettings,Permissions,andConditionsforUse-EditSharingSettingsonFiles,Folders,andTables) by creating private folders in one project but managing local share settings is more complicated and not recommended.
* Do not mix data that requires different permission models within a folder. For example, if you have a folder that contains both public and private data, you should create two folders, one for public data and one for private data. You can then grant the appropriate permissions to each folder. You can use local share settings to manage each file's permission, but this is not recommended!
thomasyu888 marked this conversation as resolved.
Show resolved Hide resolved


## Project Structure

When organizing your data for upload we have a preferred organization (flattened data layout) and an alternate option (hierarchy data layout) if your project requires that. Synapse files are automaticlaly versioned when you create a file with the same filename, so be sure to account for that when organizing your folders.
thomasyu888 marked this conversation as resolved.
Show resolved Hide resolved

> NOTE: If you and your contributing site decide to use a hierarchical file structure within your cloud storage location, please remember that each top-level folder and all of its subfolders must contain data of the same type (see details below).


### Top Level Folder Names

Top level folders correspond to the datasets being submitted. See the examples below. You can name your datasets in a way that is descriptive for your contributing site.

You can use either the Hierarchy or Flattened data layout according to the examples below.

#### Flattened Data Layout Example

This is the preferred dataset organization option. Each dataset folder contains the same datatype, and there aren’t nested folders containing datasets.

```
.
├── biospecimen_experiment_1
├── manifest1.tsv
├── biospecimen_experiment_2
├── manifestA.tsv
├── single_cell_RNAseq_batch_1
├── manifestX.tsv
├── fileA.txt
├── fileB.txt
├── fileC.txt
└── fileD.txt
└── single_cell_RNAseq_batch_2
├── manifestY.tsv
└── file1.txt
```

#### Hierarchy Data Layout Example

In this option, subfolders should be of the same data type and level as the root folder they are contained in. For example, you should not put a biospecimen and a clinical demographics subfolder within the same folder. Your files should be reasonably descriptive in stating the assay type and level and be consistently prefixed with the assay type.

* a dataset folder can’t be inside another dataset folder
* dataset folders must have unique names
* folder hierarchy may contain non-dataset folders (e.g. storing reports or other kinds of entities)

```
.
├── clinical_diagnosis
├── clinical_demographics
├── biospecimen
├── experiment_1
├── manifest1.tsv
└── experiment_2
├── manifestA.tsv
└── single_cell
├── batch_1
├── manifestX.tsv
├── fileA.txt
├── fileB.txt
├── fileC.txt
└── fileD.txt
└── batch_2
├── manifestY.tsv
└── file1.txt
```

### File Views

A file view allows you to see groups of files, tables, projects, or submissions and any associated annotations about those items. Annotations are an essential component to building a view. Annotations are labels that you apply to your data, stored as key-value pairs in Synapse. You can use annotations to select specific subsets of your data across many projects or folders and group things together in one view.

You can use a view to:

- Search and query many files, tables, projects, and submissions at once
- View and edit file or table annotations in bulk
- Group or link files, tables, projects, or submissions together by their annotations


#### Creating the file view

* Create a Fileview with the project set to the scope of the fileview
thomasyu888 marked this conversation as resolved.
Show resolved Hide resolved
* Give every Team Download level access to this fileview.
* Note: creating this file view will not be possible if files/folders don’t yet exist in the center-specific projects; Synapse will not allow you to create a file view with an empty scope.
* Make sure to add both file and folder entities to the scope of the Fileview.
* Make sure you leverage synapse annotations per file and folder to allow for your files to be more easily discoverable via a file view.

For more information, visit [File Views](https://help.synapse.org/docs/Views.2011070739.html).

#### Uploading annotations with manifests

Manifests are crucial for the organization of your data in Synapse. In the **hierarchical case**, you would fill in one manifest and include all files in experiment/batches; in the **flattened case**, you would fill in one manifest for each top level folder. The manifest would contain Synapse annotations which can be used to query the data when a File View is created. Please read [manifest_tsv](manifest_tsv.md) for more information.


thomasyu888 marked this conversation as resolved.
Show resolved Hide resolved
### An example: ELITE portal

Synapse Project: https://www.synapse.org/#!Synapse:syn27229419/wiki/623145

This project powers the elite portal: https://eliteportal.synapse.org/. More information about the studies and the files can be found in this portal.
1 change: 1 addition & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ nav:
- Properties vs Annotations: explanations/properties_vs_annotations.md
- Manifest TSV: explanations/manifest_tsv.md
- Benchmarking: explanations/benchmarking.md
- Structuring Your Project: explanations/structuring_your_project.md
- News:
- news.md

Expand Down
Loading