[FEATURE REQUEST] Add STAMP clusters #3

cbravo93 · 2021-09-29T09:13:38Z

Is your feature request related to a problem? Please describe.
In i-cisTarget (web), we perform motif clustering with STAMP, which can help to reduce redundancy.

Describe the solution you'd like
Add motif clustering with STAMP on the results as an option. Code can be adapted from: /staging/leuven/stg_00002/lcb/cbravo/Multiomics_pipeline/STAMP/STAMP.py. This should be straight-forward for default databases.

For the already clustered databases, should we implement it too? Which motif to use per cluster then? At the moment we use the STAMP consensus motif for the logo (after clustering the whole collection with a Seurat-like approach and run STAMP in each cluster, this is what makes the metaclusters in the collection) , while we use all motifs in the cluster for scoring with cbust. Does it make sense to use the consensus motif here too? Or it will be rather noisy? Also for these we have already clustered the motifs before the analysis in a sense, not sure if it would add a lot.

The text was updated successfully, but these errors were encountered:

cbravo93 · 2021-09-29T09:28:22Z

Also, this will require access to the cb files. I would add it as an optional step (not by default), and then people can either download them or we could read them from the web-server. However, some collections are (partially) private (transfac?), are we allowed to share their PWMs?

SeppeDeWinter · 2021-09-29T10:10:57Z

Is it possible to run this clustering once, generating a matrix containing the similarities between each of the motifs (similar to what is already done for the clustered motif collection)? Or is the result very dependent on which motifs are included in the analysis (i.e. is the similarity measurement relative to which motifs are included)?

[EDIT] Then we could simply read this matrix.

cbravo93 · 2021-09-29T10:16:25Z

- Is it possible to run this clustering once, generating a matrix containing the similarities between each of the motifs (similar to what is already done for the clustered motif collection)?
This would be the clustered collection indeed, we have a df with motif - clusterID, we could use it too. This we can already provide now, and we can make it default (since it does not require input data or calculations).
- Or is the result very dependent on which motifs are included in the analysis (i.e. is the similarity measurement relative to which motifs are included)?
The motif clustering will be different depending on which motifs you use as input. For example, if you have all AP-1 motifs it will look for subclusters; if you have AP-1 + other it may just group all the AP-1 into 1 cluster.

I like the idea of using the clusters from the clustered motif collections though, more elegant than running it per result file if we think about the AP-1 example; faster and even easier to add :).

SeppeDeWinter · 2021-10-08T13:12:03Z

Another Idea:

If we do this we could also color code motifs based on other measurements, for example cluster based on the Jaccard index of the target regions. i.e. visualise which motifs are present on overlapping sets of regions.

cbravo93 added the enhancement New feature or request label Sep 29, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE REQUEST] Add STAMP clusters #3

[FEATURE REQUEST] Add STAMP clusters #3

cbravo93 commented Sep 29, 2021 •

edited

Loading

cbravo93 commented Sep 29, 2021

SeppeDeWinter commented Sep 29, 2021 •

edited

Loading

cbravo93 commented Sep 29, 2021

SeppeDeWinter commented Oct 8, 2021 •

edited

Loading

[FEATURE REQUEST] Add STAMP clusters #3

[FEATURE REQUEST] Add STAMP clusters #3

Comments

cbravo93 commented Sep 29, 2021 • edited Loading

cbravo93 commented Sep 29, 2021

SeppeDeWinter commented Sep 29, 2021 • edited Loading

cbravo93 commented Sep 29, 2021

SeppeDeWinter commented Oct 8, 2021 • edited Loading

cbravo93 commented Sep 29, 2021 •

edited

Loading

SeppeDeWinter commented Sep 29, 2021 •

edited

Loading

SeppeDeWinter commented Oct 8, 2021 •

edited

Loading